Search results for "informatics"
showing 10 items of 2542 documents
FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications
2017
Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…
Are the Myokines the Mediators of Physical Activity-Induced Health Benefits?
2016
BACKGROUND: The concept of the muscle as a secretory organ, developed during the last decades, partially answers to the issue of how the crosstalk between skeletal muscle and distant tissues happens. The beneficial effects of exercise transcend the simple improved skeletal muscle functionality: systemic responses to exercise have been observed in distal organs like heart, kidney, brain and liver. Increasing data have accumulated regarding the synthesis, the kinetics of release and the biological roles of muscular cytokines, now called myokines. The most recent techniques have meaningfully improved the identification of the muscle cell secretome, but several issues regarding the extent of se…
The colored longest common prefix array computed via sequential scans
2018
Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the computation of statistics of the sequences in the dataset. Such computations become impractical in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-exter…
Q-nexus: a comprehensive and efficient analysis pipeline designed for ChIP-nexus
2016
Background: ChIP-nexus, an extension of the ChIP-exo protocol, can be used to map the borders of protein-bound DNA sequences at nucleotide resolution, requires less input DNA and enables selective PCR duplicate removal using random barcodes. However, the use of random barcodes requires additional preprocessing of the mapping data, which complicates the computational analysis. To date, only a very limited number of software packages are available for the analysis of ChIP-exo data, which have not yet been systematically tested and compared on ChIP-nexus data. Results: Here, we present a comprehensive software package for ChIP-nexus data that exploits the random barcodes for selective removal …
Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.
2020
Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, w…
Skeletal Dysplasia Mutations Effect on Human Filamins’ Structure and Mechanosensing
2016
AbstractCells’ ability to sense mechanical cues in their environment is crucial for fundamental cellular processes, leading defects in mechanosensing to be linked to many diseases. The actin cross-linking protein Filamin has an important role in the conversion of mechanical forces into biochemical signals. Here, we reveal how mutations in Filamin genes known to cause Larsen syndrome and Frontometaphyseal dysplasia can affect the structure and therefore function of Filamin domains 16 and 17. Employing X-ray crystallography, the structure of these domains was first solved for the human Filamin B. The interaction seen between domains 16 and 17 is broken by shear force as revealed by steered mo…
Lipoproteins in atherosclerosis process
2019
Background:Dyslipidaemias is a recognized risk factor for atherosclerosis, however, new evidence brought to light by trials investigating therapies to enhance HDLcholesterol have suggested an increased atherosclerotic risk when HDL-C is high.Results:Several studies highlight the central role in atherosclerotic disease of dysfunctional lipoproteins; oxidised LDL-cholesterol is an important feature, according to “oxidation hypothesis”, of atherosclerotic lesion, however, there is today a growing interest for dysfunctional HDL-cholesterol. The target of our paper is to review the functions of modified and dysfunctional lipoproteins in atherogenesis.Conclusion:Taking into account the central ro…
2016
We determine knotting probabilities and typical sizes of knots in double-stranded DNA for chains of up to half a million base pairs with computer simulations of a coarse-grained bead-stick model: Single trefoil knots and composite knots which include at least one trefoil as a prime factor are shown to be common in DNA chains exceeding 250,000 base pairs, assuming physiologically relevant salt conditions. The analysis is motivated by the emergence of DNA nanopore sequencing technology, as knots are a potential cause of erroneous nucleotide reads in nanopore sequencing devices and may severely limit read lengths in the foreseeable future. Even though our coarse-grained model is only based on …
Retrospective Proteomic Screening of 100 Breast Cancer Tissues.
2017
The present investigation has been conducted on one hundred tissue fragments of breast cancer, collected and immediately cryopreserved following the surgical resection. The specimens were selected from patients with invasive ductal carcinoma of the breast, the most frequent and potentially aggressive type of mammary cancer, with the objective to increase the knowledge of breast cancer molecular markers potentially useful for clinical applications. The proteomic screening; by 2D-IPG and mass spectrometry; allowed us to identify two main classes of protein clusters: proteins expressed ubiquitously at high levels in all patients; and proteins expressed sporadically among the same patients. Wit…
MiasDB: A Database of Molecular Interactions Associated with Alternative Splicing of Human Pre-mRNAs.
2016
Alternative splicing (AS) is pervasive in human multi-exon genes and is a major contributor to expansion of the transcriptome and proteome diversity. The accurate recognition of alternative splice sites is regulated by information contained in networks of protein-protein and protein-RNA interactions. However, the mechanisms leading to splice site selection are not fully understood. Although numerous databases have been built to describe AS, molecular interaction databases associated with AS have only recently emerged. In this study, we present a new database, MiasDB, that provides a description of molecular interactions associated with human AS events. This database covers 938 interactions …