Search results for "Data"
showing 10 items of 12992 documents
Block Sorting-Based Transformations on Words: Beyond the Magic BWT
2018
The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression and later results have contributed to make it a fundamental tool for the design of self-indexing compressed data structures. The Alternating Burrows-Wheeler Transform (ABWT) is a more recent transformation, studied in the context of Combinatorics on Words, that works in a similar way, using an alternating lexicographical order instead of the usual one. In this paper we study a more general class of block sorting-based transformations. The transformations in this new class prove to be interesting combinatorial tools that offer new research perspectives. In particular, we show that all the tra…
A multilevel statistical toolkit to study animal social networks: Animal Network Toolkit (ANT) R package
2018
AbstractHow animals interact and develop social relationships regarding, individual attributes, sociodemographic and ecological pressures is of great interest. New methodologies, in particular Social Network Analysis, allow us to elucidate these types of questions. However, the different methodologies developed to that end and the speed at which they emerge make their use difficult. Moreover, the lack of communication between the different software developed to provide an answer to the same/different research questions is a source of confusion. The R package Animal Network Toolkit (ANT) was developed with the aim of implementing in one package the many different social network analysis tech…
Combining multiple hypothesis testing with machine learning increases the statistical power of genome-wide association studies
2016
Mieth, Bettina et al.
SpaceScanner: COPASI wrapper for automated management of global stochastic optimization experiments
2017
Abstract Motivation Due to their universal applicability, global stochastic optimization methods are popular for designing improvements of biochemical networks. The drawbacks of global stochastic optimization methods are: (i) no guarantee of finding global optima, (ii) no clear optimization run termination criteria and (iii) no criteria to detect stagnation of an optimization run. The impact of these drawbacks can be partly compensated by manual work that becomes inefficient when the solution space is large due to combinatorial explosion of adjustable parameters or for other reasons. Results SpaceScanner uses parallel optimization runs for automatic termination of optimization tasks in case…
Partitioned learning of deep Boltzmann machines for SNP data.
2016
Abstract Motivation Learning the joint distributions of measurements, and in particular identification of an appropriate low-dimensional manifold, has been found to be a powerful ingredient of deep leaning approaches. Yet, such approaches have hardly been applied to single nucleotide polymorphism (SNP) data, probably due to the high number of features typically exceeding the number of studied individuals. Results After a brief overview of how deep Boltzmann machines (DBMs), a deep learning approach, can be adapted to SNP data in principle, we specifically present a way to alleviate the dimensionality problem by partitioned learning. We propose a sparse regression approach to coarsely screen…
Reactome diagram viewer: data structures and strategies to boost performance
2017
Abstract Motivation Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. For web-based pathway visualization, Reactome uses a custom pathway diagram viewer that has been evolved over the past years. Here, we present comprehensive enhancements in usability and performance based on extensive usability testing sessions and technology developments, aiming to optimize the viewer towards the needs of the community. Results The pathway diagram viewer version 3 achieves consistently better performance, loading and rendering of 97% of the diagrams in Reactome in less than 1 s. Combining the multi-layer html5 canvas strategy with a space partit…
Small RNA-seq analysis of circulating miRNAs to identify phenotypic variability in Friedreich's ataxia patients.
2018
AbstractFriedreich’s ataxia (FRDA; OMIM 229300), an autosomal recessive neurodegenerative mitochondrial disease, is the most prevalent hereditary ataxia. In addition, FRDA patients have shown additional non-neurological features such as scoliosis, diabetes, and cardiac complications. Hypertrophic cardiomyopathy, which is found in two thirds of patients at the time of diagnosis, is the primary cause of death in these patients. Here, we used small RNA-seq of microRNAs (miRNAs) purified from plasma samples of FRDA patients and controls. Furthermore, we present the rationale, experimental methodology, and analytical procedures for dataset analysis. This dataset will facilitate the identificatio…
ParDRe: faster parallel duplicated reads removal tool for sequencing studies
2016
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record [insert complete citation information here] is available online at: https://doi.org/10.1093/bioinformatics/btw038 [Abstract] Summary: Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe , a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of S…
Gene-based and semantic structure of the Gene Ontology as a complex network
2012
The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. This approach might be usefully complemented by a bottom-up approach based on the knowledge of relationships amongst genes. To this end, we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium and a gene-based …
L1-Penalized Censored Gaussian Graphical Model
2018
Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer. The combination of censoring and high-dimensionality make inference of the underlying genetic networks from these data very challenging. In this article, we propose an $\ell_1$-penalized Gaussian graphical model for censored data and derive two EM-like algorithm…