Search results for "Probability."
showing 10 items of 3396 documents
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations
2016
Various approaches to calling single-nucleotide variants (SNVs) or insertion-or-deletion (indel) mutations have been developed based on next-generation sequencing (NGS). However, most of them are dedicated to a particular type of mutation, e.g. germline SNVs in normal cells, somatic SNVs in cancer/tumor cells, or indels only. In the literature, efficient and integrated callers for both germline and somatic SNVs/indels have not yet been extensively investigated. We present SNVSniffer, an efficient and integrated caller identifying both germline and somatic SNVs/indels from NGS data. In this algorithm, we propose the use of Bayesian probabilistic models to identify SNVs and investigate a mult…
Assessing statistical significance in multivariable genome wide association analysis
2016
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whe…
Two-Stage Bayesian Approach for GWAS With Known Genealogy
2019
Genome-wide association studies (GWAS) aim to assess relationships between single nucleotide polymorphisms (SNPs) and diseases. They are one of the most popular problems in genetics, and have some peculiarities given the large number of SNPs compared to the number of subjects in the study. Individuals might not be independent, especially in animal breeding studies or genetic diseases in isolated populations with highly inbred individuals. We propose a family-based GWAS model in a two-stage approach comprising a dimension reduction and a subsequent model selection. The first stage, in which the genetic relatedness between the subjects is taken into account, selects the promising SNPs. The se…
Stagewise pseudo-value regression for time-varying effects on the cumulative incidence
2015
In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event s…
Protein-protein interactions can be predicted using coiled coil co-evolution patterns
2016
AbstractProtein-protein interactions are sometimes mediated by coiled coil structures. The evolutionary conservation of interacting orthologs in different species, along with the presence or absence of coiled coils in them, may help in the prediction of interacting pairs. Here, we illustrate how the presence of coiled coils in a protein can be exploited as a potential indicator for its interaction with another protein with coiled coils. The prediction capability of our strategy improves when restricting our dataset to highly reliable, known protein-protein interactions. Our study of the co-evolution of coiled coils demonstrates that pairs of interacting proteins can be distinguished from no…
Evidence for the implication of the histone code in building the genome structure
2018
International audience; Histones are punctuated with small chemical modifications that alter their interaction with DNA. One attractive hypothesis stipulates that certain combinations of these histone modifications may function, alone or together, as a part of a predictive histone code to provide ground rules for chromatin folding. We consider four features that relate histone modifications to chromatin folding: charge neutralisation, molecular specificity, robustness and evolvability. Next, we present evidence for the association among different histone modifications at various levels of chromatin organisation and show how these relationships relate to function such as transcription, repli…
SpaceScanner: COPASI wrapper for automated management of global stochastic optimization experiments
2017
Abstract Motivation Due to their universal applicability, global stochastic optimization methods are popular for designing improvements of biochemical networks. The drawbacks of global stochastic optimization methods are: (i) no guarantee of finding global optima, (ii) no clear optimization run termination criteria and (iii) no criteria to detect stagnation of an optimization run. The impact of these drawbacks can be partly compensated by manual work that becomes inefficient when the solution space is large due to combinatorial explosion of adjustable parameters or for other reasons. Results SpaceScanner uses parallel optimization runs for automatic termination of optimization tasks in case…
Partitioned learning of deep Boltzmann machines for SNP data.
2016
Abstract Motivation Learning the joint distributions of measurements, and in particular identification of an appropriate low-dimensional manifold, has been found to be a powerful ingredient of deep leaning approaches. Yet, such approaches have hardly been applied to single nucleotide polymorphism (SNP) data, probably due to the high number of features typically exceeding the number of studied individuals. Results After a brief overview of how deep Boltzmann machines (DBMs), a deep learning approach, can be adapted to SNP data in principle, we specifically present a way to alleviate the dimensionality problem by partitioned learning. We propose a sparse regression approach to coarsely screen…
FLYCOP: metabolic modeling-based analysis and engineering microbial communities
2018
10 p.-5 fig.-2 tab.
MetaCache: context-aware classification of metagenomic reads using minhashing.
2017
Abstract Motivation Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy. Results We introduce MetaCache—a novel software for read classification using the big data technique minhashing. Our…