Search results for "Probability."

showing 10 items of 3396 documents

SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations

2016

Various approaches to calling single-nucleotide variants (SNVs) or insertion-or-deletion (indel) mutations have been developed based on next-generation sequencing (NGS). However, most of them are dedicated to a particular type of mutation, e.g. germline SNVs in normal cells, somatic SNVs in cancer/tumor cells, or indels only. In the literature, efficient and integrated callers for both germline and somatic SNVs/indels have not yet been extensively investigated. We present SNVSniffer, an efficient and integrated caller identifying both germline and somatic SNVs/indels from NGS data. In this algorithm, we propose the use of Bayesian probabilistic models to identify SNVs and investigate a mult…

0301 basic medicineSomatic cellBayesian probabilityBiologyPolymorphism Single NucleotideGermline03 medical and health sciencesGene FrequencyINDEL MutationStructural BiologyModelling and SimulationIndel callingGenetic variationHumansAlleleIndelMolecular BiologyOvarian NeoplasmsGeneticsResearchApplied MathematicsComputational BiologyHigh-Throughput Nucleotide SequencingSNP callingSomatic SNV callingCystadenocarcinoma SerousComputer Science ApplicationsGerm Cells030104 developmental biologyBayesian modelModeling and SimulationMutation (genetic algorithm)FemaleMultinomial distributionAlgorithmsBMC Systems Biology
researchProduct

Assessing statistical significance in multivariable genome wide association analysis

2016

Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whe…

0301 basic medicineStatistics and Probability1303 BiochemistryGenotypeOperations researchLibrary sciencePolymorphism Single NucleotideBiochemistryGerman03 medical and health sciences10007 Department of EconomicsPolitical scienceGenome-Wide Association Analysis1312 Molecular Biology1706 Computer Science ApplicationsCluster AnalysisHumansComputer Simulation2613 Statistics and ProbabilityMolecular BiologyEuropean researchGenetics and Population AnalysisComputational BiologyReproducibility of ResultsOriginal Paperslanguage.human_languageComputer Science Applications330 EconomicsComputational MathematicsPhenotype030104 developmental biologyComputational Theory and MathematicsLinear Modelslanguage2605 Computational MathematicsGenome-Wide Association Study1703 Computational Theory and Mathematics
researchProduct

Two-Stage Bayesian Approach for GWAS With Known Genealogy

2019

Genome-wide association studies (GWAS) aim to assess relationships between single nucleotide polymorphisms (SNPs) and diseases. They are one of the most popular problems in genetics, and have some peculiarities given the large number of SNPs compared to the number of subjects in the study. Individuals might not be independent, especially in animal breeding studies or genetic diseases in isolated populations with highly inbred individuals. We propose a family-based GWAS model in a two-stage approach comprising a dimension reduction and a subsequent model selection. The first stage, in which the genetic relatedness between the subjects is taken into account, selects the promising SNPs. The se…

0301 basic medicineStatistics and ProbabilityBayesian probabilityPopulationSingle-nucleotide polymorphismGenome-wide association studyComputational biologyEstadísticaBiologyKinship coefficientModel selection01 natural sciencesBeta-thalassemia010104 statistics & probability03 medical and health sciencesBeta-thalassemia disorderModelsRobust prior distributionRegularizationDiscrete Mathematics and Combinatorics0101 mathematicsStage (cooking)Genetic associationGenome-wide associationModel selectionVariable-selectionProbability and statisticsBayes factorRegressionBayes factor030104 developmental biologyPhenotypeStatistics Probability and UncertaintyGaussian Markov random field
researchProduct

Stagewise pseudo-value regression for time-varying effects on the cumulative incidence

2015

In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event s…

0301 basic medicineStatistics and ProbabilityCarcinoma HepatocellularTime FactorsEpidemiologyComputer scienceFeature selectionBiostatistics01 natural sciences010104 statistics & probability03 medical and health sciencesRisk FactorsStatisticsCovariateEconometricsHumansComputer SimulationCumulative incidenceRegistries0101 mathematicsEvent (probability theory)Models StatisticalIncidenceLiver NeoplasmsAbsolute risk reductionRegression analysisRegression030104 developmental biologyRegression AnalysisJackknife resamplingAlgorithmsStatistics in Medicine
researchProduct

Protein-protein interactions can be predicted using coiled coil co-evolution patterns

2016

AbstractProtein-protein interactions are sometimes mediated by coiled coil structures. The evolutionary conservation of interacting orthologs in different species, along with the presence or absence of coiled coils in them, may help in the prediction of interacting pairs. Here, we illustrate how the presence of coiled coils in a protein can be exploited as a potential indicator for its interaction with another protein with coiled coils. The prediction capability of our strategy improves when restricting our dataset to highly reliable, known protein-protein interactions. Our study of the co-evolution of coiled coils demonstrates that pairs of interacting proteins can be distinguished from no…

0301 basic medicineStatistics and ProbabilityComputational biologyCorrelated evolutionGeneral Biochemistry Genetics and Molecular BiologyProtein Structure SecondaryProtein–protein interactionConserved sequenceEvolution Molecular03 medical and health sciencesProtein-protein interactionModelling and SimulationImmunology and Microbiology(all)Coiled coilGeneticsCoiled coilPhysicsMedicine(all)030102 biochemistry & molecular biologyGeneral Immunology and MicrobiologyAgricultural and Biological Sciences(all)Models GeneticBiochemistry Genetics and Molecular Biology(all)Applied MathematicsA proteinProteinsGeneral Medicine030104 developmental biologyModeling and SimulationGeneral Agricultural and Biological SciencesJournal of Theoretical Biology
researchProduct

Evidence for the implication of the histone code in building the genome structure

2018

International audience; Histones are punctuated with small chemical modifications that alter their interaction with DNA. One attractive hypothesis stipulates that certain combinations of these histone modifications may function, alone or together, as a part of a predictive histone code to provide ground rules for chromatin folding. We consider four features that relate histone modifications to chromatin folding: charge neutralisation, molecular specificity, robustness and evolvability. Next, we present evidence for the association among different histone modifications at various levels of chromatin organisation and show how these relationships relate to function such as transcription, repli…

0301 basic medicineStatistics and ProbabilityComputational biologyGeneral Biochemistry Genetics and Molecular BiologyHistones03 medical and health scienceschemistry.chemical_compoundTranscription (biology)AnimalsHumansHistone codeNucleosome[PHYS]Physics [physics]biologyGenome HumanApplied MathematicsRobustness (evolution)General MedicineChromatinChromatinHistone Code030104 developmental biologyHistonechemistryModeling and Simulationbiology.proteinHuman genomeDNABiosystems
researchProduct

SpaceScanner: COPASI wrapper for automated management of global stochastic optimization experiments

2017

Abstract Motivation Due to their universal applicability, global stochastic optimization methods are popular for designing improvements of biochemical networks. The drawbacks of global stochastic optimization methods are: (i) no guarantee of finding global optima, (ii) no clear optimization run termination criteria and (iii) no criteria to detect stagnation of an optimization run. The impact of these drawbacks can be partly compensated by manual work that becomes inefficient when the solution space is large due to combinatorial explosion of adjustable parameters or for other reasons. Results SpaceScanner uses parallel optimization runs for automatic termination of optimization tasks in case…

0301 basic medicineStatistics and ProbabilityComputer science0206 medical engineeringComputational Biology02 engineering and technologycomputer.software_genreModels BiologicalBiochemistryComputer Science ApplicationsSet (abstract data type)03 medical and health sciencesComputational Mathematics030104 developmental biologyComputational Theory and MathematicsStochastic optimizationData miningMolecular BiologycomputerSoftware020602 bioinformaticsCombinatorial explosionBioinformatics
researchProduct

Partitioned learning of deep Boltzmann machines for SNP data.

2016

Abstract Motivation Learning the joint distributions of measurements, and in particular identification of an appropriate low-dimensional manifold, has been found to be a powerful ingredient of deep leaning approaches. Yet, such approaches have hardly been applied to single nucleotide polymorphism (SNP) data, probably due to the high number of features typically exceeding the number of studied individuals. Results After a brief overview of how deep Boltzmann machines (DBMs), a deep learning approach, can be adapted to SNP data in principle, we specifically present a way to alleviate the dimensionality problem by partitioned learning. We propose a sparse regression approach to coarsely screen…

0301 basic medicineStatistics and ProbabilityComputer scienceMachine learningcomputer.software_genre01 natural sciencesBiochemistryPolymorphism Single NucleotideMachine Learning010104 statistics & probability03 medical and health sciencessymbols.namesakeJoint probability distributionHumans0101 mathematicsMolecular BiologyStatistical hypothesis testingArtificial neural networkbusiness.industryGene Expression Regulation LeukemicDeep learningUnivariateComputational BiologyManifoldComputer Science ApplicationsData setComputational Mathematics030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicsLeukemia MyeloidBoltzmann constantsymbolsData miningArtificial intelligencebusinesscomputerSoftwareCurse of dimensionalityBioinformatics (Oxford, England)
researchProduct

FLYCOP: metabolic modeling-based analysis and engineering microbial communities

2018

10 p.-5 fig.-2 tab.

0301 basic medicineStatistics and ProbabilityComputer scienceMetaboliteAuxotrophy030106 microbiologyMicrobial ConsortiaEccb 2018: European Conference on Computational Biology ProceedingsEvolutionary engineeringmedicine.disease_causeBiochemistry03 medical and health scienceschemistry.chemical_compoundmedicineEscherichia coliMetabolic modelingMolecular BiologyEscherichia coli2. Zero hungerbiologyMicrobiotaSystemsBiological evolutionSynechococcusbiology.organism_classificationComputer Science ApplicationsComputational MathematicsMulticellular organism030104 developmental biologyComputational Theory and MathematicschemistryMetabolic EngineeringBiochemical engineeringSoftwareBioinformatics
researchProduct

MetaCache: context-aware classification of metagenomic reads using minhashing.

2017

Abstract Motivation Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy. Results We introduce MetaCache—a novel software for read classification using the big data technique minhashing. Our…

0301 basic medicineStatistics and ProbabilityComputer scienceSequence analysisContext (language use)BiochemistryGenome03 medical and health scienceschemistry.chemical_compound0302 clinical medicineRefSeqHumansMolecular BiologyInformation retrievalShotgun sequencingHigh-Throughput Nucleotide SequencingSequence Analysis DNAComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicschemistryMetagenomicsMetagenomics030217 neurology & neurosurgeryDNAAlgorithmsSoftwareReference genomeBioinformatics (Oxford, England)
researchProduct