Search results for " Probability"

showing 10 items of 2176 documents

Reducing sample size in experiments with animals: historical controls and related strategies

2015

Reducing the number of animal subjects used in biomedical experiments is desirable for ethical and practical reasons. Previous reviews of the benefits of reducing sample sizes have focused on improving experimental designs and methods of statistical analysis, but reducing the size of control groups has been considered rarely. We discuss how the number of current control animals can be reduced, without loss of statistical power, by incorporating information from historical controls, i.e. subjects used as controls in similar previous experiments. Using example data from published reports, we describe how to incorporate information from historical controls under a range of assumptions that mig…

0301 basic medicineComputer scienceDesign of experimentsControl (management)Control subjects01 natural sciencesGeneral Biochemistry Genetics and Molecular BiologyStatistical power010104 statistics & probability03 medical and health sciences030104 developmental biologySample size determinationStatisticsRange (statistics)Statistical analysis0101 mathematicsGeneral Agricultural and Biological SciencesStatistical hypothesis testingBiological Reviews
researchProduct

Retract p < 0.005 and propose using JASP, instead

2018

Seeking to address the lack of research reproducibility in science, including psychology and the life sciences, a pragmatic solution has been raised recently:  to use a stricter p < 0.005 standard for statistical significance when claiming evidence of new discoveries. Notwithstanding its potential impact, the proposal has motivated a large mass of authors to dispute it from different philosophical and methodological angles. This article reflects on the original argument and the consequent counterarguments, and concludes with a simpler and better-suited alternative that the authors of the proposal knew about and, perhaps, should have made from their Jeffresian perspective: to use a Bayes …

0301 basic medicineData SharingOpen scienceComputer scienceresearch evidenceGeneral Biochemistry Genetics and Molecular Biology03 medical and health sciences0302 clinical medicineArgumentFrequentist inferenceOrder (exchange)practical significanceBayes factorsPrior probabilityreplicabilityp-valueGeneral Pharmacology Toxicology and Pharmaceuticsreproducibilitystatistical significancePotential impactGeneral Immunology and MicrobiologyPerspective (graphical)Bayes factorArticlesGeneral MedicineOpinion ArticleEpistemology030104 developmental biologyp-values030217 neurology & neurosurgeryF1000Research
researchProduct

Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms

2018

Abstract Motivation Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in…

0301 basic medicineEpigenomicsgenomic analysis; hadoop; distributed computingStatistics and ProbabilityComputer scienceBig dataSequence assemblyGenomeBiochemistryDomain (software engineering)Set (abstract data type)03 medical and health sciencesdistributed computingSoftwareComputational Theory and MathematicAnimalsCluster AnalysisHumansA-DNAk-mer counting distributed computing hadoop map reduceMolecular BiologyEpigenomicsBacteriabusiness.industryk-mer countingEukaryotaLinguisticsComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAComputer Science ApplicationsComputational Mathematics030104 developmental biologymap reduceComputational Theory and MathematicsDistributed algorithmgenomic analysisKernel (statistics)MetagenomehadoopbusinessAlgorithmAlgorithmsSoftware
researchProduct

FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications

2017

Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…

0301 basic medicineFASTQ formatStatistics and ProbabilityComputer scienceSequence analysismedia_common.quotation_subjectInformation Storage and RetrievalBioinformaticscomputer.software_genreGenomeBiochemistryDomain (software engineering)03 medical and health sciencesComputational Theory and MathematicHumansGenomic libraryQuality (business)DNA sequencingFASTQ; NGS; FASTQ; DNA sequencingMolecular Biologymedia_commonGene LibrarySequenceDatabaseSettore INF/01 - InformaticaGenome HumanComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAFASTQFile formatComputer Science ApplicationsStatistics and Probability; Biochemistry; Molecular Biology; Computer Science Applications1707 Computer Vision and Pattern Recognition; Computational Theory and Mathematics; Computational MathematicsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsNGSDatabase Management Systemscomputer
researchProduct

2018

Genome-Wide-Association-Studies have become a powerful method to link point mutations (e.g. single nucleotide polymorphisms (SNPs)) to a certain phenotype or a disease. However, their power to detect SNPs associated to polygenic diseases such as Alzheimer's Disease (AD) is limited, since they can only infer the pairwise relation of single SNPs to the phenotype and ignore possible effects of various SNP combinations. The common method to probe these possible complex genetic patterns is to compute a measure called linkage disequilibrium (LD). Despite the fact that several predictive patterns found with LD could successfully be applied to medical diagnosis, this measure still holds several dra…

0301 basic medicineLinkage (software)education.field_of_studyLinkage disequilibriumPopulationPosterior probabilityGenomicsSingle-nucleotide polymorphismComputational biologyBiology03 medical and health sciences030104 developmental biology0302 clinical medicineSNPeducationCategorical variable030217 neurology & neurosurgeryGenomics and Computational Biology
researchProduct

A Dirichlet Autoregressive Model for the Analysis of Microbiota Time-Series Data

2021

Growing interest in understanding microbiota dynamics has motivated the development of different strategies to model microbiota time series data. However, all of them must tackle the fact that the available data are high-dimensional, posing strong statistical and computational challenges. In order to address this challenge, we propose a Dirichlet autoregressive model with time-varying parameters, which can be directly adapted to explain the effect of groups of taxa, thus reducing the number of parameters estimated by maximum likelihood. A strategy has been implemented which speeds up this estimation. The usefulness of the proposed model is illustrated by application to a case study.

0301 basic medicineMathematical optimizationMultidisciplinaryArticle SubjectGeneral Computer ScienceComputer scienceMaximum likelihoodQA75.5-76.9501 natural sciencesDirichlet distribution010104 statistics & probability03 medical and health sciencessymbols.namesake030104 developmental biologyAutoregressive modelElectronic computers. Computer sciencesymbols0101 mathematicsTime seriesComplexity
researchProduct

Genome-scale analysis of evolutionary rate and selection in a fast-expanding Spanish cluster of HIV-1 subtype F1.

2018

Abstract This work is aimed at assessing the presence of positive selection and/or shifts of the evolutionary rate in a fast-expanding HIV-1 subtype F1 transmission cluster affecting men who have sex with men in Spain. We applied Bayesian coalescent phylogenetics and selection analyses to 23 full-coding region sequences from patients belonging to that cluster, along with other 19 F1 epidemiologically-unrelated sequences. A shift in the overall evolutionary rate of the virus, explained by positively selected sites in the cluster, was detected. We also found one substitution in Nef (H89F) that was specific to the cluster and experienced positive selection. These results suggest that fast tran…

0301 basic medicineMicrobiology (medical)GenotypeBayesian probabilityGenome scaleEpitopes T-LymphocyteHIV InfectionsGenome ViralBiologyDisease clusterMicrobiologyArticlelaw.inventionMen who have sex with menCoalescent theoryEvolution MolecularSubtype F103 medical and health sciencesSex FactorslawPhylogeneticsDatabases GeneticGeneticsHumansSelection GeneticSelectionMolecular BiologyAntigens ViralEcology Evolution Behavior and SystematicsSelection (genetic algorithm)PhylogenyRecombination GeneticGenomicsMen who have sex with men030104 developmental biologyInfectious DiseasesTransmission (mechanics)Evolutionary biologySpainHIV-1Transmission clusterInfection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases
researchProduct

Toward a direct and scalable identification of reduced models for categorical processes.

2017

The applicability of many computational approaches is dwelling on the identification of reduced models defined on a small set of collective variables (colvars). A methodology for scalable probability-preserving identification of reduced models and colvars directly from the data is derived—not relying on the availability of the full relation matrices at any stage of the resulting algorithm, allowing for a robust quantification of reduced model uncertainty and allowing us to impose a priori available physical information. We show two applications of the methodology: (i) to obtain a reduced dynamical model for a polypeptide dynamics in water and (ii) to identify diagnostic rules from a standar…

0301 basic medicineMultidisciplinarybusiness.industryComputer scienceDimensionality reductionBayesian inferenceMachine learningcomputer.software_genre01 natural sciencesReduction (complexity)010104 statistics & probability03 medical and health sciencesIdentification (information)030104 developmental biologyPhysical informationPhysical SciencesA priori and a posterioriArtificial intelligenceData mining0101 mathematicsCluster analysisbusinessCategorical variablecomputerProceedings of the National Academy of Sciences of the United States of America
researchProduct

Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling

2016

Clinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is still limited, calling for novel analysis strategies for identifying potentially prognostic SNPs that can help to better characterize disease processes. We propose such a strategy, drawing on univariate testing ideas from epidemiological case-controls studies on the one hand, and multivariable regression techniques as developed for gene expression data on the other hand. In particular, we focus on …

0301 basic medicineMultivariate analysisMicroarraysTest StatisticsGene Expressionlcsh:MedicineBioinformatics01 natural sciencesHematologic Cancers and Related DisordersCohort Studies010104 statistics & probabilityMathematical and Statistical TechniquesResamplingMedicine and Health Scienceslcsh:ScienceStatistical DataUnivariate analysisMultidisciplinarySimulation and ModelingMultivariable calculusRegression analysisHematologyMyeloid LeukemiaPrognosisRegressionBioassays and Physiological AnalysisOncologyResearch DesignPhysical SciencesStatistics (Mathematics)Research ArticleAcute Myeloid LeukemiaPermutationSingle-nucleotide polymorphismComputational biologyBiologyResearch and Analysis MethodsPolymorphism Single Nucleotide03 medical and health sciencesLeukemiasGeneticsHumansStatistical Methods0101 mathematicsDiscrete Mathematicslcsh:RUnivariateCancers and NeoplasmsBiology and Life SciencesModels Theoretical030104 developmental biologyCombinatoricsCase-Control StudiesMultivariate Analysislcsh:QMathematicsPLOS ONE
researchProduct

Melanoma-Nevus Discrimination Based on Image Statistics in Few Spectral Channels

2016

The purpose of this paper is to offer a method for discrimination of cutaneous melanoma from benign nevus, founded on analysis of skin lesion image. At the core of method is calculation of mean and standard deviation of pixel optical density values for a few narrow spectral bands. Calculated values are compared with discriminating thresholds derived from a set of images of benign nevi and melanomas with known diagnosis. Classification is done applying weighted majority rule to results of thresholding. Verification against the available multispectral images of 32 melanomas and 94 benign nevi has shown that the method using three spectral bands provided zero false negative and four false posi…

0301 basic medicineNevi and melanomasContextual image classificationImage classificationmelanoma detection.Multispectral imageSpectral bandsbiomedical optical imagingmedicine.disease01 natural sciencesThresholdingStandard deviation010104 statistics & probability03 medical and health sciences030104 developmental biologyCutaneous melanomaStatisticsmultispectral imagingmedicineNevus0101 mathematicsElectrical and Electronic EngineeringMathematicsElektronika ir Elektrotechnika
researchProduct