Search results for "Statistics & Probability"

showing 10 items of 436 documents

Stagewise pseudo-value regression for time-varying effects on the cumulative incidence

2015

In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event s…

0301 basic medicineStatistics and ProbabilityCarcinoma HepatocellularTime FactorsEpidemiologyComputer scienceFeature selectionBiostatistics01 natural sciences010104 statistics & probability03 medical and health sciencesRisk FactorsStatisticsCovariateEconometricsHumansComputer SimulationCumulative incidenceRegistries0101 mathematicsEvent (probability theory)Models StatisticalIncidenceLiver NeoplasmsAbsolute risk reductionRegression analysisRegression030104 developmental biologyRegression AnalysisJackknife resamplingAlgorithmsStatistics in Medicine

researchProduct

Partitioned learning of deep Boltzmann machines for SNP data.

2016

Abstract Motivation Learning the joint distributions of measurements, and in particular identification of an appropriate low-dimensional manifold, has been found to be a powerful ingredient of deep leaning approaches. Yet, such approaches have hardly been applied to single nucleotide polymorphism (SNP) data, probably due to the high number of features typically exceeding the number of studied individuals. Results After a brief overview of how deep Boltzmann machines (DBMs), a deep learning approach, can be adapted to SNP data in principle, we specifically present a way to alleviate the dimensionality problem by partitioned learning. We propose a sparse regression approach to coarsely screen…

0301 basic medicineStatistics and ProbabilityComputer scienceMachine learningcomputer.software_genre01 natural sciencesBiochemistryPolymorphism Single NucleotideMachine Learning010104 statistics & probability03 medical and health sciencessymbols.namesakeJoint probability distributionHumans0101 mathematicsMolecular BiologyStatistical hypothesis testingArtificial neural networkbusiness.industryGene Expression Regulation LeukemicDeep learningUnivariateComputational BiologyManifoldComputer Science ApplicationsData setComputational Mathematics030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicsLeukemia MyeloidBoltzmann constantsymbolsData miningArtificial intelligencebusinesscomputerSoftwareCurse of dimensionalityBioinformatics (Oxford, England)

researchProduct

L1-Penalized Censored Gaussian Graphical Model

2018

Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer. The combination of censoring and high-dimensionality make inference of the underlying genetic networks from these data very challenging. In this article, we propose an $\ell_1$-penalized Gaussian graphical model for censored data and derive two EM-like algorithm…

0301 basic medicineStatistics and ProbabilityFOS: Computer and information sciencesgraphical lassoComputer scienceGaussianNormal DistributionInferenceMultivariate normal distribution01 natural sciencesMethodology (stat.ME)010104 statistics & probability03 medical and health sciencessymbols.namesakeGraphical LassoExpectation–maximization algorithmHumansComputer SimulationGene Regulatory NetworksGraphical model0101 mathematicsStatistics - MethodologyEstimation theoryReverse Transcriptase Polymerase Chain ReactionEstimatorexpectation-maximization algorithmGeneral MedicineCensoring (statistics)High-dimensional datahigh-dimensional dataGaussian graphical model030104 developmental biologysymbolscensored dataCensored dataExpectation-Maximization algorithmStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaAlgorithmAlgorithms

researchProduct

Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks.

2016

Abstract Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order – some entries of the precision matrix are a priori zeros – or equal dependency strengths across time lags – some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l 1-penalized maximum likelihood, imposing a further constraint on the absolute value…

0301 basic medicineStatistics and ProbabilityFactorialDependency (UML)Computer scienceGaussianNormal Distributionpenalized inferencesparse networkscomputer.software_genreMachine learning01 natural sciencesNormal distribution010104 statistics & probability03 medical and health sciencessymbols.namesakeSparse networksGeneticsComputer SimulationGene Regulatory NetworksGraphical model0101 mathematicsgene-regulatory systemMolecular BiologyProbabilityMarkov chainModels GeneticPenalized inferencebusiness.industryModel selectiongraphical modelGene-regulatory systemsComputational Mathematics030104 developmental biologysymbolsA priori and a posterioriData miningArtificial intelligenceGraphical modelsSettore SECS-S/01 - StatisticabusinesscomputerNeisseriaAlgorithmsStatistical applications in genetics and molecular biology

researchProduct

Prioritizing covariates in the planning of future studies in the meta-analytic framework

2016

Science can be seen as a sequential process where each new study augments evidence to the existing knowledge. To have the best prospects to make an impact in this process, a new study should be designed optimally taking into account the previous studies and other prior information. We propose a formal approach for the covariate prioritization, i.e., the decision about the covariates to be measured in a new study. The decision criteria can be based on conditional power, change of the p-value, change in lower confidence limit, Kullback-Leibler divergence, Bayes factors, Bayesian false discovery rate or difference between prior and posterior expectation. The criteria can be also used for decis…

0301 basic medicineStatistics and ProbabilityFalse discovery rateComputer scienceBayesian probabilityBayes factorGeneral MedicineMultiple-criteria decision analysis01 natural sciencesConfidence interval010104 statistics & probability03 medical and health sciences030104 developmental biologySample size determinationCovariateEconometrics0101 mathematicsStatistics Probability and UncertaintyDivergence (statistics)Biometrical Journal

researchProduct

A generalization of Kingman's model of selection and mutation and the Lenski experiment.

2017

Kingman’s model of selection and mutation studies the limit type value distribution in an asexual population of discrete generations and infinite size undergoing selection and mutation. This paper generalizes the model to analyze the long-term evolution of Escherichia. coli in Lenski experiment. Weak assumptions for fitness functions are proposed and the mutation mechanism is the same as in Kingman’s model. General macroscopic epistasis are designable through fitness functions. Convergence to the unique limit type distribution is obtained.

0301 basic medicineStatistics and ProbabilityGeneralizationPopulationBiology01 natural sciencesModels BiologicalGeneral Biochemistry Genetics and Molecular Biology010104 statistics & probability03 medical and health sciencesStatisticsEscherichia coliApplied mathematicsQuantitative Biology::Populations and EvolutionLimit (mathematics)0101 mathematicsSelection GeneticeducationSelection (genetic algorithm)education.field_of_studyFitness functionGeneral Immunology and MicrobiologyApplied MathematicsGeneral MedicineQuantitative Biology::GenomicsBiological Evolution030104 developmental biologyDistribution (mathematics)Modeling and SimulationMutation (genetic algorithm)MutationEpistasisGeneral Agricultural and Biological SciencesMathematical biosciences

researchProduct

Variance component analysis to assess protein quantification in biomarker discovery. Application to MALDI-TOF mass spectrometry.

2017

International audience; Controlling the technological variability on an analytical chain is critical for biomarker discovery. The sources of technological variability should be modeled, which calls for specific experimental design, signal processing, and statistical analysis. Furthermore, with unbalanced data, the various components of variability cannot be estimated with the sequential or adjusted sums of squares of usual software programs. We propose a novel approach to variance component analysis with application to the matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) technology and use this approach for protein quantification by a classical signal processing algori…

0301 basic medicineStatistics and ProbabilityMALDI-TOFexperimental designBiometryprotein quantificationQuantitative proteomicsVariance component analysis[ CHIM ] Chemical Sciences01 natural sciencesSignaltechnological variability010104 statistics & probability03 medical and health sciencesstatistical analysis[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing[CHIM.ANAL]Chemical Sciences/Analytical chemistryComponent (UML)[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]biomarker discoverysum of squares type0101 mathematicsBiomarker discoverysignal processingMathematicsSignal processingAnalysis of Variance[ PHYS ] Physics [physics]Noise (signal processing)ProteinsGeneral MedicineVariance (accounting)[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]030104 developmental biologySpectrometry Mass Matrix-Assisted Laser Desorption-IonizationLinear Modelsvariance components[ CHIM.ANAL ] Chemical Sciences/Analytical chemistryStatistics Probability and UncertaintyBiological systemAlgorithmsBiomarkersBiometrical journal. Biometrische Zeitschrift

researchProduct

A heuristic, iterative algorithm for change-point detection in abrupt change models

2017

Change-point detection in abrupt change models is a very challenging research topic in many fields of both methodological and applied Statistics. Due to strong irregularities, discontinuity and non-smootheness, likelihood based procedures are awkward; for instance, usual optimization methods do not work, and grid search algorithms represent the most used approach for estimation. In this paper a heuristic, iterative algorithm for approximate maximum likelihood estimation is introduced for change-point detection in piecewise constant regression models. The algorithm is based on iterative fitting of simple linear models, and appears to extend easily to more general frameworks, such as models i…

0301 basic medicineStatistics and ProbabilityMathematical optimizationIterative methodHeuristic (computer science)Linear model01 natural sciencesPiecewise constant model Approximate maximum likelihood Model linearization Grid search limitations010104 statistics & probability03 medical and health sciencesComputational MathematicsDiscontinuity (linguistics)030104 developmental biologyHyperparameter optimizationCovariatePiecewise0101 mathematicsStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaChange detectionMathematics

researchProduct

A graphical model selection tool for mixed models

2017

Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing th…

0301 basic medicineStatistics and ProbabilityMixed modelModel selectionFeature selection01 natural sciencesTask (project management)Deviance plot Penalized Weighted Residual Sum of Squares Variable selection010104 statistics & probability03 medical and health sciences030104 developmental biologyModeling and SimulationStatisticsGraphical model0101 mathematicsSelection (genetic algorithm)Mathematics

researchProduct

Multiplicity- and dependency-adjusted p-values for control of the family-wise error rate

2016

Abstract Under the multiple testing framework, we propose the multiplicity- and dependency-adjustment method (MADAM) which transforms test statistics into adjusted p -values for control of the family-wise error rate. For demonstration, we apply the MADAM to data from a genetic association study.

0301 basic medicineStatistics and ProbabilityWord error rateMultiplicity (mathematics)Familywise error rateMadam01 natural sciences010104 statistics & probability03 medical and health sciences030104 developmental biologyStatisticsMultiple comparisons problemŠidák correctionPer-comparison error rate0101 mathematicsStatistics Probability and UncertaintyMathematicsStatistical hypothesis testingStatistics & Probability Letters

researchProduct