Search results for "Statistica"

showing 10 items of 5969 documents

L1-Penalized Censored Gaussian Graphical Model

2018

Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer. The combination of censoring and high-dimensionality make inference of the underlying genetic networks from these data very challenging. In this article, we propose an $\ell_1$-penalized Gaussian graphical model for censored data and derive two EM-like algorithm…

0301 basic medicineStatistics and ProbabilityFOS: Computer and information sciencesgraphical lassoComputer scienceGaussianNormal DistributionInferenceMultivariate normal distribution01 natural sciencesMethodology (stat.ME)010104 statistics & probability03 medical and health sciencessymbols.namesakeGraphical LassoExpectation–maximization algorithmHumansComputer SimulationGene Regulatory NetworksGraphical model0101 mathematicsStatistics - MethodologyEstimation theoryReverse Transcriptase Polymerase Chain ReactionEstimatorexpectation-maximization algorithmGeneral MedicineCensoring (statistics)High-dimensional datahigh-dimensional dataGaussian graphical model030104 developmental biologysymbolscensored dataCensored dataExpectation-Maximization algorithmStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaAlgorithmAlgorithms

researchProduct

Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks.

2016

Abstract Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order – some entries of the precision matrix are a priori zeros – or equal dependency strengths across time lags – some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l 1-penalized maximum likelihood, imposing a further constraint on the absolute value…

0301 basic medicineStatistics and ProbabilityFactorialDependency (UML)Computer scienceGaussianNormal Distributionpenalized inferencesparse networkscomputer.software_genreMachine learning01 natural sciencesNormal distribution010104 statistics & probability03 medical and health sciencessymbols.namesakeSparse networksGeneticsComputer SimulationGene Regulatory NetworksGraphical model0101 mathematicsgene-regulatory systemMolecular BiologyProbabilityMarkov chainModels GeneticPenalized inferencebusiness.industryModel selectiongraphical modelGene-regulatory systemsComputational Mathematics030104 developmental biologysymbolsA priori and a posterioriData miningArtificial intelligenceGraphical modelsSettore SECS-S/01 - StatisticabusinesscomputerNeisseriaAlgorithmsStatistical applications in genetics and molecular biology

researchProduct

Variance component analysis to assess protein quantification in biomarker discovery. Application to MALDI-TOF mass spectrometry.

2017

International audience; Controlling the technological variability on an analytical chain is critical for biomarker discovery. The sources of technological variability should be modeled, which calls for specific experimental design, signal processing, and statistical analysis. Furthermore, with unbalanced data, the various components of variability cannot be estimated with the sequential or adjusted sums of squares of usual software programs. We propose a novel approach to variance component analysis with application to the matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) technology and use this approach for protein quantification by a classical signal processing algori…

0301 basic medicineStatistics and ProbabilityMALDI-TOFexperimental designBiometryprotein quantificationQuantitative proteomicsVariance component analysis[ CHIM ] Chemical Sciences01 natural sciencesSignaltechnological variability010104 statistics & probability03 medical and health sciencesstatistical analysis[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing[CHIM.ANAL]Chemical Sciences/Analytical chemistryComponent (UML)[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]biomarker discoverysum of squares type0101 mathematicsBiomarker discoverysignal processingMathematicsSignal processingAnalysis of Variance[ PHYS ] Physics [physics]Noise (signal processing)ProteinsGeneral MedicineVariance (accounting)[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]030104 developmental biologySpectrometry Mass Matrix-Assisted Laser Desorption-IonizationLinear Modelsvariance components[ CHIM.ANAL ] Chemical Sciences/Analytical chemistryStatistics Probability and UncertaintyBiological systemAlgorithmsBiomarkersBiometrical journal. Biometrische Zeitschrift

researchProduct

A heuristic, iterative algorithm for change-point detection in abrupt change models

2017

Change-point detection in abrupt change models is a very challenging research topic in many fields of both methodological and applied Statistics. Due to strong irregularities, discontinuity and non-smootheness, likelihood based procedures are awkward; for instance, usual optimization methods do not work, and grid search algorithms represent the most used approach for estimation. In this paper a heuristic, iterative algorithm for approximate maximum likelihood estimation is introduced for change-point detection in piecewise constant regression models. The algorithm is based on iterative fitting of simple linear models, and appears to extend easily to more general frameworks, such as models i…

0301 basic medicineStatistics and ProbabilityMathematical optimizationIterative methodHeuristic (computer science)Linear model01 natural sciencesPiecewise constant model Approximate maximum likelihood Model linearization Grid search limitations010104 statistics & probability03 medical and health sciencesComputational MathematicsDiscontinuity (linguistics)030104 developmental biologyHyperparameter optimizationCovariatePiecewise0101 mathematicsStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaChange detectionMathematics

researchProduct

LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs

2019

Abstract Novel 3D protein descriptors based on bilinear, quadratic and linear algebraic maps in R n are proposed. The latter employs the kth 2-tuple (dis) similarity matrix to codify information related to covalent and non-covalent interactions in these biopolymers. The calculation of the inter-amino acid distances is generalized by using several dis-similarity coefficients, where normalization procedures based on the simple stochastic and mutual probability schemes are applied. A new local-fragment approach based on amino acid-types and amino acid-groups is proposed to characterize regions of interest in proteins. Topological and geometric macromolecular cutoffs are defined using local and…

0301 basic medicineStatistics and ProbabilityNormalization (statistics)GeneralizationQuantitative Structure-Activity RelationshipGeneral Biochemistry Genetics and Molecular Biology03 medical and health sciences0302 clinical medicineLinear regressionAmino AcidsMathematicsGeneral Immunology and MicrobiologyApplied MathematicsStatistical parameterProteinsGeneral MedicineCollinearityStructural Classification of Proteins databaseSupport vector machine030104 developmental biologyModeling and SimulationTest setLinear ModelsGeneral Agricultural and Biological SciencesAlgorithmSoftware030217 neurology & neurosurgeryJournal of Theoretical Biology

researchProduct

The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of…

2015

Abstract Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. Results: We contribute to close this important methodological gap …

0301 basic medicineStatistics and ProbabilityNucleosome organizationComputational biologyBiologyType (model theory)BiochemistryGenomeDNA sequencing03 medical and health sciencesComputational Theory and MathematicNucleosomeMolecular BiologySequence (medicine)GeneticsGenomeSettore INF/01 - InformaticaEukaryotaComputer Science Applications1707 Computer Vision and Pattern RecognitionStatistical modelDNAChromatinNucleosomesComputer Science ApplicationsChromatinSettore BIO/18 - GeneticaComputational Mathematics030104 developmental biologyComputational Theory and MathematicsComputational MathematicBioinformatics

researchProduct

Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.

2017

Abstract Motivation Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions …

0301 basic medicineStatistics and ProbabilityQuality ControlGenotypeComputer sciencemedia_common.quotation_subjectPopulationGenomicsBioinformaticscomputer.software_genreBiochemistryGenome03 medical and health sciencesGenetic variationAnimalsHumansQuality (business)AlleleeducationMolecular BiologyGenotypingReliability (statistics)media_commonProtocol (science)education.field_of_studyGenomeModels StatisticalGenetic VariationReproducibility of ResultsGenomicsGenome AnalysisOriginal PapersComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsData miningcomputerSoftwareReference genome

researchProduct

Evolutionary distances corrected for purifying selection and ancestral polymorphisms.

2019

Abstract Evolutionary distance formulas that take into account effects due to ancestral polymorphisms and purifying selection are obtained on the basis of the full solution of Jukes–Cantor and Kimura DNA substitution models. In the case of purifying selection two different methods are developed. It is shown that avoiding the dimensional reduction implicitly carried out in the conventional model solving is instrumental to incorporate the quoted effects into the formalism. The problem of estimating the numerical values of the model parameters, as well as those of the correction terms, is not addressed.

0301 basic medicineStatistics and ProbabilityTime FactorsADNModel parametersGeneral Biochemistry Genetics and Molecular Biology03 medical and health sciencesNegative selection0302 clinical medicineQuantitative Biology::Populations and EvolutionStatistical physicsSelection GeneticMolecular clockPhylogenyMathematicsPolymorphism GeneticGeneral Immunology and MicrobiologyApplied MathematicsGeneral MedicineModels biològicsQuantitative Biology::GenomicsBiological EvolutionFormalism (philosophy of mathematics)030104 developmental biologyDimensional reductionModeling and SimulationMutationGeneral Agricultural and Biological Sciences030217 neurology & neurosurgeryEvolució (Biologia)Journal of theoretical biology

researchProduct

Multiplicity- and dependency-adjusted p-values for control of the family-wise error rate

2016

Abstract Under the multiple testing framework, we propose the multiplicity- and dependency-adjustment method (MADAM) which transforms test statistics into adjusted p -values for control of the family-wise error rate. For demonstration, we apply the MADAM to data from a genetic association study.

0301 basic medicineStatistics and ProbabilityWord error rateMultiplicity (mathematics)Familywise error rateMadam01 natural sciences010104 statistics & probability03 medical and health sciences030104 developmental biologyStatisticsMultiple comparisons problemŠidák correctionPer-comparison error rate0101 mathematicsStatistics Probability and UncertaintyMathematicsStatistical hypothesis testingStatistics & Probability Letters

researchProduct

Pitfalls of hypothesis tests and model selection on bootstrap samples: Causes and consequences in biometrical applications

2015

The bootstrap method has become a widely used tool applied in diverse areas where results based on asymptotic theory are scarce. It can be applied, for example, for assessing the variance of a statistic, a quantile of interest or for significance testing by resampling from the null hypothesis. Recently, some approaches have been proposed in the biometrical field where hypothesis testing or model selection is performed on a bootstrap sample as if it were the original sample. P-values computed from bootstrap samples have been used, for example, in the statistics and bioinformatics literature for ranking genes with respect to their differential expression, for estimating the variability of p-v…

0301 basic medicineStatistics and Probabilityeducation.field_of_studyComputer scienceModel selectionBootstrap aggregatingPopulationGeneral MedicineAsymptotic theory (statistics)01 natural sciences010104 statistics & probability03 medical and health sciences030104 developmental biologyResamplingStatisticsEconometrics0101 mathematicsStatistics Probability and UncertaintyeducationNull hypothesisQuantileStatistical hypothesis testingBiometrical Journal

researchProduct