Search results for "Bust"

showing 10 items of 1000 documents

Pathway analysis of high-throughput biological data within a Bayesian network framework

2011

Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…

Statistics and ProbabilityComputer scienceHigh-throughput screeningGene regulatory networkcomputer.software_genreModels BiologicalBiochemistrySynthetic dataBiological pathwayBayes' theoremHumansGene Regulatory NetworksCarcinoma Renal CellMolecular BiologyGeneBiological dataMicroarray analysis techniquesGene Expression ProfilingBayesian networkRobustness (evolution)Bayes TheoremPathway analysisKidney NeoplasmsHigh-Throughput Screening AssaysComputer Science ApplicationsGene expression profilingComputational MathematicsComputational Theory and MathematicsCausal inferenceData miningcomputerAlgorithmsSoftwareBioinformatics

researchProduct

Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis

2017

International audience; The geometric median covariation matrix is a robust multivariate indicator of dispersion which can be extended without any difficulty to functional data. We define estimators, based on recursive algorithms, that can be simply updated at each new observation and are able to deal rapidly with large samples of high dimensional data without being obliged to store all the data in memory. Asymptotic convergence properties of the recursive algorithms are studied under weak conditions. The computation of the principal components can also be performed online and this approach can be useful for online outlier detection. A simulation study clearly shows that this robust indicat…

Statistics and ProbabilityComputer scienceMathematics - Statistics TheoryStatistics Theory (math.ST)01 natural sciences010104 statistics & probabilityMatrix (mathematics)Dimension (vector space)Geometric medianStochastic gradientFOS: Mathematics0101 mathematicsL1-median010102 general mathematicsEstimator[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Geometric medianCovariance[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]Functional dataMSC: 62G05 62L20Principal component analysisProjection pursuitAnomaly detectionRecursive robust estimationStatistics Probability and UncertaintyAlgorithm

researchProduct

MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study

2019

Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…

Statistics and ProbabilityContigComputer scienceRobustness (evolution)Computational biologyOriginal PapersBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsComputational Theory and MathematicsMetagenomicsReference genesGene familyHuman viromeCluster analysisMolecular BiologyBioinformatics

researchProduct

The affine equivariant sign covariance matrix: asymptotic behavior and efficiencies

2003

We consider the affine equivariant sign covariance matrix (SCM) introduced by Visuri et al. (J. Statist. Plann. Inference 91 (2000) 557). The population SCM is shown to be proportional to the inverse of the regular covariance matrix. The eigenvectors and standardized eigenvalues of the covariance, matrix can thus be derived from the SCM. We also construct an estimate of the covariance and correlation matrix based on the SCM. The influence functions and limiting distributions of the SCM and its eigenvectors and eigenvalues are found. Limiting efficiencies are given in multivariate normal and t-distribution cases. The estimates are highly efficient in the multivariate normal case and perform …

Statistics and ProbabilityCovariance functionaffine equivarianceinfluence functionMultivariate normal distributionrobustnessComputer Science::Human-Computer InteractionEfficiencyestimatorsEstimation of covariance matricesScatter matrixStatisticsAffine equivarianceApplied mathematicsCMA-ESMultivariate signCovariance and correlation matricesRobustnessmultivariate medianMathematicsprincipal componentsInfluence functionNumerical AnalysisMultivariate medianCovariance matrixcovariance and correlation matricesdiscriminant-analysisCovarianceComputer Science::Otherdispersion matricesefficiencyLaw of total covariancemultivariate locationtestsStatistics Probability and Uncertaintyeigenvectors and eigenvaluesEigenvectors and eigenvaluesmultivariate signJournal of Multivariate Analysis

researchProduct

Sign test of independence between two random vectors

2003

A new affine invariant extension of the quadrant test statistic Blomqvist (Ann. Math. Statist. 21 (1950) 593) based on spatial signs is proposed for testing the hypothesis of independence. In the elliptic case, the new test statistic is asymptotically equivalent to the interdirection test by Gieser and Randles (J. Amer. Statist. Assoc. 92 (1997) 561) but is easier to compute in practice. Limiting Pitman efficiencies and simulations are used to compare the test to the classical Wilks’ test. peerReviewed

Statistics and ProbabilityDiscrete mathematicsStatistics::TheoryMultivariate random variableExtension (predicate logic)robustnessQuadrant testPitman efficiencyTest (assessment)Exact testStatisticsChi-square testTest statisticSign testaffine invarianceStatistics Probability and UncertaintyIndependence (probability theory)MathematicsWilks’ test

researchProduct

Booms, Busts and normal times in the housing market

2015

We assess the existence of duration dependence in the likelihood of an end in housing booms, busts, and normal times. Using data for 20 industrial countries and a continuous-time Weibull duration model, we find evidence of positive duration dependence suggesting that housing market cycles have become longer over the last decades. Then, we extend the baseline Weibull model and allow for the presence of a change-point in the duration dependence parameter.We show that positive duration dependence is present in booms and busts that last less than 26 quarters, but that does not seem to be the case for longer phases of the housing market cycle. For normal times, no evidence of change-points is fo…

Statistics and ProbabilityEconomics and EconometricsHousing booms and bustsSocial SciencesDuration dependenceBoomWeibull modelEconomicsDuration (project management)Baseline (configuration management)Weibull distributionScience & TechnologyActuarial scienceCiências Sociais::Economia e Gestãohousing booms and busts duration analysis Weibull model duration dependence change-pointsSettore SECS-P/02 Politica EconomicaDuration analysis8. Economic growthChange pointsChange-pointsDemographic economics:Economia e Gestão [Ciências Sociais]Statistics Probability and UncertaintyDuration dependenceSocial Sciences (miscellaneous)

researchProduct

Symmetrised M-estimators of multivariate scatter

2007

AbstractIn this paper we introduce a family of symmetrised M-estimators of multivariate scatter. These are defined to be M-estimators only computed on pairwise differences of the observed multivariate data. Symmetrised Huber's M-estimator and Dümbgen's estimator serve as our examples. The influence functions of the symmetrised M-functionals are derived and the limiting distributions of the estimators are discussed in the multivariate elliptical case to consider the robustness and efficiency properties of estimators. The symmetrised M-estimators have the important independence property; they can therefore be used to find the independent components in the independent component analysis (ICA).

Statistics and ProbabilityElliptical distributionInfluence functionMultivariate statisticsNumerical AnalysisEstimatorEfficiencyM-estimatorM-estimatorIndependent component analysisEfficient estimatorScatter matrixScatter matrixMathematics::Category TheoryStatisticsApplied mathematicsStatistics Probability and UncertaintyRobustnessElliptical distributionIndependence (probability theory)MathematicsJournal of Multivariate Analysis

researchProduct

A Note on Robust Intensity Estimation for Point Processes

1992

A robust intensity estimator based on independent marking is derived. A simulation study is made to convince that the new estimator works also in such cases where the usual estimators based on the distance methods do not work. Some truncated distributions are derived.

Statistics and ProbabilityEstimatorGeneral MedicineTrimmed estimatorPoint processTruncated distributionDistribution (mathematics)Robustness (computer science)StatisticsApplied mathematicsStatistics Probability and UncertaintyMinimax estimatorInvariant estimatorMathematicsBiometrical Journal

researchProduct

Latin hypercube sampling with inequality constraints

2010

International audience; In some studies requiring predictive and CPU-time consuming numerical models, the sampling design of the model input variables has to be chosen with caution. For this purpose, Latin hypercube sampling has a long history and has shown its robustness capabilities. In this paper we propose and discuss a new algorithm to build a Latin hypercube sample (LHS) taking into account inequality constraints between the sampled variables. This technique, called constrained Latin hypercube sampling (cLHS), consists in doing permutations on an initial LHS to honor the desired monotonic constraints. The relevance of this approach is shown on a real example concerning the numerical w…

Statistics and ProbabilityFOS: Computer and information sciencesEconomics and EconometricsMathematical optimizationDesign of Experiments020209 energyMonotonic functionSample (statistics)Mathematics - Statistics Theory02 engineering and technologyStatistics Theory (math.ST)01 natural sciencesStatistics - Computation010104 statistics & probabilityRobustness (computer science)[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Sampling design0202 electrical engineering electronic engineering information engineeringFOS: Mathematics[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicsDependenceUncertainty analysisLatin hypercube samplingComputation (stat.CO)MathematicsApplied MathematicsComputer experimentFunction (mathematics)[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Computer experiment[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]Latin hypercube samplingModeling and SimulationUncertainty analysisSocial Sciences (miscellaneous)Analysis

researchProduct

Robust estimation and inference for bivariate line-fitting in allometry.

2011

In allometry, bivariate techniques related to principal component analysis are often used in place of linear regression, and primary interest is in making inferences about the slope. We demonstrate that the current inferential methods are not robust to bivariate contamination, and consider four robust alternatives to the current methods -- a novel sandwich estimator approach, using robust covariance matrices derived via an influence function approach, Huber's M-estimator and the fast-and-robust bootstrap. Simulations demonstrate that Huber's M-estimators are highly efficient and robust against bivariate contamination, and when combined with the fast-and-robust bootstrap, we can make accurat…

Statistics and ProbabilityHeteroscedasticityAnalysis of VarianceCovariance matrixRobust statisticsEstimatorGeneral MedicineBivariate analysisCovarianceBiostatisticsStatistics::ComputationEfficient estimatorPrincipal component analysisStatisticsEconometricsStatistics::MethodologyBody SizeStatistics Probability and UncertaintyMathematicsProbabilityBiometrical journal. Biometrische Zeitschrift

researchProduct