Search results for "Bust"
showing 10 items of 1000 documents
Pathway analysis of high-throughput biological data within a Bayesian network framework
2011
Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…
Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis
2017
International audience; The geometric median covariation matrix is a robust multivariate indicator of dispersion which can be extended without any difficulty to functional data. We define estimators, based on recursive algorithms, that can be simply updated at each new observation and are able to deal rapidly with large samples of high dimensional data without being obliged to store all the data in memory. Asymptotic convergence properties of the recursive algorithms are studied under weak conditions. The computation of the principal components can also be performed online and this approach can be useful for online outlier detection. A simulation study clearly shows that this robust indicat…
MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study
2019
Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…
The affine equivariant sign covariance matrix: asymptotic behavior and efficiencies
2003
We consider the affine equivariant sign covariance matrix (SCM) introduced by Visuri et al. (J. Statist. Plann. Inference 91 (2000) 557). The population SCM is shown to be proportional to the inverse of the regular covariance matrix. The eigenvectors and standardized eigenvalues of the covariance, matrix can thus be derived from the SCM. We also construct an estimate of the covariance and correlation matrix based on the SCM. The influence functions and limiting distributions of the SCM and its eigenvectors and eigenvalues are found. Limiting efficiencies are given in multivariate normal and t-distribution cases. The estimates are highly efficient in the multivariate normal case and perform …
Sign test of independence between two random vectors
2003
A new affine invariant extension of the quadrant test statistic Blomqvist (Ann. Math. Statist. 21 (1950) 593) based on spatial signs is proposed for testing the hypothesis of independence. In the elliptic case, the new test statistic is asymptotically equivalent to the interdirection test by Gieser and Randles (J. Amer. Statist. Assoc. 92 (1997) 561) but is easier to compute in practice. Limiting Pitman efficiencies and simulations are used to compare the test to the classical Wilks’ test. peerReviewed
Booms, Busts and normal times in the housing market
2015
We assess the existence of duration dependence in the likelihood of an end in housing booms, busts, and normal times. Using data for 20 industrial countries and a continuous-time Weibull duration model, we find evidence of positive duration dependence suggesting that housing market cycles have become longer over the last decades. Then, we extend the baseline Weibull model and allow for the presence of a change-point in the duration dependence parameter.We show that positive duration dependence is present in booms and busts that last less than 26 quarters, but that does not seem to be the case for longer phases of the housing market cycle. For normal times, no evidence of change-points is fo…
Symmetrised M-estimators of multivariate scatter
2007
AbstractIn this paper we introduce a family of symmetrised M-estimators of multivariate scatter. These are defined to be M-estimators only computed on pairwise differences of the observed multivariate data. Symmetrised Huber's M-estimator and Dümbgen's estimator serve as our examples. The influence functions of the symmetrised M-functionals are derived and the limiting distributions of the estimators are discussed in the multivariate elliptical case to consider the robustness and efficiency properties of estimators. The symmetrised M-estimators have the important independence property; they can therefore be used to find the independent components in the independent component analysis (ICA).
A Note on Robust Intensity Estimation for Point Processes
1992
A robust intensity estimator based on independent marking is derived. A simulation study is made to convince that the new estimator works also in such cases where the usual estimators based on the distance methods do not work. Some truncated distributions are derived.
Latin hypercube sampling with inequality constraints
2010
International audience; In some studies requiring predictive and CPU-time consuming numerical models, the sampling design of the model input variables has to be chosen with caution. For this purpose, Latin hypercube sampling has a long history and has shown its robustness capabilities. In this paper we propose and discuss a new algorithm to build a Latin hypercube sample (LHS) taking into account inequality constraints between the sampled variables. This technique, called constrained Latin hypercube sampling (cLHS), consists in doing permutations on an initial LHS to honor the desired monotonic constraints. The relevance of this approach is shown on a real example concerning the numerical w…
Robust estimation and inference for bivariate line-fitting in allometry.
2011
In allometry, bivariate techniques related to principal component analysis are often used in place of linear regression, and primary interest is in making inferences about the slope. We demonstrate that the current inferential methods are not robust to bivariate contamination, and consider four robust alternatives to the current methods -- a novel sandwich estimator approach, using robust covariance matrices derived via an influence function approach, Huber's M-estimator and the fast-and-robust bootstrap. Simulations demonstrate that Huber's M-estimators are highly efficient and robust against bivariate contamination, and when combined with the fast-and-robust bootstrap, we can make accurat…