Search results for "Dimensionality reduction"
showing 10 items of 120 documents
Sample size planning for survival prediction with focus on high-dimensional data
2011
Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included…
Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?
2017
Summary Principal component analysis (PCA) is a method of choice for dimension reduction. In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to perform the PCA of streaming data and/or massive data. Despite the wide availability of recursive algorithms that can efficiently update the PCA when new data are observed, the literature offers little guidance on how to select a suitable algorithm for a given application. This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical a…
A review of second‐order blind identification methods
2021
Second-order source separation (SOS) is a data analysis tool which can be used for revealing hidden structures in multivariate time series data or as a tool for dimension reduction. Such methods are nowadays increasingly important as more and more high-dimensional multivariate time series data are measured in numerous fields of applied science. Dimension reduction is crucial, as modeling such high-dimensional data with multivariate time series models is often impractical as the number of parameters describing dependencies between the component time series is usually too high. SOS methods have their roots in the signal processing literature, where they were first used to separate source sign…
Intensity estimation for inhomogeneous Gibbs point process with covariates-dependent chemical activity
2014
Recent development of intensity estimation for inhomogeneous spatial point processes with covariates suggests that kerneling in the covariate space is a competitive intensity estimation method for inhomogeneous Poisson processes. It is not known whether this advantageous performance is still valid when the points interact. In the simplest common case, this happens, for example, when the objects presented as points have a spatial dimension. In this paper, kerneling in the covariate space is extended to Gibbs processes with covariates-dependent chemical activity and inhibitive interactions, and the performance of the approach is studied through extensive simulation experiments. It is demonstr…
Some extensions of multivariate sliced inverse regression
2007
Multivariate sliced inverse regression (SIR) is a method for achieving dimension reduction in regression problems when the outcome variable y and the regressor x are both assumed to be multidimensional. In this paper, we extend the existing approaches, based on the usual SIR I which only uses the inverse regression curve, to methods using properties of the inverse conditional variance. Contrary to the existing ones, these new methods are not blind for symmetric dependencies and rely on the SIR II or SIRα. We also propose their corresponding pooled slicing versions. We illustrate the usefulness of these approaches on simulation studies.
Asymptotics for pooled marginal slicing estimator based on SIRα approach
2005
Pooled marginal slicing (PMS) is a semiparametric method, based on sliced inverse regression (SIR) approach, for achieving dimension reduction in regression problems when the outcome variable y and the regressor x are both assumed to be multidimensional. In this paper, we consider the SIR"@a version (combining the SIR-I and SIR-II approaches) of the PMS estimator and we establish the asymptotic distribution of the estimated matrix of interest. Then the asymptotic normality of the eigenprojector on the estimated effective dimension reduction (e.d.r.) space is derived as well as the asymptotic distributions of each estimated e.d.r. direction and its corresponding eigenvalue.
Dimension reduction for time series in a blind source separation context using r
2021
Funding Information: The work of KN was supported by the CRoNoS COST Action IC1408 and the Austrian Science Fund P31881-N32. The work of ST was supported by the CRoNoS COST Action IC1408. The work of JV was supported by Academy of Finland (grant 321883). We would like to thank the anonymous reviewers for their comments which improved the paper and package considerably. Publisher Copyright: © 2021, American Statistical Association. All rights reserved. Multivariate time series observations are increasingly common in multiple fields of science but the complex dependencies of such data often translate into intractable models with large number of parameters. An alternative is given by first red…
A semiparametric approach to estimate reference curves for biophysical properties of the skin
2006
Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. Th…
ERP denoising in multichannel EEG data using contrasts between signal and noise subspaces
2009
Abstract In this paper, a new method intended for ERP denoising in multichannel EEG data is discussed. The denoising is done by separating ERP/noise subspaces in multidimensional EEG data by a linear transformation and the following dimension reduction by ignoring noise components during inverse transformation. The separation matrix is found based on the assumption that ERP sources are deterministic for all repetitions of the same type of stimulus within the experiment, while the other noise sources do not obey the determinancy property. A detailed derivation of the technique is given together with the analysis of the results of its application to a real high-density EEG data set. The inter…
Multi-class pairwise linear dimensionality reduction using heteroscedastic schemes
2010
Accepted version of an article published in the journal: Pattern Recognition. Published version on Sciverse: http://dx.doi.org/10.1016/j.patcog.2010.01.018 Linear dimensionality reduction (LDR) techniques have been increasingly important in pattern recognition (PR) due to the fact that they permit a relatively simple mapping of the problem onto a lower-dimensional subspace, leading to simple and computationally efficient classification strategies. Although the field has been well developed for the two-class problem, the corresponding issues encountered when dealing with multiple classes are far from trivial. In this paper, we argue that, as opposed to the traditional LDR multi-class schemes…