Search results for "Dimensionality reduction"

showing 10 items of 120 documents

Sample size planning for survival prediction with focus on high-dimensional data

2011

Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included…

Statistics and ProbabilityClustering high-dimensional dataClinical Trials as TopicLung NeoplasmsModels StatisticalKaplan-Meier EstimateEpidemiologyProportional hazards modelDimensionality reductionGene ExpressionFeature selectionKaplan-Meier EstimateBiostatisticsPrognosisBrier scoreSample size determinationCarcinoma Non-Small-Cell LungSample SizeCensoring (clinical trials)StatisticsHumansProportional Hazards ModelsMathematicsStatistics in Medicine
researchProduct

Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?

2017

Summary Principal component analysis (PCA) is a method of choice for dimension reduction. In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to perform the PCA of streaming data and/or massive data. Despite the wide availability of recursive algorithms that can efficiently update the PCA when new data are observed, the literature offers little guidance on how to select a suitable algorithm for a given application. This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical a…

Statistics and ProbabilityComputer scienceComputationDimensionality reductionIncremental methods02 engineering and technologyMissing data01 natural sciences010104 statistics & probabilityData explosionStreaming dataPrincipal component analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing0101 mathematicsStatistics Probability and UncertaintyAlgorithmEigendecomposition of a matrixInternational Statistical Review
researchProduct

A review of second‐order blind identification methods

2021

Second-order source separation (SOS) is a data analysis tool which can be used for revealing hidden structures in multivariate time series data or as a tool for dimension reduction. Such methods are nowadays increasingly important as more and more high-dimensional multivariate time series data are measured in numerous fields of applied science. Dimension reduction is crucial, as modeling such high-dimensional data with multivariate time series models is often impractical as the number of parameters describing dependencies between the component time series is usually too high. SOS methods have their roots in the signal processing literature, where they were first used to separate source sign…

Statistics and ProbabilityComputer sciencebusiness.industryDimensionality reductionSecond order blind identificationPattern recognitionArtificial intelligencebusinessBlind signal separationWIREs Computational Statistics
researchProduct

Intensity estimation for inhomogeneous Gibbs point process with covariates-dependent chemical activity

2014

Recent development of intensity estimation for inhomogeneous spatial point processes with covariates suggests that kerneling in the covariate space is a competitive intensity estimation method for inhomogeneous Poisson processes. It is not known whether this advantageous performance is still valid when the points interact. In the simplest common case, this happens, for example, when the objects presented as points have a spatial dimension. In this paper, kerneling in the covariate space is extended to Gibbs processes with covariates-dependent chemical activity and inhibitive interactions, and the performance of the approach is studied through extensive simulation experiments. It is demonstr…

Statistics and ProbabilityDimensionality reductionNonparametric statisticsPoisson distributionPoint processsymbols.namesakeDimension (vector space)CovariatesymbolsEconometricsStatistics::MethodologyStatistical physicsStatistics Probability and UncertaintySmoothingMathematicsParametric statisticsStatistica Neerlandica
researchProduct

Some extensions of multivariate sliced inverse regression

2007

Multivariate sliced inverse regression (SIR) is a method for achieving dimension reduction in regression problems when the outcome variable y and the regressor x are both assumed to be multidimensional. In this paper, we extend the existing approaches, based on the usual SIR I which only uses the inverse regression curve, to methods using properties of the inverse conditional variance. Contrary to the existing ones, these new methods are not blind for symmetric dependencies and rely on the SIR II or SIRα. We also propose their corresponding pooled slicing versions. We illustrate the usefulness of these approaches on simulation studies.

Statistics and ProbabilityMultivariate statisticsApplied MathematicsDimensionality reductionInverseOutcome variableModeling and SimulationStatisticsSliced inverse regressionStatistics::MethodologyStatistics Probability and UncertaintyConditional varianceRegression problemsMathematicsRegression curveJournal of Statistical Computation and Simulation
researchProduct

Asymptotics for pooled marginal slicing estimator based on SIRα approach

2005

Pooled marginal slicing (PMS) is a semiparametric method, based on sliced inverse regression (SIR) approach, for achieving dimension reduction in regression problems when the outcome variable y and the regressor x are both assumed to be multidimensional. In this paper, we consider the SIR"@a version (combining the SIR-I and SIR-II approaches) of the PMS estimator and we establish the asymptotic distribution of the estimated matrix of interest. Then the asymptotic normality of the eigenprojector on the estimated effective dimension reduction (e.d.r.) space is derived as well as the asymptotic distributions of each estimated e.d.r. direction and its corresponding eigenvalue.

Statistics and ProbabilityNumerical AnalysisDimensionality reductionStatisticsSliced inverse regressionAsymptotic distributionEstimatorRegression analysisStatistics Probability and UncertaintyMarginal distributionEffective dimensionEigenvalues and eigenvectorsMathematicsJournal of Multivariate Analysis
researchProduct

Dimension reduction for time series in a blind source separation context using r

2021

Funding Information: The work of KN was supported by the CRoNoS COST Action IC1408 and the Austrian Science Fund P31881-N32. The work of ST was supported by the CRoNoS COST Action IC1408. The work of JV was supported by Academy of Finland (grant 321883). We would like to thank the anonymous reviewers for their comments which improved the paper and package considerably. Publisher Copyright: © 2021, American Statistical Association. All rights reserved. Multivariate time series observations are increasingly common in multiple fields of science but the complex dependencies of such data often translate into intractable models with large number of parameters. An alternative is given by first red…

Statistics and ProbabilitySeries (mathematics)Stochastic volatilityComputer scienceblind source separation; supervised dimension reduction; RsignaalinkäsittelyDimensionality reductionRsignaalianalyysiContext (language use)CovarianceBlind signal separationQA273-280aikasarja-analyysiR-kieliDimension (vector space)monimuuttujamenetelmätBlind source separationStatistics Probability and UncertaintyTime seriesAlgorithmSoftwareSupervised dimension reduction
researchProduct

A semiparametric approach to estimate reference curves for biophysical properties of the skin

2006

Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. Th…

Statistics::TheoryKernel density estimationcomputer.software_genre01 natural sciences010104 statistics & probability0502 economics and businessCovariateSliced inverse regressionApplied mathematicsStatistics::MethodologySemiparametric regression0101 mathematics[SHS.ECO] Humanities and Social Sciences/Economics and Finance050205 econometrics MathematicsParametric statisticsDimensionality reduction05 social sciencesNonparametric statistics[ SDV.SPEE ] Life Sciences [q-bio]/Santé publique et épidémiologie[SHS.ECO]Humanities and Social Sciences/Economics and Finance3. Good health[SDV.SPEE] Life Sciences [q-bio]/Santé publique et épidémiologie[SDV.SPEE]Life Sciences [q-bio]/Santé publique et épidémiologieC140;C630Data miningcomputerQuantile
researchProduct

ERP denoising in multichannel EEG data using contrasts between signal and noise subspaces

2009

Abstract In this paper, a new method intended for ERP denoising in multichannel EEG data is discussed. The denoising is done by separating ERP/noise subspaces in multidimensional EEG data by a linear transformation and the following dimension reduction by ignoring noise components during inverse transformation. The separation matrix is found based on the assumption that ERP sources are deterministic for all repetitions of the same type of stimulus within the experiment, while the other noise sources do not obey the determinancy property. A detailed derivation of the technique is given together with the analysis of the results of its application to a real high-density EEG data set. The inter…

Underdetermined systemNoise reductionInverseElectroencephalographyDyslexiaEvent-related potentialmedicineHumansChildEvoked PotentialsMathematicsLanguage Testsmedicine.diagnostic_testbusiness.industryGeneral NeuroscienceDimensionality reductionBrainElectroencephalographySignal Processing Computer-AssistedPattern recognitionLinear subspaceLinear mapAcoustic StimulationData Interpretation StatisticalLinear ModelsSpeech PerceptionArtificial intelligenceArtifactsbusinessAlgorithmsSoftwareJournal of Neuroscience Methods
researchProduct

Multi-class pairwise linear dimensionality reduction using heteroscedastic schemes

2010

Accepted version of an article published in the journal: Pattern Recognition. Published version on Sciverse: http://dx.doi.org/10.1016/j.patcog.2010.01.018 Linear dimensionality reduction (LDR) techniques have been increasingly important in pattern recognition (PR) due to the fact that they permit a relatively simple mapping of the problem onto a lower-dimensional subspace, leading to simple and computationally efficient classification strategies. Although the field has been well developed for the two-class problem, the corresponding issues encountered when dealing with multiple classes are far from trivial. In this paper, we argue that, as opposed to the traditional LDR multi-class schemes…

VDP::Mathematics and natural science: 400::Mathematics: 410::Applied mathematics: 413business.industryVDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422Dimensionality reductionDecision treePattern recognitionBayes classifierLinear discriminant analysisLinear subspaceWeightingArtificial IntelligenceSignal ProcessingPairwise comparisonComputer Vision and Pattern RecognitionArtificial intelligencebusinessAlgorithmSoftwareSubspace topologyMathematics
researchProduct