Search results for "Variable"

showing 10 items of 1674 documents

The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.

2020

This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

Statistics and ProbabilityStatistics::TheoryInduced smoothingEpidemiologyComputer scienceFeature selectionWald test01 natural sciencesasthma researchStatistics::Machine Learning010104 statistics & probability03 medical and health sciencesHealth Information ManagementLasso (statistics)Linear regressionsparse modelsStatistics::MethodologyComputer Simulation0101 mathematicssandwich formula030304 developmental biologyStatistical hypothesis testing0303 health sciencesCovariance matrixlung functionRegression analysisStatistics::Computationsparse modelResearch DesignAlgorithmSmoothingvariable selectionStatistical methods in medical research
researchProduct

Clusters of effects curves in quantile regression models

2018

In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…

Statistics and ProbabilityStatistics::TheoryMultivariate statistics05 social sciencesUnivariateFunctional data analysis01 natural sciencesQuantile regressionQuantile regression coefficients modeling Multivariate analysis Functional data analysis Curves clustering Variable selection010104 statistics & probabilityComputational Mathematics0502 economics and businessParametric modelCovariateStatistics::MethodologyApplied mathematics0101 mathematicsStatistics Probability and UncertaintyCluster analysisSettore SECS-S/01 - Statistica050205 econometrics MathematicsQuantile
researchProduct

Nonlinear parametric quantile models

2020

Quantile regression is widely used to estimate conditional quantiles of an outcome variable of interest given covariates. This method can estimate one quantile at a time without imposing any constraints on the quantile process other than the linear combination of covariates and parameters specified by the regression model. While this is a flexible modeling tool, it generally yields erratic estimates of conditional quantiles and regression coefficients. Recently, parametric models for the regression coefficients have been proposed that can help balance bias and sampling variability. So far, however, only models that are linear in the parameters and covariates have been explored. This paper …

Statistics and ProbabilityStatistics::Theoryquantile regressionEpidemiologyparametric010501 environmental sciences01 natural sciencesquantile regression coefficients models010104 statistics & probabilityOutcome variableHealth Information ManagementCovariateEconometricsHumansStatistics::MethodologyComputer Simulation0101 mathematicsChild0105 earth and related environmental sciencesParametric statisticsMathematicsModels StatisticalForced oscillation technique integrated loss function parametric quantile regression quantile regression coefficients models Child Computer Simulation Humans Regression Analysis Models Statistical Nonlinear DynamicsStatistics::ComputationQuantile regressionNonlinear systemNonlinear Dynamicsintegrated loss functionRegression AnalysisQuantileStatistical Methods in Medical Research
researchProduct

Eleccion de variables en regresion lineal un problema de decision

1986

A general structure for the problem of selection of variables in regression is proposed using the decision theory framework. In particular, some results for the choice of the best linear normal homocedastic model are obtained when the main purpose is either to specify the predictive distribution over the response variable or to obtain a point estimate of it. A comparison of our results with the most widespread classical ones is presented

Statistics and ProbabilityVariable (computer science)Distribution (number theory)Decision theoryStatisticsStructure (category theory)Point estimationStatistics Probability and UncertaintyRegressionSelection (genetic algorithm)MathematicsTrabajos de Estadistica
researchProduct

Asymptotic efficiency of the calibration estimator in a high-dimensional data setting

2022

Abstract In a finite population sampling survey, auxiliary information is commonly used to improve the Horvitz-Thompson estimators and calibration has been extensively used by national statistical agencies over the last decades for that purpose. This method enables to make estimators consistent with known totals of auxiliary variables and to reduce variance if the calibration variables are explanatory for the variable of interest. Nowadays, it is not unusual anymore to have high-dimensional auxiliary data sets and adding too much additional calibration variables may increase the variance of calibration estimators. We study in this paper the asymptotic efficiency of the calibration estimator…

Statistics and ProbabilityVariance inflation factorAuxiliary variablesVariable (computer science)Calibration (statistics)Applied MathematicsStatisticsEstimatorVariance (accounting)Statistics Probability and UncertaintyPopulation samplingMathematicsJournal of Statistical Planning and Inference
researchProduct

Fourth Moments and Independent Component Analysis

2015

In independent component analysis it is assumed that the components of the observed random vector are linear combinations of latent independent random variables, and the aim is then to find an estimate for a transformation matrix back to these independent components. In the engineering literature, there are several traditional estimation procedures based on the use of fourth moments, such as FOBI (fourth order blind identification), JADE (joint approximate diagonalization of eigenmatrices), and FastICA, but the statistical properties of these estimates are not well known. In this paper various independent component functionals based on the fourth moments are discussed in detail, starting wi…

Statistics and ProbabilityjadeMultivariate random variableGeneral MathematicsMathematics - Statistics TheoryStatistics Theory (math.ST)02 engineering and technologyEstimating equations01 natural sciences010104 statistics & probabilityTransformation matrixFastICAFOS: Mathematics0202 electrical engineering electronic engineering information engineeringAffine equivarianceApplied mathematics0101 mathematicsLinear combinationMathematicsComponent (thermodynamics)kurtosis020206 networking & telecommunicationsFOBIIndependent component analysisJADEFastICAStatistics Probability and UncertaintyRandom variable
researchProduct

Contributed discussion on article by Pratola

2016

The author should be commended for his outstanding contribution to the literature on Bayesian regression tree models. The author introduces three innovative sampling approaches which allow for efficient traversal of the model space. In this response, we add a fourth alternative.

Statistics and Probabilitymodel selectionMarkov Chain Monte Carlo (MCMC)Bayesian regression treeComputer scienceBig dataBayesian regression tree (BRT) modelsComputingMilieux_LEGALASPECTSOFCOMPUTINGbirth–death processMachine learningcomputer.software_genreSequential Monte Carlo methods01 natural sciencespopulation Markov chain Monte Carlo010104 statistics & probabilitysymbols.namesakebig data0502 economics and businessBayesian Regression Trees (BART)0101 mathematics050205 econometrics Bayesian treed regressionMultiple Try Metropolis algorithmsINFERÊNCIA ESTATÍSTICAbusiness.industryApplied MathematicsModel selection05 social sciencesRejection samplingData scienceVariable-order Bayesian networkTree (data structure)Tree traversalMarkov chain Monte Carlocontinuous time Markov processsymbolsArtificial intelligencebusinessBayesian linear regressioncommunication-freecomputerGibbs samplingBayesian Analysis
researchProduct

Residuenanalyse des Unabhängigkeitsmodells Zweier Kategorialer Variablen

1985

For the ‘cellwise’ analysis of independence of two categorial variables, Haberman (1973) proposes the method of ‘adjusted residuals’. Fuchs and Kenett (1980) use (the absolute value of) the maximal adjusted residual as a measure for the deviation from the null hypothesis.

StatisticsIndependence (mathematical logic)Absolute value (algebra)ResidualNull hypothesisMeasure (mathematics)Categorical variableMathematics
researchProduct

On the Ambiguous Consequences of Omitting Variables

2015

This paper studies what happens when we move from a short regression to a long regression (or vice versa), when the long regression is shorter than the data-generation process. In the special case where the long regression equals the data-generation process, the least-squares estimators have smaller bias (in fact zero bias) but larger variances in the long regression than in the short regression. But if the long regression is also misspecified, the bias may not be smaller. We provide bias and mean squared error comparisons and study the dependence of the differences on the misspecification parameter.

Statistics::Machine LearningStatistics::TheoryC51C52BiasMisspecificationLeast-squares estimatorsddc:330Statistics::MethodologyC13Mean squared errorOmitted variablesStatistics::Computation
researchProduct

On the ambiguous consequences of omitting variables

2015

This paper studies what happens when we move from a short regression to a long regression (or vice versa), when the long regression is shorter than the data-generation process. In the special case where the long regression equals the data-generation process, the least-squares estimators have smaller bias (in fact zero bias) but larger variances in the long regression than in the short regression. But if the long regression is also misspecified, the bias may not be smaller. We provide bias and mean squared error comparisons and study the dependence of the differences on the misspecification parameter.

Statistics::TheoryMean squared errorjel:C52Regression dilutionjel:C51Local regressionjel:C13Regression analysisOmitted-variable biasCross-sectional regressionStatistics::ComputationOmitted variables Misspecification Least-squares estimators Bias Mean squared errorStatistics::Machine LearningStatisticsEconometricsStatistics::MethodologyRegression diagnosticNonlinear regressionMathematics
researchProduct