Search results for "regression"

showing 10 items of 2619 documents

On the usage of joint diagonalization in multivariate statistics

2022

Scatter matrices generalize the covariance matrix and are useful in many multivariate data analysis methods, including well-known principal component analysis (PCA), which is based on the diagonalization of the covariance matrix. The simultaneous diagonalization of two or more scatter matrices goes beyond PCA and is used more and more often. In this paper, we offer an overview of many methods that are based on a joint diagonalization. These methods range from the unsupervised context with invariant coordinate selection and blind source separation, which includes independent component analysis, to the supervised context with discriminant analysis and sliced inverse regression. They also enco…

Statistics and ProbabilityScatter matricesMultivariate statisticsContext (language use)010103 numerical & computational mathematics01 natural sciencesBlind signal separation010104 statistics & probabilitySliced inverse regression0101 mathematicsB- ECONOMIE ET FINANCESupervised dimension reductionMathematicsNumerical Analysisbusiness.industryCovariance matrixPattern recognitionriippumattomien komponenttien analyysimatemaattinen tilastotiedeLinear discriminant analysisInvariant component selectionIndependent component analysismonimuuttujamenetelmätPrincipal component analysisDimension reductionBlind source separationArtificial intelligenceStatistics Probability and Uncertaintybusiness
researchProduct

Inferential tools in penalized logistic regression for small and sparse data: A comparative study.

2016

This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the ‘traditional’ Wald statistic. In this work, we consider and discuss a wider range of test statistics, including the robust Wald, the Score, and the recently proposed Gradient statistic. We compare all these asymptotically equivalent statistics in terms of interval estimation and hypothesis testing via simulation experiments and analyses of two real datasets. We find out that the Likelihood Ratio statistic does not appear the best inferential device in the Firth penal…

Statistics and ProbabilityScore testPRESS statisticEpidemiologyStatistics as TopicScoreWald testLogistic regression01 natural sciences010104 statistics & probability03 medical and health sciences0302 clinical medicineHealth Information ManagementStatisticsEconometricsHumans030212 general & internal medicine0101 mathematicsStatisticMathematicsLikelihood FunctionsModels StatisticalLogistic regression firth penalized likelihood sandwich formula score statistic gradient statisticLogistic ModelsLikelihood-ratio testData Interpretation StatisticalSample SizeAncillary statisticSettore SECS-S/01 - StatisticaStatistical methods in medical research
researchProduct

Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling

2016

ABSTRACTWe introduce a score-type statistic to test for a non-zero regression coefficient when the relevant term involves a nuisance parameter present only under the alternative. Despite the non-regularity and complexity of the problem and unlike the previous approaches, the proposed test statistic does not require the nuisance to be estimated. It is simple to implement by relying on the conventional distributions, such as Normal or t, and it justified in the setting of probabilistic coherence. We focus on testing for the existence of a breakpoint in segmented regression, and illustrate the methodology with an analysis on data of DNA copy number aberrations and gene expression profiles from…

Statistics and ProbabilityScore testscore testNuisance variablepiecewise linearthreshold valuecomputer.software_genre01 natural sciencesnon-standard inference010104 statistics & probability03 medical and health sciences0302 clinical medicineStatisticsLinear regressionTest statisticNuisance parameter0101 mathematicsSegmented regressionStatisticMathematicsApplied MathematicsProbabilistic logicBreakpoint detectionModeling and SimulationData miningStatistics Probability and UncertaintySettore SECS-S/01 - Statisticacomputer030217 neurology & neurosurgeryJournal of Statistical Computation and Simulation
researchProduct

Estimating growth charts via nonparametric quantile regression: a practical framework with application in ecology.

2013

We discuss a practical and effective framework to estimate reference growth charts via regression quantiles. Inequality constraints are used to ensure both monotonicity and non-crossing of the estimated quantile curves and penalized splines are employed to model the nonlinear growth patterns with respect to age. A companion R package is presented and relevant code discussed to favour spreading and application of the proposed methods.

Statistics and ProbabilitySettore BIO/07 - EcologiaStatistics::TheoryEcology (disciplines)Nonparametric statisticsMonotonic functionRegressionStatistics::ComputationQuantile regressionNonlinear systemR packageStatisticsEconometricsStatistics::MethodologyGrowth charts Nonparametric regression quantiles Penalized splines P. oceanica modelling R softwareStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaGeneral Environmental ScienceMathematicsQuantile
researchProduct

The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.

2020

This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

Statistics and ProbabilityStatistics::TheoryInduced smoothingEpidemiologyComputer scienceFeature selectionWald test01 natural sciencesasthma researchStatistics::Machine Learning010104 statistics & probability03 medical and health sciencesHealth Information ManagementLasso (statistics)Linear regressionsparse modelsStatistics::MethodologyComputer Simulation0101 mathematicssandwich formula030304 developmental biologyStatistical hypothesis testing0303 health sciencesCovariance matrixlung functionRegression analysisStatistics::Computationsparse modelResearch DesignAlgorithmSmoothingvariable selectionStatistical methods in medical research
researchProduct

Clusters of effects curves in quantile regression models

2018

In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…

Statistics and ProbabilityStatistics::TheoryMultivariate statistics05 social sciencesUnivariateFunctional data analysis01 natural sciencesQuantile regressionQuantile regression coefficients modeling Multivariate analysis Functional data analysis Curves clustering Variable selection010104 statistics & probabilityComputational Mathematics0502 economics and businessParametric modelCovariateStatistics::MethodologyApplied mathematics0101 mathematicsStatistics Probability and UncertaintyCluster analysisSettore SECS-S/01 - Statistica050205 econometrics MathematicsQuantile
researchProduct

Design-based estimation for geometric quantiles with application to outlier detection

2010

Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…

Statistics and ProbabilityStatistics::TheoryTheoryofComputation_COMPUTATIONBYABSTRACTDEVICESStatistics::ApplicationsComputingMethodologies_SIMULATIONANDMODELINGApplied MathematicsMathematicsofComputing_NUMERICALANALYSISUnivariateInformationSystems_DATABASEMANAGEMENTEstimatorStatistics::ComputationQuantile regressionHorvitz–Thompson estimatorComputational MathematicsDelta methodComputational Theory and MathematicsTheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITYOutlierConsistent estimatorStatisticsStatistics::MethodologyMathematicsQuantileComputational Statistics & Data Analysis
researchProduct

Nonlinear parametric quantile models

2020

Quantile regression is widely used to estimate conditional quantiles of an outcome variable of interest given covariates. This method can estimate one quantile at a time without imposing any constraints on the quantile process other than the linear combination of covariates and parameters specified by the regression model. While this is a flexible modeling tool, it generally yields erratic estimates of conditional quantiles and regression coefficients. Recently, parametric models for the regression coefficients have been proposed that can help balance bias and sampling variability. So far, however, only models that are linear in the parameters and covariates have been explored. This paper …

Statistics and ProbabilityStatistics::Theoryquantile regressionEpidemiologyparametric010501 environmental sciences01 natural sciencesquantile regression coefficients models010104 statistics & probabilityOutcome variableHealth Information ManagementCovariateEconometricsHumansStatistics::MethodologyComputer Simulation0101 mathematicsChild0105 earth and related environmental sciencesParametric statisticsMathematicsModels StatisticalForced oscillation technique integrated loss function parametric quantile regression quantile regression coefficients models Child Computer Simulation Humans Regression Analysis Models Statistical Nonlinear DynamicsStatistics::ComputationQuantile regressionNonlinear systemNonlinear Dynamicsintegrated loss functionRegression AnalysisQuantileStatistical Methods in Medical Research
researchProduct

Building up adjusted indicators of students' evaluation of university courses using generalized item response models

2012

This article advances a proposal for building up adjusted composite indicators of the quality of university courses from students’ assessments. The flexible framework of Generalized Item Response Models is adopted here for controlling the sources of heterogeneity in the data structure that make evaluations across courses not directly comparable. Specifically, it allows us to: jointly model students’ ratings to the set of items which define the quality of university courses; explicitly consider the dimensionality of the items composing the evaluation form; evaluate and remove the effect of potential confounding factors which may affect students’ evaluation; model the intra-cluster variabilit…

Statistics and ProbabilityStructure (mathematical logic)Computer sciencemedia_common.quotation_subjectadjusted indicators explanatory item response models multidimensional latent traits multilevel models evaluation of university courses potential confounding factorsRegression analysisData structureAffect (psychology)Multilevel dataComputingMilieux_COMPUTERSANDEDUCATIONEconometricsMathematics educationQuality (business)Settore SECS-S/05 - Statistica SocialeStatistics Probability and UncertaintySet (psychology)Settore SECS-S/01 - Statisticamedia_commonCurse of dimensionality
researchProduct

Sample size in cluster-randomized trials with time to event as the primary endpoint

2011

In cluster-randomized trials, groups of individuals (clusters) are randomized to the treatments or interventions to be compared. In many of those trials, the primary objective is to compare the time for an event to occur between randomized groups, and the shared frailty model well fits clustered time-to-event data. Members of the same cluster tend to be more similar than members of different clusters, causing correlations. As correlations affect the power of a trial to detect intervention effects, the clustered design has to be considered in planning the sample size. In this publication, we derive a sample size formula for clustered time-to-event data with constant marginal baseline hazards…

Statistics and ProbabilityTime FactorsEndpoint DeterminationSubstance-Related DisordersEpidemiologyPsychological interventionBiostatisticsTime-to-Treatmentlaw.inventionCorrelationRandom AllocationRandomized controlled triallawStatisticsClinical endpointEconometricsCluster AnalysisHumansPoisson DistributionBaseline (configuration management)Randomized Controlled Trials as TopicMathematicsEvent (probability theory)Likelihood FunctionsModels StatisticalTerm (time)Sample size determinationSample SizeRegression AnalysisSubstance Abuse Treatment CentersStatistics in Medicine
researchProduct