Search results for "score"
showing 10 items of 852 documents
Sample size planning for survival prediction with focus on high-dimensional data
2011
Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included…
On Rao Score and Pearson X2 Statistics in Generalized Linear Models
2005
The identity of the Rao score and PearsonX 2 statistics is well known in the areas where the latter was first introduced: goodness-of-fit in contingency tables and binary responses. We show in this paper that the same identity holds when the two statistics are used for testing goodness-of-fit of Generalized Linear Models. We also highlight the connections that exist between the two statistics when they are used for the comparison of nested models. Finally, we discuss some merits of these unifying results.
Adaptive reference-free compression of sequence quality scores
2014
Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…
A note on adjusted responses, fitted values and residuals in Generalized Linear Models
2014
Adjusted responses, adjusted fitted values and adjusted residuals are known to play in Generalized Linear Models the role played in Linear Models by observations, fitted values and ordinary residuals. We think this parallelism, which was widely recognized and used in the early literature on Generalized Linear Models, has been somewhat overlooked in more recent presentations. We revise this parallelism, systematizing and proving some results that are either scattered or not satisfactorily spelled out in the literature. In particular, we formally derive the asymptotic dispersion matrix of the (scaled) adjusted residuals, by proving that in Generalized Linear Models the fitted values are asym…
A differential-geometric approach to generalized linear models with grouped predictors
2016
We propose an extension of the differential-geometric least angle regression method to perform sparse group inference in a generalized linear model. An efficient algorithm is proposed to compute the solution curve. The proposed group differential-geometric least angle regression method has important properties that distinguish it from the group lasso. First, its solution curve is based on the invariance properties of a generalized linear model. Second, it adds groups of variables based on a group equiangularity condition, which is shown to be related to score statistics. An adaptive version, which includes weights based on the Kullback-Leibler divergence, improves its variable selection fea…
Premature conclusions about the signal‐to‐noise ratio in structural equation modeling research : A commentary on Yuan and Fang (2023)
2023
In a recent article published in this journal, Yuan and Fang (British Journal of Mathematical and Statistical Psychology, 2023) suggest comparing structural equation modeling (SEM), also known as covariance-based SEM (CB-SEM), estimated by normal-distribution-based maximum likelihood (NML), to regression analysis with (weighted) composites estimated by least squares (LS) in terms of their signal-to-noise ratio (SNR). They summarize their findings in the statement that “[c]ontrary to the common belief that CB-SEM is the preferred method for the analysis of observational data, this article shows that regression analysis via weighted composites yields parameter estimates with much smaller stan…
Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.
2013
For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivat…
Nearly exact sample size calculation for powerful non-randomized tests for differences between binomial proportions
2015
In the case of two independent samples, it turns out that among the procedures taken in consideration, BOSCHLOO'S technique of raising the nominal level in the standard conditional test as far as admissible performs best in terms of power against almost all alternatives. The computational burden entailed in exact sample size calculation is comparatively modest for both the uniformly most powerful unbiased randomized and the conservative non-randomized version of the exact Fisher-type test. Computing these values yields a pair of bounds enclosing the exact sample size required for the Boschloo test, and it seems reasonable to replace the exact value with the middle of the corresponding inter…
The “ThreePlusOne” Likelihood-Based Test Statistics: Unified Geometrical and Graphical Interpretations
2014
The presentation of the well known Likelihood Ratio, Wald and Score test statistics in textbooks appears to lack a unified graphical and geometrical interpretation. We present two simple graphical representations on a common scale for these three test statistics, and also the recently proposed Gradient test statistic. These unified graphical displays may favour better understanding of the geometrical meaning of the likelihood based statistics and provide useful insights into their connections.
Inferential tools in penalized logistic regression for small and sparse data: A comparative study.
2016
This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the ‘traditional’ Wald statistic. In this work, we consider and discuss a wider range of test statistics, including the robust Wald, the Score, and the recently proposed Gradient statistic. We compare all these asymptotically equivalent statistics in terms of interval estimation and hypothesis testing via simulation experiments and analyses of two real datasets. We find out that the Likelihood Ratio statistic does not appear the best inferential device in the Firth penal…