Search results for "score"

showing 10 items of 852 documents

Sample size planning for survival prediction with focus on high-dimensional data

2011

Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included…

Statistics and ProbabilityClustering high-dimensional dataClinical Trials as TopicLung NeoplasmsModels StatisticalKaplan-Meier EstimateEpidemiologyProportional hazards modelDimensionality reductionGene ExpressionFeature selectionKaplan-Meier EstimateBiostatisticsPrognosisBrier scoreSample size determinationCarcinoma Non-Small-Cell LungSample SizeCensoring (clinical trials)StatisticsHumansProportional Hazards ModelsMathematicsStatistics in Medicine

researchProduct

On Rao Score and Pearson X2 Statistics in Generalized Linear Models

2005

The identity of the Rao score and PearsonX 2 statistics is well known in the areas where the latter was first introduced: goodness-of-fit in contingency tables and binary responses. We show in this paper that the same identity holds when the two statistics are used for testing goodness-of-fit of Generalized Linear Models. We also highlight the connections that exist between the two statistics when they are used for the comparison of nested models. Finally, we discuss some merits of these unifying results.

Statistics and ProbabilityContingency tableProper linear modelstatisticLinear modelScoreRao scoreGeneralized linear mixed modelHierarchical generalized linear modelQuasi-likelihoodStatisticsStatistics Probability and Uncertaintylinear modelsGeneralized estimating equationMathematics

researchProduct

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome

researchProduct

A note on adjusted responses, fitted values and residuals in Generalized Linear Models

2014

Adjusted responses, adjusted fitted values and adjusted residuals are known to play in Generalized Linear Models the role played in Linear Models by observations, fitted values and ordinary residuals. We think this parallelism, which was widely recognized and used in the early literature on Generalized Linear Models, has been somewhat overlooked in more recent presentations. We revise this parallelism, systematizing and proving some results that are either scattered or not satisfactorily spelled out in the literature. In particular, we formally derive the asymptotic dispersion matrix of the (scaled) adjusted residuals, by proving that in Generalized Linear Models the fitted values are asym…

Statistics and ProbabilityGeneralized linear modelCovariance matrixLinear modelLinear predictionWald testUncorrelatedAdjusted ResidualWald test-statisticRao score test-statisticDecomposition (computer science)Parallelism (grammar)Linear ModelApplied mathematicsStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaGeneralized Linear ModelMathematicsStatistical Modelling

researchProduct

A differential-geometric approach to generalized linear models with grouped predictors

2016

We propose an extension of the differential-geometric least angle regression method to perform sparse group inference in a generalized linear model. An efficient algorithm is proposed to compute the solution curve. The proposed group differential-geometric least angle regression method has important properties that distinguish it from the group lasso. First, its solution curve is based on the invariance properties of a generalized linear model. Second, it adds groups of variables based on a group equiangularity condition, which is shown to be related to score statistics. An adaptive version, which includes weights based on the Kullback-Leibler divergence, improves its variable selection fea…

Statistics and ProbabilityGeneralized linear modelStatistics::TheoryMathematical optimizationProper linear modelGeneral MathematicsORACLE PROPERTIESGeneralized linear modelSPARSITYGeneralized linear array model01 natural sciencesGeneralized linear mixed modelCONSISTENCY010104 statistics & probabilityScore statistic.LEAST ANGLE REGRESSIONLinear regressionESTIMATORApplied mathematicsDifferential geometry0101 mathematicsDivergence (statistics)MathematicsVariance functionDifferential-geometric least angle regressionPATH ALGORITHMApplied MathematicsLeast-angle regressionScore statistic010102 general mathematicsAgricultural and Biological Sciences (miscellaneous)Group lassoGROUP SELECTIONStatistics Probability and UncertaintyGeneral Agricultural and Biological SciencesSettore SECS-S/01 - Statistica

researchProduct

Premature conclusions about the signal‐to‐noise ratio in structural equation modeling research : A commentary on Yuan and Fang (2023)

2023

In a recent article published in this journal, Yuan and Fang (British Journal of Mathematical and Statistical Psychology, 2023) suggest comparing structural equation modeling (SEM), also known as covariance-based SEM (CB-SEM), estimated by normal-distribution-based maximum likelihood (NML), to regression analysis with (weighted) composites estimated by least squares (LS) in terms of their signal-to-noise ratio (SNR). They summarize their findings in the statement that “[c]ontrary to the common belief that CB-SEM is the preferred method for the analysis of observational data, this article shows that regression analysis via weighted composites yields parameter estimates with much smaller stan…

Statistics and ProbabilityHenseler-Ogasawara specificationeffect sizetilastomenetelmätpartial least squares structural equation modelingGeneral MedicinerakenneyhtälömallitregressioanalyysiArts and Humanities (miscellaneous)sum scorescovariance-based structural equation modelingcomposite modelregression analysis with weighted compositesfactor score regressionGeneral Psychology

researchProduct

Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.

2013

For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivat…

Statistics and ProbabilityMaleNiacinamideBoosting (machine learning)Carcinoma HepatocellularEpidemiologyComputer scienceScoreFeature selectionAntineoplastic Agentscomputer.software_genreDecision Support TechniquesNeoplasmsCovariateHumansRegistriesAgedProportional Hazards ModelsProportional hazards modelPhenylurea CompoundsLiver NeoplasmsRegression analysisConfounding Factors EpidemiologicMiddle AgedSorafenibPrognosisRegressionCancer registryData Interpretation StatisticalRegression AnalysisData miningcomputerStatistics in medicine

researchProduct

Nearly exact sample size calculation for powerful non-randomized tests for differences between binomial proportions

2015

In the case of two independent samples, it turns out that among the procedures taken in consideration, BOSCHLOO'S technique of raising the nominal level in the standard conditional test as far as admissible performs best in terms of power against almost all alternatives. The computational burden entailed in exact sample size calculation is comparatively modest for both the uniformly most powerful unbiased randomized and the conservative non-randomized version of the exact Fisher-type test. Computing these values yields a pair of bounds enclosing the exact sample size required for the Boschloo test, and it seems reasonable to replace the exact value with the middle of the corresponding inter…

Statistics and ProbabilityScore testExact statisticsBinomial testsymbols.namesakeExact testMcNemar's testSample size determinationStatisticssymbolsSign testStatistics Probability and UncertaintyFisher's exact testMathematicsStatistica Neerlandica

researchProduct

The “ThreePlusOne” Likelihood-Based Test Statistics: Unified Geometrical and Graphical Interpretations

2014

The presentation of the well known Likelihood Ratio, Wald and Score test statistics in textbooks appears to lack a unified graphical and geometrical interpretation. We present two simple graphical representations on a common scale for these three test statistics, and also the recently proposed Gradient test statistic. These unified graphical displays may favour better understanding of the geometrical meaning of the likelihood based statistics and provide useful insights into their connections.

Statistics and ProbabilityScore testInterpretation (logic)Theoretical computer scienceScale (ratio)General MathematicsLikelihood ratio Wald Score Gradient statistic geometrical interpretation graphical displaySimple (abstract algebra)Likelihood-ratio testStatisticsStatistical inferenceTest statisticStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaStatistical hypothesis testingMathematicsThe American Statistician

researchProduct

Inferential tools in penalized logistic regression for small and sparse data: A comparative study.

2016

This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the ‘traditional’ Wald statistic. In this work, we consider and discuss a wider range of test statistics, including the robust Wald, the Score, and the recently proposed Gradient statistic. We compare all these asymptotically equivalent statistics in terms of interval estimation and hypothesis testing via simulation experiments and analyses of two real datasets. We find out that the Likelihood Ratio statistic does not appear the best inferential device in the Firth penal…

Statistics and ProbabilityScore testPRESS statisticEpidemiologyStatistics as TopicScoreWald testLogistic regression01 natural sciences010104 statistics & probability03 medical and health sciences0302 clinical medicineHealth Information ManagementStatisticsEconometricsHumans030212 general & internal medicine0101 mathematicsStatisticMathematicsLikelihood FunctionsModels StatisticalLogistic regression firth penalized likelihood sandwich formula score statistic gradient statisticLogistic ModelsLikelihood-ratio testData Interpretation StatisticalSample SizeAncillary statisticSettore SECS-S/01 - StatisticaStatistical methods in medical research

researchProduct