6533b820fe1ef96bd12790bd

RESEARCH PRODUCT

Empirical study of the dependence of the results of multivariable flexible survival analyses on model selection strategy

Amel MahboubiChristine BinquetJean FaivreJean FaivreValérie JoosteValérie JoosteMichal AbrahamowiczMichal AbrahamowiczCatherine QuantinClaire Bonithon-kopp

subject

MaleStatistics and ProbabilityEpidemiologyAge at diagnosisAdenocarcinomaEmpirical researchRisk FactorsStomach NeoplasmsCovariateStatisticsEconometricsHumansRegistriesSurvival analysisAgedParametric statisticsMathematicsModels StatisticalModel selectionMultivariable calculusAge FactorsMiddle AgedPrognosisSurvival AnalysisMultivariate AnalysisFemaleFranceLog-linear model

description

Flexible survival models, which avoid assumptions about hazards proportionality (PH) or linearity of continuous covariates effects, bring the issues of model selection to a new level of complexity. Each ‘candidate covariate’ requires inter-dependent decisions regarding (i) its inclusion in the model, and representation of its effects on the log hazard as (ii) either constant over time or time-dependent (TD) and, for continuous covariates, (iii) either loglinear or non-loglinear (NL). Moreover, ‘optimal’ decisions for one covariate depend on the decisions regarding others. Thus, some efficient model-building strategy is necessary. We carried out an empirical study of the impact of the model selection strategy on the estimates obtained in flexible multivariable survival analyses of prognostic factors for mortality in 273 gastric cancer patients. We used 10 different strategies to select alternative multivariable parametric as well as spline-based models, allowing flexible modeling of non-parametric (TD and/or NL) effects. We employed 5-fold cross-validation to compare the predictive ability of alternative models. All flexible models indicated significant non-linearity and changes over time in the effect of age at diagnosis. Conventional ‘parametric’ models suggested the lack of period effect, whereas more flexible strategies indicated a significant NL effect. Cross-validation confirmed that flexible models predicted better mortality. The resulting differences in the ‘final model’ selected by various strategies had also impact on the risk prediction for individual subjects. Overall, our analyses underline (a) the importance of accounting for significant non-parametric effects of covariates and (b) the need for developing accurate model selection strategies for flexible survival analyses. Copyright © 2008 John Wiley & Sons, Ltd.

https://doi.org/10.1002/sim.3447