Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation

6533b824fe1ef96bd1280098

RESEARCH PRODUCT

Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation

Giuliana Ferrante Giovanni Viegi Giovanna Cilluffo Giovanna Cilluffo Luciana Indinnimeo Ilaria Baiardini Salvatore Fasola Salvatore Fasola Stefania La Grutta João Fonseca Laura Montalbano

subject

Male 020205 medical informatics performance estimators media_common.quotation_subject Health Informatics 02 engineering and technology 03 medical and health sciences 0302 clinical medicine Health Information Management Surveys and Questionnaires Statistics true predictive performance Rinite Alérgica 0202 electrical engineering electronic engineering information engineering medicine Humans sample split Computer Simulation 030212 general & internal medicine Child Asma Normality Asthma Mathematics media_common Advanced and Specialized Nursing Receiver operating characteristic asthma control test asthma control test sample split performance estimators optimal cutoff simulation study true predictive performance Discriminant validity Reproducibility of Results Estimator medicine.disease simulation study Rhinitis Allergic Asthma Confidence interval ROC Curve Test set optimal cutoff Female Classifier (UML)

description

Background The use of receiver operating characteristic curves, or “ROC analysis,” has become quite common in biomedical research to support decisions. However, sensitivity, specificity, and misclassification rates are still often estimated using the training sample, overlooking the risk of overrating the test performance. Methods A simulation study was performed to highlight the inferential implications of splitting (or not) the dataset into training and test set. The normality assumption was made for the classifier given the disease status, and the Youden's criterion considered for the detection of the optimal cutoff. Then, an ROC analysis with sample split was applied to assess the discriminant validity of the Italian version of the Control of Allergic Rhinitis and Asthma Test (CARATkids) questionnaire for children with asthma and rhinitis, for which recent studies may have reported liberal performance estimates. Results The simulation study showed that both single split and cross-validation (CV) provided unbiased estimators of sensitivity, specificity, and misclassification rate, therefore allowing computation of confidence intervals. For the Italian CARATkids questionnaire, the misclassification rate estimated by fivefold CV was 0.22, with 95% confidence interval 0.14 to 0.30, indicating an acceptable discriminant validity. Conclusions Splitting into training and test set avoids overrating the test performance in ROC analysis. Validated through this method, the Italian CARATkids is valid for assessing disease control in children with asthma and rhinitis.

year	journal	country	edition	language
2019-01-01

10.1055/s-0039-1693732 http://hdl.handle.net/10447/385149