6533b874fe1ef96bd12d61b2

RESEARCH PRODUCT

Estimating person parameters via item response model and simple sum score in small samples with few polytomous items: A simulation study

Christian MeestersJochen HardtPhilipp Schwall

subject

Statistics and ProbabilityAnalysis of VarianceScale (ratio)EpidemiologyItem analysisSkewPolytomous Rasch modelMissing data01 natural sciences010104 statistics & probability03 medical and health sciences0302 clinical medicineSimple (abstract algebra)SkewnessSample SizeStatisticsItem response theoryHumansRegression AnalysisComputer Simulation030212 general & internal medicine0101 mathematicsCorrelation of DataMathematics

description

Background The Item Response Theory (IRT) is becoming increasingly popular for item analysis. Theoretical considerations and simulation studies suggest that parameter estimates will become precise only by utilizing many items in large samples. Method A simulation study focusing on a single scale was performed on data with (a) n = 40, 60, 80, 120, 200, 300, 500, and 900 cases utilizing (b) 4, 8, 16, or 32 items. The items were (c) symmetrically distributed vs. skew (skewness 0, 1, and 2). Item loadings were (d) homogeneous vs. heterogeneous. Item loadings were (e) low vs. high. Half of the items had (f) a correlated error or not. The number of answering categories (g) was four vs. five. A total of 10% of each item had missing values. The ability-estimates from the IRT model and the simple sum score served as criteria for evaluating the results. Results The ability-estimate from the IRT model outperformed the sum score when there were many items, skewed distributed items, and the item loadings were heterogeneous and high. The sum score outperformed the ability-estimate when there were few items, nonskewed items, and homogeneous and low item loadings. However, convergence rates were partly low in small samples. Correlated errors affected, both negatively, the ability-estimate and the sum score. Conclusion With skew item distributions and heterogeneous item loadings, utilizing an IRT model is recommended. However, with few items, many cases are required, conversely, with few cases many items. With few items and few cases, the sum score performs better.

https://doi.org/10.1002/sim.8280