Search results for "item analysis"
showing 7 items of 17 documents
Propiedades métricas de un instrumento para evaluar el Valor Social Subjetivo de la Educación: escala SECS-EVALNEC VSE-Estudiantes-Secundaria
2018
El estudio analiza los resultados de la aplicación de un instrumento que pretende medir el Valor Social Subjetivo de la Educación. Trabajando con un grupo de 341 estudiantes de Educación Secundaria Obligatoria en España, se exponen las propiedades métricas del instr umento desde una metodología de Teoría Clásica de los Tests y Teoría de Respuesta al Ítem para comprobar su fiabilidad y bondad de ajuste. Después de comprobar la dimensionalidad de la escala y el funcionamiento de los reactivos, los datos muestr an que todos se ajustan adecuadamente al modelo de referencia. A partir de ello, presentamos una propuesta de escala SECS - EVALNEC VSE - Estudiantes - Secundaria unidimensional conform…
Threats to validity when using open-ended items in international achievement studies : Coding responses to the PISA 2012 problem-solving test in Finl…
2015
Open-ended (OE) items are widely used to gather data on student performance in international achievement studies. However, several factors may threaten validity when using such items. This study examined Finnish coders’ opinions about threats to validity when coding responses to OE items in the PISA 2012 problem-solving test. A total of 6 discussions during 6 coder practice sessions (on 6 OE items) and an interview between 5 coders were audiorecorded and analyzed by means of content analysis, and 3 main threats to validity were found: (1) unclear and complex questions; (2) arbitrary and illogical coding rubrics; and (3) unclear and ambiguous responses. Suggestions are given as to how to res…
Estimating person parameters via item response model and simple sum score in small samples with few polytomous items: A simulation study
2018
Background The Item Response Theory (IRT) is becoming increasingly popular for item analysis. Theoretical considerations and simulation studies suggest that parameter estimates will become precise only by utilizing many items in large samples. Method A simulation study focusing on a single scale was performed on data with (a) n = 40, 60, 80, 120, 200, 300, 500, and 900 cases utilizing (b) 4, 8, 16, or 32 items. The items were (c) symmetrically distributed vs. skew (skewness 0, 1, and 2). Item loadings were (d) homogeneous vs. heterogeneous. Item loadings were (e) low vs. high. Half of the items had (f) a correlated error or not. The number of answering categories (g) was four vs. five. A to…
Attempt to Construct a Scale for the Measurement of the Effect of Suggestion on Perception1
1975
A scale based on experimental methods has been prepared for measuring the effects of indirect suggestion upon perception. Three categories are included: (1) distorting the interpretation of presented stimuli, (2) inducing sense-impressions in the absence of adequate stimuli, and (3) producing insensitivity to stimuli that are objectively present. Test situations were designed for tactual, auditory, and visual perception. The scale was tested on a sample of 112 students from the 11th and 12th grades of a large city high school (58 girls and 54 boys). Most of the item intercorrelations were positive and many significantly so. Eliminating the 9 lowest items of 21 left 12 for a reduced matrix,…
Test fairness: a DIF analysis of an L2 vocabulary test
2000
The purpose of this study is to analyse gender-uniform differential item functioning (DIF) in a second language (L2) vocabulary test with the tools of item response theory (the separate calibration t-method) and to study potential gender impact on the test performance measured by different item composites. The results of the study show that despite the fact that there are test items with indications of DIF in favour of either females or males, the test as a whole is not gender-biased. In spite of this, it was demonstrated that some item composites are gender-biased. In view of item bank building and use, it means that some of the tests constructed on the basis of an item bank might be bias…
Development of Computerized Adaptive Testing for Emotion Regulation
2020
Emotion regulation (ER) plays a vital role in individuals’ well-being and successful functioning. In this study, we attempted to develop a computerized adaptive testing (CAT) to efficiently evaluate ER, namely the CAT-ER. The initial CAT-ER item bank comprised 154 items from six commonly used ER scales, which were completed by 887 participants recruited in China. We conducted unidimensionality testing, item response theory (IRT) model comparison and selection, and IRT item analysis including local independence, item fit, differential item functioning, and item discrimination. Sixty-three items with good psychometric properties were retained in the final CAT-ER. Then, two CAT simulation stud…
Distractor Efficiency in an Item Pool for a Statistics Classroom Exam: Assessing Its Relation With Item Cognitive Level Classified According to Bloom…
2018
Multiple-choice items are one of the most commonly used tools for evaluating students’ knowledge and skills. A key aspect of this type of assessment is the presence of functioning distractors, i.e., incorrect alternatives intended to be plausible for students with lower achievement. To our knowledge, no work has investigated the relationship between distractor performance and the complexity of the cognitive task required to give the correct answer. The aim of this study was to investigate this relation, employing the first three levels of Bloom’s taxonomy (Knowledge, Comprehension, and Application). Specifically, it was hypothesized that items classified into a higher level of Bloom’s class…