6533b820fe1ef96bd12798b5

RESEARCH PRODUCT

false

subject

05 social sciencesComparability050301 educationcomputer.software_genre050105 experimental psychologyEducationTest (assessment)Convergent validityHomogeneousEducational assessmentMathematics education0501 psychology and cognitive sciencesContent knowledgePsychology0503 educationcomputer

description

To validly assess teachers’ pedagogical content knowledge (PCK), performance-based tasks with open-response formats are required. Automated scoring is considered an appropriate ap-proach to reduce the resource-intensity of human scoring and to achieve more consistent scor-ing results than human raters. The focus is on the comparability of human and automated scor-ing of PCK for economics teachers. The answers of (prospective) teachers (N=852) to six open-response tasks from a standardized and validated test were scored by two trained human raters and the engine "Educational SCoRIng TOolkit" (ESCRITO). The average agreement between human and computer ratings, κw = .66, suggests a convergent validity of the scoring results. The results of the single-sector variance analysis show a significant influence of the answers for each homogeneous subgroup (students = 460, trainees = 230, in-service teachers = 162) on the automated scoring. Findings are discussed in terms of implications for the use of automated scoring in educational assessment and its potentials and limitations.