AUTHOR=Lesterhuis Marije , Bouwer Renske , van Daal Tine , Donche Vincent , De Maeyer Sven TITLE=Validity of Comparative Judgment Scores: How Assessors Evaluate Aspects of Text Quality When Comparing Argumentative Texts JOURNAL=Frontiers in Education VOLUME=Volume 7 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/education/articles/10.3389/feduc.2022.823895 DOI=10.3389/feduc.2022.823895 ISSN=2504-284X ABSTRACT=The advantage of comparative judgement is that it is particularly suited to assess multidimensional and complex constructs as text quality. This is because assessors are asked to compare texts holistically and to make a quality judgement for each text in a pairwise comparison based upon on the most salient and critical differences. Also, the resulted rank order is based on the judgement of all assessors, representing the shared consensus. In order to be able to select the right number of assessors, the question is to what extent the conceptualization of assessors prevails in the aspects they base their judgement on, or whether comparative judgement minimizes the differences between assessors. In other words, can we detect types of assessors who tend to consider certain aspects of text quality more often than others? A total of 64 assessors compared argumentative texts, after which they provided decision statements on what aspects of text quality had informed their judgement. These decision statements were coded on six overarching themes of text quality: argumentation, organization, language use, language conventions, source use, references and layout. Using a multilevel-latent class analysis, four different types of assessors could be distinguished: narrowly-focused, broadly-focused, source-focused and language-focused. However, the analysis also showed that all assessor types mainly focused on argumentation and organization, and that assessor types only partly explained whether the aspect of text quality was mentioned in a decision statement. We conclude that comparative judgement is a strong method for comparing complex constructs like text quality. First, because the rank order combines different views on text quality, but foremost because the method of comparative judgement minimizes differences between assessors.