Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Educ.

Sec. Assessment, Testing and Applied Measurement

Volume 10 - 2025 | doi: 10.3389/feduc.2025.1538486

Evaluating accuracy and bias of different comparative judgment equating methods against traditional statistical equating

Provisionally accepted
  • Office of Qualifications and Examinations Regulation (Ofqual), Coventry, United Kingdom

The final, formatted version of the article will be published soon.

Traditional common-item or common-person statistical equating cannot always be used for standard maintaining or linking between test forms. In some contexts, comparative judgment (CJ) methods which capture expert judgment of quality of student work on different test forms have been trialed for this purpose. While plausibility, reliability and replicability of CJ outcomes has been shown to be high, little research has established the extent of CJ accuracy, that is, agreement between CJ outcomes and outcomes established by robust statistical equating. We report on the accuracy of outcomes from several trials and replications of different CJ methods and different associated analytical approaches, compared to operational IRT statistical equating, demonstrating largely close alignment between the two. We also compare different CJ methods (and different analytical approaches) in terms of outcome precision, replicability and evidence of bias in expert judgment (that is, a tendency to prefer student work on easier test forms). We discuss the advantages and disadvantages of different CJ methods and analytical approaches and their potential for informing standard maintaining in different contexts.

Keywords: comparative judgment, standard maintaining, equating, linking, comparability, performance standards, assessment

Received: 02 Dec 2024; Accepted: 05 Aug 2025.

Copyright: © 2025 Curcin and Lee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Milja Curcin, Office of Qualifications and Examinations Regulation (Ofqual), Coventry, United Kingdom

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.