Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Educ.

Sec. Assessment, Testing and Applied Measurement

Volume 10 - 2025 | doi: 10.3389/feduc.2025.1616879

Construct Comparability and the Limits of Post Hoc Modelling: Insights from International Baccalaureate Multi-Language Assessments

Provisionally accepted
  • 1International Baccalaureate, Cardiff, United Kingdom
  • 2University of Oxford, Oxford, England, United Kingdom

The final, formatted version of the article will be published soon.

Construct comparability was investigated across different subjects in the International Baccalaureate (IB) Diploma Programme (DP). A Rasch Partial Credit Model (PCM) was applied to historical assessment data to generate statistical measures of the relative "difficulty" of IB subjects and languages.Specifically, analysis centered on different language versions of literature assessments, where exams differ in content, but are designed to assess the same target constructs. Rasch analyses were conducted sequentially in three subsets of data. Three different conceptualizations of the linking construct were compared, with the aim of narrowing the definition to increase the validity of the comparisons. These ranged from different DP subjects being linked by "general academic ability", to linking English, Spanish and Chinese language versions of literature with the more relevant construct of "literary analysis". Ultimately, the Rasch analyses produced three different rank orders of 'difficulty' for the assessments, illustrating the limitations of post hoc construct comparability investigations. Whilst literary analysis is the most theoretically defensible linking construct in this context, the approach relies on bilingual students taking different language versions of the assessments and therefore has limited operational applicability. There are also conceptual limitations, as bilingual examinees are not representative of all students in DP cohorts. Further research is recommended into how cohort characteristics can impact performance, as well as how constructs are defined for use across linguistic and cultural subgroups. Such investigations are crucial to avoid construct bias being introduced in the earliest stages of assessment design. * English subjects highlighted in blue, Chinese in orange & Spanish in green ** Inf: indicates an effectively infinite threshold due to insufficient data * English subjects highlighted in blue, Chinese in orange & Spanish in green ** Inf: indicates an effectively infinite threshold due to insufficient data

Keywords: Assessment1, comparability2, cross-lingual3, International Baccalaureate4, Rasch modelling5, test adaptation6

Received: 23 Apr 2025; Accepted: 09 Jul 2025.

Copyright: © 2025 Badham, Meadows and Baird. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Louise Badham, International Baccalaureate, Cardiff, United Kingdom

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.