ORIGINAL RESEARCH article
Front. Educ.
Sec. Assessment, Testing and Applied Measurement
Volume 10 - 2025 | doi: 10.3389/feduc.2025.1616879
Construct Comparability and the Limits of Post Hoc Modelling: Insights from International Baccalaureate Multi-Language Assessments
Provisionally accepted- 1International Baccalaureate, Cardiff, United Kingdom
- 2University of Oxford, Oxford, England, United Kingdom
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Construct comparability was investigated across different subjects in the International Baccalaureate (IB) Diploma Programme (DP). A Rasch Partial Credit Model (PCM) was applied to historical assessment data to generate statistical measures of the relative "difficulty" of IB subjects and languages.Specifically, analysis centered on different language versions of literature assessments, where exams differ in content, but are designed to assess the same target constructs. Rasch analyses were conducted sequentially in three subsets of data. Three different conceptualizations of the linking construct were compared, with the aim of narrowing the definition to increase the validity of the comparisons. These ranged from different DP subjects being linked by "general academic ability", to linking English, Spanish and Chinese language versions of literature with the more relevant construct of "literary analysis". Ultimately, the Rasch analyses produced three different rank orders of 'difficulty' for the assessments, illustrating the limitations of post hoc construct comparability investigations. Whilst literary analysis is the most theoretically defensible linking construct in this context, the approach relies on bilingual students taking different language versions of the assessments and therefore has limited operational applicability. There are also conceptual limitations, as bilingual examinees are not representative of all students in DP cohorts. Further research is recommended into how cohort characteristics can impact performance, as well as how constructs are defined for use across linguistic and cultural subgroups. Such investigations are crucial to avoid construct bias being introduced in the earliest stages of assessment design. * English subjects highlighted in blue, Chinese in orange & Spanish in green ** Inf: indicates an effectively infinite threshold due to insufficient data * English subjects highlighted in blue, Chinese in orange & Spanish in green ** Inf: indicates an effectively infinite threshold due to insufficient data
Keywords: Assessment1, comparability2, cross-lingual3, International Baccalaureate4, Rasch modelling5, test adaptation6
Received: 23 Apr 2025; Accepted: 09 Jul 2025.
Copyright: © 2025 Badham, Meadows and Baird. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Louise Badham, International Baccalaureate, Cardiff, United Kingdom
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.