ORIGINAL RESEARCH article
Front. Psychol.
Sec. Quantitative Psychology and Measurement
Volume 16 - 2025 | doi: 10.3389/fpsyg.2025.1640864
This article is part of the Research TopicObjective Measurement of Subjective Beliefs: Improving the Usefulness of Elicitation and Assessment MethodsView all articles
A Transformer-Based Embedding Approach to Developing Short-Form Psychological Measures
Provisionally accepted- Department of Psychology, Jeonbuk National University, Jeonju, Republic of Korea
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Developing short-form psychological measures is essential not only for reducing respondent burden but also for saving time and economic resources. Existing approaches to short-form development typically require full-scale administration and rely on factor analysis or machine learning techniques based on response data. In contrast, the present study proposes a novel, data-independent method for item reduction using transformer-based semantic embeddings. Items from the International Personality Item Pool 50-item Big-Five Factor Markers (IPIP-50) were embedded using the sentence-t5-xxl model to generate dense semantic representations. These embeddings were clustered via K-means, and representative items were selected based on their proximity to cluster centroids. The resulting short-form, consisting of 30 items, preserved the original five-factor structure and demonstrated strong psychometric properties. When compared with existing item reduction techniques-namely, Classical Test Theory and a Genetic Algorithm-the proposed method achieved comparable levels of reliability, convergent validity, and predictive performance. These findings highlight the potential of transformer-based embedding approaches not only for efficient item reduction but also for informing item development. The results support the feasibility of a resource-efficient, linguistically grounded alternative to data-dependent reduction methods.
Keywords: Short-form development, Item reduction, transformer-based embedding, semantic clustering, Psychological measures
Received: 04 Jun 2025; Accepted: 21 Jul 2025.
Copyright: © 2025 Jung and Seo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Jang-Won Seo, Department of Psychology, Jeonbuk National University, Jeonju, Republic of Korea
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.