Association Analysis Between Polygenic Risk Scores and Traits: Practical Guidelines and Tutorial with an Illustrative Data Set of Schizophrenia

Irigoien, Itziar; Mas Bermejo, Patricia; Papiol, Sergi; Barrantes-Vidal, Neus; Rosa, Araceli; Arenas, Concepcion

doi:10.3389/fpsyt.2025.1621972

METHODS article

Front. Psychiatry

Sec. Behavioral and Psychiatric Genetics

Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1621972

This article is part of the Research TopicInsights in Behavioral and Psychiatric GeneticsView all 6 articles

Association Analysis Between Polygenic Risk Scores and Traits: Practical Guidelines and Tutorial with an Illustrative Data Set of Schizophrenia

Provisionally accepted

Itziar Irigoien¹

Patricia Mas Bermejo²

Sergi Papiol^3,4

Neus Barrantes-Vidal⁵

Araceli Rosa²

Concepcion Arenas^2*

¹Euskal Herriko Unibertsitatea (UPV/EHU), San Sebastian, Spain
²University of Barcelona, Barcelona, Spain
³University Hospital, LMU, Munich, Germany
⁴5Institute of Psychiatric Phenomics and Genomics (IPPG), Munich, Germany
⁵Universitat Autonoma de Barcelona, Barcelona, Spain

The final, formatted version of the article will be published soon.

Most methodological Polygenic Risk Score (PRS)-related papers explain the laborious process of computing the PRS in great depth. Afterwards, as a last step, it is generally described that to test a possible association between a PRS and a trait of interest, an analysis through regression models (linear or logistic, depending on data type) should be carried out adjusting for covariates (e.g., sex, age, clinical information, or genetic ancestry-based Principal Components). When covariates are included, measurements such as the increment on the variance explained by the addition of the PRS to the model or the significance of the PRS term are usually reported. However, the association study between PRSs and a trait is a complex concern that requires proper modeling and analysis, since interactions and validation conditions represent crucial aspects. Even though excellent papers explain how to use and interpret the results obtained with such regression models, sometimes important information from the previously calculated PRS may be lost, partly due to the automation of analyses. With this guide, we intend to fill a gap in association studies between PRSs and a trait and to facilitate the analysis, obtaining statistically correct results. It contains a motivating real data case analyzed exhaustively to illustrate how to face a real analysis. Besides, it is accompanied by four examples, called Working Examples, which present different situations the researcher may encounter along with the R code for analyzing all these data sets and the corresponding application of the steps in this guide.

Keywords: Polygenic risk score1, statistical analysis2, Covariates3, schizophrenia4, Psychoticlike experiences5

Received: 02 May 2025; Accepted: 28 Jul 2025.

Copyright: © 2025 Irigoien, Mas Bermejo, Papiol, Barrantes-Vidal, Rosa and Arenas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Concepcion Arenas, University of Barcelona, Barcelona, Spain

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.