Skip to main content

METHODS article

Front. Genet.
Sec. Applied Genetic Epidemiology
Volume 15 - 2024 | doi: 10.3389/fgene.2024.1203577
This article is part of the Research Topic Methods in Applied Genetic Epidemiology 2022 View all 5 articles

Longitudinal method comparison: Modeling polygenic risk for posttraumatic stress disorder over time in individuals of African and European ancestry

Provisionally accepted
Kristin Passero Kristin Passero 1Jennie G. Noll Jennie G. Noll 2Claire Selin Claire Selin 2Molly A. Hall Molly A. Hall 1,3,4*
  • 1 Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States
  • 2 Department of Human Development and Family Studies, College of Health and Human Development, The Pennsylvania State University, University Park, Pennsylvania, United States
  • 3 Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States
  • 4 Penn State Cancer Institute, College of Medicine, The Pennsylvania State University, Hershey, Pennsylvania, United States

The final, formatted version of the article will be published soon.

    Cross-sectional data allows investigation of how genetics influence health at a single timepoint, but to understand how the genome impacts phenotype development, one must use repeated measures data. Ignoring the dependency inherent in repeated measures can exacerbate false positives and requires utilization of methods other than general or generalized linear models. Many methods can accommodate longitudinal data, including the commonly used linear mixed model and generalized estimating equation, as well as the less popular fixed effects model, cluster-robust standard error adjustment, and aggregate regression. We simulated longitudinal data and applied these five methods alongside a naïve linear regression, which ignored dependency and served as a baseline, to compare their power, false positive rate, and estimation accuracy and precision. Results showed that naïve linear regression and fixed effects models incurred high false positive rates when analyzing a predictor that is fixed over time, making them unviable for studying time-invariant genetic effects. Linear mixed models maintained low false positive rates and unbiased estimation. The generalized estimating equation was similar to the former in terms of power and estimation, but it had increased false positives when the sample size was low, as did cluster-robust standard errors. Aggregate regression produced biased estimates when predictor effects varied over time. To show how method choice affects downstream results, we performed longitudinal analyses in an adolescent cohort of African and European ancestry. We examined how developing posttraumatic stress symptoms were predicted by polygenic risk, traumatic events, exposure to sexual abuse, and income using four approaches -linear mixed models, generalized estimating equations, cluster-robust standard errors, and aggregate regression. While directions-of-effect were generally consistent, coefficient magnitudes and statistical significance differed across methods. Through our in-depth comparison of longitudinal methods, we found that linear mixed models and generalized estimating equations were applicable in most scenarios requiring longitudinal modeling, but that no approach produced identical results even if fit to the same data. Since discrepancies can result from methodological choices, it is crucial that researchers determine their model a priori, refrain from testing multiple approaches to obtain favorable results, and utilize as similar as possible method when seeking to replicate results.

    Keywords: Longitudinal analysis methods, repeated measures, simulation study, polygenic risk scores, posttraumatic stress disorder, Longitudinal method comparison

    Received: 11 Apr 2023; Accepted: 15 Apr 2024.

    Copyright: © 2024 Passero, Noll, Selin and Hall. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Molly A. Hall, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, 16802, Pennsylvania, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.