AUTHOR=Song Zeyuan , Gunn Sophia , Monti Stefano , Peloso Gina M. , Liu Ching-Ti , Lunetta Kathryn , Sebastiani Paola 

TITLE=Learning Gaussian graphical models from correlated data

JOURNAL=Frontiers in Systems Biology

VOLUME=Volume 5 - 2025

YEAR=2025

URL=https://www.frontiersin.org/journals/systems-biology/articles/10.3389/fsysb.2025.1589079

DOI=10.3389/fsysb.2025.1589079

ISSN=2674-0702

ABSTRACT=Gaussian Graphical Models (GGMs) are a type of network modeling that uses partial correlation rather than correlation for representing complex relationships among multiple variables. The advantage of using partial correlation is to show the relation between two variables after “adjusting” for the effects of other variables and leads to more parsimonious and interpretable models. There are well established procedures to build GGMs from a sample of independent and identical distributed observations. However, many studies include clustered and longitudinal data that result in correlated observations and ignoring this correlation among observations can lead to inflated Type I error. In this paper, we propose a cluster-based bootstrap algorithm to infer GGMs from correlated data. We use extensive simulations of correlated data from family-based studies to show that the proposed bootstrap method does not inflate the Type I error while retaining statistical power compared to alternative solutions when there are sufficient number of clusters. We apply our method to learn the Gaussian Graphic Model that represents complex relations between 47 Polygenic Risk Scores generated using genome-wide genotype data from the Long Life Family Study. By comparing it to the conventional methods that ignore within-cluster correlation, we show that our method controls the Type I error well without power loss.