AUTHOR=Nguchu Benedictor Alexander , Han Yifei , Wang Yanming , Shaw Peter 

TITLE=Identifying network state-based Parkinson’s disease subtypes using clustering and support vector machine models

JOURNAL=Frontiers in Psychiatry

VOLUME=Volume 16 - 2025

YEAR=2025

URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2025.1453852

DOI=10.3389/fpsyt.2025.1453852

ISSN=1664-0640

ABSTRACT=IntroductionParkinson’s disease (PD) heterogeneity poses challenges to the current development of discovering the best therapeutic targets.MethodsHere, we employ K-means and hierarchical clustering algorithms on data from the Parkinson’s Progression Markers Initiative (PPMI) to identify network-specific patterns that describe PD subtypes using the optimal number of brain features. The features were specifically the gray matter volume and dopaminergic features of the neostriatum, i.e., the caudate, putamen, and anterior putamen. We use machine learning (ML) algorithms, including Random Forest, Logistic Regression, and Support Vector Machine, to evaluate the diagnostic power of the brain features and network patterns in differentiating the PD subtypes and distinguishing PD from HC. Finally, we assessed whether PD subtypes described through network-specific patterns are dependent on the APOE genotype.ResultsUsing data from 2396 subjects, we show that PD (n=2037) is highly associated with APOE ϵ2/ϵ4. Our findings reveal a significant DAT deficit in the left and right structures of the caudate, putamen, and anterior putamen in subjects with PD compared to subjects with SWEDD(n=137) or HC(n=222), and that APOE ϵ2/ϵ4 may accelerate DAT deficits and brain alterations in both PD and SWEDD. Furthermore, clinical symptoms of PD in subjects (SWEDD), which hardly validated by DAT scan data, can be explained by variations in APOE genotypes and other brain features beyond DAT. We show the existence of three networks states for the whole data, with the first network state describing the subjects in HC, while the remaining two network states describing the two PD subtypes—one network state typified by a mildly sparsely connected network (patterns) and the other network state characterized by a more intensified sparsity in their network. We also show that the two subtypes of PD are characterized by distinctly different levels of total gray matter volume and DAT deficit. ML models show that features extracted from brain structure and network patterns can serve as reliable biomarkers for PD and its subtypes, with the highest performance (100% AUC, 99.3% accuracy, 0.993 F1) demonstrated by the fine-tuned SVM model.ConclusionOur findings suggest that, while PD is generally associated with a larger DAT deficit in specific brain structures of the neostriatum, it exhibits intrinsic heterogeneity across individuals, which may stem from genetic factors. Such heterogeneity can be characterized by ML models and optimally mapped into network states, providing new insights to consider when developing personalized drugs.