Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Psychiatry

Sec. Mood Disorders

Machine Learning Framework for Depression Subtype Grouping: Integrating High-Resolution Imaging and Clinical Symptom Analysis via Correlation and Clustering

Provisionally accepted
  • 1Icahn School of Medicine at Mount Sinai, New York, United States
  • 2Icahn School of Medicine at Mount Sinai Friedman Brain Institute, New York, United States
  • 3University of Oxford, Oxford, United Kingdom
  • 4University of Nebraska Omaha, Omaha, United States
  • 5University of Missouri, Columbia, United States

The final, formatted version of the article will be published soon.

Introduction Major depressive disorder (MDD) affects approximately one in six individuals over their lifetime, with many patients experiencing treatment-resistant depression, characterized by inadequate or insufficient response from at least one antidepressant treatment. Current classification strategies for depression rely primarily on clinical assessment of symptom severity, which are prone to reader bias and test-retest variability. Moreover, these symptom-based subtypes have shown limited utility in predicting treatment response. This study introduces a data-driven, non-biased classification framework that integrates clinical features with high-resolution magnetic resonance imaging (MRI)-derived features. Using canonical correlation analysis (CCA) and hierarchical clustering, the approach identifies distinct subtypes of MDD, offering a more objective and potentially predictive alternative to traditional methods. Materials & Methods Sixty-four participants with MDD currently experiencing a major depressive episode and not currently undergoing treatment completed a battery of 11 clinical symptom severity assessments and scanned with 7T T1-weighted MRI with parameters: TE/TR = 3.62/6000 ms, 320x240x240 array size with 224x168 mm2 field-of-view (FOV) for voxel dimensions of 0.7 mm3 isotropic. The images were automatically segmented using the FreeSurfer 6.0 package and 87 resulting imaging features, along with 11 clinical measures were processed through CCA to derive highly-correlated clinical-imaging phenotypes. An analysis using the Sillhouette and other methods determined an optimal number of clusters for this dataset. Participants with MDD were plotted on axes consisting of highly correlated clinical-imaging phenotypes derived from CCA and subsequently grouped through hierarchical clustering. Results CCA identified three highly correlated (r > 0.9) clinical-imaging variable pairs. The first, an anhedonia-related phenotype, showed high loadings from anhedonia severity and brainstem features. The second phenotype was associated with childhood trauma and anhedonia, with the right frontal pole as the primary imaging feature. The third phenotype linked general distress and perceived stress with the right superior temporal lobe. Hierarchical clustering along these canonical axes revealed two distinct clusters: one characterized by high childhood trauma scores and the other showing scores comparable to healthy controls. Conclusion Taken together, this study presents a novel ML framework for classifying depression using CCA and clustering.

Keywords: Biological subtypes, Canonical correlation analysis (CCA), Childhood Trauma Questionnaire (CTQ), FreeSurfer, heirarchical clustering, machine learning based classification, Major depression disorder (MDD), ultrahigh field (UHF) magnetic resonance imaging (MRI).

Received: 17 Nov 2025; Accepted: 28 Jan 2026.

Copyright: © 2026 Verma, Jacob, Morris, Xing, Lin, Murrough, Balchandani and Delman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Gaurav Verma

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.