You're viewing our updated article page. If you need more time to adjust, you can return to the old layout.

ORIGINAL RESEARCH article

Front. Psychiatry, 30 January 2026

Sec. Molecular Psychiatry

Volume 17 - 2026 | https://doi.org/10.3389/fpsyt.2026.1712225

Integrated transcriptomic and machine learning analysis reveals novel diagnostic biomarkers for adolescent major depressive disorder

  • 1. Psychiatric Department, First Affiliated Hospital of Kunming Medical University, Kunming, China

  • 2. Mental Health Institute of Yunnan, First Affiliated Hospital of Kunming Medical University, Kunming, China

  • 3. Yunnan Clinical Research Center for Mental Health, Kunming, China

  • 4. Yunnan Clinical Center for Mental Health, Kunming, China

  • 5. Department of Clinical Laboratory, First Affiliated Hospital of Kunming Medical University & Yunnan Province Clinical Research Center for Laboratory Medicine, Kunming, China

  • 6. Department of Neurosurgery, Second Affiliated Hospital of Kunming Medical University, Kunming, China

Article metrics

View details

485

Views

40

Downloads

Abstract

Introduction:

The lack of objective biomarkers and mechanistic understanding of adolescent Major Depressive Disorder (MDD) impedes early diagnosis and targeted intervention.

Methods:

To elucidate peripheral molecular biomarkers for adolescent MDD, we performed RNA sequencing on peripheral blood mononuclear cells (PBMCs) from 15 adolescents with MDD and 15 age- and sex-matched healthy controls. Differential expression analysis and protein-protein interaction (PPI) network construction were utilized to identify key regulatory genes. The expression of core targets was validated using RT-qPCR and ELISA. To establish a robust diagnostic model, an integrated feature selection strategy combining Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and Random Forest algorithms was applied to screen candidate biomarkers.

Results:

Transcriptomic profiling identified 367 differentially expressed genes characterized by a dual signature of innate immune activation and compensatory hypoxic responses. Eight core hub genes were identified and experimentally validated, revealing a dichotomous expression pattern: upregulation of erythroid-related and inflammatory factors (SLC4A1, HBB, GYPA, IL6) and downregulation of neurotrophic and remodeling factors (IGF1, CSF2, MMP9, CXCR1). Notably, lower expression levels of MMP9 and CXCR1 were significantly correlated with higher Hamilton Depression Rating Scale (HAMD) scores, indicating greater symptom severity. The multi-algorithm machine learning approach identified a consensus three-gene diagnostic panel comprising SLC4A1, IGF1, and MMP9, which achieved a high classification accuracy with an Area Under the Curve (AUC) of 0.867.

Conclusion:

This study delineates a systemic molecular landscape of adolescent MDD defined by the coexistence of hypoxic compensation and neurotrophic/remodeling failure. The identified three-gene biosignature (SLC4A1, IGF1, MMP9) offers a promising, objective tool for the early diagnosis of adolescent depression, highlighting the immune-metabolic interface as a critical avenue for future precision medicine.

1 Introduction

MDD is a widespread psychiatric condition characterized by persistent depressed mood, anhedonia, and cognitive impairment, substantially compromising social functioning and quality of life (1, 2). As reported by the World Health Organization (WHO), more than 300 million people are affected by MDD globally, with increasing incidence and recurrence rates exacerbating the burden on healthcare and economic systems worldwide (3, 4).

Despite extensive investigation, the precise molecular underpinnings of MDD remain elusive, impeding early diagnosis and effective management. Emerging evidence implicates dysregulation of neurotransmitter systems (5, 6), inflammatory activation (710), and impaired neuroplasticity (11, 12) as pivotal contributors to its pathogenesis. However, these mechanistic insights, while foundational, have not yet translated into reliable biomarkers, underscoring the need for more comprehensive, system-level approaches.

Advances in transcriptomic profiling via RNA sequencing (RNA-seq) (13) have enabled systematic exploration of the molecular architecture of psychiatric diseases. RNA-seq facilitates genome-wide identification of differentially expressed genes (DEGs) in pathological states, offering insights into the involved biological pathways and regulatory networks. This methodology is particularly advantageous for detecting peripheral biomarkers, as blood-derived transcripts can reflect systemic and central nervous system (CNS) alterations in a minimally invasive manner.

Prior transcriptomic studies conducted primarily in adults, mostly using PBMCs, have yielded important insights, revealing immune dysregulation and peripheral inflammation, including elevated pro-inflammatory mediators such as IL-6 and MMP9 (1416). However, the applicability of these adult-derived signatures to adolescents is highly uncertain. Adolescent MDD presents distinct etiological challenges, arising during a critical neurodevelopmental window shaped by dynamic hormonal changes, ongoing brain maturation, and unique psychosocial stressors. This developmental disparity is critical; molecules integral to growth and immune regulation, such as insulin-like growth factor-1 (IGF1) and granulocyte-macrophage colony-stimulating factor (CSF2), may exert unique roles in adolescent pathophysiology (17). Consequently, the scarcity of reliable, age-specific biomarkers for adolescent MDD hinders early identification and intervention, often resulting in chronicity and adverse long-term outcomes.

To address this critical gap, we conducted blood-based RNA sequencing in adolescents with MDD to characterize systemic molecular alterations. Specifically, we aimed to: 1) identify adolescent-specific differentially expressed genes (DEGs); 2) elucidate their functional implications via pathway enrichment analyses; and 3) determine disease associations to reveal broader systemic relevance. By delineating the unique molecular signatures of adolescent MDD, this work seeks to provide a more integrative understanding of the disorder, identify potential biomarkers for early diagnosis, and reveal novel therapeutic targets tailored to this vulnerable population.

2 Methods

2.1 Participants and procedure

This study recruited 15 adolescents (aged 12–17) with DSM-5–defined MDD and HAMD-17 scores ≥17 from the Department of Psychiatry, and 15 age- and sex-matched healthy controls from the Pediatric Health Center of the First Affiliated Hospital of Kunming Medical University between June and December 2024. All diagnoses were confirmed by attending psychiatrists. Written informed consent was obtained from participants and their legal guardians. Exclusion criteria included organic comorbidities, inflammatory or autoimmune diseases, schizophrenia spectrum disorders, bipolar disorder, and substance use disorders; for healthy controls, additional exclusions were any personal or family history of psychiatric illness. Clinical assessments were conducted by trained professionals using standardized procedures. All procedures were approved by the institutional review board of the First Affiliated Hospital of Kunming Medical University and conducted in accordance with the 2013 Declaration of Helsinki. Clinical and demographic characteristics are listed in Table 1.

Table 1

Characteristic MDD (n = 15) HC (n = 15) t/χ2/z p-Value
Gender (male/female) 8/7 7/8 χ² = 0 1
Age (years) (Mean ± SD) 14.5 ± 1.7 14.7 ± 1.7 t = 0.43 0.670
HAMD-17 score (median/IQR) 27 (23.5-30.5) 3 (1.5-4.5) z = 0 0.000

Demographic and clinical profiles of the MDD and HC groups.

HC, healthy control; HAMD, Hamilton Depression Rating Scale; MDD, major depressive disorder.

2.2 PBMCs isolation

PBMCs were isolated from 2 mL of freshly collected whole blood diluted 1:1 with phosphate-buffered saline. The diluted sample was gently layered onto 3 mL of Ficoll–Paque PLUS and centrifuged at 3000 g for 20 min at 18–20°C. Following density gradient separation, the mononuclear cell layer was carefully aspirated and washed. Cells were then subjected to a low-speed spin (60–100g, 10min) to remove residual debris. After discarding the supernatant, cell pellets were lysed in TRIzol and stored at −80°C for subsequent analyses.

2.3 RNA extraction, library preparation and sequencing

Total RNA was isolated from monocytes using TRIzol reagent (MJZol total RNA extraction kit). RNA quality and integrity were evaluated by measuring the A260/A280 ratio with a NanoDrop ND-2000 (Thermo Scientific, USA) and determining RNA integrity numbers using an Agilent 5300 Bioanalyzer (Agilent Technologies, USA). Only samples that passed predefined quality control thresholds were advanced to library construction. Paired-end libraries were generated using the Illumina Stranded mRNA Prep Ligation (Illumina, USA) following the manufacturer’s instructions. Briefly, 1μg of total RNA was enriched for mRNA using oligo(dT) magnetic beads and subsequently fragmented in the first-strand synthesis buffer. First-strand cDNA was synthesized using random primers and reverse transcriptase, followed by second-strand synthesis with DNA polymerase I, RNase H, and dNTPs. The resulting double-stranded cDNA was end-repaired, adaptor-ligated, and PCR-amplified. Purified libraries were assessed for quality on an Agilent 5300 Bioanalyzer before sequencing on the Illumina NovaSeq Reagent Kit platform.

2.4 Data preprocessing

Raw image files were processed with llumina BCL Convert (version 3.9.3) to generate FASTQ-format reads. Quality control included adapter trimming and removal of low-quality sequences, yielding high-quality clean reads. Clean reads were aligned to the reference genome using hisat2 (version 2.2.1). Gene-level quantification was performed with RSEM (version 1.3.3), and transcript abundance was expressed as fragments per kilobase of transcript per million mapped reads (FPKM) to enable standardized comparison across samples.

2.5 Bioinformatics analysis and screening strategy

Differential expression analysis was conducted in a discovery framework, and nominal p-values (p < 0.05 and |Log2FC| > 2) were used for downstream network and validation analyses, using the edgeR R package (version 4.4.2). This criterion was designed to capture high-magnitude regulators that might be penalized by FDR correction due to inter-individual heterogeneity. Expression patterns were visualized using the edgeR R package (version 4.4.2) and pheatmap (version 1.0.13) R packages. Functional enrichment (GO pathways) was conducted via Metascape. The PPI network was mapped using the STRING database and visualized in Cytoscape. Key modules and hub genes were identified using the MCODE and CytoNCA plugins (Mediator Number Centered algorithm), respectively.

2.6 Quantitative real-time polymerase chain reaction

To ensure the technical reliability of the transcriptomic profiles, qPCR validation was performed using the identical discovery cohort (20 adolescents with MDD and 20 matched healthy controls). Total RNA was extracted from PBMCs using TRIzol reagent. To validate transcriptomic findings, the expression of selected DEGs was quantified by qPCR. Relative mRNA levels were calculated using the 2−ΔΔCt method. The validation panel targeted the following genes: solute carrier family 4 member 1 (SLC4A1), hemoglobin subunit beta (HBB), insulin-like growth factor 1 (IGF1), colony stimulating factor 2 (CSF2), matrix metallopeptidase 9 (MMP9), C-X-C motif chemokine receptor 1 (CXCR1), interleukin 6 (IL6), and glycophorin A (GYPA). Primers were designed using the NCBI Primer-BLAST. The primers for the eight main core factors are listed in Supplementary Table S1.

2.7 Enzyme-linked immunosorbent assays

Enzyme-linked immunosorbent assay (ELISA) Serum levels of the proteins encoded by the validated hub genes—SLC4A1, HBB, IGF1, CSF2, MMP9, CXCR1, IL6, and GYPA—were quantified using commercial ELISA kits. All assays were performed strictly according to the manufacturers’ protocols. Detailed information regarding the specific kits, including manufacturers and catalog numbers, is provided in Supplementary Table S2.

2.8 Screening and identification of diagnostic biomarkers using machine learning

To identify robust diagnostic biomarkers for MDD and mitigate the risk of overfitting due to the limited sample size (N = 30), we employed an integrated feature selection strategy combining three distinct machine learning algorithms: LASSO, SVM-RFE), and Random Forest (RF). For LASSO regression, we implemented a Stratified Stability Selection approach using the glmnet R package (version 4.1.10). Unlike standard cross-validation, which can be unstable in small datasets, this method involved 100 bootstrap iterations. In each iteration, stratified sampling was performed to maintain a balanced ratio of case (MDD) to control samples (15 vs. 15). Genes that were selected in more than 60% of the iterations were considered stable features. Concurrently, the SVM-RFE algorithm was applied using the caret R package (version 7.0.1) to rank features based on their contribution to model accuracy. To ensure a rigorous evaluation, we utilized Leave-One-Out Cross-Validation (LOOCV), which is the optimal validation strategy for small sample sizes, to determine the feature subset that yielded the highest classification accuracy. Additionally, the Random Forest algorithm was utilized (randomForest package [version 4.7.1.2], ntree = 2000) to evaluate feature importance based on the Mean Decrease Gini index. Finally, the intersecting genes identified by all three algorithms were determined using a Venn diagram. These overlapping genes were defined as the final hub diagnostic markers. A logistic regression model was constructed using these markers, and its predictive performance was evaluated using Receiver Operating Characteristic (ROC) curve analysis.

2.9 Statistical analysis

Statistical analyses were conducted using the R computing environment (version 4.4.2). Continuous variables are expressed as mean and standard deviation (SD) for normally distributed data or median with interquartile range (IQR) for non-normally distributed data. Data normality was assessed using the Shapiro–Wilk test. For between-group comparisons, the independent two-sample t-test was employed for normally distributed continuous variables, whereas the Mann–Whitney U test was utilized for non-normally distributed data. Given the limited sample size (N = 30) and potential non-normal distribution of gene expression data, partial Spearman’s rank correlation analysis was performed to evaluate the relationship between hub gene expression and HAMD scores. Age and gender were included as covariates to control for potential confounding effects. The ppcor R package (version 1.1) was used for the analysis. All statistical tests were two-sided, and a p-value < 0.05 was considered statistically significant.

3 Results

3.1 Global transcriptomic profiling and functional characterization

To systematically characterize the peripheral molecular signatures of adolescent depression, we performed RNA-sequencing on PBMCs from 15 adolescents with MDD (HAMD-17 scores: 19–33) and 15 age- and sex-matched healthy controls (HAMD-17 scores: 0–6). Differential expression analysis identified 367 dysregulated genes (p < 0.05, |Log2FC| > 1), comprising 176 upregulated and 191 downregulated transcripts in the MDD cohort (Figures 1A, B). Functional enrichment analysis uncovered a dual signature of systemic immune dysregulation and metabolic stress. Specifically, biological processes were predominantly enriched in innate immune responses, such as myeloid leukocyte and neutrophil activation. Concurrently, molecular functions converged on hemoglobin complex and oxygen carrier activities, suggesting a convergence on pathways related to immune activation and oxygen transport, which may reflect systemic metabolic stress in adolescent MDD (Supplementary Figure S1).

Figure 1

Panel A shows a volcano plot with points indicating gene expression changes. Red points (176) are upregulated, and blue points (191) are downregulated. Panel B displays a heatmap with hierarchical clustering, where rows represent genes and columns indicate control and MDD groups. Blue to red gradient shows expression levels.

Differential gene expression in PBMCs from adolescents with MDD and healthy controls. (A) Overview of transcriptomic alterations and relative expression patterns in MDD compared with healthy controls. (B) Identification of 367 differentially expressed genes in PBMCs (p < 0.05, |log2FC| > 1), including upregulated genes (red) and downregulated genes (blue).

3.2 Topological analysis of PPI networks and experimental validation of hub genes

To identify key regulatory nodes characterized by high connectivity and substantial magnitude of dysregulation, we constructed a PPI network based on the network discovery set (367 nominally significant DEGs). We employed a dual-algorithm strategy to ensure robustness: CytoNCA prioritized the top 25 nodes based on subgraph centrality, while MCODE detected highly interconnected functional modules. The intersection of these independent approaches converged on a consensus list of 8 core hub genes (Figure 2), comprising erythroid-related factors (SLC4A1, HBB, GYPA) and immune/growth regulators (IL-6, CSF2, MMP9, CXCR1, IGF1).

Figure 2

PPI network analysis of differentially expressed genes. (A) A global view of the densely connected network. (B, C, D) Significant modules or "hub" gene clusters identified within the network. Panel B highlights a core cluster (red nodes), while Panel C and D show additional functional modules (orange and yellow-orange nodes) representing highly interconnected gene groups.

Topological analysis of PPI networks. (A) Overview of the global PPI network constructed from the differentially expressed genes (N = 367). Red nodes indicate upregulated genes, and orange nodes indicate downregulated genes. (B–D) Identification of key functional modules using the MCODE algorithm. (B) The erythroid-related module (e.g., SLC4A1, HBB, GYPA). (C) The growth factor and cytokine regulatory module (e.g., IL6, IGF1, MMP9). (D) The immune trafficking module (e.g., CXCR1, CSF2). These hub clusters reveal the core molecular machinery underlying the immuno-metabolic dysregulation in MDD.

To validate these in silico findings, we performed cross-validation using RT-qPCR and ELISA. Both assays yielded consistent results, revealing a distinct dichotomous expression pattern: upregulated cluster: expression levels of SLC4A1, HBB, GYPA, and IL-6 were significantly elevated in the MDD group, suggesting a compensatory enhancement of oxygen-carrying capacity and erythropoiesis; downregulated cluster: conversely, IGF1, CSF2, MMP9, and CXCR1 exhibited significant downregulation, indicating a suppression of specific cell growth and inflammatory trafficking pathways. Collectively, these validation data confirm that the identified core hubs orchestrate a complex shift involving enhanced erythroid function and altered immune modulation (Figure 3).

Figure 3

Panel A shows violin plots comparing the relative mRNA expression levels of various genes, including CSF2, CXCR1, GYPA, HBB, IGF1, IL-6, MMP9, and SLC4A1, between a control group and a specified condition. Panel B displays similar violin plots for the Protein levels of the same genes, comparing the control group with a condition labeled MDD. Significant differences are indicated by asterisks. Blue represents the control group, and red represents the condition group.

Experimental validation of core hub genes. (A) Validation of mRNA expression levels in PBMCs using RT-qPCR. (B) Validation of protein levels in plasma using ELISA. Data are presented as mean ± SEM. Statistical significance was determined by unpaired t-test (*p < 0.05, **p < 0.01, ***p < 0.001).

3.3 Association of hub gene expression and protein levels with depressive symptom severity

To investigate the clinical relevance of the identified hub genes, we assessed the associations between their expression levels and depressive symptom severity (HAMD scores) using partial Spearman correlation analyses adjusted for age and sex. At the transcriptomic level, significant associations were specific to IL-6 and MMP9 (P < 0.01) (Figure 4A). MMP9 showed a consistent negative correlation at both mRNA (Adj.r = -0.52) and protein levels (Adj.r = -0.72), indicating a convergent transcript–protein association with symptom severity. Intriguingly, IL-6 mRNA was positively correlated with symptom severity (Adj.r = 0.70), contrasting with its protein level. At the proteomic level, the analysis revealed broader consistency; all eight candidate proteins correlated significantly with HAMD scores (P < 0.05) (Figure 4B). HBB (Adj.r = 0.83, P < 0.001) and SLC4A1 (Adj.r = 0.83, P < 0.001) demonstrated robust positive associations, while IL-6 protein (Adj.r = -0.87, P < 0.001) was negatively correlated. The discordance between transcript and protein levels of IL-6 suggests post-transcriptional regulation, which is further discussed below.

Figure 4

Scatter plots showing the relationship between HAMD scores and mRNA or protein levels for various genes, adjusted for age and sex. mRNA plots (top) include CSF2, CXCR1, GYPA, HBB, IGF1, IL6, MMP9, and SLC4A1, with blue trend lines. Protein plots (bottom) show the same genes with yellow trend lines, noting stronger correlations. Red represents MDD and blue HC groups.

Associations between hub gene biomarkers and depression severity at proteomic and transcriptomic levels. (A) Partial Spearman’s rank correlation analysis between peripheral blood mRNA expression levels (RNA-seq) and HAMD scores. The regression line is shown in blue. (B) Partial Spearman’s rank correlation analysis between ELISA of 8 hub genes and HAMD scores. The regression line is shown in gold. Note: The analysis was performed in the entire study cohort (N = 30), including Healthy Controls (n=15) and patients with MDD (n=15). All correlation coefficients (Adj.R) and P-values (Adj.p) were adjusted for age and sex as covariates. The regression lines with 95% confidence intervals (shaded areas) indicate the trends of the associations. Blue dots represent HC subjects, and red dots represent MDD patients.

3.4 Identification and validation of diagnostic biomarkers for MDD

To identify robust diagnostic biomarkers for MDD within the limited sample size (N = 30), we implemented a multi-algorithmic machine learning approach integrating LASSO regression, SVM-RFE, and Random Forest based on the 8 candidate hub genes. First, LASSO stability selection with 100 stratified bootstrap iterations identified 3 stable genes with selection probabilities exceeding 0.6 (Figures 5A, B). Second, the SVM-RFE algorithm with leave-one-out cross-validation (LOOCV) achieved optimal classification accuracy with 7 features (Figure 5C). Third, the Random Forest model ranked the candidates based on the Gini index (Figure 5D). To ensure reliability, we intersected the candidate genes identified by these three independent methods. The Venn diagram revealed that 3 genes (SLC4A1, IGF1, and MMP9) were consistently selected (Figure 5E) and were thus retained as the final diagnostic biomarkers. This combined diagnostic model achieved an AUC of 0.867 (Figure 5F).

Figure 5

Panel A shows a graph of coefficients against negative log lambda values for various features. Panel B displays a plot of misclassification error versus negative log lambda, highlighting the minimum error. Panel C is a line graph depicting accuracy against feature count. Panel D presents a dot plot of mean decrease in Gini for different features. Panel E is a Venn diagram comparing features selected by LASSO, SVM, and RF methods. Panel F illustrates ROC curves for different models, showing sensitivity versus specificity, with AUC values for each model.

Screening and validation of diagnostic biomarkers for MDD based on machine learning. (A, B) LASSO regression analysis. (A) LASSO coefficient profiles of the candidate genes. (B) Selection of the optimal lambda parameter using LOOCV. (C) SVM-RFE algorithm for feature selection. The plot shows the accuracy changes with the number of features; the red dot indicates the maximum accuracy (0.90) with 7 genes. (D) Random Forest feature importance ranking. Genes are ranked by the Mean Decrease Gini index; the top genes contribute most to the classification. (E) Venn diagram showing the intersection of candidate genes identified by LASSO (n=3), SVM-RFE (n=7), and Random Forest (n=8). Three common genes were identified. (F) ROC curve analysis for the 3 hub genes (SLC4A1, IGF1, and MMP9). The AUC values indicate the diagnostic performance of each gene.

Subsequently, we validated these findings using ELISA data. Notably, SVM-RFE achieved a classification accuracy of 1.000 using IGF1 alone (Supplementary Figure S3C). Given this near-perfect discrimination within this cohort (AUC = 1.000), we conducted a permutation test with 1,000 iterations to rigorously assess the risk of overfitting. As shown in Supplementary Figure S3F, the observed AUC was significantly distinct from the random permutation distribution (P < 0.001).

4 Discussion

The absence of objective biological benchmarks for adolescent MDD remains a critical bottleneck in early identification and precision medicine. In the present study, we performed comprehensive transcriptomic profiling of PBMCs to unravel the peripheral molecular landscape underlying adolescent major depressive disorder (MDD). Our data revealed a distinctive dual-signature characterized by innate immune activation and may reflect a compensatory transcriptional response related to oxygen transport or metabolic stress. Through topological network analysis and experimental validation, we identified a consensus set of eight core hub genes. These genes exhibited a dichotomous expression pattern: an upregulation of pro-inflammatory and erythroid-related factors (IL6, SLC4A1, HBB, GYPA) contrasted with a suppression of neurotrophic and remodeling factors (IGF1, CSF2, MMP9, CXCR1). Furthermore, by integrating three independent machine learning algorithms, we successfully constructed a robust diagnostic model based on SLC4A1, IGF1, and MMP9, which demonstrated high accuracy (AUC = 0.867) in distinguishing adolescent MDD patients from healthy controls.

Functional enrichment analyses revealed that these molecular alterations converge on critical physiological axes, specifically immune activation (18) and inflammatory signaling (19), alongside metabolic and signal transduction perturbations (2022). This suggests that the functional activation of immune cells, particularly neutrophils, plays a pivotal role in MDD. Recent studies have increasingly highlighted the link between MDD and a chronic low-grade inflammatory state (2325). The activation of innate immune cells, such as neutrophils (26, 27) and macrophages (2830), mediates neuroinflammatory responses through the release of inflammatory factors, which in turn affect central nervous system (CNS) function.

A salient and novel finding of our study is the coordinated upregulation of erythroid-related genes (SLC4A1, HBB, GYPA) in the PBMCs of MDD patients. Under physiological conditions, these genes are predominantly expressed in the erythroid lineage; their ectopic or elevated expression in peripheral mononuclear cells suggests a systemic response to metabolic stress or hypoxia. Emerging evidence links depression to mitochondrial dysfunction and oxidative stress (3133), which may lead to a state of “pseudohypoxia” at the cellular level (34, 35). SLC4A1 (Band 3) functions as a critical regulator of systemic pH and CO2 transport. Its dysregulation in blood transcriptomics often reflects a compensatory response to chronic metabolic acidosis or oxidative stress, serving as a peripheral sensor of hypoxic burden (36, 37). We postulate that the upregulation of hemoglobin complex genes may acts as a compensatory mechanism to enhance oxygen-carrying capacity and mitigate cellular hypoxia induced by the bioenergetic deficits (38, 39) often seen in depression. This finding adds a new dimension to the pathophysiology of MDD, suggesting that the disorder involves systemic adaptations to metabolic demand beyond central nervous system signaling. Although PBMC preparations are generally depleted of erythrocytes, we cannot fully exclude subtle shifts in cell composition. However, the consistent upregulation across multiple erythroid-related genes and their association with clinical severity argue against a simple contamination effect.

In contrast to the upregulation of hypoxic and inflammatory markers (IL-6), we observed a significant downregulation of IGF1, MMP9, CSF2, and CXCR1. This “downregulated cluster” points towards a deficit in neurotrophic support and immune-mediated repair. IGF-1 is a potent neurotrophic factor essential for neurogenesis and synaptic plasticity (40); its peripheral reduction aligns with the “neurotrophic hypothesis” of depression, reflecting a deficit that may impair the maturation of stress-regulatory circuits during adolescence. However, peripheral IGF-1 levels are not consistently associated with depression in clinical populations (4143), emphasizing the need to consider confounding factors such as disease duration, medication use, age, and whether the depression is first-episode or recurrent.

The observed discordance between IL-6 transcript and protein levels suggests the involvement of post-transcriptional regulatory mechanisms. IL-6 expression is tightly regulated at multiple levels, including mRNA stability, translational efficiency, and protein secretion, allowing for rapid and context-dependent immune responses. In peripheral immune cells, IL-6 mRNA can be transiently upregulated without a proportional increase in circulating protein levels, particularly under conditions of chronic or low-grade inflammation. Moreover, circulating IL-6 protein levels are influenced not only by cellular production but also by clearance dynamics and receptor-mediated consumption. Transcriptomic and proteomic measurements represent snapshots of distinct regulatory layers that may not be synchronized in cross-sectional designs. Therefore, the inverse or weak correspondence between IL-6 mRNA and protein observed in this study likely reflects complex regulatory processes rather than technical inconsistency.

Crucially, our partial correlation analysis (adjusted for age and gender) revealed that lower expression of MMP9 was significantly associated with more severe depressive symptoms (higher HAMD scores). Traditionally, elevated MMP9 is viewed as a marker of acute neuroinflammation and BBB disruption (44). However, MMP9 is pleiotropic, beyond its proteolytic activity in inflammation, it plays a constitutive and indispensable role in synaptic plasticity, specifically in the conversion of pro-BDNF to mBDNF and the maintenance of L-LTP (45, 46).

In our chronic cohort, the observed downregulation represents a “plasticity deficit” state rather than an active inflammatory state. The concurrent low levels of IGF-1 further suggest a failure of neurotrophic support. We propose a working model in which chronic depressive states may evolve from an initial phase dominated by inflammatory activation toward a subsequent plasticity-deficient state. Within this framework, reduced baseline levels of MMP9 and IGF1 may reflect an impaired capacity for synaptic remodeling and neuronal support. Consequently, the observed negative association between MMP9 expression and symptom severity may represent a “burnt-out” adaptive state, characterized by insufficient molecular resources to sustain synaptic plasticity and resilience, ultimately favoring neurotoxic dominance (47). Importantly, this model further suggests that therapeutic strategies aimed at restoring synaptic remodeling capacity may preferentially benefit patients exhibiting lower baseline MMP9 levels.

To translate these molecular signatures into clinical utility, we employed a multi-algorithm feature selection pipeline (LASSO, SVM-RFE, and Random Forest) to identify robust biomarkers, prioritizing stability despite the limited sample size. We identified a consensus diagnostic panel comprising SLC4A1, IGF1, and MMP9. This model achieved an AUC of 0.867, demonstrating that combining markers from distinct biological pathways yields a diagnostic fingerprint superior to single markers alone. Biologically, this three-gene signature suggests a coherent pathological trajectory: SLC4A1 captures the systemic physiological toll (metabolic/hypoxic stress); IGF1 signals the withdrawal of essential neurotrophic support; and MMP9 reflects the ultimate failure of the machinery required for synaptic adaptation and repair. Together, they describe a system under stress that has lost the capacity to remodel and recover. Although IGF1 demonstrated perfect classification in this dataset, this result should be interpreted conservatively, as small sample sizes are prone to overestimating predictive performance, even under cross-validation. Therefore, the observed accuracy likely represents an upper-bound estimate rather than true generalizability. The strength of IGF1 in this study lies in its biological consistency across transcriptomic and proteomic layers, rather than its apparent standalone diagnostic performance.

Several limitations of this study should be acknowledged. First, the sample size was relatively modest (N = 30). Although we employed strict statistical corrections, including Stratified Stability Selection and LOOCV, to minimize overfitting, validation in larger, multi-center cohorts is necessary to confirm the generalizability of the three-gene panel. Second, the cross-sectional design prevents causal inference; it remains to be determined whether the erythroid compensation is a driver of MDD or a downstream consequence of chronic physiological stress. Third, while PBMCs serve as a valuable window into systemic physiology, they may not fully recapitulate the region-specific synaptic alterations within the CNS.

5 Conclusion

In summary, our study delineates a systemic molecular signature of adolescent MDD defined by the coexistence of hypoxic compensation (SLC4A1 high) and neurotrophic/remodeling failure (IGF1/MMP9 low). We identified a robust three-gene candidate biosignature (SLC4A1, IGF1, MMP9) that demonstrates promising discrimination between patients and controls. Furthermore, the specific correlation of downregulated MMP9 and CXCR1 with increased symptom severity provides new insights into the link between immune-plasticity deficits and disease progression. These findings highlight the potential of targeting the immune-metabolic interface for future precision medicine approaches in adolescent depression.

Statements

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Ethics statement

The studies involving humans were approved by Institutional Review Board at first affiliated hospital of Kunming Medical University. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin. Written informed consent was obtained from the minor(s)’ legal guardian/next of kin for the publication of any potentially identifiable images or data included in this article.

Author contributions

RY: Conceptualization, Formal Analysis, Funding acquisition, Methodology, Software, Writing – original draft, Investigation. LJ: Conceptualization, Formal Analysis, Methodology, Writing – original draft. JP: Investigation, Writing – original draft. KL: Investigation, Writing – original draft. YX: Investigation, Writing – original draft. YH: Investigation, Writing – original draft. ZH: Investigation, Writing – original draft. QQ: Investigation, Writing – original draft. JL: Conceptualization, Funding acquisition, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This study was support by National Natural Science Foundation of China (72264019, 82360670), the Union Project of Yunnan Science and Technology Bureau and Kunming Medical University (202101AY070001-051, 202401AY070001-001), Yunnan Fundamental Research Projects (202201AS070098), First-Class Discipline Team of Kunming Medical University (2024XKTDTS16, 2024XKTDYS02), Yunnan Clinical Center for mental health Scientific Research Project (YWLCYXZX20221005), Academician Song Weihong Workstation in Yunnan Province (202305AF150180).

Acknowledgments

The authors would like to thank all adolescents participated in this study and wish them a speedy recovery.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2026.1712225/full#supplementary-material

References

  • 1

    Monroe SM Harkness KL . Major depression and its recurrences: life course matters. Annu Rev Clin Psychol. (2022) 18:329–57. doi: 10.1146/annurev-clinpsy-072220-021440

  • 2

    Hauenstein EJ . Depression in adolescence. J Obstetric Gynecol Neonatal nursing: JOGNN. (2003) 32:239–48. doi: 10.1177/0884217503252133

  • 3

    Woody CA Ferrari AJ Siskind DJ Whiteford HA Harris MG . A systematic review and meta-regression of the prevalence and incidence of perinatal depression. J Affect Disord. (2017) 219:8692. doi: 10.1016/j.jad.2017.05.003

  • 4

    Evans-Lacko S Aguilar-Gaxiola S Al-Hamzawi A Alonso J Benjet C Bruffaerts R et al . Socio-economic variations in the mental health treatment gap for people with anxiety, mood, and substance use disorders: results from the WHO World Mental Health (WMH) surveys. Psychol Med. (2018) 48:1560–71. doi: 10.1017/S0033291717003336

  • 5

    Guo N Wang X Xu M Bai J Yu H Zhang L . PI3K/AKT signaling pathway: Molecular mechanisms and therapeutic potential in depression. Pharmacol Res. (2024) 206:107300. doi: 10.1016/j.phrs.2024.107300

  • 6

    Luscher B Maguire JL Rudolph U Sibille E . GABAA receptors as targets for treating affective and cognitive symptoms of depression. Trends Pharmacol Sci. (2023) 44:586600. doi: 10.1016/j.tips.2023.06.009

  • 7

    Nerurkar L Siebert S McInnes IB Cavanagh J . Rheumatoid arthritis and depression: an inflammatory perspective. Lancet Psychiatry. (2019) 6:164–73. doi: 10.1016/S2215-0366(18)30255-4

  • 8

    Beurel E Toups M Nemeroff CB . The bidirectional relationship of depression and inflammation: double trouble. Neuron. (2020) 107:234–56. doi: 10.1016/j.neuron.2020.06.002

  • 9

    Wu A Zhang J . Neuroinflammation, memory, and depression: new approaches to hippocampal neurogenesis. J Neuroinflamm. (2023) 20:283. doi: 10.1186/s12974-023-02964-x

  • 10

    Kofod J Elfving B Nielsen EH Mors O Köhler-Forsberg O . Depression and inflammation: Correlation between changes in inflammatory markers with antidepressant response and long-term prognosis. Eur Neuropsychopharmacol. (2022) 54:116–25. doi: 10.1016/j.euroneuro.2021.09.006

  • 11

    Tartt AN Mariani MB Hen R Mann JJ Boldrini M . Dysregulation of adult hippocampal neuroplasticity in major depression: pathogenesis and therapeutic implications. Mol Psychiatry. (2022) 27:2689–99. doi: 10.1038/s41380-022-01520-y

  • 12

    Price RB Duman R . Neuroplasticity in cognitive and psychological mechanisms of depression: an integrative model. Mol Psychiatry. (2020) 25:530–43. doi: 10.1038/s41380-019-0615-x

  • 13

    Joo MK Lee JW Woo JH Kim HJ Kim DH Choi JH . Regulation of colonic neuropeptide Y expression by the gut microbiome in patients with ulcerative colitis and its association with anxiety- and depression-like behavior in mice. Gut Microbes. (2024) 16:2319844. doi: 10.1080/19490976.2024.2319844

  • 14

    Sun L Ren C Leng H Wang X Wang D Wang T et al . Peripheral blood mononuclear cell biomarkers for major depressive disorder: A transcriptomic approach. Depress Anxiety. (2024) 2024:1089236. doi: 10.1155/2024/1089236

  • 15

    Martinelli S Anderzhanova EA Bajaj T Wiechmann S Dethloff F Weckmann K et al . Stress-primed secretory autophagy promotes extracellular BDNF maturation by enhancing MMP9 secretion. Nat Commun. (2021) 12:4643. doi: 10.1038/s41467-021-24810-5

  • 16

    Knight JM Costanzo ES Singh S Yin Z Szabo A Pawar DS et al . The IL-6 antagonist tocilizumab is associated with worse depression and related symptoms in the medically ill. Trans Psychiatry. (2021) 11:58. doi: 10.1038/s41398-020-01164-y

  • 17

    Nguyen HD . Resveratrol, endocrine disrupting chemicals, neurodegenerative diseases and depression: genes, transcription factors, microRNAs, and sponges involved. Neurochem Res. (2023) 48:604–24. doi: 10.1007/s11064-022-03787-7

  • 18

    Bhatt S Nagappa AN Patil CR . Role of oxidative stress in depression. Drug Discov Today. (2020) 25:1270–6. doi: 10.1016/j.drudis.2020.05.001

  • 19

    Yang C Tiemessen KM Bosker FJ Wardenaar KJ Lie J Schoevers RA . Interleukin, tumor necrosis factor-α and C-reactive protein profiles in melancholic and non-melancholic depression: A systematic review. J Psychosom Res. (2018) 111:5868. doi: 10.1016/j.jpsychores.2018.05.008

  • 20

    Cruz-Pereira JS Rea K Nolan YM O’Leary OF Dinan TG Cryan JF . Depression’s unholy trinity: dysregulated stress, immunity, and the microbiome. Annu Rev Psychol. (2020) 71:4978. doi: 10.1146/annurev-psych-122216-011613

  • 21

    Colasanto M Madigan S Korczak DJ . Depression and inflammation among children and adolescents: A meta-analysis. J Affect Disord. (2020) 277:940–8. doi: 10.1016/j.jad.2020.09.025

  • 22

    Yang W Yin H Wang Y Wang Y Li X Wang C et al . New insights into effects of Kaixin Powder on depression via lipid metabolism related adiponectin signaling pathway. Chin Herbal Medicines. (2023) 15:240–50. doi: 10.1016/j.chmed.2022.06.012

  • 23

    Duan L Song L Qiu C Li J . Effect of the sEH inhibitor AUDA on arachidonic acid metabolism and NF-κB signaling of rats with postpartum depression-like behavior. J Neuroimmunol. (2023) 385:578250. doi: 10.1016/j.jneuroim.2023.578250

  • 24

    Alhaddad A Radwan A Mohamed NA Mehanna ET Mostafa YM El-Sayed NM et al . Rosiglitazone mitigates dexamethasone-induced depression in mice via modulating brain glucose metabolism and AMPK/mTOR signaling pathway. Biomedicines. (2023) 11:860. doi: 10.3390/biomedicines11030860

  • 25

    Wang J Zhou D Dai Z Li X . Association between systemic immune-inflammation index and diabetic depression. Clin Interventions Aging. (2021) 16:97105. doi: 10.2147/CIA.S285000

  • 26

    Poletti S Mazza MG Benedetti F . Inflammatory mediators in major depression and bipolar disorder. Trans Psychiatry. (2024) 14:247. doi: 10.1038/s41398-024-02921-z

  • 27

    Suneson K Lindahl J Chamli Hårsmar S Söderberg G Lindqvist D . Inflammatory depression-mechanisms and non-pharmacological interventions. Int J Mol Sci. (2021) 22:1640. doi: 10.3390/ijms22041640

  • 28

    Zhou M Liu YWY He YH Zhang JY Guo H Wang H et al . FOXO1 reshapes neutrophils to aggravate acute brain damage and promote late depression after traumatic brain injury. Military Med Res. (2024) 11:20. doi: 10.1186/s40779-024-00523-w

  • 29

    Jiang R Noble S Rosenblatt M Dai W Ye J Liu S et al . The brain structure, inflammatory, and genetic mechanisms mediate the association between physical frailty and depression. Nat Commun. (2024) 15:4411. doi: 10.1038/s41467-024-48827-8

  • 30

    Zhou X Luo F Shi G Chen R Zhou P . Depression and macrophages: A bibliometric and visual analysis from 2000 to 2022. Medicine. (2023) 102:e34174. doi: 10.1097/MD.0000000000034174

  • 31

    Xia X Li K Jiang B Zou W Wang L . Mitochondrial dysfunction in depression: Mechanisms and targeted therapy strategies. Asian J Psychiatr. (2025) 112:104694. doi: 10.1016/j.ajp.2025.104694

  • 32

    Black CN Bot M Scheffer PG Cuijpers P Penninx BW . Is depression associated with increased oxidative stress? A systematic review and meta-analysis. Psychoneuroendocrinology. (2015) 51:164–75. doi: 10.1016/j.psyneuen.2014.09.025

  • 33

    Li W Zhu L Chen Y Zhuo Y Wan S Guo R . Association between mitochondrial DNA levels and depression: a systematic review and meta-analysis. BMC Psychiatry. (2023) 23:866. doi: 10.1186/s12888-023-05358-8

  • 34

    Gomes AP Price NL Ling AJ Moslehi JJ Montgomery MK Rajman L et al . Declining NAD(+) induces a pseudohypoxic state disrupting nuclear-mitochondrial communication during aging. Cell. (2013) 155:1624–38. doi: 10.1016/j.cell.2013.11.037

  • 35

    Song Y Cao H Zuo C Gu Z Huang Y Miao J et al . Mitochondrial dysfunction: A fatal blow in depression. BioMed Pharmacother. (2023) 167:115652. doi: 10.1016/j.biopha.2023.115652

  • 36

    Kalli AC Reithmeier RAF . Organization and dynamics of the red blood cell band 3 anion exchanger SLC4A1: insights from molecular dynamics simulations. Front Physiol. (2022) 13:817945. doi: 10.3389/fphys.2022.817945

  • 37

    Remigante A Morabito R Marino A . Band 3 protein function and oxidative stress in erythrocytes. J Cell Physiol. (2021) 236:6225–34. doi: 10.1002/jcp.30322

  • 38

    Tian Z Li Y Jin F Xu Z Gu Y Guo M et al . Brain-derived exosomal hemoglobin transfer contributes to neuronal mitochondrial homeostasis under hypoxia. Elife. (2025) 13:RP99986. doi: 10.7554/eLife.99986

  • 39

    Haase VH . Regulation of erythropoiesis by hypoxia-inducible factors. Blood Rev. (2013) 27:4153. doi: 10.1016/j.blre.2012.12.003

  • 40

    Agis-Balboa RC Fischer A . Generating new neurons to circumvent your fears: the role of IGF signaling. Cell Mol Life Sci. (2014) 71:2142. doi: 10.1007/s00018-013-1316-2

  • 41

    Qiao X Yan J Zang Z Xi L Zhu W Zhang E et al . Association between IGF-1 levels and MDD: a case-control and meta-analysis. Front Psychiatry. (2024) 15:1396938. doi: 10.3389/fpsyt.2024.1396938

  • 42

    Chen M Zhang L Jiang Q . Peripheral IGF-1 in bipolar disorder and major depressive disorder: a systematic review and meta-analysis. Ann Palliat Med. (2020) 9:4044–53. doi: 10.21037/apm-20-1967

  • 43

    Fernández-Pereira C Agís-Balboa RC . The insulin-like growth factor family as a potential peripheral biomarker in psychiatric disorders: A systematic review. Int J Mol Sci. (2025) 26:2561. doi: 10.3390/ijms26062561

  • 44

    Rempe RG Hartz AMS Bauer B . Matrix metalloproteinases in the brain and blood-brain barrier: Versatile breakers and makers. J Cereb Blood Flow Metab. (2016) 36:1481–507. doi: 10.1177/0271678X16655551

  • 45

    Nagy V Bozdagi O Matynia A Balcerzyk M Okulski P Dzwonek J et al . Matrix metalloproteinase-9 is required for hippocampal late-phase long-term potentiation and memory. J Neurosci. (2006) 26:1923–34. doi: 10.1523/JNEUROSCI.4359-05.2006

  • 46

    Mizoguchi H Nakade J Tachibana M Ibi D Someya E Koike H et al . Matrix metalloproteinase-9 contributes to kindled seizure development in pentylenetetrazole-treated mice by converting pro-BDNF to mature BDNF in the hippocampus. J Neurosci. (2011) 31:12963–71. doi: 10.1523/JNEUROSCI.3118-11.2011

  • 47

    Rodríguez-Moreno A Kohl MM Reeve JE Eaton TR Collins HA Anderson HL et al . Presynaptic induction and expression of timing-dependent long-term depression demonstrated by compartment-specific photorelease of a use-dependent NMDA receptor antagonist. J Neurosci. (2011) 31:8564–9. doi: 10.1523/JNEUROSCI.0274-11.2011

Summary

Keywords

adolescent major depressive disorder, transcriptomics, biomarkers, peripheral blood mononuclear cells, machine learning, immune-metabolic dysregulation

Citation

Yang R, Jiang L, Pan J, Lian K, Xie Y, He Y, Huang Z, Qi Q and Lu J (2026) Integrated transcriptomic and machine learning analysis reveals novel diagnostic biomarkers for adolescent major depressive disorder. Front. Psychiatry 17:1712225. doi: 10.3389/fpsyt.2026.1712225

Received

24 September 2025

Revised

26 December 2025

Accepted

02 January 2026

Published

30 January 2026

Volume

17 - 2026

Edited by

Cátia Santa, BioMed X GmbH, Germany

Reviewed by

Abraham Weizman, Tel Aviv University, Israel

Zhong XiaoGang, Affiliated Rehabilitation Hospital of Chongqing Medical University, China

Updates

Copyright

*Correspondence: Jin Lu,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics