Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Pulmonary Medicine

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1603140

Machine learning-based integration develops an immune-derived signature for diagnosing High-altitude pulmonary hypertension

Provisionally accepted
Dan  YangDan Yang1,2Qian  LiQian Li1Feng  YangFeng Yang1*Rui  WangRui Wang1Peng  JiangPeng Jiang1Jialin  WuJialin Wu1Xi  YangXi Yang1Yixuan  HuangYixuan Huang2Yuqiang  LiuYuqiang Liu1Shishang  WangShishang Wang1Junqiang  GouJunqiang Gou1Zhangfeng  SunZhangfeng Sun2Junjie  MaJunjie Ma1Yanhui  QinYanhui Qin1Wu  LiWu Li1*Dongfeng  YinDongfeng Yin1*
  • 1General Hospital of Xinjiang Military Region, Ürümqi, China
  • 2Xinjiang Medical University, Ürümqi, Xinjiang Uyghur Region, China

The final, formatted version of the article will be published soon.

Background: High-altitude pulmonary hypertension (HAPH) is a common disease in high-altitude regions where implementation of gold-standard diagnostic methods remains logistically challenging. Methods: In the retrospective analysis, we employed an integrative multi-omics approach combining single-cell RNA sequencing (scRNA-seq, n = 10), bulk RNA sequencing (RNA-seq, n = 126), and proteomic profiling (n = 42) to characterize immune microenvironment remodeling in HAPH. Subsequently, we established a machine learning-based diagnostic model. The HAPH-associated signatures were finally validated by Quantitative PCR. Results: Through scRNA-seq analysis utilizing Ro/e and contribution scoring analysis, we first demonstrated the pivotal role of myeloid lineages in HAPH pathogenesis. Pseudotime trajectory analysis of the myeloid subsets further revealed 2,615 differentially expressed genes (DEGs) associated with HAPH progression. We also identified 144 and 77 DEGs from bulk RNA-seq and proteomic data between HAPH and control groups, respectively. Finally, 22 candidate biomarkers were screened by muti-omics analysis. These genes were further refined through ensemble machine learning algorithms. Evaluation of 113 algorithm combinations revealed that a six-gene random forest (RF) model (HEMGN, HBG2, MYL9, ANK1, UBE2O, RBPMS2) achieved optimal diagnostic accuracy, with an area under the curve (AUC) of 0.995 in the training cohort (n = 55) and 0.773 in external validation cohorts (n = 71). Quantitative PCR validated significant overexpression of these biomarkers in HAPH compared to controls (P < 0.05). Conclusions: Our findings propose the minimally invasive blood-derived immune signature for HAPH diagnosis, providing a practical framework for early detection in resource-constrained high-altitude populations.

Keywords: High-altitude pulmonary hypertension, single-cell RNA sequencing, Multi-omics integration, machine learning, Non-invasive diagnosis

Received: 15 Apr 2025; Accepted: 15 Aug 2025.

Copyright: © 2025 Yang, Li, Yang, Wang, Jiang, Wu, Yang, Huang, Liu, Wang, Gou, Sun, Ma, Qin, Li and Yin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Feng Yang, General Hospital of Xinjiang Military Region, Ürümqi, China
Wu Li, General Hospital of Xinjiang Military Region, Ürümqi, China
Dongfeng Yin, General Hospital of Xinjiang Military Region, Ürümqi, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.