Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Pulmonary Medicine

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1666574

This article is part of the Research TopicComplex Interplay Between Lung Diseases and Multisystem Disorders: Pathogenesis, management, and OutcomeView all 10 articles

Diagnosis model for assessing chronic thromboembolic pulmonary hypertension in high-altitude pulmonary embolism patients: A machine learning approach

Provisionally accepted
  • 1Tongji University Dongfang Hospital, Shanghai, China
  • 2Tongren Hospital Shanghai Jiaotong University School of Medicine, Shanghai, China
  • 3Shigatse People’s Hospital, Shigatse, China
  • 4Shanghai Tenth People's Hospital, Shanghai, China

The final, formatted version of the article will be published soon.

Background: Patients with pulmonary embolism (PE) at high altitude face an increased risk of developing chronic thromboembolic pulmonary hypertension (CTEPH). This study aims to establish a diagnosis model of CTEPH patients at high altitude to optimize early screening. Methods: A retrospective cohort of CTEPH and PE patients was rigorously selected through inclusion/exclusion criteria. Clinical data encompassing biochemical profiles, echocardiography, and CT angiography (CTA) were collected, yielding 103 candidate variables. Feature parameters were screened using the Boruta algorithm, followed by predictive model development with seven machine learning architectures. The optimal model was identified based on area under the curve (AUC). The optimal Random Forest model was subsequently interpreted through Shapley Additive Explanations (SHAP) to quantify feature contributions. Results: Among 57 PE patients, 44% met echocardiographic criteria for pulmonary hypertension following PE. Diameter of right atrium, diameter of right ventricle, Vessel-Grade (of embolization) and Sup-inferior (superior or inferior of embolization) were key identified predictors. Random Forests model had the highest AUC of 0.842. Enlarged right heart, embolization of small vessels and superior pulmonary artery embolism increased the risk of CTEPH, while normal right heart structure and isolated inferior pulmonary embolism reduced it. Conclusions: The Random Forests model demonstrated potential for detecting CTEPH in PE patients, enabling early and rapid pulmonary hypertension assessment.

Keywords: Predictive learning models, machine learning, Pulmonary Embolism, PulmonaryDisease, Chronic obstructive

Received: 15 Jul 2025; Accepted: 22 Sep 2025.

Copyright: © 2025 Fan, Ma, Zhang, Yang, Zhakeer, Huang, Yu, Zeng and Mi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Qing Yu, daisyyuqing@163.com
Yanxi Zeng, 2411211@tongji.edu.cn
Ma Mi, 793541008@qq.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.