ORIGINAL RESEARCH article
Front. Oncol.
Sec. Gastrointestinal Cancers: Gastric and Esophageal Cancers
From Complex Algorithms to Clinical Practice: A Multicenter Machine Learning Model and Simplified Decision Tree for Predicting Cachexia Risk in Gastric Cancer
Provisionally accepted- 1The 958th Army Hospital of the Chinese People's Liberation Army, Chongqing, China
- 2958 Hospital of the People's Liberation Army, Chongqing, China
- 3Xijing Digestive Disease Hospital Fourth Military Medical University, Xi'an, China
- 4Air Force Medical University Tangdu Hospital, Xi'an, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Cachexia is a frequent, specific metabolic syndrome that severely compromises survival in gastric cancer (GC). While early diagnosis is paramount, existing screening methods are limited by complexity and suboptimal accuracy. There is an urgent need for an efficient, data-driven tool derived from routine clinical parameters. Methods: In this multicenter retrospective study, we analyzed data from three independent hospitals. Variable selection was performed using univariable and multivariable analyses. We constructed and compared multiple machine learning (ML) models to predict cachexia risk. The models' discriminative ability, calibration, and clinical net benefit were comprehensively evaluated via AUC, calibration plots, and Decision Curve Analysis (DCA). Results: The study included 1,570 GC patients (cachexia prevalence: 30.3%). Patients were divided into training (n=920), internal testing (n=350), and external validation (n=300) cohorts. Cachexia was significantly associated with poor nutritional status, elevated inflammation, and inferior overall survival (P < 0.01). The Random Forest (RF) model yielded the best performance, maintaining excellent stability across the internal test set (AUC=0.898) and external validation set (AUC=0.913). To enhance clinical utility, we further derived a simplified decision tree model based on three accessible markers: CA19-9, CEA, and albumin. This simplified tool retained high diagnostic accuracy (AUC > 0.783) and demonstrated significant positive net benefits in DCA. Conclusion: We successfully established and externally validated a high-performance ML model for predicting GC-associated cachexia. Crucially, the derived simplified decision tree offers a convenient, highly generalizeable tool for clinicians to identify high-risk patients using routine laboratory tests, enabling earlier precision nutritional management.
Keywords: Cachexia, decision tree, External validation, gastric cancer, machine learning, Nutritional assessment, Prediction model
Received: 14 Dec 2025; Accepted: 16 Feb 2026.
Copyright: © 2026 Zhao, Deng, Guo, Qiao, Wu, Zhao, Zeng, Zhao, Song, Hou and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Qianyong Yang
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
