Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Public Health

Sec. Public Health Policy

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1614938

This article is part of the Research TopicChanging Healthcare through Innovation in Clinical Management and Healthcare Policy Strategies: Focus on Quality Improvement for the PatientView all 5 articles

Comparative Models on Low Multiplier DRG Classification for Advanced Lung Cancer

Provisionally accepted
Mingming  YuMingming Yu1Mingming  YuMingming Yu2*
  • 1Shanghai Institute of Electronic Information Technology, Shanghai, China
  • 2Zhejiang College, Tongji University, Shanghai, Shanghai Municipality, China

The final, formatted version of the article will be published soon.

Abstract [Objective] This study aimed to compare the performance of machine learning models in predicting low multiplier DRGs for advanced lung cancer, and to identify the optimal algorithm along with key influencing factors. [Methods] Prediction models for low multiplier DRGs in advanced lung cancer were developed using four machine learning algorithms: logistic regression, hybrid naive Bayes, support vector machine (SVM), and random forest. Model performance was evaluated, and key contributing features were identified. [Results] The random forest algorithm achieved the highest AUC, accuracy, and precision across all three ER group, indicating robust performance. Second, cost-related features and length of hospital stay (LoS) reflecting "resource consumption" contributed significantly more to the low multiplier DRGs prediction than demographic factors such as gender and age. [Conclusion] Based on comorbidity severity, the DRG classification for advanced lung cancer patients receiving internal medicine treatment under ER1 appeared reasonably structured and provided a valid basis for subgroup comparisons. Additionally, according to the predictive model's findings, potential signs of upcoding and intentional underuse of reimbursable medications were observed, highlighting the need to monitor examination fee reductions across ER1 subgroups and to track medication costs in ER11 throughout the hospital stay. Lastly, in predicting low multiplier DRGs, larger datasets improve model stability. Model choice should align with the analytical goal: Random Forest offers higher precision and robustness, while logistic regression or SVM may be preferred for higher recall.

Keywords: machine learning, Advanced lung cancer, Low Multiplier DRGs, Prediction model, Upcoding

Received: 20 Apr 2025; Accepted: 28 Aug 2025.

Copyright: © 2025 Yu and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mingming Yu, Zhejiang College, Tongji University, Shanghai, 200092, Shanghai Municipality, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.