- 1Department of Medical Record Management, Tongliang District People’s Hospital, Chongqing, China
- 2Department of General Practice, Tongliang District People’s Hospital, Chongqing, China
- 3Department of Clinical Laboratory, Tongliang District People’s Hospital, Chongqing, China
Background: The prognosis of patients with primary lung cancer remains poor. Therefore, this study aimed to develop and validate a predictive model to evaluate the overall survival (OS) of these patients.
Methods: A retrospective analysis was conducted on the data of 1,308 patients with primary lung cancer who received treatment and follow-up at our hospital from 2016 to 2022. The entire cohort was randomly divided into a derivation cohort (70%, n=915) and a validation cohort (30%, n=393) in a 7:3 ratio. A prognostic nomogram was constructed using Cox-least absolute shrinkage and selection operator regression analysis to predict the OS probabilities at 1-, 3-, and 5-years. Kaplan–Meier curve and log-rank tests were used to analyze and compare OS among different patient subgroups. The model was comprehensively evaluated through the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis (DCA).
Results: Age, gender, red blood cell count, serum potassium, albumin-globulin ratio, and prothrombin time activity were the prognostic indicators for predicting OS in patients with primary lung cancer. In the derivation cohort, the AUCs at 1-, 3-, and 5-years were 0.739 (95% confidence interval [CI]: 0.702–0.776), 0.727 (95% CI: 0.690–0.764), and 0.675 (95% CI: 0.629–0.721). In the validation cohort, the AUCs at 1-, 3-, and 5-years were 0.770 (95% CI: 0.712–0.827), 0.784 (95% CI: 0.732–0.837), and 0.717 (95% CI: 0.646–0.789), respectively. The calibration curve and DCA results confirmed the model’s good predictive power.
Conclusion: In this study, we developed and validated an OS prediction model for patients with primary lung cancer. Providing personalized predictions with multiple outcomes increases the information available to patients and clinicians.
1 Introduction
Lung cancer is one of the most prevalent malignant tumors globally. The latest research by the American Cancer Society projected that, by the end of 2022, there will be 1,918,030 new cancer cases and 609,360 cancer-related deaths in the United States (US), with approximately 350 deaths occurring daily from lung cancer (1). Lung cancer holds the highest mortality rate among malignant tumors worldwide and ranks first in both incidence and mortality rates in China (2). In 2020, there were 539,100 new cases, and 471,500 people died from lung cancer (3). Additionally, lung cancer is one of the primary causes of disability-adjusted life years (4, 5). The low 5-year survival rate of cancer compels clinicians to analyze and assess patients’ prognoses to optimize treatment strategies (6). Personalized prognostic diagnostics can guide tailored treatments, significantly advancing the development of precision medicine (7). Consequently, an accurate and practical prognostic model is essential.
Most current risk assessment models focus primarily on integrating disease-related dimensions, such as Tumor-Node-Metastasis (TNM) staging, imaging features, and specific pathological or genetic markers (8–10). Furthermore, several models have been developed based on routine laboratory indexes, such as inflammatory markers or albumin levels. However, models that systematically screen and integrate a broad panel of pre-treatment, routinely available laboratory parametersent,.DAT relying on specialized pathology, advanced imaging, or genomic datamic build a parsimonious and immediately deployable prognostic tool are less common. The novelty of this study lies in its exclusive focus on this universally accessible data domain. Its primary advantage over existing high-performance models is not necessarily superior predictive accuracy, but superior practicality and accessibility. It is designed as a “first-line” risk stratification tool that can be generated from a standard admission blood panel and basic demographics, providing rapid prognostic insight in settings where or at a time when detailed staging, advanced imaging (e.g., PET-CT), or genetic profiling results are not yet available. This addresses a distinct clinical gap where simplicity, speed, and broad applicability are paramount. Indeed, biomarkers and laboratory indicators play an indispensable role in facilitating rapid diagnosis and accurate prediction of short-term prognosis (11). Richlitzki et al. indicated that C-reactive protein (CRP) is a significant predictor of overall survival (OS) in patients with stage III non-small cell lung cancer (12). In a multicenter, observational, retrospective study (13), researchers developed a prognostic model designed to discern early differences between lung cancer and benign nodules, leveraging routine clinical and laboratory data. The XGBoost model demonstrated exceptional performance in differentiating advanced from early-stage lung cancer, achieving an area under the curve (AUC) as high as 0.913. Other laboratory parameters, including urea nitrogen, serum albumin, serum total protein, and neutrophil count, are also pivotal in the diagnostic and prognostic assessment of lung cancer (14–16). In addition to laboratory indicators, recent advancements in artificial intelligence (AI) have shown promise in improving diagnostic and prognostic accuracy in lung cancer (17–19). For instance, AI models have been applied to histopathological images for survival prediction and to CT imaging data for prognostication in non-small cell lung cancer. Furthermore, privacy-preserving distributed modeling approaches, such as federated learning, are being explored to develop robust prognostic models without compromising patient data privacy (20–22). These AI-driven methodologies, when integrated with conventional laboratory indicators, hold significant potential to enhance the accuracy of lung cancer diagnosis and prognosis.
While non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC) are distinct pathological subtypes with different prognoses and treatment paradigms, this study focused on primary lung cancer as a combined entity. This approach was taken for two primary reasons. First, our objective was to develop a pragmatic, initial risk-stratification tool based on universally available routine data prior to the completion of comprehensive pathological subtyping, which can sometimes be delayed in clinical practice. Second, we aimed to identify common, systemic physiological derangements (reflected in laboratory parameters) that influence prognosis across the spectrum of lung cancer, thereby providing a broadly applicable clinical instrument.
Therefore, this study compiled a large set of routine laboratory data from patients with lung cancer, aiming to investigate the risk factors affecting their prognosis and to develop a prognostic prediction model to assist in clinical decision-making. This approach will enable physicians to more accurately assess patients’ conditions and provide a basis for implementing personalized treatment plans.
2 Methods
2.1 Study design and patients
This study adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines. All patients with lung cancer admitted to the People’s Hospital of Tongliang District between January 1, 2016, and December 31 2022, were included in this study. The entire cohort of 1,308 patients was then randomly divided into a derivation cohort (70%, n=915) and an internal validation cohort (30%, n=393) using a computer-generated random sequence. This split was performed to ensure independent assessment of the model’s performance. This study was approved by the Ethics Committee of the People’s Hospital of Tongliang District (Ethical Approval No. TLLLKY2023041) and was conducted following the Declaration of Helsinki. Informed consent for participation was not required due to the retrospective design of the study, which complied with national legislation and institutional requirements.
2.2 Inclusion and exclusion criteria
The inclusion criteria were as follows: (i) Age ≥ 18 years and (ii) a diagnosis of lung cancer confirmed through histopathological examination. The exclusion criteria were as follows: (i) Lung cancer not being the first primary cancer; (ii) survival time unknown or < one month; (iii) unknown marital status and age. The inclusion and exclusion criteria are outlined in Figure 1.
2.3 Data collection
A total of 26 candidate variables were selected. All laboratory indicators were collected as baseline measurements, specifically from the first complete blood tests performed at the time of or immediately following the confirmed diagnosis of primary lung cancer, and prior to the initiation of any definitive anti-cancer treatment (such as surgery, chemotherapy, radiotherapy, or targeted therapy). This approach ensured that the variables reflected the patient’s pre-treatment physiological state. Specifically, we explored age, gender, marital status, white blood cell (WBC) count, red blood cell (RBC) count, neutrophil-lymphocyte ratio (NLR), lymphocyte-monocyte ratio (LMR), platelet-lymphocyte ratio (PLR), serum creatinine, serum chloride, serum potassium, serum sodium, serum calcium, serum phosphorus, serum magnesium, basophil percentage, eosinophil percentage, total bilirubin (TBIL), direct bilirubin (DBIL), albumin-globulin ratio (AGR), aspartate aminotransferase/alanine aminotransferase (AST/ALT), alkaline phosphatase (ALP), urea, uric acid (UA), prothrombin time activity (PTA), and hypersensitive C-reactive protein (hs-CRP).
2.4 Statistical analyses
Statistical analyses were performed using the Statistical Package for the Social Sciences (version 22.0) and R (version 4.3.3) software. Categorical data are expressed as frequencies and percentages, and the chi-square test was used for group comparisons. Continuous data with a normal distribution are expressed as mean ± standard deviation, and intergroup differences were analyzed using the t-test. For continuous data that did not follow a normal distribution, the median and interquartile range (M [Q1-Q3]) were reported, and comparisons were made using the Mann–Whitney U test. To handle missing data and maximize statistical power, we employed multivariate multiple imputations with chained equations (MICE). We performed 10 imputations, and convergence was assessed using the Gelman-Rubin statistic, with values close to 1 indicating satisfactory convergence.
The Cox-least absolute shrinkage and selection operator (LASSO) regression analysis was used to identify independent prognostic predictors. The optimal λ value was determined using the one-standard-error rule, which selects the largest λ value within one standard error of the minimum cross-validation error. Based on the results of multivariate Cox regression analysis, independent risk predictors were utilized to create a prognostic nomogram to predict the probability of OS at 1-, 3-, and 5-years. The score assignment for each indicator in the nomogram was calculated by assigning each predictor variable a score based on its hazard ratio (HR) from the Cox regression model, summing the individual scores of all predictor variables to obtain a total score for each patient, and then mapping this total score to the corresponding survival probabilities at 1-, 3-, and 5-years using the nomogram. The Kaplan–Meier (KM) curve and log-rank tests were used to analyze and compare the OS of patients in different subgroups. The model was comprehensively evaluated through the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curve analysis (DCA). To further validate our model, we compared it with two other machine learning-based survival models: Random Survival Forests (RSF) and XGBoost survival analysis. These models were implemented using the R packages ‘randomForestSRC’ and ‘xgboost’, respectively. The performance of these models was evaluated using the same metrics as our Cox-LASSO model. All statistical tests were two-tailed, with a significance level set at P < 0.05.
3 Results
3.1 Patient characteristics
A total of 1,308 patients with primary lung cancer selected through stringent screening procedures and inclusion/exclusion criteria were included in this study. The patients were divided into derivation and validation cohorts in a 7:3 ratio, comprising 915 patients in the derivation cohort and 393 in the validation cohort. Among these, 957 were males (73.17%) and 351 were females (26.83%). Most patients were married (85.24%). Table 1 provides a detailed summary of the basic information of the patients with primary lung cancer included in this study. The results of the multiple imputation for missing data are detailed in Supplementary Tables 1, Supplementary Table 2. These results indicate that, in both the derivation and validation cohorts, there were no statistically significant differences in the imputed values for all missing indicators before and after imputation.
3.2 Identification of the predictive factors for OS
As displayed in Figure 2, the Cox-LASSO model selected lambda.1se, corresponding to a λ value of 0.05501995, and six predictors: age, gender, RBC count, serum potassium, AGR, and PTA. The multivariate Cox regression model indicated that age (HR = 1.047, 95% CI: 1.033atednte and P < 0.001), gender (male) (HR = 2.465, 95% CI: 1.766)tednte and P < 0.001), RBC count (HR = 0.787, 95% CI: 0.645,tednte and P = 0.018), serum potassium (HR = 0.712, 95% CI: 0.569siumnte and P = 0.003), AGR (HR = 0.357, 95% CI: 0.244,iumnte and P < 0.001), and PTA (HR = 1.017, 95% CI: 1.011,iumnte and P < 0.001) were the prognostic indicators for predicting OS in patients with primary lung cancer (Figure 3).
Figure 2. Feature selection by Cox-LASSO. (A) Cox-LASSO coefficient profiles (y-axis) of features; (B) 10-fold cross-validation for tuning parameter selection in the Cox-LASSO model.
3.3 Development of prediction model for OS
We developed a nomogram for OS. Each variable was assigned a point based on HR. Then, by summing the total score of each variable and locating the score on the total points scale, we obtained a nomogram predicting 1-, 3-, and 5-year OS. The nomogram containing independent predictive factors for predicting 1-, 3-, and 5-year OS of patients with primary lung cancer is illustrated in Figure 4.
3.4 Performance of the prediction model for OS
The time-dependent ROC curves demonstrated that the AUCs at 1-, 3-, and 5-years were 0.739 (95% CI: 0.702–0.776), 0.727 (95% CI: 0.690–0.764), and 0.675 (95% CI: 0.629–0.721), respectively, in the derivation cohort. In the validation cohort, the AUCs at 1-, 3-, and 5-years were 0.770 (95% CI: 0.712–0.827), 0.784 (95% CI: 0.732–0.837), and 0.717 (95% CI: 0.646–0.789), respectively (Figures 5A, B). Additionally, calibration curves (Figures 6A, B) indicated that the nomogram has good predictive accuracy in both the derivation and validation cohorts. The DCA curves (Figures 7A–F) also revealed that the nomogram has good clinical utility in both cohorts. To further evaluate our model, we compared it with RSF and XGBoost survival analysis models. The AUCs for the RSF model at 1-, 3-, and 5-years were 0.745, 0.725, and 0.673, respectively (Supplementary Figure 1). For the XGBoost model, the AUCs at 1-, 3-, and 5-years were 0.716, 0.660, and 0.591, respectively (Supplementary Figure 1). Collectively, these results indicate that our Cox-LASSO model performs comparably to these machine learning models, demonstrating a slight advantage at certain time points. To enhance the interpretability of our model, we have included a visualization of feature importance based on the coefficients from our Cox-LASSO model (Supplementary Figure 2).
Figure 7. Decision curves of the model. (A) 1-year in the derivation cohort; (B) 3-year in the derivation cohort; (C) 5-year in the derivation cohort; (D) 1-year in the validation cohort; (E) 3-year in the validation cohort; (F) 5-year in the validation cohort.
3.5 Sensitivity analysis
To further validate the stability of our predictive model, we conducted a sensitivity analysis based on marital status. The model was evaluated in three subgroups: Married/Living with partner, Never married, and Widowed/Divorced/Separated. The results showed that the model maintained good predictive performance across all subgroups, with AUCs ranging from 0.650 to 0.750 at 1-, 3-, and 5-years (Supplementary Figure 3). This analysis demonstrates the robustness of our model in different demographic subgroups.
We observed differences in LMR, PLR, and AST/ALT between the derivation and validation cohorts. To ensure that these differences do not affect the stability of our predictive model, we conducted sensitivity analyses. We re-verified the model after excluding patients with abnormal indicators (LMR, PLR, and AST/ALT outside the normal range). The results showed that the model maintained good predictive performance, with AUCs at 1-, 3-, and 5-years remaining stable (Supplementary Figure 4). This analysis confirms that the model’s stability is not significantly affected by the observed differences in these indicators.
3.6 KM survival analysis
We conducted survival analysis on the six prognostic factors separately. The results of the KM continuity study revealed that the OS of male patients was significantly lower than that of female patients (P < 0.001), while patients aged < 65 years exhibited a considerably higher OS (P < 0.001) (Figures 8A, B). Additionally, we observed reduced OS in patients with lower RBC counts (P = 0.003), lower serum potassium levels (P < 0.001), lower AGR (P < 0.001), and higher PTA (P = 0.005) (Figures 8C–F).
Figure 8. KM curves for the predictors. (A) Gender; (B) Age; (C) RBC count; (D) Serum potassium; (E) AGR; (F) PTA.
4 Discussion
In this study, we evaluated several prognostic variables associated with OS in patients with primary lung cancer. Our findings indicated that a straightforward prediction model based on six prognostic factors—age, gender, RBC count, serum potassium, AGR, and PTA—can effectively predict the 1-, 3-, and 5-year OS of these patients, with AUCs of 0.739 (95% CI: 0.702–0.776), 0.727 (95% CI: 0.690–0.764), and 0.675 (95% CI: 0.629–0.721), respectively.
Advanced age is a significant risk factor affecting the long-term survival of patients with primary lung cancer. The poor prognosis in elderly patients is associated with digestive dysfunction, reduced physiological reserve, poor tolerance to surgery and chemotherapy, and the presence of multiple comorbidities (23–25). Chen et al. (26) demonstrated significant differences in clinical characteristics and prognosis among patients with lung cancer across various age groups. Compared to patients aged 40 years or younger, those over 80 years have a higher risk of mortality. Huh et al. (27) also indicated that one of the strongest risk factors for lung cancer mortality was age ≥ 65 years.
Gender significantly impacts the survival status of patients with primary lung cancer. In this study, the mortality rate among male patients was much higher than that among female patients (1-year mortality rate: male to female ratio of 1.014:1; 3-year mortality rate: male to female ratio of 1.603:1; 5-year mortality rate: male to female ratio of 1.923:1), which was also confirmed in a cohort from the US (28). A nationwide cohort study of non-small cell lung cancer in Switzerland also demonstrated that male survival rates were significantly worse than those of females at various clinical stages (29). Another study from Korea indicated that female gender is a better prognostic factor for patients with small cell lung cancer, even after comprehensive adjustment for all prognostic variables (adjusted HR: 0.51, 95% CI: 0.34–0.77) (30). The significant gender disparity in survival observed in our cohort (male: 73.17%) aligns with these global trends but must be interpreted considering potential local confounders. The pronounced male predominance in our study population may reflect, in part, regional differences in smoking prevalence and occupational exposures. Furthermore, while our multivariable model adjusted for key laboratory parameters, residual confounding from unmeasured factors—such as detailed smoking history, specific treatment modalities received (e.g., types of surgery or chemotherapy regimens), socio-economic status, and comorbidities—could contribute to the observed association between male gender and poorer prognosis. Future studies with more granular data are needed to disentangle the independent effect of gender from these closely linked lifestyle and clinical factors. The higher mortality rate in males may be related to adverse habits such as smoking and drinking (31, 32). Conversely, females are less likely to smoke and drink than males, resulting in lower exposure to risk factors in terms of both time and intensity.
Our study identified four common laboratory indicators (RBC count, serum potassium, AGR, and PTA) that can serve as prognostic variables for OS in patients with primary lung cancer. RBCs are responsible for transporting oxygen to various parts of the body. A low RBC count can limit oxygen supply, potentially influencing tumor growth rate and invasiveness, thereby affecting treatment efficacy and patient survival (33). Furthermore, a low RBC count may be associated with chronic inflammatory conditions, where the inflammatory microenvironment can promote tumor development and metastasis (34). Regarding serum potassium, while its level is tightly regulated, variations within the normal range may reflect underlying metabolic or nutritional status. Our finding that lower serum potassium is associated with worse prognosis warrants further investigation into its potential role in the context of cancer cachexia or treatment-related metabolic disturbances (35, 36). Numerous studies have examined the role of the AGR in lung cancer prognosis. Yao et al. (37) demonstrated a significant correlation between pre-treatment AGR and patients with advanced non-small cell lung cancer. Duran et al. (38) identified low AGR as a strong predictor of long-term mortality risk in patients with lung adenocarcinoma. Zhou et al. indicated that an AGR below 1.29 was an independent predictor of OS deterioration in patients with small-cell lung cancer (39). Patients with late-stage lung cancer often exhibit dysfunctions in coagulation and fibrinolysis systems, with chemotherapy potentially exacerbating these abnormalities (40). Changes in fibrinolysis parameters are particularly sensitive indicators of such exacerbations. PTA is a crucial marker for assessing coagulation function, and an elevated PTA may indicate the activation status of the coagulation system.
In addition to laboratory indicators, multimodal approaches that integrate laboratory and imaging data have shown promise in improving diagnostic and prognostic accuracy in thoracic oncology. For instance, Rani et al. proposed a multi-modal bone suppression, lung segmentation, and classification approach for accurate COVID-19 detection using chest radiographs (41). This approach demonstrates the potential of integrating advanced image preprocessing pipelines with laboratory data to enhance diagnostic accuracy. Similarly, Rani et al. introduced a spatial feature and resolution maximization GAN for bone suppression in chest radiographs, which further highlights the importance of multimodal data integration in thoracic oncology (42). These studies suggest that combining laboratory indicators with advanced imaging techniques can provide a more comprehensive assessment of patient prognosis.
The primary innovation of this study lies in its methodological focus and pragmatic design. Unlike models that depend on specialized or delayed diagnostic information, we systematically derived and validated a parsimonious prognostic tool using exclusively pre-treatment, routine laboratory data and basic demographics. This approach identifies a core set of systemic physiological markers (RBC, potassium, AGR, PTA) that provide independent prognostic value across a broad lung cancer population. Clinically, this model offers several practical implications. First, it serves as an immediate, accessible risk stratification instrument that can be calculated at the time of initial diagnosis, providing early prognostic insight to clinicians and patients while awaiting more definitive staging and molecular profiling. Second, by highlighting parameters like AGR and PTA, it draws attention to the prognostic relevance of nutritional-inflammation status and coagulation function, which may be modifiable therapeutic targets. Finally, the nomogram format facilitates personalized communication of survival probabilities at 1, 3, and 5 years, aiding in shared decision-making and the tailoring of follow-up intensity. While not a replacement for comprehensive staging, this tool adds a valuable, rapid-assessment layer to the clinical management pathway.
Several limitations should be noted: (i) Although the present study is based on real-world data, the inherent nature of a single-center retrospective analysis introduces potential bias, necessitating further validation through multicenter prospective studies; (ii) we did not include several critical prognostic variables, most notably the tumor stage, detailed histology (NSCLC vs. SCLC subtype), specific genetic mutation status (e.g., EGFR, ALK), and the treatment modality received by the patient (e.g., surgery, type of chemotherapy/immunotherapy, radiotherapy). We recognize that this omission is a major limitation, as these factors are fundamental determinants of prognosis in lung cancer. Consequently, the reliability and clinical applicability of our current model are indeed constrained; it is not intended to replace comprehensive staging and molecular-based prognostic systems. Instead, its utility lies as a preliminary, complementary risk-assessment tool that leverages universally available data. Future studies integrating these crucial clinical-pathological variables with the baseline laboratory parameters are essential to develop a more robust and comprehensive predictive model; (iii) our study did not conduct repeated measurements at different time points to observe the dynamic changes of these laboratory parameters. Given the potential value of dynamic monitoring, future research should focus on collecting longitudinal data to explore the association between changes in laboratory parameters (e.g., the decrease range of serum potassium level) and OS. This approach could provide a more nuanced understanding of disease progression and improve the accuracy of prognostic models. To address some of these limitations, we conducted a sensitivity analysis based on marital status, which showed that the model maintained good predictive performance across different subgroups. To address the baseline differences observed in LMR, PLR, and AST/ALT between the two cohorts, we conducted a supplementary analysis and sensitivity analyses. The differences may be attributed to variations in patient inclusion time and treatment plans. Our sensitivity analyses confirmed that the model’s stability is not significantly affected by these differences, ensuring the robustness of our predictive model.
5 Conclusion
This population-based study identified independent predictive factors associated with OS in patients with primary lung cancer. Additionally, we developed and validated a novel, robust, and reliable predictive model that provide more accurate individualized survival estimates and can be utilized for clinical counseling.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The Ethics Committee of Tongliang District People’s Hospital approved the study. Written informed consent for participation was not required for this study due to its retrospective design, and the study was undertaken in accordance with national legislation and institutional requirements.
Author contributions
MC: Formal Analysis, Writing – original draft, Data curation, Conceptualization, Funding acquisition, Supervision. HC: Writing – original draft, Formal Analysis, Data curation. ZY: Writing – original draft, Formal Analysis. XH: Formal Analysis, Writing – original draft.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This study was funded by Chongqing medical scientific research project (grant number 2024MSXM079).
Acknowledgments
We would like to thank all the participants of this project and investigators for collecting the data. We would like to thank Editage (www.editage.com) for English language editing.
Conflict of interest
The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1708848/full#supplementary-material
References
1. Siegel RL, Miller KD, Fuchs HE, and Jemal A. Cancer statistics, 2022. CA Cancer J Clin. (2022) 72:7–33. doi: 10.3322/caac.21708
2. Xia C, Dong X, Li H, Cao M, Sun D, He S, et al. Cancer statistics in China and United States, 2022:profiles, trends, and determinants. Chin Med J (Engl). (2022) 135:584–90. doi: 10.1097/CM9.0000000000002108
3. Cao W, Chen H-D, Yu Y-W, Li N, and Chen W-Q. Changing profiles of cancer burden worldwide and in China:a secondary analysis of the global cancer statistics 2020. Chin Med J (Engl). (2021) 134:783–91. doi: 10.1097/CM9.0000000000001474
4. Collaborators GA. The burden and trend of diseases and their risk factors in Australia, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Public Health. (2023) 8:e585–e99. doi: 10.1016/S2468-2667(23)00123-8
5. Kuang Z, Wang J, Liu K, Wu J, Ge Y, Zhu G, et al. Global, regional, and national burden of tracheal, bronchus, and lung cancer and its risk factors from 1990 to 2021: findings from the global burden of disease study 2021. EClinicalMedicine. (2024) 75:102804. doi: 10.1016/j.eclinm.2024.102804
6. Ahvonen J, Luukkaala T, Laitinen T, and Jukkola A. Survival with lung cancer in Finland has not improved during 2007-2019-a single center retrospective population-based real-world study. Acta Oncol. (2023) 62:571–8. doi: 10.1080/0284186X.2023.2213444
7. Chen M, Copley SJ, Viola P, Lu H, and Aboagye EO. Radiomics and artificial intelligence for precision medicine in lung cancer treatment. Semin Cancer Biol. (2023) 93:97–113. doi: 10.1016/j.semcancer.2023.05.004
8. Kurose Y, Azuma Y, Iyoda A, and Tochigi N. Prognostic impact based on tumor diameter of pathological N1 lymph node metastases for non-small cell lung cancer. J Thorac Dis. (2024) 16:5878–89. doi: 10.21037/jtd-24-792
9. Peng Z, Wang Y, Qi Y, Hu H, Fu Y, Li J, et al. Application of prediction model based on CT radiomics in prognosis of patients with non-small cell lung cancer. BMC Cancer. (2025) 25:1273. doi: 10.1186/s12885-025-14544-8
10. Yang F, Jia X, Ma Z, Liu S, Liu C, Chen D, et al. Exploring the prognostic role of microbial and genetic markers in lung squamous cell carcinoma. Sci Rep. (2025) 15:4499. doi: 10.1038/s41598-025-88120-2
11. Seo D, Choi BH, La JA, Kim Y, Kang T, Kim HK, et al. Multi-biomarker profiling for precision diagnosis of lung cancer. Small. (2024) 20:e2402919. doi: 10.1002/smll.202402919
12. Richlitzki C, Wiesweg M, Metzenmacher M, Guberina N, Pöttgen C, Hautzel H, et al. C-reactive protein as robust laboratory value associated with prognosis in patients with stage III non-small cell lung cancer (NSCLC) treated with definitive radiochemotherapy. Sci Rep. (2024) 14:13765. doi: 10.1038/s41598-024-64302-2
13. Wei W, Wang Y, Ouyang R, Wang T, Chen R, Yuan X, et al. Machine learning for early discrimination between lung cancer and benign nodules using routine clinical and laboratory data. Ann Surg Oncol. (2024) 31:7738–49. doi: 10.1245/s10434-024-15762-3
14. Peng X, Huang Y, Fu H, Zhang Z, He A, and Luo R. Prognostic value of blood urea nitrogen to serum albumin ratio in intensive care unit patients with lung cancer. Int J Gen Med. (2021) 14:7349–59. doi: 10.2147/IJGM.S337822
15. Cao J, Luo F, Zeng K, Ma W, Lu F, Huang Y, et al. Predictive value of high preoperative serum total protein and elevated hematocrit in patients with non-small-cell lung cancer after radical resection. Nutr Cancer. (2022) 74:3533–45. doi: 10.1080/01635581.2022.2079683
16. Wang Y, Li D, Li Q, Basnet A, Efird JT, and Seki N. Neutrophil estimation and prognosis analysis based on existing lung squamous cell carcinoma datasets: the development and validation of a prognosis prediction model. Transl Lung Cancer Res. (2024) 13:2023–37. doi: 10.21037/tlcr-24-411
17. Yin X, Liao H, Yun H, Lin N, Li S, Xiang Y, et al. Artificial intelligence-based prediction of clinical outcome in immunotherapy and targeted therapy of lung cancer. Semin Cancer Biol. (2022) 86:146–59. doi: 10.1016/j.semcancer.2022.08.002
18. Guo S-B, Cai X-Y, Meng Y, Huang W-J, and Tian X-P. AI model using clinical images for genomic prediction and tailored treatment in patients with cancer. Lancet Oncol. (2025) 26:e126. doi: 10.1016/S1470-2045(25)00008-7
19. Zhao Y, Xiong S, Ren Q, Wang J, Li M, Yang L, et al. Deep learning using histological images for gene mutation prediction in lung cancer: a multicentre retrospective study. Lancet Oncol. (2025) 26:136–46. doi: 10.1016/S1470-2045(24)00599-0
20. Field M, Vinod S, Delaney GP, Aherne N, Bailey M, Carolan M, et al. Federated learning survival model and potential radiotherapy decision support impact assessment for non-small cell lung cancer using real-world data. Clin Oncol. (2024) 36:e197–208. doi: 10.1016/j.clon.2024.03.008
21. Field M, Thwaites DI, Carolan M, Delaney GP, Lehmann J, Sykes J, et al. Infrastructure platform for privacy-preserving distributed machine learning development of computer-assisted theragnostics in cancer. J BioMed Inform. (2022) 134:104181. doi: 10.1016/j.jbi.2022.104181
22. Subashchandrabose U, John R, Anbazhagu UV, Venkatesan VK, and Ramakrishna MT. Ensemble federated learning approach for diagnostics of multi-order lung cancer. Diagnostics. (2023) 13:3053. doi: 10.3390/diagnostics13193053
23. Chaudhary H, Stewart CM, Webster K, Herbert RJ, Frick KD, Eisele DW, et al. 127:631–41. doi: 10.1002/lary.26311
24. Chen X, Hou L, Shen Y, Wu X, Dong B, and Hao Q. The role of baseline sarcopenia index in predicting chemotherapy-induced undesirable effects and mortality in older people with stage III or IV non-small cell lung cancer. J Nutr Health Aging. (2021) 25:878–82. doi: 10.1007/s12603-021-1633-3
25. Molina-Garrido M, Guillen-Ponce C, Munoz-Sanchez M, Hernandez AO, Ruiperez CO, Crespo JS, et al. Chemotherapy-induced changes in the physiologic reserve of elderly patients diagnosed with cancer. J Clin Oncol. (2012) 30:e19590. doi: 10.1200/jco.2012.30.15_suppl.e19590
26. Chen X, Han X, Zhou H, Liang Y, Huang Z, Li S, et al. The clinical characteristics and prognosis of different age patients with lung cancer. Cancer Manag Res. (2020) 12:8445–50. doi: 10.2147/CMAR.S240318
27. Huh Y, Sohn YJ, Kim HR, Chun H, Kim HJ, and Son KY. Sex differences in prognosis factors in patients with lung cancer: A nationwide retrospective cohort study in Korea. PloS One. (2024) 19:e0300389. doi: 10.1371/journal.pone.0300389
28. Brouwer AF, Engle JM, Jeon J, and Meza R. Sociodemographic survival disparities for lung cancer in the United States, 2000-2016. J Natl Cancer Inst. (2022) 114:1492–500. doi: 10.1093/jnci/djac144
29. Radkiewicz C, Dickman PW, Johansson ALV, Wagenius G, Edgren G, and Lambe M. Sex and survival in non-small cell lung cancer: A nationwide cohort study. PloS One. (2019) 14:e0219206. doi: 10.1371/journal.pone.0219206
30. Lim JH, Ryu J-S, Kim JH, Kim H-J, and Lee D. Gender as an independent prognostic factor in small-cell lung cancer: Inha Lung Cancer Cohort study using propensity score matching. PloS One. (2019) 13:e0208492. doi: 10.1371/journal.pone.0208492
31. Dugué P-A, Yu C, Hodge AM, Wong EM, Joo JE, Jung C-H, et al. Methylation scores for smoking, alcohol consumption, and body mass index and risk of seven types of cancer. Int J Cancer. (2023) 153:489–98. doi: 10.1002/ijc.34513
32. Shi M, Luo C, Oduyale OK, Zong X, LoConte NK, and Cao Y. Alcohol consumption among adults with a cancer diagnosis in the all of us research program. JAMA Netw Open. (2023) 6:e2328328. doi: 10.1001/jamanetworkopen.2023.28328
33. Chen X-X, Zhao S-T, Yang X-M, He S-C, and Qian F-H. Additional diagnostic value of the monocyte to red blood cell count ratio and the product of lymphocyte count and albumin concentration in lung cancer management. Oncol Lett. (2023) 25:135. doi: 10.3892/ol.2023.13721
34. Yang Z, He H, He G, Zeng C, and Hu Q. Investigating causal effects of hematological traits on lung cancer: A mendelian randomization study. Cancer Epidemiol Biomarkers Prev. (2023) 33:96–105. doi: 10.1158/1055-9965.EPI-23-0725
35. Capitani C, Altadonna GC, Santillo M, and Lastraioli E. Ion channels in lung cancer: biological and clinical relevance. Front Pharmacol. (2023) 14:1283623. doi: 10.3389/fphar.2023.1283623
36. Alasiri G. Comprehensive analysis of KCNJ14 potassium channel as a biomarker for cancer progression and development. Int J Mol Sci. (2023) 24:2049. doi: 10.3390/ijms24032049
37. Yao Y, Zhao M, Yuan D, Gu X, Liu H, and Song Y. Elevated pretreatment serum globulin albumin ratio predicts poor prognosis for advanced non-small cell lung cancer patients. J Thorac Dis. (2014) 6:1261–70. doi: 10.3978/j.issn.2072-1439
38. Duran AO, Inanc M, Karaca H, Dogan I, Berk V, Bozkurt O, et al. Albumin-globulin ratio for prediction of long-term mortality in lung adenocarcinoma patients. Asian Pac J Cancer Prev. (2014) 15:6449–53. doi: 10.7314/APJCP.2014.15.15.6449
39. Zhou T, Zhan J, Hong S, Hu Z, Fang W, Qin T, et al. Ratio of C-reactive protein/albumin is an inflammatory prognostic score for predicting overall survival of patients with small-cell lung cancer. Sci Rep. (2015) 5:10481. doi: 10.1038/srep10481
40. Abbas M, Kassim SA, Wang Z-C, Shi M, Hu Y, and Zhu H-L. Clinical evaluation of plasma coagulation parameters in patients with advanced-stage non-small cell lung cancer treated with palliative chemotherapy in China. Int J Clin Pract. (2020) 74:e13619. doi: 10.1111/ijcp.13619
41. Rani G, Misra A, Dhaka VS, Buddhi D, Sharma RK, Zumpano E, et al. A multi-modal bone suppression, lung segmentation, and classification approach for accurate COVID-19 detection using chest radiographs. Intell Syst Appl. (2022) 16:200148. doi: 10.1016/j.iswa.2022.200148
Keywords: nomogram, overall survival, prediction model, primary lung cancer, prognostic
Citation: Cai M, Chen H, Yan Z and He X (2026) Common laboratory parameters as predictors of prognosis in primary lung cancer. Front. Oncol. 15:1708848. doi: 10.3389/fonc.2025.1708848
Received: 30 September 2025; Accepted: 22 December 2025; Revised: 15 December 2025;
Published: 12 January 2026.
Edited by:
Michael N. Kammer, Université Toulouse 1 Capitole, FranceReviewed by:
Eugenio Vocaturo, National Research Council (CNR), ItalyXiuyu Cai, Sun Yat-sen University Cancer Center (SYSUCC), China
Copyright © 2026 Cai, Chen, Yan and He. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Mingchun Cai, Y21jMTg3MzAzOTYxQDE2My5jb20=
Hao Chen2