Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol., 03 December 2025

Sec. Gynecological Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1661153

Machine learning-based prediction of clinical outcomes in cervical cancer using routine hematological indices: development and web implementation

Gaigai Bai&#x;Gaigai BaiFanghua Chen&#x;Fanghua ChenJunjun Qiu*Junjun Qiu*Keqin Hua*Keqin Hua*
  • Obstetrics & Gynecology Hospital of Fudan University, Shanghai Key Lab of Reproduction and Development, Shanghai Key Lab of Female Reproductive Endocrine Related Diseases, Shanghai, China

Background: Cervical cancer prognosis critically depends on tumor invasiveness, yet existing predictive tools lack accessibility and generalizability. We aimed to develop predictive models using comprehensive hematological profiling of routine tests to assess invasiveness and survival, improving clinical decision-making.

Methods: We conducted a retrospective analysis of 512 cervical cancer patients who underwent radical surgery. A panel of hematological indices was evaluated, including inflammatory markers, coagulation parameters, and metabolic indicators. Machine learning (ML) algorithms innovatively integrated with traditional regression were employed for feature selection and model development. Models were internally validated by bootstrap methods for discrimination (AUC/C-index) and calibration. Clinical utility was assessed by decision curve analysis (DCA). Web-based Shiny applications of these models were deployed.

Results: Using routine hematological indices selected from ML-based methods, we identified the optimal variable set for each clinical outcome prediction model based on C-index comparisons. The multivariable analyses of these variables identified hematological parameters associated with cervical cancer progression and prognosis. TG, HGB, Eosinophil count, TCLR, and NAR acted as protective factors, while LDL, WBC, FAR, DDI, FLR, ENLR, SII and platelet count were risk factors linked to advanced disease features. In addition, Tbil and DDI were consistent risk factors for both recurrence-free survival (RFS) and overall survival (OS). The models assessed invasiveness risk and survival risk in two critical periods: pre-surgery and post-surgery. The AUC values for predicting locally advanced cervical cancer (LACC), uterine body invasion (UBI), lymph node positivity (LNP), adjuvant therapy (ADT), parauterine invasion (PUI), and vaginal invasion (VI) were 0.714, 0.781, 0.781, 0.719, 0.756, and 0.700, respectively. For OS, the pre-surgery and post-surgery models achieved C-index of 0.875 and 0.906, while the RFS models yielded 0.790 and 0.863, respectively. All models showed AUC ≥ 0.7, strong calibration, and positive net benefit on DCA. Interactive web tools were implemented based on these models.

Conclusions: Comprehensive hematological profiling enables accurate prediction of cervical cancer invasiveness and survival during different decision-making periods. Our ML-enhanced, web-implemented models can enhance risk stratification and clinical decisions, particularly in resource-limited settings.

1 Introduction

Cervical cancer remains a significant global health burden, ranking as the fourth most common malignancy and cause of cancer-related mortality among women worldwide (1). Prognosis varies substantially depending on tumor stage, lymph node metastasis, and therapeutic response, with 5-year survival rates declining from 92% in localized disease to below 20% in metastatic cases (2). Despite advances in treatment, recurrence rates persist at 15-30% within 5 years post-treatment, underscoring the need for refined prognostic stratification (2).

In clinical practice, decision-making for cervical cancer management relies heavily on assessing key prognostic factors, including tumor stage, lymph node positivity (LNP), parauterine invasion (PUI), uterine body invasion (UBI), and vaginal invasion (VI). These factors not only guide therapeutic strategies but also predict surgical outcomes and survival. For instance, tumor stage determines surgical candidacy and forms the basis of initial treatment planning, while PUI and LNP are pivotal in guiding adjuvant treatment decisions, as recommended by current clinical guidelines (3). Moreover, these invasion-related parameters (PUI, LNP, UBI, and VI) also provide essential metrics for evaluating surgical complexity. Besides, the question of whether adjuvant therapy (ADT) is required adds complexity to clinical decision-making and often becomes a source of patient anxiety, making it a key concern in treatment discussions. Consequently, accurate assessment of these parameters becomes critical for both therapeutic planning and patient counseling.

However, despite the availability of numerous predictive tools for cervical cancer, critical gaps limit the clinical translation of existing predictive tools. First, many models depend on advanced imaging techniques and specialized expertise, which are often inaccessible in resource-limited settings (4, 5). Second, most tools lack patient-friendly interfaces, hindering their use in shared decision-making (6). Third, several hematological biomarkers (such as neutrophil-to-lymphocyte ratio) are clinically applicable, many molecular biomarkers, such as circulating tumor DNA or proteomic signatures, rely on specialized assays and remain less available in routine clinical settings (7). Additionally, many predictive tools for cervical cancer have been developed using datasets from limited geographic or institutional sources, which may not fully capture population diversity or clinical practice variations (8). These limitations highlight an urgent need for practical, accessible prediction tools that can predict multiple clinical outcomes in cervical cancer management.

Emerging evidence suggests that inflammation, nutrition, and metabolic status play critical roles in cancer progression, as demonstrated by extensive basic and clinical research (911). Routine clinical tests, such as complete blood count, liver and kidney function tests, and coagulation profiles, provide valuable insights into patients’ inflammatory, nutritional, and metabolic status. Notably, several derived indicators, including the neutrophil-to-lymphocyte ratio (NLR), mean platelet volume-to-platelet count ratio (MPV/PC), and lymphocyte-to-monocyte ratio (LMR), have shown significant prognostic relevance in cervical cancer (1214). Additionally, the systemic immune-inflammatory (SII) indices derived from platelet count and NLR, have emerged as potential predictors of progression-free survival in patients receiving immunotherapy (15). Despite these advances, the translation of these findings into clinical practice remains challenging, primarily due to the difficulty in selecting representative and practical indicators from a wide array of available options.

To address these challenges, this study aimed to develop clinically meaningful predictive models using routinely collected clinical data, selected by machine learning combined with traditional methods. Specifically, we evaluated risks of locally advanced cervical cancer (LACC), PUI, UBI, LNP, VI, ADT, recurrence-free survival (RFS), and overall survival (OS) in two critical periods: pre-surgery and post-surgery for cervical cancer patients. Additionally, we created user-friendly, web-based tools to integrate these models into clinical practice. These tools will not only enhance healthcare providers’ ability to assess disease severity but also empower patients with personalized risk insights, facilitating informed decision-making and improving patient outcomes.

2 Methods

2.1 Data collection

Patient data were collected from the electronic medical records of the Obstetrics and Gynecology Hospital of Fudan University between December 2017 and December 2018. Data were extracted by two independent researchers using standardized data collection forms. The inclusion criteria were as follows: (1) diagnosis of cervical cancer confirmed by pathology; (2) patients who underwent primary radical surgery; and (3) availability of complete follow-up data in database of our center (The last follow-up time was Nov.2023). The exclusion criteria were: (1) patients who received neoadjuvant therapy; (2) patients with missing clinical information; (3) patients who had cervical carcinoma in situ and (4) pregnant patients or those with severe comorbidities (such as infectious diseases, liver cirrhosis, blood diseases, etc.) or combined with other cancers. A flowchart outlining the inclusion and exclusion process for patient selection is shown in Figure 1. This study was approved by the Ethics Committee of Obstetrics and Gynecology Hospital of Fudan University (2025-25). Informed consent was waived as the study used retrospective, anonymized data without identifiable patient information.

Figure 1
Flowchart illustrating the inclusion and exclusion process of patients between December 2017 and December 2018. Out of 600 patients, 88 were excluded due to various reasons such as neoadjuvant therapy and missing data, leaving 512 included. The chart outlines six steps: data collection using electronic medical records, variable selection through methods like RSF and LASSO, model construction and comparison, and evaluation and application of models using web-based applications. It highlights the development of six outcome models, with final variable selection using traditional and multiple regression methods.

Figure 1. Research framework of this study.

We collected routine blood test results, lipid profiles, liver function indices, and coagulation function indices from the hospital laboratory database. Clinicopathological factors, including age, pregnancy history, menopausal status, and family history of cancer, were obtained from inpatient medical records. The human papillomavirus (HPV) infection status of each patient was retrieved from either inpatient records or the HPV genotyping report. The American Society of Anesthesiologists Physical Status Classification System (ASA) score was extracted from anesthesia records. Histopathological results were obtained from pathology reports, and tumor staging was determined by gynecologists according to the FIGO 2018 criteria. To determine the most predictive blood biomarkers for clinical outcomes, we conducted a comprehensive literature review to identify composite indicators used in cancer prognosis prediction and subsequently calculated these indices for each individual (1621). The calculated parameters are listed in Supplementary Table 1. Figure 1 shows the study framework.

2.2 Variable selection for different clinical outcomes

To ensure the practicality of the predictive models, we defined the following usage scenarios: (1) predicting risk of LACC (defined as FIGO stage IB3-IVA), UBI (confirmed by pathology), LNP (confirmed by pathology), ADT (got from the medical record and followed-up data), PUI (confirmed by pathology), VI (confirmed by pathology), OS, and RFS for pre-surgery patients; and (2) OS and RFS for post-surgery patients, with known pathological results. We employed a two-step approach for variable selection, as depicted in Figure 1. The first step focused on selecting blood indices variables. Random Forest (RF), Least Absolute Shrinkage and Selection Operator (LASSO) regression, and Stepwise regression were used in this step to combine the strengths of machine learning and traditional statistical approaches. Random Forest identifies variables with high predictive importance in nonlinear settings, LASSO regression performs penalized selection to handle multicollinearity, and Stepwise regression selects variables based on statistical significance and information criteria (22, 23). This combined approach ensured robust and stable feature identification. The features selection process including: (1) RF, where we ranked the importance of blood indices variables using the variable importance (vimp) method and selected the more significant ones while considering collinearity determined by variance inflation factor (VIF), where VIF < 10 was considered indicative of no multicollinearity; (2) LASSO, which utilized the minimum criteria for variable selection; and (3) Stepwise Regression, which identified the optimal combination of blood indices variables for each outcome prediction according to the smallest Akaike information criterion (AIC). In the second step, we incorporated clinical pathological factors into the selected blood indices variables result from the first step. We performed both univariable + multivariable and multivariable logistic/Cox regression. For univariable + multivariable regression analyses, it is a two-step selection approach. First, candidate blood indices were screened using univariable analyses; variables with p < 0.05 were considered unlikely to be associated and were excluded. To avoid omitting variables with potential clinical relevance or borderline statistical significance, variables that met a relaxed threshold (p< 0.1) in multivariable screening were retained as candidate predictors and further entered into prediction model. In the multivariable regression, the final set of variables was selected based on a p-value threshold of < 0.05.

2.3 Model construction and internal validation

After the variable selection procedure, we got 6 sets of variables for each outcome (LACC, UBI, LNP, ADT, PUI, VI, pre-surgery RFS, post-surgery RFS, pre-surgery OS, post-surgery OS). Then we constructed Logistic regression models for predicting LACC, PUI, UBI, LNP, ADT and VI; Cox models for predicting survival outcomes. We selected the models based on area under the curve (AUC) of Logistic regression models and Concordance index (C-index) of Cox regression models. Models with a high AUC/C-index and relatively simpler were selected as the final models. After determining the final model of each outcome, we conducted internal validation to ensure the good performance of models. The internal validation was conducted by bootstrapping (1000 resamples) methods, composed of two parts, discrimination and accuracy, which was evaluated by Receiver Operating Characteristic (ROC) curves and calibration plots separately. An AUC ≥ 0.7 was considered acceptable discrimination. To evaluate the goodness of fit in calibration plots, the Spiegelhalter Z-test was conducted, with a p-value > 0.05 indicating no significant difference between predicted and actual probabilities. Additionally, a Brier score ranging from 0 to 0.25 was considered indicative of good model accuracy. To further assess the clinical practicality, we conducted the decision curve analysis (DCA) and plot the DCA curves. Moreover, to improve the practicality and availability of models, we deployed the models on https://www.shinyapps.io/ by using R software, so that users can input their information and get the corresponding prediction outcomes.

2.4 Statistical methods

All data were processed by R software (R 4.3.1). For categorical data, the numbers and frequencies were used to present it. For normal distribution continuous data, means and standard deviation (SD) were used to describe it, for Skewed distribution continuous data, median and interquartile range (IQR) were used to describe it. The survival curves were compared by Log-rank test, Benjamini-Hochberg method was used to control the false discovery rate in the group comparison. Prior to logistic and Cox regression modeling, we assessed multicollinearity using variance inflation factors (VIF < 10). In the study, statistical significance is denoted as follows: * for p < 0.05, ** for p < 0.01, *** for p < 0.001, and **** for p < 0.0001.

3 Results

3.1 Patient characteristics

A total of 600 patients were initially screened based on the inclusion and exclusion criteria. Of these, 88 were excluded, leaving 512 patients for final inclusion (Figure 1). The average age of the included patients was 47.61 years. Among them, 347 (66.77%) were infected with HPV types 16/18, while 32 (6.25%) were HPV negative. The majority of the patients, 452 (88.28%), reported no family history of cancer, and 161 (31.45%) were menopausal. Regarding cancer stage, 385 (75.20%) patients were classified as FIGO (2018) stage I, 55 (10.74%) as stage II, and 72 (14.06%) as stage III. Squamous cell carcinoma (SCC) was the most prevalent histological type, with 408 (79.69%) patients diagnosed with SCC. Subsequently, we analyzed the distribution of clinical outcomes of interest among the study population. Among the 512 patients, 150 (29.30%) were diagnosed with LACC. The invasion patterns were distributed as follows: UBI was observed in 51 cases (9.96%), lymph node metastasis was present in 71 patients (13.87%), and PUI was identified in 24 cases (4.69%). VI was detected in 102 patients, accounting for 19.92% of the cohort. Regarding therapeutic interventions, 194 patients (37.89%) received adjuvant treatment following primary therapy (Supplementary Table 2). The blood indices and composite indices of included patients were showed in Supplementary Table 2.

3.2 Survival outcomes of patients

We subsequently evaluated the survival and recurrence outcomes of these patients. During the follow-up period, 36 patients died, and 51 experienced recurrent. The 1-, 3-, and 5-year OS rates were 99.22%, 94.53%, and 92.88%, respectively, while the corresponding RFS rates were 95.12%, 90.62%, and 90.06% (Figures 2A, B). Specifically, the 1, 3, 5- year OS rate and RFS rate stratified by histological type and FIGO stage were showed in Table 1. Prognosis varied significantly according to FIGO stage, with FIGO I patients demonstrating the most favorable outcomes, followed by FIGO II and FIGO III patients (p < 0.001) (Figures 2C, D). Significant differences in survival outcomes were also observed among patients with different histology types. Specifically, patients with SCC exhibited significantly better prognosis compared to those with adenocarcinoma (ACC), as evidenced by higher overall survival and recurrence-free survival rates (p = 0.0023 and p < 0.0001, respectively) (Figures 2E, F). Additionally, patients with LACC, UBI, LNP, PUI, or VI had significantly poorer prognosis compared to those without these characteristics (all p < 0.0001) (Figures 2G–P).

Figure 2
A series of Kaplan-Meier survival plots displaying different survival probabilities over 60 months across various groupings. Plots include overall survival (OS) and recurrence-free survival (RFS) for various categories: “All”, “FIGO”, “Histology”, “LACC”, “UBI”, “LNP”, “PUI”, and “VI”. Each plot compares survival between two or more groups, with statistical significance indicated by p-values. Distinct color-coded lines represent different groups.

Figure 2. Kaplan-Meier survival analysis of cervical cancer patients. (A, B) Overall survival (OS) and recurrence-free survival (RFS) for the entire cohort. (C, D) Survival stratified by FIGO (2018) stage. (E, F) Survival stratified by histological type. (G, H) Survival in locally advanced cervical cancer (LACC) patients and Non-LACC patients. (I, J) Survival based on uterine body invasion (UBI). (K, L) Survival stratified by lymph node positivity (LNP). (M, N) Survival according to parauterine invasion (PUI). (O, P) Survival based on vaginal invasion (VI).

Table 1
www.frontiersin.org

Table 1. Survival outcomes stratified by histological type and FIGO stage.

3.3 Variable selection and models construction

We selected the variables of each outcome according to the methods described in the second part (Figure 1). In first step of the blood indices variable selection, the importance ranking in the RF methods was showed in the Supplementary Table 3, the LASSO plots were provided in the Supplementary Figure 1 and Supplementary Figure 2. Next, we proceeded to the second step, combining the clinicopathological factors with selected blood indices and obtaining the final variable set from six sets (Supplementary Table 4). To pick out the best variable set, we calculated the C-index for each model, the final one was the model with a higher C-index and relatively simpler variables, with their final variables was showed in Table 2. Additionally, our analysis revealed that lipid profiles, coagulation indices, and total bilirubin (Tbil) were prominently associated with the predictive outcomes. To further explore the prognostic relevance of hematological indices in cervical cancer, we performed uni- and multivariable regression analyses for each clinical outcome (Supplementary Table 5). The results revealed distinct hematological patterns associated with disease progression and prognosis. For the LACC prediction, Triglycerides (TG) and Hemoglobin (HGB) acted as protective factors, whereas Low-density lipoprotein cholesterol (LDL), White blood cell (WBC) and the ratio of Fibrinogen to Albumin (FAR) were identified as independent risk factors. In the model predicting UBI, HGB remained a protective factor, while FAR served as a risk factor. Regarding LNP prediction, D-Dimer (DDI), the ratio of Fibrinogen to Lymphocyte (FLR), the ratio of (Eosinophil × Neutrophil) to Lymphocyte (ENLR), and SII were risk factors, whereas eosinophil count, the ratio of Total cholesterol to Lymphocyte (TCLR), and the ratio of Neutrophil to Albumin (NAR) exhibited protective effects. For ADT prediction, FLR and platelet count were identified as independent risk factors. In survival analyses, both Tbil and DDI were consistent and independent risk factors for RFS and OS. Collectively, these findings indicate that a subset of routinely available hematological and biochemical parameters are significantly associated with distinct pathological characteristics and prognosis in cervical cancer.

Table 2
www.frontiersin.org

Table 2. C-index of 6 variable sets for different clinical outcomes.

3.4 Performance of models

Subsequently, we performed internal validation of models by bootstrap methods (1,000 resamples). The AUC of the model predicting LACC, UBI, LNP, ADT, PUI were 0.701, 0.765, 0.739, 0.707 and 0.722 respectively, which showed these models can discriminate the corresponding outcomes well. The model predicting VI showed moderate discrimination with the AUC = 0.690 (Figure 3A). To evaluate the predictive accuracy, we depicted calibration plots, which showed the high fit of the calibration curves and ideal curves (all p > 0.05). Furthermore, the Brier score of the models ranged from 0.044 - 0.205 indicating a high accuracy (Figure 3B). Then, the DCA was used to evaluated the clinical practicality of the models. As the DCA plots showed, these models can provide potential clinical benefits, as a significant net benefit was observed across a wide range of threshold probabilities (Figure 3C).

Figure 3
Image showing three panels labeled A, B, and C with multiple graphs each. Panel A displays ROC curves for six variables: LACC, UBI, LNP, ADT, PUI, and VI, along with their respective AUC values. Panel B features calibration plots with actual versus predicted probabilities for the same variables, including metrics like Brier score. Panel C contains decision curve analysis graphs illustrating standardized net benefit against high-risk thresholds for each variable, comparing nomogram with all and none scenarios.

Figure 3. The performance of Logistic regression models. (A) The receiver operating characteristic (ROC) curve, (B) calibration curve and (C) decision curve analysis (DCA) plot of the models predicting locally advanced cervical cancer (LACC), uterine body invasion (UBI), lymph node positivity (LNP), adjuvant treatment (ADT), parauterine invasion (PUI) and vaginal invasion (VI).

For the survival and recurrence prediction models, the results demonstrated that four models exhibited good discrimination ability. Specifically, the 1-, 3-, and 5-year AUC values were as follows: 0.840, 0.810, and 0.813 for the model predicting RFS for pre-surgery patients; and 0.945, 0.882, and 0.900 for the model predicting RFS for post-surgery patients; 0.989, 0.890, and 0.865 for the model predicting OS for pre-surgery patients; 0.997, 0.918, and 0.904 for the model predicting OS for post-surgery patients (Figure 4A). The calibration curves demonstrated excellent agreement between the models’ predicted survival probabilities and the actual overall survival rates at 1, 3, and 5 years, indicating the models’ high accuracy (Figure 4B). DCA demonstrated that all four models provided significant net benefits across a range of threshold probabilities at 1, 3, and 5 years, indicating their potential clinical utility (Figure 4C).

Figure 4
Three panels display graphs related to survival outcomes for patients at two treatment stages (pre- and post- surgery). Panel A shows ROC curves for pre-surgery and post-surgery RFS and OS with AUC values. Panel B includes calibration plots of actual versus predicted probabilities for RFS and OS at one, three, and five years. Panel C presents decision curve analysis for net benefit versus risk threshold, indicating different models, including one, three, and five-year DCAs. Each panel aims to assess model performance over time.

Figure 4. The performance of Survival predicting models. (A) The receiver operating characteristic (ROC) curve (B) calibration curve and (C) decision curve analysis (DCA) plot of survival predicting models.

3.5 The application of models

To enhance the practicality and clinical applicability of our predictive models, we deployed them as user-friendly web-based tools. These tools are accessible through the website listed in Table 3, where users can input their personal health data. All deployed applications are fully anonymized and do not store any patient-identifiable information, in strict accordance with institutional privacy policies and ethical standards. The web interface includes clear instructions and definitions for each input field to facilitate accurate data entry. Upon submission, users receive individualized risk estimates for the relevant clinical outcomes. For logistic regression models, predicted risks along with 95% confidence intervals (CIs) are presented both as forest plots and in numerical format. For Cox models, outputs include survival curves, forest plots at specific time points, and numerical summaries. Additionally, a summary of model performance metrics is provided on the output page to aid interpretation. Sample user interfaces are shown in Figure 5. These interactive tools empower patients to actively participate in their healthcare decision-making and foster more effective collaboration with their healthcare providers.

Table 3
www.frontiersin.org

Table 3. Web-based Shiny applications of different outcomes prediction in cervical cancer.

Figure 5
Panel A shows a graphical summary for uterine body invasion prediction, displaying a plot of the 95% confidence interval for response probability. Parameters include age, HPV status, HIS, HGB, and FAR. Panel B presents a pre-surgery overall survival prediction, with a survival probability plot over time and a 95% confidence interval for survival probability. Parameters include HPV status, Tbil, MPV, Platelet, and DDI_grade. Abbreviations and units are provided for both panels.

Figure 5. The sample interface of Logistic regression models (A) and survival predicting models (B).

4 Discussion

In the context of the high incidence of cervical cancer, the complexity of its treatment, and the limited availability of applicable predictive models in clinical practice, this study developed practicable predictive models using data from routine clinical examinations. By integrating machine learning and traditional methods, we identified key blood-based variables and clinicopathological factors to construct models for predicting clinical outcomes (LACC, PUI, LNP, UBI, VI, ADT, OS, and RFS) at two critical periods: pre-surgery and post-surgery. To enhance the practical utility of these models, we created interactive Shiny apps accessible to both patients and doctors. Overall, our study not only provides valuable clinical tools but also facilitates patient understanding of their disease status, thereby improving doctor-patient communication.

With the rapid advancement of science and technology, an increasing number of predictive models have been developed to forecast critical outcomes and prognoses in cervical cancer. Traditionally, these models have primarily relied on clinical factors. For example, Guo et al. constructed a predictive model for para-aortic lymph node metastasis based on pathological features of lymph nodes, tumor size, and histological type (24). In recent years, the application of machine learning and deep learning techniques has gained traction in cervical cancer prediction. An example is the deep learning model developed by Wu et al., which utilizes MRI images to predict the risk of cervical cancer recurrence (25). Meanwhile, progress in molecular oncology has enabled the development of molecular-based predictive models. These models have been employed to forecast responses to chemotherapy and immunotherapy, as well as the likelihood of distant metastases in cervical cancer patients (2629). Despite these advancements, only a few predictive models have been successfully integrated into clinical practice. This gap is primarily attributed to the poor accessibility and limited clinical utility of current predictive models.

Our study builds upon and extends previous research that has applied machine learning approaches to routine blood analyses for cervical cancer (30, 31). Compared with these studies, our work integrates both classical machine learning algorithms and traditional statistical methods for variable selection, providing a balanced framework that enables comprehensive screening and optimal variable combinations for model development. In addition, we incorporated clinical treatment timelines by constructing predictive models for key clinical outcomes at different stages (preoperative and postoperative), thereby enhancing clinical relevance and applicability. Moreover, the models rely on easily obtainable clinical and laboratory parameters, and we further developed an interactive web-based interface designed for both healthcare providers and patients. This tool facilitates disease severity assessment, follow-up planning, and shared decision-making, supporting more individualized patient management.

Furthermore, our study delineated several pivotal hematological markers implicated in the progression of cervical cancer. Hematological parameters such as lymphocyte, monocyte, neutrophil, and eosinophil counts, which have been previously correlated with the prognosis of colorectal, breast, and other malignancies, were reaffirmed in our research to hold prognostic significance in cervical cancer as well (19, 32, 33). Particularly, we revealed that coagulation markers (platelet count, DDI, and fibrinogen) play a substantial role in forecasting clinical outcomes in cervical cancer, including LACC, UBI, LNP, ADT, VI, RFS and OS. This association, scarcely documented in prior research, underscores the prognostic weight of a hypercoagulable state, thereby underpinning the rationale for anti-coagulation therapy in cervical cancer management. Furthermore, our investigation illuminated the cancer severity relevance of lipid metabolism indicators (TC, TG, HDL, LDL) in the trajectory of cervical cancer. Intriguingly, we identified Tbil as a risk factor for prognosis. While the prognostic utility of Tbil has been established in hepatocellular carcinoma, cholangiocarcinoma, and non-small cell lung cancer, leading to the development of associated prognostic scores (3436), its prognostic relevance in cervical cancer remains underexplored. This gap underscores the imperative for further mechanistic and clinical inquiries into the role of Tbil in cervical cancer, to elucidate its potential as a biomarker and therapeutic target. By leveraging comprehensive clinical data, our predictive models not only identified novel blood indices, such as coagulation markers and lipid metabolism indicators, but also demonstrated their significant value in cervical cancer. These findings enhance the clinical applicability of our study, providing a robust foundation for risk stratification and personalized treatment strategies.

The limitations of this study include the following: Firstly, the data may be subject to bias due to the single-center, retrospective nature of the research. Secondly, due to limited data availability, we were unable to perform external validation. Thirdly, for the models of pre-surgery time point, the models were developed using data from patients who ultimately underwent surgery at our institution, which may introduce bias in predicting outcomes for patients who did not receive radical surgery. Future research should incorporate more comprehensive datasets to enhance the generalizability of our findings and support the development of more intelligent, user-friendly, and clinically applicable predictive models.

In summary, we developed and validated clinically applicable prediction models for cervical cancer patients across different therapeutic stages by integrating multidimensional blood indices with clinicopathological features. These models exhibited strong performance, including high AUC values, excellent calibration, and significant clinical utility as demonstrated by DCA, supporting their potential for personalized clinical decision-making. To facilitate real-world application, we deployed the models as interactive Shiny applications, enhancing accessibility and usability in clinical settings. This study not only provides a scalable predictive framework but also underscores the translational potential of routinely collected clinical data in advancing cervical cancer management.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Ethics Committee of Obstetrics and Gynecology Hospital of Fudan University. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because Informed consent was waived as the study used retrospective, anonymized data without identifiable patient information.

Author contributions

GB: Writing – original draft, Visualization, Conceptualization, Validation, Formal analysis, Data curation, Writing – review & editing, Methodology. FC: Data curation, Visualization, Validation, Methodology, Investigation, Writing – original draft. JQ: Writing – review & editing, Supervision, Resources, Funding acquisition. KH: Visualization, Resources, Supervision, Conceptualization, Funding acquisition, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by the National Natural Science Foundation of China (Grant No. 82173188), the National Natural Science Foundation of China (82472993), Medical Innovation Research of Shanghai Science and Technology(21Y11906900) and Shanghai Oriental talent youth project and Huangpu youth talent project.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1661153/full#supplementary-material

Abbreviations

ADT, Adjuvant therapy; AKP, Alkaline phosphatase; ALB, Albumin; ASA, American Society of Anesthesiologists Physical Status Classification System; CONUT, Controlling nutritional status score; DDI, D-Dimer; ELR, The ratio of eosinophil to lymphocyte; ENLR, The ratio of (eosinophil × neutrophil) to lymphocyte; FAR, The ratio of fibrinogen to albumin; FLR, The ratio of fibrinogen to lymphocyte; HDL, High-density lipoprotein cholesterol; HDLR, The ratio of HDL to lymphocyte; HGB, Hemoglobin; HPV, Human papillomavirus; LACC, Locally advanced cervical cancer; LDL, Low-density lipoprotein cholesterol; LDLR, The ratio of LDL to lymphocyte; LMR, The ratio of lymphocyte to monocyte; MPV, Mean platelet volume; NAR, The ratio of neutrophil to albumin; NLR, The ratio of neutrophil to lymphocyte; PDW, Platelet distribution width; PLR, The ratio of platelet to lymphocyte; PUI, Parauterine invasion; PVPR, The ratio of MPV to platelet; SII, Systemic immune inflammation index; SIS, Systemic inflammation score; Tbil, Total bilirubin; TC, Total cholesterol; TCLR, The ratio of TC to lymphocyte; TG, Triglycerides; TGLR, The ratio of TG to lymphocyte; UBI, Uterine body invasion; VI, Vaginal invasion; WBC, White blood cell

References

1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2024) 74:229–63. doi: 10.3322/caac.21834

PubMed Abstract | Crossref Full Text | Google Scholar

2. Zheng M, Huang W, Wang D, Huang L, Ren Y, Gao Q, et al. Prognostic assessment of cervical cancer based on biomarkers: the interaction of ERRα and immune microenvironment. Virol J. (2025) 22:47. doi: 10.1186/s12985-025-02664-3

PubMed Abstract | Crossref Full Text | Google Scholar

3. Abu-Rustum NR, Yashar CM, Arend R, Barber E, Bradley K, Brooks R, et al. NCCN guidelines® Insights: cervical cancer, version 1.2024. J Natl Compr Canc Netw. (2023) 21:1224–33. doi: 10.6004/jnccn.2023.0062

PubMed Abstract | Crossref Full Text | Google Scholar

4. Jiang Y, Wang C, and Zhou S. Artificial intelligence-based risk stratification, accurate diagnosis and treatment prediction in gynecologic oncology. Semin Cancer Biol. (2023) 96:82–99. doi: 10.1016/j.semcancer.2023.09.005

PubMed Abstract | Crossref Full Text | Google Scholar

5. Li J, Zhou H, Lu X, Wang Y, Pang H, Cesar D, et al. Preoperative prediction of cervical cancer survival using a high-resolution MRI-based radiomics nomogram. BMC Med Imaging. (2023) 23:153. doi: 10.1186/s12880-023-01111-5

PubMed Abstract | Crossref Full Text | Google Scholar

6. Collins GS, Chester-Jones M, Gerry S, Ma J, Matos J, Sehjal J, et al. Clinical prediction models using machine learning in oncology: challenges and recommendations. BMJ Oncol. (2025) 4:e000914. doi: 10.1136/bmjonc-2025-000914

PubMed Abstract | Crossref Full Text | Google Scholar

7. Lizano M, Carrillo-García A, de la Cruz-Hernández E, Castro-Muñoz LJ, and Contreras-Paredes A. Promising predictive molecular biomarkers for cervical cancer (Review). Int J Mol Med. (2024) 53:50. doi: 10.3892/ijmm.2024.5374

PubMed Abstract | Crossref Full Text | Google Scholar

8. He B, Chen W, Liu L, Hou Z, Zhu H, Cheng H, et al. Prediction models for prognosis of cervical cancer: systematic review and critical appraisal. Front Public Health. (2021) 9:654454. doi: 10.3389/fpubh.2021.654454

PubMed Abstract | Crossref Full Text | Google Scholar

9. Zitvogel L, Pietrocola F, and Kroemer G. Nutrition, inflammation and cancer. Nat Immunol. (2017) 18:843–50. doi: 10.1038/ni.3754

PubMed Abstract | Crossref Full Text | Google Scholar

10. Ravasco P. Nutrition in cancer patients. J Clin Med. (2019) 8:1211. doi: 10.3390/jcm8081211

PubMed Abstract | Crossref Full Text | Google Scholar

11. Muscaritoli M, Arends J, and Aapro M. From guidelines to clinical practice: a roadmap for oncologists for nutrition therapy for cancer patients. Ther Adv Med Oncol. (2019) 11:1758835919880084. doi: 10.1177/1758835919880084

PubMed Abstract | Crossref Full Text | Google Scholar

12. Trinh H, Dzul SP, Hyder J, Jang H, Kim S, Flowers J, et al. Prognostic value of changes in neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR) and lymphocyte-to-monocyte ratio (LMR) for patients with cervical cancer undergoing definitive chemoradiotherapy (dCRT). Clin Chim Acta. (2020) 510:711–6. doi: 10.1016/j.cca.2020.09.008

PubMed Abstract | Crossref Full Text | Google Scholar

13. Du J-Q, Zhang F, Wang C-Q, Zhu J-F, Xu L-X, Yang Y-H, et al. Effects of peripheral blood neutrophil/lymphocyte ratio levels and their changes on the prognosis of patients with early cervical cancer. Front Oncol. (2023) 13:1139809. doi: 10.3389/fonc.2023.1139809

PubMed Abstract | Crossref Full Text | Google Scholar

14. Deng Q, Long Q, Liu Y, Yang Z, Du Y, and Chen X. Prognostic value of preoperative peripheral blood mean platelet volume/platelet count ratio (MPV/PC) in patients with resectable cervical cancer. BMC Cancer. (2021) 21:1282. doi: 10.1186/s12885-021-09016-8

PubMed Abstract | Crossref Full Text | Google Scholar

15. Chen Q, Zhai B, Li J, Wang H, Liu Z, Shi R, et al. Systemic immune-inflammatory index predict short-term outcome in recurrent/metastatic and locally advanced cervical cancer patients treated with PD-1 inhibitor. Sci Rep. (2024) 14:31528. doi: 10.1038/s41598-024-82976-6

PubMed Abstract | Crossref Full Text | Google Scholar

16. Crooks CJ, West J, Jones J, Hamilton W, Bailey SER, Abel G, et al. COLOFIT: development and internal-external validation of models using age, sex, faecal immunochemical and blood tests to optimise diagnosis of colorectal cancer in symptomatic patients. Aliment Pharmacol Ther. (2025) 61:852–64. doi: 10.1111/apt.18459

PubMed Abstract | Crossref Full Text | Google Scholar

17. Li Y, Yu M, Yang M, and Yang J. The association of systemic immune-inflammation index with incident breast cancer and all-cause mortality: evidence from a large population-based study. Front Immunol. (2025) 16:1528690. doi: 10.3389/fimmu.2025.1528690

PubMed Abstract | Crossref Full Text | Google Scholar

18. Kim D, Im M, Ryang S, Kim M, Jeon YK, Kim SS, et al. Association of the preoperative controlling nutritional status (CONUT) score with clinicopathological characteristics in patients with papillary thyroid carcinoma. Endocrinol Metab (Seoul). (2024) 39:856–63. doi: 10.3803/EnM.2024.2006

PubMed Abstract | Crossref Full Text | Google Scholar

19. Constantinescu A-E, Bull CJ, Jones N, Mitchell R, Burrows K, Dimou N, et al. Circulating white blood cell traits and colorectal cancer risk: A Mendelian randomisation study. Int J Cancer. (2024) 154:94–103. doi: 10.1002/ijc.34691

PubMed Abstract | Crossref Full Text | Google Scholar

20. Megyesfalvi E, Ghimessy A, Bauer J, Pipek O, Saghi K, Gellert A, et al. Diagnostic and prognostic relevance of inflammatory markers in surgically treated thymic epithelial tumors: An international multicenter study. Lung Cancer. (2025) 200:108111. doi: 10.1016/j.lungcan.2025.108111

PubMed Abstract | Crossref Full Text | Google Scholar

21. Chen X, Zhao Y, Wang Y, Ye Y, Xu S, Zhou L, et al. Fluctuations in serum lipid levels during neoadjuvant treatment as novel predictive and prognostic biomarkers for locally advanced breast cancer: a retrospective analysis based on a prospective cohort. Lipids Health Dis. (2024) 23:261. doi: 10.1186/s12944-024-02140-x

PubMed Abstract | Crossref Full Text | Google Scholar

22. Breiman L. Random forests. Mach Learn. (2001) 45:5–32. doi: 10.1023/A:1010933404324

Crossref Full Text | Google Scholar

23. Tibshirani R. Regression shrinkage and selection via the lasso. R Stat Soc J Ser B: Methodological. (1996) 58:267–88. doi: 10.1111/j.2517-6161.1996.tb02080.x

Crossref Full Text | Google Scholar

24. Guo M, He M, Dang Y, Lei L, Li Q, Huang Y, et al. Predictors of para-aortic lymph node metastasis based on pathological diagnosis via surgical staging in patients with locally advanced cervical cancer: A multicenter study. Cancer Lett. (2025) 616:217545. doi: 10.1016/j.canlet.2025.217545

PubMed Abstract | Crossref Full Text | Google Scholar

25. Wu J, Li J, Huang B, Dong S, Wu L, Shen X, et al. ConvXGB: A novel deep learning model to predict recurrence risk of early-stage cervical cancer following surgery using multiparametric MRI images. Transl Oncol. (2025) 52:102281. doi: 10.1016/j.tranon.2025.102281

PubMed Abstract | Crossref Full Text | Google Scholar

26. Fang J, Wang Y, Li C, Liu W, Wang W, Wu X, et al. A hypoxia-derived gene signature to suggest cisplatin-based therapeutic responses in patients with cervical cancer. Comput Struct Biotechnol J. (2024) 23:2565–79. doi: 10.1016/j.csbj.2024.06.007

PubMed Abstract | Crossref Full Text | Google Scholar

27. Lukovic J, Pintilie M, Han K, Fyles AW, Bruce JP, Quevedo R, et al. An immune gene expression risk score for distant metastases after radiotherapy for cervical cancer. Clin Cancer Res. (2024) 30:1200–7. doi: 10.1158/1078-0432.CCR-23-2085

PubMed Abstract | Crossref Full Text | Google Scholar

28. Gui Z, Ye Y, Li Y, Ren Z, Wei N, Liu L, et al. Construction of a novel cancer-associated fibroblast-related signature to predict clinical outcome and immune response in cervical cancer. Transl Oncol. (2024) 46:102001. doi: 10.1016/j.tranon.2024.102001

PubMed Abstract | Crossref Full Text | Google Scholar

29. Zhang X, Li J, Yang L, Zhu Y, Gao R, Zhang T, et al. Targeted proteomics-determined multi-biomarker profiles developed classifier for prognosis and immunotherapy responses of advanced cervical cancer. Front Immunol. (2024) 15:1391524. doi: 10.3389/fimmu.2024.1391524

PubMed Abstract | Crossref Full Text | Google Scholar

30. Su J, Lu H, Zhang R, Cui N, Chen C, Si Q, et al. Cervical cancer prediction using machine learning models based on routine blood analysis. Sci Rep. (2025) 15:22655. doi: 10.1038/s41598-025-08166-0

PubMed Abstract | Crossref Full Text | Google Scholar

31. Zhao H, Wang Y, Sun Y, Wang Y, Shi B, Liu J, et al. Hematological indicator-based machine learning models for preoperative prediction of lymph node metastasis in cervical cancer. Front Oncol. (2024) 14:1400109. doi: 10.3389/fonc.2024.1400109

PubMed Abstract | Crossref Full Text | Google Scholar

32. Li B, Che L, Li H, Min F, Ai B, Wu L, et al. Peripheral blood immunoinflammatory biomarkers: prospective predictors of postoperative long-term survival and chronic postsurgical pain in breast cancer. Front Immunol. (2025) 16:1531639. doi: 10.3389/fimmu.2025.1531639

PubMed Abstract | Crossref Full Text | Google Scholar

33. Onesti CE, Josse C, Boulet D, Thiry J, Beaumecker B, Bours V, et al. Blood eosinophilic relative count is prognostic for breast cancer and associated with the presence of tumor at diagnosis and at time of relapse. Oncoimmunology. (2020) 9:1761176. doi: 10.1080/2162402X.2020.1761176

PubMed Abstract | Crossref Full Text | Google Scholar

34. Kim K-P, Kim KM, Ryoo B-Y, Choi W-M, Cha WC, Kang M, et al. Prognostic efficacy of the albumin-bilirubin score and treatment outcomes in hepatocellular carcinoma: A large-scale, multi-center real-world database study. Liver Cancer. (2024) 13:610–28. doi: 10.1159/000539724

PubMed Abstract | Crossref Full Text | Google Scholar

35. Omouri-Kharashtomi M, Alemohammad SY, Moazed N, Afzali Nezhad I, and Ghoshouni H. Prognostic value of albumin-bilirubin grade in patients with cholangiocarcinoma: a systematic review and meta-analysis. BMC Gastroenterol. (2025) 25:19. doi: 10.1186/s12876-025-03596-6

PubMed Abstract | Crossref Full Text | Google Scholar

36. Song L, Xu Q, Zhong T, Guo W, Lin S, Jiang W, et al. Potential utility of albumin-bilirubin and body mass index-based logistic model to predict survival outcome in non-small cell lung cancer with liver metastasis treated with immune checkpoint inhibitors. Chin Med J (Engl). (2025) 138:478–80. doi: 10.1097/CM9.0000000000003385

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: cervical cancer, risk prediction, prognosis, machine learning, shiny

Citation: Bai G, Chen F, Qiu J and Hua K (2025) Machine learning-based prediction of clinical outcomes in cervical cancer using routine hematological indices: development and web implementation. Front. Oncol. 15:1661153. doi: 10.3389/fonc.2025.1661153

Received: 07 July 2025; Accepted: 19 November 2025; Revised: 11 November 2025;
Published: 03 December 2025.

Edited by:

Kazim Yalcin Arga, Marmara University, Türkiye

Reviewed by:

Lanchun Lu, The Ohio State University, United States
Medi Kori, Marmara University, Türkiye

Copyright © 2025 Bai, Chen, Qiu and Hua. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Junjun Qiu, cWl1X2p1bmp1bkBmdWRhbi5lZHUuY24=; Keqin Hua, aHVha2VxaW5AZnVkYW4uZWR1LmNu

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.