Ensemble machine learning models for predicting bone metastasis in bladder cancer

Yu, Zhan Jiang; Xu, Xiang Da; Zou, Xin Chang; Su, Pei De; Chao, Hai Chao; Zeng, Tao

doi:10.3389/fonc.2025.1653506

ORIGINAL RESEARCH article

Front. Oncol., 25 September 2025

Sec. Genitourinary Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1653506

This article is part of the Research TopicUrothelial Neoplasms: An Integrated Approach to Prevention, Diagnostics, and Personalized TherapyView all 14 articles

Ensemble machine learning models for predicting bone metastasis in bladder cancer

Zhan Jiang Yu^1†

Xiang Da Xu^1†

Xin Chang Zou¹

Pei De Su²

Hai Chao Chao²

Tao Zeng^2*

¹The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, China
²Department of Urology, Second Affiliated Hospital of Nanchang University, Nanchang, China

Background and purpose: The occurrence of bone metastasis (BM) in advanced bladder cancer (BC) often signifies a poor prognosis. Currently, the accurate prediction of BM in BC remains a challenge. This study develops predictive models using machine learning algorithms to predict bladder cancer bone metastasis (BCBM) and aid in personalized clinical decisions.

Patients and methods: We reviewed and analyzed data from patients diagnosed with BC between 2010 and 2015 in the Surveillance, Epidemiology, and End Results (SEER) database. In addition, we included 327 patients treated at the Second Affiliated Hospital of Nanchang University and Jiangxi Cancer Hospital as an external validation cohort. Independent risk factors for BM in patients with BC were identified through univariate and multivariate logistic regression analyses. These features were then integrated into seven machine learning algorithms to build predictive models: logistic regression (LR), support vector machine (SVM), gradient boosting machine (GBM), neural network (NN), random forest (RF), extreme gradient boosting (XGB), and k-nearest neighbors (KNN). The performance of these models was evaluated using the area under the receiver operating characteristic curve (AUC), along with accuracy, sensitivity (recall), and specificity.

Results: A total of 22,114 patients diagnosed with BC were included in this study, with 537 (2.4%) patients developing BM. The identified independent risk factors for BCBM included age, race, tumor histology, tumor grade, T stage, N stage, the presence of brain metastasis, liver metastasis, and lung metastasis, and history of radiotherapy. Among the seven developed machine learning models, the tree-based GBM model exhibited the best performance in the test set, achieving AUC, accuracy, sensitivity, and specificity values of 0.855, 0.813, 0.733, and 0.815, respectively. The GBM model also demonstrated robust performance in the external validation set, achieving an AUC of 0.766 and accuracy of 0.945. According to Shapley additive explanations (SHAP), the most significant feature in the GBM prediction model is the T stage, followed by the N stage and radiotherapy.

Conclusion: The GBM model offers a precise and personalized approach to predicting BCBM, potentially enhancing clinical decision-making and the efficiency of BM screening in patients with BC.

Introduction

Bladder cancer (BC) is the second most common urogenital cancer (1). Worldwide, it ranks as the ninth most prevalent cancer, with approximately 614,000 new cases and 220,000 deaths reported in 2022 (2). BC is characterized by a high rate of recurrence and metastasis (3). Metastatic bladder cancer (mBC) primarily spreads to the lymph nodes, the bones, the lungs, and the liver (4). Approximately 10%–15% of patients with BC are diagnosed with metastasis at the initial presentation (5), with the bone being the most common site of metastasis (6, 7). Bone metastasis (BM) can lead to skeletal-related events (SREs), which often result in complications such as pain, hypercalcemia, spinal cord compression, pathological fractures, and neurological deficits. These complications significantly diminish the patient’s quality of life (8) and adversely affect survival rates (9), with the 1-year survival rate for patients with bladder cancer bone metastasis (BCBM) as low as 21% (10). The TNM staging system established by the American Joint Cancer Committee (AJCC) is widely recognized for predicting the metastasis risk and the prognosis of various cancer patients (11). However, the TNM system does not account for additional risk factors such as age, gender, and previous treatment history, which have been shown to be valuable in predicting BC metastasis (12, 13). Consequently, the predictive accuracy of the TNM staging system for patients with BM may be limited. Many patients with BC may not receive a timely diagnosis of BM, potentially missing optimal treatment windows and leading to poorer prognosis. Therefore, accurately predicting the occurrence of BM in patients with BC is of great significance.

In recent years, artificial intelligence (AI) models based on machine learning (ML) algorithms have been increasingly integrated into clinical practice (14, 15). As a key branch of AI, ML has been utilized to independently extract features from large datasets and construct high-precision prediction models, continuously optimizing the performance of these algorithms.

In medical research, the construction and validation of models based on ML can uncover potential patterns in large clinical datasets, providing valuable tools for early diagnosis and prognosis assessment. ML has been widely applied in the prognostic evaluation of prostate cancer, kidney cancer, and gastrointestinal cancer, as well as in studies of organ metastasis (16, 17). The rapid advancement of health big data in biomedical science has revealed the significant potential of ML applications in understanding disease and in health management (18).

Currently, there are limited studies exploring ML models for the prediction of BCBM. In this study, we evaluated seven ML algorithms and observed that, among them, the gradient boosting machine (GBM) model showed relatively better performance. This study extracted data on patients with BC, as well as their clinical and pathological characteristics, from the Surveillance, Epidemiology, and End Results (SEER) database for the years 2010–2015. Accurate and reliable ML models to predict BCBM were constructed, which could assist clinicians in promptly identifying patients with BM. This approach aims to provide personalized clinical strategies for patients and promote the rational allocation of medical resources.

Methods

Ethics statement

The SEER database is a publicly available, anonymized cancer registry where all patient data have been de-identified. Therefore, this study was exempt from ethics review and patient consent requirements.

Patient selection and variables

All data were extracted from the SEER database using SEERStat software (version 8.4.4). This database covers approximately 28% of the US population and includes 17 population-based cancer registries, providing clinicopathological, demographic, and survival outcome information. The case listing was based on the dataset of Incidence—SEER Research Data, 17 Registries, Nov 2023 Sub (2000–2021). Subjects with BC were identified using site codes C67.0–C67.9. In this study, patients with a diagnosis of malignant BC by positive histology diagnosed between 2010 and 2015 were selected. The exclusion criteria were as follows: 1) patients under the age of 18 years; 2) patients with unknown AJCC T or N staging; 3) patients with unknown race or histological grade; 4) patients with unknown bone, brain, liver, or lung metastasis status; 5) patients with unknown radiotherapy or chemotherapy information; and 6) patients with two or more primary tumors. The flowchart for the case screening is shown in Figure 1. The external validation cohort comprised 327 patients with pathologically confirmed BC diagnosed between 2016 and 2023, among whom 11 developed BM. The final follow-up was completed in November 2024. This study was approved by the Institutional Review Boards of the Second Affiliated Hospital of Nanchang University and Jiangxi Cancer Hospital, with a waiver of informed consent granted. A total of 13 variables related to patient demographics and clinicopathological characteristics were extracted for analysis. The demographic variables included age, sex, and race, while the clinicopathological variables included tumor histology type, tumor grade, T stage, N stage, radiotherapy, chemotherapy, brain metastasis, BM, lung metastasis, and liver metastasis. Patient age was categorized into three subgroups, <60 years, 60–80 years, and >80 years, and the tumor grade into two subgroups. The histological types were classified into transitional cell carcinoma, squamous cell carcinoma, adenocarcinoma, and other types. All cancer patients exhibited histopathological and morphological evidence consistent with the International Classification of Diseases for Oncology, Third Edition (ICD-O-3), and all BC patients were staged according to the AJCC 7th Edition guidelines and the SEER staging information.

Figure 1

Flowchart depicting the selection process for a study on bladder cancer patients from the SEER database (2010-2015). Starting with 103,482 patients, excluded cases include those with multiple tumors, unknown grades or stages, metastatic status, or race, and underage patients, totaling 81,368 exclusions. The remaining 22,114 patients form the study population, divided into 15,480 for the train set, 6,634 for the test set, and 327 for the external validation set, all used to assess machine learning model value.

Figure 1. Study flowchart of case screening.

Data processing and feature engineering

All statistical analyses and data descriptions were conducted using R version 4.4.1 and SPSS version 27. The continuous variable age was converted into a categorical variable, which was then processed using the label encoding method. In this study, logistic regression analysis was performed on the variables collected from the SEER database using R software to identify features suitable for ML models. Significant variables in patients with BCBM were identified through univariate logistic regression analysis (p < 0.05). These variables were subsequently included in a multifactorial logistic regression analysis, and the ML models were built using the variables that remained statistically significant (p < 0.05) in the multivariate analysis. Correlation analysis was conducted to examine the relationships between the selected variables. In addition, to compare the importance of each feature, the feature importance in the ML model was extracted based on the principle of permutation importance. Finally, the importance of each feature was ranked using Shapley additive explanations (SHAP), helping decision-makers understand how to effectively utilize the model and comprehend the impact of each feature on the final predicted outcome. To achieve this, SHAP was employed to quantify the contribution of each feature to the model predictions, providing a transparent and interpretable analysis. Given that this dataset is unbalanced, which may affect the model performance, the synthetic minority oversampling technique (SMOTE) was employed as the sampling method in the training set to mitigate the impact of sample imbalance on the evaluation results.

Model construction and evaluation

The data from the SEER database were randomly divided into a training set and a test set at a ratio of 7:3. In this study, seven ML algorithms were selected, including three tree-based models [random forest (RF), GBM, and extreme gradient boosting (XGB)]; a linear model (logistic regression, LR); a kernel-based model (support vector machine, SVM); a distance-based model (k-nearest neighbors, KNN); and neural networks. External validation was subsequently conducted to further evaluate the generalizability of the model. The evaluation indicators for the ML algorithms included the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. The ML models were developed using the caret framework in R software. The relevant parameters of the model can be found in Supplementary Table S1.

Results

Patient characteristics and metastasis

A total of 22,114 patients with BC were included in this study. At the time of initial diagnosis, 21,577 patients (97.6%) had no BM, while 537 patients (2.4%) had BM. The patients were randomly divided into a training set (n = 15,480) and a test set (n = 6,634) at a 7:3 ratio. In the external validation cohort, 316 patients (96.6%) showed no evidence of BM, while 11 patients (3.4%) developed BM. The characteristics of all cohorts are presented in Tables 1 and 2.

Table 1

Table 1. Clinical and pathological characteristics of the training and test sets.

Table 2

Table 2. Clinical and pathological characteristics of the external validation set.

Feature filter

A total of 10 independent risk factors related to BM were identified through univariate and multivariate logistic regression analyses. These included age, race, tumor histology, tumor grade, T stage, N stage, radiotherapy, brain metastasis, lung metastasis, and liver metastasis (p < 0.05) (Table 3). Among these, the three most significant risk factors were brain metastasis (OR = 5.98, 95%CI = 2.37–15.14), liver metastasis (OR = 5.89, 95%CI = 4.05–8.56), and lung metastasis (OR = 5.87, 95%CI = 4.25–8.09). Based on these features, seven different models were developed in this study using ML algorithms.

Table 3

Table 3. Univariate and multivariate logistic regression analyses of the variables.

Importance of correlation analysis and features for prediction

Spearman’s correlation analysis was used to evaluate the correlation between factors and examine the independence of the data characteristics. As shown in Figure 2, the correlation heatmap illustrates no significant correlation among the 10 variables filtered using logistic regression. Figure 3 displays the importance of the features extracted from the different ML algorithms. Notably, in the majority of the predictive models, T stage consistently emerged as the most influential feature, underscoring its critical role in predicting BM in BC. In contrast, tumor histology, tumor grade, race, and brain metastasis contributed relatively little to the model across most algorithms, with no significant differences in their importance. In the GBM model, the features ranked from the highest to the lowest importance were: T stage, N stage, lung metastasis, radiotherapy, liver metastasis, age, race, tumor histology, tumor grade, and brain metastasis. The SHAP values were then calculated for each variable in the GBM model, with the SHAP bar graph (Figure 4A) illustrating the importance of each feature. The results indicated that T stage, N stage, and radiotherapy are the most significant contributors to the GBM model. Both methods were consistent in identifying T stage and N stage as the top two characteristics, while the bottom four—race, tumor histology, tumor grade, and brain metastasis—were also nearly identical. A summary plot of the SHAP values is presented in Figure 4B, which explains the impact of each feature on the model predictions.

Figure 2

A correlation matrix visualized with colored and sized circles, representing correlation values between medical variables. Darker and larger circles indicate stronger correlations. Variables include Grade, T stage, N stage, and others. A color scale on the right ranges from negative one to one.

Figure 2. Heat map of the correlation of features.

Figure 3

Seven bar charts compare feature importance scores across different machine learning models: GBM, KNN, Logistic, Neural Network, Random Forest, SVM, and XGBoost. Each chart ranks features such as T stage, N stage, radiotherapy, and metastases by their impact on model predictions. The T stage consistently ranks high across most models, while other features like brain metastasis have lower scores. The arrangement and length of bars visually represent the importance of each feature in the respective models.

Figure 3. Feature importance of the different models.

Figure 4

Panel A is a horizontal bar chart showing the mean SHAP values for various features related to GBM, such as T stage and N stage, with T stage having the highest value. Panel B is a scatter plot with colored points indicating feature values in relation to the SHAP values, depicting the same features. A color gradient from yellow to purple represents high to low feature values.

Figure 4. Interpretability of the gradient boosting machine (GBM) model assessed using the SHAP method. (A) SHAP bar chart showing the importance of each feature based on the mean SHAP values. (B) SHAP summary plot showing the impact of each feature on the model predictions. Individual dots symbolize patients, and different colors represent different levels of influence on the model output. SHAP, Shapley additive explanations.

Model performance and subgroup analysis

Figure 5 and Table 4 present the performance of the seven prediction models. The training set, balanced using SMOTE, was employed to train the models, while the test set was used to evaluate the accuracy and generalization ability of the models. To further validate the generalizability of the GBM model, external validation was performed using an independent cohort. Seven ML models were developed using the identified risk factors. After a comprehensive comparison, the GBM model demonstrated the best predictive value, achieving the highest AUC value of 0.855, along with accuracy, sensitivity (recall), and specificity values of 0.813, 0.733, and 0.815, respectively. The GBM model demonstrated favorable performance in the external validation cohort, achieving an AUC of 0.766 and an accuracy of 0.945 (Supplementary Table S2). The discrepancy between the model accuracy and AUC may be attributed to sample imbalance. Given that only 11 cases of BM were available, the model likely exhibited bias toward the majority class. This results in superficially high accuracy while limiting the model’s ability to identify minority class samples, consequently compromising the AUC performance. The confusion matrices for the GBM model in both the training and test sets are displayed in Figure 6. The predictive performance of the GBM model was compared with that of TNM staging to evaluate whether the model could provide more accurate and clinically meaningful predictions. As shown in Figure 7, the GBM model demonstrated superior performance to TNM staging alone, achieving an AUC of 0.855 compared with the lower AUC of TNM staging. This suggests that the GBM model may better capture features associated with the risk of BM. Stratified analyses of the model predictions were conducted to evaluate its fairness across demographic subgroups (Figure 8). Patients were stratified by gender, race, and age, with the model performance metrics calculated separately for each subgroup. The results showed comparable predictive performance between genders (AUC of 0.865 for male vs. 0.831 for female patients). Racial subgroup analysis revealed AUCs of 0.859 (white), 0.781 (black), and 0.847 (other). Age-stratified performance demonstrated AUCs of 0.920 (<60 years), 0.840 (60–80 years), and 0.788 (>80 years). While some inter-subgroup variability was observed, the model maintained clinically acceptable performance across all demographic strata.

Figure 5

Three ROC curve graphs labeled A, B, and C compare the performance of various machine learning models: Logistic, SVM, GBM, Neural Network, Random Forest, Xgboost, and KNN. Each curve represents sensitivity versus 1-specificity, with AUC and confidence intervals provided in the legends. Different colors distinguish each model's performance, showing their predictive accuracy across the datasets.

Figure 5. Receiver operating characteristic (ROC) curves of the prognostic models based on machine learning in the training set (A), test set (B), and the external validation set (C).

Table 4

Table 4. Test set predictive performance of the different models.

Figure 6

Two panels, A and B, displaying confusion matrices for prediction versus target outcomes. Panel A: 38.4% true positives, 6.9% false positives, 11.6% false negatives, 43.1% true negatives. Panel B: 1.8% true positives, 18.1% false positives, 0.6% false negatives, 79.5% true negatives.

Figure 6. Confusion matrices of the gradient boosting machine (GBM) model in the training set (A) and the test set (B).

Figure 7

Panel A and B show ROC curves for models predicting outcomes. The y-axis is sensitivity, the x-axis is one minus specificity. In both panels, the GBM model (blue) has higher AUC values compared to T_stage (red) and N_stage (green), with Panel A showing AUCs of 0.908, 0.686, and 0.631, respectively, and Panel B showing AUCs of 0.855, 0.657, and 0.663, respectively. The GBM model consistently outperforms the others.

Figure 7. Performance comparison between the gradient boosting machine (GBM) model and TNM staging alone in both the training set (A) and the test set (B).

Figure 8

Eight ROC curve graphs labeled A to H show sensitivities against one-specificity. Graph A (Male) has an AUC of 0.865, B (Female) 0.831, C (White) 0.840, D (Black) 0.813, E (Other) 0.847, F (<60) 0.838, G (60–80) 0.841, H (>80) 0.784. Each graph includes a diagonal reference line.

Figure 8. Stratified analysis of the gradient boosting machine (GBM) model performance by gender (A, B), race (C–E), and age (F–H) subgroups in the test set.

Discussion

BC is a fatal urinary tumor that can be classified into non-muscle-invasive bladder cancer (NMIBC), muscle-invasive bladder cancer (MIBC), or clinical metastatic disease (19). The 5-year survival rate for mBC is only 5% (20). Patients with BCBM have the worst prognosis compared to other BM patients with urogenital cancers (21). The early identification of BM in BC could help improve the clinical outcomes. The available prediction methods have certain limitations. In this study, a GBM model was developed to assess the risk of BM in patients with BC. The model provides individualized risk stratification based on patient-specific characteristics (e.g., age, tumor stage, and histologic subtype), thereby informing personalized clinical decision-making. For patients across different risk categories, therapeutic strategies may be judiciously tailored—individuals at high risk might benefit from intensified multimodal regimens combining chemotherapy, immunotherapy, and targeted agents, while patients at low risk could potentially undergo reduced-frequency bone imaging surveillance—measures that may help alleviate financial burden, enhance quality of life, and mitigate metastasis-related complications.

Currently, the treatment strategies for BC are rapidly evolving. Immunotherapies and targeted therapies have transformed the treatment paradigm, offering broader and more effective therapeutic options for patients. Particularly noteworthy are the latest antibody–drug conjugates (ADCs), which have demonstrated significant benefits in BC (22, 23). The BM prediction model (GBM) developed in this study can provide decision-making support for ADC-based treatment strategies. For patients predicted to be at high risk of BM, we recommend direct adoption of combination therapy with ADCs and immune checkpoint inhibitors (ICIs). Studies have indicated that patients with metastatic predisposition who receive ADC+ICI combination therapy achieve a remarkable 1-year disease-free survival (DFS) rate of 97.4%, while the overall pathological downstaging rate reaches 75.5% (24), fully demonstrating the substantial advantage of this combined approach. AI is a research field that utilizes computers to simulate human intelligence, which has been successfully utilized in various domains, including autonomous driving, facial recognition, and music creation (25–27). ML, as a subset of AI, can assist clinicians in making better clinical decisions, thereby improving patient care and overall health (28). Tsai et al. (29) conducted a diagnostic study involving 1,336 patients with cystitis, BC, renal cancer, uterine cancer, and prostate cancer. The authors innovatively combined clinical laboratory data with ML methods to establish a diagnostic model for BC. Key indicators included calcium, alkaline phosphatase (ALP), albumin, urinary ketones, urethral occult blood, creatinine, alanine aminotransferase (ALT), and diabetes. Of the five models constructed in the study, LightGBM exhibited the best predictive performance, achieving an AUC value of 0.923 and an accuracy of 87.6%, demonstrating the potential of using clinical laboratory data for cancer detection. Xiong et al. (30) conducted a retrospective study involving 105 patients with BC. By comparing the performance of clinical models, radiomic models, and clinical–radiomic fusion models, the authors found that ML models combining radiomic features with clinical variables could more accurately predict the clinical staging of BC. Liosis et al. (31) developed an elastic net ML prediction model that successfully identified gene markers related to BC treatment response and disease progression, effectively predicting patients’ treatment responses and disease progression. Zheng et al. (32) created an ML algorithm based on pathological sections of MIBC to accurately quantify the tumor–stratum ratio (TSR) in patients. Their study showed a significant correlation between a low TSR and poorer overall survival, providing an automated TSR quantification method that reduces the subjectivity and inter-observer variability associated with traditional visual assessment methods. Despite significant progress in the construction and utilization of various models for the diagnosis, staging, treatment, and prediction of the prognosis of BC, there remains considerable room for improvement in the development of models that predict BCBM. For instance, Fan et al. (33) constructed a nomogram based on traditional logistic models to predict BCBM, identifying age, lung metastasis, liver metastasis, brain metastasis, N stage, T stage, histological type, pathological grading, primary tumor sites, and race as independent risk factors for BM in patients with BC. This study did not include patients’ previous treatment information, which could be considered in future model refinements. Zhang et al. (10) identified risk factors for BM in patients with BC, including age, race, marital status, T stage, N stage, tumor grading, lung metastasis, liver metastasis, and brain metastasis, but did not construct a corresponding predictive model.

In summary, while previous studies have developed nomogram models based on LR for predicting BM in patients with BC, these traditional approaches may have limitations in handling complex datasets. Our ML-based method offers an alternative approach that could potentially provide additional insights for clinical decision-making (34, 35). The existing prediction models for BCBM have shown varying performance. Identifying the risk factors for BCBM remains important for risk stratification and clinical management. In this study, ML algorithms were applied to analyze potential associations between clinical factors and BCBM risk, with the aim of developing an improved predictive approach (36).

Based on a big data analysis of the SEER database, this study identified independent risk factors related to BM through logistic regression analysis. A total of 12 clinically relevant variables associated with BCBM were included, namely, age, gender, race, tumor histology, tumor grade, T stage, N stage, radiotherapy, chemotherapy, brain metastasis, liver metastasis, and lung metastasis. Using multiple logistic regression analysis, 10 independent risk factors related to BM were identified: brain metastasis, lung metastasis, liver metastasis, radiotherapy, tumor grade, tumor histology, T stage, N stage, race, and age. BC exhibits diverse histological subtypes, including transitional cell carcinoma, squamous cell carcinoma, adenocarcinoma, and other subtypes. These variants demonstrated significant differences in biological behavior and prognostic outcomes (37). In this study, the limited number of BM-positive cases may have precluded comprehensive stratification to fully capture the heterogeneous impact of the histological subtypes on metastatic risk. Nevertheless, SHAP analysis confirmed their non-negligible contribution to the predictive model. Notably, chemotherapy was not identified as an independent risk factor for BM. This may be attributed to its predominant use in the advanced stage or in patients with mBC, who inherently exhibit a higher baseline risk of BM. Consequently, while chemotherapy appeared associated with BM in the univariable analysis, its effect became non-significant in the multivariable analysis after adjusting for T stage, N stage, and the presence of other metastases (e.g., liver/lung). These variables were incorporated into the model, enabling the development of an ML-based predictive tool. Model performance was assessed using standard metrics such as AUC, accuracy, sensitivity, and specificity on the test set. The GBM model demonstrated an AUC of 0.855, with a sensitivity of 0.733 and a specificity of 0.815, showing improved predictive capability compared with the other models developed in the study. These results suggest that this model may help identify patients with BC at an increased risk for BM. Furthermore, the subgroup analysis revealed diminished predictive performance of the model in two specific populations: black patients and those aged over 80 years. This observed reduction in accuracy may be attributable to data limitations and potential selection biases inherent in the study design. The GBM model, an ensemble learning algorithm, iteratively builds decision trees to correct prediction errors. Its ability to capture complex nonlinear relationships makes it highly effective for disease prognosis and risk stratification (38). Using the SHAP method, we determined that the T stage, the N stage, radiotherapy, age, lung metastasis, and liver metastasis are important predictors of BCBM. By comparing the characteristic rankings from the ML model with the SHAP analysis results, it was found that the T stage and the N stage consistently ranked as the top two features, indicating their significant contribution to model predictions. In addition, it was observed that four variables—radiotherapy, age, liver metastasis, and lung metastasis—ranked among the top six in importance across both methods, highlighting their value in predicting BCBM. Furthermore, it is noteworthy that radiotherapy emerged as a significant risk factor in the multifactor logistic regression analysis, with its importance ranking third in the SHAP graph, following T stage and N stage. This result may be related to the potential of radiotherapy to alter the tumor microenvironment and disrupt the normal synthesis and folding processes of the endoplasmic reticulum (ER) proteins, thereby promoting tumor aggressiveness and metastatic potential (39).

This study has several advantages. Firstly, an ML-based prediction model that can accurately predict BCBM was established, offering a more reliable alternative to traditional nomogram prediction models. Secondly, this research further explored the relationships among different independent high-risk factors, providing new directions for future clinical studies. Thirdly, for interpretability, SHAP values were used to show how each feature affected the predictions, helping to explain the model’s behavior. Finally, the generalizability of the model was independently evaluated using an external validation cohort, thereby mitigating potential performance overestimation due to data-splitting bias or overfitting.

However, this study does have certain limitations. Firstly, this large retrospective SEER-based study may introduce selection bias, particularly for the exclusion of patients due to missing data who might have a higher BM risk or unique clinical characteristics that the model failed to adequately learn, potentially compromising the prediction accuracy for these subgroups in clinical practice. Secondly, SEER lacks detailed treatment variables such as chemotherapy regimens and dosages, reducing the clinical prediction credibility and precluding treatment effect analysis. Future studies should integrate electronic health records (EHRs) with chemotherapy/radiotherapy planning systems. Thirdly, the established BC risk factors (i.e., smoking and occupational/environmental exposures) are unavailable in SEER and were thus excluded from the model, limiting the prediction accuracy. A fourth limitation is SEER’s hospital-reported diagnosis risk misclassification: BM may be underreported in asymptomatic patients without confirmatory imaging, while clinical–pathological T/N-staging discrepancies may exist. Fifthly, SEER does not track post-metastasis survival or SREs, which hinders assessment of whether early prediction improves outcomes. Although the reliability of the model was validated using AUC, accuracy, sensitivity, and specificity metrics and its generalizability was confirmed through external validation, its predictive capability remains limited and requires prospective clinical trial validation. Finally, the external validation dataset exhibits both class imbalance and geographic homogeneity (originating from a single region), resulting in performance fluctuations and predictive bias in the external cohort. Furthermore, disproportionate representation across subgroups may contribute to diminished predictive accuracy for specific demographic strata.

Today, with the rapid development of AI technology, the combination of AI with imaging omics plays a significant role in precision medicine (40) and is widely applied in the diagnosis, risk stratification, and treatment of various tumors, including BC, liver cancer, lung cancer, and parotid cancer (41–45). Overall, radiomics plays a significant role in the diagnosis, treatment, and prognosis of patients with BC, which enables timely interventions and thereby improves patients’ quality of life (46, 47). Future research plans include applying ML in conjunction with imaging omics to predict BCBM. We believe that, with the continued advancement of AI technology, ML will become increasingly prevalent in biomedical science, demonstrating substantial potential for clinical transformation and promising to significantly transform future medical practices (48–50).

Clinical implementation and challenges

The GBM model developed in this study demonstrated good performance in predicting BM in patients with BC. We plan to implement this model as an interactive risk calculator in clinical practice, where patients’ clinical characteristics can be input after BC diagnosis to obtain a preliminary BM risk score (represented as a 0–1 value, e.g., 0.30 indicating 30% risk). Patients at high risk would be prioritized for imaging examinations to assist clinical decision-making (see the model card in the Supplementary Material for details). However, several potential barriers exist for clinical integration: Firstly, clinical data integration poses challenges due to fragmented data across different information systems with inconsistent formats and missing values, potentially compromising input data quality. Secondly, establishing a multidisciplinary team that involves clinicians, data scientists, and other experts is crucial to develop implementation strategies, determine risk thresholds, create clinical guidelines and workflows, and obtain regulatory approvals and ethical clearance. Thirdly, considering the severe consequences of BM and the healthcare cost-effectiveness, action thresholds should be established through cost–benefit analysis to minimize the expected costs based on model-predicted probabilities. In addition, clinicians accustomed to traditional approaches might exhibit skepticism toward the new model, questioning its reliability and perceiving it as interfering with clinical autonomy, while the complex algorithm and multiple input features may hinder interpretability and clinician trust. To enable developers, clinicians, regulatory agencies, and other stakeholders to quickly understand the model’s applicable scope and potential risks, we have created a model card (see Supplementary Table S3).

Conclusion

In this study, we developed a ML model to predict BM in BC using 10 routinely available clinical features. Among the tested models, the GBM algorithm showed the highest predictive performance, including in the external validation cohort. These results suggest that the GBM model may aid in the clinical assessment of metastasis risk and inform treatment decisions.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: https://seer.cancer.gov/data-software/.

Author contributions

ZY: Conceptualization, Data curation, Methodology, Validation, Writing – original draft. XX: Conceptualization, Data curation, Formal Analysis, Validation, Writing – original draft. XZ: Conceptualization, Formal Analysis, Methodology, Project administration, Supervision, Validation, Writing – review & editing. PS: Conceptualization, Investigation, Software, Supervision, Writing – review & editing. HC: Formal Analysis, Investigation, Project administration, Software, Supervision, Validation, Writing – review & editing. TZ: Funding acquisition, Project administration, Resources, Validation, Visualization, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This research was funded by the National Natural Science Foundation of China (no. 82260598) and the Jiangxi Provincial Academic and Technical Leader Training Program in Major Disciplines (no. 20225BCJ22009).

Acknowledgments

We are extremely grateful to Dr. Huang Jianbiao from Jiangxi Cancer Hospital for providing the clinicopathological data on bladder cancer and for his valuable insights and critical scientific discussions on the research. We are grateful to Xiao Pang for the technical support he provided for this research.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer LC declared a shared parent affiliation with the authors to the handling editor at the time of review.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1653506/full#supplementary-material

References

1. Siegel RL, Giaquinto AN, and Jemal A. Cancer statistics, 2024[J. CA: A Cancer J Clin. (2024) 74:12–49. doi: 10.3322/caac.21820

PubMed Abstract | Crossref Full Text | Google Scholar

2. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer J Clin. (2024) 74:229–63. doi: 10.3322/caac.21834

PubMed Abstract | Crossref Full Text | Google Scholar

3. Stellato M, Santini D, Cursano MC, Foderaro S, Tonini G, Procopio G, et al. Bone metastases from urothelial carcinoma. The dark side of the moon. J Bone Oncol. (2021) 31:100405. doi: 10.1016/j.jbo.2021.100405

PubMed Abstract | Crossref Full Text | Google Scholar

4. Tran L, Xiao JF, Agarwal N, Duex JE, and Theodorescu D. Advances in bladder cancer biology and therapy. Nat Rev Cancer. (2021) 21:104–21. doi: 10.1038/s41568-020-00313-1

PubMed Abstract | Crossref Full Text | Google Scholar

5. Luzzago S, Palumbo C, Rosiello G, Pecoraro A, Deuker M, Tian Z, et al. The effect of radical cystectomy on survival in patients with metastatic urothelial carcinoma of the urinary bladder. J Surg Oncol. (2019) 120:1266–75. doi: 10.1002/jso.25717

PubMed Abstract | Crossref Full Text | Google Scholar

6. Tao L, Pan X, Zhang L, Wang J, Zhang Z, Zhang L, et al. Marital status and prognostic nomogram for bladder cancer with distant metastasis: A SEER-based study. Front Oncol. (2020) 10:586458. doi: 10.3389/fonc.2020.586458

PubMed Abstract | Crossref Full Text | Google Scholar

7. Shinagare AB, Ramaiya NH, Jagannathan JP, Fennessy FM, Taplin ME, and Van den Abbeele AD. Metastatic pattern of bladder cancer: correlation with the characteristics of the primary tumor. AJR Am J Roentgenol. (2011) 196:117–22. doi: 10.2214/AJR.10.5036

PubMed Abstract | Crossref Full Text | Google Scholar

8. Fornetti J, Welm AL, and Stewart SA. Understanding the bone in cancer metastasis. J Bone Miner Res: Off J Am Soc Bone Miner Res. (2018) 33:2099–113. doi: 10.1002/jbmr.3618

PubMed Abstract | Crossref Full Text | Google Scholar

9. Selvaggi G and Scagliotti GV. Management of bone metastases in cancer: a review. Crit Rev Oncol Hematol. (2005) 56:365–78. doi: 10.1016/j.critrevonc.2005.03.011

PubMed Abstract | Crossref Full Text | Google Scholar

10. Zhang C, Liu L, Tao F, Guo X, Feng G, Chen F, et al. Bone metastases pattern in newly diagnosed metastatic bladder cancer: A population-based study. J Cancer. (2018) 9:4706–11. doi: 10.7150/jca.28706

PubMed Abstract | Crossref Full Text | Google Scholar

11. Burke HB. Outcome prediction and the future of the TNM staging system. J Natl Cancer Inst. (2004) 96:1408–9. doi: 10.1093/jnci/djh293

PubMed Abstract | Crossref Full Text | Google Scholar

12. Zou XC, Rao XP, Huang JB, Zhou J, Chao HC, and Zeng T. Predicting distant metastasis of bladder cancer using multiple machine learning models: a study based on the SEER database with external validation. Front Oncol. (2024) 14:1477166. doi: 10.3389/fonc.2024.1477166

PubMed Abstract | Crossref Full Text | Google Scholar

13. Shi S, Peng G, Luo L, and Li D. Predictive nomograms for risk and prognostic factors in metastatic bladder cancer: a population-based study. Trans Cancer Res. (2023) 12:3284–302. doi: 10.21037/tcr-23-1229

PubMed Abstract | Crossref Full Text | Google Scholar

14. Jones OT, Calanzani N, Saji S, Duffy SW, Emery J, Hamilton W, et al. Artificial intelligence techniques that may be applied to primary care data to facilitate earlier diagnosis of cancer: systematic review. J Med Internet Res. (2021) 23:e23483. doi: 10.2196/23483

PubMed Abstract | Crossref Full Text | Google Scholar

15. Yin J, Ngiam KY, and Teo HH. Role of artificial intelligence applications in real-life clinical practice: systematic review. J Med Internet Res. (2021) 23:e25759. doi: 10.2196/25759

PubMed Abstract | Crossref Full Text | Google Scholar

16. Peng ZH, Tian JH, Chen BH, Zhou HB, Bi H, He MX, et al. Development of machine learning prognostic models for overall survival of prostate cancer patients with lymph node-positive. Sci Rep. (2023) 13:18424. doi: 10.1038/s41598-023-45804-x

PubMed Abstract | Crossref Full Text | Google Scholar

17. Wang Z, Xu C, Liu W, Zhang M, Zou J, Shao M, et al. A clinical prediction model for predicting the risk of liver metastasis from renal cell carcinoma based on machine learning. Front Endocrinol. (2023) 13:1083569. doi: 10.3389/fendo.2022.1083569

PubMed Abstract | Crossref Full Text | Google Scholar

18. Zhuang Y, Chen YW, Shae ZY, and Shyu C-R. Generalizable layered blockchain architecture for health care applications: development, case studies, and evaluation. J Med Internet Res. (2020) 22:e19029. doi: 10.2196/19029

PubMed Abstract | Crossref Full Text | Google Scholar

19. Compérat E, Amin MB, Cathomas R, Choudhury A, De Santis M, Kamat A, et al. Current best practice for bladder cancer: a narrative review of diagnostics and treatments. Lancet. (2022) 400:1712–21. doi: 10.1016/S0140-6736(22)01188-6

PubMed Abstract | Crossref Full Text | Google Scholar

20. Patel VG, Oh WK, and Galsky MD. Treatment of muscle-invasive and advanced bladder cancer in 2020. CA: A Cancer J Clin. (2020) 70:404–23. doi: 10.3322/caac.21631

PubMed Abstract | Crossref Full Text | Google Scholar

21. Owari T, Miyake M, Nakai Y, Morizawa Y, Itami Y, Hori S, et al. Clinical features and risk factors of skeletal-related events in genitourinary cancer patients with bone metastasis: A retrospective analysis of prostate cancer, renal cell carcinoma, and urothelial carcinoma. Oncology. (2018) 95:170–8. doi: 10.1159/000489218

PubMed Abstract | Crossref Full Text | Google Scholar

22. Hu J, Chen J, Ou Z, Chen H, Liu Z, Chen M, et al. Neoadjuvant immunotherapy, chemotherapy, and combination therapy in muscle-invasive bladder cancer: A multi-center real-world retrospective study. Cell Rep Med. (2022) 3:100785. doi: 10.1016/j.xcrm.2022.100785

PubMed Abstract | Crossref Full Text | Google Scholar

23. Hu J, Yan L, Liu J, Chen M, He Y, Fan B, et al. Neoadjuvant immunotherapy driven bladder preservation for muscle invasive bladder cancer. iMeta. (2025) 4:e70063. doi: 10.1002/imt2.70063

Crossref Full Text | Google Scholar

24. Hu J, Yan L, Liu J, Chen M, Liu P, Deng D, et al. Efficacy and biomarker analysis of neoadjuvant disitamab vedotin (RC48-ADC) combined immunotherapy in patients with muscle-invasive bladder cancer: A multi-center real-world study. iMeta. (2025) 4:e70033. doi: 10.1002/imt2.70033

PubMed Abstract | Crossref Full Text | Google Scholar

25. Khatua A, Khatua A, Chi X, and Cambria E. Artificial intelligence, social media and supply chain management: the way forward. Electronics. (2021) 10:2348. doi: 10.3390/electronics10192348

Crossref Full Text | Google Scholar

26. Molas G and Nowak E. Advances in emerging memory technologies: from data storage to artificial intelligence. Appl Sci. (2021) 11:11254. doi: 10.3390/app112311254

Crossref Full Text | Google Scholar

27. Kikon A and Deka PC. Artificial intelligence application in drought assessment, monitoring and forecasting: a review. Stoch Environ Res Risk Assess. (2022) 36:1197–214. doi: 10.1007/s00477-021-02129-3

Crossref Full Text | Google Scholar

28. Bhavsar KA, Singla J, Al-Otaibi YD, Song OY, Zikria YB, and Bashir AK. Medical diagnosis using machine learning: A statistical review. Computers Mater Continua. (2021) 67:107–25. doi: 10.32604/cmc.2021.014604

Crossref Full Text | Google Scholar

29. Tsai IJ, Shen WC, Lee CL, Wang DR, and Lin CY. Machine learning in prediction of bladder cancer on clinical laboratory data. Diagn (Basel Switzerland). (2022) 12:203. doi: 10.3390/diagnostics12010203

PubMed Abstract | Crossref Full Text | Google Scholar

30. Xiong S, Fu Z, Deng Z, Li S, Zhan X, Zheng F, et al. Machine learning-based CT radiomics enhances bladder cancer staging predictions: A comparative study of clinical, radiomics, and combined models. Med Phys. (2024) 51:5965–77. doi: 10.1002/mp.17288

PubMed Abstract | Crossref Full Text | Google Scholar

31. Liosis KC, Marouf AA, Rokne JG, Ghosh S, Bismar TA, and Alhajj R. Genomic biomarker discovery in disease progression and therapy response in bladder cancer utilizing machine learning. Cancers. (2023) 15:4801. doi: 10.3390/cancers15194801

PubMed Abstract | Crossref Full Text | Google Scholar

32. Zheng Q, Jiang Z, Ni X, Yang S, Jiao P, Wu J, et al. Machine learning quantified tumor-stroma ratio is an independent prognosticator in muscle-invasive bladder cancer. Int J Mol Sci. (2023) 24:2746. doi: 10.3390/ijms24032746

PubMed Abstract | Crossref Full Text | Google Scholar

33. Fan Z, Huang Z, Hu C, Tong Y, and Zhao C. Risk factors and nomogram for newly diagnosis of bone metastasis in bladder cancer. Medicine. (2020) 99:e22675. doi: 10.1097/MD.0000000000022675

PubMed Abstract | Crossref Full Text | Google Scholar

34. Naik K, Goyal RK, Foschini L, Chak CW, Thielscher C, Zhu H, et al. Current status and future directions: the application of artificial intelligence/machine learning for precision medicine. Clin Pharmacol Ther. (2024) 115:673–86. doi: 10.1002/cpt.3152

PubMed Abstract | Crossref Full Text | Google Scholar

35. Goecks J, Jalili V, Heiser LM, and Gray JW. How machine learning will transform biomedicine. Cell. (2020) 181:92–101. doi: 10.1016/j.cell.2020.03.022

PubMed Abstract | Crossref Full Text | Google Scholar

36. Buch VH, Ahmed I, and Maruthappu M. Artificial intelligence in medicine: current trends and future possibilities. Br J Gen Pract: J R Coll Gen Pract. (2018) 68:143–4. doi: 10.3399/bjgp18X695213

PubMed Abstract | Crossref Full Text | Google Scholar

37. Claps F, Biasatti A, Di GianFrancesco L, Ongaro L, Giannarini G, Pavan N, et al. The prognostic significance of histological subtypes in patients with muscle-invasive bladder cancer: an overview of the current literature. J Clin Med. (2024) 13:4349. doi: 10.3390/jcm13154349

PubMed Abstract | Crossref Full Text | Google Scholar

38. Dash TK, Chakraborty C, Mahapatra S, and Panda G. Gradient boosting machine and efficient combination of features for speech-based detection of COVID-19. IEEE J Biomed Health Inf. (2022) 26:5364–71. doi: 10.1109/JBHI.2022.3197910

PubMed Abstract | Crossref Full Text | Google Scholar

39. Nie Z, Chen M, Wen X, Gao Y, Huang D, Cao H, et al. Endoplasmic reticulum stress and tumor microenvironment in bladder cancer: the missing link. Front Cell Dev Biol. (2021) 9:683940. doi: 10.3389/fcell.2021.683940

PubMed Abstract | Crossref Full Text | Google Scholar

40. Hosny A, Parmar C, Quackenbush J, Schwartz LH, and Aerts JHWL. Artificial intelligence in radiology. Nat Rev Cancer. (2018) 18:500–10. doi: 10.1038/s41568-018-0016-5

PubMed Abstract | Crossref Full Text | Google Scholar

41. Zheng Y, Zhou D, Liu H, and Wen M. CT-based radiomics analysis of different machine learning models for differentiating benign and Malignant parotid tumors. Eur Radiol. (2022) 32:6953–64. doi: 10.1007/s00330-022-08830-3

PubMed Abstract | Crossref Full Text | Google Scholar

42. Mao B, Zhang L, Ning P, Ding F, Wu F, Lu G, et al. Preoperative prediction for pathological grade of hepatocellular carcinoma via machine learning–based radiomics. Eur Radiol. (2020) 30:6924–32. doi: 10.1007/s00330-020-07056-5

PubMed Abstract | Crossref Full Text | Google Scholar

43. Jiang C, Luo Y, Yuan J, You S, Chen Z, Wu M, et al. CT-based radiomics and machine learning to predict spread through air space in lung adenocarcinoma. Eur Radiol. (2020) 30:4050–7. doi: 10.1007/s00330-020-06694-z

PubMed Abstract | Crossref Full Text | Google Scholar

44. Wei Z, Xv Y, Liu H, Li Y, Yin S, Xie Y, et al. A CT-based deep learning model predicts overall survival in patients with muscle invasive bladder cancer after radical cystectomy: a multicenter retrospective cohort study. Int J Surg. (2024) 110:2922–2932. doi: 10.1097/JS9.0000000000001194

PubMed Abstract | Crossref Full Text | Google Scholar

45. Bizzarri FP, Nelson AW, Colquhoun AJ, and Lobo N. Utility of fluorodeoxyglucose positron emission tomography/computed tomography in detecting lymph node involvement in comparison to conventional imaging in patients with bladder cancer with variant histology. Eur Urol Oncol. (2025):S2588931125000975. doi: 10.1016/j.euo.2025.03.019

PubMed Abstract | Crossref Full Text | Google Scholar

46. Gavi F, Foschi N, Fettucciari D, Russo P, Giannarelli D, Ragonese M, et al. Assessing trifecta and pentafecta success rates between robot-assisted vs. Open radical cystectomy: A propensity score-matched analysis. Cancers. (2024) 16:1270. doi: 10.3390/cancers16071270

PubMed Abstract | Crossref Full Text | Google Scholar

47. Palermo G, Bizzarri FP, Scarciglia E, Sacco E, Moosavi Seyed K, Russo P, et al. The mental and emotional status after radical cystectomy and different urinary diversion orthotopic bladder substitution versus external urinary diversion after radical cystectomy: A propensity score-matched study. Int J Urol. (2024) 31:1423–8. doi: 10.1111/iju.15586

PubMed Abstract | Crossref Full Text | Google Scholar

48. Hofer IS, Burns M, Kendale S, and Wanderer J. Realistically integrating machine learning into clinical practice: A road map of opportunities, challenges, and a potential future. Anesth Analg. (2020) 130:1115–8. doi: 10.1213/ANE.0000000000004575

PubMed Abstract | Crossref Full Text | Google Scholar

49. Liu L, Zhang R, Shi Y, et al. Automated machine learning for predicting liver metastasis in patients with gastrointestinal stromal tumor: a SEER-based analysis. Sci Rep. (2024) 14:12415. doi: 10.1038/s41598-024-62311-9

PubMed Abstract | Crossref Full Text | Google Scholar

50. Lee C, Light A, Alaa A, Thurtle D, Van Der Schaar M, and Gnanapragasam VJ. Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database. Lancet Digit Health. (2021) 3:e158–65. doi: 10.1016/S2589-7500(20)30314-9

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: machine learning, bladder cancer, SEER database, bone metastasis, predictive value

Citation: Yu ZJ, Xu XD, Zou XC, Su PD, Chao HC and Zeng T (2025) Ensemble machine learning models for predicting bone metastasis in bladder cancer. Front. Oncol. 15:1653506. doi: 10.3389/fonc.2025.1653506

Received: 25 June 2025; Accepted: 02 September 2025;
Published: 25 September 2025.

Edited by:

Sanja Stifter-Vretenar, Skejby Sygehus, Denmark

Reviewed by:

Jure Murgic, Sisters of Charity Hospital, Croatia
Jiao Hu, Central South University, China
Luca Ongaro, Royal Free Hospital, United Kingdom
Luyao Chen, The First Affiliated Hospital of Nanchang University, China

Copyright © 2025 Yu, Xu, Zou, Su, Chao and Zeng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tao Zeng, dGFvemVuZzQwNzA5QHNpbmEuY29t

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.