Prediction of metastatic risk of renal clear cell carcinoma based on CT radiomics analysis

Wang, Xueyi; Yang, Youchang; Wu, Jiaojiao; Tang, Xiaoqiang; Wang, Yao

doi:10.3389/fonc.2025.1576956

ORIGINAL RESEARCH article

Front. Oncol., 06 June 2025

Sec. Genitourinary Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1576956

This article is part of the Research TopicAdvancing Cancer Imaging Technologies: Bridging the Gap from Research to Clinical Practice Volume IIView all 18 articles

Prediction of metastatic risk of renal clear cell carcinoma based on CT radiomics analysis

Xueyi Wang¹

Youchang Yang²

Jiaojiao Wu³

Xiaoqiang Tang⁴

Yao Wang^1*

¹Department of Radiology, Wujin Hospital Affiliated with Jiangsu University 2 The Wujin Clinical college of Xuzhou Medical University, Changzhou, China
²Department of Radiology, Qilu Hospital (Qingdao), Cheeloo College of Medicine, Shandong University, Qingdao, China
³Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
⁴Department of Radiology, The Affiliated Changzhou No. 2 People’s Hospital of Nanjing Medical University, Changzhou, China

Objective: To investigate the value of using imaging histological models to non-invasively assess the risk of metastasis in patients with clear cell renal cell carcinoma (ccRCC).

Methods: This study retrospectively enrolled 273 clear cell renal cell carcinoma (ccRCC) patients from three hospitals, with 57 cases allocated as an independent test cohort. High-throughput imaging histomic features (n=2,264) were extracted from triphasic CT (non-enhanced, corticomedullary, and nephrographic phases) using Pyradiomics. Three monophasic radiomics models were developed following dimensionality reduction, with feature contributions quantified via Shapley Additive exPlanations (SHAP) framework to enhance interpretability. A triphasic radiomics model was subsequently established by ensembling phase-specific prediction probabilities. Metastasis risk factors identified through univariate/multivariate logistic regression informed a clinical predictor model. The final combined model integrated triphasic radiomics signatures with clinical parameters, visualized through a nomogram. Diagnostic performance was evaluated via ROC analysis, while calibration curves validated prediction consistency.

Results: In this study, SHAP analysis revealed that radiomics features quantifying intratumoral heterogeneity (e.g., necrosis patterns in medullary-phase CT) synergized with clinical factors (tumor size >3 cm, creatinine levels) to drive predictions. Key biological insights included threshold effects of necrosis volume (linked to hypoxia) and tumor diameter (critical threshold: 3 cm), aligning with known metastatic pathways. The clinical model achieved an area under the ROC curve (AUROC) of 0.752 (95% confidence interval [CI]: 0.679-0.826) in the training dataset and 0.681 (95% CI: 0.529-0.833) in the testing dataset. Among the single-phase radiomics models, the CT_Medullary model demonstrated good prediction performance, with an AUROC of 0.785 (95% CI: 0.645-0.924) in the testing dataset. The three-phased CT model exhibited improved diagnostic performance, with a testing AUROC rising to 0.812 (95% CI: 0.680-0.943). Notably, the combined model integrating clinical and radiomics features yielded the best prediction, achieving a further improvement in testing AUROC to 0.824 (95% CI: 0.704-0.944).

Conclusion: Radiomics technology provides a quantitative, objective method for predicting the risk of metastasis in patients with ccRCC. Nonetheless, the clinical indicators persist as irreplaceable.

1 Introduction

Renal cell carcinoma (RCC) accounts for approximately 90% of renal malignancies, with clear cell renal cell carcinoma (ccRCC) being the most common subtype (1, 2). Surgery is the most effective radical treatment, but studies have shown that approximately 30% of ccRCC patients present with local recurrence or metastasis at initial presentation (3, 4). ccRCC does not respond well to radiotherapy and chemotherapy, and the 5-year survival rate for patients with metastatic ccRCC is only 10% (5). Therefore, accurate assessment of the risk of ccRCC recurrence and metastasis after surgery is extremely important for patient prognosis. Currently, the prognosis of patients with RCC is mainly predicted by tumor size, TNM staging system, Fuhrman classification, WHO/ISUP classification, and other clinicopathological features with limited accuracy, and patients with RCC of the same stage and/or pathological classification often have different prognoses (6). Therefore, new markers are urgently needed to improve the efficacy of predicting ccRCC recurrence and metastasis for accurate and personalized clinical decision-making.

Computed tomography (CT) is a widely used non-invasive imaging modality for tumor staging and assessment of tumor aggressiveness in ccRCC patients (7). Radiomics, a promising and emerging technique, enables the transformation of medical images into vast amounts of image-related features that can be analyzed in model-building algorithms (8–10). To date, radiomics has been successfully applied in several areas of RCC, including prediction of Fuhrman stages and response to therapy in ccRCC and discrimination of RCC subtypes. However, most studies have focused on developing models based solely on texture analysis, neglecting the importance of clinical risk factors and radiological features that could improve predictive performance (11–14). It is worth noting that multimodal data fusion is expected to enhance diagnostic performance, probably because information from different modalities can complement each other and has already shown excellent capabilities in the domain of treatment and prognostic prediction for glioma, ovarian cancer, and breast cancer, among others (15–17).

The purpose of this study was to develop and validate a radiomics nomogram incorporating CT radiological features and clinical factors to predict the risk of metastasis in ccRCC.

2 Materials and methods

2.1 Patients

The clinical and imaging data of patients diagnosed with ccRCC between April 2013 and March 2021 at Shandong University Qilu Hospital, Jinan Campus (Hospital A), Shandong University Qilu Hospital, Qingdao Campus (Hospital B), and Changzhou No. 2 People’s Hospital (Hospital C) were retrospectively reviewed. Patients were categorized based on the presence or absence of metastasis at 3 years postoperatively. Ethical approval was granted by the institutional review boards of the three hospitals to access their clinical and imaging records for this study. Due to the retrospective nature of the research, written informed consent was not necessary. The inclusion criteria were: (i) pathologically confirmed ccRCC; (ii) a thorough review of patient data, including three-phased CT scans (i.e., non-enhanced, cortical enhanced, and medullary enhanced phases) and laboratory results. The exclusion criteria were: (i) inability to evaluate the patient’s imaging; (ii) incomplete general or laboratory data; and (iii) a history of other malignancies. Ultimately, 273 patients were included in the study, of whom 89 developed metastases.

2.2 Image acquisition and preprocessing

All subjects underwent CT enhancement scanning. Patients at Hospital A were examined using up to five different CT helical/spiral scanners, including General Electric Medical Systems, Philips, Siemens, Canon Medical Systems, and a Toshiba 512-row detector; patients at Hospital B were examined using three different CT helical/spiral scanners. These included a 256-slice CT (GE Revolution CT, GE Healthcare, USA), a dual-source CT (SOMATOM Definition, Siemens Healthineers, Germany), and a dual-source CT (SOMATOM Force, Siemens Healthineers, Germany). Hospital C employed Siemens Germany’s Definition Flash CT for the initial examination. The renal examination was conducted using a scanning scope that extended from the upper pole to the lower pole, encompassing the entire kidney. The scanning parameters employed in the three hospitals included in this study are presented in Supplementary Table 1. Consequently, the three-phase CT scan images of the non-enhanced, cortical enhanced, and medullary enhanced phases of one patient were evaluated for quality by two radiologists, and any discrepancies were resolved by a senior radiologist with over two decades of diagnostic experience.

The image preprocessing procedures involved the following steps: (1) pixel resampling to a resolution of 1 × 1 mm²; (2) grey-level normalization using the ± 3 sigma method; and (3) grey-level discretization into 64 distinct levels.

2.3 Tumor delineation

All CT images in DICOM format were imported into ITK-SNAP v3.6.0 (www.itksnap.org) for annotation of ccRCC lesions, maintaining their original size and resolution. The region of interest (ROI) was delineated by two experienced radiologists (R1 and R2) on cortical enhanced phase CT images. The radiologists jointly reviewed the images to define the three-dimensional (3D) ROI covering the entire lesion. Following this, the non-enhanced phase, medullary enhanced phase, and cortical phase images were aligned using rigid registration to correct for any motion between acquisitions. The ROIs delineated on the cortical enhanced phase images were then mapped to the other two phases and reviewed by a radiologist for further radiomics analysis. The accuracy of the registration was assessed using the Dice similarity coefficient (DSC), which measures the similarity between the registered mask of the moving image (transformed original mask) and the reference segmentation on the fixed image. A DSC value of 0.80 indicated good registration performance.

2.4 Radiomics analysis

The radiomics analysis pipeline, encompassing radiomics feature extraction, feature selection, model construction, and performance evaluation, was conducted using the uAI Research Portal (uRP, United Imaging Intelligence) (18). To improve model interpretability, the Shapley additive explanations (SHAP) method was applied by assigning each feature an importance value in the prediction, providing insights into how the model makes decisions.

2.4.1 Feature extraction and selection

For each imaging modality (i.e., non-enhanced CT, cortical enhanced CT, and medullary enhanced CT), each ROI extracted 2,264 radiomic features in compliance with the Image Biomarker Standardization Initiative (IBSI) (19), encompassing first-order statistics, shape-based features, and texture features. Additionally, each participant owned 42 clinical characteristics, including demographic information, biological data, and ccRCC characteristics. To ensure model’s generalizability, feature selection and model construction was conducted on the training dataset and validated on an independent testing dataset. Among the 273 participants, 216 individuals from Hospital A and Hospital B comprised the training dataset, while the remaining 57 participants from Hospital C formed the independent testing dataset.

To select the most valuable radiomics features for constructing three single-phased CT models, feature standardization was initially performed to eliminate the magnitude differences between various features. Only features with intraclass correlation coefficient (ICC) values greater than 0.75 in both intra-observer and inter-observer agreement tests were retained. The feature selection strategy was customized for each model to balance robustness and performance. For the CT_Cortical model, an F-test (P < 0.05) was first applied to exclude statistically non-significant features, followed by least absolute shrinkage and selection operator (LASSO) regression (α = 0.08) to further reduce multicollinearity. For the CT_Medullary and CT_Non-enhanced models, minimum redundancy maximum relevance (mRMR) was employed to directly optimize feature relevance-redundancy trade-offs, as these phases exhibited stronger inter-feature correlations. For the clinical model, the univariate logistic regression (P < 0.05) identified statistically significant predictors, and mRMR refined the subset by removing redundant variables. These methods were applied sequentially (not independently), with the workflow for each model optimized through grid search and cross-validation. In accordance with Harrell’s guideline, the number of selected features should not exceed 10% of the size of the smallest group (the metastasis group) in the training dataset, which is equivalent to 10 EPP (events per candidate predictor parameter) (20). Consequently, the final number of features in each constructed model was limited to fewer than 7. Detailed parameters in feature selection for models are summarized in Supplementary Table 2.

2.4.2 Model construction and validation

Based on the selected features, various data preprocessing techniques were employed for feature standardization, such as Z-score scaler, max_abs scaler, L2 normalization, and quantile transformer. To ensure algorithmic diversity and robustness, six machine learning classifiers – random forest (RF), logistic regression (LR), decision tree (DT), Bagging DT, support vector machine (SVM), and partial least squares-discriminant analysis (PLS-DA) – were evaluated. These classifiers were selected to represent distinct computational paradigms, such as tree-based, linear, ensemble methods. Multiple candidate models were generated by combining feature subsets, preprocessing methods, and classifiers. For each classifier, hyperparameters were optimized via grid search on the training dataset using 5-fold cross-validation, with the area under the receiver operating characteristic curve (AUROC) as the optimization metric. The final model for each modality was selected based on the highest cross-validated AUROC in the training dataset and subsequently applied to the testing dataset, ensuring strict separation between model development and validation phases to prevent data leakage.

Finally, four single-modality models were constructed based on selected features from their respective modalities: the CT_Corticle model, CT_Medullary model, CT_Non-enhanced model, and clinical model. For instance, RF outperformed other classifiers for CT_Cortical and CT_Medullary models, likely due to its inherent noise robustness and suitability for high-dimensional radiomics data. To investigate whether combining information from three-phase CT images could improve predictions, a multi-phased CT model was created by integrating predicted probabilities from the three single-modality CT models and passing these to another classifier. Additionally, the potential of combining radiomics with clinical information was explored by developing a combined model, which integrated the predicted probabilities from both the multi-phased CT model and the clinical model. Each of the six final models was identified as the optimal configuration for its respective input features, balancing performance and generalizability through iterative parameter tuning (Supplementary Table 3).

After selecting the optimized model with superior performance and robustness, the model was applied to the testing dataset to validate its generalizability. The receiver operating characteristic (ROC) curve was first plotted, allowing for the quantitative calculation of the AUROC. Similarly, the precision-recall (PR) curves suitable for unbalanced sample cases, were plotted to visualize the discrimination efficiency, with the calculation of the area under the PR curve (AUPR). To quantitatively assess the consistency between the actual labels and predicted categories, an additional five metrics were calculated from confusion matrices: accuracy, sensitivity, specificity, precision, and F1 score. Additionally, calibration curves were employed to compare the predictive outputs with the actual outcomes. Decision curves were utilized to demonstrate the clinical net benefit of multi-modality models.

2.4.3 Model interpretability with SHAP and nomogram

SHAP (Shapley Additive exPlanations) is a Python library designed to interpret the prediction outcomes of sophisticated machine learning models based on game theory (21). The foundation of SHAP lies in the concept of Shapley values. These values assign an importance score to each feature for a specific prediction, providing a measure of how much each feature contributes to the model’s output, thereby enabling researchers to peek into the “black box” of complex models. The positive or negative SHAP value is a clear indicator of the nature of a feature’s influence on model prediction, where a positive SHAP value represents that the influence of this characteristic on model prediction is promotional, while a negative SHAP value implies that the feature suppresses the prediction. In our study, we harnessed the capabilities of the SHAP library to comprehensively analyze the impact of each factor on the model’s prediction and explore how these features interact.

Alongside SHAP, we applied nomogram to visually display how factors interact and contribute to the model’s prediction, making it easier to understand the model’s decision-making process. The nomogram combines different predictors into a single diagram, in which each variable’s contribution to the prediction is represented by a scale, and by aligning values on these scales, users can estimate the outcome.

2.6 Statistical analysis

The Shapiro-Wilk tests were conducted to assess the normal distribution of continuous variables. Continuous variables were expressed as mean ± standard deviation if approximately normally distributed or as median (25^th, 75^th percentiles) for asymmetric distributions. Categorical variables were presented as counts (proportions). For comparisons between two groups (i.e., metastatic vs. non-metastatic groups), normally distributed continuous variables were analyzed using t-tests and non-normally distributed variables with Mann-Whitney U tests, whereas categorical variables were compared via chi-square tests or Fisher’s exact tests. For comparisons across three cohorts (training, internal validation, and external validation cohorts), one-way ANOVA or Kruskal-Wallis H tests were applied to continuous variables depending on their distribution, and chi-square tests or Fisher’s exact tests were used for categorical variables as appropriate. The classification performance of different models was quantitatively compared using seven metrics: AUROC, AUPR, accuracy, sensitivity, specificity, precision, and F1 score. When comparing the AUROC of multiple-modality models with single-modality models, AUROC for each model was computed at 1000 bootstrap intervals using R (fbroc package) and statistical analyses were performed using Kruskal-Wallis H tests followed by Dunnett’s multiple comparisons tests. To qualitatively compare the classification performance and clinical benefit of different models, four visualization figures—ROC curve, PR curve, calibration curve, and decision curve—were generated. All statistical analyses were conducted using SPSS (version 26.0, https://www.ibm.com/spss) and R (version 4.2.2, https://www.R-project.org). A two-tailed p < 0.05 was considered statistically significant. All figures were created using GraphPad Prism 9 (https://www.graphpad.com/), OriginPro 2021 (https://www.originlab.com/), R (version 4.2.2), and Adobe Illustrator 2023 (https://www.adobe.com/products/illustrator.html).

3 Results

3.1 Clinical characteristics of the patients

The patients’ demographic baseline characteristics were summarized in Table 1. There were 273 ccRCC patients (197 men and 76 women), 216 patients in the training and internal validation cohort and 57 patients in the external validation cohort. The incidence of ccRCC metastasis was 33.33% (72 out of 216) and 29.82% (17 out of 57) in the training and testing cohorts, respectively. Detailed clinical variable comparisons across the training, internal validation, and external validation cohorts were provided in Supplementary Table 4. Significant differences were observed in variables such as necrosis, capsule presence, smoking, and drinking habits (all p < 0.05), suggesting potential heterogeneity among cohorts. These differences were accounted for during model development through standardized feature normalization.

Table 1

Table 1. Patient clinical characteristics.

3.2 Classification performance of single-modality radiomics models

Four single-modality models were constructed using selected features from their respective modalities: the CT_Cortical model, CT_Medullary model, CT_Non-enhanced model, and clinical model. As presented in Supplementary Table 2 and Supplementary Figure 1, the number of selected features for these four models was 6, 4, 7, and 4, respectively. Notably, four clinical features - maximum diameter, creatinine, necrosis, and vascular invasion - were identified as independent risk factors for metastasis in the ccRCC, and were incorporated into the development of the clinical model. The distribution of the maximum diameter of lesions in the training and testing datasets was illustrated in Supplementary Figure 2.

The classification performance of these single-modality models was illustrated in Figure 1. In the training dataset, the AUROC values for the single-modality radiomics models based on cortical enhanced, medullary enhanced, and non-enhanced phase images were 0.782 (95% confidence interval [CI]: 0.717-0.847), 0.834 (95% CI: 0.779-0.889) and 0.785 (95% CI: 0.723-0.848), respectively. Correspondingly, in the independent testing dataset, the AUROC values were 0.751 (95% CI: 0.620-0.883), 0.785 (95% CI: 0.645-0.924), and 0.724 (95% CI: 0.583-0.866). Notably, the CT_Medullary model demonstrated relatively superior prediction performance. Decision curve analysis further revealed that both the CT_Cortical and CT_Medullary models provided clinical net benefits across threshold probabilities in both training and testing datasets (Supplementary Figure 3). Moreover, the clinical model had an AUROC of 0.752 (95% CI: 0.679-0.826) in the training dataset and 0.681 (95% CI: 0.529-0.833) in the testing dataset. Similar trends were also observed in the PR curves and corresponding AUPR values, with the CT_Medullary model achieving the highest AUPR value. Additionally, calibration curves were employed to assess how well a classification model’s predicted probabilities corresponded to actual outcomes, with a lower Brier score indicating a more accurate model. Furthermore, five other metrics were calculated from the confusion matrices (Table 2). Evidently, the CT_Medullary model had the best predictive performance among the four single-modality models, with an accuracy of 0.702 and a sensitivity of 0.765 in the testing dataset.

Figure 1

Figure 1. Classification performance of four single-modality models. (a) CT_Cortical model, (b) CT_Medullary model, (c) CT_Non-enhanced model, and (d) clinical model. Column 1: Receiver operating characteristic (ROC) curves displayed true positive rate (Y-axis) versus false positive rate (X-axis), with a dashed diagonal representing random classification performance. Column 2: Precision-recall (PR) curves illustrated precision (positive predictive value, Y-axis) against recall (sensitivity, X-axis). Column 3: Calibration curves compared binned predicted probabilities (X-axis) against observed event frequencies (Y-axis), aligned to a perfect-calibration dashed diagonal. Column 4: Confusion matrices used color gradients to show classification outcomes, with larger numbers being darker.

Table 2

Table 2. Classification performance of six models in the training and testing datasets.

3.3 Model interpretability with SHAP

To improve the model interpretability, the SHAP method was used to calculate the Shapley values of each selected feature for the prediction of every observation in single-modality models (Figure 2). Positive SHAP values indicated an increased risk of developing metastasis in ccRCC patients. As shown in Figures 2a-c three single-modality radiomics models involved 6, 4, and 7 essential features in the global interpretation. The most contributing feature in classifying metastasis and non-metastasis was “recursivegaussian_glszm_ZoneEntropy” in the CT_Cortical model, “log_glszm_log-sigma-0-5-mm-3D-SmallAreaHighGrayLevelEmphasis” in CT_Medullary model, and “log_glszm_log-sigma-4-0-mm-3D-LargeAreaHighGrayLevelEmphasis” in CT_Non-enhanced model. In the clinical model, “maximum diameter” exhibited the largest mean of absolute SHAP value (Figure 2d). This demonstrated the similarity between radiomics and clinical information, both of which emphasized the importance of tumor size. In the individual interpretation, one patient was chosen from the independent testing dataset to demonstrate how the SHAP method could be applied to explain individual model predictions. The SHAP force plot showed each feature’s positive and negative effects on the predictive outcomes in a single case. A predictor’s importance was demonstrated by the size of its arrow, where a larger arrow indicated a more important predictor. The base value represented the primary diagnosis and prediction probability, while f(x) represented its final diagnosis and prediction probability.

Figure 2

Figure 2. SHapley Additive exPlanations (SHAP) analysis of four single-modality models for clear cell renal cell carcinoma (ccRCC) metastasis prediction: (a) CT_Cortical model, (b) CT_Medullary model, (c) CT_Non-enhanced model, and (d) clinical model. Left panel: Summary plots demonstrating directional feature impacts, where horizontal axis SHAP values quantified metastasis risk contribution (rightward = risk-increasing, leftward = risk-reducing), with vertical ordering reflected global feature importance. Color gradients (purple-to-orange) represented feature values (purple = low, orange = high; e.g., elevated “recursivegaussian_glszm_ZoneEntropy” correlated with increased SHAP impact in CT_Cortical model). The bar charts on the right side of the figure ranked features by mean absolute SHAP values (|SHAP|), where bar length corresponded to cumulative impact magnitude across samples. Right panel: Force plots for a representative metastatic ccRCC testing patient, showing consistent baseline risk (base value) versus model-specific decision trajectories (arrows), with final prediction probabilities (f(x)) differing across models due to modality-specific feature contributions (positive/negative impacts shown in orange/purple).

3.4 Construction and evaluation of multi-modality models

Considering the unique information provided by different imaging modalities, multimodal data can harness complementary information to enhance predictive performance. Inspired by this concept, multi-modality fusion models were constructed by integrating the predictive probabilities derived from multiple single-modality models to boost performance. Initially, the Multi-phased CT model was developed by integrating the predicted probabilities from three radiomics models. The model achieved an AUROC of 0.837 (95% CI: 0.782-0.892) in the training dataset and 0.812 (95% CI: 0.680-0.943) in the testing dataset. As illustrated in Figure 3, it demonstrated superior performance compared to any individual single-phase model. The enhanced classification performance may be attributed to the fact that three-phase CT images provide a comprehensive view of the lesion information, enabling a more robust prediction of metastasis. Additionally, to better understanding the decision-making process of the model, a nomogram was constructed based on the three probabilities (Figure 3i). For a given patient, every variable corresponded to a point, and the total point corresponded to the probability of the metastasis.

Figure 3

Figure 3. Classification performance and clinical interpretability of the multi-phased CT model. (a) Receiver operating characteristic (ROC) curves demonstrated diagnostic accuracy in training (grey line) and testing (pink line) datasets, with Y-axis representing true positive rate and X-axis representing false positive rate. (b, c) Area under the ROC curve (AUROC) comparison analyses among four models (CT_Cortical, CT_Medullary, CT_Non-enhanced, multi-phased CT) in training (b) and testing (c) datasets, where asterisks (***) indicated statistical significance (p < 0.001). (d) Precision-recall (PR) curves quantified positive predictive value (precision, Y-axis) versus sensitivity (recall, X-axis). (e) Calibration curves assessed agreement between predicted probabilities (X-axis) and observed metastasis frequencies (Y-axis), with dashed diagonal denoting perfect calibration. (f) Decision curve analysis evaluated clinical net benefit (Y-axis) across threshold probabilities (X-axis) in the testing dataset. (g) Confusion matrix used to visualize true vs. predicted classifications in the testing dataset. (h) Radar plot showed quantitative metrics in the testing dataset including AUROC, area under the PR curve (AUPR), sensitivity (SEN), specificity (SPE), accuracy (ACC), precision (PRE), and F1 score. (i) Nomogram integrated predicted probabilities from three single-phase CT models, where vertical red lines marked sample-specific scores for a representative patient, and total points translated to metastasis probability (bottom axis).

To further investigate the complementary ability of radiomics with clinical information, a combined model was constructed by integrating predicted probabilities of the multi-phased CT model with the clinical model. This model achieved an AUROC of 0.969 (95% CI: 0.950-0.988) in the training dataset and 0.824 (95% CI: 0.704-0.944) in the testing dataset Figure 4). Pairwise DeLong’s tests were performed to statistically validate its superiority over other models. In the training dataset, the combined model showed significantly higher AUROC values compared to all single-modality models and the multi-phase CT model (all p < 0.001). In the testing dataset, while its performance differences against single-modality models were less pronounced, it still demonstrated statistical superiority over the clinical model (p < 0.05). These results, visualized as a comparison heatmap in Supplementary Figure 4 indicated that the three-phase CT radiomics model and the clinical model provide complementary insights into lesion information from various perspectives, thereby enhancing diagnostic performance. Additionally, a nomogram was created that integrated predicted probabilities from four single-modality models, which can help to clarify the decision-making process of the model (Figure 4i).

Figure 4

Figure 4. Classification performance and clinical interpretability of the combined model (multi-phased CT + clinical). (a) Receiver operating characteristic (ROC) curves demonstrated diagnostic accuracy in training (grey line) and testing (purple line) datasets, with Y-axis representing true positive rate and X-axis representing false positive rate. (b, c) Area under the ROC curve (AUROC) comparison analyses among three models (clinical, multi-phased CT, and combined) in training (b) and testing (c) datasets, where asterisks indicated statistical significance (*p < 0.05, ***p < 0.001). (d) Precision-recall (PR) curves quantified positive predictive value (precision, Y-axis) versus sensitivity (recall, X-axis). (e) Calibration curves assessed agreement between predicted probabilities (X-axis) and observed metastasis frequencies (Y-axis), with dashed diagonal denoting perfect calibration. (f) Decision curve analysis evaluated clinical net benefit (Y-axis) across threshold probabilities (X-axis) in the testing dataset. (g) Confusion matrix used to visualize true vs. predicted classifications in the testing dataset. (h) Radar plot showed quantitative metrics in the testing dataset including AUROC, area under the PR curve (AUPR), sensitivity (SEN), specificity (SPE), accuracy (ACC), precision (PRE), and F1 score. (i) Nomogram integrated predicted probabilities from four single-modality models, where vertical red lines marked sample-specific scores for a representative patient, and total points translated to metastasis probability (bottom axis).

4 Discussion

The findings of this study suggest that the unimodal radiomics model, which relies solely on CT image features, and the clinical prediction model, which is based on clinical-radiological features, exhibit limited efficacy in predicting the metastatic risk in patients with ccRCC. However, the integration of radiomics features with clinical data has been demonstrated to significantly enhance the predictive performance. This improvement is mechanistically supported by SHAP (SHapley Additive exPlanations) analysis, which revealed that radiomics features related to intratumoral heterogeneity synergized with clinical factors like tumor size and creatinine levels to drive model predictions. The dominance of radiomics features in SHAP global interpretations underscores their ability to quantify subtle tumor microenvironment characteristics, such as necrosis and hypoxia, which are not fully captured by clinical variables alone. This finding offers valuable insights for the development of personalized treatment strategies for ccRCC patients in clinical practice. Compared to prior studies focusing on single-phase CT or clinical models alone (11–13, 22), our multimodal fusion approach achieved superior performance (testing AUROC: 0.824 vs. 0.68–0.816 in existing literature (22–24)), demonstrating the unique advantage of leveraging complementary information from multiphase CT and clinical-pathological factors.

The observed variation in AUC values across imaging modalities likely reflects distinct pathophysiological insights captured during different contrast phases. For instance, the CT_Medullary model outperformed cortical and non-enhanced models (testing AUROC: 0.785 vs. 0.751/0.724), potentially attributable to its enhanced sensitivity to vascular invasion patterns and hypoxia-induced necrosis during the medullary phase—a period when contrast washout highlights tumor microvascular heterogeneity. This aligns with our SHAP analysis identifying medullary-phase features as key contributors, which may quantify focal necrosis clusters associated with aggressive phenotypes.

Ficarra et al. (25) demonstrated that tumor necrosis, as assessed by the Mayo Clinic Staging, Size, Grade, and Necrosis (SSIGN) scoring system, serves as a significant prognostic factor in the clinical management of ccRCC patients. This finding aligns with our study. SHAP dependence plots further elucidated that radiomics signatures linked to necrosis (e.g., high gray-level emphasis features in medullary-phase CT) exhibited threshold effects, mirroring the biological transition to aggressive phenotypes when necrosis exceeds critical levels. Previous research has established that necrosis occurs when tumor cells exhibit higher metabolic activity relative to angiogenesis levels, resulting in inadequate oxygen and nutrient supply (26). Our investigation revealed that tumor size significantly influences the risk of developing distant metastasis in RCC patients, with larger tumors being more likely to develop such metastasis. SHAP analysis quantified this relationship, showing a sharp increase in metastatic risk contribution for tumors exceeding 3 cm—a threshold consistent with Zastrow et al.’s observations (27). These results are consistent with Hutterer et al.’s (28) study, which developed a nomogram to predict RCC distant metastasis and identified tumor size as a critical risk factor.

The association between intravascular tumor thrombus formation and metastatic risk corroborates previous findings. SHAP local explanations highlighted cases where thrombus-related radiomics features overrode contradictory clinical variables, emphasizing the model’s ability to prioritize imaging biomarkers in context-specific scenarios. Additionally, our study indicated that elevated creatinine levels serve as an independent risk factor associated with metastasis. The near-linear positive correlation between creatinine SHAP values and metastatic risk aligns with its role as a marker of renal dysfunction, which may promote systemic metabolic dysregulation conducive to tumor dissemination (26). The clinical prediction model was developed using the three identified independent risk factors. In both the training and validation datasets, the AUC values were 0.752 (95% CI: 0.679 to 0.826) and 0.681 (95% CI: 0.529 to 0.833), respectively.

Capitanio et al. (23) constructed a predictive model for LNM in kidney cancer, achieving an accuracy of 86.9%. Marconi et al. (24) created prognostic models for survival rates in patients with distant metastases, reporting AUROC values of 0.68 (95% CI: 0.62-0.74) for the preoperative assessment and 0.73 (95% CI: 0.68-0.78) for the postoperative assessment. Bai et al. (22) used MRI images to develop a radiomics nomogram for predicting outcomes in patients with distant metastasis, achieving an AUROC value of 0.816 in the external validation cohort. The AUROC of the multimodal fusion model was 0.969 (95% confidence interval [CI]: 0.950 to 0.988) and 0.824 (95% CI: 0.704 to 0.944) for the training and test sets, respectively. Our multimodal fusion model outperformed these benchmarks (AUROC: 0.969 in training, 0.824 in testing), with SHAP analysis providing critical transparency: it demonstrated how clinical factors (e.g., tumor size) contextualize radiomics patterns (e.g., texture entropy), resolving discrepancies seen in single-modality models. This finding suggests that the integration of multiple sources of information enhances the predictive capability of the model.

Notably, while Bai et al. (22) achieved comparable performance using MRI-based radiomics (external AUROC: 0.816), our CT-based multimodal model offers distinct practical advantages. Firstly, CT remains the first-line imaging modality for ccRCC staging globally, ensuring broader clinical applicability. Secondly, The integration of multiphase CT captures dynamic contrast kinetics, providing insights into tumor angiogenesis and interstitial pressure gradients that MRI cannot replicate. Lastly, Our SHAP-driven nomogram enhances interpretability compared to “black-box” models in prior studies, enabling clinicians to weigh imaging vs. clinical factors case-specifically.

Although the results of this study are encouraging, it is crucial to acknowledge its limitations. Firstly, the retrospective nature of the study may introduce selection bias, potentially compromising the accuracy of the prediction model. Therefore, prospective trials are necessary to validate these findings. Secondly, while all patients underwent CT imaging using contrast agents, it remains unclear whether radiomics features extracted from CT images vary based on different contrast agents and if such variations affect the performance of the final model. Thirdly, although SHAP provided interpretability, causality between specific radiomics features and metastatic pathways remains hypothetical; future studies integrating molecular profiling with SHAP-driven hypotheses are needed. Lastly, despite enrolling patients from three hospitals, the sample size is relatively small. Further studies with larger sample sizes are essential to confirm the accuracy and reliability of the model.

In conclusion, the present study introduced a multimodal fusion model that demonstrated robust performance in predicting the risk of metastasis in ccRCC patients. The SHAP framework not only validated the biological plausibility of feature contributions but also bridged the gap between model complexity and clinical interpretability. The utilization of this model by clinicians has the potential to facilitate more informed and precise treatment decisions.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by Ethics Committee of Qilu Hospital, Shandong University (Qingdao). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

XW: Writing – original draft, Writing – review & editing. YY: Data curation, Writing – review & editing. JW: Data curation, Software, Writing – review & editing. XT: Writing – review & editing. YW: Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

Author JW was employed by the company Shanghai United Imaging Intelligence Co., Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1576956/full#supplementary-material

References

1. Motzer R, Agarwal N, Beard C, Bolger GB, Boston B, Carducci MA, et al. NCCN clinical practice guidelines in oncology: kidney cancer. J Natl Compr Cancer Netw JNCCN. (2009) 7:618–30. doi: 10.6004/jnccn.2009.0043

PubMed Abstract | Crossref Full Text | Google Scholar

2. Xing T and He H. Epigenomics of clear cell renal cell carcinoma: mechanisms and potential use in molecular pathology. Chin J Cancer Res Chung-Kuo Yen Cheng Yen Chiu. (2016) 28:80–91. doi: 10.3978/j.issn.1000-9604.2016.02.09

PubMed Abstract | Crossref Full Text | Google Scholar

3. Quivy A, Daste A, Harbaoui A, Duc S, Bernhard JC, Gross-Goupil M, et al. Optimal management of renal cell carcinoma in the elderly: a review. Clin Interv Aging. (2013) 8:433–4. doi: 10.2147/CIA.S30765

PubMed Abstract | Crossref Full Text | Google Scholar

4. Znaor A, Lortet-Tieulent J, Laversanne M, Jemal A, and Bray F. International variations and trends in renal cell carcinoma incidence and mortality. Eur Urol. (2015) 67:519–30. doi: 10.1016/j.eururo.2014.10.002

PubMed Abstract | Crossref Full Text | Google Scholar

5. Kunath F, Schmidt S, Krabbe LM, Miernik A, Dahm P, Cleves A, et al. Partial nephrectomy versus radical nephrectomy for clinical localised renal masses. Cochrane Database Syst Rev. (2017) 5:CD012045. doi: 10.1002/14651858.CD012045.pub2

PubMed Abstract | Crossref Full Text | Google Scholar

6. Chen J, Cao N, Li S, and Wang Y. Identification of a risk stratification model to predict overall survival and surgical benefit in clear cell renal cell carcinoma with distant metastasis. Front Oncol. (2021) 11:630842. doi: 10.3389/fonc.2021.630842

PubMed Abstract | Crossref Full Text | Google Scholar

7. Ljungberg B, Albiges L, Abu-Ghanem Y, Bensalah K, Dabestani S, Fernández-Pello S, et al. European association of urology guidelines on renal cell carcinoma: the 2019 update. Eur Urol. (2019) 75:799–810. doi: 10.1016/j.eururo.2019.02.011

PubMed Abstract | Crossref Full Text | Google Scholar

8. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RG, Granton P, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer (Oxford Engl 1990). (2012) 48:441–6. doi: 10.1016/j.ejca.2011.11.036

PubMed Abstract | Crossref Full Text | Google Scholar

9. Dong D, Zhang F, Zhong LZ, Fang MJ, Huang CL, Yao JJ, et al. Development and validation of a novel MR imaging predictor of response to induction chemotherapy in locoregionally advanced nasopharyngeal cancer: A randomized controlled trial substudy (NCT01245959). BMC Med. (2019) 17:190. doi: 10.1186/s12916-019-1422-6

PubMed Abstract | Crossref Full Text | Google Scholar

10. Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. (2017) 14:749–62. doi: 10.1038/nrclinonc.2017.141

PubMed Abstract | Crossref Full Text | Google Scholar

11. Shu J, Tang Y, Cui J, Yang R, Meng X, Cai Z, et al. Clear cell renal cell carcinoma: CT-based radiomics features for the prediction of fuhrman grade. Eur J Radiol. (2018) 109:8–12. doi: 10.1016/j.ejrad.2018.10.005

PubMed Abstract | Crossref Full Text | Google Scholar

12. Shu J, Wen D, Xi Y, Xia Y, Cai Z, Xu W, et al. Clear cell renal cell carcinoma: machine learning-based computed tomography radiomics analysis for the prediction of WHO/ISUP grade. Eur J Radiol. (2019) 121:108738. doi: 10.1016/j.ejrad.2019.108738

PubMed Abstract | Crossref Full Text | Google Scholar

13. Scrima A, Lubner MG, Abel EJ, Havighurst TC, Shapiro DD, Huang W, et al. Texture analysis of small renal cell carcinomas at MDCT for predicting relevant histologic and protein biomarkers. Abdominal Radiol (New York). (2019) 44:1999–2008. doi: 10.1007/s00261-018-1649-2

PubMed Abstract | Crossref Full Text | Google Scholar

14. Li Y, Wei D, Liu X, Fan X, Wang K, Li S, et al. Molecular subtyping of diffuse gliomas using magnetic resonance imaging: comparison and correlation between radiomics and deep learning. Eur Radiol. (2022) 32:747–58. doi: 10.1007/s00330-021-08237-6

PubMed Abstract | Crossref Full Text | Google Scholar

15. Boehm KM, Aherne EA, Ellenson L, Nikolovski I, Alghamdi M, Vázquez-García I, et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat Cancer. (2022) 3:723–33. doi: 10.1038/s43018-022-00388-9

PubMed Abstract | Crossref Full Text | Google Scholar

16. Liu H, Chen Y, Zhang Y, Wang L, Luo R, Wu H, et al. A deep learning model integrating mammography and clinical factors facilitates the Malignancy prediction of BI-RADS 4 microcalcifications in breast cancer screening. Eur Radiol. (2021) 31:5902–12. doi: 10.1007/s00330-020-07659-y

PubMed Abstract | Crossref Full Text | Google Scholar

17. Kwon BR, Shin SU, Kim SY, Choi Y, Cho N, Kim SM, et al. Microcalcifications and peritumoral edema predict survival outcome in luminal breast cancer treated with neoadjuvant chemotherapy. Radiology. (2022) 304:310–9. doi: 10.1148/radiol.211509

PubMed Abstract | Crossref Full Text | Google Scholar

18. Wu J, Xia Y, Wang X, Wei Y, Liu A, Innanje A, et al. uRP: An integrated research platform for one-stop analysis of medical images. Front Radiol. (2023) 3:1153784. doi: 10.3389/fradi.2023.1153784

PubMed Abstract | Crossref Full Text | Google Scholar

19. Zwanenburg A, Vallières M, Abdalah MA, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. (2020) 295:328–38. doi: 10.1148/radiol.2020191145

PubMed Abstract | Crossref Full Text | Google Scholar

20. Zwanenburg A, Vallières M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, et al. Calculating the sample size required for developing a clinical prediction model. BMJ. (2020) 368:m441. doi: 10.1136/bmj.m441

PubMed Abstract | Crossref Full Text | Google Scholar

21. Michalski A, Duraj K, and Kupcewicz B. Leukocyte deep learning classification assessment using Shapley additive explanations algorithm. Int J Lab Hematol. (2023) 45:297–302. doi: 10.1111/ijlh.14031

PubMed Abstract | Crossref Full Text | Google Scholar

22. Bai X, Huang Q, Zuo P, Zhang X, Yuan J, Zhang X, et al. MRI radiomicsbased nomogram for individualised prediction of synchronous distant metastasis in patients with clear cell renal cell carcinoma. Eur Radiol. (2021) 31:1029–42. doi: 10.1007/s00330-020-07184-y

PubMed Abstract | Crossref Full Text | Google Scholar

23. Capitanio U, Abdollah F, Matloob R, Suardi N, Castiglione F, Di Trapani E, et al. When to perform lymph node dissection in patients with renal cell carcinoma: a novel approach to the preoperative assessment of risk of lymph node invasion at surgery and of lymph node progression during follow-up. BJU Int. (2022) 32:747–58. doi: 10.1111/bju.12125

PubMed Abstract | Crossref Full Text | Google Scholar

24. Marconi L, de Bruijn R, van Werkhoven E, Beisland C, Fife K, Heidenreich A, et al. External validation of a predictive model of survival after cytoreductive nephrectomy for metastatic renal cell carcinoma. World J Urol. (2018) 36:1973–80. doi: 10.1007/s00345-018-2427-z

PubMed Abstract | Crossref Full Text | Google Scholar

25. Ficarra V, Novara G, Galfano A, Brunelli M, Cavalleri S, Martignoni G, et al. The ‘Stage, Size, Grade and Necrosis’ score is more accurate than the University of California Los Angeles Integrated Staging System for predicting cancer-specific survival in patients with clear cell renal cell carcinoma. BJU Int. (2009) 103:165–70. doi: 10.1111/j.1464-410X.2008.07901.x

PubMed Abstract | Crossref Full Text | Google Scholar

26. Shinagare AB, Krajewski KM, Braschi-Amirfarzan M, and Ramaiya NH. Advanced renal cell carcinoma: role of the radiologist in the era of precision medicine. Radiology. (2017) 284:333–51. doi: 10.1148/radiol.2017160343

PubMed Abstract | Crossref Full Text | Google Scholar

27. Zastrow S, Phuong A, von Bar I, Novotny V, Hakenberg OW, and Wirth MP. Primary tumour size in renal cell cancer in relation to the occurrence of synchronous metastatic disease. Urol Int. (2014) 92:462–7. doi: 10.1159/000356325

PubMed Abstract | Crossref Full Text | Google Scholar

28. Hutterer GC, Patard JJ, Jeldres C, Perrotte P, de La Taille A, Salomon L, et al. Patients with distant metastases from renal cell carcinoma can be accurately identified: external validation of a new nomogram. BJU Int. (2008) 101:39–43. doi: 10.1111/j.1464-410X.2007.07170

Crossref Full Text | Google Scholar

Keywords: clear cell renal cell carcinoma, CT, predicted, radiomics, metastasis

Citation: Wang X, Yang Y, Wu J, Tang X and Wang Y (2025) Prediction of metastatic risk of renal clear cell carcinoma based on CT radiomics analysis. Front. Oncol. 15:1576956. doi: 10.3389/fonc.2025.1576956

Received: 14 February 2025; Accepted: 19 May 2025;
Published: 06 June 2025.

Edited by:

Abhishek Mahajan, The Clatterbridge Cancer Centre, United Kingdom

Reviewed by:

Guozheng Zhang, Quzhou City People’s Hospital, China
Ali Hajj Ali, Dana–Farber Cancer Institute, United States

Copyright © 2025 Wang, Yang, Wu, Tang and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yao Wang, NTI5MTI1MTE4QHFxLmNvbQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.