Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol., 15 October 2025

Sec. Genitourinary Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1640159

This article is part of the Research TopicLeveraging Artificial Intelligence for Biomarker Discovery in Prostate CancerView all 3 articles

Multimodal integration of [18F]PSMA-1007 PET/CT semiquantitative parameters and clinicopathological data for predicting prostate cancer metastasis

JiaYing Yang&#x;JiaYing Yang1†ZhiLong Ma&#x;ZhiLong Ma1†HaiTong HaoHaiTong Hao2Jian ChenJian Chen3ZhiYong Lv*&#x;ZhiYong Lv4*‡Qian Zhao*&#x;Qian Zhao1*‡YanMei Li*&#x;YanMei Li1*‡
  • 1Nuclear Medicine Department, General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China
  • 2College of Basic Medical Sciences, Ningxia Medical University, Yinchuan, Ningxia, China
  • 3Nuclear Medicine Department, International Medical Center Hospital, Xi'an, Shanxi, China
  • 4Urinary surgery Department,General Hospital of Ningxia Medical University, Yinchuan, Ningxia, China

Background: Prostate cancer is one of the most prevalent malignant tumors of the male genitourinary system. The occurrence of metastasis significantly influences treatment strategies and prognosis. However, current risk assessments for metastatic disease primarily rely on single imaging or pathological indicators, which are often limited by suboptimal accuracy and considerable individual variability.

Objective: This study aimed to develop a high-performance predictive model for prostate cancer metastasis by integrating semiquantitative parameters from [18F]PSMA-1007 PET/CTwith key clinicopathological features.

Methods: We retrospectively analyzed data from prostate cancer patients, includingPSMA PET/CT-derived features (SUVmax, SUVmean, PSMA-TVp, TL-PSMAp) and clinical-pathological variables (age, tPSA, Gleason score). Five machine learningalgorithms—Logistic Regression, Support Vector Machine, Random Forest, Naive Bayes, and XGBoost—were evaluated for metastasis prediction performance. Model performance was assessed using accuracy, sensitivity, precision, and area under the ROC curve (AUC). Shapley additive explanations (SHAP) were applied to interpret the most effective model.

Results: Among the five algorithms, the XGBoost model achieved an accuracy of 90.32%, sensitivity of 90.0%, specificity of 94.74%, and an AUC of 0.8977. SHAP analysis identified PSMA-TVp, TL-PSMAp as the most important predictors, followed by SUVmax, tPSA, and Gleason score. These findings highlight the key role of PSMA-derived tumor burden in metastasis prediction. Force plots further revealed the individual-level contributions of features, supporting the model’s clinical interpretability.

Conclusion: The XGBoost-based multimodal model integrating PET/CT semiquantitative parameters with clinicopathological data demonstrated excellent accuracy and interpretability in predicting prostate cancer metastasis. This approach has strong potential for clinical application and may provide a valuable tool for personalized treatment decision-making.

1 Introduction

Prostate cancer (PCa) is one of the most common malignancies among male worldwide and remains a leading cause of cancer-related death, ranking sixth in global male cancer mortality rates (1). With the growing trend of population aging, the incidence of prostate cancer continues to rise annually, posing a significant challenge to global public health. Clinical studies have shown that patients with metastatic prostate cancer exhibit a markedly reduced 5-year survival rate of approximately 31%, substantially lower than that of patients with localized disease (2). However, the biological behavior of prostate cancer is highly heterogeneous, leading to vastly different progression trajectories and therapeutic responses among patients even at the same clinical stage (24). Therefore, the early and accurate identification of patients at high risk of metastasis has become a critical issue in improving treatment outcomes and prolonging survival, and holds great clinical importance for the realization of precision diagnosis and therapy in prostate cancer.

The total prostate-specific antigen (tPSA) can be used for prostate cancer risk stratification and prediction of distant metastasis; however, its specificity is limited, which may lead to unnecessary prostate biopsies in some patients (5). Magnetic Resonance Imaging (MRI) has played a significant role in improving the detection rate and local staging of prostate cancer. Nevertheless, it may still miss approximately 20% of clinically significant cancers and has limited sensitivity and specificity in detecting lymph node metastases (6).

By contrast, Prostate-specific membrane antigen (PSMA), a transmembrane glycoprotein that is highly overexpressed in prostate cancer cells, particularly in advanced or castration-resistant stages. PSMA-targeted PET/CT imaging has demonstrated outstanding sensitivity and specificity in the diagnosis, staging, recurrence detection, and treatment evaluation of prostate cancer (610). Compared to conventional imaging techniques, PSMA PET/CT offers significant advantages in detecting small lesions and identifying recurrent disease even in cases with low total prostate-specific antigen (tPSA) levels, thus providing a reliable basis for precision therapy (11, 12). In addition, PSMA PET/CT enables the acquisition of multiple semiquantitative parameters that reflect tumor PSMA expression and volumetric characteristics, such as maximum standardized uptake value (SUVmax), mean standardized uptake value (SUVmean), prostate PSMA-tumor volume (PSMA-TVp) and prostate total lesion PSMA (TL-PSMAp). These quantitative metrics provide an objective basis for evaluating tumor aggressiveness and metastatic potential. Specifically, SUVmax indicates the peak uptake within the most active part of the lesion and is easy to obtain and compare across patients. However, it neglects tumor volume and heterogeneity, potentially underestimating tumor burden.SUVmean reflects the average PSMA ligand uptake across the lesion, offering insight into overall PSMA expression, although it is susceptible to variation depending on the definition of the volume of interest (VOI). PSMA-TVp and TL-PSMAp, as composite indicators of tumor burden, quantify the PSMA-avid tumor volume and the prostate total uptake PSMA, respectively. These parameters offer a more comprehensive assessment of the tumor’s global PSMA ligand uptake and biological behavior and have been reported in numerous studies to be closely associated with tumor staging, metastasis, and prognosis (1316). However, in current clinical practice, interpretation of these imaging parameters largely relies on empirical assessment, lacking systematic analysis and quantitative predictive methodologies. This limitation hinders their full potential in personalized metastatic risk stratification.

With the rapid advancement of artificial intelligence, particularly machine learning (ML) techniques, there is now a promising opportunity to build personalized risk prediction models by integrating multimodal medical data (1719).Machine learning algorithms excel in handling high-dimensional, multi-variable, and non-linear data relationships, and have shown great success in early detection, metastasis prediction, and prognosis evaluation across various solid tumors, such as breast and lung cancer (2023). In the context of prostate cancer, combining machine learning with PSMA PET/CT-derived semiquantitative metrics, clinical features, and pathological data could enable the development of high-performance predictive models, allowing for early identification of high-risk patients and supporting more precise stratified management and clinical decision-making.

In this study, we propose a multimodal data fusion strategy that integrates clinicopathological features (e.g., age, Gleason score) and PSMA PET/CT semi-quantitative metrics (e.g., SUVmax,PSMA-TVp) to construct a machine learning-based predictive model for assessing the risk of metastasis in prostate cancer patients. The goal is to enhance the early identification of high-risk individuals, facilitate personalized treatment planning, and provide an intelligent decision-support tool for risk stratification. Ultimately, this approach aims to shift the paradigm of prostate cancer management from empirical judgment to data-driven precision medicine, with significant clinical value and application potential.

2 Materials and methods

2.1 Study population

This retrospective study included a total of 295 patients with histologically confirmed prostate cancer (PCa) who underwent [18F]PSMA-1007 PET/CT imaging at our institution between January 2020 and February 2022. Inclusion criteria were: (1) complete clinical data; (2) patients had undergone transrectal ultrasound-guided prostate biopsy or radical prostatectomy with definitive pathological diagnosis. Exclusion criteria were as follows: (1) presence of other malignancies; (2) an interval of more than one month between serum total prostate-specific antigen (tPSA) testing, pathological biopsy, and [18F]PSMA-1007 PET/CT imaging; (3) severe hepatic or renal dysfunction; (4) prior anti-tumor treatment before imaging; (5) lack of PSMA uptake in the primary tumor. After screening, 101 patients met all criteria and were included in the final analysis, the selection process is shown in Figure 1.

Figure 1
Flowchart depicting patient selection and exclusion criteria for a study. It starts with 295 scans from 295 patients. Exclusions were due to lack of definitive results (25 scans), other cancer diagnoses (17 scans), diagnoses of hyperplasia or prostatitis (68 scans), testing interval exceeding a month (49 scans), and prior treatments or history (35 scans). This results in 101 scans, split into 70 for training and 31 for validation.

Figure 1. Flowchart illustrating the inclusion and exclusion criteria for patient selection in the study.

All study procedures were conducted in accordance with the Declaration of Helsinki and were approved by the hospital ethics committee (Approval Nos. 2020–083 and 2020-876). Written informed consent was obtained from all participants.

2.2 Image interpretation

PET/CT imaging was performed using a GE Discovery VCT scanner (64-slice CT), with routine quality control to ensure stable performance. [18F]PSMA-1007 was synthesized using the PET-IFB-X5 automated module from Shaanxi Zhengze Biotechnology Co., Ltd., and its radiochemical purity was confirmed by high-performance liquid chromatography to be ≥95%. The injected dose of [18F]PSMA-1007 was 4.0 MBq/kg. Whole-body PET/CT scans were performed 60–90 minutes post-injection. Spiral CT scans were first acquired from the skull vertex to mid-thigh with the following parameters: 140 kV tube voltage, 150 mA tube current, 0.875 mm pitch, 3.75 mm slice thickness, and a 512×512 matrix. Subsequently, PET images were acquired over the same range using a 3D acquisition mode with a 128×128 matrix, 2.5 minutes per bed position, and 6–7 bed positions in total. All PET images were attenuation-corrected using the corresponding CT data and reconstructed for image fusion and further analysis.

2.3 Image interpretation

All images were independently reviewed in a double-blind manner by two board-certified nuclear medicine physicians with extensive diagnostic experience. Discrepancies were resolved through joint discussion to reach a consensus diagnosis.

On visual inspection, lesions showing focal PSMA uptake higher than the surrounding normal tissue in the prostate were considered positive. A circular region of interest (ROI) was manually drawn on axial images around the lesions, and the positive volume was delineated using a fixed threshold method set at 40% of the SUVmax. The maximum standardized uptake value (SUVmax), the mean standardized uptake value (SUVmean), PSMA-TVp and TL-PSMAp were recorded for each lesion.

Criteria for Lymph Node Metastasis: On [18F]PSMA-1007 PET/CT, focal abnormal radiotracer uptake outside of physiological uptake regions (e.g., salivary glands, liver, gallbladder, prostate, kidneys, intestines) was interpreted as metastatic unless located in known false-positive sites such as axillary, mediastinal, or inguinal lymph nodes. The number of metastatic lymph nodes and their corresponding SUVmax, SUVmean, PSMA-TVp and TL-PSMAp were recorded.

Criteria for Bone Metastasis: Focal areas of increased PSMA uptake in bone were considered metastatic if they could not be attributed to fractures, degenerative changes, or other benign bone conditions (24).

Final diagnoses were established based on histopathological findings from surgery or biopsy when available, or through clinical follow-up. For lesions not amenable to tissue diagnosis (e.g., bone or distant metastases), a comprehensive judgment was made based on synchronous imaging findings and clinical follow-up data.

2.4 Statistical analysis

All statistical analyses were conducted using Python 3.10. Continuous variables were compared using either Student’s t-test or the Mann–Whitney U test, depending on data distribution. Categorical variables were compared using Pearson’s χ² test or Fisher’s exact test. The dataset was randomly divided into training and validation sets at a ratio of 7:3. For classification model evaluation, receiver operating characteristic (ROC) curves were plotted using probability scores ranging from 0 to 1, and the area under the curve (AUC) was calculated to assess discriminatory performance. A p-value< 0.05 was considered statistically significant.

3 Results

3.1 Comparison of basic patient information and clinicopathological characteristics

A total of 101 patients with prostate cancer (PCa) were retrospectively enrolled in this study, with a mean age of 68 years. Among them, 97.03% were diagnosed with acinar adenocarcinoma, while the remaining subtypes included one case each of signet ring cell carcinoma, intraductal carcinoma, and foamy gland adenocarcinoma. As summarized in Table 1, the Gleason scores ranged from 6 to 10, with 60.39% (61/101) scoring greater than 8. The total prostate-specific antigen (tPSA) levels ranged from 5.42 to 100.0 ng/mL, with 59.41% (60/101) ≥20 ng/mL. Among the cohort, 34 patients showed no evidence of metastasis, while 67 patients had confirmed metastatic disease, including 53 cases of lymph node metastasis, 52 cases of bone metastasis, and 8 cases of visceral metastasis, all of which were pulmonary.

Table 1
www.frontiersin.org

Table 1. Baseline feature distribution and dataset partitioning.

Figure 1 illustrates the distribution of various features between the metastasis and non-metastasis groups. Based on the distributions shown, there are significant differences in several features between the two groups. Specifically, the metastasis group had significantly higher values in tPSA, Gleason score, SUVmax, SUVmean, PSMA-TVp, and TL-PSMAp compared to the non-metastasis group, with all differences being statistically significant (p< 0.05). However, there was no statistically significant difference in age between the two groups (p = 0.096). For details, see Supplementary Table 1.

3.2 Performance evaluation of machine learning models

To ensure the scientific rigor and effectiveness of model training, the dataset was randomly divided into a training set and a validation set at a 7:3 ratio. The training set included 70 patients, while the validation set included 31 patients. Among the training cohort, 47 patients (67.14%) presented with metastatic disease; in the validation cohort, 20 patients (64.52%) had confirmed metastases. To assess feature distribution balance, we performed homogeneity tests on clinical, pathological, and imaging characteristics between the two cohorts. The results demonstrated no statistically significant differences between the training and validation sets in terms of age, tPSA, Gleason score, SUVmax, SUVmean, PSMA-TVp, and TL-PSMAp (p > 0.05), as shown in Table 1. These findings confirm that the two groups were well-balanced across key features, thereby minimizing potential bias introduced by data partitioning during model development and evaluation.

In the training cohort, five commonly used classification algorithms—Logistic Regression, Support Vector Machine (SVM), Random Forest, Extreme Gradient Boosting (XGBoost), and Naive Bayes—were employed to construct predictive models. All models were trained using the same set of input features, with the presence or absence of metastasis serving as the output label. In the validation cohort, model performance was comprehensively evaluated using multiple metrics. Receiver Operating Characteristic (ROC) curves were plotted, and the Area Under the Curve (AUC) was calculated to assess classification performance. In addition, accuracy, sensitivity, and specificity were used as complementary evaluation metrics. The ROC curves for each model are shown in Figure 2A. To further quantify the overall performance of each model across multiple metrics, a Composite Score was introduced, calculated as follows: Composite_Score= 0.4×AUC+0.3×F1_score+0.1× Accuracy+0.1×Sensitivity+0.1×Specificity. This score provides a balanced and representative evaluation by integrating both discriminative power and classification effectiveness. Figure 2B illustrates the radar chart of the six major performance indicators, clearly highlighting the superior overall performance of the XGBoost model.

Figure 2
Seven violin plots labeled A to G compare variables against metastasis status: N (no) and Y (yes). Plots depict distribution differences for Age, Gleason Score, tPSA, SUVmax, SUVmean, PSMA-TVp, and TL-PSMAp. Each plot shows distinct shapes and spreads, indicating variable measurements related to metastasis presence or absence.

Figure 2. Distribution of individual features between metastatic and non-metastatic prostate cancer patients.(A-G) Illustration of the distribution of key features—including age, tPSA, Gleason score, SUVmax, SUVmean, PSMA-TVp, and TL-PSMAp—between metastatic and non-metastatic groups using violin plots. Each subplot shows the kernel density estimation and boxplot for a given feature, allowing visualization of both the value distribution and central tendency. The x-axis represents metastasis status, while the y-axis indicates the corresponding feature values.

Furthermore, to analyze the types and patterns of classification errors, confusion matrices of the five models in the validation cohort were generated (Figures 2C–G). Among them, the XGBoost model demonstrated the best performance in predicting prostate cancer metastasis, achieving an AUC of 0.8977, an accuracy of 90.32%, a sensitivity of 90.0%, and a specificity of 94.74% in the validation set. The Naive Bayes model ranked second, with the same AUC (0.8977) and a slightly higher sensitivity (95.0%), though its accuracy (87.10%) and specificity (86.36%) were slightly lower. In contrast, the Logistic Regression, Support Vector Machine, and Random Forest models showed relatively inferior performance across the evaluated metrics.

In summary, all models demonstrated a certain degree of discriminative ability in the validation set; however, nonlinear ensemble models such as XGBoost exhibited superior generalization and robustness when integrating multiple features. Combined with the performance visualizations in Figure 2, these results suggest that deep ensemble learning methods hold greater potential for clinical application in predicting prostate cancer metastasis risk.

3.3 Feature contribution and interpretability of the XGBoost model

Based on the performance evaluation of all models, we ultimately selected the XGBoost model for predicting prostate cancer metastasis, as it demonstrated the best overall performance. To further explore the model’s decision-making process and the contribution of each input feature, we employed the SHAP method for interpretability analysis. SHAP is a game-theory-based model interpretation technique that assigns each feature a clear “contribution value,” quantifying both the direction and magnitude of its impact on model output. Compared to traditional feature importance analysis, SHAP not only reflects the global importance of features but also supports fine-grained explanations at the individual level. In this study, we used SHAP to interpret the XGBoost model from both global and individual perspectives.

As shown in Figure 3, subplot A presents the SHAP summary plot for the XGBoost model, illustrating the global feature importance rankings and their influence on metastasis prediction. The horizontal axis represents SHAP values (i.e., the impact on the prediction), while the vertical axis lists the eight input features. Each dot represents a sample’s SHAP value for that feature, with color gradients from blue to red corresponding from low to high feature values.

Figure 3
Panel A displays a radar chart comparing Logistic, SVM, RandomForest, XGBoost, and NaiveBayes across metrics like accuracy, sensitivity, specificity, F1, AUC, and composite score. Panel B presents an ROC curve with AUC values for each model: Logistic (0.8909), SVM (0.9000), RandomForest (0.8841), XGBoost (0.8977), and NaiveBayes (0.8977). Panels C to G show confusion matrices for XGBoost, NaiveBayes, Logistic, SVM, and RandomForest models, respectively, displaying predicted versus actual values.

Figure 3. Performance evaluation of different models on the validation set. (A) Receiver operating characteristic (ROC) curves of the five models, along with their corresponding area under the curve (AUC) values; (B) Radar plots illustrating the comparative performance of five classification algorithms in terms of accuracy, sensitivity, specificity, and other key metrics; (C–G) Confusion matrices depicting the classification results of each model on the validation set.

The SHAP analysis revealed that PSMA-TVp, and TL-PSMAp, as key PET parameters reflecting the overall tumor burden based on PSMA expression, contributed the most to the prediction of prostate cancer metastasis. This suggests that the tumor’s overall PSMA expression holds significant predictive value for metastasis risk. SUVmax, representing the highest uptake intensity of the most active part of the lesion, showed a slightly lower contribution due to its sensitivity to image noise and lesion heterogeneity, compared to the more comprehensive PSMA-TVp, and TL-PSMAp. Traditional clinical indicators such as tPSA and Gleason score also demonstrated strong discriminative power in the model, indicating that fundamental serological and histopathological features still play a stable role in prediction. Although SUVmean can reflect the overall tumor PSMA expression of the lesion, its importance was slightly lower due to its sensitivity to VOI (volume of interest) delineation. Age had the least predictive contribution in the model, consistent with the lack of statistical difference between groups, suggesting its limited value in distinguishing metastasis risk within this cohort.

To further illustrate the model’s decision-making mechanism at the individual level, this study randomly selected two non-metastatic patients and two metastatic patients, and generated SHAP force plots for each case (Figure 4, subplots B–E). Subplots B and C correspond to the non-metastatic patients. As shown in the plots, features such as PSMA-TVp, and TL-PSMAp, and SUVmax were all at relatively low levels, contributing negatively to the prediction outcome and steering the model toward a “non-metastatic” classification. Although tPSA in subplot B and the Gleason score in subplot C showed some positive influence on the prediction, the overall SHAP value contributions still supported a “non-metastatic” result. Subplots D and E illustrate the SHAP explanations for the two metastatic patients. In these cases, PSMA-TVp, TL-PSMAp, SUVmax, and tPSA were markedly elevated, exhibiting strong red positive forces that drove the model decisively toward a “metastatic” prediction. These results indicate that features reflecting high PSMA expression and tumor burden play a critical role in the model’s decision-making process, further confirming their clinical potential in identifying the risk of prostate cancer metastasis.

Figure 4
A series of charts showing SHAP analysis for model output related to metastasis risk. Panel A displays global feature importance with features like TL-PSMAp, GleasonScore, and Age impacting the model. Panels B to E represent SHAP force plots explaining individual predictions, with red indicating higher risk and blue indicating lower risk. Each plot includes feature values like GleasonScore and tPSA, alongside impact on the model's output.

Figure 4. SHAP-based interpretability analysis of the XGBoost model. (A) SHAP summary plot illustrating the global importance and directional impact of each feature on the model’s prediction; (B, C) SHAP force plots for two non-metastatic patients, showing how low metabolic values drive predictions toward the “non-metastatic” class; (D, E) SHAP force plots for two metastatic patients, where higher SUVmax,PSMA-TVpand TL-PSMAp values strongly contribute to the model’s prediction of metastasis.

These individual-level explanations demonstrate that the model not only possesses strong overall predictive performance but also clearly reveals the key features and their directional influence at the level of individual patients. This enhances the interpretability and clinical applicability of the model. Combined with the SHAP analysis results, it is evident that PSMA PET/CT-derived parameters such as PSMA-TVp and TL-PSMAp play a dominant role in predicting prostate cancer metastasis, outperforming the traditional SUVmax metric. Meanwhile, clinicopathological variables such as the Gleason score and tPSA also show significant predictive value, suggesting that imaging biomarkers and pathological indicators offer complementary strengths in this task.

In summary, the SHAP-based interpretability analysis not only confirmed the critical role of PSMA-avid tumor burden-related parameters in predicting prostate cancer metastasis but also highlighted the potential of the XGBoost model in providing individualized risk assessments. This approach holds promise for supporting data-driven clinical decision-making and guiding stratified management and treatment strategies for prostate cancer.

4 Discussion

The occurrence of prostate cancer metastasis is directly related to treatment decisions and prognosis assessment. To enhance clinical prediction capabilities, this study developed a multimodal metastasis risk prediction model based on machine learning by integrating semiquantitative parameters from PSMA PET/CT, clinical variables, and pathological features. This comprehensive approach provides a powerful tool for precision diagnosis and treatment of prostate cancer. The proposed model demonstrated excellent performance in the validation cohort, achieving an accuracy of 90.32%, sensitivity of 90.0%, specificity of 94.74%, and an area under the curve (AUC) of 0.8977. These metrics indicate strong discriminative ability and suggest that the model can effectively support clinicians in identifying patients at high risk of metastasis. Previous studies have predominantly focused on lesion-level prediction using PSMA PET/CT imaging features or on evaluating treatment response following radioligand therapy (2527). In contrast, this study transcends the conventional paradigm of single-modality prediction by systematically integrating heterogeneous data sources.According to the D’Amico risk classification, metastasis can be observed in at least half of patients categorized as high risk, underscoring its clinical relevance. Nevertheless, this system does not incorporate molecular imaging information (28, 29). By leveraging the complementary use of multidimensional features, the model’s predictive performance was substantially improved. This multimodal fusion approach not only enhances risk stratification accuracy but also offers a novel technical pathway for intelligent prediction of metastatic PCa, further expanding the application scope of machine learning in prostate cancer management.

As an emerging technology, machine learning is still in its early stages of clinical application but has already demonstrated broad potential in biomedical research. The predictive model developed in this study provides strong supporting evidence for the clinical translation of machine learning methods in urology. XGBoost, a type of ensemble learning algorithm, has shown superior performance in various medical prediction tasks due to its powerful nonlinear modeling capabilities and adaptability to high-dimensional, heterogeneous data (3032). In our study, XGBoost outperformed other models such as Random Forest and Support Vector Machine in assessing the risk of prostate cancer metastasis. Unlike traditional models that rely on a single imaging or clinical-pathological feature, our approach integrates imaging, clinical, and pathological data to enhance the model’s ability to identify complex patterns of metastasis. This multi-source data integration strategy allows for a more comprehensive representation of tumor biology and individual patient characteristics, and its effectiveness has also been demonstrated in fields such as head and neck cancer and cardiovascular disease (33, 34).

The SHAP framework, as a leading tool for model interpretability, effectively unveils the “black-box” mechanisms within machine learning models. In this study, SHAP analysis of the XGBoost model’s decision-making process revealed that PSMA-TVp, and TL-PSMAp made the largest marginal contributions, suggesting that volumetric parameters exhibit greater stability and discriminative power in predicting prostate cancer metastasis. Compared to SUVmax and SUVmean, which only reflects the highest uptake within a single voxel, PSMA-TVp, and TL-PSMAp integrate both PSMA expression level and lesion volume, offering a more comprehensive representation of tumor burden. This allows them to demonstrate superior discriminative capacity and robustness under complex biological conditions. These findings are consistent with previous studies and further validate their potential clinical value in capturing tumor heterogeneity and identifying distant metastasis (35, 36). Although SUVmax, a conventional PSMA PET/CT parameter, retained some importance in the model, its interpretive capacity was limited due to its reflection of only the local peak uptake, making it susceptible to noise interference (37, 38). Clinical and pathological variables such as tPSA and Gleason score also contributed significantly to the prediction task, indicating that fundamental serological and histological grading information provides important complementary value to the mode (39, 40). In contrast, age showed the lowest contribution, which aligns with its lack of statistical difference between groups, suggesting its limited predictive value for metastasis risk within this study cohort. Overall, the XGBoost model, when applied to high-dimensional multimodal data, tends to prioritize variables with stable global explanatory power. This highlights the importance of incorporating volumetric PSMA-avid tumor parameters to enhance model performance. Future research should consider giving priority to such comprehensive indicators in clinical applications to improve model generalizability and decision-making utility. Overall, the XGBoost model, when applied to high-dimensional multimodal data, tends to prioritize variables with stable global explanatory power. This highlights the importance of incorporating volumetric PSMA-avid tumor parameters to enhance model performance. Future research should consider giving priority to such comprehensive indicators in clinical applications to improve model generalizability and decision-making utility.

This study has several limitations. First, it is a single-center retrospective analysis with a relatively limited sample size(n=101), although well-balanced across clinical variables, may limit the generalizability of our findings to broader and more heterogeneous patient populations. Larger prospective, multicenter studies are warranted to validate and refine the predictive performance of our model in diverse clinical settings. Second, although SHAP analysis was employed to enhance model interpretability, the misclassification mechanisms for borderline cases require further investigation. Third, the current model is primarily based on structured features. Future studies may consider integrating raw imaging data, genomic information, longitudinal dynamic indicators, and additional clinical risk stratification systems such as the D’Amico classification to further enhance predictive accuracy and robustness.Additionally, the clinical deployment and user interaction workflows of the model remain to be designed and optimized to ensure feasibility and usability in real-world medical settings.

5 Conclusions

In conclusion, the XGBoost model accurately predicted prostate cancer metastasis, with PET parameters PSMA-TVp, TL-PSMAp, and SUVmax contributing more prominently than traditional clinical indicators such as Gleason score and tPSA.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by the Ethics Committee of the General Hospital of Ningxia Medical University (KYLL-2023-0119). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

ZM: Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing. JY: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. QZ: Funding acquisition, Investigation, Resources, Supervision, Writing – review & editing. YL: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Supervision, Writing – review & editing. ZL: Writing – original draft, Writing – review editing, Methodology, Conceptualization, Software. JC: Data curation, Investigation, Writing – review & editing. HH: Data curation, Formal analysis, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by Ningxia Natural Science Foundation (2024AAC03558, 2024AAC03648, 2021AAC03385, 2025AAC030751).

Acknowledgments

This study would not have been possible without the help of all the clinicians who participated in our department.

Conflict of interest

The authors declare that this study was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1640159/full#supplementary-material

Abbreviations

PCa, Prostate Cancer; csPCa, Clinically Significant Prostate Cancer; PI-RADS, Prostate Imaging Reporting and Data System; PET/CT, Positron Emission Tomography/Computed Tomography; [18F]F-PSMA-1007, Fluorine-18–labeled Prostate-Specific Membrane Antigen-1007; PSMA, Prostate-Specific Membrane Antigen; SUVmax, Maximum Standardized Uptake Value; ROC, Receiver Operating Characteristic; AUC, Area Under the Curve; tPSA, the total prostate-specific antigen; SUVmean, Mean Standardized Uptake Value; PSMA-TVp, prostate PSMA-tumor volume; TL-PSMAp, prostate total lesion PSMA; MRI, Magnetic Resonance Imaging; SHAP, SHapley Additive exPlanations; XGBoost, Extreme Gradient Boosting; ML, Machine Learning; ROI, Region of Interest; SVM, Support Vector Machine.

References

1. Belal SL, Frantz S, Minarik D, Enqvist O, Wikström E, Edenbrandt L, et al. Applications of artificial intelligence in PSMA PET/CT for prostate cancer imaging. Semin Nucl Med. (2024) 54:141–9. doi: 10.1053/j.semnuclmed.2023.06.001

PubMed Abstract | Crossref Full Text | Google Scholar

2. Siegel RL, Kratzer TB, Giaquinto AN, Sung H, and Jemal A. Cancer statistics, 2025. CA Cancer J Clin. (2025) 75:10–45. doi: 10.3322/caac.21871

PubMed Abstract | Crossref Full Text | Google Scholar

3. Hu J and Han B. Interpretation and research advances on molecular biomarkers in prostate cancer from 2020 International Society of Urological Pathology consultation conference report. Zhonghua Bing Li Xue Za Zhi. (2021) 50:172–6. doi: 10.3760/cma.j.cn112151-20200922-00732

PubMed Abstract | Crossref Full Text | Google Scholar

4. Ku SY, Gleave ME, and Beltran H. Towards precision oncology in advanced prostate cancer. Nat Rev Urol. (2019) 16:645–54. doi: 10.1038/s41585-019-0237-8

PubMed Abstract | Crossref Full Text | Google Scholar

5. Li Y, Han D, Wu P, Ren J, Ma S, Zhang J, et al. Comparison of (68)Ga-PSMA-617 PET/CT with mpMRI for the detection of PCa in patients with a PSA level of 4–20 ng/ml before the initial biopsy. Sci Rep. (2020) 10:10963. doi: 10.1038/s41598-020-67385-9

PubMed Abstract | Crossref Full Text | Google Scholar

6. Giesel FL, Knorr K, Spohn F, Will L, Maurer T, Flechsig P, et al. Detection efficacy of (18)F-PSMA-1007 PET/CT in 251 patients with biochemical recurrence of prostate cancer after radical prostatectomy. J Nucl Med. (2019) 60:362–8. doi: 10.2967/jnumed.118.212233

PubMed Abstract | Crossref Full Text | Google Scholar

7. Lopci E, Saita A, Lazzeri M, Lughezzani G, Colombo P, Buffi NM, et al. (68)Ga-PSMA positron emission tomography/computerized tomography for primary diagnosis of prostate cancer in men with contraindications to or negative multiparametric magnetic resonance imaging: A prospective observational study. J Urol. (2018) 200:95–103. doi: 10.1016/j.juro.2018.01.079

PubMed Abstract | Crossref Full Text | Google Scholar

8. Zeng T, Xie Y, Chai K, and Sang H. The application of prostate specific membrane antigen in the diagnosis and treatment of prostate cancer: status and challenge. Onco Targets Ther. (2024) 17:991–1015. doi: 10.2147/OTT.S485869

PubMed Abstract | Crossref Full Text | Google Scholar

9. Dai H, Huang S, Tian T, Hou N, Zeng H, Wei Q, et al. Clinical value of dual tracer PET imaging with (68)Ga-PSMA and (18)F-FDG in patients with metastatic prostate cancer. Sichuan Da Xue Xue Bao Yi Xue Ban. (2024) 55:1063–70. doi: 10.12182/20240960201

PubMed Abstract | Crossref Full Text | Google Scholar

10. Gu ZY, Cheng C, and Zuo CJ. Research progress of prostate-specific membrane antigen PET/MR in the diagnosis of prostate cancer. Int J Radiol Nucl Med. (2023) 47:33–8. doi: 10.3760/cma.j.cn121381-202204010-00261

Crossref Full Text | Google Scholar

11. Liu YC, Zhang XJ, Liu JJ, Wang Y, Wang RM, Xu BX, et al. Application value of 18F-PSMA-3Q PET/CT in prostate cancer patients with low postoperative PSA levels. Chin J Nucl Med Mol Imaging. (2023) 43:201–5. doi: 10.3760/cma.j.cn321828-20221220-00374

Crossref Full Text | Google Scholar

12. Hao YX, Ma L, Zhai LP, Cao XM, and Zhang WC. Significance of PSMA ligand PET imaging for lymph node dissection in prostate cancer. J Clin Urol. (2021) 36:993–7. doi: 10.13201/j.issn.1001-1420.2021.12.016

Crossref Full Text | Google Scholar

13. Park SY, Cho A, Yu WS, Lee CY, Lee JG, Kim DJ, et al. Prognostic value of total lesion glycolysis by 18F-FDG PET/CT in surgically resected stage IA non-small cell lung cancer. J Nucl Med. (2015) 56:45–9. doi: 10.2967/jnumed.114.147561

PubMed Abstract | Crossref Full Text | Google Scholar

14. Liao S, Penney BC, Zhang H, Suzuki K, and Pu Y. Prognostic value of the quantitative metabolic volumetric measurement on 18F-FDG PET/CT in Stage IV nonsurgical small-cell lung cancer. Acad Radiol. (2012) 19:69–77. doi: 10.1016/j.acra.2011.08.020

PubMed Abstract | Crossref Full Text | Google Scholar

15. Liao C, Deng Q, Zeng L, Guo B, Li Z, Zhou D, et al. Baseline and interim (18)F-FDG PET/CT metabolic parameters predict the efficacy and survival in patients with diffuse large B-cell lymphoma. Front Oncol. (2024) 14:1395824. doi: 10.3389/fonc.2024.1395824

PubMed Abstract | Crossref Full Text | Google Scholar

16. Chato L and Regentova E. Survey of transfer learning approaches in the machine learning of digital health sensing data. J Pers Med. (2023) 13:1703. doi: 10.3390/jpm13121703

PubMed Abstract | Crossref Full Text | Google Scholar

17. Hossain MI, Maruf MH, Khan MAR, Prity FS, Fatema S, Ejaz MS, et al. Heart disease prediction using distinct artificial intelligence techniques: performance analysis and comparison. Iran J Comput Sci. (2023) 6:397–417. doi: 10.1007/s42044-023-00148-7

Crossref Full Text | Google Scholar

18. Berros N, Mendili FE, Filaly Y, and Idrissi YEBE. Enhancing digital health services with big data analytics. Big Data Cogn Computing. (2023) 7(2):64. doi: 10.3390/bdcc7020064

Crossref Full Text | Google Scholar

19. Cai Z, Poulos RC, Liu J, and Zhong Q. Machine learning for multi-omics data integration in cancer. iScience. (2022) 25:103798. doi: 10.1016/j.isci.2022.103798

PubMed Abstract | Crossref Full Text | Google Scholar

20. Dong J, Lei R, Ma F, Yu L, Wang L, Xu S, et al. Machine learning-based prediction of distant metastasis risk in invasive ductal carcinoma of the breast. PloS One. (2025) 20:e0310410. doi: 10.1371/journal.pone.0310410

PubMed Abstract | Crossref Full Text | Google Scholar

21. Duan H, Zhang Y, Qiu H, Fu X, Liu C, Zang X, et al. Machine learning-based prediction model for distant metastasis of breast cancer. Comput Biol Med. (2024) 169:107943. doi: 10.1016/j.compbiomed.2024.107943

PubMed Abstract | Crossref Full Text | Google Scholar

22. Singh GAP and Gupta PK. Performance analysis of various machine learning-based approaches for detection and classification of lung cancer in humans. Neural Computing Appl. (2019) 31:6863–77. doi: 10.1007/s00521-018-3518-x

Crossref Full Text | Google Scholar

23. Mittlmeier LM, Brendel M, Beyer L, Albert NL, Todica A, Zacherl MJ, et al. Feasibility of different tumor delineation approaches for 18F-PSMA-1007 PET/CT imaging in prostate cancer patients. Front Oncol. (2021) 11:663631. doi: 10.3389/fonc.2021.663631

PubMed Abstract | Crossref Full Text | Google Scholar

24. Zhu W, Tang Y, Qi L, Gao X, Hu S, Chen MF, et al. Machine learning models for enhanced diagnosis and risk assessment of prostate cancer with 68Ga-PSMA-617 PET/CT. Eur J Radiol. (2025) 186:112063. doi: 10.1016/j.ejrad.2025.112063

PubMed Abstract | Crossref Full Text | Google Scholar

25. Moazemi S, Erle A, Khurshid Z, Lütje S, Muders M, Essler M, et al. Decision-support for treatment with (177)Lu-PSMA: machine learning predicts response with high accuracy based on PSMA-PET/CT and clinical parameters. Ann Transl Med. (2021) 9:818. doi: 10.21037/atm-20-6446

PubMed Abstract | Crossref Full Text | Google Scholar

26. Yi Z, Hu S, Lin X, Zou Q, Zou M, Zhang Z, et al. Machine learning-based prediction of invisible intraprostatic prostate cancer lesions on (68) Ga-PSMA-11 PET/CT in patients with primary prostate cancer. Eur J Nucl Med Mol Imaging. (2022) 49:1523–34. doi: 10.1007/s00259-021-05631-6

PubMed Abstract | Crossref Full Text | Google Scholar

27. Yu LX, Nazia H, and Jeremie C. An investigation of XGBoost-based algorithm for breast cancer classification. Mach Learn Appl. (2021) 6:100154. doi: 10.1016/j.mlwa.2021.100154

Crossref Full Text | Google Scholar

28. Has Simsek D, Sanli Y, Engin MN, Erdem S, and Sanli O. Correction to: Detection of metastases in newly diagnosed prostate cancer by using 68Ga-PSMA PET/CT and its relationship with modified D'Amico risk classification. Eur J Nucl Med Mol Imaging. (2021) 48:1701–5. doi: 10.1007/s00259-021-05286-3

PubMed Abstract | Crossref Full Text | Google Scholar

29. Ulas Babacan O, Hasbek Z, and Seker K. The relationship between D'Amico and ISUP risk classifications and (68)Ga-PSMA PET/CT SUVmax values in newly diagnosed prostate cancers. Curr Oncol. (2024) 31:5307–17. doi: 10.3390/curroncol31090391

PubMed Abstract | Crossref Full Text | Google Scholar

30. Kabiraj S, Raihan M, Alvi N, Afrin M, Akter L, Sohagi SA, et al. Breast cancer risk prediction using XGBoost and random forest algorithm. Kharagpur, India: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). (2020) 1–4. doi: 10.1109/ICCCNT49239.2020.9225451

Crossref Full Text | Google Scholar

31. Guan X, Du Y, Ma R, Teng N, Ou S, Zhao H, et al. Construction of the XGBoost model for early lung cancer prediction based on metabolic indices. BMC Med Inform Decis Mak. (2023) 23:107. doi: 10.1186/s12911-023-02171-x

PubMed Abstract | Crossref Full Text | Google Scholar

32. Zhu J, Liu H, Liu X, Chen C, and Shu M. Cardiovascular disease detection based on deep learning and multi-modal data fusion. BioMed Signal Process Control. (2025) 99(2025):106882. doi: 10.1016/j.bspc.2024.106882

Crossref Full Text | Google Scholar

33. Farooq A, Sharma U, and Mishra D. Enhanced survival prediction in head and neck cancer using convolutional block attention and multimodal data fusion. (2024) 2410.21831. doi: 10.48550/arXiv.2410.21831

Crossref Full Text | Google Scholar

34. Wen JN, Yang QQ, Cheng C, and Zuo CJ. Differential study of metabolic volume parameters of 68Ga-PSMA-11 PET/CT in treatment-naïve prostate cancer patients with different risk stratifications. Int J Radiol Nucl Med. (2021) 45:9. doi: 10.3760/cma.j.cn121381-202204010-00261

Crossref Full Text | Google Scholar

35. Xie Y, Li C, Zhang LL, Yu F, Zang SM, Wang SQ, et al. Tumor burden assessment of primary lesions in treatment-naïve prostate cancer using 68Ga-PSMA-I&T PET/CT. J South Med Univ. (2022) 42:1143–8. doi: 10.12122/j.issn.1673-4254.2022.08.05

PubMed Abstract | Crossref Full Text | Google Scholar

36. Ren JZ, Zhao F, Huo ZW, Wang XH, Liu Y, and Yang GR. Correlation between metabolic parameters of ¹8F-FDG PET/CT and tumor markers in prostate cancer bone metastases. Oncoradiology. (2018) 27:637–43. doi: CNKI:SUN:YXYX.0.2018-01-001

Google Scholar

37. Schmuck S, von Klot CA, Henkenberens C, Sohns JM, Christiansen H, Wester HJ, et al. Initial experience with volumetric (68)Ga-PSMA I&T PET/CT for assessment of whole-body tumor burden as a quantitative imaging biomarker in patients with prostate cancer. J Nucl Med. (2017) 58:1962–8. doi: 10.2967/jnumed.117.193581

PubMed Abstract | Crossref Full Text | Google Scholar

38. Lee JW, Kang CM, Choi HJ, Lee WJ, Song SY, Lee JH, et al. Prognostic value of metabolic tumor volume and total lesion glycolysis on preoperative ¹8F-FDG PET/CT in patients with pancreatic cancer. J Nucl Med. (2014) 55:898–904. doi: 10.2967/jnumed.113.131847

PubMed Abstract | Crossref Full Text | Google Scholar

39. Li YM, Li Y, Chen J, Yang PF, Dong SY, and Li J. Diagnostic value of ¹8F-PSMA-1007 PET/CT combined with clinicopathological factors for predicting prostate cancer metastasis. Radiologic Pract. (2023) 38:925–30. doi: 10.13609/j.cnki.1000-0313.2023.07.020

Crossref Full Text | Google Scholar

40. Lawal IO, Ndlovu H, Kgatle M, Mokoala K, and Sathekge MM. Prognostic value of PSMA PET/CT in prostate cancer. Semin Nucl Med. (2024) 54:46–59. doi: 10.1053/j.semnuclmed.2023.07.003

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: [18F]PSMA-1007, positron emission tomography/computed tomography, predictingprostate cancer metastasis, multimodal prediction, machine learning, SHAP

Citation: Yang J, Ma Z, Hao H, Chen J, Lv Z, Zhao Q and Li Y (2025) Multimodal integration of [18F]PSMA-1007 PET/CT semiquantitative parameters and clinicopathological data for predicting prostate cancer metastasis. Front. Oncol. 15:1640159. doi: 10.3389/fonc.2025.1640159

Received: 03 June 2025; Accepted: 25 September 2025;
Published: 15 October 2025.

Edited by:

Neil Mendhiratta, George Washington University, United States

Reviewed by:

Kerim Şeker, Sivas Cumhuriyet University Faculty of Medicine, Türkiye
Ozge Ulas, Gaziosmanpaşa University, Türkiye

Copyright © 2025 Yang, Ma, Hao, Chen, Lv, Zhao and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: YanMei Li, YW1heTUwNTlAMTYzLmNvbQ==; Qian Zhao, Y2VjaWxpYV9oaEAxMjYuY29t; ZhiYong Lv, THZsdTE5OTZAc2luYS5jb20=

These authors have contributed equally to this work and share first authorship

These authors have contributed equally to this work and share last authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.