Application of radiomics-based prediction model to predict preoperative lymph node metastasis in prostate cancer: a systematic review and meta-analysis

Zheng, Yanghuang; Du, Yuelin; Zhang, Biao; Zhang, Helin; Shang, Panfeng; Hou, Zizhen

doi:10.3389/fonc.2025.1577794

SYSTEMATIC REVIEW article

Front. Oncol., 20 June 2025

Sec. Surgical Oncology

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1577794

This article is part of the Research TopicArtificial Intelligence in Clinical Oncology: Enhancements in Tumor ManagementView all 11 articles

Application of radiomics-based prediction model to predict preoperative lymph node metastasis in prostate cancer: a systematic review and meta-analysis

Yanghuang Zheng^†

Yuelin Du^†

Biao Zhang

Helin Zhang

Panfeng Shang^*

Zizhen Hou^*

Department of Urology, Lanzhou University Second Hospital, Lanzhou, Gansu, China

Background: This study aims to comprehensively evaluate the accuracy and efficacy of radiomics models based on imaging equipment in predicting prostate cancer (PCa) lymph node metastasis (LNM).

Methods: We systematically searched PubMed, Embase, Cochrane Library, Web of Science, and Sinomed databases from their establishment until July 2024. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria and the Radiomics Quality Score (RQS) tools were utilized to assess the quality of the studies. Indicators such as the pooled area under the curve (AUC), sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio were computed to evaluate the predictive effect of radiomics technology on LNM of PCa.

Results: A total of 1860 patients diagnosed with LNM of PCa through histological examination were included in this meta-analysis. The radiomics model for predicting LNM in PCa showed a pooled AUC value of 0.88 (95% confidence interval (CI) [0.85 - 0.91]), with a sensitivity and specificity of 0.81 (95% CI [0.64 - 0.91]) and 0.85 (95% CI [0.75 - 0.91]), respectively. The positive likelihood ratio was 5.43 (95% CI [3.34 - 8.84]), the negative likelihood ratio was 0.22 (95% CI [0.12 - 0.43]), and the diagnostic odds ratio was 24.21 (95% CI [10.59 - 55.32]). The meta-analysis showed significant heterogeneity among the included studies. No threshold effect was detected. The subgroup analysis showed that the least absolute shrinkage and selection operator regression algorithm had the higher diagnostic sensitivity, with a pooled sensitivity of 0.96 (95% CI [0.90 - 1.00]) (p = 0.02), while the random forest algorithm was the opposite, with a pooled sensitivity of 0.48 (95% CI [0.16 - 0.80]) (p = 0.01). Radiomics features without intraclass correlation coefficient preprocessing would lead to a decrease in diagnostic specificity, 0.73 (95% CI [0.53 - 0.92]) (p = 0.04). The pooled specificity with an RQS score≥ 17 was 0.77 (95% CI [0.65 - 0.88]) (p = 0.01), and the higher the score, the lower the diagnostic specificity would be.

Conclusions: The predictive model based on radiomics features has the potential to serve as an auxiliary approach for predicting preoperative LNM of PCa.

Systematic review registration: https://www.crd.york.ac.uk/prospero/, identifier PROSPERO CRD42024575818.

1 Introduction

Prostate cancer (PCa) is the fifth most frequently diagnosed cancer worldwide, accounting for 17% of all cancer cases and ranking as the second most common cancer in men (1). Incidence rates vary significantly across regions, ranging from 6.4 to 82.8 per 100,000 individuals (1). Accurate lymph node staging is crucial for evaluating the patient’s prognosis, risk of recurrence, and potential for salvage therapy. Studies report that the recurrence rate among PCa patients with lymph node involvement at initial diagnosis ranges from 1.3% to 12%, which is closely associated with increased mortality (2). Therefore, early determination of lymph node status in PCa patients is critical (3).

Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are key modalities for detecting lymph node metastasis (LNM) in PCa, but their diagnostic accuracy remains limited (4, 5). The introduction of Positron Emission Tomography - Computed Tomography (PET-CT) has significantly enhanced accuracy by approximately 27% compared to traditional imaging equipment in detecting PCa and lymph node status. However, it also presents challenges such as reduced diagnostic sensitivity and anomalous uptake in nerve nodes (6, 7). Furthermore, the determination of lymph node status is often influenced by the spatial resolution of imaging equipment and subjective factors of the pathologist. The primary predictive models for LNM in PCa include the Memorial Sloan Kettering Cancer Center (MSKCC) model and the Briganti nomograms (2012, 2017, and 2019 editions), which aid treatment decisions, which aid treatment decisions but have limitations such as relatively low area under the curve (AUC) values and limited specificity (8–13). While pelvic lymph node dissection (Plnd) or extended pelvic lymph node dissection (Eplnd) remains the gold standard for confirming LNM, these procedures involve prolonged operative times and risks such as lymph leakage and lymphocele formation. Consequently, the indication for pelvic lymphadenectomy remains contentious.

The invasiveness of tumors is related to their heterogeneity. Radiomics technology can encode the subtle heterogeneity into quantifiable features (14, 15). By integrating these features through artificial intelligence algorithms and traditional modeling, radiomics facilitates the development of predictive models for disease status and prognosis (16). Unlike conventional histopathological biopsy, this method offers a non-invasive means of identifying the disease state and is widely utilized in medical research. Various imaging features hold potential value for evaluating the staging of PCa and lymph node status (17). Moreover, quantitative radiomic features can enhance medical decision support systems and improve clinical decision-making (18). Several studies have applied radiomics to predict LNM in PCa (19); however, the lack of standardized radiomics workflows limits model robustness and reproducibility (20).

This study aims to systematically review and comprehensively summarize existing research on the use of radiomics for evaluating LNM in PCa, focusing on diagnostic performance, sensitivity, and specificity. It seeks to provide clinicians with a potential reference tool for assessing LNM status and improving the accuracy of early diagnosis.

2 Methods

2.1 Study protocol and registration

This systematic review and meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement and the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) guidelines (21, 22). The protocol was registered in the International Prospective Register of Systematic Reviews (PROSPERO) database (registered number: CRD42024575818).

2.2 Literature search

To obtain more relevant research data, we conducted a comprehensive literature search in PubMed, Web of Science, Embase, and the Cochrane Library database, covering the time range from the establishment of each database to research published up to July 20, 2024. Additionally, the SinoMed database was searched to further ensure the inclusion of pertinent articles. During the search process, we employed a combination of Medical Subject Headings (MeSH) terms and keywords to conduct our search. The specific search terms used were as follows: (“radiomics” OR “radiomic” OR “Artificial Intelligence”[Mesh] OR “Artificial intelligence” OR “deep learning” OR “machine learning” OR “convolutional neural network” OR “automatic detection”) AND (“Magnetic Resonance Imaging”[Mesh] OR “Tomography, X-Ray Computed”[Mesh] OR “CT” OR “MRI”) AND (“Lymphatic Metastasis”[Mesh] OR “lymph node metastasis” OR “Lymph node” OR “LNM”) AND (Neoplasms, Prostatic OR Neoplasm, Prostatic OR Prostatic Neoplasm OR Prostate Neoplasms OR Neoplasms, Prostate OR Neoplasm, Prostate OR Prostate Neoplasm OR Prostate Cancer OR Cancer, Prostate OR Cancers, Prostate OR Prostate Cancers OR Cancer of Prostate OR Cancer of the Prostate OR Prostatic Cancer OR Cancer, Prostatic OR Cancers, Prostatic OR Prostatic Cancers) OR (“Prostatic Neoplasms”[Mesh]). The specific search strategies implemented in each database are detailed in Supplementary S1.

2.3 Literature screening

A rigorous screening process was implemented to remove duplicate records from the initial dataset. Subsequently, titles and abstracts were thoroughly reviewed. To address selective reporting bias, two authors (YH.Z. and YL.D.) independently assessed the abstracts and titles to determine which studies met the inclusion criteria for full-text review. Discrepancies in study selection were resolved through consultation with a third reviewer(the corresponding author, PF.S.). By adhering to the PICO standard and formulating a specific literature search strategy, we ensured an exhaustive and impartial search as follows:

P (Population): Patients who underwent radical prostatectomy combined with pelvic lymph node dissection and were affirmed to have PCa through histopathological examination. I (Intervention): Prior to the diagnosis of PCa, CT and MRI imaging examinations were undergone. C (Comparator): Histopathologic results were used as the reference standard to compare the performance of radiomics models. O (Outcomes): The performance of Radiomics models was assessed through key metrics, including AUC, sensitivity, specificity, positive and negative likelihood ratios, and diagnostic odds ratios.

The exclusion criteria are as follows (1): Irrelevant titles and abstracts; (2) Unqualified publication types, such as case reports, review articles, editorials, letters, errata, conference abstracts, and animal experiments. All studies that fail to comply with these criteria were excluded to ensure the reliability and quality of the meta-analysis data.

2.4 Data extraction

The data extraction for the study was conducted independently by two authors (YH.Z. and YL.D.), who utilized WPS Office software (version 6.10.1) to record the data on an electronic spreadsheet. Any discrepancies were resolved through consultation with the corresponding author (PF.S.). The extracted data encompassed: (1) General study information (first author’s name, publication year, country); (2) Parameters related to radiomics techniques (imaging equipment, tumor lesion segmentation method, region of interest (ROI) dimensions, imaging feature extraction software, imaging feature types); (3) Details about the development and validation of the prediction model (clinical characteristics including the number of patients, the number of lymph nodes, positive rate of lymph nodes, lymph node dissection procedure, study design, number of centers; Intraclass Correlation Coefficient (ICC) or not; standardization or not; classifier model; and model validation method); (4) Performance evaluation indicators for the prediction model such as AUC value, sensitivity, specificity along with their respective 95% confidence intervals (95% CI) as well as true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). The AUC value is derived from the highest validation set or test set of the predictive model developed based on radiomics features. For single-center studies, the AUC value stems from the validation set or the test set. If multi-center data exist in the study, the result with the highest AUC value from the external validation set will be incorporated.

2.5 Quality assessment

The Radiomics Quality Score (RQS) checklist and QUADAS-2 were employed to evaluate the included studies (22, 23). Two authors (YH.Z. and YL.D) independently conducted the assessments, with any discrepancies resolved through consultation with the corresponding author (PF.S.). The RQS checklist, proposed by Lambin et al. in 2017, is a specialized tool for assessing the quality of radiomics research. It evaluates 16 components across six key domains to measure the methodological rigor of the radiomics workflow. Complementing the radiomics focus, the QUADAS-2 tool addresses issues related to applicability and bias risk in diagnostic accuracy studies. The details of each study can be found in Supplementary S2, Supplementary S3.

2.6 Statistical analysis

All statistical analysis and graphical representations were performed using STATA (Version 18.0), R Studio (version 4.3.1), and Origin pro (Version 2022) incorporating the R packages “metamisc” and “metaphor”. Summary receiver operating characteristic (SROC) curves were constructed from 2 × 2 contingency table data to evaluate diagnostic test performance. The area under the curve (AUC) was used as a metric to assess the predictive models’ accuracy. Diagnostic metrics including sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio, and diagnostic score were calculated with their corresponding 95% confidence intervals. Missing data were estimated using either the confusion matrix calculator or R Studio-based methods. Detailed formulas and procedures can be found in Supplementary S4.

The Q-test and I² statistic were combined to assess heterogeneity among study results, with heterogeneity classified as very low (0–25%), low (25–50%), moderate (50–75%), and high (>75%). Based on the degree of heterogeneity, either a fixed-effect or random-effects meta-analysis model was employed.

In subgroup analysis, multiple covariates were evaluated to determine the source of heterogeneity, including whether clinical characteristics, calibration method of model, study design, imaging equipment, tumor lesion segmentation method, ROI dimension, imaging feature extraction software, lymph node dissection procedure, ICC or not, standardization or not, classifier model, and model validation method, and classification based on the median RQS score as RQS ≥ median or not.

Continuous numerical variables were also examined in the meta-analysis as potential sources of heterogeneity, including the number of patients, size of validation cohorts, number of lymph node-positive cases, and lymph node positivity rate (a total of four items).

A stepwise sensitivity analysis was conducted by sequentially omitting one study at a time to evaluate the influence of individual studies on the overall estimate.

The Deek’s funnel plot was utilized to examine potential publication bias, while the Egger’s test quantitatively evaluated the risk of such bias. Additionally, we applied the Fagan plot to assess clinical utility by providing pre-test probabilities for LNM when calculating post-test probabilities. Statistical significance was defined as P < 0.05.

3 Results

3.1 Study screening and selection

Through our systematic search strategy, we identified 431 studies from 5 databases. Following the removal of duplicate studies, 355 records remained for subsequent screening. Upon the review of the titles and abstracts, 291 documents were excluded due to non-compliance with the PICO criteria. Subsequently, a detailed screening and evaluation were conducted on 64 articles. Among these, 53 articles were eliminated as they failed to comply with the requirements of the literature type. Ultimately, 11 articles that conformed to the PICO criteria and whose full texts were accessible were incorporated into the systematic review; nonetheless, 1 article was precluded from further meta-analysis on account of data reuse (24), thus a total of 10 articles were included in the meta-analysis (19, 25–33). Figure 1 presents a comprehensive depiction of the entire literature inclusion process.

Figure 1

Flowchart illustrating study selection process: From 431 records identified, 76 duplicates were removed. After screening 355 records, 291 were excluded for irrelevance. Reports sought and assessed numbered 64; 11 were included in a systematic review, 10 in a meta-analysis. Reports not retrieved were zero; 53 and 1 were excluded for unqualified publication types and data repetition, respectively.

Figure 1. The PRISMA flowchart for the study screening process.

3.2 Study characteristics and data

These 11 articles were published between 2019 and 2023, involving a total of 1,931 patients (Table 1). In the included studies, there were 5 from Europe, 5 from Asia, and 1 from North America. It is worth noting that all the Asian literature originated from China. All the suspected metastatic lymph nodes were resected by Plnd and/or Eplnd method before the operation and confirmed histopathologically as LNM of PCa. Most of the studies were retrospective in design, with only 1 being prospective. Only 2 studies had multi-center data sources, while 9 were from a single center. In the radiomics workflow, 7 studies used imaging equipment MRI to obtain the original images, while the rest used PET-CT. In tumor lesion segmentation, manual segmentation is commonly employed to define the tumor dimension in three-dimensional space. 7 studies utilized the open-source software PyRadiomics for feature extraction, with all studies extracting representative texture features. Furthermore, Bourbonne et al. and Lai et al. conducted ICC evaluations of radiomics features to ensure imaging feature accuracy (19, 28). 6 studies standardized the extracted imaging feature values during data processing. For model building and validation, machine learning algorithms were used in 6 studies, traditional linear algorithms in 4 studies, and deep learning (DL) algorithms in only 1. To enhance model robustness, 7 studies employed cross-validation while 2 used bootstrapping methods; however, 2 did not specify the validation method. Internal validation was predominantly utilized, although Hou et al., Luining et al., and Peeken et al. incorporated external validation methods to bolster model reliability (26, 27, 31).

Table 1

Table 1. The table of the characteristics and data included in the study.

3.3 Data quality assessment

Upon utilizing the QUADAS-2 tool, it was discerned that none of the studies exhibited a low risk of bias and practical relevance. 8 studies exhibited a high risk of bias in the domain of test selection, while 11 studies showed a low risk of bias in both patient selection and reference standard domains. The risk of bias in the flow and time domains remained uncertain in 11 studies, primarily due to inadequate reporting of the interval between the index test and reference standard test. In terms of applicability concerns, all studies were deemed to pose low risk. The specific details are depicted in Figure 2.

Figure 2

A table and bar charts show risk of bias and applicability concerns across multiple studies. The top table lists studies with assessments for patient selection, index test, reference standard, and flow and timing, marked by smiley faces for low risk, sad faces for high risk, and question marks for unclear risk. The bar charts below visually represent the distribution of low, high, and unclear risks for each category, with color coding: green for low, red for high, and blue for unclear.

Figure 2. The summary of the quality assessment of the included study following QUADAS-2.

Among the 11 studies included in the systematic review, the range of RQS scores was 13 to 21 (Table 2); and among the 10 studies included in the meta-analysis, the range of RQS scores was also 13 to 21, with an average ± standard deviation of 16.4 ± 2.94 and a median of 16.5 (Figure 3).

Table 2

Table 2. The table of RQS scores for each study.

Figure 3

Circular chart categorizing various topics under different names, each marked with a color-coded score from one to seven. Names like Peeken, Liu, and Lai are segmented with topics such as validation and open science. Score legend on the right shows colors representing scores from one (gray) to seven (blue).

Figure 3. The figure of the RQS scores of the studies included in the meta-analysis. The diverse color scales on the right side of the figure denote distinct scores. The scores ascend from top to bottom. The project score of 0 is not presented in the figure.

3.4 Data analysis

Through analyzing and combining the results of diagnostic indicators, it was shown that the prediction model developed based on radiomics technology had good diagnostic performance in predicting LNM in PCa patients preoperatively. AUC was 0.88 (95% CI [0.85 - 0.91]), sensitivity was 0.81 (95% CI [0.62 - 0.91]), specificity was 0.83 (95% CI [0.73 - 0.90]), positive likelihood ratio (PLR) was 4.69 (95% CI [3.11 - 7.10]), negative likelihood ratio (NLR) was 0.23(95% CI [0.11-0.48]), diagnostic odds ratio (DOR) was 20(95% CI [9 - 45]), diagnostic Score was 3 (95% CI [2.19 - 3.81]).

The forest plot illustrating the combined sensitivity and specificity is depicted in Figure 4, while the SROC curve is presented in Figure 5. Detailed information on diagnostic likelihood ratios, diagnostic scores, and diagnostic odds ratios can be found in Supplementary S5, Supplementary S6.

Figure 4

Forest plot displaying sensitivity and specificity of various studies. The left panel shows sensitivity with values ranging from 0.18 to 1.00, and the right panel shows specificity ranging from 0.73 to 1.00. Combined sensitivity is 0.81, and specificity is 0.83, with confidence intervals shown for each study. The plot includes a red dashed line for the mean and diamond markers for combined values.

Figure 4. Forest plot of sensitivity and specificity.

Figure 5

SROC curve graph with sensitivity on the y-axis and specificity on the x-axis. Black solid line represents the ROC curve. Red diamond indicates the summary operating point with sensitivity 0.81 and specificity 0.83. AUC is 0.88. Dotted and dashed lines show ninety-five percent prediction and confidence contours. Data points are marked with circles.

Figure 5. The SROC curve for the prediction of lymph node metastasis of prostate cancer based on radiomics technology.

3.5 Heterogeneity test

The results of Cochran’s Q and Higgins I² tests indicate a high level of heterogeneity in pooled sensitivity and specificity, with Q values of 52.31 (p < 0.01) and I² values of 82.79 for sensitivity, as well as Q values of 48.14 (p < 0.01) and I2 values of 81.30% for specificity. The Spearman correlation coefficient of 0.45 (p > 0.05) suggests the absence of a threshold effect.

3.6 Subgroup analysis

The subgroup analysis revealed that the Least Absolute Shrinkage and Selection Operator regression (LASSO) algorithm significantly enhanced diagnostic sensitivity, yielding a combined sensitivity of 0.96 [0.90 - 1.00] (p=0.02), whereas the random forest (RF) algorithm had an adverse effect, resulting in a combined sensitivity of 0.48 [0.16 - 0.80] (p=0.01). Imaging features not selected by the ICC led to a reduction in diagnostic specificity, resulting in a combined specificity of 0.73 (0.53 - 0.92) (p=0.04). The combined specificity of RQS score ≥ 17 was 0.77 (0.65 - 0.88) (p=0.01), and higher RQS scores were associated with lower diagnostic specificity. In addition, the combination of radiomics technology with clinical characteristics, calibration method of model, study design, imaging feature extraction software, imaging equipment, lymph node dissection procedure, model validation method, segmentation method, ROI dimension, and Standardization did not exhibit statistically significant effects on sensitivity and specificity. Prospective and retrospective designs demonstrated similar discriminatory ability in prediction models. Moreover, differences between PyRadiomics software and other feature extraction software in prediction models were found to be nonsignificant.

Incorporating clinical characteristics, Fold-bootstraping, LASSO classifier; uncalculated ICC values; combined Plnd with Eplnd; internal validation method; 3D tumor lesions; RQS score ≥ 17; manual segmentation; data Standardization; and Fold-bootstrap method all contributed to improved pooled AUC value of prediction models. However, there was no statistically significant difference in the above results. For further details please refer to Table 3 and Supplementary S7, Supplementary S8.

Table 3

Table 3. The table of subgroup analysis results in the included study.

3.7 Sensitivity analysis

The sensitivity analysis revealed no significant changes upon systematically removing one study at a time as shown in Figure 6.

Figure 6

Forest plot showing meta-analysis estimates with each named study omitted. The x-axis ranges from 2.57 to 12.06, including marks at 3.60, 6.14, and 10.48. Studies listed on the y-axis: Bourbonne, Cysouw, Hou, Lai, Liu, Liu-2, Luining, Peeken, Zamboglou, and Zheng, with years ranging from 2019 to 2023. Each study has a point estimate and confidence interval lines.

Figure 6. The figure for sensitivity analysis was calculated using the stepwise rejection method. CI, Confidence Interval.

3.8 Meta-regression

The statistical results show that there is no significant difference in AUC value between the number of patients, the number of validation set, the number of lymph nodes positive, and the positive rate of lymph nodes (p=0.56; p=0.78; p=0.43; p=0.57) (Figure 7).

Figure 7

Four scatter plots labeled a to d show relationships between different variables, each with effect size on the y-axis. Plot a: number of patients on the x-axis, a slight negative correlation. Plot b: number of validation set on the x-axis, a slight positive correlation. Plot c: number of lymph nodes positive on the x-axis, a moderate positive correlation. Plot d: positive rate of lymph nodes on the x-axis, a moderate positive correlation. Circles represent data points, with sizes indicating varying data weights.

Figure 7. The figure of meta-regression between AUC value and numerical variables. The number of patients (a), the number of validation set (b), the number of lymph nodes positive (c), and the positive rate of lymph nodes (d) (p = 0.56, p = 0.78, p = 0.43, p = 0.57, respectively).

3.9 Publication bias

The Deek’s funnel plot asymmetry test did not reveal a statistically significant bias (p=0.23). Similarly, the results of Egger’s test indicated no substantial publication bias in the included studies (p=0.613), as illustrated in Figure 8.

Figure 8

Panel a shows Deeks' Funnel Plot Asymmetry Test with a regression line plotted against the diagnostic odds ratio, having a p-value of 0.23. Panel b illustrates a funnel plot with pseudo ninety-five percent confidence limits, featuring a scatter of study points within a triangular area formed by dashed lines indicating confidence boundaries.

Figure 8. The figure of publication bias using Deek’s test and Egger’s test. (a) The funnel plot for publication bias was assessed by applying Deek’s asymmetry test. (b) This funnel plot presented all the studies encompassed in the meta-analysis, and each point represented an independent study.

3.10 Clinical utility

The combined findings indicate that the pre-test probability is 26%. Furthermore, the positive predictive value of the LNM test-based radiomics model prediction is 62%, while the negative predictive value is 8%, as illustrated in Figure 9.

Figure 9

Graph depicting a likelihood ratio nomogram with pre-test probability on the left, likelihood ratio in the center, and post-test probability on the right. A solid red line represents a positive likelihood ratio of 5, increasing post-test probability from 26% to 62%. A dashed blue line denotes a negative likelihood ratio of 0.23, decreasing post-test probability to 8%.

Figure 9. The Fagan nomogram of the radiomics model in the detection of lymph node metastasis of prostate cancer. The Fagan nomogram demonstrated the performance of the radiomics model in the detection of lymph node metastasis in prostate cancer. The pre-test probability of having lymph node metastasis was 26%, yielding a post-test probability of 62% with a positive test and 8% with a negative test.

4 Discussion

This study presents a systematic review and meta-analysis of ten studies that developed radiomics-based predictive models for LNM in prostate in PCa. To our knowledge, this is the first comprehensive evaluation of radiomics technology in predicting LNM in PCa. The pooled results indicate that the models demonstrated a good performance in predicting LNM (0.88 [95% CI: 0.85 - 0.91]) with high sensitivity (0.81 [95% CI: 0.62 - 0.91]) and specificity (0.83 [95% CI: 0.73 - 0.90]). Using radiomics - based predictive models to predict LNM of PCa before surgery may provide certain reference value for clinical decision-making.

Currently, imaging equipment such as CT and MRI primarily identify abnormal lymph nodes by visually assessing their size, shape, and contrast-enhanced regions. PET-CT can identify LNM (e.g., short axis diameter of lymph node > 10 mm on CT, maximum standardized uptake value ≥ 2.5 on PET/CT) through the uptake of abnormal radioactive elements. However, the involvement of too many subjective factors in evaluation can easily lead to a bias in the diagnostic results (34, 35). Although widely validated, existing mainstream nomogram models have yet to demonstrate significant predictive performance. Bandini et al., Hueting et al., Oderda et al., and Gandaglia et al. each validated different nomogram models using large datasets, revealing variable predictive accuracy (8–10, 12). The variances might arise from the discrepant composition of patients in the validation datasets and the circumstance that they originate from different regions. Di et al. reported that among high-risk prostate PCa patients, all four evaluated models systematically overestimated the risk of LNM to varying degrees (36). Specifically, the MSKCC, Briganti 2012, Briganti 2017, and Briganti 2019 nomograms exhibited similar predictive performance for LNM, with respective AUC values of 0.526, 0.548, 0.555, and 0.573. Moreover, these nomograms displayed relatively high sensitivity (0.973, 0.991, 0.973, and 0.959, respectively) yet exhibited extremely low specificity (0.078, 0.093, 0.140, and 0.183, respectively). These results suggest that there is scope for improvement in the existing models in terms of accurately predicting LNM.

Image-based radiomics techniques partially address this limitation by enabling quantitative feature analysis for assessing disease progression. A bibliometric analysis of recent publications on PCa and radiomics technology within the Web of Science database over the last five years reveals a close association between PCa and radiomics as well as key concepts such as AUC, deep learning, and biomarkers (Figure 10). Numerous studies have now successfully constructed predictive models for predicting LNM in PCa based on extracted imaging features from CT scans, MRIs, or PET-CT scans (37). However, there is currently a lack of robust evidence supporting the ability of radiomics technology to diagnose diseases, and there are fewer systematic reviews and meta-analyses on the application of radiomics in PCa LNM.

Figure 10

Network diagram showing interconnected terms related to cancer research, such as “deep learning,” “biopsy,” and “cancer patient.” Nodes vary in color and size, reflecting connections and relevance over time from 2021.8 to 2022.6, as indicated by a color scale.

Figure 10. A bibliometric network map of the application of radiomics technology in prostate cancer. This figure depicts the findings of a recent 5-year study on prostate cancer utilizing radiomics technology. The bar graph in the lower right corner illustrates the transition from white to purple, symbolizing the historical progression to the present. Each circle represents a specific theme or keyword, with its size corresponding to publication frequency. The figure was created using VOS viewer (version 1.6.20, www.vosviewer.com) based on scientific articles in the Web of Science database.

In this study, radiomics technology demonstrates well sensitivity and specificity in predicting LNM. Our findings align with those of Abbaspour et al., who conducted a meta-analysis of 36 studies on the diagnostic performance of radiomics technology for predicting LNM of colorectal cancer, yielding a combined AUC, sensitivity, and specificity of 0.814 ([95% CI: 0.78 - 0.85]), 0.77 ([95% CI: 0.69 - 0.84]), and 0.73 ([95% CI: 0.67 - 0.78]), respectively (38). Li et al. included 12 studies in their analysis to evaluate the diagnostic ability of radiomics in predicting cervical cancer LNM, with a combined AUC, sensitivity, and specificity of 0.83 ([95% CI: 0.76 - 0.89]), 0.80 ([95% CI: 0.72 - 0.87]), and 0.76 ([95% CI: 0.72 - 0.80]), respectively (39). While our findings demonstrate superior diagnostic performance compared to the two previous studies, it does not imply that radiomics technology is more effective in predicting LNM of PCa than in predicting metastasis of other tumors. Such differences may reflect heterogeneity among pathological tumor types, variability in imaging equipment and features, and challenges in reproducing modeling methodologies (23).

In terms of clinical practicality, Fagan nomogram analysis shows that the model based on radiomics features can increase the post-test probability of positive results to 62% and reduce the post-test probability of negative results to 8%. Compared with the widely used MSKCC nomogram and Briganti nomogram (2012, 2017, 2019 editions), the radiomics model demonstrates higher diagnostic efficacy (with a better AUC value), and due to its non-invasive advantage, it may reduce patients’ reliance on prostate biopsy and thereby avoid related complications. However, the current negative posterior probability (8%) of the radiomics model is still higher than that of MSKCC (5%) and Briganti nomogram (minimum 2%), its false negative risk limits the reliability of its sole clinical application. Therefore, at this stage, the radiomics model cannot completely replace the MSKCC or Briganti nomogram, but can serve as one of the supplementary tools for clinical decision-making.

During the process of constructing predictive models, owing to the high dimensionality of radiomics features, the RF algorithm exhibits superiority in handling complex nonlinear relationships, whereas LASSO algorithm is more proficient in fitting linear relationships (23). Nevertheless, in this study, the combined sensitivity of LASSO algorithm was 0.96 ([95% CI: 0.90 - 1.00]) (p = 0.02), while that of RF was 0.48 ([95% CI: 0.16 - 0.80]) (p = 0.01). Despite the statistical significance of this difference, substantial variations in training and validation set sizes, data distributions, and hyperparameter optimization strategies across studies may have influenced model generalizability. For example, in the studies by Cysouw et.al, Hou et.al, and Luning et.al, the sample sizes of the validation sets of the RF models (14, 50, and 51, respectively) were markedly smaller than those of the LASSO models in the studies by Liu et al. and Peeken et al. (208 and 102, respectively). Furthermore, in small-sample studies, the RF algorithm is more susceptible to overfitting, resulting in the degradation of the performance of the validation set, while the regularization property of LASSO enables it to perform more stably under small-sample conditions (40). During the hyperparameter optimization process, RF algorithm is more sensitive to hyperparameters (such as tree depth and node size, etc.). If the tuning is inadequate or the data is limited, its performance may be underestimated. Although the current results indicate that LASSO algorithm may be more robust, the evidential strength of this conclusion is limited due to the heterogeneity of the datasets and methodological differences. Currently, there is no optimal modeling approach in model construction, and distinct modeling methods possess obvious inherent limitations, such as the assumption of feature independence in Logit models, the requirement for feature discretization in Bayesian networks, and the dependence on network configuration in DL. In the future, while pursuing the predictive ability of the models, it is of greater significance to ensure that the developed models are fully reproducible (23, 41).

The validation methods for the model include internal validation and external validation. At present, for the developed radiomics model, most are evaluated by internal validation to assess the predictive performance of the model. According to the literature, external validation is recommended for datasets with more than 50 samples, while resampling methods are preferable for smaller datasets (42). Among various resampling methods, cross-validation and bootstrap methods are the most widely used. Cross-validation mainly focuses on evaluating the predictive performance of the model (calculating some statistical data on the missing samples and evaluating the model’s predictability) (43). Bootstrap method primarily focuses on the statistical evaluation of the model, rather than assessing the predictive validity of the samples missed in each iteration (44–46). Our findings show that bootstrap method can improve model sensitivity (0.97 vs. 0.72) compared to cross-validation. However, it does not imply that the bootstrap method is the best way to validate models. Model performance on new data largely depends on the quality of the training data. Furthermore, the best model is not the one that is best suited for calibrating the data or providing the best results in validation, but rather the one that can predict new samples with high reliability and stable results (47).

ICC is a statistical method based on analysis of variance, commonly used to analyze continuous numerical data with a range of ratio indicators between 0 and 1 (48).

In radiomics model construction, ICC assesses the reproducibility and robustness of imaging features extracted from tumor lesions across different individuals, segmentation methods, and time points. This assessment encompasses inter-observer or intra-observer heterogeneity. Previous reviews have indicated that most imaging features demonstrate high robustness to inter-observer or intra-observer heterogeneity, regardless of the ICC threshold set and the type of tumor (49). In line with the findings of the appeal, our study suggests that unprocessed imaging features may result in a reduction in diagnostic specificity, yielding a combined specificity of 0.73 ([95% CI: 0.53 - 0.92]) (p=0.04). Nevertheless, the threshold setting for ICC value has not been standardized yet, and there exist numerous sources of heterogeneity including imaging scanner parameters, imaging resolution, tumor segmentation methods, feature extraction software, etc., thus far precluding any quantitative evaluation of ICC (50).

Among the 10 studies included in this analysis, the RQS was 16.5, which is below both the 50% threshold (18/36) and the 60% threshold (21.6/36) of the maximum score. Nevertheless, it is noteworthy that the average RQS score surpassed those reported in other studies (38, 51). Subgroup analysis revealed that the diagnostic specificity for RQS score ≥17 was 0.77 ([95% CI: 0.65 - 0.88]) (p=0.01), with a decrease in specificity as the score increased, possibly attributed to inter-rater variability in the RQS scoring process and challenges associated with replicating the RQS score (52). Furthermore, it is important to note that currently, the applicability of the RQS tool is limited to traditional radiomics workflows, lacking corresponding evaluation criteria for DL research. Studies on identifying the status of LNM using DL algorithms are constantly increasing (53–55). However, this approach to data processing differs from the classic feature processing, selection, and model tuning procedures in radiomics. For instance, traditional radiomics features are based on manual segmentation, while DL algorithms extract features directly from images and have a black-box nature, making them not directly applicable to the “feature reproducibility” and “feature interpretability” scoring items in RQS tool. Additionally, DL algorithms mainly enhance data reliability through methods such as image rotation and flipping, but there are no corresponding evaluation criteria for these methods in the RQS tool. Therefore, the RQS tool still needs further improvement to adapt to the progress of algorithms.

This study has several limitations. First, the number of articles meeting the inclusion criteria is limited, and some data are derived through calculations. Second, most included studies were retrospective, with relatively few prospective studies, potentially weakening the strength of evidence, potentially leading to a decline in the strength of evidence. Third, while the number of studies on predictive models based on radiomics features has increased annually, there are very few multi-center or cross-regional external validation studies on the developed models, which may undermine the reproducibility and credibility of these models. Fourth, imaging features may be influenced by factors such as imaging equipment instrument protocols, contrast agent types, tumor segmentation methods, feature extraction software, and modeling methods.

Such variability may contribute to differences in diagnostic performance and underscores the need for authoritative, standardized operational guidelines. Finally, there is currently no fully unified expert consensus on quality assessment procedures for imaging genomics operations despite widespread use of RQS as a quality assessment tool; further improvements are necessary.

5 Conclusions

The promising diagnostic capability of radiomics technology in preoperatively predicting LNM in PCa holds potential clinical relevance for guiding treatment decisions. Nevertheless, the current limitations associated with this technology may restrict its immediate clinical applicability. Further comprehensive research is warranted to validate the findings of this study and facilitate the integration of this technology into clinical practice.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Author contributions

YZ: Data curation, Formal analysis, Writing – original draft. YD: Data curation, Formal analysis, Writing – review & editing. BZ: Investigation, Writing – review & editing. HZ: Methodology, Writing – review & editing. PS: Supervision, Writing – review & editing. ZH: Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1577794/full#supplementary-material

References

1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer J Clin. (2024) 74:229–63. doi: 10.3322/caac.21834

PubMed Abstract | Crossref Full Text | Google Scholar

2. Siegel RL, Miller KD, and Jemal A. Cancer statistics, 2018. CA: A Cancer J Clin. (2018) 68:7–30. doi: 10.3322/caac.21442

PubMed Abstract | Crossref Full Text | Google Scholar

3. Mottet N, Bellmunt J, Bolla M, Briers E, Cumberbatch MG, De Santis M, et al. EAU-ESTRO-SIOG guidelines on prostate cancer. Part 1: Screening, diagnosis, and local treatment with curative intent. Eur Urol. (2017) 71:618–29. doi: 10.1016/j.eururo.2016.08.003

PubMed Abstract | Crossref Full Text | Google Scholar

4. Franklin A, Yaxley WJ, Raveenthiran S, Coughlin G, Gianduzzo T, Kua B, et al. Histological comparison between predictive value of preoperative 3-T multiparametric MRI and 68 ga-PSMA PET/CT scan for pathological outcomes at radical prostatectomy and pelvic lymph node dissection for prostate cancer. BJU Int. (2021) 127:71–9. doi: 10.1111/bju.v127.1

Crossref Full Text | Google Scholar

5. Petersen LJ, Nielsen JB, Langkilde NC, Petersen A, Afshar-Oromieh A, De Souza NM, et al. (68)Ga-PSMA PET/CT compared with MRI/CT and diffusion-weighted MRI for primary lymph node staging prior to definitive radiotherapy in prostate cancer: a prospective diagnostic test accuracy study. World J Urol. (2020) 38:939–48. doi: 10.1007/s00345-019-02846-z

PubMed Abstract | Crossref Full Text | Google Scholar

6. Jilg CA, Drendel V, Rischke HC, Beck TI, Reichel K, Kronig M, et al. Detection rate of (18)F-choline PET/CT and (68)Ga-PSMA-HBED-CC PET/CT for prostate cancer lymph node metastases with direct link from PET to histopathology: dependence on the size of tumor deposits in lymph nodes. J Nucl Med. (2019) 60:971–7. doi: 10.2967/jnumed.118.220541

PubMed Abstract | Crossref Full Text | Google Scholar

7. Hofman MS, Lawrentschuk N, Francis RJ, Tang C, Vela I, Thomas P, et al. Prostate-specific membrane antigen PET-CT in patients with high-risk prostate cancer before curative-intent surgery or radiotherapy (proPSMA): a prospective, randomised, multicentre study. Lancet. (2020) 395:1208–16. doi: 10.1016/S0140-6736(20)30314-7

PubMed Abstract | Crossref Full Text | Google Scholar

8. Bandini M, Marchioni M, Pompe RS, Tian Z, Gandaglia G, Fossati N, et al. First North American validation and head-to-head comparison of four preoperative nomograms for prediction of lymph node invasion before radical prostatectomy. BJU Int. (2018) 121:592–9. doi: 10.1111/bju.2018.121.issue-4

PubMed Abstract | Crossref Full Text | Google Scholar

9. Hueting TA, Cornel EB, Somford DM, Jansen H, van Basten JA, Pleijhuis RG, et al. External validation of models predicting the probability of lymph node involvement in prostate cancer patients. Eur Urol Oncol. (2018) 1:411–7. doi: 10.1016/j.euo.2018.04.016

PubMed Abstract | Crossref Full Text | Google Scholar

10. Oderda M, Diamand R, Albisinni S, Calleris G, Carbone A, Falcone M, et al. Indications for and complications of pelvic lymph node dissection in prostate cancer: accuracy of available nomograms for the prediction of lymph node invasion. BJU Int. (2021) 127:318–25. doi: 10.1111/bju.v127.3

PubMed Abstract | Crossref Full Text | Google Scholar

11. Diamand R, Oderda M, Albisinni S, Fourcade A, Fournier G, Benamran D, et al. External validation of the Briganti nomogram predicting lymph node invasion in patients with intermediate and high-risk prostate cancer diagnosed with magnetic resonance imaging-targeted and systematic biopsies: A European multicenter study. Urol Oncol. (2020) 38:847.e9– e16. doi: 10.1016/j.urolonc.2020.04.011

PubMed Abstract | Crossref Full Text | Google Scholar

12. Gandaglia G, Martini A, Ploussard G, Fossati N, Stabile A, De Visschere P, et al. External validation of the 2019 briganti nomogram for the identification of prostate cancer patients who should be considered for an extended pelvic lymph node dissection. Eur Urol. (2020) 78:138–42. doi: 10.1016/j.eururo.2020.03.023

PubMed Abstract | Crossref Full Text | Google Scholar

13. Lucciola S, Pisciotti ML, Frisenda M, Magliocca F, Gentilucci A, Del Giudice F, et al. Predictive role of node-rads score in patients with prostate cancer candidates for radical prostatectomy with extended lymph node dissection: comparative analysis with validated nomograms. Prostate Cancer Prostatic Dis. (2023) 26:379–87. doi: 10.1038/s41391-022-00564-z

PubMed Abstract | Crossref Full Text | Google Scholar

14. Ganeshan B and Miles KA. Quantifying tumour heterogeneity with CT. Cancer Imaging: Off Publ Int Cancer Imaging Society. (2013) 13:140–9. doi: 10.1102/1470-7330.2013.0015

PubMed Abstract | Crossref Full Text | Google Scholar

15. Davnall F, Yip CSP, Ljungqvist G, Selmi M, Ng F, Sanghera B, et al. Assessment of tumor heterogeneity: An emerging imaging tool for clinical practice? Insights into Imaging. (2012) 3:573–89. doi: 10.1007/s13244-012-0196-6

PubMed Abstract | Crossref Full Text | Google Scholar

16. Zhang Y, Li X, Lv Y, and Gu X. Review of value of CT texture analysis and machine learning in differentiating fat-poor renal angiomyolipoma from renal cell carcinoma. Tomography (Ann Arbor Mich). (2020) 6:325–32. doi: 10.18383/j.tom.2020.00039

PubMed Abstract | Crossref Full Text | Google Scholar

17. Ferro M, de Cobelli O, Musi G, Del Giudice F, Carrieri G, Busetto GM, et al. Radiomics in prostate cancer: An up-to-date review. Ther Adv Urology. (2022) 14:17562872221109020. doi: 10.1177/17562872221109020

PubMed Abstract | Crossref Full Text | Google Scholar

18. Lambin P, van Stiphout RGPM, Starmans MHW, Rios-Velazquez E, Nalbantov G, Aerts HJWL, et al. Predicting outcomes in radiation oncology–multifactorial decision support systems. Nat Rev Clin Oncol. (2013) 10:27–40. doi: 10.1038/nrclinonc.2012.196

PubMed Abstract | Crossref Full Text | Google Scholar

19. Bourbonne V, Jaouen V, Nguyen TA, Tissot V, Doucet L, Hatt M, et al. Development of a radiomic-based model predicting lymph node involvement in prostate cancer patients. Cancers. (2021) 13:5672. doi: 10.3390/cancers13225672

PubMed Abstract | Crossref Full Text | Google Scholar

20. Vickers AJ. Prediction models: Revolutionary in principle, but do they do more good than harm? J Clin Oncol. (2011) 29:2951–2. doi: 10.1200/JCO.2011.36.1329

PubMed Abstract | Crossref Full Text | Google Scholar

21. Page MJ, Moher D, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. (2021) 372:n160. doi: 10.1136/bmj.n160

PubMed Abstract | Crossref Full Text | Google Scholar

22. Whiting PF. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. (2011) 155:529. doi: 10.7326/0003-4819-155-8-201110180-00009

PubMed Abstract | Crossref Full Text | Google Scholar

23. Lambin P, Leijenaar RTH, Deist TM, Peerlings J, De Jong EEC, Van Timmeren J, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. (2017) 14:749–62. doi: 10.1038/nrclinonc.2017.141

PubMed Abstract | Crossref Full Text | Google Scholar

24. Liu X, Zhang Y, Sun Z, Wang X, Zhang X, and Wnag X. Prediction of pelvic lymph node metastasis in prostate cancer using radiomics based on T2-weighted imaging. J Cent South Univ (Med Sci). (2022) 47:1025–36. doi: 10.11817/j.issn.1672-7347.2022.210692

PubMed Abstract | Crossref Full Text | Google Scholar

25. Cysouw MCF, Jansen BHE, Van De Brug T, Oprea-Lager DE, Pfaehler E, De Vries BM, et al. Machine learning-based analysis of [(18)F]DCFPyL PET radiomics for risk stratification in primary prostate cancer. Eur J Nucl Med Mol Imaging. (2021) 48:340–9. doi: 10.1007/s00259-020-04971-z

PubMed Abstract | Crossref Full Text | Google Scholar

26. Luining WI, Oprea-Lager DE, Vis AN, Van Moorselaar RJA, Knol RJJ, Wondergem M, et al. Optimization and validation of 18F-DCFPyL PET radiomics-based machine learning models in intermediate- to high-risk primary prostate cancer. PloS One. (2023) 18:e0293672. doi: 10.1371/journal.pone.0293672

PubMed Abstract | Crossref Full Text | Google Scholar

27. Hou Y, Bao J, Song Y, Bao M-L, Jiang K-W, Zhang J, et al. Integration of clinicopathologic identification and deep transferrable image feature representation improves predictions of lymph node metastasis in prostate cancer. EBioMedicine. (2021) 68:103395. doi: 10.1016/j.ebiom.2021.103395

PubMed Abstract | Crossref Full Text | Google Scholar

28. Lai S and Shilei Z. Diagnostic value of radiomics based on fat-suppressed T2-weighted imaging magnetic resonance imaging for pelvic metastatic lymph nodes of prostate cancer. J China Med University. (2021) 50:230–4. doi: 10.12007/j.issn.0258-4646.2021.03.008

Crossref Full Text | Google Scholar

29. Liu X, Tian J, Wu J, Zhang Y, Wang X, Zhang X, et al. Utility of diffusion weighted imaging-based radiomics nomogram to predict pelvic lymph nodes metastasis in prostate cancer. BMC Med Imaging. (2022) 22:190. doi: 10.1186/s12880-022-00905-3

PubMed Abstract | Crossref Full Text | Google Scholar

30. Liu X, Wang X, Zhang Y, Sun Z, Zhang X, and Wang X. Preoperative prediction of pelvic lymph nodes metastasis in prostate cancer using an ADC-based radiomics model: comparison with clinical nomograms and PI-RADS assessment. Abdom Radiol. (2022) 47:3327–37. doi: 10.1007/s00261-022-03583-5

PubMed Abstract | Crossref Full Text | Google Scholar

31. Peeken JC, Shouman MA, Kroenke M, Rauscher I, Maurer T, Gschwend JE, et al. A CT-based radiomics model to detect prostate cancer lymph node metastases in PSMA radioguided surgery patients. Eur J Nucl Med Mol Imaging. (2020) 47:2968–77. doi: 10.1007/s00259-020-04864-1

PubMed Abstract | Crossref Full Text | Google Scholar

32. Zamboglou C, Carles M, Fechter T, Kiefer S, Reichel K, Fassbender TF, et al. Radiomic features from PSMA PET for non-invasive intraprostatic tumor discrimination and characterization in patients with intermediate- and high-risk prostate cancer - a comparison study with histology reference. Theranostics. (2019) 9:2595–605. doi: 10.7150/thno.32376

PubMed Abstract | Crossref Full Text | Google Scholar

33. Zheng H, Miao Q, Liu Y, Mirak SA, Hosseiny M, Scalzo F, et al. Multiparametric MRI-based radiomics model to predict pelvic lymph node invasion for patients with prostate cancer. Eur Radiol. (2022) 32:5688–99. doi: 10.1007/s00330-022-08625-6

PubMed Abstract | Crossref Full Text | Google Scholar

34. Zacho HD and Petersen LJ. Bone flare to androgen deprivation therapy in metastatic, hormone-sensitive prostate cancer on 68Ga-prostate-specific membrane antigen PET/CT. Clin Nucl Med. (2018) 43:e404–e6. doi: 10.1097/RLU.0000000000002273

PubMed Abstract | Crossref Full Text | Google Scholar

35. Kopp D, Kopp J, Bernhardt E, Manka L, Beck A, Gerullis H, et al. 68Ga-prostate-specific membrane antigen positron emission tomography-computed tomography-based primary staging and histological correlation after extended pelvic lymph node dissection in intermediate-risk prostate cancer. Urol Int. (2022) 106:56–62. doi: 10.1159/000515651

PubMed Abstract | Crossref Full Text | Google Scholar

36. Di Pierro GB, Salciccia S, Frisenda M, Tufano A, Sciarra A, Scarrone E, et al. Comparison of four validated nomograms (Memorial sloan kettering cancer center, briganti 2012, 2017, and 2019) predicting lymph node invasion in patients with high-risk prostate cancer candidates for radical prostatectomy and extended pelvic lymph node dissection: clinical experience and review of the literature. Cancers (Basel). (2023) 15:1683. doi: 10.3390/cancers15061683

PubMed Abstract | Crossref Full Text | Google Scholar

37. Faiella E, Vaccarino F, Ragone R, D’Amone G, Cirimele V, Piccolo CL, et al. Can machine learning models detect and predict lymph node involvement in prostate cancer? A comprehensive systematic review. J Clin Med. (2023) 12:7032. doi: 10.3390/jcm12227032

PubMed Abstract | Crossref Full Text | Google Scholar

38. Abbaspour E, Karimzadhagh S, Monsef A, Joukar F, Mansour-Ghanaei F, and Hassanipour S. Application of radiomics for preoperative prediction of lymph node metastasis in colorectal cancer: a systematic review and meta-analysis. Int J Surg. (2024) 110:3795–813. doi: 10.1097/JS9.0000000000001239

PubMed Abstract | Crossref Full Text | Google Scholar

39. Li L, Zhang J, Zhe X, Tang M, Zhang X, Lei X, et al. A meta-analysis of MRI-based radiomic features for predicting lymph node metastasis in patients with cervical cancer. Eur J Radiol. (2022) 151:110243. doi: 10.1016/j.ejrad.2022.110243

PubMed Abstract | Crossref Full Text | Google Scholar

40. Vabalas A, Gowen E, Poliakoff E, and Casson AJ. Machine learning algorithm validation with a limited sample size. PloS One. (2019) 14:e0224365. doi: 10.1371/journal.pone.0224365

PubMed Abstract | Crossref Full Text | Google Scholar

41. Parmar C, Grossmann P, Bussink J, Lambin P, and Aerts H. Machine learning methods for quantitative radiomic biomarkers. Sci Rep. (2015) 5:13087. doi: 10.1038/srep13087

PubMed Abstract | Crossref Full Text | Google Scholar

42. Westad F and Marini F. Validation of chemometric models – a tutorial. Analytica Chimica Acta. (2015) 893:14–24. doi: 10.1016/j.aca.2015.06.056

PubMed Abstract | Crossref Full Text | Google Scholar

43. Franca T, Goncalves D, and Cena C. ATR-FTIR spectroscopy combined with machine learning for classification of PVA/PVP blends in low concentration. Vib Spectrosc. (2022) 120:103378. doi: 10.1016/j.vibspec.2022.103378

Crossref Full Text | Google Scholar

44. Kucheryavskiy S, Zhilin S, Rodionova OY, and Pomerantsev AL. Procrustes cross-validation — a bridge between cross-validation and independent validation set. Anal Chem. (2020) 92:11842–50. doi: 10.26434/chemrxiv.12327803.v1

PubMed Abstract | Crossref Full Text | Google Scholar

45. Esbensen KH and Geladi P. Principles of Proper Validation: use and abuse of re-sampling for validation. J Chemometrics. (2010) 24:168–87. doi: 10.1002/cem.v24:3/4

Crossref Full Text | Google Scholar

46. Pomerantsev AL and Rodionova OY. Procrustes cross-validation of short datasets in PCA context. Talanta. (2021) 226:122104. doi: 10.1016/j.talanta.2021.122104

PubMed Abstract | Crossref Full Text | Google Scholar

47. Lopez E, Etxebarria-Elezgarai J, Amigo JM, and Seifert A. The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples. Analytica Chimica Acta. (2023) 1275:341532. doi: 10.1016/j.aca.2023.341532

PubMed Abstract | Crossref Full Text | Google Scholar

48. Shrout PE and Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull. (1979) 86:420–8. doi: 10.1037/0033-2909.86.2.420

PubMed Abstract | Crossref Full Text | Google Scholar

49. Xue C, Yuan J, Lo GG, Chang ATY, Poon DMC, Wong OL, et al. Radiomics feature reliability assessed by intraclass correlation coefficient: a systematic review. Quantitative Imaging Med Surgery. (2021) 11:4431–60. doi: 10.21037/qims-21-86

PubMed Abstract | Crossref Full Text | Google Scholar

50. Zhuang M, Li X, Qiu Z, and Guan J. Does consensus contour improve robustness and accuracy in 18F-FDG PET radiomic features? EJNMMI Phys. (2024) 11:48. doi: 10.1186/s40658-024-00652-0

PubMed Abstract | Crossref Full Text | Google Scholar

51. Ren J, Li Y, Liu XY, Zhao J, He YL, Jin ZY, et al. Diagnostic performance of ADC values and MRI-based radiomics analysis for detecting lymph node metastasis in patients with cervical cancer: A systematic review and meta-analysis. Eur J Radiol. (2022) 156:110504. doi: 10.1016/j.ejrad.2022.110504

PubMed Abstract | Crossref Full Text | Google Scholar

52. Akinci D’Antonoli T, Cavallo AU, Vernuccio F, Stanzione A, Klontzas ME, Cannella R, et al. Reproducibility of radiomics quality score: An intra- and inter-rater reliability study. Eur Radiology. (2023) 34:2791–804. doi: 10.1007/s00330-023-10217-x

PubMed Abstract | Crossref Full Text | Google Scholar

53. Bian Y, Zheng Z, Fang X, Jiang H, Zhu M, Yu J, et al. Artificial intelligence to predict lymph node metastasis at CT in pancreatic ductal adenocarcinoma. Radiology. (2023) 306:160–9. doi: 10.1148/radiol.220329

PubMed Abstract | Crossref Full Text | Google Scholar

54. Chen M, Kong C, Lin G, Chen W, Guo X, Chen Y, et al. Development and validation of convolutional neural network-based model to predict the risk of sentinel or non-sentinel lymph node metastasis in patients with breast cancer: a machine learning study. EClinicalMedicine. (2023) 63:102176. doi: 10.1016/j.eclinm.2023.102176

PubMed Abstract | Crossref Full Text | Google Scholar

55. Bedrikovetski S, Dudi-Venkata NN, Kroon HM, Seow W, Vather R, Carneiro G, et al. Artificial intelligence for pre-operative lymph node staging in colorectal cancer: a systematic review and meta-analysis. BMC Cancer. (2021) 21:1058. doi: 10.1186/s12885-021-08773-w

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: lymph node metastasis, machine learning, magnetic resonance imaging, positron emission tomography - computed tomography, prostate cancer, radiomics

Citation: Zheng Y, Du Y, Zhang B, Zhang H, Shang P and Hou Z (2025) Application of radiomics-based prediction model to predict preoperative lymph node metastasis in prostate cancer: a systematic review and meta-analysis. Front. Oncol. 15:1577794. doi: 10.3389/fonc.2025.1577794

Received: 16 February 2025; Accepted: 02 June 2025;
Published: 20 June 2025.

Edited by:

Sharon R. Pine, University of Colorado Anschutz Medical Campus, United States

Reviewed by:

Vittorio Canale, Cardinal Massaia Asti Hospital, Italy
Kang Hu, Soochow University, China

Copyright © 2025 Zheng, Du, Zhang, Zhang, Shang and Hou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zizhen Hou, aG91eml6aGVuQDEyNi5jb20=; Panfeng Shang, c2hhbmdwZkBsenUuZWR1LmNu

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.