Your new experience awaits. Try the new design now and help us make it even better

SYSTEMATIC REVIEW article

Front. Immunol., 10 February 2026

Sec. Cancer Immunity and Immunotherapy

Volume 17 - 2026 | https://doi.org/10.3389/fimmu.2026.1753166

This article is part of the Research TopicNovel Immune Markers and Predictive Models for Diagnosis, Immunotherapy and Prognosis in Lung Cancer​​​​​​​View all 10 articles

CT-based radiomics in predicting the efficacy of preoperative neoadjuvant chemoimmunotherapy for non-small cell lung cancer: a systematic review and meta-analysis

  • 1Department of Graduate School, Beijing University of Chinese Medicine, Beijing, China
  • 2Department of Oncology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
  • 3Capital Medical University, Beijing, China
  • 4Beijing Hospital of Traditional Chinese Medicine, Beijing, China
  • 5China Academy of Chinese Medical Sciences, Beijing, China
  • 6Faculty of Chinese Medicine and State Key Laboratory of Mechanism and Quality of Chinese Medicine, Macao University of Science and Technology, Macao, Macao SAR, China

Introduction: Neoadjuvant chemoimmunotherapy significantly improves surgical resection rates, major pathological response rates (MPR), pathological complete response rates (pCR), and survival rates in patients with resectable NSCLC. Through systematic reviews and meta-analyses, we examined the diagnostic value of CT-based predictive models in predicting neoadjuvant chemoimmunotherapy treatment outcomes for NSCLC.

Method: PubMed, Embase, Web of Science databases, China National Knowledge Infrastructure, and Wanfang were systematically searched up to January 12, 2026. To assess study risk of bias and quality, we employed the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool and the Radiomics Quality Score version 2.0(RQS). Diagnostic accuracy of radiomics for detecting neoadjuvant chemoimmunotherapy pathological response in NSCLC patients was evaluated by calculating the area under the curve (AUC), sensitivity, specificity, and accuracy for each study.

Results: The meta-analysis analyzed 17 studies with 4,510 individual subjects. The pooled AUC, sensitivity, and specificity of internal validation models were 0.81, 0.79, and 0.69, respectively. The pooled AUC, sensitivity, and specificity of external validation models were 0.80, 0.75, and 0.73, accordingly. Subgroup analyses revealed that models using deep learning (DL) algorithms demonstrated superior sensitivity (internal: 0.79, 95% CI: 0.73-0.85; external: 0.77, 95% CI: 0.72-0.82) and specificity (internal: 0.79, 95% CI: 0.74-0.85; external: 0.73, 95% CI: 0.68-0.78) compared to those using machine learning (ML). Models predicting MPR exhibited higher sensitivity in internal validation (0.82, 95% CI: 0.77-0.86), while showing higher specificity in external validation (0.76, 95% CI: 0.72-0.81). In contrast, models predicting pCR demonstrated the opposite pattern. Features selected using the intraclass correlation coefficient (ICC) demonstrated significantly higher pooled sensitivity (internal: 0.85, 95% CI: 0.80-0.89; external: 0.81, 95% CI: 0.76-0.87) and specificity (internal: 0.70, 95% CI: 0.63-0.78; external: 0.77, 95% CI: 0.71-0.82) compared to non-ICC-selected features. When stratified by the median Radiomics Quality Score (RQS ≥ 41.07%), higher-scoring studies were associated with lower pooled sensitivity (internal: 0.78, 95% CI: 0.73-0.84; external: 0.71, 95% CI: 0.66-0.76) but a trend toward higher specificity. Finally, models based on two-dimensional regions of interest (2D ROI) demonstrated higher pooled sensitivity (internal: 0.86, 95% CI: 0.80-0.92; external: 0.87, 95% CI: 0.79-0.96) and specificity in external validation (0.80, 95% CI: 0.68-0.91).

Conclusion: Due to its good diagnostic accuracy, widespread use, and low cost, CT-based radiomics can be used to predict the efficacy of neoadjuvant chemoimmunotherapy in NSCLC preoperatively.

Systematic Review Registration: https://www.crd.york.ac.uk/prospero/, identifier (CRD420251174128).

1 Introduction

Lung cancer is one of the most commonly diagnosed cancers and the leading cause of cancer-related deaths globally. Non-small cell lung cancer (NSCLC) accounts for approximately 85% of all lung cancer cases. Surgery is the standard treatment for resectable NSCLC. Unfortunately, even after radical resection, early-stage disease still carries a high risk of recurrence and mortality (1). A meta-analysis demonstrated that neoadjuvant chemotherapy significantly improved overall survival(OS), time to distant recurrence, and recurrence-free survival(RFS) in patients with resectable NSCLC (2). However, compared to surgery alone, neoadjuvant chemotherapy yields only a 5% to 6% absolute difference in 5-year RFS and OS rates—a figure that remains unsatisfactory. In the context of early-stage or locally advanced NSCLC, neoadjuvant chemoimmunotherapy has emerged as a significant research direction in solid tumor treatment. Neoadjuvant therapy combining PD-1/PD-L1 blockade with chemotherapy has demonstrated significant improvements in resection rate, major pathological response (MPR) rate, pathological complete response (pCR) rate, and survival rates among patients with resectable NSCLC (3, 4). A meta-analysis demonstrated that patients receiving neoadjuvant or perioperative chemotherapy combined with immunotherapy exhibited significantly superior event-free survival (EFS) compared to those receiving neoadjuvant chemotherapy alone [hazard ratio (HR) 0.58, 95% confidence interval (CI) 0:0.51-0.66] (5).

In most trials involving neoadjuvant chemoimmunotherapy for NSCLC, MPR serves as a surrogate endpoint for OS and disease-free survival (DFS). The MPR rates observed in current neoadjuvant chemoimmunotherapy clinical studies vary significantly, ranging from 18% to 83% (6). Clinical trials have established that MPR and pCR serve as key surrogate endpoints for evaluating chemoimmunotherapy efficacy. However, a substantial proportion of patients fail to benefit from neoadjuvant chemoimmunotherapy. The occurrence of immune-related adverse events and relatively low clinical response rates in patients with NSCLC underscore the limitations of neoadjuvant chemoimmunotherapy (7, 8). High intratumoral heterogeneity may be a primary driver of this therapeutic disparity (9, 10). The peritumoral microenvironment, characterized by abundant tumor-infiltrating lymphocytes and tumor-associated macrophages, can modulate responses to immunotherapy (11, 12). Current clinical approaches to obtaining information on intratumoral heterogeneity and the peritumoral microenvironment involve invasive tissue biopsies, limiting the feasibility of repeated needle biopsies during neoadjuvant chemoimmunotherapy. Consequently, there is an urgent need for an alternative method to predict response to neoadjuvant treatment. Accurate prediction of neoadjuvant chemoimmunotherapy outcomes enables timely formulation of appropriate treatment strategies, which avoids potential undertreatment or overtreatment and prevents unnecessary complications associated with surgery. Programmed Death-Ligand 1/Programmed Death-1 (PD-L1/PD-1) represents the most critical immunotherapy target in NSCLC. However, recent studies demonstrate that PD-L1 expression exhibits significant dynamic changes throughout disease progression. This may result in initial test results failing to accurately reflect the immune status at the time of treatment, thereby compromising predictive reliability. The lack of standardized strategies for PD-L1 testing—including inconsistent timing of assessment, number of biopsies, and specimen interpretation criteria—further limits its predictive value. In summary, reliable biomarkers for predicting pathological response after neoadjuvant therapy in resectable NSCLC are currently unavailable (13).

Computed tomography (CT), as a non-invasive imaging modality, has been widely adopted for preoperative evaluation of NSCLC and has demonstrated applicability in assessing pathological response to NSCLC treatment. However, numerous studies indicate that tumor size measurements based on CT scans cannot reliably predict pathological response following neoadjuvant therapy for resectable NSCLC (14). In the NADIM trial, 33% of patients with stable disease and 73% of those with radiographic partial response achieved pCR (15). This discrepancy is typically attributable to false positives. Lymphocytic infiltration can obscure lesions, as radiographic findings may fail to reflect actual tumor regression. To enhance the accuracy of predicting which patients will achieve postoperative MPR in NSCLC following neoadjuvant chemoimmunotherapy, the integration of multidimensional and high-throughput imaging features is essential.

With the advancement of artificial intelligence (AI), extracting radiomics or deep learning (DL) features from CT scans can provide additional information to enhance the diagnosis, treatment, and prognosis of lung cancer. A multicenter study predicted major pathological response to neoadjuvant chemotherapy and immunotherapy in non-small cell lung cancer using DL. After integrating clinical features into the DL score, the combined model demonstrated satisfactory performance in both internal validation (AUC: 0.77, 95% CI: 0.64-0.89) and external validation cohorts (AUC: 0.75, 95% CI: 0.62-0.87) (16). Liu et al. developed and validated a radiomics-based nomogram to predict major pathological response to neoadjuvant immunochemotherapy in potentially resectable NSCLC patients. The radiological-clinical combined model demonstrated excellent discriminatory performance, achieving AUCs of 0.84 (95% CI, 0.74-0.93) and 0.81 (95% CI, 0.63-0.98) (17). However, aggregated results derived from different predictive models, outcome measures, and model construction approaches appear to be inconsistent. Furthermore, the lack of standardized radiomics workflows limits the robustness and reproducibility of these models.

This study aims to systematically review and comprehensively summarize the application of CT in predicting the efficacy of preoperative neoadjuvant chemoimmunotherapy for NSCLC, focusing on its diagnostic performance, sensitivity, and specificity. It seeks to provide clinicians with a potential reference tool for evaluating the efficacy of neoadjuvant chemoimmunotherapy, thereby facilitating the development of personalized treatment strategies.

2 Materials and methods

2.1 Study protocol and registration

This systematic review and meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (18), ensuring a structured and transparent methodology. The review protocol was registered and approved in the International Prospective Register of Systematic Reviews (PROSPERO) database (Registration ID: CRD420251174128).

2.2 Literature search strategy

In accordance with the PRISMA statement, two authors (HC and BF) independently performed a comprehensive database search using PubMed, Embase, Web of Science, China National Knowledge Infrastructure, and Wanfang as primary sources. The search covered the period from each database’s inception to January 12, 2026. A combination of Medical Subject Headings (MeSH) and keywords associated with CT, NSCLC, neoadjuvant therapy, chemoimmunotherapy, and prediction was employed. Additional target literature was identified by reviewing the references of included studies. The specific search strategy is detailed in Supplementary Material.

2.3 Inclusion criteria and screening

Inclusion and exclusion criteria were established based on the PICOS framework. The inclusion criteria were as follows: (1) Population (P): patients with NSCLC; (2) Intervention (I): an AI algorithm applied to CT imaging to predict neoadjuvant chemoimmunotherapy efficacy; (3) Comparator (C): patients not achieving pCR or MPR after neoadjuvant therapy; (4) Outcomes (O): efficacy (MPR/pCR status) presented in a 2×2 diagnostic contingency table; (5) Study design (S): machine learning (ML) or DL studies evaluating the diagnostic value of imaging for NSCLC, published in peer-reviewed journals.

The exclusion criteria were: (1) insufficient outcome data for analysis; (2) conference papers, case reports, systematic reviews, etc.; (3) unfinished studies or published research including unfinished data; (4) duplicate reports; (5) studies for which the full text was unavailable.

2.4 Study selection and data extraction

Two authors (HC and BF) recorded data in standardized spreadsheets. Any discrepancies were resolved through consultation with a third author (NQ). Extracted data included: (1) study characteristics: first author, publication year, country, data source, study design; (2) patient characteristics: sample size, training/validation set distribution, tumor stage; (3) radiomics-related parameters: imaging modality, tumor lesion segmentation method, region of interest (ROI) size, feature extraction software, feature type, and AI method; (4) validation methods; (5) model performance metrics: AUC values, sensitivity, specificity with 95% confidence intervals (95% CIs), and true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).

For studies that did not directly report sensitivity, specificity, or 2×2 tables, data were extracted from ROC curves using GetData Graph Digitizer 2.24 software (19). To mitigate selection bias, AUC values were derived from all validation set data within prediction models based on radiomics features, with stratified reporting for internal and external validation (20). Where two thresholds were used in model construction, the model built using the fixed threshold was selected.

2.5 Quality assessment

The Radiomics Quality Score version 2.0 (RQS) checklist and the QUADAS-2 tool were employed to evaluate the included studies (21, 22). Two authors (XQ and CQ) conducted independent assessments, resolving discrepancies through consultation with a third author (WH). The RQS 2.0 framework (accessed November 23, 2025, at https://www.radiomics.world/rqs2), proposed by Lambin and colleagues, was used to evaluate the quality of radiomics research reports across 42 assessment dimensions within 9 key domains (23, 24). The maximum achievable score is 56 points (100%). The QUADAS-2 tool was used to assess risk of bias and applicability in patient selection, index test, reference standard, and flow/timing. Responses were recorded as “yes,” “no,” or “unclear” in RevMan 5.4.

2.6 Data analysis

This meta-analysis employed STATA software (version 14) and Revman (version 5.4) for statistical analysis. A bivariate random-effects model was used to pool data across different validation datasets. Pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and their corresponding 95% CIs were calculated. Summary receiver operating characteristic (SROC) curves and the area under the curve (AUC) were constructed to assess overall diagnostic performance (25). Diagnostic accuracy based on AUC values was categorized as: 0.90-1 (excellent), 0.80-0.90 (good), 0.70-0.80 (fair), 0.60-0.70 (poor), and 0.50-0.60 (very poor). Threshold effects were evaluated by calculating Spearman’s rank correlation coefficient between logit (sensitivity) and logit(1-specificity).

Heterogeneity was assessed using I2 and Q statistics, with I2 values categorized as low (0-50%), moderate (50-75%), or high (>75%). Forest plots displayed sensitivity and specificity across studies.

Meta-regression analyses and subgroup analyses of the radiomics model were performed to compare studies using different datasets, including clinical characteristics, model calibration methods, study design, data source, imaging modality, tumor lesion segmentation method, RQS score, ROI size, feature extraction software, model combination characteristics, AI method, and model validation methods.

To assess the influence of individual studies on the overall estimate, a sensitivity analysis was conducted by sequentially excluding one study at a time using a univariate diagnostic odds ratio (DOR) model. Any identified outliers were reanalyzed to validate the robustness of the results. Publication bias was assessed using Deeks’ funnel plot asymmetry test (26). Statistical significance was defined as P < 0.05. Clinical utility was assessed using a Fagan plot to calculate post-test probability for a given prior probability. A random-effects model was used for all pooled analyses to accommodate expected heterogeneity among studies.

3 Results

3.1 Screening and selection of articles

A systematic literature search, conducted according to a predefined strategy, identified 2,134 articles. After removing 207 duplicate records, 1,927 publications underwent title and abstract screening, resulting in the exclusion of 1,875 irrelevant studies. The full text of the remaining 32 papers was then assessed for eligibility. After comprehensive review, 15 articles were excluded for inconsistency with the study objectives. Ultimately, 17 articles (6, 16, 17, 2740) conforming to the PICOS inclusion criteria were included in this study. The screening process, illustrated in the PRISMA flow diagram, is presented in Figure 1.

Figure 1
Flowchart of a systematic review process. Starting with 2134 records identified through database searching, 1927 remain after removing duplicates. After screening, 52 records are left; 20 are excluded. Of 32 full-text articles assessed, 15 are excluded due to issues like wrong population or intervention, leaving 17 studies for systematic review and meta-analysis.

Figure 1. Flowchart demonstrating the process of selecting studies.

3.2 Study and patient characteristics

These 17 articles were published between 2022 and 2025, involving a total of 4,510 patients and 126 datasets (Tables 1, 2). All included studies were conducted in China. The vast majority employed a retrospective design, with only one ambispective cohort study (38). 10 studies (6, 16, 29, 30, 32, 33, 36, 3840) involving 48 datasets utilized multicenter data sources, while 14 studies (6, 16, 17, 27, 28, 3032, 3437, 39, 40), comprising 78 datasets, employed single-center data sources.

Table 1
www.frontiersin.org

Table 1. General characteristics information of studies included in the systematic review.

Table 2
www.frontiersin.org

Table 2. Radiomics-related information of studies included in the systematic review.

Regarding efficacy assessment, 11 studies (6, 16, 17, 2729, 31, 34, 36, 39), involving 81 datasets, developed predictive models for MPR following neoadjuvant chemoimmunotherapy, while 6 studies (30, 32, 33, 35, 37, 40) comprising 45 datasets, predicted pCR.

Among the 17 eligible studies, different AI algorithms were employed for modeling: 8 studies (16, 29, 30, 32, 36, 3840), incorporating 69 datasets, utilized DL algorithms, while 11 studies (6, 17, 27, 28, 30, 31, 3337), with 57 datasets, employed ML algorithms.

In radiomics workflows, 9 studies (6, 27, 29, 3234, 37, 39, 40), involving 74 datasets, extracted features exclusively from contrast-enhanced CT images, while the remaining studies (16, 17, 28, 3032, 35, 36, 38) did not restrict the use of contrast-enhanced CT.

For tumor lesion segmentation, manual segmentation was commonly employed to delineate ROI within tumors. Han (27), Ye et al. (40) and Fan (35) et al. adopted semi-automatic and fully automatic methods, respectively. 12 studies (6, 16, 17, 27, 30, 31, 33, 34, 3638, 40) depicted regions of interest (ROI) as 3D images, while 4 studies (28, 29, 32, 35) used 2D images. Gan et al. (39) used a blend of 2D and 3D images during model construction.

For radiomic feature extraction, 5 studies (16, 17, 27, 30, 37) utilized 3D Slicer software for sampling, 9 studies (6, 28, 3134, 36, 39, 40) used ITK-SNAP software for sampling, one study (35) employed a Deepwise Multimodal Research Platform, and 2 studies (29, 38) were not referenced in the text.

LASSO was the most commonly used feature selection method. The primary classification methods employed include: LR (6, 16, 17, 2731, 33, 35, 39), SVM (6, 28, 34, 35), RF (28, 31, 34, 36), and CNN (16, 29, 38). 15 studies (6, 16, 2736, 3840) standardized extracted image feature values during data processing, while 9 studies (17, 27, 28, 3133, 36, 37, 39) used the intraclass correlation coefficient (ICC) to assess consistency among feature extractions.

Additionally, 9 studies (6, 16, 17, 27, 28, 30, 31, 37, 38), incorporating 25 datasets, combined clinical factors with radiomics features for model construction. External validation, incorporating 62 datasets, was predominantly used(n=13) (6, 16, 2730, 3234, 36, 3840), while 14 studies (6, 16, 17, 27, 28, 3032, 3437, 39, 40), with 64 datasets, employed internal validation.

3.3 Quality assessment

We assessed the quality of the selected studies using the QUADAS-2 tool (Figure 2). Overall, the methodological quality was acceptable. In the patient selection domain, 9 studies had an unclear risk of bias due to insufficient description of consecutive patient enrollment, while 2 studies showed a high risk of bias due to missing information on both consecutive enrollment and the applicable time period. In the index test domain, two studies had an unclear risk of bias due to a lack of information on blinding during the assessment process, while others showed a low risk. All studies clearly defined the reference standards. Regarding the flow and timing domain, five studies reported insufficient information on the interval between radiomics analysis and the application of the reference standard. All studies demonstrated a low risk of bias in clinical applicability.

Figure 2
Table and bar chart displaying risk of bias and applicability concerns for various studies from 2022 to 2025. Symbols used are green plus signs for low risk, yellow question marks for unclear risk, and red minus signs for high risk. Most studies show low risk, with a few indicating high or unclear risk, particularly in the categories of patient selection and flow and timing.

Figure 2. The summary of the quality assessment of the included study following QUADAS-2.

We used the RQS 2.0 checklist shown in Supplementary Table S3 to evaluate the quality of all radiomics studies. The scores reveal systematic methodological shortcomings. Across 9 domains, the median overall RQS was 42.31% (range 34.62%-53.57%), with an mean quality score of 42.57%. Only 1 study provided prospective data for radiomics models. Furthermore, very few studies scored well in both applicability and sustainability, and clinical Deployment.

3.4 Diagnostic test accuracy analysis

The overall radiomics model demonstrated good diagnostic performance in detecting the efficacy of neoadjuvant immunochemotherapy in NSCLC. On internal validation, the pooled AUC was 0.81 (95% CI: 0.77-0.84), sensitivity of 0.79 (95% CI: 0.74-0.83, I2 = 82.78%), specificity of 0.69 (95% CI: 0.64-0.75, I2 = 82.12%). The positive likelihood ratio (PLR) was 2.60 (95% CI [2.20-3.00]), the negative likelihood ratio (NLR) was 0.31 (95% CI [0.26-0.37]), and the diagnostic odds ratio (DOR) was 8 (95% CI [6-11]). On external validation, the pooled values were as follows: AUC 0.80 (95% CI: 0.76-0.83), sensitivity 0.75 (95% CI: 0.70-0.79, I2 = 83.05%), specificity 0.73 (95% CI: 0.68-0.77, I2 = 77.27%), PLR 2.7 (95% CI: 2.4-3.1), NLR 0.35 (95% CI: 0.30-0.41), and DOR 8 (95% CI: 6-10). The forest plot integrating sensitivity and specificity was shown in Figure 3, while the SROC curves from all studies were depicted in Figure 4.

Figure 3
Two forest plots labeled A and B compare sensitivity and specificity across multiple studies. Each plot displays study IDs with corresponding sensitivity or specificity values and confidence intervals. The data points are represented as squares with horizontal lines indicating confidence intervals, centered on a vertical line marking reference values. Each plot concludes with combined sensitivity or specificity values.

Figure 3. Forest plots of SEN and SPE with corresponding 95% CIs of CT-based radiomics in predicting the efficacy of neoadjuvant chemoimmunotherapy for non-small cell lung cancer on (A) internal validation and (B) external validation.

Figure 4
Two SROC curves illustrating prediction and confidence contours for sensitivity and specificity. Panel A shows SENS = 0.79 and SPEC = 0.69 with an AUC of 0.81. Panel B displays SENS = 0.75 and SPEC = 0.73 with an AUC of 0.80. Each plot includes observed data points within 95% confidence and prediction contours.

Figure 4. SROC curve with corresponding 95% CIs of CT-based radiomics in predicting the efficacy of neoadjuvant chemoimmunotherapy for non-small cell lung cancer on (A) internal validation and (B) external validation.

3.5 Subgroup analysis

The I2 statistic revealed high heterogeneity both in pooled sensitivity (I2in = 82.78%, I2ex = 83.05%) and specificity (I2in = 82.12%, I2ex = 77.27%). In internal validation, the Spearman correlation coefficient between logit(sensitivity) and logit(1-specificity) was 0.54 (P < 0.001). In external validation, the coefficient was 0.47 (P = 0.001), indicating a moderate threshold effect across studies. Subgroup analyses were conducted to identify potential sources of heterogeneity and threshold effect, with groups adjusted as shown in Tables 3 and 4. The meta-regression analyses indicated that the factor of different AI algorithms, imaging method, efficacy evaluation, ICC, RQS and ROI may contribute to sources of heterogeneity.

Table 3
www.frontiersin.org

Table 3. Results of meta-regression and subgroup analyses of radiological model in internal datasets.

Table 4
www.frontiersin.org

Table 4. Results of meta-regression and subgroup analyses of radiological model in external datasets.

In terms of data sources on internal validation, multi-center studies (n=13) demonstrated lower sensitivity (0.73 [95% CI, 0.62-0.84] compared to single-center studies (n=51, 0.80, 95% CI: 0.75-0.85; P < 0.001). In external validation, the specificity of multicenter studies (n=35) was lower than that of single-center studies (n=27, 0.74, 95% CI: 0.68-0.80; P < 0.001).

Regarding AI algorithms in internal validation, models using DL algorithms (n=30) exhibited higher diagnostic sensitivity (0.79 [95% CI, 0.73-0.85] than those using ML algorithms (n=34, 0.79, 95% CI: 0.73-0.84; P < 0.001). However, their effect on specificity was not significant (P = 0.96), although DL showed a higher point estimate for specificity. In external validation, this pattern shifted: DL models (n=39) demonstrated superior specificity (0.73 [95% CI,0.68-0.78]) (P < 001) compared to ML models (n=23, 0.72 [95% CI,0.65-0.79]). Sensitivity also showed a trend toward superiority for DL (0.77 vs 0.68, P = 0.08).

By imaging method in internal validation, models constructed using enhanced CT alone (n=34) demonstrated lower sensitivity (0.76 [95% CI, 0.70-0.83] vs 0.81 [95% CI, 0.75-0.87]) (P < 0.001). This trend persists in the external validation datasets (0.73 [95% CI, 0.67-0.79] vs 0.77 [95% CI, 0.70-0.84]) (P < 0.001). Furthermore, the specificity of enhanced CT (n=40) was also significantly reduced (0.71 [95% CI, 0.66-0.76] vs 0.75 [95% CI, 0.69-0.81]) (P < 0.001).

Regarding efficacy assessment in internal validation, the model constructed to predict MPR(n=39) following neoadjuvant chemotherapy combined with immunotherapy demonstrated superior sensitivity (0.82 [95% CI, 0.77-0.86] vs (0.74 [95% CI, 0.66-0.81] (P = 0.02) but lower specificity (0.67 [95% CI, 0.60-0.74] vs 0.73 [95% CI, 0.65-0.80]) (P < 0.001) over the model predicting pCR (n=25). In external validation, MPR-predicting models (n=42) demonstrated slightly lower sensitivity (0.74 [95% CI, 0.68-0.80] vs 0.76 [95% CI, 0.69-0.84]) (P < 0.001), but higher specificity (0.76 [95% CI, 0.72-0.81] vs 0.66 [95% CI, 0.59-0.73]) (P = 0.03) than pCR-predicting models.

Studies combining radiomics with clinical factors demonstrated lower sensitivity in internal validation (0.77, 95% CI: 0.61-0.87 vs. 0.79, 95% CI: 0.74-0.84; P < 0.001) and lower specificity in external validation (0.66, 95% CI: 0.54-0.78 vs. 0.74, 95% CI: 0.70-0.78; P < 0.001).

Imaging features selected based on ICC demonstrated significantly improved sensitivity and specificity in both internal and external validation datasets. In internal validation, sensitivity was 0.85 (95% CI: 0.80-0.89) vs 0.71 (95% CI: 0.64-0.78) (P = 0.04), and specificity was 0.70 (95% CI: 0.63-0.78) vs 0.69 (95% CI: 0.61-0.76) (P = 0.03). In external validation, sensitivity was 0.81 (95% CI: 0.76-0.87) vs 0.70 (95% CI: 0.63-0.76) (P = 0.05), and specificity was 0.77 (95% CI: 0.71-0.82) vs 0.70 (95% CI: 0.64-0.76) (P = 0.01).

In internal validation, the pooled sensitivity for studies with an RQS score over 41.07% was slightly lower (0.78, 95% CI: 0.73-0.84) compared to those with a lower score (0.79, 95% CI: 0.71-0.87; P < 0.001). This trend was more pronounced in external validation, where higher RQS scores were associated with lower sensitivity (0.71, 95% CI: 0.66-0.76 vs. 0.86, 95% CI: 0.79-0.92; P < 0.001), indicating that diagnostic sensitivity decreased as the RQS score increased.

In internal validation, models using 2D ROIs demonstrated higher sensitivity (0.86 [95% CI, 0.80-0.92]) than those using 3D ROIs (0.76 [95% CI, 0.70-0.81]). The diagnostic accuracy of 2D ROIs was further supported in external validation (sensitivity: 0.87, 95% CI: 0.79-0.96; specificity: 0.80, 95% CI: 0.68-0.91).

3.6 Sensitivity analyses

As shown in Figure 5, sensitivity analysis indicates no significant changes after each systematic exclusion of a study.

Figure 5
Two side-by-side forest plots labeled (A) and (B) display meta-analysis random-effects estimates in exponential form from various studies. Each plot features horizontal lines with central points and confidence intervals. The studies are listed vertically on the left, and the x-axis shows the scale of estimates. Plot (A) ranges from 5.65 to 7.36, while plot (B) ranges from 5.82 to 9.40.

Figure 5. The figure for sensitivity analysis was calculated using the stepwise rejection method on (A) internal validation and (B) external validation.

3.7 Publication bias

To assess publication bias, a Deeks funnel plot asymmetry test was conducted (Figure 6). The results showed no substantial evidence of publication bias both on internal (Pin = 0.39), and external validation (Pex = 0.46).

Figure 6
Two funnel plots labeled A and B display the results of Deeks' Funnel Plot Asymmetry Test with corresponding p-values of 0.39 and 0.46. Each plot shows data points representing studies plotted against the diagnostic odds ratio, with a regression line for asymmetry detection. The y-axis is labeled as 1 over the square root of effective sample sizes (ESS). A legend indicates circles represent studies and dashed lines are regression lines.

Figure 6. Funnel plot based on radiomics model in predicting the efficacy of neoadjuvant chemoimmunotherapy for non-small cell lung cancer on (A) internal validation and (B) external validation.

3.8 Clinical utility

Based on Fagan’s histogram analysis, in internal validation, applying a radiomics model with a pre-test probability of 51% and a positive likelihood ratio (PLR) of 3, the predicted probability of NSCLC patients achieving MPR or pCR after neoadjuvant chemotherapy and immunotherapy can be elevated to 73%. Conversely, under the same model framework, a negative likelihood ratio (NLR) reduces this positive a posteriori probability to 24%. In external validation, a model with a pre-test probability of 44% and a PLR of 3 elevated the predicted probability to 68%. Conversely, an NLRof 0.35 reduces the posterior positive probability to 22% (Figure 7).

Figure 7
Graphs A and B display likelihood ratios with pre-test and post-test probabilities. In both, the left y-axis shows pre-test probability percentages and the right y-axis shows post-test probabilities. The red arrows represent positive likelihood ratios, while the blue dashed lines represent negative likelihood ratios. Panel A starts with a prior probability of fifty-one percent, increasing to seventy-three percent post-test positive, and decreasing to twenty-four percent for negative. Panel B begins at forty-four percent, rising to sixty-eight percent post-test positive, and decreasing to twenty-two percent for negative.

Figure 7. The Fagan nomogram of the radiomics model in predicting the efficacy of neoadjuvant chemoimmunotherapy for non-small cell lung cancer on (A) internal validation and (B) external validation.

4 Discussion

Neoadjuvant chemotherapy is an emerging adjuvant therapy for lung cancer, primarily aimed at suppressing tumor immune escape, activating the body’s immune response, and eliminating tumor cells (41). Compared to traditional neoadjuvant chemotherapy, neoadjuvant chemotherapy enhances the feasibility of curative surgery and improves long-term survival, representing a significant advancement in lung cancer treatment (15, 4244). However, despite significant progress, a substantial proportion of patients fail to benefit from this approach (45).

Accurate prediction of treatment response is critical for stratifying and selecting those who will derive the greatest benefit from neoadjuvant chemoimmunotherapy while avoiding immune-related adverse events (irAEs). In clinical practice, CT-based RECIST assessment is the standard method for monitoring systemic therapy responses (46). However, due to the specific tumor response patterns induced by immunotherapy—such as pseudoprogression, hyperprogression, and delayed response—pathological responses in 41% to 45% of NSCLC patients undergoing neoadjuvant therapy do not correlate with imaging evaluations (47).

Radiomics technology can extract features from lung cancer CT images to establish predictive models for evaluating the efficacy of novel anticancer drugs in lung cancer, enabling early identification of drug resistance and providing guidance for optimizing and adjusting treatment regimens (48). Recent studies have confirmed the effectiveness of radiomics in predicting pathological response to neoadjuvant chemoimmunotherapy in resectable NSCLC patients. For instance, a meta-analysis demonstrated that ΔSUVmax serves as a reliable early predictor of MPR to neoadjuvant immunochemotherapy in NSCLC (49). CT holds broader applicability in NSCLC and remains an indispensable modality for evaluating the efficacy of neoadjuvant chemoimmunotherapy. However, meta-analyses examining CT-based radiomics for predicting the efficacy of neoadjuvant chemoimmunotherapy in NSCLC are currently absent.

4.1 Key findings

This study conducted a systematic review and meta-analysis of 17 studies establishing CT-based predictive models for the efficacy of neoadjuvant chemoimmunotherapy in NSCLC. To our knowledge, this represents the first comprehensive evaluation of radiomics techniques applied to this specific treatment context. The pooled results indicate that these models demonstrate good performance in predicting neoadjuvant treatment efficacy (SROCin = 0.81 [0.77-0.84], SROCex = 0.80 [0.76-0.83]). The consistency between internal validation models and external validation models reflects their robustness. Utilizing radiomics to predict the efficacy of neoadjuvant chemoimmunotherapy in NSCLC may provide valuable guidance for clinical decision-making.

We observed heterogeneity in the pooled sensitivity (I2in = 82.78%, I2ex = 83.05%) and specificity (I2in = 82.12%, I2ex = 77.27%). This may be attributed to methodological differences and variations in radiomics workflows across studies, as shown in Table 3. Several factors—including the integration of clinical parameters, choice of AI algorithm, efficacy evaluation metric, use of ICC, RQS score, and ROI dimensions—collectively exerted statistically significant effects on the results and contributed to the observed variability.

A sensitivity analysis demonstrated that excluding any single study from the pool of 17 did not significantly impact the pooled sensitivity and specificity, confirming the robustness of our findings. This result enhances the reliability of our conclusions regarding the effectiveness of radiomics in predicting neoadjuvant chemoimmunotherapy efficacy. Regarding clinical application, in internal validation, a positive radiomics model result (PLR = 3) increased the probability of an effective response from a pre-test probability of 51% to a post-test probability of 73%. Conversely, a negative model result (NLR = 0.31) reduced the probability of an effective response to 24%. In external validation, a positive result (PLR = 3) increased the probability from a pre-test probability of 44% to 68%, while a negative result (NLR = 0.35) reduced it to 22%.

4.2 MPR and pCR: complementary predictive values

Among numerous neoadjuvant immunotherapy trials for solid tumors, pathological response has been widely recognized as a surrogate endpoint (50). Unlike other cancers such as breast and bladder cancer, the rate of complete pathological response (pCR) following chemotherapy for NSCLC remains relatively low, ranging from 9% to 63%. Conversely, 27% to 86% of NSCLC patients achieve MPR following neoadjuvant immunotherapy (51). This suggests that setting MPR as a predictive endpoint yields a more balanced training sample and identifies nearly twice as many patients likely to benefit from neoadjuvant therapy compared to pCR. The clinical significance of MPR prediction lies in its strong association with survival benefit and its utility as a surrogate endpoint for assessing treatment efficacy in thoracic oncology (52).

A key strength of this study is the consistency definition of critical pathological endpoints (pCR and MPR) across the literature, which enhances the comparability of results and the reliability of the pooled estimates. Our meta-regression analysis revealed a significant trade-off between models based on these two endpoints, though the specific advantage was dependent on the validation setting. In internal validation, MPR-based models demonstrated superior sensitivity, while pCR-based models showed greater specificity. Interestingly, this pattern shifted in external validation, where MPR-based models exhibited stronger specificity, whereas pCR-based models tended toward higher sensitivity. This dichotomy and its context-dependence likely stem from inherent differences in their pathological definitions and the challenges of generalizing predictive models. MPR, defined as a continuum of residual viable tumor (≤10%), captures progressive biological responses that may be more continuous and thus more susceptible to detection by evolving radiomics features during treatment. Conversely, pCR represents an absolute state of no viable tumor cells. While this stringent criterion yields highly specific signals, it may fail to detect microscopic residual disease on imaging, resulting in reduced sensitivity. The observed performance reversal in external datasets may reflect greater heterogeneity in imaging protocols and case mix, which differentially impacts the detection thresholds for a graded (MPR) versus a binary (pCR) outcome.

Notably, the superior specificity of the pCR-based model in internal validation highlights its unique value as a definitive “all-or-nothing” endpoint. Achieving pCR serves as a potent prognostic indicator, frequently associated with the deepest molecular response and optimal long-term survival rates. Consequently, its precise identification (i.e., high specificity) is crucial for dose-reduction trials or selecting candidates for curative intent. For thoracic surgeons, preoperative MPR prediction aids risk stratification, informing decisions on optimal surgical timing and adjuvant therapy planning. This non-invasive approach reduces reliance on invasive biopsies, aligning with the growing demand for personalized treatment optimization in locally advanced NSCLC. By accurately reflecting tumor response to therapy, MPR is regarded as a key prognostic marker for resectable NSCLC. Studies demonstrate a significant correlation between MPR and long-term overall survival (OS) in NSCLC patients receiving neoadjuvant chemotherapy and immunotherapy (53). This underscores MPR’s utility as a survival surrogate endpoint and tool for evaluating neoadjuvant efficacy in clinical trials.

In clinical decision-making, prioritizing MPR or pCR as a predictive endpoint should align with specific treatment objectives: If the goal is to maximize identification of all potential responders to avoid undertreatment, the more sensitive MPR model should be preferred. Conversely, when the goal is to select patients most likely to achieve deep, curative responses for intensive monitoring or dose-reduction strategies, the higher-specificity pCR model is more appropriate. Ultimately, integrating both endpoints through sequential or combined modeling-assessing tumor response comprehensively (MPR) while confirming superior efficacy (pCR) —would best balance the needs of precision treatment.

4.3 Advantages of deep learning algorithms

Our subgroup analysis demonstrated that prediction models constructed using DL algorithms outperformed those based on ML algorithms. This finding is consistent with a previous meta-analysis of CT-based radiomics models predicting air-space spread in lung cancer, where the DL subgroup demonstrated higher pooled sensitivity (0.87 vs. 0.81, P < 0.001) and comparable pooled specificity (0.77 vs. 0.75, P < 0.001) (54).

As a complex subset of ML, has the potential to uncover subtle imaging features not readily apparent to the human eye, providing complementary information for predicting neoadjuvant immunotherapy efficacy (55). Its application in tumor radiomics analysis plays an increasingly vital role in diagnosis, treatment decision-making, and prognosis (56, 57). For instance, Lin et al. (58) conducted a retrospective analysis of clinical and imaging data from 62 NSCLC patients undergoing neoadjuvant immunotherapy. They extracted radiomic and DL features from lung cancer lesions to construct an integrated model combining clinical characteristics, radiomic features, and DL functions for accurate response prediction. She et al. (16) developed a DL model to predict MPR, achieving an AUC of 0.72 in an external validation cohort.

The primary limitation of DL models is their lack of interpretability, posing a persistent challenge for deploying this “black-box” technology in clinical practice. However, recent research has begun to address this issue by visualizing the image regions the network deems important, thereby elucidating the predictive process of these models (16, 29). Peng et al. recruited 309 LUSC patients from multiple medical institutions and developed a ResNet-50 model. Using the Grad-CAM method to visualize CT images, they identified critical regions for predicting MPR. The DL model demonstrated excellent predictive accuracy, achieving an area under the AUC of 0.95 (95% CI: 0.98-1.00) and a sensitivity of 0.90 (95% CI: 0.81-0.98) (29).

In summary, AI-based radiomics and DL models demonstrate significant potential in predicting the efficacy of neoadjuvant chemoimmunotherapy treatment for lung cancer. These models can assist clinicians in early identification of patient response and resistance, thereby providing valuable insights for optimizing and transforming immunotherapy treatment plans.

4.4 Optimization of radiology workflow processes

To mitigate the risk of overestimating model performance, this meta-analysis exclusively utilizes validated data from radiomics studies, synthesizing internal and external validation results separately. Currently, most developed radiomics models assess predictive performance through internal validation. Literature indicates that external validation is recommended for datasets exceeding 50 samples, while revalidation methods are preferred for smaller datasets (59). In this study, three investigations employed cohorts recruited at different time points from a unified research center as external validation sets. The AUC values obtained from external validation were generally lower than those from internal validation or training cohorts, a pattern consistent with findings in previous studies.

In radiomics model development, ICC is used to assess the repeatability and robustness of imaging features extracted from tumor lesions across different individuals, segmentation methods, and time points. This assessment encompasses inter-observer and intra-observer variability. Previous reviews indicate that most imaging features exhibit high robustness to inter-observer or intra-observer variability. Consistent with these findings, our study suggests that using features without ICC filtering may lead to reduced diagnostic sensitivity and specificity. Nevertheless, numerous sources of heterogeneity exist for ICC values, including imaging scanner parameters, imaging resolution, tumor segmentation methods, feature extraction software, etc. Furthermore, threshold settings remain unstandardized, precluding any quantitative evaluation of ICC to date (60). Within the internal dataset, the 2D model demonstrated high sensitivity (0.86) but low specificity (0.55). This suggests the model may enhance detection capabilities by overfitting to local features of specific two-dimensional slices in the training data. Conversely, this also led to misclassification of non-target patterns, indicating overfitting. However, in the external dataset, a small number of 2D models (n=6) reported both high sensitivity (0.87) and specificity (0.80). This discrepancy may stem from the 2D analysis’s heavy reliance on the operator’s selection of the “most representative “ slice. During internal validation, this selective process may have optimized for sensitivity at the cost of specificity. In certain ideal external scenarios, however, a similarly well-chosen representative slice may prove equally effective, resulting in better-balanced performance.

Among the 17 studies included in this analysis, the average RQS score was 42.57%, indicating significant room for improvement. The original RQS framework established benchmarks for evaluating the quality and reporting of radiomics research. However, advancements in AI have introduced new complexities to the field, raising the need to address challenges such as fairness, generalizability, accessibility, and interpretability. RQS 2.0 accounts for this evolution by distinguishing between supervised learning and DL approaches. Through its integration with the Radiomics Reporting Layer (RRL), it provides a more comprehensive framework that not only assesses the scientific quality of radiomics studies but also establishes checkpoints to facilitate their effective translation into clinical practice (23). In a subgroup analysis stratified by the median RQS score (41.07%), studies with higher scores demonstrated a lower pooled diagnostic sensitivity (internal validation: 0.78 [95% CI: 0.73-0.84]; external validation: 0.71 [95% CI: 0.66-0.76]) but a trend toward improved specificity (internal validation: 0.71 [95% CI: 0.65-0.78]; external validation: 0.75 [95% CI: 0.71-0.79]). It’s potentially attributable to inter-rater variability during RQS scoring and challenges in RQS scoring reproducibility (61).Furthermore, few studies scored in the areas of Prospective Validity, Applicability and Sustainability, and Clinical Deployment. This highlights directions for improvement in future research.

4.5 Limitation

This study has several limitations that warrant careful consideration when interpreting the findings. First, all included studies were conducted in China, which raises concerns regarding the generalizability of the results—a concept known as spectrum bias. The external validity of our findings to non-Chinese populations remains unproven. Spectrum effects may arise from differences in the mix of disease stages, the proportion of squamous cell carcinoma versus adenocarcinoma (which have distinct biological behaviors and may respond differently to chemoimmunotherapy), as well as variations in regional treatment protocols and healthcare systems. Therefore, the predictive performance of the summarized radiomics models requires rigorous validation in multinational, prospective cohorts before they can be applied to other ethnic or geographic populations.

Second, significant technical heterogeneity existed across the studies, representing a fundamental challenge in radiomics research. Variations in CT scanner manufacturers, models, acquisition parameters (e.g., slice thickness, tube voltage), and reconstruction kernels directly affect the extraction and reproducibility of radiomics features. This scanner and protocol heterogeneity likely introduces substantial noise into the pooled analysis and may degrade model performance and robustness when applied to external datasets using different equipment. Future multicenter studies should employ standardized imaging protocols or advanced harmonization techniques (e.g., ComBat) to mitigate this issue.

Third, clinical and methodological heterogeneity was evident among the included studies. The use of ambispective or retrospective designs introduces risks of selection and verification bias. While our meta-regression attempted to explore some of these factors, these analyses were exploratory and limited by the number of available studies. Future research should prioritize prospective designs with prespecified, standardized protocols for patient stratification, treatment, and outcome assessment to develop more reliable and subtype-specific predictive models.

5 Conclusion

Radiomics technology demonstrates strong diagnostic capability in predicting the efficacy of neoadjuvant chemoimmunotherapy in NSCLC. However, current limitations associated with this technology may restrict its direct clinical application. Further comprehensive studies are needed to validate the findings of this research and promote the clinical application of AI and imaging in this field in the future.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author contributions

HC: Project administration, Formal Analysis, Methodology, Data curation, Writing – review & editing, Writing – original draft, Conceptualization, Software, Investigation. BF: Writing – original draft. MY: Conceptualization, Writing – original draft. DW: Writing – review & editing, Conceptualization. CQ: Investigation, Writing – original draft. NQ: Writing – original draft, Formal Analysis. XQ: Data curation, Writing – original draft. WH: Writing – review & editing, Funding acquisition.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the National Key Research and Development Program of China (NO. 2023YFC3503301), High Level Chinese Medical Hospital Promotion Project (NO. HLCMHPP2023085).

Acknowledgments

We would like to be grateful to all authors for including in the study.

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2026.1753166/full#supplementary-material

References

1. Li Y, Yan B, and He S. Advances and challenges in the treatment of lung cancer. BioMed Pharmacother. (2023) 169:115891. doi: 10.1016/j.biopha.2023.115891

PubMed Abstract | Crossref Full Text | Google Scholar

2. NSCLC Meta-analysis Collaborative Group. Preoperative chemotherapy for non-small-cell lung cancer: a systematic review and meta-analysis of individual participant data. Lancet (London England). (2014) 383:1561–71. doi: 10.1016/S0140-6736(13)62159-5

PubMed Abstract | Crossref Full Text | Google Scholar

3. Wang M, Herbst RS, and Boshoff C. Toward personalized treatment approaches for non-small-cell lung cancer. Nat Med. (2021) 27:1345–56. doi: 10.1038/s41591-021-01450-2

PubMed Abstract | Crossref Full Text | Google Scholar

4. Sorin M, Prosty C, Ghaleb L, Nie K, Katergi K, Shahzad MH, et al. Neoadjuvant chemoimmunotherapy for NSCLC: A systematic review and meta-analysis. JAMA Oncol. (2024) 10:621–33. doi: 10.1001/jamaoncol.2024.0057

PubMed Abstract | Crossref Full Text | Google Scholar

5. Rossi G, Barcellini L, Tagliamento M, Tanda ET, Garassino MC, Blondeaux E, et al. Immunotherapy for resectable NSCLC: neoadjuvant/perioperative followed by surgery over surgery followed by adjuvant. Systematic review and meta-analysis with subgroup analyses. ESMO Open. (2025) 10:105759. doi: 10.1016/j.esmoop.2025.105759

PubMed Abstract | Crossref Full Text | Google Scholar

6. Han D, Zhao J, Hao S, Fu S, Wei R, Zheng X, et al. Integrative radiomics analysis of peri-tumoral and habitat zones for predicting major pathological response to neoadjuvant immunotherapy and chemotherapy in non-small cell lung cancer. Transl Lung Cancer Res. (2025) 14:1168–84. doi: 10.21037/tlcr-2024-1131

PubMed Abstract | Crossref Full Text | Google Scholar

7. Lahiri A, Maji A, Potdar PD, Singh N, Parikh P, Bisht B, et al. Lung cancer immunotherapy: progress, pitfalls, and promises. Mol Cancer. (2023) 22:40. doi: 10.1186/s12943-023-01740-y

PubMed Abstract | Crossref Full Text | Google Scholar

8. Liu L, Bai H, Wang C, Seery S, Wang Z, Duan J, et al. Efficacy and safety of first-line immunotherapy combinations for advanced NSCLC: A systematic review and network meta-analysis. J Thorac Oncol. (2021) 16:1099–117. doi: 10.1016/j.jtho.2021.03.016

PubMed Abstract | Crossref Full Text | Google Scholar

9. Passaro A, Brahmer J, Antonia S, Mok T, and Peters S. Managing resistance to immune checkpoint inhibitors in lung cancer: treatment and novel strategies. J Clin Oncol. (2022) 40:598–610. doi: 10.1200/JCO.21.01845

PubMed Abstract | Crossref Full Text | Google Scholar

10. Ricciuti B, Lamberti G, Puchala SR, Mahadevan NR, Lin J-R, Alessi JV, et al. Genomic and immunophenotypic landscape of acquired resistance to PD-(L)1 blockade in non-small-cell lung cancer. J Clin Oncol. (2024) 42:1311–21. doi: 10.1200/JCO.23.00580

PubMed Abstract | Crossref Full Text | Google Scholar

11. Raez LE, Brice K, Dumais K, Lopez-Cohen A, Wietecha D, Izquierdo PA, et al. Liquid biopsy versus tissue biopsy to determine front line therapy in metastatic non-small cell lung cancer (NSCLC). Clin Lung Cancer. (2022) 24:120–9. doi: 10.1016/j.cllc.2022.11.007

PubMed Abstract | Crossref Full Text | Google Scholar

12. Liam C-K, Mallawathantri S, and Fong KM. Is tissue still the issue in detecting molecular alterations in lung cancer? Respirology. (2020) 25:933–43. doi: 10.1111/resp.13823

PubMed Abstract | Crossref Full Text | Google Scholar

13. John N, Schlintl V, Sassmann T, Lindenmann J, Fediuk M, Wurm R, et al. Longitudinal analysis of PD-L1 expression in patients with relapsed NSCLC. J For Immunother Cancer. (2024) 12:e008592. doi: 10.1136/jitc-2023-008592

PubMed Abstract | Crossref Full Text | Google Scholar

14. Provencio M, Serna-Blasco R, Nadal E, Insa A, García-Campelo MR, Casal Rubio J, et al. Overall Survival and Biomarker Analysis of Neoadjuvant Nivolumab Plus Chemotherapy in Operable Stage IIIA Non-Small-Cell Lung Cancer (NADIM phase II trial). J Clin Oncol. (2022) 40:2924–33. doi: 10.1200/JCO.21.02660

PubMed Abstract | Crossref Full Text | Google Scholar

15. Provencio M, Nadal E, Insa A, García-Campelo MR, Casal-Rubio J, Dómine M, et al. Neoadjuvant chemotherapy and nivolumab in resectable non-small-cell lung cancer (NADIM): an open-label, multicentre, single-arm, phase 2 trial. Lancet Oncol. (2020) 21:1413–22. doi: 10.1016/S1470-2045(20)30453-8

PubMed Abstract | Crossref Full Text | Google Scholar

16. She Y, He B, Wang F, Zhong Y, Wang T, Liu Z, et al. Deep learning for predicting major pathological response to neoadjuvant chemoimmunotherapy in non-small cell lung cancer: A multicentre study. EBioMedicine. (2022) 86:104364. doi: 10.1016/j.ebiom.2022.104364

PubMed Abstract | Crossref Full Text | Google Scholar

17. Liu C, Zhao W, Xie J, Lin H, Hu X, Li C, et al. Development and validation of a radiomics-based nomogram for predicting a major pathological response to neoadjuvant immunochemotherapy for patients with potentially resectable non-small cell lung cancer. Front Immunol. (2023) 14:1115291. doi: 10.3389/fimmu.2023.1115291

PubMed Abstract | Crossref Full Text | Google Scholar

18. Page MJ, Moher D, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. (2021) 372:n160. doi: 10.1136/bmj.n160

PubMed Abstract | Crossref Full Text | Google Scholar

19. Z, C, C H. Diagnostic accuracy of CT and PET/CT radiomics in predicting lymph node metastasis in non-small cell lung cancer. J Evidence-Based Med. (2016) 16:159–64. doi: 10.1007/s00330-024-11036-4

PubMed Abstract | Crossref Full Text | Google Scholar

20. Li Y, Deng J, Ma X, Li W, and Wang Z. Diagnostic accuracy of CT and PET/CT radiomics in predicting lymph node metastasis in non-small cell lung cancer. Eur Radiol. (2024) 35:1966–79. doi: 10.1007/s00330-024-11036-4

PubMed Abstract | Crossref Full Text | Google Scholar

21. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. (2011) 155:529–36. doi: 10.7326/0003-4819-155-8-201110180-00009

PubMed Abstract | Crossref Full Text | Google Scholar

22. Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. (2017) 14:749–62. doi: 10.1038/nrclinonc.2017.141

PubMed Abstract | Crossref Full Text | Google Scholar

23. Lambin P, Woodruff HC, Mali SA, Zhong X, Kuang S, Lavrova E, et al. Radiomics Quality Score 2.0: towards radiomics readiness levels and clinical translation for personalized medicine. Nat Rev Clin Oncol. (2025) 22:831–46. doi: 10.1038/s41571-025-01067-1

PubMed Abstract | Crossref Full Text | Google Scholar

24. Zheng Y, Du Y, Zhang B, Zhang H, Shang P, and Hou Z. Application of radiomics-based prediction model to predict preoperative lymph node metastasis in prostate cancer: a systematic review and meta-analysis. Front In Oncol. (2025) 15:1577794. doi: 10.3389/fonc.2025.1577794

PubMed Abstract | Crossref Full Text | Google Scholar

25. Rutter CM and Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med. (2001) 20:2865–84. doi: 10.1002/sim.942

PubMed Abstract | Crossref Full Text | Google Scholar

26. Deeks JJ, Macaskill P, and Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol. (2005) 58:882–93. doi: 10.1016/j.jclinepi.2005.01.016

PubMed Abstract | Crossref Full Text | Google Scholar

27. Han X, Wang M, Zheng Y, Wang N, Wu Y, Ding C, et al. Delta-radiomics features for predicting the major pathological response to neoadjuvant chemoimmunotherapy in non-small cell lung cancer. Eur Radiol. (2024) 34:2716–26. doi: 10.1007/s00330-023-10241-x

PubMed Abstract | Crossref Full Text | Google Scholar

28. Huang D, Lin C, Jiang Y, Xin E, Xu F, Gan Y, et al. Radiomics model based on intratumoral and peritumoral features for predicting major pathological response in non-small cell lung cancer receiving neoadjuvant immunochemotherapy. Front Oncol. (2024) 14:1348678. doi: 10.3389/fonc.2024.1348678

PubMed Abstract | Crossref Full Text | Google Scholar

29. Peng J, Xie B, Ma H, Wang R, Hu X, and Huang Z. Deep learning based on computed tomography predicts response to chemoimmunotherapy in lung squamous cell carcinoma. Aging Dis. (2024) 16:1674–90. doi: 10.14336/AD.2024.0169

PubMed Abstract | Crossref Full Text | Google Scholar

30. Qu W, Chen C, Cai C, Gong M, Luo Q, Song Y, et al. Non-invasive prediction for pathologic complete response to neoadjuvant chemoimmunotherapy in lung cancer using CT-based deep learning: a multicenter study. Front Immunol. (2024) 15:1327779. doi: 10.3389/fimmu.2024.1327779

PubMed Abstract | Crossref Full Text | Google Scholar

31. Wang F, Yang H, Chen W, Ruan L, Jiang T, Cheng L, et al. A combined model using pre-treatment CT radiomics and clinicopathological features of non-small cell lung cancer to predict major pathological responses after neoadjuvant chemoimmunotherapy. Curr Probl Cancer. (2024) 50:101098. doi: 10.1016/j.currproblcancer.2024.101098

PubMed Abstract | Crossref Full Text | Google Scholar

32. Ye G, Wu G, Qi Y, Li K, Wang M, Zhang C, et al. Non-invasive multimodal CT deep learning biomarker to predict pathological complete response of non-small cell lung cancer following neoadjuvant immunochemotherapy: a multicenter study. J Immunother Cancer. (2024) 12:e009348. doi: 10.1136/jitc-2024-009348

PubMed Abstract | Crossref Full Text | Google Scholar

33. Ye G, Wu G, Zhang C, Wang M, Liu H, Song E, et al. CT-based quantification of intratumoral heterogeneity for predicting pathologic complete response to neoadjuvant immunochemotherapy in non-small cell lung cancer. Front Immunol. (2024) 15:1414954. doi: 10.3389/fimmu.2024.1414954

PubMed Abstract | Crossref Full Text | Google Scholar

34. Bao X, Peng Q, Bian D, Ni J, Zhou S, Zhang P, et al. Short-term intra- and peri-tumoral spatiotemporal CT radiomics for predicting major pathological response to neoadjuvant chemoimmunotherapy in non-small cell lung cancer. Eur Radiol. (2025) 35:6052–64. doi: 10.1007/s00330-025-11563-8

PubMed Abstract | Crossref Full Text | Google Scholar

35. Fan S, Xie J, Zheng S, Wang J, Zhang B, Zhang Z, et al. Non-invasive CT based multiregional radiomics for predicting pathologic complete response to preoperative neoadjuvant chemoimmunotherapy in non-small cell lung cancer. Eur J Radiol. (2025) 189:112171. doi: 10.1016/j.ejrad.2025.112171

PubMed Abstract | Crossref Full Text | Google Scholar

36. Geng Z, Li K, Mei P, Gong Z, Yan R, Huang Y, et al. Multichannel deep learning prediction of major pathological response after neoadjuvant immunochemotherapy in lung cancer: a multicenter diagnostic study. Int J Surg. (2025) 111:6614–26. doi: 10.1097/JS9.0000000000002821

PubMed Abstract | Crossref Full Text | Google Scholar

37. Xiong D, Li J, Li L, Xu F, Hu T, Zhu H, et al. Delta-radiomics features combined with haematological index predict pathological complete response after neoadjuvant immunochemotherapy in resectable non-small cell lung cancer. Clin Radiol. (2025) 86:106906. doi: 10.1016/j.crad.2025.106906

PubMed Abstract | Crossref Full Text | Google Scholar

38. Zheng J, Yan Z, Wang R, Xiao H, Chen Z, Ge X, et al. NeoPred: dual-phase CT AI forecasts pathologic response to neoadjuvant chemo-immunotherapy in NSCLC. J Immunother Cancer. (2025) 13:e011773. doi: 10.1136/jitc-2025-011773

PubMed Abstract | Crossref Full Text | Google Scholar

39. Gan X, He J, Zhang W, Chen W, Liu S, Li W, et al. Attention-guided framework for integrative omics and temporal dynamics in predicting major pathological response in neoadjuvant immunochemotherapy for NSCLC. J For Immunother Cancer. (2025) 13:e012526. doi: 10.1136/jitc-2025-012526

PubMed Abstract | Crossref Full Text | Google Scholar

40. Ye G, Wei Z, Han C, Wu G, Wong C, Liang Y, et al. AI-derived longitudinal and multi-dimensional CT classifier for non-small cell lung cancer to optimize neoadjuvant chemoimmunotherapy decision: a multicentre retrospective study. EClinicalMedicine. (2025) 89:103551. doi: 10.1016/j.eclinm.2025.103551

PubMed Abstract | Crossref Full Text | Google Scholar

41. Kang J, Zhang C, and Zhong W-Z. Neoadjuvant immunotherapy for non-small cell lung cancer: State of the art. Cancer Commun (London England). (2021) 41:287–302. doi: 10.1002/cac2.12153

PubMed Abstract | Crossref Full Text | Google Scholar

42. Forde PM, Chaft JE, Smith KN, Anagnostou V, Cottrell TR, Hellmann MD, et al. Neoadjuvant PD-1 blockade in resectable lung cancer. New Engl J Med. (2018) 378:1976–86. doi: 10.1056/NEJMoa1716078

PubMed Abstract | Crossref Full Text | Google Scholar

43. Shu CA, Gainor JF, Awad MM, Chiuzan C, Grigg CM, Pabani A, et al. Neoadjuvant atezolizumab and chemotherapy in patients with resectable non-small-cell lung cancer: an open-label, multicentre, single-arm, phase 2 trial. Lancet Oncol. (2020) 21:786–95. doi: 10.1016/S1470-2045(20)30140-6

PubMed Abstract | Crossref Full Text | Google Scholar

44. Gao S, Li N, Gao S, Xue Q, Ying J, Wang S, et al. Neoadjuvant PD-1 inhibitor (Sintilimab) in NSCLC. J Thorac Oncol. (2020) 15:816–26. doi: 10.1016/j.jtho.2020.01.017

PubMed Abstract | Crossref Full Text | Google Scholar

45. Awad MM, Forde PM, Girard N, Spicer J, Wang C, Lu S, et al. Neoadjuvant nivolumab plus ipilimumab versus chemotherapy in resectable lung cancer. J Clin Oncol. (2025) 43:1453–62. doi: 10.1200/JCO-24-02239

PubMed Abstract | Crossref Full Text | Google Scholar

46. Seymour L, Bogaerts J, Perrone A, Ford R, Schwartz LH, Mandrekar S, et al. iRECIST: guidelines for response criteria for use in trials testing immunotherapeutics. Lancet Oncol. (2017) 18:e143–e52. doi: 10.1016/S1470-2045(17)30074-8

PubMed Abstract | Crossref Full Text | Google Scholar

47. Liang W, Cai K, Chen C, Chen H, Chen Q, Fu J, et al. Expert consensus on neoadjuvant immunotherapy for non-small cell lung cancer. Trans Lung Cancer Res. (2020) 9:2696–715. doi: 10.21037/tlcr-2020-63

PubMed Abstract | Crossref Full Text | Google Scholar

48. Dercle L, McGale J, Sun S, Marabelle A, Yeh R, Deutsch E, et al. Artificial intelligence and radiomics: fundamentals, applications, and challenges in immunotherapy. J For Immunother Cancer. (2022) 10:e005292. doi: 10.1136/jitc-2022-005292

PubMed Abstract | Crossref Full Text | Google Scholar

49. Deng H, Deng Z, Huang Y, Jiang Y, Lin Y, Wang R, et al. Diagnostic performance of 18F-FDG PET/CT metabolic parameters for early prediction of pathological response in NSCLC treated with neoadjuvant immuno(chemo)therapy: A systematic review and meta-analysis. Eur J Nucl Med Mol Imaging. (2025) 53:715–27. doi: 10.1007/s00259-025-07497-4

PubMed Abstract | Crossref Full Text | Google Scholar

50. Topalian SL, Taube JM, and Pardoll DM. Neoadjuvant checkpoint blockade for cancer immunotherapy. Science. (2020) 367:eaax0182. doi: 10.1126/science.aax0182

PubMed Abstract | Crossref Full Text | Google Scholar

51. Ulas EB, Dickhoff C, Schneiders FL, Senan S, and Bahce I. Neoadjuvant immune checkpoint inhibitors in resectable non-small-cell lung cancer: a systematic review. ESMO Open. (2021) 6:100244. doi: 10.1016/j.esmoop.2021.100244

PubMed Abstract | Crossref Full Text | Google Scholar

52. Deutsch JS, Cimino-Mathews A, Thompson E, Provencio M, Forde PM, Spicer J, et al. Association between pathologic response and survival after neoadjuvant therapy in lung cancer. Nat Med. (2023) 30:218–28. doi: 10.1038/s41591-023-02660-6

PubMed Abstract | Crossref Full Text | Google Scholar

53. Weissferdt A, Pataer A, Vaporciyan AA, Correa AM, Sepesi B, Moran CA, et al. Agreement on major pathological response in NSCLC patients receiving neoadjuvant chemotherapy. Clin Lung Cancer. (2020) 21:341–8. doi: 10.1016/j.cllc.2019.11.003

PubMed Abstract | Crossref Full Text | Google Scholar

54. Chen L, Lan X, Huang Y, Tao J, Huang X, Su Y, et al. CT-based radiomics models for predicting spread through air space in lung cancer: A systematic review and meta-analysis. Eur J Radiol. (2025) 190:112249. doi: 10.1016/j.ejrad.2025.112249

PubMed Abstract | Crossref Full Text | Google Scholar

55. Parekh VS and Jacobs MA. Deep learning and radiomics in precision medicine. Expert Rev Precis Med Drug Dev. (2019) 4:59–72. doi: 10.1080/23808993.2019.1585805

PubMed Abstract | Crossref Full Text | Google Scholar

56. Dong D, Fang MJ, Tang L, Shan XH, Gao JB, Giganti F, et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann Oncol. (2020) 31:912–20. doi: 10.1016/j.annonc.2020.04.003

PubMed Abstract | Crossref Full Text | Google Scholar

57. Zhang L, Dong D, Zhang W, Hao X, Fang M, Wang S, et al. A deep learning risk prediction model for overall survival in patients with gastric cancer: A multicenter study. Radiother Oncol. (2020) 150:73–80. doi: 10.1016/j.radonc.2020.06.010

PubMed Abstract | Crossref Full Text | Google Scholar

58. Lin Q, Wu HJ, Song QS, and Tang YK. CT-based radiomics in predicting pathological response in non-small cell lung cancer patients receiving neoadjuvant immunotherapy. Front In Oncol. (2022) 12:937277. doi: 10.3389/fonc.2022.937277

PubMed Abstract | Crossref Full Text | Google Scholar

59. Westad F and Marini F. Validation of chemometric models - a tutorial. Anal Chim Acta. (2015) 893:14–24. doi: 10.1016/j.aca.2015.06.056

PubMed Abstract | Crossref Full Text | Google Scholar

60. Zhuang M, Li X, Qiu Z, and Guan J. Does consensus contour improve robustness and accuracy in 18F-FDG PET radiomic features? EJNMMI Phys. (2024) 11:48. doi: 10.1186/s40658-024-00652-0

PubMed Abstract | Crossref Full Text | Google Scholar

61. Akinci D’Antonoli T, Cavallo AU, Vernuccio F, Stanzione A, Klontzas ME, Cannella R, et al. Reproducibility of radiomics quality score: an intra- and inter-rater reliability study. Eur Radiol. (2023) 34:2791–804. doi: 10.1007/s00330-023-10217-x

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: computed tomography, meta-analysis, neoadjuvant chemoimmunotherapy, non-small cell lung cancer, prediction model

Citation: Chen H, Fan B, Yuan M, Wang D, Qiao C, Qiu N, Quan X and Hou W (2026) CT-based radiomics in predicting the efficacy of preoperative neoadjuvant chemoimmunotherapy for non-small cell lung cancer: a systematic review and meta-analysis. Front. Immunol. 17:1753166. doi: 10.3389/fimmu.2026.1753166

Received: 24 November 2025; Accepted: 23 January 2026; Revised: 18 January 2026;
Published: 10 February 2026.

Edited by:

Zhenwei Shi, Guangdong Academy of Medical Sciences, China

Reviewed by:

Harish RaviPrakash, AstraZeneca, United States
Guanchao Ye, First Affiliated Hospital of Zhengzhou University, China

Copyright © 2026 Chen, Fan, Yuan, Wang, Qiao, Qiu, Quan and Hou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Wei Hou, aG91d2VpMTk2NEAxNjMuY29t

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.