- 1Respiratory and Critical Care Medicine Department, Lanzhou University Second Hospital, Lanzhou, Gansu, China
- 2Second Clinical Medical College of Lanzhou University, Lanzhou, Gansu, China
- 3Radiology Department, Lanzhou University Second Hospital, Lanzhou, Gansu, China
Background and purpose: To explore the predictive value of a model based on clinical and contrast-enhanced computed tomography (CT) radiomic features for the early prediction of immunotherapy efficacy in patients with advanced non-small cell lung cancer (NSCLC).
Methods: This retrospective study included 144 patients with advanced NSCLC who received immunotherapy at Lanzhou University Second Hospital between January 2023 and December 2024. Clinical data and CT images were collected from each patient. All patients underwent imaging examinations to evaluate the efficacy of immunotherapy after the second treatment cycle. Patients who achieved complete response (CR) or partial response (PR) were considered to be in the reactive group, while those who experienced stable disease (SD) or progressive disease (PD) were considered to be in the non-reactive group. The participants were randomly divided into a training set (n = 115) and a testing set (n = 29) at a ratio of 8:2. Radiomic features were extracted from pre-treatment contrast-enhanced CT venous phase images. Feature reduction was performed using the Spearman rank correlation coefficient and the least absolute shrinkage and selection operator (LASSO) algorithm. The best radiomics signature was built using multiple machine learning algorithms and combined with clinical features to build a nomogram model. The area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis (DCA) were used to evaluate the model’s predictive performance, calibration, and clinical net benefit.
Results: Three clinical features (C-reactive protein, baseline tumor size, and programmed death receptor ligand 1) and seven radiomics features (one first-order feature and six texture features) were selected for the model. The radiomic signature performed best based on the Extreme Random Tree algorithm. The radiomic signature and the nomogram model demonstrated superior predictive performance and clinical net benefit compared to the clinical model in both training and testing sets (AUCs: radiomics: 0.926 vs. 0.848; nomogram: 0.953 vs. 0.788; clinical: 0.882 vs. 0.742), with statistically significant differences (P < 0.05).
Conclusion: The integrated clinical-radiomics nomogram establishes a robust framework for early prediction of immunotherapy efficacy in advanced NSCLC, offering valuable support for personalized treatment decisions.
1 Introduction
Non-small cell lung cancer (NSCLC) is the leading cause of cancer-related death worldwide. The five-year survival rate for patients with advanced NSCLC is less than 5% owing to the loss of surgical opportunities, limited efficacy of radiotherapy and chemotherapy, and significant toxicity (1, 2). In recent years, immunotherapy has emerged as a first-line treatment for advanced NSCLC patients without driver mutations, significantly improving the five-year survival rate to 23.5% (3, 4). However, the efficacy of currently clinically applied predictive biomarkers, such as programmed cell death ligand 1 (PD-L1) and tumor mutational burden (TMB), is limited by sampling limitations and tumor spatial heterogeneity. Furthermore, PD-L1 expression and the efficacy of immunotherapy are influenced by treatment regimens, tumor heterogeneity, and tumor microenvironment, resulting in only a few patients benefiting from it in the long term (5). Therefore, it is necessary to accurately identify individuals who are sensitive to immune treatment. This is important for guiding decisions regarding late-stage NSCLC treatment. The goal is to extend the patients’ lives.
Radiomics is a process that uses computer software to extract high-throughput features from traditional imaging data. This method avoids the limitations of invasive tissue biopsy, such as missing tumor spatial heterogeneity and poor reproducibility. It can comprehensively reflect tumor biology and provide safer and more reliable guidance for patient follow-up and prognosis monitoring in the future. In 2018, Sun Roger (6, 7) first demonstrated that radiomics could be used to predict the efficacy of immunotherapy. Since then, radiomics research has focused on predicting the prognosis, evaluating the efficacy, and monitoring the immunotherapy-related adverse reactions of different tumors. This study aimed to explore the predictive value of clinical characteristics and treatment-before-enhanced CT radiomics features for the early prediction of the efficacy of immunotherapy in late-stage NSCLC. The goal is to establish a predictive model related to efficacy and provide an early, non-invasive, and high-precision predictive tool for individualized immunotherapy decisions.
2 Materials and methods
2.1 Patients
This retrospective study included 144 patients with advanced non-small cell lung cancer (NSCLC) who received immune checkpoint inhibitor (ICI) treatment at Lanzhou University Second Hospital between January 2023 and December 2024. The inclusion criteria were as follows: (1) TNM stage IIIb to IV; (2) pathologically confirmed NSCLC; (3) at least two cycles of first-line PD-1 inhibitor combined with chemotherapy; and (4) PS score 0-2. The exclusion criteria were as follows: (1) receiving radiotherapy or surgery before immunotherapy; (2) maximum tumor diameter <5 mm or unclear tumor border due to lung infection, lung collapse, etc., affecting image segmentation; (3) other malignant tumors; (4) interval between pre-treatment enhanced CT and immunotherapy was greater than 4 weeks; and (5) incomplete data or lost follow-up. The enrollment process is illustrated (Figure 1). The study was approved by the Ethics Committee (2024A-306), and informed consent was waived.
Figure 1. Patient flow diagram. For the study dataset, training and test set were randomly divided in a proportion of 8:2. NSCLC, Non-small cell lung cancer; CT, computed tomography.
2.2 Chest enhanced CT
Imaging Protocol The scanning equipment utilized in this study included the Siemens Somatom Force CT, Philips Brilliance iCT, GE Discovery HD750, and GE Revulation CT. The scanning parameters are listed in Table 1. The scanning method was as follows: the patient assumed a supine position with both arms raised above the head. Following deep inhalation, the patient was asked to hold their breath for a period of time. The scanning range encompassed the cranium and inferior border of the ilium. Subsequently, the patient was administered 1.5 mL of a contrast agent containing 320 mg of iodine per milliliter (IsoTec, Bayer AG) via a high-pressure injector. The patient was asked to collect an arterial phase image 30 s after the injection of the contrast agent and a venous phase image at 60 s.
2.3 Data collection and efficacy evaluation of immunotherapy
The clinical dataset included all patients who underwent combination chemotherapy with platinum derivatives and immune checkpoint inhibitors (ICIs) as a treatment modality. The immunotherapy regimen consisted of a PD-1 inhibitor. The patients’ baseline characteristics were collected using the hospital’s electronic medical records system. These characteristics included age, race, sex, height, weight, body mass index (BMI), smoking history, family history of cancer, TNM staging, pathological type, and Eastern Cooperative Oncology Group performance status (ECOG PS) score. Baseline tumor size (BTS) was also recorded. Serum markers, including C-reactive protein (CRP), albumin, and Lactate Dehydrogenase (LDH), were also measured. Serum levels of Carcinoembryonic Antigen (CEA), Neuron-Specific Enolase (NSE), cytokeratin 19 Fragment Antigen 21-1 (CYFRA21-1), progastrin-releasing peptide (ProGRP), and Squamous Cell Carcinoma Antigen (SCC) were also measured. PD-L1 expression was detected using immunohistochemistry (IHC) on lung tissue samples obtained via bronchoscopy or CT-guided needle biopsies. The evaluation of at least 100 tumor cells (TCs) is necessary, and the tumor cell percentage (TPS) should be used for quantitative analysis (8). TPS was categorized as follows: negative (TPS < 1%), low (TPS 1%-49%), and high (TPS ≥ 50%).
The following images were obtained using a picture archiving and communication system (PACS) in the Digital Imaging and Communications in Medicine (DICOM) format. The images were obtained during the chemotherapy and immunotherapy treatment periods, specifically during the first four weeks of intravenous infusion. The images were analyzed using image analysis software.
Efficacy evaluation of immunotherapy: The primary endpoint of this study was the final immunotherapy response, determined based on the initial radiological assessment followed by necessary confirmatory procedures. All patients underwent baseline contrast-enhanced chest CT and routine laboratory tests within 4 weeks before initiating immunotherapy. The first follow-up imaging evaluation was performed at 6–8 weeks after the initial treatment. Tumor response was strictly evaluated according to the iRECIST criteria (9). Complete response (iCR) was defined as the disappearance of all target lesions. Partial response (iPR) was defined as a ≥30% decrease in the sum of diameters of all target lesions relative to baseline. Stable disease (iSD) was defined as a change in the sum of diameters ranging from −30% to +20% (Figure 2). Unconfirmed progressive disease (iUPD) was defined as a ≥20% increase in the sum of diameters. For patients assessed as iUPD, a comprehensive evaluation was conducted by a respiratory physician to decide whether to continue the original treatment regimen. If treatment was continued, a confirmatory imaging scan was performed 4–6 weeks later. The final outcome (iCPD, iSD, or iPR) determined from this confirmatory scan was recorded as the study endpoint for that patient (Figure 3). For analysis purposes, patients were categorized into two groups: responders (best overall response of iCR or iPR, including those converting from iUPD to iPR) and non-responders (best overall response of iSD or iCPD).
Figure 2. Imaging findings at the first on-treatment evaluation demonstrating Partial Response (PR) and Stable Disease (SD). PR: A 68-year-old male with right lung SCC. Baseline scan shows a target lesion (48.41 mm) (A). At first follow-up, the lesion is not measurable at the same level (B), the residual lesion measures 25.01 mm (C). SD: A 48-year-old male with right lung SCC. Baseline scan shows a target lesion (84.61 mm) (D). At first follow-up, the lesion measures 76.26 mm (E).
Figure 3. Individualized management and serial imaging for a patient initially assessed with iUPD. A 63-year-old man with right lung adenocarcinoma. Baseline scan demonstrates the target lesion (86.63 mm) (A). After 2 cycles of chemoimmunotherapy, the first follow-up showed iUPD (B). Despite progression, treatment was continued due to suspected pseudoprogression. After a third cycle, a subsequent scan revealed significant tumor shrinkage to 52.39 mm, confirming the iUPD was unconfirmed (C). After 6 cycles, tumor size was 50.78 mm, confirming a best overall response of iPR (D).
2.4 Image analysis
2.4.1 Image preprocessing and segmentation
The images were resampled to a uniform voxel size of 1×1×1 mm³ using linear interpolation. Gray levels were discretized via min-max normalization into 25 fixed intensity bins (10). The preprocessed images were then segmented using the ITK-SNAP software (http://www.itksnap.org). Initially, a junior physician (Yue HOU) manually delineated the region of interest (ROI) on the transverse plane displaying the maximum tumor diameter, using a lung window setting (width: 1500 HU; level: -500 HU). All these initial ROIs were subsequently reviewed and revised by a senior respiratory specialist (Tianming ZHANG). The delineation was guided by the tumor-lung interface, carefully excluding necrotic or calcified areas, adjacent vessels, and peripheral non-tumor regions (Supplementary Figure 1). To assess the inter- and intra-observer consistency of the segmentation method, two senior experts (Tianming ZHANG and Kaibo ZHU) independently delineated the lesions in a randomly selected cohort of 30 images following the same protocol.
2.4.2 Radiomics feature extraction, screening, and modeling
Imaging features were extracted from the ROIs using the Python PyRadiomics package (http://pyradiomics.readthedocs.io/). Z-score normalization was first applied to standardize the data. Inter- and intra-observer consistency of features extracted by the two experts were then evaluated, and only radiomic features demonstrating high reproducibility (ICC ≥ 0.75) were retained. A Mann–Whitney U test was performed on all features, and those with a p-value below 0.05 were kept. Subsequently, to minimize redundancy, features exhibiting a Spearman’s correlation coefficient greater than 0.9 were excluded. The dataset was randomly divided into a training set (115 cases) and a testing set (29 cases), approximating an 8:2 ratio. On the training set, feature dimensionality was reduced using the least absolute shrinkage and selection operator (LASSO). The optimal hyperparameter (α) for LASSO was determined via five-fold cross-validation applied exclusively to the training set. The final feature set, consisting of features with nonzero coefficients, was obtained by fitting a LASSO model on the entire training set using this optimal α. These selected imaging features were used to construct a Rad-score via multiple machine learning algorithms, including logistic regression (LR), support vector machines (SVM), random forest (RF), and extreme randomized trees (ET). Similarly, significant clinical features (P < 0.05) were incorporated into a clinical model. The best-performing algorithm was selected to integrate the Rad-score and clinical features into a combined prediction model (8, 11–13), which was ultimately presented as a nomogram for clinical application. The overall workflow is depicted in Figure 4.
Figure 4. Workflow for the radiomics analysis. ROI, region of interest; MSE, mean squared error; ROC, receiver operator characteristic; DCA, decision curve analysis.
2.5 Internal validation of the nomogram
To provide a robust and unbiased estimate of the nomogram’s generalizability and to address potential overfitting, the model was internally validated using a repeated stratified 5-fold cross-validation scheme on the training set (n=115). With 50 repetitions, this process yielded 250 performance estimates. These estimates were used to comprehensively assess the model’s stability and predictive performance.
2.6 Statistical analysis
Statistical analyses were performed using R software version 4.3.3. Normally distributed quantitative data were expressed as mean ± standard deviation (X ± s), and intergroup comparisons were conducted using the independent samples t-test. Non-normally distributed quantitative data were expressed as median (interquartile range) [M(P25,P75)], and intergroup comparisons were performed using the Mann-Whitney U test. Categorical data were expressed as percentages (%). Intergroup comparisons were performed using chi-square (χ2) and Fisher’s exact tests. Differences were considered statistically significant at P < 0.05. Receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was calculated to assess the predictive performance of the models. The sensitivity, specificity, and accuracy were also computed. The DeLong test was used to compare the performance differences among the different models. Calibration curves were used to assess the consistency between the predicted probabilities and actual outcomes. The Hosmer-Lemeshow test was used to evaluate the goodness-of-fit of the models. Decision curve analysis (DCA) was conducted to assess the clinical utility of the models (11–14). Furthermore, to enable a direct quantitative comparison at specific clinical decision points, we calculated the net benefit, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each model at the key thresholds of 0.2, 0.3, and 0.5.
3 Results
3.1 Comparing the clinical characteristics of patients
According to the inclusion and exclusion criteria, this study included 144 patients, aged 36–86 years, with a median age of 61.00 years. Among them, 88 patients (61.11%) were 60 years or older, and 56 patients (38.89%) were younger than 60 years. There were 132 males and 12 females, with 97 patients having squamous cell carcinoma and 47 patients having adenocarcinomas. The patients were randomly divided into a training set of 115 patients (72 with a response and 43 without a response) and a testing set of 29 patients (18 with a response and 11 without a response). The results showed statistically significant differences in CRP, PD-L1, and BTS in the training set (P<0.05). The other demographic characteristics (age, gender, race, etc.), smoking history, family history of cancer, and TNM stage did not differ and were not statistically significant (P > 0.05), as shown in Table 2.
Table 2. Characteristics of patients included in the study and P values revealing statistical differences between the study cohorts.
3.2 Feature extraction and radiomics signature construction
A total of 107 radiomics features were extracted, including 18 first-order features, 14 shape features, 5 NGTDM features, 16 GLSZM features, 16 GLRLM features, 14 GLDM features, and 24 GLCM features. Following univariate screening with the Mann-Whitney U test (p < 0.05), 24 significant features were identified. The percentage of each feature and its statistical values are presented in Figure 5 Subsequent removal of highly correlated features (Spearman’s |ρ| > 0.9) reduced the number to 12. A LASSO regression model applied to these 12 features selected the final 7 most predictive features, whose coefficients and cross-validated MSE are presented. The features and their coefficients are shown in Figure 6. The optimal image set features were then used to construct a prediction model using machine learning methods, such as LR, SVM, random forest, and extra trees. Supplementary Table 1 summarizes all models evaluated for predicting immunotherapy efficacy. Among them, the Extra Trees algorithm demonstrated superior performance, achieving the highest AUC values in both the training set (0.926; 95% CI: 0.882–0.970) and the testing set (0.848; 95% CI: 0.695–1.000), as illustrated in Figure 7. Consequently, it was selected as the final clinical prediction model. A radiomics score (Rad-score) was constructed using the LASSO logistic regression model with the following formula:Radscore= 0.375 + 0.075783 * original_firstorder_Maximum +0.009822 * original_gldm_SmallDependenceLowGrayLevelEmphasis -0.042976 * original_glrlm_LongRunHighGrayLevelEmphasis -0.007042 * original_glszm_GrayLevelNonUniformity -0.033593 * original_glszm_SizeZoneNonUniformity +0.046212 * original_glszm_SmallAreaLowGrayLevelEmphasis +0.036860 * original_ngtdm_Coarseness. P(Non-reactive=1) = 1/(1 + e^(-Z)), where Z = -2.80967098 + 3.16488672 × Clinic_Sig + 2.82782853 × Rad_Sig.
Figure 6. Coefficients of 5 fold cross validation (A). MSE of 10 fold cross validation (B). The histogram of the Rad-score based on the selected features (C).
3.3 Comparing clinical, radiomic signature, and nomogram model
The clinical, radiomics, and nomogram models all demonstrated good specificity in the training set, with the nomogram model achieving the highest predictive performance, the AUCs of 0.882 (95% CI: 0.820–0.943), 0.926 (95% CI: 0.882–0.970), and 0.953 (95% CI: 0.921–0.986), respectively. In the validation set, both the clinical and radiomics models showed good fit, whereas the nomogram model may be overfitted, with AUCs of 0.742 (95% CI: 0.538–0.947), 0.848 (95% CI: 0.695–1.000), and 0.788 (95% CI: 0.610–0.965). Although the radiomics model outperformed the nomogram model, the difference was not statistically significant (P > 0.05), as shown in Figure 8. Calibration assessment indicated room for improvement in the absolute accuracy of model predictions. Although the Hosmer–Lemeshow test P-values for the clinical model, radiomics signature, and nomogram in the validation set were all greater than 0.05 (0.354, 0.515, and 0.086, respectively), the calibration curves (Figure 9) revealed visible deviations between predicted and observed outcomes. Moreover, the nomogram’s P-value (0.086) approached the borderline of statistical significance. These findings suggest that the calibration performance of the models is suboptimal, and they may be more suitable for risk stratification than for precise probability estimation. Thus, the predicted probabilities should be interpreted as indicative of a relative risk range rather than as exact point estimates.
Figure 8. AUC comparison of clinical, radiomics and combined models. Compared with clinical and radiomics models, the combined model had the best performance in the training set and validation set.
The decision curves for the clinical, radiomics, and nomogram models are presented in Figure 10. Quantitative clinical utility analysis demonstrated that at the probability threshold of 0.3, the nomogram achieved a net benefit of 0.318 with perfect sensitivity (100%) and specificity of 79.2%, significantly outperforming simple clinical baseline models including PD-L1 high expression (net benefit: 0.020), BTS <50 mm (net benefit: -0.063, indicating net harm), and low CRP (net benefit: 0.024), as detailed in Supplementary Figure 4 and Table 2. The clinical application of the nomogram model is illustrated in Figure 11, where the total score reflects the probability of achieving PD or SD as the initial treatment response, providing quantifiable advantages for clinical decision-making. The comprehensive performance metrics for all final models are detailed in Supplementary Table 2.
3.4 Unbiased performance estimate by internal validation
The notable performance drop of the nomogram from the training set (AUC = 0.953) to the initial testing set (AUC = 0.788) suggested potential overfitting. To robustly estimate the model’s generalizable performance and address this concern, we performed a rigorous internal validation using 50 repeats of stratified 5-fold cross-validation on the training cohort (n=115).
This validation yielded a mean AUC of 0.783 (95% CI: 0.777 – 0.790), which aligns closely with the initial test set performance (AUC = 0.788). The high consistency between these independent assessments strongly indicates that the model’s generalizable discrimination ability is approximately 0.78, not the overly optimistic 0.953 observed on the training set. Furthermore, the low coefficient of variation (3.02%) and the detailed distribution of performance across all repetitions (Supplementary Figure 2) confirm the satisfactory stability of the model.
4 Discussion
In China, the incidence of lung cancer is increasing, and the burden of cancer is expected to continue to increase over the next 20 years. Despite the increasing number of treatment options, the prognosis for patients with lung cancer remains poor (2, 15). The objective response rate to immunotherapy varies among patients with NSCLC (16). Identifying patients who will benefit from immunotherapy early on and switching patients who do not respond to a different treatment plan early on is crucial for reducing the disease burden on patients. This study developed and validated a model based on clinical and pre-treatment enhanced CT imaging features that outperformed traditional clinical prediction models. This model can provide an early, noninvasive predictive tool for late-stage NSCLC immunotherapy decisions.
Immunotherapy outcomes are influenced by a multifactorial interplay of tumor-related factors (e.g., histology, metastatic pattern, BTS), host-related factors (e.g., non-specific inflammatory markers, PS score, PD-L1 expression, prognostic nutritional index), and treatment-related parameters (e.g., line of therapy, treatment strictness) (16). From this broad prognostic spectrum, we identified three core clinical variables—CRP, baseline tumor size (BTS), and PD-L1 expression—to construct a streamlined clinical model. CRP is an acute-phase protein synthesized by hepatocytes under the influence of inflammatory factors. High concentrations of CRP (>10 μg/mL) are associated with metastasis and prognosis of NSCLC and other late-stage tumors (17, 18). Regardless of the critical value of CRP or the type of variable, its baseline level is significantly associated with the prognosis of patients with advanced NSCLC receiving immunotherapy (19). Similarly, BTS directly quantifies tumor burden and functions as an independent prognostic factor, with larger size (e.g., >50 mm) correlating with significantly lower disease control rates (20, 21). PD-L1 expression as a biomarker of immunotherapy efficacy has been widely used in clinical practice. Both domestic and international guidelines recommend PD-1 inhibitors alone or in combination with chemotherapy as a first-line treatment for patients with non-small cell lung cancer (NSCLC) with negative driver gene mutations, PD-L1 high expression (TPS ≥ 50%), or PD-L1 low expression (1% ≤ TPS < 50%) (3, 22, 23). Our data confirmed their discriminative power, with responders showing significantly lower CRP and BTS and higher PD-L1 positivity. In addition, elevated LDH levels can cause lactic acid production. This may be associated with the development of cancerous tumors in the body. It is also associated with a decrease in the effectiveness of NSCLC treatment and poor prognosis (24, 25). However, the predictive value of LDH in this study was limited. Further rigorous clinical research is required to verify its potential value.
Radiomics provides a more objective, efficient, and dynamic approach than traditional manual image interpretation for assessing lesion characteristics and treatment efficacy (26, 27), with previous NSCLC immunotherapy studies demonstrating predictive value through test AUCs of 0.78-0.84 (28–31). Notably, Zhang et al. (31) confirmed through a systematic review that pre-treatment CT radiomic texture features can capture intrinsic tumor heterogeneity for efficacy prediction. They further revealed that tumors assessed as progressive at the first evaluation were characterized by abundant stroma and defective vascular structures, which impeded the infiltration of immune cells. An important methodological consideration in such studies is the strategy for lesion selection. In this regard, Wu et al. (32) demonstrated the superiority of the single-largest-lesion approach over the target-lesions method, as the latter is constrained by inter-lesional heterogeneity and clinical impracticality. Accordingly, our study adopted the single-largest-lesion approach, selecting the largest pulmonary tumor as the target lesion to ensure robustness. Building on this foundation and employing multiple machine learning algorithms, we achieved a competitive predictive performance, with AUCs of 0.926 and 0.848 in the training and testing sets, respectively.
Numerous studies have combined radiomics with diverse sets of clinical and serological markers. For instance, Miguel-Perez et al. (33) prospectively developed a combined model based on plasma PD-L1 dynamics and six radiomic features, demonstrating its potential as a biomarker for immunotherapy response. Furthermore, previous domestic and international studies (34–38) have incorporated varying clinical parameters—such as age, metastasis sites, systemic immune-inflammation index (SII), or drug types—consistently showing that combined models outperform those using clinical or radiomic features alone. In contrast to these approaches, our clinical model was deliberately constructed using a selective and distinct set of biomarkers (CRP, PD-L1, BTS). The rationale for this parsimonious feature set was rigorously validated through a formal sensitivity analysis (detailed results are provided in Supplementary Figure 3), which showed that incorporating seven additional common clinical covariates—namely, smoking history, ECOG PS score, neutrophil-to-lymphocyte ratio (NLR), metastasis pattern, histology, platelet-to-lymphocyte ratio (PLR), and prognostic nutritional index (PNI)—did not improve performance but slightly reduced the AUC from 0.737 to 0.712. Feature importance analysis further reaffirmed the dominant contributions of CRP, BTS, and PD-L1, indicating that other factors provided largely redundant prognostic information. This confirms that our refined clinical feature set is both optimally predictive and non-redundant. The subsequent integration of this robust clinical model with a radiomics signature (comprising one first-order and six texture features) yielded a high-performing nomogram, offering a non-invasive and efficient tool to support treatment decision-making in cancer patients.
The optimal delineation strategy for regions of interest in radiomic analysis—whether based on a single representative slice or the entire tumor volume—remains a subject of methodological discussion (39–41). While 3D segmentation theoretically provides more comprehensive spatial heterogeneity representation, 2D delineation demonstrates superior stability across different scanners and multicenter datasets, along with higher inter-observer agreement (42), our current analysis focused specifically on core tumor parenchyma to ensure feature robustness. We acknowledge that extending analysis to peritumoral regions represents a promising direction for future research to achieve more comprehensive tumor biological characterization. Combined with its closer alignment to clinical workflows through operational simplicity and standardization potential, these considerations led to our adoption of a two-dimensional annotation approach based on the largest tumor cross-section.
The nomogram demonstrated a marked performance drop from the training set (AUC = 0.953) to the initial testing set (AUC = 0.788), indicating overfitting; however, rigorous repeated cross-validation yielded a mean AUC of 0.783, closely aligning with the testing set performance and confirming a robust, generalizable discrimination ability of approximately 0.78. In contrast, calibration assessment revealed suboptimal accuracy, with visible curve deviations and a borderline Hosmer-Lemeshow test P-value (0.086) in the testing set. These discrepancies may be attributable not only to the limited validation cohort sample size but also to the inherent complexity of integrating multimodal features and potential unmodeled patient heterogeneity. Consequently, the current model is more reliable for risk stratification than for precise probability estimation. To define its clinical utility, decision curve analysis identified a probability threshold of 0.3 as optimal, at which the nomogram demonstrated substantially superior net benefit over all simple clinical benchmarks (PD-L1 expression, BTS, and CRP), confirming its significant added value for guiding immunotherapy strategies. Future efforts should focus on large-scale, external validation cohorts and algorithm refinement, incorporating post-hoc verification to enhance predictive calibration and secure broader clinical applicability.
An intriguing observation in this study was the notable difference in generalizability between the integrated nomogram and the standalone radiomics signature. Although both models demonstrated competent performance during training, the nomogram exhibited a more pronounced performance decline upon external validation, suggesting potential overfitting. This discrepancy appears to stem from two primary factors. First, while incorporating clinical variables aimed to enhance predictive accuracy, it may have introduced noise or cohort-specific biases, thereby promoting overfitting to non-generalizable patterns. Second, the increased model complexity from integrating multi-domain features inherently elevates overfitting risk, particularly with limited sample sizes. Moreover, radiomics features derived directly from tumor lesions may capture tumor heterogeneity more specifically and directly than indirect clinical parameters such as PD-L1 expression, BTS, and CRP, potentially accounting for their superior predictive efficacy. Although the radiomics model’s advantage in the testing set was not statistically significant (P > 0.05), its consistent performance and conceptual simplicity suggest its potential as a more robust and translatable biomarker across diverse populations. Despite demonstrating overfitting tendencies, the combined model remains clinically relevant and warrants further validation in larger, prospective cohorts.
This study has certain limitations: (1) It was a single-center, retrospective study, and PD-L1 detection using specimens from different collection sites may have affected the results, potentially leading to selection bias; (2) Manual segmentation of 2D images (central cross-sectional views of target lesions) was time-efficient and easier to perform than delineating entire tumors. However, this approach may have excluded lesions not fully captured in the maximal cross-section or those with indistinct borders that were challenging to delineate; (3) The pre-treatment contrast-enhanced CT scans were acquired using multiple scanning devices. Although image resampling and fixed gray-level discretization were applied to minimize inter-scanner heterogeneity, the variations in scanning parameters still pose a challenge to feature reproducibility. Crucially, the lack of scanner metadata in our retrospective dataset prevented us from applying the ComBat method, a powerful tool for batch effect correction, to quantify and adjust for scanner variations. Future studies should prospectively and standardly collect key information such as scanner models and scanning parameters to lay the groundwork for applying more reliable harmonization methods; (4) The study performed radiomics analysis only on primary tumors, excluding pulmonary metastases and metastases in other organs, thereby neglecting the patient survival prognosis. These factors may have introduced bias. Future efforts should include multicenter data for external validation, standardize lesion segmentation criteria, adopt semi-automated methods to enhance segmentation efficiency, and extend follow-up periods to optimize the model’s predictive performance, reliability, and generalizability.
5 Conclusion
In conclusion, we developed an early prediction framework for immunotherapy efficacy in advanced NSCLC using pretreatment clinical and contrast-enhanced CT radiomics features. The radiomics signature provides a robust basis for patient stratification, while the integrated nomogram shows potential for improved predictive performance, pending validation in larger cohorts.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by the ethics committee of Lanzhou University Second Hospital (2024A-306). The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because this study is a retrospective study without any clinical intervention or control group. All the collected data are handled in strict accordance with the principles of confidentiality, prohibited from commercial use, and are solely utilized for the relevant research of this specific project. Therefore, written informed consent forms are waived.
Author contributions
YH: Writing – original draft, Writing – review & editing, Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Project administration. TZ: Writing – original draft, Writing – review & editing, Funding acquisition, Project administration, Resources, Supervision. KZ: Writing – review & editing, Methodology, Validation. JJ: Writing – review & editing, Data curation. HW: Writing – review & editing, Project administration, Funding acquisition, Resources, Supervision.
Funding
The author(s) declared that financial support was received for this work and/or its publication. Funding project: Gansu Provincial Natural Science Fund Project (No. 24JRRA371); Lanzhou University Second Hospital Comprehensive Innovative Program Project (No. CY2024-LC-B03), Zhongke Innovation and Prevention of Chronic Diseases Research Institute.
Conflict of interest
The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2025.1711402/full#supplementary-material
References
1. Siegel RL, Kratzer TB, Giaquinto AN, Sung H, and Jemal A. Cancer statistics, 2025. CA: A Cancer J Clin. (2025) 75:10–45. doi: 10.3322/caac.21871
2. Su B, Zhong P, Xuan Y, Xie J, Wu Y, Chen C, et al. Changing patterns in cancer mortality from 1987 to 2020 in China. Cancers. (2023) 15:476. doi: 10.3390/cancers15020476
3. Riely GJ, Wood DE, Ettinger DS, Aisner DL, Akerley W, Bauman JR, et al. Non-small cell lung cancer, version 4.2024, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. (2024) 22:249–74. doi: 10.6004/jnccn.2204.0023
4. Hanna NH, Robinson AG, Temin S, Baker S, Brahmer JR, Ellis PM, et al. Therapy for stage IV non-small-cell lung cancer with driver alterations: ASCO and OH (CCO) joint guideline update. J Clin Oncol. (2021) 39:1040–91. doi: 10.1200/JCO.20.03570
5. Qi C, Li Y, Zeng H, Wei Q, Tan S, Zhang Y, et al. Current status and progress of PD-L1 detection: guiding immunotherapy for non-small cell lung cancer. Clin Exp Med. (2024) 24:162. doi: 10.1007/s10238-024-01404-1
6. Sun R, Limkin EJ, Vakalopoulou M, Dercle L, Champiat S, Han SR, et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol. (2018) 19:1180–91. doi: 10.1016/S1470-2045(18)30413-3
7. Hou Y, Zhang T, and Wang H. Advancements in radiomics for immunotherapy of non-small cell lung cancer. Zhongguo Fei Ai Za Zhi. (2024) 27:637–44. doi: 10.3779/j.issn.1009-3419.2024.102.29
8. Chen M, Lu H, Copley SJ, Han Y, Logan A, Viola P, et al. A novel radiogenomics biomarker for predicting treatment response and pneumotoxicity from programmed cell death protein or ligand-1 inhibition immunotherapy in NSCLC. J Thorac Oncol. (2023) 18:718–30. doi: 10.1016/j.jtho.2023.01.089
9. Seymour L, Bogaerts J, Perrone A, Ford R, Schwartz LH, Mandrekar S, et al. iRECIST: guidelines for response criteria for use in trials testing immunotherapeutics. Lancet Oncol. (2017) 18:e143–52. doi: 10.1016/S1470-2045(17)30074-8
10. Li Z, Zhang L, Li S, Jiang G, Zhang Z, Liu D, et al. CT radiomics analysis facilitates preoperative risk stratification of central lymph node metastasis in papillary thyroid cancer: a multicenter study. Front Oncol. (2025) 15. Available online at: https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2025.1681000/full(Accessed November 21, 2025).
11. Shen J, Du H, Wang Y, Du L, Yang D, Wang L, et al. A novel nomogram model combining CT texture features and urine energy metabolism to differentiate single benign from Malignant pulmonary nodule. Front Oncol. (2022) 12:1035307. doi: 10.3389/fonc.2022.1035307
12. Lin S, Ma Z, Yao Y, Huang H, Chen W, Tang D, et al. Automatic machine learning accurately predicts the efficacy of immunotherapy for patients with inoperable advanced non-small cell lung cancer using a computed tomography-based radiomics model. dir. (2025) 31(2):130–40. doi: 10.4274/dir.2024.242972
13. Liu Z, Yao Y, Zhao M, Zhao Q, Xue J, Huang Y, et al. Radiomics models derived from arterial-phase-enhanced CT reliably predict both PD-L1 expression and immunotherapy prognosis in non-small cell lung cancer: A retrospective, multicenter cohort study. Acad Radiol. (2025) 32:493–505. doi: 10.1016/j.acra.2024.07.028
14. Li J, Qiu Z, Zhang C, Chen S, Wang M, Meng Q, et al. ITHscore: comprehensive quantification of intra-tumor heterogeneity in NSCLC by multi-scale radiomic features. Eur Radiol. (2022) 33:893–903. doi: 10.1007/s00330-022-09055-0
15. Huang ZH, Qiao G, Zhou YL, and Shi JH. The burden of lung cancer in China from 1990 to 2021. Chin J Tuberculosis Respir Dis. (2025) 48:633–41. doi: 10.3760/cma.j.cn112147-20241031-00651
16. Brueckl WM, Ficker JH, and Zeitler G. Clinically relevant prognostic and predictive markers for immune-checkpoint-inhibitor (ICI) therapy in non-small cell lung cancer (NSCLC). BMC Cancer. (2020) 20:1185. doi: 10.1186/s12885-020-07690-8
17. Nassar YM, Ojara FW, Pérez-Pitarch A, Geiger K, Huisinga W, Hartung N, et al. C-reactive protein as an early predictor of efficacy in advanced non-small-cell lung cancer patients: A tumor dynamics-biomarker modeling framework. Cancers (Basel). (2023) 15:5429. doi: 10.3390/cancers15225429
18. Mikkelsen MK, Lindblom NAF, Dyhl-Polk A, Juhl CB, Johansen JS, and Nielsen D. Systematic review and meta-analysis of C-reactive protein as a biomarker in breast cancer. Crit Rev Clin Lab Sci. (2022) 59:480–500. doi: 10.1080/10408363.2022.2050886
19. Tong W, Xu H, Tang J, Zhao N, Zhou D, Chen C, et al. Inflammatory factors are associated with prognosis of non-small cell lung cancer patients receiving immunotherapy: a meta-analysis. Sci Rep. (2024) 14:26102. doi: 10.1038/s41598-024-76052-2
20. Uehara Y, Hakozaki T, Kitadai R, Narita K, Watanabe K, Hashimoto K, et al. Association between the baseline tumor size and outcomes of patients with non-small cell lung cancer treated with first-line immune checkpoint inhibitor monotherapy or in combination with chemotherapy. Trans Lung Cancer Res. (2022) 11(2):135–49. Available online at: https://tlcr.amegroups.org/article/view/61113 (Accessed November 21, 2025).
21. Hakozaki T, Hosomi Y, Kitadai R, Kitagawa S, and Okuma Y. Efficacy of immune checkpoint inhibitor monotherapy for patients with massive non-small-cell lung cancer. J Cancer Res Clin Oncol. (2020) 146:2957–66. doi: 10.1007/s00432-020-03271-1
22. Chinese Society of Clinical Oncology. Chinese Medical Association guideline for clinical diagnosis and treatment of lung cancer (2025 edition). Chin Med J. (2025) 105:2918–59. doi: 10.3760/cma.j.cn112137-20250511-01152
23. Tian P, He B, Mu W, Liu K, Liu L, Zeng H, et al. Assessing PD-L1 expression in non-small cell lung cancer and predicting responses to immune checkpoint inhibitors using deep learning on computed tomography images. Theranostics. (2021) 11:2098–107. doi: 10.7150/thno.48027
24. Wang M, Zhou Q, Cao T, Li F, Li X, Zhang M, et al. Lactate dehydrogenase A: a potential new target for tumor drug resistance intervention. J Transl Med. (2025) 23:713. doi: 10.1186/s12967-025-06773-z
25. Rosique-Aznar C, Valcuende-Rosique A, Rosique-Robles D, and Sánchez-Alcaraz A. Relationship between lactate dehydrogenase and survival in patients with non-small-cell lung cancer receiving immunotherapy. Farmacia Hospitalaria. (2025) 49:T143–7. doi: 10.1016/j.farma.2024.09.003
26. Kann BH, Hosny A, and Aerts HJWL. Artificial intelligence for clinical oncology. Cancer Cell. (2021) 39:916–27. doi: 10.1016/j.ccell.2021.04.002
27. Gong J, Bao X, Wang T, Liu J, Peng W, Shi J, et al. A short-term follow-up CT based radiomics approach to predict response to immunotherapy in advanced non-small-cell lung cancer. OncoImmunology. (2022) 11:2028962. doi: 10.1080/2162402X.2022.2028962
28. Wu Q, Wang J, Sun Z, Xiao L, Ying W, and Shi J. Immunotherapy efficacy prediction for non-small cell lung cancer using multi-view adaptive weighted graph convolutional networks. IEEE J BioMed Health Inform. (2023) 27:5564–75. doi: 10.1109/JBHI.2023.3309840
29. Shen LL, Tao GY, Fu HC, Liu XM, Ye XD, and Ye JD. Predicting response to non-small cell lung cancer immunotherapy using pre-treatment contrast-enhanced CT texture-based classification. Chin J Oncol. (2021) 43:541–5. doi: 10.3760/cma.j.cn112152-20190725-00468
30. Zhou Z, Guo W, Liu D, Micha JRN, Song Y, and Han S. Multiparameter prediction model of immune checkpoint inhibitors combined with chemotherapy for non-small cell lung cancer based on support vector machine learning. Sci Rep. (2023) 13:4469. doi: 10.1038/s41598-023-31189-4
31. Zhang YR, Lu YH, Lin CM, and Ku JW. Pretreatment CT texture analysis for predicting survival outcomes in advanced nonsmall cell lung cancer patients receiving immunotherapy: A systematic review and meta-analysis. Thorac Cancer. (2025) 16:e70144. doi: 10.1111/1759-7714.70144
32. Wu M, Zhang Y, Zhang J, Zhang Y, Wang Y, Chen F, et al. A combined-radiomics approach of CT images to predict response to anti-PD-1 immunotherapy in NSCLC: A retrospective multicenter study. Front Oncol. (2021) 11:688679. doi: 10.3389/fonc.2021.688679
33. de Miguel-Perez D, Ak M, Mamindla P, Russo A, Zenkin S, Ak N, et al. Validation of a multiomic model of plasma extracellular vesicle PD-L1 and radiomics for prediction of response to immunotherapy in NSCLC. J Exp Clin Cancer Res. (2024) 43:81. doi: 10.1186/s13046-024-02997-x
34. Vaidya P, Khorrami M, Bera K, Fu P, Delasos L, Gupta A, et al. Computationally integrating radiology and pathology image features for predicting treatment benefit and outcome in lung cancer. NPJ Precis Oncol. (2025) 9:161. doi: 10.1038/s41698-025-00939-0
35. Jin H, Wang Y, Li X, Yang Y, and Qi R. Radiomics nomogram for predicting chemo-immunotherapy efficiency in advanced non-small cell lung cancer. Sci Rep. (2024) 14:20788. doi: 10.1038/s41598-024-63415-y
36. Li C, Zhou Z, Hou L, Hu K, Wu Z, Xie Y, et al. A novel machine learning model for efficacy prediction of immunotherapy-chemotherapy in NSCLC based on CT radiomics. Comput Biol Med. (2024) 178:108638. doi: 10.1016/j.compbiomed.2024.108638
37. Shao H, Zhu J, Shi L, Yao J, Wang Y, Ma C, et al. Value of computed tomography radiomics combined with inflammation indices in predicting the efficacy of immunotherapy in patients with locally advanced and metastatic non-small cell lung cancer. J Thorac Dis. (2024) 16:3213–27. doi: 10.21037/jtd-24-526
38. Liu YT, Zhao L, and Liu AS. Prediction of CT radiomics for chemotherapy and immunotherapy in patient with advanced non-small cell lung cancer. Chin J Med Imaging. (2025) 33:758–65. doi: 10.3969/j.issn.1005-5185.2025.07.013
39. Meng L, Dong D, Chen X, Fang M, Wang R, Li J, et al. 2D and 3D CT radiomic features performance comparison in characterization of gastric cancer: A multi-center study. IEEE J BioMed Health Inform. (2021) 25:755–63. doi: 10.1109/JBHI.2020.3002805
40. Xie XJ, Liu SY, Chen JY, Zhao Y, Jiang J, Wu L, et al. Development of unenhanced CT-based imaging signature for BAP1 mutation status prediction in Malignant pleural mesothelioma: Consideration of 2D and 3D segmentation. Lung Cancer. (2021) 157:30–9. doi: 10.1016/j.lungcan.2021.04.023
41. Xie R, Pan D, Zeng A, Xu X, Wang T, Ullah N, et al. Target area distillation and section attention segmentation network for accurate 3D medical image segmentation. Health Inf Sci Syst. (2023) 11:9. doi: 10.1007/s13755-022-00200-z
Keywords: immunotherapy, machine learning, non-small cell lung cancer, radiomics, response prediction
Citation: Hou Y, Zhang T, Zhu K, Jiang J and Wang H (2025) Early prediction of immunotherapy efficacy for advanced NSCLC based on clinical and pre-treatment contrast-enhanced CT radiomics features. Front. Oncol. 15:1711402. doi: 10.3389/fonc.2025.1711402
Received: 23 September 2025; Accepted: 05 December 2025; Revised: 01 December 2025;
Published: 19 December 2025.
Edited by:
Sunyi Zheng, Tianjin Medical University Cancer Institute and Hospital, ChinaReviewed by:
Harish RaviPrakash, AstraZeneca, United StatesWeisi Yan, University of Kentucky, United States
Copyright © 2025 Hou, Zhang, Zhu, Jiang and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Hong Wang, bGRleWh4a3doQDE2My5jb20=