Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol., 12 January 2026

Sec. Cancer Imaging and Image-directed Interventions

Volume 15 - 2025 | https://doi.org/10.3389/fonc.2025.1706104

This article is part of the Research TopicArtificial Intelligence Advancing Lung Cancer Screening and TreatmentView all 14 articles

Development and validation of an LDCT-based deep learning radiomics nomogram for predicting postoperative recurrence of stage Ia lung adenocarcinoma

Haimei Lan&#x;Haimei Lan1†Chaosheng Wei&#x;Chaosheng Wei1†Yiming LuoYiming Luo2Mingzhuang LiaoMingzhuang Liao1Hongfeng LiangHongfeng Liang1Jianli QinJianli Qin1Jixing YiJixing Yi1Fengming XuFengming Xu1Dandan HuangDandan Huang1Meiqing ZhangMeiqing Zhang1Qing Feng*Qing Feng1*Tao Li*Tao Li1*
  • 1Department of Radiology, Liuzhou Workers’ Hospital, Guangxi, China
  • 2Department of Radiation Therapy, Guangxi Medical University Cancer Hospital, Guangxi, China

Objective: This study wanted to use low-dose computed tomography (LDCT) plain scan images to create a deep learning radiomic nomogram (DLRN) to accurately predict the likelihood of recurrence after surgery in patients with stage Ia lung adenocarcinoma (LUAD).

Methods: We collected cases from January 2010 to December 2020 at Center 1 who underwent surgery and were pathologically diagnosed with stage Ia LUAD, and additionally collected patients with the same criteria at Center 2 from January 2015 to December 2018 for external validation. Deep learning and radiomic feature extraction were performed on LDCT images of all patients. In the deep learning and radiomics methods, we tested multiple different models and selected the best model based on the results of the internal validation cohort. Finally, we construct a nomogram by combining deep learning features, radiomics features and clinical data. Subsequently, We used the receiver operating characteristic (ROC) curve to check how well these models performed in terms of diagnosis. The calibration degree of each model was evaluated using calibration curves, while the clinical value of each model was assessed through decision curve analysis (DCA).

Results: In Center 1, we collected a total of 233 eligible patients, who were randomly divided into a training cohort (163 patients) and an internal validation cohort (70 patients) at a 7:3 ratio. And we collected included a total of 89 patients in Center 2. Internal validation results showed Resnet50 and Logistic Regression (LR) as optimal models for deep learning and radiomics approaches, respectively. The area under the curve (AUC) values for this combined model were 0.972 (95% CI: 0.949-0.995) in the training cohort, 0.925 (95% CI: 0.845-1.000) in the internal validation cohort, and 0.915 (95% CI: 0.853-0.976) in the external validation cohort. Compared with other single models, it demonstrated the best performance.

Conclusion: Preoperative DLRN based on LDCT plain scan images exhibit good predictive value for postoperative recurrence in patients with stage Ia LUAD. The present study developed a novel prognostic assessment method with the objective of assisting clinicians in refining adjuvant treatment plans for patients with stage Ia LUAD, thus facilitating personalised prognostic management.

1 Introduction

Lung cancer remains the leading cause of cancer death globally (1). Non-small cell lung cancer (NSCLC) accounts for approximately 80% of all lung cancer cases, and lung adenocarcinoma (LUAD) is the most common histological type (2, 3). Recently, the widespread adoption of low-dose computed tomography (LDCT) screening for lung cancer has led to the detection of numerous early-stage LUAD patients (46). For these patients, surgical resection is the preferred option for achieving radical treatment (3, 7). However, postoperative recurrence is the most common cause of death and a major factor affecting long-term survival after surgery. Studies have found that the postoperative recurrence rate in patients with stage Ia LUAD is approximately 10%-20% (8). Therefore, assessing the recurrence of stage Ia LUAD after surgery is crucial for formulating personalized and effective treatment strategies.

Radiomics is a practical quantitative medical imaging technique that analyses high-throughput features derived from manually delineated tumour regions (9, 10). This technology has been extensively studied in post-operative recurrence prediction and survival analysis for LUAD (11). Traditional manual radiomics approaches merely extract surface image features from annotated regions, failing to fully capture tumour heterogeneity and thus limiting the field’s potential (12). The autonomous extraction of quantitative surface features from medical images using deep learning techniques represents a novel direction in radiomics development. Convolutional neural networks (CNNs)-based deep learning has become a promising approach for predicting postoperative recurrence in lung adenocarcinoma (LUAD), demonstrating significant clinical value (13). However, the effective application of deep learning requires substantial training data support, and medical datasets often suffer from data scarcity due to their limited scale.

In recent years, deep transfer learning (DTL) technology has emerged as a focal point of scholarly research. Its core principle lies in accomplishing novel tasks by fine-tuning pre-trained datasets, thereby enabling the application of deep learning techniques even with limited datasets. Concurrently, radiomics and deep learning have become rapidly advancing frontier technologies (14). Numerous researchers employ a transfer learning (TL) approach involving pre-trained CNNs to address overfitting issues arising from limited datasets (15, 16). Furthermore, the integration of DTL classification networks with traditional manual radiomics frameworks has gained traction in medical research (17, 18). Nevertheless, its application in postoperative recurrence studies of stage Ia LUAD remains relatively constrained, whereas the predominant research focus has been on its role in therapeutic decision-making for mid-advanced stage lung cancer (1922).

Despite the growing body of research literature on early-stage LUAD, there remains a lack of studies employing radiomics and deep learning techniques utilising LDCT images to predict postoperative recurrence in stage Ia LUAD patients. Moreover, LDCT represents the most promising imaging modality for early screening of LUAD, effectively reducing mortality rates among lung cancer patients. This study aims to develop and validate a deep learning radiomic (DLR) feature based on LDCT images, and to explore its predictive efficacy for postoperative recurrence in stage Ia LUAD patients.

2 Materials and methods

This study introduces a DLR model using LDCT to predict the likelihood of recurrence after stage Ia LUAD surgery. Initially, handcrafted features were extracted from CT images using the pyradiomics packet. Secondly, deep learning features are extracted through the maximum cross-section of the region of interest (ROI). These features are further enhanced using TL techniques, leveraging a pre-trained ResNet50 model. A radiomics signature and a corresponding nomogram were then developed and validated on an independent cohort. Figure 1 depicts the workflow of the radiomics analysis conducted in this research.

Figure 1
Workflow diagram illustrating a medical image analysis process. It includes four main steps: ROI Segmentation showing lung images with highlighted regions; Feature Extraction detailing radiomics and deep learning features; Feature Selection using ICC and LASSO methods with visual representations; and Nomogram Construction and Evaluation displaying AUC, Calibration, DCA plots, and a nomogram for predictive analysis.

Figure 1. The workflow of LDCT-based DLRN.

2.1 Study population and follow-up

This retrospective study received approval from two institutional review boards (Approval No.: KY2025613, KY20251082) and was exempted from the requirement for patient informed consent. Data were collected on cases of surgically resected, pathologically confirmed stage Ia LUAD from Center 1 (January 2010 to December 2020) and from Center 2 (January 2015 to December 2018). Inclusion criteria: (1) medical history indicates a solitary lung cancer; (2) postoperative pathological diagnosis of invasive stage Ia LUAD; (3) CT examination conducted within 2 weeks prior to surgery. Exclusion criteria: (1) receiving other non-surgical treatments, such as radiotherapy and chemotherapy, before surgery; (2) patients who cannot be followed up after surgery; (3) severe respiratory or motion artifacts causing blurring of CT images that impacted tumor observation; (4) absence of preoperative LDCT plain scan images of the lungs. Figure 2 illustrates the detailed recruitment methodology.

Figure 2
Flowchart outlining the selection process for patients with stage Ia LUAD. Initially, 271 patients were confirmed by histopathological examination and CT scans. After preliminary screening, 263 patients remained. Exclusion criteria included receiving other treatments, inability to follow up, severe respiratory artifacts, and absence of preoperative scans. Finally, 233 patients were enrolled, divided into training (163) and internal validation (70) cohorts. Additionally, 89 patients from another center formed an external validation cohort.

Figure 2. The patient recruitment process and distribution in the training and validation cohorts.

The main objective of this study is to evaluate the recurrence-free survival (RFS). Patients were followed up for 5 years after surgery. In these five years, if the tumor recurred, they were assigned to the recurrence group; If there is no recurrence, it is classified as a non-recurrence group. Follow-up once every six months for the first two years and once a year thereafter. Follow-up includes CT, MRI or PET/CT scanning, as well as telephone consultation. According to the standard research procedure, recurrence can be divided into local recurrence and distant metastasis. If the tumor reappears in N1 or N2 lymph nodes, mediastinum, primary lung position or pleura, it is considered as local recurrence. Distant metastasis refers to the spread of cancer to adrenal gland, kidney, bone, brain, liver, contralateral lung, skin or N3 lymph nodes (23).

2.2 CT image acquisition

All CT examinations were conducted during a breath-hold at deep inspiration, with the patient’s arms raised above their head, and the acquisition spanned from the lung apices to the lung bases. In Center 1, the scanning machine was SIEMENS SOMATOM Definition Flash (Stellar). And in Center 2, the scanning machine was General Electric Optima CT670. The CT scanning parameters for both machines were as follows: tube voltage at 120 kV, tube current at 30 mA, collimated detector at 128×0.6 mm, matrix at 512×512, and slice thickness at 5 mm. The images were reconstructed using iterative reconstruction techniques, and the acquired CT data were saved in the DICOM format. The CT images were preprocessed through standardization, including grayscale value discretization and image resampling, followed by Gaussian filtering.

2.3 ROI acquisition

In our study, two experienced radiologists delineated the ROI along the tumor edges on the training dataset using ITK-SNAP, while avoiding the pleural wall, large bronchi, and blood vessels during the delineation process. Any inconsistencies they found while drawing the contours were adjusted for by a doctor with more than 20 years of experience in radiology diagnosis.

2.4 Intraclass correlation coefficient

We used intraclass correlation coefficient (ICC) to assess radiologists ‘consistency in drawing ROI. A random sample of 100 patients was selected from the training cohort, with two physicians independently performing ROI delineation. An ICC value exceeding 0.75 was interpreted as indicating high consistency of results.

2.5 Radiomics procedure

Handcrafted features fall into three distinct groups: geometric, intensity-based, and textural characteristics. Researchers typically derive these features through established techniques like the Gray-Level Co-occurrence Matrix (GLCM), Gray-Level Run-Length Matrix (GLRLM), Gray-Level Size Zone Matrix (GLSZM), and Neighboring Gray-Tone Difference Matrix (NGTDM). For this particular investigation, we extracted a comprehensive set of 1,834 handcrafted features, which broke down to 14 geometric attributes, 360 intensity-related measurements, and a substantial 1,460 textural descriptors. All features were obtained using the internal procedures of Pyradiomics (http://pyradiomics.readthedocs.io).

The extracted features were standardized using Z-score normalization to ensure comparability. We then performed t-tests to assess statistical significance, keeping only those features with p-values under the 0.05 threshold. To address potential multicollinearity issues, we calculated Pearson’s correlation coefficients between feature pairs and eliminated any with correlations surpassing 0.9. The feature selection process was finalized through the least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation, which helped determine the ideal regularization parameter (λ). This approach produced a streamlined collection of highly predictive and meaningful features.

Machine learning algorithms, including Logistic Regression (LR), Support Vector Machine(SVM) and so on for the radiomics risk model construction. Comparative analyses were conducted to assess the performance of each model.

2.6 Deep learning procedure

We used the maximum cross-section of the ROI for each case as the representative image. To simplify the algorithmic analysis and minimize background noise, we retained only the smallest bounding rectangle encompassing the ROI, expanded by an additional 10 pixels based on recent research emphasizing the significance of peritumoral regions.

We performed Z-score normalization on the images input into the model to unify the intensity distribution across the RGB channels. During the training process, a real-time data augmentation strategy was also employed, including methods such as random cropping, horizontal flipping, and vertical flipping. For validation images, we limited processing to normalization.

During the research process, we explored the performance of the classic VGG11, ResNet18, VIT, DenseNet121 and ResNet50 networks. Moreover, an examination of these models pinpointed ResNet50 as the superior algorithm most aligned with the research goals (Figure 3).

Figure 3
Three ROC curve plots (A, B, C) compare the performance of five models: VGG11, ResNet18, ViT, DenseNet121, and ResNet50. Each plot shows sensitivity versus 1-specificity, with a diagonal reference line. Plot A demonstrates the highest AUC for ResNet50 (0.844), while plot B shows the same model has an AUC of 0.758. In plot C, ResNet50 achieves an AUC of 0.810. Each legend includes the AUC value and 95% confidence interval for the respective models.

Figure 3. The ROC curves for different deep learning models in training (A), internal validation (B) and external (C) cohorts.

In the study, transfer learning was employed to ensure the model’s effectiveness in patients with significant differences. This approach starts by loading pre-trained ImageNet weights to improve the model’s adaptability to various datasets. An important component of our approach is the continuous integration of learning rates to optimize generalization performance. To achieve this, we utilized the cosine decay learning rate strategy.

ηt=ηmini+12(ηmaxiηmini)(1+cos(TcurTiπ))

The minimum learning rate is initialized as $\eta_{min}^{i}=0$, whereas the upper bound is fixed at $\eta_{max}^{i}=0.01$. Here, $T_i=30$ represents the total epoch count during iterative model training. Other key hyperparameters include Stochastic Gradient Descent (SGD) for optimization and softmax cross-entropy serving as the objective function.

Feature Early Fusion: We integrated $features{ct}$, $features{dwi}$ using a feature concatenation method ($\oplus$), combining these into a single comprehensive feature vector:

featurefusion=featuresctfeaturesdwi

We got 2048-dimensional depth features from the trained ResNet50 model, and also extracted 1834-dimensional standard image features from the lesion area-these features include shape, first-order features and texture features. In order to make the CNN model simpler, we PCA-processed the 2048-dimensional depth features output from the penultimate layer and compressed them into 512 dimensions. Then, we standardized the 512-dimensional depth features and the 1834-dimensional image omics features with the Z-score method. Finally, the processed feature vectors are put together to form a unified 2346-dimensional fusion feature vector, which is directly sent to the downstream logistic regression classifier for training and testing. We choose this early fusion method in order to make the classifier use the complementary information contained in the depth feature and the artificial design feature at the same time. These preprocessing steps, including Principal Component Analysis (PCA) and standardization, are critical to achieve balanced and effective fusion.

PCA can turn the initial variables into a set of independent principal components by finding the vertical axis that best represents the data changes. This method is reliable, because it can reduce the reconstruction error while retaining the most information, and also reduce the influence of feature repetition and noise. In order to test whether the features we selected are reliable or not, we used bootstrap validation, and made a total of 500 repeated sampling. In each test, we will write down which features are selected, and then calculate a score called Jaccard stability index, and the result is 0.78. We also found that eight features were particularly stable and were selected in more than 80% of the tests. Our findings also show that when we use LASSO method to select features and control the adjustment parameter λ within the range of [0.01, 0.1], the selected feature combinations are almost the same every time. However, if the wrapper-based method is used, the effect is particularly vulnerable to the parameters set at the beginning. We also draw a stability-performance trade-off curve, and find that if the selection standard is set at 0.75, we can get the most stable feature combination, and the performance of the model will not deteriorate, because the AUC score can be kept above 0.85.

2.7 Signature building

In the model, the CNN’s output probabilities are referred to as the Deep Learning Signature within the model.

We construct a DLR feature signature through a pre-fusion algorithm. This approach first fuses deep learning features with radiomics features, and then completes feature selection and model establishment according to the standard procedures of traditional radiomics.

To strengthen the practical utility of our findings, we performed both univariable and stepwise multivariable analyses across clinical variables to pinpoint statistically significant predictors. These key features were then merged with the predictions generated by our DLR model to construct a LR linear model—ultimately yielding what we term the Combined Signature. For enhanced clinical interpretation, this composite score was presented visually through an easy-to-use nomogram.

2.8 Statistical analysis

All data analyses were performed using Python 3.7.12 on the OnekeyAI platform (version 4.9.1). For clinical characteristics, continuous variables are expressed as mean ± standard deviation, while count variables are presented as percentages. Table 1 shows the clinical baseline characteristics. Independent risk factors were identified through univariate and multivariate analyses. The DLR model utilized LASSO regression analysis and ROC analysis. Both the clinical model and the DLR model determined the optimal model based on the ROC curve. The calibration accuracy of the models was assessed using calibration curves, and the clinical utility was evaluated using DCA curves. The DeLong test was employed to assess the differences among multiple models. All reported p-values are from two-tailed tests, with a p-value less than 0.05 considered statistically significant.

Table 1
www.frontiersin.org

Table 1. Baseline characters of our cohorts.

3 Results

3.1 Clinical features

We performed detailed univariate and multivariate evaluations of clinical characteristics. These ratios were crucial in developing our final fusion model. Significant features identified through univariable and multivariate analysis screening were utilized to construct the Clinical Signature.

Univariate analysis and multivariate analysis identified sex, squamous cell carcinomaassociated antigen (SCCA), pathological type and nodule type as independent risk factors for postoperative recurrence of stage Ia LUAD (Table 2).

Table 2
www.frontiersin.org

Table 2. Univariable and Multivariate Analysis of clinical features.

3.2 Radiomics signature

We assessed model discrimination across cohorts using the area under the curve AUC as the key measure. The results from the internal validation cohort indicated that LR demonstrated significantly superior performance (AUC = 0.809), outperforming SVM (AUC = 0.801) and Adaptive Boosting (AdaBoost) (AUC = 0.795). The AUC values for k-NearestNeighbor (KNN) and Extreme Gradient Boosting (XGBoost) were relatively lower and close to each other, at 0.761 and 0.772, respectively (Table 3).

Table 3
www.frontiersin.org

Table 3. Metric results for radiomics signature.

The AUC analysis indicates that LR performs exceptionally well in both training and validation cohorts, maintaining a robust discriminative capability. While all models exhibit a decline in AUC from training to validating, LR shows the highest AUC, suggesting its generalizability and stability. The decline in AUC for all models in the validation cohorts highlights the challenge of maintaining model performance on unseen data, emphasizing the need for robust validation and potential overfitting mitigation strategies.

3.3 Grad-CAM visualization

We employed the Gradient-weighted Class Activation Mapping (Grad-CAM) technique to visualize the decision-making basis of the deep learning model, in order to investigate its recognition capability for different classes of cases. Figure 4 presents the visualization results based on Grad-CAM, where the highlighted areas indicate the image features on which the model’s decision is based, and these features are associated with postoperative recurrence of stage Ia LUAD. By identifying image regions that are crucial for the model’s decision-making, this analysis enhances the interpretability of the model, thereby improving its reliability and clinical trustworthiness as an auxiliary tool.

Figure 4
Three side-by-side images show a sequence of imaging data. The left image is a blurry gray structure. The center and right images are heatmaps with colors ranging from blue to red, indicating intensity. A vertical scale on the right marks values from zero in blue to one in red.

Figure 4. The ResNet50 model with Grad-CAM was used on stage Ia LUAD, the red area is the basis of decision-making for ResNet50.

3.4 Signature comparison

The AUC values for the various signatures indicate varying levels of discriminatory power. The “Combined” signature outperforms all others, achieving the highest AUC scores in both the training (0.972) and validation (0.925, 0.915) cohorts—clear evidence of its exceptional ability to differentiate between classes. Not far behind, the “DLR” signature holds its own with impressive AUC values of 0.885 (training), 0.853 (internal validation) and 0.803 (external validation), showcasing robust classification performance. Similarly, the “Rad” signature delivers solid results, posting AUCs of 0.876, 0.809 and 0.796 in the respective cohorts. On the flip side, the “Clinic” signature lags significantly, with lackluster scores of 0.802 (training), 0.741 (internal validation) and 0.807 (external validation), highlighting its relatively weak discriminatory capacity. Meanwhile, the “DTL” signature lands squarely in the middle of the pack, with AUC values of 0.844, 0.758 and 0.810, reflecting moderate predictive performance (Table 4 and Figure 5).

Table 4
www.frontiersin.org

Table 4. Metrics on different signature.

Figure 5
Three ROC curve graphs are displayed, each labeled (A), (B), and (C). Each graph shows the sensitivity versus 1-specificity for different diagnostic models. Graph (A) is titled “Cohort train ROC,” (B) “Cohort test ROC,” and (C) “Cohort test2 ROC.” The models compared include Clinic, Rad, DTL, DLR, and Combined, each with different line styles and colors. Each graph includes an AUC value with a 95% confidence interval listed for each model, indicating the performance of the models in different cohorts. The Combined model consistently shows higher AUC values across all graphs.

Figure 5. The ROC curves for different models in training (A), internal validation (B) and external (C) cohorts.

Based on the AUC analysis, the “Combined” signature emerges as the most robust indicator, demonstrating high discriminatory power in both training and validation cohorts. The “DLR” and “Rad” signatures also exhibit strong performance, although slightly lower than the “Combined” signature. The “Clinic” signature, in contrast, displays the weakest discriminatory ability. These findings highlight the potential utility of the “Combined” signature in clinical or diagnostic applications, given its superior ability to differentiate between classes. These preliminary results require further validation in larger and more representative populations.

Figure 6 compares the calibration curves of each model, and the results show that the joint model has the highest calibration degree.

Figure 6
Three calibration plots labeled A, B, and C. Each plot shows mean predicted probability on the x-axis and fraction of positives on the y-axis. The dotted line represents perfect calibration. Various colored lines for Clinic, Rad, DTL, DLR, and Combined indicate their respective calibration performances across different cohorts: train, test, and test2. Each line deviates differently from the perfect calibration line, highlighting differences in performance across datasets and models.

Figure 6. The Calibration curves of different signatures in in training (A) and internal validation (B) and external (C) cohorts.

To evaluate the statistical significance of performance differences, the DeLong test was applied to compare the models on the training and validation datasets (Figure 7). Our combined model exhibited a significant enhancement compared to other models.

Figure 7
Three heatmaps labeled (A), (B), and (C) compare different cohorts labeled “Cohort train Delong” and “Cohort test Delong”. Variables are Clinic, Rad, DTL, and DLR. Color scales reflect values from 0.0 (dark red) to 0.8 (dark blue). Heatmap (A) shows values like 0.121 in orange and 0.322 in cyan. Heatmap (B) includes values like 0.542 in cyan and 0.006 in red. Heatmap (C) features values such as 0.858 in blue and 0.002 in dark red.

Figure 7. DeLong results in training (A) and internal validation (B) and external (C) cohorts.

3.5 Clinical use

DCA: Figure 8 illustrates the decision curve analysis for the training, internal and external validation cohorts. The findings reveal that our combined model provides considerable improvements in predicting the likelihood of recurrence in stage Ia LUAD following surgery. Moreover, it consistently outperforms other signatures by delivering a greater net benefit, underscoring its effectiveness.

Figure 8
Three decision curve analyses comparing net benefit versus threshold probability. Panel A shows “Cohort train DCA,” Panel B “Cohort test DCA,” and Panel C “Cohort test2 DCA.” Each panel includes curves for Clinic, Rad, DTL, DLR, Combined, Treat all, and Treat none strategies in various colors, indicating performance differences.

Figure 8. Different models’ DCA curves in training (A) and internal validation (B) and external (C) cohorts.

3.6 Construction of nomogram

We have established a final model that integrates clinical independent risk factors with the predictive results of the DLR model and effectively visualizes them through a deep learning radiomics nomogram (DLRN). The nomogram indicates that the DLR factor plays a crucial role in the stratified prediction of postoperative recurrence risk. (Figure 9).

Figure 9
A nomogram with five horizontal scales: Points, Pathological Type (1 to 5), Nodule Type (0 to 1), DLR (0 to 1), Total Points (0 to 180), and Risk (0.05 to 0.95). Each scale is aligned for use in calculating risk based on variables.

Figure 9. The nomogram for predicting stage Ia LUAD.

4 Discussion

Our study reveals that DLR features based on LDCT images exhibit distinct efficacy compared to clinical feature models in predicting postoperative recurrence, and when integrated with clinical features, they can further enhance the predictive value for postoperative recurrence in patients with stage Ia LUAD. Integrating radiomic features, deep learning features, and clinically independent risk factors into nomograms demonstrates outstanding diagnostic performance in the training internal and external validation cohorts. This fully validates the incremental value of the combined model in associating with individualized disease-free survival in patients with stage Ia LUAD after surgery.

Radiomics has emerged as a continually evolving key direction in the field of medical image analysis, providing a novel method to convert medical images into quantitative features that can reveal tumor-related biological information. By analyzing these features, clinical decision-making can be optimized (9). This technology transcends the limitations of human visual perception, achieving comprehensive representation of tumor-related information by capturing more detailed heterogeneous characteristics of tumors (24). In a study led by Su and colleagues, they utilized a fusion of clinical and radiomics techniques, anchored by CT scans, to gauge the likelihood of bone metastasis in LUAD patients. The findings showcased that this integrated approach provided exceptional diagnostic prowess (with an AUC of 0.866 and a 95% confidence interval of 0.786 to 0.947), which in turn has the potential to enhance individualized treatment planning (25). Xie et al. constructed and validated a radiomics nomogram for prognostic prediction and identification of surgical patients with stage I LUAD who may benefit from adjuvant chemotherapy (26). Zhang et al. found that radiomics has the potential ability to assess the prognosis of patients with pathological stage I LUAD (≤3cm), which may take a step forward in precision medicine (27).

In addition, an increasing number of scholars are utilizing deep learning techniques to predict the prognosis of patients undergoing surgery for LUAD, and have mentioned the practicality of this method (2830). For instance, scholars such as Yuki et al. employed deep convolutional neural networks (DCNN) to predict postoperative recurrence of LUAD using preoperative CT images. The results confirmed that DCNN technology utilizing preoperative CT images can effectively predict postoperative recurrence in patients undergoing LUAD surgery (13). Scholars such as Zhu have confirmed that deep learning algorithms can replace manual methods for tumor measurement, and they outperform manual measurement in the prognostic stratification of patients with LUAD (31). In a study by Wang et al. employing radiomics to predict early recurrence, they extracted peritumoural deep learning features and established a radiomics model incorporating tumour spread through air spaces model. The results demonstrated its potential in guiding adjuvant postoperative treatment strategies (32). Peng and other scholars developed the ResNet50 architecture to evaluate the major pathological response (MPR) in patients with lung squamous cell carcinoma (LUSC) after receiving neoadjuvant chemoimmunotherapy (NCI) treatment. They divided the data into a discovery set (n=200), validation set 1 (n=60), and validation set 2 (n=49). Finally, it was found that the ResNet50 model trained on enhanced CT images was developed and validated for predicting MPR, and its AUC in the first and second validation sets was 0.95 and 0.90, respectively (33).

An increasing number of studies are integrating DTL classification networks with conventional artificial radiomics features (34, 35). However, the use of DLR features based on LDCT images to assess postoperative recurrence in patients with stage Ia LUAD has not been identified. In this study, by utilizing the deep learning-radiomics-clinical combined model we proposed, we obtained results superior to those previously reported. This result indicates that the risk of postoperative recurrence in surgical patients with stage Ia LUAD can be predicted using preoperative LDCT images through a combined model. Our study only requires LDCT images, which we believe have a unique advantage, which they enable individualized postoperative recurrence assessment under preoperative noninvasive conditions.

Univariate and multivariate regression analyses revealed that sex, SCCA levels, pathological type, and nodule type are independent predictors of postoperative recurrence in patients with stage Ia LUAD, which can be used to establish a clinical model. Miyoshi and other scholars retrospectively analyzed the clinicopathological characteristics of 809 patients with stage Ia LUAD and found that patients with ground-glass opacity (GGO) nodules had significantly higher survival rates than those without GGO components (5-year overall survival rate: 97% vs. 84%, p < 0.0001). For patients in the three different stages of T1a, T1b, and T1c, those with ground-glass nodules exhibited higher survival rates compared to those with solid nodules (36). Moreira et al. confirmed in their research that a high-grade subtype (grade 3, predominantly solid/micropapillary components) is a stronger predictor of recurrence than the TNM stage alone (37). This signifies that the pathological subtype has evolved from a qualitative description to a quantitative, standardized prognostic grading tool. This indicates that the pathological type and nodule type have a significant impact on the prognosis of lung adenocarcinoma, which has also been confirmed in this study. To date, no studies have identified sex and SCCA as independent risk factors for recurrence in LUAD. Moreover, as they were not significant in our feature assessment, they were excluded from the model construction. Therefore, we developed a nomogram that combines independent predictive factors with DLR features to predict the recurrence in patients with stage Ia LUAD. Excitingly, compared to single deep learning, radiomics, and clinical models, the nomogram demonstrates a significant improvement in the AUC. It can provide a valuable therapeutic window for patients with stage Ia LUAD who may experience recurrence, and it aids in formulating more rational and effective treatment plans.

Furthermore, the DeLong test was conducted on the AUC of each model. In the training cohort, the AUC value of the combined model showed significant differences compared to clinical model. This results indicate that the deep learning-radiomics-clinical combined model performs better than single models.

The limitations of this study are that it is a retrospective analysis, and the dataset is small. A small sample size is a prevalent and significant challenge in medical imaging research. For instance, it can lead to several adverse outcomes, including inadequate statistical power and unreliable results, distorted model performance evaluations, and the failure of internal validation. While an ideal study design would be a prospective longitudinal cohort study (38), for patients with stage Ia LUAD, conducting such a study faces significant challenges due to the long waiting period required for survival outcome data. Although the generalisability of outcomes would be more convincingly demonstrated through large-scale, independent, prospective, our DCA can assess clinical relevance, confirming that the nomogram identified holds significant potential for clinical application in predicting postoperative outcomes. Moreover, these results have also been well validated in the external validation cohort.

In summary, the identified DLR features predict postoperative recurrence in stage Ia LUAD, a finding confirmed through external validation. When compared to other clinical risk factors, the DLRN detailed herein underscores its pivotal role in the personalized assessment of postoperative recurrence. Although the conclusions of recent small-scale research are only preliminary, mainly to stimulate new ideas and explore new directions, the most critical step is to conduct comprehensive verification through large-scale and multi-center prospective experiments. Our main goal is to establish a reliable index or framework based on imaging, and test it widely in various practical environments, and finally provide a solid basis for its inclusion in clinical decision support tools in the future.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by Liuzhou Workers Hospital and Guangxi Medical University Cancer Hospital, Guangxi, China (Approval No.: KY2025613, KY20251082). The studies were conducted in accordance with the local legislation and institutional requirements. The human samples used in this study were acquired from primarily isolated as part of your previous study for which ethical approval was obtained. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

HML: Data curation, Formal analysis, Methodology, Writing – original draft. CW: Data curation, Formal analysis, Investigation, Writing – original draft. YL: Data curation, Formal analysis, Software, Writing – original draft. ML: Methodology, Software, Supervision, Writing – original draft. HFL: Methodology, Software, Supervision, Writing – original draft. JQ: Methodology, Resources, Supervision, Writing – original draft. JY: Methodology, Software, Supervision, Writing – original draft. FX: Methodology, Software, Supervision, Writing - original draft. DH: Data curation, Investigation, Methodology, Software, Writing – original draft. MZ: Investigation, Software, Supervision, Writing – original draft. QF: Resources, Validation, Visualization, Writing – review & editing. TL: Project administration, Resources, Validation, Visualization, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This study were supported by three Liuzhou Science and Technology Project (Grant number: 2019BJ10607/2025SB0501A001/2025RB0302A020).

Acknowledgments

Some of our experiments were caried out on OnekeyAl platform, Thank OnekeyAl and it’s developers’ scientfc research work.

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Siegel RL and Giaquinto AN and Jemal A. Cancer statistics, 2024. CA Cancer J Clin. (2024) 74:12–49. doi: 10.3322/caac.21820

PubMed Abstract | Crossref Full Text | Google Scholar

2. Loo PS, Thomas SC, Nicolson MC, Fyfe MN, and Kerr KM. Subtyping of undifferentiated non-small cell carcinomas in bronchial biopsy specimens. J Thorac Oncol. (2010) 5:442–7. doi: 10.1097/JTO.0b013e3181d40fac

PubMed Abstract | Crossref Full Text | Google Scholar

3. Nicholson AG, Gonzalez D, Shah P, Pynegar MJ, Deshmukh M, Rice A, et al. Refining the diagnosis and EGFR status of non-small cell lung carcinoma in biopsy and cytologic material, using a panel of mucin staining, TTF-1, cytokeratin 5/6, and P63, and EGFR mutation analysis. J Thorac Oncol. (2010) 5:436–41. doi: 10.1097/JTO.0b013e3181c6ed9b

PubMed Abstract | Crossref Full Text | Google Scholar

4. De Luca GR and Diciotti S and Mascalchi M. The pivotal role of baseline LDCT for lung cancer screening in the era of artificial intelligence. Arch Bronconeumol. (2025) 61:359–67. doi: 10.1016/j.arbres.2024.11.001

PubMed Abstract | Crossref Full Text | Google Scholar

5. Benzaquen J, Hofman P, Lopez S, Leroy S, Rouis N, Padovani B, et al. Integrating artificial intelligence into lung cancer screening: a randomised controlled trial protocol. BMJ Open. (2024) 14:e074680. doi: 10.1136/bmjopen-2023-074680

PubMed Abstract | Crossref Full Text | Google Scholar

6. Lee JH, Chae KJ, Lu MT, Chang YC, Lee S, Goo JM, et al. External testing of a deep learning model for lung cancer risk from low-dose chest CT. Radiology. (2025) 316:e243393. doi: 10.1148/radiol.243393

PubMed Abstract | Crossref Full Text | Google Scholar

7. Chansky K, Detterbeck FC, Nicholson AG, Rusch VW, Vallières E, Groome P, et al. The IASLC lung cancer staging project: external validation of the revision of the TNM stage groupings in the eighth edition of the TNM classification of lung cancer. J Thorac Oncol. (2017) 12:1109–21. doi: 10.1016/j.jtho.2017.04.011

PubMed Abstract | Crossref Full Text | Google Scholar

8. Zeng D, Chen Z, Li M, Yi Y, Hu Z, Valeria B, et al. Survival benefit of surgery vs radiotherapy alone to patients with stage IA lung adenocarcinoma: a propensity score-matched analysis. Eur J Med Res. (2025) 30:173. doi: 10.1186/s40001-025-02436-3

PubMed Abstract | Crossref Full Text | Google Scholar

9. Gillies RJ, Kinahan PE, and Hricak H. Radiomics: images are more than pictures, they are data. Radiology. (2016) 278:563–77. doi: 10.1148/radiol.2015151169

PubMed Abstract | Crossref Full Text | Google Scholar

10. Zwanenburg A, Vallières M, Abdalah MA, Aerts Hjwl, Andrearczyk V, Apte A, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. (2020) 295:328–38. doi: 10.1148/radiol.2020191145

PubMed Abstract | Crossref Full Text | Google Scholar

11. Choe J, Lee SM, Do KH, Kim S, Choi S, Lee JG, et al. Outcome prediction in resectable lung adenocarcinoma patients: value of CT radiomics. Eur Radiol. (2020) 30:4952–63. doi: 10.1007/s00330-020-06872-z

PubMed Abstract | Crossref Full Text | Google Scholar

12. Gao W, Wang W, Song D, Yang C, Zhu K, Zeng M, et al. A predictive model integrating deep and radiomics features based on gadobenate dimeglumine-enhanced MRI for postoperative early recurrence of hepatocellular carcinoma. Radiol Med. (2022) 127:259–71. doi: 10.1007/s11547-021-01445-6

PubMed Abstract | Crossref Full Text | Google Scholar

13. Sasaki Y, Kondo Y, Aoki T, Koizumi N, Ozaki T, Seki H, et al. Use of deep learning to predict postoperative recurrence of lung adenocarcinoma from preoperative CT. Int J Comput Assist Radiol Surg. (2022) 17:1651–61. doi: 10.1007/s11548-022-02694-0

PubMed Abstract | Crossref Full Text | Google Scholar

14. Zhang Y, Hong D, McClement D, Oladosu O, Pridham G, Slaney G, et al. Grad-CAM helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging. J Neurosci Methods. (2021) 353:109098. doi: 10.1016/j.jneumeth.2021.109098

PubMed Abstract | Crossref Full Text | Google Scholar

15. Bo L, Zhang Z, Jiang Z, Yang C, Huang P, Chen T, et al. Differentiation of brain abscess from cystic glioma using conventional MRI based on deep transfer learning features and hand-crafted radiomics features. Front Med (Lausanne). (2021) 8:748144. doi: 10.3389/fmed.2021.748144

PubMed Abstract | Crossref Full Text | Google Scholar

16. Feng B, Huang L, Liu Y, Chen Y, Zhou H, Yu T, et al. A transfer learning radiomics nomogram for preoperative prediction of borrmann type IV gastric cancer from primary gastric lymphoma. Front Oncol. (2021) 11:802205. doi: 10.3389/fonc.2021.802205

PubMed Abstract | Crossref Full Text | Google Scholar

17. Ning Z, Luo J, Li Y, Han S, Feng Q, Xu Y, et al. Pattern classification for gastrointestinal stromal tumors by integration of radiomics and deep convolutional features. IEEE J BioMed Health Inform. (2019) 23:1181–91. doi: 10.1109/jbhi.2018.2841992

PubMed Abstract | Crossref Full Text | Google Scholar

18. Paul R, Hawkins SH, Schabath MB, Gillies RJ, Hall LO, and Goldgof DB. Predicting Malignant nodules by fusing deep features with classical radiomics features. J Med Imaging (Bellingham). (2018) 5:11021. doi: 10.1117/1.Jmi.5.1.011021

PubMed Abstract | Crossref Full Text | Google Scholar

19. Tan AC and Tan DSW. Targeted therapies for lung cancer patients with oncogenic driver molecular alterations. J Clin Oncol. (2022) 40:611–25. doi: 10.1200/jco.21.01626

PubMed Abstract | Crossref Full Text | Google Scholar

20. Zhang X, Lu B, Yang X, Lan D, Lin S, Zhou Z, et al. Prognostic analysis and risk stratification of lung adenocarcinoma undergoing EGFR-TKI therapy with time-serial CT-based radiomics signature. Eur Radiol. (2023) 33:825–35. doi: 10.1007/s00330-022-09123-5

PubMed Abstract | Crossref Full Text | Google Scholar

21. Gong J, Bao X, Wang T, Liu J, Peng W, Shi J, et al. A short-term follow-up CT based radiomics approach to predict response to immunotherapy in advanced non-small-cell lung cancer. Oncoimmunology. (2022) 11:2028962. doi: 10.1080/2162402x.2022.2028962

PubMed Abstract | Crossref Full Text | Google Scholar

22. Nardone V, Boldrini L, Grassi R, Franceschini D, Morelli I, Becherini C, et al. Radiomics in the setting of neoadjuvant radiotherapy: A new approach for tailored treatment. Cancers (Basel). (2021) 13:20210717. doi: 10.3390/cancers13143590

PubMed Abstract | Crossref Full Text | Google Scholar

23. Potter AL, Costantino CL, Suliman RA, Haridas CS, Senthil P, Kumar A, et al. Recurrence after complete resection for non-small cell lung cancer in the national lung screening trial. Ann Thorac Surg. (2023) 116:684–92. doi: 10.1016/j.athoracsur.2023.06.004

PubMed Abstract | Crossref Full Text | Google Scholar

24. Aerts HJ. The potential of radiomic-based phenotyping in precision medicine: A review. JAMA Oncol. (2016) 2:1636–42. doi: 10.1001/jamaoncol.2016.2631

PubMed Abstract | Crossref Full Text | Google Scholar

25. Su Q, Wang B, Guo J, Nie P, and Xu W. CT-based radiomics and clinical characteristics for predicting bone metastasis in lung adenocarcinoma patients. Transl Lung Cancer Res. (2024) 13:721–32. doi: 10.21037/tlcr-24-38

PubMed Abstract | Crossref Full Text | Google Scholar

26. Xie D, Wang TT, Huang SJ, Deng JJ, Ren YJ, Yang Y, et al. Radiomics nomogram for prediction disease-free survival and adjuvant chemotherapy benefits in patients with resected stage I lung adenocarcinoma. Transl Lung Cancer Res. (2020) 9:1112–23. doi: 10.21037/tlcr-19-577

PubMed Abstract | Crossref Full Text | Google Scholar

27. Zhang L, Lv L, Li L, Wang YM, Zhao S, Miao L, et al. Radiomics signature to predict prognosis in early-stage lung adenocarcinoma (≤3 cm) patients with no lymph node metastasis. Diagn (Basel). (2022) 12:20220806. doi: 10.3390/diagnostics12081907

PubMed Abstract | Crossref Full Text | Google Scholar

28. Li G, Luo Q, Wang X, Zeng F, Feng G, and Che G. Deep learning reveals cuproptosis features assist in predict prognosis and guide immunotherapy in lung adenocarcinoma. Front Endocrinol (Lausanne). (2022) 13:970269. doi: 10.3389/fendo.2022.970269

PubMed Abstract | Crossref Full Text | Google Scholar

29. Kim PJ, Hwang HS, Choi G, Sung HJ, Ahn B, Uh JS, et al. A new model using deep learning to predict recurrence after surgical resection of lung adenocarcinoma. Sci Rep. (2024) 14:6366. doi: 10.1038/s41598-024-56867-9

PubMed Abstract | Crossref Full Text | Google Scholar

30. Lin X, Liu K, Li K, Chen X, Chen B, Li S, et al. A CT-based deep learning model: visceral pleural invasion and survival prediction in clinical stage IA lung adenocarcinoma. iScience. (2024) 27:108712. doi: 10.1016/j.isci.2023.108712

PubMed Abstract | Crossref Full Text | Google Scholar

31. Zhu Y, Chen LL, Luo YW, Zhang L, Ma HY, Yang HS, et al. Prognostic impact of deep learning-based quantification in clinical stage 0-I lung adenocarcinoma. Eur Radiol. (2023) 33:8542–53. doi: 10.1007/s00330-023-09845-0

PubMed Abstract | Crossref Full Text | Google Scholar

32. Wang Y, Ding Y, Liu X, Li X, Jia X, Li J, et al. Preoperative CT-based radiomics combined with tumour spread through air spaces can accurately predict early recurrence of stage I lung adenocarcinoma: a multicentre retrospective cohort study. Cancer Imaging. (2023) 23:83. doi: 10.1186/s40644-023-00605-3

PubMed Abstract | Crossref Full Text | Google Scholar

33. Peng J, Xie B, Ma H, Wang R, Hu X, Huang Z, et al. Deep learning based on computed tomography predicts response to chemoimmunotherapy in lung squamous cell carcinoma. Aging Dis. (2024) 16:1674–90. doi: 10.14336/ad.2024.0169

PubMed Abstract | Crossref Full Text | Google Scholar

34. Wei Z, Liu H, Xv Y, Liao F, He Q, Xie Y, et al. Development and validation of a CT-based deep learning radiomics nomogram to predict muscle invasion in bladder cancer. Heliyon. (2024) 10:e24878. doi: 10.1016/j.heliyon.2024.e24878

PubMed Abstract | Crossref Full Text | Google Scholar

35. Zhang YY, Mao HM, Wei CG, Chen T, Zhao WL, Chen LY, et al. Development and validation of a biparametric MRI deep learning radiomics model with clinical characteristics for predicting perineural invasion in patients with prostate cancer. Acad Radiol. (2024) 31:5054–65. doi: 10.1016/j.acra.2024.07.013

PubMed Abstract | Crossref Full Text | Google Scholar

36. Miyoshi T, Aokage K, Katsumata S, Tane K, Ishii G, and Tsuboi M. Ground-glass opacity is a strong prognosticator for pathologic stage IA lung adenocarcinoma. Ann Thorac Surg. (2019) 108:249–55. doi: 10.1016/j.athoracsur.2019.01.079

PubMed Abstract | Crossref Full Text | Google Scholar

37. Moreira AL, Ocampo PSS, Xia Y, Zhong H, Russell PA, Minami Y, et al. A grading system for invasive pulmonary adenocarcinoma: A proposal from the international association for the study of lung cancer pathology committee. J Thorac Oncol. (2020) 15:1599–610. doi: 10.1016/j.jtho.2020.06.001

PubMed Abstract | Crossref Full Text | Google Scholar

38. Park BJ, Kim TH, Shin S, Kim HK, Choi YS, Kim J, et al. Recommended change in the N descriptor proposed by the international association for the study of lung cancer: A validation study. J Thorac Oncol. (2019) 14:1962–9. doi: 10.1016/j.jtho.2019.07.034

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: deep leraning radiomics nomogram, low-dose computed tomography, lung adenocarcinoma, predictive model, ResNet50

Citation: Lan H, Wei C, Luo Y, Liao M, Liang H, Qin J, Yi J, Xu F, Huang D, Zhang M, Feng Q and Li T (2026) Development and validation of an LDCT-based deep learning radiomics nomogram for predicting postoperative recurrence of stage Ia lung adenocarcinoma. Front. Oncol. 15:1706104. doi: 10.3389/fonc.2025.1706104

Received: 15 September 2025; Accepted: 16 December 2025; Revised: 11 December 2025;
Published: 12 January 2026.

Edited by:

Sunyi Zheng, Tianjin Medical University Cancer Institute and Hospital, China

Reviewed by:

Ziye Yan, Zhejiang Normal University, China
Chenyang Xu, Shandong University, China

Copyright © 2026 Lan, Wei, Luo, Liao, Liang, Qin, Yi, Xu, Huang, Zhang, Feng and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Qing Feng, MjQ1MjY4MTk1QHFxLmNvbQ==; Tao Li, bGk5NjY1MTFAMTYzLmNvbQ==

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.