ORIGINAL RESEARCH article
Front. Oncol.
Sec. Thoracic Oncology
This article is part of the Research TopicArtificial Intelligence Advancing Lung Cancer Screening and TreatmentView all 3 articles
Machine Learning-Based Differentiation of Lung Squamous Cell Carcinoma and Adenocarcinoma Using Clinical-Semantic and Radiomic Features
Provisionally accepted- 1Tianjin Medical University Cancer Institute and Hospital, Tianjin, China
- 2Tianjin University of Traditional Chinese Medicine, Tianjin, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Purpose: To evaluate and compare the predictive performance of machine learning methods using clinical-semantic, radiomic, and combined features in distinguishing squamous cell carcinoma (SCC) from adenocarcinoma (ADC) in non-small cell lung cancer (NSCLC). Methods: A total of 399 patients with pathologically confirmed NSCLC were retrospectively enrolled in 2017, and randomly divided into a training set (n=279) and a validation set (n=120). Clinical factors, semantic features, and radiomics features were collected and screened via the minimum redundancy maximum relevance (mRMR) method and least absolute shrinkage and selection operator (LASSO). We investigated 3 models constructed with 4 classifiers for histologic subtype prediction. The models were trained on the training cohort and their performance was evaluated on the independent validation cohort using accuracy, sensitivity, specificity, F1 score, precision and area under the receiver operating characteristic curve (AUC). Results: After feature selection, 10 representative features were finalized, comprising 4 clinical-semantic and 6 radiomic features. In the validation cohort, the support vector machine (SVM) classifier demonstrated promising predictive performance. When integrating clinical-semantic and radiomic features, the combined model (AUC = 0.871) showed potential in distinguishing NSCLC pathological subtypes, outperforming models based solely on clinical-semantic (AUC = 0.594) or radiomic features (AUC = 0.713). It achieved an accuracy of 0.892, a sensitivity of 0.758, a specificity of 0.943, a F1 score of 0.794, and a precision of 0.833. However, the AUC differences were not statistically significant, highlighting the need for further multi-center prospective validation. This is a provisional file, not the final typeset article Conclusion: In this study, the SVM-based combined model, which integrated clinical-semantic and radiomic features, demonstrated promising performance among the four classifiers-based combined models in distinguishing between ADC and SCC. However, due to the study's single-center, retrospective design and the lack of statistically significant differences in AUC for some models, the findings should be interpreted with caution. These results show potential but require future multi-center prospective validation before clinical application.
Keywords: Squamous cell carcinoma, Adenocarcinoma, Radiomics, machine learning, computed tomography
Received: 16 Oct 2025; Accepted: 11 Nov 2025.
Copyright: © 2025 Li, Yang, Zhang, Liang, Liu, Zheng, Wang and Ye. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Zhaoxiang Ye, zye@tmu.edu.cn
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
