Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol.

Sec. Cancer Imaging and Image-directed Interventions

This article is part of the Research TopicArtificial Intelligence Advancing Lung Cancer Screening and TreatmentView all 4 articles

Risk Factors for Misclassification in Predicting EGFR Mutation Status Using PET/CT Imaging Features in Non-Small Cell Lung Cancer Patients

Provisionally accepted
Jiali  LiJiali Li1Zihang  ZengZihang Zeng2Jie  ChenJie Chen1Tianxing  FangTianxing Fang1Hongjun  LiuHongjun Liu1Yong  HeYong He1*
  • 1Zhongnan Hospital of Wuhan University Department of Nuclear Medicine, Wuhan, China
  • 2Zhongnan Hospital of Wuhan University Department of Radiation and Medical Oncology, Wuhan, China

The final, formatted version of the article will be published soon.

Objective: This study aims to develop 10 machine learning models based on PET/CT radiomic features to predict epidermal growth factor receptor (EGFR) mutations in non-small cell lung cancer (NSCLC) patients, and identify risk factors contributing to model misclassification. Methods: This study included 277 NSCLC patients from Zhongnan Hospital, Wuhan University, who underwent pre-treatment 18F-FDG PET/CT and EGFR mutation testing. A PET-CT signature (PCS)-nomogram was developed by comparing 10 machine learning algorithms for EGFR prediction. Leave-one-out cross-validation generated model-specific EGFR mutation probabilities of individual patient, with performance disparities analyzed across clinical subgroups. Model performance was assessed via receiver operating characteristic curve, Youden's index, decision curve analysis, and DeLong's test. Results: The PCS-nomogram model constructed with partial least squares generalized linear models (plsRglm) algorithm achieved optimal performance of EGFR mutation prediction in NSCLC patients (training cohort: AUC 0.80; validation cohort: AUC 0.82). Smoking history caused statistically significant performance deterioration in 7/10 machine learning models (|ΔYouden's index| ≥0.1). PCS model demonstrated higher predictive performance in never-smokers than in smokers (AUC 0.90 vs. 0.64, P < 0.05). Conclusion: A plsRglm-based PCS-nomogram model was proposed for non-invasive prediction of EGFR mutations in NSCLC patients. Compared to smokers, radiomics-based EGFR mutation prediction demonstrated superior performance in never-smokers.

Keywords: EGFR mutation, misclassification, Non-small cell lung cancer, 18F-FDG PET/CT, Radiomics

Received: 10 Sep 2025; Accepted: 14 Nov 2025.

Copyright: © 2025 Li, Zeng, Chen, Fang, Liu and He. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yong He, vincentheyong@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.