Systematic Review of Artificial Intelligence and Radiomics for Preoperative Prediction of Extranodal Extension and Lymph Node Metastasis in Oropharyngeal Cancer

Stawarz, Katarzyna; Gorzelnik, Anna; Klos, Wojciech; Korzon, Jacek; Kissin, Filip; Bieńkowska-Pluta, Karolina; Stawarz, Grzegorz; Rusetska, Natalia; Zwolinski, Jakub

doi:10.3389/fonc.2025.1717641

SYSTEMATIC REVIEW article

Front. Oncol.

Sec. Head and Neck Cancer

Systematic Review of Artificial Intelligence and Radiomics for Preoperative Prediction of Extranodal Extension and Lymph Node Metastasis in Oropharyngeal Cancer

Provisionally accepted

Katarzyna Stawarz^1*

Anna Gorzelnik¹

Wojciech Klos¹

Jacek Korzon¹

Filip Kissin¹

Karolina Bieńkowska-Pluta¹

Grzegorz Stawarz²

Natalia Rusetska¹

Jakub Zwolinski¹

¹Maria Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland
²Praski Hospital in Warsaw, Warsaw, Poland

The final, formatted version of the article will be published soon.

Background: Preoperative identification of extranodal extension (ENE) and cervical lymph node metastasis (LNM) in oropharyngeal cancer guides treatment escalation and de-escalation. Artificial intelligence (AI) and radiomics offer promise for nodal assessment, but clinical utility and reporting quality remain variable. Methods: This systematic review followed PRISMA guidelines. We systematically searched PubMed, Scopus, and Web of Science for studies published between 2020–2025. Eleven eligible studies (4 core, 7 supportive) addressed ENE (n=2) or LNM prediction (n=2), with additional supportive studies on segmentation, lymphatic spread modeling, MRI radiomics, and outcomes modeling. Extracted variables included study characteristics, performance metrics, validation, calibration, and unit of analysis. Risk of bias was assessed using PROBAST; reporting quality was evaluated with TRIPOD. Due to heterogeneity and limited study numbers, no meta-analysis was performed; results were narratively synthesized. For ENE, we report study-level accuracy, decision-curve analysis (DCA), and per-1,000 management impact. Results: All core studies were CT-based. The task-specific deep-learning ENE model achieved AUC 0.86 with balanced operating points, while the generalist LVLM (Large Vision-Language Model) reached sensitivity 1.00 with specificity 0.34. DCA favored the DL model across thresholds 0.10–0.40, showing fewer unnecessary dissections per 1,000 patients than Treat-all or L(V)LM. For LNM, discrimination was high (AUC 0.865–0.919), calibration was reported, and one study included external validation, though threshold-level sensitivity/specificity were missing. External validation was reported in 25% of core studies, calibration in 50%; TRIPOD adherence was 74.5% overall, with frequent under-reporting of blinding and missing-data handling. 2 Conclusions: AI and radiomics show promising potential for preoperative prediction of ENE and LNM in oropharyngeal cancer. Task-specific deep-learning models achieve balanced discrimination, while generalist LVLMs provide high recall at lower specificity. For LNM, encouraging performance is reported, but limited external validation and absent standardized thresholds still preclude clinical use. Broader validation and harmonized reporting are essential before translation into practice. Registration/Protocol: Not registered; methods followed PRISMA/TRIPOD/PROBAST guidance.

Keywords: oropharyngeal cancer, head and neck cancer, Extranodal extension, lymph node metastasis, Radiomics, deep learning, Decision-curve analysis, Tripod

Received: 02 Oct 2025; Accepted: 17 Nov 2025.

Copyright: © 2025 Stawarz, Gorzelnik, Klos, Korzon, Kissin, Bieńkowska-Pluta, Stawarz, Rusetska and Zwolinski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Katarzyna Stawarz, kasiaworek1@gmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.