ORIGINAL RESEARCH article

Front. Immunol., 05 June 2025

Sec. Cancer Immunity and Immunotherapy

Volume 16 - 2025 | https://doi.org/10.3389/fimmu.2025.1580200

Machine learning for predicting distant metastasis in nasopharyngeal carcinoma patients

Hong SunHong Sun1Jijie ZhuJijie Zhu2Ling LiLing Li1Xiu XinXiu Xin1Jingchao Yan*Jingchao Yan1*Taomin Huang*Taomin Huang1*
  • 1Department of Pharmacy, Eye & ENT Hospital, Fudan University, Shanghai, China
  • 2Shanghai University of Medicine & Health Sciences, Shanghai, China

Background: Distant metastasis is the main cause of treatment failure and death in patients with nasopharyngeal carcinoma (NPC). The aim of this study was to explore the risk factors for distant metastasis in NPC patients using machine learning (ML) methods.

Methods: We collected data from NPC patients who were treated at the Eye Ear Nose Throat Hospital of Fudan University between September 2017 and June 2024. Seven ML methods were employed to construct the predictive models. By comparing the predictive performance of different ML models, the best one was selected to establish a predictive model for distant metastasis of NPC. The SHapley Additive exPlanation (SHAP) method was utilized to ascertain the ranking of feature importance and to provide explanations for the predictive model.

Results: A total of 1,845 NPC patients were included in this study. Among the seven models, Logistic Regression (LR) performed best in the test dataset (Area Under the ROC Curve [AUC] = 0.8499). SHAP analysis indicated that the most important variables for distant metastasis in NPC patients were targeted therapy, immunotherapy, N stage, Epstein-Barr virus (EBV), hypertension, T stage, lymphocyte count (LY) and lactate dehydrogenase (LDH) level.

Conclusion: Targeted therapy, N stage, immunotherapy, EBV, hypertension, T stage, LY and LDH level are significantly associated with the risk of distant metastasis in NPC and could be used to identify high-risk populations for distant metastasis in NPC patients. For high-risk patients, early interventions such as targeted therapy and immunotherapy might be considered to reduce the risk of distant metastasis in NPC.

1 Introduction

Nasopharyngeal carcinoma (NPC), a subset of head and neck cancers, originates from the epithelial cells of the nasopharynx (1, 2). The development of NPC is associated with a variety of factors, including genetic susceptibility, infection by the Epstein-Barr virus (EBV), and environmental factors such as smoking (38). NPC has significant geographical differences, being prevalent in East and Southeast Asia (9). The early treatment of NPC mainly relies on radiotherapy and chemotherapy, which has a good prognosis (10). However, most patients were diagnosed at advanced stages with a poor prognosis. Patients with advanced NPC have a higher risk of distant metastasis. Over the past few decades, the survival rate of patients with locally advanced NPC has improved through successful chemoradiation strategies. Despite advances in treatment strategies, approximately 30% of NPC patients still experience recurrence or metastatic disease (11). Distant metastasis is the main cause of treatment failure and death in NPC patients (12).

Patients with metastatic NPC are usually advised to undergo platinum-based chemotherapy as first-line treatment. However, while the combination of gemcitabine and cisplatin has been frequently utilized in recent years, it has been found to offer only limited short-term benefits. Although Programmed Cell Death Protein 1 (PD-1) and Programmed Cell Death Ligand 1 (PD-L1) inhibitors have brought new hope to metastatic NPC patients in recent years, there are still some challenges and limitations, such as drug resistance and economic burden associated with long-term use (13, 14). Moreover, although tislelizumab plus chemotherapy appeared to be the optimal choice compared with other PD-1 inhibitors plus chemotherapy for the first-line treatment of recurrent or metastatic nasopharyngeal carcinoma, there were still limited benefit in many patients (15, 16). Overall, the prognosis for patients with metastatic NPC is poor. Therefore, exploring the risk factors affecting distant metastasis in NPC and constructing a prediction model for distant metastasis in NPC is important for improving the prognosis of NPC patients. In this study, we aimed to construct and validate a machine learning (ML) model to predict the risk of distant metastasis in NPC patients. The SHapley Additive exPlanation (SHAP) method was used to elucidate the feature importance and interpret the model’s predictive results, thereby assessing the model’s practical utility in predicting the distant metastasis in NPC patients.

2 Materials and methods

2.1 Study design

The study design was displayed in Figure 1. The data of 1,845 NPC patients was analyzed in this study. The candidate variables including demographic variables, treatment regimens, comorbidities, lifestyle variables, laboratory indicators and the outcome variable (distant metastasis) were collected. The dataset was split into training and test subsets with a ratio of 7:3. The Least Absolute Shrinkage and Selection Operator (LASSO) method was utilized on the training dataset to pinpoint the most significant features. Then, the most significant features were included in seven ML models. According to the performance of each ML model, the optimal model was selected to establish a predictive model for distant metastasis in NPC. The SHAP method was deployed to ascertain the ranking of feature importance and to elucidate the predictive model’s outcomes. The univariate and multivariate analyses were used on the entire cohort to identify the independent risk factors. The association between the independent continuous factors and distant metastasis was assessed through the application of a restricted cubic spline (RCS) model.

Figure 1
www.frontiersin.org

Figure 1. Study design of the study.

2.2 Study population

This study collected the data from patients at Eye Ear Nose Throat Hospital of Fudan University between September 2017 and June 2024. The criteria for participant inclusion were as follows: 1) patients were diagnosed as NPC by pathology; 2) metastatic lesions were diagnosed by imaging and pathology; 3) patients for whom demographic data, laboratory, imaging, pathological information and treatment regimens was recorded completely. Exclusion criteria were: 1) the data of laboratory, pathological, or imaging was absent; 2) patients with multiple primary malignancies. Ultimately, this study encompassed data from 1,845 NPC patients. Adhering to the regulations of China and principles of the Declaration of Helsinki, the research received approval from the Ethics Committee of the Eye Ear Nose Throat Hospital of Fudan University (2024222). This study was registered on ChiCTR (ChiCTR2500095104). Given the retrospective nature of the study and the anonymization of all data, the necessity for obtaining informed consent from the patients was waived.

2.3 Data collection

The data of variables analyzed in this research was sourced from the electronic medical records of patients in our hospital. The collected data were as follows: 1) demographic information: age, gender, and body mass index (BMI); 2) pathological information: Tumor Node Metastasis (TNM) stage (American Joint Committee on Cancer 8th edition), tumor differentiation, gene expression; 3) laboratory indicators (first set of tests after NPC admission): alanine aminotransferase (ALT), aspartate aminotransferase (AST), albumin (ALB), globulin (GLOB), blood urea nitrogen (BUN), creatinine (CREA), lactate dehydrogenase (LDH), white blood cell (WBC) count, hemoglobin (HGB) level, platelet (PLT) count, neutrophil count (NE), lymphocyte count (LY); 4) treatment regimens: radiotherapy, chemotherapy, targeted therapy, immunotherapy (If the patient had distant metastasis, only the treatment regimen before metastasis was collected); 5) comorbidities: hypertension, diabetes, hepatitis B, etc.; 6) lifestyle: smoking history, drinking history; 7) EBV infection.

2.4 Data processing and model building

Variables with missing data exceeding 10% were excluded. The proportion of missing data for each variable was displayed in the Supplementary Figure S1. The missing values were estimated by employing the random forest algorithm with its standard settings. Random forest imputation was chosen due to its ability to handle complex interactions between variables and its robustness in various missing data scenarios. In addition, we adopted multiple imputation for sensitivity analysis. The dataset was subsequently partitioned into a training set and a test set with a ratio of 7:3. LASSO method was performed to screen the significant features. Seven widely used ML methods were employed, including Logistic Regression (LR) (17), Random Forest (RF) (18), K-Nearest Neighbor (KNN) (19), Support Vector Machine (SVM) (20), Neural Network (NNET) (21), eXtreme Gradient Boosting (XGBoost) (22), and Light Gradient Boosting Machine (LightGBM) (23). Each of these ML models was trained by the training dataset and validated by the test set. We used the cross-validation to assess the performance of each model. Receiver operating characteristic curves (ROC) and decision curve analysis (DCA) were performed to assess the performance of different ML models. According to the performance of different ML models, the best model was selected to establish the predictive model. Subsequently, the SHAP method was implemented on the best-performing model to decipher the role of features and their clinical significance. The detailed code of data processing and model building was displayed in Supplementary Materials.

2.5 Statistical analysis

The ML models were established using R software (version 4.4.2) and related software packages such as “xgboost”, “lightgbm”, and “randomForest”. The discrimination performance of different ML models was evaluated by the analysis of the ROC curve. DCA was implemented to illustrate the net benefit of employing a model across various thresholds, thereby evaluating the clinical utility of the model. Continuous variables were expressed as the mean ± standard deviation (SD) and were analyzed using the t-test for comparisons. Categorical variables were displayed as numbers with their respective percentages and were assessed using the chi-square test for differences. The independent risk factors were identified through both univariate and multivariate logistic regression analyses. The association between the independent continuous factors and distant metastasis was evaluated using a RCS model. A two-tailed P value of less than 0.05 was considered statistically significant.

3 Results

3.1 Characteristics of study participants

A total of 1,845 NPC patients were analyzed in this study. Among them, 162 patients occurred distant metastasis. The cohort comprised 1,348 males (73.06%) and 497 females (26.94%). In terms of lifestyle factors, 812 individuals (44.01%) reported smoking, and 606 (32.85%) reported alcohol consumption. Concerning tumor stage and differentiation, the majority of patients presented with advanced T3/T4 stage (1,426, 77.29%), N2/N3 stage (1,176, 63.74%), and undifferentiated carcinoma (1,616, 87.59%). Regarding therapeutic approaches, 1,204 patients (65.26%) underwent targeted therapy, while 246 (13.33%) received immunotherapy. Almost all patients received radiotherapy or radiotherapy combined with chemotherapy. Comorbid conditions included hypertension in 507 patients (27.48%), diabetes in 156 (8.46%), hepatitis B in 61 (3.31%), and a history of other tumors in 44 (2.38%). Genetically, most patients exhibited positive expression for CKpan, P40, P63, EGFR, and EGER, whereas P16 expression was negative in the majority. In terms of EBV infection, 1,352 patients (73.28%) tested positive. The average age of the patients was 52.34 years old, and the average BMI was 23.81 kg/m².

3.2 Independent risk factors and dose-response relationship

We investigated the independent risk factors for distant metastasis in NPC patients. Through univariate logistic regression analysis, 11 potential risk factors were pinpointed to be significantly associated with distant metastasis in NPC (P < 0.05; Table 1). Subsequent multivariate logistic regression analysis revealed 9 factors that were independently associated with the risk of distant metastasis in NPC patients (P < 0.05; Table 1). These independent risk factors were gender, T stage, N stage, targeted therapy, immunotherapy, hypertension, EBV, LDH and LY.

Table 1
www.frontiersin.org

Table 1. Results of the univariate and multivariate logistic regression analyses.

Based on the findings from the multivariate logistic regression analysis, we proceeded to investigate the relationship between LDH, LY levels and the risk of distant metastasis in NPC by RCS analysis. Before examining the dose-response association, we adjusted for the potential confounding factors and conducted non-linearity assessments. The dose-response curves suggested there was a nonlinear association of LDH level with distant metastasis in NPC (P-overall < 0.001, P-non-linear = 0.010) (Figure 2). The risk of distant metastasis in NPC increased rapidly when the LDH level was > 239U/L. There was no significant nonlinear association of LY level with distant metastasis in NPC (P-overall > 0.05, P-non-linear > 0.05).

Figure 2
www.frontiersin.org

Figure 2. Restricted cubic spline (RCS) plots. (A) LDH; (B) LY.

3.3 Selection of most important features and model development

LASSO method was performed to screen the significant features (Figure 3). Eight most important features (T stage, N stage, targeted therapy, immunotherapy, hypertension, EBV, LDH and LY) were identified by LASSO regression. The ROC curves were presented in Figure 4A (ROC curves for training dataset) and Figure 4B (ROC curves for test dataset). In the test dataset, LR performed best in terms of the Area Under the ROC Curve (AUC) value (Figure 4B). DCA also indicated that the LR model performed best in the test dataset (Supplementary Figure S2). We selected the LR model to establish the predictive model for distant metastasis in NPC.

Figure 3
www.frontiersin.org

Figure 3. LASSO regression analysis. (A) LASSO regression coefficient paths; (B) LASSO regression cross-validation error plot.

Figure 4
www.frontiersin.org

Figure 4. Model evaluation metrics and curves. (A) ROC curves for training dataset; (B) ROC curves for test dataset.

3.4 Model explanation

We utilized the SHAP method to interpret the final model’s predictions by assessing the contribution of each feature to the forecasted outcomes. The SHAP summary bar plot illustrated the assessment of feature contributions to the model, ranked by the mean SHAP values in a descending sequence: targeted therapy, N stage, immunotherapy, EBV, hypertension, T stage, LY and LDH level (Figure 5A). Furthermore, the SHAP summary dot plot graphically represented the strength and direction of the impact on each feature on this model prediction (Figure 5B). Features including N2/3, EBV positive, T3/4 and high LDH levels were significantly associated with the increased risk of distant metastasis in NPC. On the contrary, features including targeted therapy, N0/1, immunotherapy, comorbid with hypertension, T1/2 and high level of LY could significantly reduce the risk of distant metastasis in NPC.

Figure 5
www.frontiersin.org

Figure 5. Model explanation. (A) SHAP summary bar plot; (B) SHAP summary dot plot; (C) Nomogram to predict the probability of distant metastasis in NPC.

In addition, based on the LR model in the training dataset, a nomogram was constructed for predicting the risk of distant metastasis in NPC (Figure 5C). The calibration plot of LR model was shown in Supplementary Figure S3. To more intuitively illustrate the impact of each variable on distant metastasis in NPC, we performed a multivariate analysis on the dataset used to construct the nomogram and created a forest plot to display the results (Supplementary Figure S4). In the nomogram, a total score could be calculated by targeted therapy, N stage, immunotherapy, EBV, hypertension, T stage, LY and LDH level. Each of these variables was assigned a score on the point scale axis. The total score could be calculated by summing up these individual scores. By plotting the total score on the lower total point scale, we were able to estimate the likelihood of distant metastasis in NPC.

3.5 Sensitivity analysis

The missing data were estimated by multiple imputation for sensitivity analysis. The results of sensitivity analysis were displayed in Supplementary Figures S5S8, which indicated that the results were robust and reliable.

4 Discussion

To date, although there were several studies on the prediction of the risk of distant metastasis in NPC, most of them were with small sample size (2427). Among them, a study was based on the SEER database mining, which could not accurately represent the real circumstances of the Chinese NPC patients (27). In addition, the SEER database usually lacks some data such as comorbidities. To our knowledge, apart from the SEER database mining, this study currently represents the largest sample size examining distant metastasis in NPC. In the present study, we identified several significant risk factors of distant metastasis in NPC. Seven ML models were then used to analyze and predict the risk of distant metastasis in NPC. After comparing the performance of different ML models, the LR model performed the best and was selected to develop the prediction model for distant metastasis in NPC.

Our results identified 8 high risk factors (targeted therapy, N stage, immunotherapy, EBV, hypertension, T stage, LY and LDH level) of distant metastasis in NPC. Characteristics such as N2/3 stage, EBV positive, T3/4 stage, and elevated LDH levels were significantly correlated with an increased risk of distant metastasis in NPC. Conversely, the administration of immunotherapy and targeted therapy, along with N0/1 stage, T1/2 stage, hypertension, high level of LY were associated with a significantly reduced risk of distant metastasis in NPC. These factors could serve as important reference indicators for clinicians to assess the risk of distant metastasis in NPC patients, helping to identify high-risk populations and implement early interventions. Considering these factors in combination may have greater predictive value than considering any single factor alone and may more accurately reflect the patient’s risk of distant metastasis.

For high-risk patients, early intervention measures (such as immunotherapy and targeted therapy) might have potential benefits. Immunotherapy such as PD-1 inhibitors could suppress tumor through various mechanisms, including enhancing anti-tumor immune responses, synergistic effects of combination therapies, modulating the tumor microenvironment, and impacting tumor metastasis (28). For patients with advanced NPC, immunotherapy could improve their prognosis. The combination of PD-1 inhibitors and chemotherapy is the first-line treatment for metastatic NPC. In this study, although few patients (246 cases) received immunotherapy, only 2 of them developed distant metastasis. Therefore, for NPC patients with high risk of distant metastasis, early intervention with immunotherapy (PD-1 inhibitor) could reduce the risk of distant metastasis. In addition to immunotherapy, targeted therapy could also significantly reduce the risk of distant metastasis for NPC. Studies have shown that over 90% of NPC patients overexpressed the epidermal growth factor receptor (EGFR), and high expression of EGFR was closely associated with the aggressiveness, metastasis, resistance to radiotherapy and chemotherapy, and poor prognosis of NPC (29). Drugs such as nimotuzumab could selectively inhibit the proliferation of tumor cells by targeting EGFR, thereby improving prognosis (30). Future research could further explore the specific efficacy and safety of these early interventions in different risk groups.

Interestingly, our study indicated that NPC patients with hypertension had a lower risk of distant metastasis, with an OR value of 0.35 (0.21-0.56, P<0.001) in the univariate analysis, and an OR value of 0.36 (0.22-0.61, P<0.001) in the multivariate analysis. Some studies reported that hypertension was a risk factor for several types of cancer such as renal cell carcinoma and early cervical cancer (31, 32). A study indicated that hypertension was related to the increased risk of EBV reactivation in NPC (33). In addition, a previous study indicated that captopril could inhibit the lung tumor growth and metastasis (34). Given the high collinearity between hypertension and the use of antihypertensive medications, hypertension in this study should refer to the use of antihypertensive medications. However, which type of antihypertensive agents could reduce the risk of distant metastasis in NPC and the specific mechanisms still require further research. We will conduct a prospective study and in-depth mechanistic explorations in the future to further elucidate this issue.

There are several limitations in the present study. Firstly, this study was conducted retrospectively in a single center. This limits generalizability. Despite the strict criteria of inclusion and exclusion, there remained a challenge in eliminating potential biases that could influence the research outcomes. We are considering temporal validation to further assess the model’s performance over time. Multicenter prospective studies with large sample size are also needed in the future. Secondly, because it is a retrospective study, some important information is missing, such as drug details in complicated diseases and the time of distant metastasis. For example, for patients with hypertension, we could only know that almost all patients had taken antihypertensive drugs, but for most patients, the specific type of antihypertensive medication was not recorded. We were unable to draw Kaplan-Meier curves for metastasis-free survival stratified by risk groups based on the nomogram. Thirdly, the model is developed by using the data of Chinese patients, whether it could be used for patients in other populations still needs further study. Moreover, time is an important factor to consider, as some patients who have not experienced distant metastasis now may develop it in the future. Therefore, prospective studies are needed for more in-depth research in the future. In addition, although machine learning models possess substantial theoretical predictive capabilities, their implementation in practical clinical environments encounters various challenges such as clinicians’ understanding and trust in model outputs, and the effective utilization of models in real-time clinical decision-making processes. Furthermore, the LR model performed the best in this study (AUC = 0.8499). Although this performance metric is quite satisfactory, the AUC value might be influenced by factors such as the distribution of features in the dataset and the sample size. In future clinical practice, it is necessary to continuously update and optimize the predictive models to adapt to the evolving medical knowledge and technological advancements, thereby providing more precise support for the individualized treatment of NPC patients. Despite these limitations, the outstanding performance of our final prediction model remains undiminished.

5 Conclusion

In conclusion, targeted therapy, immunotherapy, N stage, EBV, hypertension, T stage, LY and LDH levels are significantly related to the risk of distant metastasis in NPC patients and could be used to identify high-risk populations for the distant metastasis in NPC. The identification of these risk factors could help clinicians develop more precise treatment plans based on individual patient characteristics, thereby improving therapeutic outcomes and reducing the risk of distant metastasis. With the implementation of multicenter studies, the identification of new features, and the application of more advanced machine learning technologies, it is anticipated that the predictive ability and intervention outcomes for distant metastasis in NPC will be further enhanced, thereby improving patient prognosis.

Data availability statement

The raw data presented in the study are included in the article/Supplementary Material. Any other data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by The Ethics Committee of the Eye Ear Nose Throat Hospital of Fudan University. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent from participants or their legal guardians/next of kin, given the retrospective nature of the study and the anonymization of all data.

Author contributions

HS: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Visualization, Writing – original draft, Writing – review & editing. JZ: Data curation, Resources, Writing – review & editing. LL: Resources, Writing – review & editing. XX: Resources, Writing – review & editing. JY: Conceptualization, Supervision, Writing – review & editing. TH: Conceptualization, Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Acknowledgments

We are grateful to all the patients who participated in this study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2025.1580200/full#supplementary-material

References

1. Chen Y, Chan A, Le Q, Blanchard P, Sun Y, and Ma J. Nasopharyngeal carcinoma. Lancet. (2019) 394:64–80. doi: 10.1016/S0140-6736(19)30956-0

PubMed Abstract | Crossref Full Text | Google Scholar

2. Cantù G. Nasopharyngeal carcinoma. A “different” head and neck tumour. Part B: treatment, prognostic factors, and outcomes. Acta Otorhinolaryngol Ital. (2023) 43:155–69. doi: 10.14639/0392-100X-N2223

PubMed Abstract | Crossref Full Text | Google Scholar

3. Chang ET, Ye W, Zeng Y, and Adami H. The evolving epidemiology of nasopharyngeal carcinoma. Cancer Epidemiology Biomarkers Prev. (2021) 30:1035–47. doi: 10.1158/1055-9965.EPI-20-1702

PubMed Abstract | Crossref Full Text | Google Scholar

4. Su ZY, Siak PY, Lwin YY, and Cheah S. Epidemiology of nasopharyngeal carcinoma: current insights and future outlook. Cancer Metastasis Rev. (2024) 43:919–39. doi: 10.1007/s10555-024-10176-9

PubMed Abstract | Crossref Full Text | Google Scholar

5. Friborg JT, Yuan JM, Wang R, Koh WP, Lee HP, and Yu MC. A prospective study of tobacco and alcohol use as risk factors for pharyngeal carcinomas in Singapore Chinese. Cancer. (2007) 109:1183–91. doi: 10.1002/cncr.22501

PubMed Abstract | Crossref Full Text | Google Scholar

6. Roy Chattopadhyay N, Das P, Chatterjee K, and Choudhuri T. Higher incidence of nasopharyngeal carcinoma in some regions in the world confers for interplay between genetic factors and external stimuli. Drug Discoveries Ther. (2017) 11:170–80. doi: 10.5582/ddt.2017.01030

PubMed Abstract | Crossref Full Text | Google Scholar

7. Tsao S, Tsang C, and Lo K. Epstein – Barr virus infection and nasopharyngeal carcinoma. Philos Trans R Soc Lond B Biol Sci. (2017) 372:20160270. doi: 10.1098/rstb.2016.0270

PubMed Abstract | Crossref Full Text | Google Scholar

8. Diao H, Xue WQ, Wang TM, Yang DW, Deng CM, Li DH, et al. The interaction and mediation effects between the host genetic factors and Epstein–Barr virus VCA-IgA in the risk of nasopharyngeal carcinoma. J Med Virol. (2023) 95:e29224. doi: 10.1002/jmv.29224

PubMed Abstract | Crossref Full Text | Google Scholar

9. Guo R, Mao Y, Tang L, Chen L, Sun Y, and Ma J. The evolution of nasopharyngeal carcinoma staging. Br J Radiol. (2019) 92:20190244. doi: 10.1259/bjr.20190244

PubMed Abstract | Crossref Full Text | Google Scholar

10. Juarez-Vignon Whaley JJ, Afkhami M, Sampath S, Amini A, Bell D, and Villaflor VM. Early stage and locally advanced nasopharyngeal carcinoma treatment from present to future: where are we and where are we going? Curr Treat Options Oncol. (2023) 24:845–66. doi: 10.1007/s11864-023-01083-2

PubMed Abstract | Crossref Full Text | Google Scholar

11. Jiromaru R, Nakagawa T, and Yasumatsu R. Advanced nasopharyngeal carcinoma: current and emerging treatment options. Cancer Manag Res. (2022) 14:2681–89. doi: 10.2147/CMAR.S341472

PubMed Abstract | Crossref Full Text | Google Scholar

12. Cao C, Luo J, Gao L, Yi J, Huang X, Wang K, et al. Clinical outcomes and patterns of failure after intensity-modulated radiotherapy for T4 nasopharyngeal carcinoma. Oral Oncol. (2013) 49:175–81. doi: 10.1016/j.oraloncology.2012.08.013

PubMed Abstract | Crossref Full Text | Google Scholar

13. Ma Y, Chen X, Wang A, Zhao H, Lin Q, Bao H, et al. Copy number loss in granzyme genes confers resistance to immune checkpoint inhibitor in nasopharyngeal carcinoma. J Immunother Cancer. (2021) 9:e2014. doi: 10.1136/jitc-2020-002014

PubMed Abstract | Crossref Full Text | Google Scholar

14. Duarte HS, Da Veiga CRP, Da Veiga CP, Wainstein AJA, Da Silva WV, and Drummond-Lage AP. Does it fit in your pocket? economic burden of PD-1 inhibitors’ toxicity in the supplementary health system: evidence from Brazil. BMC Health Serv Res. (2023) 23:781. doi: 10.1186/s12913-023-09736-6

PubMed Abstract | Crossref Full Text | Google Scholar

15. Yang Y, Pan J, Wang H, Zhao Y, Qu S, Chen N, et al. Tislelizumab plus chemotherapy as first-line treatment for recurrent or metastatic nasopharyngeal cancer: A multicenter phase 3 trial (RATIONALE-309). Cancer Cell. (2023) 41:1061–72. doi: 10.1016/j.ccell.2023.04.014

PubMed Abstract | Crossref Full Text | Google Scholar

16. Sun H, Bu F, Li L, Zhang X, Xin X, Yan J, et al. Efficacy and safety of immune checkpoint inhibitors combined with chemotherapy as first-line treatment for recurrent or metastatic nasopharyngeal carcinoma: A network meta-analysis of randomized controlled trials. Ann Pharmacother. (2024) 58:349–59. doi: 10.1177/10600280231188171

PubMed Abstract | Crossref Full Text | Google Scholar

17. Wang Q, Yu S, Qi X, Hu Y, Zheng W, Shi J, et al. Overview of logistic regression model analysis and application. Zhonghua Yu Fang Yi Xue Za Zhi. (2019) 53:955–60. doi: 10.3760/cma.j.issn.0253-9624.2019.09.018

PubMed Abstract | Crossref Full Text | Google Scholar

18. Jin Y, Lan A, Dai Y, Jiang L, and Liu S. Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy. Eur J Med Res. (2023) 28:394. doi: 10.1186/s40001-023-01361-7

PubMed Abstract | Crossref Full Text | Google Scholar

19. Liu C, Yang H, Feng Y, Liu C, Rui F, Cao Y, et al. A K-nearest neighbor model to predict early recurrence of hepatocellular carcinoma after resection. J Clin Transl Hepatol. (2022) 10:600–07. doi: 10.14218/JCTH.2021.00348

PubMed Abstract | Crossref Full Text | Google Scholar

20. Pal S, Peng Y, Aselisewine W, and Barui S. A support vector machine-based cure rate model for interval censored data. Stat Methods Med Res. (2023) 32:2405–22. doi: 10.1177/09622802231210917

PubMed Abstract | Crossref Full Text | Google Scholar

21. Huang J, Zhou Y, Zhang H, and Wu Y. A neural network model to screen feature genes for pancreatic cancer. BMC Bioinf. (2023) 24:193. doi: 10.1186/s12859-023-05322-z

PubMed Abstract | Crossref Full Text | Google Scholar

22. Li Y, Zou Z, Gao Z, Wang Y, Xiao M, Xu C, et al. Prediction of lung cancer risk in Chinese population with genetic-environment factor using extreme gradient boosting. Cancer Med. (2022) 11:4469–78. doi: 10.1002/cam4.4800

PubMed Abstract | Crossref Full Text | Google Scholar

23. Ramalingam K, Yadalam PK, Ramani P, Krishna M, Hafedh S, Badnjević A, et al. Light gradient boosting-based prediction of quality of life among oral cancer-treated patients. BMC Oral Health. (2024) 24:349. doi: 10.1186/s12903-024-04050-x

PubMed Abstract | Crossref Full Text | Google Scholar

24. Hua H, Deng Y, Li S, Li S, Li F, Xiao B, et al. Deep learning for predicting distant metastasis in patients with nasopharyngeal carcinoma based on pre-radiotherapy magnetic resonance imaging. Comb Chem High Throughput Screen. (2023) 26:1351–63. doi: 10.2174/1386207325666220919091210

PubMed Abstract | Crossref Full Text | Google Scholar

25. Zhang L, Wu X, Liu J, Zhang B, Mo X, Chen Q, et al. MRI-based deep-learning model for distant metastasis-free survival in locoregionally advanced nasopharyngeal carcinoma. J Magn Reson Imaging. (2021) 53:167–78. doi: 10.1002/jmri.27308

PubMed Abstract | Crossref Full Text | Google Scholar

26. Chen X, Li Y, Li X, Cao X, Xiang Y, Xia W, et al. An interpretable machine learning prognostic system for locoregionally advanced nasopharyngeal carcinoma based on tumor burden features. Oral Oncol. (2021) 118:105335. doi: 10.1016/j.oraloncology.2021.105335

PubMed Abstract | Crossref Full Text | Google Scholar

27. Sun Y, Tan J, Li C, Yu D, and Chen W. Creating an interactive database for nasopharyngeal carcinoma management: applying machine learning to evaluate metastasis and survival. Front Oncol. (2024) 14:1456676. doi: 10.3389/fonc.2024.1456676

PubMed Abstract | Crossref Full Text | Google Scholar

28. Liu X, Shen H, Zhang L, Huang W, Zhang S, and Zhang B. Immunotherapy for recurrent or metastatic nasopharyngeal carcinoma. NPJ Precis Oncol. (2024) 8:101. doi: 10.1038/s41698-024-00601-1

PubMed Abstract | Crossref Full Text | Google Scholar

29. Ma X, Huang J, Wu X, Li X, Zhang J, Xue L, et al. Epidermal growth factor receptor could play a prognostic role to predict the outcome of nasopharyngeal carcinoma: A meta-analysis. Cancer biomark. (2014) 14:267–77. doi: 10.3233/CBM-140401

PubMed Abstract | Crossref Full Text | Google Scholar

30. Liang R, Yang L, and Zhu X. Nimotuzumab, an anti-EGFR monoclonal antibody, in the treatment of nasopharyngeal carcinoma. Cancer Control. (2021) 28:1–06. doi: 10.1177/1073274821989301

PubMed Abstract | Crossref Full Text | Google Scholar

31. Kim CS, Han K, Choi HS, Bae EH, Ma SK, and Kim SW. Association of hypertension and blood pressure with kidney cancer risk. Hypertension. (2020) 75:1439–46. doi: 10.1161/HYPERTENSIONAHA.120.14820

PubMed Abstract | Crossref Full Text | Google Scholar

32. Shen T, Zhao J, Li W, Wang X, Gao Y, Wang Z, et al. Hypertension and hyperglycaemia are positively correlated with local invasion of early cervical cancer. Front Endocrinol (Lausanne). (2023) 14:1280060. doi: 10.3389/fendo.2023.1280060

PubMed Abstract | Crossref Full Text | Google Scholar

33. Lin K, Zeng Z, Li X, Chen W, Lin D, Xie S, et al. Association between hypertension and Epstein-Barr virus reactivation among the population in a high-risk area for nasopharyngeal carcinoma. Virus Res. (2023) 331:199117. doi: 10.1016/j.virusres.2023.199117

PubMed Abstract | Crossref Full Text | Google Scholar

34. Attoub S, Gaben AM, Al Salam S, Al Sultan MAH, John A, Nicholls MG, et al. Captopril as a potential inhibitor of lung tumor growth and metastasis. Ann N Y Acad Sci. (2008) 1138:65–72. doi: 10.1196/annals.1414.011

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: nasopharyngeal carcinoma, machine learning, distant metastasis, predictive model, immunotherapy, targeted therapy

Citation: Sun H, Zhu J, Li L, Xin X, Yan J and Huang T (2025) Machine learning for predicting distant metastasis in nasopharyngeal carcinoma patients. Front. Immunol. 16:1580200. doi: 10.3389/fimmu.2025.1580200

Received: 20 February 2025; Accepted: 19 May 2025;
Published: 05 June 2025.

Edited by:

Claudine Kieda, Military Institute of Medicine, Poland

Reviewed by:

Catharina Lisson, Ulm University Medical Center, Germany
Jingwei Zhao, Shanghai Jiao Tong University, China

Copyright © 2025 Sun, Zhu, Li, Xin, Yan and Huang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Taomin Huang, dGFvbWluaHVhbmdAMTI2LmNvbQ==; Jingchao Yan, amluZ2NoYW8ueWFuQGZkZWVudC5vcmc=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.