AUTHOR=Fang Yutong , Zheng Rongji , Xiao Yefeng , Zhang Qunchen , Liu Junpeng , Wu Jundong TITLE=Machine learning-based diagnostic and prognostic models for breast cancer: a new frontier on the clinical application of natural killer cell-related gene signatures in precision medicine JOURNAL=Frontiers in Immunology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2025.1581982 DOI=10.3389/fimmu.2025.1581982 ISSN=1664-3224 ABSTRACT=BackgroundBreast cancer (BC) remains a leading cause of cancer-related mortality among women worldwide. Natural killer (NK) cells play a crucial role in the innate immune system and exhibit significant anti-tumor activity. However, the role of NK cell-related genes (NRGs) in BC diagnosis and prognosis remains underexplored. With the advent of machine learning (ML) techniques, predictive modeling based on NRGs may offer a new avenue for precision oncology.MethodsWe collected transcriptomic and clinical data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differentially expressed genes (DEGs) were identified, and key prognostic NRGs were selected using univariate and multivariate Cox regression analyses. We constructed ML-based diagnostic models using 12 algorithms and evaluated their performance for identifying the optimal ML diagnostic model. Additionally, a prognostic risk model was developed using LASSO-Cox regression, and its performance was validated in independent cohorts. To explore the potential mechanisms underlying the prognostic differences between high-risk and low-risk patient groups, as well as their drug treatment sensitivities, we conducted functional enrichment analysis, tumor microenvironment analysis, immunotherapy prediction, drug sensitivity analysis, and mutation analysis.ResultsULBP2, CCL5, PRDX1, IL21, NFATC2, CD2, and VAV3 were identified as key NRGs for the construction of ML models. Among the 12 ML diagnostic models, the Random Forest (RF) model demonstrated the best performance, which demonstrated robust performance in distinguishing BC from normal tissues in both training (TCGA) and validation (GEO) cohorts. In terms of the prognostic model, the risk score based on LASSO-Cox regression effectively distinguished between high-risk and low-risk patients, with patients in the high-risk group exhibiting significantly poorer overall survival (OS) compared to those in the low-risk group, and was validated in the GEO cohorts. Patients in the high-risk group displayed increased tumor proliferation, immune evasion, and reduced immune cell infiltration, correlating with poorer prognosis and lower response rates to immunotherapy. Furthermore, drug sensitivity analysis indicated that high-risk patients were more sensitive to Thapsigargin, Docetaxel, AKT inhibitor VIII, Pyrimethamine, and Epothilone B, while showing higher resistance to drugs such as I-BET-762, PHA-665752, and Belinostat.ConclusionThis study provides a comprehensive analysis of NRGs in BC and establishes reliable ML-based diagnostic and prognostic models. The findings highlight the clinical relevance of NRGs in BC progression, immune regulation, and therapy response, offering potential targets for personalized treatment strategies.