AUTHOR=Tu Siyuan , Yin Yulian , Ma Lina , Chen Hongfeng , Ye Meina TITLE=Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction JOURNAL=Frontiers in Cellular and Infection Microbiology VOLUME=Volume 15 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/cellular-and-infection-microbiology/articles/10.3389/fcimb.2025.1602883 DOI=10.3389/fcimb.2025.1602883 ISSN=2235-2988 ABSTRACT=BackgroundNon-puerperal mastitis (NPM) arises from heterogeneous factors ranging from autoimmune dysregulation to occult infections. To establish a diagnosis, biopsy is reliable but invasive. Imaging exhibits a limited specificity and may cause diagnostic delays, patient discomfort, and suboptimal management. Inspired by non-invasive tongue diagnosis in traditional Chinese medicine, this study integrated tongue-coating microbiota profiling and AI-quantified tongue image phenotyping to establish an objective, non-invasive diagnostic framework for NPM.MethodsA total of 100 NPM patients from the Breast Surgery Department of Longhua Hospital and 100 healthy volunteers were included. Their clinical characteristics, tongue images, and tongue-coating microbiota data were collected. Features of tongue images (detection, segmentation, and classification) were quantitated and extracted via deep learning. The microbiota composition was assessed using 16S rRNA gene sequencing (V3–V4 region) and bioinformatic pipelines (QIIME2, DADA2). Based on clinical, imaging, and microbial features, three machine learning models—logistic regression (LR), support vector machine (SVM), and gradient boosting decision tree (GBDT)—were trained to distinguish NPM.ResultsThe GBDT model achieved a superior diagnostic performance (AUROC = 0.98, accuracy = 0.95, and specificity = 0.95), outperforming the LR (AUROC = 0.98, accuracy = 0.95, and specificity = 0.90) and SVM models (AUROC = 0.87, accuracy = 0.80, and specificity = 0.75). Integration of clinical characteristics, tongue image features, and bacterial profiles (at the genus/family level) yielded the highest accuracy, whereas models using a single class of features showed a lower discriminatory ability (AUROC = 0.90–0.91). Key predictors included Campylobacter (12%), waist–hip ratio (11%), and Alloprevotella (6%).ConclusionsIntegrating clinical characteristics, tongue image features, and tongue-coating microbiota profiles, the multimodal GBDT model demonstrates a high diagnostic accuracy, supporting its utility for early screening and diagnosis of NPM.