Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction

Tu, Siyuan; Yin, Yulian; Ma, Lina; Chen, Hongfeng; Ye, Meina

doi:10.3389/fcimb.2025.1602883

ORIGINAL RESEARCH article

Front. Cell. Infect. Microbiol., 28 July 2025

Sec. Oral Microbes and Host

Volume 15 - 2025 | https://doi.org/10.3389/fcimb.2025.1602883

Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction

ST
Siyuan Tu ^†
YY
Yulian Yin ^†
LM
Lina Ma
HC
Hongfeng Chen ^*
MY
Meina Ye ^*

Department of Breast Surgery, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China

Article metrics

View details

Citations

1,8k

Views

571

Downloads

Abstract

Background:

Non-puerperal mastitis (NPM) arises from heterogeneous factors ranging from autoimmune dysregulation to occult infections. To establish a diagnosis, biopsy is reliable but invasive. Imaging exhibits a limited specificity and may cause diagnostic delays, patient discomfort, and suboptimal management. Inspired by non-invasive tongue diagnosis in traditional Chinese medicine, this study integrated tongue-coating microbiota profiling and AI-quantified tongue image phenotyping to establish an objective, non-invasive diagnostic framework for NPM.

Methods:

A total of 100 NPM patients from the Breast Surgery Department of Longhua Hospital and 100 healthy volunteers were included. Their clinical characteristics, tongue images, and tongue-coating microbiota data were collected. Features of tongue images (detection, segmentation, and classification) were quantitated and extracted via deep learning. The microbiota composition was assessed using 16S rRNA gene sequencing (V3–V4 region) and bioinformatic pipelines (QIIME2, DADA2). Based on clinical, imaging, and microbial features, three machine learning models—logistic regression (LR), support vector machine (SVM), and gradient boosting decision tree (GBDT)—were trained to distinguish NPM.

Results:

The GBDT model achieved a superior diagnostic performance (AUROC = 0.98, accuracy = 0.95, and specificity = 0.95), outperforming the LR (AUROC = 0.98, accuracy = 0.95, and specificity = 0.90) and SVM models (AUROC = 0.87, accuracy = 0.80, and specificity = 0.75). Integration of clinical characteristics, tongue image features, and bacterial profiles (at the genus/family level) yielded the highest accuracy, whereas models using a single class of features showed a lower discriminatory ability (AUROC = 0.90–0.91). Key predictors included Campylobacter (12%), waist–hip ratio (11%), and Alloprevotella (6%).

Conclusions:

Integrating clinical characteristics, tongue image features, and tongue-coating microbiota profiles, the multimodal GBDT model demonstrates a high diagnostic accuracy, supporting its utility for early screening and diagnosis of NPM.

Introduction

Non-puerperal mastitis (NPM) is an entity of inflammatory breast diseases including mammary duct ectasia, idiopathic granulomatous mastitis (IGM), periductal mastitis, and tuberculous mastitis (Kasales et al., 2014; Scott, 2022; Shi et al., 2022). While NPM is detected in only 4% to 5% of biopsies for benign breast diseases (Shi et al., 2022), its morbidity has kept rising over the last two decades, and currently, it occurs in adult women of all ages with a prolonged and recurrent course (Verghese and Ravikanth, 2012; Yuan et al., 2022). However, the etiology of NPM is still elusive, which challenges early diagnosis and subsequent treatment (Gopalakrishnan et al., 2015). Due to its heterogeneous etiology (e.g., microbial infections) (Li et al., 2022; Tariq et al., 2022), autoimmune responses (Chougule et al., 2015), ambiguous clinical features (resembling invasive ductal carcinoma and inflammatory breast cancer in terms of symptoms (Chen et al., 2023), and nonspecific imaging findings (non-mass enhancement or irregular rim enhancement with blurred margins) (Fazzio et al., 2016), how to make a definite diagnosis of NPM remains a concern in clinical scenarios.

Histopathological analysis is a golden standard for diagnosing NPM (Liang et al., 2022). However, a possible misdiagnosis with malignant diseases still exists due to the complications with core needle biopsy (e.g., bleeding, sinus formation, and pain) and limited lesions taken for tests (Yuan et al., 2022). On the other hand, deep learning models are making the diagnosis of NPM more non-invasive, convenient, and inexpensive. A nomogram based on multiparametric sonogram and radiomics features (lesion diameter, orientation, echogenicity, shape and tubular extension features, and the American College of Radiology Breast Imaging Reporting and Data System score) can well differentiate IGM from invasive breast cancer (IBC) (Ma et al., 2023a); however, this model does not show a high stability due to the variation in sonographic variables among sonographers. Magnetic resonance imaging (MRI)-based whole-lesion histogram and texture analysis can be used to differentiate IGM from IBC, with a 79.9% accuracy rate, but this analysis depends on high-quality manual segmentation, different MR systems, and single-shot diffusion weighted imaging (Zhao et al., 2020). MRI can rule out malignancy with a high sensitivity, but its specificity decreases in the absence of mass enhancement (Soylu et al., 2023). Accordingly, it is urgent to explore for new non-invasive biomarkers and improve the model’s performance in the diagnosis of NPM.

Tongue-coating microbiota are involved in the progression of systemic diseases (Shapira et al., 2013), such as rheumatic immunological disorders, respiratory, circulatory, urinary, and digestive system diseases as well as dental caries and other oral ailments (Gao et al., 2018). Mechanistic studies have revealed that some differentially enriched tongue-coating microbial species can serve as disease biomarkers, providing scientific evidence supporting the value of tongue diagnosis, a method in traditional Chinese medicine (TCM). The connection between tongue microbiota and NPM is still controversial (Betal and Macneill, 2011; Le Fleche-Mateos et al., 2012; Renshaw et al., 2011). Various diseases can be defined through tongue diagnosis (Liu et al., 2023), including IGM (Chen et al., 2022), indicating the possibility of using the “whole tongue” to diagnose NPM. However, no research has analyzed the diagnostic potential of tongue-coating microbiota for NPM.

Here we created a gradient boosting decision tree (GBDT) model, which encompassed significant clinical characteristics, “whole tongue” imaging, and microbiota features, and evaluated its clinical value in the early screening of NPM (Figure 1).

Figure 1

Materials and methods

Data

From April 2021 to November 2023, a total of 101 NPM patients from the Breast Surgery Department of Longhua Hospital Affiliated to Shanghai University of TCM and 103 healthy volunteers were recruited. All patients were pathologically diagnosed with NPM by needle biopsy or post-surgery histopathological analysis. Healthy control participants presented no clinically diagnosed diseases and were not on medications. Additionally, excluded were those with (1) intake of glucocorticoids or antibiotics within a month earlier, (2) duodenal ulcer, gastric ulcer, gastrorrhagia, or other gastrointestinal disorder, (3) severe primary diseases or mental illness, (4) immune diseases, such as rheumatoid arthritis, systemic lupus erythematosus, and autoimmune skin diseases, (5) infection confirmed within 3 months earlier, (6) concurrent acute periodontal disease, (7) behaviors of eating, drinking, brushing teeth, or smoking before sampling, and (8) other abnormalities that might have effects on tongue microbiota. All of the included participants received a face-to-face interview, had tongue imaging, and provided tongue coating samples. Finally, 100 participants were assigned to each group. The study protocol was approved by the Ethics Committee of Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine (2021LCSY047). All participants provided written informed consent.

A questionnaire survey was performed to collect clinical characteristics, including age, height, weight, waist circumference, hip circumference, systolic pressure (SP), and diastolic pressure (DP). Body mass index (BMI) was computed as weight in kilograms divided by the square of height in meters. Waist–hip ratio (WHR) was computed as waist circumference divided hip circumference.

Tongue image acquisition and quantitative analysis

Before sampling tongue-coating microbiota, tongue images were collected by researchers trained on a tongue diagnosis device (GMSX001, Shanghai National Health Company, Shanghai, China) (Figure 2), which contains a SONY IMX179 photosensitive chip, with a closed light source, color temperature of 5,600 K, illumination of 1,200 lx, and a color rendering index greater than 85 Ra. All of the images obtained were processed into the JPG format. Each tongue was imaged at least two times. The images with nebulization, underexposure, overexposure, stained tongue coating, and abnormal tongue shape were removed.

Figure 2

We extracted the color and texture features of the tongue by applying Nahefa Cloud System V2.0 developed by Shanghai National Health Company. After color correction and image segmentation, the system automatically distinguished the tongue body from the tongue coating. The tongue image quantification system was constructed based on techniques of deep learning object detection (Wang, 2023), deep learning image segmentation (Chen et al., 2018), and deep learning image classification (He, 2016).

Three attending physicians in TCM labeled the tongue features on the basis of diagnostics in Chinese medicine. After labeling, three TCM experts conducted a spot check of the labeling quality. The model was trained after the labeling was considered qualified. Each model used a different evaluation method: mAP for the detection model, acc for the classification model, and mIoU for the segmentation model. Tongue and tongue coating colors, tongue coating texture, and tongue shape were calculated by using the deep learning image classification model that had been trained through the diagnostic results of medical experts (Figure 3).

Figure 3

The tongue coating texture and tongue shape were calculated as follows:

The tongue surface area was divided into nine parts.
Each part was given a score by its own classification model.
The nine parts’ sum score was divided by “nine times the feature level” in order to obtain the quantitative value of the tongue features.

The model detecting tongue indentation and spots was trained by manually using rectangular boxes to annotate abnormal-pixel positions. The tongue crack segmentation model was trained by manually marking the crack area (Figure 4).

Figure 4

The tongue features were statistically analyzed in two groups separately.

High-throughput sequencing for tongue-coating microbiota

The tongue-coating microbiota of each participant were sampled using sterile swabs, disposable mouth mirrors, cryopreservation tubes, ice packs, and a portable incubator. The participant was informed previously not to brush teeth or eat after getting up in the morning. On the day of sampling, the participant should present no physical discomfort and did not drink, smoke, or chew sweets before sampling. Then, the participant rinsed his or her mouth with sterile water three times (10 mL each time) to remove food debris. Then, the researcher rolled forward a sterile swab along the middle of the participant’s tongue for three times (approximately 2-cm-long wiping action) and repeated this movement for two times. Afterward, the sterile swab was transformed into a cryopreservation tube and immediately transported to a -80°C freezer with a portable incubator filled with ice packs. This process was accomplished within an hour. Repeated freezing and thawing of samples were avoided. The samples were placed into a portable Styrofoam box with dry ice, then sent to the laboratories at Majorbio Bio-Pharm Technology Co. Ltd. (Shanghai, China) within a month, and preserved in a freezer at -80°C until nucleic acid extraction.

According to the manufacturer’s instructions, total DNA was extracted from tongue-coating microbiota samples using E.Z.N.A.^® Soil DNA Kit (Omega Bio-tek, Norcross, GA, USA). Agarose gel electrophoresis (1%) and a NanoDrop^® ND-2000 spectrophotometer (Thermo Scientific Inc., USA) were used to determine DNA quality and concentration. The hypervariable region V3–V4 of the bacterial 16S rRNA gene was amplified with primer pairs 338F (5′-ACTCCTACGGG-AGGCAGCAG-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3′) by an ABI GeneAmp^® 9700 PCR thermocycler (ABI, CA, USA) (Liu et al., 2016). PCR amplification comprised denaturation at 95°C for 3 min, 27 cycles of denaturing at 95°C for 30 s, annealing at 55°C for 30 s, and extension at 72°C for 45 s, and single extension at 72°C for 10 min, and end at 10°C. The PCR reaction mixture was made by adding 4 μL of 5× Fast Pfu buffer, 2 μL of 2.5 mM dNTPs, 0.8 μL of primer (5 μM each), 0.4 μL of Fast Pfu polymerase, 10 ng of template DNA, and ddH₂O to a final volume of 20 µL. Triplicate amplifications were performed on all samples. The PCR product was extracted from 2% agarose gel, purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, Union City, CA, USA), and quantified using Quantus™ Fluorometer (Promega, USA). On an Illumina MiSeq PE300/NovaSeq PE250 platform (Illumina, San Diego, CA, USA), purified amplicons were pooled in equal amounts and paired-end sequenced (Figure 5).

Figure 5

The resultant sequences were quality-filtered with Fastp (0.19.6) (Chen et al., 2018) and merged with FLASH (v1.2.11) (Magoc and Salzberg, 2011) after demultiplexing. Then, the high-quality sequences were denoised using DADA2 (Callahan et al., 2016) plugin in the Qiime2 (Bolyen et al., 2019) (version 2020.2) pipeline with default parameters to obtain a single-nucleotide (amplicon sequence variants) resolution based on the error profiles within the samples. DADA2-denoised sequences are usually called amplicon sequence variants (ASVs). The number of sequences from each sample was rarefied to 20,000 to minimize the impact of sequencing depth on alpha and beta diversity. With a contrast threshold set to 70%, the SILVA 16S rRNA database (v138) and the naive Bayesian classifier were used to assign taxonomic classifications to ASVs. The metagenomic function was predicted by using PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) (Douglas et al., 2020) based on the ASVs of representative sequences and abundances. A series of statistical or visual analyses were carried out, including ASV analysis, species taxonomy analysis, community diversity analysis, species difference analysis, and model prediction analysis (Figure 6).

Figure 6

Statistical analysis

Clinical data were analyzed with R v4.2.1. The data were described as mean ± SD (standard deviation) if in normal distribution and otherwise as median with lower and upper quartiles. Student’s t-test and chi-square test were respectively used to assess differences between two groups of continuous and categorical variables. A P-value less than 0.05 was considered statistically significant.

The quantitative and visual analyses of tongue images were realized by both GMSX001, a tongue diagnosis instrument, and software Nahefa Cloud System V2.0. Nahefa Cloud System V2.0 is constructed based on techniques of deep learning object detection (Wang, 2023), deep learning image segmentation (Chen et al., 2018), and deep learning image classification (He, 2016).

The tongue-coating microbiota was subjected to bioinformatic analysis on the Majorbio Cloud platform. Mothur v1.30.2 and R v3.3.1 were used to analyze microbial diversity and calculate alpha diversity indices, including Sobs, Chao, Ace, Shannon, Simpson index, and Good’s coverage, based on ASV information (Schloss et al., 2009). The similarity among the microbial communities in different samples was determined by principal coordinate analysis (PCoA), principal component analysis (PCA), and non-metric multidimensional scaling analysis (NMDS) based on Bray–Curtis dissimilarity using Qiime software. Wilcoxon rank-sum test was used to analyze the difference in microbial community structure between groups. To identify the significantly abundant taxa (phylum to genera) of bacteria among the different groups, linear discriminant analysis (LDA) effect size (LEfSe) (Segata et al., 2011) was performed (LDA score > 3, P < 0.05).

Classical machine learning methods, including logistic regression (LR), support vector machine (SVM), and GBDT, were used for the construction of NPM-diagnosing models. Using stratified sampling method, healthy participants and NPM patients were randomly assigned in a 8:2 ratio to a training set (n = 160) or an internal test set (n = 40) to analyze the performances of different models. The predictive ability was illustrated based on area under the receiver operating characteristic curve (auROC) and decision curve analysis (DCA). The significance of each feature was inferred from the machine learning diagnosis model. Python v3.7 and scikit-learn v1.0.2 were used for modeling.

Reporting guidelines

This study strictly adhered to the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines for reporting observational data collection and analysis. Complete checklists are provided in Supplementary Table S1.

Results

Participant characteristics

A total of 200 participants were included, including 100 NPM patients and 100 healthy people. The clinical characteristics of all participants are shown in Table 1. The average age (SD) was 30.42 (6.18) years in the healthy group and 32.34 (4.85) years in the NPM group. The NPM group showed higher BMI, WHR, and SP (P < 0.001) but no significant difference in DP.

Table 1

Items	Health	NPM	P-value
Number (person)	100 (50%)	100 (50%)
Age (years)	30.42 (6.18)	32.34 (4.85)	0.015
Weight (kg)	55.40 [50, 62.25]	63.25 [55, 70]	<0.001
Height (cm)	1.62 [1.60, 1.67]	1.60 [1.58, 1.64]	0.011
BMI (kg/cm²)	20.96 [19.47, 23.42]	23.71 [21.64, 26.82]	<0.001
SP (mmHg)	109 [103, 117]	119.50 [110, 125.25]	<0.001
DP (mmHg)	71.41 (9.18)	73.87 (9.79)	0.068
Hip (cm)	92.50 [88, 99]	97.25 [93.38, 104.12]	<0.001
Waist (cm)	74 [68, 80]	84.50 [79, 91.62]	<0.001
WHR	0.79 (0.05)	0.86 (0.06)	<0.001

Clinical characteristics of included participants.

NPM, non-puerperal mastitis; BMI, body mass index; SP, systolic pressure; DP, diastolic pressure; WHR, waist–hip ratio.

Tongue image features

Tongue image features included tongue color, tongue coating color, tongue coating thickness, tongue shape, tongue spot, tongue crack, and tongue indentation. To unify the format of data format, the tongue color and tongue coating color were recorded as standard chromaticity of the Lab color space specified by the International Commission on Illumination. Among them, the value of “L” represents the brightness of the pixel. The increased value of “A” means that the color changes from red to green. The increased value of “B” means that the color changes from yellow to blue (Billmeyer, 1983). These features were further subdivided according to their scores counted by Nahefa Cloud System V2.0.

The NPM group showed lighter tongue coating luminance and fewer tongue spots but yellower and thicker tongue coating than the healthy group (P < 0.05). No significant difference was observed in tongue color, tongue shape, tongue crack, and tongue indentation between groups (Table 2).

Table 2

Items	Health (N = 100)	NPM (N = 100)	P-value
Tongue coating color-L	105.15 (14.09)	99.99 (13.37)	0.009
Tongue coating color-A	139.30 [137.85,140.32]	138.50 [137.50,140.46]	0.207
Tongue coating color-B	105.39 [103.25,107.41]	106.91 [103.40,110.95]	0.016
Tongue color-L	49.55 [46.34, 53.50]	49.43 [46.35, 52.14]	0.458
Tongue color-A	15.80 [13.66, 17.07]	14.42 [12.43, 16.92]	0.104
Tongue color-B	4.26 [3.43, 5.27]	3.65 [0.78, 6.13]	0.126
Tongue coating thickness	0.72 [0.56, 0.86]	0.83 [0.74, 0.89]	<0.001
Tongue shape	0.85 [0.83, 1.20]	0.84 [0.50, 0.87]	0.094
Tongue spot	35.50 [23.00, 54.25]	27.50 [12.00, 46.25]	0.003
Tongue crack	0.02 [0.00, 0.20]	0.00 [0.00, 0.09]	0.118
Tongue indentation	3.00 [1.00, 4.00]	3.00 [2.00, 4.00]	0.449

Scores of tongue image features.

NPM, non-puerperal mastitis.

Tongue-coating microbiota profiles

ASVs with 97% similarity were clustered into using Qiime2 (v2022.2) software and drawn according to the minimum number of sample sequences. A total of 10,889 ASVs were generated, with 1,276 and 8,519 ASVs in the NPM group and healthy group, respectively. The tongue-coating microbiota were further classified to one domain, one kingdom, 18 phyla, 29 classes, 73 orders, 129 families, 243 genera, and 542 species using the classify-sklearn (naïve Bayesian) algorithm. Pan/core analysis and dilution curve analysis suggested that the volume of sequencing data was large enough for further analysis (Figures 7A–C). Wilcoxon rank-sum test showed significant differences in community richness, diversity, and coverage indices between the two groups (all P < 0.05, Table 3). In addition, PCA, PCoA, and NMDS showed a significant difference in the distribution and dispersion of PC1/NMDS1 and PC2/MNDS2 axes as well as the aggregation area between the two groups (P = 0.001), indicating a significant difference in the microbial composition between the two groups (Figures 7D–F).

Figure 7

Table 3

Diversity index	NPM ()	Health ()	P-value	FDR
Sobs	134.62 ± 40.13	199.91 ± 74.78	<0.001	<0.001
Chao	137.27 ± 42.34	240.30 ± 111.00	<0.001	<0.001
Ace	137.46 ± 42.44	255.67 ± 131.68	<0.001	<0.001
Shannon	3.68 ± 0.32	3.55 ± 0.41	0.019	0.019
Simpson	0.05 ± 0.02	0.07 ± 0.04	0.006	0.007
Good’s coverage	1.00 ± 0.0005	1.00 ± 0.0003	<0.001	<0.001

α-Diversities in NPM and healthy groups.

NPM, non-puerperal mastitis; FDR, false discovery rate.

Wilcoxon rank-sum test was further performed to evaluate the difference between NPM and healthy groups at each taxonomic level. Differences in microbiota species were assessed using LEfSe with the Kruskal–Wallis sum-rank test and LDA scores >3. The microbiota profile of the NPM group was significantly different from that of the healthy group, with differences in nine phyla, 13 classes, 15 orders, 15 families, 15 genera, and 15 species (all P < 0.05, Figure 8).

Figure 8

To find out the differential bacterial taxa, we performed LEfSe analysis, which confirmed that the NPM group showed increases in the phyla of Bacteroidota, Patescibacteria, classes of Bacteroidia, Saccharimonadia, Clostridia, Campylobacteria, Gracilibacteria, orders of Bacteroidales, Campylobacterales, Pseudomonadales, Absconditabacteriales_SR1, families of Prevotellaceae, Saccharimonadaceae, Fusobacteriaceae, Campylobacteraceae, Moraxellaceae, and genera of Prevotella, Alloprevotella, TM7x, Fusobacterium, Campylobacter, Megasphaera, Moraxella but decreases in the phyla of Proteobacteria, Actinobacteriota, classes of Gammaproteobacteria, Bacilli, Actinobacteria, Alphaproteobacteria, orders of Lactobacillales, Micrococcales, Pasteurellales, Actinomycetales, Staphylococcales, families of Streptococcaceae, Micrococcaceae, Pasteurellaceae, Actinomycetaceae, Carnobacteriaceae, Gemellaceae, and genera of Streptococcus, Rothia, Haemophilus, Actinomyces, Granulicatella, and Gemella compared to the healthy group (multi-group comparison strategy: all-against-all, LDA > 3, P < 0.05, Figures 9A, B).

Figure 9

Performances of machine learning models in diagnosing NPM

To classify NPM patients, LR, SVM, and GBDT models were established using a combination of 44 features (including clinical characteristics, tongue images, and tongue-coating microbiota features). The features were selected through an integrated approach involving expert evaluation, literature review, and deep learning-based calculation. All models were operated in the same training and validation sets. The LR and GBDT models exhibited obvious advantages (auROC of 0.98), outperforming the SVM model (auROC of 0.87, Table 4; Figure 10A).

Table 4

Model	Precision	Recall	Accuracy	Specificity	Sensitivity	AUROC
LR	0.95	0.95	0.95	0.90	1.00	0.98
SVM	0.80	0.80	0.80	0.75	0.85	0.87
GBDT	0.95	0.95	0.95	0.95	0.95	0.98

Performance of the three machine learning models.

AUROC, area under the receiver operating characteristic curve; LR, logistic regression; SVM, support vector machine; GBDT, gradient boosting decision tree.

Figure 10

We next performed a DCA to evaluate the practicability of the three models. The SVM model provided the least net benefit, whereas the GBDT model provided the greatest gain. With threshold probabilities of risk ranging from 0.6 to 1.0, the gain from the GBDT model was particularly higher than those from the other two models, with added net incremental benefits across each threshold (Figure 10B).

To evaluate the performances of GBDT models with different characteristics, we incorporated seven types of features, including clinical characteristics, tongue image features, bacterial genera, bacterial families, bacterial species, and their different combinations (Table 5). The results showed that the models separately based on only clinical characteristics or a combination of clinical characteristics and tongue image features had similar performances (auROC of 0.90 to 0.92, model A–D). Models E, F and G, which are based on a combination of bacterial genera/species/families plus tongue image features and clinical characteristics, demonstrated a higher diagnostic accuracy, indicating that bacterial features could improve the accuracy of the GBDT model. Models E and G, separately based on a combination of clinical characteristics, tongue image features, and bacterial genera (model E) and families (model G), demonstrated the highest accuracy (0.95), specificity (0.95), and sensitivity (0.95). However, the performance of model F, which incorporated clinical characteristics, tongue image features, and bacterial species, was slightly worse. Based on the same amount of data, the diagnostic accuracy of the model decreased as the number of feature dimensions increased.

Table 5

Model	Features	Accuracy	Sensitivity	Specificity	AUROC
A	Clinical characteristics	0.85	0.80	0.90	0.90
B	Tongue image features	0.80	0.85	0.75	0.91
C	Bacterial species	0.85	0.85	0.85	0.90
D	Clinical characteristics + tongue image features	0.85	0.85	0.85	0.92
E	Clinical characteristics + tongue image features + bacterial genera	0.95	0.95	0.95	0.98
F	Clinical characteristics + tongue image features + bacterial species	0.88	0.80	0.95	0.98
G	Clinical characteristics + tongue image features + bacterial families	0.95	0.95	0.95	0.98

Performance of the GBDT models based on different combinations of features.

GBDT, gradient boosting decision tree; AUROC, area under the receiver operating characteristic curve.

The features with the closest associations with NPM risk in model E included Campylobacter (12%), WHR (11%), and waist circumstance (10%) followed by Alloprevotella (6%), tongue coating color-L (5%), TM7x (4%), age (3%), Rothia (3%), BMI (3%), tongue color-L (2%), and tongue color-B (2%, Figure 11).

Figure 11

Discussion

NPM may arise from various etiological factors, ranging from infection to autoimmune disorders (Gopalakrishnan et al., 2015). The management of NPM is a thorny issue, and any misdiagnosis may lead to overtreatment, such as mastectomy (Bani-Hani et al., 2004). Our study is the first multi-modal analysis integrating tongue-coating microbiota and tongue image features from NPM patients and healthy people. We identified a cluster of microbial species and a list of tongue phenotypes associated with NPM. Besides that, combining the clinical, tongue image, and tongue-coating microbiota features, a GBDT model was established, showing a strong ability to screen out NPM. This model was non-invasive, simple, accurate, and highly suitable for large-scale NPM screening.

In our study, the mean WHR of NPM patients reached 0.86, indicating the association of central obesity with NPM risk. WHR, as the ratio of waist circumference to hip circumference, is effective to evaluate central obesity and predict the relationship between body fat distribution and the risk of various metabolic diseases. Even for a subject with a normal BMI, a higher WHR still increases the risk of premature death (Gazarova et al., 2022). According to the World Health Organization standard, the WHR of women should not exceed 0.85 (Nishida et al., 2010). Our results showed that WHR and waist circumstance were statistically different between the two groups. Studies have shown that obesity is a risk factor for NPM (Jiao et al., 2023). On the one hand, obesity may directly damage the immune function in the breast (Liu et al., 2017). Adipose tissue accumulates during development, but excessive accumulation may lead to hypoxia that increases the production of inflammatory factors and decreases that of anti-inflammatory factors, thus arousing inflammatory responses (Nishimura et al., 2008). Obesity also favors the development of mild chronic inflammation. Adipokines secreted by adipose tissue, such as visfatin, leptin, and acylated proteins, may disrupt neuroendocrine activities, thus inducing systemic inflammatory and immune responses (Fan et al., 2014; Liu et al., 2017). In addition, interferon-γ secreted by adipose tissue can directly act on estrogen receptors in the breast, thereby dysregulating estrogen and progesterone levels to evoke local immune responses and hypersensitivity (Brown, 2014).

In this study, the blood pressure was normal in both groups, while SP was slightly higher in the NPM group. Similarly, a research has shown that the incidence of hypertension is slightly higher in NPM patients compared with benign breast mass patients (OR, 2.221; 95% CI, 1.318–3.741; P = 0.003) (Shi et al., 2022). The association between hypertension and NPM needs to be further studied. Additionally, hypertension may increase the risk of breast cancer by 15% in women (Han et al., 2017), which may be explained by the fact that breast cancer and hypertension are driven by similar physiopathological pathways, such as chronic inflammation mediated by adipose tissue (Balkwill et al., 2005; Largent et al., 2006; Li et al., 2005).

Moreover, we innovatively combined the tongue image and tongue-coating microbiota features for diagnosing NPM. As a fundamental TCM methodology, tongue diagnosis is convenient and non-invasive for revealing the pathological changes in internal organs and warning diseases in the early stage (Han et al., 2016; Zhang and Zhang, 2015). Nowadays, tongue diagnosis is still being used for evaluating patients’ physical condition and disease stage (Huang et al., 2022; Jiang et al., 2021; Zhang et al., 2022). However, tongue diagnosis is always subjective, and its accuracy may be decided by many factors, such as brightness in the clinic. Machine learning technology can allow an objective evaluation about the tongue condition. Classical machine learning algorithms are powerful in analyzing structural data (Bini, 2018) and image features. In addition, these algorithms can also drill into sets of complex data (Dobrescu et al., 2020; Gao et al., 2020). In the present study, between-group differences were observed in the quantitative features of tongue images. The NPM group showed more yellower and thicker tongue coating than the healthy group. Tongue coating represents the accumulation of exfoliated mucosa cells, debris, and proliferation of microorganisms (Negrato and Tarzia, 2010). Medical studies have shown that the tongue coating is associated with the occurrence and prognosis of various diseases (Ali et al., 2021; Chen et al., 2024). According to TCM theory, a thick tongue coating is usually accompanied with phlegm-dampness and blood stasis (Anastasi et al., 2009; Kirschbaum, 2010), while a yellow tongue coating mirrors a hot interior condition (Jiang et al., 2012; Ye et al., 2016). These tongue features are also consistent with the pathology of NPM, which manifests a combination of heat, phlegm, and blood stasis.

Moreover, our research results showed that the NPM group had fewer tongue spots than the healthy group. Tongue spots originate in the fungiform papillae, which are enlarged and protrude to form awn-like spikes (Shahbake et al., 2005). In TCM, tongue spots indicate heat in the blood or excess heat in the internal organs (Wang et al., 2022). The number of tongue spots has been used for evaluating breast cancer (Lo et al., 2013). In this study, most of the patients had suffered a long-term NPM, which consumed too much Qi and blood to produce more tongue spots.

Compared with tongue diagnosis, the indices of tongue-coating microbiota are more objective for diagnosing NPM. We found significant differences at taxonomic levels between groups, including nine phyla, 13 classes, 15 orders, 15 families, 15 genera, and 15 species. Between-group differences were observed in the genera of Actinomyces, Alloprevotella, Campylobacter, Fusobacterium, Gemella, Granulicatella, Haemophilus, Megasphaera, Moraxella, Prevotella, Rothia, Streptococcus, and TM7x. Among them, Campylobacter, Alloprevotella, TM7x, and Rothia had the closest associations with NPM risk in the model.

Oral Campylobacters, also termed “emerging Campylobacter species”, can cause infections that may have been underreported (Costa and Iraola, 2019). Except for periodontitis, oral Campylobacters have been associated with extraoral infections, including gastroenteritis, irritable bowel disease, Barrett’s esophagus, gastroenteritis, appendicitis, Crohn’s disease, ulcerative colitis, empyema thoracis, cerebral microbleeds in stroke patients, peritonitis, and abscesses in the bone (Castano-Rodriguez et al., 2017; Kaakoush et al., 2015; Lam et al., 2011; Shiga et al., 2020; Warren et al., 2013). Apart from their own pathogenicity, microbiota and their metabolites enter into the systemic circulation, thereby inducing and aggravating inflammation (Gao et al., 2022). Pathogenic oral bacteria can induce the production of proinflammatory factors. IL-6, with a positive correlation with the abundance of Alloprevotella (Ye et al., 2024), is upregulated in both the serum and breast tissues of NPM patients (Liu et al., 2024). The upregulation of Alloprevotella expression in diarrheal irritable bowel syndrome suggests that Alloprevotella may exert pro-inflammatory effects (Tang et al., 2023).

TM7x, a member of phylum Saccharibacteria (TM7), is involved in host immune response (Domenech et al., 2013; He et al., 2015). In vivo, TM7x may directly repress the inflammatory response by forming a biofilm that hinders immune activation (Domenech et al., 2013). TM7x also inhibits the expression of TNF-α induced by XH001 in macrophages, thus achieving immune escape (He et al., 2015). The Rothia genus comprises Gram-positive aerobic bacteria commonly found in the oral and respiratory tracts. These bacteria have the potential to function as opportunistic pathogens, contributing to a range of infections, including endocarditis, pneumonia, peritonitis, and septicemia, particularly in individuals with compromised immune systems (Fatahi-Bafghi, 2021). Considering the close connections among Campylobacters, Alloprevotella, TM7x, Rothia, and inflammatory diseases, further research is needed to figure out whether these bacteria cause NPM as conditioned pathogens or by triggering the systemic immune response and producing pro-inflammatory factors.

Deep learning technology, due to its ability to process large amounts of data and identify relationships hidden deep inside biological data, has been utilized in the biochemical analysis of natural products, disease diagnosis, and treatment (Ma et al., 2023b; Seetharam et al., 2019). In this study, we constructed GBDT, SVM, and LR models for the diagnosis of NPM. GBDT algorithm, as a classic algorithm proposed by Friedman of Stanford University, has a strong ability in classification, regression, and feature selection. GBDT can illustrate the importance of features in the classification or regression model by calculating the average of the weight of features in each decision tree (JH, 2001). In the present study, the GBDT model exhibited the best performance. Both GBDT and LR models showed high precision and recall parameters and low false negative and positive rates in detecting NPM. However, the interactions among multimodal data—tongue images, microbiota, and clinical features—may exhibit highly complex nonlinear patterns. GBDT was more accurate to capture such intricate relationships, whereas the linear assumptions of LR might constrain diagnostic performance. Based on different combinations of clinical, tongue image, and tongue-coating microbiota features, all of the GBDT models were highly sensitive, accurate, and specific to NPM, suggesting their better discriminative and predictive performances.

However, our study still has some limitations. Concerning potential heterogeneity in NPM and inherent variability in tongue features, the group sample was relatively small, which may lead to the instability of models. Future work will involve external validation of the model in larger cohorts. The limited interpretability of GBDT remains a critical concern for clinical adoption. The association between the non-invasive biomarkers and NPM remains to be investigated by clinical and experimental studies. Tongue image features may vary with tongue position and other factors, which calls for standard operating procedures.

Conclusion

The GBDT model incorporating clinical characteristics, “whole tongue” images, and tongue-coating microbiota may serve as a reliable tool for the early screening and diagnosis of NPM.

Statements

Data availability statement

The data presented in the study are deposited in the NCBI repository, accession number PRJNA1291263. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by The Ethics Review Board of Longhua Hospital Affiliated to Shanghai Traditional Chinese Medicine University (2021LCSY047). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

ST: Data curation, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing. YY: Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review & editing. LM: Investigation, Validation, Writing – review & editing. HC: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing. MY: Conceptualization, Project administration, Resources, Supervision, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was partially supported by National Natural Science Foundation of China(82104854); The Second Major Clinical Research Project of “Three-year Action Plan for Promoting Clinical Skills and Clinical Innovation in Municipal Hospitals” (SHDC2020CR2051B); Sailing Program, Scientific and Innovative Action Plan of Shanghai (20YF1449800).

Acknowledgments

Tongue image data collection and analysis for this project was supported by Shanghai National Health Company.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcimb.2025.1602883/full#supplementary-material

Supplementary Table 1

STROBE checklist of items that should be included in the reports of observational studies.

References

1
AliM. M.AlK. S.Al-QadhiG. (2021). Tongue-coating microbiome as a cancer predictor: a scoping review. Arch. Oral. Biol.132, 105271. doi: 10.1016/j.archoralbio.2021.105271
2
AnastasiJ. K.CurrieL. M.KimG. H. (2009). Understanding diagnostic reasoning in tcm practice: tongue diagnosis. Altern. Ther. Health Med.15, 18–28.
- Pubmed Abstract
- Google Scholar
3
BalkwillF.CharlesK. A.MantovaniA. (2005). Smoldering and polarized inflammation in the initiation and promotion of Malignant disease. Cancer Cell7, 211–217. doi: 10.1016/j.ccr.2005.02.013
4
Bani-HaniK. E.YaghanR. J.MatalkaI. I.ShatnawiN. J. (2004). Idiopathic granulomatous mastitis: time to avoid unnecessary mastectomies. Breast J.10, 318–322. doi: 10.1111/j.1075-122X.2004.21336.x
5
BetalD.MacneillF. A. (2011). Chronic breast abscess due to mycobacterium fortuitum: a case report. J. Med. Case Rep.5, 188. doi: 10.1186/1752-1947-5-188
6
BillmeyerJ. F. W. (1983). Color science: concepts and methods, quantitative data and formulae, 2nd ed., By gunter wyszecki and w. S. Stiles, john wiley and sons, new yor. Color Res. Appl.8, 262–263. doi: 10.1002/col.5080080421
- CrossRef
- Google Scholar
7
BiniS. A. (2018). Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J. Arthroplasty33, 2358–2361. doi: 10.1016/j.arth.2018.02.067
8
BolyenE.RideoutJ. R.DillonM. R.BokulichN. A.AbnetC. C.Al-GhalithG. A.et al. (2019). Reproducible, interactive, scalable and extensible microbiome data science using qiime 2. Nat. Biotechnol.37, 852–857. doi: 10.1038/s41587-019-0209-9
9
BrownK. A. (2014). Impact of obesity on mammary gland inflammation and local estrogen production. J. Mammary Gland Biol. Neoplasia19, 183–189. doi: 10.1007/s10911-014-9321-0
10
CallahanB. J.McMurdieP. J.RosenM. J.HanA. W.JohnsonA. J.HolmesS. P. (2016). Dada2: high-resolution sample inference from illumina amplicon data. Nat. Methods13, 581–583. doi: 10.1038/nmeth.3869
11
Castano-RodriguezN.KaakoushN. O.LeeW. S.MitchellH. M. (2017). Dual role of helicobacter and campylobacter species in ibd: a systematic review and meta-analysis. Gut66, 235–249. doi: 10.1136/gutjnl-2015-310545
12
ChenJ.SunY.LiJ.LyuM.YuanL.SunJ.et al. (2024). In-depth metaproteomics analysis of tongue coating for gastric cancer: a multicenter diagnostic research study. Microbiome12, 6. doi: 10.1186/s40168-023-01730-8
13
ChenJ.YangJ.QinY.SunC.XuJ.ZhouX.et al. (2022). Tongue features of patients with granulomatous lobular mastitis. Med. (Baltimore)101, e31327. doi: 10.1097/MD.0000000000031327
14
ChenL. C.PapandreouG.KokkinosI.MurphyK.YuilleA. L. (2018). Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell.40, 834–848. doi: 10.1109/TPAMI.2017.2699184
15
ChenS.ZhouY.ChenY.GuJ. (2018). Fastp: an ultra-fast all-in-one fastq preprocessor. Bioinformatics34, i884–i890. doi: 10.1093/bioinformatics/bty560
16
ChenW.ZhangD.ZengY.CuiJ.YuJ.WangJ.et al. (2023). Clinical characteristics and microbiota analysis of 44 patients with granulomatous mastitis. Front. Microbiol.14. doi: 10.3389/fmicb.2023.1175206
17
ChouguleA.BalA.DasA.SinghG. (2015). Igg4 related sclerosing mastitis: expanding the morphological spectrum of igg4 related diseases. Pathology47, 27–33. doi: 10.1097/PAT.0000000000000187
18
CostaD.IraolaG. (2019). Pathogenomics of emerging campylobacter species. Clin. Microbiol. Rev.32, e00072-18. doi: 10.1128/CMR.00072-18
19
DobrescuA.GiuffridaM. V.TsaftarisS. A. (2020). Doing more with less: a multitask deep learning approach in plant phenotyping. Front. Plant Sci.11. doi: 10.3389/fpls.2020.00141
20
DomenechM.Ramos-SevillanoE.GarciaE.MoscosoM.YusteJ. (2013). Biofilm formation avoids complement immunity and phagocytosis of streptococcus pneumoniae. Infect. Immun.81, 2606–2615. doi: 10.1128/IAI.00491-13
21
DouglasG. M.MaffeiV. J.ZaneveldJ. R.YurgelS. N.BrownJ. R.TaylorC. M.et al. (2020). Picrust2 for prediction of metagenome functions. Nat. Biotechnol.38, 685–688. doi: 10.1038/s41587-020-0548-6
22
FanL.Strasser-WeipplK.LiJ. J.StL. J.FinkelsteinD. M.YuK. D.et al. (2014). Breast cancer in China. Lancet Oncol.15, e279–e289. doi: 10.1016/S1470-2045(13)70567-9
23
Fatahi-BafghiM. (2021). Characterization of the rothia spp. And their role in human clinical infections. Infect. Genet. Evol.93, 104877. doi: 10.1016/j.meegid.2021.104877
24
FazzioR. T.ShahS. S.SandhuN. P.GlazebrookK. N. (2016). Idiopathic granulomatous mastitis: imaging update and review. Insights Imaging7, 531–539. doi: 10.1007/s13244-016-0499-0
25
GaoJ.FrenchA. P.PoundM. P.HeY.PridmoreT. P.PietersJ. G. (2020). Deep convolutional neural networks for image-based convolvulus sepium detection in sugar beet fields. Plant Methods16, 29. doi: 10.1186/s13007-020-00570-z
26
GaoX.HuY.TaoY.LiuS.ChenH.LiJ.et al. (2022). Cymbopogon citratus (dc.) Stapf aqueous extract ameliorates loperamide-induced constipation in mice by promoting gastrointestinal motility and regulating the gut microbiota. Front. Microbiol.13. doi: 10.3389/fmicb.2022.1017804
27
GaoL.XuT.HuangG.JiangS.GuY.ChenF. (2018). Oral microbiomes: more and more importance in oral cavity and whole body. Protein Cell9, 488–500. doi: 10.1007/s13238-018-0548-1
28
GažarováM.BihariM.LorkováM.LenártováP.HabánováM. (2022). The use of different anthropometric indices to assess the body composition of young women in relation to the incidence of obesity, sarcopenia and the premature mortality risk. Int. J. Environ. Res. Public Health19, 12449. doi: 10.3390/ijerph191912449
29
GopalakrishnanN. C.JacobP.MenonR. R. (2015). Inflammatory diseases of the non-lactating female breasts. Int. J. Surg.13, 8–11. doi: 10.1016/j.ijsu.2014.11.022
30
HanH.GuoW.ShiW.YuY.ZhangY.YeX.et al. (2017). Hypertension and breast cancer risk: a systematic review and meta-analysis. Sci. Rep.7, 44877. doi: 10.1038/srep44877
31
HanS.YangX.QiQ.PanY.ChenY.ShenJ.et al. (2016). Potential screening and early diagnosis method for cancer: tongue diagnosis. Int. J. Oncol.48, 2257–2264. doi: 10.3892/ijo.2016.3466
32
HeK.ZhangX.RenS.SunJ. (2016). “Deep Residual Learning for Image Recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 770–778. doi: 10.1109/CVPR.2016.90
- CrossRef
- Google Scholar
33
HeX.McLeanJ. S.EdlundA.YoosephS.HallA. P.LiuS. Y.et al. (2015). Cultivation of a human-associated tm7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle. Proc. Natl. Acad. Sci. U.S.A.112, 244–249. doi: 10.1073/pnas.1419038112
34
HuangY. S.WuH. K.ChangH. H.LeeT. C.HuangS. Y.ChiangJ. Y.et al. (2022). Exploring the pivotal variables of tongue diagnosis between patients with acute ischemic stroke and health participants. J. Tradit Complement Med.12, 505–510. doi: 10.1016/j.jtcme.2022.04.001
35
JHF. (2001). Greedy function approximation: a gradient boosting machine. Ann. Stat29, 1189–1232. doi: 10.1214/aos/1013203451
- CrossRef
- Google Scholar
36
JiangT.GuoX. J.TuL. P.LuZ.CuiJ.MaX. X.et al. (2021). Application of computer tongue image analysis technology in the diagnosis of nafld. Comput. Biol. Med.135, 104622. doi: 10.1016/j.compbiomed.2021.104622
37
JiangB.LiangX.ChenY.MaT.LiuL.LiJ.et al. (2012). Integrating next-generation sequencing and traditional tongue diagnosis to determine tongue coating microbiome. Sci. Rep.2, 936. doi: 10.1038/srep00936
38
JiaoY.ChangK.JiangY.ZhangJ. (2023). Identification of periductal mastitis and granulomatous lobular mastitis: a literature review. Ann. Transl. Med.11, 158. doi: 10.21037/atm-22-6473
39
KaakoushN. O.Castano-RodriguezN.ManS. M.MitchellH. M. (2015). Is campylobacter to esophageal adenocarcinoma as helicobacter is to gastric adenocarcinoma? Trends Microbiol.23, 455–462. doi: 10.1016/j.tim.2015.03.009
40
KasalesC. J.HanB.SmithJ. J.ChetlenA. L.KanedaH. J.ShereefS. (2014). Nonpuerperal mastitis and subareolar abscess of the breast. Ajr Am. J. Roentgenol202, W133–W139. doi: 10.2214/AJR.13.10551
41
KirschbaumB. (2010). Atlas of chinese tongue diagnosis. 2nd ed (Seattle: Eastland Press).
- Google Scholar
42
LamJ. Y.WuA. K.NgaiD. C.TengJ. L.WongE. S.LauS. K.et al. (2011). Three cases of severe invasive infections caused by campylobacter rectus and first report of fatal c. Rectus infection. J. Clin. Microbiol.49, 1687–1691. doi: 10.1128/JCM.02487-10
43
LargentJ. A.McEligotA. J.ZiogasA.ReidC.HessJ.LeightonN.et al. (2006). Hypertension, diuretics and breast cancer risk. J. Hum. Hypertens.20, 727–732. doi: 10.1038/sj.jhh.1002075
44
Le Fleche-MateosA.BerthetN.LomprezF.ArnouxY.Le GuernA. S.LeclercqI.et al. (2012). Recurrent breast abscesses due to corynebacterium kroppenstedtii, a human pathogen uncommon in caucasian women. Case Rep. Infect. Dis.2012, 120968. doi: 10.1155/2012/120968
45
LiJ. J.FangC. H.HuiR. T. (2005). Is hypertension an inflammatory disease? Med. Hypotheses64, 236–240. doi: 10.1016/j.mehy.2004.06.017
46
LiX.YuanJ.FuA.WuH. L.LiuR.LiuT. G.et al. (2022). New insights of corynebacterium kroppenstedtii in granulomatous lobular mastitis based on nanopore sequencing. J. Invest. Surg.35, 639–646. doi: 10.1080/08941939.2021.1921082
47
LiangY.ZhanH.KrishnamurtiU.HarigopalM.SunT. (2022). Further characterization of clinicopathologic features of cystic neutrophilic granulomatous mastitis. Am. J. Clin. Pathol.158, 488–493. doi: 10.1093/ajcp/aqac074
48
LiuQ.LiY.YangP.LiuQ.WangC.ChenK.et al. (2023). A survey of artificial intelligence in tongue image for disease diagnosis and syndrome differentiation. Digit Health9, 589834756. doi: 10.1177/20552076231191044
49
LiuR.LuoZ.DaiC.WeiY.YanS.KuangX.et al. (2024). Corynebacterium parakroppenstedtii secretes a novel glycolipid to promote the development of granulomatous lobular mastitis. Signal Transduct Target Ther.9, 292. doi: 10.1038/s41392-024-01984-0
50
LiuC.ZhaoD.MaW.GuoY.WangA.WangQ.et al. (2016). Denitrifying sulfide removal process on high-salinity wastewaters in the presence of halomonas sp. Appl. Microbiol. Biotechnol.100, 1421–1426. doi: 10.1007/s00253-015-7039-6
51
LiuL.ZhouF.WangP.YuL.MaZ.LiY.et al. (2017). Periductal mastitis: an inflammatory disease related to bacterial infection and consequent immune responses? Mediators Inflammation2017, 5309081. doi: 10.1155/2017/5309081
52
LoL. C.ChengT. L.ChiangJ. Y.DamdinsurenN. (2013). Breast cancer index: a perspective on tongue diagnosis in traditional chinese medicine. J. Tradit Complement Med.3, 194–203. doi: 10.4103/2225-4110.114901
53
MaS.LiuJ.LiW.LiuY.HuiX.QuP.et al. (2023b). Machine learning in tcm with natural products and molecules: current status and future perspectives. Chin. Med.18, 43. doi: 10.1186/s13020-023-00741-9
54
MaQ.LuX.QinX.XuX.FanM.DuanY.et al. (2023a). A sonogram radiomics model for differentiating granulomatous lobular mastitis from invasive breast cancer: a multicenter study. Radiol. Med.128, 1206–1216. doi: 10.1007/s11547-023-01694-7
55
MagocT.SalzbergS. L. (2011). Flash: fast length adjustment of short reads to improve genome assemblies. Bioinformatics27, 2957–2963. doi: 10.1093/bioinformatics/btr507
56
NegratoC. A.TarziaO. (2010). Buccal alterations in diabetes mellitus. Diabetol. Metab. Syndr.2, 3. doi: 10.1186/1758-5996-2-3
57
NishidaC.KoG. T.KumanyikaS. (2010). Body fat distribution and noncommunicable diseases in populations: overview of the 2008 who expert consultation on waist circumference and waist-hip ratio. Eur. J. Clin. Nutr.64, 2–5. doi: 10.1038/ejcn.2009.139
58
NishimuraS.ManabeI.NagasakiM.SeoK.YamashitaH.HosoyaY.et al. (2008). In vivo imaging in mice reveals local cell dynamics and inflammation in obese adipose tissue. J. Clin. Invest.118, 710–721. doi: 10.1172/JCI33328
59
RenshawA. A.DerhagopianR. P.GouldE. W. (2011). Cystic neutrophilic granulomatous mastitis: an underappreciated pattern strongly associated with gram-positive bacilli. Am. J. Clin. Pathol.136, 424–427. doi: 10.1309/AJCP1W9JBRYOQSNZ
60
SchlossP. D.WestcottS. L.RyabinT.HallJ. R.HartmannM.HollisterE. B.et al. (2009). Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol.75, 7537–7541. doi: 10.1128/AEM.01541-09
61
ScottD. M. (2022). Inflammatory diseases of the breast. Best Pract. Res. Clin. Obstet Gynaecol83, 72–87. doi: 10.1016/j.bpobgyn.2021.11.013
62
SeetharamK.KagiyamaN.SenguptaP. P. (2019). doi: 10.1530/ERP-18-0081
63
SegataN.IzardJ.WaldronL.GeversD.MiropolskyL.GarrettW. S.et al. (2011). Metagenomic biomarker discovery and explanation. Genome Biol.12, R60. doi: 10.1186/gb-2011-12-6-r60
64
ShahbakeM.HutchinsonI.LaingD. G.JinksA. L. (2005). Rapid quantitative assessment of fungiform papillae density in the human tongue. Brain Res.1052, 196–201. doi: 10.1016/j.brainres.2005.06.031
65
ShapiraI.SultanK.LeeA.TaioliE. (2013). Evolving concepts: how diet and the intestinal microbiome act as modulators of breast Malignancy. Isrn Oncol.2013, 693920. doi: 10.1155/2013/693920
66
ShiL.WuJ.HuY.ZhangX.LiZ.XiP. W.et al. (2022). Biomedical indicators of patients with non-puerperal mastitis: a retrospective study. Nutrients14, 4816. doi: 10.3390/nu14224816
67
ShigaY.HosomiN.NezuT.NishiH.AokiS.NakamoriM.et al. (2020). Association between periodontal disease due to campylobacter rectus and cerebral microbleeds in acute stroke patients. PloS One15, e239773. doi: 10.1371/journal.pone.0239773
68
SoyluB. F.EsenI. G.KayadibiY.TasdelenI.AlverD.. (2023). Idiopathic granulomatous mastitis or breast cancer? A comparative mri study in patients presenting with non-mass enhancement. Diagnostics (Basel)13, 1475. doi: 10.3390/diagnostics13081475
69
TangB.HuY.ChenJ.SuC.ZhangQ.HuangC. (2023). Oral and fecal microbiota in patients with diarrheal irritab le bowel syndrome. Heliyon9, e13114. doi: 10.1016/j.heliyon.2023.e13114
70
TariqH.MenonP. D.FanH.VadlamudiK. V.PandeswaraS. L.NazarullahA. N.et al. (2022). Detection of corynebacterium kroppenstedtii in granulomatous lobular mastitis using real-time polymerase chain reaction and sanger sequencing on formalin-fixed, paraffin-embedded tissues. Arch. Pathol. Lab. Med.146, 749–754. doi: 10.5858/arpa.2021-0061-OA
71
VergheseB. G.RavikanthR. (2012). Breast abscess, an early indicator for diabetes mellitus in non-lactating women: a retrospective study from rural India. World J. Surg.36, 1195–1198. doi: 10.1007/s00268-012-1502-7
72
WangC. Y.BochkovskiyA.Mark LiaoH. Y. (2023). “Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in 2023 ieee/cvf conference on computer vision and pattern recognition (Cvpr), Vancouver, BC, Canada. 7464–7475. doi: 10.1109/CVPR52729.2023.00721
- CrossRef
- Google Scholar
73
WangX.LuoS.TianG.RaoX.HeB.SunF. (2022). Deep learning based tongue prickles detection in traditional chinese medicine. Evid Based Complement Alternat Med.2022, 5899975. doi: 10.1155/2022/5899975
74
WarrenR. L.FreemanD. J.PleasanceS.WatsonP.MooreR. A.CochraneK.et al. (2013). Co-occurrence of anaerobic bacteria in colorectal carcinomas. Microbiome1, 16. doi: 10.1186/2049-2618-1-16
75
YeJ.CaiX.YangJ.SunX.HuC.XiaJ.et al. (2016). Bacillus as a potential diagnostic marker for yellow tongue coating. Sci. Rep.6, 32496. doi: 10.1038/srep32496
76
YeJ.LvY.XieH.LianK.XuX. (2024). Whole-genome metagenomic analysis of the oral microbiota in patients with obstructive sleep apnea comorbid with major depressive disorder. Nat. Sci. Sleep16, 1091–1108. doi: 10.2147/NSS.S474052
77
YuanQ. Q.XiaoS. Y.FaroukO.et al (2022). Management of granulomatous lobular mastitis: an international multidisciplinary consensus 2022 edition). Mil Med. Res.9, 20. doi: 10.1186/s40779-022-00380-5
78
ZhangG.HeX.LiD.TianC.WeiB. (2022). Automated screening of covid-19-based tongue image on chinese medicine. BioMed. Res. Int.2022, 6825576. doi: 10.1155/2022/6825576
79
ZhangB.ZhangH. (2015). Significant geometry features in tongue image analysis. Evid Based Complement Alternat Med.2015, 897580. doi: 10.1155/2015/897580
80
ZhaoQ.XieT.FuC.ChenL.BaiQ.GrimmR.et al. (2020). Differentiation between idiopathic granulomatous mastitis and invasive breast carcinoma, both presenting with non-mass enhancement without rim-enhanced masses: the value of whole-lesion histogram and texture analysis using apparent diffusion coefficient. Eur. J. Radiol.123, 108782. doi: 10.1016/j.ejrad.2019.108782

Summary

Keywords

non-puerperal mastitis, tongue diagnosis, tongue microbiota, high through put sequencing, machine learning model

Citation

Tu S, Yin Y, Ma L, Chen H and Ye M (2025) Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction. Front. Cell. Infect. Microbiol. 15:1602883. doi: 10.3389/fcimb.2025.1602883

Received

30 March 2025

Accepted

25 June 2025

Published

28 July 2025

Volume

15 - 2025

Edited by

Angela Brown, Lehigh University, United States

Reviewed by

Zeyan Li, Renmin Hospital of Wuhan University, China

Divya Gopinath, Ajman University, United Arab Emirates

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Hongfeng Chen, chhfluk@126.com; Meina Ye, yemeina2002@126.com

†These authors have contributed equally to this work

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Oral Microbes and Host

ORIGINAL RESEARCH article

Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction

Abstract

Introduction

Materials and methods

Data

Tongue image acquisition and quantitative analysis