Immunity and Extracellular Matrix Characteristics of Breast Cancer Subtypes Based on Identification by T Helper Cells Profiling

Background The therapeutic effect of immune checkpoint inhibitors on tumors is not only related to CD8+ effector T cells but also sufficiently related to CD4+ helper T (TH) cells. The immune characteristics of breast cancer, including gene characteristics and tumor-infiltrating lymphocytes, have become significant biomarkers for predicting prognosis and immunotherapy response in recent years. Methods Breast cancer samples from The Cancer Genome Atlas (TCGA) database and triple-negative breast cancer (TNBC) samples from GSE31519 in the Gene Expression Omnibus (GEO) database were extracted and clustered based on gene sets representing TH cell signatures. CIBERSORT simulations of immune cell components in the tumor microenvironment and gene set enrichment analyses (GSEAs) were performed in the different clusters to verify the classification of the subtypes. The acquisition of differentially expressed genes (DEGs) in the different clusters was further used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. The clinical information from different clusters was used for survival analysis. Finally, the surgical tissues of TNBC samples were stained by immunofluorescence staining and Masson’s trichrome staining to explore the correlation of TH cell subtypes with extracellular matrix (ECM). Results The breast cancer samples from the datasets in TCGA database and GEO database were classified into TH-activated and TH-silenced clusters, which was verified by the immune cell components and enriched immune-related pathways. The DEGs of TH-activated and TH-silenced clusters were obtained. In addition to TH cells and other immune-related pathways, ECM-related pathways were found to be enriched by DEGs. Furthermore, the survival data of TCGA samples and GSE31519 samples showed that the 10-year overall survival (p-value < 0.001) and 10-year event-free survival (p-value = 0.162) of the TH-activated cluster were better, respectively. Fluorescent labeling of TH cell subtypes and staining of the collagen area of surgical specimens further illustrated the relationship between TH cell subtypes and ECM in breast cancer, among which high TH1 infiltration was related to low collagen content (p-value < 0.001), while high TH2 and Treg infiltration contained more abundant collagen (p-value < 0.05) in TNBC. With regard to the relationship of TH cell subtypes, TH2 was positively correlated with Treg (p-value < 0.05), while TH1 was negatively correlated with both of them. Conclusions The immune and ECM characteristics of breast cancer subtypes based on TH cell characteristics were revealed, and the relationship between different TH cell subsets and ECM and prognosis was explored in this study. The crosstalk between ECM and TH cell subtypes formed a balanced TME influencing the prognosis and treatment response in breast cancer, which suggests that the correlation between TH cells and ECM needs to be further emphasized in future breast cancer studies.

Background: The therapeutic effect of immune checkpoint inhibitors on tumors is not only related to CD8+ effector T cells but also sufficiently related to CD4+ helper T (T H ) cells. The immune characteristics of breast cancer, including gene characteristics and tumorinfiltrating lymphocytes, have become significant biomarkers for predicting prognosis and immunotherapy response in recent years.
Methods: Breast cancer samples from The Cancer Genome Atlas (TCGA) database and triple-negative breast cancer (TNBC) samples from GSE31519 in the Gene Expression Omnibus (GEO) database were extracted and clustered based on gene sets representing T H cell signatures. CIBERSORT simulations of immune cell components in the tumor microenvironment and gene set enrichment analyses (GSEAs) were performed in the different clusters to verify the classification of the subtypes. The acquisition of differentially expressed genes (DEGs) in the different clusters was further used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. The clinical information from different clusters was used for survival analysis. Finally, the surgical tissues of TNBC samples were stained by immunofluorescence staining and Masson's trichrome staining to explore the correlation of T H cell subtypes with extracellular matrix (ECM).
Results: The breast cancer samples from the datasets in TCGA database and GEO database were classified into T H -activated and T H -silenced clusters, which was verified by the immune cell components and enriched immune-related pathways. The DEGs of T Hactivated and T H -silenced clusters were obtained. In addition to T H cells and other immune-related pathways, ECM-related pathways were found to be enriched by DEGs. Furthermore, the survival data of TCGA samples and GSE31519 samples showed that the 10-year overall survival (p-value < 0.001) and 10-year event-free survival (p-value = 0.162) of the T H -activated cluster were better, respectively. Fluorescent labeling of T H cell

INTRODUCTION
The emergence of immunotherapy brought solid tumors to a new era (1), especially in breast cancer, which has the highest incidence (2). Currently, many clinical studies related to immune checkpoint inhibitors (ICIs) mainly focus on the regulation of CD8+ cytotoxic T cells (CTLs) on the tumor immune microenvironment (TIME) and the possible mechanism of immune-related targets, for example, IMpassion 130 (3), KEYNOTE-119 (4), and the FUTURE trial (5) for advanced breast cancer and KEYNOTE-173 (6) and IMpassion 031 (7) for early breast cancer. In reality, as the principal component of tumor-infiltrating lymphocytes (TILs), T lymphocytes play a key role in the occurrence and development of breast cancer, especially in triple-negative breast cancer (TNBC) (8,9). Among TILs, not only CTLs but also CD4+ helper T (T H ) cells directly or indirectly exert protumorigenic or/and antitumorigenic immune effects by affecting other immune cells, especially CTLs, through the inflammatory molecules secreted by different subtypes of T H cells and the accommodation of signal transduction (10,11). It is precisely the various and interlaced immunoregulatory properties of T H cells that make the optimization of immunotherapy based on them more clinically significant. Based on previous studies (12)(13)(14)(15), the conference on T H 1 (T helper type 1), T H 2 (T helper type 2), T H 17 (T helper type 17), T fh (T follicular helper), and T reg (CD4+ regulatory T) cells as a group of CD4+ T H cells that are independent and have a chain reaction with TILs is very meaningful to avoid tumor immune escape and improve the efficacy of immunotherapy for breast cancer.
Apart from immune cells, cancer-associated fibroblasts (CAFs) are also prominent components of the tumor microenvironment (TME) (16,17), and the activation of cytokines such as interleukin-1 (IL-1) and IL-6 and immunerelated pathways such as the Janus kinase-signal transducer and activator of transcription (JAK-STAT) and nuclear factor kappa-B (NF-kb) pathways plays an important role in the generation and recruitment of CAFs (18,19). In fibrosis-related diseases, including tumors, each subtype of T H cells has different regulatory effects on fibroblast-induced collagen synthesis through inflammatory factors, including interferon-g (IFN-g) (20). In turn, CAFs also regulate the activation and function of T H cells by secreting cytokines and chemokines (16,21). The extracellular matrix (ECM) structured through collagen produced by CAFs in the TME not only affects the differentiation of T cells but also affects the spatial distribution of T cells (22) to modulate antitumorigenic immunity by influencing the dialog between T cells and tumor cells.
Based on TIME-related characteristics as biomarkers for breast cancer prognosis and prediction of immunotherapy efficacy, many studies (23)(24)(25)(26) have proposed prognostic models and detailed breast cancer typing in recent years. A total score of the immune microenvironment as indicators of the classification for the data samples was employed in most of them (27,28). In contrast, this study will utilize the genetic traits of T H cells, such as heterogeneous and particular cells, and explore the characteristics of the TIME and ECM in breast cancer to put forward a reference for the breakthrough of immunotherapy in the field of T H cells.

Datasets and Clinical Samples
RNA-sequencing data of 1,097 breast cancer patients from The Cancer Genome Atlas (TCGA) database and the corresponding clinical data were extracted. Microarray gene expression data using the Affymetrix U133A array of 64 TNBC patients from GSE31519 were obtained from the National Center for Biotechnology Information (NCBI) in the Gene Expression Omnibus (GEO) Database, in which clinical survival data are available. The data from the above public databases were used for cluster analysis based on T H cell characteristics and for differential analysis and survival analysis among different clusters.
Formalin-fixed paraffin-embedded (FFPE) surgical tissue sections (4 µm thick) from 30 TNBC patients admitted to the Department of Medical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University during 2016-2020 were used for immunofluorescent staining and Masson's trichrome staining, which is for the analysis of different T H subtypes and collagen content in the TIME of TNBC patients. The detailed clinical information of patients is described in Supplementary Table S1, in which disease-free survival (DFS) was defined as the time from surgery to the occurrence of the first metastasis.

Identification of Subtypes Based on T H Cell Characteristics in Breast Cancer
Consensus cluster analysis was employed to identify breast cancer subtypes based on T H cell characteristics obtained from the single-sample gene set enrichment analysis (ssGSEA) (29,30). The gene set representing five T H cell subtypes, including T H 1, T H 2, T H 17, T fh , and T reg cells, was used for clustering, as shown in Supplementary Table S2. The "ConsensusClusterPlus" package (http://www.bioconductor.org/) was used to divide the datasets with duplicate samples removed (1,090 in the TCGA dataset and 64 in the GSE31519 dataset, as shown in Supplementary Tables S3, S4, respectively) into k=2-9 subgroups followed by hierarchical agglomerative consensus, and optimal clustering was obtained by comparing the consensus matrix and cumulative distribution function (CDF) value.

Evaluation of TIME in Breast Cancer Samples From Datasets
The CIBERSORT deconvolution algorithm was used to estimate the fraction of 22 immune cell types in each sample to evaluate TIME in breast cancer (31), which was calculated via the online calculator (https://cibersort.stanford.edu/). The CIBERSORT results were filtered by p-value < 0.05 to obtain more accurate prediction results, and the samples upon the filter conditions were employed in subsequent differential analysis between different clusters (846 in TCGA dataset and 63 in GSE31519 dataset as shown in Supplementary Tables S5, S6, respectively). Principal component analysis (PCA) was used to distinguish immune cell components between different clusters through dimensionality reduction, pattern recognition, and exploratory visualization.

Extraction of Differentially Expressed
Genes, Gene Ontology, and KEGG Pathway Enrichment Analysis of DEGs T H -related differentially expressed genes (DEGs) were extracted and analyzed using the R packages "Limma," "Impute," and "EdgeR" through the gene expression profiles of different clusters. DEGs were filtered by logFCfilter=2 or 1 and fdrFilter=0.05. Gene Ontology (GO) analysis was conducted for the high-throughput annotation of biological functions (BPs), cellular components (CCs), and molecular function (MF) of the DEGs between different clusters, while KEGG analysis was conducted for the molecular and pathway levels of the DEGs. The R packages "DOSE," "ClusterProfiler," "Org.Hs.eg.db," and "Enrichplot" were used in the GO and KEGG pathway enrichment analyses of DEGs (p-value < 0.05).

Survival Analysis
For plotting Kaplan-Meier (K-M) curves of TCGA and GSE31519 clusters based on T H cell characteristics, the R packages "survival" and "survminer" were used, while the logrank test was employed to assess the significance of overall survival (OS) and event-free survival (EFS) differences. Similarly, K-M survival analysis and log-rank tests were conducted to compare the differences in DFS between the different groups of T H cell subtypes in 30 TNBC samples.

Statistical Analysis
The statistical analyses of data from the public database were performed using R software (http:///www.r-project.org/) and Bioconductor (http://bioconductor.org/). To compare any two groups of datasets in the study, the Wilcoxon test was conducted using R software. The R packages "pheatmap," "ggplot2," and "vioplot" were used to show the differential results in the study. The statistical analyses of 30 TNBC samples collected by us were performed using GraphPad Prism 8.0.0 (GraphPad Software Inc., San Diego, CA, USA). The non-parametric independent sample t-test was used for the difference in the different groupings of T H cell subtypes, and Pearson's correlation was used for the correlation between different T H cell subtypes, in which the data were all normally distributed. All p-values were bilateral, and a p-value < 0.05 was considered statistically significant.

Identification of Subtypes by T H Cells and the Corresponding TIME in Breast Cancer
The transcriptome and clinical data of 1,097 breast cancer samples from TCGA database were extracted, and the mRNA expression of 1,090 breast cancer samples after the removal of duplicate samples was analyzed. According to the expression of T H -cell-related gene characteristics, 1,090 samples were collected for cluster analysis. From the clustering results of k = 2-9, it can be found that when the samples were clustered into two clusters (k = 2, Cluster 1 included 654 samples and Cluster 2 included 436 samples, see Supplementary Table S3 for details), the consensus matrix had a relatively average distribution, less matrix overlap, and a smoothly decreasing CDF value ( Figure 1A; Supplementary Figures S1A-D). In addition, we extracted the transcriptomic data of 64 TNBC samples from the GSE31519 dataset of the GEO database for the same cluster analysis as above ( Figure 1B; Supplementary Figures S2A-D), which was still optimal when the samples were divided into two clusters (k=2, Cluster 1 included 38 samples and Cluster 2 included 26 samples; see Supplementary Table S4 for details).
The deconvolution algorithm CIBERSORT was used to verify the immune microenvironment of the two clusters obtained from the TCGA dataset by simulating the fraction of 22 kinds of immune cells in the TIME of each sample (see Supplementary  Table S5 for details). Differential analysis of immune cell components showed that Cluster 2 had more resting memory CD4 T cells, activated memory CD4 T cells, T fh cells, gamma delta T cells (p-value < 0.001), and T reg (p value = 0.023) than Cluster 1. Therefore, Cluster 1 was defined as a T H -silenced cluster, and Cluster 2 was defined as a T H -activated cluster.   Figures 1C,D). In the TCGA samples, the correlation of 22 immune cell components can be seen in Figure 1E, in which CD8 T cells have a strong positive correlation with T reg cells and activated memory CD4 T cells, and M1 macrophages were positively correlated with T fh cells and resting memory CD4 T cells, while M2 macrophages showed the opposite correlation. Similarly, CIBERSORT was used to verify the TIME of the samples in the GSE31519 dataset (see Supplementary Table S6 for details), from which we found that T fh cells, T reg , and M1 macrophages were more abundant in Cluster 2 than in Cluster 1, while M2 macrophages and resting mast cells were less abundant, and the differences in other immune cells were not statistically significant ( Figure 1F). Therefore, Cluster 1 in the GSE31519 dataset was still defined as the T H -silenced cluster, and Cluster 2 was defined as the T Hactivated cluster.

Verification Towards the Characteristics of the Two Clusters Identified by T H Cells
A total of 1,090 samples classified into T H -silenced clusters and T H -activated clusters were evaluated for immune cell components using PCA. The two clusters were clearly separated (Figure 2A). The transcriptome data were prepared for the KEGG pathway-related GSEA, in which the pathways of normalized enrichment score (NES) top 15 in the two clusters were selected for demonstration. Activation of immune-related pathways was enriched in the T H -activated cluster, including the T cell, B-cell receptor signaling, and cytokine and chemokine signaling pathways ( Figure 2B). This further verified the stronger immunogenicity and immune activity of the samples in the T H -activated cluster than in the T H -silenced cluster. The pathways enriched in the T H -silenced cluster included glucose metabolism-, amino acid metabolism-and fatty acid synthesisrelated pathways ( Figure 2C).

The Pathways Enriched by DEGs of Two Clusters Cued the Correlation Between T H Cells and ECM
Samples from the two clusters of TCGA database were used for gene expression differential analysis. When logFCfilter=2 and fdrFilter=0.05, 148 DEGs were obtained (see Supplementary  Table S7 for details). The differential expression of these DEGs in the two clusters and the distribution of clinical information, including TNM stage, stage, age, and gender, are shown in Figure 2D. GO enrichment analysis ( Figure 2E) showed that BP of the DEGs was most significantly related to the activation of T cells and differentiation and activation of lymphocytes, while the MF portion was also mostly cytokine related. In addition to the above manifestation that immune-activated GO is enriched with upregulation in T H -activated cluster versus T H -silenced cluster, collagen-containing ECM was enriched in the aspect of CC, indicating that synthesis of collagen and the construction of ECM may be correlated with T H cell activation. A total of 310 DEGs were obtained between the transcriptional data of the samples in the two clusters from the GSE31519 dataset (see Supplementary Table S8, logFCfilter=1 and fdrFilter=0.05). The up-and downregulated expression of the 100 prominent DEGs between the two clusters is shown in the heatmap in Figure 3A. Furthermore, GO enrichment analysis of DEGs showed that ECMrelated pathways were enriched in parts of BP, CC, and MF by comparing the T H -activated cluster versus the T H -silenced cluster, while cytokine activity was enriched in MF ( Figures 3B, C). The possibility that ECM is associated with T H cells was reconfirmed by KEGG pathway enrichment analysis of these DEGs ( Figures 3D,  E), in which cytokine-cytokine receptor interaction and JAK-STAT signaling pathways related to immunity were enriched with upregulation, and ECM-receptor interaction was also enriched.

The Correlation Between Different T H Cell Subtypes and ECM
To further explore the content of tumor-infiltrating T H cell s u b ty p e s in b r e a s t c a n ce r t i s s u e s , w e p e r f o r m e d immunofluorescence staining on paraffin tissue sections of primary foci from 30 TNBC patients, in which CD4 + T-bet + T H 1, CD4 + GATA3 + T H 2, and CD4 + CD25 + T reg cells were labeled (see Supplementary Table S1 for clinical information and corresponding T H cell content of these patients  Figure 4A. In contrast, as shown in Figures 4B, C, breast cancer tissues with higher tumorinfiltrating T H 2 cells and T reg content were more collagen distributed. Furthermore, Pearson correlation analysis was conducted on the contents of the three subtypes of T H cells, and it was found that the content of T H 2 was positively correlated with T reg (R 2 = 0.3477, p-value < 0.001), while T H 1 was negatively correlated with T H 2 and T reg (p-value > 0.05), as shown in Figure 4D (the content of T H cell subtypes at each data point was the average value of three random fluorescence staining fields). Figures 4E-G are the t-test analyses of variance between the two groups of T H 1 grouping (p-value < 0.001), T H 2 grouping (p-value = 0.0246), and T reg grouping (p-value = 0.0326).

The Prognosis Of Different Groupings Based on Total T H Cells and T H Cell Subtypes
To verify that the T H -activated cluster and T H -silenced cluster indicated significance for prognosis, K-M survival analysis was employed for samples from the public database. The 10-year OS of 1,090 samples from the TCGA database was compared in two clusters, of which the OS of the T H -activated cluster was longer ( Figure 5A, p-value < 0.001). The 10-year EFS of 64 TNBC samples from the GSE31519 dataset was compared in two clusters, of which the EFS of the T H -activated cluster was longer ( Figure 5B, p-value = 0.162). Based on the fluorescence staining results of T H cells of the 30 TNBC samples mentioned above, DFS of patients with different groupings of T H cell subtypes was employed to plot the K-M survival curve. Patients with high T H 1 cell infiltration had a better prognosis with longer DFS, as shown in Figure 5C

DISCUSSION
The predictive significance for treatment response and changes in tumor development of TILs as biomarkers have been an issue discussed in breast cancer as solid tumors. Many studies (32) have also classified breast cancer into high or low TIL subtypes to guide the treatment and prognosis of patients. In addition, a large number of studies (27,33) have focused on the classification of breast cancer subtypes by immune score based on inflammatory factors and immune-cell-related genes in recent years. In the breast cancer subtype, TNBC was more widely treated with immunotherapy because of the emergence of effective biomarkers such as PD-L1 and well-founded classification of full inflamed (FI), stroma restricted (SR), margin restricted (MR), and immune desert (ID) subtypes (34). Therefore, in addition to all breast cancer samples of TCGA database, a TNBC dataset from GEO database was incorporated in this study, and the verification of T H cell subtypes was also conducted in TNBC samples. In other solid tumors, 65 combinations of T-cell markers have been used as indicators for determining the generation and early metastasis of colorectal cancer (35), while immune-related genes have also been used as gene sets for subtyping squamous cell carcinoma and lymphoma (36,37). Regarding the role of subtyping and predicting the prognosis of breast cancer based on T cells, a study (38) showed that the gene score based on CD8+ T cells was associated with survival, especially in TNBC. In this study, a breast cancer dataset and a TNBC dataset were both divided into T H -silenced cluster and T H -activated cluster based on the gene characteristics of T H cells, among which two datasets had similar distributions of T H cell-related gene characteristics. The T H -activated cluster was characterized by upregulated immune-related gene characteristics, more activated immune-related pathways, and better prognosis, suggesting that the employment of T H cells as an independent biomarker for breast cancer could be achieved and have no less clinical significance than other prominent immune cell components, such as CTLs.
The role of T H cells in the TIME is related to the inflammatory factors secreted by them, while these inflammatory factors cannot only affect other types of immune cells but also interact with each other among different T H cell subtypes. As an example, IFN-g secreted by T H 1 can further induce the activation of STAT1 and STAT4 in T cells, thereby promoting T H 1 differentiation with positive feedback and inhibiting T H 2 and T H 17 differentiation (39). Conversely, IL-4 secreted by T H 2 can regulate T H 2 differentiation with positive feedback while inhibiting T H 1 differentiation (40), from which it can be found that the balance between IFN-g and IL-4 is also the balance of T H 1-T H 2 in the TIME, corresponding to the balance of pro-and antitumorigenic immune effects. T regs induced by transforming growth factor-b (TGF-b) in the TIME activate the STAT5 signaling pathway by binding IL-2 with high affinity through CD25 and exert opposite regulatory effects on T H 1 and T H 2 cells by secreting IL-10, IL-35 and TGF-b (41). In this study, correlation analysis of T H 1, T H 2 and T reg in TIME of TNBC also showed that T H 2 was positively correlated with T reg , while T H 1 was negatively correlated with both of them, which further confirmed that a relationship network containing promotion and restriction was formed between T H cell subtypes to achieve the balance of pro-and antitumorigenic immunity. In addition, a study (42) showed that different subtypes of peripheral blood T regs in breast cancer patients have different effects on the secretion of intratumor T H 1-and T H 2-related inflammatory factors, which also suggests that it is necessary to pay attention to T H cells in peripheral blood.
In this study, ECM-related pathways were enriched in the T Hactivated cluster, and the relationships between different T H cell subtypes and ECM were detected in clinical specimens. It was found that high T H 1 infiltration was related to low collagen content, while TME with high T H 2 and T reg infiltration contained more abundant collagen in TNBC. A study (43) has shown that IFN-g can upregulate the expression of matrix metalloproteinases (MMPs) that degrade ECM components, such as MMP-2, MMP-7, MMP-9, and MMP-13. Therefore, IFN-g secreted by T H 1 cells may be the possible cause of ECM remodeling, and the antagonistic relationship between T H 2 and T H 1 cells may also lead to the opposite effect on ECM. Furthermore, IL-13 secreted by T H 2 is also considered to be a factor positively related to ECM formation in many fibrotic diseases (44). The effects of T reg on fibrosis are not the same in different diseases, while studies (45) have shown that the decrease in IL-10 and TGF-b levels caused by T reg depletion may be the crucial procedure leading to the reduction in fibrosis. In addition, a study (46) reported that the expression level of fibrosis-related transcription factors in tissueresident T reg cells was increased in renal fibrosis disease. In breast cancer, research on tissue-resident memory T (T RM ) cells (47) as an emerging target in immunotherapy reveals that the combination of tissue-resident T reg cells and fibrosis-related biomarkers has outstanding clinical significance. On the other hand, in terms of the effect of ECM on T H cells in the TME, a study (48) found that the elimination of FAP+ CAFs in vivo can implement the polarization of T H 2 cells to T H 1 cells; moreover, in breast cancer, the CAF1-S1 subtype could achieve immunosuppression by recruiting and increasing the differentiation of CD4+CD25+ T reg (49). Therefore, the crosstalk between ECM and T H cell subtypes persistently accumulates, forming a balanced TME in breast cancer. Similarly, a recent study (50) showing the association between clusters of CAFs and immunotherapy resistance by single-cell sequencing highlighted the positive feedback loop relationship between specific CAFs-S1 clusters and T reg and revealed the correlation between different CAF clusters and CD8+ and CD4+ T cells, which indicates that the identification of specific clusters instructs treatment and prognosis in cancer. In reality, different T H cell subtypes have different prognostic guidance for solid tumors, which has been proven by studies (51, 52) based on characteristic gene expression data from clinical samples and animal experiments. This study also found that DFS of 30 TNBC patients was longer under high T H 1 infiltration, low T H 2, and low T reg infiltration. Accordingly, studies of ECM characteristics identifying different prognoses in breast cancer gradually appeared in 10 years, in which a stiff TME with abundant ECM indicating a poor prognosis was generally approved (53). In addition, different subtypes of TNBC have different prognoses and treatment strategies according to the spatial heterogeneity of CD8+TILs (54), which makes consideration of whether the distribution of CD4+ T H cells in the TIME also has crucial clinical value. Correspondingly, the spatial distribution of T H cells and ECM-related studies (22) further reminded us of the significance of T H cell characteristics and ECM as a new combined biomarker for breast cancer. The intrinsic modulatory mechanism of cytokines secreted by CD4+ T H cells influencing ECM remodeling would be our study field in the future, which is a crucial dimension to explain the correlation of the TME with prognosis in breast cancer and find potential combined therapies in the clinic.

CONCLUSION
Overall, the immune and ECM characteristics of breast cancer subtypes based on T H cell characteristics were revealed in this study by analyzing the datasets, and the relationship between different T H cell subsets with ECM and prognosis were explored in clinical TNBC samples. The accumulation of crosstalk between ECM and T H cell subtypes formed a balanced TME in breast cancer, which suggests that the combination of T H cell characteristics and ECM as a new biomarker needs to be further emphasized in future breast cancer clinical studies.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of the First Affiliated Hospital of Xi'an Jiaotong University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
YaZ, QT, and BW contributed to the study design and performed the experiments. HG and LZ contributed to data