Integrative Analysis of Gene Expression and DNA Methylation Depicting the Impact of Obesity on Breast Cancer

Obesity has been reported to be a risk factor for breast cancer, but how obesity affects breast cancer (BC) remains unclear. Although body mass index (BMI) is the most commonly used reference for obesity, it is insufficient to evaluate the obesity-related pathophysiological changes in breast tissue. The purpose of this study is to establish a DNA-methylation-based biomarker for BMI (DM-BMI) and explore the connection between obesity and BC. Using DNA methylation data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO), we developed DM-BMI to evaluate the degree of obesity in breast tissues. In tissues from non-BC and BC population, the DM-BMI model exhibited high accuracy in BMI prediction. In BC tissues, DM-BMI correlated with increased adipose tissue content and BC tissues with increased DM-BMI exhibited higher expression of pro-inflammatory adipokines. Next, we identified the gene expression profile relating to DM-BMI. Using Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we observed that the DM-BMI-related genes were mostly involved in the process of cancer immunity. DM-BMI is positively correlated with T cell infiltration in BC tissues. Furthermore, we observed that DM-BMI was positively correlated with immune checkpoint inhibitors (ICI) response markers in BC. Collectively, we developed a new biomarker for obesity and discovered that BC tissues from obese individuals exhibit an increased degree of immune cell infiltration, indicating that obese BC patients might be the potential beneficiaries for ICI treatment.


INTRODUCTION
Breast cancer (BC) is the most commonly diagnosed cancer and the second leading cause of cancer death for women in the world (Bray et al., 2018). It was reported in previous studies that obesity, characterized by excess adipose tissues, is a risk factor for BC (Pierobon and Frankenfeld, 2013;Sung et al., 2019). For premenopausal women, obesity is connected with increased risk of hormone receptor (HR) negative BC, while for postmenopausal women, it is connected with increased risk of HR positive BC (Suzuki et al., 2009;Picon-Ruiz et al., 2017). Moreover, several studies showed that obese patients exhibited more aggressive tumor features (such as larger tumor size, lymph node metastasis, shorter disease-free survival, and greater risk of mortality) compared with non-obese patients in BC (Copson et al., 2015;Jiralerspong and Goodwin, 2016). Although it has been observed in previous studies that the adipose tissue in obese individuals increasingly secrets adipokines (including hormones, growth factors, and cytokines) and contributes to an environment promoting cancer proliferation and metastasis (Khandekar et al., 2011;Maguire et al., 2021), how obesity impacts BC requires further studies.
Body mass index (BMI), defined as a person's weight in kilograms divided by the square of height in meters, is the most commonly used method for obesity evaluation. However, it is more like a surrogate measure for body fatness for obesity should be calculated using the excess accumulation of adipose tissues rather than body mass (Prentice and Jebb, 2001). As there is heterogeneity in the body distribution, function, and tissue composition of adipose tissue among BC patients, a total body mass index is insufficient to evaluate the degree of obesity in local tissue. Moreover, BMI is only able to reflect the gaining of weight, with no indication in pathophysiological changes during the process of obesity (Bosello et al., 2016). Thus, developing new biomarkers to evaluate the obesity status of BC tissues is helpful to assess the impact of obesity on BC.
It is well known that obesity is affected by multiple factors (including environmental factors, genetic predisposition, and the individual lifestyle) (Conway and Rene, 2004;Bray et al., 2016). Recently, increased evidences showed that DNA methylation is also involved in the process of obesity (Ling and Rönn, 2019;Samblas et al., 2019). DNA methylation is an epigenetic mechanism which regulates gene expression through chromatin structure changes. Equally influenced by environmental factors, genetic predisposition and the individual lifestyle, the level of gene methylation is dynamically changing in setting up stable gene expression profiles to adapt to the process of obesity (Samblas et al., 2019;Cabre et al., 2021). A previous study analyzing the whole genome methylation and gene expression in nondiseased breast showed that obesity is connected with the genome-wide methylation changes in human tissue (Hair et al., 2015a). In addition, Hair et al. observed that obesity is significantly correlated with genome-wide hyper-methylation in ER-positive BCs (Hair et al., 2015b). Thus, changes of genomewide DNA methylation could be a reflection of the biological changes in breast tissue during the process of obesity. The goal of our study is to capture the obesity-related genomic changes and explore the impact of obesity on BC tissues. We developed DNA methylation-based BMI index (DM-BMI) to evaluate the degree of obesity in breast tissues and validated the accuracy of DM-BMI in breast tissues from non-BC and BC population. Furthermore, we assessed the correlation among DM-BMI, obesity adipose tissue content, and the expression of adipokines in BC tissues. Next, we identified the DM-BMIrelated gene expression profile. Using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we observed that the DM-BMI-related genes are significantly involved in the process of cancer immunity. Using Estimate and Cibersort algorithm, we observed a positive correlation between DM-BMI and immune cell infiltration. Finally, we assessed the correlation between DM-BMI and biomarkers of response to immune checkpoint inhibitors (ICI) (Shah et al., 2012) and observed that DM-BMI positively correlated with ICI response in BC.

Data collection and processing
The training set includes genome-wide methylation data of 221 normal breast tissues in GEO (GSE88883 and GSE101961) while the validation sets includes data of 44 normal breast tissues (Validation Set 1). Data of 70 tumor-adjacent breast tissues (Validation Set 2) in GEO (GSE67919 and GSE74214) were used to develop the DM-BMI score. BMI data of the above cases are listed in Supplementary Materials S1, S2. The DNA methylation and expression data of 76 cases with matched tumor and tumor-adjacent breast tissues and the data of 775 cases with unmatched tumor tissues were collected from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/ ). Clinical features of BC patients in TCGA-BRCA are listed in Table 1. Genome-wide methylation data was profiled using Illumina Infinium HumanMethylation450 BeadChips Assay. For DNA methylation data, β value ranging from 0 (no DNA methylation) to 1 (complete DNA methylation) was used to define the methylation level of each probe. Probes with missing value in over 50% samples were removed while the probes with missing value in less than 50% of samples were imputed with the k-nearest neighbors (knn) imputation method. Probes located on the sex chromosome and probes containing known single nucleotide polymorphisms (SNPs) were removed. Eventually, 301,998 probes were included in this study. BMIQ normalization for type I and II probe correction was performed. Data from multiple studies was integrated and the Combat algorithm was performed to remove the batch effects. The above processes were carried out using the R package ChAMP (Tian et al., 2017).
For gene expression data, background correction and normalization were carried out using the R package limma (Ritchie et al., 2015).

Calculation of DNA-methylation based body mass index
To improve the predictive accuracy of the model, the BMI value was transformed to F(BMI) before further analysis, which is shown as follows: The parameter healthy.BMI was set to 25, referring to the upper limit of BMI in healthy population.
A lasso regression was used to regress the DM-BMI in the form of F(BMI) based on the BMI data and DNA methylation data with 301,998 probes; 42 probes were selected in the lasso regression model as BMI predictors according to the lambda.min value (Supplementary Figure S1A). The coefficient values of each probe are shown in Supplementary Figure S1B. The lasso regression analysis was carried out using R package glmnet (alpha was set to 1, and the lambda value was identified by performing a 10-fold cross validation to the training data).

Analysis of intra-sample adipose tissue content
Adipose tissue accounts for a large proportion of breast tissue composition. Based on DNA methylation, we used a deconvolution algorithm to calculate the proportion of adipose tissue in breast and BC tissues. Teschendorff et al. (2016) provided a deconvolution algorithm to model cell subpopulations in breast tissues based on DNA methylation data. Illumina 450k DNA methylation data of human mammary epithelial cells (GSE40699) and adipose tissue (GSE48472) were used as reference profiles. Data were processed as previously described. Probes with an absolute difference in beta-value between the human mammary epithelial cells and the averaged adipose tissue >0.7 were selected for the evaluation of intra-sample adipose tissue content. Data for adipose tissue content are listed in Supplementary Material S3.

Characteristics analyses of body mass index predictors
Forty-two probes were selected in the lasso regression model as BMI predictors. The distribution of 42 probes on chromosome, CpG island, and TSS regions was assessed using the R package ChAMP. BMI predictors, differentially methylated between BC tissues and tumor-adjacent breast tissues in the TCGA database, were identified using the R package Champ. The survival correlation of BMI predictors was assessed using TCGA BRCA data. Correlation between the methylation level of BMI predictors and DM-BMI was assessed.
Functional and clinical characteristics analysis of DM-BMI-related gene profile DM-BMI of TCGA-BC tissues were calculated using DNA methylation data. Spearman correlation coefficient was used to assess the correlation of DM-BMI and clinical characteristics (menopause status, hormone status, copy number variation and gene mutation) in BC. Gene expression profile related to DM-BMI was identified; functional analysis of the related genes was process by GO and KEGG analysis. Besides, we performed gene set variation analysis (GSVA) to identify DM-BMI related gene signature using gene expression data in TCGA (Hänzelmann et al., 2013). The above procedures were performed using the R software.

Evaluation of correlations between DM-BMI and the immune microenvironment in breast cancer
Tumor mutation burden (TMB) was defined as the number of non-synonymous mutations/Mb of genome. As previously reported (Chalmers et al., 2017), TMB of BC tissues in TCGA was calculated as (whole exome non-synonymous mutations)/ 38 (Mb).
The level of tumor-infiltrating immune cells and stromal cells in each tissue were evaluated by ESTIMATE algorithm (Yoshihara et al., 2013). The proportion of 22 immune cells in each tissue was evaluated using the CIBERSORT algorithm (http://cibersort.stanford.edu/) (Gentles et al., 2015). Correlations between DM-BMI and ESTIMATE/CIBERSORT scores were calculated using Spearman correlation coefficient. The data for TMB, ESTIMATE, and CIBERSORT analysis are listed in Supplementary Material S4.

Development and validation of DM-BMI in breast, tumor-adjacent, and breast cancer tissues
A total of 221 breast tissues from non-BC cases (GEO database) were selected as the training cohort to develop the DNAmethylation-based BMI (DM-BMI) prediction model ( Figure 1). The median BMI and median age of the training cohort were 28.24 (6.07-53.74) and 37 (17-82). Fifty lasso regression models based on DNA methylation data of the training cohort (301,998 probes per sample) were constructed. A model with minimum mean-squared error was selected based on the Lambda value (Supplementary Figure S1A). Forty-two probes were included and their coefficients are shown in Supplementary Figure S1B and Supplementary Material S5. We used Spearman correlation coefficient and paired t-test to assess the predictive accuracy of the DM-BMI model. DM-BMI showed a significant correlation with BMI ( Figure 2A) while paired t-test revealed that there was no significant difference between DM-BMI and BMI (t = −0.384, df = 220, p-value = 0.702). Using a deconvolution algorithm, we evaluated the breast tissue composition and observed that the increased DM-BMI was significantly correlated with higher proportion of adipose tissue ( Figure 2B). These results showed the high accuracy of DM-BMI for BMI prediction.
Next, 44 normal breast cases (Validation Set 1) and 70 tumoradjacent breast tissues (Validation Set 2) were enrolled for external validation (Figure 1). The median BMI and median age of Validation Set 1 were 27.1 (14.6-62.7) and 44 (13-80); median BMI and median age of Validation Set 2 were 28.65 (16.5-53.4) and 56 (29-84). In both Validation Sets 1 and 2, DM-BMI showed positive correlation with BMI ( Figures 2C,E). Paired t-test revealed that there was no significant difference between DM-BMI and BMI (t = −0.253, df = 43, p-value = 0.801, Furthermore, we assessed the DM-BMI of paired tumor and tumor-adjacent breast tissues in TCGA database. The tumor tissues exhibited a higher level of DM-BMI compared with its paired tumor-adjacent tissues ( Figure 2G). In BC tissues, DM-BMI is positively correlated with adipose tissue proportion ( Figure 2H). What is known about the 42 body mass index predictors?
As DM-BMI was significantly correlated with obesity status, which has been suggested to regard as a risk factor for BC, we further assessed the relevance between BMI predictors and BC. As hyper-methylation of CpG island at gene promoter regions often causes gene silencing, we first evaluated the distribution of BMI predictors. Among 22 pairs of chromosomes (Chr), Chr1 and 16 are the most common region for BMI predictor distribution; 45.2% of BMI predictors were located at the gene body regions while only 23.8% of them were located at the promoter regions (TS1500 and TS200). Furthermore, we observed that only a few parts of BMI predictors were located at CpG islands ( Figure 3A).
Next, we assessed the methylation variation of BMI predictors between tumor and tumor-adjacent breast tissues. Twenty-six differentially methylated probes (DMP) were identified: 22 BMI predictors were hyper-methylated in tumor tissues compared with the tumor-adjacent breast tissues; 4 were hypo-methylated in tumor ( Figure 3B). Three of 42 BMI predictors were correlated with better OS for BC patients, while 2 of them were located at gene-coding regions ( Figure 3C). In BC tissues, the correlation between the methylation level of 42 BMI predictors and DM-BMI was evaluated. Eleven of them were significantly correlated with DM-BMI (correlation coefficient >0.3 or < −0.3; Figures 3D,E); 35 of 42 BMI predictors were matched to the human gene region. Through the integrative analysis of DNA methylation and expression data, the negative correlation between methylation level and gene expression was observed in 22 of 42 BMI predictors ( Figure 3E).

Functional and clinical characteristics analysis of DM-BMI related gene profile in breast cancer
Later we explored the biological significance of DM-BMI in breast cancer tissues. The survival correlation of DM-BMI was evaluated in BCs and the subgroup of BCs with cancer therapy (chemotherapy, endocrine-therapy, anti-HER2 therapy, and radiation-therapy). DM-BMI was consistently correlated with higher mortality risk in the whole cohort of BC patients and subgroups of patients with chemotherapy, endocrine-therapy or radiation-therapy, respectively ( Table 2). Tissues from patients with postmenopausal status and TP53-mutation exhibited a significantly higher level of DM-BMI ( Figures 4A,B). Apart from that, an increasing level of DM-BMI was correlated with an increased proportion of ERBB2 and MYC amplification ( Figures 4C,D).
Previous studies showed that adipokines produced by obese adipose tissues leads to obesity-mediated inflammation and BC progression. In BC tissues, sections of adipose tissue were positively correlated with DM-BMI. Expression of six proinflammatory adipokines was positively correlated with DM-BMI while the expression of two anti-inflammatory adipokines was negatively correlated with DM-BMI ( Figure 4E). These results indicated that obesity increased the expression of proinflammatory adipokines in BC tissues.
Furthermore, we assessed the DM-BMI (obesity)-related gene expression profile and mRNA expression of 10,032 genes significantly correlated with DM-BMI. To evaluate the biological effect of obesity on BC, we performed a GSEA analysis of DM-BMI-related genes using KEGG and GO database. GO analysis showed that gens positively correlated with DM-BMI were significantly involved in antigen process and presentation, immune cell activation, MHC protein binding, and immune receptor activity in BC ( Figure 4F). KEGG consistently showed that DM-BMI-related genes were significantly enriched in tumor-immunity-related pathway (which includes antigen processing and presentation, NK cellmediated cytotoxicity, T cell differentiation, and PD-1 checkpoint pathway) ( Figure 4G). These results indicated that the obesityrelated gene profile is involved in the regulation of immune response in BC.

DM-BMI correlated with T-cell infiltration and immune checkpoint inhibitor response markers in breast cancer
We evaluated the correlation between obesity and immune response to BC. Gene mutation which changes the proteincoding sequence and leads to the expression of abnormal proteins was suggested to be the driving factor for cancer development. Also, the abnormal protein derived from tumor mutation might rouse immune response. In BC tissues, we observed a positive correlation between DM-BMI and TMB ( Figure 5A). Using the Estimate algorithm, we evaluated the degree of immune cell infiltration in TCGA BC tissues. FIGURE 3 | methylation level of BMI predictors. r, Spearman correlation coefficient; 39 BMI predictors positively correlated with DM-BMI (BMI predictors with correlation coefficient >0.3 were marked as red dot; n = 10), and five BMI predictors negatively correlated with DM-BMI (BMI predictors with Spearman correlation coefficient < −0.3 were marked as green dot; n = 1). (E) Venn diagram of DMP; DM-BMI-correlated and expression-correlated BMI predictors. BMI predictors differentially methylated between tumor and tumor-adjacent tissues were labeled as blue; methylation levels of the BMI predictors correlated with DM-BMI were labeled as red; methylation levels of BMI predictors negatively correlated with gene expression were labeled as green.  Interestingly, we found a positive correlation between DM-BMI and Estimate-immune score while no significant correlation was observed between DM-BMI and Estimate-Stromal score ( Figure 5B). Next, we calculated the relative abundance of 22 immune cell types in TCGA BC tissues. Among them, the content of CD8-T, CD4 memory-activated T, T follicular helper, and regulatory T cells (Treg) were positively correlated with DM-BMI, indicating a more intense T cell-mediated immune response in BC tissues with increased DM-BMI ( Figures 5C,D). As the representative of immunotherapy, the ICI therapy suppressed BC progression by activating T cell-mediated immune response. Thus, we examined whether DM-BMI predicted the tumor response to ICI. Five markers for ICI response and four markers for ICI resistance were selected to evaluate the tumor response. In BC tissues, DM-BMI is positively correlated with IFNG, IFNG.GS, CD274, CD8, and APS, indicating that BC tissues with increased DM-BMI exhibited a better response to ICI ( Figures 5A,E). Moreover, DM-BMI was negatively correlated with two ICI resistance markers (CAFs and TAM-M2). All those results indicated that BC tissue at obesity status might exhibit a more intense response to ICI based on markers of ICI response.

DISCUSSION
Based on the DNA methylation pattern, we developed an obesity evaluation model (DM-BMI) in the study. We then further identified the obesity-related gene expression profile based on the DM-BMI model. Although obesity has been shown to be a BC risk factor for many years, most studies have been focusing on the correlation between obesity and clinical prognosis while studies about the biological and genomic impact of obesity on breast cancer were limited. The adipose tissue, as the major agent mediating obesity-related biological effects, is an important starting point for the study of the obesity impact. In both breast and BC tissue, we observed a positive correlation between the proportion adipose tissue content and the DM-BMI. Previous studies reported that the expansion of adipose tissue, accompanied by the dysregulation of the endocrine function (adipokine secretion) of the adipose cells, was driven by an increase in size of adipose or by a formation of a new adipose cell (Ghaben and Scherer, 2019;Quail and Dannenberg, 2019). As the DM-BMI increased, we observed an increased expression of pro-inflammatory adipokines but a decreased expression of anti-inflammatory adipokines which might synergistically induce obesity-mediated inflammation through the activation of the NF-κB pathway and a pro-oncogenic environment.
In addition to the expansion of adipose tissue, we also observed a promoting effect of obesity on immune response in BC tissue. For those obese individuals, adipose tissue expands with the increasing demand of oxygen, which induces the development of the hypoxia environment (Iwamoto et al., 2018). The activation of hypoxia signaling increases the expression of adipokines, especially the pro-inflammation adipokines (including CCL2, CXCL8, CXCL10, IL-18, IL-1α and Oncostatin M), which is involved in the recruitment of tumor-associated immune cells (Taylor and Colgan, 2017;Hou et al., 2020;McGettrick and O'Neill, 2020). Moreover, previous studies showed that adipocytes could fuel immune cells by releasing exosome-sized and lipid-filled vesicles (Flaherty et al., 2019;Zhang et al., 2020). In BC tissues, we observed that DM-BMI is positively correlated with the degree of M1 macrophages and activated dendritic cell and T cell infiltration, indicating an increase activity of tumor immune response. As T cell exhaustion is the key for tumor immune escape, previous studies have indicated that T cell exhaustion could be reversed by immune checkpoint inhibition (such as PD-1 inhibition) and replenishing activated T cells (such as CAR-T) (Bajgain et al., 2018;Bassez et al., 2021). Interestingly, in obese BC tissues, we found an increase content of CD4 +CD8 + and follicular helper T cells, which may be the result of an increased secretion of immune chemokines in adipose tissue. Although we observed a positive correlation between DM-BMI and regulatory T cell (Treg), coupled with a subset of immune cell with immunosuppressive activity, they can be interpreted as a negative feedback regulation by the immune system to maintain the stability of the immune environment after the activation of the immune response (von Boehmer and Daniel, 2013). Furthermore, our study revealed that DM-BMI is positively correlated with ICI response markers in BC tissues. These results suggest that obese BC patients may benefit from ICI. Recently, Wang et al. consistently reported that obesity is concerned with increased response of PD-1/PD-L1 blockade in an animal melanoma tumor model (Wang et al., 2019). However, as our findings were mainly supported by database analysis, data from clinical samples treated with ICI treatment are still required to validate the correlation between DM-BMI and ICI response.
With the increasing number of obese patients, the impact of obesity on the treatment of breast cancer has aroused more and more attention. We observed that the increased DM-BMI was correlated with higher mortality risk in patients with chemotherapy, hormone therapy, and radiation therapy. Although no evidence pointed out that obesity will induce drug-resistance in cancer cell, dose of chemotherapy and radiation might still be routinely reduced in obese individuals as doctors usually limit body surface area under 2 m 2 to reduce toxicity when calculating the dose of chemotherapy (Lyman, 2012;Picon-Ruiz et al., 2017). As is highly expressed of aromatase, adipose tissue is an endocrine organ, which is an important site for estrogen biosynthesis, especially for postmenopausal women (Liedtke et al., 2012). For obese BC patients, increased expression of aromatase might cause the resistance to endocrine-therapy.
Because of the limitation of BMI in obesity evaluation, several imaging methods have been developed for obesity evaluation (including: bioimpedance analysis instruments, dual-energy x-ray absorptiometry, computed tomography, and magnetic resonance imaging) (Karlsson et al., 2013;Neamat-Allah et al., 2014). Although these new methods enabled the precise quantification of adipose tissue, the operational complexity did limit their application (Nimptsch et al., 2019). Developing new methods to evaluate the degree of obesity is of great value.
Recently, an increased number of studies indicated that environmental factors (such as dietary pattern and lifestyle) will induce changes in the DNA methylation pattern predisposing to obesity and obesity and, likewise, results in genome-wide methylation variation (Wahl et al., 2017;Samblas et al., 2019). Moreover, biomarkers based on DNA methylation have been shown to be effective in obesity evaluation while most of them were only applied in blood samples (Samblas et al., 2019). Thus, we developed a DNAmethylation-based biomarker (DM-BMI) for obesity evaluation in breast tissue. In both normal breast and tumor-adjacent breast tissue, DM-BMI showed a significant correlation in both BMI and the content of adipose tissue. In addition, we also observed that DM-BMI was positively correlated with the degree of proinflammatory adipokine and immune cell infiltration in BC tissues. All those data indicated that DM-BMI is an effective biomarker to evaluate the biological changes in tumor tissues of obese patients.
The identification of BMI predictors naturally causes the assumption that these CpGs are critical regulators of obesity. In our study, only 11 of 42 BMI predictors were significantly correlated with DM-BMI while the others exhibited negligible correlation with DM-BMI. Although DNA methylation level was negatively correlated with gene expression in over half of BMI predictors, 45.2% of BMI predictors were located at the body region of gene sequence. How the CpGs located at the body region regulate gene expression remains unclear. As a previous study reported, the variations of DNA methylation pattern are the consequence, rather than cause, of adiposity (Wahl et al., 2017). Whether these BMI predictors are regulators of obesity or imprints of the biological process remained to be further investigated.

CONCLUSION
Collectively, we established a new biomarker for obesity evaluation and discovered that BC tissues of obese individuals exhibit an increased degree of immune cell infiltration, indicating that obese BC patients might be the potential beneficiaries for ICI treatment.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
XH and FX designed the overall project; ZX and XL analyzed the data and wrote the manuscript; ZX and LY collected and analyzed the data; XL, LW, and YX did the statistical analysis. All the authors have read and approved the final manuscript.

FUNDING
The study was supported by funds from the National Natural Science Foundation of China (81974444 to XX).