A Novel Prognostic Biomarker of Luminal Breast Cancer: High CD39 Expression Is Related to Poor Survival

Background CD39 is one of the functional surface markers for T regulatory cells, the prognostic role and immune-related effects of CD39 in luminal breast cancer (BC) patients has not been evaluated yet. The aim of the current study was to explore the association between CD39 expression and clinic pathological characteristics and the prognosis in luminal BC patients. Methods Clinical information and RNA-sequencing (RNA-Seq) expression data were extracted from The Cancer Genome Atlas (TCGA). Patients were divided into a high or low CD39 expression group by the optimal cutoff value (4.18) identified from the receiver operating characteristic curve analysis. The relationships between CD39 expression and clinic pathological features were evaluated by the corresponding statistical tests. Survival analyses were applied to evaluate the overall survival between the high and low CD39 expression groups in luminal BC. Furthermore, Gene Expression Omnibus datasets were used for external data validation. Gene set enrichment analysis (GSEA) was also performed, and CIBERSORT was used to analyze the immune cell populations. Results Analysis of 439 cases of tumor data showed that CD39 was overexpressed in luminal BC. The multivariable analysis suggested that CD39 expression was an independent prognostic factor for luminal BC patients. GSEA suggested that CD39 might play an important role in luminal BC progression through immune regulation. Analysis of immune cell patterns revealed high CD39 expression correlated to a higher proportion of CD8+ T cells and M2 macrophages. Conclusion This study demonstrates that CD39 expression correlates with the prognosis of luminal BC through TCGA database mining. Further studies are warranted further to elucidate this potential novel therapeutic strategy for BC.


INTRODUCTION
Breast cancer (BC) is the most familiar malignant neoplasm in women worldwide, with multiple molecular subtypes (Siegel and Miller, 2020). Tumor size, histological grade, lymph node stage, and hormone receptor status were the common risk factors of BC (Chia et al., 2008;Soerjomataram et al., 2008). However, BC's propensity of giving rise to distant metastases also depends on the molecular subtype. Luminal A-like tumors were defined as estrogen receptor-positive (ER+), human epidermal growth factor receptor 2 negatives (HER2−), progesterone receptor (PR) ≥ 20%, Ki67 < 14% which has "low" recurrence risk assessed by gene assays (Goldhirsch et al., 2013). Luminal B-like tumors were defined as ER+, HER2−, and at least one of the following: PR negative < 20%, Ki67 ≥ 20%, and which has "high" recurrence risk based on gene assays. It was reported that the most common subtype of BC is the ERα+ subtype (luminal A or luminal B), which comprises 80% of all BC (Metzger-Filho et al., 2013). Bone is the most common metastatic site in luminal subtypes (Kennecke et al., 2010). Although BC has a better prognosis and a higher overall survival (OS) rate, especially the luminal subtypes (Kennecke et al., 2010;Wu et al., 2017), it remains a challenge to reduce BC's bone metastasis and mortality. Hence, identifying novel molecular signature to predict BC's prognosis, especially the most common luminal ERα+ subtype, is of great importance.
Ectonucleoside triphosphate diphosphohydrolase 1 (ENTPD1/CD39) is not the only ATP-degrading enzyme that is expressed by tumor cells; for example, nucleoside triphosphatase, cancer-related (NTPCR) is also expressed by some tumors which can bind to extracellular ATP and then convert it to adenosine (Moesta et al., 2020). Its identification derives from studies on genetics, biochemistry, pharmacology, and immunology revealing the broad immunosuppressive effects of adenosine (Wolberg et al., 1975;Huang et al., 1997;Ohta and Sitkovsky, 2001;Ohta et al., 2006). CD39 is widely expressed in immune cells and non-immune cells (Lei et al., 2015). CD39 is also detected in some tumor cells, and intratumoral immune cells demonstrate elevated CD39 expression (Moesta et al., 2020). CD39 dysregulation has been proved to be associated with many malignancies, including melanoma (Dzhandzhugazyan et al., 1998), leukemia (Pulte et al., 2011), pancreatic cancer (Künzli et al., 2007), ovarian cancer (Häusler et al., 2011), and colon cancer (Künzli et al., 2011). In BC patients' specimens, CD39 + CD8 + T cells were expressed in tumors or metastatic lymph nodes rather than non-invaded lymph nodes or peripheral blood (Canale et al., 2018). However, the prognostic role and immune-related effects of CD39 in luminal BC patients has not been evaluated yet.
Here, for the first time, we explored whether CD39 is a factor affecting luminal BC's prognosis. We explored the association between CD39 and clinic features and the prognosis using The Cancer Genome Atlas (TCGA)-BRCA level 3 data, and Gene Expression Omnibus (GEO) dataset GSE86166 was applied as external test data to validate the prognostic performance. Moreover, GSE 45827 was used to validate CD39 high expression in tumor tissue. Subsequently, we performed gene set enrichment analysis (GSEA) to explore the related regulatory network signaling pathways of CD39 in BC. To investigate the influence of CD39 on the tumor microenvironment, CIBERSORT was employed to analyze tumor-infiltrating immune cells (TIICs) associated with CD39 expression.

Patient Samples in TCAG and GEO Datasets
The RNA-sequencing (RNA-Seq) expression data and the clinical data of luminal BC patients were acquired from TCGA 1 . The analysis process was conducted using the RNA-Seq by Expectation-Maximization (RSEM) expression values. GEO datasets were acquired from the GEO database 2 .

Gene Set Enrichment Analysis
To investigate the influence of CD39 on pathway-level changes in BC tissues, we conducted GSEA to explore whether a priori defined set of genes demonstrated significant difference in expression between the constructed high and low CD39 expression groups (the grouping method is shown in the section "Statistical Analysis") in the TCGA cohort. The pathway with a normal P-value < 0.01 and a false discovery rate (FDR) < 0.01 was considered to be significantly enriched.

Analysis of Tumor-Infiltrating Immune Cells
CIBERSORT, a computational method developed by Newman et al. (2015), was employed to analyze the TIIC fractions of BC samples based on TCGA. The standardized processed data set of gene expression was uploaded in the CIBERSORT website 3 . Monte Carlo sampling was conducted for the deconvolution of each sample to improve the algorithm's accuracy. Only samples with a CIBERSORT P < 0.05 were enrolled in analysis. Moreover, the Wilcoxon rank-sum tests were performed for the comparisons of the TIIC fractions between the constructed CD39 expression high and low groups.

Statistical Analysis
All statistical analyses and plots were performed using R (v.3.5.1). A comparison of targeted gene expression in BC and normal tissues was conducted using the Wilcoxon rank-sum test using GEO datasets (GSE45827), which included 141 BC patients, and visualized in box plots. The Kruskal-Wallis test and Wilcoxon rank-sum tests were used to evaluate the relationships between CD39 and clinicopathological features.
Time-dependent receiver operating characteristic (ROC) curve analysis was used to determine the optimal cutoff value of CD39 in prognosis analysis in both training and validation datasets. Patients were then divided into two groups (i.e., high expression and low expression) in terms of cutoff value. We used a Chi-square test or Fisher's exact test to determine the link between CD39 groups and clinicopathological features. Kaplan-Meier method was used to plot the survival curves and the difference was compared by the Log-rank test. The relevant clinicopathological characteristics were screened by univariate analysis, and multivariable analysis was then performed to investigate association between CD39 and the survival of luminal BC patients. We evaluated the prediction performance of the constructed CD39 groups using an external GEO dataset (accession ID: GSE86166), which included the expression levels of 247 luminal patients. Additionally, the comparisons of three adenosine receptors (ARs) (ADORA2A, ADORA2B, and ADORA3) in BC and normal tissues were analyzed based on GSE45827 dataset using the Wilcoxon rank-sum test and visualized in box plots, Pearson correlation coefficients between three ARs and CD39 were also estimated, separately. The workflow of our analysis is presented in Figure 1.

Clinic Pathological Features of Patients
The analysis was performed in the TCGA-BRCA level 3 data. The clinic pathological features, including sample type, histological

CD39 Expression in Luminal BC
CD39 expression differences were depicted in boxplots according to patient age ( Compared to normal breast tissues, CD39 was higher in BC in the microarray GSE45827 dataset (P = 0.0009, Figure 3).

The Association Between CD39 Expression and Clinicopathological Features in Luminal BC
The high or low CD39 expression groups were determined by the optimal cut-off value (4.18) using ROC curve analysis. Chi-square test and Fisher's exact tests were used to compare the differences in clinic pathological features between the two groups. Several clinicopathological characteristics demonstrated  Values with "Not available," "Not Evaluated," "Unknown," and "Indeterminate" were excluded. Bold values of P < 0.05 indicate statistically significant correlations to be significantly associated with high CD39 expression ( Table 2), such as patient age (P < 0.0001), N classification (N0 vs N2-N3-NX) (P = 0.0263).

High CD39 Is an Independent Prognostic Factor of Luminal BC
We used Kaplan-Meier method to explore prognostic significance of CD39 expression in luminal BC patients. High CD39 was associated with poor OS in luminal BC patients (P = 0.029, Figure 4A). Validation using microarray datasets GSE86166 indicated similar prognostic prediction power of CD39 expression in BC (P = 0.0049, Figure 4B). The relevant clinic pathological characteristics screened by multivariable analysis were used to assess the influence of CD39 expression on luminal BC patients, high CD39 was an independent prognostic factor associated with poor OS in the luminal molecular type subgroup (training: P = 0.0185; HR: 2.310, 95% CI: 1.151-4.637, validation: P = 0.0458; HR: 1.602, 95% CI: 1.009-2.543, Table 3).

Gene Interactions and Enrichment Analysis of CD39
We conducted GSEA to find the different key signaling pathways between low and high CD39 expression datasets. Several significant different pathways (FDR < 0.05, normalized P < 0.05) were found in the enrichment of the MSigDB Collection (c2.cp.v6.2.symbols). Table 4 and Figure 5 demonstrated the most significantly enriched signaling pathways based on their normalized enrichment scores (NES). CD39 was related to the integrin1 pathway, ECM-receptor interaction pathway, AMB2neutrophils pathway, CXCR4 pathway, core matrisome, focal adhesion, reactome integrin cell surface interactions, AVB3 integrin pathway.

The Correlation Between Tumor-Infiltrating Immune Cells and CD39 Expression in Luminal Breast Cancer
We then investigated whether CD39 expression influenced immune cell infiltration in BC. The CIBERSORT algorithm was performed and 439 luminal tumor samples in the TCGA cohort (total population = 1090) with a P-value < 0.05 were eligible for this study. Figure 6A demonstrated the immune infiltration landscape in BC obtained from 439 tumors arranged by CD39 from low to high. To further investigate the influence of CD39 on TIICs, CD39 within the last/top quarter was specified as the low/high group. There were significant intragroup and intergroup variations in the proportions of TIICs. The high CD39 group showed a higher fraction of CD8 + T (P = 0.005, Figure 6B) cells and macrophage M2 cells (P = 0.003, Figure 6B) than the low expression group. In contrast, the M0 macrophages fraction was relatively lower (P < 0.000, Figure 6B). Meanwhile, the high CD39 expression group consisted of a higher proportion of CD4 + memory resting T cells and naïve B cells, which had an immunosuppressive phenotype.

DISCUSSION
We demonstrated that CD39 was higher in tumors compared to normal tissues from the TCGA data. CD39 expression was significantly associated with patient age, histological type, PR, and vital status. High CD39 was linked to poor luminal BC survival, and the multivariable Cox analyses confirmed that CD39 was an independent prognostic divisor in luminal BC using the TCGA cohort. GEO datasets validated the same results. CD39 expression has been identified to be associated with survival of luminal BC patients based on TCGA database mining for the first time.
It is known that in the tumor microenvironment, eATP can enhance immune responses and contribute to cancer cell death. CD39 can sequential hydrolysed eATP to AMP which can further degraded to anti-inflammatory adenosine by CD73 (Xiao et al., 2019). Recently, several CD39 antibodies had been developed in consideration of the potential value of CD39 as a therapeutic target (Bastid et al., 2013) mediated by reducing immunosuppressive adenosine or the increase in eATP (Moesta et al., 2020).
CD4 + T and CD8 + T cells has a pivotal role in cancer protective immunity. In BC, CD4 + T cells infiltration in the tumor microenvironment positively correlates with decreased OS (Matkowski et al., 2009). Our results revealed that the BC tissue with high CD39 expression had a higher scale of CD8 + T cells and M2 macrophages, however the M0 macrophages scale was relatively lower, suggesting that CD39 can promote TAMs toward M2 differentiation. Meanwhile, the high CD39 expression group comprised a higher proportion of naïve B cells and CD4 + memory resting T cells that presented immunosuppressive phenotypes, so CD39 is actually an exhaustion marker of T cells (Duhen et al., 2018;Simoni et al., 2018;Thelen et al., 2018). This is consistent with previous research, which revealed that more resting memory CD4 + T cells existed in ER+ cancers .
The enrichment of CD4 + T cell infiltration have significant implications for prognosis . CD39 may increase the resting memory of CD4 + cells by several mechanisms: (a) the function of anti-tumor T cells might be decreased by the adenosine generated by CD39; (b) CD39generating adenosine can promote apoptosis of T-cells; (c) macrophages and dendritic cells can be polarized into immunosuppressive regulatory cells resulting in a T cells limitation; (d) A2A receptor activation induces T regulatory cells and myeloid-derived suppressor cells (MDSCs) cell expansion and increases the immunity suppression; (e) CD39-generating adenosine can interact with specific G-protein-coupled receptors-A1, A2A, A2B, and A3 which can weak   Frontiers in Genetics | www.frontiersin.org anti-tumor immunity through enhance suppressive immune cells and attenuate the protective immune cells (Stagg and Smyth, 2010;Vigano et al., 2019). Deficient in the adenosine receptor A2A was related to the adenosinergic pathway in tumor immunity. Therefore, in this study, we performed an investigation into the genes' expression on the adenosinergic pathway based on TCGA datasets. ADORA2A and ADORA3 expression were significantly upregulated in BC, positively correlated with high CD39 expression. Thus, an over activated CD39-adenosine axis might contribute to BC immune escape and progression. The different molecular subtypes of BC, including HER2enriched, basal-like, luminal A, and luminal B, showed different prognosis (Sorlie et al., 2003). Luminal B subtype occupied for nearly 40% of BCs (Metzger-Filho et al., 2013). Luminal B BC is characterized by a low ER expression, a low PR expression, and a high histological grade (Harbeck et al., 2013). It generally demonstrated an aggressive behavior and has a prognosis similar to the HER2-enriched and base-like subtypes (Tran and Bedard, 2011). CD39-adenosine axis overactivated can play a vital role in luminal BC through immune escape pathways.
Further studies are needed to validate the link between CD39 and tumor immunity. CD39 may be an effective therapeutic target in luminal BC, as long as it is confirmed by further fundamental and clinical studies.
This study also has several limitations. Firstly, the OS of BC patients may be impacted by the other risk factors which were not collected in TCGA database, such as tumor size, treatments et al. Secondly, luminal BC comprises of luminal A and luminal B subtypes which may have significantly different OS, but it was hard to separate according to the current TCGA database. Third, it is better to involve the same covariates in training and validation datasets, but unfortunately, age was not collected in the validation set.

CONCLUSION
In the present analysis, CD39 expression correlates with the prognosis of luminal BC. This is the first study to demonstrate that CD39 is related to luminal BC survival based the immune pathway through TCGA database mining to the best of our knowledge. Further studies are warranted further to elucidate this potential novel therapeutic strategy for luminal BC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ supplementary material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Zhongshan Hospital, Fudan University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
XN and LH: conceptualization. WW: methodology. LH: software and data analysis. XN and XL: writing-original draft preparation. LH, ZH, JM, and WY: writing-review and editing. All authors have read and agreed to the published version of the manuscript.