rs12537 Is a Novel Susceptibility SNP Associated With Estrogen Receptor Positive Breast Cancer in Chinese Han Population

Genetic testing is widely used in breast cancer and has identified a lot of susceptibility genes and single nucleotide polymorphisms (SNPs). However, for many SNPs, evidence of an association with breast cancer is weak, underlying risk estimates are imprecise, and reliable subtype-specific risk estimates are not in place. A recent genome-wide long non-coding RNA (lncRNA) association study in Chinese Han has verified a genetic association between rs12537 and breast cancer. This study is aimed at investigating the association between rs12537 and the phenotype. We collected the clinical information of 5,634 breast cancer patients and 6,308 healthy controls in the early study. And χ2 test was used for the comparison between different groups in genotype. The frequency of genotypic distribution among SNP rs12537 has no statistically significant correlation with family history (p = 0.8945), menopausal status (p = 0.3245) or HER-2 (p = 0.2987), but it is statistically and significantly correlated with ER (p = 0.004006) and PR (p = 0.01379). Most importantly, compared to the healthy control, rs12537 variant is significantly correlated with ER positive patients and the p-value has reached the level of the whole genome (p = 1.66E-08 <5.00E-08). Furthermore, we found rs12537 associated gene MTMR3 was lower expressed in breast cancer tissues but highly methylated. In conclusion, our findings indicate that rs12537 is a novel susceptibility gene in ER positive breast cancer in Chinese Han population and it may influence the methylation of MTMR3.


INTRODUCTION
The burden of breast cancer is increasing worldwide. Among the 19.3 million new cases reported by the GLOBOCAN 2020, breast cancer patients account for 11.7% (1). China is undergoing cancer transition with an increasing burden of breast cancer, and the incidence of breast cancer arrives at 18.41%. In China, female breast cancer patients took up approximately 18% of breast cancer deaths across the world (2). Many sequencing methods such as genome-wide association studies (GWASs), exome and lncRNA sequencing are used to identify SNPs/loci/genes related to the occurrence, development, prognosis and drug resistance of breast cancer (3)(4)(5)(6)(7).
Breast cancer is a heterogeneous and polygenic disease, and breast cancer susceptibility SNPs and genes are closely related to molecular subtype and clinical phenotypes (8)(9)(10). However, for many SNPs, evidence of an association with cancer is often weak, and accurate estimates of the cancer risks associated with variants are often not available (11). In our previous study, we performed a genome-wide lncRNA association study, and reported a suggestive SNP, rs12537 (p = 8.84E-07), which may be associated with breast cancer susceptibility (12). rs12537 variant was reported to be associated with IgA nephropathy in Han Chinese, and rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE) in Egyptian patients (13,14). Moreover, rs12537 variant was also found associated with significantly increased gastric cancer risk (15,16). However, the relationship between rs12537 and breast cancer remains unknown.
To investigate the association between rs12537 on 22q12.2 and breast cancer susceptibility as well as the clinical phenotype including familial history, menopausal status, estrogen receptor (ER), progestogen receptor (PR), human epidermal growth factor receptor 2 (HER-2) and molecular subtypes of breast cancer patients in Chinese Han population, we conducted a genotypephenotype analysis to clarify the association of rs12537 with breast cancer phenotypes in Chinese Han population. Moreover, we tried to analyze rs12537 associated genes and breast cancer based on public databases.

Subjects
We collected the genotyping data from our previous data (including GWAS stage data and replication stage genotyping data) and clinical data (including age of onset, family history, menopausal status, ER, PR and HER-2) of a total of 5,634 patients (12). Immunohistochemical analysis was employed to evaluate the ER, PR and HER-2 status of breast tissue of biopsies. Each case was diagnosed and confirmed by at least two oncologists. And their clinical information was collected by investigators with a comprehensive clinical check-up. We also collected the genotyping data and age of 6,308 healthy controls, and they were clinically determined to be free of breast cancer, other neoplastic disease, systemic disorders, and to have no family history of cancer (including first-, second-and third-degree relatives). All participants provided written informed consent. This study was approved by the institutional ethics committee of each hospital and was conducted according to the Declaration of Helsinki principles.

Statistical Analysis
To identify which phenotypes were associated with the specific SNP rs12537, we performed case-control and case-only analysis to examine the risk conferred by the suggestive SNP on different phenotypes of breast cancer. PLINK1.07 software (developed by Christopher Chang and others) and SPSS16.0 (IBM, https://www.ibm.com) were used to perform chi-square test and logistic regression analysis to explore the correlation between rs12537 and breast cancer susceptibility, as well as the different phenotypes of breast cancer. Allele frequency and genotype frequency were calculated by direct counting method, χ 2 significance test was carried out, and the relative risk was evaluated by Odds ratio (OR) and 95% confidence interval (95% CI), with the difference being statistically significant (p < 0.05), a remarkable deviation from Hardy-Weinberg equilibrium in the controls (p > 0.05) during each stage.

Sample Characteristics
All subjects involved in this study were from our early genomewide lncRNA association study (12).

Genotypic and Phenotype Analysis
To further explore the relationship between suggestive SNP rs12537 and breast cancer susceptibility, we combined our GWAS and replication data to perform a genotypic and phenotype analysis based on the clinical information we collected. The results show that the suggestive SNP rs12537 is not related to the familial history of cancer, menopausal status, HER-2 and the four molecular subtypes of breast cancer patients. And there is a statistical difference between PR positive patients and PR negative patients (p = 0.01379, OR = 0.8536, 95% CI: 0.7525-0.9638) in rs12537 variant ( Table 2 and Supplementary Table 1).
And we also performed genotypic and phenotype analysis on the other three SNPs, rs9397435, rs11066150 and rs62112521 in Chinese Han women (12), but we found no correlation between these three SNPs and the clinical characteristics of breast cancer (Supplementary Table 2). Surprisingly, ER positive patients and healthy controls also differ statistically (p = 1.66E-8, OR = 0.7744, 95% CI: 0.7085-0.8464) in rs12537 variant, and this difference has reached the level of the whole genome for p < 5.00E-8. Moreover, there is also a statistical difference between ER positive patients and ER negative patients (p = 0.004006, OR = 0.8309, 95% CI: 0.7323-0.9427) in rs12537 variant ( Table 2). To exclude the influence of clinical features other than ER status on the results, we further analyzed and compared the clinical differences between ER positive, ER negative breast cancer patients and healthy controls (Supplementary Table 3), and we found that age, family history and HER-2 expression were not correlated with ER. So rs12537 is a novel ER positive breast cancer associated SNP variant in Chinese Han women.
rs12537 associated gene, MTMR3, was reported to be associated with RA and SLE, gastric cancer and breast cancer (14,15,19). Therefore, we try to investigate the expression of MTMR3 in the cancer genome atlas (TCGA) database by UALCAN (20), discovering that compared to normal tissues MTMR3 was lower expressed in primary tumor tissues (p < 1E-12), but the promoter methylation level was higher (p = 2.66E-02). MTMR3 expression was not associated with overall survival (OS) (p = 0.44) (Figures 1A-C). Moreover, based on Kaplan-Meier Plotter (www.kmplot.com) (21), we found highly expressed MTMR3 could improve patients relapse free survival (RFS) (p = 2.2E-06) (Figure 1D), but there was no correlation between MTMR3 expression and the OS, postoperative survival (PPS) as well as distant metastasis-free survival (DMFS) in breast cancer patients (Supplementary Figure 1).

DISCUSSION
Breast cancer is a complex multifactorial disease, with high incidence, strong invasiveness, metastasis and heterogeneity (1,22,23). A large number of sequencing studies have identified more than 200 susceptibility SNPs/genes (24). By combining sequencing analysis with the clinical characteristics of breast cancer patients, more SNPs/genes that have a stronger correlation with clinical characteristics were identified, which provides important theoretical support for precision treatment of breast cancer (8,11,25).
It is reported that more than 60% of breast cancers, including Luminal A and Luminal B breast cancers, were ER positive (8,10). And ER positive breast cancer is a highly heterogeneous disease comprising different histological and mutational patterns, with varied clinical courses and responses to systemic treatment. GWASs have identified a lot of ER positive breast cancer associated SNPs, such as rs112545418, rs17132398 in 4p16, rs116638271, rs77274510 and rs117564384 in 11q13 and rs10941679 in 5p12 (26,27). In our previous study, we designed a lncRNA array independently, and then performed the first genome-wide lncRNA association study on Han Chinese women, identifying a novel breast cancer-associated susceptibility SNP, rs11066150, a previously reported SNP, rs9397435 and two suggestive SNPs rs12537 and rs62112521 (12), but our study revealed that rs11066150, rs9397435 and rs6211252 had no relationship with the clinical characteristics of breast cancer (Supplementary Table 2). In the present research, we identified rs12537 as a novel susceptibility SNP in ER positive breast cancer in Han Chinese women. And this is the first time that rs12537 has been reported to be associated with ER positive breast cancer. However, only 4,263 ER patients (65.05% ER positive, 34.56% ER negative) and 6,308 healthy controls were included in this study, and a larger and better-matched population (including age, familial history, menopausal status, ER, PR and HER-2) may be needed for further verification.
The SNP rs12537 present in the miR-181a-binding site in the 3' UTR of the MTMR3 gene (15) and T/C variant in MTMR3 were reported to be associated with IgA nephropathy, RA, SLE and gastric cancer (13)(14)(15)(16). As an autophagy-related gene involved in the negative regulation of autophagy initiation (24), rs12537 T/T carriers were associated with lower serum MTMR3 expression and higher miR-181a expression than in other genotypes among SLE patients, and their interaction may lead to autophagy increasing (14). rs12537 CT genotype carriers in gastric cancer had low MTMR3 mRNA expression than CC genotype carriers (15). Ectopic expression of miR-181a mimics or introduction of MTMR3 small interfering RNA resulted in an increase in cell proliferation, colony formation, migration, invasion, as well as suppression of apoptosis in gastric cancer (28).
DNA methylation plays a crucial role in the formation and process of cancers and it could be potential candidate biomarkers for cancers (29). Based on TCGA database, we found that MTMR3 gene was lower expressed in breast cancer tissues than normal tissues and the promoter methylation level was higher. However, MTMR3 expression had no correlation with overall survival. Here, we hypothesize that rs12537 variant in ER positive breast cancer patients could regulate the methylation of MTMR3, and further studies are required to fully understand the mechanism.
In conclusion, the results of our study show that rs12537 is a novel susceptibility SNP in ER positive breast cancer in Chinese Han population. Moreover, rs12537 associated gene MTMR3 is lowly expressed but highly methylated in breast cancer. Considering that we have not found the correlation between MTMR3 expression and overall survival based on TCGA database, multicentric studies involving a larger number of cases and genotypic data are needed to verify this result.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by this study was approved by the Ethics Committee of Anhui Medical University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JX, YC, and BZ conceived of the idea. JX, GL, and MC conducted statistical analyses. WL and YW collected clinical data. JX and GL wrote the manuscript with inputs from all authors. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
The authors would like to thank all the participating patients and healthy controls, as well as all the doctors and nurses who have contributed to this work.