Correlation of ERCC5 polymorphisms and linkage disequilibrium associated with overall survival and clinical outcome to chemotherapy in breast cancer

Purpose ERCC5 is a DNA endonuclease and nucleotide excision repair gene; its mutations lead to a lack of activity by this enzyme, causing oxidative DNA damage. This study aimed to assess the role of four selected single nucleotide polymorphisms (SNPs) in ERCC5 and their linkage disequilibrium associated with survival analysis and clinical outcomes in breast cancer. Patients and methods Four SNPs (rs751402, rs17655, rs2094258, and rs873601) of the ERCC5 gene were analyzed using the PCR-RFLP technique, followed by sequencing in 430 breast cancer (BC) cases and 430 cancer-free individuals. Statistical analysis was performed using MedCalc 17 and SPSS version 24, while bioinformatic analysis of linkage disequilibrium was performed using Haploview software 4.2. Results Multivariate analysis showed that the rs751402 and rs2094258 polymorphisms were significantly associated with an elevated risk of BC (P < 0.001), while the other two SNPs, rs17655 and rs873601, did not show any association (P > 0.001). Survival analysis revealed that rs751402 and rs2094258 had longer overall survival periods (P <0.001) than rs17655 and rs873601. Moreover, rs751402 and rs2094258 also had significantly longer overall survival (log-rank test, P < 0.005) for all three survival functions (positive family history, ER+PR status, and use of contraceptives), while rs17655 and rs873601 did not show any significant association. Only rs873601 showed a strong negative correlation with all the chemotherapeutic groups. Conclusion The current results suggest that variations in ERCC5 may contribute to BC development and that their genetic anomalies may be associated with cancer risk and may be used as a biomarker of clinical outcome.


Introduction
Breast cancer (BC) is one of the most common malignancies and the primary cause of death among females worldwide (1). One in nine women in Pakistan faces this brutal disease (2). The mechanisms underlying breast carcinogenesis have not yet been fully explored and need to be completely understood. Various polymorphisms of genes involved in DNA damage responses play a significant role in cancer development and proliferation. Genes associated with DNA repair pathways are considered candidate genes for cancer susceptibility because reduced repair efficiency may induce carcinogenesis (3). One of the DNA repair pathways is the nucleotide excision repair (NER) pathway, which is significantly associated with cancer risk. Maintaining genomic stability and preventing the propagation of errors in the genome requires efficient DNA repair, and the NER pathway helps in the repair of bulky lesions such as thymine dimers generated by ultraviolet radiation (4). ERCC5 is a vital constituent of the NER mechanism and is called xeroderma pigmentosum group G (XPG). It encodes an endonuclease enzyme, which makes a structure-specific 3'-incision at damaged DNA sites. It can also act non-enzymatically by participating in a 5' incision with the help of the ERCC1/XPF heterodimer (5). ERCC5 is expressed in different tissues and cell lines, and its deficiency leads to genomic instability, DNA repair faults, and non-functioning gene transcription modulation and thus plays a role in DNA damage and higher breast cancer susceptibility, and regulation of DNA repair is a vital feature in various steps of carcinogenesis. Single nucleotide polymorphisms in ERCC5 may change its activity or expression, affecting DNA repair function, resulting in the alteration of cancer treatment effects, as treatment outcomes depend on the genetic variant of the gene present (6,7).
Many studies have depicted that XPG polymorphisms are linked with various cancers like gastric, lung, breast, and colorectal (8)(9)(10)(11). However, to our knowledge, only a limited number of studies have been conducted on the association analysis of these particular polymorphisms of ERCC5 (rs751402, rs17655, rs2094258, and rs873601) in BC patients and their response to chemotherapy. To investigate the possible influence of ERCC5 on BC, a case-control study was designed to evaluate the active involvement of these selected polymorphisms. Our study highlights the correlation of ERCC5 polymorphisms with various clinicopathological factors, overall survival rates with different survival functions, linkage disequilibrium analysis, and therapeutic outcomes of different chemotherapeutic drugs among breast cancer patients. Linkage disequilibrium analysis was conducted to explore the combined effects of these ERCC5 germline variants on breast carcinogenesis. It is expected that the data generated in the present study will help health practitioners make treatment decisions or provide the best advice based on an assessment of risk.

Subjects and ethical considerations
The study was approved by the ethical committees of the Institute of Nuclear Medicine, Oncology, and Radiotherapy (INOR) Hospital, Abbottabad, Pakistan, and Fatima Jinnah Women University, Rawalpindi, Pakistan. The sample size was evaluated using a sample size calculator provided by the World Health Organization and validated manually by. Blood samples and demographic details were collected from 430 histologically confirmed breast cancer patients (mean age 47.32 ± 11.7) and healthy controls (mean age 46.3 ± 14.03, P = 0.005), with patients' consent signed by them to participate in the study (2019-2022). A questionnaire was designed for the collection of clinicopathological details of patients.

Single nucleotide polymorphism selection
Four potential SNPs (rs751402, rs17655, rs2094258, and rs873601) were selected from the National Center for Biotechnology Information SNP data base (http://www.ncbi. nlm.nih.gov/) and SNPinfo (http://snpinfo.niehs.nih.gov/) combined with previously described studies on the characteristics of the East Asian population in HapMap with minor allele frequency (MAF >5%). SNP rs17655 is a nonsynonymous SNP (nsSNP) present in exon 15, while the remaining three SNPs are present in the regulatory region of ERCC5 (i.e., the 3′ untranslated region (UTR), the 5′ UTR promoter region, and the 5′ near gene). rs2094258 in the 5′ near gene was predicted to affect transcription factor binding site activity, rs751402 was present in the 5' UTR promoter region of the gene, and rs873601 in the 3'UTR may have an influence on the splicing and miRNA binding sites.

DNA extraction and polymorphism screening
Blood samples were collected in EDTA vacutainers and stored at −20°C until further use. Genomic DNA was isolated from blood samples by the standard phenol-chloroform method (5) and stored in a refrigerator at 4°C until further analysis. Qualitative analysis of DNA was performed using conventional electrophoresis on a 1% agarose gel and a spectrophotometer. Genotyping of the ERCC5 germline variants rs751402, rs17655, rs2094258, and rs873601 was performed by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP), following a method modified by Guo et al. (12). Primers were obtained from the published literature and are listed in Supplementary Table S1 along with their respective references.

Statistical and survival analysis
Clinicopathological details, demographic characteristics, and ERCC5 variants between BC patients and healthy controls were analyzed using Pearson's chi-square test (c 2 ) and Fisher's exact test. Conditional logistic regression was applied to find the associations between ERCC5 SNPs, clinicopathological details, and breast cancer risk by computing 95% confidence intervals (95% CIs) and odds ratios (ORs). Frequency distribution analysis was performed according to Hardy-Weinberg equilibrium (HWE) statistics. Patient follow-up was performed every 6 months and the homozygous wild variant was taken as a reference in all four ERCC5 SNPs. The overall survival (OS), survival distributions, and OS with three survival functions were estimated using Kaplan-Meier and log-rank tests. The survival distributions among the different classes of chemotherapy drugs were also assessed. Patients were classified based on the chemotherapeutic drugs administered. Taxanes, cytotoxic agents, and a combination of chemotherapeutic drugs were administered to all patients treated with chemotherapy. The frequency of chemotherapeutic drugs in breast cancer with both SNPs was analyzed by the chi-squared test. The correlation between SNPs and chemotherapeutic drugs was also assessed. Linkage disequilibrium analysis was performed using Haploview software 4.2. Significance level was set at P <0.05. All statistical analyses were performed using IBM SPSS version 24 and MedCalc 17.

Subject characteristics
The current study aimed to assess the genetic variations in the DNA excision repair protein ERCC-5 of the nucleotide excision repair pathway in 430 BC patients and 430 healthy controls. The demographic details and genotype frequencies of ERCC5 in patients with BC and healthy controls are shown in Table 1. The demographic parameters studied included family history, age, cancer staging, chemotherapeutic drug type,

Association of ERCC5 germline variants and clinicopathological parameters
To associate the genotype frequency of the assessed SNPs with clinicopathological factors, we applied logistic regression and c2 tests. In this analysis, clinicopathological factors such as family history, marital status, ER status, PR status, and menopausal status were considered independent factors, and the genotype of all evaluated SNPs was considered a dependent variable, as illustrated in Table 2. The distribution frequency of the homozygous variant type and heterozygous variant type of ERCC5 rs751402 was only associated with patients who used

Linkage disequilibrium analysis
The analysis of linkage disequilibrium of the evaluated polymorphism of the ERCC5 gene was calculated using Haploview software, as shown in Figure 1. LD values are displayed as r2 and D values. Site 1 represents rs751402, site 2 represents rs17655, site 3 represents rs2094258 and site 4 represents rs873601. Sites 3 and 4 (rs2094258 and rs873601, respectively) exhibited a stronger association with LD among cancer patients than among healthy controls.

Survival distributions for different chemotherapeutic drugs classes
Chemotherapeutic drug-related data were available for only 236 patients, possibly because they were not taking those drugs or had missing records from the files. We categorized all chemotherapeutic drugs given in different classes: cytotoxic drugs, taxanes, others, cytotoxic and taxanes, and all three were given together in combination. Patients were followed up every six months to inquire about their health condition, monitor the effectiveness of drugs, and for survival analysis. Genetic analysis was conducted to evaluate the association of SNPs with the response to a particular chemotherapeutic drug type. The outcomes are summarized in Table 3. We were unable to find any association between the respective chemotherapeutic drugs and rs17655 (P >0.001), whereas rs751402 showed a significant association (P <0.001). The overall survival for all the drugs administered was not statistically significant (log-rank test, P = 0.09). Survival differences for different drugs were compared through Breslow, Tarone-Ware, and the Log-rank test, which showed insignificant results for all drugs given (Logrank test, P = 0.09; median = 18; 95% CI = 14.9-21.08) (Figure 2).

Discussion
The present study was designed to associate single nucleotide polymorphisms of ERCC5 (rs751402, rs2094258, rs17655, and rs873601) with breast cancer and associated risk factors. A significantly higher rate of variants at rs751402 and 2094258 was observed in breast cancer patients than in noncancerous individuals, while the other two evaluated SNPs did not show any association. Only rs17655 was present in the exonic region, whereas the remaining three were present in the regulatory region of ERCC5. The present study reported elevated BC risk with a positive family history, showing similar results to previously reported literature (13,14). The present study also reported that increased BC risk was linked to late menopause and early menarche, which is concordant with the literature (15). To maintain genome integrity, regulation of the NER pathway is essential, and ERCC5 is a multifunctional gene that encodes structure-specific endonucleases (16). Studies have found an association between ERCC5 genetic variations and different cancers (8). The current study reported a significant association between variant types of rs751402 and rs2094258 with an elevated risk of breast cancer. At present, very few studies are available with respect to the mentioned polymorphisms and breast cancer. Significant correlation of variant genotypes of rs751402, rs2094258, and rs873601 has been stated with colorectal cancer susceptibility (17). Pongsavee and Wisuwan (4) also reported a significant association of rs751402 with breast cancer in a Thai population. Wang et al. (7) had described no association of rs17655 with BC among the Han population of northwest China and a significant association of rs751402 with breast carcinogenesis. A meta-analysis showed that rs873601 was significantly associated with overall risk, and another meta-analysis showed that this polymorphism is involved in the development and severity of colorectal cancer (8,18). Guo et al. (12) investigated the role of rs17655 and rs751402 in the development of gastric cancer in a Chinese population and found that the mutant genotype of rs751402 significantly increased gastric cancer risk compared to the wild type, but rs17655 did not. Several meta-analyses have found that the rs17655 polymorphism might not confer susceptibility to breast cancer, and the results are still inconsistent (6,19). Our study showed short survival due to delayed medical aid, a diverse medical history, and an advanced disease stage. Patients with early medical aid have higher 5-year survival rates than those with delayed presentation (88% and 12%, respectively) (20). Most of the patients in the present study had advanced stages of disease because they were from rural areas and mostly lacked disease knowledge and had financial constraints to go for therapeutic options and a good diet. Looking for medical aid early, before the advanced stage, implies a better prognosis and ultimately improves survival rates.

Conclusion
In conclusion, two SNPs (rs751402 and 2094258) may play a role in the etiology of breast cancer in Pakistan. This is the first report of the association between ERCC5 (rs751402, rs2094258, rs17655, and rs873601) and breast cancer risk in Pakistan. The literature is limited in this area; therefore, for more pronounced results, studies with larger sample sizes are needed. Furthermore, late menopause, positive ER/PR status, and a positive family history are contributing factors to breast cancer development.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author. Illustrating survival distributions for different chemotherapeutic drugs classes among breast cancer patients.

Ethics statement
The studies involving human participants were reviewed and approved by the Research Ethics Committee of Fatima Jinnah Women University, Rawalpindi, Pakistan. The patients/ participants provided their written informed consent to participate in this study.