Functional analysis of CTLA4 promoter variant and its possible implication in colorectal cancer immunotherapy

Background Colorectal cancer (CRC) is a prevalent cancer, ranking as the third most common. Recent advances in our understanding of the molecular causes of this disease have highlighted the crucial role of tumor immune evasion in its initiation and progression. CTLA4, a receptor that acts as a negative regulator of T cell responses, plays a pivotal role in this process, and genetic variations in CTLA4 have been linked to CRC susceptibility, prognosis, and response to therapy. Methods We conducted a case-control study involving 98 CRC patients and 424 controls. We genotyped the CTLA4 c.-319C > T variant (rs5742909) and performed an association analysis by comparing allele frequencies between the patients and controls. To assess the potential functional impact of this variant, we first performed an In Silico analysis of transcription factor binding sites using Genomatix. Finally, to validate our findings, we conducted a luciferase reporter gene assay using different cell lines and an electrophoretic mobility shift assay (EMSA). Results The case-control association analysis revealed a significant association between CTLA4 c.-319C > T and CRC susceptibility (p = 0.023; OR 1.89; 95% CI = 1.11–3.23). Genomatix analysis identified LEF1 and TCF7 transcription factors as specific binders to CTLA4 c.-319C. The reporter gene assay demonstrated notable differences in luciferase activity between the c.-319 C and T alleles in COS-7, HCT116, and Jurkat cell lines. EMSA analysis showed differences in TCF7 interaction with the CTLA4 C and T alleles. Conclusion CTLA4 c.-319C > T is associated with CRC susceptibility. Based on our functional validation results, we proposed that CTLA4 c.-319C > T alters gene expression at the transcriptional level, triggering a stronger negative regulation of T-cells and immune tumoral evasion.


Introduction
Colorectal cancer (CRC) is the third most incident type of cancer and the second leading cause of death among cancer patients worldwide (GLOBOCAN source, 2020-2021). The pathogenesis of this disease is complex, heterogeneous and is influenced by several factors, including lifestyle, environment, and genetics. Genetic susceptibility is driven by germline variants and the accumulation of somatic mutations that disturb key processes involved in cell cycle and promote tumorigenesis, including tumor suppressor genes, protooncogenes and immunogens (1). Notably, genetic factors have been shown to contribute up to 35% of the etiology of this disease (2).
Increasing evidence in cancer biology highlights the significance of the immunological landscape and tumor immune evasion as one of the hallmarks of cancer. Several genes, such as NKG2D, CD28, TNFRSF4, CTLA4, CD80, CD86, and PD-1, have been associated with this immune response. Currently, conventional treatment approaches for CRC include surgery, chemotherapy and radiotherapy (3). More recently, biological therapies based on monoclonal antibodies have been approved, representing a promising area of biopharmaceutical research (4). Specific antibodies that block or deactivate immunological checkpoints and induce antitumor immune responses have been developed and are employed in cancer treatment (5). Ipilimumab, an anti-CTLA4 monoclonal antibody, has received regulatory approval from agencies such as the Food and Drug Administration (FDA) and the European Medicines Agency (EMA) for the treatment of cancer treatment, including metastatic CRC). It is primarily used in advanced melanoma treatment and has demonstrated a complete response rate of 19% (6). The CHECKMATE-142 trial aimed to assess the effectiveness and safety of ipilimumab in patients with advanced colorectal cancer. The trial demonstrated promising activity of ipilimumab either alone or in combination with nivolumab, in a subset of patients with microsatellite instable colorectal cancer (7).
The cytotoxic T-lymphocyte antigen-4 (CTLA4) gene, also known as cluster of differentiation 152 (CD152), encodes a transmembrane type 1 T cell inhibitory receptor and plays a critical role as an immune checkpoint. This gene belongs to the IgG superfamily and is transiently expressed on activated T cells while constitutively expressed on regulatory T cells. CTLA4 has two regulatory pathways: an intrinsic regulatory pathway involving direct interaction with the TCR-CD3 complex, leading to downstream negative regulation after T-cell receptor activation (8) and an extrinsic, pathway where CTLA4 competes with CD28 for interactions with the CD80 and CD86 ligands (9).
Considering its role in maintaining immunological balance, genetic variants in CTLA4 have the potential to modify the immune response and, susceptibility to cancer development. A comprehensive meta-analysis conducted by Fang et al. (2018), which included a total of 67 case-control studies, reported the involvement of several CTL4 SNPs and proposed the utility of rs5742909 as predictive genetic biomarker for cancer predisposition (10). However, the association between rs5742909 and CRC susceptibility is conflicting, and some researchers did not find a significant association (11). Functional characterization of this variant in CRC has not yet been reported.
Molecular variants such as CTLA4 c.-319C > T (rs5742909) may impact gene expression variability through transcriptional regulation, affecting the binding sites for specific transcription factors. TCF7/ LEF1, members of the high-mobility group (HMG) transcription factor family, play a key role in regulating T cell development and differentiation, as well as the Wnt signaling pathway involved in various cellular processes underlying colorectal cancer (12-14). The genetic variant rs5742909 is located within the CTL4 promoter consensus Tcf/Lef motif, suggesting its potential influence on the binding affinity of Tcf/Lef transcription factors and subsequent gene expression regulation impacting CRC T cells immune surveillance (15).
Immunotherapeutic biomarkers play a crucial role in predicting treatment response and guiding the use of immunotherapies in CRC. Current immunotherapeutic biomarkers in CRC include: high microsatellite instability (MSI-H)/mismatch repair deficient (dMMR) status (7); programmed death-Ligand 1 (PD-L1) expression (16); tumor-infiltrating lymphocytes (TILs) (17) and immune gene signatures that provide information about the tumor's immune environment and potential response to immunotherapy (18). Despite this repertoire of biomarkers, it is of great interest to incorporate new options that can be easily evaluated in patient blood samples. In this context, this study assessed the association between the CTLA4 c.-319C > T variant and CRC susceptibility, proposing its potential use as a biomarker for therapeutic response to anti-CTLA4 monoclonal antibody immunotherapy.

Sampling and data collection
This study included 100 patients who attended the Hospital Universitario Mayor Méderi, Bogotá, Colombia. Patients whose biopsy confirmed CRC, and accepted and signed the informed consent form were recruited for this study. The patients were enrolled between July 2017 and December 2021. The inclusion criteria included patients diagnosed by pathology with any type of colorectal cancer, including individuals with metastatic CRC tumors. Genomic data from healthy controls were obtained from the gnomAD public database v2.1.1, 1 filtering Latin-American and non-cancer individuals. A total of 424 participants met these criteria for a case-control ratio of 1:4.
The sample size was considered according to the value obtained using the formula n = Nz2*p(1 − p)/(α2(N − 1) + z2*p(1 − p), accessible in the OpenEpi web tool. 2 Considering that this study was the first to evaluate the allelic frequency of rs5742909 in the Colombian population, a p (sample proportion) of 7.4% was considered, this value corresponds to the minimum allele frequency (MAF) of the polymorphism in the Latin-American population reported in the public database gnomAD v2.1.1 (non-cancer). A confidence level of 99% (α = 0.001, z = 2.576), and a population finite size N = 8,000,000 for Bogotá, the city where the study was conducted, Frontiers in Medicine 03 frontiersin.org were used for the estimation. The obtained value was n = 506, which was approximated to 524 individuals (case and controls) considering possible losses of data. Sociodemographic and clinical variables of the patients were collected through structured interviews and clinical records by trained healthcare professionals. The variables assessed were sex, age, comorbidities (hypertension, diabetes mellitus, chronic obstructive pulmonary disease, cancer, inflammatory bowel disease, and others), family history of cancer, habits, CRC screening tests, height, weight, age at diagnosis, tumor location, histology, lymphovascular infiltration, pTNM classification, and stage. The study protocol and all procedures were approved by the Ethics Committee of the Universidad del Rosario (CEI -DVN021-000285) and the technical committee of the Hospital Universitario Mayor Méderi. This study adhered to the Declaration of Helsinki guidelines.

DNA extraction and genotyping
Genomic DNA was obtained from blood samples of 100 patients using the Quick-DNA™ Miniprep Plus Kit (Zymo Research). Polymerase chain reaction (PCR) was used to amplify the CTLA4 promoter region from −500, considering the first ATG as +1, to the end of the first exon. The PCR products were purified and directly sequenced by Sanger sequencing. Primers were designed using Primer-BLAST. The reference sequence used was obtained from the Ensembl database (ENST00000648405.2). Primers and PCR conditions are listed in Supplementary Table S1. Genotyping of the rs5742909 was performed in batches of 25 samples by two independent researchers, genotyping was attempted in 100 individuals and was successful in 98% (n = 98). Control genotypes were obtained from the genomes of Latin American non-cancer individuals from the gnomAD public database (see footnote 1) v2.1.1 (non-cancer). The genotype quality for controls ranged from 95 to 100 and the read depth was more than 20X for >95% of variant carriers.

Population genetic statistics and polymorphism association
Genotypic and allele frequencies, and Hardy-Weinberg equilibrium were determined for the case and control groups. Deviation from HWE was estimated using a chi-square (χ 2 ) goodnessof-fit test with 1 degree of freedom (1df) using the SNP-Stats software. 3 For the case-control analysis, genotypes were compared under different genetic models including codominant (C/C vs. C/T vs. T/T) dominant (C/C vs. C/T-T/T) and recessive (C/C-C/T vs. T/T). χ 2 tests with 2 degrees of freedom (2df) for codominant and 1df for dominant and recessive models were used to identify statistically significant differences. The best model was selected based on the Akaike information criterion (AIC). Finally, odds ratios (OR), 95%, and confidence interval (CI) values were determined using SNP-Stats. This study was conducted following the Extension for Genetic Association Studies (STREGA) guidelines recommendations (19). In silico transcription factor binding site analysis and luciferase reporter assay Potential transcription factor binding sites (TFBS) on CTLA4 c.-319C promoter region were assessed using Genomatix bioinformatics tools MatInspector (v8.4.2) and MatBase (v11.3). 4 Values of 0.75 core similarity and 0.70 matrix similarity were set as parameter cutoffs.

Plasmid constructs
Using genomic DNA obtained from a patient heterozygous for the CTLA4 c.-319C > T variant, we amplified a region that encompassed the CTLA4 promoter region from −1 to −575 pb. Forward and reverse primers used in PCR contained KpnI and XhoI restriction sites located at the 5'and 3′ ends, respectively. Primers were designed using Primer-BLAST and the sequences are listed in Supplementary Table S1. The PCR products were digested and ligated into the pGL4.22luc2CP/Puro plasmid (#E6771/Promega) following the manufacturer's instructions. The constructs were sequenced to verify the generation of plasmids carrying CTLA4 c.-319C and c.-319 T alleles.

Cell culture and luciferase reporter gene assay
Three cell lines were used for the reporter gene assay considering the variability in LEF1 and TCF7 regulation effects depending on the cell type and available co-factors. Cell lines used HCT116, COS-7 and Jurkat are all well-know and have been extensively used in functional validation approaches. We chose these cell lines for their application in studies of colorectal cancer biology such as human colorectal carcinoma cell line, HCT116; Jurkat is a human T-Cell leukemia cell line extensively used in immunology research and COS-7 cells commonly used for protein expression that require high transfection efficiency. HCT116 and COS-7 are adherent while Jurkat are suspension cells. 7

Electrophoretic mobility shift assay
Custom double-stranded oligonucleotides 5' end-labeled with biotin and oligonucleotides containing the TCF7 consensus sequence for human CTLA4 c.-319 C or T promoter were used in the EMSA procedure. The sequences used are listed in Supplementary Table S1. COS-7 cells were transfected with the pcDNA TCF7 vector. After 48 h, nuclear proteins were extracted using the NE-PER Nuclear and Cytoplasmic Extraction Reagents (Catalog number 78833, ThermoScientific). EMSA was assessed using the LightShift Chemiluminescent EMSA Kit following the manufacturer's instructions. Binding reactions were performed using 10 fmol of double-strand biotin-oligonucleotide, 3 ug of nuclear extract 10 mMol HEPES, 5 mmol DTT, 50 mMol NaCl, 10 mMol KCl, 1.5 mMol EDTA, 15%, 1 mg/mL BSA, 0.5 mMol PMSF, and 2 μg poly (dI:dC). The reactions were incubated for 15 min on ice before adding biotin-labeled DNA and then for 1 h at room temperature. Electrophoresis of the DNA-protein complex was performed on 6% polyacrylamide and 3% glycerol gels at 80 V in TBE 0.5× buffer and transferred to a nylon membrane for biotin-labeled DNA detection using streptavidin-horseradish conjugate and the chemiluminescent reagent contained in the EMSA Kit.

Clinical and demographic characteristics
This study analyzed a total of 98 cases (Table 1). This group was similarly distributed between males (51.0%, n = 50) and females (48.9%,  13.2% (n = 13) stage I and 2% (n = 2) stage 0. Of these, 22.6% had metastasis (n = 22) and 57.7% (n = 56) presented lymphovascular infiltration. 12 patients (13.0%) had a relapse defined as local, regional, and distant metastatic recurrence after a disease-free period (20). The values obtained were calculated based on the sample size of each variable.

Genetic statistics and association analysis
The CTLA4 c.-319C genotype and allele frequencies were determined for 98 cases and 424 controls. The global genotype frequencies were 83.5% (436/522) for CC, 16.3% (85/522) for CT and 0.2% (1/522) for TT, with allelic frequencies of 91.75 and 8.25% for C and T alleles, respectively. This variant was found to be in HWE (p = 0.24). Genotypic and allelic frequencies according to case-control status are presented in Table 2. According to the minimum AIC value, the best genetic model was dominant (AIC = 503; p = 0.023). Association analysis for this model identified a statistically significant association between CTLA4 c.-319C > T and CRC development (p = 0.023; OR 1.89; 95% CI = 1.11-3.23). The results of the codominant, dominant, and recessive association analyses are presented in Supplementary Table S2.

In silico binding site prediction and promoter activity
To investigate the functional and regulatory role of the rs5742909 promoter variant, we first searched for potential TFBS using the Genomatix software. This in silico approach predicted that the LEF1 and TCF7 transcription factors bind to the CTLA4 c.-319 promoter region. The bioinformatics platform determined that LEF1 and TCF7 bind to the sequence 5´-agatccTCAAAGtgaac-3′ and that the variant CTLA4 c.-319C > T is located in the core consensus motif (Figure 1). These results suggest that the rs5742909 variant could disrupt the CTLA4 promoter binding site for LEF1/TCF7.
To test this hypothesis, we conducted two in vitro assays: a luciferase reporter assay and EMSA. Luciferase reporter gene assay indicated that the CTLA4 promoter is transactivated by LEF1 and TCF7 transcription factors in COS-7, HCT116, and Jurkat cell lines (p < 0.001). Significant differences in luciferase activity among the c.-319C or T alleles were observed for the three cell lines (Figure 2). Notably, these effects on transactivation varied according to the cell line and the transcription factor ( Figure 2). For example, the T allele decreased RLU by 20% for TCF7 transactivation in HCT116 cells (22.8 ± 0.8 vs. 27.1 ± 1.2) when compared with wild-type (p = 0.008). No significant differences were observed for LEF1 in this cell line (Figure 2A). In contrast, the alternative allele (T) increased RLU by 30% for LEF1 in COS-7 cells (6.76 ± 0.4 vs. 5.07 ± 0.4) when compared with the C allele (p = 0.015). Similarly, a significant promoter activity enhancement was found for TCF7 in Jurkat cells (p = 0.01; Figure 2B). The luciferase CTLA4 promoter activity was significantly higher when compared to the pGL4 empty vector (p < 0.001) for all cell lines ( Figure 2C; Supplementary Table S3).

DNA-specific binding analyses
The EMSA assay demonstrated the interaction of the TCF7 transcription factor with biotin-labeled oligonucleotides of the CTLA4 promoter. The affinity and strength of binding revealed an increased band intensity for the T allele in the nuclear extracts of transfected TCF7 cells. A similar pattern of band enrichment was observed in the nuclear extracts of non-transfected cells (Figure 3). These results suggested that the binding was specific, considering that there was competition by unlabeled DNA, where band intensity was meaningfully reduced.

Discussion
CRC is a multifactorial disease with both environmental and genetic factors contributing to its pathogenesis. As illustrated in Table 1, the majority of CRC patients (89.8%) were over 50 years old, which is consistent with the age cut-off defining late-onset colorectal cancer (21). Most cases of the disease are sporadic and approximately 25% of CRC cases have a positive family history (22). However, only 5,32% of our patients met these criteria while 61,7% of them had family history of other types of cancer. The World Cancer Research Fund and American Institute of Cancer Research have established obesity, low physical activity, diets rich in high red and processed meat and low in fiber and alcohol intake as main risk factors for CRC development (22). Importantly, the sociodemographic variables taken in CRC cases allow us to identify that 56,4% of patients are overweight or obese, 61,18% consume red meat three or more days per week. Additionally, 86,73% of them have comorbidities with special consideration given to type 2 diabetes and hypertension as well studied risk factors (23,24). Considering that we used a population database (gnomAD) to obtain controls genomic data to perform the case-control analysis, it was not possible to compare all of these clinical variables among groups.
Among the key features of CRC oncogenesis and progression, immune surveillance escape has gained increasing importance, particularly as a targeted therapy (25). At the cellular level, the immune system initially attempts to eliminate malignant cells via cytotoxic or natural killer (NK) lymphocytes. However, over time, tumors enter an equilibrium phase and display resistance. Finally, the tumor reaches an escape phase, where neoplastic growth, proliferation, and dissemination of cancer cells saturate the immune system (26).
CTLA4 is a negative regulator of T-cell function and modulates the duration and strength of T cell mediated immune responses through several intrinsic and extrinsic mechanisms, triggering anergy and immune tolerance (27,28). Several studies have reported significant associations between frequent CTL4 polymorphisms and cancer susceptibility, including CRC (11,(29)(30)(31)(32)(33)(34)(35). We identified a positive association between CTL4 c.-319C > T (rs5742909) and CRC susceptibility (dominant genetic model, p = 0.023). This positive association could be related to impairment of the T cell antitumor immune response.
The rs5742909 variant is located in the CTLA4 promoter region and may alter the DNA binding of transcription factors and impact gene expression via transcriptional regulation. Consistent with our findings, Gibson et al., 2007, used serial CTLA4 promoter deletions and luciferase reporter assays to identify an essential regulatory region located between −200 and −330 bp, which overlaps with the studied variant. A few specific T-cell transcriptional factors have been identified in the CTLA4 promoter (36). The interaction of these factors with CTL4 promoter influences the expression of CTLA-4, which is a critical checkpoint molecule in immune regulation. NFAT (Nuclear Factor of Activated T cells) binding sites in the CTL4 promoter have been implicated in the induction of CTLA-4 expression upon T-cell activation. Recent studies have demonstrated that NFATC2 promotes the stemness of colorectal cancer stem cells via AJUBA-mediated YAP activation and constitutes a novel therapeutic target (37).
According to our Genomatix in silico analysis (Figure 1), the CTLA4 c.-319C > T variant is located in a core consensus motif (TCAAAG) for LEF1 and TCF7 transcription factors (15,38). CTLA4 transcriptional regulation mediated by LEF1/TCF7 is a molecular pathway conserved in multiple T-lineage cells (39). In agreement with these observations, our luciferase reporter assay results demonstrated that LEF1 and TCF7 transcription factors positively activated the CTLA4 c.-319C promoter in COS-7 and HCT116 cell lines (Figure 2).
It has been reported that LEF1/TCF7 has variable activity and regulation depending on the cell type and available co-factors. Therefore, we used three cell lines to confirm the differences between the two alleles. CTLA4 T allele has 30% higher promoter activity for  Nevertheless, these findings highlight the importance of considering CTL4 promoter molecular variants as contributing factors to TCF7's effect on CTLA4 expression. Our study identified TCF7 as an important transcription factor involved in abnormal transactivation of the CTLA4 T allele as evidenced in the non-activated Jurkat cells luciferase assay (Figure 2). The relevance of TCF7 binding consensus region has been demonstrated in EMSA assays using specific inhibitors, such as RNA aptamers (44)(45)(46). Our results support these observations, showing a decreased binding between TCF7 and CTL4 T allele (Figure 3).
Frontiers in Medicine 08 frontiersin.org expression of CTLA4 protein in T-cells affecting proliferation and activation, mitigating the anti-tumoral immune response, and promoting tumoral immune surveillance escape, thus conferring an increased risk of CRC. Some studies have evaluated the relation between rs5742909 and CTLA4 mRNA expression levels, reporting statistically significant higher expression for the T allele (51,52). CTLA4 gene expression has a significant impact in the clinical setting and the c.-319C > T promoter variant might be useful as a potential prognosis or therapeutic biomarker. Overexpression of this gene has been associated with poor prognosis in several tumor types (53). Currently, it is considered a key therapeutic target for melanoma, non-small cell lung cancer and metastatic CRC. Omura et al., 2020 showed that the CTLA4 overexpression in CRC tissue was associated with worse overall survival (HR = 3.86, value of p = 0.001) (54). Consistently, Kamal et al., 2021 found significant CTLA4 upregulation in CRC patients compared to healthy volunteers and suggested that it may be used as an independent prognostic biomarker for survival (55). Our functional validation assays suggest that the CTLA4 c.-319C > T variant modifies the transcriptional regulation of this gene.
In a therapeutic context, CTLA4 protein has been recognized as the target for immunotherapy drugs such as ipilimumab, a monoclonal antibody approved for advanced CRC treatment (34). To date, only a few biomarkers have been applied in the clinical practice to guide therapeutic decisions. Tumor mutational burden (TMB), microsatellite instability (MSI), T cell-inflamed microenvironment, and TGFβ expression profile are candidate biomarkers for CRC, but their analyses are expensive, delayed, and not easily available (56,57). Recent evidence has indicated that methylation levels of CTLA4 promoter predict therapeutic response in patients affected by melanoma and clear renal cell carcinoma (58,59). Collectively, our findings suggest a potential use of this molecular variant as a potential novel biomarker for prognosis and therapeutic response.

Study limitations
The controls used for the association analysis were obtained from a public genomic database, and therefore, we did not have access to their clinical data. It was not possible to blind the researchers in the genotyping process because the case-control status was known from the beginning. We did not have access to clinical data of patients who declined to participate in our study and it is possibly related to selection bias. Additionally, our findings were not replicated in an independent sample which could have been helpful for reducing possible false positive associations. The potential impact of co-factors such as β-catenin or Groucho were controlled only using three different cell lines, but not directly assessed.

Conclusion
To our knowledge, this is the first report describing the functional impact of the CTLA4 c.-319 T allele on TCF7 promoter transactivation in the context of CRC. The fact that predictors based on genotyping could be a promising field in personalized medicine is supported by our findings. However, despite the evidence, more experimental and clinical studies would be necessary to validate its performance, including in CRC patients treated with anti-CTLA4 immunotherapy.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.

Ethics statement
The study was reviewed and approved by the Ethics Committee of the Universidad del Rosario (CEI -DVN021-000285). The patients/ participants provided their written informed consent to participate in this study.