Investigation of the association between the genetic polymorphisms of the co-stimulatory system and systemic lupus erythematosus

Human leukocyte antigen genes have been shown to have the strongest association with autoimmune disease (AD). However, non-HLA genes would be risk factors of AD. Many genes encoding proteins that are related to T- and B-cell function have been identified as susceptibility genes of systemic lupus erythematosus (SLE). In this study, we explored the correlation between SLE and the genetic polymorphisms of co-stimulatory/co-inhibitory molecules, including CTLA4, CD28, ICOS, PDCD1, and TNFSF4. We found that there were nine single-nucleotide polymorphisms (SNPs) associated with SLE, namely, rs11571315 (TT vs. CT vs. CC: p < 0.001; TT vs. CT: p = 0.001; p = 0.005; TT vs. CT +CC: p < 0.001; TT+CT vs. CC: p = 0.032), rs733618 (CC vs. CT vs. TT: p = 0.002; CC vs. CT: p = 0.001; CC vs. TT: p = 0.018; CC vs. CT + TT: p = 0.001), rs4553808 (AA vs. AG: p < 0.001), rs62182595 (GG vs. AG vs. AA: p < 0.001; GG vs. AG: p < 0.001; GG vs. AG+AA: p < 0.001), rs16840252 (CC vs. CT vs. TT: p < 0.001; CC vs. CT: p < 0.001; CC vs. CT + TT: p < 0.001), rs5742909 (CC vs. CT: p = 0.027; CC vs. CT + TT: p = 0.044), rs11571319 (GG vs. AG vs. AA: p < 0.001, GG vs. AG: p < 0.001; GG vs. AG+AA: p < 0.001), rs36084323 (CC vs. CT vs. TT: p = 0.013, CC vs. TT: p = 0.004; CC vs. CT + TT: p = 0.015; CC +CT vs. TT: p = 0.015), and rs1234314 (CC vs. CG vs. GG: p = 0.005; CC vs. GG: p=0.004; CC+ CG vs. GG: p=0.001), but not in CD28 and ICOS by using the chi-square test. Additionally, rs62182595 and rs16840252 of CTLA and rs1234314 and rs45454293 of TNFSF4 were also associated with SLE in haplotypes. These SLE-related SNPs also had an association with several diseases. It was indicated that these SNPs may play an important role in immune regulation and pathogenic mechanisms.


Introduction
Autoimmune diseases (ADs) are a heterogeneous group of diseases that involve the connective tissues, skin, subcutaneous tissues, muscles, joints, and various internal organs. In a Taiwanese population, the incidence of systemic lupus erythematosus (SLE) was 7.2 per 100,000 person-years (1), particularly affecting the women of childbearing age. Because SLE is a complex and multifactorial disorder, its course is difficult to accurately judge. If AD cannot be treated at the first time, it will lead to poor prognosis. Moreover, data showed that the heritability of SLE was up to 43.9% (2). Thus, we tried to investigate the correlation between genetic polymorphism and SLE.
SLE is a multisystem autoimmune disease characterized by the production of autoantibodies. The inappropriate T-celldependent expansion of autoreactive B cells is considered to play a role in the production of pathogenic autoantibodies against autoantigens (3). The pathogenesis of SLE is still unclear yet. An increasing number of studies showed that the overactivation of autoreactive T cells is the main factor in AD (4,5). In general, the autoreactive T cells that can recognize the selfantigen will go to apoptosis through positive and negative selection in the thymus (6). However, even if such a rigorous selection mechanism is complied, a small number of T cells escape into the peripheral blood after the selection, in which these escaped T cells will recognize self-antigens and attack to own cells and tissues. In the peripheral blood system, autoreactive T cells are regulated by regulatory T cells (Treg). When these autoreactive T cells cannot be controlled by Treg, also known as loss of immune tolerance, it will lead to become one of the main causes of autoimmune diseases.
To date, it has been found that human leukocyte antigen (HLA) genes generally have the strongest association with AD (7). However, other genes located outside of the HLA region may be a risk factor of AD. Several genes involved in the etiology of SLE have been widely studied, and the single-nucleotide polymorphisms (SNPs) of these genes encoding proteins related to T-and B-cell functions have been considered as the susceptibility loci of SLE. Studies showed that T-cell activation was strictly regulated by signals from co-stimulatory and coinhibitory molecules (8). Among them, the most popular hot genes are cytotoxic T-lymphocyte-associated protein 4 (CTLA4) and CD28. CD28 is continuously expressed on naïve and mature T cells. After binding with CD80/CD86 on antigen-presenting cells (APCs), it provides a stimulatory signal to promote activation of T cells, giving them the ability to attack (9).
CTLA4 is a co-inhibitory molecule, which is induced to express on the surface of T cells after CD28 interacts with CD80/ CD86 on APCs and stimulates T-cell activation. By competing with CD28 for CD80/CD86, it plays a role in inhibition of T-cell activation. Thus, the balance between CTLA4 and CD28 is an important key for T-cell activation or inhibition (10). Both programmed cell death protein 1 (PDCD1; PD1) and CTLA4 are immune checkpoints. When the PD1 protein binds to its ligand on APCs, it will induce the activation of an immunereceptor tyrosine-based inhibitory motif (ITIM) in the cytoplasmic tail of PD1, so as to inhibit the activation of T cells (11). Inducible co-stimulator (ICOS), like CD28, acts as a co-stimulatory receptor for T cells. It is essential for the activation of T cells and further promotes the humoral immune response (12). In addition, it was found that ICOS was highly expressed on T cells of SLE patients (13). When tumor necrosis factor superfamily member 4 (TNFSF4; OX40L) binds to its receptor (OX40), it can also provide a signal to promote T-cell activation, and studies have shown that OX40L can stimulate T-cell response and promote the pathogenesis of SLE (14). Therefore, these five genes were considered in the present study. Moreover, it is worth noting that the susceptible SNP of the disease has ethnic variations. Therefore, the aim of this study was to find out the SLE susceptibility loci of a Taiwanese population and apply it to the clinic to assist physicians in diagnosis.

Methods and material Study subjects
The Institutional Review Board of Chang Gung Memorial Hospital has reviewed and approved the study. The approval ID was 202002097B0. All study subjects signed informed consent and performed in accordance with relevant guidelines and regulation. There were 75 SLE patients who participated in the present study, in which the average onset age was 32.99 ± 1.49, and there were 66 women (88%) and nine men (12%). The inclusion criteria for SLE were based on the diagnostic criteria established by the American College of Rheumatology: 1) malar rash, 2) discoid lupus erythematosus with desquamation, 3) UV sensitivity on skin, 4) oral ulcer, 5) non-erosive arthritis, 6) serositis, pleurisy, or pericarditis, 7) kidney lesions, 8) nervous system lesions, 9) hematological lesions, 10) immunological lesions, 11) positive antinuclear antibody (ANA) test. If four of the 11 items are met, they can be diagnosed as SLE. In the definition of laboratory diagnosis, it is mainly to detect autoantibodies. ANA positivity exists in more than 95% of SLE patients. However, ANA will increase with age, and normal people may also have positive results, so the clinically meaningful results of ANA need to be ≥1:160x. Furthermore, 75 volunteers without immune abnormalities were collected for a case-control study, in which the average age was 31.83 ± 0.93 and there were 60 women (80%) and 15 men (20%).

DNA extraction and sequencing
Three milliliters of peripheral blood of autoimmune disease patients and healthy controls was collected in EDTA-coated vacuum tubes, and QIAamp DNA Blood Mini Kit (Qiagen, Valencia, California, USA) was used to extract the genomic DNA. After PCR reaction, the concentration and purity of extracted DNA were measured using a UV spectrometer. The PCR mixture contained 50 ng of DNA, 7.5 µl of HotStarTaq DNA Polymerase (Qiagen GmbH, Hilden, Germany) or 2X Tag polymerase, each 1 µl of forward and reverse primers (10 mM), and 14.5 µl of ddH 2 O. The primer pairs of each gene region and the PCR programs are shown in Table 1. Then, the BigDye Terminator Cycle Sequencing Kit (Thermo Fisher, Waltham, Massachusetts, USA) and the ABI PRISM Genetic Analyzer (Thermo Fisher, Waltham, Massachusetts, USA) were used for direct sequencing according to the manufacturer's instructions.

Selection of candidate SNPs
Because the abnormal expression level of co-stimulatory/coinhibitory molecules may contribute to autoimmune diseases (15), and the SNP variation located in the promoter region may alter the expression level of mRNA expression, we focused on the promoter region to look for the SNPs related to SLE. In addition, we also searched the SNPs related to autoimmune diseases that were published in previous studies through NCBI, including rs4404254 and rs4675379 located in the three prime untranslated regions (3′ UTR) of ICOS gene (16), rs2227982 located in exon 5 and rs10204525 located in the 3′UTR of PDCD1 gene (17), rs3087243 located in the 3′UTR of CTLA4 gene (18), and rs3116496 located in the intron 3 of CD28 gene (19). These SNPs were selected as candidate SNPs, and the region from the 500 bp upstream to 500 bp downstream of these candidate SNPs was amplified, so as to find the SNPs related to SLE in the Taiwanese population.

Statistical analysis
All the allele frequencies of each SNP in the control group were in accordance with the Hardy-Weinberg equilibrium (HWE) ( Table 2). The allele frequency and genotype frequency between patients and healthy controls were determined by chi-squared test and Fisher's exact test and were given an odds ratio with a 95% confidence interval. These statistical data were calculated by SPSS 17.0. D′ was used to estimate the linkage disequilibrium (LD) by comparing the observed and expected frequencies of one haplotype involved in alleles at different loci. The block was defined as it scarcely had evidence for historical recombination in this region (20). The figures of LD were produced by Haploview 4.2 (https://www.broadinstitute.org/haploview/haploview).

Hardy-Weinberg equilibrium test
The genotype frequencies based on controls were checked for Hardy-Weinberg equilibrium (HWE). It was found that the rs11571305 and rs10932035 of ICOS violated HWE, which indicated that these SNPs were not sufficient to represent the genotype distribution of the population, so the subsequent analysis results of these SNPs were less credible (Table 2). Thus, these SNPs were not analyzed and discussed.

The analysis of allele frequencies
The allele frequencies of the six SNPs in the CTLA4 gene were significantly different between cases and controls, including those of rs11571315 (p < 0.001), rs733618 (p = 0.007), rs4553808 (p < 0.001), rs62182595 (p < 0.001), rs16840252 (p < 0.001), and rs11571319 (p = 0.001). For the subjects with a minor allele at these six loci, they had a lower risk (0.045-0.532 times) of SLE. Moreover, in the PDCD1 gene and TNFSF4 gene, each SNP had significance between the SLE group and control group, including rs36084323 (p = 0.003) and rs1234314 (p = 0.008). For the subjects with a minor allele at these two loci, they had a higher risk (1.881 and 2.096 times, respectively) of SLE ( Table 2).

The analysis of genotype frequencies
There were seven SNPs in the CTLA4 gene, one SNP in the PDCD1 gene, and one SNP in the TNFSF4 gene that had statistical significance ( Table 3). The genotype frequency (CC vs. CT vs. TT) of rs11571315 was significantly different between cases and controls (p < 0.001). Compared to TT, the subjects with a CT genotype at rs11571315 would have a lower risk of SLE (OR = 0.301, 95% CI = 0.142-0.641, p = 0.001), and those with CC would have a 0.17 times risk of SLE (95% CI = 0.044-0.654, p = 0.005). Additionally, the genotype frequencies were analyzed through the dominant model (AA vs. Aa +aa) and recessive model (AA +Aa vs. aa), where "A" was referred to as the allele with higher frequency in the population, also known as major allele. Based on the dominant model, the subjects who had at least one C allele (CT+CC) at rs11571315 would have a 0.267 times risk of SLE (95% CI = 0.132-0.539, p < 0.001), which also had significance based on the recessive model.  The position was obtained from Genome Assembly GRCh38.p13. rs: reference SNP; HWE: Hardy-Weinberg equilibrium; 95% CI: 95% confidence interval; P a values of allele frequency were counted from Chi-square test or Fisher's exact test. In the column of "Allele", the bold was referred to minor allele, and the minor allele was referred to the allele with lower frequency in the population containing cases and controls. "*" was expressed as p<0.05.

Discussion
According to the results of SNP analysis, we found that there were seven SNPs of the CTLA4 gene, one SNP of the PDCD1 gene, and one SNP of the TNFSF4 gene associated with SLE.
Additionally, these SLE-related SNPs also had an association with other autoimmune disorders and cancers. The information is summarized in Supplementary Table 1. It is increasingly being appreciated that multiple autoimmune diseases share common susceptibility genes. It was indicated that these SNPs may be a key factor driving immune abnormalities in the immune regulation and pathogenic mechanism of disorders.
For CTLA4, rs11571315 was associated with transfusion reaction (21), polycystic ovary syndrome (22), etc. Yao et al. showed that rs11571315 was significant with the expression level of CTLA4 (23). rs733618 was associated with Graves' disease (24,25), non-small cell lung cancer (26), etc. A meta-analysis showed that the allele and genotype frequencies of rs733618 were associated with SLE. Moreover, compared to the CC genotype, people with CT+TT had a lower risk of SLE in an Asian population (27). This was similar to our results. rs4553808 was associated with several autoimmune diseases (28-30), cancers (31,32), and prognosis post transplantation (33, 34). Wang et al. showed that the G-allele frequency of rs4553808 was higher in breast cancer patients than in controls (35). On the contrary, our result showed that the G-allele frequency was higher in SLE patients than in controls. This result suggested that rs4553808 had opposite effects on different diseases. rs62182595 was also associated with polycystic ovary syndrome (22). rs16840252 was associated with hepatocellular carcinoma (36), antineutrophil cytoplasmic antibody-associated vasculitis (37), rheumatoid arthritis (38), etc., in which rs16840252 had the most significance in haplotype. rs5742909 was associated with cervical cancer (39), schizophrenia (40), chronic liver diseases (41), Hashimoto's thyroiditis (42), etc. Shojaa et al. showed that rs5742909 was associated with SLE pathogenesis in an Iranian population, where the CC genotype was associated with SLE, while the CT genotype and T allele were more frequent in controls than in SLE cases (43). Our result was similar to theirs. The rs11571319 located in the 3′UTR was associated with rheumatoid arthritis (44), primary biliary cirrhosis (45), asthma (46), etc. the SNPs in the 3′UTR may interfere with mRNA stability and then alter the translation level of proteins (47). For PDCD1 and TNFSF4, rs36084323 was associated with cancer risk (48). Ishizaki et al. showed that the rs36084323 G The linkage disequilibrium (LD) analysis of the CD28 gene. There was one haplotype block shown in the CD28 gene, including rs1879877 and rs3181096. The linkage disequilibrium (LD) analysis of the CTLA4 gene. There was one haplotype block shown in the CTLA4 gene, including rs62182595 and rs16840252. The linkage disequilibrium (LD) analysis of the ICOS gene. There were two haplotype blocks shown in the ICOS gene. One block included rs11889352 and rs11883722, and the other contained rs10932036, rs4404254, rs10932037, and rs10932038. The linkage disequilibrium (LD) analysis of the PDCD1 gene. There was one haplotype block shown in the PDCD1 gene, including rs10204525, rs2227981, rs2227982, and rs6705653. allele had a significantly higher promoter activity than the A allele (49). rs1234314 was associated with allergic rhinitis (50), coronary artery disease (51), etc. It was shown that the CC genotype of rs1234314 provided a protective effect against allergic rhinitis (50), which was the same as our result. These SNPs were all located in the non-coding region. Thus, they may affect the immunity by altering the expression of the gene. Additionally, the results showed that most of the significant SNPs were in the CTLA4 gene. This may be because CTLA4 is mainly expressed on Tregs. Moreover, the autoreactive T-cell response was strictly controlled by Tregs in the periphery. The abnormality of any one of the co-stimulatory/co-inhibitory signals would lead to these autoreactive T cells' uncontrolled expansion, causing the development of AD (52). Moreover, haplotype analysis showed that A rs62182595 T rs16840252 , A r s 6 2 1 8 2 5 9 5 C r s 1 6 8 4 0 2 5 2 , G r s 6 2 1 8 2 5 9 5 T r s 1 6 8 4 0 2 5 2 , and G rs1234314 G rs45454293 haplotypes may decrease the risk of SLE. This suggested the possibility of an interaction between the two SNPs in one haplotype in SLE susceptibility, in view of the high LD between polymorphisms.
Abnormal expression and function of co-stimulatory/coinhibitory molecules have been described to be associated with aberrant T-cell activation in SLE patients, which results in reduction in the T-cell activation threshold and loss of peripheral immune tolerance (53). These data further advance our understanding of the complex immunopathogenesis of SLE and provide additional support for the emerging concept of shared genes in multiple autoimmune diseases.
In summary, our results showed that several SNPs of immune regulatory genes were significant with SLE, which were almost located in non-coding regions. It should be further verified whether SNP variations affect gene expression or protein function. Moreover, epigenetic markers can also affect the expression level of a gene, leading to abnormal CD4+ T-cell function, which in turn increases the risk of autoimmune disease. Consequently, epigenetic markers should be considered together in the future.

Limitation
The co-stimulatory system is involved in the regulation of T-cell activation. The expression level of co-stimulatory/coinhibitory molecules will affect the degree of T-cell activation. Therefore, we focused on the promoter region of genes. Due to limited funds, we searched the literature related to autoimmune diseases and selected the hot SNPs related to autoimmune diseases as candidate SNPs for discussion. We explored the 500-bp flanking region of the candidate SNPs, especially on the promoter region, rather than whole-genome region. In addition, because of insufficient genomic DNA and failure of PCR reaction, not every sample had complete SNP data available. The linkage disequilibrium (LD) analysis of the TNFSF4 gene. There was one haplotype block shown in the TNFSF4 gene, including rs45454293 and rs1234314.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Ethics statement
The Institutional Review Board of Chang Gung Memorial Hospital has reviewed and approved the study. The approval ID was 202002097B0. The patients/participants provided their written informed consent to participate in this study.

Funding
This study was supported by grants to D-PC from the Ministry of Science and Technology (110-2320-B-182A-007).
The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.