Compelling Evidence Linking CD40 Gene With Graves’ Disease in the Chinese Han Population

Mutations in CD40 have been widely reported to be risk factors for Graves’ disease (GD). The gene, along with its cognate ligand CD40L, may regulate pro-inflammatory and immune responses. Rs1883832, located at the -1 position of the Kozak sequence, is the most well-studied single nucleotide polymorphism (SNP) of CD40, and has been confirmed to predispose those with the alteration to GD, regardless of ethnicity. Our genome-wide association study (GWAS) indicated that several SNPs, including rs1883832 located within the vicinity of CD40 were associated with GD in the Han Chinese population. Aiming at identifying the most consequential SNP and its underlying pathogenic mechanism, we performed a two-stage refined study on 8,171 patients with GD and 7,906 controls, and found rs1883832 was the most significantly GD-associated SNP in the CD40 gene region (P Combined = 9.17×10-11, OR = 1.18). Through searching the cis-expression quantitative trait locus database and using quantitative RT-PCR, we further discovered that the rs1883832 genotype can influence CD40 gene transcription. Furthermore, we demonstrated that rs1883832 is a susceptibility locus for pTRAb+ GD patients. In conclusion, the current study provides robust evidence that rs1883832 can regulate CD40 gene expression and affect serum TRAb levels, which ultimately contributes to the development of GD.

The CD40 gene is a member of the tumor necrosis factor (TNF) receptor superfamily. The encoded protein is mainly expressed on B cells and different antigen-presenting cells (APCs), which is essential for mediating immune and inflammatory responses, including T cell-dependent immunoglobulin class switching, B cell proliferation and activation, and memory B cell development (2,18). It has been reported that CD40 is also expressed on some nonimmune cells, such as pancreatic beta cells, endothelial cells, and thyroid epithelial cells (19)(20)(21)(22). CD40 interacts with its ligand, CD40L (CD154), and plays a prominent role in many autoimmune diseases, such as rheumatoid arthritis (RA), multiple sclerosis (MS), and systemic lupus erythematosus (SLE) (23)(24)(25). Most investigators agree that GD is an organ-specific autoimmune disorder mediated by B and T cells, due to a complex interplay of factors that lead to the loss of immune tolerance to thyroid antigens and to the initiation of a sustained autoimmune reaction (2,26). Given that CD40 plays an important role in humoral and cellular immunity, it is easy to understand why it is likely that CD40 contributes to GD development.
Linkage analyses have confirmed the genetic association of CD40 with GD. Rs1883832, the most well-studied SNP located at position -1 in the Kozak sequence of the CD40 gene, was reported to be associated with GD with a relative risk ranging from 1.22-1.93 in different ethnicities (12,(27)(28)(29). Moreover, investigators have verified that the C allele of rs1883832 could promote the translation of CD40 and lead to GD (30). As for Chinese Han population, Wang et al. genotyped rs1883832 in 196 GD cases and 122 controls and concluded there is an association between rs1883832 and GD susceptibility (OR = 1.57) (31). However, most researches have used candidate gene strategies or have performed replication studies to verify the relationship between CD40 and GD. The generalizability of the Wang et al. findings is constrained by the relatively small sample size and the enlistment of a single patient center. Therefore, their results need to be confirmed in large-scale and multi-center genetic study, potentially using a different strategy.
Genetic underpinnings are of fundamental importance for determining individual differences in immune and inflammatory responses. Fairfax et al. (32) assessed the correlation between SNP genotypes and gene transcription levels in monocytes under lipopolysaccharide (LPS) or interferon-g (IFN-g) stimulation compared with those in the naïve state. They established a database containing SNPs and the correlated genes within a 1 Mb window ─ the cis-expression quantitative trait locus (cis-eQTL) database (32). Molecular, cellular, and environmental factors all contribute to the pathogenesis of autoimmune diseases. This database makes it possible to explore functional genetic variants and their modulated genes.
Our group conducted a genome-wide association study (GWAS) and imputation analysis in 1,536 GD patients and 1,516 controls, and confirmed that HLA, CTLA4, FCRL3, and THSR were susceptibility genes for GD and identified two new GD risk loci in RNASET2 and GDCG4P14 in Chinese Han population (8). Rs1883832 showed a weak association with GD in our GWAS data. In order to identify the causal loci in the CD40 gene region and the underlying pathogenic mechanism, we performed a two-stage genetic analysis in a large cohort (n =16,077) and provided compelling evidence for the association of rs1883832 with GD. We also confirmed the relationship between rs1883832 genotypes and CD40 transcriptional levels by searching through the aforementioned cis-eQTL database and the Genotype-Tissue Expression (GTEx) database, and also by applying qRT-PCR method.

Subjects
All individuals were Chinese Han population and were recruited by The China Consortium for the Genetics of Autoimmune Thyroid Disease. This study was performed in accordance with the principles of the Declaration of Helsinki. Approval was granted by the local ethics committees of all partner hospitals and all participants provided written informed consent prior to participation. In the initial GWAS stage, 1,536 GD and 1,516 sexmatched controls were included and 1,442 GD cases and 1,468 controls remained after quality control (8). In the replication stage, we recruited a further 6,729 cases and 6,438 controls. The inclusion and exclusion criteria were as previously described (8,15). The plasma level of thyroid stimulating hormone receptor autoantibody (TRAb) in GD patients treated with antithyroid drugs (ATD) for ≥ 1 year were analyzed by an enzyme-linked immunosorbent assay ELISA kit, and we defined TRAb ≥ 1.5 U/L as persistent TRAb positive (pTRAb+) and TRAb < 1.5 U/L as non-persistent TRAb-positive (pTRAb−) (8,16). Sample characteristics are presented in (Table 1).  (32). The eQTL analysis was also conducted using the Genotype-Tissue Expression (GTEx) portal (https://gtexportal.org/home/). The GTEx project has generated rich transcriptome data in a variety of human tissue types, thus providing insights into the regulatory role of genetic variation (33). Through searching these two databases, we investigated the expression SNPs (eSNPs) for CD40 and analyzed whether these eSNPs were associated with Graves' disease based on the GWAS data.

Statistical Analysis
The association analysis for GWAS and the replication stage were performed using the PLINK Cochran-Armitage trend test and logistic regression (34). For the combined stage and TRAb +/subsets analysis, we used the Cochran-Mantel-Haenszel stratification test (34). The forward and two-locus logistic regression analyses were conducted using the R software environment and PLINK. The regional plots were generated by LocusZoom (http://locuszoom.sph.umich.edu/). The CD40 expression data were analyzed using ANOVA and the unpaired Student's t-test.

Refining Association Study in the 20q13.12 Region
Evidence for the strong linkage of CD40 to GD had been reported through whole genome linkage scanning, and rs1883832 appeared to be the causal SNP of GD in the vicinity of CD40 (12,(27)(28)(29)31).
To deeply analyze the relationship between CD40 and GD, we performed a genome-wide association study (GWAS) and imputation analysis of 1,442 GD cases and 1,468 controls after quality control ( Table 1). Because SNPs surrounding the CD40 gene exhibited significant association with GD, we selected a 400 kb region on 20q13.12, which includes CD40, and conducted a refining study aimed at providing more compelling evidence for GD-associated SNPs. After filtering SNPs with minor allele frequency (MAF) < 0.01, 761 SNPs remained, and of them, 49 SNPs were associated with GD (P GWAS < 0.05) ( Figure 1A and Table S1). Through analyzing the linkage disequilibrium (LD) structure of the 49 GD-associated SNPs, we selected eight tag SNPs to further narrow the linked chromosomal regions ( Figure 1B and Table S2). Of these, forward logistic regression analysis showed that rs79200351, rs6074069, and rs1883832 yielded evidence of association (P forward = 0.004, 0.004, 0.008, respectively). These three SNPs are located in three different LD blocks ( Figure 1B), and of them, rs1883832 displayed the strongest association with GD (P = 9.77×10 -3 ). Unlike the other two candidates, two-locus logistic regression indicated that rs1883832 improved the regression models of the other seven tag SNPs, while acting as an independent GD-associated SNP ( Figure 2). Therefore, only rs1883832 was tested for replication in an additional 6,729 GD cases and 6,438 controls ( Table 2) and showed a consistent association with GD in both stages (P Replication = 4.95×10 -10 , OR = 1.18; P Combined = 9.17×10 -11 , OR = 1.18). Taken together, we concluded that rs1883832 is the most strongly GDassociated SNP in the 20q13.12 region.
The Effects of rs1883832 on CD40 Expression Jacobson et al. reported that the C allele of rs1883832 predisposes one to develop GD by increasing the translation efficiency of CD40 mRNA, rather than operating at the transcription level (30). Unfortunately, their study was based on a small sample size, from which one cannot draw a comprehensive conclusion. We therefore speculated that the true relationship between rs1883832 and CD40 mRNA expression level still needs to be determined. First, by searching cis-eQTL data from the GTEx database, we found that rs1883832 is correlated with CD40 expression within a cluster of tissues, including the thyroid and whole blood ( Figure 3A). Risk allele C of rs1883832 significantly upregulated CD40 mRNA levels both in 670 whole blood and 574 thyroid samples from the GTEx database, as assessed by cis-eQTL analysis (P = 1.6×10 -13 and 2.4×10 -14 , Normalized effect size (NES) = 0.220 and 0.347, respectively) ( Figures 3A, B). Furthermore, we searched the cis-eQTL database built by Fairfax  et al. (32) and found 77 SNPs were correlated with CD40 expression. After quality control, 40 SNPs were included in our own GWAS data, and eight SNPs among them were associated with GD (P < 0.05, Table 3). SNPs that influence gene expression were called eSNPs. Interestingly, rs1569723, rs1883832, rs4810485, and rs6074022 exhibited stronger association with CD40 expression and GD than rs4810486, rs4813003, rs1883835, and rs2143699. Compared with the naïve state, rs1569723, rs1883832, rs4810485, and rs6074022 showed weaker correlation with CD40 expression when stimulated by IFN-g or LPS. After analyzing the LD structure of the eight eSNPs, we found that rs1883832 could capture rs1569723, rs4810485, and rs6074022 with r 2 > 0.8, and rs4813001 (one of the eight tag SNPs mentioned earlier) could capture rs4810486, rs4813003, rs1883835, and rs2143699 with r 2 > 0.8. However, rs4813001 was filtered out by forward logistic regression in the GWAS stage, and rs1883832 could significantly improve the regression model with rs4813001 ( Figure 2). Finally, we evaluated the CD40 mRNA level and its correlation with rs1883832 genotypes in different subtypes of peripheral blood mononuclear cell (PBMC), including CD4+ T cells, CD8+ T cells, CD14+ monocytes, and CD19+ B-cells. The results showed that all PBMC subtypes expressed CD40 while CD19+ B-cells exhibited the highest expression level ( Figure 3C), and CC/CT genotypes of rs1883832 had higher CD40 expression than the TT genotype in CD19+ B-cells (P <0.05 in Figure 3D). Thus, we speculated that rs1883832 and its linked SNPs probably contribute to the susceptibility for GD by regulating CD40 expression.

Association Analysis of rs1883832 in pTRAb +/-GD Subsets
The serum level of thyroid stimulating hormone receptor autoantibodies (TRAb) is a diagnostic and prognostic factor in GD, which can help in monitoring the efficiency of antithyroid drug (ATD) therapy for GD. In this study, we performed an association analysis of rs1883832 in TRAb+/-GD subsets in order to investigate whether rs1883832 could affect the serum level of TRAb, thereby possibly affecting the maintenance of hyperthyroidism. GD patients with TRAb > 1.5 U/L were defined as the persistent TRAb+ group (n = 2,389) and TRAb < 1.5 U/L were as persistent TRAb-group (n = 1,004). Our results showed that rs1883832 was associated with pTRAb+ patients (P = 3.75 × 10 −5 ), but was not associated with pTRAb-patients (P = 0.06) ( Table 4). Furthermore, we performed heterogeneity analysis between the 2,389 pTRAb+ patients and 1,004 pTRAb-patients and found rs1883832 exhibits non-significant heterogeneity (P h = 0.35) between the two subgroups.

DISCUSSION
By using our GWAS data and the tag SNP selection strategy, we conducted a refined association study of the region surrounding the CD40 gene and found rs1883832 was the most relevant SNP statistically associated with GD. Based on the GTEx database and the cis-eQTL database built by Fairfax et al. (32), we analyzed the SNPs relevant to CD40 expression and found rs1883832 as well as its tightly linked SNPs exhibited a strong association with CD40 expression. In addition, as a functional polymorphism located at the -1 position in the Kozak sequence of the CD40 gene, rs1883832 was demonstrated to be associated with CD40 mRNA expression level in PBMCs in our study. These results indicated that rs1883832 probably regulated CD40 expression, thereby contributing to the pathogenesis of GD. Furthermore, the allele frequency of rs1883832 in pTRAb+/pTRAb-patients had no significant difference (P h = 0.35), but the association analysis showed that rs1883832 was correlated with pTRAb+ patients only (P = 3.75 × 10 −5 ). Therefore, we concluded that rs1883832 participated in the development of GD. CD40 is a type I integral membrane glycoprotein and a cellsurface member of TNF receptor superfamily that functions in B-cell proliferation and activation, T cell priming, antigen presentation, immunoglobulin isotype switching, germinal centers development and humoral immune memory (18,35). CD40 interacting with its ligand CD154 (CD40L) could trigger immune and inflammatory responses, and it has been found to be associated with several autoimmune diseases such as GD and Hashimoto's thyroiditis (2), systemic lupus erythematosus (SLE) (36), rheumatoid arthritis (RA) (37) and multiple sclerosis (38). The linkage and association analysis has provided robust evidence that CD40 could confer susceptibility to GD. Tomer firstly identified rs1883832 (CC genotype) and CD40 were both associated with GD in Caucasian population (12). After that, rs1883832 CC genotype was proved to be a susceptibility variant in Japanese GD patients (12,28), and the C allele was demonstrated to raise the risk of GD in the Chinese Han population (39,40). However, these studies used candidate  Naive, monocytes without LPS or IFN treatment; P LPS2 , P value of each SNP-regulated CD40 gene expression in monocytes treated with LPS for 2 hours; P LPS24 , P value of monocytes treated with LPS for 24 hours; P IFN-g24 , P value of monocytes treated with IFN-g for 24 hours.
gene strategies or performed replication studies in relatively small sample sizes, which have limited statistical power. In contrast, we provide more reliable and unbiased results that rs1883832 is probably the causal SNP located in the CD40 gene region through a GWAS strategy, and two-stage refined association analysis in large Chinese cohorts. Given that rs1883832 is located at the -1 position of the Kozak consensus sequence of CD40, variations in this sequence are supposed to affect gene translation (41). Jacobson et al. used a battery of methods including in vitro transcription-translation assays, surface expression analysis of cells transfected with the two alleles, and analyses of B cells from individuals with different SNP genotypes, to demonstrated that the C allele of rs1883832 was correlated with increased CD40 translational efficiency, compared with the T allele (30). However, they also concluded that there was no correlation between rs1883832 genotype and CD40 mRNA levels by using quantitative RT-PCR data from the purified B cells (cultured for 16h with or without interferon-g) of 11 individuals (30). These results might require further confirmation by increasing the number of patients and/ or using other methods. Recently, CD40 expression has been detected on thyroid epithelial cells (19,42), and several studies unraveled the mechanism how CD40 contributes to GD pathogenesis. Jacobson et al. revealed a stronger association of rs1883832 genotype CC with persistently elevated thyroid antibody among GD patients than those who were thyroid antibody negative, with significant expression of CD40 mRNA and corresponding proteins in the thyroid observed(GD target tissue) (43). They proposed CD40 overexpression on thyrocytes augmented thyroid-directed autoimmunity through two possible mechanisms. The extrinsic mechanism is based on the fact that thyrocytes can present self-peptides within HLA class II molecules to intra-thyroidal T cells under certain conditions, which makes surface molecules like CD40 transmit co-stimulatory signals, finally leads to T cells activation. The intrinsic mechanism, through the activation of CD40 signaling pathway within the thyrocytes could alter their physiology, lead to inflammation and autoimmunity (43). Using the transgenic mouse model overexpressing CD40 in the thyroid, Huber et al. demonstrated that thyroidal CD40 overexpression augmented the production of thyroid-specific Abs due to the activation of downstream cytokines and chemokines (most notably IL-6), resulting in more severe experimental autoimmune GD (44). Recently, a pilot study on a small cohort of 13 GD patients demonstrated that specific CD40 haplotypes composed of six SNPs were associated with higher CD40 mRNA levels and clinical response to Iscalimab (the anti-CD40 monoclonal antibody). As one of the key SNPs, rs1883832 could differentiate responders from non-responders -C allele associated with response to Iscalimab and T allele with no response (45). Therefore, we evaluated the relationship between rs1883832 genotype and CD40 mRNA levels in PBMC based a study of 95 healthy people, and confirmed that individuals with CC and CT genotypes had higher CD40 mRNA levels in CD19+ B-cells. Our results provide more insights into how the genetic variations at the CD40 gene-locus could affect the clinical response of CD40targeted therapies. Polymorphisms located within immune regulator gene regions are associated with a variety of diseases (30). Through exposing primary monocytes from 432 healthy volunteers to IFN-g or LPS and mapping gene expression as a quantitative trait loci, Fairfax et al. established a cis-eQTL database that can help to understand the nature and functional consequences of genetic variation (32). Using this database, we found eight GDassociated SNPs were correlated with CD40 expression in the naïve or stimulus state (Table 3). However, rs1883832 with its highly linked SNPs (r 2 > 0.8) exhibited a stronger association than the other seven SNPs under all conditions TRAb is implicated in GD pathogenesis, and its presence in serum is diagnostic for Graves' disease (46). TRAb is also related with the extrathyroidal manifestations of GD (such as Graves' ophthalmopathy and pretibial dermopathy) (47,48). Our group has performed association analysis for several SNPs in pTRAb+/subgroups (8,16), but so far, no studies have reported the relationship between rs1883832 and TRAb. Therefore, our results provide a new view of the involvement of rs1883832 in GD pathogenesis.
In summary, by means of a refined study including 8,171 GD patients and 7,906 controls, our research provides compelling evidence that rs1883832 is the most significant GD-associated SNP located within the CD40 gene region in the Chinese Han population, and is also a susceptibility locus for pTRAb+ GD patients. Furthermore, monocytes and CD19+ B-cells carrying different rs1883832 genotypes showed distinct CD40 mRNA levels, which indicates that rs1883832 and its highly linked SNPs probably affect the CD40 gene at the transcriptional level. Considering that rs1883832 genotypes influence the translation of CD40, we propose that through transcriptional and translational pathophysiological aspects, rs1883832 alters CD40 gene expression, which ultimately contributes to Graves' disease etiology.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found here: https://wwwdev.ebi.ac.uk/ eva/?eva-study=PRJEB48200.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committees of all partner hospitals including the Shanghai ninth peoples' hospital, the first affiliated hospital of Bengbu medical college, the Linyi hospital and the Xuzhou central hospital. The patients/participants provided their written informed consent to participate in this study.