Common Variants in NUS1 and GP2 Genes Contributed to the Risk of Gestational Diabetes Mellitus

Background Recently, NUS1 and GP2 genes were reported to be associated with the risk of type 2 diabetes (T2D) in a Japanese population. Given the sharing of pathogenic contribution from genetic factors between T2D and gestational diabetes mellitus (GDM), we conducted the study to systematically examine the relationship of NUS1 and GP2 genes with the susceptibility to GDM in Chinese Han population. Methods A total of 4,250 subjects comprised of 1,282 patients with GDM and 2,968 controls were recruited, and 20 tag single nucleotide polymorphisms (SNPs) (10 from NUS1 and 10 from GP2) were selected for genotyping. Association analyses were conducted for GDM and its related biomedical indexes including fasting glucose and HbA1c levels. Results Two SNPs, rs80196932 from NUS1 (P=2.93×10-5) and rs117267808 from GP2 (P=5.68×10-5), were identified to be significantly associated with the risk of GDM. Additionally, SNP rs80196932 was significantly associated with HbA1c level in both patients with GDM (P=0.0009) and controls (P=0.0003), while SNP rs117267808 was significantly associated with fasting glucose level in both patients with GDM (P=0.0008) and controls (P=0.0007). Serum levels of protein NUS1 and GP2 were measured for the study subjects, and significant differences were identified among groups with different genotypes of SNP rs80196932 and rs117267808, respectively. Conclusions Our findings indicate that NUS1 and GP2 genes contribute to the risk of GDM, which would help to offer the potential to improve our understanding of the etiology of GDM and, in turn, could facilitate the development of novel medicines and treatments for GDM.


INTRODUCTION
Gestational diabetes mellitus (GDM) is diabetes diagnosed for the first time during pregnancy (1)(2)(3). It is the most common metabolic disorder in pregnant women (1). The reported prevalence of GDM varies from 1% to >30% due to a lack of consensus in diagnostic criteria for GDM (4). The prevalence of GDM is highest in the Middle East and North Africa (median, 15.2%) and lowest in Europe (median, 6.1%) (4)(5)(6). Similar to type 2 diabetes (T2D), insulin is the primary medical treatment for GDM if lifestyle intervention is not effective in controlling glucose (4). The diagnosis of GDM for most women is potentially associated with anxiety related to maternal and fetal health (7). Depression has also been reported to be more frequent in women with GDM than in pregnant women without GDM (8)(9)(10). Additionally, GDM has been reported to be associated with an increased risk of later maternal diabetes and cardiovascular disease (11).
Elucidating the underlying pathological mechanisms of GDM could facilitate the development of novel treatments and disease prevention for GDM. Similar to T2D, GDM is considered a complex disorder with contributions from multiple genetic and environmental factors (12,13). Up to now, two genome-wide association (GWA) studies have been published, and multiple loci have been reported to be associated with susceptibility to GDM including IGF2BP2, CDKAL1, SLITRK6, NUMBL, LTBP4, and SNRPGP16 (14,15), and most of the reported significant loci were identified in previous GWA studies for T2D.
A recent GWA study reported genetic polymorphisms associated with the risk of T2D in two novel loci, the NUS1 and GP2 genes, in a Japanese population (16). Suzuki et al. identified that the T allele of SNP rs80196932 in NUS1 is associated with a reduced risk of T2D. Additionally, they found that the A allele of single nucleotide polymorphism (SNP) rs117267808 in GP2 is associated with an increased risk of T2D. NUS1 gene encodes a transmembrane domain protein which is a subunit of cisprenyltransferase (17). GP2 gene encodes a protein named glycoprotein 2 which is an integral membrane protein (18). This protein binds pathogens such as bacteria and therefore plays a significant role in the immune response (18). Previous evidence has indicated that both T2D and GDM might share a common genetic basis (19)(20)(21). Given the sharing of genetic contribution between T2D and GDM in their pathogenesis, we hypothesized that NUS1 and GP2 may also contribute to the risk of GDM. To examine this hypothesis, we performed a case-control study to evaluate the potential association between NUS1 and GP2 and the risk of GDM in a sample of women with Chinese Han ancestry. The aim of this study was to systematically examine the relationship of the common DNA variants in NUS1 and GP2 with the susceptibility for GDM. This study will illuminate the pathological mechanisms involved in GDM.

Study Subjects
We conducted a case-control study to evaluate genetic association. Both patients with GDM and healthy controls were enrolled from the Ninth Hospital of Xi'an and Northwest Women and Children's Hospital from April 2015 to June 2019. All study subjects were unrelated women at 24 weeks to 28 weeks of gestation. Individuals with metabolism-related disorders, including hypertension, diabetes, previous polycystic ovary syndrome, autoimmune disease, and preeclampsia, pregnancyinduced hypertension, were excluded. Additionally, women with alcohol abuse or multiple gestation were not included in this study. Controls were healthy pregnant women without any maternal or fetal disorders. The diagnosis of GDM was made according to guidelines proposed by the International Association of the Diabetes and Pregnancy Study Groups (IADPSG) in 2010. Fasting whole-blood samples were collected for analyzing of basic biochemical data including fasting glucose and HbA1c profiles. Another 5 ml of maternal venous blood was drawn from each enrolled case upon admission for genotyping experiments. Demographic and clinical variables including age, prepregnancy BMI, and family history of diabetes mellitus were collected by questionnaire. This study was performed in accordance with the ethical guidelines of the Declaration of Helsinki (version 2002) and was approved by the Ethics Committee of the Ninth Hospital of Xi'an. Written consent forms were collected from all study subjects.

SNP Selection and Experimental Methods
To investigate the potential contributions of NUS1 and GP2 genes to GDM risk, tag SNPs of the two genes were selected for genotyping. We first extracted 88 SNPs located within the genetic regions of NUS1 and GP2 with minor allele frequency (MAF) ≥0.05 based on Han Chinese data from the 1000 genome project (https://www.internationalgenome.org/). Then, tag SNPs were selected using r 2 >0.8 as criteria. Finally, 20 tag SNPs (10 for NUS1 and 10 for GP2) were selected and genotyped.
We extracted the genomic DNA from the peripheral blood samples of the study subjects using a DNA extraction kit provided by Tiangen Biotech Co. (Tiangen Biotech Co. Ltd, Beijing, China). SNP genotyping was implemented with the Sequenom MassARRAY platform (Sequenom, San Diego, CA, USA). Raw genotyping data were processed by Sequenom Typer 4.0 software. To evaluate the accuracy of the genotyping experiment, 5% of the samples were randomly selected for replication. A 100% concordance rate was obtained for the replication experiment results. In addition, serum level of protein NUS1 and GP2 in the study subjects were measured using enzyme-linked immunosorbent assay (ELISA) kits manufactured by Westtang Biotech Inc.(Shanghai, China).

Statistical Analysis
Demographic and characteristic information of study subjects were compared between patients with GDM and controls by Student's t-test or c 2 test. Hardy-Weinberg equilibrium (HWE) was tested in controls for all SNPs using c 2 test. Single-markerbased association analyses were conducted in allelic and genotypic groups. Logistic models were fitted for significant hits obtained from single-marker-based association analyses, and age and prepregnancy BMI were included as covariates. Linkage disequilibrium (LD) blocks were constructed using genetic data from selected SNPs. Haplotype-based association analysis was conducted within each LD block. In addition to risk of GDM, fasting glucose and HbA1c levels were also analyzed as phenotypes using linear models stratified by the disease status of our study subjects. We have performed ANOVA to investigate the distributions of serum level of protein NUS1 and GP2 in groups with different genotypes of SNP rs80196932 and rs117267808. Plink was utilized for genetic association analysis (22). Haploview was utilized for visualization of the LD blocks (23). Bonferroni corrections were applied to multiple comparisons and 0.0025 was chosen as threshold of P values for single-markerbased association analyses.

Bioinformatics Analysis
To further investigate the potential functional consequences of significant SNPs, expression quantitative trait loci (eQTL) data from the GTEx database were extracted (https://www.gtexportal. org/home/) (24). eQTL data from 47 types of human tissue were extracted and examined.

Demographic and Clinical Characteristics of the Study Subjects
A total of 4,250 study subjects comprised of 1,282 patients with GDM and 2,968 controls were recruited ( Table 1). No significant difference was identified for age and prepregnancy BMI between patients with GDM and controls. Significant differences were identified for parity (P=0.0069), abnormal pregnancy (P=0.0030), and family history of diabetes mellitus (P=0.0038) between patients with GDM and controls.

Significant SNPs Contributing to the Risk of GDM
No significant results of HWE tests were identified for the 20 SNPs in controls (Supplemental Table S1). We identified two significant SNPs, SNP rs80196932 from NUS1 and SNP rs117267808 from GP2, associated with the risk of GDM ( Table 2 and Supplemental Table S2). The C allele of rs80196932 was significantly associated with a reduced risk of GDM [OR (95%CI)=0.74 (0.64-0.85), c 2 = 17.47, P=2.93×10 -5 ]. On the other hand, the A allele of SNP rs117267808 was significantly associated with an increased risk of GDM [OR (95%CI)=1.41 (1.19-1.67), c 2 = 16.21, P=5.68×10 -5 ]. Doseresponse relationships were observed for both significant SNPs in the genotypic analyses. These significant signals remained after adjusting for age and prepregnancy BMI using logistic models ( Table 2 and Supplemental Table S3).
Significant Haplotypes Associated With the Risk of GDM 6 LD blocks were constructed using genetic data of the selected SNPs (Supplemental Figure S1). Significant haplotypes were identified for both NUS1 and GP2 (Supplemental Table S4). In NUS1, a two-SNP block including rs80196932 and rs9767451 was identified to be significantly related to the disease status of GDM (c 2 = 30.30, P=2.64×10 -7 ). In GP2, a three-SNP block including rs117267808, rs141536185, and rs4430753 was identified to be significantly related to the disease status of GDM (c 2 = 29.33, P=1.91×10 -6 ). Not surprisingly, both significant LD blocks contained the significant hits found in single markerbased analyses.

Both rs80196932 and rs117267808 Are Significantly Associated With Diabetes-Related Biomedical Indexes
In addition to the risk of GDM, we also investigated the association between the significant genetic markers and diabetes-related biomedical indexes including fasting glucose and HbA1c levels ( Table 3). Age and prepregnancy BMI were included in the linear model as covariates. This analysis was conducted separately for patients with GDM and controls, and the results from both groups agreed with each other. SNP rs80196932 was significantly associated with the HbA1c level. However, this SNP was not associated with the fasting glucose level. The C allele of rs80196932 was significantly related with reduced HbA1c level in both patients with GDM (t-statistic=-3.34, P=0.0009) and controls (t-statistic=-3.66, P=0.0003). On the other hand, SNP rs117267808 was significantly associated with the fasting glucose level but not the HbA1c level. The A allele of rs117267808 was significantly related with increased fasting glucose level in both patients with GDM (t-statistic=3.36, P=0.0008) and controls (t-statistic=3.38, P=0.0007).

eQTL Signals Identified for SNP rs80196932 in NUS1
Significant eQTL signals were discovered for SNP rs80196932 in the NUS1 gene from multiple types of human tissues (Figure 1 and Supplemental Table S5). The most significant eQTL signal was obtained from pancreatic tissue, which is the target tissue of GDM. The C allele of rs80196932 was significantly related with increased expression of NUS1 gene in pancreatic tissue (Supplemental Figure S2). No significant eQTL signals were found for SNP rs117267808 on the GP2 gene (Supplemental Table S6).

Association of Serum NUS1 and GP2 Level With Significant SNPs
Genotypes of SNP rs80196932 and rs117267808 are significantly associated with serum level of NUS1 and GP2, respectively. The C allele of SNP rs80196932 was significantly related with decreased serum level of protein NUS1 (Figure 2A). The A allele of SNP rs117267808 was significantly related with decreased serum level of protein GP2 ( Figure 2B). Further stratification analysis indicated that these trends are in the same direction in both GDM cases and controls (Supplemental Table S7).

DISCUSSION
In this study we identified genetic polymorphisms of the NUS1 and GP2 genes contributing to the risk of GDM in a sample of Chinese women with Han ethnic group ancestry. Although both NUS1 and GP2 have been reported to contribute to the risk of T2D in a recent GWA study in a Japanese population (16), our study is the first to link the two loci to GDM in samples with Chinese Han ancestry. The direction of effect for both SNPs on GDM discovered in the present study was consistent with that reported in the recent T2D GWA study, although the effect sizes of both SNPs were much smaller in the recent GWA study. SNP rs80196932 is located in the 5' untranslated region of NUS1. According to the RegulomeDB database, it located in an evolutionarily conserved region which overlaps with DNase Ihypersensitive sites and active transcriptional start sites (25). Additionally, in the bioinformatics analysis, we identified multiple significant eQTL signals for this SNP in the NUS1 gene in various types of human tissues including the pancreas. Therefore, this SNP may be more than a surrogate and may be a susceptible DNA variant with true functional consequences.
The eQTL evidence showed that the C allele of SNP rs80196932 was significantly related with increased expression level of NUS1 gene  in human tissue of pancreas. However, at the protein level, we have identified the C allele of SNP rs80196932 was significantly associated with decreased serum level of protein NUS1 in the study subjects. This discordant finding could be due to multiple reasons. First of all, the eQTL signals were extracted from human tissue of pancreas while the protein level of NUS1 was measured in serum. This spatial difference might affect the gene expression and translation. In addition, complex mechanisms post-transcriptional regulations, such as phosphorylation and ubiquitination, might complicate the relationship between gene expression level and protein level.  Although it is out of the scope of the current study to examine the pathological mechanisms of NUS1 in GDM, we can still obtain some clues by integrating evidence from the present study. According to our findings, the C allele of SNP rs80196932 was associated with a lower risk of GDM and with higher expression of NUS1. Previous studies have indicated that NUS1 is essential for protein glycosylation and intracellular cholesterol trafficking (26), and another animal model-based study showed that mouse embryonic fibroblasts with conditional knockdown of Nus1 accumulated free cholesterol (17). On the other hand, plasma cholesterol levels have long been considered to be associated with the risk of diabetes (27,28). In this sense, we believe that the C allele of SNP rs80196932 reduces the risk of GDM by increasing the expression level NUS1 and in turn reducing free cholesterol levels. More studies are still needed to unravel the underlying pathological mechanisms of NUS1 and GDM.
SNP rs117267808 is an intronic DNA variant of GP2 gene. Although we have identified significant relationship between genotypes of rs117267808 and serum level of protein GP2, no significant eQTL signals for rs117267808 on GP2 were found. Therefore, the functional consequences of this DNA variant on the GP2 gene are still not clear. It is probably that rs117267808 is a surrogate of some underlying DNA variants with true effects. A recent study showed that GP2 is a marker for human pancreatic progenitors cells and it secrete digestive enzymes and confers plasticity to convert progenitors into insulin-producing beta cells (29). However, we were not able to illuminate the mechanisms underlying GDM pathology and GP2 in the present study.
Our findings from association analysis of diabetes-related biomedical indexes indicated that although NUS1 and GP2 both contribute to the risk of GDM, they might be involved in different mechanisms of GDM. SNP rs80196932 from NUS1 and SNP rs117267808 from GP2 were associated with different biomedical indexes. It seems that NUS1 was related to HbA1c, while GP2 was related to fasting glucose. This difference can be partially explained by the difference in the functions of the two genes. Although it is insufficient to draw solid conclusions only from SNP results (30)(31)(32), the protein product of GP2 has been reported to be causally related to insulin-producing beta cells and therefore could have direct effects on the fasting glucose level (29). On the other hand, NUS1 is related to the protein glycosylation and intracellular cholesterol trafficking. A recent study demonstrated correlation between HbA1c and serum lipid profiles in T2D patients (33). Therefore, NUS1 might be associated with HbA1c levels through its effect on cholesterol trafficking.
Our study has several limitations. Population stratification is a main confounding factor for population based gene association mapping studies. As a study based on a couple of candidate SNPs it is impossible for us to address this issue using standard methods such as principal component analysis. However, we tried to implement additional inclusion criteria to at least partially address this potential issue. In the subject enrollment process, we only recruited native individuals with no immigration history in last three generations. In this sense, we could, at least partly, restrict the genetic heterogeneity of the study subjects. Another potential limitation is that the eQTL evidence of the present study was extracted from a publicly available database which was based on individuals with no GDM. However, the expression levels for genes could be different in controls and patients with GDM. Thus, caution is advised for interpreting the results of eQTL signals. Last but not least, as this study only genotyped several candidate SNPs, we might miss the opportunity to discover novel susceptible variants of GDM.
In summary, we reported novel association signals between genetic polymorphisms of the NUS1 and GP2 genes and GDM risk. Associations with diabetes-related biomedical indexes were also identified. Our findings shed light on the roles of NUS1 and GP2 in pathogenic mechanisms of GDM, which will be further elucidated in future investigation.

DATA AVAILABILITY STATEMENT
The datasets generated during the current study are not publicly available due to local legislation, but are partially available from the corresponding author on reasonable request (Email: yuniuxjtu@163.com).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of the Ninth Hospital of Xi'an. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
Authors YN and TZ conceived and designed the study. TZ and LZ carried out candidate SNPs selection and statistical analyses. SW, JL, YC, LM, JF, and YN conducted subject screening. TZ, LZ, and YN contributed to the collection and preparation of control DNA samples. TZ and YN drafted the first version of manuscript together. All authors contributed to the article and approved the submitted version.