Genome-Wide Association Study on Immunoglobulin G Glycosylation Patterns

Immunoglobulin G (IgG), a glycoprotein secreted by plasma B-cells, plays a major role in the human adaptive immune response and are associated with a wide range of diseases. Glycosylation of the Fc binding region of IgGs, responsible for the antibody’s effector function, is essential for prompting a proper immune response. This study focuses on the general genetic impact on IgG glycosylation as well as corresponding subclass specificities. To identify genetic loci involved in IgG glycosylation, we performed a genome-wide association study (GWAS) on liquid chromatography electrospray mass spectrometry (LC–ESI-MS)—measured IgG glycopeptides of 1,823 individuals in the Cooperative Health Research in the Augsburg Region (KORA F4) study cohort. In addition, we performed GWAS on subclass-specific ratios of IgG glycans to gain power in identifying genetic factors underlying single enzymatic steps in the glycosylation pathways. We replicated our findings in 1,836 individuals from the Leiden Longevity Study (LLS). We were able to show subclass-specific genetic influences on single IgG glycan structures. The replicated results indicate that, in addition to genes encoding for glycosyltransferases (i.e., ST6GAL1, B4GALT1, FUT8, and MGAT3), other genetic loci have strong influences on the IgG glycosylation patterns. A novel locus on chromosome 1, harboring RUNX3, which encodes for a transcription factor of the runt domain-containing family, is associated with decreased galactosylation. Interestingly, members of the RUNX family are cross-regulated, and RUNX3 is involved in both IgA class switching and B-cell maturation as well as T-cell differentiation and apoptosis. Besides the involvement of glycosyltransferases in IgG glycosylation, we suggest that, due to the impact of variants within RUNX3, potentially mechanisms involved in B-cell activation and T-cell differentiation during the immune response as well as cell migration and invasion involve IgG glycosylation.

Immunoglobulin G (IgG), a glycoprotein secreted by plasma B-cells, plays a major role in the human adaptive immune response and are associated with a wide range of diseases. Glycosylation of the Fc binding region of IgGs, responsible for the antibody's effector function, is essential for prompting a proper immune response. This study focuses on the general genetic impact on IgG glycosylation as well as corresponding subclass specificities. To identify genetic loci involved in IgG glycosylation, we performed a genome-wide association study (GWAS) on liquid chromatography electrospray mass spectrometry (LC-ESI-MS)-measured IgG glycopeptides of 1,823 individuals in the Cooperative Health Research in the Augsburg Region (KORA F4) study cohort. In addition, we performed GWAS on subclass-specific ratios of IgG glycans to gain power in identifying genetic factors underlying single enzymatic steps in the glycosylation pathways. We replicated our findings in 1,836 individuals from the Leiden Longevity Study (LLS). We were able to show subclass-specific genetic influences on single IgG glycan structures. The replicated results indicate that, in addition to genes encoding for glycosyltransferases (i.e., ST6GAL1, B4GALT1, FUT8, and MGAT3), other genetic loci have strong influences on the IgG glycosylation patterns. A novel locus on chromosome 1, harboring RUNX3, which encodes for a transcription factor of the runt domain-containing family, is associated with decreased galactosylation. Interestingly, GWAS on IgG Glycosylation Patterns Frontiers in Immunology | www.frontiersin.org February 2018 | Volume 9 | Article 277 members of the RUNX family are cross-regulated, and RUNX3 is involved in both IgA class switching and B-cell maturation as well as T-cell differentiation and apoptosis. Besides the involvement of glycosyltransferases in IgG glycosylation, we suggest that, due to the impact of variants within RUNX3, potentially mechanisms involved in B-cell activation and T-cell differentiation during the immune response as well as cell migration and invasion involve IgG glycosylation.
To deepen our understanding of glycan biosynthesis and its role in the pathophysiology of many diseases, it is imperative, however, that we identify all factors involved in glycosylation pathways. The best described glycoprotein so far is immunoglobulin G (IgG) (17). Its glycosylation is thought to have important regulatory functions in the immune response (18) and has been associated with various diseases, such as rheumatoid arthritis (19) and different types of cancers (10,11). Also within the healthy population, a high interindividual variability in IgG glycosylation patterns is observed, that is, partly attributable to a heritable component (14,20). With the development of high-throughput glycosylation techniques, it has now become feasible to analyze glycosylation profiles and their relation with genetics at a population level. A first genome-wide association study (GWAS) by Lauc et al. (21)., including 2,247 individuals from four European cohorts (CROATIA-Vis, CROATIA-Korcula, Orkney Complex Disease Study and Northern Swedish Population Health Study), identified four loci encoding glycosyltransferases associated with IgG N-glycans. The authors likewise propose that five additional loci are involved in IgG glycosylation showing that a GWAS can be used to identify genetic loci controlling glycosylation of a single plasma protein (21). They replicated the association of two of their loci, MGAT3 and B4GALT1, in a cohort of MALDI-TOF MS (matrix-assisted laser desorption/ionization time-of-flight mass spectrometry) measured glycan data from the Leiden Longevity Study (LLS) (22). A recent study by Shen et al. (23) used a multiphenotype approach to analyze the genetic background of IgG glycosylation. Here, the authors examine IgG glycan structures measured by ultra-performance liquid chromatography [(UPLC) (24)] in a multivariate way and thereby detect five novel genetic loci that are associated with combinations of IgG glycan traits.
In contrast to UPLC, used by Lauc et al. and Shen et al., the liquid chromatography electrospray mass spectrometry (LC-ESI-MS) method allows for subclass-specific quantification of N-linked glycans. It has been shown that the IgG subclasses, IgG1-IgG4, not only differ in their structure, especially within the hinge region of the glycoprotein, but also in their effector functions (17,25). Besides differences in the number of disulfide bonds and the length and flexibility of the hinge region, glycosylation profiles also differ between the four IgG subclasses (26). While IgG2 is characterized by a higher degree of core-fucosylation and a low level of galactosylation, IgG1 shows a particularly high level of galactosylation for both neutral and sialylated structures (26). IgG4, on the other hand, shows a high level of core-fucosylated complexes with bisecting N-acetylglucosamine (GlcNAc) (26). How these subclass-specific glycosylation profiles are realized and what their specific contributions are in the pathophysiology of diseases remains largely illusive.
Previous GWAS on serum metabolite levels have indicated that analyzing enzyme substrate-product ratios benefits in gain by power for detecting associated genetic loci over analyzing single metabolites (27). Due to the LC-ESI-MS method, we are able to derive different types of IgG glycan traits to address the genetic background of the IgG glycan synthesis, including withinsubclass ratios representing the addition of one monosaccharide at a time, i.e., a single pathway step within IgG glycan synthesis. The same approach was utilized to validate pathway steps inferred by a network-based approach in Benedetti et al. (28). Here, the authors included GWAS data on ratios of IgG glycan structures representing specific, established and newly predicted enzymatic pathway steps. The GWAS data as well as additional laboratory experiments verified the hypotheses drawn from the network analysis. In contrast to Benedetti et al. we extend the list of ratios to all possible one-step pathway steps independent of any prior selection. Furthermore, to challenge the assumption of similar genetic control of glycan biosynthesis for all IgG subclasses (16), we additionally compute subclass-specific IgG glycan traits containing between-subclass ratios and subclass-specific IgG glycan proportions. Furthermore, we include summarizing traits for IgG glycan structures to capture general trends associated with variations in genetic loci as well as additional biologically meaningful glycosylation traits.
By means of a GWAS including these newly derived traits of the LC-ESI-MS measured IgG glycopeptides in our discovery cohort from the KORA F4 study (n = 1,823), and a replication of the results for the same glycan panel in an additional 1,800 samples from LLS, we want to further investigate the underlying genetic control of IgG glycosylation.  We conducted an age-and sex-adjusted genome-wide association  scan on 376 glycan traits, including 50 initial measured IgG glycopeptides, 155 summarizing derived traits, 95 within-subclass  ratios, 40 between-subclass ratios, and 36 glycan proportions (see Figure 1 for an overview; Table S1 in Supplementary Material; and Section "Materials and Methods" for further details). In our discovery cohort, KORA F4 (n = 1,823, study characteristics in Table S2  phenotypic traits and set our Bonferroni-corrected replication significance threshold to 2.33 × 10 -6 . From the 21,476 associations available for replication, we replicated 15,342 associations between 159 traits and 718 SNPs, which are displayed in Figure 2 (network representation) and Figure S1 (Manhattan plot) and Table S5 (all replicated results) in Supplementary Material. Table 1 summarizes the mentioned results. This table presents the associated genomic loci with p-values and effect sizes from both cohorts and associated IgG glycan traits and their directions of association.
The replicated traits cover all types of glycan traits and all IgG subclasses: 22 (out of 50) initial IgG glycopeptides, 87 (out of 155) summarizing derived traits, 39 (out of 95) within-subclass ratios, 6 (out of 40) between-subclass ratios, and 5 (out of 36) glycan proportions. Effects for all replicated associations are in the same direction and of similar magnitude as in the discovery cohort (part 2 in Figure 1).
The   ]. An overview of the associated traits per locus can be found in Table S8 in Supplementary Material and is shortly given in Table 1. With our study, we can confirm six of the loci associated with UPLC-measured IgG glycan traits (21,23) being associated with LC-ESI-MS-measured IgG glycan structures in a comparable way (see the Supplementary note and Table S13 in Supplementary Material for additional details). In addition, we detect a novel locus at RUNX3 (chromosome 1p36.11).
On chromosome 1, three SNPs (rs16830188, rs10903120, and rs11270291) have significant impact on glycan traits. A multivariate analysis in KORA F4 reveals that the three SNPs describe one locus, with rs16830188 being the most influential SNP (see Table S10 in Supplementary Material). These SNPs are in high linkage disequilibrium (LD) (r 2 ≥ 0.5) and are flanking the gene RUNX3 (see Figure S6A in Supplementary Material). The T-allele of the most significant marker for all associated glycan traits, rs16830188, is associated with an increase in agalactosylated structures and a decrease in mono-and digalactosylated structures. In addition, this SNP has the largest effect sizes for all associated glycan traits and explains 1.4 to 3.5% of the variance of the associated traits (see Table S3 in Supplementary Material). The genetic variants within RUNX3 especially affect IgG glycan traits from IgG2 and IgG4, illustrating the merit of the subclassspecific analysis.
In contrast to UPLC, LC-ESI-MS is suited for quantifying subclass-specific IgG glycan structures and thus for analyzing within-subclass ratios that represent single pathway steps in IgG glycan synthesis, as well as between-subclass ratios and glycan proportions. Using QQ-plots to compare the associations obtained with initial IgG glycan traits versus within-subclass ratios, we clearly demonstrate a gain in power for the latter analytical approach (Figure 3). These ratios even outperform the summarizing derived traits. Subclass specificity assessed by glycan proportions and between-subclass ratios perform almost as good as initial measured IgG glycopeptides, except for associations with very low p-values (1 × 10 -18 ) ( Figure S3A in Supplementary Material). In addition, we performed meta-analyses on replicated SNP-glycan associations of the two cohorts and statistically compared the strength of the associations of the same glycan trait for different subclasses.
Except for four IgG glycan ratios (IgG2_G2/IgG2_G1, IgG1_ G2N/IgG1_G1N, IgG1_G1NS1/IgG1_G1N, and IgG1_G1S1/ IgG1_G1), the associations of within-subclass ratios support the known functions of the glycosyltransferases within the IgG glycan synthesis across the subclasses (see Figures  bisecting GlcNAc over structures without bisecting GlcNAc associate with SNPs within the N-acetylglucosaminyltransferase MGAT3 locus; ratios of sialylated over non-sialylated glycan traits associate to variants within the sialyltransferase ST6GAL1 locus and ratios of fucosylated over non-fucosylated structures associate with fucosyltransferase FUT8. Besides known pathway steps in IgG glycan synthesis and inferred reactions from network analyses (28), the results hint at hitherto unknown enzymatic reaction steps catalyzed by the known glycosyltransferases, e.g., associations between variants of FUT8 and IgG1_G0FN/IgG1_G0N.
Subclass comparisons of meta-analyzed data revealed that 22 glycan traits are significantly different associated with 432 SNPs on 4 chromosomes ( Table S7 in Supplementary Material contains all results from the statistical tests, an overview is given in Table S8 in Supplementary Material and, graphically, in Figures S2 and S4F in Supplementary Material). In addition, as stated before, associations to five between-subclass ratios and six glycan proportions were replicated. Taken together, the major difference of IgG glycan traits on different subclasses lies in bisecting and fucosylation. We found 13 IgG glycans to be significantly different associated with SNPs at FUT8 between IgG1 and IgG2. In addition, the neutral glycan traits G0n, G1n, and G2n showed significantly different behaviors between IgG1 and IgG2 as well as between IgG2 and IgG4 as the T-allele of the strongest SNP in this locus (FUT8), rs11158592, was significantly negative associated with these traits in IgG1 and IgG2 but not in IgG4.
Furthermore, for the association between SNPs within MGAT3 and IgG glycan traits, we detected two traits being significantly different between IgG1 and IgG2, 10 glycan traits differing between IgG2 and IgG4, and only one glycan trait being different for IgG1 and IgG4 (G1FN). Almost all of the significantly differing glycan structures contain a bisecting GlcNAc. In addition, within-subclass ratios representing the addition of a GlcNAc (G1FN/G1F, G2FN/G2F, and G0FN/G0F) are significantly differently associated with the MGAT3 locus for the IgG subclasses.

DiscUssiOn
Our study attempts to deepen the knowledge of genetic influence on IgG glycosylation and to disclose possible subclass specificity in the synthesis pathways. We used the LC-ESI-MS-measured glycopeptides in both the discovery and replication cohort. In contrast to the UPLC and MALDI-TOF MS data used in Ref. (21,23), we quantify subclass-specific attached N-glycans, and the traits are comparable between the discovery and replication study.
With the new analytical method, LC-ESI-MS, we confirmed the association of IgG glycosylation to six of the loci previously identified with UPLC (21, 23) and, moreover, detect a novel locus, RUNX3, on chromosome 1p36.11. Unfortunately, we could not verify any of the additional proposed loci proposed by Shen et al. (23), probably due to power reasons and difference in statistical methodology (multivariate vs univariate approach). In addition, it has to be highlighted, however, that IgG glycan traits originating from the two analytical methods cannot be combined straightforwardly (see the Supplementary note and Table S10 in Supplementary Material).
RUNX3 encodes for a transcription factor of the runt domaincontaining family. It is located on chromosome 1 and three variants within this locus are associated with 10 phenotypic traits. All three SNPs are in high LD to each other (r 2 ≥ 0.5). RUNX3 and other transcription factors of the runt-family have a large impact on hematopoiesis (29). Methylation of RUNX3 promoters has an impact on several diseases (30)(31)(32), as well as on inflammation and immune response (33)(34)(35). In particular, RUNX3 could be linked to B-cell maturation (36).
In addition, the transcription factor has been shown to contribute greatly to the regulation of apoptosis in cancer metastasis in general (37) and in the differentiation of T-cells to CD4+ and CD8+ T-cells in particular (38)(39)(40)(41). While IgG is secreted by differentiated B-cells, it nonetheless has been shown that IgG1 glycosylation is dependent on B cell stimuli during their differentiation. These stimuli include T-cell derived cytokines and metabolites (42). By influencing T-cell differentiation, RUNX3 could likely indirectly influence the glycosylation of antibodies produced by B-cells. Thus, T-cell differentiation may stimulate B-cell activation and influence the glycosylation of their secreted antibodies.
The opposing effect directions for structures with and without attached galactose lead to the hypothesis that the RUNX3 locus plays an important role in galactosylation. There is a striking overlap between glycan traits associated with the RUNX3 locus and the B4GALT1 locus, supporting this hypothesis. Interestingly, a similar feature as for RUNX3, namely altering the differentiation process of T-cells, is attributed to the enzymes of the Ikaros family including IKZF1 (43,44). However, in our study no glycan traits overlapped for the two loci. Potentially, the two transcription factors regulate different glycosyltransferases.
While the other six loci have been described before in Ref. (21,23) to be associated with N-glycan biosynthesis, variants in the RUNX3 locus are novel candidates from our study. Since only glycan traits from less abundant IgG2 (IgG2/3 in KORA) and IgG4 were associated (17), it is reasonable to assume that the reason why the locus on chromosome 1 could not be detected before by the UPLC is because this technique does not provide information about N-glycosylation that is subclass specific, but instead results in total IgG N-glycans quantification and thus, a larger sample size may have been needed for UPLC data. Indeed, subclass-specific analyses reveal this association presumably due to higher power for the subclass-specific associations.
The IgG glycan traits based upon the two analytical methods, UPLC and LC-ESI-MS are not entirely comparable. A benefit of the LC-ESI-MS method is the subclass-specific IgG glycosylation measurements, with the drawback of non-separable IgG2 and IgG3 in the discovery cohort, which is due to the identical peptide moieties (E293EQFNSTFR301) of their tryptic Fc glycopeptides in Caucasians (45). Nevertheless, we were able to compare SNP associations for similar glycan traits between the subclasses and examine the IgG glycan synthesis separately for each subclass.
The within-subclass ratios representing enzymatic pathway steps are mainly associated with the assumed genetic loci coding for known glycosyltransferases (16)   Supplementary Material). These traits contain not only wellknown enzymatic reactions within IgG glycan synthesis ( Figure  S4A in Supplementary Material), inferred reactions based on network analysis as in Ref. (28) but also all possible ratios representing the addition of one monosaccharide at a time ( Figure  S4B in Supplementary Material). Comparing the results from the known enzymatic steps and other possible one-step pathway relations suggests the existence of several of the latter (see Figure  S3B in Supplementary Material), even in addition to the pathway steps supposed by Benedetti et al. (28). However, few of the within-subclass ratios are associated with variants from different genetic loci (see Figure 2). The comparison of IgG glycan traits across subclasses leads to the hypothesis that fucosylation catalyzed by Fut8 and the addition of bisecting GlcNAc supported by Mgat3 is realized to different extent between the IgG subclasses. Fucosylation seems to be especially different between IgG1 and IgG2 while bisection is mostly differing between IgG2 and IgG4. For more details, see the supplementary note in Data Sheet S1 Supplementary Material. Still, functional studies are needed to elucidate the underlying mechanisms, especially with regard to subclass specificity. Indeed, it has been shown that specific subclasses and their attached glycan structure are highly relevant as biomarkers for diseases and even more when used in antibody therapy (46)(47)(48)(49)(50).
The obtained results help to broaden our knowledge on the pathway steps of IgG glycan synthesis in general, and, specifically the differences for each IgG subclass. While the initial glycan traits outperform the between-subclass ratios and glycan proportions, the findings from comparing SNP-glycan associations across subtypes (Table S5 and Figures S2 and S4F in Supplementary Material) hint at altered glycan synthesis for the different IgG subclasses.
cOnclUsiOn Summarizing, our analysis yields 159 phenotypic traits based on LC-ESI-MS measured IgG glycopeptide structures being significantly associated with 718 genetic variants on seven distinct loci. For UPLC-measured IgG glycans, six out of the seven loci have been shown to influence IgG glycosylation (21,23). The new gene found to be associated with LC-ESI-MS measured IgG glycopeptide traits is RUNX3 on chromosome 1. Ratios of IgG glycans representing enzymatic pathway steps within the N-glycan biosynthesis are predominantly associated with genetic variants within regions of a priori suggested genes encoding for known glycosyltransferases. Subclass comparisons point to specific behavior of variants covering the MGAT3 locus on chromosome 22 and the FUT8 locus on chromosome 14.

Discovery cohort-KOra F4
The KORA F4 study, conducted in 2006-2008, is an independent population-based health survey (51) and was performed as a follow-up of the KORA S4 study (1999)(2000)(2001) (52). The study followed the recommendations of the Declaration of Helsinki and was approved by the local ethical committees. In the F4 follow-up, a total of 3,080 persons participated of whom 1,823 individuals were available for the genome-wide association scan of IgG glycopeptides traits. Genotyping was realized with the Affymetrix Axiom Chip (53,54). Prephasing was done by SHAPEIT v2 and imputation was carried out by IMPUTE v2.3.0 using 1000 Genome (phase 1 integrated haplotypes CEU) as a reference panel. SNPs were non-monomorphic and filtered based on their call rate (98%), their minor allele frequency (>1%) and were excluded if they significantly aberrated from the Hardy-Weinberg Equilibrium (p < 5 × 10 −6 ). All individuals were of European ancestry and samples with mismatching phenotypic and genetic gender were excluded. These criteria led to a total of 18,185,628 SNPs. After the analysis, we additionally excluded SNPs with imputation quality defined by IMPUTE lower than 30%. A total of 1,823 individuals from the KORA F4 cohort were used for discovery. The samples include 935 women and 888 men ranging from 32 to 81 years, with mean age of 62.56 years (SD = 9.89) (see Table S2 in Supplementary Material for more details).

replication cohort-lls
The LLS followed the recommendations of the Declaration of Helsinki, the study protocol was approved by the local medical ethical committee and good clinical practice guidelines were maintained.
The LLS examined long-lived siblings of European descent together with their offspring and the partners of the offspring. Families with at least two long-lived siblings (age ≥89 for man, age ≥91 for women) were recruited. This age category represented <0.5% of the Dutch population in 2001 (22). In total, 944 longlived individuals (age range 89-104), 1,671 of their offspring (age range 39-81), and 744 partners thereof (60 years, 36-79) were included (55). DNA genotyping for LLS was performed at baseline as described in detail in Ref. (56) with the Illumina Human660W and Illumina OmniExpress arrays. Genotype imputation was performed using IMPUTE v2.2 (beta) with the 1000 Genome (phase 1 integreated haplotypes CEU) as reference panel. Quality control included SNP-wise call rate (95%), their minor allele frequency (>1%) and derivation from the Hardy-Weinberg equilibrium (p < 1 × 10 −4 ). As for KORA F4, we excluded SNPs with imputation quality lower than 30% as provided by IMPUTE. For the current genome-wide association analysis with IgG glycopeptide measurements, 1,836 samples of offspring and their partners were available.

IgG Isolation
As described in Ref. (20), IgG was isolated from plasma by affinity chromatography using 96-well protein G monolithic

LC-ESI-MS/MS Analysis of IgG Tryptic Glycopeptides
For the KORA F4 study, tryptic glycopeptides were analyzed on nanoACQUITY UPLC system (Waters, USA) coupled to Compact mass spectrometer (Bruker Daltonics, Germany) via a capillary electrophoresis electrospray (ESI) interface (Agilent Technologies, Santa Clara, CA, USA). A sheath liquid (50% isopropanol, 20% proprionic acid) was pumped at a flow rate of 2 µL/min. Nine µL of IgG tryptic glycopeptides was loaded on Acclaim PepMap100 C8 (5 mm × 300 μm i.d.) trap column. The glycopeptides were washed 1 min with 0.1% TFA (solvent A) at a flow rate of 40 µL/min and separated on an HALO C18 nanoLC column (50 mm × 75 μm i.d., 2.7 µm HALO fused core particles) (Advanced Materials Technology, USA) at 30°C, using a 3.5 min gradient from 19 to 25% solvent B (80% ACN) at 1 µL/ min flow rate. Mass spectra were acquired from 500 to 2,000 m/z GWAS on IgG Glycosylation Patterns Frontiers in Immunology | www.frontiersin.org February 2018 | Volume 9 | Article 277 units with two averages at a frequency of 0.5 Hz. The quadrupole ion energy and collision energy were set to 4 eV. NanoACQUITY UPLC system was operated under MassLynx software version 4.1 and the Bruker micrOTOF-Q was operated under HyStar software, version 3.2. Data extraction was performed using an in-house Python script. In short, data were m/z recalibrated based on a subset of hand-picked analytes having a high signalto-noise ratio and the expected isotopic distribution. Intensities for the top four isotopologues were extracted using a 10 ppm m/z window. Retention times were aligned toward the cohort median and retention time bins were determined for the analytes. All of the signals belonging to a single analyte for every sample were summed up. For the LLS study, the IgG glycopeptide samples were analysed using an Ultimate 3000 RSLCnano liquid chromatography system (Dionex, Sunnyvale, CA, USA) coupled to a Maxis Impact quadrupole time-of-flight-MS (micOTOF-Q, Bruker Daltonics), as described previously (57). Samples were run over a trap column (Acclaim PepMap100 C18, 5 mm × 300 µm i.d., Dionex, Sunnyvale, CA, USA) and a separation column (Ascentis Express C18 nanoLC, 50 mm× 75 µm i.d., 2.7 µm HALO fused core particles; Supelco, Bellefonte, PA, USA). A linear gradient was used with a flow rate of 0.9 µL/min, with solvent A consisting of 0.1% TFA and B of 95% ACN: t = 0, 3% solvent B; t = 2, 6%; t = 4.5, 18%; t = 5, 30%; t = 7, 30%; t = 8, 0%; t = 11, 0%. The LC was coupled to the MS via a sheath-flow electrospray (ESI) interface (Agilent Technologies, Santa Clara, CA, USA). A sheath flow, consisting of 50% isopropanol, 20% proprionic acid, and 30% MilliQ-purified water, was applied with a flow rate of 2 µL/min, along with nitrogen gas at 4 L/min. Mass spectra were acquired within an m/z range of 600-2,000 at a frequency of 0.5 Hz. LC-MS data were examined and calibrated using Compass Data Analysis 4.2 (Bruker Daltonics), and retention time alignment was done using Msalign. In-house developed software Xtractor 2D (see http://ms-utils.org/ Xtractor/) was used to extract signal intensity data. For each type of glycopeptide, the background-subtracted signal intensity of the first three isotopic peaks in both 2+ and 3+ charge state were summed.
For the following analyses we used the most prominent measureable 20 glycopeptides in subclasses IgG1 and IgG2 (a mixture of IgG2 and IgG3 in KORA F4, IgG2 only in LLS) and the most prominent and identifiable 10 fucosylated glycopeptides in IgG4, since peaks belonging to afucosylated IgG4 glycans overlapped with those of earlier eluting and much higher abundant IgG1 glycans.

Preprocessing of igg glycopeptides
Glycosylation is highly differing between individuals. Absolute values of peaks obtained by the LC-ESI-MS method are not comparable. We normalize glycopeptides per subclass by total area normalization as defined in the R-package "glycanr" (R-package version 0.3.0) (58), taking their relative abundance within subclasses as phenotypes and input variables for ratios. Batch correction per subclass was performed with the ComBat (59) algorithm of the R-package "sva" (R-package version 3.14.0) (60). To meet the assumptions for ComBat batch correction, samples were log-transformed before applying the algorithm and exponentiated afterward to regain the original scale. Derived traits have been computed from batch corrected glycan measurements.
Summarizing derived traits per subclass were computed as described in S1 using the ildt function from glycanr package (58). Ratios within subclasses were defined as product over substrate for all possible one-step reactions in the pathways, based on the assumption that single sugar molecules can only be added and not removed (61). Ratios between subclasses were calculated as described in Data Sheet S1 in Supplementary Material. Here, we do not assume an actual product-substrate relationship. For ratios including glycopeptides traits from IgG4, we renormalized the glycopeptide traits on all corresponding fucosylated traits only. All ratios were log-transformed.
For the glycan proportions, i.e., normalization per specific glycopeptide trait in total Fc IgG glycopeptides, we also used the total area normalized traits as input. We only calculated per glycan normalization for core-fucosylated glycopeptides as others are not available for IgG4. In addition, we computed the sums per IgG subclass as ratios of these sums. Again, the ratios were log-transformed before any further analyses. For the discovery cohort, characteristics of all IgG glycan traits can be found in Table S2 in Supplementary Material.
From these 50 initial glycopeptides, summarizing traits per subclass were derived [as seen in Table S1 in Supplementary Material and in Ref. (24)], including, e.g., "Percentage of IgG1 Fucosylation" (sum of all fucosylated glycan traits in IgG) or the "ratio of afucosylated monosialylated structures with and without bisecting GlcNAc in total IgG1 glycans" (the ratio of sum of afucosylated monosialylated structures with bisecting GlcNAcs over the sum of afucosylated monosialylated structures without bisecting GlcNAcs in total IgG1 glycan traits).
In addition, we included all one-step pathway ratios of product over substrate possible within each subclass, e.g., IgG1_G0F/ IgG1_G0 (see Figure S4B in Supplementary Material). The ratios describe reactions that are already known to be part of the IgG glycosylation biosynthesis as well as reactions that can be derived on the assumption of the addition of one monosaccharide at a time, but which are hitherto unknown.
To analyze differing glycosylation pathways for the subclasses, we included ratios of glycopeptides across subclasses in our analysis, e.g., IgG1_G0/IgG2_G0. Glycopeptide traits being used for ratios with IgG4 were additionally normalized on their respective fucosylated glycopeptides only.
For detecting genetic influence on the abundances of the IgG subclasses, we additionally normalized the traits "per glycan" [e.g., IgG1_G0F/(IgG1_G0F + IgG2_G0F + IgG4_G0F)] and included the newly normalized glycopeptides, the subclass-specific sums, and ratios thereof in the analyses. These traits describe the relative All variants in LD with r 2 ≥ 0.5 were assigned to the same LD-block. Hereby, SNPs could be assigned to more than one LD-block. We then merged LD-blocks that were overlapping position-wise (see Figures S5A-F in Supplementary Material). This step takes care of SNPs for which no LD information was available but which are still situated in a highly heritable genetic region. The remaining (now larger) LD blocks can thus be clearly separated by positions on the chromosome and can be more easily observed by specific regional plots (see Figure  S6 in Supplementary Material). This approach helps to summarize variants within loci. However, it does not account for functional similarity between markers or their relevance on IgG glycopeptide traits.
Information on the position of SNPs and their genetic features are obtained from the UCSC Genome Browser on Human [February 2009 (GRCh37/hg19)] (66).
For replicated SNPs on chromosome 1, we additionally performed multivariate linear models with the data from the discovery cohort, KORA F4, and settings as in the original model. For associated glycopeptide traits, models included one up to all three of the replicated SNPs on chromosome 1. We compared the significance of the single SNPs in the joint models as well as the added explained variance of the glycan traits.

eThics sTaTeMenT
The KORA study was carried out in accordance with the recommendations of good clinical practice guidelines and the guidelines of the KORA study group, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the KORA study group. The LLS study was carried out in accordance with the recommendations of good clinical practice guidelines and the guidelines of the medical ethics committee of the Leiden University Medical Center, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

acKnOWleDgMenTs
The KORA study was initiated and financed by the Helmholtz Zentrum München-German Research Center for Environmental Health, which is funded by the German Federal Ministry of abundance of the attached glycopeptide structure for the specific subclass.
All traits declared as "ratios" were log-transformed before any statistical analysis.
For a complete list of all 376 phenotypic traits, please see Table  S1 in Supplementary Material.

genome-Wide association analysis and Meta-analysis
We performed a genome-wide association analysis in the discovery cohort, KORA F4. First, each phenotype (see "IgG glycan traits" for explanation) was adjusted for sex and age and regression residuals were inverse normal rank transformed to assure normal distribution. The transformed residuals were used for association analysis in a linear model performed with snptest v2.5.1 software (62) using an additive genetic model.
The threshold determining the suggestive SNPs was set to 5 × 10 −8 . Given that we performed 376 different GWASs, the genome-wide significant threshold for the discovery was additionally adjusted for the number of initially measured glycopeptide traits and thus set to 1 × 10 −9 (5 × 10 −8 /50 initially measured glycopeptides). All derived traits and different ratios are a function of initially measured glycan traits and are therefore dependent on these traits. We acknowledge that there might be less independent traits within initially measured glycans, but we decided to be more conservative and correct for 50 tests.
For the LLS cohort, we computed linear regression models for the suggestive associations from the discovery only. Similar to the discovery cohort, each trait was adjusted for sex and age first, and the obtained residuals were normally rank transformed. For LLS, a score test including family information (63) was conducted with the C++ program QTassoc [see http://www.lumc.nl/uh, under GWAS Software (63)]. The Bonferroni-corrected replication threshold was set to 2.33 × 10 −6 (0.05/number of replicated associations).
For all replicated trait-SNP associations (p < 2.33 × 10 −6 ), we additionally performed an inverse variance-weighted fixed effects meta-analysis on the summary statistics of the two cohorts. The meta-analysis was performed using the software METAL (64). We included similar traits for all three IgG subclasses. Based on the results from the meta-analysis, we performed t-tests to determine statistical significant differences between the same traits for the different IgG subclasses. In the subclass comparisons, we included all glycopeptide traits being available in at least two of the subclasses. We then analyzed parallels between the significantly different glycopeptide traits for the IgG subclasses and the replicated associations for ratios of IgG glycopeptides from different subclasses.

relation of replicated snPs
For all replicated SNPs, we obtain the information on their LD from SNiPA (65) (last update version from November 2015, setting: GRCh 37, 1000 Genomes Phase 3 v5, European, Ensembl82). As for some variants, LD information is missing, we next generated LD-blocks of replicated markers. GWAS