Large-Scale Analysis of Apolipoprotein CIII Glycosylation by Ultrahigh Resolution Mass Spectrometry

Apolipoprotein-CIII (apo-CIII) is a glycoprotein involved in lipid metabolism and its levels are associated with cardiovascular disease risk. Apo-CIII sialylation is associated with improved plasma triglyceride levels and its glycosylation may have an effect on the clearance of triglyceride-rich lipoproteins by directing these particles to different metabolic pathways. Large-scale sample cohort studies are required to fully elucidate the role of apo-CIII glycosylation in lipid metabolism and associated cardiovascular disease. In this study, we revisited a high-throughput workflow for the analysis of intact apo-CIII by ultrahigh-resolution MALDI FT-ICR MS. The workflow includes a chemical oxidation step to reduce methionine oxidation heterogeneity and spectrum complexity. Sinapinic acid matrix was used to minimize the loss of sialic acids upon MALDI. MassyTools software was used to standardize and automate MS data processing and quality control. This method was applied on 771 plasma samples from individuals without diabetes allowing for an evaluation of the expression levels of apo-CIII glycoforms against a panel of lipid biomarkers demonstrating the validity of the method. Our study supports the hypothesis that triglyceride clearance may be regulated, or at least strongly influenced by apo-CIII sialylation. Interestingly, the association of apo-CIII glycoforms with triglyceride levels was found to be largely independent of body mass index. Due to its precision and throughput, the new workflow will allow studying the role of apo-CIII in the regulation of lipid metabolism in various disease settings.


INTRODUCTION
Lipid metabolism is regulated by complex biological mechanisms in which apolipoproteinsproteins embedded in lipoprotein particlesmodulate the transport and availability of blood lipids (Mahley et al., 1984). Apolipoprotein-CIII (apo-CIII) is a 79 amino acid glycoprotein present on the surface of triglyceride-rich lipoproteins and is an inhibitor of lipoprotein lipase (LPL), an enzyme that hydrolyzes triglycerides into fatty acids (Shachter, 2001;Larsson et al., 2013Larsson et al., , 2017. Apo-CIII has been associated with increased monocyte adhesion to the endothelium (Kawakami et al., 2006) and enhanced binding of apoB-containing lipoproteins to vascular proteoglycans (Olin-Lewis et al., 2002). High apo-CIII levels are associated with hypertriglyceridemia (Kohan, 2015;Dai et al., 2019;Taskinen et al., 2019) and increased cardiovascular disease risk in the general population (Wyler Von Ballmoos et al., 2015;Rosenson et al., 2016;Dai et al., 2019) and diabetes mellitus (Juntti-Berggren and Berggren, 2017;Christopoulou et al., 2019). Recently, the clinical interest for this protein has increased due to the promising results obtained from antisense oligonucleotide-based therapies for the reduction of apo-CIII and triglyceride levels (Pollin et al., 2008;Rocha et al., 2017;Reyes-Soffer et al., 2019;Taskinen et al., 2019).
Apo-CIII exists in four major proteoforms: one nonglycosylated form (apo-CIII 0a ) and three O-glycosylated variants with a core 1 (T-antigen) glycan structure, which is either non-sialylated (apo-CIII 0c ), monosialylated (apo-CIII 1 ) or disialylated (apo-CIII 2 ) (Nedelkov, 2017;Ramms and Gordts, 2018). Low-abundance fucosylated, non-sialylated apo-CIII forms have also been described (Nicolardi et al., 2013a). It has been shown that not only the levels of apo-CIII but also the specific glycoforms and their relative expression control triglyceride metabolism (Yassine et al., 2015;Koska et al., 2016). For example, an inverse association between apo-CIII 2 / apo-CIII 1 ratio and triglyceride levels has been confirmed by two independent studies (Koska et al., 2016;Kegulian et al., 2019). It has also been shown that sialylation modulates the apo-CIII affinity for hepatic receptors that clear lipoprotein particles (Kegulian et al., 2019) and that different proteoforms of apo-CIII may affect the inhibition of LPL (Holleboom et al., 2011) and the interaction of LDL with the vascular wall (Hiukka et al., 2009). Since the association of different apo-CIII proteoforms with specific cardiometabolic endpoints has not been fully elucidated, further research in large sample cohorts is warranted.
We have developed a high-throughput method based on magnetic-bead extraction and matrix-assisted laser desorption/ ionization (MALDI) and ultrahigh-resolution Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry (MS) for the analysis of serum apo-CIII proteoforms (Nicolardi et al., 2013a;Nicolardi et al., 2013b). Apo-CIII contains methionine residues, which can be (partially) oxidized during biological processes in vivo (Stadtman et al., 2003), sample processing and freeze-thaw cycles (Borges et al., 2014). The presence of different oxidoforms increases mass spectra complexity, which complicates MS data processing and affects the repeatability of measurements. Although the analyte oxidation may not pose a serious challenge in MALDI MS analysis of single samples, it can seriously impact the precision and accuracy of quantitative measurements in large sample cohorts.
In the current study, we have applied a modified workflow employing a previously established MALDI FT-ICR MS method preceded by a chemical oxidation step for complete oxidation of apo-CIII methionine residues. This results in highly reproducible high-throughput measurements for relative quantification of apo-CIII proteoforms in a large number of plasma samples varying in protein oxidation levels. Furthermore, we have adopted sinapinic acid (SPA) as a MALDI matrix to minimize the loss of sialic acid induced by MALDI. The high-throughput quantitation software, MassyTools (Jansen et al., 2015), was here further developed to facilitate semi-automated MS data processing for intact proteins. The validity of the new workflow was tested on a clinical cohort comprised of 771 plasma samples, which allowed the evaluation of the relationship between apo-CIII glycoforms and metabolic biomarkers, such as BMI, cholesterol, and triglyceride levels.

Clinical Samples
Blood plasma samples from a group of individuals without diabetes of the DiaGene Study were used. The DiaGene Study is a case-control study comprising 1886 type-2 diabetes patients and 854 controls without diabetes, from the areas of Eindhoven and Veldhoven, in the Netherlands. The study is described in detail elsewhere (Van Herpt et al., 2017). For the current study, after quality control, apo-CIII glycosylation data were available for 771 samples, in 746 whereof, data on clinical characteristics were available. All participants gave their written informed consent. This study was approved by the Medical Ethics Committees of the Erasmus University Medical Center, Catharina Hospital and Maxima Medical Center.
Clinical information and blood samples were obtained at baseline, as described previously (Van Herpt et al., 2017). Triglycerides and cholesterol concentrations were measured using standard clinical chemistry essays and reported by the collecting clinic. Non-high-density lipoprotein (non-HDL)cholesterol was calculated by subtracting the high-density lipoprotein (HDL)-cholesterol from the total cholesterol, body mass index (BMI) was calculated by dividing the body mass (in kg) by the square of the body length (in m). Triglyceride concentrations were logarithmically transformed before linear regression analysis, because of non-normal distribution.

High-Throughput RP-C18 Solid-Phase Extraction of Plasma Proteins
Plasma standards (VisuCon-F) were randomized over cohort sample plates. 10 µL of human blood plasma was transferred from the cohort sample plates into 96-well skirted PCR plates (4ti-0960/C, 4titude, Dorking, United Kingdom). 15 µL of an oxidizing solution (12% H 2 O 2 /0.5% TFA in water) was added to each sample. The plate was sealed with a pierce foil seal (4ti-0521, 4titude Ltd., Wotton, Surrey, United Kingdom) and incubated for 1 h at 37°C. Subsequently, the plate was cooled at 4°C for 30 min and centrifuged briefly at 800 × g. The pierce foil was removed and the plate was transferred onto a liquid handling robot (Hamilton, Bonaduz, Switzerland) where solid-phase extraction (SPE) was carried out as follows: the RP-C18 beads were activated by three washes using acetonitrile (ACN) and trifluoroacetic acid (TFA) solution in water (first wash using 50% ACN/0.1% TFA followed by two washes with 0.1% TFA). Next, plasma samples were transferred to the activated beads and incubated for 10 min at room temperature. The incubation was followed by three washes: one wash using 15% ACN and two washes with 0.1% TFA. Proteins were eluted by adding 15 µL of 50% ACN/0.1% TFA in water and incubating for 5 min at room temperature. For MALDI spotting, 2 μL of sample eluates were mixed with either 16 μL of sinapinic acid solution (1.3 g/L in 2:1 v/v ethanol/acetone) or 15 μL alpha-cyano-4hydroxycinnamic acid solution (1.4 g/L in 2:1 v/v ethanol/ acetone). 1.5 μL of each sample mix was spotted in duplicate onto a MALDI AnchorChip target plate (800 μm anchor diameter; Bruker Daltonics, Bremen, Germany) and allowed to air-dry before MALDI MS analysis.

MALDI FT-ICR Mass Spectrometry and MS Data Analysis
All MALDI MS experiments were performed on a 15 T solariX XR FT-ICR mass spectrometer (Bruker Daltonics) equipped with a Smartbeam II ™ laser system (355 nm wavelength) and a ParaCell detector. All spectra were acquired in the m/z-range 3495-30,000, from the average of ten scans of 200 laser shots (at 500 Hz) each using 524,288 data points. The analyzer parameters were set as previously reported (Van Der Burgt et al., 2019). Briefly, measurements were performed with high trapping potentials (up to 8.5 V) and high ParaCell DC biases (up to 8.8 V) and with a Sweep excitation power of 57% for 13.5 µs. A laser power of 20% and "medium" laser focus was used for MS measurements using HCCA, while a laser power of 30% and "ultra-large" focus was used with SPA. Details on MS data processing and statistical analysis can be found in Supplementary Material (Additional experimental details: MS data processing and statistical analysis).

RESULTS AND DISCUSSION
The Controlled Oxidation of Methionine Residues Reduces the Complexity of Mass Spectra A common event observed in proteomics studies is the oxidation of methionine residues due to biological and pathological processes occurring in vivo (Stadtman et al., 2003), sample storage and processing (Borges et al., 2014). These reactions are so common that, in bottom-up studies, methionine oxidation is often included in the database search as a variable or even fixed modification. However, in general, the peptides generated by enzymatic digestion (e.g. using trypsin) are small and often do not contain methionine residues and although a (partial) oxidation of methionine residues increases the number of peptides in a digest, these unwanted reactions do not significantly affect the analysis (Lao et al., 2015;Hains and Robinson, 2017;Bettinger et al., 2020). In a clinical setting, the use of fresh samples may be an ideal approach, especially for diagnostic purposes based on profiling of intact protein. Whereas in cohort studies, the collection of large numbers of clinical samples, their storage, transfer between institutions and multiple use may lead to oxidation processes that affect the analysis of intact proteins by increasing the heterogeneity of proteoforms detected in a spectrum. The higher complexity increases the chance of overlapping signals and reduces the sensitivity of the measurements due to the spreading of the signal over a higher number of species. This results in MS spectra characterized by the presence of interfering species and very low abundant analyte peaks, which do not meet acceptable spectral quality criteria for consideration in the statistically significant quantitative analysis.
Apo-CIII contains two methionine residues which can be oxidized to form methionine sulfoxide (MetO) and methionine sulfone (MetO 2 ) although this latter form requires harsher oxidizing conditions (Kim et al., 2015;Lim et al., 2019). Previously, MALDI-TOF MS methods have been used in analyses of apo-CIII proteoforms . In such lowresolution methods, apo-CIII oxidoforms cannot be resolved, however, their presence results in the broadening and distortion of apo-CIII proteoforms' signals, which can eventually overlap or interfere with signals of other proteins affecting their quantification. The chance of signal interference increases when SPE is used for the enrichment of apo-CIII as it leads to the co-enrichment of other small plasma proteins. The application of more specific enrichment methods, such as immunocapture, may help to reduce signals interfering with the various apo-CIII oxidoforms, but was not implemented in the present study for simplicity reasons. Of note, apo-CIII proteoforms have been analyzed by methods employing LC systems (Kailemia et al., 2018;Olivieri et al., 2018). Despite certain advantages over MALDI-TOF MS, such as absolute quantification and enhanced resolution, the throughput of this approach remains relatively low. Recently, we have developed a method for the analysis of apo-CIII proteoforms using ultrahigh-resolution MALDI FT-ICR MS (Nicolardi et al., 2013a). Apo-CIII proteoforms were mainly detected as singly charged ions. Thus, apo-CIII oxidoforms (1 and 2 times MetO) were detected at +15.995 Th and +31.990 Th from the non-oxidized forms (Figure 1). The degree of methionine oxidation of apo-CIII can vary greatly but the complete oxidation of apo-CIII (i.e. 100% conversion of the two methionine residues to MetO) is not commonly observed (Supplementary Figure S1). Therefore, for each of the four major proteoforms of apo-CIII, two additional oxidoforms were observed in MALDI FT-ICR MS spectra resulting in twelve proteoforms ( Figure 1). In addition to that, we were able to detect C-terminal alanine cleaved and fucosylated proteoforms (Supplementary Table S1).
To reduce sample complexity, we included an oxidation step with hydrogen peroxide to perform a controlled oxidation of both apo-CIII methionine residues to MetO (Figure 1). While the implementation of the oxidation step added 2 h to the workflow for the analysis, it reduced the heterogeneity of the spectra and facilitated MS data processing using MassyTools software (see Implementation of MassyTools software for high-throughput MS data processing) (Jansen et al., 2015). The efficiency of the controlled oxidation was tested on 136 standard and 771 clinical plasma samples. The relative intensities between the non-, mono-and dioxidized forms of apo-CIII 0a , apo-CIII 0c , apo-CIII 1 , and apo-CIII 2 are reported in Supplementary Tables S2, S3. Oxidation rates over 90% were found for apo-CIII 1 and apo-CIII 2 . Oxidation efficiency seemed to be lower for apo-CIII 0a , apo-CIII 0c, however, close inspection of the spectra revealed the presence of interfering species that contributed to the signal of the non-and mono-oxidized forms of apo-CIII 0a , apo-CIII 0c thus increasing their apparent relative intensity (Supplementary Figure S2). Therefore, the controlled oxidation was considered efficient for all four proteoforms by providing consistent oxidation rates across standard and clinical plasma samples. These results supported our strategy of using only the signal of the di-oxidized apo-CIII proteoforms for further statistical analysis. The good efficiency and repeatability of the oxidation step allowed us to assess associations between apo-CIII glycosylation and different lipid markers using only the signal of the di-oxidized forms.

Minimizing Sialic Acid Loss Using Sinapinic Acid as MALDI Matrix
In our previously reported ultrahigh-resolution MALDI FT-ICR MS method for the analysis of apo-CIII proteoforms HCCA was used as a MALDI matrix (Nicolardi et al., 2013a;Nicolardi et al., 2013b). This compound was chosen to increase the sensitivity for other serum peptides and small proteins present in C18-SPE eluates obtained from the high-throughput enrichment step using magnetic beads. However, it is known that sialic acid loss can result from in-source decay fragmentation events of glycan structures even when linked to peptides and proteins. In fact, previous reports on the analysis of apo-CIII by MALDI-TOF MS were based on the use of a MALDI matrix colder than HCCA, namely SPA (Kegulian et al., 2019;Koska et al., 2016;Yassine et al., 2015). The use of SPA allowed to minimize the loss of sialic acid, as evidenced by an increased relative intensity of both the mono-and the disialylated apo-CIII proteoforms and leading to reproducible apo-CIII glycosylation profiles (Figure 2; 1 | Inter-and intra-plate variability. Relative peak intensities, standard deviation (SD) and coefficient of variation (CV) are given for the intra-and inter-plate variability based on 136 plasma standards.

Implementation of MassyTools Software for High-Throughput MS Data Processing
One of the advantages of using ultrahigh-resolution MS is that measurements at isotopic resolution provide more spectra information compared to broad-peak detection in linear mode MALDI-TOF MS. Previously, we showed that the goodness of the observed isotopic distributions can be used as a quality control parameter for the selection of high-quality spectra generated from the analysis of a large cohort of samples (Nicolardi et al., 2010). This concept was then implemented in a more powerful software-namely, MassyTools-developed for the high-throughput processing of MALDI mass spectra (Jansen et al., 2015). MassyTools allows the determination of a series of quality control parameters that can be used to perform a curation of MS data at different levels. Mass spectra with unacceptable internal calibration quality and low intensity were discarded at first. Then, the quality of the signal of each apo-CIII proteoform was assessed using the S/N and MME values determined for the most intense peak within an isotopic distribution. Additionally, the quality of such distribution (i.e. IPQ value) was taken into account. The distributions of values of these parameters over 136 standard and 771 clinical plasma samples are reported in Supplementary Figures  S3, S4; Supplementary Table S5. The analytes passing the curation process were then used for statistical analysis.
As assessed on 136 standard plasma samples, which were distributed over 17 MALDI target plates measured over 28 days, the method provided good repeatability for relative quantitation of all four proteoforms with CVs in a range of 1-18% for average intra-plate and 6-16% for inter-plate variability ( Table 1). While we reduced in-source decay by selecting SPA as a MALDI matrix, we expect that partial sialic acid loss from apo-CIII 1 during MS analysis may lead to a slight, artificial increase in the apo-CIII 0c glycoform abundance. Hence, fluctuations in the extent of sialic acid loss may contribute to the larger CVs for apo-CIII 0c .

Associations Between Apo-CIII Sialylation and Lipid Markers
We used this approach to determine non-glycosylated and the glycosylated non-sialylated, mono-sialylated and disialylated apo-CIII glycoforms within a cohort of 746 individuals without diabetes (cohort characteristics in Table 2) and test their association with a range of metabolic biomarkers. We found the association of disialylated apo-CIII 2 with overall improved lipid profiles and decreased BMI ( Table 3; Supplementary Table S6), which is in accordance with some of the previous reports (Koska et al., 2016;Kegulian et al., 2019). A subgroup analysis in participants not using statins or fibrates, did not change these associations (Supplementary Tables S7A,B).
So far, the inhibitory effect of apo-CIII on LPL has been linked to total apo-CIII concentration, but not to the relative proportion of apo-CIII glycoforms (Olivieri et al., 2018). Recent studies proposed that the presence of apo-CIII on triglyceride-rich lipoproteins (TRLs) alters the affinity between TRLs and their receptors in the liver. Kegulian et al. demonstrated that the degree of apo-CIII sialylation directs TRLs to different hepatic clearance pathways, as shown in mice (Kegulian et al., 2019). In detail, apo-CIII 1enriched very low-density lipoproteins (VLDLs) are preferentially cleared by faster-acting low-density lipoprotein (LDL) receptor (LDLR) and LDL receptor-related protein 1 (LRP1), whereas apo-CIII 2 directs VLDLs to syndecan 1 (SDC1) receptors that are characterized by a slower but larger capacity metabolism of TRLs. The same study also showed that a 13 weeks antisense oligonucleotide treatment for apo-CIII, which, as expected, reduced plasma TG levels, also altered relative abundances of these two glycoforms leading to an increase of apo-CIII 2 and a decrease of apo-CIII 1 . The increase of the apo-CIII 2 /apo-CIII 1 ratio in a response to the antisense oligonucleotide therapy was explained by a differing capacity and clearance speed of the hepatic TRL receptors. In support of this, we observed in our cohort study of individuals without diabetes a positive association of the relative abundance of apo-CIII 1 glycoform with TG levels, and a negative association for apo-CIII 2 (Figure 3; Table 3).
Our study supports the hypothesis that triglyceride clearance may be regulated, or at least strongly influenced, by apo-CIII glycosylation, specifically sialylation. However, other aspects have to be considered. For instance, defective LDLR/LRP1-driven metabolic pathways might lead to decreased clearance of TGs. Expression and stability of LDLR and LRP1 in the liver might be affected by naturally occurring genetic variants (Oldoni et al., 2018;Paththinige et al., 2018;Reyes-Soffer et al., 2019;Xu et al., 2020). Moreover, it has been shown in mice that a high-fat diet can lead to the down-regulated expression of hepatic LRP1 by causing hyperglycemia with a high level of plasma triglycerides (Kim et al., 2014). In humans, obesity is associated with increases in plasma triglycerides (Howard et al., 2003;Franssen et al., 2011). We hypothesized that the association of apo-CIII glycoforms with triglycerides could be confounded by BMI. Surprisingly, after adjustment for BMI, the direction of effect and goodness-of-fit did not evidently change (Supplementary Tables  S8A,B). This indicates that the association of apo-CIII sialylation with triglycerides is largely independent of BMI, and that it is not obesityassociated physiological changes that determine apo-CIII sialylation and its association with triglycerides.
Expression levels of apo-CIII were not investigated in this study. The differences in apo-CIII glycosylation profiles observed between individuals may be caused by varying expression levels of apo-CIII (Olivieri et al., 2018) or apo-CIII glycoforms (Holleboom et al., 2011), or the accumulation of certain glycoforms due to dysfunctional clearance pathways, based on recent findings by Kegulian et al. (Kegulian et al., 2019). It may also be a combination of the listed factors, which should be explored in further research. Nevertheless, from the results of this study, we cannot determine whether apo-CIII sialylation influences triglyceride levels, or vice versa. Further studies are needed to elucidate the genetic and environmental factors that determine apo-CIII sialylation in health and disease.

CONCLUSION
Apo-CIII is a novel potential drug target in the management of cardiovascular disease driven by multiple studies demonstrating that plasma levels of apo-CIII are predictive of coronary heart disease and the risk of disease-related events (Borén et al., 2020). Previously, it has been shown that sialylated apo-CIII glycoforms are differentially cleared by hepatic receptors and that a higher apo-CIII 2 /apo-CIII 1 ratio is associated with improved triglyceride levels (Kegulian et al., 2019). In humans, the production rates of these two glycoforms are comparable (Mauger et al., 2006), therefore varying apo-CIII2/apo-CIII1 ratios between individuals in healthy and disease groups might suggest various dysfunctional mechanisms involved in their production and clearance. This is the first large-scale study of apo-CIII glycosylation by ultrahigh resolution mass spectrometry. Clinical cohort studies employing large numbers of individuals will provide more insight into this topic, and the development of highly robust and accurate analytical methods enabling such large-scale studies is warranted.
Here, we present a workflow for high-throughput MALDI FT-ICR MS analysis of apo-CIII glycosylation in human plasma samples varying in protein oxidation levels. The controlled oxidation of apo-CIII methionine residues, the use of sinapinic acid as a MALDI matrix, and the use of MassyTools software for semi-automated, standardized spectra processing have been implemented to achieve highly repeatable measurements of intact apo-CIII proteoforms. The new analytical workflow allowed us to overcome the problem of the high spectral heterogeneity produced by methionine oxidation thus allowing the robust screening of a large cohort of plasma samples for the Frontiers in Chemistry | www.frontiersin.org May 2021 | Volume 9 | Article 678883 relative quantitation of apo-CIII proteoforms. Importantly, the evaluation of MS spectra-derived quality parameters was implemented to minimize biases and ensure accuracy of collected data. The cohort analysis confirmed that the level of apo-CIII sialylation is strongly associated with lipid biomarkers, especially with triglyceride levels. The relation between relative abundances of apo-CIII glycoforms and cardiovascular disease development should be further explored. More insight into the role of apo-CIII glycosylation in disease pathophysiology could provide new drug targets. Also, understanding of the mechanisms of existing drugs might increase by considering apo-CIII glycosylation. The methods presented, will enable such large-scale studies.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Medical Ethics Committees of the Erasmus University Medical Center and Catharina Hospital and Maxima Medical Center. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
DD carried out all experiments and sample cohort analysis, partially assisted by SN, MB, and JN. DD and VD performed MS data processing. BJ provided a modified MassyTools script. AN performed association analysis assisted by DD. MW, MH and VD designed and supervised the study. ES and MH provided the study sample cohort. DD, SN and MW prepared the manuscript with significant contributions from all authors.

FUNDING
This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 722095.