Genome-wide association study of blood lipid levels in Southern Han Chinese adults with prediabetes

Background Dyslipidemia is highly prevalent among individuals with prediabetes, further exacerbating their cardiovascular risk. However, the genetic determinants underlying diabetic dyslipidemia in Southern Han Chinese remain largely unexplored. Methods We performed a genome-wide association study (GWAS) of blood lipid traits in 451 Southern Han Chinese adults with prediabetes. Fasting plasma lipids, including triglycerides (TG), total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) were assayed. Genotyping was conducted using the Precision Medicine Diversity Array and Gene Titan platform, followed by genotype imputation using IMPUTE2 with the 1000 Genomes Project (Phase 3, Southern Han Chinese) as reference. Single nucleotide polymorphisms (SNPs) associated with lipid levels were identified using mixed linear regression, with adjustment for covariates. Results We identified 58, 215, 74 and 81 novel SNPs associated with TG, TC, HDL-C and LDL-C levels, respectively (P < 5×10-5). Several implicated loci were located in or near genes involved in lipid metabolism, including SRD5A2, PCSK7, PITPNC1, IRX3, BPI, and LBP. Pathway enrichment analysis highlighted lipid metabolism and insulin secretion. Conclusion This first GWAS of dyslipidemia in Southern Han Chinese with prediabetes identified novel genetic variants associated with lipid traits. Our findings provide new insights into genetic mechanisms underlying heightened cardiovascular risk in the prediabetic stage. Functional characterization of implicated loci is warranted.


Introduction
Diabetes mellitus is a prevalent chronic metabolic disease characterized by elevated levels of blood glucose (1).In 2021, there were 529 million people living with diabetes mellitus worldwide, which remains a substantial public health issue (2).The global burden of prediabetes is also substantial and growing (3).Prediabetes confers not only heightened risk for diabetes mellitus development, but also independent risk for cardiovascular disease (4,5).Compared to healthy normoglycemic individuals, patients with prediabetes exhibit a greater propensity for atherogenic dyslipidemia, marked by elevated triglycerides (TG), total cholesterol (TC) and low-density lipoprotein cholesterol (LDL-C), alongside decreased high-density lipoprotein cholesterol (HDL-C) (6).This lipid profile significantly promotes atherosclerotic cardiovascular disease in the prediabetic population (7).Elucidating the pathogenic mechanisms underlying dyslipidemia in prediabetes is therefore imperative to inform strategies for complication prevention and management.
Lipid metabolism is a complex physiological process under polygenic regulation.While genome-wide association studies (GWASs) have uncovered several lipid-related loci in normoglycemic (8)(9)(10)(11) and diabetes mellitus populations (12), focused genetic analyses in prediabetic cohorts remain scarce.Given the obvious dyslipidemia in prediabetes, studying the genetic characteristics of blood lipids in this population can reveal specific genetic variations.This can provide insights into disease susceptibility, aid risk stratification, and guide targeted interventions to mitigate cardiovascular risk.
In this study, we performed GWAS of lipid levels in 451 Han Chinese adults with prediabetes using the Precision Medicine Diversity Array.We aim to identify novel loci associated with TG, TC, HDL-C and LDL-C that may be specific to the prediabetic state.Significant variants will also undergo functional annotation.Our findings can facilitate personalized dyslipidemia management and cardiovascular risk assessment in this high-risk population.

Study design and participants
To identify genetic loci associated with lipid traits, including TG, TC, HDL-C, and LDL-C, in the context of prediabetes among Southern Han Chinese, we conducted a GWAS (Figure 1).A total of 451 patients with prediabetes were recruited from the National Diabetes Prevalence Survey conducted by the Chinese Medical Association in 2017.Prediabetes was defined according to standard diagnostic criteria: 5.6mmol/L ≤ fasting plasma glucose (FPG) < 7.0mmol/L, or 7.8mmol/L ≤ 2-hours postprandial glucose (2hPG) < 11.1mmol/L, or 5.7% ≤ hemoglobin A1C (HbA1C) ≤ 6.4% (13,14).All participants were at least 18 years old and had resided in the selected community for at least five years.Exclusion criteria were pregnancy, severe illness (e.g., cancer, kidney disease, acute infections), and cognitive impairment.Blood lipid levels were measured in a central laboratory using a Roche Modular autoanalyzer (Roche Diagnostics, Indianapolis, IN, USA).
This study was approved by the Ethical Committee of the Hainan Affiliated Hospital of Hainan Medical University (Med-Eth-Re (2019) 18) and adhered to ethical standards set forth by the committee and the Declaration of Helsinki.All participants provided written informed consent after being fully informed of the study's purpose.

DNA extraction and genotyping
Fasting peripheral blood (5mL) was collected from each participant in Ethylene diamine tetraacetic acid (EDTA) tubes and stored at −20°C.Genomic DNA was extracted from whole blood using the GoldMag-Mini Whole Blood Genomic DNA Purification Kit (GoldMag Co. Ltd., Xi'an, China) per manufacturer's protocol.DNA concentration and purity were The flow chart of this study design.

Genotype imputation and quality control
Additional SNPs were imputed using IMPUTE2 software and the 1000 Genomes Project Phase III (Southern Han Chinese) reference panel, increasing genomic coverage to 9,378,219 SNPs.Post-imputation quality control removed non-biallelic variants, SNPs with imputation quality score ≤ 0.4, call rate < 90%, and HWE p-value < 5×10 −6 .The final imputed dataset contained 1,752,717 SNPs.

Statistical analysis
Extreme outlier values for lipid traits were removed based on 3s principle.Lipid indicators were then standardized using the RNOmni package in R software.Mixed linear regression analysis was performed to identify genetic variants associated with lipid traits (TG, TC, HDL-C, LDL-C) under an additive model with adjustment for age, gender, and smoking using Gold Helix SNP & Variation Suite software (version 8.7).Genome-wide significance was defined as P < 5 × 10 −5 .Pearson correlation analysis was conducted in SPSS 20.0 (IBM Corp, Armonk, NY) to assess associations between lipid indicators, with p < 0.05 considered statistically significant.
The lead SNPs tagged distinct association signals in or near genes with known roles in lipid metabolism and transport, such as SRD5A2, PCSK7, SNAP25, PITPNC1, IRX3, BPI, and LBP.Meanwhile, KEGG enrichment analysis revealed that the genes mapped by the lead SNPs were enriched in several key pathways, including ther elipid metabolism, insulin secretion, Wnt signaling pathway, cardiac muscle contraction, MAPK signaling pathway, fatty acid metabolism, nitrogen metabolism, and metabolic pathways (Figure 3).In addition, we conducted interaction analysis and Gene Ontology (GO) enrichment analysis on the genes (SRD5A2, MICAL2, SORCS2, and MATN1) significantly associated with lipid traits in our GWAS results.The interaction analysis using the STRING database identified several potential protein-protein interactions among the lipid-associated genes (Figure 4).The GO enrichment analysis results revealed significant enrichment of several GO terms, includin gregulation of hormone levels, exocytosis, nerve growth factor signaling    pathway, and extracellular matrix organization (Figure 5; Supplementary Table 5).In the next stage, we will follow up these SNPs in larger cohorts and perform functional characterization to elucidate the mechanisms underlying the genetic associations with lipid levels.

Discussion
Our GWAS results identified numerous novel SNPs significantly associated with lipid traits, including 58 SNPs for TG, 215 SNPs for TC, 74 SNPs for HDL-C, and 81 SNPs for LDL-C, in a Chinese Han prediabetic poputation.Previous studies have shown that several lead SNPs located in or nearby genes, such as SRD5A2, PCSK7, PITPNC1, IRX3, BPI, and LBP, play a role in lipid metabolism (15)(16)(17)(18)(19).Meanwhile, enrichment analysis revealed that the lead SNPs located in or nearby genes were significantly enriched in elipid metabolism, insulin secretion pathways, and includin gregulation of hormone levels, exocytosis process.
Most individuals go through a prediabetes stage before progressing to full-blown diabetes.Recent research indicates that the long-term complications of diabetes manifest in individuals with prediabetes.These complications include cardiovascular disease, which is a leading cause of death in diabetes patients (20).It is well-established that dyslipidemia, including elevated TG levels and altered cholesterol profiles, plays a significant role in the development of cardiovascular disease.For instance, plasma TG levels of ≥2 mmol/L have been found to be associated with a twofold increase in mortality from cardiovascular disease (21).In diabetic patients, elevated TC levels have been linked to an increased risk of cardiovascular disease (22).Increases in LDL-C levels have also been associated with a higher risk of myocardial infarction (23).Additionally, there is a U-shaped association between HDL-C levels and the risk of cardiovascular disease mortality, with a potential interaction between HDL-C levels and glycemic status (24).Therefore, by conducting a GWAS to identify the genetic variations associated with lipid levels in individuals with prediabetes, we can gain insights into the genetic basis of lipid metabolism in prediabetes patients.Notably, our study identified six SNPs (rs10207755, rs522638, rs523349, rs632148, rs534999, and rs558803) in SRD5A2 that showedassociations with TG levels in individuals with prediabetes.The SRD5A2 gene encodes steroid 5-alpha reductase 2 and has been shown to suppress lipogenesis through inhibiting cortisol effects (15).The SRD5A2 variant rs523349 was previously linked with a higher risk of the progression and death of patients with metastatic prostate cancer (25).Additionally, SNPs in the SIK3-PAFAH1B2 intergenic region and PCSK7, along with rs6032829 in SNAP25, were associated with TG.These genes have demonstrated associations with TG levels in prior genome-wide studies in Korean population (26).Meanwhile, significant associations are detected between SNAP25 polymprphism rs363050 and increasing fasting glucose, HbA1c, and lower insulin in type 2 diabetes mellitus (T2DM) patients (27).WWOX r517797882 was found to be significantly assciated with increased level of TG in T2DM in the Han Chinese population (28).However, it's important to note that there is no overlap in the genomic locus examined between our study and the earlier investigation.This discrepancy may be attributed to the distinct focus of our study on prediabetes, whereas the previous research specifically addressed T2DM.The differences in the studied populations and conditions highlight the need for targeted investigations tailored to the specific stages and conditions within the spectrum of diabetes-related disorders.
We observed that the PITPNC1 SNP rs2011941 was associated with TC levels in individuals with prediabetes.PITPNC1, a member of the phosphatidylinositol transfer protein family, promotes the thermogenes is of brown adipose tissue (17).In our study, we also identified an intergenic SNP (rs9929651) between IRX3 and CRNDE, as well as multiple SORCS2 SNPs, that showed significant associations with HDL-C levels in individuals with prediabetes.Previous studies have reported a negative correlation between IRX3 expression levels of and serum TC, TG, and LDL-C levels (18).Additionally, SORCS2 has been found to facilitate the release of endostatin from astrocytes and regulate post-stroke angiogenesis (29).Previous studuy also has observed that PSMD6 rs831571 had a significant association with decreased HDL-C, while WWOX r517797882 significantly increased level of HDL-C in T2DM in the Han Chinese population (28).These findings highlight the importance of considering specific genes and pathways in the context of prediabetes and T2DM.However, it is important to note that additional studies are needed to validate these associations and elucidate the underlying mechanisms involved.
Futhermore, several SNPs were significantly associated with LDL-C levels, including rs7254892 in NECTIN2, rs1126757 in IL11, rs547687426 in BPI, rs186778434 in LBP, and rs13045621 near MAFB.These genes have established functional roles in regulating vascular inflammation, fibrosis, and lipid metabolism based on findings from recent studies.For example, recent studies Scatter plot of KEGG enrichment of lead SNPs located or near genes.(A).lead SNPs located or near genes associated with TG levels.(B).lead SNPs located or near genes associated with TC levels.(C).lead SNPs located or near genes associated with HDL-C levels.(D).lead SNPs located or near genes associated with LDL-C levels.GeneRatio is the number of genes enriched in the KEGG entry divided by the total number of given genes.
have revealed pro-angiogenic, anti-apoptotic and antiinflammatory properties of NTN1 as well as preventing vascular dysfunction in diabetes (30).The NECTIN2 cell adhesion molecule has been shown to participate in lipid metabolism and serves as a potential biomarker for disease progression in carotid artery stenosis (31).Additionally, the cytokines IL6 and IL11 are upregulated under glucotoxic conditions, and IL6/IL11-mediated islet fibrosis may contribute to dysfunction in T2DM (32).Furthermore, the BPI-like proteins LBP and BPI are involved in lipid homeostasis and have been implicated in cardiovascular pathogenesis (19).However, the functional effects of many other lead SNPs located in or nearby novel candidate genes still need to be explored and validated experimentally.Elucidating the mechanisms by which these uncharacterized genes influence lipid traits will provide new insights into dyslipidemia in the prediabetic state.
In our study, we found that genes significantly linked to lipid levels in prediabetes are involved in regulating various pathways, including ether lipid metabolism, steroid hormone biosynthesis, cell adhesion molecules, Wnt signaling pathway, fatty acid biosynthesis, MAPK signaling pathway, fatty acid metabolism, and metabolic pathways.Additionally, our GO enrichment analysis showed that proteins interacting with SRD5A2, MICAL1, and MATN1 proteins may be engaged in hormone and steroid metabolism, regulation of exocytosis and vesicle-mediated transport, and extracellular structure organization, respectively.These discoveries strongly indicate that the genes associated with these pathways play a critical role in regulating lipid levels in individuals with prediabetes.Delving deeper into the intricate mechanisms governing these pathways will help uncover potential therapeutic targets and elucidate the molecular mechanisms underlying disease development.This provides valuable insights for further research into the molecular regulation of prediabetes, enriching our overall understanding of the associated biological processes.
This study has several limitations.First, the sample size was small with only 400 patients, which reduces power to detect SNPs with minor effects.Expanding the cohort would improve statistical power.Second, only Southern Han Chinese were included, limiting generalizability to all Chinese prediabetics.Multi-center studies across diverse populations are needed.Third, environmental and lifestyle factors were not well controlled, which could influence blood lipids.Detailed lifestyle data should be collected in future studies and considering gene-gene and gene-environment interactions.Fourth, as discovery research, only preliminary SNP screening was done without functional validation of novel hits.Experimental studies are needed to verify biological roles.Fifth, this study did not identify SNPs related to dyslipidemia in prediabetes.Further research will consider stratified patients in prediabetes according to lipid levels to identify SNPs related to dyslipidemia in prediabetes.Finally, comparison with healthy controls was lacking, thus it remains unclear if identified SNPs are specific to prediabetes.Comparisons to normoglycemic cohorts would aid in elucidating pathogenic mechanisms underlying dyslipidemia.Addressing these limitations in follow-up studies will provide more robust insights into the genetic architecture of lipid abnormalities in prediabetes.

Conclusions
This GWAS provides confirmatory evidence for associations of multiple lipid loci (58 SNPs for TG, 215 SNPs for TC, 74 SNPs for HDL-C, and 81 SNPs for LDL-C) in a prediabetic Southern Chinese Han population, while also uncovering novel SNP associations warranting further functional characterization.The findings yield insights into dyslipidemia genetic mechanisms in prediabetes.

FIGURE 2
FIGURE 2Manhattan plots and quantile-quantile plots of association of lipid traits in the GWAS.Chromosomes are shown on the x-axis, and the −log10 of the p-value on the y-axis.The red line represents the genome-wide significance cut off of 5 × 10 −5 .TC, total cholesterol; TG, triglyceride; HDL-C, highdensity lipoprotein-cholesterol; LDL-C, low density lipoprotein cholesterol; QQ, quantile-quantile.

4
FIGURE 4 Protein-protein interaction (PPI) networks.(A).PPI network diagram of interaction with MICAL2 protein.(B).PPI network diagram of interaction with SRD5A2 protein.(C).PPI network diagram of interaction with SORCS2 protein.(D).PPI network diagram of interaction with MATN1 protein.

TABLE 1
Basic characteristics of participants.

TABLE 2 Lead
SNPs associated with lipid traits in the GWAS.