Interaction of energy and sulfur microbial diet and smoking status with polygenic variants associated with lipoprotein metabolism

Introduction Hypo-high-density lipoprotein cholesterolemia (hypo-HDL-C) contributes to the development of cardiovascular diseases. The hypothesis that the polygenic variants associated with hypo-HDL-C interact with lifestyle factors was examined in 58,701 middle-aged Korean adults who participated in the Korean Genome and Epidemiology Study (KoGES). Methods Participants were categorized into the Low-HDL (case; n = 16,980) and Normal-HDL (n = 41,721) groups. The participants in the Low-HDL group were selected using the guideline-based cutoffs for hypo-HDL-C (<40 mg/dL for men and < 50 mg/dL for women) and included those taking medication for dyslipidemia. The genes associated with hypo-HDL-C were determined through a genome-wide association study (GWAS) in a city hospital-based cohort, and the results were validated in the Ansan/Anung study. The genetic variants for the single nucleotide polymorphism (SNP)-SNP interaction were selected using a generalized multifactor dimensionality reduction analysis, and the polygenic risk score (PRS) generated was evaluated for interaction with lifestyle parameters. Results The participants with hypo-HDL-C showed a 1.45 and 1.36-fold higher association with myocardial infarction and stroke, respectively. The High-PRS with four SNPs, namely ZPR1_rs3741297, CETP_rs708272, BUD13_rs180327, and ALDH1A2_rs588136, and that with the 11q23.3 haplotype were positively associated with hypo-HDL-C by about 3 times, which was a 2.4-fold higher association than the PRS of 24 SNP with p < 5×10−8. The risk alleles of CETP_rs708272 and ALDH1A2_rs588136 were linked to increased expression in the heart and decreased in the brain, respectively. The selected SNPs were linked to the reverse cholesterol transport pathway, triglyceride-rich lipoprotein particle remodeling pathway, cholesterol storage, and macrophage-derived foam cell differentiation regulation. The PRS of the 4-SNP model interacted with energy intake and smoking status, while that of the haplotype interacted with a glycemic index of the diet, sulfur microbial diet, and smoking status. Discussion Adults with a genetic risk for hypo-HDL-C need to modulate their diet and smoking status to reduce their risk.


Introduction
Dyslipidemia, defined as an imbalance of lipids including cholesterol, low-density lipoprotein cholesterol (LDL-C), triglycerides (TG), and high-density lipoprotein cholesterol (HDL-C), is a risk factor for cardiovascular disease (CVD) worldwide (1).The global burden of dyslipidemias has increased over the last 30 years (2).In Korea, the incidence has risen from 9.0% in 2007 to 20.7% in 2018 (3).Dyslipidemia is highly prevalent in patients with metabolic syndrome (MetS) and type 2 diabetes (T2D), and this combination further increases the CVD risk (2).Hypo-HDL-cholesterolemia (hypo-HDL-C), hyper-LDL-cholesterolemia (hyper-LDL-C), and hypertriglyceridemia are influenced by a varied set of genetic and environmental factors (4).Therefore, the genetic factors and their interaction with lifestyle factors that influence hypo-HDL-C development may differ among Asians.
Cholesterol and triglycerides are emulsified with proteins and phospholipids to form lipoproteins that serve as vehicles for transporting cholesterol throughout the body.Under normal circumstances, LDL-C transports cholesterol into the peripheral tissues.However, when LDL-C is in excess, it can deposit cholesterol into the blood vessels, especially arteries, leading to a series of inflammatory changes in the vessel wall, resulting in CVD.HDL-C is synthesized in the liver and small intestines as 'nascent' HDL and contains a variety of lipid binding proteins called apolipoprotein (Apo) and includes ApoA1, ApoA2, ApoA4, ApoA5, ApoC1, ApoC2, ApoC3, and ApoE.'Nascent' HDL is secreted into the bloodstream via transporter ATP Binding Cassette Subfamily A Member 1 (ABCA1) (5).The Apo proteins play specific roles in the reverse cholesterol transport function of HDL (5).ApoA1 and ApoC1 activate the enzyme lecithin-cholesterol acyl transferase (LCAT), and ApoA5, ApoC2, and ApoC3 modulate lipoprotein lipase (LPL) activity.The liver can also uptake HDL-C through ApoE, although it is a wellknown ligand for the LDL receptor.The 'mature' HDL-C mediates cholesterol transport from non-hepatic tissues, especially arteries, to the liver.The returned cholesterol is metabolized and excreted through the bile (5).Therefore, reverse cholesterol transport via apolipoproteins in HDL-C is essential to reduce plaque formation in the blood vessels.
Plaque formation occurs in the blood vessels with the accumulation of LDL-C and its subsequent oxidation, followed by the inflammatory process of recruitment of monocytes-macrophages, uptake of oxidized LDL-C, and transformation of macrophages into foam cells.On the other hand, HDL-C particles protect against plaque formation by removing lipid buildup from the vessels and preventing inflammation (6).Hypo-HDL-C in persons without a history of CVD is inversely associated with future CVD risk, especially atherosclerosis.However, the inverse association may not apply to all patients with metabolic disorders and a history of CVD (7).Furthermore, serum HDL-C concentration may not always accurately represent the benefits of HDL function.This suggests that the functionality of HDL in reverse cholesterol transport can be influenced differently by various HDL subclasses with distinct particle sizes and compositions rather than solely by serum HDL concentration (8).However, the guideline for CVD includes that serum HDL-C should be maintained at a high level through lifestyle modifications.
Since genetic backgrounds significantly influence serum HDL-C, the interaction of genetic variants with lifestyle factors should be considered to modulate HDL-C.Several genetic variants have been reported to be associated with hypo-HDLC.These include variants associated with the following genes: low-density lipoprotein receptor (LDLR), proprotein convertase subtilisin-kexin type 7 (PCSK7), APOA5, SID 1 transmembrane family member 2 (SIDT2), and ABCA1, cholesterol ester transfer protein (CETP), APOA1, tyrosine-protein phosphatase non-receptor type 11 (PTPN11), rabphilin 3A (RPH3A), and oligoadenylate synthetase 3 (OAS3) (9)(10)(11).However, studies on the role of single genetic variants that affect hypo-HDL-C have been conducted only with small sample sizes.A few studies have attempted to explore genetic variants and their interactions to evaluate the role of the polygenic risk score (PRS) in HDL-C function and the PRS interaction with lifestyle factors to influence hypo-HDL-C.We hypothesized that the polygenic variants associated with hypo-HDL-C interacted with lifestyle factors.The hypothesis was examined in 58,701 middle-aged Korean adults who participated in the Korean Genome and Epidemiology Study (KoGES) and validated in 13,598 adults in the combined regional and rural cohorts.The results can be used to modulate lifestyle factors to prevent hypo-HDL-C in genetically susceptible adults at risk of CVD.

Participants and setting
The KoGES aimed to establish a scientific basis for the implementation of customized treatments and preventive medicine by identifying risk factors for chronic diseases common among Koreans.Among several cohorts in KoGES, a large city hospital-based cohort (n = 58,701) and the Ansan/Ansung plus rural cohorts (n = 13,598) included the measurement of genetic variants, and their volunteers were used as the participants in the present study.The Ansan/Ansung plus rural cohorts were used as a replicate study for the genetic result.The participants aged 40-74 years were recruited during the years 2010-2014 (12).The institutional review boards (IRB) of the Korea National Institute of Health and Hoseo University approved the KoGES and the present study (KBP-2015-055 and 1041231-150811-HR-034-01, respectively).All participants signed written informed consent.

Demographic, anthropometric, and biochemical parameters of the participants
On their initial visit to the hospital, the participants filled out survey forms for demographic information and lifestyles.Gender, education (<, =, or > high school), income (<monthly 2,000, 3,000, 4,000, or over 5,000 USD), physical activity, and smoking history were collected as the categorical variables.Alcohol and coffee consumption was recorded during a health interview.Current and past smokers were defined as smoking at least 20 cigarettes in the past six months and not smoking for at least the past six months, respectively (13).Daily alcohol and coffee intakes were calculated by multiplying the frequency of consumption by the amount consumed at one time (13).Regular physical activity was defined as more than 30 min of moderate physical activity for three or more days per week.Height and weight were measured as described previously (12).Body fat and skeletal muscle masses were estimated using a prediction model generated by a machine learning algorithm from the Ansan/Ansung cohort based on the measurements made using the Inbody 3.0 (Cheonan, Korea) equipment, which uses the bioelectric impedance analysis method (14).The skeletal muscle index (SMI) was calculated by dividing the appendicular skeletal muscle mass (ASM) by height squared.Insulin resistance was calculated with homeostatic model assessment for insulin resistance (HOMA-IR), and it was also predicted using a prediction model made from a machine learning approach (15).The average systolic blood pressure (SBP) and diastolic blood pressure (DBP) were measured three times with a sphygmomanometer under resting conditions.After fasting for more than 12 h, the serum total cholesterol, HDL-C, triglycerides, creatinine concentrations, alanine aminotransferase (ALT) and aspartate aminotransferase (AST) activities, and plasma glucose concentrations were measured using a Hitachi 7,600 Automatic Analyzer (Hitachi, Tokyo, Japan).Blood HbA1c and serum high-sensitive C-reactive protein (hs-CRP) were measured using a ZEUS 9.9 automatic analyzer (Takeda, Tokyo, Japan) and a high-sensitivity ELISA kit (Thermofisher, Waltham, MA, USA), respectively.

Definition of hypo-HDL-C
Hypo-HDL-C was defined as HDL-C < 40 mg/dL for men (n = 4,173) and < 50 mg/dL for women or the current use of antidyslipidemic medication (n = 12,807) (16).Participants were categorized into the Low-HDL and Normal-HDL groups based on the above definition.There were 16,980 and 41,721 participants in the Low-HDL and Normal-HDL groups, respectively.
Food intake using a semi-quantitative food frequency questionnaire (SQFFQ) The usual food intake during the past 12 months was measured using an SQFFQ with 106 food items commonly consumed by Koreans and validated with three-day food records of the four seasons (16,17).Food intake was calculated by multiplying the frequency of each food consumption item by the amount consumed daily, as described previously.The food intake was expressed as grams/day.The daily intake of energy, carbohydrates, fats, proteins, vitamins, and minerals was calculated from the SQFFQ results using the computeraided nutritional analysis program CAN-Pro 2.0 designed by the Korean Nutrition Society.

Dietary patterns by principal component analysis, dietary inflammatory index (DII), glycemic index (GI), and sulfur microbial diet index
As reported previously, food items in the SQFFQ were divided into 30 predefined food groups, which were used to constitute the dietary patterns using principal component analysis (PCA).Based on eigenvalues >1.5 and the orthogonal rotation procedure (varimax), four dietary patterns were defined (15).The name of each dietary pattern was assigned to foods with ≥0.40 factor-loading values or predominant contributors (15).Supplementary Table S1 lists the foods in each dietary pattern.The groups were named the Korean-balanced diet (KBD), plant-based diet (PBD), Western-style diet (WSD), or rice-based diet (RBD) groups.
DII is an index of the pro-inflammatory potential of dietary components.As the intake of garlic, ginger, saffron, and turmeric was not recorded, they were excluded from the DII computation.DII was calculated by multiplying the dietary inflammatory scores of the 38 food and nutrient components by their daily intakes, and the sum of 38 items was divided by 100, as described previously (18).
The GI and glycemic load (GL) were calculated using the relevant equations.The GI of the same food can vary due to differences in the types of the food and its nutritional composition.The GI values listed for common Korean foods were used (19).The 43 gut microbes related to sulfur metabolism were selected, and food groups positively or negatively associated with the gut microbes were identified in a previous study (20).The food groups with a positive association were processed meats, liquor, and low-calorie drinks, and those with a negative association were beer, fruit juices, legumes, other vegetables, and sweets or desserts.Sulfur microbial diet scores were calculated by summing the multiplying value of the beta coefficient by the amount of each food item (20).
Genotyping using a Korean Chip, quality control, and genome-wide association study (GWAS) The participants' genotypes in the Ansan/Ansung, rural, and city hospital-based cohorts were measured at the Center for Genome Science at the Korea National Institute of Health.The genotypes were measured in the genomic DNA isolated from whole blood using a Korean Chip (Affymetrix, Santa Clara, CA) designed for assessing the disease-related single nucleotide polymorphisms (SNPs) in Koreans (18).The inclusion criteria were ≥ 98% genotyping accuracy, <4% missing genotype call rate, <30% heterozygosity, and no gender bias.The genetic variants met the Hardy-Weinberg equilibrium (HWE) criterion at p > 0.05 and minor allele frequency (MAF) at >1% (18).
GWAS was conducted between the Low-HDL (n = 16,980) and Normal-HDL (n = 41,721) groups using the PLINK open-source whole genome association analysis toolset.The Manhattan and quantilequantile (Q-Q) plots showed the quality of the selected genetic variants from the GWAS using the Fastman library in the R program (12).The Manhattan plot displayed the negative logarithms of the association p-values for each serum HDL-C concentration.The Q-Q plot displayed the quantile distribution of observed p-values (on the y-axis) versus the expected p-values (on the x-axis) for the genetic variants between the Low-HDL and Normal-HDL groups.The Q-Q plot indicated the goodness of fit between the actual and theoretical data distributions, and the lambda value of the Q-Q plot was calculated.The pathways linked to the genetic variants for serum HDL-C concentrations were selected using the MAGMA gene-set analysis in the SNP2GENE of the FUMA web application, available through the git repository. 1 The statistical analysis was selected at p-values for the Bonferroni correction <0.05.

Selection of the genetic variants to influence hypo-HDL-C and the optimal model with SNP-SNP interactions
The procedure to select genetic variants for hypo-HDL-C risk and to generate the best model with the SNP-SNP interactions is presented in Figure 1.Genetic variants associated with hypo-HDL-C risk were 1 https://github.com/Kyoko-wtnb/FUMA-webapp/evaluated to select 4,233 SNPs in the urban hospital-based cohort (p < 5×10 −5 ).Among the genetic variants, those not meeting the criteria for HWE and MAF were removed (n = 681).The linkage disequilibrium (LD) analyses were conducted on the SNPs of the 3,552 genetic variants in the same chromosome using Haploview 4.2 in PLINK.The genetic variants having an LD score of D′ ≥ 0.2 were eliminated because they provided the same information on the genetic impact.The gene names of the remaining 154 genetic variants were searched using g:Profiler, 2 and 56 SNPs were identified by gene names.The pathways involved in the genetic variants were identified, and 24 SNPs were selected. 3The optimal SNP-SNP interaction model was identified using the generalized multifactor dimensionality reduction (GMDR).
Ten genetic variants interacted with each other and were selected by GMDR from among the 24 genetic variants associated with hypo-HDL-C risk.The optimal SNP-SNP interaction model was selected in 2 https://biit.cs.ut.ee/gprofiler/snpense 3 http://genemania.org/a sign rank test of trained balanced accuracy (TRBA) and testing balanced accuracy (TEBA) while adjusting for the covariates using a GMDR program and a p-value threshold of 0.05 (12).The covariates used were age, gender, residence area, body mass index, education, and income for model 1, and the model 1 covariates plus energy intake, alcohol intake, regular exercise, and smoking status for model 2. The ten-fold cross-validation was also checked for cross-validation consistency (CVC) because the sample size was larger than 1,000 (12).The 10 out of 10 scores in the CVC indicated perfect crossvalidation criteria.

Haplotype analysis and polygenic risk score
The haplotype was considered to show the genetic impact of the hypo-HDL-C risk when the primary genetic variants were located on the same chromosome.The LD of the selected SNP met the criteria (D′ < 0.2).The haplotypes and their frequencies were analyzed using the GPLINK software (21).
The risk allele number of each SNP was counted to generate the PRS of the optimal models.For example, the genetic score for the SNP was 2, 1, and 0 when the participants had AA, AG, and GG of one SNP, respectively, and the A allele was the risk allele.The PRS of the best model was assessed by summing the number of the risk alleles from each selected SNP in the best gene-gene interaction model (21,22).The PRSs in the three and six SNP models were divided into three categories according to the number of risk alleles.They were classified as Low-PRS, Middle-PRS, and High-PRS when the number of risk alleles in the PRS was 0-2 (n = 19,686), 3-4 (n = 30,513), and ≥ 5 (n = 3,629) in the three-SNP model and 0-5 (n = 27,212), 6-7 (n = 20,375), and ≥ 8 (n = 1,822) in the six-SNP model, respectively.Among the best models to meet the value of p of the sign test and CVC, the model with the lowest SNP number (three-SNP model) was used to explore its interaction with the lifestyle parameters.

Expression quantitative trait locus (eQTL) analysis
The eQTL analysis is a direct approach to estimating the candidate gene expression with the genetic variants at risk loci.Gene expressions corresponding to the genetic variants related to the hypo-HDL risk were determined by eQTL analysis in the Genotype-Tissue Expression (GTE) × eQTL calculator. 4

Molecular docking of the gene having missense mutation with food compounds and molecular dynamics simulation (MDS)
The wild and mutated protein structures were generated in the Protein Data Bank (PDB) format from the Iterative Threading Assembly Refinement (I-TASSER) website. 5The proteins were switched into the PDB, partial charge (Q), and atom type (T) (PDBQT) files using AutoDock Tools 1.5.6 (Molecular Graphics Laboratory, Scripps Research Institute, FL, USA) (23).The active sites of the proteins were searched using the ProteinsPlus website. 6The active functional pockets and the mutated sites were also included in the active site for molecular docking.Food compounds (n = 20,000) were converted to the PDBQT file format, and water molecules attached to the ligands were removed (23).Food components having < −10 kcal/mol binding energy between the proteins and food components were selected (24).The lower the binding free energy, the tighter the binding and affinity.
The conformational changes in the protein structures were examined using MDS to detect the changes in their activity.After the top docking poses with the selected food components were added, simulations were conducted on the docked complexes between the protein and food components.The Chemistry at Harvard Macromolecular Mechanics (CHARMM) force field was added to the docked complex in the "Simulation" part, and the protein was solvated by "Solvation." The "Standard Dynamics Cascade" was used to set the molecular dynamics simulation parameters for the protein added to the solvent system.The root mean square deviation (RMSD), root mean square fluctuations (RMSF), and hydrogen bond values were determined after the 10 ns simulation.

Statistical analysis
Statistical analysis was performed using SAS (version 9.3; SAS Institute, Cary, NC, USA).The sample size was determined by satisfying the significance at α = 0.05, β = 0.99, and 1.05 odds ratio in the logistic analysis using a G-power calculator.The sample size of 57,801 was sufficient to achieve the significance.Frequency distributions were used for the descriptive statistics for categorical variables between the Low-HDL and Normal-HDL groups, and a Chi-square test was applied to determine statistical significance.Descriptive statistics of the continuous variables were determined as the adjusted means with standard deviations after adjusting for the covariates linked to dyslipidemia.The gender and HDL groups were used as the main effects, and their interactions were evaluated in a two-way analysis of covariance (ANCOVA) (19).Multiple comparisons of the groups were conducted using Tukey's test.
The association of hypo-HDL-C with the biochemical parameters was evaluated using a logistic regression analysis after adjustment for covariates.The odds ratios (ORs) and 95% confidence intervals (CIs) of hypo-HDL-C with each biochemical parameter were calculated.The covariate set 1 was age, residence area, survey year, body mass index (BMI), education, and income.Set 2 was the covariates in set 1, plus energy intake, physical activity, smoking status, and alcohol consumption, and the covariate set 3 were covariates of set 2, plus blood HbA1c and serum triglyceride concentration.In the two-way analysis of covariance (ANCOVA), when the interaction terms between the PRS and lifestyle-related parameters were statistically significant, each lifestyle-related parameter was categorized into the Normal-HDL or Low-HDL groups with the designated cutoff.The adjusted odds ratio (ORs) and 95% confidence intervals (CIs) of hypo-HDL-C with PRS were also calculated by adjusted logistic regression analysis with covariate set 3 between the Normal-HDL and Low-HDL groups.The significant differences between the Low-HDL and Normal-HDL groups were analyzed using the χ 2 test in the low-and high groups of lifestyle-related parameters.

Characteristics of the participants
The participants in the Low-HDL group were older, less educated, and earned a lower income than those in the Normal-HDL group, but this difference was restricted only to the women (Table 1).The participants in the Low-HDL group had higher BMI, waist circumferences, and fat mass than those in the Normal-HDL group for both genders, but only women had a lower SMI (Table 1).Serum glucose and blood HbA1c concentrations and insulin resistance were higher in the participants in the Low-HDL group than those in the Normal-HDL group.The participants in the Low-HDL group had a higher incidence of dyslipidemia and hypertension (Table 1).The incidence of myocardial infarction, stroke, and cardiovascular disease was higher in the Low-HDL group than in the Normal-HDL group for both genders (Table 1).The participants in the Low-HDL group were higher at 1.4 times the risk of cardiovascular diseases, including myocadiac infarction and cerebrovascular stroke, than the Normal-HDL (Figure 2).

Lifestyles, including nutrient intake, and dietary patterns
There was no difference in the daily energy intake between the Low-HDL and the Normal-HDL groups.The participants of both  genders in the Low-HDL group consumed higher carbohydrates and lower fat than those with Normal-HDL (Table 2).Protein, fiber, calcium, sodium, and vitamin D intakes were lower in the women participants with the Low-HDL group than in those with the Normal-HDL group (Table 2).The DII, GI of the food consumed, and flavonoid intake did not differ between the Low-HDL and Normal-HDL groups.Coffee intake was lower in the Low-HDL than the Normal-HDL group only in women, and alcohol intake was lower for both genders (Table 2).Fewer participants exercised regularly in the Low-HDL than in the Normal-HDL group for both genders, and the number of male smokers was much higher in the Low-HDL group (Table 2).

Genetic variants associated with hypo-HDL-C
The statistical significance of the genetic variants associated with hypo-HDL-C has been shown in a Manhattan plot (Supplementary Figure S1A).Lambda, a genome inflation factor for genetic variants linked to hypo-HDL-C, was calculated by comparing the observed and expected p values.This comparison was shown in the Q-Q plot, and the lambda was 1.083, indicating no inflation of the genetic variants (Supplementary Figure S1B).
The PRS of 24 genetic variants associated with hypo-HDL-C satisfied the inclusion criteria, such as p < 5×10 −8 for the GWAS, D′ < 0.2 for LD, p ≥ 0.05 for HWE, and ≥ 0.01 for MAF.In the PRS containing the risk alleles of 24 genetic variants, serum HDL-C was lower in the high-PRS containing up to 12-15 risk alleles, but they continuously decreased with PRS containing up to 37 risk alleles (Figure 3A).Among the 24 genetic variants, ten were selected as having similar pathways and interactions with each other using the GeneMANIA site.The characteristics of the 10 genetic variants are shown in Table 3.These 10 genetic variants were associated with hypo-HDL-C at p < 5×10 −12 in the city hospital-based cohort (n = 58,701) and at p < 5×10 −5 in the Ansan/Ansung plus rural cohorts (n = 13,598).

PRS for interacted genetic variants each other or the haplotype
The optimal model with genetic variants interacting with each other was found using GMDR.The 4-SNP and 6-SNP models met the criteria for TEBA at p < 0.05 and 10/10 CVC.The 4-SNP model included BUD13_rs180327, ZPR1_rs3741297, ALDH1A2_rs588136, and CETP_rs708272, and the 6-SNP model added LPL_rs325 and ABCA1_rs1883025 to the genetic variants of the 4-SNP model (Supplementary Table S2).Among the 10 genetic variants, 4 genetic variants in chromosome 11 were part of the haplotype, and its PRS was calculated.Four genetic variants in chromosome 11 showed D′ < 0.2 of LD, as shown in Supplementary Figure S2.As the PRS for the 4-SNP, 6-SNP, and haplotype 11q23.3models decreased, the serum HDL-C was lowered (Figures 3B-D).However, the decline in serum HDL-C was greater in the 4-SNP model than in the 6-SNP model.In haplotype 11q23.3, the genetic variants with LD ≥ 0.2 were removed (Supplementary Figure S2).

Gene expression by eQTL according to genetic variants
Only the gene expressions of some genetic variants selected for hypo-HDL-C were determined in different tissues.The risk allele of ALDH1A2_rs588136 and SIK3_rs7115583 had a lower expression than that of the non-risk allele in the cortex of the brain and skeletal muscle (Figure 4).However, the risk alleles of CETP_rs708272 and ABCA1_rs1883025 had a higher expression than the non-risk allele in the arterial appendage of the heart and skeletal muscle (Figure 4).

The binding affinity of kuwanol E to APOE_ rs429358
The wild (cys13) and mutated APOE_rs429358 (130Arg) exhibited different levels of binding free energy to specific food components (Table 4).Neoacrimarine H, viniferifuran, morellinol, 22-deoxocucurbitacin D, cucurbitacin B, yuccaol A, yuccaol C, pregeijerene, and plantacyanin showed a low binding energy of < −10 kcal/mol to both wild and mutated types of APOE_rs429358 (Table 4).However, some food components showed a different binding affinity to wild and mutated types of APOE_rs429358.For example, the binding energy of APOE_rs429358 with Kuwanol E was −7.3 kcal/mol with the wild type but −10.1 kcal/mol with the mutated type (Table 4).Kuwanol E's binding to the wild and mutated type of APOE_rs429358 is presented in Figures 5A,B.
Figures 5C,D show the root mean square deviation (RMSD) and root mean square fluctuation (RMSF) for APOE_rs429358 wild and mutated types binding to kuwanol E. RMSD for APOE_rs429358 mutated type binding with kuwanol E was sustained close to 3 Å during 100 nanoseconds (Figure 5C).RMSF for APOE_rs429358 wild-type binding with kuwanol E also did not exceed 3 nm in the RMSF graph (Figure 5D).These results suggest that kuwanol E was more stably bound to the APOE_ rs429358 mutated type than the wild type.

Metabolic functions of the genetic variants associated with hypo-HDL-C
The genetic variants associated with hypo-HDL-C were involved in reverse cholesterol transport, triglyceride-rich

Interaction of genetic variants and lifestyle parameters in hypo-HDL-C
In the interaction between the PRS and lifestyles, the PRS of the 4-SNP model interacted with energy intake (p = 0.04) and smoking status (p = 0.0006; Table 6).The HDL-C was much lower in the participants in the high-PRS group with a low-energy intake than those with a high-energy intake (Figure 6A).Former and current smokers had lower serum HDL concentration than non-smokers but the PRS effect was smaller in the smokers than non-smokers (Figure 6B).These results indicated that a low-energy intake and non-smoking status did not improve the hypo-HDL-C status in the participants with high-PRS.
The PRS of the haplotype in 11q23.3interacted with the glycemic index (p = 0.003), sulfur microbial diet (p = 0.005), and smoking status (p < 0.0001) to influence hypo-HDL-C status (Table 6).In participants with a high-GI diet, serum HDL concentration was lower than those with a low-GI diet regardless of haplotype(hap)-PRS but the participants with high-hap-PRS had a much lower serum HDL concentration in Low-GI diet (Figure 6C).In the both low-and high-sulfur microbial diets, participants with a high-hap-PRS had a lower HDL-C than those with a low-hap-PRS, while the PRS impact was greater in the low-sulfur microbial diet than in the high-sulfur microbial diet  (Figure 6D).Smokers had lower serum HDL concentrations than non-smokers the non-smokers with high-hap-PRS showed a remarkably lower than those with low-hap-PRS (Figure 6E).Therefore, low-GI, low-sulfur microbial diet and non-smoking status could not offset the high-hap-PRS impact to decrease serum HDL concentration.

Discussion
The prevalence of hypo-HDL-C is estimated to be approximately 15-25% in the adult population worldwide (1).Hypo-HDL-C is an independent risk factor for CVD and is associated with an increased risk of coronary artery disease, stroke, and peripheral arterial disease.The risk factors for hypo-HDL-C include age, gender, lifestyle factors (smoking, lack of physical activity, poor diet, and obesity), family history, and certain medical conditions.Age above 55 years, the female gender, higher BMI, waist circumferences, fat mass, hyperglycemia, insulin resistance, hypertension, and hypertriglyceridemia were risk factors for hypo-HDL-C in the present study.Among lifestyle factors, the risk factors were observed to be a high carbohydrate and sodium intake, low protein, vitamin D, coffee, and alcohol intake, low exercise, and smoking.In addition to lifestyle factors, hypo-HDL-C was associated with genetic factors, and the interaction between genetic and lifestyle factors influenced hypo-HDL-C.
In the present study, the PRS of the 4-SNP model and the 11q23.3haplotype were positively associated with hypo-HDL-C by about 3 times.This was a higher association than the PRS of the 24 SNP model with p < 5×10 −8 .These results suggest that the interaction between genetic variants with each other showed a better association with hypo-HDL-C when they were pooled.The selected genetic variants indicated the essential pathways that influence hypo-HDL-C.The SNPs associated with hypo-HDL-C identified in the present study were involved in generating HDL-C, reverse cholesterol transport, triglyceride transport, and cholesterol metabolism in macrophage and foam cells.Pre-β-HDL is formed and secreted from the liver and intestines into the bloodstream.It interacts with ABCA1 to form discshaped nascent HDL by the efflux of phosphatidylcholine and cholesterol, and phospholipids from the macrophage and foam cells are transferred into nascent HDL-C through ABCA1 (7).The intracellular cholesterol contents control the expression of the ABCA1 gene.The nascent HDL-C is converted into mature spherical HDL by esterifying free cholesterol by LCAT, reverse cholesterol transport.CETP, mainly released from the liver, is bound to HDL-C in the blood, and it facilitates the movement of cholesteryl ester and triglycerides HDL, LDL, and very low-density lipoprotein (VLDL).Moreover, increased HDL-C content and activity decreases triglycerides in the serum.Therefore, the expression and mutation of APOA1, ABCA1, LCAT, CETP, and LPL are involved in HDL-C and triglyceride homeostasis (25).The present study showed that their genetic variants were associated with hypo-HDL-C.
Interestingly, hypo-HDL-C is genetically linked to triglyceride transfer from chylomicron and VLDL.The APOA1/C3/A4/ A5-ZPR1-BUD13 gene cluster is located on chromosome 11q23.3and modulates LPL activity (26,27).The cluster is also linked to HDL metabolism.The best model with SNP-SNP interaction included 2 SNPs (ZPR1_rs3741297 and BUD13_rs180327).The PRS of the 4-SNP model and 11q23.3haplotype were positively associated with not only hypo-HDL-C but also hypertriglyceridemia.However, the PRS of the 4-SNP model and 11q23.3haplotype were not associated with other biochemical and anthropometric measurements.This suggests that triglyceride metabolism is closely linked to HDL metabolism.Girona et al. (28) have reported that the triglyceride content in HDL is strongly inversely related to HDL-C and positively associated with the triglyceride contents in chylomicron and VLDL.HDL-C, with high triglyceride content, is small and has a severely abnormal structure (29).Small HDL or HDL with high triglycerides is a marker of cardiovascular disease (29).Therefore, the genetic variants involved in triglyceride metabolism may be linked to the triglyceride movement of not only chylomicron and VLDL but also HDL.
ApoE mediates the binding of lipoproteins, especially VLDL and chylomicron remnants or lipid complexes in the plasma or interstitial fluids, to specific cell-surface receptors (30).In the present study, APOE_rs429358, a missense mutation (cys130arg), was positively associated with hypo-HDL-C.Consistent with the present study, APOE_rs429358 and rs7412 polymorphisms have been positively associated with hypo-HDL-C, hyper-LDL-C, and hypertriglyceridemia in an earlier study.Furthermore, the lipid profile was linked to cognition in the aging Chinese population (31) and Eastern Europe, as observed in the Health, Alcohol, and Psychosocial Factors in Eastern Europe (HAPIEE) study (32).Therefore, ApoE_rs429358 is associated with dyslipidemia, indicating it impacts ApoE activity, thereby modulating the lipid profile.
However, no study has investigated the change in ApoE binding energy with food components according to the rs429358 mutation.Its wild and mutated type proteins showed similar or different binding energies to food components.Among 20,000 food components, food components, mainly polyphenols, had < −10 binding energy with wild and mutated types of APOE_rs429358 (cys130arg).Two (solacauline and isomorellic acid) and three food components (quercetin 3-O-xylosylglucuronide, gambogic acid, and kuwanol E) had low binding energy with wild-type or mutated-type APOE, respectively.The differences in binding energy between the wild and mutated types were due to conformational changes of the ApoE expressed by the gene with a missense mutation.The changes modified intermolecular binding affinities, such as the conventional hydrogen bonds between food components and the ApoE protein.
Kuwanol E known to be present in Morus alba had a lowered binding affinity with the mutated type rather than the wild one.Previous studies have shown that Morus alba intake protects against dyslipidemia and hepatic liver steatosis in rats and humans (33)(34)(35).Lowering the binding energy of ApoE improves its activity by lowering serum HDL-C concentrations.
The relationship between dietary fat and carbohydrate intake and HDL-C is somewhat paradoxical with respect to CVD risk (36).HDL-C decreases by replacing dietary saturated fat with polyunsaturated and monounsaturated fat (36).Furthermore, the switch from dietary fat to dietary carbohydrate reduces HDL-C.The present study showed that high carbohydrate and low-fat intake, regardless of the type of fat, decreased HDL-C (p < 0.001).Although fiber intake was lower in the Low-HDL group than in the Normal-HDL group (p < 0.05), the dietary glycemic index and sulfur diet index did not differ between the Low-HDL and Normal-HDL groups.Vitamin D intake is a well-known regulator of calcium homeostasis, blood pressure, and glycemia (37).Vitamin D significantly enhances HDL-C to reduce atherosclerotic cardiovascular disease risk scores (38, 39).Consistent with the previous studies (40), vitamin D and calcium intakes were higher in the Normal-HDL group than in the Low-HDL group.
Previous studies have consistently reported that individuals with higher sulfur microbial diet scores related to consuming a high intake of processed meats, liquor, low-calorie drinks, beer, sweets, and desserts and a low intake of mixed vegetables and face an elevated risk of early-onset adenomas and colorectal cancer, with the risk being 1.31 and 1.25 times higher, respectively (41,42).Furthermore, recent research has highlighted the influence of sulfur microbial diet scores on obesity risk and its impact on metabolic processes, based on data from the UK Biobank (20).However, our current study yielded some intriguing findings.We observed no significant differences in sulfur microbial diet scores between individuals categorized as having Low-HDL and Normal-HDL levels.This discrepancy can be attributed to the distinctive dietary habits of Koreans, where the average sulfur microbial diet scores were notably lower (approximately −35 for men and − 55 for women) compared to Europeans (approximately −0.5).This variation arises from the fact that Koreans consume considerably fewer processed meats and consume more legumes and vegetables than their European counterparts (20).Additionally, our study uncovered a noteworthy interaction between sulfur microbial diet scores and PRS to affect serum HDL concentration.In cases where individuals adhered to a low sulfur microbial diet, those with medium-hap-PRS exhibited a reduced genetic influence on their serum HDL levels.The impact of haplotype genetics was less pronounced in a low-sulfur microbial diet, underscoring the intricate interplay between diet and genetics.Some studies have demonstrated that genetic variants related to hypo-HDL-C and lifestyle factors exhibit an interaction to modulate HDL-C.For example, total fat intake interacted with the LPL_rs13702 polymorphism to impact HDL-C (interaction p = 0.041).The individuals with the risk allele (G) of LPL_rs13702 have significantly higher HDL-C when consuming a high-fat diet (>92 g/day) than those on a low-fat diet (p = 0.033) (43).The risk allele of the haplotype in the 12q23 has a positive association with hypo-HDL-C by 1.65 times compared to its non-risk alleles.The risk allele of the haplotype interacts with protein, saturated fat, and polyunsaturated fatty acid intake (16).However, the PRS with 4-SNP and the haplotype 11q23.3 had no interaction with protein and fat intake.On the other hand, there was an interaction of the PRS with 4-SNP and haplotype 11q23.3 with energy intake and microbial sulfur diet, respectively.The PRS of 4-SNP and haplotype did not interact with protein and fat intake but with the sulfur microbial diet containing high in meats, mainly processed meats, and low in vegetables.Therefore, the genetic variants are associated with the intake of saturated fat and proteins.The present study is novel as it has shown that HDL-C was associated with not only cholesterol but also triglyceride transfer from triglyceriderich lipoproteins.Furthermore, HDL-C was related to the regulation of macrophage differentiation derived from foam cells.The PRS of the 4-SNP and haplotype 11q23.3interacted with energy and the sulfurmicrobial diet score, respectively, to influence low HDL-C.The limitations of the study were as follows: First, the data originated from a cross-sectional study, and although it was well-designed, and the data collection and measurement were well-controlled, the results could not represent cause and effect.Second, HDL subclasses, including HDL particle size, composition, and functionality, were not measured to understand the genetic impact of HDL metabolism and its interaction with lifestyles.Third, daily food intake was estimated from the SQFFQ, including 106 common Korean foods and dishes.The SQFFQ included some bias for the usual food intake, although it was checked with a three-day record four times.Third, genetic variants were estimated with a customized K-chip for Koreans (Axiom Biobank plus Genotyping Array, KNIHv1.1)(44).The genetic variants might not include some genetic variants related to metabolic diseases.
In conclusion, adults with hypo-HDL-C had a 1.4-fold higher risk of CVD.Those with a high PRS of ZPR1_rs3741297, CETP_rs708272, BUD13_rs180327, and ALDH1A2_rs588136 or the haplotype 11q23.3 were positively associated with the risk of hypo-HDL-C by about 3 times.The PRS of the 4-SNPs and haplotypes interacted with energy intake and sulfur-microbial scores, affecting hypo-HDL.The wild type of APOE_ rs429358 (cys130) lowered the binding energy to polyphenols somewhat differently than the mutated ones (130arg).Therefore, adults with a genetic risk for hypo-HDL need to modulate their diet to reduce their risk.Our study demonstrates the clinical relevance of genetic variants associated with hypo-HDL-C in a large cohort of middle-aged Asian adults.These findings highlight the diagnostic value of incorporating genetic risk assessment into managing hypo-HDL-C and, eventually, cardiovascular disease.Individuals identified with a genetic predisposition to hypo-HDL-C can benefit from targeted interventions, such as lifestyle modifications such as low sulfur microbial and glycemic diets and non-smoking and early screening, to mitigate their increased risk of myocardial infarction and stroke.Implementing personalized risk assessments based on genetic factors has the potential to enhance preventive strategies and improve patient outcomes in clinical practice.

FIGURE 1 Flow
FIGURE 1Flow chart to generate the polygenic risk score (PRS) associated with hypo-HDL-cholesterolemia by SNP-SNP interaction and haplotype and its interaction with lifestyle factors.Korean adults aged over 40 were categorized based on the guidelines based on the cutoff of HDL-C < 40 mg/dL for men and < 50 mg/dL for women, plus the current use of anti-dyslipidemic medication in the period 2010-2014.There were 16,980 and 41,721 participants in the Low-HDL and Normal-HDL groups, respectively.

FIGURE 2
FIGURE 2Adjusted odds ratio (ORs) and 95% confidence intervals (CIs) of hypo-HDL-cholesterolemia with cardiovascular diseases.Covariates set 1 included age, gender, body mass index, residence area, education, and income; covariate set 2 contained those in set 1 plus energy intake, exercise, alcohol consumption, smoking, and incidence of osteoporosis; and covariate set 3 included those in set 2 plus blood HbA1c and serum triglyceride concentrations.

FIGURE 5
FIGURE 5 Molecular docking and molecular dynamic simulation (MDS) of Kuwanol E on Apolipoprotein E (APOE) wild type (cys130) and mutated type (130arg) in rs429358.(A) The interaction force between Kuwanol E and APOE _rs429358 wild type.(B) The interaction force between Kuwanol E and APOE _ rs429358 mutated type.(C) Root mean square deviation (RMSD) of Kuwanol E on APOE_rs429358 wild and mutated types.(D) Root mean square fluctuations (RMSF) of Kuwanol E on APOE_rs429358 wild and mutated types.

TABLE 1
General, anthropometric, and biochemical characteristics according to gender and hypo-HDL.

TABLE 2
Daily nutrient intake according to gender and hypo-HDL-cholesterolemia.
Serum HDL-C concentration according to the PRS of 4 SNPs in haplotype 11q23.3selected from the SNP-SNP interaction by GMDR.(E) Adjusted odds ratio (ORs) and 95% confidence intervals (CIs) of hypo-HDL-cholesterolemia with PRS with 24 SNP, 4-SNP PRS.6-SNP PRS, and PRS of the haplotype 11q23.3.PRS was generated as the sum of the number of risk alleles in each SNP generated from the SNP-SNP interaction and haplotype.They were classified as Low-PRS, Middle-PRS, and High-PRS according to the range 0-3, 4-5, and ≥ 6 in the four-SNP model and 0-5, 6-7, and ≥ 8 in the six-SNP model, respectively.Covariates set 1 included age, gender, body mass index, residence area, education, and income; covariate set 2 contained those in set 1 plus energy intake, exercise, alcohol consumption, smoking, and incidence of osteoporosis; and covariate set 3 included those in set 2 plus blood HbA1c and serum triglyceride concentrations.

TABLE 3
Characteristics of genetic variants related to adult height from generalized multifactor dimensionality reduction analysis.

TABLE 4
Biding energy of food components to APOE wild type (WT) and mutated one (MT) in rs429358.

TABLE 5
Pathways related to genetic variants for serum HDL concentration.

TABLE 6
Adjusted odds ratios for the hypo-HDL risk by polygenetic risk scores (PRS) of the best model for gene-gene interaction or haplotype in 11q23.3 after covariate adjustments according to the patterns of lifestyles.Values represent adjusted odd ratios and 95% confidence intervals.PRS with 4 SNPs of the best GMDR model or haplotype in 11q23.3 was divided into three categories according to the number of the risk alleles: ≤ 3, 4-5, and ≥ 6 into Low-PRS, Middle-PRS, and High-PRS, respectively.The reference was the low-PRS.Covariates included age, sex, education, income, energy intake, residence areas, daily activity, alcohol intake, smoking status, blood HbA1c, and serum triglyceride concentrations.Cutoff of each variable: 100% for estimated energy requirement (EER) 1 , 33th percentiles 2 , and 20 g/day3.GI, glycemic index; DII, dietary inflammation index. 1 <Estimated energy requirement defined in dietary reference index; 2 < 75 th percentiles; < 20 g/day; 3 < 20 g daily alcohol intake.