Genetic Polymorphism of Vitamin D Family Genes CYP2R1, CYP24A1, and CYP27B1 Are Associated With a High Risk of Non-alcoholic Fatty Liver Disease: A Case-Control Study

Background Previous studies have highlighted the important role of vitamin D and calcium pathway genes in immune modulation, cell differentiation and proliferation, and inflammation regulation, all closely implicated in the pathogenesis of non-alcoholic fatty liver disease (NAFLD). Objective This study aims to investigate whether 11 candidate single nucleotide polymorphisms (SNPs) in vitamin D and calcium pathway genes (CYP2R1, CYP24A1, and CYP27B1) are associated with the risk of NAFLD. Methods In this case-control study, a total of 3,023 subjects were enrolled, including 1,114 NAFLD cases and 1,909 controls. Eleven genetic variants in CYP2R1, CYP24A1, and CYP27B1 genes were genotyped. Logistic regression analysis was used to assess the effects of these variants on NAFLD risk. The functional annotations of positive SNPs were further evaluated by bioinformatics analysis. Results After adjusting for age, gender, and metabolic measures, we identified that CYP24A1 rs2296241 variant genotypes (recessive model: OR, 1.316; 95% CI, 1.048–1.653; p = 0.018), rs2248359 variant genotypes (recessive model: OR, 1.315; 95% CI, 1.033–1.674; p = 0.026), and CYP27B1 rs4646536 variant genotypes (additive model: OR, 1.147; 95% CI, 1.005–1.310; p = 0.042) were associated with an elevated risk of NAFLD. In combined effects analysis, we found that NAFLD risk significantly increased among patients carrying more rs2296241-A, rs2248359-T, and rs4646536-T alleles (ptrend = 0.049). Multivariate stepwise analysis indicated that age, visceral obesity, ALT, γ-GT, hypertriglyceridemia, hypertension, low HDL-C, hyperglycemia, and unfavorable alleles were independent predictors of NAFLD (all p < 0.05). The area under the receiver operating characteristic curve was 0.789 for all the above factors. Conclusion The polymorphisms of vitamin family genes CYP24A1 (rs2296241, CYP24A1, and rs2248359) and CYP27B1 (rs4646536) were associated with NAFLD risk in Chinese Han population, which might provide new insight into NAFLD pathogenesis and tools for screening high-risk population.


INTRODUCTION
Non-alcoholic fatty liver disease (NAFLD) has emerged as a major chronic liver disease, afflicting more than one-quarter of adults worldwide (Younossi et al., 2016). Previous studies have shown that NAFLD cannot only cause liver-related complications, such as non-alcoholic steatohepatitis (NASH), cirrhosis, hepatocellular carcinoma (HCC) (Lindenmeyer and McCullough, 2018) but also increase the risk of other extrahepatic diseases, such as cardiovascular disease (CVD) (Targher et al., 2010), type 2 diabetes mellitus (T2DM) (Tilg et al., 2017), and chronic kidney disease (Byrne and Targher, 2020). Its high global prevalence and poor prognosis have made NAFLD a serious public health threat.
The mechanisms underlying NAFLD are far from clear (Tarantino et al., 2019). Previous studies have indicated that NAFLD is a multi-factorial disease associated with a high frequency of metabolic comorbidities (Italian Association for the Study of the Liver (Aisf), 2017). Experts have reached a consensus on suggesting to rename NAFLD as metabolicassociated fatty liver disease (MAFLD), a more appropriate overarching term (Eslam et al., 2020). Dynamic interactions among insulin resistance, lipid metabolism, genetic variation, and other environmental factors shape the susceptibility and progression of this disease (Buzzetti et al., 2016;Eslam and George, 2016).
The aim of this study was to investigate the correlation of candidate SNPs in vitamin D and calcium pathway genes (CYP2R1, CYP24A1, and CYP27B1) with the risk of NAFLD among Chinese Han population. Our findings may deepen our understanding of NAFLD and provide strategies for the screening, prevention, and individualized treatment of NAFLD patients.

Study Participants and Design
The participants of this case-control study were recruited from a community (Nanjing, Jiangsu, China) from July to September 2018. NAFLD was diagnosed based on Guideline of prevention and treatment for non-alcoholic fatty liver disease: a 2018 update (National Workshop on Fatty Liver and Alcoholic Liver Disease, Chinese Society of Hepatology, Association et al., 2018). Diagnostic guidelines: (1) no history of drinking or overdose drinking (less than 210 g/week ethanol for men and 140 g/week for women in the past 12 months); (2) absence of drug hepatitis, hepatitis C virus genotype 3 infection, hepatolenticular degeneration, and other specific diseases that could result in fatty liver; (3) mildly to moderately increased serum levels of transaminase and γ-glutamyl transpeptidase (γ-GT) (<5 times above the upper normal limit), usually presenting as an increase in alanine aminotransferase (ALT); (4) metabolic syndrome constituents, such as visceral obesity, hyperglycemia, blood lipid disorder, and hypertension; (5) imaging results meeting the diagnostic criteria of diffuse fatty liver; and (6) histological findings of liver biopsy meeting the pathological diagnostic criteria of fatty liver disease. Since liver biopsy was difficult to obtain, we used the liver imaging methods mentioned in the guideline. NAFLD was diagnosed if criteria 1-4 coexist with criterion 5. The hepatic ultrasound examination was performed using a LOGIQ-E9 ultrasound system (General Electric Healthcare, Milwaukee, WI, United States).
Those patients diagnosed with NAFLD were grouped. The non-NAFLD controls were collected from the same community during the study period and randomly assigned to the control group. The constituent ratios of gender and age between case and control groups were considered similar, according to the result of frequency matching. Included were patients (1) signing informed consent; (2) aged between 18 and 60 years. Excluded were those: (1) taking antihypertensive, antidiabetic, lipid-lowering, or hypouricemic agents within 24-h before physical examination; (2) with infection, acute or chronic gastrointestinal diseases, autoimmune diseases, or malignant tumors; (3) with history of other viral hepatitis, alcoholic liver disease, or primary liver cancer; (4) having excessive alcohol consumption (alcohol consumption ≥ 30 g/day in males and ≥ 20 g/day in females); (5) receiving a liver transplant within the previous year or had complications of advanced liver disease (varicose veins, ascites, etc.); (6) with drug-induced fatty hepatitis; or (7) with a history of psychiatric disorders.
After literature review, we assumed that the frequency of gene mutation in the general population was 10-30%, odds ratio (OR) was 1.5, two-sided test α was 0.05, and power of test (1-β) was 80%. Sample size was estimated by NCSS-PASS 11.0 software (Dawson edition; Kaysville, UT, United States). The sample size of 1,114 NAFLD cases and 1,909 controls in this study was large enough to guarantee the production of reliable results.
The study was performed in accordance with the World Medical Association Declaration of Helsinki on ethical principles in medical research involving human subjects and was approved by the Institutional Ethics Review Committee of Nanjing Medical University (Nanjing, China). Written informed consent was obtained before blood test and genetic analysis.

Data and Blood Sample Collection
The demographic and clinical characteristics of all participants were collected from self-designed questionnaires and electronic medical records. All participants underwent abdominal ultrasound and blood biochemical tests. Five-milliliter ethylenediaminetetraacetic acid (EDTA) anticoagulant venous blood was collected from the fasting participants via the antecubital vein in the morning. The serum and blood cells in each blood sample were separated and frozen at −80 • C within 2 h, until further serological tests and genotyping assays.
We used magnetic bead method (blood genomic extraction kit; Pangu Genome Nanotechnology Co., Ltd.; Nanjing, China) for isolating genomic DNA from EDTA-anticoagulated blood samples. Genotyping assay was performed with a TaqMan allelic discrimination assay on the Light Cycler 480 II Real-Time PCR System (Roche, Switzerland). Detailed information on primers and probes was shown in Supplementary Table 1. Some measures were implemented to control the data quality as follows: (1) blind methods were adopted in genotyping, so that all technicians were unclear about the clinical data of the participants; and (2) repeated experiments were conducted in 10% of random samples with a repeatability of 100%. Genotyping success rate of all SNPs was above 95%. All tests were carried out in accordance with the manufacturer's instructions.

In silico Analysis
To further explore the potential functions of gene variants, we performed bioinformatics analysis using some online database as follows: (1) determining the genetic variationspecific location sites on the chromosome and transcriptional regulation information with NCBI dbSNP 2 ; (2) viewing the scores of all genetic variation sites in the Regulome DB database, 3 and SNP scoring 1-3 might act in transcriptional regulation; (3) checking whether the genetic variation sites were located on the histone modification peak through the UCSC Genome Browser database 4 ; and (4) using the RNA fold Web Server 5 to predict the effect of genetic variation of positive SNP on the secondary structure.

Statistical Analyses
All data analyses were performed using IBM SPSS Statistics for Windows 23.0 (IBM Corp, Armonk, NY, United States) and R software v3.4.3. 6 Distributions of demographic and clinical characteristics among case and control groups were compared by the Chi-square (χ 2 ) test (for categorical variables), Student's t-test, or Mann-Whitney U-test (for continuous variables). Logistic regression analysis with adjustment for gender, age, aspartate aminotransferase (AST), ALT, γ-GT, triglyceride (TG), total cholesterol (TC), and high-density lipoprotein cholesterol (HDL-C) was used to calculate odds ratio (OR) and 95% 6 http://www.r-project.org/    (Woods et al., 2015). The combined effect of three independent SNPs (CYP24A1-rs2296241, CYP24A1-rs2248359, and CYP27B1-rs4646536) was analyzed using the Cochran-Armitage trend test. Subgroup analysis was performed for positive SNPs, and Q-test was performed to calculate the heterogeneity between subgroups. Multivariate stepwise logistic regression analysis was used to determine the independent predictive factors for NAFLD. A receiver-operating characteristic curve (ROC) was used to represent the risk prediction model for NAFLD, with the area under the receiver operating characteristic curve (AUROC) indicating its predictive power. A two-tailed test with a p-value < 0.05 was regarded as statistically significant in all analyses.

Basic Characteristics of Study Subjects
A total of 3,023 participants were enrolled in our study, including 1,114 NAFLD cases and 1,909 controls. The distribution of demographic and clinical characteristics in two study groups is summarized in Table 1. No significant differences were observed in age and gender between the control and NAFLD groups (all p > 0.05). However, there were significant differences in body mass index (BMI), waist circumference (WC), systolic blood pressure (SBP), diastolic blood pressure (DBP), TG, TC, HDL-C, low-density lipoprotein cholesterol (LDL-C), glucose (GLU), γ-GT, ALT, AST, direct bilirubin (DBIL), and total bilirubin (TBIL) (all p < 0.001).

Associations Between CYP2R1, CYP24A1, and CYP27B1 SNPs and NAFLD Risk
The genotype distributions of the 11 SNPs between the two study groups and the results of the logistic regression analysis are shown in  Variables are numbers of combined unfavorable alleles (rs2296241-A, rs2248359-T, and rs4646536-T). Bold type indicates statistically significant results. a Logistic regression analyses adjusted for gender, age, AST, ALT, GGT, TG, TC, and HDL-C. b p trend -Value was analyzed by Cochran-Armitage trend test.

Functions of Positive SNPs
The Regulome DB score for the CYP24A1-rs2296241, CYP24A1-rs2248359, and CYP27B1-rs4646536 were 4, 4, and 1d, respectively. 7 The UCSC prediction showed that rs2296241, rs2248359, and rs4646536 were all enriched near the H3K4Me1 marker (Figure 2). The effect of rs2296241, rs2248359, and rs4646536 predicted by RNA fold Web Server on the secondary structure of CYP24A1 and CYP27B1 mRNA is shown in Figure 2. The arrows indicated the position of the mutation (50 bases upstream and 50 bases downstream of the mutation). The minimum free energy (MFE) of G and A alleles of CYP24A1-rs2296241 were estimated at −10.9 and −7.5 kcal/mol, respectively. The MFE of C and T alleles of CYP24A1-rs2248359 were estimated at −21.5 and −18.6 kcal/mol, respectively. The MFE of A and T alleles of CYP27B1-rs4646536 were all estimated at −30.5 kcal/mol. 7 http://www.regulomedb.org/results FIGURE 1 | The ROC curve for the influence factors of Table 4. Abbreviations: NAFLD, non-alcoholic fatty liver disease; AUROC, area under the receiver operating curve. The response variable is NAFLD risk and the diagnostic test variable is a combination of age, visceral obesity, ALT, γ-GT, hypertriglyceridemia, hypertension, low HDL-C, hyperglycemia, and unfavorable alleles with the coefficients taken from the regression analysis.

DISCUSSION
In this case-control study, we explored the associations of vitamin D family genes CYP2R1, CYP24A1, and CYP27B1 genetic polymorphism with the risk of NAFLD among Chinese Hans. Our result showed that CYP24A1 rs2296241-A, rs2248359-T, and CYP27B1 rs4646536-T as unfavorable alleles were associated with the increased risk of NAFLD. The combination of clinical factors and unfavorable alleles exhibited a desirable predictive value for the risk of NAFLD.
NAFLD is a complex metabolic disorder closely associated with obesity, T2DM, and metabolic syndrome (Younossi, 2019). Vitamin D is a pleiotropic hormone definitively involved in immune-inflammatory and metabolic processes (Charoenngam and Holick, 2020). Accumulative previous studies have suggested vitamin D deficiency highly prevalent among the general Asian population (Cheng et al., 2017;Cho et al., 2019). Several recent meta-analyses have suggested that low vitamin D level may impact disease progression of NAFLD, while in chronic liver diseases, enzymatic conversion (hydroxylation) is disrupted in the liver through a variety of mechanisms, leading to low serum vitamin D levels (Zhu and DeLuca, 2012;Zhang et al., 2019;Liu et al., 2020). SNPs in genes involved in the vitamin D metabolic process could affect vitamin D status. The associations of CYP2R1, CYP24A1, and CYP27B1 gene polymorphisms with vitamin D deficiency in multiple population have been reported (Gibson et al., 2018;Arai et al., 2019). In this study, we genotyped three target genes (CYP2R1, CYP24A1, and CYP27B1) encoding hydroxylase in the vitamin D metabolic pathway. No relationship was found between CYP2R1 and NAFLD. Similar negative results were found in previous studies among different populations (Gibson et al., 2018). However, positive results have shown that genetic variants of CYP24A1 and CYP27B1 can increase the risk of NAFLD. To our knowledge, this is the first study revealing a relationship between CYP24A1 and CYP27B1 SNPs and NAFLD risk.
Although no studies have shown a relationship between the polymorphisms of CYP24A1, CYP27B1, and NAFLD, these variants have been extensively investigated in other diseases, such as organ-specific autoimmune endocrine diseases (Ma et al., 2020), CVD (Qian et al., 2020), metabolic diseases (Yu et al., 2020), and multiple cancers (Hibler et al., 2015;Torkko et al., 2020). Our results also confirmed that NAFLD patients with a combined load of unfavorable alleles (rs2296241-A, rs2248359-T, and rs4646536-T) exhibited an association with the increase in NAFLD risk, and this association was also in a dose-dependent manner. These findings would provide useful information of risk assessment or possible diagnostic markers for NAFLD.
In the model of combined unfavorable alleles with other clinical factors (binary age, visceral obesity, ALT, γ-GT, hypertriglyceridemia, hypertension, low HDL-C, hyperglycemia), gender was not significant, but the remaining nine were independent risk factors of NAFLD. These results are almost consistent with NAFLD guidelines and some previous studies. Abnormal ALT and γ-GT levels, hypertriglyceridemia, hypertension, low HDL-C levels, and hyperglycemia have been recognized as predictors of metabolic syndrome Frontiers in Genetics | www.frontiersin.org 8 August 2021 | Volume 12 | Article 717533 FIGURE 2 | Continued position of the mutation (50 bases upstream and 50 bases downstream of the mutation). The minimum free energies (MFE) for the G and A allele of rs2296241 were estimated at −10.9 and −7.5 kcal/mol, respectively. The MFE for the C and T allele of rs2248359 were estimated at −21.5 and −18.6 kcal/mol, respectively. The MFE for the A and T alleles of rs4646536 were all estimated at −30.5 kcal/mol by RNA fold Wed Server.
(National Workshop on Fatty Liver and Alcoholic Liver Disease, Chinese Society of Hepatology, Association et al., 2018). Age and BMI are established predictors of liver fibrosis and have been included in the NAFLD fibrosis score formula (Angulo et al., 2007). WC and γ-GT are included in the non-invasive model Fatty Liver Index (FLI) for risk prediction of NAFLD (Bedogni et al., 2006). Moreover, the AUROC of our model combining the above nine variables was 0.789, indicating that it had a desirable predictive value. Given its potential predictive value, the predictive model combining clinical and genetic factors might provide new avenue for early screening in high-risk population.
Cost-benefit analysis should be considered in future studies.
In this study, we used multiple bioinformatics databases to predict the function of positive SNPs. The Regulome DB score for the CYP24A1-rs2296241, CYP24A1-rs2248359, and CYP27B1-rs4646536 were 4, 4, and 1 days, respectively, which indicated that these loci have strong potential functions, regulate the expression of CYP24A1 and CYP27B1 by changing multiple regulatory motifs, and interfere with protein-binding activity (Boyle et al., 2012). The performances of UCSC showed that rs2296241, rs2248359, and rs4646536 were involved in promoter and enhancer modification in different cell lines, especially in the vicinity of enhancer elements (H3K2me1 marker) of multiple cell lines. Moreover, they were also related to the change of transcription factor-binding module. In addition, the MFEs of rs2296241 G and A alleles (−10.9 vs. −7.5 kcal/mol) and rs2248359 C and T alleles (−21.5 vs. −18.6 kcal/mol) were different, suggesting that mutation of rs2296241 and rs2248359 may affect the transcription of CYP24A1. Further research is warranted to identify the function of these polymorphisms in vitamin D metabolic pathway.
Several potential limitations need to be considered. Firstly, this is a single-center study, and population selection is underrepresented. In response, gender and age were matched in the design stage, and multivariate analysis and stratified analysis were carried out to control the influence of the confounding factors. Secondly, we only chose three hydroxylase genes in the vitamin D metabolic pathway, which may not fully analyze the relationship between genetic variants and NAFLD risk. More genetic loci are needed to confirm the effect of genetic variation on the risk of NAFLD. It is necessary to further explore the impact of polygenic loci and their combination with other environmental factors on NAFLD risk in a multicenter population of different races. CONCLUSION CYP24A1 (rs2296241, rs2248359) and CYP27B1 (rs4646536) variants are associated with a high risk of NAFLD in the Chinese Han population. The combination of unfavorable SNPs and metabolic-related indicators shows high efficiency in predicting the risk of NAFLD. These findings might provide new insight into NAFLD pathogenesis and a new tool for early screening of high-risk population.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Ethics Review Committee of Nanjing Medical University (Nanjing, China). The patients/participants provided their written informed consent to participate in this study.