Genetic polymorphisms in CYP4F2 may be associated with lung cancer risk among females and no-smoking Chinese population

Background Our study aimed to explore the potential association of CYP4F2 gene polymorphisms with lung cancer (LC) risk. Methods The five variants in CYP4F2 were genotyped using Agena MassARRAY in 507 cases and 505 controls. Genetic models and haplotypes based on logistic regression analysis were used to evaluate the potential association between CYP4F2 polymorphisms and LC susceptibility. Results This study observed that rs12459936 was linked to an increased risk of LC in no-smoking participants (allele: OR = 1.38, p = 0.035; homozygote: OR = 2.00, p = 0.035; additive: OR = 1.40, p = 0.034) and females (allele: OR = 1.64, p = 0.002; homozygote: OR = 2.57, p = 0.006; heterozygous: OR = 2.56, p = 0.001; dominant: OR = 2.56, p < 0.002; additive: OR = 1.67, p = 0.002). Adversely, there was a significantly decreased LC risk for rs3093110 in no-smoking participants (heterozygous: OR = 0.56, p = 0.027; dominant: OR = 0.58, p = 0.035), rs3093193 (allele: OR = 0.66, p = 0.016; homozygote: OR = 0.33, p = 0.011; recessive: OR = 0.38, p = 0.021; additive: OR = 0.64, p = 0.014), rs3093144 (recessive: OR = 0.20, p = 0.045), and rs3093110 (allele: OR = 0.54, p = 0.010; heterozygous: OR = 0.50, p = 0.014; dominant: OR = 0.49, p = 0.010; additive: OR = 0.54, p = 0.011) in females. Conclusions The study demonstrated that CYP4F2 variants were associated with LC susceptibility, with evidence suggesting that this connection may be affected by gender and smoking status.


Introduction
Lung cancer (LC) has been regarded as one of the most common causes of cancer-related death worldwide over the past few decades, with an estimated 2.1 million new diagnoses of LC in 2018, accounting for 12% of the total increase in cancer cases (1). In recent years, the incidence of LC in China has been consistent with the global trend, showing a rapid increase, and LC has since become the main cause of cancer-related deaths in China (2). It is predicted that the mortality of LC in China is likely to increase by about 40% between 2015 and 2030 (3). Despite advances in early detection, the majority of LC patients are often diagnosed at a later stage, resulting in a 5-year overall survival rate of only 10% to 15%, according to statistics (4). The burden of LC on our society is increasing day by day and cannot be ignored. Various factors can predispose people to LC, with smoking being the most prevalent factor. In addition, other potential risk factors include gender, age, race, ethnicity, and especially single nucleotide polymorphisms (SNPs) (5,6).
Cytochrome P450s (CYP), phase I drug metabolizing enzymes, encode 57 CYP proteins in the human genome and are responsible for the metabolism of numerous endogenous and xenobiotic compounds (7). The CYP4F2 gene, a member of the CYP450 superfamily, is an w-hydroxylase that catalyzes the first step of the vitamin E metabolic pathway (8), as well as the metabolism of arachidonic acid (AA) to generate 20-hydroxyethyl hexadecanoic acid (20-HETE) through w-hydroxylation (9). 20-HETE is known to promote tumorigenesis by increasing a variety of proinflammatory mediators, cytokines, and chemokines. Previous studies have demonstrated that the elevated expression of CYP4F2 enzymes and 20-HETE is closely related to ovarian cancer (10). We hypothesized that CYP4F2 might be involved in tumor genesis and development by accelerating the production of 20-HETE. Additionally, Geng et al. have proved that rs1558139 and rs2108622 of CYP4F2 are associated with hypertension, and the association between rs1558139 and hypertension is particularly strong in men (11). Despite this, there is a lack of studies investigating the association between CYP4F2 polymorphisms and LC risk.
In this case-control study, five SNPs (rs3093203, rs3093144, rs12459936, rs3093110, and rs3093193) in CYP4F2 were genotyped by the Agena MassARRAY platform. The gender-and smokingstratified analyses on the correlation between CYP4F2 variants and LC risk were performed.

Study subjects
A total of 507 newly diagnosed LC patients (353 males and 154 females) were randomly recruited from the Second Affiliated Hospital of Xi'an Jiaotong University in the case-control association analysis between CYP4F2 polymorphisms and the risk of LC. All patients had no history of any other cancers and had not received chemotherapy before acquiring blood samples. Further, the control group comprised 505 unrelated healthy controls (354 males and 151 females) from the physical examination center of the hospital. Information about all subjects, including age, gender, height (cm), weight (kg), smoking status, drinking status, tumor stage, and lymph node metastasis, was collected from questionnaires and clinical data. Peripheral blood samples were collected from all study subjects into vacutainer tubes containing EDTA, and genomic DNA was then isolated from the collected blood samples using the GoldMag-Mini Purification Kit (GoldMag Co. Ltd., Xi'an, China) and stored at −80°C. DNA concentration and purity were determined by a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA).

SNP selection and genotyping
In this study, five SNPs (rs3093203, rs3093144, rs12459936, rs3093110, and rs3093193) in CYP4F2 were selected according to previously published studies on the association between CYP4F2 polymorphisms and disease susceptibility (12)(13)(14). The genotype distributions of the candidate SNPs in controls met Hardy-Weinberg equilibrium (HWE) (p >0.05). All the candidate SNPs had a minor allele frequency (MAF) of >5% in the Han Chinese in Beijing (CHB) population from the 1,000 Genomes Project (http:// www.internationalgenome.org/). The primers for five SNPs were designed by Agena Bioscience Assay Design Suite version 2.0 software. The polymorphisms were genotyped using the Agena MassARRAY platform (Agena Bioscience, San Diego, CA, USA) with iPLEX gold chemistry. Ultimately, Agena Bioscience TYPER version 4.0 software was used for data management and genotyping result analysis.

Expression analysis
We extracted the data for CYP4F2 expression in normal lung tissues and lung squamous cell carcinoma (LUSC) tissues under different subgroups from the TCGA database and analyzed them via UALCAN (http://ualcan.path.uab.edu/index.html), which is an interactive web resource for tumor subgroup gene expression analysis and survival analysis.

Statistical analysis
SPASS version 22.0 software was applied for statistical analysis. HWE was calculated for the control group by the chi-square test. Differences in the continuous characteristic (age) and categorical variable (gender) between patients with LC and controls were measured by the student's t-test and Pearson Chi-Square test, respectively. The correlation between CYP4F2 variants and LC susceptibility was evaluated by logistic regression analysis adjusted for age and gender using PLINK software (version 1.07) under multiple genetic models (allele, genotype, dominant, recessive, and additive). Odds ratio (OR) and 95% confidence interval (CI) were calculated to assess the relationship between CYP4F2 SNPs and LC risk (OR = 1: no impact; OR <1: protective factor; OR >1: risk factor). Finally, PLINK (version 1.07) and Haploview (version 4.2) softwares were used to analyze the pairwise linkage disequilibrium (LD) among five SNPs and generate an LD map to observe the linkage degree among them based on D' and r-squared values. The SNPStats software (https:// www.snpstats.net/start.htm) was used to estimate the correlation between CYP4F2 haplotypes and LC risk. In our study, the p-values of all tests were two-sided, and p <0.05 was considered statistically significant.

Participant characteristics
The mean ages of 507 LC patients and 505 unrelated healthy controls were 61.30 ± 8.32 years and 58.91 ± 9.58 years, respectively (Table 1). In our study, there were no statistically significant differences in age (p = 0.525) and gender (p = 0.870) distribution between cases and controls.

Basic information about the selected SNPs in CYP4F2
The basic information about the five SNPs in CYP4F2 (rs3093203, rs3093144, rs12459936, rs3093110, and rs3093193) among cases and controls was displayed (Table 2), including gene, SNP ID, position, alleles, HWE, and OR (95% CI). The five SNPs in controls were in accordance with HWE (p >0.05). We further evaluated the association between the five SNPs and LC susceptibility by logistic regression ( Table 2). The four genetic models (genotype, dominant, recessive, and additive) were also applied to analyze the association by logistic regression adjusted for age and gender (Table S1). Unfortunately, there was no significant association between these five SNPs in CYP4F2 and LC susceptibility under the allelic and genetic models.

Stratification analysis by smoking status
The smoking-stratified analysis (Table 3)

Stratification analysis by gender
In addition, the analysis stratified by gender (

Bioinformatics analysis of CYP4F2 expression in LC
The analysis of the expression level of CYF4F2 in normal and LUSC tissues and its effect on the survival of these patients was conducted using UALCAN online analysis software based on the TCGA database, as shown in Figure 2. We observed that the expression level of CYP4F2 was significantly different between normal and LUSC tissues (p <0.001). In addition, the expression level of CYP4F2 was higher in non-smoking LUSC patients than in normal and smoking ones (p <0.001). The expression level was higher in males than in females (p <0.001). Moreover, a high expression level of CYP4F2 was found to be significantly related to the poor prognosis of non-smoking LUSC patients (p = 0.033).

Discussion
In our study, the connection between five variants in CYP4F2 and LC risk in the Chinese Han population was detected. Association analyses revealed that CYP4F2 rs12459936 increased susceptibility to LC in non-smoking individuals and females. In contrast, rs3093110 showed a protective effect on LC susceptibility in non-smoking groups and females. The two SNPs (rs3093193 and rs3093144) were also associated with a decreased risk of LC in females.
The CYP4F2 gene, a member of the CYP450 superfamily, located on chromosome 19p13.12, has been shown to be  expressed at higher levels in certain types of cancerous tissues, such as the thyroid, ovarian, breast, and colon (10). Eun et al. have confirmed that low expression of CYP4F2 may contribute to the progression of hepatocellular carcinoma (HCC) and decrease survival rates due to its involvement in various metabolic pathways (15). A similar study showed that CYP4F2 expression was higher in pancreatic ductal adenocarcinoma (PDA) patients than in normal ones and negatively correlated with age (16). Database prediction found that CYP4F2 was highly expressed in lung cancer tissues. The expression of CYP4F2 was higher in men than women and higher in non-smokers than smokers.  (18), while another study has revealed that the antagonist of 20-HETE, WIT002, is able to inhibit tumor growth in a renal cell carcinoma cell line (19). This suggests that CYP4F2 polymorphisms may be related to susceptibility to LC by affecting the metabolism of 20-HETE, although further verification is required. Studies have also indicated a significant association between CYP4F2 polymorphisms and a variety of diseases, including ischemic stroke and various other cardiovascular and cerebrovascular diseases (12,20). Our study focused on the association between CYP4F2 polymorphisms and susceptibility to LC. Five sites were selected for statistical analyses: rs3093203, rs3093193, rs12459936, rs3093144, and rs3093110. However, none of these loci were found to be significantly associated with LC susceptibility under the allelic model or any of the five genetic models. The actual increase in LC risk may be underestimated due to the limited sample size. To further examine the potential influence of LC, we conducted a stratified analysis. Tobacco has long been recognized as an independent risk factor for tumorigenesis, as it contains many carcinogens, such as nitrosamines, polycyclic aromatic hydrocarbons, and volatile organic compounds (21). However, our analysis stratified by smoking revealed that the rs12459936 and rs3093110 loci were significantly associated with increased susceptibility to LC in the non-smoking population but not in the smoking population.
In addition, gender has been found to have a notable impact on the toxicity of therapeutic treatments and the response to them in many types of cancer. The underlying cause of this difference is likely related to a complex interplay of several factors, including sex hormones, which have been shown to affect the self-renewal of tumor stem cells, the tumor microenvironment, the immune system, and metabolism (22). It is well established that there are considerable differences in the immune system between men and women. In general, women have a stronger immune system than men, leading to distinct sexbased differences in both innate and adaptive immune responses. These disparities in immune systems likely play a role in cancer susceptibility between males and females (23). In our study, analysis stratified by gender was performed, and we found that rs309319, rs12459936, and rs3093110 all had a protective role against LC in females.
Taken together, our study observed that variants in CYP4F2 were associated with LC susceptibility. However, our research had some limitations. First, the potential functional implications of CYP4F2 polymorphisms were not addressed in this study. The expression data for CYP4F2 in LC cases were sourced from the database. To properly elucidate the genetic mechanism of CYP4F2 Haplotype block map for SNPs in the CYP4F2 gene. The numbers inside the diamonds indicate the D′ value × 100 for pairwise analyses. in LC, expression analysis of CYP4F2 mRNA and annotation of the functional significance of variants are necessary. Second, the sample size was relatively small. In the following steps, we will perfect this information and expand the sample size to explore the molecular mechanism of CYP4F2 polymorphisms affecting the development of LC.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

Ethics statement
Our study complied with the Declaration of Helsinki, and the protocol in our experience was approved by the Second Affiliated Hospital of Xi'an Jiaotong University. All participants have been informed and provided written informed consent for the study.

Author contributions
HS designed this study and drafted the manuscript. YZ performed the DNA extraction and genotyping. YW revised the manuscript and performed the data analysis. PF and YL performed the samples collection and information recording. HS conceived and supervised the study. All authors contributed to the article and approved the submitted version.