Causal associations between dietary factors and colorectal cancer risk: a Mendelian randomization study

Background Previous epidemiological studies have found a link between colorectal cancer (CRC) and human dietary habits. However, the inherent limitations and inevitable confounding factors of the observational studies may lead to the inaccurate and doubtful results. The causality of dietary factors to CRC remains elusive. Methods We conducted two-sample Mendelian randomization (MR) analyses utilizing the data sets from the IEU Open GWAS project. The exposure datasets included alcoholic drinks per week, processed meat intake, beef intake, poultry intake, oily fish intake, non-oily fish intake, lamb/mutton intake, pork intake, cheese intake, bread intake, tea intake, coffee intake, cooked vegetable intake, cereal intake, fresh fruit intake, salad/raw vegetable intake, and dried fruit intake. In our MR analyses, the inverse variance weighted (IVW) method was employed as the primary analytical approach. The weighted median, MR-Egger, weighted mode, and simple mode were also applied to quality control. Heterogeneity and pleiotropic analyses were implemented to replenish the accuracy of the results. Results MR consequences revealed that alcoholic drinks per week [odds ratio (OR): 1.565, 95% confidence interval (CI): 1.068–2.293, p = 0.022], non-oily fish intake (OR: 0.286; 95% CI: 0.095–0.860; p = 0.026), fresh fruit intake (OR: 0.513; 95% CI: 0.273–0.964; p = 0.038), cereal intake (OR: 0.435; 95% CI: 0.253–0.476; p = 0.003) and dried fruit intake (OR: 0.522; 95% CI: 0.311–0.875; p = 0.014) was causally correlated with the risk of CRC. No other significant relationships were obtained. The sensitivity analyses proposed the absence of heterogeneity or pleiotropy, demonstrating the reliability of the MR results. Conclusion This study indicated that alcoholic drinks were associated with an increased risk of CRC, while non-oily fish intake, fresh fruit intake, cereal intake, and dried fruit were associated with a decreased risk of CRC. This study also indicated that other dietary factors included in this research were not associated with CRC. The current study is the first to establish the link between comprehensive diet-related factors and CRC at the genetic level, offering novel clues for interpreting the genetic etiology of CRC and replenishing new perspectives for the clinical practice of gastrointestinal disease prevention.


Introduction
Colorectal carcinoma (CRC) is the third most commonly diagnosed cancer worldwide, accounting for 9.4% of cancer-related fatalities globally (1).CRC patients exhibit clinical manifestations, including bowel habits changes, occult or overt rectal bleeding, abdominal pain, and anemia.However, in the early phase, patients are primarily asymptomatic or exhibit minor symptoms like common bowel diseases.When their bodies present a series of perceptible abnormalities, the cancer has already progressed to an advanced stage, even metastasized.Localized CRC patients have a high 5-year survival rate, decreasing from approximately 90% for primary tumors to 14% for metastatic CRC (2).With the incidence increasing constantly worldwide (1), CRC poses a significant challenge to public health globally.Individuals affected by CRC, including the patients and their families, fall into physical as well as financial adversities that ensue (3).Furthermore, CRC patients face psychological distress, including anxiety and depression (4).Eventually, the prolonged physical and mental issues may worsen the quality of life of patients.In addition, this disease not only presents a severe threat to personal health but also consumes substantial social and medical resources and heavily burdens society and healthcare systems (5).Clarifying the pathogenesis and etiology, including potential risk and protective factors, has excellent significance for the clinical practice of disease prevention and management.
Although the cause of CRC is still unclear, several researches have revealed some risk factors functionally integrated in the progression of this gastrointestinal disease.The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) indicated that the incidence rates of CRC increased with age, particularly surging in individuals aged 50-54 years and older (6).Additionally, a genome-wide association study identified 155 high-confidence effector genes that were functionally related to CRC risk, such as ARHGEF4, GNA12, LRIG1, GAB1, CNIH2, etc.These genes have multiple functions and affect tumor biology through various biological processes, including proliferation, homeostasis, migration, cell adhesion, immunity, and microbial interactions (7).Previous studies also found that environmental risk factors, Sedentary behavior (RR: 1.30, 95% CI: 1.22-1.39)(8) and smoking (RR: 1.17; 95% CI: 1.15-1.20)(9), could potentially impact the risk of CRC.Notably, in the realm of diet and nutrition, many experimental and epidemiological studies have made significant findings.For instance, Diets low in milk or calcium have been identified as primary contributors to the CRC disability-adjusted life years (6).Moreover, it has been found that nutritional supplements, such as omega-3 and arginine supplementation, could also modify the risk of CRC development (10).
According to previous studies, alcohol intake (11), red meat intake, processed meat intake (12), vegetable intake, and fruit intake (13) were associated with the pathophysiology of CRC.The potential mechanism of these pathologies is complicated and may contain direct biological effects on epithelial cells, modifications in inflammation and immune reactions, and diet-induced regulation in the composition and abundance of human gut microbes (14).
Current observational and meta-analysis studies on dietary factors and CRC face inherent limitations.Firstly, the sample sizes are typically small, affecting the reliability of results.Additionally, potential confounders may interfere with the interpretation of findings.Due to these factors, it's challenging for these studies to conclusively demonstrate the epidemiological link between dietary habits and CRC risk.Hence, more robust and high-quality evidence is necessary to bridge the existing research gap.
Since the relationship between dietary factors and CRC has not been explored by any genetic instruments, we hypothesized there was a causative association of CRC with dietary factors.Similar to randomized controlled trials, the Mendelian randomization (MR) study is a novel research method that uses single-nucleotide polymorphisms (SNPs) as instrumental variables (IVs) to infer causal relationships between risk factors and health outcomes (15).This research methodology draws upon Mendel's second law of genetics.It involves categorizing the study cohort according to the presence of specific genetic variations and subsequently comparing the occurrence of outcomes across these categories (16).SNPs follow the principle of being randomly allocated during the process of meiosis.This helps to eliminate the influence of confounding factors and the possibility of reverse causation, as genetic variants exist before the onset of the disease (17,18).Recent MR studies suggest that dietary habits have a significant effect on several cardiovascular diseases (16) and five major mental disorders (19).Through MR studies, more diet-related factors for various diseases could be investigated.Herein, we performed a two-sample MR design to investigate the possible association of CRC with dietary factors.

Study design
A flowchart (Figure 1) presents our study design concisely, including the procedure of selecting IVs, conducting MR studies using five methods, and carrying out sensitivity analyses.To provide a better understanding of our study design, it's important to detail the foundation of MR, which consists of three essential assumptions.The first assumption is that the SNPs employed as IVs are supposed to be closely related to exposure factors.The second assumption indicates that the screened IVs should not be associated with any confounding factors.The third assumption requires that the proposed genetic variants should impact the risk of the health outcome only via exposure we focused on Chen et al. (15).The three crucial assumptions guaranteed that the MR results would not be interfered with by other confounding factors, such as the population's characteristics, environment, and socioeconomic status.Also, since the genetic variation explains the formation of the exposure part before the outcome, reverse causality can be eliminated, thus compensating for the limitations of traditional methods.The two-sample MR analysis was performed to identify the causal relationship between traits utilizing publicly available genetic datasets in several genome-wide association studies (GWAS).

Data sources
Dietary factors employed in our study covered drinks intake (alcoholic drinks per week, tea intake, and coffee intake), vegetable and fruit intake (salad/raw vegetable intake, cooked vegetable intake, fresh fruit intake and dried fruit intake), meat intake (pork intake, beef intake, lamb/mutton intake, poultry intake, oily fish intake, non-oily fish intake, and processed meat intake), staple diet intake (bread intake and cereal intake), and dairy product intake (cheese intake).These GWAS summary-level data were obtained from the UK Biobank by the IEU open GWAS project, supported by the MRC Integrative Epidemiology Unit (IEU) at the University of Bristol.The GWAS summary-level data of CRC was extracted from the European Bioinformatics Institute by the IEU open GWAS project.More relevant information about the original datasets is shown in Table 1 and Supplementary Table S1.All the data used in this work are publicly available and were obtained from studies with the consent and ethical approval of the relevant participants.As a result, this study did not require the ethical approval of an institutional review board.

Genetic variants
In order to meet the three assumptions of the MR analysis, the quality control steps below were applied to screen the related SNPs.We selected SNPs that are closely associated with various dietary factors.This selection was based on a genome-wide significant level (p < 5 × 10 −8 ).We also performed the clumping process [distance window of 10,000 kb, linkage disequilibrium (LD) coefficient r 2 < 0.001] (20).This step was crucial to avoid LD between SNPs and to ensure the independence of genetic variants.We selected the SNPs closely associated with various dietary factors at the significant level of genome-wide (p < 5 × 10 −8 ) and conducted the clumping process [distance window 10,000 kb, linkage disequilibrium (LD) coefficient r 2 < 0.001] to avoid LD between SNPs and ensure the independence of genetic variants (20).If no SNP was intensely related to any dietary factors found in the CRC database, proxy SNPs were allowed with a minimum LD R 2 = 0.8 (21).Palindrome SNPs were reserved based on the threshold that the minor allele frequency (MAF) <0. 3 (22).Notably, if the allele frequency contained in the details of an SNP is close to 0.5, we could hardly pinpoint the minor allele exactly, as there is sampling variance around the allele frequency.To enhance the accuracy of our study, we excluded such SNPs at the outset of MR analyses.In addition, to measure the power of the screened IVs and ensure their close relationships with exposures, we calculated the F-statistics and the proportion of variance interpreted (R 2 ) for each SNP.Genetic variants (F-statistics <10) were generally considered as weak instruments, which should be removed from our MR analysis (23).Finally, MR-PRESSO tests were also employed to recognize potential horizontal pleiotropy, and the identified outliers would be ruled out to prevent the impact of pleiotropy (24).

Statistical analysis
We first performed an inverse variance weighted (IVW) test.This test is recognized for its strongest ability to determine causation (25).We applied it as the primary method to identify the causal effect between diet-related factors and CRC.We performed the inverse variance weighted (IVW) test, which possessed the most substantial ability to determine causation (25), as the significant method to detect the causal effect between diet-related factors and CRC.The evidence from the IVW method was complemented with the MR-Egger, weighted mode, weighted median, and simple mode.The conclusion would be more credible, stable, and precise when the consequences of these methods were consistent (26).For the IVW test and MR-Egger model, Cochran's Q test was conducted to assess heterogeneity (27).
Cochran's Q test p < 0.05 indicated the existence of heterogeneity.
Besides the MR-PRESSO test, as stated earlier, we also used the MR-Egger intercept test to detect directional pleiotropy.The absence of non-zero intercepts (p > 0.05) indicated that IVs did not affect CRC through other confounders (28).Leave-one-out analysis was applied to judge whether the causal link was affected by eliminating a particular SNP (29).Statistical analysis was carried out with R software using the "TwoSampleMR" (20) package and "MR-PRESSO" (24).The significant threshold of the existence of causation is p < 0.05.

Selection of instrumental variables
The causal associations of dietary factors with CRC were analyzed with 17 different exposures.The number of SNPs employed in our study ranged from 7 to 62.The F-statistics were greater than 10 for all the IVs (range: 32.539 to 80.012), suggesting that the selected IVs fulfilled the requirements of intense association with exposure.The amounts of European participants included in the exposure datasets ranged from 335,394 to 461,981.The outcome dataset covered 11,895 European-descent CRC cases and 14,695 European-descent controls.It was sourced from the European Bioinformatics Institute.Compared with the exposure datasets, there was little potential deviation in population stratification.More detailed information is presented in Table 1.Due to the non-significant conclusions of the MR-PRESSO global test (p > 0.05), no outlier was eliminated through MR-PRESSO.

Sensitivity analysis
Meanwhile, no heterogeneity was discovered in Cochran's Q tests (p > 0.05 for all the consequences).MR-Egger intercept test indicated that except for the causality calculation between cheese intake and CRC, no statistically significant horizontal pleiotropy was observed in other remaining research (Figure 3 and Supplementary Table S3).Leave-one-out results suggested that no particular SNP could independently affect the MR positive conclusions (Figure 4).All the sensitivity analyses ensured the reliability of our results.

Discussion
We executed two-sample MR analyses utilizing large-scale GWAS summary statistics.These analyses observed genetic evidence for a causal association of CRC risk with 17 genetically predicted dietrelated factors.Specifically, we noticed suggestive evidence that weekly alcoholic drinks may elevate CRC risk while a higher intake of non-oily fish, cereals, fresh fruit, and dried fruit may reduce risk.Apart from these five exposures, there was no evidence that other dietary factors affected CRC risk significantly.Clarifying these relationships had a vital impact on developing nutritional recommendations for CRC management and prevention.
The relationship between dietary factors and CRC remains controversial.Previously, some observational studies indicated that alcohol intake was a risk factor for colorectal cancer.For instance, a nested case-control study in South Asia revealed that current or former drinkers had a higher risk of CRC (OR: 5.4; 95% CI: 1.1-27.8;p = 0.043) (30).Similar conclusions were reported from other methods and regions (31,32).However, a previous European MR study found no evidence of a pronounced relationship (OR: 1.60; 95% CI: 0.85-3.04;p = 0.146) (33).Whereas a total of 3 IVs representing weekly alcohol consumption were utilized, and only 0.2% of the genetic variation could be explained, which might lead to a weak statistical power and the absence of robustness.Our study, using 32 SNPs in total, preliminarily demonstrated that alcohol drinks per week was causally associated with about a 56.5% increase in the risk of CRC in European individuals.Some experimental evidence indicated that alcohol might result in the development of CRC by disrupting the composition of gut microbacteria.The possible acetaldehyde accumulation in the Ruminococcus and Coriobacterium located in the colorectum would contribute to mutagenesis and the enablement of carcinogenesis (34).Simultaneously, alcohol metabolites might trigger DNA-adduct formation, lipid peroxidation, and oxidative stress, leading to the initiation of cancer-promoting cascades (35).Additionally, an epigenetic analysis and a gene-alcohol interaction analysis revealed that alcohol consumption could affect DNA methylation by regulating the expression of the COLCA1/COLCA2 gene, which would also increase CRC risk (36).Further investigations are necessary to identify the role of alcohol intake in the genetic and metabolic effects of CRC.
The consequences are also inconsistent between fruit intake and the CRC risk.A European prospective investigation covering 2,819 incident CRC cases has shown that fruit consumption was inversely linked with CRC.The CRC risk was compared between the highest and the lowest EPIC-wide quintile of consumption over an 8.8-year follow-up (HR: 0.86; 95% CI: 0.75-1.00;p trend = 0.04) (37).
Similarly, a cohort study on Chinese males obtained the same result (HR: 0.67; 95% CI: 0.48-0.95;p trend = 0.03) (38).On the contrary, a meta-analysis containing 16 cohort studies indicated the absence of significant association (39).Notably, the aforementioned conclusions might not be reliable due to the inherent drawbacks of the observational study design.Removing the underlying confounding factors and focusing on the fresh and dried fruit separately, our MR analyses suggested both fresh fruit (OR: 0.513; 95% CI: 0.273-0.964;p = 0.038) and dried fruit intake (OR: 0.522; 95% CI: 0.311-0.875;p = 0.014) were genetically correlated with a lower risk of CRC.The casual relationship may be attributable to several physiological mechanisms.Specifically, apigenin, a flavonoid that widely existed in fruits, targeted the K433 site of PKM2, thus restricted the glycolysis of HCT-8 and LS-174T cells, thereby serving the crucial function of anti-CRC in vivo and in vitro and markedly attenuating tumor growth in the meantime (40).Moreover, anthocyanins are phenolic pigments that give red and purple fruits their vivid colors.It has been demonstrated to protect against CRC by suppressing the activity and expression of DNA methyltransferase enzymes (DNMT1 and DNMT3B) and demethylating WNT upstream regulators (CDKN2A, SFRP2, SFRP5, and WIF1) (41).Further explorations were necessary to confirm the existence of the causality and investigate the concrete mechanism.To date, the role of cereal intake in CRC has been widely studied, and a certain amount of epidemiological studies have yielded similar conclusions.A meta-analysis containing 7 European studies suggested a 10% decreased risk of CRC for each 10 g/day intake of cereal and more obvious reductions with higher intake (42).A prospective study of the UK Biobank deduced that intake of fiber from breakfast cereals was a statistically protective factor to CRC (HR: 0.86, 95% CI: 0.76-0.98,p trend = 0.005) with the multivariable model (43).Our results further confirmed the significant causal effect of cereal consumption (OR: 0.435; 95% CI: 0.253-0.746;p = 0.003) against the development of CRC.Mechanism studies reported that cereal foods could increase stool bulk, dilute fecal carcinogens, and decrease transit time.These procedures could offer the lining of the colorectum protective effects against carcinogens (44), which supported our discovery.Specifically, cereal foods' regulatory effects on CRC development were mediated by activating AHR and GPCRs and inhibiting STAT3 phosphorylation (45).Analogically, other cereal components, including vitamins, phytoestrogens, and trace minerals, have also been associated with a lower risk of CRC (46).More underlying anticarcinogenic mechanisms of high levels of cereal intake could be investigated in the future.
In contrast, there is only a limited number of clinical studies focusing on non-oily fish and CRC.A large European cohort investigation observed an inverse association with CRC incidence (HR: 0.91; 95% CI: 0.83-1.00;p trend = 0.016) (47), which was compatible with our present study (OR: 0.286; 95% CI: 0.095-0.860,p = 0.0026).Additionally, pathophysiological evidence proposed that the ω-3 polyunsaturated fatty acids (PUFAs) contained in the fish might regulate eicosanoid metabolism (48).It was revealed that eicosapentaenoic acid (EPA), which is a type of ω-3 PUFAs, could lead to a decrease in the number and size of colorectal tumors by inhibiting COX-2, reducing β-catenin nuclear translocation and increasing apoptosis (49).ω-3 PUFAs could also promote a higher gut microbial diversity, thus ameliorating the body's metabolic and immune functions and eventually reducing the CRC risk (34,50).Subsequent highquality analyses are required to deduce potential causalities and biological mechanisms.
Notably, some food of animal origins, such as dairy products and eggs, are susceptible to contamination by persistent organic pollutants (POPs), including polychlorinated dibenzo-p-dioxins (PCDDs), polychlorinated dibenzofurans (PCDFs) (51,52), and polychlorinated biphenyls (PCBs) (53,54).Long-term exposure to those POPs could damage the immune system and interfere with endocrine functions, thus causing a range of adverse health effects, especially cancer (51)(52)(53)(54).Given that dietary intake is the primary route of exposure for humans, contaminated food of animal origin poses a significant risk Scatter plots were used to visualize the causal effect between alcoholic drinks per week (A), non-oily fish intake (B), fresh fruit intake (C), cereal intake (D), dried fruit intake (E) and colorectal cancer.The x-axis shows the SNP effect and SE on dietary factors.The y-axis shows the SNP effect and SE on colorectal cancer.The regression lines for the inverse-variance weighted (IVW) method, the MR-Egger regression method, the weighted median, the weighted mode, and the simple mode are shown.The slope of each straight line indicates the magnitude of the causal association.SNP, single nucleotide polymorphism; SE, standard error.There are multiple critical advantages of this work as follows: for all we know, this is the first work to elucidate the causal associations between CRC and diet-related factors by the two-sample MR method.This method addressed the debate of the prior epidemiologic studies and avoided the inherent deficiencies of previous traditional observational research, such as reverse causality and confounders.It also provided novel insights and methods for assessing the health benefits associated with dietary configurations.Secondly, benefiting from the large-scale GWAS database, the massive sample size of our analyses and the solid statistical evaluation effect of each IV (F-statistic >10) guaranteed the statistical validity of the current study.Moreover, we restricted the participants of this study to European-descent individuals, which minimized the potential bias induced by population stratification.Eventually, 5 MR methods and diverse sensitivity analyses were applied to assess the consistency of causal effects and obtained similar results, ensuring the robustness and stability of our discovery.Some possible limitations in this study should also be considered.First, Mendel's second law is not universally applicable to all genetic variants because not all genes determining traits are isolated independently.The inherent presence of developmental compensation bias also contributes to the potential inaccuracy of Mendelian randomization studies.Second, all analyses conducted in the current study were merely based on the European participants.Thus, it remained to be seen whether our results could be extrapolated to non-European populations.Third, due to the lack of classified population GWAS data for different sexes and ages, we could not execute a sex-or age-stratified analysis.Specifically, owing to the limited details provided by the original research, it was difficult to predict the generalizability of the study results across different exposure periods and levels.Analogically, diet-related information obtained from surveys may be prone to recall bias, which could possibly render our results unreliable.Additionally, given the complexity of dietary habits, we were unable to distinguish the impacts of diverse dietary combinations.Hence, it was challenging to identify the specific role of these interested dietary factors in the etiology and pathogenesis of CRC.Further investigation will focus on conducting more comprehensive studies to gather high-quality evidence regarding the idiographic mechanisms through which dietary factors affect CRC risk.This involves expanding the scope of research to include a broader range of dietary factors, identifying potential biomarkers that could help in understanding the effect of diet on CRC development, exploring genetic predispositions that may modify the impact of dietary factors, and longitudinal studies to track dietary habits over time and their direct correlation with CRC incidence.

Conclusion
Based on the GWAS summary data of CRC and European dietary habits, this study was implemented to identify the potential plots of the "leave-one-out" sensitivity analyses to demonstrate the impact of individual SNPs on the results.The x-axis shows MR "leave-oneout" sensitivity analyses for alcoholic drinks per week (A), non-oily fish intake (B), fresh fruit intake (C), cereal intake (D), and dried fruit intake (E) on colorectal cancer.The y-axis shows the analyses for the effect of "leave-one-out" of SNPs on colorectal cancer.The black point on the bottom line of each panel indicates the IVW estimate using all SNPs.MR, Mendelian randomization; SNP, single nucleotide polymorphism.

FIGURE 1 Study
FIGURE 1Study design and workflow.

FIGURE 2 Forest
FIGURE 2Forest plots of the MR results (IVW method) to present the causal associations between 17 dietary factors and CRC risk.OR, odds ratio; CI, confidence interval.

TABLE 1
Information of the exposures and outcome datasets.

TABLE 2
The results of Mendelian randomization analyses.

TABLE 2 (
Continued) 10.3389/fnut.2024.1388732Frontiers in Nutrition 11 frontiersin.orgassociations of colorectal cancer with 17 dietary factors using genetic instruments.The causal relationship between alcoholic drinks per week and an increased risk of CRC and the inverse causality of non-oily fish intake, cereal intake, fresh fruit, and dried fruit intake with CRC were determined by performing the two-sample MR analyses.The current study is the first to build the link between comprehensive diet-related factors and CRC at the genetic level, offering novel clues for interpreting the genetic etiology of CRC and replenishing new perspectives for managing gastrointestinal diseases.The result also prompts future explorations, including longitudinal studies and nutritional interventions, highlights the importance of interdisciplinary collaboration for clinical diagnostics, comprehensive patient care, and genetic counseling and education, and helps develop public health recommendations and tailored nutrition and prevention strategies.