Garlic consumption and colorectal cancer risk in US adults: a large prospective cohort study

Objective To clarify the inconsistent findings of epidemiological studies on the association between dietary garlic consumption and colorectal cancer (CRC) incidence, by prospectively assessing the association in a large US population. Methods Data of 58,508 participants (aged 55–74) from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial were analyzed. Dietary data were collected using a validated questionnaire. Multivariable Cox regression analysis determined hazard ratio (HR) and 95% confidence interval (CI). Restricted cubic spline regression was used to investigate the non-linear relationship, and subgroup analysis was conducted to examine potential effect modifiers. Results During a median follow-up of 12.05 years, 782 CRC cases were documented, including 456 proximal colon cancer cases, 322 distal CRC cases, and 4 CRC cases with an unknown site. Moderate dietary garlic consumption was significantly associated with a reduced risk of overall CRC (HRquintile 3vs. 1: 0.70, 95% CI: 0.54 to 0.91, p = 0.007, P for trend: 0.434), exhibiting a U-shaped dose-response pattern, and also with overall CRC in males in the stratified Cox regression model (Model 2: HRquintile 3vs. 1: 0.57, 95% CI: 0.40 to 0.81, p = 0.002), but not in females. The protective association was more pronounced in men, Caucasian, and those with lower alcohol consumption. Notably, these protective effects were observed for overall distal CRC (HRquintile 3vs. 1: 0.62, 95% CI: 0.42 to 0.93, p = 0.021; and HRquintile 4vs. 1: 0.63, 95% CI: 0.43 to 0.92, p = 0.018, P for trend: 0.208); and for distal CRC in males (HRquintile 3vs. 1: 0.40, 95% CI: 0.22 to 0.71, p = 0.002, P for trend: 0.696), but not for proximal CRC. Conclusion Moderate consumption of dietary garlic is associated with a decreased CRC risk in the US population, with variations based on CRC anatomic subsites. Further in-depth prospective studies are needed to validate these findings in different populations and to explore subsites-specific associations.


Introduction
Colorectal cancer (CRC) is the second most common cause of cancer death in the United States, with an estimated 153,020 new cases and 52,550 fatalities expected in 2023, including a concerning number among those under 50 years old (1).In addition to genetic factors, over half of CRC cases are linked to modifiable lifestyle risk factors, including obesity, physical inactivity, alcohol drinking, and smoking (2).Also, diet high in plant-based foods has been associated with a reduced likelihood of developing the disease (3).Specifically, garlic (Allium sativum L.) has shown an inverse association with CRC risk in case-control studies, although findings from cohort studies remain controversial (4,5).
Given that the existing evidence is largely derived from case-control studies, which are susceptible to recall bias and unable to establish a time-based association.Moreover, there haven't been any prospective cohorts to assess the non-linear relationship of garlic consumption with CRC over a period of time in the US population.To fill this research gap, we conducted a prospective cohort study using the data from the Prostate, Lung, Colorectal, and Ovarian (PLCO) cancer screening trial to comprehensively explore the association between dietary garlic consumption and CRC incidence.

Data source and study population
The PLCO cancer screening trial was a large-scale, multicenter randomized controlled study sponsored by the United States National Cancer Institute (NCI) to determine whether specific screening examinations reduce mortality from PLCO cancers in US adults aged 55-74 years.Details of the study design and methodology have been described elsewhere (28).Briefly, a total of 154,887 participants including 76,678 men and 78,209 women were recruited between 1993 and 2001 from ten screening centers (Washington, Denver, Marshfield, Detroit, Minneapolis, Birmingham, Pittsburgh, Honolulu, Salt Lake City, and St Louis) across the United States.Upon enrollment, they were randomly assigned to either a control group or an intervention group.The PLCO trial was conducted with approval from the Institutional Review Boards of the National Cancer Institute (NCI).Each of the ten study centers obtained approval from their local Institutional Review Boards.All participants provided written informed consent before enrollment in the study.
In the present study, 77,443 participants with available data on garlic (g/day) consumption in the intervention arm were collected from the DQX questionnaire at baseline (T0).Sequentially, participants were further excluded if they (1) did not return the baseline questionnaire (n = 1,833) or had any history of CRC before the baseline questionnaire (n = 22); (2) had an incomplete DQX questionnaire (n = 12,410) or an invalid DQX that was missing the completion date, was completed before the date of death, had ≥ 8 missing frequency responses, or indicated extremely high or lowcalorie intake (i.e., top 1% or bottom 1%) (n = 1,809); (3) had a history of any cancer before DQX entry (n = 2,874); or (4) had no follow-up time after the DQX (n = 77).Ultimately, our cohort consisted of 58,508 eligible participants (Figure 1).

Data collection
Participants in the study completed a comprehensive baseline questionnaire, providing self-reported data on demographics, lifestyle factors, and medical history, including sex, race, trial arm, body mass index (BMI), educational level, marital status, aspirin use, cigarette smoking, family history of CRC, history of colon comorbidities, history of colorectal polyps, and diabetes history.BMI was calculated as weight in kilograms divided by the square of height in meters.Dietary data at baseline (T0), encompassing alcohol consumption, dietary energy intake, and dietary foods or nutrient intake, were collected using a 137item self-administered food-frequency questionnaire known as the DQX.The DQX questionnaire was derived from 2 previously validated food frequency questionnaire (FFQs) developed for epidemiologic and clinical use.It included items from the 61-item semiquantitative Willett FFQ that have been shown to provide adequate information on individual nutrient intake over 1 year and other items from the Block FFQ developed from the Second National Health and Nutrition Examination Survey (29,30).During the dietary survey, participants were instructed to recall the average frequency of consuming each food item listed in an FFQ over the past year.Dietary intake of energy and nutrients was calculated by multiplying the amount of energy and nutrients in the standard portion size of each food item by the reported frequency.The values were then summed across all food items, utilizing the United States Department of Agriculture's 1994-1996 Continuing Survey of Food Intakes by Individuals or the widely employed Nutrition Data Systems for Research nutrient database (31).Healthy Eating Index-2005, a metric for assessing diet quality, was calculated following the methodology outlined in the literature (32).Physical activity levels were assessed using the DQX questionnaire, specifically quantifying hours engaged in vigorous activities at present.

Ascertainment of colorectal cancer
The main outcome measure of the study was the occurrence of CRC, determined through annual reviews of participants' medical records, which provided updates on cancer diagnoses, including the date of detection and cancer site.A standardized form was used to review relevant medical records to ensure the accuracy of the

Statistical analysis
To address missing data for twelve covariates (Supplementary Table 1) and enhance statistical power while minimizing potential biases, we utilized multiple imputations with the random forest algorithm (R package "missRanger") to impute twelve covariates with missing data, assuming that the missing data were random.The imputed data set included all variables used in the statistical analyses.Furthermore, we conducted additional analyses of participants with complete data for comparison purposes.
We employed Cox proportional hazards regression to estimate hazard ratios (HRs) and 95% confidence intervals (CIs) to assess the association between dietary garlic consumption and CRC incidence, using follow-up time as the underlying time metric.Garlic intake was adjusted for energy using the residual method (33) and categorized into quintiles, with the lowest quintile serving as the reference group.To assess linear trends in risk estimates across quintiles of energy-adjusted garlic consumption, we assigned the median value of each quintile to the corresponding participants, creating an ordinal variable.This ordinal variable was treated as a continuous variable in regression models, and its significance in indicating linear trends was assessed using the Wald test to obtain the associated p-value.In Cox models, we selected covariates entering into multivariable analyses based on our casual knowledge and used the directed acyclic graph to visualize the relationship among exposure, outcome, and potential confounders (DAGitty version 3.0; www.dagitty.net/)(Supplementary Figure 1).A total of 9 potential confounders were identified, and we verified the proportional hazard assumption by the Schoenfeld residual test (all p-values for the global test > 0.05; listed in Supplementary Table 2).Among them, the variable "sex" violated the PH assumption, therefore, we also conducted the stratified Cox regression to control the time-varying effect of "sex."Furthermore, we applied the Marginal Structural Model (MSM) to adjust the potential time-varying dietary exposure or variables in the Cox regression models (34)(35)(36).Specifically, the Crude model adjusted for none; Model 1 adjusted for age (years), and sex (male vs. female).Model 2 adjusted for age (years), sex (male vs. female), race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others), physical activity (none vs. ≤ 1 h/week vs. ≥ 2 h/week), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), BMI (kg/m 2 ), alcohol consumption (g/day), and energy from diet (kcal/day); Model 3 adjusted the covariates in Model 2 using the Marginal Structural Model; Model 4 adjusted for all variables at baseline, including age (years), sex (male vs. female), marital status (married vs. unmarried), race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others), education level (≤ some college vs. college graduate vs. postgraduate), physical activity (none vs. ≤ 1 h/week vs. ≥ 2 h/week), multivitamin use (no vs. yes), aspirin use (no vs. yes), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), pack-years (continuous), BMI (kg/m 2 ), family history of colorectal cancer (no vs. yes vs. possibly), alcohol consumption (g/day), history of colorectal polyps (no vs. yes), history of colon comorbidities (no vs. yes), and energy from diet (kcal/day); Model 5 adjusted all the covariates in Model 4 using the Marginal Structural Model.
We conducted subgroup analyses to evaluate whether the association between garlic consumption and CRC incidence was influenced by age (< median vs. ≥ median), sex (male vs. female), race (white, non-Hispanic vs. black, non-Hispanic vs. others), BMI (< 25 kg/m 2 vs. ≥ 25 kg/m 2 ), smoking status (current/former vs. never), and alcohol consumption (no/light/moderate vs. heavy).For alcohol consumption, we categorized it as light, moderate, and heavy.Light alcohol consumption was defined as up to 6 g/day.Moderate consumption was defined as more than 6-28 g/day for males and more than 6-14 g/day for females, and heavy consumption was defined as more than 28 g/day for males and more than 14 g/day for females, respectively (37).To assess the modification effect, we used a likelihood ratio test by comparing models with and without interaction terms to obtain a P interaction.Furthermore, we also categorized the garlic consumption into tertiles and repeated the above-mentioned subgroup analyses to minimize the bias from small case numbers of each stratum.
A series of wide-range sensitivity analyses were conducted as following steps: (1) excluded participants with extreme energy intake from diet (< 800 or > 4,000 kcal/day for men and < 500 or > 3,500 kcal/day for women); (2) excluded participants with a history of diabetes; (3) excluded participants with extreme BMI (top 1% and bottom 1% of BMI); (4) excluded participants within the first 2 years of follow-up; (5) repeated analysis for participants with complete data; (6) additionally adjusting for the Healthy Eating Index-2015 to examine whether the observed correlation was influenced by diet quality.We employed restricted cubic spline functions with 4 knots (5,35,65, and 95th percentiles) to explore potential non-linear relationships between energy-adjusted dietary garlic consumption and CRC incidence.It is important to note that participants with dietary intakes below the 1st percentile or above the 99th percentile were excluded to minimize potential bias from extreme values in the dose-response analyses.Furthermore, we assessed the significance of non-linearity by testing the null hypothesis that the regression coefficient of the second spline was equal to zero.All statistical analyses were performed using R software (version 4.2.1), with a two-tailed significance level set at P < 0.05.

Participants' baseline characteristics
We identified a total of 58,508 participants for this study.During a median follow-up of 12.05 years, there were 782 cases of colorectal cancer and 324 deaths, Energy-adjusted dietary garlic consumption ranged from −1.15 to 11.68 g/day (median value: 0.42 g/day), whereas the unadjusted dietary garlic consumption ranged from 0 to 11.33 g/day (median value: 0.34 g/day).Table 1 summarizes the baseline characteristics of these participants by quintiles of energy-adjusted garlic consumption.
Compared to participants with the lowest level of garlic intake, those with the highest quintile for garlic intake were generally more physically active, consumed more alcohol, had higher educational levels, reported more pack-years of smoking, and also had a lower overall caloric intake from diet.Meanwhile, participants with moderate garlic intake (quintile 2, quintile 3, and quintile 4) were predominantly females, consumed less alcohol, had a lower proportion of current smokers, and also lower dietary energy intake.

Garlic consumption and CRC incidence
In this study, which tracked 676,471 person-years of follow-up, we identified 782 CRC cases, including 456 proximal colon cancer, 322 distal CRC (comprising distal colon and rectal cancer), and 4 CRC cases with an unspecific location.The overall incidence rate was 1.16 cases per 1,000 person-years.
To investigate the relationship between dietary garlic consumption and the incidence of overall colorectal cancer, as well as its subsites, we employed multivariable Cox regression models as shown in Table 2.As for overall colorectal cancer, the full-adjusted model (Model 2) indicated that those in the moderate consumption category (quintile 3) were associated with a 30% lower risk of overall CRC incidence (HR quintile 3vs. 1 : 0.70, 95% CI: 0.54 to 0.91, p = 0.007, P for trend: 0.434) compared to the lowest consumption (quintile 1) of energy-adjusted dietary garlic.Considering the potential time-varying effect from the dietary exposure and variables, we applied the Marginal Structural Mode to further analyses and found a similar inverse association in Model 3 (HR quintile 3vs. 1 : 0.74, 95% CI: 0.59 to 0.94, p = 0.013).In addition, we included all variables at baseline in Model 4 (PH assumption showed global test < 0.05) as a comparative analysis versus Model 2 and used the MSM model to control time-varying variables, but we didn't observe a significant change (Model 4: HR quintile 3vs. 1 : 0.71, 95% CI: 0.55 to 0.92, p = 0.009; Model 5: HR quintile 3vs. 1 : 0.73, 95% CI: 0.58 to 0.93, p = 0.009).Next, we conducted the stratified Cox regression by sex, and found a consistent inverse association in males (Model 2: HR quintile 3vs. 1 : 0.57, 95% CI: 0.40 to 0.81, p = 0.002), but not in females.Similarly, we also observed the inverse association between moderate consumption of garlic and overall distal CRC (HR quintile 3vs. 1 : 0.62, 95% CI: 0.42 to 0.93, p = 0.021; and HR quintile 4vs. 1 : 0.63, 95% CI: 0.43 to 0.92, p = 0.018, P for trend: 0.208); and distal CRC in males (HR quintile 3vs. 1 : 0.40, 95% CI: 0.22 to 0.71, p = 0.002, P for trend: 0.696), but not in females.In contrast, we detected a suggestive but not significant inverse association between garlic consumption and proximal colon cancer, both in males and females.

Dose-response analyses
We utilized restricted cubic spline plots to visualize the relationships between the dietary intake of garlic and Values are hazard ratios (95% confidence intervals).a P for interaction was calculated by comparing models with and without interaction terms (sex stratification).b A total of 782 colorectal cancer cases were identified, including 456 proximal colon cancer cases, 322 distal CRC (that is, distal colon and rectal cancer) cases, and 4 CRC cases with an unknown site.c Incidence rate was calculated per 1,000 person-years.d Crude model adjusted for none.
DAG refers to the directed acyclic graph used to identify potential confounders.MSM refers to the Marginal Structural Model used to adjust the time-varying dietary exposure or variables that do not meet the PH assumption in the Cox regression models.Crude model adjusted for none.Model 1 adjusted for age (years), and sex (male vs. female).Model 2 adjusted for age (years), sex (male vs. female), race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others), physical activity (none vs. ≤ 1 h/week vs. ≥ 2 h/week), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), BMI (kg/m 2 ), alcohol consumption (g/day), and energy from diet (kcal/day).Model 3 adjusted the covariates in Model 2 using the Marginal Structural Model.Model 4 adjusted for all variables, including age (years), sex (male vs. female), marital status (married vs. unmarried), race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others), education level (≤ some college vs. college graduate vs. postgraduate), physical activity (none vs. ≤ 1 h/week vs. ≥ 2 h/week), multivitamin use (no vs. yes), aspirin use (no vs. yes), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), pack-years (continuous), BMI (kg/m 2 ), family history of colorectal cancer (no vs. yes vs. possibly), alcohol consumption (g/day), history of colorectal polyps (no vs. yes), history of colon comorbidities (no vs. yes), and energy from the diet (kcal/day).Model 5 adjusted all the covariates in Model 4 using the Marginal Structural Model.
the risk of CRC (overall CRC and its subsites: proximal colon cancer and distal CRC) across the full range of consumption levels.We observed a U-shaped curve between energy-adjusted garlic consumption and the risk of overall CRC (for both: association of dietary garlic with CRC incidence (16,17), while others have found no such association (23,24).Interestingly, when separately analyzed based on study type, the results showed that garlic was associated with reduced CRC risk in the casecontrol studies but the no such correlation in cohort studies (23).Meanwhile, a meta-analysis that only included cohort studies showed no association of colorectal cancer incidence with raw and cooked garlic or garlic supplements (24).This suggests that different study designs may have different effects on the results.Of course, it's imperative to interpret the conclusions with caution because the evidence of an inverse association predominantly comes from case-control studies, which are potentially vulnerable to recall bias and selection bias.Moreover, the pooled studies exhibited significant heterogeneity.The prospective data on the impact of garlic on CRC incidence remains limited.Our study identified an inverse association between dietary garlic consumption and CRC incidence.This finding contrasts with recent prospective cohort studies, which reported no significant association between CRC incidence and either garlic intake or garlic supplement use (26,27).The inconsistency might be attributed to variations in adjustments for potential confounders, sample sizes, and population heterogeneity.Furthermore, our study unveiled a U-shape dose-response relationship, a phenomenon yet to be explored by previous studies.Several factors might explain this observation, including the potential threshold effect of garlic's protective compounds, the modulating effect of alcohol consumption, and the influence of dietary energy intake (38-40).Additionally, we noted a similar U-shaped inverse association with distal CRC risk, but no such association was evident for proximal colon cancer risk.This observation aligns with findings from the Iowa Women's Health Study (IWHS) that involved 41,837 women (41).In our study, which involved 58,508 participants of both genders, we further corroborated the protective effect of dietary garlic intake on distal CRC.The heterogenous protective effect of dietary garlic intake on different CRC subsites offers valuable insights into the etiologic heterogeneity of CRC, potentially associated with the distinct molecular and microbial profiles of the proximal and distal colon (42, 43).Thus, gut microbiota may act as a potential mediator between diet and site-specific CRC risk (44,45).This may be explained by the different regions of the gastrointestinal tract vary widely in terms of transit time, pH, exposure to oxygen, nutrient availability, mucosal surfaces, and interactions with the immune system, all of which affect microbial colonization (46).For example, there is a marked difference in the mucosal microbiota between patients who develop right-versus left-sided CRCs (43,(47)(48)(49), including in the presence of bacterial biofilms, defined as mucin layers with admixed bacteria on the luminal surface of the colonic epithelium, which can invade the mucus layer of the colon and may be pathogenic when they make direct contact with the mucosal epithelial cells.Invasive bacterial biofilms were found in 89% of right-sided CRCs but in only 12% of leftsided CRCs (50).In addition, evidence indicated that the shortchain fatty acids acetate, propionate, and butyrate function in the suppression of inflammation and cancer, whereas other microbial metabolites, such as secondary bile acids, promote carcinogenesis (46).For instance, one study showed that butyrate is a more important source of energy for the distal than the proximal colonic mucosa (51), which may be a relevant biological mechanism explaining the site-specific difference.Furthermore, garlic contains bioactive compounds, such as organosulfur compounds, that have been shown to have antimicrobial and anti-inflammatory properties.These compounds may influence the gut microbiota composition and function (52,53), potentially leading to a protective effect against distal colorectal tumors.However, the findings related to the anatomical subsites of CRC require further experimental validation.Interestingly, our finding highlighted a more pronounced reduction in CRC risk among men than women following a U-shaped dose-response manner.This disparity might be attributed to the variation in dietary and behavioral habits between genders, such as alcohol consumption and smoking patterns.Additionally, biological factors, especially hormonal differences might play a pivotal role (54).However, our result contrasts with a prospective cohort involving 579 men and 551 women of older US adults diagnosed with CRC (27).That study indicated that daily garlic consumption was associated with no significant correlation with CRC risk in men, whereas the association was nearly inverse in women.Although we cannot rule out the possibility of a chance finding, our results are derived from a study of 58,508 participants, providing a piece of more robust and powerful evidence.The specific mechanism to explain the sex disparities in tumorigenesis of CRC remains undetermined.In addition, racial and ethnic disparities in CRC risk are commonly documented in the literature which show a lower incidence and mortality of CRC among Caucasians (55,56).These disparities may be attributed to differences in socioeconomic characteristics, dietary patterns, surveillance, and genetic and environmental factors.We also found the inverse association was more pronounced for the white race, but considering that over 90% of participants were non-Hispanic White, we should interpret the finding with caution, which needs to be further validated by future studies.
Our study, based on a large-scale, multi-center randomized trial with over 155,000 participants recruited from 10 screening centers, boasts an appropriate observation period, ensuring a substantial number of outcome events and minimizing the bias of reverse causality.However, certain limitations persist.First, using self-reported DQX to categorize food items may introduce non-differential misclassification bias.Second, despite thorough adjustments, we couldn't eliminate all potential unmeasured confounders.Third, our one-time baseline assessment of food consumption might not diet habits change over time, though significant shifts in adult's dietary habits over short periods are rare.Using only the baseline diet might weaken compared to cumulative averages.Fourth, our findings may not apply universally because generalized from the US population.Fifth, our study didn't investigate garlic supplements due to data limitations.

Conclusion
In US adults, moderate dietary garlic consumption shows a U-shaped dose-response association with a decreased risk of CRC.The protective association is particularly evident in distal CRC cases, but not in proximal colon cancer cases.Interestingly, the protective effect is more pronounced in men than in women, Caucasians, and among participants with lower alcohol consumption.Our findings underscore the importance of a healthy diet in mitigating the global burden of CRC.
Figure 2A: P for non-linearity = 0.016; for males: Figure 2B: P for non-linearity = 0.011), and the risk of distal CRC (for both: Figure 2D: P for non-linearity = 0.007; for males: Figure 2E: P for nonlinearity < 0.001).However, no significant non-linear relationship was observed with the risk of proximal colon cancer or any location cancer type in females (Figures 2C, F-I: all P for non-linearity > 0.05). 10.3389/fnut.2023.1300330

transverse colon, and splenic flexure colon cancer, while distal CRC included descending cancer, sigmoid colon cancer, rectosigmoid junction cancer, and rectal cancer. The follow-up duration was calculated from the date of DQX completion until the first instance of CRC diagnosis, participant dropout, CRC-related death, or the end of the follow-up period, which extended until 31 December 2009.
The flowchart of subjects identified in our study.BQ, baseline questionnaire; DQX, dietary questionnaire, PLCO, Prostate, Lung, Colorectal, and Ovarian.

TABLE 2
Baseline characteristics of the study population according to quintiles of energy-adjusted garlic consumption (g/day) in 58,508 participants.Association between energy-adjusted dietary garlic consumption (g/day) and colorectal cancer incidence in the PLCO cancer screening trial.