Associations between colorectal cancer risk and dietary intake of tomato, tomato products, and lycopene: evidence from a prospective study of 101,680 US adults

Background Previous epidemiological studies have yielded inconsistent results regarding the effects of dietary tomato, tomato products, and lycopene on the incidence of colorectal cancer (CRC), possibly due to variations in sample sizes and study designs. Methods The current study used multivariable Cox regression, subgroup analyses, and restricted cubic spline functions to investigate correlations between CRC incidence and mortality and raw tomato, tomato salsa, tomato juice, tomato catsup, and lycopene intake, as well as effect modifiers and nonlinear dose-response relationships in 101,680 US adults from the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial. Results During follow-up 1100 CRC cases and 443 CRC-specific deaths occurred. After adjustment for confounding variables, high consumption of tomato salsa was significantly associated with a reduced risk of CRC incidence (hazard ratio comparing the highest category with the lowest category 0.8, 95% confidence interval 0.65–0.99, p for trend = 0.039), but not with a reduced risk of CRC mortality. Raw tomatoes, tomato juice, tomato catsup, and lycopene consumption were not significantly associated with CRC incidence or CRC mortality. No potential effect modifiers or nonlinear associations were detected, indicating the robustness of the results. Conclusion In the general US population a higher intake of tomato salsa is associated with a lower CRC incidence, suggesting that tomato salsa consumption has beneficial effects in terms of cancer prevention, but caution is warranted when interpreting these findings. Further prospective studies are needed to evaluate its potential effects in other populations.


Introduction
Colorectal cancer (CRC) is a significant global public health challenge, and the third most prevalent cancer in the United States with an estimated 147,950 new cases and 53,200 deaths in 2020 (1).Unhealthy lifestyle factors including heavy alcohol consumption, cigarette smoking, physical inactivity, excess body weight, and dietary choices may contribute to nearly half of CRC cases (2).Emerging evidence suggests that high consumption of red or processed meat (3,4), trans-fatty acids (5) and dietary supplements containing aristolochic acid (6) may increase the risk of CRC, whereas consumption of calcium (7,8), whole grains and fiber (9), fruit and vegetables (10), and dairy products (11) may decrease the risk.It would therefore be beneficial to establish a primary prevention strategy after clarifying associations between different dietary components and CRC incidence.
Tomatoes and tomato products are recognized as a component of a healthy diet (12).Epidemiological studies have shown that higher intake of tomato, tomato products, and/or lycopene may reduce the risk of various cancers, including hepatocellular carcinoma (13), prostate cancer (14), pancreatic cancer (15), gastric cancer (16), and ovarian cancer (17).Nevertheless, associations between tomato/tomato product intake and CRC risk remain unclear due to limited participant sizes and inconsistent study results (18,19).Meta-analyses have yielded conflicting results with regard to associations between lycopene intake and the incidence of CRC (20,21).Notably, these studies did not differentiate between raw and processed tomatoes, which may have different effects on CRC risk.Dose-response relationships between tomato or lycopene intake and mortality have not been investigated.A recent study investigated relationships between the intake of raw tomatoes, tomato catsup, or lycopene and all-cause and cause-specific mortality, but no such analysis has been done to examine their relationship with CRC (22).
To provide evidence to fill this gap, we conducted a comprehensive, prospective cohort study using data from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial, which was a multicenter randomized controlled study involving approximately 155,000 participants.The aim of the current study was to investigate potential correlations between the risk of CRC incidence and mortality and the consumption of tomatoes, tomato products, and lycopene.Additionally, we sought to examine the possible dose-response relationships and nonlinear associations between the intake of tomato products/lycopene and CRC risk.We also aimed to enhance the generalizability of our findings by conducting subgroup analyses.
2 Materials and methods

Data source and study population
The PLCO study design and methodology have been previously described (23).Briefly, it was a multicenter randomized controlled trial aimed at determining whether specific screening examinations reduce mortality from PLCO cancers.Approximately 155,000 participants aged 55-74 years were recruited between 1993 and 2001 via ten screening centers across the United States, and were randomly assigned to either a control group or an intervention group upon entry, in accordance with a detailed plan.The study was approved by the NCI's Institutional Review Boards and each study center, and all enrolled participants signed informed consent forms.
Participants were excluded if they (1) did not return the baseline questionnaire (n = 4918) or had any history of CRC before the baseline questionnaire (n = 34) (2); had an incomplete dietary history questionnaire (DHQ) (n = 33,230) or an invalid DHQ that was missing the completion date, was completed before the date of death, had ≥ 8 missing frequency responses, or indicated extremely high or low calorie intake (i.e., top 1% or bottom 1%) (n = 5,221) (3); had a history of any cancer before DHQ entry (n = 9,682); or (4) no follow-up time after the DHQ (n = 122).Ultimately 101,680 eligible participants were included in our cohort (Figure 1).

Data collection and dietary assessment
All participants completed a baseline questionnaire in which they self-reported information on demographics and medical history, including sex, race, trial arm, body mass index (BMI), educational level, marital status, aspirin use, cigarette smoking, family history of CRC, history of colon comorbidities, history of colorectal polyps, and diabetes history.Dietary data were collected using a self-administered DHQ.The DHQ included the serving size and response frequency of 124 food items and supplement use over the past year, such as red meat, processed meat, fruit, vegetables, whole grain, dairy, added sugars, dietary fiber, protein, total fat, carbohydrate, glycemic load, glycemic index, calcium, folate, magnesium, iron, vitamin D, and olive oil.The 1994-96 Continuing Survey of Food Intakes by Individuals, available from the USDA Food Surveys Research Group, and the Nutrition Data Systems for Research from the University of Minnesota were used to calculate the daily intake of all nutrients in the database (24).The DHQ has been validated and has shown good or better performance in estimating dietary intake compared to other commonly used food frequency questionnaires (25).Five independent exposures were included in the current analysis; tomato juice, raw tomato, tomato salsa, tomato catsup, and lycopene.Due to a lack of data on total tomato consumption, the overall relationship between total tomato and CRC risk could not be investigated.

Outcome ascertainment
The primary endpoint of the study was the incidence of CRC, which was determined via annual medical record reviews that updated participants' cancer diagnosis status, including the date of detection and the site of the cancer.The secondary endpoint was mortality related to CRC.Information regarding deaths was obtained through various sources, including Annual Study Update questionnaires, reports from relatives, friends, or physicians, and National Death Index Plus searches.Upon notification, PLCO Screening Centers made efforts to obtain a death certificate for each death that occurred on or before 31 December 2018.The trial database recorded and coded information from the death certificate, and the underlying cause of death was determined using rules established by the National Center for Health Statistics.To ensure a more accurate assessment of trial endpoints a death review process was conducted, and medical records were reviewed for all deaths that may have been related to prostate, lung, colorectal, and ovarian cancers.The DRP cause of death was considered authoritative and was used in statistical analyses of the primary endpoints.The follow-up duration was calculated from the date of completion of the DHQ to the first occurrence of CRC diagnosis, participant dropout, CRCrelated death, or the end of follow-up through to 31 December 2009 for incidence, and through to 31 December 2018 for mortality.

Statistical analyses
Dietary exposures were adjusted for total energy from the diet using the residual method (26).Energy-adjusted dietary tomato, tomato products, and lycopene intakes were then divided equally into quintiles, with the lowest quintile serving as the referent group.Continuous variables are expressed as medians and interquartile ranges (IQRs), and categorical variables are presented as numbers and percentages.Kruskal-Wallis H tests and chi-squared tests were used to compare between-group variance if appropriate.Multivariable Cox regression analyses were used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs).Schoenfeld residuals were used to verify the proportional hazard assumption of baseline covariates (all p > 0.05) (27).Due to the abnormal distribution of the The flow chart of study participants from the PLCO screening trial.
five exposures, a Log2 transformation was performed.The linear trend of each quintile of energy-adjusted dietary tomato, tomato products, and lycopene intakes were also analyzed by entering the median value as a continuous variable in the models.Model 2 was fully adjusted for age, sex, race, trial arm, BMI, educational level, marital status, aspirin use, cigarette smoking, alcohol consumption, family history of CRC, history of colon comorbidities, history of colorectal polyps, diabetes history, and dietary energy intake.In addition, the five exposures were mutually adjusted to assess individual contributions to the risk of CRC.
Subgroup analyses were conducted in several prespecified subgroups, including age group, sex, trial arm, BMI group, aspirin use, cigarette smoking, alcohol consumption, family history of CRC, history of colon comorbidities, colorectal polyps, and diabetes.The interaction effect on each stratum was compared using likelihoodratio tests.Restricted cubic spline functions with four knots (5 th , 35 th , 65 th , and 95 th percentiles) were used to investigate non-linear associations between dietary tomato, tomato product, and lycopene intakes and the incidence and CRC mortality.Notably subjects with energy-adjusted dietary tomato/lycopene intakes < 1 st or > 90 th percentile were excluded to reduce potential bias for extreme values in the dose-response analyses.

Participant characteristics
The cohort included 101,680 participants with a median followup of 9.54 years, corresponding to 908,801 person-years.During this period 1100 cases of CRC were reported, corresponding to an incidence rate of 12.10 per 10,000 person-years.Within a median follow-up of 14.5 years (corresponding to 1,353,326 person-years) 443 CRC-specific deaths were recorded.The average age of participants at baseline was 65.0 years.The median intakes of the five dietary items of primary interest were raw tomato 13.58 g/day, tomato salsa 0.90 g/day, tomato juice 12.76 g/day, tomato catsup 1.41 g/day, and lycopene 5.26 mg/day.The baseline characteristics of the study population according to the quintiles of the five exposure variables are summarized in Table 1 and Supplementary Tables S1-S4.Compared to the lowest quintile of energy-adjusted tomato salsa consumption, participants in the highest quintile were more likely to be young (median age 64 years), Caucasian, more highly educated, have a history of diabetes, have a lower glycemic load, and have lower total dietary energy intake.On average they consumed less red meat, processed meat, and added sugars, and they were less likely to be current smokers.In the Q1 category (representing the lowest consumption of tomato salsa), 69.1% of participants were male.Overall, the distribution was similar however, with 48.6% being male and 51.4% being female.There were also similar trends in the consumption of other tomato products, including raw tomato, tomato juice, tomato ketchup, and lycopene (Supplementary Tables S1-S4).

Associations between CRC incidence and tomato, tomato product, and lycopene intakes
There was an inverse association between CRC incidence and the moderate and the highest dietary intake of tomato salsa and in the crude model (Q4 vs. Q1: HR 0.81, 95% CI 0.67-0.97;Q5 vs. Q1: HR 0.64, 95% CI 0.53-0.79)(Table 2).Similar results on the association between CRC incidence and the highest intake of tomato salsa were obtained in adjusted models (model 1, Q5 vs. Q1: HR 0.77, 95% CI 0.63-0.94,p trend = 0.016; model 2, HR Q5 vs. Q1: HR 0.80, 95% CI 0.65-0.99,p trend = 0.028).There were no significant associations between raw tomato, tomato juice, tomato catsup, or lycopene intake and CRC incidence.With respect to individual contributions to CRC incidence assessed after mutual adjustments for each of the five exposure variables, comparing tomato salsa Q5 and Q1 the HR for CRC incidence was 0.79 (95% CI 0.64-0.99,p = 0.037, p for trend = 0.030); thus tomato salsa intake remained a significant predictor of CRC risk even after adjustment for the other tomato variables and covariates.
There were no significant interactions between tomato salsa intake and CRC incidence in any subgroups including age, sex, trial arm, BMI group, aspirin use, cigarette smoking, alcohol drinking, family history of CRC, history of colon comorbidities, colorectal polyps, and diabetes (Supplementary Table S5, p for interaction > 0.05).Given the distinct distribution of tomato salsa intake between males and females within the Q1 category, we also performed subgroup analyses to assess the association between tomato salsa intake (as quintiles) and CRC risk by sex.There was a negative association between tomato salsa intake and CRC incidence in women (Q5 vs. Q1 HR 0.60, 95% CI 0.41-0.87,p = 0.007).In men there was only a tendency towards a negative association.There was no significant interaction effect between sex and salsa intake on CRC incidence (p for interaction = 0.703).Sensitivity analyses were performed to examine the robustness of the correlation between tomato salsa intake and CRC incidence.The analyses included the exclusion of events ascertained within 2 years, the exclusion of subjects with extreme energy intakes, and the use of additional models.In those analyses the correlation between tomato salsa intake and CRC incidence remained robust (Supplementary Table S6).Smooth curve-fitting plots did not provide any evidence of nonlinear dose-response associations between energy-adjusted tomato salsa consumption and CRC incidence after full adjustment (Supplementary Figure S1; p for nonlinearity > 0.05).

Associations between CRC-specific mortality and tomato, tomato product, and lycopene intakes
Consumption of tomato salsa was significantly associated with lower CRC-specific mortality in the crude model (Table 3, p trend = 0.024).After adjustment for confounding variables in models 1 and 2 however, there were no significant associations (Table 3, p trend > 0.05).There were no significant associations between CRC-specific mortality and the intake of raw tomato, tomato juice, tomato ketchup, or lycopene.When the five exposures were mutually adjusted, comparing tomato salsa Q5 and Q1 yielded an HR for CRC mortality of 0.96 (95% CI 0.7-1.32,p = 0.807, p for trend = 0.942).Thus, there was no significant association between tomato salsa intake and CRC mortality.
In subgroup analyses there were no significant effect modifiers in the prespecified groups when the exposures were treated as categorical variables (quintiles) (Supplementary Table S7; p for interaction > 0.05).In sensitivity analyses there was also a lack of an association between dietary tomato salsa intake and CRC mortality (Supplementary Table S8).In dose-response analyses there was no non-linear relationship between tomato salsa intake and CRC mortality (Supplementary Figures S2; p for non-linearity > 0.05).

Discussion
In this prospective cohort study of 101,680 US adults' higher consumption of tomato salsa was associated with a 20% lower risk of CRC incidence after adjustment for potential confounders.There Olive oil (g/day) 0.0 (0.0, 0.5) 0.0 (0.0, 0.3) 0.0 (0.0, 0.4) 0.0 (0.0, 0.3) 0.0 (0.0, 0.6) 0.0 (0.0, 0.8) < 0.001 Data are presented as median (IQR) or number (percentage)."Others" refers to Asian, Pacific Islander, or American Indian.DHQ, dietary history of questionnaire; BMI, body mass index.Energy from the diet was adjusted using the residual method.were no significant associations between the consumption of raw tomato, tomato juice, tomato catsup, or lycopene and the risk of CRC incidence or mortality.These results were robust in a series of analyses.No effect modifiers or no non-linear relationships were observed.To our knowledge this is the first study to report a protective effect of tomato salsa against CRC risk.In contrast, a previous study did not observe a significant association between bladder cancer risk and tomato salsa consumption after adjustment for confounders in the PLCO cohort (29).These results suggest that the protective effect of tomato salsa against cancer risk is heterogenous among different cancers.To further assess the individual contribution of tomato-related dietary intake and CRC Model 1 adjusted for age (continuous), sex (male vs. female), trial arm (intervention vs. control), and race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others).
Model 2 adjusted for model 1 plus marital status (married vs. unmarried), education level (≤high school vs. ≥some college), aspirin use (no vs. yes), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), BMI (<25kg/m2 vs. ≥25kg/m2), family history of colorectal cancer (yes vs. no vs.possibly), alcohol drinking (never vs. former vs. current), history of colorectal polyps (no vs. yes), history of colon comorbidities (no vs. yes), and energy from diet (continuous).Model 3 adjusted for model 2, and mutually adjusted for the five exposure variables.Missing values for covariates were treated as dummy variables in the models.risk, we conducted a series of analyses including mutual adjustment for the five primary dietary factors of interest, and additional adjustment for other foods and nutrients, and the association between tomato salsa and CRC incidence remained robust.Although intake of tomato and/or lycopene has been associated with reduced risk of several cancers such as hepatocellular carcinoma (13), prostate cancer (14), pancreatic cancer (15), gastric cancer (16), ovarian cancer (17), and CRC (18,19), in this large PLCO study CRC risk was not significantly associated with consumption of raw tomato, tomato juice, or tomato catsup.That was consistent with a previous study investigating bladder cancer (29), but inconsistent with a previous case-control study conducted in a CRC population in Italy, in which there was a protective association between a higher intake of tomato and the incidence of CRC (18), and sub-sites of CRC stratified by cancer site (19).These differences may be due to the retrospective nature of the previous studies on this topic, which only examined associations with total tomato intake.The current study investigated the effects of specific types of tomato products (i.e., raw tomato and tomato catsup) on the incidence of CRC, which may have differential effects on health outcomes (29).Similarly, selection and recall bias due to the retrospective designs and residual confounders in previous studies may also have led the inconsistent results.Although we did not specifically investigate the mechanisms underlying associations identified in the study, several potential explanations could be explored.It has been proposed that the cancer-preventing effects of high tomato consumption may be attributed to lycopene.This powerful antioxidant not only neutralizes harmful free radicals but also potentially mitigates oxidative stress, a condition associated with cellular damage and implicated in various types of cancer (30)(31)(32).Processed and concentrated tomato products such as salsa (9.28 mg/100 g) and tomato juice (7.83 mg/100 mg) contain higher levels of lycopene than raw tomatoes (3.1-7.74 mg/100 g) (33), which may contribute to their cancer-protective effects (34).However, we did not observe a significant association between CRC risk and dietary intake of lycopene after adjusting for confounders, which was similar to previously reported results (21,35,36).Notably our study only investigated the link between dietary lycopene intake and CRC risk, and did not directly measure serum lycopene levels.Because the estimated dietary lycopene absorption rate in humans ranges from 10%-30% (37), dietary intake may not fully reflect serum lycopene levels.Therefore, the observed correlational coefficient of 0.46 between dietary and serum lycopene levels could be influenced by various factors (14).In addition, the method of cooking and chopping can affect the bioavailability of lycopene, and certain food preparation techniques may enhance absorption (38,39).Tomato salsa (sofrito) is a traditional Mediterranean diet preparation comprised of a mix of foods characteristic of the Mediterranean diet such as tomato, onion, garlic, and extra virgin olive oil, and it contains many bioactive phenolic compounds and carotenoids (40,41).The inverse association between salsa and CRC in our study may be attributable to the presence of unique additives such as olive oil.Olive oil is known to contain a variety of substances, including monounsaturated free fatty acids (such as oleic acid), hydrocarbon squalene, tocopherols, aroma components, and phenolic compounds.Although olive oil quality can affect biological/nutritional actions (42), these components have been associated with anticancer properties (43,44).Therefore, the addition of olive oil to tomato sauces may have positive health outcomes.Furthermore, the potential health benefits may be related to the method of tomato salsa processing.Evidence from a prospective randomized, cross-over intervention study suggested that the plasma concentration and urinary excretion of naringenin glucuronide were both significantly higher after the consumption of tomato sauce than after the consumption of raw tomatoes.It was suggested that mechanical and thermal treatments during tomato sauce manufacture may help to deliver these potentially bioactive phenolics from the food matrix more effectively than the addition of an oil component, thus increasing their bioavailability (45).Moreover, tomato salsa, characterized by its intricate mixture of ingredients, should be considered for the potential synergistic interactions among its bioactive compounds.It is not only a rich source of antioxidant lycopene, but also a treasure trove of other bioactive components, including phenolic acids, flavonoids, and ascorbic acid (46).The interaction among these ingredients yields synergistic effects that amplify the health benefits of tomato salsa.For instance, the bioavailability of lycopene can be significantly boosted by the presence of fats, such as those found in avocados, olive oil -a common ingredient in salsa recipes (47).The assortment of antioxidants in salsa promises more robust protection against oxidative stress and inflammation than any single compound could offer (42).Lastly, participants with a higher intake of salsa often reported other healthy dietary habits at baseline, such as a higher intake of vegetables and fruits and a lower intake of red or processed meat, which may provide additional protective effects against CRC.To better understand the potential association between tomato and CRC risk, future studies should investigate the sources, bioavailability, and serum concentration of lycopene in tomato salsa, ideally with a longer follow-up period.
The strengths of this study included its prospective design based on a large and well-established cohort (the PLCO trial), which ensured reliable data, a large sample size, and a comprehensive assessment of dietary intake of various tomato products and lycopene.The data enabled investigation of dose-response relationships, as well as long-term follow-up with a high followup rate to minimize reverse causality and selection bias.The study also had several limitations.Firstly, due to the observational nature of the study there may have been residual confounders that we could not control for.Secondly, the data were derived from a dietary questionnaire, and may thus have been subject to recall bias and misclassification errors.Thirdly, we only had baseline dietary information, which limited our ability to examine dynamic changes between nutrients and cancer risk.Fourthly, serum assessment of nutrients was lacking, which prevented a more detailed evaluation.Fifthly, the study population was limited to the US, which may limit the generalizability of the results to other countries with different dietary patterns.Further studies with larger sample sizes and longer follow-up periods, as well as more detailed assessments of dietary intake and serum nutrient levels are warranted to confirm our findings and better understand potential associations between dietary factors and cancer risk.

Conclusions
The current study indicates that high amounts of tomato salsa may be a beneficial addition to a healthy diet, and may contribute to CRC prevention in the adult population in the US.However, more prospective studies that involve more detailed assessments of tomato salsa intake are necessary to assess its potential effects in other populations.

TABLE 1
Baseline characteristics of study population according to quintiles of energy-adjusted tomato salsa consumption in 101680 participants.

TABLE 1 Continued
Quintiles of energy-adjusted tomato salsa consumption (g/day)

TABLE 1 Continued
Quintiles of energy-adjusted tomato salsa consumption (g/day)

TABLE 2
Association between energy-adjusted tomato-related products/lycopene intakes and colorectal cancer incidence in the PLCO cancer screening trial.

TABLE 3
Association between energy-adjusted tomato-related products/lycopene intakes and colorectal cancer mortality in the PLCO cancer screening trial.

TABLE 3 Continued
PLCO, prostate, lung, colorectal and ovarian; HR, hazard ratio; CI, confidence interval.Crude model adjusted for none.Model 1 adjusted for age (continuous), sex (male vs. female), trial arm (intervention vs. control), and race (white, non-Hispanic vs. black, non-Hispanic vs. Hispanic vs. others).Model 2 adjusted for model 1 plus marital status (married vs. unmarried), education level (≤high school vs. ≥some college), aspirin use (no vs. yes), diabetes (no vs. yes), cigarette smoking (never vs. current vs. former), BMI (<25kg/m2 vs. ≥25kg/m2), family history of colorectal cancer (yes vs. no vs.possibly), alcohol drinking (never vs. former vs. current), history of colorectal polyps (no vs. yes), history of colon comorbidities (no vs. yes), and energy from diet (continuous).Mode 3 adjusted mode 2, and mutually adjusted the five-exposure variables.Missing values for covariates were treated as dummy variables in the models.