Global and Historical Distribution of Clostridioides difficile in the Human Diet (1981–2019): Systematic Review and Meta-Analysis of 21886 Samples Reveal Sources of Heterogeneity, High-Risk Foods, and Unexpected Higher Prevalence Toward the Tropic

Clostridioides difficile (CD) is a spore-forming bacterium that causes life-threatening intestinal infections in humans. Although formerly regarded as exclusively nosocomial, there is increasing genomic evidence that person-to-person transmission accounts for only <25% of cases, supporting the culture-based hypothesis that foods may be routine sources of CD-spore ingestion in humans. To synthesize the evidence on the risk of CD exposure via foods, we conducted a systematic review and meta-analysis of studies reporting the culture prevalence of CD in foods between January 1981 and November 2019. Meta-analyses, risk-ratio estimates, and meta-regression were used to estimate weighed-prevalence across studies and food types to identify laboratory and geographical sources of heterogeneity. In total, 21886 food samples were tested for CD between 1981 and 2019 (96.4%, n = 21084, 2007–2019; 232 food-sample-sets; 79 studies; 25 countries). Culture methodology, sample size and type, region, and latitude were sources of heterogeneity (p < 0.05). Although non-strictly-anaerobic methods were reported in some studies, and we confirmed experimentally that improper anaerobiosis of media/sample-handling affects CD recovery in agar (Fisher, p < 0.01), most studies (>72%) employed the same (one-of-six) culture strategy. Because the prevalence was also meta-analytically similar across six culture strategies reported, all studies were integrated using three meta-analytical methods. At the study level (n = 79), the four-decade global cumulative-prevalence of CD in the human diet was 4.1% (95%CI = −3.71, 11.91). At the food-set level (n = 232, mean 12.9 g/sample, similar across regions p > 0.2; 95%CI = 9.7–16.2), the weighted prevalence ranged between 4.5% (95%CI = 3–6%; all studies) and 8% (95%CI = 7–8%; only CD-positive-studies). Risk-ratio ranking and meta-regression showed that milk was the least likely source of CD, while seafood, leafy green vegetables, pork, and poultry carried higher risks (p < 0.05). Across regions, the risk of CD in foods for foodborne exposure reproducibly decreased with Earth latitude (p < 0.001). In conclusion, CD in the human diet is a global non-random-source of foodborne exposure that occurs independently of laboratory culture methods, across regions, and at a variable level depending on food type and latitude. The latitudinal trend (high CD-food-prevalence toward tropic) is unexpectedly inverse to the epidemiological observations of CD-infections in humans (frequent in temperate regions). Findings suggest the plausible hypothesis that ecologically-richer microbiomes in the tropic might protect against intestinal CD colonization/infections despite CD ingestion.


INTRODUCTION
Clostridioides (Clostridium) difficile (CD) is a spore-forming anaerobic bacterium that causes severe enteritis, colitis, and mortality in susceptible humans, especially if affected with inflammatory bowel diseases, cancer, immunosuppression, or if taking antibiotics (1)(2)(3)(4)(5). To date, it is well-known that CD infections (CDI) in humans are more frequent in temperate regions. Latitudinal trends however have not been reported for CDI at continental scales. Since the first report linking CD to pseudomembranous colitis in 1975, several reports now indicate that CD could reach humans via foods (6). If the presence of C. difficile in foods was indeed linearly associated with infections, one would expect that the prevalence of food contamination was higher in temperate regions as it is the case for the incidence of CDI in humans.
CDI have now worsened severity and incidence since the emergence of hypervirulent strains that caused CDI epidemics in both Canada and the UK in the mid 2000s. After the astounding isolation of such strains from young cattle and retail beef in Canada in 2005 (7,8) numerous food studies support the hypothesis of potential foodborne exposure (9,10). With the availability of genomics, elegant studies have shown that only ∼25-30% of CDI in hospitals are nosocomial, redirecting the attention to foods as viable sources of CD (11,12). As further evidence for connectivity between foods and CDI, last year a de novo genome sequencing study showed that the first CD strain derived from foods (PCR ribotype 078) in Canada in 2005 was identical to the historical strain M120 that contributed to epidemics in the UK in 2007 (13).
Unless we understand the distribution pattern of CD across foods, regions, and laboratory variability, little can be done to minimize the exposure of susceptible persons to CD in their diet. Distinguishing methodological variability from natural variability is important to assign a proper risk value to the presence of CD in the food supply (6). To formally quantify the prevalence of CD in foods and map the distributional trends over global scales, we conducted a systematic review and metaregression of studies reporting the presence of CD in foods. The main quantitative objectives were (i) to appraise peer-reviewed studies on quality and the prevalence of CD in foods, (ii) to determine laboratory factors associated with CD-positivity, and (iii) to perform meta-analysis across regions, and food items to examine reporting differences and outline latitudinal trends.
Herein, we report that the majority of studies used the same laboratory culture method for the isolation of CD allowing us to conduct meta-analysis and rank food items based on the weighted risk of contamination across regions. Although beef and pork were food categories often containing CD, leafy green vegetables and seafoods had higher rates of contamination. Of remarkable novelty, the contamination of foods followed a latitudinal trend that is inverse to the Earth's latitude (higher toward the Equator). Although there are no global reports describing latitudinal trends for CDI, results indicate that the latitudinal trend observed in foods is inverse to that of what is reported and expected for infections (i.e., high incidence in temperate regions). We hypothesize tropical microbiomes prevent CDI.

Systematic Review, Team, and Definitions
This study follows and complies with principles of systematic review research methodology for "food safety" and food item definitions (14,15). All procedures used in this study were reported in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines in structuring our literature search analysis.
We conducted a systematic search of available literature reporting the prevalence of C. difficile in foods. Electronic databases (MEDLINE/PubMed, Scopus, Web of Science, and Google Scholar) were searched to identify all studies reporting the prevalence of CD in foods. The detailed search algorithm, questionnaire, data extraction criteria and verification are available as Supplementary Materials. Five iterative rounds of verification of extraction strategies and tools were validated to ensure reproducibility of data extraction.
In brief, a list of search terms was developed by consensus by the research team to retrieve citations pertaining to CD prevalence in foods. Search terms (n = 64 terms) relating to population (e.g. food, meat, beef, etc.) and outcome (e.g., Clostridium Clostridioides difficile) were combined to search numerous food types (or items), without restrictions. To minimize bias and errors, identified terms were pre-tested in PubMed and used to develop the final algorithm, using as basis a similar validated strategy we implemented for vegetables (16). The complete search terms consisted of the following: Citations retrieved from electronic databases were imported and de-duplicated in reference management software EndNote Web TM (Clarivate Analytics). Search verification included manual searching of references citing the first five manuscripts reporting CD in foods or its potential for foodborne transmission (9,10,(17)(18)(19) using Google Scholar in consultation with research team members, and the references of all identified studies. Experts in the field were consulted to identify unpublished data, including theses and research poster/conference presentations. Google Search Engine limited to the first 600 hits was searched to identify any "gray literate." Alert in Google Scholar was set up to identify any newly published studies. All potentially relevant citations discovered through the manual searching method, which were not previously identified through electronic search, were added into the review process and processed in the same manner as electronic citations. All peer-reviewed studies, dissertations and reports containing original prevalence data were eligible. Studies lacking the report of both number of samples tested (N) and number of positive samples (n) were excluded (20). Prevalence contamination data was only extracted for culture assays, and not for prevalence data based on molecular assays (21). No restrictions were imposed in terms of the study time period, design, language, or study origin.
Relevant citations after reviewer screen 1 (RS1) were procured as full articles, and screened by two reviewers (BS and SI) using pre-tested RS2 checklists (Supplementary Tables 1, 2). Conflicts were resolved by a consensus between respective reviewers and when not possible, by senior authors of this study. During initial manual screening of selected abstracts, carcass trims or carcass washings/rinsates at the processing plants were selected for secondary analysis. Data describing environment, wastewater, animal or human fecal samples were excluded. Non-primary research studies (e.g., narrative reviews) and studies investigating other aspects (e.g., outbreak reports, test performance studies) were excluded. Case reports or case series of hospital-associated C. difficile infections, and case-control studies that did not provide prevalence estimates, and duplicate publications were also excluded. Relevant articles were assessed and categorized by food type (e.g., beef, poultry, vegetables) and descriptive characteristics (e.g., food processing level, where in the production chain was the product sampled). Through initial title and abstract-based relevance screening one (RS1), potentially relevant primary research articles were identified.

Extraction Tool and Risk of Bias
Prior to reading the manuscripts, two meeting sessions (phone, and in person) took place (ARP, SI, BS, and AD) to discuss and create a Data Extraction Tool (DET, list of questions and response categories, see Supplementary Table 1) draft to standardize the extraction of data required for statistical analysis and testing of study objectives. Following five iterative rounds of verification for accuracy and clarity, the pre-final extraction tool was pretested by ARP and BS at CWRU, and SI and JM at OSU, using 10 studies (the first five in the 1980s, and 5 in 2015) (9,10,(17)(18)(19). Phone conferences occurred biweekly during this phase to estimate test agreement, to address concerns, to edit/improve, and thus finalize the DET. The pretesting and definitive data extraction were conducted after the participating reviewers (KM, BS) were trained on laboratory methodologies available for CD by senior scientists from two institutions (ARP and SI). Four reviewers extracted data independently. Data extraction was verified by senior two authors for data interpretation, extraction and accuracy. The final DET was used to extract all relevant research articles, which were assessed for methodological soundness and bias as part of the data extraction strategy, by at least two reviewers, using the prevalence study-based criteria.
All studies were assessed by rating each of the 6 quality assessment items listed in the DET into dichotomous ratings: low risk (1) and high risk (0). An overall Risk of Bias score was calculated by adding the numeric value of all six items. High scores indicate low risk of bias and stronger method quality. Measures of data SD or variability were estimated using the number of food samples tested and the percentage of positive samples. Because the reliability of available statistical methods on bias have previously shown to be inaccurate and misleading with effects that are close to the extremes, for instance close to 0 or to 100% (22), publication bias was tested using funnel plot and Egger's statistics using study size at the food set level instead of the standard error of the effect as recommended for proportions with high data /effects polarity (22). As we recently mentioned (23) however, it is uncertain how many studies start but do not get published due to the lack of a prepublication registry of prevalence based studies in foods.

Pooled Ratios, Meta-Analysis, and Meta-Regression
Extracted data were used to estimate risk ratios and perform a prevalence meta-analysis. Three main categories of data were extracted: sample characteristics, methods, and prevalence data. All food items were grouped for analytical purposes into food item categories (e.g., pork, leafy green vegetables). Pooled risk ratios (RR, 95% CI) for each food group were calculated to quantify the differences and rank the foods according to the risk of being contaminated using a random effects model (16, 24) In brief, heterogeneity tests with Higgins' I 2 statistic were performed to determine the extent of variation between the studies that rely on measure analysis for the deviations for each within-study variance from a central estimate for the collective between-study variance distribution. (24) Meta-analysis was used to estimate the overall prevalence of CD in foods globally and per region by pooling variances of proportions in a randomeffects model using DerSimonian and Laird method (25,26). Analyses were performed using R software and Metaphor (27), and Stata's Metaregression and Metacum functions. To illustrate the cumulative meta-analytical prevalence of CD globally and regionally at the "study-level" (n = 79), over the past 4 decades, we analyzed and plotted the data as a forest plot as previously reported (23). Because each study tested multiple "food item categories, " we then decomposed the study variance across each food item, within each study, and constructed the remaining forest plots at the item level presented in this study. Exact binomial weighted and pooled estimates at "item level" (n = 232) are presented in forest plots both without adjusting for "zero-studies" (which excludes 0% prevalence studies), and with adjustments using either a balanced addition of 1 to n and N, or using the Freeman-Tukey double arcsine transformation, which include 0% prevalence studies (27). For meta-regression and latitudinal analysis, coordinate data were obtained from NASA. To determine if the reported prevalence was influenced by the amount of food tested, data were extracted as absolute values in grams. Modeling and latitudinal simulations were conducted in R and STATA (27) (Supplementary Materials).

Experiments With C. difficile on Non-anaerobic Media
The exposure of CD spores to conditions suitable for grow (high moisture, nutrients, and warmth) trigger spore germination even in room air. However, the subsequent step, i.e., bacterial growth from germinated spore to cell division, does not occur in the presence of air/oxygen. Because (i) most studies did not report whether the reagents or the handling of foods in growth media were fully anaerobically, and because (ii) the germination of CD spores and the subsequent viability of vegetative daughter cells are influenced by the lack of strict anaerobiosis, we determined if a source of low CD recovery and study variability could be partly due to negative selection when non-reduced reagents are used by scientists. To test this hypothesis we platted 1-year old (superdormant) spores aged for 1 year in PBS as described (28) on TSA agar enriched with 5% defibrinated sheep blood. Two different pre-reduced agar conditions, which only differed on the length of time the agar had been incubated (pre-reduced) anaerobically before being used for bacterial inoculation using our novel rapid enumeration Parallel Lanes Plating method (29).

Global Distribution of Studies Reporting C. difficile in Foods and Publication Bias
The standard PRISMA diagram depicted in Figure 1 summarizes the flow process of studies selection for this systematic review. 4% of all food samples tested (over the last 12 years) making this analysis current and relevant to the modern concerns of food transmission of CD via the food supply. Eleven (13.9%) of all studies reported the absence of CD in the food samples tested (CD-negative). Only 30% of studies (n = 24) were dedicated to testing only one type of food. Most studies tested between 2 and 4 food types. Funnel plot analysis indicate there has been absent-tomoderate publication bias, depending on the statistical method used for the funnel analysis to consider data handling of reports close to 0% prevalence as illustrated in Figure 2.

Historical Study Referents of C. difficile Isolation From Foods
This meta-analysis illustrates the geographical distributions of the numerous laboratories around the world that have been examining the potential of foodborne transmissibility of CD spores to humans, via the food supply. Figure 3 depicts in a map the arithmetic average of the CD prevalence reported for local food items across countries, and other descriptive features of the studies. Of note, since the first report, there have been periods of oscillations possibly reflecting trends in research interest or funding availability.
Historically, the first study attempting to quantify the prevalence of CD in ready-to-eat foods was published by Fekety et al. (30) in a hospital setting. Using a direct culture approach (effective for isolation of CD from environmental surfaces) on hospital meals, this study yielded no CD. The following year, two reports highlighted the potential foodborne and zoonotic potential of CD transmission to humans (19,111) but a period of quiescence lasted until 1996, when Broda et al. (18) made a food science report of incidental isolation of CD from spoiled "blown-packed" meats in New Zealand. Google citation statistics of Broda's publication indicate that her findings were only relevant to food spoilage studies, and not cited on "public health" or "food safety" reports due to human health concerns until our reports in 2006 discovered the presence of hypervirulent epidemic CD strains in foodproducing animals and retail beef in Canada (9,10). No citations of Broda et al. occurred on the basis of foodborne/health concerns between 1996 and 2006 (0 vs. 30 citations on meat spoilage), but steadily increased to 27 foodborne citations after the 2006 reports (9, 10) (51 citations on meat spoilage, mainly due to Clostridium estercholaris; Fisher's p < 0.001). Citation analyses support the reproducibility and historic context of our systematic review, with minimal publication interest on the "foodborne potential of C. difficile" before 2006. See Figure 4 for a graphical representation of the historical context and order in which numerous laboratories around the world tested food items intended for human consumption since 1981.  Although the cumulative prevalence of CD in the foods tested has been 4.1% globally at the study-level (twotail 95%CI = −3.71, 11.91, Table 1), we demonstrate that the cumulative prevalence has distinct patterns of heterogeneity (variance) depending on the region, being comparably lower at the study-level in Europe (1.9%; 95%CI = −7.49, 11.29; see Figure 5 for cumulative estimates in other regions).

Overall Food Contamination: Food-Type Level Analysis
Because most studies (>75%) tested more than one "fooditem type/category" (e.g., "beef, " "vegetables"; 2.95 ± 1.8 categories/study), and because pooling data from distinct food categories as a single CD prevalence for each study was deemed biologically inappropriate, and non-informative to generate food-based risk ranks, we extracted data separately for each food item tested in all studies. Thus, together, this meta-analysis represents 21886 samples of retail foods tested across 232 "food item sample sets." On average, each food set comprised 92 ± 127 samples; maximum = 956. For the pooled analysis, the 232 food sets were grouped into 20 food categories (e.g., "pork, " "seafood, " "mixed meats"), being "beef " the most studied commodity (see cumulative statistics in Supplementary Table 3). Reported CD prevalence at the "food-category level" ranged from 0 to 100%. On average, studies tested 12.9 grams of food per sample (SD, 13.9, 95%CI = 9.7, 16.2; min = 0.7, max = 50) with no differences between regions (adjusted p > 0.2) controlling for year, although year was associated with an increased amount of food tested over time (0.4 g/sample per year, adjusted p = 0.083).
As a single unweighted statistic, the arithmetic mean for the CD prevalence in foods at the food-type level was 10.6 ± 16.6% (Supplementary Table 4). Because differences exist across regions and food tested categories, and because estimations depend on the inclusion of data from zero prevalence studies, we then computed the overall adjusted weighted meta-analysis cumulative prevalence considering the sample sets and regions, and three statistical methods to account for the 0% prevalence in CD-negative studies. Notice that Figure 6 illustrates the heterogeneity (I 2 statistics) across regions and the 232 food sets, at the same time it illustrates that the overall of C difficile in foods ranges between 4.5% (95%CI = 3-6%, for all CD-positive and CD-negative studies combined) and 8% (95%CI = 7-8%, for the CD-positive studies only).

Heterogeneity and Overall Prevalence of C. difficile in Foods Is Independent of Culture Method
To date, one of the most cited factors to explain differences in CD across food studies is the existence of variability across methods and reagents (Supplementary Table 5). Although we have not seen recovery differences for antibiotics used as selective reagents in food studies (cycloserine-cefoxitin vs. cysteine hydrochloride-norfloxacin-moxalactam, CDMN) (17,113), we examined the role that culture methods play in this meta-analysis.
Although studies clearly report the use of anaerobic jars (21), culture media (e.g., CDMN, BHI), and homogenization methods for sample disruption (e.g., stomachers, blenders) which mix samples with room air, unfortunately, most studies did not specify clearly if reagents were pre-reduced (incubated anaerobically prior to utilization) or if protocols were anaerobic (77,102,105). Because 73.4% of studies did not use positive controls (58/79; Supplementary Table 6), it is impossible to infer if protocols were fully anaerobic. To test if the incubation of CD spores in non-reduced media (e.g., agar freshly removed from refrigerator) inhibits CD recovery, we conducted experiments in vitro. Using 1-year-aged spores from human PCR-ribotypes 078, 027, 077, strains 630 and ATCC 1869 (13,28,53), we observed that the use of non-reduced agars results in no CD recovery compared to using agars pre-reduced in an anaerobic chamber 4 h prior inoculation (0/10 vs. 10/10, Fisher exact p < 0.001). Because 26.9% of studies also reported short periods of incubation (e.g., overnight), we determined if short incubation influenced CD recovery. Of relevance, aged CD spores grew slowly requiring ∼72 h to produce the same surface biomass (per colony on agar) as the produced by vegetative cells in 24 h. Although results indicate that non-reduced media and short incubations could yield false-negative studies results, we deemed these to be common error factors randomly distributed across methods.
Thus, we next examined the role of overall culture strategies, by cataloging and grouping all reported methods into six different categories based on sequence of isolation steps and three culture strategies: (i) direct plating on agar, (ii) enrichment of the foods using liquid media prior to culture on agar, and (iii) the use of ethanol or heat to eliminate non-spore forming microbes in foods prior to culture in liquid media or agar to favor the growth of CD spores. Frequency analysis showed that almost three-quarters of all food samples tested (70.8%) used the methodological strategy reported in the first index report of CD in foods in 2006 (9,10). Confirming that the five remaining methods had comparable CD recovery, univariate, and weighted predictive meta-analysis, showed that all the six methods were statistically similar (see Figure 7).
Publication bias, journal impact factor, and the amount of food tested were also ruled out as sources of variability. However, we discovered that the number of samples tested per food set correlated inversely with CD prevalence (linear regression p = 0.007; meta-regression p = 0.067 controlling for region/method, Supplementary Figures 1, 2 and Supplementary Table 7). Although seasonality has yielded heterogeneity in food animals (i.e., low prevalence in summer; high in winter in temperate regions), seasonal variability could not be tested since 85.9% of studies did not include referents or surrogates for season. Together, Figure 7 and the analysis described illustrates that different culture strategies cannot explain the prevalence heterogeneity reported in the literature, and confirmed that all studies can be integrated in this meta-analysis.

Contamination Risk Analysis Ranks Vegetables and Seafoods as High-Risk Food items
Of relevance to risk statistics, over one-quarter of food sets were CD negative (64/232; 27.8%, 95%CI = 22.2, 32.2). However, from a clinical perspective, doctors and patients could benefit by knowing which foods are more likely to be contaminated to determine how diets can be adjusted during periods of increased susceptibility (e.g., cancer, IBD). For instance, by cooking or avoiding highrisk foods.
Since different food items could be contaminated with different probability risks, we calculated risk ratios (RR) to rank each food group with respect to the food yielding the lowest combined prevalence of CD, and also using meta-analysis weighted estimates (Figure 8). Using milk as a reference (which had the lowest prevalence, but clinically important CD strains) (20), vegetables, seafoods and pork had the highest RR. Compared to milk, vegetables were 21.9 times more likely to yield CD, while seafoods and pork were 14.3 and 12.9 times more likely, respectively. Figures 9-12 comparatively illustrate the weighted prevalences for the following food categories: beef and vegetables, poultry, pork and seafood. Comparing retail beef, leafy-green vegetables and root vegetables, Figure 9 illustrates that leafy green vegetables are twice more as likely to carry CD compared to root vegetables. Supplementary Figures 3-5 display detailed forest plots for both statistical methods for beef and vegetables, and for all the food tests tested including mixed meals, and others based on the biological origin (animal/plant) of the food.

The Probability of Recovery C. difficile From Foods Increased Latitudinally Toward the Tropic
To determine whether the prevalence of CD in foods was influenced by Earth's latitude, we added the positional coordinates to the dataset. Both unadjusted and arcsine adjusted meta-regression revealed that latitude determines the magnitude by which CD has been isolated from foods FIGURE 4 | Graphical and historical overview of food item categories (types) for human consumption tested for C. difficile: (1981-2019). Contextualization of food items tested across continental regions. For a historic narrative see Results section. * First studies relevant to risk of ingestion C. difficile and food microbial safety epidemiology. a First study in human hospital menus; negative results. Others (36,44,52) yielded positive results (112). b First isolation from food produced by invertebrate insects-honey. c First study in drinking water. d First isolation of C. difficile from retail raw root vegetables. e First isolation of C. difficile from animal-derived meat product, incidental finding while studying clostridia in spoiled and blown vacuumed packed sausages. No recognition of relevance to human health. f First study on retail food derived from farm animals destined for mass scale production of food for humans with genotyping evidence of C. difficile hyper-virulent strains present in retail foods. Isolates obtained from retail ground beef purchased in Guelph, Ontario, Canada, 2004-2005. PCR ribotypes had assigned international nomenclature by Dr. Jon Brazier, U. of Wales, UK. g First national systematic sampling study reporting seasonality of C. difficile in foods, Canada, 2006. worldwide. While longitude was non-significant, latitude had a negative linear correlation with CD prevalence (in a y = β 0+ β 1 χ 1 model; Figure 13A). Since several studies reporting high prevalence were from mid-range latitudes, collectively the data displayed a concave pattern (in a y = β 0+ β 1 χ 1 + β 2 χ 2 1 model; Supplementary Figures 6, 7). However, after dividing the 232 food sets into 22 food-per-continent subsets to control for longitude (e.g., beef in Africa vs. Asia), regression slope Notice heterogeneity across regions. The detailed decomposed heterogeneity at the "food-item-category" level (food types), for 232 food sets is presented below.
Frontiers in Medicine | www.frontiersin.org FIGURE 6 | Forest plot of weighed prevalence of C. difficile at "Food-category" level ("food item sets," n = 232). A version in PDF that can be magnified to high resolution is available in FigShare. Each plot represents different analytical strategies that differed on the method used for data transformation to deal with "zero" prevalence reports. Data ranked by author and year. See estimates (ES) and weights (W) for each region in green and shaded ovals. Note that the confidence intervals (CI) overlap irrespective of analytical adjustments. (A) Meta-analysis conducted with untransformed proportions. This mathematically excludes food sets with 0% prevalence (red font, ∼25%). (B) Meta-analysis conducted after adding 1 to the denominator and numerator (n/N), and after using the Arcsine Transformation of the raw data which forces the inclusion of adjusted data derived from "zero" prevalence. Note the ranking of studies excluded in plot panel a (red font), are re-ranked in adjusted analyses. The smaller size of circles with the arcsine transformation illustrates better adjustment of heterogeneity. Confidence intervals are exact binomial (Clopper-Pearson). P < 0.05 indicates pooled prevalence is different from zero. I 2 , heterogeneity test, p < 0.05 indicates the "true effect" across studies is not the same. Random-effects, DerSimonian/Laird statistics.
analysis (in y = β 0+ β 1 χ 1 ) showed that latitude negatively correlation with prevalence in 94.5% of the 22 data subsets (Sign p < 0.0001). Such reproducible correlation was not due to chance, since random allocation of latitude values in 25 simulations showed non-reproducible slopes (Sign p = 0.35). Findings were also validated using predictive spatial density map simulations on a 2D-plot representing the Earth's surface (Figures 13B-E, adjusted p < 0.001). Contour-density plots illustrate that the patterns of CD have a spatial latitudinal structure that is different from simulations of randomly spaced studies. For the first time, the prevalence of CD in the human diet is shown to have a latitudinal pattern. Occurring reproducibly across continental longitudes, with comparatively higher CD prevalence in regions closer to the tropic, this CD-in-foods trend is opposite to what is expected for CDI in humans, where most cases seem to be more frequent (in temperate regions) away from the tropic.

DISCUSSION
The present meta-analysis, for the first time, summarizes the distribution of CD in the human diet, which we derived from data from 79 studies conducted between 1981 and 2019. Although this study encompasses almost 40 years of available reports, which could be perceived as a representation of a wide array of non-comparable culture methodologies, our analysis illustrates that the majority of studies were published over the past 12 years as shown in  10)] and reagents, which did not yield statistical differences in integrated meta-analysis despite being strategically different, as illustrated in Figure 7, regardless the region. Lastly, we provide cumulative metanalysis per region, weighted metanalysis per region and food types, and metaregression statistics which are conservative as they control and minimize the biased produced by individual studies as time progresses, or the weights the power of influential studies. The conclusions herein derived are therefore deemed to be unbiased and representative of what has been reported primarily in the recent literature. and 95% CI). Note that the RRR ranks beef, poultry, pork, and vegetables at different levels although they cluster together in panel a, which clusters these products together because those were more commonly tested across regions. Horizontal bars connect products with statistically similar RRR. Distinct superscripts denote statistical differences, Chi-square p < 0.05. (C) Meta-analytic display of weighed prevalence estimates at the "food set" level. Top panel displays data from all studies, including "zero" prevalence reports (Arcsine transformation, homogeneous adjustment for variability, see comparably-sized small circles), while the top panel displays data of only positive studies (larger variably-sized circles, see area within rectangular polygon). Vertical ovals in top panel highlight representative clusters of reports describing high prevalence of C. difficile in certain foods, supports raking statistics in (B). Leafy green vegetables are ranked high since estimates are from studies with larger sample sizes/more weighed influence (small circles, narrower CI in bottom panel; larger circles in arcsine-adjusted top panel). Estimated various regional and global prevalence of CD in foods ranged between 3 and 8% globally, or between 0 and 22% regionally. We also identified for the first time a latitudinal trend in foods with increased rates of CD recovery in food toward the tropic. The analysis of almost twenty-two thousand samples across the globe, as a robust representation of the human diet, indicates that prevalence heterogeneity exists independently of culture methods. While it had been assumed that the variability in findings was due to culture method differences, our analysis (verified using I 2 statistics) demonstrates that there were no significant differences for the CD prevalence across methods, and that most studies used the same methodology. Prevalence estimates also varied within studies conducted by the same author, which cannot be explained by variations in culture methods. Often, the same method was applied to different food items yielded different rates of CD contamination under the same report. Such differences reflect real variance of CD in the food supply. FIGURE 9 | Comparative global and regional prevalence of C. difficile in beef and vegetables. Forest plots using the Arcsine Transformation of the raw data force the inclusion of adjusted data derived from "zero" prevalence studies. Confidence intervals (CI) are exact binomial. Rectangular ovals denote overall estimates. Shaded ovals, region estimates. Back ovals denote overall estimates from unadjusted meta-analysis (detailed plots with higher prevalence estimates from unadjusted data to include only CD positive studies are in Supplementary Figures 3, 4). Note larger variability among studies conducted with vegetables (wide overall CIs) when compared to variability for beef products (narrow overall CI). Leafy green vegetables are twice more commonly found to contain CD compared to root vegetables. Analysis of these three food type categories, based on weighed mean prevalences, ranks leafy green vegetables as more likely to carry CD, and beef products the least likely.
Although earlier articles speculated that the identification of CD in foods could have been due to poor techniques and cross-contamination, high-quality studies have shown that contamination is an obsolete argument to discount the value of identifying toxigenic and even emerging virulent strains of CD in the food supply, which have been shown to be genetically similar to strains of clinical relevance in distant regions (13). Because a number of studies reported 0% of CD, it is possible that there are natural sources of contamination heterogeneity in foods, similarly to other known foodborne pathogens. There is evidence to support that the risk changes as a function of climate, and latitude. It has been established that the tropic has ecologically greater microbial diversity (114), but how such diversity could determine the presence of C. difficile in the food supply across regions is uncertain. If CD contamination is higher toward lower latitudes, possible explanations could include that more diverse microbiomes in the gut, environment, (114) and foods toward the tropic could prevent CD colonization and CDI, since CDIs are more often reported in temperate latitudes.
Our study only examined the reported prevalence of CD in food items, regardless of the toxinogenic potential of the identified isolates, assessed on culture cells or in susceptible hosts. Virtually, every study recovering CD have determined that the isolates have had at least one of the three toxins or genes needed to fulfill the criteria for CD toxigenicity (tcdA, tcdB, cdtA/B). Similarly, recent studies have used molecular methods to determine the epidemiological distribution of the isolates in human hospitals. However, because the performance, acceptability, and generalizability of molecular typing methods vary across regions, and because there is no a single unified system for CD strain typing or nomenclature worldwide to make meaningful comparisons at global scales, we refer the readers to the original publications to examine the strength of the genomic evidence reported in each epidemiological study. As historically highlighted, there is molecular evidence that the presence of CD in the human diet is genuine and not due to laboratory cross contamination with CD from human specimens. Major examples include the complete genome sequence of the first food derived PCR-ribotype 078 isolates from foods in Canada that matched contemporary strains affecting humans in the UK, in the mid 2000s, when there was no physical connection between the laboratories that reported both studies (13). Supporting the remarkable risk for CD exposure via seafoods, we also highlight the latest report of CD in foods conducted in the Adriatic Sea where mussels and clams contaminated at a mean prevalence of 16.9% (CI: 14.1-19.8%) carried a large proportion of CD representing diverse genotypes commonly isolated in European hospitals (113 CD isolates represented 53 genotypes, with 40.7% of them belonging to CD seen in CDI in hospitals) (79).
Although the present meta-analysis showed significant latitudinal heterogeneity, one of the limitations of the reported studies, and therefore the coordinate data used for the analysis, is that the latitudinal positioning of the samples collected and processed by each of the study authors is inferred for each research center, and it is not the actual coordinates of origin of each sample which was not reported in any study. However, since the analysis is conducted at the global scale and it is considered to be a proxy for the exposure risk, for all studies, which is relevant for the local communities in the districts sampled by the researchers, the analysis and findings are deemed pertinent and good indicators of the effect of latitudinal positioning and the CD prevalence trend observed at the global scale. Lastly, most studies have been conducted in Northern regions, however, to increase the study power we normalized all latitudes by squaring the latitudinal coordinates as the distance from the equator toward both hemispheres which is mathematically and geo-positionally standard method to gain statistical symmetry around latitude 0 (equator).
In summary, with respect to latitudinal positioning on earth, this study does not intend to make inferences/comparisons between north and south hemispheres. It only addresses the effect of absolute coordinates, which by the virtue of being positive (by squaring negative coordinates), they may be more representative or inflate the ecology in the north latitudes. Because local climates vary in opposing terms as latitude increases toward the poles (winter in north, summer in south, and vice versa, not controlled in this study because precise temporal referents were not reported in the reviewed studies) it is advisable that future studies provide databases containing the coordinates, day/month of the year and air temperature for each sample, and the CD test results to validate and further test the latitudinal trend and hypothesis herein generated in this systematic review.
Concerning the recognition of foodborne associated CDIs, it is crucial to emphasize that the confirmation of cases is challenging due to the difficulty in predicting which individuals will get CDI. However, it is well-known that several "traditional" risk factors are required to allow the colonization of CD in the intestinal tract (1)(2)(3)(4)(5)(6), and the subsequent production of toxins needed to induce disease. Several studies in animal models containing wild-type (healthy, non-dysbiosis) gut microbiomes indicate that the sole ingestion of CD may lead to intestinal colonization, but not necessarily to toxin production in sufficient quantities to be detected in the feces, and therein unlikely to induce disease (115). Therefore, it is essential to emphasize that education programs promoting preventive food safety (cooking) measures to minimize the exposure to food dwelling CD could be beneficial, if combined with the provision of educational information regarding the "traditional" risk factors known to alter the gut microbiome (antibiotics; immunosuppression), which is a pre-requisite for CDI symptoms to occur. Therefore, both "traditional" and "emerging" factors that increase the susceptibility to CDI need to be addressed simultaneously with the risk of ingestion of CD with foods. More recently, elegant studies have illustrated "emerging" risk factors that promote CDI virulence. For instance, food additives, i.e., trehalose, could serve as triggering determinants to facilitate the colonization and virulence of ingested CD spores (116)(117)(118). Especially, hypervirulent strains (including the first PCR ribotypes 027 and 078 we described in food animals and foods in 2007), which might have acquired metabolic trehaloserelated genes (116)(117)(118), and which are more heat-resistant than clinical isolates from the 1970s (118), which on its own could be critical factors to help explain the higher prevalence of CDI.
In conclusion, it is reasonable to infer from our analysis that there is no single number that summarizes the complexity of CD in the human diet worldwide. Until the dynamics of CD over space and time are better defined, doctors could advice patients and communities at risk to cook their meals better and give other simple suggestions, such as avoiding high-risk foods that are commonly consumed raw (e.g., fresh produce), until the patient's susceptibility to CDI decreases. From a clinical and prevention perspective, patients could benefit by knowing which foods are more likely to be contaminated with CD to determine how to adjust their diets during periods of increased susceptibility. Considering that ∼10% of the samples in this study (∼20 grams per sample, over 1 overfilled tablespoon) were contaminated with CD, which represents only a fraction of an average meal size per person, it is possible that consumers are exposed to CD very frequently. If a person consumes 500 g of food per day, FIGURE 13 | The probability of recovering C. difficile from foods increases toward the tropic. Linear correlation estimates, various meta-regression analyses controlling for confounders, contour plot simulation and Monte Carlo permutation test (n = 224, 1000 perm, joint P = 0.01) statistics revealed that latitude has been one of the most influential variables determining the magnitude and frequency by which C. difficile has been found in the human diet. (A) Moment-based estimate of between-food-set study variance and display of weighed correlation between prevalence and the absolute latitude. Without Knapp & Hartung modification to standard errors. P-values unadjusted and adjusted for multiple testing. Note that longitude is not significant variable. (B) Plot of linear trends derived from fitting linear models to actual data segregated by type of food and continents/regions aligned over distinct longitude ranges. Notice that except one slope, published reports have documented an inversed latitudinal trend. (C) Contour line plot simulation of the weighed CD prevalence for all food items over absolute latitude and real longitude plane (semi-transparent circles of different sizes, the larger the circle, the greater the influence on overall simulation). (D) Contour density and line plot simulation to help visualize the low prevalence estimates (near zero = blue) and latitudinal trends. Circles represent the location of the research centers were the studies were conducted or the centroid for the region that was sampled. (E) Contour density simulation to illustrate that latitudinal trends (arrows) can cover different latitudinal ranges, depending on the region (e.g., short high arrow corresponds to Europe). In iterative simulations, it is to note that such density latitudinal trends tend to cluster between two extreme arrangement patterns but that the significance is independent of the region (Supplementary Figures 6, 7 for further details and statistics). estimates could suggest than in average one table spoon full of meal in every 13 (260 grams of meal) could be contaminated with CD, if not cooked properly. Basic recommendations emphasizing food safety practices updated to CD (using >85 • C for 10 min, or even better, boiling temperatures) (28,53,56,119,120), could prevent inadvertent exposure especially if patients are affected with debilitating conditions that increase the risk for CD intestinal colonization and infection. Future publications should include in their design and reporting descriptors for climate, ambient temperature, season and latitude. Since CDI are very common in individuals receiving antibiotics, widely used in health care centers and the community, a starting point in making a potentially life-saving intervention could be the provision of necessary information on (i) for instance, the risk of CDI associated with antibiotic consumption and prescriptions, (ii) the prevalence of CD in the food supply, and (iii) the need to objectively improve food safety and cooking practices to minimize the ingestion of CD spores which are necessary to induce disease. Such combined information delivered as a simple infographic could be attached to prescribed drugs at the time of purchase and/or sale by registered pharmacists. Several other alternatives are also possible and could be assessed using ecological studies to quantify their impact on the epidemiology of CDI.

DATA AVAILABILITY STATEMENT
All data, easily inferable from the figures, tables and Supplementary Materials, and which is also easily extractable from the articles listed in the Supplementary References, will be made fully available by the corresponding authors upon reasonable request.

AUTHOR CONTRIBUTIONS
SI and AR-P designed the study. SI and AD developed the search algorithm. SI, JM, AR-P, and BS developed and pretested data extraction strategy and tool. JM, BS, KM, AR-P, and JM performed the literature search. KM and BS under the supervision of SI and AR extracted the data. AR and SI planned and performed all data analysis with support from NB. SI and AR-P wrote the manuscript. All authors approved the final manuscript.

ACKNOWLEDGMENTS
This study was conducted with or discretionary funds allocated to the corresponding authors. SI is currently an Assistant professor Human Nutrition and Food Safety State Specialist at the Ohio State University, Ohio Agricultural Research and Development Center Hatch funds through Multistate Project S1077 supported this project. NB received support from the Fulbright as a senior scholar and visiting professor in mathematics at the University of North Carolina. AR-P is currently an Assistant Professor of Medicine at Case Western Reserve University and receives support from the National Institutes of Health via grants R21 and R01 subaward mechanism, and is the Director of the Germ Free and Gut Microbiome Core at the Digestive Diseases Research Institute. He is also the Technical Director of the Mouse Models Core at the NIH Silvio O'Conte Cleveland Digestive Diseases Research Core Center. Special thanks to Anna Rodriguez-Ilic and Daniel Rodriguez-Ilic for their art work and encouraging suggestions. All art/images used in this manuscript are the result from the authors work, or are in public domain. This article is based on previously conducted studies, including publications authored by AR-P and SI, but does not contain unpublished studies performed by any of the authors.