- 1Riddet Institute, Massey University, Palmerston North, New Zealand
- 2High-Value Nutrition National Science Challenge, Auckland, New Zealand
- 3AgResearch, Palmerston North, New Zealand
- 4Department of Human Nutrition, University of Otago, Dunedin, New Zealand
- 5Department of Nutrition and Dietetics, The University of Auckland, Auckland, New Zealand
Traditional approaches for studying diet-colonic microbiota interactions are time-consuming, resource-intensive, and often hindered by technical and ethical concerns. Metagenome-scale community metabolic models show promise as complementary tools to overcome these limitations. However, their experimental validation is challenging, and their accuracy in predicting colonic microbial function under realistic dietary conditions remains unclear. This study assessed the accuracy of the Microbial Community model (MICOM) in predicting major short-chain fatty acid (SCFA) production by the colonic microbiota of weaning infants, using fecal samples as a proxy. Model predictions were compared with experimental SCFA production using in vitro fecal fermentation data at the genus level. The model exhibited overall poor accuracy, with only a weak, significant correlation between measured and predicted acetate production (r = 0.17, p = 0.03). However, agreement between predicted and measured SCFA production improved for samples primarily composed of plant-based foods: acetate exhibited a moderate positive correlation (r = 0.31, p = 0.005), and butyrate a trend toward a weak positive correlation (r = 0.21, p = 0.06). These findings suggest that the model is better suited for predicting the influence of complex carbohydrates on the colonic microbiota than for other dietary compounds. Our study demonstrates that, given current limitations, modeling approaches for diet-colonic microbiota interactions should complement rather than replace traditional experimental methods. Further refinement of computational models for microbial communities is essential to advance research on dietary compound-colonic microbiota interactions in weaning infants.
1 Introduction
The relationship between dietary compounds and colonic microbiota has garnered scientific interest due to its impact on host health (1, 2). From a health perspective, changes in colonic microbial function are more relevant than alterations in composition. Imbalances in microbial metabolite production may distinguish individuals with disease from healthy controls, despite individual variations in microbial taxonomy (3–6). Numerous metabolites play crucial roles in the bidirectional communication between the colonic microbiota and the host, such as neurotransmitters, polyamines, vitamins, bile acids, and organic acids (7). Among them, the short-chain fatty acids (SCFAs), acetate, propionate, and butyrate, offer numerous benefits to the host, including maintaining colonic barrier integrity, serving as an energy source for colonocytes, and exerting neuroprotective effects (8–11).
Most of our current understanding about diet-colonic microbiota interactions centers on the effects of complex carbohydrates on the adult microbiota. In contrast, the impact of other dietary compounds, such as fatty acids and polyphenols, remains underexplored, particularly in underrepresented populations such as infants and older adults (12). A better understanding of how dietary compounds affect colonic microbial function across diverse human populations is crucial. Clinical trials are standard approaches for evaluating this impact and assessing potential health outcomes from diet-microbiota interactions. However, they are time- and resource-consuming, and often challenged by technical and ethical concerns (13). Given these limitations, mathematical models show promise as complementary tools to reduce the cost and time of microbiota investigations (14, 15).
Various models have been proposed to investigate diet-colonic microbiota interactions, including kinetic-based (16–19), agent-based (20–22), and genome-scale metabolic models (GEMs) (23–26). GEMs use metabolic reconstructions, a mathematical representation of a microorganism’s metabolism, and flux balance analysis (FBA) to predict microbial metabolite production as fluxes (units of concentration per time) (27). Metagenome-scale community metabolic models (MGCMs) extend this concept to microbial communities (14, 15, 28). Among MGCMs, the Microbial Community model (MICOM) stands out for its user-friendly approach, extensive documentation, and pre-made workflows that range from data preparation to visualization (14, 29).
However, experimentally validating the predictions of MGCMs is challenging (30). A recent study compared measured SCFA fluxes from ex vivo fecal incubations with predicted fluxes obtained from MICOM under the influence of dietary fibers (31). An agreement between predictions and experimental measurements was reported for propionate and butyrate. Nevertheless, while this study partially validated in silico predictions for the influence of isolated dietary fibers, the model’s accuracy in predicting how whole foods shape colonic microbial function remains unexplored. Importantly, foods contain multiple dietary compounds that can interact during digestion and modulate their collective impact on colonic microbes (32, 33).
This study aimed to evaluate MICOM’s accuracy in predicting microbial SCFA production in real-life feeding scenarios for weaning infants. Predicted acetate, propionate, and butyrate fluxes were compared with fluxes measured experimentally from a published in vitro fecal fermentation study (34). The in silico simulations were designed to match the experimental setup as closely as possible, which examined how complementary foods, alone or combined with infant formula and/or other foods, affected major SCFA production by the colonic microbiota of weaning infants.
2 Methods
2.1 Experimentally measured fluxes of short-chain fatty acids
Measured acetate, propionate, and butyrate fluxes were estimated using data from a fecal fermentation study (34). The study evaluated the effects of complementary foods on the composition and SCFA production of the colonic microbiota in weaning infants, using fecal samples as a proxy. Food ingredients were tested individually and in combination with other foods, infant formula, or both, resulting in 53 samples. Samples were digested in vitro using a static model adapted from the INFOGEST protocol (35) to mimic the gastrointestinal conditions of 6-month-old infants, then fermented for 24 h at 37 °C using a pooled fecal inoculum from six healthy weaning infants (aged 5–11 months).
Organic acids were acidified with hydrochloric acid, extracted with diethyl ether, derivatized with N-tert-butyldimethylsilyl-N-methyltrifluoroacetamide, and detected using gas chromatography with a flame ionization detector [see methods description in (34)]. A solution of 2-ethyl butyric acid was used as an internal standard to account for the batch variations. Organic acids were quantified using standard solutions of acetate, propionate, and butyrate. The production of SCFAs was determined for each sample as the difference between measured SCFAs before and after fermentation, normalized by the dry weight of the fermented sample. Fluxes of acetate, propionate, and butyrate were calculated by dividing the concentration of each organic acid by the fermentation time (24 h).
2.2 Software
Simulations were conducted in Python (version 3.9.10) within the Spyder integrated development environment (version 5.4.2) using MICOM (14) (version 0.37.0) and the CPLEX Optimization Studio solver (IBM ILOG, version 22.1), which was accessed under an academic license. Additionally, the Assembly of Gut Organisms through Reconstruction and Analysis version 2 (AGORA) (36) metabolic reconstructions were employed to infer the metabolism of the infant fecal microbiota. Data and code are available at https://github.com/vgenisel/Assessing-the-accuracy-of-a-gut-microbiota-modelling-tool.
2.3 In silico media design
Media for the simulations were designed following the workflow of a previous study that used MICOM to assess the impact of complementary foods on the fecal microbiota of weaning infants (37). In short, the “Design a diet” function of the Virtual Metabolic Human database was used to select the foods composing the media (38). Foods were chosen to match the experimental samples as closely as possible (Supplementary Table 1). Additionally, the in silico media were designed to have the same dry mass of foods (150 g) to replicate the experimental conditions, which used 1.5 g of freeze-dried foods (a scalar increase was necessary to mitigate numerical instability in the simulations). The same approach was used to design media composed of food combinations, including foods with other foods, infant formula, or both, keeping the same ratios used in in vitro: 50% food1 with 50% food2; 20% food with 80% infant formula; and 10% food1 with 10% food2 and 80% infant formula.
Imported data were processed through the MICOM workflow for media design (31, 37), which added host-secreted compounds (mucin cores and bile acids), removed diluted compounds that are absorbed in the small intestine, and supplemented the media with minimal missing nutrients to ensure a community growth rate of 0.3/h. Finally, media compounds were diluted by a factor of 10 to match experimental conditions, where approximately 10% of the volume of post-dialysis digested food samples was fermented with fecal inoculum (34). As MICOM’s workflow does not directly account for digestion, the composition of the in silico media was assumed to reflect the chemical profile of the experimentally digested food samples.
2.4 In silico simulations
The relative abundance of the microbial community used in the simulations reflects baseline values from the in vitro fecal fermentation study (34) (Supplementary Table 2). Simulations were performed at the genus level due to the use of 16S rRNA gene sequencing in the experimental work. Briefly, raw paired-end sequencing data were generated by amplifying the V3-V4 regions of the 16S rRNA gene using an Illumina MiSeq platform. Primers were removed using Cutadapt (version 2.3) (39) and Trimmomatic (40). The DADA2 (version 1.32) (41) pipeline was followed for denoising, read truncation, chimera removal, and inferring amplicon sequence variants. The SILVA database (version 138.1) (42) was used for taxonomy assignment, and amplicon sequence variants were collapsed at the genus level using the microbiome (version 1.26) (43) package. Only genera with at least 0.001 relative abundance were included in the simulations to reduce numerical instability and processing time. A total of 31 genera (out of 54) were included, representing 99.3% of the relative abundance of the microbial community. Pan models of the AGORA2 metabolic reconstructions (36) for these genera were built by pooling microbial metabolic strains into higher taxonomic ranks.
Simulations followed published protocols (31, 37). MICOM is based on FBA under a mass steady-state assumption (14), representing the exponential phase of microbial growth, during which growth rates remain constant. Fluxes of microbial metabolites are calculated as the solution to a constrained linear programming problem, integrating the biochemical reactions performed by the microbial community, assuming no accumulation of substrates in the system, to maximize microbial community biomass. Notably, MICOM incorporates a trade-off between maximal community growth and maximal individual microbial growth (14). This strategy prevents the most abundant microbes from growing at the expense of low-abundance ones. The optimal cooperative trade-off was determined for each in silico medium (values ranged from 0.4 to 0.7) using MICOM’s “tradeoff” function. Additionally, MICOM employs a linearization strategy that relates the growth rates of individual taxa to their relative abundance (14). Consequently, the microbial community was expected to exhibit consistent growth patterns across different in silico media, with high-abundance genera predicted to have higher growth rates than those of lower abundance. Importantly, microbial relative abundance was used as a proxy for microbial biomass in the simulations, and fluxes of microbial metabolites were normalized by the dry weight of microbial biomass (expressed in millimoles per gram per hour {mmol/gDW h}).
2.5 Statistical analyses
To account for dissimilarities between the design of in silico and in vitro studies, standard scores (z-scores) were calculated for measured and predicted fluxes (Supplementary Table 3). The z-score describes the number of standard deviations a value differs from the mean. This strategy enables a comparison of results across studies with different designs (31). Pearson correlation coefficients (r) and two-tailed p-values (p) between measured and predicted z-scores for acetate, propionate, and butyrate production were calculated in Python (version 3.10.9) using pandas (version 2.2.3) (44) and SciPy (version 1.10.0) (45). Plotnine (version 0.14.5) was used to plot the correlations (46). To further assess the agreement between predicted and measured z-scores, the Bland–Altman analysis (47) was performed, and the 95% limits of agreement (mean difference ± 1.96 standard deviations) were calculated using the packages NumPy (version 1.23.5) (48) and matplotlib (version 3.10.0) (49). The normal distribution of the data was verified through the Shapiro–Wilk test using SciPy (version 1.10.0) (45). Heatmaps and radar charts were generated using matplotlib (version 3.10.0) (49), seaborn (version 0.13.2) (50), and SciPy (version 1.10.0) (45).
3 Results
3.1 Pearson correlations between predicted and measured SCFA production
Correlation analyses demonstrated a weak agreement between predicted and measured acetate production (r = 0.17, p = 0.03). However, this was the only significant correlation (p < 0.05) observed when considering the entire dataset (Supplementary Figure 1). To investigate whether combining food ingredients with other dietary compounds would impact the accuracy of the model, subsequent analyses clustered samples into the following categories: food ingredients alone, foods combined with infant formula (food-formula combinations), foods combined with other foods (food-food combinations), and foods combined with both (food-food-formula combinations). A trend toward a weak correlation was observed between predicted and measured butyrate production for food ingredients alone (r = 0.28, p = 0.07). Additionally, a moderate correlation was observed between model predictions and experimental productions of acetate and total SCFAs for food-food combinations (r = 0.43 and 0.41, with p = 0.006 and 0.01, respectively), while propionate exhibited a trend toward a weak negative correlation (r = −0.30, p = 0.06; Supplementary Figure 2). In contrast, no significant correlations were observed between predicted and measured z-scores for food-formula and food-food-formula combinations (Supplementary Figure 2).
Given that combining infant formula with foods reduced the model’s accuracy, an analysis was performed using only food ingredients and food-food combinations, corresponding to samples predominantly composed of plant-based foods. For this subset of samples, acetate exhibited a moderate agreement between predicted and measured z-scores (r = 0.31, p = 0.005), butyrate demonstrated a trend toward a weak positive correlation (r = 0.21, p = 0.06), and propionate showed a trend toward weak a negative correlation (r = −0.19, p = 0.08; Figure 1). Similar results were observed when animal-based food samples were completely excluded from the dataset (Supplementary Figure 3).

Figure 1. Pearson correlations between measured and predicted z-scores of major SCFAs for food ingredients and food-food combinations. SCFAs are displayed from left to right: acetate, butyrate, and propionate. Total SCFAs correspond to the sum of acetate, propionate, and butyrate. Pearson correlation coefficients (r) and two-tailed p-values are calculated for each plot individually. A regression line is shown in black, with the corresponding 95% confidence interval in gray.
3.2 Bland–Altman analysis
The normality of the dataset, comprising food ingredients and food-food combinations, was verified by the Shapiro–Wilk test. While the differences between predicted and measured z-scores for individual SCFAs followed a normal distribution (p-value > 0.05), their sum did not (p = 0.01; Supplementary Table 4). Consequently, Bland–Altman plots were generated only for acetate, propionate, and butyrate individually (Figure 2). Since fluxes were standardized into z-scores, all plots’ mean differences between z-scores were zero. Among the SCFAs, acetate exhibited the lowest limits of agreement, indicating better concordance between predicted and measured z-scores. Overall, most samples fell within the 95% limits of agreement for the major SCFAs. However, some samples exceeded these limits: 5 out of 81 for acetate, 2 out of 81 for propionate, and 7 out of 81 for butyrate. This suggests that the model had limitations in accurately predicting experimental outcomes across the entire dataset.

Figure 2. Bland–Altman plots comparing predicted and measured z-scores of major SCFAs for food ingredients and food-food combinations. SCFAs are displayed from left to right: acetate, propionate, and butyrate. The red line represents the mean difference between z-scores, while the blue lines indicate the upper and lower 95% limits of agreement (mean difference ± 1.96 standard deviation). Each sample is depicted as a green dot.
3.3 Comparison between predicted and measured z-scores
To better visualize the comparison between in silico with in vitro outcomes, a heatmap (Figure 3) and a radar chart (Supplementary Figure 4) were plotted. Only food ingredients and food-food combinations were included, as they demonstrated better agreement between predicted and measured values. Results showed several disagreements between in silico predictions and experimentally measured z-scores for these samples. For instance, black beans combined with blackcurrants had the greatest experimental production of butyrate among food-food combinations. In contrast, the model predicted the greatest butyrate production for pork combined with blackcurrants or couscous. For food ingredients, notable discrepancies were observed in the z-scores for chickpeas, couscous, soybeans, and kūmara samples, with only the butyrate prediction matching the experimental outcomes. Furthermore, the model did not accurately predict the ability of strawberries to increase propionate and total SCFA production in vitro. On the other hand, the model satisfactorily predicted the greatest production of butyrate for kūmara with skin among the food ingredients and the greatest production of acetate and total SCFAs for blackcurrants combined with strawberries among the food-food combinations. Additionally, the model captured the impact of blackcurrants and raspberries in increasing the total SCFA production compared to other foods.

Figure 3. Heatmap of measured and predicted z-scores of major SCFAs for food ingredients and food-food combinations. Predicted z-scores obtained via in silico modeling are displayed on the left and measured z-scores obtained via in vitro fermentation on the middle. The absolute difference between predicted and measured z-scores is displayed on the right (distance between z-scores). Cells are colored by intensity, with the lowest values in red and the highest values in blue.
3.4 Sensitivity analyses
To assess whether prediction quality depends on input microbial composition, additional simulations were conducted using post-fermentation relative abundances for each sample. Consistent with our initial results, the model exhibited poor overall agreement with experimental outcomes, with no significant correlations observed between predicted and measured z-scores for all samples. However, acetate production showed a moderate positive correlation (r = 0.42, p = 0.008) and total SCFA production trended positively (r = 0.28, p = 0.08) when analyses were restricted to food-formula combinations (Supplementary Figure 5). Similar to the initial results, food-formula combinations involving protein-rich foods, such as pork-formula and prawn-formula, showed the greatest discrepancies between predicted and measured outcomes (Supplementary Figures 6, 7).
4 Discussion
This study compared the predicted production of acetate, propionate, and butyrate in silico using MICOM (14) with experimental values obtained from a fecal fermentation study (34). Our focus on assessing the model’s accuracy in predicting the production of health-relevant metabolites, rather than microbial growth, was driven by the lack of available data on in vitro individual microbial growth rates. Additionally, from a health perspective, microbial function is more informative than composition, as colonic microbes are functionally redundant, and the colonic microbiota of healthy individuals maintains similar functionality despite taxonomic differences (3, 6). The workflows for in silico simulations and statistical comparison between in silico and in vitro outcomes followed published protocols (31, 37). Another strength was the assessment of the model’s accuracy in predicting SCFA production by the colonic microbiota of weaning infants under realistic infant feeding patterns, including whole foods, foods combined with other foods, infant formula, or both.
When evaluating all samples, the model demonstrated poor accuracy, with only acetate showing a weak correlation between predicted and measured outcomes. Interestingly, we observed increased accuracy when analyzing samples predominantly composed of plant-based foods. Acetate exhibited a moderate positive correlation, while propionate and butyrate showed a trend toward weak positive and weak negative correlations, respectively. These findings suggest that MICOM’s accuracy in predicting the function of the infant microbiota increases for media enriched in complex carbohydrates. This result is expected, considering that the influence of complex carbohydrates on colonic microbes is far better understood than that of other dietary compounds. Consequently, metabolic reconstructions of colonic microbes are often not curated for biochemical reactions involving amino acid and lipid utilization due to limited available data (51). Additionally, this result is likely driven by the functional capacity of the colonic microbiota of weaning infants. Observational studies have demonstrated that the weaning infant microbiota transitions from primarily degrading human milk oligosaccharides to metabolizing complex carbohydrates, while amino acid fermentation increases but remains less prominent compared to that of the adult colonic microbiota (52, 53).
Given that correlation analyses assess the strength of the relationship between two variables but do not indicate how these variables differ from each other, a Bland–Altman analysis was conducted to determine the limits of agreement (95% confidence interval) between predicted and measured SCFA production. Standard scores were used to account for differences in magnitude and unit between methods (54). Acetate had lower limits of agreement than propionate and butyrate, suggesting stronger concordance between predicted and measured z-scores, which aligns with the correlation results. Overall, most food and food-food combination samples fell within the 95% limits of agreement. However, some exceptions highlighted the model’s limited ability to accurately predict SCFA experimental production across the entire dataset.
Among food ingredients and food-food combinations, disagreements were observed between in vitro and in silico outcomes. For example, strawberries and the combination black beans-blackcurrants had positive z-scores for total SCFA production in vitro but negative z-scores in silico. On the other hand, some samples that drove the greatest total SCFA production in vitro, such as blackcurrants, raspberries, and the blackcurrants-strawberries combination, also resulted in high positive z-scores in silico. Sensitivity analyses assessing the impact of microbial composition on prediction accuracy indicated that community composition strongly influences simulation outcomes. However, the model consistently showed limited predictive performance across SCFAs and samples, with acetate showing better agreement in carbohydrate-rich samples. As a take-home message, our study demonstrates that emerging modeling approaches for diet-colonic microbiota interactions are imperfect and should not replace experimental methods. Instead, given their cost- and time-efficiency, and ability to leverage existing data, these models offer a valuable starting point for generating insights that can guide the design of in vitro and in vivo studies. Their further refinement and use as complementary tools represent a promising opportunity to advance diet-colonic microbiota research.
Future directions for the development of MGCMs include expanding the number of high-quality microbial metabolic reconstructions, which should be built using sequencing data that meet quality standards and subsequently curated with experimental data (55–57). Shotgun metagenomic sequencing data is preferable to 16S rRNA sequencing data, as it more accurately captures the metabolic potential of colonic microbes (58). Importantly, experimental data remain crucial for model curation and validation (59), with a notable need for more research into the behavior of colonic microorganisms in response to various dietary compounds. The accuracy of MGCMs could also benefit from incorporating omics data, such as metatranscriptomics, to better personalize the model’s conditions (60). Additionally, integrating dynamic FBA could improve the representation of changes in the colonic environment over time (24, 61). Finally, databases used for designing media in in silico simulations require further refinement to better account for the heterogeneity of dietary patterns across individuals.
Our results partially contrast with a previous study that reported agreement between MICOM’s prediction for propionate and butyrate and experimental outcomes from ex vivo fecal incubations with isolated dietary fibers using adult inoculum (31). The modest performance observed here is likely due to the different study designs. The fecal microbiota of weaning infants has distinct functionality compared to the adult microbiota, including a higher proportional production of acetate and lower of propionate and butyrate (62, 63). Furthermore, whole foods are complex matrices containing not only fiber but also other compounds, such as protein, fat, and phytochemicals, all of which impact the function of colonic microbes (64–66). Finally, while the referenced study used metagenomic data to build models at the species level and normalized SCFA predictions based on microbial biomass (31), our simulations were limited to the genus rank and did not incorporate biomass normalization due to using 16S rRNA sequencing data, thereby reducing metabolic specificity. This contrasting result highlights the need for further investigations of the model predictive accuracy across different study conditions and host populations, particularly concerning the production of health-relevant metabolites like butyrate (8, 9).
A major limitation of our study is the difference in outcomes obtained from distinct methods, as fluxes predicted in silico are not equivalent to the concentration of metabolites measured in vitro. To address this limitation, we followed a published protocol that used standard scores to compare results from different methodologies and validate MICOM’s predictions of propionate and butyrate production (31). However, quantifying SCFA fluxes using static in vitro fermentation is a limited representation, as metabolites accumulate and substrates are depleted over time (67). Over long fermentation periods, these conditions diverge from the dynamic conditions and steady state assumption inherent in FBA-based models, which assumes no accumulation of substrates in the system (27).
Additionally, the accuracy of models based on FBA strongly depends on the quality of the metabolic reconstructions used in the simulations. A recent systematic evaluation of FBA-based tools, including MICOM, reported low prediction accuracy of microbial growth rates compared to experimental data when using the AGORA metabolic reconstructions (68). The authors highlighted that semi-curated metabolic reconstructions were not sufficiently accurate for predicting the behavior of microorganisms (68). Our simulations were performed using the second version of AGORA, which was generated via a semi-automated pipeline and manually refined (36). However, it is important to recognize that these reconstructions may contain gaps or inaccurately assigned biochemical reactions (51).
Another limitation was the use of pooled fecal data, justified by the absence of individual-level data in the in vitro fermentation study. Although pooling fecal samples is a common practice in such studies, it inevitably alters the original microbial community structure, reducing inter-individual variability and functional resolution. These changes are likely to influence microbial functional dynamics in ways that are not captured by the taxonomic profile of the pooled sample (69, 70). Using amplicon 16S sequencing data limited model construction to the genus level, potentially masking metabolic differences between microbial species (31, 71). Additionally, this approach reduced the number of taxa included in the simulations and hindered the normalization of SCFA predictions by bacterial biomass. Notably, bacterial species that are present in low abundance yet biologically meaningful (keystone species) may be underrepresented when using 16S rRNA sequencing data (72, 73). Moreover, relying on microbial relative abundances as a proxy for absolute quantities, which can be biased by interdependencies between taxa and may not accurately reflect the true composition of the microbial community (74), can influence the model predictions. To overcome this limitation, simulations should be based on absolute microbial biomass, estimated via metagenomics or sequencing combined with qPCR (75, 76).
Furthermore, the in silico media were designed using the Virtual Metabolic Human database (38), which lacks information on diverse cooking methods and food ingredients. Notably, indigenous foods available in New Zealand and used in the experimental work (34), such as kūmara (sweet potato variety), were not included in the database and had to be substituted with the most similar available food. Finally, while the in vitro study used a static protocol to mimic the digestion and absorption of dietary compounds (34), MICOM, like other diet-microbiota models, currently does not account for digestion. Instead, it represents absorption by diluting the dietary compounds identified to be absorbed by the human intestine using a scalar (14). Not accounting for digestion is a potential major source of variation between in silico and in vitro outcomes, as digestion influences food structure and, consequently, the nutrient profile accessible for microbial fermentation (77). However, this limitation is intrinsic to the modeling framework, as computationally simulating food digestion is challenging. Although promising tools are emerging (78), they have not yet been integrated into MGCMs.
5 Conclusion
This study evaluated the accuracy of the metagenome-scale community metabolic model MICOM in predicting SCFA production by the fecal microbiota of weaning infants under realistic complementary infant feeding patterns, using static in vitro fecal fermentation data as a comparator. A weak positive correlation was observed between predicted and measured acetate production. The agreement between predicted and measured SCFA fluxes improved when analyzing samples predominantly composed of plant-based foods. These findings suggest that the model more accurately replicates experimental results when simulating media rich in complex carbohydrates. Despite disagreements between experimental and simulated outcomes for specific SCFAs, the model identified samples with the highest total SCFA production in vitro. This exemplifies the model’s limitations as a replacement for traditional experimental methods but supports its potential as a complementary tool. Further model development is essential to improve its accuracy, particularly for media-rich in fat and protein. Refined versions of the model would contribute to advancing research on the relationship between diet and microbiota.
Data availability statement
The original contributions presented in the study are publicly available. This data can be found here: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1327581 [SRA database BioProject Accession number PRJNA1327581].
Ethics statement
The studies involving humans were approved by Massey University Human Ethics Committee Southern A (Application 22/48). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin.
Author contributions
VS: Conceptualization, Methodology, Data curation, Writing – original draft, Investigation, Visualization, Formal analysis. NS: Conceptualization, Writing – review & editing, Supervision. JM: Writing – review & editing, Supervision, Formal analysis. NR: Supervision, Writing – review & editing, Funding acquisition. CW: Supervision, Writing – review & editing, Funding acquisition. WM: Project administration, Writing – review & editing, Supervision.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This study was funded by the New Zealand Ministry for Business, Innovation and Employment (MBIE, grant no. 3710040, contract number UOAX1902) through the High-Value Nutrition National Science Challenge. VS was supported by a Riddet Institute PhD Scholarship. The PhD stipend and project were funded by the High-Value Nutrition National Science Challenge (contract number UOAX1902).
Acknowledgments
Thanks to the MICOM community for providing insights into the comparison between in silico predicted and experimentally measured results.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnut.2025.1623418/full#supplementary-material
References
1. Rinninella, E, Tohumcu, E, Raoul, P, Fiorani, M, Cintoni, M, Mele, MC, et al. The role of diet in shaping human gut microbiota. Best Pract Res Clin Gastroenterol. (2023) 62-63:101828. doi: 10.1016/j.bpg.2023.101828
2. Duncanson, K, Williams, G, Hoedt, EC, Collins, CE, Keely, S, and Talley, NJ. Diet-microbiota associations in gastrointestinal research: a systematic review. Gut Microbes. (2024) 16:2350785. doi: 10.1080/19490976.2024.2350785
3. Huttenhower, C, Gevers, D, Knight, R, Abubucker, S, Badger, JH, Chinwalla, AT, et al. Structure, function and diversity of the healthy human microbiome. Nature. (2012) 486:207–14. doi: 10.1038/nature11234
4. Bellocchi, C, Fernández-Ochoa, Á, Montanelli, G, Vigone, B, Santaniello, A, Milani, C, et al. Microbial and metabolic multi-omic correlations in systemic sclerosis patients. Ann N Y Acad Sci. (2018) 1421:97–109. doi: 10.1111/nyas.13736
5. Chen, S-J, Chen, C-C, Liao, H-Y, Lin, Y-T, Wu, Y-W, Liou, J-M, et al. Association of Fecal and Plasma Levels of short-chain fatty acids with gut microbiota and clinical severity in patients with Parkinson disease. Neurology. (2022) 98:e848–58. doi: 10.1212/WNL.0000000000013225
6. Franzosa, EA, Sirota-Madi, A, Avila-Pacheco, J, Fornelos, N, Haiser, HJ, Reinker, S, et al. Gut microbiome structure and metabolic activity in inflammatory bowel disease. Nat Microbiol. (2019) 4:293–305. doi: 10.1038/s41564-018-0306-4
7. Zhang, Y, Chen, R, Zhang, D, Qi, S, and Liu, Y. Metabolite interactions between host and microbiota during health and disease: which feeds the other? Biomed Pharmacother. (2023) 160:114295. doi: 10.1016/j.biopha.2023.114295
8. Sanna, S, van Zuydam, NR, Mahajan, A, Kurilshikov, A, Vich Vila, A, Võsa, U, et al. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. Nat Genet. (2019) 51:600–5. doi: 10.1038/s41588-019-0350-x
9. Wang, H-B, Wang, P-Y, Wang, X, Wan, Y-L, and Liu, Y-C. Butyrate enhances intestinal epithelial barrier function via up-regulation of tight junction protein Claudin-1 transcription. Dig Dis Sci. (2012) 57:3126–35. doi: 10.1007/s10620-012-2259-4
10. Soret, R, Chevalier, J, Coppet, PD, Poupeau, G, Derkinderen, P, Segain, JP, et al. Short-chain fatty acids regulate the enteric neurons and control gastrointestinal motility in rats. Gastroenterology. (2010) 138:1772–82. doi: 10.1053/j.gastro.2010.01.053
11. Grüter, T, Mohamad, N, Rilke, N, Blusch, A, Sgodzai, M, Demir, S, et al. Propionate exerts neuroprotective and neuroregenerative effects in the peripheral nervous system. Proc Natl Acad Sci USA. (2023) 120:e2216941120. doi: 10.1073/pnas.2216941120
12. Geniselli da Silva, V, Roy, NC, Smith, NW, Wall, C, Mullaney, JA, and McNabb, WC. Dietary patterns influencing the human colonic microbiota from infancy to centenarian age: a narrative review. Front Nutr. (2025) 12. 1591341. doi: 10.3389/fnut.2025.1591341
13. Kostis, JB, and Dobrzynski, JM. Limitations of randomized clinical trials. Am J Cardiol. (2020) 129:109–15. doi: 10.1016/j.amjcard.2020.05.011
14. Diener, C, Gibbons, SM, and Resendis-Antonio, O. MICOM: metagenome-scale modeling to infer metabolic interactions in the gut microbiota. mSystems. (2020) 5:e00606-19. doi: 10.1128/mSystems.00606-19
15. Baldini, F, Heinken, A, Heirendt, L, Magnusdottir, S, Fleming, RMT, and Thiele, I. The microbiome modeling toolbox: from microbial interactions to personalized microbial communities. Bioinformatics. (2019) 35:2332–4. doi: 10.1093/bioinformatics/bty941
16. Kettle, H, Holtrop, G, Louis, P, and Flint, HJ. microPop: modelling microbial populations and communities in R. Methods Ecol Evol. (2018) 9:399–409. doi: 10.1111/2041-210X.12873
17. Kettle, H, Louis, P, and Flint, HJ. Process-based modelling of microbial community dynamics in the human colon. J R Soc Interface. (2022) 19:20220489. doi: 10.1098/rsif.2022.0489
18. Muñoz-Tamayo, R, Laroche, B, Walter, É, Doré, J, and Leclerc, M. Mathematical modelling of carbohydrate degradation by human colonic microbiota. J Theor Biol. (2010) 266:189–201. doi: 10.1016/j.jtbi.2010.05.040
19. Muñoz-Tamayo, R, Laroche, B, Walter, É, Doré, J, Duncan, SH, Flint, HJ, et al. Kinetic modelling of lactate utilization and butyrate production by key human colonic bacterial species. FEMS Microbiol Ecol. (2011) 76:615–24. doi: 10.1111/j.1574-6941.2011.01085.x
20. Bauer, E, Zimmermann, J, Baldini, F, Thiele, I, and Kaleta, C. BacArena: individual-based metabolic modeling of heterogeneous microbes in complex communities. PLoS Comput Biol. (2017) 13:e1005544. doi: 10.1371/journal.pcbi.1005544
21. Valiei, A, Dickson, A, Aminian-Dehkordi, J, and Mofrad, MRK. Metabolic interactions shape emergent biofilm structures in a conceptual model of gut mucosal bacterial communities. Npj Biofilms Microbiomes. (2024) 10:99–13. doi: 10.1038/s41522-024-00572-y
22. Aminian-Dehkordi, J, Dickson, A, Valiei, A, and Mofrad, MRK. Metabiome: a multiscale model integrating agent-based and metabolic networks to reveal spatial regulation in gut mucosal microbial communities. mSystems. (2025) 10:e01652-24. doi: 10.1128/msystems.01652-24
23. Shoaie, S, Ghaffari, P, Kovatcheva-Datchary, P, Mardinoglu, A, Sen, P, Pujos-Guillot, E, et al. Quantifying diet-induced metabolic changes of the human gut microbiome. Cell Metab. (2015) 22:320–31. doi: 10.1016/j.cmet.2015.07.001
24. Dukovski, I, Bajić, D, Chacón, JM, Quintin, M, Vila, JCC, Sulheim, S, et al. A metabolic modeling platform for the computation of microbial ecosystems in time and space (COMETS). Nat Protoc. (2021) 16:5030–82. doi: 10.1038/s41596-021-00593-3
25. Heirendt, L, Arreckx, S, Pfau, T, Mendoza, SN, Richelle, A, Heinken, A, et al. Creation and analysis of biochemical constraint-based models: the COBRA toolbox v3.0. Nat Protoc. (2019) 14:639–702. doi: 10.1038/s41596-018-0098-2
26. Chan, SHJ, Simons, MN, and Maranas, CD. SteadyCom: predicting microbial abundances while ensuring community stability. PLoS Comput Biol. (2017) 13:e1005539. doi: 10.1371/journal.pcbi.1005539
27. Orth, JD, Thiele, I, and Palsson, BØ. What is flux balance analysis? Nat Biotechnol. (2010) 28:245–8. doi: 10.1038/nbt.1614
28. Quinn-Bohmann, N, Carr, AV, Diener, C, and Gibbons, SM. Moving from genome-scale to community-scale metabolic models for the human gut microbiome. Nat Microbiol. (2025) 10:1055–66. doi: 10.1038/s41564-025-01972-2
29. Wts, J, Benito-Vaquerizo, S, Zimmermann, J, Bajić, D, Heinken, A, Suarez-Diez, M, et al. A structured evaluation of genome-scale constraint-based modeling tools for microbial consortia. PLoS Comput Biol. (2023) 19:e1011363. doi: 10.1371/journal.pcbi.1011363
30. Diener, C, and Gibbons, SM. More is different: metabolic modeling of diverse microbial communities. mSystems. (2023) 8:e01270-22. doi: 10.1128/msystems.01270-22
31. Quinn-Bohmann, N, Wilmanski, T, Sarmiento, KR, Levy, L, Lampe, JW, Gurry, T, et al. Microbial community-scale metabolic modelling predicts personalized short-chain fatty acid production profiles in the human gut. Nat Microbiol. (2024) 9:1700–12. doi: 10.1038/s41564-024-01728-4
32. Zheng, Y, Qin, C, Wen, M, Zhang, L, and Wang, W. The effects of food nutrients and bioactive compounds on the gut microbiota: a comprehensive review. Foods. (2024) 13:1345. doi: 10.3390/foods13091345
33. Capuano, E, and Janssen, AEM. Food matrix and macronutrient digestion. Annu Rev Food Sci Technol. (2021) 12:193–212. doi: 10.1146/annurev-food-032519-051646
34. Silva, VGda, Mullaney, JA, Roy, NC, and Smith, NW. Complementary foods in infants: an in vitro study of the faecal microbial composition and organic acid production. Food Funct (2025) 16:3465–6481. doi: 10.1039/D5FO00414D
35. Brodkorb, A, Egger, L, Alminger, M, Alvito, P, Assunção, R, Ballance, S, et al. INFOGEST static in vitro simulation of gastrointestinal food digestion. Nat Protoc. (2019) 14:991–1014. doi: 10.1038/s41596-018-0119-1
36. Heinken, A, Hertel, J, Acharya, G, Ravcheev, DA, Nyga, M, Okpala, OE, et al. Genome-scale metabolic reconstruction of 7,302 human microorganisms for personalized medicine. Nat Biotechnol. (2023) 41:1320–31. doi: 10.1038/s41587-022-01628-0
37. da Silva, VG, Smith, NW, Mullaney, JA, Wall, C, Roy, NC, and McNabb, WC. Food-breastmilk combinations alter the colonic microbiome of weaning infants: an in silico study. mSystems. (2024) 9: e00577–24. doi: 10.1128/msystems.00577-24
38. Noronha, A, Modamio, J, Jarosz, Y, Guerard, E, Sompairac, N, Preciat, G, et al. The virtual metabolic human database: integrating human and gut microbiome metabolism with nutrition and disease. Nucleic Acids Res. (2019) 47:D614–24. doi: 10.1093/nar/gky992
39. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal. (2011) 17:10. doi: 10.14806/ej.17.1.200
40. Bolger, AM, Lohse, M, and Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. (2014) 30:2114–20. doi: 10.1093/bioinformatics/btu170
41. Callahan, BJ, McMurdie, PJ, Rosen, MJ, Han, AW, Johnson, AJA, and Holmes, SP. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods. (2016) 13:581–3. doi: 10.1038/nmeth.3869
42. Quast, C, Pruesse, E, Yilmaz, P, Gerken, J, Schweer, T, Yarza, P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. (2013) 41:D590–6. doi: 10.1093/nar/gks1219
43. Lahti, Leo, and Shetty, Sudarshan Tools for microbiome analysis in R. (2017) Available online at: http://microbiome.github.com/microbiome. [Accessed September 13, 2024]
44. McKinney, W. Data structures for statistical computing in Python. Scipy. (2010). doi: 10.25080/Majora-92bf1922-00a
45. Virtanen, P, Gommers, R, Oliphant, TE, Haberland, M, Reddy, T, Cournapeau, D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. (2020) 17:261–72. doi: 10.1038/s41592-019-0686-2
46. Wickham, H. A layered grammar of graphics. J Comput Graph Stat. (2010) 19:3–28. doi: 10.1198/jcgs.2009.07098
47. Bland, MJ, and Altman, DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. (1986) 327:307–10.
48. Harris, CR, Millman, KJ, van der Walt, SJ, Gommers, R, Virtanen, P, Cournapeau, D, et al. Array programming with NumPy. Nature. (2020) 585:357–62. doi: 10.1038/s41586-020-2649-2
49. Hunter, JD. Matplotlib: a 2D graphics environment. Comput Sci Eng. (2007) 9:90–5. doi: 10.1109/MCSE.2007.55
50. Waskom, ML. Seaborn: statistical data visualization. J Open Source Softw. (2021) 6:3021. doi: 10.21105/joss.03021
51. Magnúsdóttir, S, Heinken, A, Kutt, L, Ravcheev, DA, Bauer, E, Noronha, A, et al. Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota. Nat Biotechnol. (2017) 35:81–9. doi: 10.1038/nbt.3703
52. Bäckhed, F, Roswall, J, Peng, Y, Feng, Q, Jia, H, Kovatcheva-Datchary, P, et al. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host Microbe. (2015) 17:690–703. doi: 10.1016/j.chom.2015.04.004
53. Stewart, CJ, Ajami, NJ, O’Brien, JL, Hutchinson, DS, Smith, DP, Wong, MC, et al. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature. (2018) 562:583–8. doi: 10.1038/s41586-018-0617-x
54. Yeung, SSY, Reijnierse, EM, Trappenburg, MC, Hogrel, J-Y, McPhee, JS, Piasecki, M, et al. Handgrip strength cannot be assumed a proxy for overall muscle strength. J Am Med Dir Assoc. (2018) 19:703–9. doi: 10.1016/j.jamda.2018.04.019
55. Bowers, RM, Kyrpides, NC, Stepanauskas, R, Harmon-Smith, M, Doud, D, Reddy, TBK, et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol. (2017) 35:725–31. doi: 10.1038/nbt.3893
56. Thiele, I, and Palsson, BØ. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc. (2010) 5:93–121. doi: 10.1038/nprot.2009.203
57. Lieven, C, Beber, ME, Olivier, BG, Bergmann, FT, Ataman, M, Babaei, P, et al. MEMOTE for standardized genome-scale metabolic model testing. Nat Biotechnol. (2020) 38:272–6. doi: 10.1038/s41587-020-0446-y
58. Jovel, J, Patterson, J, Wang, W, Hotte, N, O’Keefe, S, Mitchel, T, et al. Characterization of the gut microbiome using 16S or shotgun metagenomics. Front Microbiol. (2016) 7:459. doi: 10.3389/fmicb.2016.00459
59. Karp, PD, Weaver, D, and Latendresse, M. How accurate is automated gap filling of metabolic models? BMC Syst Biol. (2018) 12:73. doi: 10.1186/s12918-018-0593-7
60. Zampieri, G, Campanaro, S, Angione, C, and Treu, L. Metatranscriptomics-guided genome-scale metabolic modeling of microbial communities. Cell Rep Methods. (2023) 3:100383. doi: 10.1016/j.crmeth.2022.100383
61. Popp, D, and Centler, F. μBialSim: constraint-based dynamic simulation of complex microbiomes. Front Bioeng Biotechnol. (2020) 8:574. doi: 10.3389/fbioe.2020.00574
62. Tsukuda, N, Yahagi, K, Hara, T, Watanabe, Y, Matsumoto, H, Mori, H, et al. Key bacterial taxa and metabolic pathways affecting gut short-chain fatty acid profiles in early life. ISME J. (2021) 15:2574–90. doi: 10.1038/s41396-021-00937-7
63. Łoniewska, B, Fraszczyk-Tousty, M, Tousty, P, Skonieczna-Żydecka, K, Maciejewska-Markiewicz, D, and Łoniewski, I. Analysis of fecal short-chain fatty acids (SCFAs) in healthy children during the first two years of life: an observational prospective cohort study. Nutrients. (2023) 15:367. doi: 10.3390/nu15020367
64. Ma, G, and Chen, Y. Polyphenol supplementation benefits human health via gut microbiota: a systematic review via meta-analysis. J Funct Foods. (2020) 66:103829. doi: 10.1016/j.jff.2020.103829
65. Wolters, M, Ahrens, J, Romaní-Pérez, M, Watkins, C, Sanz, Y, Benítez-Páez, A, et al. Dietary fat, the gut microbiota, and metabolic health – a systematic review conducted within the MyNewGut project. Clin Nutr. (2019) 38:2504–20. doi: 10.1016/j.clnu.2018.12.024
66. Wu, S, Bhat, ZF, Gounder, RS, Mohamed Ahmed, IA, Al-Juhaimi, FY, Ding, Y, et al. Effect of dietary protein and processing on gut microbiota—a systematic review. Nutrients. (2022) 14:453. doi: 10.3390/nu14030453
67. Ni, J, Wang, Y, Sun, H, Chang, Z, Wang, R, Jiang, Y, et al. Comparative study on static and dynamic digest characteristics of oat β-Glucan and β-Gluco-oligosaccharides. Food Res Int. (2024) 197:115153. doi: 10.1016/j.foodres.2024.115153
68. Joseph, C, Zafeiropoulos, H, Bernaerts, K, and Faust, K. Predicting microbial interactions with approaches based on flux balance analysis: an evaluation. BMC Bioinformatics. (2024) 25:36. doi: 10.1186/s12859-024-05651-7
69. Aguirre, M, Ramiro-Garcia, J, and Koenen, M. To pool or not to pool? Impact of the use of individual and pooled fecal samples for in vitro fermentation studies. J Microbiol Methods. (2014) 107:1–7. doi: 10.1016/j.mimet.2014.08.022
70. Reygner, J, Delannoy, J, Barba-Goudiaby, M-T, Gasc, C, Levast, B, Gaschet, E, et al. Reduction of product composition variability using pooled microbiome ecosystem therapy and consequence in two infectious murine models. Appl Environ Microbiol. (2024) 90:e0001624–4. doi: 10.1128/aem.00016-24
71. Bauer, E, Laczny, CC, Magnusdottir, S, Wilmes, P, and Thiele, I. Phenotypic differentiation of gastrointestinal microbes is reflected in their encoded metabolic repertoires. Microbiome. (2015) 3:55. doi: 10.1186/s40168-015-0121-6
72. Durazzi, F, Sala, C, Castellani, G, Manfreda, G, Remondini, D, and De Cesare, A. Comparison between 16S rRNA and shotgun sequencing data for the taxonomic characterization of the gut microbiota. Sci Rep. (2021) 11:3030. doi: 10.1038/s41598-021-82726-y
73. Banerjee, S, Schlaeppi, K, and van der Heijden, MGA. Keystone taxa as drivers of microbiome structure and functioning. Nat Rev Microbiol. (2018) 16:567–76. doi: 10.1038/s41579-018-0024-1
74. Bruijning, M, Ayroles, JF, Henry, LP, Koskella, B, Meyer, KM, and Metcalf, CJE. Relative abundance data can misrepresent heritability of the microbiome. Microbiome. (2023) 11:222. doi: 10.1186/s40168-023-01669-w
75. Barlow, JT, Bogatyrev, SR, and Ismagilov, RF. A quantitative sequencing framework for absolute abundance measurements of mucosal and lumenal microbial communities. Nat Commun. (2020) 11:2590. doi: 10.1038/s41467-020-16224-6
76. Tang, G, Carr, AV, Perez, C, Sarmiento, KR, Levy, L, Lampe, JW, et al. Metagenomic estimation of absolute bacterial biomass in the mammalian gut through host-derived read normalization. bioRxiv. (2025) 10: e00984–25. doi: 10.1101/2025.01.07.631807
77. Liu, Y, Duan, X, Duan, S, Li, C, Hu, B, Liu, A, et al. Effects of in vitro digestion and fecal fermentation on the stability and metabolic behavior of polysaccharides from Craterellus cornucopioides. Food Funct. (2020) 11:6899–910. doi: 10.1039/D0FO01430C
Keywords: gut microbiota, modeling, in silico , correlation, short-chain fatty acid
Citation: Geniselli da Silva V, Smith NW, Mullaney JA, Roy NC, Wall C and McNabb WC (2025) Mathematical models of the colonic microbiota: an evaluation of accuracy using in vitro fecal fermentation data. Front. Nutr. 12:1623418. doi: 10.3389/fnut.2025.1623418
Edited by:
Hao Huang, Lishui University, ChinaReviewed by:
Daniel Garza, INRAE Centre Jouy-en-Josas, FranceMateus Kawata Salgaço, São Paulo State University, Brazil
Copyright © 2025 Geniselli da Silva, Smith, Mullaney, Roy, Wall and McNabb. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Warren Charles McNabb, Vy5NY05hYmJAbWFzc2V5LmFjLm56