- 1Department of Gastroenterology, Qilu Hospital, Shandong University, Jinan, Shandong, China
- 2Department of Gastroenterology, Qilu Hospital of Shandong University, Jinan, China
- 3Clinical Epidemiology Unit, Qilu Hospital of Shandong University, Jinan, China
- 4National Key Laboratory for Innovation and Transformation of Luobing Theory, Jinan, China
- 5The Key Laboratory of Cardiovascular Remodeling and Function Research, Chinese Ministry of Education, Chinese National Health Commission and Chinese Academy of Medical Sciences, Jinan, China
- 6Department of Cardiology, Qilu Hospital of Shandong University, Jinan, China
- 7Department of Pancreatic Surgery, General Surgery, Qilu Hospital of Shandong University, Jinan, China
- 8Department of Gastroentero-Pancreatic Surgery, Qilu Hospital (Qingdao), Cheeloo College of Medicine, Shandong University, Qingdao, Shandong, China
Introduction: Inflammatory bowel disease (IBD), comprising Crohn’s disease (CD) and ulcerative colitis (UC), is a chronic and relapsing inflammatory disorder of the gastrointestinal tract. Current diagnostic approaches are invasive, costly, and time-consuming, underscoring the need for non-invasive, accurate diagnostic methods.
Methods: We conducted a targeted metabolomic analysis of 49 metabolites related to central carbon metabolism in urinary samples from individuals with IBD and control group. Diagnostic models were constructed using six machine learning algorithms, and their performance was evaluated by cross-validated area under the receiver operating characteristic curve (AUC). The SHAP (SHapley Additive exPlanations) method was used to interpret the models and identify key discriminatory features.
Results: Six metabolites—xylose, isocitric acid, fructose, L-fucose, N-acetyl-D-glucosamine (GlcNAc), and glycolic acid—differentiated UC from control group, while three metabolites—xylose, L-fucose, and citric acid—distinguished CD from control group. The optimal diagnostic model achieved a mean AUC of 0.84 for UC and 0.93 for CD. These models retained high diagnostic accuracy even after adjusting for disease activity. SHAP analysis identified L-fucose, xylose, and GlcNAc as important features for UC, and citric acid and xylose for CD.
Discussion: Our findings highlight distinct metabolic signatures in central carbon metabolism associated with IBD subtypes. The identified metabolite panels, combined with machine learning models, offer promising non-invasive tools for differentiating UC and CD from healthy individuals.
Introduction
Inflammatory bowel disease (IBD) is a chronic and relapsing inflammatory disorder of the gastrointestinal tract, with ulcerative colitis (UC) and Crohn’s disease (CD) being the primary classifications. The incidence and prevalence of IBD are rapidly increasing, particularly in newly industrialized countries (Kaplan, 2015). IBD is typically diagnosed using standard clinical, endoscopic, radiological, and histological criteria (Rubin et al., 2019; Lichtenstein et al., 2018). Although UC and CD can present with similar clinical manifestations, their pathogenesis and treatments differ considerably (Bernstein et al., 2010; Blonski et al., 2012). Evidence has indicated that long-lasting subclinical disease activity usually reduces the quality of life and increases the risk of surgical intervention (Lichtenstein et al., 2004). Therefore, diagnosing and monitoring IBD disease activity are particularly crucial.
Currently, endoscopy examination is the gold standard method for IBD diagnosis, but it is time-consuming, invasive, and expensive (Hamilton, 2012; Annese et al., 2013). Some serum biomarkers, such as C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), anti-Saccharomyces cerevisiae antibodies (ASCA), and perinuclear antineutrophil cytoplasmic antibodies (p-ANCA), are limited by their low sensitivity and specificity (Sands, 2015; Sakurai and Saruta, 2023).
Metabolomics offers a comprehensive analysis of metabolites, facilitating disease diagnosis and biomarker identification (Nicholson and Lindon, 2008). Urine metabolomics has emerged as a promising approach for identifying non-invasive biomarkers for disease diagnosis (Khamis et al., 2017). Central carbon metabolism, which includes glycolysis/gluconeogenesis, the pentose phosphate pathway, and the tricarboxylic acid (TCA) cycle, plays a crucial role in cellular function by providing energy and precursors for biosynthetic pathways (Wu et al., 2023; Xia et al., 2022). Extensive research has shown that some differential metabolites involved in central carbon metabolism, such as succinic acid and citric acid, differed significantly between IBD and healthy people (Schicho et al., 2012; Stephens et al., 2013; Dawiskiba et al., 2014; Alonso et al., 2016; Aldars-García et al., 2024; Martin et al., 2016; Martin et al., 2017). However, previous studies have often focused on single metabolites and lacked quantitative analysis of central carbon metabolism.
In this study, we used ultra-high-pressure liquid chromatography coupled with tandem mass spectrometry (UHPLC-MS/MS) to perform a quantitative analysis of central carbon metabolism in urine samples collected from 95 subjects, including UC, CD, and control group (CG). We aimed to establish diagnostic models for discriminating IBD from non-IBD individuals using machine learning methods and evaluate the importance of metabolites. The overall study design is shown in Figure 1.

Figure 1. Overview of the study design. The schematic illustrates the overall workflow of the study, including participant recruitment (UC, CD, and control groups), urine sample collection and preparation, targeted metabolomic profiling using UHPLC-MS/MS, machine learning model construction and evaluation, and SHAP-based feature interpretation. (Drawn by Figdraw platform, ID: TTTIW05057).
Materials and methods
Study participants
This study was approved by the Ethics Committee of Qilu Hospital of Shandong University (Approval number: KYLL-202212-010), and written informed consent was obtained from all subjects. A total of 49 UC patients, 20 CD patients, and 26 control subjects were recruited from hospitalized patients in the Qilu Hospital of Shandong University between April 2023 and June 2024. Eligible patients were between 18 and 75 years old, with a diagnosis of ulcerative colitis (UC) or Crohn’s disease (CD) confirmed according to the European Crohn’s and Colitis Organization criteria (Magro et al., 2017; Gomollón et al., 2017). Clinical activity was scored using the Mayo score for UC (Paine, 2014) and the Crohn’s disease activity index (CDAI) for CD (Best et al., 1976). The inclusion criterion for control group was healthy adults between 18 and 75 years of age.
To minimize inter-individual variation due to hydration status and recent dietary intake, all urine samples were collected as first-morning voids following an overnight fast of at least 8 h. Participants were instructed to avoid food and fluid intake after midnight and to collect their first urination immediately upon waking. This standardized collection helps ensure consistency in urinary metabolite concentrations by reducing the influence of short-term fluctuations in fluid balance and nutritional status.
To minimize possible confounding effects in the results, exclusion criteria were as follows: indeterminate colitis; structural abnormalities of the gastrointestinal tract; urinary dysfunction or infection; and combined with diabetes mellitus, inborn errors of metabolism, renal or hepatic disease, severe infection, or evidence of malignancy that could affect the results of this study. All participants adhered to the same exclusion criteria.
Sample processing and preparation of standards
Fresh midstream urine samples were collected from subjects in the morning. Immediately after collection, the urine was centrifuged at 1,000 × g for 10 min to collect the supernatant. After centrifugation at 12,000 × g for 10 min at 4°C, the middle layer of the supernatant was aspirated into a 1.5 mL centrifuge tube and stored at −80°C until use.
A total of 49 central carbon metabolism-related standards were accurately weighed and individually dissolved in 50% methanol-water to prepare single-compound stock solutions. Appropriate volumes of each stock solution were combined and diluted with 50% methanol-water to obtain a mixed working standard solution at suitable concentrations.
Isotopically labeled internal standards—including succinic acid-D4, L-carnitine-D3, cholic acid-D4, and salicylic acid-D4—were also weighed and dissolved individually in 50% methanol-water to prepare their respective stock solutions. Equal volumes of these solutions were then combined and diluted to prepare a mixed internal standard solution with final concentrations of 5 μg/mL (succinic acid-D4), 5 μg/mL (L-carnitine-D3), 20 μg/mL (cholic acid-D4), and 30 μg/mL (salicylic acid-D4).
To construct the calibration curves, 50 μL of the working standard solution was mixed with 10 μL of the isotope internal standard mixture and 140 μL of acetonitrile. The mixture was vortexed for 1 min and centrifuged at 14,000 rcf for 20 min at 4°C. Then, 100 μL of the resulting supernatant was transferred to a 1.5 mL centrifuge tube, followed by the addition of 25 μL of 200 mM 3-nitrophenylhydrazine hydrochloride (3NPHHCl) and 25 μL of 120 mM EDCHCl solution containing 6% pyridine. The mixture was vortexed for 30 s, briefly centrifuged for 5 s, and incubated at 60°C for 40 min using a thermostatic shaker. This reaction forms hydrazone derivatives with carbonyl-containing metabolites, which is known as derivatization. After the reaction, the sample was vortexed again for 30 s, centrifuged at 14,000 rcf for 20 min at 4°C, and the supernatant was transferred into an autosampler vial for LC-MS/MS analysis. Calibration curves for all 49 central carbon metabolites were constructed using 12 concentration points, each of which was injected in triplicate. The resulting peak area ratios (analyte/internal standard) were used to generate calibration curves through weighted linear regression. Detailed infromation were listed in Supplementary Table S5.
A 20 μL aliquot of urine was mixed with 10 μL of isotope internal standard, 30 μL of 50% methanol, and 140 μL of acetonitrile, and vortexed for 1 min. After centrifugation at 14,000 rcf for 20 min at 4°C, 100 μL of the supernatant was transferred to 1.5 mL centrifuge tubes containing 25 μL of 200 mM 3NPH.HCL and 25 μL of 120 mM EDC. HCL (containing 6% pyridine) solution. The mixture was shaken in a vortex for 30 s, centrifuged for 5 s, and then reacted at 60°C for 40 min in a thermostatic oscillator. After the reaction, the mixture was vortexed for 30 s and centrifuged at 14,000 rcf for 20 min at 4°C. Finally, the supernatant was transferred to an injection vial for subsequent analysis.
UHPLC-MS/MS analysis
Targeted metabolomic analysis of the urine samples was performed using liquid chromatography–tandem mass spectrometry (LC-MS/MS) on an ExionLC AD system coupled with a QTRAP® 6500+ mass spectrometer (Sciex, United States) at Majorbio Bio-Pharm Technology Co. Ltd. (Shanghai, China).
Chromatographic separation was performed on an ExionLC™ AD system equipped with a Waters HSS T3 chromatography column (2.1 × 150 mm, 1.8 μm). Mobile phase A was 0.03% formic acid in water, and mobile phase B was 0.03% formic acid in methanol. The gradient was as follows: 0.0–2.0 min, hold at 1% B; 2.0–8.0 min, from 1% to 22% B; 8.0–12.0 min, hold at 22% B; 12.0–13.0 min, from 22% to 40% B; 13.0–17.0 min, from 40% to 65% B; 17.0–19.0 min, hold at 65% B; 19.0–20.0 min, from 65% to 100% B; 20.0–21.0 min, hold at 100% B; 21.0–21.01 min, from 100% to 1% B; 21.01–22.0 min, hold at 1% B. The injection volume was 2 μL, and the column temperature was 40°C.
Mass spectrometric analyses were performed on a QTRAP® 6500+ mass spectrometer (Sciex, United States) equipped with an electrospray ionization (ESI) source operating in negative and positive modes. The parameters were set as follows: source temperature (TEM) at 550°C; curtain gas (CUR) at 35 psi; collision gas (CAD) at medium; both Ion Source Gas1 and Gas2 at 55 psi; IonSpray Voltage (IS) at +4500/−4500 V.
Data acquisition was conducted in multiple reaction monitoring (MRM) mode, a highly sensitive and specific mass spectrometric technique that enables the selective quantification of predefined metabolites. In MRM mode, the mass spectrometer first selects precursor ions (parent ions) of interest in the first quadrupole (Q1), induces fragmentation through collision-induced dissociation (CID) in the second quadrupole (Q2), and then monitors specific product ions in the third quadrupole (Q3). This approach ensures high analytical specificity, sensitivity, and reproducibility for the targeted detection and quantification of known metabolites. Metabolite identification was carried out in a targeted manner using MRM based on authentic reference standards. Both precursor and corresponding product ions were monitored for each metabolite, enabling high-confidence identification. The complete list of ion pairs and retention time data is provided in Supplementary Table S3.
The quality control (QC) sample was a mixed standard solution at a moderate concentration level, primarily used to evaluate the stability of the analytical system. In this study, the QC sample was prepared using the C10 concentration level (i.e., the 10th calibration level, see Supplementary Table S5). During the LC-MS/MS analysis, one QC sample was injected after every 5 to 10 sample injections to monitor instrument stability and repeatability. The stability of QC signals across the analytical batch further validated the reproducibility of the instrument and analytical conditions. The relative standard deviations (RSDs) of all target analytes were below 15%, indicating that the method and analytical system were stable and reliable.
To evaluate the robustness of the LC-MS/MS platform and exclude the potential influence of instrumental variation, we conducted method validation for 49 central carbon metabolites. QC samples at low, medium, and high concentrations were analyzed in six replicates within a single day and across 3 days. Intra-day and inter-day precision were calculated as RSD%, and recovery rates were calculated based on spiked and measured concentrations. Acceptable thresholds were defined as RSD% ≤15% and recovery rates within 80%–120%, following common metabolomics validation guidelines. All metabolites met these criteria, confirming the stability of the measurement system.
The raw data were processed by Sciex software OS by using the default parameters and assisting manual inspection. A linear regression standard curve was created with the ratio of the mass spectral peak area of the analyte to the internal standard peak area as the vertical coordinate and the concentration of the analyte as the horizontal coordinate. The ratio of the mass spectral peak area of the analyte to the internal standard peak area was substituted into the linear equation to calculate the sample’s concentration (Supplementary Tables S1, S2).
Statistical analysis
SPSS version 27.0.1 and R software (version 4.4.2) were used to process and analyze clinical and metabolomic data. Normally distributed data were expressed as mean ± standard deviation and compared using a t-test. Non-normally distributed data were expressed as median and interquartile ranges and analyzed using the Mann-Whitney U-test. Qualitative data were presented using frequencies and percentages and compared with the Chi-square test. Multivariate statistical analysis was performed using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA) to identify differential metabolites and visualized by the ggplot2 package. Seven-fold cross-validation and permutation tests were used to evaluate the quality of the model. Enrichment of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways was also performed. Multivariable logistic regression, adjusting for age, sex, smoking, weight, and height, was used to evaluate the association of each metabolite with disease status. Metabolites with a VIP >1 in OPLS-DA analysis and Padj values <0.05 were defined as significantly differential metabolites.
Machine learning models
Machine learning operations were performed using the “tidymodels” package. The SMOTE algorithm was utilized to address unbalanced class distribution issues in the models. Several optimized machine learning models were used to discriminate between IBD and control group: decision tree (DT), random forest (RF), ridge regression (Ridge), support vector machine (SVM), light gradient boosting machine (LightGBM), and neural networks (NN). The hyperparameters were obtained using grid search with 5-fold cross-validation. The performance of these machine learning methods was evaluated by the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis (DCA). We also utilized Shapley additive explanation (SHAP) values to improve the interpretability of the final model.
Results
Participant characteristics
A total of 95 participants were enrolled in this study, including 49 patients with ulcerative colitis (UC), 20 patients with Crohn’s disease (CD), and 26 control individuals. The clinical characteristics of the three groups are summarized in Table 1. Statistically significant differences were observed in age and weight between the IBD groups and the control group (p < 0.01), as well as in hematological parameters such as hemoglobin, hematocrit, and platelet count. These differences highlight the clinical and physiological alterations associated with IBD, which may be reflected in the metabolomic profiles.
Metabolomic profiling and multivariate analysis
Targeted metabolomic profiling identified 49 urinary metabolites associated with central carbon metabolism across all subjects. Principal component analysis (PCA) showed limited separation among the three groups, indicating the necessity for supervised methods. Orthogonal partial least squares discriminant analysis (OPLS-DA) revealed clear group separation between UC and controls (R2X = 0.906, R2Y = 0.598, Q2 = 0.439), and between CD and controls (R2X = 0.543, R2Y = 0.733, Q2 = 0.594), suggesting that metabolite patterns differ significantly between disease and non-disease states (Figure 2, Supplementary Figure S1).

Figure 2. Multivariate statistical analysis of urinary metabolites in IBD patients. (A) OPLS-DA score plot of UC (green dots) vs. CG (pink dots); (B) OPLS-DA score plot of CD (blue dots) vs. CG (pink dots). (C, D) The 200-time permutation plots of two OPLS-DA models, respectively.
Hierarchical clustering of metabolite intensities showed distinct metabolic signatures among UC, CD, and control groups (Figure 3A). Metabolites such as glucosamine-6-phosphate were elevated in UC, whereas cis-aconitic acid, trehalose-6-phosphate, nicotinic acid, and glucaric acid were more abundant in CD compared to both UC and control groups. KEGG pathway enrichment analysis further supported that metabolic pathways including the TCA cycle, glucagon signaling, PI3K-Akt, and mTOR signaling were perturbed in IBD, suggesting potential disease-specific metabolic reprogramming (Figures 3B,C).

Figure 3. Comparison of urinary metabolomic profiles and KEGG pathway enrichment analysis. (A) Heatmap of cluster analysis of each metabolite among the three groups (UC, CD and CG), illustrating differential metabolite abundance. (B) KEGG enrichment analysis between UC vs CG. (C) KEGG enrichment analysis between CD vs CG.
Identification of differential metabolites and diagnostic panels
Logistic regression models adjusting for age, sex, smoking status, weight, and height identified several significantly differential metabolites. Six metabolites—xylose, isocitric acid, fructose, L-fucose, GlcNAc, and glycolic acid—were significantly altered in UC patients compared to controls. In CD, three metabolites—xylose, L-fucose, and citric acid—were significantly changed. Notably, L-fucose and xylose were elevated in both UC and CD, highlighting their potential as shared biomarkers. The differences in metabolite levels were more pronounced in patients with higher disease activity, as shown in Figure 4.

Figure 4. Candidate biomarkers based on disease severity (A) Metabolite levels in UC vs CG stratified by disease severity (remission/mild vs moderate/severe). (B) Metabolite levels in CD vs CG stratified similarly.
To assess the potential impact of instrumental variation, we performed method validation based on intra- and inter-day precision as well as recovery rate analyses. A total of 49 central carbon metabolites were evaluated using quality control (QC) samples prepared at three concentration levels (low, medium, and high). Each level was analyzed in six replicates within a day (intra-day precision) and across 3 days (inter-day precision). The RSD of intra-day precision ranged from 1.20% to 11.92%, and that of inter-day precision ranged from 2.33% to 12.66%. Recovery rates ranged from 85.33% to 113.71%. These results confirm that the analytical platform is highly stable and reproducible under the current workflow. Therefore, the observed metabolite differences are unlikely to be attributed to instrumental drift. The detailed information was listed in Supplementary Table S4.
Machine learning algorithms
Six machine learning algorithms were trained to distinguish IBD subtypes from controls. Among them, the random forest (RF) model achieved the best performance for UC, with a mean cross-validated AUC of 0.84 (SE = 0.036), and the support vector machine (SVM) model performed best for CD, with a mean AUC of 0.93 (SE = 0.035). Calibration curves showed good model fit, and decision curve analysis (DCA) demonstrated that both models offered higher net clinical benefit compared to baseline strategies (Figures 5A–F).

Figure 5. Evaluation of diagnostic model performance using machine learning algorithms. (A) Cross-validated AUC values for each model in UC vs CG. (B) Cross-validated AUC values in CD vs CG. (C) Calibration curve of the RF model for UC. (D) Calibration curve of the SVM model for CD. (E) DCA of different models in distinguishing UC from CG. (F) DCA curves for CD vs CG.
Stratified analysis by disease activity showed that the diagnostic accuracy remained high in both mild/remission and moderate/severe stages. The AUC for distinguishing UC from controls was 0.864 in remission/mild patients and 0.904 in moderate/severe cases. Similarly, CD patients showed AUCs of 0.963 in remission and 0.988 in active disease, confirming the robustness of the models across disease stages (Figures 6A–D).

Figure 6. Diagnostic performance of biomarker panels across disease activity levels. (A) ROC curve for identifying UC patients in remission/mild activity vs CG. (B) ROC curve for moderate/severe UC vs CG. (C) ROC curve for CD patients in remission vs CG. (D) ROC curve for active CD vs CG.
Longitudinal data from a subset of patients with samples collected at ≥6-month intervals showed consistent metabolite profiles during periods of equivalent disease activity, indicating good temporal stability of the biomarkers (Figure 7).

Figure 7. Stability of candidate biomarkers over time. (A) Longitudinal comparison of urinary metabolite levels in UC patients between two time points ≥6 months apart with similar disease activity. (B) Similar analysis in CD patients.
Model interpretation using SHAP
To interpret the contributions of individual metabolites to model predictions, SHAP (Shapley Additive exPlanations) analysis was performed (Lundberg and Lee, 2017). In the RF model for UC, L-fucose, xylose, and GlcNAc were identified as the top predictors, with positive SHAP values indicating higher disease risk (Figure 8A). For CD, citric acid and xylose were most influential in the SVM model, with decreased citric acid and elevated xylose contributing to increased CD probability (Figure 8B). These findings confirm the biological relevance and diagnostic utility of the selected metabolites.

Figure 8. Model interpretation using SHAP analysis. (A) SHAP beeswarm plot from the RF model for UC vs CG. (B) SHAP beeswarm plot from the SVM model for CD vs CG. In both plots, purple indicates higher feature values, yellow indicates lower values. Positive SHAP values on the X-axis correspond to increased predicted disease risk; negative values indicate lower risk. The vertical axis ranks features by overall contribution to model output.
Discussion
To our knowledge, this is the first targeted urinary metabolomics study to explore the association between IBD and central carbon metabolism. In this study, we demonstrated that certain urine metabolites related to central carbon metabolism differ significantly between IBD patients and control group. Through machine learning algorithms, we also identified two potential biomarker panels to distinguish UC from control group, and CD from control group, respectively. Furthermore, they have value for patients with IBD at different stages. Overall, our research showed that the combined application of urinary metabolomics and machine learning had unique advantages in diagnosing and monitoring IBD disease activity.
The levels of xylose and L-fucose were significantly increased in both UC and CD compared with the control group, as reported in a previous study (Schicho et al., 2012; Murdoch et al., 2008). The changes in monosaccharides were unclear and might be related to dysbiosis in IBD. Xylose and L-fucose are metabolites related to intestinal flora; their production is inseparable from the involvement of intestinal flora. The enrichment of some bacterial genera [such as Actinobacteria (Manichanh et al., 2012) and Streptococcus (Scanu et al., 2024)] in UC may cause increased production and urinary excretion of xylose (Beg et al., 2001) and L-fucose (Moya-Gonzálvez et al., 2022). Increased production and urinary excretion of xylose (Robert and Bernalier-Donadille, 2003) and L-fucose (Moya-Gonzálvez et al., 2022) may be due to the enrichment of some bacterial genera [such as Enterococcus (Kang et al., 2010) and Ruminococcus (Manichanh et al., 2012)] in CD. Meanwhile, fucose itself is a critical component of mucin in the intestinal epithelial barrier and maintains intestinal flora homeostasis (Bets et al., 2022). Fujii H et al. further explored the underlying mechanism of fucose in IBD (Fujii et al., 2016). They discovered that the core-fucosylated T-cell receptor was essential for T-cell signaling and the production of inflammatory cytokines. Interestingly, recent research found that these monosaccharides and some polysaccharides containing them could reduce colon inflammation (Li et al., 2021; Hui et al., 2024; Lean et al., 2015; Qin et al., 2022). Exogenous L-fucose improved intestinal epithelial barrier function, and the mechanism was related to the upregulation of fucosyltransferase 2-mediated fucosylation of intestinal epithelial cells (Li et al., 2021). And L-fucose also improved the epithelial barrier by promoting the proliferation of intestinal stem cells (Tan et al., 2022). A polysaccharide containing xylose and fucose could significantly decrease Akkermansia to upregulate thiamine metabolism, thereby inhibiting macrophage activation and reducing oxidative stress and inflammation (Hui et al., 2024).
N-acetyl-D-glucosamine (GlcNAc), an amide derivative of the monosaccharide glucose, presents in parts of glycosaminoglycans, glycoproteins, and glycolipids (Chen et al., 2010; Das et al., 2024). Notably, GlcNAc is also a component of mucin and helps maintain the mucus barrier. Currently, the relationship between GlcNAc and UC has not yet been fully explored. Previous research indicated that GlcNAc could reduce the colonization of pathogenic bacteria, enhance the growth of beneficial bacteria, and elevate the expression of the tight junction protein occludin to improve intestinal barrier function (Choi et al., 2023; Jenior et al., 2017). Zhao M et al. discovered that reduced O-linked-N-acetylglucosaminylation (O-GlcNAcylation) levels lead to increased intestinal permeability and microbial imbalance in the gut in patients with UC (Zhao et al., 2018). UC is usually accompanied by intestinal barrier disruption and infiltration of immune cells. Matos I et al. confirmed that N-acetylglucosaminidase, a biomarker of macrophage infiltration, was increased in the colon of DSS-induced colitis mice (Matos et al., 2013). We thus speculated that the elevated levels of GlcNAc in the urine of UC patients may be associated with intestinal barrier disruption and increased N-acetylglucosaminidase. Supplementation with GlcNAc has been used to treat children with treatment-resistant IBD, with a significant improvement in symptoms (Salvatore et al., 2000).
Citric acid is an essential metabolite in the TCA cycle. The urinary citric acid levels were decreased in CD patients, especially in those patients with high levels of disease activity. This result was consistent with previous results (Schicho et al., 2012; Stephens et al., 2013; Dawiskiba et al., 2014; Alonso et al., 2016; Aldars-García et al., 2024). Chronic diarrhea may contribute to hypocitraturia in CD patients because of metabolic acidosis from a loss of bicarbonate in the feces. This may explain the increased risk of urinary stones in CD patients (Rudman et al., 1980; Siener et al., 2024). Citrate and hydroxycinnamate derivatives from Mume Fructus could relieve LPS-induced intestinal epithelial cell injury by regulating the FAK/PI3K/AKT signaling pathway and thus should be considered as potential therapeutic targets for CD (Liu et al., 2023).
As mentioned above, changes in the urine metabolic profile of IBD patients in our study were consistent with the findings of previous studies. Our research further employed machine learning techniques to evaluate the importance of the metabolites associated with central carbon metabolism. Finally, we built two potential biomarker panels to diagnose IBD, which showed good diagnostic performance even in different disease stages. ESR and CRP are the most widely used blood markers for IBD in clinical practice (Sands, 2015). The AUC values of CRP and ESR in UC patients were 0.607 and 0.552, respectively. In CD patients, the AUC values for CRP and ESR were 0.698 and 0.746, respectively (Huang et al., 2023). The AUC values of the potential biomarker panels were significantly greater than those of CRP and ESR, suggesting that the potential biomarker panels had higher accuracy in diagnosing IBD than these traditional indicators.
Several previous studies have explored urinary metabolomic changes in patients with IBD, predominantly using untargeted approaches such as NMR or global LC-MS. For instance, Schicho et al. and Stephens et al. demonstrated that urinary metabolites such as hippurate and citrate could distinguish IBD patients from healthy individuals, although pathway specificity was limited (Schicho et al., 2012; Stephens et al., 2013). Previous studies by Alonso et al. and Aldars-García et al. examined the urinary metabolome in immune-mediated or treatment-naïve IBD populations; however, these analyses primarily relied on non-targeted or global profiling approaches rather than quantitative methods. (Alonso et al., 2016; Aldars-García et al., 2024).
In contrast, our study employed a targeted metabolomic strategy focusing specifically on central carbon metabolism, enabling precise quantification of 49 metabolites involved in glycolysis, the TCA cycle, and related pathways. Moreover, we integrated multiple machine learning algorithms and SHAP interpretability analysis, which were not typically applied in previous urinary metabolomics studies for IBD. While some of our findings, such as reduced urinary citrate in CD and elevated fucose in active IBD, are consistent with previous observations (Schicho et al., 2012; Dawiskiba et al., 2014), the persistent elevation of xylose and GlcNAc in both UC and CD and their stability over time may offer novel diagnostic insights.
Compared to earlier untargeted studies, our quantitatively validated metabolite panel shows higher diagnostic performance and better clinical translation potential. These findings both confirm and expand upon prior research by identifying robust, pathway-relevant biomarkers for the non-invasive diagnosis and stratification of IBD.
While this study has some strengths, it also has some limitations. First, our study enrolled a relatively small sample of participants from a single institution. Therefore, large-scale and multicenter studies should be carried out to verify the robustness of the models. Second, we did not evaluate the effects of diet on urine metabolites. Morning urine samples were collected to minimize the impact of diet and physical activity. Third, we adjusted for several potential confounders, but residual confounding cannot be completely ruled out, moreover, sample randomization was not achieved during the targeted metabolomics analysis, which may have introduced potential bias. Fourth, we could not completely establish the stability of urine metabolites in this study. The quantitative determination of metabolites was performed in the same batch to reduce variability. The relative standard deviation of the stability of these targets was less than 15%, indicating that the results obtained by the method should be reliable. The precision and recovery were determined by analyzing high, medium, and low standard concentrations. Inter-day and intra-day precision was <15% and the recoveries were greater than 80% at all concentrations. Although our research evaluated the stability of urinary metabolites, the sample size was too small. Furthermore, longitudinal stability of metabolites was assessed in some studies and the results were credible (Khamis et al., 2017).
As future work, more participants, updated models and more key metabolic pathways are needed.
Conclusion
In conclusion, this study identified a distinct panel of urinary metabolites related to central carbon metabolism that can accurately differentiate ulcerative colitis (UC) and Crohn’s disease (CD) from control individuals. By integrating targeted metabolomics with multiple machine learning algorithms, we developed robust diagnostic models with high predictive performance (AUC >0.90 for CD, and >0.80 for UC), even across different stages of disease activity. Key metabolites such as xylose, L-fucose, GlcNAc, and citric acid were found to be strongly associated with IBD status and may reflect underlying metabolic dysregulation in disease pathogenesis.
These findings not only validate previous metabolic signatures reported in IBD but also expand upon them by providing pathway-specific, quantitatively validated biomarker panels. Our results support the potential utility of urinary metabolite-based models as non-invasive diagnostic tools for IBD, offering a promising supplement or alternative to current invasive methods. Future large-scale and multicenter studies are warranted to further validate the generalizability and clinical applicability of these metabolite-based diagnostic strategies.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Ethics statement
The studies involving humans were approved by Ethics Committee of Qilu Hospital of Shandong University (Approval number: KYLL-202212-010). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
M-LL: Data curation, Methodology, Conceptualization, Writing – original draft, Investigation, Writing – review and editing, Formal Analysis. GB: Formal Analysis, Writing – review and editing, Methodology, Data curation. X-LY: Software, Writing – original draft. YW: Writing – review and editing, Software. Z-RS: Writing – original draft, Software. XG: Data curation, Software, Writing – original draft. HZ: Software, Writing – original draft. XZ: Software, Writing – original draft. FL: Methodology, Conceptualization, Writing – review and editing, Data curation. Y-BY: Resources, Visualization, Data curation, Project administration, Formal Analysis, Validation, Methodology, Investigation, Software, Writing – original draft, Conceptualization, Writing – review and editing, Funding acquisition, Supervision.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This research was funded by the National Natural Science Foundation of China (NSFC 82070540) and the Taishan Scholars Program of Shandong Province (tsqn202211309), the National Key Research and Development Program (2022YFC2504001), the Natural Science Foundation of Shandong Province (ZR2024LSW013), the Scientific Research Project of Shandong Medical Association (YXH2024YS027).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmolb.2025.1615047/full#supplementary-material
References
Aldars-García, L., Gil-Redondo, R., Embade, N., Riestra, S., Rivero, M., Gutiérrez, A., et al. (2024). Serum and urine metabolomic profiling of newly diagnosed treatment-naïve inflammatory bowel disease patients. Inflamm. Bowel Dis. 30 (2), 167–182. doi:10.1093/ibd/izad154
Alonso, A., Julià, A., Vinaixa, M., Domènech, E., Fernández-Nebro, A., Cañete, J. D., et al. (2016). Urine metabolome profiling of immune-mediated inflammatory diseases. BMC Med. 14 (1), 133. doi:10.1186/s12916-016-0681-8
Annese, V., Daperno, M., Rutter, M. D., Amiot, A., Bossuyt, P., East, J., et al. (2013). European evidence based consensus for endoscopy in inflammatory bowel disease. J. Crohns Colitis 7 (12), 982–1018. doi:10.1016/j.crohns.2013.09.016
Beg, Q. K., Kapoor, M., Mahajan, L., and Hoondal, G. S. (2001). Microbial xylanases and their industrial applications: a review. Appl. Microbiol. Biotechnol. 56 (3-4), 326–338. doi:10.1007/s002530100704
Bernstein, C. N., Fried, M., Krabshuis, J. H., Cohen, H., Eliakim, R., Fedail, S., et al. (2010). World Gastroenterology Organization Practice Guidelines for the diagnosis and management of IBD in 2010. Inflamm. Bowel Dis. 16 (1), 112–124. doi:10.1002/ibd.21048
Best, W. R., Becktel, J. M., Singleton, J. W., and Kern, F. (1976). Development of a Crohn’s disease activity index. Gastroenterology 70 (3), 439–444. doi:10.1016/s0016-5085(76)80163-1
Bets, V. D., Achasova, K. M., Borisova, M. A., Kozhevnikova, E. N., and Litvinova, E. A. (2022). Role of Mucin 2 glycoprotein and L-fucose in interaction of immunity and microbiome within the experimental model of inflammatory bowel disease. Biochem. (Mosc) 87 (4), 301–318. doi:10.1134/S0006297922040010
Blonski, W., Buchner, A. M., and Lichtenstein, G. R. (2012). Clinical predictors of aggressive/disabling disease: ulcerative colitis and crohn disease. Gastroenterol. Clin. North Am. 41 (2), 443–462. doi:10.1016/j.gtc.2012.01.008
Chen, J. K., Shen, C. R., and Liu, C. L. (2010). N-acetylglucosamine: production and applications. Mar. Drugs 8 (9), 2493–2516. doi:10.3390/md8092493
Choi, S. I., Shin, Y. C., Lee, J. S., Yoon, Y. C., Kim, J. M., and Sung, M. K. (2023). N-Acetylglucosamine and its dimer ameliorate inflammation in murine colitis by strengthening the gut barrier function. Food Funct. 14 (18), 8533–8544. doi:10.1039/d3fo00282a
Das, S., Chowdhury, C., Kumar, S. P., Roy, D., Gosavi, S. W., and Sen, R. (2024). Microbial production of N-acetyl-D-glucosamine (GlcNAc) for versatile applications: biotechnological strategies for green process development. Carbohydr. Res. 536, 109039. doi:10.1016/j.carres.2024.109039
Dawiskiba, T., Deja, S., Mulak, A., Ząbek, A., Jawień, E., Pawełka, D., et al. (2014). Serum and urine metabolomic fingerprinting in diagnostics of inflammatory bowel diseases. World J. Gastroenterol. 20 (1), 163–174. doi:10.3748/wjg.v20.i1.163
Fujii, H., Shinzaki, S., Iijima, H., Wakamatsu, K., Iwamoto, C., Sobajima, T., et al. (2016). Core fucosylation on T cells, required for activation of T-cell receptor signaling and induction of colitis in mice, is increased in patients with inflammatory bowel disease. Gastroenterology 150 (7), 1620–1632. doi:10.1053/j.gastro.2016.03.002
Gomollón, F., Dignass, A., Annese, V., Tilg, H., Van Assche, G., Lindsay, J. O., et al. (2017). 3rd European evidence-based consensus on the diagnosis and management of Crohn’s disease 2016: part 1: diagnosis and medical management. J. Crohns Colitis 11 (1), 3–25. doi:10.1093/ecco-jcc/jjw168
Hamilton, M. J. (2012). The valuable role of endoscopy in inflammatory bowel disease. Diagn Ther. Endosc. 2012, 467979. doi:10.1155/2012/467979
Huang, J., Lu, J., Jiang, F., and Song, T. (2023). Platelet/albumin ratio and plateletcrit levels are potential new biomarkers for assessing endoscopic inflammatory bowel disease severity. BMC Gastroenterol. 23 (1), 393. doi:10.1186/s12876-023-03043-4
Hui, H., Wang, Z., Zhao, X., Xu, L., Yin, L., Wang, F., et al. (2024). Gut microbiome-based thiamine metabolism contributes to the protective effect of one acidic polysaccharide from Selaginella uncinata (Desv.) Spring against inflammatory bowel disease. J. Pharm. Anal. 14 (2), 177–195. doi:10.1016/j.jpha.2023.08.003
Jenior, M. L., Leslie, J. L., Young, V. B., and Schloss, P. D. (2017). Clostridium difficile colonizes alternative nutrient niches during infection across distinct Murine gut microbiomes. mSystems 2 (4), e00063-17. doi:10.1128/mSystems.00063-17
Kang, S., Denman, S. E., Morrison, M., Yu, Z., Dore, J., Leclerc, M., et al. (2010). Dysbiosis of fecal microbiota in Crohn’s disease patients as revealed by a custom phylogenetic microarray. Inflamm. Bowel Dis. 16 (12), 2034–2042. doi:10.1002/ibd.21319
Kaplan, G. G. (2015). The global burden of IBD: from 2015 to 2025. Nat. Rev. Gastroenterol. Hepatol. 12 (12), 720–727. doi:10.1038/nrgastro.2015.150
Khamis, M. M., Adamko, D. J., and El-Aneed, A. (2017). Mass spectrometric based approaches in urine metabolomics and biomarker discovery. Mass Spectrom. Rev. 36 (2), 115–134. doi:10.1002/mas.21455
Lean, Q. Y., Eri, R. D., Fitton, J. H., Patel, R. P., and Gueven, N. (2015). Fucoidan extracts ameliorate acute colitis. PLoS One 10 (6), e0128453. doi:10.1371/journal.pone.0128453
Li, Y., Jiang, Y., Zhang, L., Qian, W., Hou, X., and Lin, R. (2021). Exogenous l-fucose protects the intestinal mucosal barrier depending on upregulation of FUT2-mediated fucosylation of intestinal epithelial cells. Faseb J. 35 (7), e21699. doi:10.1096/fj.202002446RRRR
Lichtenstein, G. R., Loftus, E. V., Isaacs, K. L., Regueiro, M. D., Gerson, L. B., and Sands, B. E. (2018). ACG clinical guideline: management of Crohn’s disease in adults. Am. J. Gastroenterol. 113 (4), 481–517. doi:10.1038/ajg.2018.27
Lichtenstein, G. R., Yan, S., Bala, M., and Hanauer, S. (2004). Remission in patients with Crohn’s disease is associated with improvement in employment and quality of life and a decrease in hospitalizations and surgeries. Am. J. Gastroenterol. 99 (1), 91–96. doi:10.1046/j.1572-0241.2003.04010.x
Liu, Z., Zhang, Z., Chen, X., Ma, P., Peng, Y., and Li, X. (2023). Citrate and hydroxycinnamate derivatives from Mume Fructus protect LPS-injured intestinal epithelial cells by regulating the FAK/PI3K/AKT signaling pathway. J. Ethnopharmacol. 301, 115834. doi:10.1016/j.jep.2022.115834
Lundberg, S., and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In proceedings of the 31st international conference on neural information processing systems (NIPS'17). Red Hook, NY: Curran Associates Inc., 4768–4777. doi:10.5555/3295222.3295230
Magro, F., Gionchetti, P., Eliakim, R., Ardizzone, S., Armuzzi, A., Barreiro-de Acosta, M., et al. (2017). Third European evidence-based consensus on diagnosis and management of ulcerative colitis. Part 1: definitions, diagnosis, extra-intestinal manifestations, pregnancy, cancer surveillance, surgery, and ileo-anal pouch disorders. J. Crohns Colitis 11 (6), 649–670. doi:10.1093/ecco-jcc/jjx008
Manichanh, C., Borruel, N., Casellas, F., and Guarner, F. (2012). The gut microbiota in IBD. Nat. Rev. Gastroenterol. Hepatol. 9 (10), 599–608. doi:10.1038/nrgastro.2012.152
Martin, F. P., Ezri, J., Cominetti, O., Da Silva, L., Kussmann, M., Godin, J. P., et al. (2016). Urinary metabolic phenotyping reveals differences in the metabolic status of healthy and inflammatory Bowel disease (IBD) children in relation to growth and disease activity. Int. J. Mol. Sci. 17 (8), 1310. doi:10.3390/ijms17081310
Martin, F. P., Su, M. M., Xie, G. X., Guiraud, S. P., Kussmann, M., Godin, J. P., et al. (2017). Urinary metabolic insights into host-gut microbial interactions in healthy and IBD children. World J. Gastroenterol. 23 (20), 3643–3654. doi:10.3748/wjg.v23.i20.3643
Matos, I., Bento, A. F., Marcon, R., Claudino, R. F., and Calixto, J. B. (2013). Preventive and therapeutic oral administration of the pentacyclic triterpene α,β-amyrin ameliorates dextran sulfate sodium-induced colitis in mice: the relevance of cannabinoid system. Mol. Immunol. 54 (3-4), 482–492. doi:10.1016/j.molimm.2013.01.018
Moya-Gonzálvez, E. M., Peña-Gil, N., Rubio-Del-Campo, A., Coll-Marqués, J. M., Gozalbo-Rovira, R., Monedero, V., et al. (2022). Infant gut microbial metagenome mining of α-l-fucosidases with activity on fucosylated human milk oligosaccharides and glycoconjugates. Microbiol. Spectr. 10 (4), e0177522. doi:10.1128/spectrum.01775-22
Murdoch, T. B., Fu, H., MacFarlane, S., Sydora, B. C., Fedorak, R. N., and Slupsky, C. M. (2008). Urinary metabolic profiles of inflammatory bowel disease in interleukin-10 gene-deficient mice. Anal. Chem. 80 (14), 5524–5531. doi:10.1021/ac8005236
Nicholson, J. K., and Lindon, J. C. (2008). Systems biology: metabonomics. Nature 455 (7216), 1054–1056. doi:10.1038/4551054a
Paine, E. R. (2014). Colonoscopic evaluation in ulcerative colitis. Gastroenterol. Rep. (Oxf) 2 (3), 161–168. doi:10.1093/gastro/gou028
Qin, Z., Yuan, X., Liu, J., Shi, Z., Cao, L., Yang, L., et al. (2022). Albuca Bracteata polysaccharides attenuate AOM/DSS induced Colon tumorigenesis via regulating oxidative stress, inflammation and gut microbiota in mice. Front. Pharmacol. 13, 833077. doi:10.3389/fphar.2022.833077
Robert, C., and Bernalier-Donadille, A. (2003). The cellulolytic microflora of the human colon: evidence of microcrystalline cellulose-degrading bacteria in methane-excreting subjects. FEMS Microbiol. Ecol. 46 (1), 81–89. doi:10.1016/S0168-6496(03)00207-1
Rubin, D. T., Ananthakrishnan, A. N., Siegel, C. A., Sauer, B. G., and Long, M. D. (2019). ACG clinical guideline: ulcerative colitis in adults. Am. J. Gastroenterol. 114 (3), 384–413. doi:10.14309/ajg.0000000000000152
Rudman, D., Dedonis, J. L., Fountain, M. T., Chandler, J. B., Gerron, G. G., Fleming, G. A., et al. (1980). Hypocitraturia in patients with gastrointestinal malabsorption. N. Engl. J. Med. 303 (12), 657–661. doi:10.1056/NEJM198009183031201
Sakurai, T., and Saruta, M. (2023). Positioning and usefulness of biomarkers in inflammatory bowel disease. Digestion 104 (1), 30–41. doi:10.1159/000527846
Salvatore, S., Heuschkel, R., Tomlin, S., Davies, S. E., Edwards, S., Walker-Smith, J. A., et al. (2000). A pilot study of N-acetyl glucosamine, a nutritional substrate for glycosaminoglycan synthesis, in paediatric chronic inflammatory bowel disease. Aliment. Pharmacol. Ther. 14 (12), 1567–1579. doi:10.1046/j.1365-2036.2000.00883.x
Sands, B. E. (2015). Biomarkers of inflammation in inflammatory bowel disease. Gastroenterology 149 (5), 1275–1285. doi:10.1053/j.gastro.2015.07.003
Scanu, M., Toto, F., Petito, V., Masi, L., Fidaleo, M., Puca, P., et al. (2024). An integrative multi-omic analysis defines gut microbiota, mycobiota, and metabolic fingerprints in ulcerative colitis patients. Front. Cell Infect. Microbiol. 14, 1366192. doi:10.3389/fcimb.2024.1366192
Schicho, R., Shaykhutdinov, R., Ngo, J., Nazyrova, A., Schneider, C., Panaccione, R., et al. (2012). Quantitative metabolomic profiling of serum, plasma, and urine by (1)H NMR spectroscopy discriminates between patients with inflammatory bowel disease and healthy individuals. J. Proteome Res. 11 (6), 3344–3357. doi:10.1021/pr300139q
Siener, R., Ernsten, C., Speller, J., Scheurlen, C., Sauerbruch, T., and Hesse, A. (2024). Intestinal oxalate absorption, enteric hyperoxaluria, and risk of urinary stone formation in patients with Crohn’s disease. Nutrients 16 (2), 264. doi:10.3390/nu16020264
Stephens, N. S., Siffledeen, J., Su, X., Murdoch, T. B., Fedorak, R. N., and Slupsky, C. M. (2013). Urinary NMR metabolomic profiles discriminate inflammatory bowel disease from healthy. J. Crohns Colitis 7 (2), e42–e48. doi:10.1016/j.crohns.2012.04.019
Tan, C., Hong, G., Wang, Z., Duan, C., Hou, L., Wu, J., et al. (2022). Promoting effect of L-Fucose on the regeneration of intestinal stem cells through AHR/IL-22 pathway of intestinal Lamina Propria monocytes. Nutrients 14 (22), 4789. doi:10.3390/nu14224789
Wu, Z., Liang, X., Li, M., Ma, M., Zheng, Q., Li, D., et al. (2023). Advances in the optimization of central carbon metabolism in metabolic engineering. Microb. Cell Fact. 22 (1), 76. doi:10.1186/s12934-023-02090-6
Xia, H., Huang, Z., Xu, Y., Yam, J. W. P., and Cui, Y. (2022). Reprogramming of central carbon metabolism in hepatocellular carcinoma. Biomed. Pharmacother. 153, 113485. doi:10.1016/j.biopha.2022.113485
Keywords: inflammatory bowel disease, ulcerative colitis, Crohn’s disease, urinary metabolomics, machine learning, central carbon metabolism
Citation: Lei M-L, Bi G-W, Yin X-L, Wang Y, Sun Z-R, Guo X-r, Zhang H-p, Zhao X-h, Li F and Yu Y-B (2025) Targeted urinary metabolomics combined with machine learning to identify biomarkers related to central carbon metabolism for IBD. Front. Mol. Biosci. 12:1615047. doi: 10.3389/fmolb.2025.1615047
Received: 20 April 2025; Accepted: 18 July 2025;
Published: 11 August 2025.
Edited by:
Francois-Pierre Martin, H&H Group, SwitzerlandReviewed by:
Boxun Zhang, China Academy of Chinese Medical Sciences, ChinaDavid Gaul, Georgia Institute of Technology, United States
Copyright © 2025 Lei, Bi, Yin, Wang, Sun, Guo, Zhang, Zhao, Li and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Feng Li, MjAxMDYyMDAwMTM3QHNkdS5lZHUuY24=; Yan-bo Yu, eXV5YW5ibzIwMDBAMTI2LmNvbQ==