ORIGINAL RESEARCH article

Front. Microbiol., 03 June 2020

Sec. Microbial Immunology

Volume 11 - 2020 | https://doi.org/10.3389/fmicb.2020.01162

Development of a Novel Metagenomic Biomarker for Prediction of Upper Gastrointestinal Tract Involvement in Patients With Crohn’s Disease

  • Department of Internal Medicine, Kyung Hee University Hospital at Gangdong, College of Medicine, Kyung Hee University, Seoul, South Korea

Abstract

The human gut microbiota is an important component in the pathogenesis of Crohn’s disease (CD), promoting host–microbe imbalances and disturbing intestinal and immune homeostasis. We aimed to assess the potential clinical usefulness of the colonic tissue microbiome for obtaining biomarkers for upper gastrointestinal (UGI) tract involvement in CD. We analyzed colonic tissue samples from 26 CD patients (13 with and 13 without UGI involvement at diagnosis) from the Inflammatory Bowel Disease Multi-Omics Database. QIIME1, DiTaxa, linear discriminant analysis effect size (LEfSe), and PICRUSt2 methods were used to examine microbial dysbiosis. Linear support vector machine (SVM) and random forest classifier (RF) algorithms were used to identify the UGI tract involvement-associated biomarkers. There were no statistically significant differences in community richness, phylogenetic diversity, and phylogenetic distance between the two groups of CD patients. DiTaxa analysis predicted significant association of the species Ruminococcus torques with UGI involvement, which was confirmed by the LEfSe analysis (P = 0.025). For the feature ranking method in both linear SVM and RF models, the species R. torques and age at diagnosis contributed to the combined models. The L-methionine biosynthesis III (P = 0.038) and palmitate biosynthesis II (P = 0.050) were under-represented in CD with UGI involvement. These findings suggest that R. torques might serve as a novel potential biomarker for UGI involvement in CD and its correlations, in addition to a range of bacterial species. The mechanisms of interaction between hosts and R. torques should be further investigated.

Introduction

Crohn’s disease (CD) is a heterogeneous disorder with a multifactorial etiology, including genetic factors, host immune system, environmental factors, and gut microbiota, and is characterized by chronic relapsing transmural inflammation which can affect the gastrointestinal tract (Strober et al., 2007). It may affect any part of the gastrointestinal tract, from the mouth to the perianal area, although the terminal ileum and the right colon are the most commonly affected sites (Strober et al., 2007; Gajendran et al., 2018). Previous studies estimated the prevalence of CD patients affected with an upper gastrointestinal (UGI) tract involvement at 16–34% for adults (Lenaerts et al., 1989; Cameron, 1991; Kefalas, 2003; Castellaneta et al., 2004; Van Limbergen et al., 2008) and 26–54% for children (Annunziata et al., 2012; Diaz et al., 2015). The UGI tract involvement in CD represents a risk of complications, such as stricturing and fistulizing phenotypes (Bernell et al., 2000), recurrence (Wolters et al., 2006), further hospitalization (Chow et al., 2009), and surgery (Bernell et al., 2000; Wolters et al., 2006; Henriksen et al., 2007; Lazarev et al., 2013; Davis, 2015; Gomollon et al., 2017).

Accordingly, the European Crohn’s and Colitis Organization consensus guideline recommends that UGI endoscopy and radiology, such as magnetic resonance imaging, computed tomography, and small bowel capsule endoscopy, should be performed in all CD patients where UGI tract involvement is suspected (Annese et al., 2013). However, UGI tract involvement is a diagnostically challenging presentation in CD, due to a lack of specific clinical symptoms, and thus, there is a heavier reliance on imaging modalities in practice.

Chronic inflammation in CD patients is related to altered interactions between the host and the microbiota, and microbial imbalance (Frank et al., 2007; Xavier and Podolsky, 2007; Sartor, 2008; Halfvarson et al., 2017). Currently, the human microbiome, comprising of the entire microbial complement related with human hosts, is a critical and emerging area for biomarker discovery (Pascal et al., 2017; Douglas et al., 2018; Mills et al., 2019). The identification of microbial biomarkers and their use for the prediction of the disease provide valuable information for predictions in a wide range of applications.

Hence, the aims of this study were to compare the metagenomic profile in CD patients with and without UGI involvement at diagnosis, and to identify the metagenomic biomarkers predicting its development.

Materials and Methods

Data Sources and Processing

We used the data from the Inflammatory Bowel Disease Multi’omics Database1 for the most comprehensive description to date of host and microbial activities in inflammatory bowel diseases. Tissue samples gathered during the initial screening colonoscopy at diagnosis were collected according to a standardized protocol, and the V4 region of the 16S rRNA gene was PCR-amplified and sequenced in the MiSeq platform (Illumina) (for detailed protocols see http://ibdmdb.org/protocols). We divided the subjects into two groups, “nonL4” versus “L4” -where nonL4 are CD patients without UGI tract involvement and L4 are those with UGI tract involvement in disease extent.

Community Analysis

The obtained raw data were analyzed using Quantitative Insights Into Microbial Ecology (QIIME) version 1.9.0, a software that performs microbial community analysis and taxonomic classification of microbial genomes (Navas-Molina et al., 2013). Sequences were assigned to operational taxonomic units (OTUs) with a 97% similarity threshold and subsequently picked by UCLUST against a closed reference table, the latest version of the Greengenes OTU database (Edgar, 2010). For diversity analysis, samples were normalized so all the samples could be compared. Alpha diversity of OTU libraries was described using the Chao1, phylogenetic diversity (PD) whole tree, and observed species, and were compared using a Student’s t-test. Distance matrices were constructed using the unweighted and weighted UniFrac algorithms in QIIME from the whole community phylogenetic tree. Significant differences between the predefined groups were analyzed using one-way analysis of similarities (ANOSIM) with 999 permutations with their corresponding Global-R statistics.

Biomarker Detection and Functional Analysis

To determine the potential biomarker OTUs, linear discriminant analysis effect size (LEfSe) analysis was performed with a linear discriminant analysis (LDA) score threshold of > 1.0 to detect features significantly different in abundance between the groups (Fisher, 1936; Segata et al., 2011).

In addition, we conducted subsequence-based 16S rRNA data processing using the DiTaxa software, which substitutes standard OTU-clustering method by segmenting 16S rRNA reads into the most frequent variable-length subsequences, for sequence phenotype classification and biomarker detection (Asgari et al., 2018). The linear support vector machine (SVM) and random forest classifier (RF) algorithms were used to build a predictive model and to calculate the importance of all variables and rank them accordingly. For linear SVM, we set the cost to the value of 1 and use RF classifier in the default settings.

PICRUSt2 was used to predict microbial content from each sample’s data and functionally annotate the data (Langille et al., 2013). The results were further subjected to statistical analysis of taxonomic and functional profiles (STAMP v2.1.1) software (Parks et al., 2014). To investigate the metabolic network of the predicted organism, we used MetaCyc database2, which contains data regarding chemical compounds, reactions, enzymes, and metabolic pathways that have been experimentally validated and reported in the scientific literature (Caspi et al., 2016). The statistical analyses were performed using R version 3.5.1 (R Core Team, 2017; Venables and Smith, 2020). All significant thresholds were set at a two-sided p-value of 0.05.

Results

Baseline Characteristics

Among the 37 potential CD patients, four patients with insufficient data on disease extent and seven patients who were not receiving tissue samples at the time of diagnosis were excluded, leaving 26 patients for analysis. Patients of the L4 group were diagnosed at a younger median age of 13.0 years (IQR 10.5–15.5 years) compared to 19.0 years (IQR 14.5–28.0 years) for the patients in nonL4 group (P = 0.005), and the male to female ratio was 2.3 and 1.2 in the L4 and nonL4 groups, respectively (P = 0.650) (Table 1). The baseline C-reactive protein (CRP) score and simple endoscopic score for Crohn’s disease (SES-CD) did not differ significantly between the groups (P = 0.711 and P = 0.056 for L4 and nonL4, respectively) (Table 1). However, the CD patients with UGI tract involvement had higher erythrocyte sedimentation rate (ESR) than those without UGI tract involvement (P = 0.033) (Table 1). All tissue samples were obtained from the rectum and ileum (Table 1). None of the patients were on any active medication, such as corticosteroids, immunomodulators, or biological agents at the time of sample collection. The detailed demographic and clinical characteristics are summarized in Table 1.

TABLE 1

VariableL4 (n = 13)nonL4 (n = 13)P-value
Age at diagnosis, year (median, IQR)13.0 (5.0)19.0 (13.5)0.005
Race (n, %)1.000
 White12 (92.3)12 (92.3)
 American, Indian, or Alaska Native1 (7.7)0 (0.0)
 Other0 (0.0)1 (7.7)
Sex (n, %)0.650
 M9 (69.2)7 (53.8)
 F4 (30.8)6 (46.2)
Biopsy location (n, %)0.216
 Ileum6 (46.2)3 (23.1)
 Rectum7 (53.8)10 (76.9)
ESR (mm/h)40.0 (27.0)11.0 (28.0)0.033
CRP (mg/L)4.8 (3.3)4.9 (12.6)0.711
SES-CD score8 (9.0)2.5 (6.0)0.056

Baseline characteristics of the patients.

IQR, interquartile range; ESR, erythrocyte sedimentation rate; CRP, C-reactive protein, SES-CD, simple endoscopic score for Crohn’s disease. The p-Values were obtained using Mann–Whitney test and chi-square test.

Taxonomic Characterization

We analyzed the intestinal microbiota diversity of the two groups and tested whether intestinal microbiota diversity could be related to disease extent. The alpha diversity indices of Chao1, PD whole tree, and observed species diversity are shown in Supplementary Figure S1. All three diversity indices were higher in nonL4 compared to L4, but there were no significant differences between the two groups (P = 0.522, P = 0.503, and P = 0.275 for Chao1, PD whole tree, and observed species diversity, respectively; Supplementary Figure S1). Beta diversity was further evaluated using weighted-UniFrac analysis, which showed similar bacterial communities in patients of both groups (Supplementary Figure S1). Furthermore, an unweighted UniFrac-based principal coordinate analysis (PCoA) showed that samples were clustered by subject (ANOSIM: R = −0.010; P = 0.477) (Supplementary Figure S2). We also performed a weighted-UniFrac PCoA analysis with ANOSIM (R = −0.043; P = 0.898) (Supplementary Figure S2).

Bacterial Abundance and Distribution

Subsequently, we analyzed the intestinal microbiota abundance and distribution in the two groups and tested whether they could be related to UGI tract involvement in CD. At the genus level, bacteria from Akkermansia (0.3% vs. 1.8%), Haemophilus (0.2% vs. 1.7%), Oscillospira (1.0% vs. 1.3%), Parabacteroides (0.8% vs. 0.9%), Clostridium (0.1% vs. 0.9%), Dialister (0.7% vs. 0.8%), Lachnospira (0.1% vs. 0.7%), Streptococcus (0.3% vs. 0.5%), Coprococcus (0.4% vs. 0.5%), and Ruminococcus [f__Ruminococcaceae] (0.3% vs. 0.4%) were less abundant, whereas those from Bacteroides (34.9% vs. 33.0%), Faecalibacterium (13.7% vs. 11.1%), Ruminococcus [f__Lachnospiraceae] (7.4% vs. 5.4%), Prevotella (3.5% vs. 0.3%), Fusobacterium (3.1% vs. 2.6%), Sutterella (2.4% vs. 2.2%), Blautia (1.3% vs. 0.8%), Veillonella (1.2% vs. 1.1%), Dorea (0.7% vs. 0.5%), Bilophila (0.6% vs. 0.3%), Phascolarctobacterium (0.3% vs. 0.1%), and Odoribacter (0.2% vs. 0.1%) were more abundant in L4 compared to nonL4 (Figure 1).

FIGURE 1

Metagenomic Biomarker Discovery

We found significant differences in the community compositions between the two groups by LEfSe analysis. As shown in Figure 2, the microbial composition was also significantly different at the order level among groups. The Pasteurellales (P = 0.042), Sphingomonadales (P = 0.045), Campylobacterales (P = 0.024), and Clostridiales (P = 0.043) exhibited a relatively higher abundance in nonL4 group (Figure 2). The patients in nonL4 group had members of the class Epsilonproteobacteria (P = 0.024) and the family Campylobacteraceae (P = 0.024) that were significantly dominant than those in L4 group patients (Figure 2). Furthermore, there were seven significantly different genera, composed of Campylobacter (P = 0.024), Prevotella (P = 0.034), Clostridium (P = 0.043), Coprobacillus (P = 0.015), Slackia (P = 0.034), and Lachnospira (P = 0.015) that were enriched in the nonL4 group, while Limnohabitans (P = 0.034) was enriched in the L4 group (Figure 2). At the species level, significantly more Haemophilus parainfluenzae were detected in the patients in nonL4 group (P = 0.028), while Ruminococcus torques were enriched in L4 group patients (P = 0.015) (Figure 2).

FIGURE 2

Comparative taxonomic visualization of detected differentially expressed markers for DiTaxa and a common workflow are shown in Supplementary Figure S3 for samples from CD patients with UGI tract involvement versus those without UGI tract involvement. Taxa predicted by DiTaxa analysis for samples from L4 group patients versus those from nonL4 group exhibited Ruminococcus faecis, Coprococcus comes, Dorea formicigenerans, Ruminococcus torques, CCMM_s, Eubacterium hallii, Bilophila wadsworthia, Blautia faecis, Ruminococcus gnavus, Alistipes putredinis, Bacteroides finegoldii, Roseburia faecis, Faecalibacterium prausnitzii, Roseburia inulinivorans, Lachnospira pectinoschiza, Intestinibacter bartlettii, Clostridium symbiosum, Agathobacter rectalis, Roseburia intestinalis, Clostridium bolteae, Fusicatenibacter saccharivorans, Anaerostipes hadrus, Bacteroides caccae, Bacteroides uniformis, Flavonifractor plautii as significantly associated (Table 2).

TABLE 2

DirectionTaxonomyMarkerP-valueNumber of markers
+Ruminococcus faecistacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgtagacggagtggcaagtctggtgtgaaaacccggggct caaccccgggactgcattggaaactgtcaatctagagtaccggagaggtaagcggaattcctagtgtagcggtgaaatgcgtagat attaggaggaacaccagtggcgaaggcggcttactggacggtaactgacgttgaggctcgaaagcgtggggagcaaacagg0.0142
+Coprococcus comestacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgtagacggctgtgtaagtctgaagtgaaagcccggggctc aaccccgggactgctttggaaactatgcagctagagtgtcggagaggtaagtggaattcccagtgtagcggtgaaatgcgtagatatt gggaggaacaccagtggcgaaggcggcttactggacggtaactgacgttgaggctcgaaagcgtggggagcaaacagg0.0251
+Dorea formicigenerans*tacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgtagacggctgtgcaagtctgaagtgaaaggcatgggctca acctgtggactgctttggaaactgtgcagctagagtgtcggagaggtaagtggaattcctagtgtagcggtgaaatgcgtagatattag gaggaacaccagtggcgaaggcggcttactggacgatgactgacgttgaggctcgaaagcgtggggagcaaacagg0.0251
+Ruminococcus torques*tacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgtagacggagtggcaagtctgatgtgaaaacccggggctc aaccccgggactgcattggaaactgttcatctagagtgctggagaggtaagtggaattcctagtgtagcggtgaaatgcgtagatatta ggaggaacaccagtggcgaaggcggcttactggacagtaactgacgttgaggctcgaaagcgtggggagcaaacagg0.0251
+CCMM stacgtaggtggcgagcgttatccggaattattgggcgtaaagagggagcaggcggcactaagggtctgtggtgaaagatcgaagc ttaacttcggtaagccatggaaac0.0251
+Eubacterium halliitgctcggctagagtacaggagaggcaggcggaattcctagtgtagcggtgaaatgcgtagatattaggaggaacaccagtggcg aagcgggcctgctggactgttactgacgctgaggcacgaaagcgtggggagcaaacagg0.0461
+Bilophila wadsworthiatccgtagatatctggaggaacaccggtggcgaaggcggccacctggacggtaactgacgctgaggtgcgaaagcgtgggtagc aaacagg0.0461
+Blautia faecistacgtagggggcaagcgttatccggatttactgggtgtaaagggagcgtagacggcgcagcaagtctgatgtgaaaggcagggg cttaacccctggactgcattggaaactgctgtac0.0461
+Ruminococcus gnavus*tacgtagggggcaagcgttatccggatttactgggtgtaaagggagcgtagacggcatggcaagccagatgtgaaagcccggggc tcaaccccgggactgcatttggaactgtcaggctagagtgtcggagaggaaagcggaattcctggtgtagcggtgaaatgcgtagat attaggaggaacaccagtggcgaaggcggctttctggacgatgactgacgttgaggctcgaaagcgtggggagcaaacagg0.0461
+Alistipes putredinistacggaggattcaagcgttatccggatttattgggtttaaagggtgcgtaggcggtttgataagttagaggtgaaatttcggggctcaa ccctgaacgtgcctctaatactgttgagctagagagtagttgcggtaggcggaatgtatggtgtagcggtgaaatgcttagagatcat acagaacaccgattgcgaaggcagcttaccaaactatacctgacgttgaggcacgaaagcgtggggagcaaacagg0.0461
+Bacteroides finegoldiitacggaggatccgagcgttatccggatttattgggtttaaagggagcgtaggtggattgttaagtcagttgtgaaagtttgcggctca accgtaaaattgcagttgatactggctgtcttgagtacagtagaggtgggcgg0.0461
Roseburia faecis*tacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgcaggcggtgcggcaagtctgatgtgaaagcccggggctca accccggtactgcattggaaactgtcgtactagagtgtcggaggggtaagtggaattcctagtgtagcggtgaaatgcgtagatatta ggaggaacaccagtggcgaaggcggcttactggacgataact gacgctgaggctcgaaagcgtggggagcaaacagg0.0051
Faecalibacterium prausnitzii*aaggcaagttggaagtgaaatccatgggctcaacccatgaactgctttcaaaactgtttttcttgagtagtgcagaggtaggcggaat tcccggtgtagcggtggaatgcgtagatatcgggaggaacaccagtggcgaaggcggcctactgggcaccaactgacgctga ggctcgaaagtgtgggtagcaaacagg0.0141
Roseburia inulinivoranstacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgcaggcggaaggctaagtctgatgtgaaagcccggggct caaccccggtac0.0251
Lachnospira pectinoschizaagaggcaagtggaattcctagtgtagcggtgaaatgcgtagatattaggaggaacaccagtggcgaaggcggcttgctggactgtaa ctgacactgaggctcgaaagcgtggggagcaaacagg0.0251
Intestinibacter bartlettiitacgtagggggctagcgttatccggatttactgggcgtaaagggtgcgtaggcggtcttttaagtcaggagtgaaaggctacggctca accgtagtaagctcttgaaactggaggacttgagtgcaggagaggagagtggaattcctagtgtagcggtgaaatgcgtagatattag gaggaacaccagtagcgaaggcggctctctggactgtaactgacgctgaggcacgaaagcgtggggagcaaacagg0.0351
Roseburia inulinivoranstacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgcaggcggagggctaagtctgatgtgaaagcccggggc tcaaccccggtactgcattggaaactggtcatctagagtgtcggaggggtaagtggaattcctagtgtagcggtgaaatgcgtaga tattaggaggaacaccagtggcgaaggcggcttactggacgataactgacgctgaggctcgaaagcgtggggagcaaacagg0.0465
Clostridium symbiosumtgtttaactggagtgtcggagaggtaagtggaattcctagtgtagcggtgaaatgcgtagatattaggaggaacaccagtggcgaa ggcgacttactggacgataactgacgttgaggctcgaaagcgtggggagcaaacagg0.0461
Agathobacter rectalistacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgcaggcggtgcggcaagtctgatgtgaaagcccggggct caaccccggtactgcattggaaactgtcgtactagagtgtcggaggggtaagcggaattcctagtgtagcggtgaaatgcgtagatat taggaggaacaccagtggcgaaggcggcttactggacgataactgacactgaggctcgaaagcgtggggagcaaacagg0.0461
Roseburia intestinalistacgtatggtgcaagcgttatccggatttactgggtgtaaagggagcgcaggcggtacggcaagtctgatgtgaaagcccggggct caaccccggtactgcattggaaactgtcggac0.0461
Clostridium bolteaetacgtaggtggcaagcgttatccggatttactgggtgtaaagggagcgtagacggcgaagcaagtctgaagtgaaaacccagggc tcaaccctgggactgctttggaaactgttttgctagagtgtcggagaggtaagtggaattcctagtgtagcggtgaaatgcgtagat attaggaggaacaccagtggcgaaggcggcttactggacgataactgacgttgaggctcgaaagcgtggggagcaaacagg0.0461
Fusicatenibacter saccharivoranstacgtagggggcaagcgttatccggatttactgggtgtaaagggagcgtagacggcaaggcaagtctgatgtgaaaacccaggg cttaaccctgggactgcattggaaactgtctggctcgagtgccggagaggtaagcggaattcctagtgtagcggtgaaatgcgtaga tattaggaagaacaccagtggcga0.0461
Anaerostipes hadrustacgtagggggcaagcgttatccggaattactgggtgtaaagggtgcgtaggtggtatggcaagtcagaagtgaaaacccaggg cttaactctgggactgcttttgaaactgtcagactggagtgcaggagaggtaagcggaattcctagtgtagcggtgaaatgcgtagat attaggagg0.0461
Bacteroides caccae*tacggcggatccgagcgttatccggatttattgggtttaaagggagcgtaggcggattgttaagtcagttgtgaaagtttgcggctcaac cgtaaaattgcagttgatactggcagtcttgagtgcagtagaggtgggcggaattcgtggtgtagcggtgaaatgcttagatatcacg aagaactccgattgcgaaggcagctcactggagtgtaactgacgctgatgctcgaaagtgtgggtatcaaacagg0.0461
Bacteroides caccaetacggaggatccgagcgttatccggatttattgggtttaaagggagcgtaggcggattgttaagtcagttgtgaaagtttgcggctcaa ccgtaaaattgcagttgatactggcagtcttgagtgcagtagaggtgggcggaattcgtggtgtagcggtgaaatgcttagatatcacga agaactccgattgcggaggcagctcactggagtgtaactgacgctgatgctcgaaagtgtgggtatcaaacagg0.0461
Bacteroides uniformis*tacggaggatccgagcgttatccggatttattgggtttaaagggagcgtaggcggacgcttaagtcagttgtgaaagtttgcggctcaa ccgtaaaattgcagttgatactgggtgtcttgagtacagtagaggcaggcggaattcgtggtgtagcggtgaaatgcttagatat cacgaagaactccgattgcgaaggcagcctgctggactgtaactgacgctgatgctcgaaagtgtgggtatcaaaaagg0.0461
Flavonifractor plautiitaaagggcgtgtaggcgggattgcaagtcagatgtgaaaactgggggctcaacctccagcctgcatttgaaactgtagttc0.0461

Taxa predicted by DiTaxa analysis for UGI tract involvement in CD.

UGI, upper gastrointestinal; CD, Crohn’s disease; +, prediction for L4 group; −, prediction for nonL4 group; *taxa identified with UCLUST-based methods in QIIME.

Of these, the following seven taxa were actually identified with UCLUST-based methods in QIIME (Table 2): Dorea formicigenerans, Ruminococcus torques, Ruminococcus gnavus, Roseburia faecis, Faecalibacterium prausnitzii, Bacteroides caccae, and Bacteroides uniformis.

Metagenomic Biomarker Evaluation

To further characterize the predictive value of the eight identified taxa by LEfSe or DiTaxa methods, we performed ROC analysis with clinical variables (age at diagnosis and sex) using the machine learning models (Figure 3). A comparison of the average performance as a predictive model suggests the superiority of SVM: the average performance of SVM is >0.799 AUC and 68.2–75.2% accuracy, while that of RF is <0.740 AUC and 57.8–66.5% accuracy (Figure 3). For the top performing model architecture, the addition of microbial features improves the predictive performance of linear SVM model; however, the performance in RF model tends to decrease. Notably, for the feature ranking method in both linear SVM and RF models, the top two factors -the species Ruminococcus torques and age at diagnosis- contributed to the combined models (Figure 3). Figure 3 also shows that the addition of the signature of the species Haemophilus parainfluenzae into the models enabled us to achieve the highest accuracy and to increase the diagnostic performance of UGI tract involvement in CD patients.

FIGURE 3

Metagenomic Functional Analysis

In addition, the functional diversity of the different putative metagenomes was assessed using the PICRUSt2 software. Pathways displaying a significant difference in mean proportions between L4 and nonL4 groups were represented (Figure 4). The pathways, including thiazole biosynthesis II (p < 0.001), superpathway of thiamine diphosphate biosynthesis II (P = 0.010), and octane oxidation (P = 0.035), were over-represented, whereas L-methionine biosynthesis III (P = 0.038) and palmitate biosynthesis II (P = 0.050) were under-represented in L4 (Figure 4). For selected pathways, we also examined the extent to which these pathways are linked with the species Ruminococcus torques. As shown in Figure 5, the two related MetaCyc pathways, L-methionine biosynthesis III and palmitate biosynthesis II, showed an association with Ruminococcus torques, which may play important roles in the intestinal integrity and barrier function (Figure 5).

FIGURE 4

FIGURE 5

Discussion

To our knowledge, this is the first study to identify a reliable metagenomic biomarker for UGI tract involvement in CD. The reported frequency of UGI tract involvement in CD largely varies. The main cause of the discrepancies regarding prevalence rates of UGI tract lesions is probably related to irregularly performing different diagnostic modalities for CD diagnosis, presumably because of the low reliability of mapping disease extent in clinical practice. However, to date, no data has been analyzed regarding a simple biomarker for CD patients with UGI tract involvement.

Our main hypothesis is that the possible differences in taxonomic composition might potentially be used as proxy biomarkers for UGI tract involvement in CD patients, since altered microbial communities have been demonstrated to be an essential factor in driving intestinal inflammation in CD (Tamboli et al., 2004; Sartor, 2008).

We analyzed the differences in the tissue microbial community of CD patients at the species level. In this study, the species Dorea formicigenerans, Ruminococcus torques, Ruminococcus gnavus, Roseburia faecis, Faecalibacterium prausnitzii, Bacteroides caccae, Bacteroides uniformis, and Haemophilus parainfluenza were identified as predictive biomarkers by the LEfSe or DiTaxa programs. Interestingly, Ruminococcus torques, a butyrate-producing bacterial species, was the only one commonly identified by the two different algorithms.

The authors also examined whether the composition of the microbiota, with clinical predictors, could predict whether the patient would have UGI tract involvement or not using two different machine learning algorithms (linear SVM and RF). Modest predictive performances were achieved with a few features (eight taxa, age at diagnosis, and sex), especially in linear SVM. Notably, the most influential features for predicting disease extent were levels of the species Ruminococcus torques (positive correlation) and age at diagnosis (negative correlation). These consistent reproducible results for the species Ruminococcus torques present the possibility of using microbiota analysis as a screening tool to determine CD patients at high risk of UGI tract involvement.

Contrary to the inconsistent results regarding the signature of Ruminococcus torques in fecal samples of CD patients (Joossens et al., 2011; Gevers et al., 2014; Takahashi et al., 2016), most of the studies from tissue samples showed consistently high levels of the species in the mucosa of patients with CD compared to that in healthy subjects (Martinez-Medina et al., 2006; Png et al., 2010). Furthermore, our results show that its abundance remains significantly high in the CD patients with UGI tract involvement.

No role in CD pathophysiology has been suggested so far for the species Ruminococcus torques, belonging to the Clostridium coccoides group/cluster XIVa. They utilize MUC2, the main secreted mucin in the human intestine, as the sole carbon source and have a strong gastrointestinal mucin-degrading ability, providing further evidence of their adaptability in the human gut mucosal environment (Colina et al., 1996; Dethlefsen et al., 2006; Png et al., 2010). Therefore, it has been proposed that excessive mucin degradation by these bacteria may contribute to intestinal disorders, as access of luminal antigens to the intestinal immune system is facilitated (Ganesh et al., 2013).

In addition, we found an inverse relationship for age at diagnosis with UGI involvement in CD. Present study showed that the median age of the patients with UGI involvement was significantly lower compared to those without UGI involvement. This is in accordance with a previous study by Thomas and colleagues who demonstrated a higher rate of younger patients (≤16 years) suffering from UGI tract involvement compared to those without (9.4% versus 17.8%, P = 0.005) (Greuter et al., 2018). Another study by Lopez-Siles et al. observed that CD patients below 16 years of age had a striking reduction in the population of Akkermansia sp. that possess a lower mucolytic activity compared to those with disease onset at a later age, which is in agreement with our findings (Lopez-Siles et al., 2018).

Taken together, the mucus barrier dysfunction, due to the replacement of a less mucolytic bacteria, such as Akkermansia species by a more mucolytic one, such as Ruminococcus torques, at young age may influence the microbial community on the intestinal mucosa and be instrumental in the development of UGI tract involvement in CD.

To characterize the functional role of the microbiome in phenotype, we annotated the taxa by the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. This analysis suggested that L-methionine biosynthesis III and palmitate biosynthesis II pathway were decreased, which are linked with Ruminococcus torques, while the three KEGG pathways predicted to be increased in L4 group were not associated with the species. Currently, the role of methionine metabolism and its metabolites, and palmitate metabolic pathway in the pathogenesis of CD is poorly understood. Methionine is known to improve the integrity and barrier function of the small intestinal mucosa and villus morphology, and development in previous studies from animal models (Chen et al., 2014; Shen et al., 2014). A previous in vivo study by Wei et al. showed that palmitate plays a key role in the preservation of the gut barrier function by regulating the secretion and function of MUC2 (Wei et al., 2012). Another recent study also demonstrated that palmitate enhances MUC2 production in goblet cells of intestine, leading to the establishment of a thick mucus gel, thereby maintaining the integrity of the gut barrier (Benoit et al., 2015). These findings imply that there might be a decrease in the two protective pathways through the communication between intestinal cells and microbial community, especially the species Ruminococcus torques may induce excessive mucus degradation of small intestine in CD patients with UGI tract involvement.

The main strength of this study is that it evaluated the potential metagenomic biomarkers for prediction of UGI tract involvement in CD patients through various analyses. Further, we analyzed the CD patients with new-onset disease, before the commencement of treatment. Changes in microbiota community structure important for disease pathogenesis are likely to be more evident in new-onset and treatment-naive patients than those undergoing treatment. Lastly, the study focuses on the mucosa-associated microbiota samples, which may be more relevant to disease pathogenesis and diagnosis than fecal samples. However, our study was limited by the small sample size. Another limitation was that it comprised of predominately white patients, and thus, the findings may not be generalized to other racial populations. Finally, although our study was focused only on the microbial community, microbial metabolites also have great potential for improving diagnosis of CD and reflect the abnormalities of the host intestine microbiota. Therefore, new biomarkers for CD patients with UGI involvement could be developed by integrated analysis of metabolomics and metagenomics from a multinational and multicenter cohort.

Conclusion

In conclusion, the species Ruminococcus torques in the tissue microbial community of CD patients might serve as a novel potential biomarker for UGI tract involvement. The UGI tract involvement in CD is higher in younger age group patients; therefore, it should be carefully monitored in them. The mechanisms of interactions between the host and Ruminococcus torques should be further investigated.

Statements

Data availability statement

All datasets presented in this study are included in the article/Supplementary Material.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants’ legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

MK designed the study. JC and HS analyzed and interpreted the data, and wrote the manuscript. JJ and JY supervised the project and revised the manuscript. All authors vouch for the data and analysis, have approved the final version, and agreed to publish the manuscript.

Funding

This research was supported by the Basic Science Research Program of the National Research Foundation of Korea (NRF), which is funded by the Korean Ministry of Science, ICT and Future Planning (grant number NRF-2019R1C1C1003524).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2020.01162/full#supplementary-material

FIGURE S1

Analysis of alpha diversity as predicted by Chao 1 estimator, PD whole tree, and observed species (A); and beta diversity measured by weighted-UniFrac distances in L4 versus nonL4 groups (B).

FIGURE S2

Principal coordinates analysis (PCoA) based on (A) unweighted and (B) weighted UniFrac distance; blue for the nonL4 and red for the L4 (ANOSIM: R = −0.010, P = 0.477; R = −0.043, P = 0.898).

FIGURE S3

Heat map of occurrence of markers by DiTaxa analysis between L4 and nonL4 groups. The rows are sorted based on the taxonomic marker assignments and the columns represent each group and are sorted firstly, based on their phenotype, and secondly, based on their pattern similarity.

References

  • 1

    AnneseV.DapernoM.RutterM. D.AmiotA.BossuytP.EastJ.et al (2013). European evidence based consensus for endoscopy in inflammatory bowel disease.J. Crohn Colit.79821018.

  • 2

    AnnunziataM. L.CavigliaR.PapparellaL. G.CicalaM. (2012). Upper gastrointestinal involvement of Crohn’s disease: a prospective study on the role of upper endoscopy in the diagnostic work-up.Dig. Dis. Sci.5716181623. 10.1007/s10620-012-2072-0

  • 3

    AsgariE.MünchP. C.LeskerT. R.MchardyA. C.MofradM. R. K. (2018). DiTaxa: nucleotide-pair encoding of 16S rRNA for host phenotype and biomarker detection.Bioinformatics3524982500. 10.1093/bioinformatics/bty954

  • 4

    BenoitB.BrunoJ.KayalF.EstienneM.DebardC.DucrocR.et al (2015). Saturated and unsaturated fatty acids differently modulate colonic goblet cells in vitro and in rat pups.J. Nutr.14517541762. 10.3945/jn.115.211441

  • 5

    BernellO.LapidusA.HellersG. (2000). Risk factors for surgery and postoperative recurrence in Crohn’s disease.Ann. Surg.2313845.

  • 6

    CameronD. J. (1991). Upper and lower gastrointestinal endoscopy in children and adolescents with Crohn’s disease: a prospective study.J. Gastroenterol. Hepatol.6355358. 10.1111/j.1440-1746.1991.tb00870.x

  • 7

    CaspiR.BillingtonR.FerrerL.FoersterH.FulcherC. A.KeselerI. M.et al (2016). The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.Nucleic Acids Res.44D471D480.

  • 8

    CastellanetaS. P.AfzalN. A.GreenbergM.DeereH.DaviesS.MurchS. H.et al (2004). Diagnostic role of upper gastrointestinal endoscopy in pediatric inflammatory bowel disease.J. Pediatr. Gastroenterol. Nutr.39257261.

  • 9

    ChenY.LiD.DaiZ.PiaoX.WuZ.WangB.et al (2014). L-methionine supplementation maintains the integrity and barrier function of the small-intestinal mucosa in post-weaning piglets.Amino Acids4611311142. 10.1007/s00726-014-1675-5

  • 10

    ChowD. K.SungJ. J.WuJ. C.TsoiK. K.LeongR. W.ChanF. K. (2009). Upper gastrointestinal tract phenotype of Crohn’s disease is associated with early surgery and further hospitalization.Inflamm. Bowel Dis.15551557. 10.1002/ibd.20804

  • 11

    ColinaA. R.AumontF.DeslauriersN.BelhumeurP.De RepentignyL. (1996). Evidence for degradation of gastrointestinal mucin by Candida albicans secretory aspartyl proteinase.Infect. Immun.6445144519. 10.1128/iai.64.11.4514-4519.1996

  • 12

    DavisK. G. (2015). Crohn’s disease of the foregut.Surg. Clin. North Am.9511831193.

  • 13

    DethlefsenL.EckburgP. B.BikE. M.RelmanD. A. (2006). Assembly of the human intestinal microbiota.Trends Ecol. Evol.21517523. 10.1016/j.tree.2006.06.013

  • 14

    DiazL.Hernandez-OquetR. E.DeshpandeA. R.MoshireeB. (2015). Upper gastrointestinal involvement in Crohn disease: histopathologic and endoscopic findings.South Med. J.108695700. 10.14423/smj.0000000000000373

  • 15

    DouglasG. M.HansenR.JonesC. M. A.DunnK. A.ComeauA. M.BielawskiJ. P.et al (2018). Multi-omics differentially classify disease state and treatment outcome in pediatric Crohn’s disease.Microbiome6:13.

  • 16

    EdgarR. C. (2010). Search and clustering orders of magnitude faster than BLAST.Bioinformatics2624602461. 10.1093/bioinformatics/btq461

  • 17

    FisherR. A. (1936). The use of multiple measurements in taxonomic problems.Ann. Eugen.7179188. 10.1111/j.1469-1809.1936.tb02137.x

  • 18

    FrankD. N.St AmandA. L.FeldmanR. A.BoedekerE. C.HarpazN.PaceN. R. (2007). Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases.Proc. Natl. Acad. Sci. U.S.A.1041378013785. 10.1073/pnas.0706625104

  • 19

    GajendranM.LoganathanP.CatinellaA. P.HashashJ. G. (2018). A comprehensive review and update on Crohn’s disease.Dis. Mon.642057.

  • 20

    GaneshB. P.KlopfleischR.LohG.BlautM. (2013). Commensal Akkermansia muciniphila exacerbates gut inflammation in Salmonella typhimurium-infected gnotobiotic mice.PLoS One8:e74963. 10.1371/journal.pone.0074963

  • 21

    GeversD.KugathasanS.DensonL. A.Vazquez-BaezaY.Van TreurenW.RenB.et al (2014). The treatment-naive microbiome in new-onset Crohn’s disease.Cell Host Microb.15382392.

  • 22

    GomollonF.DignassA.AnneseV.TilgH.Van AsscheG.LindsayJ. O.et al (2017). 3rd European evidence-based consensus on the diagnosis and management of Crohn’s Disease 2016: part 1: diagnosis and medical management.J. Crohns Colit.11325.

  • 23

    GreuterT.PillerA.FournierN.SafroneevaE.StraumannA.BiedermannL.et al (2018). Upper gastrointestinal tract involvement in Crohn’s disease: frequency, risk factors, and disease course.J. Crohns Colit.1213991409. 10.1093/ecco-jcc/jjy121

  • 24

    HalfvarsonJ.BrislawnC. J.LamendellaR.Vazquez-BaezaY.WaltersW. A.BramerL. M.et al (2017). Dynamics of the human gut microbiome in inflammatory bowel disease.Nat. Microbiol.2:17004.

  • 25

    HenriksenM.JahnsenJ.LygrenI.AadlandE.SchulzT.VatnM. H.et al (2007). Clinical course in Crohn’s disease: results of a five-year population-based follow-up study (the IBSEN study).Scand. J. Gastroenterol.42602610. 10.1080/00365520601076124

  • 26

    JoossensM.HuysG.CnockaertM.De PreterV.VerbekeK.RutgeertsP.et al (2011). Dysbiosis of the faecal microbiota in patients with Crohn’s disease and their unaffected relatives.Gut60631637. 10.1136/gut.2010.223263

  • 27

    KefalasC. H. (2003). Gastroduodenal Crohn’s disease.Proc. Bayl. Univ. Med. Cent.16147151.

  • 28

    LangilleM. G.ZaneveldJ.CaporasoJ. G.McdonaldD.KnightsD.ReyesJ. A.et al (2013). Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences.Nat. Biotechnol.31814821. 10.1038/nbt.2676

  • 29

    LazarevM.HuangC.BittonA.ChoJ. H.DuerrR. H.McgovernD. P.et al (2013). Relationship between proximal Crohn’s disease location and disease behavior and surgery: a cross-sectional study of the IBD Genetics Consortium.Am. J. Gastroenterol.108106112. 10.1038/ajg.2012.389

  • 30

    LenaertsC.RoyC. C.VaillancourtM.WeberA. M.MorinC. L.SeidmanE. (1989). High incidence of upper gastrointestinal tract involvement in children with Crohn disease.Pediatrics83777781.

  • 31

    Lopez-SilesM.Enrich-CapoN.AldeguerX.Sabat-MirM.DuncanS. H.Garcia-GilL. J.et al (2018). Alterations in the abundance and co-occurrence of Akkermansia muciniphila and Faecalibacterium prausnitzii in the colonic mucosa of inflammatory bowel disease subjects.Front. Cell Infect. Microbiol.8:281. 10.3389/fcimb.2018.00281

  • 32

    Martinez-MedinaM.AldeguerX.Gonzalez-HuixF.AceroD.Garcia-GilL. J. (2006). Abnormal microbiota composition in the ileocolonic mucosa of Crohn’s disease patients as revealed by polymerase chain reaction-denaturing gradient gel electrophoresis.Inflamm. Bowel Dis.1211361145. 10.1097/01.mib.0000235828.09305.0c

  • 33

    MillsR. H.Vazquez-BaezaY.ZhuQ.JiangL.GaffneyJ.HumphreyG.et al (2019). Evaluating metagenomic prediction of the metaproteome in a 4.5-year study of a patient with Crohn’s disease.mSystems4:e0337-18.

  • 34

    Navas-MolinaJ. A.Peralta-SanchezJ. M.GonzalezA.McmurdieP. J.Vazquez-BaezaY.XuZ.et al (2013). Advancing our understanding of the human microbiome using QIIME.Methods Enzymol.531371444. 10.1016/b978-0-12-407863-5.00019-8

  • 35

    ParksD. H.TysonG. W.HugenholtzP.BeikoR. G. (2014). STAMP: statistical analysis of taxonomic and functional profiles.Bioinformatics3031233124. 10.1093/bioinformatics/btu494

  • 36

    PascalV.PozueloM.BorruelN.CasellasF.CamposD.SantiagoA.et al (2017). A microbial signature for Crohn’s disease.Gut66813822.

  • 37

    PngC. W.LindenS. K.GilshenanK. S.ZoetendalE. G.McsweeneyC. S.SlyL. I.et al (2010). Mucolytic bacteria with increased prevalence in IBD mucosa augment in vitro utilization of mucin by other bacteria.Am. J. Gastroenterol.10524202428. 10.1038/ajg.2010.281

  • 38

    R Core Team (2017). R: A Language And Environment For Statistical Computing.Vienna: R Foundation for Statistical Computing.

  • 39

    SartorR. B. (2008). Microbial influences in inflammatory bowel diseases.Gastroenterology134577594. 10.1053/j.gastro.2007.11.059

  • 40

    SegataN.IzardJ.WaldronL.GeversD.MiropolskyL.GarrettW. S.et al (2011). Metagenomic biomarker discovery and explanation.Genome Biol.12:R60.

  • 41

    ShenY. B.WeaverA. C.KimS. W. (2014). Effect of feed grade L-methionine on growth performance and gut health in nursery pigs compared with conventional DL-methionine.J. Anim. Sci.9255305539. 10.2527/jas.2014-7830

  • 42

    StroberW.FussI.MannonP. (2007). The fundamental basis of inflammatory bowel disease.J. Clin. Invest.117514521. 10.1172/jci30587

  • 43

    TakahashiK.NishidaA.FujimotoT.FujiiM.ShioyaM.ImaedaH.et al (2016). Reduced abundance of butyrate-producing bacteria species in the fecal microbial community in Crohn’s Disease.Digestion935965. 10.1159/000441768

  • 44

    TamboliC. P.NeutC.DesreumauxP.ColombelJ. F. (2004). Dysbiosis as a prerequisite for IBD.Gut53:1057.

  • 45

    Van LimbergenJ.RussellR. K.DrummondH. E.AldhousM. C.RoundN. K.NimmoE. R.et al (2008). Definition of phenotypic characteristics of childhood-onset inflammatory bowel disease.Gastroenterology13511141122. 10.1053/j.gastro.2008.06.081

  • 46

    VenablesW.SmithD. (2020). An Introduction to R. Available online at: https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf

  • 47

    WeiX.YangZ.ReyF. E.RidauraV. K.DavidsonN. O.GordonJ. I.et al (2012). Fatty acid synthase modulates intestinal barrier function through palmitoylation of mucin 2.Cell Host Microb.11140152. 10.1016/j.chom.2011.12.006

  • 48

    WoltersF. L.RusselM. G.SijbrandijJ.AmbergenT.OdesS.RiisL.et al (2006). Phenotype at diagnosis predicts recurrence rates in Crohn’s disease.Gut5511241130. 10.1136/gut.2005.084061

  • 49

    XavierR. J.PodolskyD. K. (2007). Unravelling the pathogenesis of inflammatory bowel disease.Nature448427434. 10.1038/nature06005

Summary

Keywords

microbiome, 16S rRNA, Crohn’s disease, upper gastrointestinal tract, biomarker

Citation

Kwak MS, Cha JM, Shin HP, Jeon JW and Yoon JY (2020) Development of a Novel Metagenomic Biomarker for Prediction of Upper Gastrointestinal Tract Involvement in Patients With Crohn’s Disease. Front. Microbiol. 11:1162. doi: 10.3389/fmicb.2020.01162

Received

19 March 2020

Accepted

06 May 2020

Published

03 June 2020

Volume

11 - 2020

Edited by

Hyundoo Hwang, BBB Inc., South Korea

Reviewed by

Luiz Gustavo Gardinassi, Universidade Federal de Goiás (IPTSP – UFG), Brazil; Guillaume Sarrabayrouse, Vall d’Hebron Research Institute (VHIR), Spain

Updates

Copyright

*Correspondence: Min Seob Kwak,

This article was submitted to Microbial Immunology, a section of the journal Frontiers in Microbiology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics