Comparison of Microbiota in Patients Treated by Surgery or Chemotherapy by 16S rRNA Sequencing Reveals Potential Biomarkers for Colorectal Cancer Therapy

Colorectal cancer (CRC) is the third most diagnosed cancer worldwide due to its high difficulty in early diagnosis, high mortality rate and short life span. Recent publications have demonstrated the involvement of the commensal gut microbiota in the initiation, progression and chemoresistance of CRC. However, this microbial community has not been explored within CRC patients after anti-cancer treatments. To this end, we performed next generation sequencing-based metagenomic analysis to determine the composition of the microbiota in CRC patients after anti-cancer treatments. The microbial 16S rRNA genes were analyzed from a total of 69 fecal samples from four clinical groups, including healthy individuals, CRC patients, and CRC patients treated with surgery or chemotherapy. The findings suggested that surgery greatly reduced the bacterial diversity of the microbiota in CRC patients. Moreover, Fusobacterium nucleatum were shown to confer chemoresistance during CRC therapy, and certain bacterial strains or genera, such as the genus Sutterella and species Veillonella dispar, were specifically associated with CRC patients who were treated with chemotherapeutic cocktails, suggesting their potential relationships with chemoresistance. These candidate bacterial genera or strains may have the ability to enhance the dosage response to conventional chemotherapeutic cocktails or reduce the side effects of these cocktails. A combination of common CRC risk factors, such as age, gender and BMI, identified in this study improved our understanding of the microbial community and its compositional variation during anti-cancer treatments. However, the underlying mechanisms of these microbial candidates remain to be investigated in animal models. Taken together, the findings of this study indicate that fecal microbiome-based approaches may provide additional methods for monitoring and optimizing anti-cancer treatments.


INTRODUCTION
Colorectal cancer (CRC) is considered the third most frequently diagnosed cancer in the world and leads to high mortality (Jemal, 2011;Ferlay et al., 2013). Generally, over 1 million people are affected due to various intrinsic and environmental factors. For instance, lifestyle and dietary risks, such as drinking problems, excessive consumption of red meat in the diet and long-term intestinal inflammation, greatly increase the possibility of tumor initiation in the gastrointestinal tract (Larsson et al., 2005;Nugent et al., 2005;Huxley et al., 2009).
Genetic studies indicate that the loss of tumor suppressor genes and mutation activation/inactivation may contribute to the development of CRC (Vogelstein and Kinzler, 2004;Brenner et al., 2014). Recent advancements in CRC research suggest that variations in the composition of microorganisms (i.e., gut microbiome) in the gastrointestinal tract may play crucial roles in CRC initiation and development (Cheesman et al., 2011). In general, approximately 10 3 commensal bacteria species can be counted in the healthy human intestinal tract and are essential for human health by promoting food digestion and intestinal epithelial homeostasis under normal conditions (Hooper and Gordon, 2001). The physiological and molecular mechanisms of the gut microbiome and its association with CRC have been extensively studied over the past decade (Dolara et al., 2002;Stappenbeck et al., 2002;Rakoff-Nahoum and Medzhitov, 2007). Initial evidence linking CRC to microorganisms has demonstrated that larger tumor sizes are observed in mice kept under specific pathogen-free conditions (Dove et al., 1997). In addition, Streptococcus bovis and Clostridium septicum have been identified to be associated with clinical CRC samples via culture-based analysis (Boleij et al., 2009;Seder et al., 2009). However, evidence from case reports is not strong enough to draw conclusions. Hypotheses and research models have been established to indicate that the signature of the gut microbiome may affect tumor growth in the colon via direct or indirect mechanisms (Chen et al., 2012Kostic et al., 2012). For example, the diet has been considered to be an important environmental factor that promotes adenoma in CRC patients by affecting the GI microbial composition . Different bacterial structures have been demonstrated to be associated with cancerous and non-cancerous tissue (Chen et al., 2012). With the development of next-generation of sequencing approaches, a number of studies have characterized the compositional changes of the microbiome between healthy individuals and CRC patients at different stages (Larsson et al., 2005;Rakoff-Nahoum and Medzhitov, 2007;Huxley et al., 2009). This microbial dysbiosis has been observed in several sample types, including the colorectal mucosa, tumor tissue and human feces (Sobhani et al., 2011;Geng et al., 2013). Bacterial species have been commonly identified across CRC samples (Wang et al., 2011;Zackular et al., 2014), but their roles in CRC development remain to be elucidated. For example, Bacteroides fragilis and some strains of Escherichia coli are positively correlated with colorectal tumor size by producing toxins to human intestinal cells (Costello et al., 2009). By contrast, bacterial strains with anti-tumor potential have been proposed as well. As metabolic products of anti-cancer microbes, short-chain fatty acids (SCFA), such as butyrate, inhibit tumor growth in the colon (Boon et al., 2002). Recent has shown that the symbiotic microorganisms in the human intestinal tract are able to affect the local tissue metabolic activity and development of host organisms (Lee and Mazmanian, 2010). Furthermore, commensal bacteria have the ability to influence the inflammation process and immune responses of the host organism systemically (Ichinohe et al., 2011;Clemente et al., 2012;McAleer and Kolls, 2012). For instance, commensal microbiota affect virus-specific antibody production during influenza virus infection (Ichinohe et al., 2011). In addition, commensal bacteria are able to promote the recognition of viruses by antigen-presenting cells (McAleer and Kolls, 2012). Thus, CRC initiation and development are regarded as a cooccurrence that leads to the establishment of tumor-promoting bacteria and elimination of anti-tumor bacteria.
The development of CRC is a stepwise process by which locally attached adenoma tissue initiates in the colon and progresses to invasive and metastatic carcinoma tissues over time (Fearon, 2011). Importantly, the chance of initiating carcinoma development can be largely reduced if early adenomas (stage I and II) are detected and surgically removed in time (Bernard Levin et al., 2008). Over 90% of CRC patients survive if the diagnosis occurs while the disease is still localized. However, there is a dramatic decrease of the survival percentage among CRC patients with later stage carcinomas. In addition to surgery, chemotherapy is usually used to inhibit tumor growth and cancer development in advanced CRC patients when metastasis to secondary organs occurs (Yu T. et al., 2017). Generally, chemicals with a broad-spectrum of cytotoxicity, such as capecitabine, 5-fluorouracil (5-FU) and oxaliplatin, are commonly used in combination for chemotherapy (Cartwright, 2012;Yu T. et al., 2017). As a first-line treatment of CRC, oxaliplatin is usually given in combination with 5-FU, which is known as the FOLFOX regimen (Wang and Li, 2012). These drugs can either effectively inhibit DNA replication of cancer cells (capecitabine and 5-FU) or arrest the cell cycle (oxaliplatin) (Walko and Lindley, 2005;Kelland, 2007). However, the survival rate of advanced CRC patients receiving the FOLFOX regimen is generally lower than 10% because of the chemoresistance of tumor tissues (Dahan et al., 2009). Studies indicate that novel immune checkpoint therapy barely has any effect these patients with stage III or IV CRC (Zou et al., 2016). Therefore, developing a better understanding of chemoresistance and its underlying mechanisms in CRC are current priorities. Emerging evidence has sown that there are links between the gut microbiota and chemotherapy (Iida et al., 2013). Previously identified CRC biomarkers, denoted as Fusobacterium nucleatum, have been characterized by multiple studies (Castellarin et al., 2012;Kostic et al., 2012). This species has been observed to be high abundance in adenoma tissues and progressively increases in richness during colorectal carcinogenesis (Castellarin et al., 2012;Kostic et al., 2012). Previous observations suggest that this species has the ability to promote carcinogenesis (Rubinstein et al., 2013) and has been detected to be co-localized with CRC epithelial cells (Abed et al., 2016). Recently, the mechanism of Fusobacterium nucleatum in CRC chemoresistance has been elucidated (Yu T. et al., 2017), providing a valuable target for the development of new chemotherapeutic methods for CRC treatment.
Although various studies have been performed to profile the microbial composition of the microbiota of CRC patients at different tumor developmental stages to assess its diagnostic potential, no metagenomic studies of the microbiota of CRC patients after surgery and chemotherapy have been reported. In this study, we hypothesized that using alternative biomarkers from the CRC-related microbiota could improve the probability of identify candidate microbes that are specifically associated with CRC patients treated by surgical debulking or chemotherapy. We performed standard 16S rRNA sequencing and bioinformatic analysis of the stool microbiome of healthy individuals, CRC patients, and CRC patients after either surgical debulking or chemotherapy. High-resolution images of the human CRC microbiome, especially those after patients underwent surgery or chemotherapy, were compared for potential biomarker identification. The results indicate that metagenomic analysis of the microbial composition is a valuable tool for biomarker identification and candidate selection for further investigation.

Characteristics of the Sequencing Results
The experimental design and analytical pipeline are shown in Supplementary Figure S1. In general, a total of 69 samples were subjected to 16S rRNA sequencing. These samples were divided into four groups (Table 1) that included 33 healthy individuals (denoted as group C), 17 CRC patients (denoted as group I), 14 chemotherapy-treated CRC patients (denoted as group D) and 5 surgically treated CRC (denoted as group IO). In general, an average of 1259 operational taxonomic unit (OTUs) were identified across these 69 samples, with a minimum of 80 OTUs and a maximum of 2324 OTUs. The parameters, such as the dilution curve, Shannon curve, and rank abundance of OTUs, were evaluated to confirm the reliability of the sequencing data (Supplementary Figure S2).

Differential Microbial Structure Among 69 Tested Samples
Stool samples were analyzed from the above mentioned four sampling groups (Supplementary Figure S1). The microbial composition of each sample showed no apparent variation according to principle component analysis (PCA) ( Figure 1A). The Chao value indicated that three groups (C, D, and I) had higher community richness than that of group IO ( Figure 1B and Supplementary Table S1). Similar patterns were observed regarding the Shannon diversity indicator (Figure 1C), suggesting that surgery greatly affected the structure of the microbiota in CRC patients. Subsequent statistical analysis using PERMANOVA determined considerable variations in microbial community structures among the four clinical groups (P = 0.001). In addition, large variations in both the number of OTUs and counts detected among all 69 samples indicated that considerable variations among individuals within each group and statistical analysis were required for subsequent group comparisons ( Figure 1D). Abundance summaries showed that some OTUs were much higher in abundance than other OTUs among all samples ( Figure 1D). Subsequently, both heatmap and PCoA calculated by Brey-Curtis similarity suggested that considerable variations exist among individual samples referred by beta-diversity (Figure 2). The high abundance OTUs provided interesting candidates for further analysis. Moreover, the microbiota structure of each sample was analyzed and presented at the phylum, genus and species levels (Figure 3). PCoA suggested that most of the IO group members were significantly different from those of the other three groups at the phylum level ( Figure 3A), implying its unique microbiota composition. Among the phyla, Bacteroidetes and Firmicutes were the most abundant phyla in the C, I, and D groups, whereas the phylum Proteobacteria was found at high richness in the IO group ( Figure 3D), indicating its specific association with CRC patients after surgery. By contrast, at both the genus and species levels, no further variations were observed among the four groups via PCoA (Figures 3B,C), but variations of the microbial composition in an individual genus or species were detected among all samples ( Figure 3D).

Detection of Microbial Groups Is Linked to Colorectal Cancer
Common microbial species detected in cancer studies (Chen et al., 2012;Feng et al., 2015) were repeatedly found in this batch analysis of CRC patients. For example, the phylum Fusobacteria was enriched in CRC patients (group I) in comparison to that in healthy individuals (group C). Furthermore, the genera Fusobacterium, Oscillospira, and Prevotella were all detected in CRC patients (group I) and CRC patients after chemotherapy (group D). The variations of the microbial composition after surgery and chemotherapy were further compared in subsequent analyses.

Surgery Leads to a Unique Microbial Composition in CRC Patients
To further understand the differential microbiota structures among the four groups, microorganisms with a high relative abundance (>1%) were selected for comparison at the phylum, genus and species levels (Figure 4). Consistently, the majority of high abundance microbes present in group C, D, and I were not detected in the IO group. In comparison to the other three groups, the phyla Bacteroidetes and Firmicutes were much less abundant in the IO group. Meanwhile, a higher abundance of Proteobacteria was detected in the IO group than in the other groups ( Figure 4A). Correspondingly, in the IO group, the genus Bacteroides was found at a lower relative abundance in comparison with that in the other three groups. This phenomenon indicates that surgical operation may greatly reduce microbial diversity and richness in CRC patients. However, the underlying mechanism remains to be elucidated.

Specific Groups of Microbes Are Associated With Chemotherapy
Chemotherapy by drug cocktails is a routine method used in patients with advanced CRC. Here, we demonstrated that some microbial groups were tightly associated with CRC patients undergoing chemotherapy that included a combination of oxaliplatin and tegafur (a precursor of 5 -FU). At the genus level, Veillonella was uniquely present in group D patients ( Figure 4B). In particular, V. dispar was only observed in chemotherapy-treated patients, but not in patients in the other three groups (Figure 4C and Supplementary Figure S3). Additionally, although not detected at the genus level, two other species, Prevotella copri and Bacteroides plebeius, were only enriched in patients treated by chemotherapy ( Figure 4C and Supplementary Figure S3).

Potential Biomarkers for CRC Patients Treated by Surgery or Chemotherapy
Biomarkers are valuable targets for disease diagnosis and treatment improvement. Potential biomarkers (from phylum to species) for each group were identified using the LDA score (Figures 5, 6). As previously described, Proteobacteria was regarded as a biomarker in CRC patients treated by surgery.
Moreover, Fusobacteria was considered a potential biomarker for CRC patients in this study, whereas Cyanobacteria was associated with CRC patients after chemotherapy (Figure 5).
At the species level, species such as V. dispar, F. prausnitzii, B. plebeius, B. ovatus, and B. uniformi were repeatedly observed to have high LDA scores (Figure 6), suggesting their potential value as novel biomarkers for monitoring CRC patients after surgery or chemotherapy. In addition, network associations of bacterial structure identified among all the clinical groups was drafted (Figure 7). Several abundant bacterial genera ( Figure 4B) such as Prevotella, Veillonella, Megamonas, and Ruminococcus, exhibited multiple positive links to other bacterial genera (P < 0.05, Spearman R > 0.8), suggesting their potential synergistic roles related to CRC.

Comparison of Microbiome Changes Among the Four Tested Groups
Colorectal cancer (CRC) is considered to be a global public health problem since it has a high incidence and percentage of mortality and low diagnostic rate at early stages. Although a number of oncogenes related to CRC have been reported, the influence of additional environmental factors in CRC, such as the microbiome, is unknown. Previous metagenomic and transcriptomic studies have led to the development of novel specific and sensitive biomarkers for the diagnosis of CRC at the earliest stages (Mármol et al., 2017). In this study, considerable microbiome differences among healthy people (group C), CRC patients (group I), and CRC patients after surgery (group IO) or chemotherapy (group D) were observed by using PERMANOVA (P = 0.001), implying their potential roles in response to anticancer treatments. Previous studies have listed a number of microbial species that have been found in CRC patients (Chen et al., 2012;Zackular et al., 2013). Some of these species were present in our dataset as well. For example, Fusobacterium spp. were previously reported to co-exist with tumors and may positively regulate tumor cell propagation (Kostic et al., 2012;Rubinstein et al., 2013). In this study, the genus Fusobacterium was found to be enriched in both CRC patients (group I) and CRC patients after chemotherapy (group D), but depleted in healthy individuals and CRC patients after surgery (group IO, Figure 4B); therefore, the genus Fusobacterium was identified as a putative biomarker for CRC patients (Figure 6). This result suggested that Fusobacterium spp. were associated by CRC, but not by conventional chemotherapy. The abundance of Fusobacterium was largely reduced after surgical debulking, suggesting that there were potential differences between these two anti-cancer therapeutic approaches. In addition, Bacteroides fragilis has been considered to be a pathogenic species that is associated with colorectal tumor development (Sears et al., 2008). Previous reports indicated that carcinoma tissues had elevated levels of B. fragilis in comparison to early stage adenomas (Zackular et al., 2014), suggesting that this species maybe associated with tumor progression. However, no differences in B. fragilis were detected among the four groups in our dataset ( Figure 4C) and this inconsistency remains to be further investigated. Furthermore, the genus Bacteroides and its representative species B. ovatus have been identified to have reduced abundance with colorectal tumor development (Zackular et al., 2014;Feng et al., 2015). Bacteroides species and their metabolic products, such as bile acid, have been demonstrated to promote liver carcinoma through the mechanism of DNA damage (Hagio, 2011). According to our data, this genus was most enriched in group C. The abundance of Bacteroides was reduced in both group I and group D, and Bacteroides were not found group IO, indicating that both anticancer methods reduced the abundance of this microbial genus and that surgery is more effective for eliminating the growth of B. ovatus. Although some strains of E. coli have been shown to participate in tumor growth in a previous study (Arthur and Jobin, 2012) and the genus was in low relative abundance in our data set, Enterobacteriaceae are commensal bacteria in human. Thus, the identification of this family as a potential biomarker for the IO group (Figure 6) may imply that surgery may not eradicate this family of microorganisms. Similar to previous descriptions (Atarashi et al., 2011;Kostic et al., 2012), we identified the genera Clostridium, family Lachnospiraceae and order Clostridiales as potential biomarkers for healthy individuals (Figure 6), indicating their important roles in the healthy human intestine. Taken together, these results raised an intriguing question regarding whether the validity of this metagenomic study could be confirmed and indicated that the novel roles of these previously identified microorganisms in CRC patients treated by surgery or chemotherapy require further investigated.

Additional Bacterial Genera Associated With Surgery and Chemotherapy May Serve as Candidates to Improve Anti-cancer Treatments
A recent discovery has demonstrated that F. nucleatum can induce chemoresistance of colorectal tumors by modulating autophagy pathways (Yu T. et al., 2017). In general, F. nucleatum promotes CRC chemoresistance by activating autophagosome formation and subsequently inducing autophagy-related proteins, such as pULK1, ULK1, and ATG7, in colorectal tumor cells (Yu T. et al., 2017). This study also recommended that post-detection of F. nucleatum abundance may be an effective approach to predict the occurrence of chemoresistance. In our study, F. nucleatum was not detected at high abundance across the four sampling groups, suggesting that this species may not be the only indicator of chemoresistance-mediated CRC recurrence. Although the microbial diversity was largely reduced after surgery, additional biomarkers associated with post-surgery were identified (Figure 6), providing new candidates for mechanistic study and clinical diagnosis. However, identified potential bio-markers restricted to taxons above bacteria families and orders, thus following shotgun metagenomic approach is required to confirm bacterial identity on species level. Moreover, severe side effects are associated with current chemotherapy protocols (Vanesa et al., 2015). Increasing evidence implies that the gut microbiota provides valuable targets for the alleviation of chemotherapeutic side effect (Mármol et al., 2017). In this study, an in-depth 16s RNA analysis between CRC patients before or after chemotherapy was conducted. The putative biomarkers identified in this study, such as V. dispar and the genus Sutterella (Figure 6), may provide additional targets to enhance the dosage response to conventional chemotherapeutic cocktails and simultaneously reduce side effects. In particular, the genera Sutterella, was found to be enriched in pancreatic cancer patients in comparison to control group (Half et al., 2015). Although V. dispar has been recorded as infectious bacterial species for human (Bhatti and Frank, 2000), there is no direct link reported between this species and cancer. Thus, further investigations are necessary to characterize the correlations between these two potential biomarkers and CRC patients.

Perspectives for the Improvement of Anti-CRC Treatments by Studying Commensal Microorganisms
Large variations of microbial structures between different populations and areas have been mentioned in previous studies . Thus, a larger scale analysis is required to obtain universal indicators for anti-cancer treatments, such as chemotherapy and surgery. In addition, patients can be grouped according to various physiological parameters, such as sex, age or BMI, to study the relationships between these factors and commensal microbiota. Moreover, microbiota differentiation can be observed in each individual CRC patient. Therefore, personalized diagnosis and suitable therapeutic methods will improve the quality of life and survival rate of CRC patients in the future. Interestingly, a comprehensive analysis has been developed along with the development of modernomics technology. For example, metagenomic sequencing and microarray-based approaches may enhance the understanding of the functional relationships among commensal bacteria and CRC treatments (Hua et al., 2015). Meta-transcriptomic and proteomic studies that correlate the steady-state transcript and protein levels of the gut microbiota are considered to be powerful -omics tools for molecular profiling . Additionally, expertise on bacterial strain isolation, culturing and molecular analysis are crucial for determining the anti-cancer properties of certain bacterial strains.

CONCLUSION
The results of this study suggest that a general effect of the presence or absence of certain bacterial groups likely plays a pivotal role in anti-cancer treatments in CRC patients. Observations of species-specific variations in the microbiota from CRC patients may lead to the development and optimization of existing surgical or chemotherapeutic protocols. Therefore, unraveling the underlying mechanisms of this phenomenon is a priority for addressing how certain microbial strains can be affected after anti-cancer therapeutic treatments to cure CRC in the near future.

Group Information of CRC Patients
In total, stool samples from 69 individuals divided into four groups were collected at from the Peking University Shenzhen Hospital, China. In detail, 33 samples were from healthy individuals, denoted as group C; 17 samples were from CRC patients before treatment, denoted as group I; 5 samples were from CRC patients who were surgically treated, denoted as group IO; and 14 samples were from CRC patients who were chemotherapeutically treated, denoted as group D. The basic information of these subjects is listed in Table 1. The average age and BMI of patients of each clinical group were approximately 50 and 22, respectively. This study was approved by the Ethics Committee of the Peking University Shenzhen Hospital, China. Informed consent was obtained from all of the subjects in this study. For group D patients, a chemotherapeutic cocktail containing tegafur (precursor of 5 -fluorouracil, 40 mg/m 2 bid po d1-d14) and oxaliplatin (130 mg/m 2 ivgtt 2h d1) was applied for 21 days/cycle, with 6-8 cycles per treatment.

Sampling Condition and Genomic DNA Extraction
Stool samples were taken and kept on dry ice before DNA extraction. DNA was extracted using the GenElute TM Stool DNA Isolation Kit (Sigma-Aldrich, St. Louis, MO, United States) according to the manufacturer's protocol. In general, approximately 200 mg stool samples were extracted by adding lysis buffer A and additive A in bead tubes. After vortexing for approximately 3-5 min, the lysate was transferred to a new 1.5 ml tube supplemented with binding buffer I on ice for 10 min. After a short spin, the resulting lysate was transferred into a new 1.5 ml tube with the addition of absolute ethanol for precipitation. Precipitated DNA was then loaded onto a binding column, and the eluted solution was discarded. Wash solution A was used to wash bound DNA twice. Distilled water was used to elute DNA. Subsequently, the DNA integrity, concentration and purity were monitored on 1% agarose gels (Supplementary Figure S3).
The length and concentration of the PCR products were detected by 1% agarose gel electrophoresis. Samples with a bright main strip (e.g., 16S V4/V5: 400-450 bp) were used for further experiments. PCR products were mixed at equi-density ratios according to the GeneTool Analysis Software (Version4.03.05.0, SynGene). Then, the mixed PCR products were purified with the EZNA Gel Extraction Kit (Omega, United States).

Library Preparation and NGS Sequencing
Sequencing libraries were generated using the NEBNext R Ultra TM DNA Library Prep Kit for Illumina R sequencing (New England Biolabs, United States) following the manufacturer's recommendations, and index codes were added. The library quality was assessed on a Qubit@ 2.0 Fluorometer (Thermo Scientific) and Agilent Bioanalyzer 2100 system. Finally, the library was sequenced on an Illumina Hiseq2500 platform, and 250 bp paired-end reads were generated.

Quality Control Steps of the Raw NGS Sequencing Data
Quality filtering of the paired-end raw reads was performed under specific filtering conditions to obtain high-quality clean reads according to the Trimmomatic (V0.33) 1 quality-control process. At the same time, sequences were assigned to each 1 http://www.usadellab.org/cms/?page=trimmomatic sample based on their unique barcode and primer, after which the barcodes and primers were removed and paired-end clean reads were recovered.
Paired-end clean reads were merged using FLASH (V1.2.11) 2 (Magoč and Salzberg, 2011) according to the relationship of the overlap between the paired-end reads when at least 10 of the reads overlapped the read generated from the opposite end of the same DNA fragment, the maximum allowable error ratio of an overlap region of 0.2, and the spliced sequences were called raw tags.
Quality filtering on the spliced sequences was performed using Trimmomatic software (Bolger et al., 2014) to obtain effective clean tags.

Identification and Quantification of Operational Taxonomic Units
Operational taxonomic units (OTU) clustering and species annotation were performed by USEARCH software (V8.0.1517) 3 (Edgar, 2010). Sequences with ≥97% similarity were assigned to the same OTU. Representative sequences for each OTU were screened for further annotation. It is generally believed that a singleton OTU is obtained due to sequencing errors or chimeras generated during PCR. Therefore, singleton OTUs were removed using usearch 4 after OTU clustering, and then, chimeric sequences were detected and removed using the UCHIME de novo algorithm 5 (Edgar et al., 2011).
For each representative sequence, the GreenGene database (version gg_13_8) 6 was used based on the RDP classifier algorithm (ucluster approach with default settings) and the assign_taxonomy.py script 7 in QIIME (version 1.9.0) 8 (Caporaso et al., 2010) to annotate the taxonomic information (set the confidence threshold to default to 0.5 or above). The polluted OTU and its Tags, which are annotated as chloroplasts or mitochondria (16S amplicons) and cannot be annotated at the kingdom level, were removed. Then, the number of effective tags (No. of seqs) and OTU taxonomic synthesis information were input into a table (otu_table) for the final analysis.
To study the phylogenetic relationship of different OTUs, KRONA software 9 (Phillippy et al., 2011) was used to visualize the results of individual sample annotations. To quickly and intuitively study the species composition and abundance information in a sample, the GraPhlAn software 10 (Asnicar et al., 2015) was used to obtain a single sample OTU annotation circle graph.
To study the difference in dominant species in different samples (groups), an OTU representative sequence with a relative abundance in the first 50 and annotated to the level of the genus was selected, multiple sequence alignments were conducted using FastTree software, and the relative abundance of each OTU and species annotation information of the representative sequence were combined with the ggtree software package for visual display.
Operational taxonomic unit abundance information was normalized using a standard of the sequence number corresponding to the sample with the least sequences. Subsequent analyses of the alpha diversity and beta diversity were performed basing on this output normalized data.

Analysis of Alpha and Beta Diversity
Alpha diversity was applied to analyze the complexity of species diversity for a sample via 2 indices, including the Chao1 and Shannon and Simpson dominance. All of the indices in our samples were calculated with QIIME (V1.9.1) and displayed using R software (V2.15.3). Two indices were selected to identify community richness: the observed specie, and Chao1. Three indices were used to identify community diversity: Shannon, Simpson, and Dominance.
Beta diversity analysis was used to evaluate differences in species complexity in the samples, and beta diversity on both the weighted and unweighted unifrac were calculated by QIIME software. Principal coordinate analysis (PCoA) was performed to obtain the principal coordinates and visualize the species using complex, multidimensional data. A weighted or unweighted unifrac distance matrix among samples obtained previously was transformed to a new set of orthogonal axes, by which the maximum variation factor was demonstrated according to the first principal coordinate, the second maximum by the second principal coordinate, and so on. PCoA was performed by QIIME2 and the ggplot2 package in R software. Sample cluster analysis was performed using the UPGMA (Unweighted Pairgroup Method with Arithmetic Means) method to interpret the distance matrix using the average linkages and was conducted by the upgma_cluster.py script 11 in QIIME software. NMDS analysis was performed by the vegan package of R software based on the normalized OTU abundance table. Network analysis was performed by using bacterial genera with relative abundance higher than 0.5%. The network map was drawn by selecting Spearman correlations between each two bacterial groups (P < 0.05, R > 0.8). 11 http://qiime.org/scripts/upgma_cluster.html

Statistical Analysis
PERMANOVA was performed among four groups of patients to determine variations between the microbial community structures. All the four groups shown significant differences in microbial structures (P = 0.001).

DATA SUBMISSION
The raw data used in this study were uploaded to Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under Bioproject PRJNA464414.

AUTHOR CONTRIBUTIONS
GLY designed the experiments. ZL and GLI performed the experiments. BL and XJ analyzed the data. XD wrote the manuscript. GLY critically commented and revised it.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01607/full#supplementary-material FIGURE S1 | Schematic view of the experimental design and analytical pipeline of this study. CRC, colorectal cancer; OTUs, operational taxonomic units.