Molecular Characterization and Meta-Analysis of Gut Microbial Communities Illustrate Enrichment of Prevotella and Megasphaera in Indian Subjects

The gut microbiome has varied impact on the wellbeing of humans. It is influenced by different factors such as age, dietary habits, socio-economic status, geographic location, and genetic makeup of individuals. For devising microbiome-based therapies, it is crucial to identify population specific features of the gut microbiome. Indian population is one of the most ethnically, culturally, and geographically diverse, but the gut microbiome features remain largely unknown. The present study describes gut microbial communities of healthy Indian subjects and compares it with the microbiota from other populations. Based on large differences in alpha diversity indices, abundance of 11 bacterial phyla and individual specific OTUs, we report inter-individual variations in gut microbial communities of these subjects. While the gut microbiome of Indians is different from that of Americans, it shared high similarity to individuals from the Indian subcontinent i.e., Bangladeshi. Distinctive feature of Indian gut microbiota is the predominance of genus Prevotella and Megasphaera. Further, when compared with other non-human primates, it appears that Indians share more OTUs with omnivorous mammals. Our metagenomic imputation indicates higher potential for glycan biosynthesis and xenobiotic metabolism in these subjects. Our study indicates urgent need of identification of population specific microbiome biomarkers of Indian subpopulations to have more holistic view of the Indian gut microbiome and its health implications.


INTRODUCTION
The gut microbial ecosystem is known to be governed by ecological and evolutionary forces  and is often controlled by secretions from the host at intestinal epithelium-microbiota interface such that beneficial microbes are maintained (Schluter and Foster, 2012). The physiological diversity of gut microbiota and its role in human health has been an inspiration for the initiation of elite projects such as Human Microbiome Project (HMP; Turnbaugh et al., 2007) and Metagenomics of the Human Intestinal Tract (MetaHIT) project (Qin et al., 2010). These projects and other related studies have generated wealth of information suggesting a link between gut microbiota and their genomic capabilities in maintenance of general wellbeing (Cho and Blaser, 2012) and also in highly specialized functions such as development of the immune system (Chung et al., 2012), neurodevelopmental disorders (Hsiao et al., 2013), and xenobiotic metabolism (Maurice et al., 2013).
Studies in past few years have highlighted discernible patterns of gut microbiota and microbiome in geographically separated populations (Mueller et al., 2006;De Filippo et al., 2010;Nam et al., 2011;Yatsunenko et al., 2012). Such studies are important in light of possible role of gut microbiota in the modulation of efficacy of oral vaccines (Valdez et al., 2014). In addition, action of pre and probiotics varies based on type of prebiotic, strain of probiotics used and possibly host gut environment (Boyle et al., 2006). Population specific microbiota studies such as American Gut, Canadian Microbiome, Brazilian Microbiome project and others are likely to yield valuable information about the gut microbiota as a target for medical interventions, may be in the form of fecal microbial transplantation to restore the healthy state (Borody et al., 2014).
Indian population is a unique conglomeration of genetically diverse groups having varied dietary habits and residing in vast geographic locations (Basu et al., 2003;Xing et al., 2010). In addition to these genetic differences, Indians have distinctive metabolic (Shukla et al., 2002) and anthropometric features (Yajnik et al., 2003;Prasad et al., 2011). Moreover, Indians are also confronted with the double burden of underand over-nutrition primarily due to the income inequalities (Subramanian et al., 2007). In this study, we provide detailed account of prominent attributes of the Indian gut microbial composition and its functions from 34 healthy Indian subjects. We carried out 16S rRNA gene amplicon sequencing using different sequencing platforms viz. Ion Torrent PGM and Illumina HiSeq. We then combined the 16S rRNA amplicon data of Indian subjects together with American (Muegge et al., 2011), Korean (Nam et al., 2011), Spanish (Peris-Bondia et al., 2011), and Bangladeshi (Lin et al., 2013) to compare it with gut microbiota of these populations. In addition, considering the response of gut microbiota to different types of diets; we compared Indian gut microbiota with non-human primates including hind-gut-fermenters, fore-gut-fermenters, herbivorous, and carnivorous organisms (Ley et al., 2008).

Study Population, Sample Collection, and DNA Extraction
We included 34 healthy Indian subjects from two urban cities: Delhi and Pune (one from Northern and one from Western part) of India and nearby rural regions of these cities. These cities are characterized by diverse groups of individuals from different parts of the country. Institutional Ethical Committee of National Centre for Cell Science approved the study and informed consent was obtained form all the participants. Although, this was not a clinical trial, we followed all good clinical practices as per Indian Council of Medical Research guidelines while recruiting the subjects and throughout the study. Fecal samples were collected from all of the subjects and stored at −80 • C until DNA extraction. Total community DNA was extracted from each fecal sample using QIAmp DNA Stool Mini kit (Qiagen, Madison USA) as per manufacturer's instructions.

16S rRNA Gene Amplicon Sequencing
16S rRNA amplicon sequencing of samples from Western region was performed using Ion Torrent PGM and that from Northern region using Illumina Hiseq2000 sequencing technology. For Ion Torrent PGM sequencing, samples were processed as follows: PCR was set up in 50 µl reaction using AmpliTaq Gold PCR Master Mix (Life Technologies, USA) and with 16S rRNA V3 region specific bacterial universal primers: forward primer 341F (5 ′ -CCTACGGGAGGCAGCAG-3 ′ ) and reverse primer 518R (5 ′ -ATTACCGCGGCTGCTGG-3 ′ ; Bartram et al., 2011). Following conditions were used for PCR: initial denaturation at 95 • C for 4 min, followed by 20 cycles of 95 • C for 1 min, 56 • C for 30 s, and 72 • C for 30 s with a final extension at 72 • C for 10 min. PCR products were purified using Agencourt AMPure XP DNA purification Bead (Beckman Coulter, USA), end repaired and ligated with specific barcode adaptor as explained in Ion Xpress TM Plus gDNA Fragment Library Preparation user guide. Fragment size distribution and molar concentrations of amplicon were assessed on a Bioanalyzer 2100 (Agilent Technologies, USA) using High Sensitivity DNA Analysis Kit as per manufacturer's instructions. Emulsion PCR was carried out on diluted and pooled amplicon (10 samples in each pool) using the Ion OneTouch TM 200 Template Kit v2 DL (Life Technologies). Sequencing of the amplicon libraries was carried out on 316 chips using the Ion Torrent PGM system and Ion Sequencing 200 kit (Life Technologies). For Illumina sequencing, samples were processed as follows: A PCR reaction of 50 µl was set up using AmpliTaq Gold high fidelity polymerase (Life Technologies, USA) and PCR conditions were as follows: initial denaturation at 95 • C for 10 min; followed by 30 cycles of 95 • C for 30 s; 56 • C for 30 s; and 72 • C for 30 s. The final extension was set at 72 • C for 7 min. The PCR products were purified using gel elution and the eluted products were used for library preparation. The libraries were quantified on Bioanalyzer using the DNA high sensitivity LabChip kit (Agilent Technologies, USA) and sequenced using Illumina HiSeq2000 (2x150 PE).

Sequence Processing and Bioinformatics Analysis
All PGM and Illumina HiSeq reads were pre-processed using Mothur pipeline (Schloss et al., 2009) with following conditions: minimum 150 bp to maximum 200 bp, maximum homopolymer-5, maximum ambiguity-0, and average quality score-20. This way we derived total of ∼17 million high quality amplicon reads from 34 samples, which we pooled into single FASTA file for further analysis in QIIME: Quantitative Insights Into Microbial Ecology (Caporaso et al., 2010). Closed reference based OTU picking approach was used to cluster reads into Operational Taxonomic Units (OTUs) at 97% sequence similarity using UCLUST algorithm (Edgar, 2010) and a representative sequence from each OTU was selected for downstream analysis. All OTUs were assigned to the lowest possible taxonomic rank by utilizing RDP Classifier 2.2 (Wang et al., 2007) and Greengenes database 13.8 with a confidence score of at least 80%. Estimations of Core OTUs were done as described previously (Huse et al., 2012). Various estimates of alpha diversity such as Chao1, PD whole tree, Simpson, and Shannon were applied on rarefied sequence count (1181 sequence per sample) and UniFrac was used as beta diversity measures to understand the microbial communities in Indian individuals. UniFrac analysis is known to be affected by sequencing depth and evenness, therefore, we performed jackknifing in which samples are subjected to even subsampling for n replicates and UniFrac distance matrix is calculated for each replicate (Lozupone and Knight, 2005). This way we generated 1000 replicates of PCoA coordinates and Procrustes analysis was applied to each PCoA replicate to plot average position of individuals on PCoA plot. The interquartile range of the distribution of points among the replicates was represented as an eclipse around the point (Lozupone et al., 2011).

qPCR Based Quantification of Dominant OTUs
The abundance of intestinal bacterial groups belonging to genus Prevotella, Faecalibacterium, and Megasphaera were measured by absolute quantification of 16S rRNA gene copy number by using primers listed in Supplementary Table 1. Template concentration for each sample was initially adjusted to 50 ng/µl. qPCR amplification and detection were performed in 10 µl reaction (consisting of 5 µl Power SYBR Green PCR Master Mix, 0.1 µM of each specific primer and 1 µl template) in triplicate using 7300 Real time PCR system (Applied Biosystems Inc., USA). Following conditions were used for qPCR assays: one cycle of 95 • C 10 min followed by 40 cycles of 95 • C for 15 s and 60 • C for 1 min. Group specific standard curves were generated from 10fold serial dilutions of a known concentration of PCR products for each group. Average values of the triplicate were used for enumerations of 16S rRNA gene copy numbers for each group using standard curves generated (Marathe et al., 2012). Percent abundance of each genus was obtained by calculating ratio of copy number of that genus to that of total bacteria. Throughout the qPCR experiments efficiency was maintained above 90% with a correlation coefficient >0.99.

Imputation of Metagenome Using PICRUSt
The metagenome imputation was done using method as described earlier (Langille et al., 2013). Briefly, closed reference based OTU picking approach was utilized to bin the amplicon sequences using latest Greengenes database 13.5 at 97% sequence similarity cut-off. The normalization for 16S rRNA gene copy number was carried out before prediction of the metagenome. This OTU table was used for predicting metagenome at three different KEGG levels (L1 to L3). Metagenomic differences between Indians-Americans as well as Indian-non-human primates were analyzed using linear discriminant analysis (LDA) effect size (LEfSe; Segata et al., 2011). PICRUSt and LEfSe analysis were performed with available parameters at http://huttenhower.sph.harvard.edu/galaxy/.

Publically Available Data Used
We did a PubMed search restricted only to publically available 16S rRNA amplicon data. Upon further narrowing down our search, we obtained raw sequence data of Korean subjects (DDJB project ID 60507; Nam et al., 2011), Bangladeshi subjects (SRA-SRA057705; Lin et al., 2013), data of 18 American individuals and 33 non-human primates (MG-RAST qiime625 and qiime626; Muegge et al., 2011) and data of Spanish individuals (SRA-SRP005393; Peris-Bondia et al., 2011). The list of primers, variable region of 16S rRNA gene and sequencing technology for each of the study is listed in Supplementary  Table 2. Any previously reported sequence data for Indian population was not available. To avoid biases introduced due to respective studies describing microbiota of these populations and inter-individual variations, sequence data of all individuals from a study was merged and considered as a representative microbiota of that country. The raw data from all these samples was processes along with the Indian sequence data (Ion Torrent and Illumina amplicons) in the same way as explained earlier.

Additional Statistical Tests
We applied Good's coverage to have a sense of understanding that the sequencing we have performed was enough to cover microbial diversity in the samples studied (Good, 1953). We also applied Welch's t-test with Benjamini-Hochberg FDR correction to examine the significantly differing bacterial families between Indians and Americans and Kruskal-Wallis test (a non-parametric measure of variance) to examine the population specific OTUs. Similar comparisons were made to evaluate the differential OTUs among non-human primates and Indians. Random Forest, a supervised machinelearning approach was applied to our data sets to identify taxa that were indicators for community differences in Indians-Americans as well as Indian-non-human primates Yatsunenko et al., 2012). An OTU was given importance scores by estimating amount of error introduced if that OTU is removed from the set of indicator taxa.

Key Features of Indian Gut Microbiota
We obtained over 17 million good quality reads which were clustered into 3782 OTUs from the 34 healthy Indian individuals, for further analysis the sequences were normalized to 1181 per sample (Supplementary Table 3). We first employed Good's coverage in order to estimate that enough sequencing has been performed to address the gut microbial diversity; with mean Good's coverage of 94% ±0.03, we were convinced of capturing dominant OTUs in all study subjects and to comment on gut microbial features of them.
We used alpha diversity indices to understand community composition of gut microbiota, some of which were based on species richness and species abundance and some on phylogenetic distance between them. Alpha diversity indices  such as Chao1, Shannon, Simpson, and PD_Whole tree revealed that there were large differences in the community composition in study subjects under consideration ( Figure 1A). Upon comparison of alpha diversity indices between rural and urban population, it was observed to be higher in urban subjects, however, no significant differences were noted for alpha diversity indices with respect to sequencing technology used. Overall, we could detect 201 bacterial genera belonging to 11 bacterial phyla in Indian subjects ( Figure 1B). Upon closer examination of the OTU table we were able to detect 50 OTUs that were present across the samples, such OTUs are commonly termed as core OTUs ( Table 1). Presence of just 50 core OTUs suggest that the gut microbiome of Indians is very diverse. This was further confirmed by performing beta diversity analysis using unweighted (sensitive to presence of unique OTUs) and weighted (sensitive to the abundance) UniFrac distance matrices. In each case, jackknifed PCoA biplots were produced to illustrate the compositional variation in gut microbiota between the samples; position of each sample is the average of jackknifed replicate shown with ellipses representing the IQR in each axis. Presence of large ellipses around each sample sphere in unweighted PCoA plot ( Figure 1C) is indicative of variations on beta diversity measures due to random subsampling and thus the presence of unique OTUs particular to each individual. Interestingly, we also noted that the samples that were happened to be collected from rural areas (eight samples on the right side of Figure 1C) clustered separately from the urban samples on unweighted PCoA plot indicating the contribution of lifestyle associated factors on sample segregation. However, on weighted PCoA plot (Figure 1D), all samples found scattered indicating the abundance of taxa influencing the segregation of samples on weighted PCoA plot was not different among the samples. Further, from the taxa contributing sample segregation of PCoA plots and from core OTUs, it was noticed that the gut microbiota of Indians is highly enriched with the OTUs belonging to bacterial genera Prevotella and Megasphaera and bacterial families such as Lachnospiraceae, Ruminococcaceae, and Veillonellaceae. To confirm that Indian gut microbiota is enriched with Prevotella and Megasphaera OTUs, we carried out qPCR assays for absolute quantification of 16S rRNA copy number of these genera in the study subjects. Mean count of Prevotella and Megasphaera was found to be 4.45% and 8.45%, respectively of total bacterial count. On the contrary, Faecalibacterium mean count was as low as 0.63% of total bacterial count (Figure 2). Interestingly, based on absolute count of Prevotella and Megasphaera Indian subjects were demarcated into two groups, one with moderate and other with high copy number of these genera. These results confirmed the 16S rRNA gene amplicon analysis and signify the dominance of Prevotella and Megasphaera in Indians.

Quantitative Differences between Gut Microbiota of Indians and Americans
The mean abundance of bacterial phyla and families between Indians and Americans was compared using t-test. Significant differences were observed in four dominant phyla in these populations: Actinobacteria (P = 0.0003), Bacteroidetes (P = 0.029), and Proteobacteria (P = 0.0015) being significantly more abundant in Indians and Firmicutes (P = 0.0004) in Americans (Figure 3A). At family level, 11 families were observed to be significantly different in the two populations ( Figure 3B and Supplementary  (Figures 4A,B) as against the Indians who were found dispersed along the coordinate 2. For Random Forest analysis, we considered an OTU to be highly predictive if its importance score was at least 0.001, this revealed 76 highly predictive OTUs between the two populations (Supplementary Table 6). Among these 76 highly predictive OTUs, 6 were overrepresented in Indians while rest were overrepresented in Americans. The OTUs overrepresented in Indians belonged to genus Prevotella, Lactobacillus, Lachnospira and Roseburia. Our results highlight profound differences at various taxonomic levels in gut microbial community structure of the two populations.
We further analyzed the differences in gut microbiota of Indian, Bangladeshi, American, Korean and Spanish populations in terms of unique and shared bacterial families plus the OTUs among these populations. For this, we normalized the sequence data to 4389 sequences per sample which contributed to 1807 OTUs. At bacterial family level, Indians shared more families with Bangladeshis; while fewer with Americans, Koreans and Spanish (Figure 5A). With 460 unique OTUs (Supplementary Table 7), Indians shared maximum of 25 OTUs with Bangladeshi, 15 with Americans and Spanish while 7 with Koreans ( Figure 5B). Most of the shared OTUs between Indians and Bangladeshis belonged to families Lachnospiraceae, Ruminococcaceae and Enterobacteriaceae, and genus Prevotella (Supplementary Table 8). Interestingly, only 3 OTUs were common in all populations, which were contributed by Streptococcaceae and Enterobacteriaceae families.

Indians Share Microbiota with Omnivorous Mammals
For the comparison of gut microbiota of Indians with nonhuman primates, we normalized the sequences to 1181 sequences per sample, which constituted 6189 OTUs. We observed that Indians share maximum 68 of 236 bacterial families ( Figure 6A) and 112 OTUs with omnivorous mammals (Supplementary Table  9) while minimum of 32 OTUs with carnivorous mammals. Interestingly, only 2 OTUs were common in all non-human primates and Indians ( Figure 6B). Further, principal coordinate analysis (PCoA) based on unweighted and weighted UniFrac distance matrices showed scattered distribution of omnivorous samples (Figures 4C,D). On the contrary, herbivorous (hind gut and fore gut fermenters), and carnivorous clustered separately ( Figure 4C). Random Forest analysis of Indians and non-human primates revealed 652 highly discriminants OTUs. Of the 341 OTUs, 122 and 174 OTUs were overrepresented in Indians and omnivorous mammals, respectively (Supplementary Table 10).

Imputed Metagenome
For comparing functional potential of the microbial communities in Indians and Americans, we used PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States). PICRUSt uses extended ancestralstate reconstruction algorithm to estimate which gene families are present and then combines gene families to give complete metagenome of the samples. From the data of functional capabilities, we focused primarily on those, which are associated with the microbial metabolism. We noticed significant differences in all major metabolic functions in gut microbiome of Indians and Americans. Broadly, gene families associated with xenobiotic biodegradation, nucleotide metabolism, enzyme families, metabolism of terpenoids and polyketides, glycan biosynthesis, and metabolism were overrepresented in Indians, whereas, metabolic functions associated with energy metabolism, carbohydrate metabolism, amino acid metabolism, and biosynthesis of other secondary metabolites were overrepresented in Americans ( Figure 7A). Further, the metagenomic comparison between Indians and non-human primates revealed that gene families linked to energy harvesting potential such as carbohydrate metabolism, glycolysis-gluconeogenesis, and fatty acid biosynthesis were enriched in omnivorous mammals ( Figure 7B).

DISCUSSION
Studies concerning population specific microbiota have revealed peculiar patterns in distribution of specific microbial communities in their gut. Surprisingly, till date, no efforts have been made to understand specific features of the microbiota of healthy Indian subjects. Based on 16S rRNA data from 34 individuals and 3782 OTUs, in this work, we first systematically describe gut microbiota features in Indian subjects. We suggest vast inter-individual variation in gut microbial communities in these subjects, characterized by dominance of Prevotella and Megasphaera. We further demonstrate the graded difference in microbial communities of these subjects from neighboring country (Bangladeshi) to distant population (Americans) as well as show that they indeed share most of the microbiota with omnivorous animals.
Our observation of compositional and phylogenetic variation within Indian gut microbiota as revealed by alpha diversity indices, could be a result of different variables like biogeographic separations of individuals (like rural-urban setting) and associated life-style factors. Further, we noted large variation in alpha diversity indices in urban individuals. Thus, to check whether there are unique taxa responsible for this, we performed UniFrac based beta diversity analysis. Indeed, the separation between rural and urban subjects observed on unweighted UniFrac PCoA which is influenced by less abundant unique OTUs was lost on weighted UniFrac PCoA because of abundance of dominant OTUs. The distinct separation was also not evident at phyla level abundance. On PCoA bi-plots, we further showed the contribution of dominant taxonomic groups influencing the segregation of samples. Thus, our results are robust and proves the presence of individual specific OTUs; at the same time it confirms that Indian subjects could not be separated into two or more groups based on presence and abundance of dominant taxonomic groups.
Knowing the fact that gut microbiota is influenced by diet and geography, we extended our analysis and compared gut microbiota of Indian subjects with American gut microbiota. Based on composition of microbiota, Americans were closely clustered while Indians were found dispersed on PCoA-biplots. This distinctive clustering could be partly because of genetic make-up and largely due the calorie restricted diet that these subjects were following. Interestingly, in an another study Americans from metropolitan areas which were not on any specific diet, segregated distantly from those of Malawians and Amerindians and were clustered closely (Yatsunenko et al., 2012). This provides the clue that though the cohort was calorie restricted, gut microbiota of Americans is indeed different from gut microbiota of other communities. Thus, diet can be one of many factors which influence the gut microbial communities and other factors such as genetic make-up and other current practices could also have a major influence on gut microbial composition. On broader scale, Indian population which originated from first wave of modern humans Out-of-Africa following the coastal route; and American population, which is effectively descendants of post-Columbian European migrants (Lazaridis et al., 2014), are genetically different hosts with varied dispersal histories (Macaulay et al., 2005;Mellars et al., 2013). The lack of cohesive Indian population cluster may be due to the heterogeneous representation of Indian samples from different endogamous groups experiencing diverse dietary patterns, prescriptions-proscriptions for food and food taboos that vary culturally.
Upon analysing the differentiating bacterial lineages and contributors in PCoA-biplots, we discovered that the OTUs belonging to genus Prevotella, Lactobacillus, Bifidobacterium, and Megasphaera were discriminately abundant in Indians. Members of genus Prevotella are known for their ability to degrade complex plant polysaccharides (De Filippo et al., 2010), thus its high abundance in Indian gut microbiota could be a result of the nature of Indian diet, which is primarily rich in plant derived preparations (Vecchio et al., 2014). Predominance of members of Lactobacillus and Bifidobacterium could be explained by the fact that fermented foods are another major components in Indian diet; these fermented foods are good source of lactic acid bacteria (Satish Kumar et al., 2013). Members of genus Megasphaera, a normal inhabitant of ruminant gut, have been isolated by us from gut microbiota of Indians (Shetty et al., 2013). The genome analysis and physiological characterization of these Megasphaera isolates highlighted their ability to produce short chain fatty acids viz. propionate, acetate, and butyrate and vitamins like of cyanocobalamin. One of the interesting observations of our study is the demarcation of Indian individuals into two groups (moderate and high copy number of Prevotella and Megasphaera). Recently, bimodal bacteria (with low and high abundance groups) in more than 1000 western individuals were reported and were predicted to be key bacterial groups associated with host health (Lahti et al., 2014). Considering the metabolic features of Prevotella and Megasphaera explained earlier and effect of different environmental factors on microbiota, they can be represented as tipping elements in Indian gut microbiota and are possibly linked with general well-being of these subjects as all the participants were healthy. However, further analysis would be needed to confirm the bimodal nature of these groups. Further we obtained the evidence for variations in gut microbiota of Indians by comparing it with gut microbiota of Spanish, Korean, Bangladeshi and American population, which are unique with respect to their dietary patterns and biogeographic locations. Indians shared maximum taxonomic groups with next-door neighbor Bangladeshi, which became progressively less with American, Spanish, and Koreans. High similarity shared between gut microbiota of Indian and Bangladeshi population is a reflection of shared ethnicity and other life-style factors between these populations. Interestingly, Indians shared least OTUs with Korean, which in turn shared maximum OTUs with Americans is in accordance with observations of previous study (Nam et al., 2011). The most intriguing finding of this analysis however, was the presence of only three common OTUs amongst all the populations, strengthening the fact that gut microbiome of geographically separated population is indeed unique and very few OTUs may contributes to core microbiome of the global population (Huse et al., 2012).
In the meta-analysis of microbial studies often comparisons are made between the data generated using different experimental protocols, hence a critical question is whether the principal conclusions derived are because of the technical differences or they are indeed biologically meaningful? Taking into account the effect of different experimental protocols including method of DNA extraction, use of specific primers and sequencing technologies, it cannot be denied that these factors could introduce some bias in the observed results (Lozupone et al., 2013). However, by the use of more stringent approach during bioinformatics analysis of amplicon (as presented in the current manuscript), it is possible to reduce such biases. The results presented in Figure 1C indicate that the segregation of samples is not due to the sequencing technologies used, but are indeed due to the large compositional differences in microbiota. Thus, such comparisons are required to identify the influence of these factors on the observed results and will bring into light the ways of optimizing the analysis protocol in order to minimize the effect of such confounding factors.
One of the major life-style factors, which characterize a population, is its dietary habits. There are abundant evidences in the literature suggesting effect of diet on microbiota (David et al., 2014;Xu and Knight, 2014). We therefore hypothesized, that gut microbiota of Indians who typically display mixed vegetarian and non-vegetarian dietary habits may be alike omnivorous mammals. The observation of the present study regarding similarities of gut microbiome of Indians and omnivorous mammals are in congruent to previous study findings (Ley et al., 2008;Muegge et al., 2011). In a study, Ley et al. showed that indigenous gut microbial communities co-diversify with their hosts and the microbial diversity increases from carnivory to omnivory to herbivory. Moreover, presence of only two common OTUs amongst all the types of dietary patterns, hint toward subtle differences and rapid trade-offs in gut microbial communities shaped by evolutionary forces in response to animal and plant diets.
Metagenomic studies of gut microbiome suggests that microbes residing in the gut have enormous genetic potential to code for functions essential for them to thrive in the gut environment and maintain homeostasis of gut ecosystem (Qin et al., 2010). To the best of our knowledge, report on experimentally derived human gut metagenomic data from adult Indian individuals is unavailable. In this context, our metagenomic imputations become minimum essential to have first glimpse at the functional capabilities of Indian individuals. Our metagenomic imputations using PICRUSt followed by LEfSe analysis reveals vast diversity in metabolic functions in these subjects. Although, the findings of differences in metabolic capabilities among the Indians-Americans and Indians-Nonhuman primates are based on imputed metagenome and has some limitations as explained earlier (Langille et al., 2013), we were able to capture broader functional features in gut microbiota and correlate it with the taxonomic features. Higher abundance of Bacteroidetes are generally attributed to ability to degrade xenobiotics like antibiotics (Maurice et al., 2013) and metabolism of complex glycans (Martens et al., 2009) Whereas, the Firmicutes are related to increased energy harvest through excessive carbohydrate metabolism and production of SCFAs (Turnbaugh et al., 2006). High Bacteroidetes and low Firmicutes found in Indian subjects and their correlation with metabolic abilities, indeed suggests that their gut microbiota not only differ at taxonomic level but also at the functional levels from that of Americans.

CONCLUSION
Our study raises the exciting possibility that the difference in microbiota may contribute to differences in health and disease characteristics of Indian population that could be different compared to the observations in the western population. Findings of the present study will serve as a basis for large-cohort studies in near future on Indian Gut Microbiome to address the questions such as if there are specific bacterial taxa or microbial functions which can be treated as a potential target for medical intervention studies.

AUTHOR CONTRIBUTIONS
YS and SJ conceptualized and designed the study whereas, YS also coordinated it. SB, SM, RS performed Ion torrent PGM sequencing. SS performed Illumina Hiseq sequencing. SK, AG, HM downloaded all relevant 16S rRNA sequence data. SB, PP carried out the detailed bioinformatics analysis. MN wrote the specific PERL script for bioinformatics analysis. DD coordinated the bioinformatics analysis. SO provided the anthropological insight on Indian context. SB, SG, HL, CY, DA, RP, SJ, and GM were involved in sample collection. SB and PP wrote the manuscript and all authors edited and approved the manuscript.

AVAILABILITY OF SEQUENCE DATA
Ion Torrent PGM runs were deposited to NCBI SRA under the accession numbers SRP041693, SRP055407 and Illumina raw reads to DDBJ under the accession number DRA002238.