Comparative Gut Microbiota of 59 Neotropical Bird Species

The gut microbiota of vertebrates are essential to host health. Most non-model vertebrates, however, lack even a basic description of natural gut microbiota biodiversity. Here, we sampled 116 intestines from 59 Neotropical bird species and used the V6 region of the 16S rRNA molecule as a microbial fingerprint (average coverage per bird ~80,000 reads). A core microbiota of Proteobacteria, Firmicutes, Bacteroidetes, and Actinobacteria was identified, as well as several gut-associated genera. We tested 18 categorical variables associated with each bird for significant correlation to the gut microbiota; host taxonomic categories were most frequently significant and explained the most variation. Ecological variables (e.g., diet, foraging stratum) were also frequently significant but explained less variation. Little evidence was found for a significant influence of geographic space. Finally, we suggest that microbial sampling during field collection of organisms would propel biological understanding of evolutionary history and ecological significance of host-associated microbiota.


INTRODUCTION
Gut microbiota are an essential component of vertebrate health. Microbes provide many necessary functions for the host organism, including aiding digestion, vitamin synthesis, protection against pathogens, training the immune system and organ development (Qin et al., 2010;Diaz Heijtz et al., 2011;Al-Asmakh et al., 2014). The gut microbiota is one of the most densely populated natural environments known (Whitman et al., 1998), possibly composed of thousands of species (Xu and Gordon, 2003). Generally speaking, gut microbiota communities tend to be more similar between more similar hosts, although the specific members of the microbiota can vary significantly between hosts of the same species (Eckburg et al., 2005;Hird et al., 2014) and even between identical twins (Turnbaugh et al., 2010). The microbiota may have an influence above the level of the individual, as they can affect mate choice (Sharon et al., 2010) and cause hybrid inviability (Brucker and Bordenstein, 2013). Understanding the role of the microbiota in evolution is a major outstanding question and the subject of much ongoing research.
Birds are a globally distributed class of vertebrates. The bird lineage is thought to be approximately 150 million years old and has likely been in symbiotic relationships with microorganisms the entire time. Birds live on every continent and exhibit extreme morphological and ecological diversity, much of which is centered in the Neotropics (Jenkins et al., 2013). Pertinent to gut microbiota, bird diets range from robust and opportunistic to strictly carrion feeding (i.e., vultures) to nectar feeding (i.e., hummingbirds) to folivorous (i.e., hoatzin) with corresponding variation in their intestinal morphology. Most avian gut microbiota studies have looked at one or a few host species; from these studies we know that bird gut microbiota are influenced by host genetics and evolutionary history (Banks et al., 2009;Dewar et al., 2013) as well as by ecological factors, such as dietary specialization (Roggenbuck et al., 2014). The gut communities can also change as a result of seasonal dietary fluctuations (Wienemann et al., 2011).
Modern molecular techniques provide an insight into the biodiversity contained within the guts of birds. Firmicutes, Proteobacteria, and Actinobacteria tend to dominate avian gut samples (summarized in Waite and Taylor, 2014), although captivity status can have a significant effect on microbial composition (Xenoulis et al., 2010;Wienemann et al., 2011;Kohl, 2012). Most gut microbiota studies have not included physical space as a variable that may be contributing to microbial variation in host gut communities. Geographic distance is associated with gut microbiota similarity (Yatsunenko et al., 2012) and, generally speaking, bacterial communities that are geographically closer are more similar than communities more distant (Green and Bohannan, 2006). Avian gut microbiota have shown conflicting results regarding the importance of geographic distance; whereas some studies have found it to be associated with the gut microbiota (Hird et al., 2014), others have looked but found no such evidence (Banks et al., 2009).
Large comparative studies inform us about the relationship between higher taxonomic classes and broader ecological groups.
In mammals, such studies have found that host taxonomy is strongly associated with gut microbiota communities (Ley et al., 2008a) as is dietary specialization (Ley et al., 2008b;Muegge et al., 2011). Diet seems to be the most important factor across insects (Anderson et al., 2012;Colman et al., 2012) and environmental variables have the most influence on gut microbiota in fishes , although taxonomy may also have a role. A recent meta-analysis of all previously published bird gut microbiota studies found that taxonomy of the bird had a major influence on the composition of the gut microbes (Waite and Taylor, 2014). Understanding how various host and environmental factors interact to shape the gut microbiota is a major goal of modern microbial ecology and evolution.
A molecular survey of the gut microbiota is a basic biodiversity measure that is lacking for the vast majority of wild bird species. Our primary aim for the current study was to create a microbial catalog for the gut microbiota diversity found in 59 Neotropical bird species sampled in the wild. Second, we test for taxonomic and ecological associations between bird host and gut microbiota. Finally, we tested for a geographic signal in our dataset, which was gathered from 12 localities across Costa Rica and Peru.

Sampling
The large intestine was extracted from 108 birds in Costa Rica and seven in Peru (Table 1, Figure 1, Supplemental Table S1) during fieldwork conducted between May and August 2010 [LSU IACUC protocol 09-001; Sistema Nacional de Areas de Conservación (SINAC) permit number: 109-2010-SINAC; access to genetic material by Comisión Nacional para la Gestión de la Biodiversidad (CONAGEBio) under permit: R010-2010-OT-CONAGEBIO]. Immediately after euthanization, birds were dissected and the largest section of undisturbed large intestine tied off, cut out and deposited into CryoVials. Samples were frozen in liquid nitrogen, following the protocol of Godoy-Vitorino et al. (2010) and kept frozen at the LSUMNS Collection of Genetic Resources until microbial DNA extraction (using Qiagen Power Soil extraction kits). We focused on bird species found in lowland forests across Costa Rica whose ranges extend south into the Peruvian Amazon.
Bird DNA Sequencing and Phylogenetic Tree Estimation DNA was extracted from the liver, heart or pectoral muscle of birds using the standard tissue protocol from Qiagen DNeasy kits. The mitochondrial locus ND2 was sequenced using the L5215 (5 ′ TAT CGG GCC CAT ACC CCG AAA AT 3 ′ ) and H6313 (5 ′ CTC TTA TTT AAG GCT TTG AAG GC 3 ′ ) primers and a thermocycler program of 94 • C for 2 min followed by 35 cycles of 94 • C for 60 s, 45 • C for 30 s, 72 • C for 60 s and a final extension of 7 min at 72 • C (Brumfield et al., 2007). Amplicons were cleaned using a BigDye Terminator kit and sequenced on ABI PRISM 3730xl at Beckman Coulter Genomics (Danvers, Massachusetts). The sequences were aligned with MUSCLE (Edgar, 2004) and we used BEAST (Drummond and Rambaut, 2007) to generate a sample from the posterior distribution of phylogenetic trees given the data (partitioned GTR + Ŵ model with estimated base frequencies). Because many of the internal branches of a multi-order bird phylogeny are short and several genomic-scale datasets have recently resolved some of these common conflicts, we also constrained the tree's internal nodes to reflect current genomic understanding of bird taxonomy (Klicka et al., 2007;Tello et al., 2009;Barker et al., 2013).

Microbial Fingerprint
Following Gloor et al. (2010) we used combinatoric primers and massive multiplexing of PCR amplicons for sequencing on an Illumina Hi-Seq. This method uses paired-end sequencing technology to generate pairs of sequences with 100% overlap across variable region 6 (V6) of the 16S component of rRNA; primer sequences align to positions 967-985 and 1078-1061 on Escherichia coli 16S rRNA (Gloor et al., 2010). We chose the V6 region of 16S because it is short enough that the sequencing technology was able to cover the entire region in both directions but variable enough to differentiate bacterial species (Chakravorty et al., 2007). Several samples were extracted or amplified more than once to quantify differences along the digestive tract and/or determine how sensitive the methods are to differences. One sample was amplified and sequenced a second time from a single extraction; these replicates are Cyanocompsa.cyanoides.1.1 and Cyanocompsa.cyanoides.1.2. Three birds had two extractions completed and were sequenced independently: Attila.spadiceus.1 had one extraction from the posterior large intestine and the other from the anterior large intestine; Trogon.rufus.2 had    two extractions in tandem from the posterior large intestine; Nyctidromus.albicollis.1 had one extraction from the posterior large intestine and a second from one of the ceca. We used several measures of sequence quality control. First, both reads of a given pair had to match across 100% of the bases. The pairs also must have no errors in the individual tag or priming sequence. We used the BELLEROPHON (Huber et al., 2004) function within the mothur program (Schloss et al., 2009) to identify and discard potentially chimeric sequences. Finally, we used mothur to discard sequences that did not blast to the domain Bacteria. The reads passing these filters were included in the final dataset.

Subsampled Datasets
To assess patterns across different spatial, taxonomic and ecological scales, we subdivided the dataset eight ways.
1. Full dataset: all samples (N = 116). 2. >2 individuals: all individuals belonging to species sampled more than once (N = 80, removes singletons). 3. CFM: all individuals from the three species that were sampled from both Costa Rica and Peru: Cyanocompsa cyanoides, Florisuga mellivora and Mionectes oleagineus (N = 17, this dataset allows investigation of large-scale geographic differences between birds of the same species).
4. Cyanoides: all individuals belonging to the species Cyanocompsa cyanoides (N = 8, removes taxonomic variation). 5. Manakins: all individuals from two species that were sampled multiple times and belong to the same family: Manacus candei and Ceratopipra mentalis (N = 13, this dataset allowed us to look at the differentiation between closely related species). 6. Non-passerines: all the birds belonging to orders other than Passeriformes (N = 27, removes largest order). 7. Passerines: all the birds belonging to the order Passeriformes (N = 88, constrains taxonomic variation to a single order). 8. Tirimbina: all individuals from Tirimbina Biological Reserve, Sarapiquí, Costa Rica, the most densely sampled locality (N = 45, removes geographic variation).

Taxonomic Assignment and Clustering Analyses
The microbial ecology package QIIME (Caporaso et al., 2010b) was used for the following analyses. First, the de novo OTU picking protocol was used to assign the reads to phylotypes at 97% sequence similarity because 3% is frequently cited as the "species" level of microbial taxonomy (Schloss and Handelsman, 2005), hereinafter "phylotypes." Next, we assigned taxonomies to phylotypes using the QIIME implementation of the RDP Classifier 2.2 Program (Wang et al., 2007), with the default confidence threshold of 80%. A "core microbiota" was calculated and included all phylotypes that were found in 100% of the samples. A pairwise UniFrac (Lozupone and Knight, 2005) distance matrix (UDM) was constructed between each gut microbial community (i.e., each bird specimen). UniFrac distances are calculated based on the amount of branch length in a phylogenetic tree that is unique to either of two environments (vs. how much of the tree is shared by the environments). These distances can be based on presence-absence of OTUs ("unweighted") or weighted by abundance and our analyses use both, as neither method is agreed to be more appropriate for multi-species microbiota studies. Our data were aligned using the QIIME implementation of PyNAST (Caporaso et al., 2010a) and a microbial phylogenetic tree was constructed with FASTTREE (Price et al., 2009). All individuals were randomly reduced to 3652 reads, equal to the lowest number of reads for any bird in the dataset. Despite this type of rarefaction being inappropriate for some questions (McMurdie and Holmes, 2014), we chose to normalize our data this way because we are comparing "whole microbiome" data across many individuals and highly variable sequencing depths can affect diversity estimates (Goodrich et al., 2014). We constructed UPGMA dendrograms based on both the unweighted UDM and weighted UDM to visually represent the relatedness of the gut microbiota for all datasets. Principal coordinates analysis (PCoA) was also performed on both the weighted and unweighted UniFrac distance matrices.
As a complement to the phylogenetic-based methods, we visualized the data with non-metric multidimensional scaling (NMDS). We square root-transformed the percentage of each sample that belonged to each bacterial phylum, then created a pairwise distance matrix using Bray-Curtis dissimilarity, applied through the VEGDIST function of the VEGAN package (Oksanen et al., 2011) in R (R Development Core Team, 2010). The NMDS function of the ECODIST package (Goslee and Urban, 2007) was then used to calculate the two-dimensional positions of the samples (such that closer samples are more similar), the stress and R 2 value of the plot. Stress values >0.3 should not be considered valid whereas values <0.2 can be considered a good representation of the data (Quinn and Keough, 2002).

Categorical Variable Significance
To look for a relationship between categorical variables associated with each bird and the microbial communities, we used the statistical tools Adonis (McArdle and Anderson, 2001) and Anosim (Clarke, 1993) implemented in QIIME. The categorical variables included the American Ornithologists' Union South American Classification Committee's taxonomy, i.e., order, family, genus and species (Remsen et al., 2012), ecological variables, including dietary specialization and habitat (Bennett and Owens, 2002), spatial variables and individual properties, like age (based on percent of skull ossification), stomach contents (e.g., "insects" or "plant material") and bacterial richness (bacterial taxa identified per bird). Table 2 gives a detailed list of the variables and their sources. We calculated significance of all variables for both the weighted and unweighted UDMs with 999 permutations.
After testing the significance of each variable independently, we ran an additional Adonis test on the most frequently significant variables to quantify the amount of variation each variable was responsible for in the context of other variables. We used the full dataset's unweighted and weighted UDMs as input, calculated 999 permutations, and permuted the order of the variables, which can affect the results of the test. Finally, we constrained the analyses to only permute the data within bird orders, as a measure of controlling for taxonomy. We then reran the weighted and unweighted UDMs.

RESULTS
After initial quality control steps, 9,897,718 pairs of reads remained with no errors in priming sequence, region of overlap or individual tags. Potentially chimeric sequences (1167, 0.01% of reads) were then discarded. A further 3,58,725 sequences that did not align to any sequence within the domain Bacteria (3.6% of reads) were removed; 75% of these discarded reads belonged to 11 individuals. The reads passing these filters were included in the final dataset, totaling 9,537,817 sequences and averaging 82,222 sequences per individual; reads/sample varied by over two orders of magnitude (range: 3652-853,078).

Clustering Analyses
The PCoA for the unweighted UDM displayed clustering by taxonomic order (Figure 3), but little obvious clustering by foraging stratum, diet or sampling locality. Host order displayed the most clustering in the weighted UDM PCoA and NMDS plot as well (Figure S1), although little correspondence is seen between the actual bird phylogeny and a dendrogram of the weighted UDM (Figure 4; see Figure S2 for unweighted UDM and Bray-Curtis dendrograms).

Categorical Variable Significance
To look for correlations between categorical variables and the gut microbiota, we conducted two statistical tests for significance on both the weighted and unweighted UDMs (Table 4). For the full dataset, taxonomic categories explained the most amount of variation and were all significant at p < 0.05 for the Adonis statistical tests ( Figure 5A, Table 4). All other variables explained much less variation although some were significant. The taxonomic categories were also significant in the Anosim test of the full dataset, whereas none of the ecological or spatial variables were (Table 4). Averaging across all 8 datasets, taxonomic variables were the most frequently significant; all four categories (order, family, genus, species) were significant in over 50% of the tests. The taxonomic categories also accounted for the most variation ( Figure 5B). The ecological variables were significant in most datasets but generally explained much less variation than the taxonomic variables. The locality and individual variables had generally low R 2 values and were all significant in less than 33% of the tests. Using the "Tirimbina" dataset, which included all individuals from a single locality, host order was still significant, with other taxonomic levels less so (Table 4). Conversely, in our most taxonomically restricted dataset including a single species, "Cyanoides, " geographic variables explained a large amount of variation but the effect was not significant (unweighted UDM R 2 = 58%, p = 0.23; weighted UDM R 2 = 74.9%, p = 0.156).
The multifactor Adonis tests yielded different results for the weighted and unweighted UniFrac distance matrices. Foraging stratum, host order and bacterial richness (at the phylum level) were all significant (p < 0.05) for the unweighted UDM and accounted for around 10% of the variation each (Table 5A). No variables were significant for the weighted UDM (Table 5B). When data were permuted within the taxonomic orders (i.e., controlling for high-level host taxonomy), the significance of the variables did not change (Tables 5C,D), although the amount of variation that the host taxonomy explained increased.

DISCUSSION
Microorganisms are an essential component of biodiversity, and vertebrate evolutionary history is incomplete without an adequate understanding of our microbiota. This study greatly increases our basic understanding of what organisms live in the guts of wild Neotropical birds, spanning eight orders within Aves. We found taxonomic profiles similar to other avian gut studies with some exceptions. Compared to the meta-analysis performed by Waite and Taylor (2014), our samples appear to be enriched for Proteobacteria and have a deficit of Actinobacteria. If this result is not a biological signal, the discrepancy may be due to the difference in marker choice or methods between our study and those in the meta-analysis or to an underlying methodological bias.
Several gut-associated taxa were defined as our "core microbiome." The phyla Firmicutes, Proteobacteria, Actinobacteria, and Bacteroidetes were found in all our samples and are often found in gut habitats across vertebrates (e.g., Turnbaugh et al., 2008;Roeselers et al., 2011); the same is true for the genera Bacteroides, Clostridium, and Lactobacillus. Additionally, some genera frequently found in avian guts specifically were identified in all our samples: Streptococcus and Campylobacter (Zhu et al., 2002;Xenoulis et al., 2010;Videnska et al., 2014). Although several known contaminants of microbiota studies (Salter et al., 2014) were found in our dataset (e.g., Corynebacterium, Propionibacterium, Streptococcus), the  Gray boxes indicate the phylotype did not align to any named taxa at that taxonomic rank. Numbers in boxes are the total number of identified phylotypes in that group. Full length gray bar at bottom indicates a phylotype that did not align to any named phylum. high biomass of our initial samples (intestinal contents) and the corroboration of many of these genera being found in other gut studies leads us to believe our results are not due to laboratory contamination. Overall, these results suggest that there may be a conserved set of microorganisms found in the guts of birds and perhaps all vertebrates.
The Importance of Host Taxonomy, Ecology, and Geographic Space The statistical tests applied above indicate varying degrees of importance for taxonomy, ecology, geographic space, and individual traits for structuring the gut microbiota of birds. Generally speaking, it appears that who a bird is most important, how a bird lives is possibly important and where a bird lives may be of little importance.
Instead of phylogenetic distance, we used taxonomic categories as our evolutionary units, so that we could apply the same statistical methods to all metadata associated with the bird samples. Focusing on hierarchical taxonomic categories also allowed us to overcome the analytical problems associated with high individual variation in gut microbiota (e.g., Turnbaugh et al., 2008; Figure 4) while still learning about the effect of evolutionary history on the microbiota. Taxonomic categories were most frequently significant in our analyses and explained the most variation. There is clear evidence for associations between gut microbiota and host taxonomy, which has been widely noted in other systems and at varying scales (e.g., Ochman et al., 2010). It is important to note, though, that taxonomy is not the same as phylogeny; our results show that individuals of the same species/genus/family/order are more similar to one another than they are to individuals in other groups, not that more closely related species/genera/families/orders have more similar microbiota. In fact, little correlation was seen between the individual phylogeny and the gut microbiota dendrogram (Figure 4). Does this mean that phylogeny is not an underlying factor shaping the gut microbiota? We believe that it would be difficult to get significant signal at all taxonomic levels without phylogeny playing some role but this requires further investigation that may require parsing the microbiome into vertically vs. horizontally transmitted bacteria. It would not be surprising to find some vertical transmission of microbiota from parent to offspring occurring in birds, as all the birds in this study feed their young by regurgitating food into the mouth of nestlings. The exact role of evolutionary history on gut microbiota is an exciting next step.
We also found small but significant associations between the gut microbiota and host ecology. Broad dietary classifications (e.g., mostly plant, mostly animal, plant/animal) were more significant than specific dietary specializations and the literal stomach contents were only significant in two of the 32 tests. Perhaps long term habits or nutritional content have greater influence on gut microbiota than day to day food intake, which is consistent with other studies showing the stability of the avian gut community once established (Benskin et al., 2010) and that diet can drive the convergence of gut microbiota in nonavian vertebrates (Ley et al., 2008a,b;Muegge et al., 2011). Of methodological note: the "stomach contents" variable contains a lot of variance particularly with respect to specificity and accuracy, as it is recorded in the field and only general data are taken, (i.e., "plant material" or "insects").
Strata and habitat are important ecological aspects of avian biology. Foraging strata is associated with genetic divergence in Neotropical birds (Burney and Brumfield, 2009;Smith et al., 2014) because ecology affects dispersal ability. Our results reinforce the importance of foraging strata on avian biology and support an important role for ecology in differentiation of both host and microbiota. Microbes from the same ecological niche on the human body are able to share genes on a global scale (Smillie et al., 2011)-perhaps the microbiota of birds that share ecological niches are able to transfer genetic material as well. We could not decipher which aspect of host ecology is directly responsible for the signal we discovered in our current data-is   it that birds in the same foraging stratum are exposed to the same environment microbes, that their behavior exposes them to particular microbes within the environment or that similar genes recruit specific microbes? Investigating how these aspects of host ecology affect gut microbiota is an exciting avenue of future research. Despite ecological factors possibly being important for avian gut microbiota, (external) geographic space showed little effect on the microbiota. All the locality variables had poor correlations with the gut microbiota and our spatial tests revealed no statistical significance between space and microbiota. Thus, bird gut microbiota are likely not just a random assortment of the microbes available in a given environment. Can we further conclude that physical space has little effect on the microbiota? Possibly. None of our analyses found that gut communities that are spatially closer are more closely related, both across Aves and within a single species (Cyanocompsa cyanoides). Alternatively, the apparent lack of effect of locality may be an issue of sampling. This dataset contains few species (or even genera) with multiple individuals. If locality is important, we might expect it to be working within species instead of across higher taxonomic levels. Hird et al. (2014) found that locality was significantly correlated with gut microbiota within 34 brown-headed cowbirds from two localities. Scale of analysis may be critical for detecting what factors are contributing to divergence and much more work is needed to define appropriate geographic, taxonomic and ecological sample sizes. Regardless, it appears that geographic space is much less important than taxonomy and ecology for structuring the gut microbiota in birds.

Incorporating Microbiome Studies into Collection Based Research
Wild organisms have different microbiota than captive conspecifics (Wienemann et al., 2011) and microbial communities change once the host organism has died (Hauther et al., 2015). Thus, field preservation of the microbiota generates unique and irretrievable data. Studying the microbiota informs us about the host but in a comparative framework, and we can learn about evolutionary processes above the individual as well (e.g., Ley et al., 2008b). The data for this paper were obtained without additional sampling effort; every result represents information that would have been lost had we not collected the intestinal contents for this purpose. The methodologies herein can relatively easily be incorporated into field protocols, when vertebrate specimens are being prepared for preservation in research collections. The addition of microbial fingerprints to studies of natural history and evolution provides a non-traditional glimpse into some very important biodiversity. Our protocol for field sampling added an average of 5-10 min to specimen preparation time and consisted of isolating the intact intestinal tract, tying an undisturbed portion off at both ends, clipping the intestines on the outside of the ties and storing in liquid nitrogen. Other field researchers may want to include microbiome components in "macrodiversity" studies, even if they cannot process the samples immediately because doing so adds and preserves a novel and frequently entirely unknown dimension of data to vouchered museum specimens.

DATA ACCESSIBILITY
Sequence data, metadata file and an OTU table have been uploaded to FigShare available at figshare.com/s/fe2026007 37e11e58a2906ec4bbcf141.

ACKNOWLEDGMENTS
The authors would like to thank NSF (DEB-0956069 and DEB-0841729), the AMNH Frank Chapman Grant (to CS) and Sigma Xi Grants-in-Aid-of-Research (to SH) for funding this project. The UC Davis Open Access Fund and the UC Davis Chancellor's Postdoctoral Fellowship helped fund publishing of the manuscript. We thank Sistema Nacional de Areas de Conservación (SINAC) and Consejo Nacional para la Gestión de la Biodiversidad (CONAGEBIO) for permits to do field work and obtain the samples in Costa Rica. We thank Gilbert Barrantes for allowing us access to work at Museo de Zoología, Universidad de Costa Rica, Bernal Rodrígues for granting access to Reserva Biológica Tirimbina, and Reinaldo Aguilar at Los Charcos de Osa. We thank Michael Harvey, Gustavo Bravo, Diego Ocampo, and Karla Conejo for field assistance. We also thank the editors and reviewers at Frontiers for improving the manuscript with their suggestions.