Shared and Distinct Features of Human Milk and Infant Stool Viromes

Infants acquire many of their microbes from their mothers during the birth process. The acquisition of these microbes is believed to be critical in the development of the infant immune system. Bacteria also are transmitted to the infant through breastfeeding, and help to form the microbiome of the infant gastrointestinal (GI) tract; it is unknown whether viruses in human milk serve to establish an infant GI virome. We examined the virome contents of milk and infant stool in a cohort of mother-infant pairs to discern whether milk viruses colonize the infant GI tract. We observed greater viral alpha diversity in milk than in infant stool, similar to the trend we found for bacterial communities from both sites. When comparing beta diversity, viral communities were mostly distinguishable between milk and infant stool, but each was quite distinct from adult stool, urine, and salivary viromes. There were significant differences in viral families in the infant stool (abundant bacteriophages from the family Siphoviridae) compared to milk (abundant bacteriophages from the family Myoviridae), which may reflect significant differences in the bacterial families identified from both sites. Despite the differences in viral taxonomy, we identified a significant number of shared viruses in the milk and stool from all mother-infant pairs. Because of the significant proportion of bacteriophages transmitted in these mother-infant pairs, we believe the transmission of milk phages to the infant GI tract may help to shape the infant GI microbiome.


INTRODUCTION
The human body carries communities of microbes that inhabit virtually every available niche on both its inner and outer surfaces (Ding and Schloss, 2014). Most studies to date have focused on bacteria that inhabit body surfaces; however, a growing number of studies suggest that viruses also play an important role in health and disease (Abeles et al., 2015;Reyes et al., 2015;Columpsi et al., 2016;Ly et al., 2016;Duranti et al., 2017;Wylie, 2017). Studies of viruses inhabiting human body surfaces have generally suffered from a strong focus on DNA rather than RNA viruses, biases toward characterization of non-enveloped viruses, specimen preservation techniques that may reduce representation of certain viruses, and a relative lack of quantitative techniques. Despite these limitations, human virome studies have clearly demonstrated that viruses inhabit all the same niches as those inhabited by bacteria. For example, viruses inhabit the mouth Ly et al., 2014), respiratory tract (Wylie, 2017), gastrointestinal (GI) tract (Columpsi et al., 2016), bladder (Santiago- Rodriguez et al., 2015a), skin (Hannigan et al., 2015), and even the blood (Sauvage et al., 2016) of healthy individuals. The significant representation of viruses on the human body surfaces that have been studied to date suggests that there are no parts of the human body free of viruses.
The human microbiome plays a significant role in human health and disease, including its potential role in immune maturation (Gensollen et al., 2016;Laforest-Lapointe and Arrieta, 2017), direct pathogenicity (Eloe-Fadrosh and Rasko, 2013;Sanz et al., 2015), and in protection against pathogen invasion (Edelblum et al., 2017). Studies of the virome in early life are limited (Breitbart et al., 2008;Lim et al., 2015;Duranti et al., 2017). Increasing attention has recently been focused on the extent to which the microbiome may be shared amongst close contacts, so that the microbiome of an individual may have characteristics that are relatively individual-specific, but that individuals have the capacity to share their microbiota with their environment (Lax et al., 2017) and with their close contacts in those environments (Song et al., 2013;Lax et al., 2014). We recently studied a cohort of genetically unrelated individuals who lived together and identified significant numbers of shared bacteria within each household . These individuals also shared viruses in both their oral and GI microbiomes , suggesting that passive contact amongst individuals who share the same environments may result in the sharing of viruses. Because many of the viruses inhabiting these microbiomes are bacteriophages, their ability to kill bacteria or provide them with potentially beneficial gene functions may not only shape their own microbiomes, but may also shape the microbiomes of their close contacts.
There may be no closer relationship than the bond that is formed between mother and infant. With significant focus on shared microbes amongst close contacts, the motherinfant relationship warrants further investigation because of the potential implications that shared microbes could have for infant microbiota development (Pannaraj et al., 2017), immune development (Duerkop et al., 2009;Bengmark, 2013), and for the gut-brain axis, which may affect infant neurological development (Cong et al., 2016;Sharon et al., 2016;Yang et al., 2016). Human milk harbors its own microbiome (Hunt et al., 2011), which could play a role in development of the infant GI tract microbiome via breastfeeding (Guaraldi and Salvatori, 2012;Funkhouser and Bordenstein, 2013). The existence of a microbiota in human milk began with cultivation studies demonstrating the presence of bacteria such as Streptococcus and Staphylococcus in human milk (Heikkila and Saris, 2003). These types of analyses were extended through 16S rRNA sequencing to include additional microbes such as Bifidobacterium and Lactobacillus (Martin et al., 2007;Jost et al., 2015). Finally, next-generation sequencing has extended the analysis of 16S rRNA to demonstrate the presence of substantial bacterial diversity beyond that which could be identified through prior conventional means (Hunt et al., 2011) (Backhed et al., 2015. Infants acquire nutrients such as vitamins, minerals, and fatty acids through ingestion of breast milk (Jost et al., 2015;Nongonierma and FitzGerald, 2015;Eriksen et al., 2018), and have been hypothesized to aid in the establishment of the infant GI tract microbiome (Backhed et al., 2015). Our recent studies have shown that infant GI microbiota more closely reflect the milk of their mothers than they do to unrelated mothers, indicating that microbes from human milk likely are early infant GI colonizers and may form part of the foundation of the infant GI microbiome (Pannaraj et al., 2017). This relationship is especially important because it develops through the infant ingesting human milk, which has the capacity not only to transmit bacteria, but also to potentially transmit viruses. There is some evidence that Bifidobacteria bacteriophages may be transmitted to infants through human milk (Duranti et al., 2017) but no specific studies characterize the human milk virome nor which of the milk virome members may be transmitted to infants. We characterized the viromes from human milk and infant stool of a cohort of mother-infant pairs, with the goal to: (1) characterize the virome of milk, (2) identify similarities and differences in the diversity of viruses in milk and infant stool viromes, (3) decipher whether milk and infant stool viromes are similar taxonomically, and (4) establish the extent to which viruses may be transmitted to the infant GI tract via milk.

Human Subject Characteristics
We recruited 10 healthy mother-infant pairs from the community around Children's Hospital Los Angeles and collected milk from the mothers and stool from each infant. Characteristics of mothers and infants are described in Supplementary Table S1. All mothers self-identified as Hispanic. Mothers' mean age was 27 years ± standard deviation (SD) 6.8 years. Milk and stool were collected at a single time point between 4 and 10 days (mean 7.4 ± SD 1.7 days) of the infants' birth. Infants were born between 37 and 41 weeks' gestation. Seven of the ten infants were born via vaginal delivery, with three born via Cesarean section. Four of the mothers were exposed to peripartum antibiotics; however, none of the infants were treated with antibiotics prior to the collection of stool. Each of the mothers was actively breastfeeding at the time of the milk and stool collections, but only three of the infants had never received formula.

Processing of Viromes and Microbiomes
We processed the viromes from milk and infant stool according to our previously described protocols through sequential filtration followed by cesium chloride density gradient centrifugation to enrich the viral fraction present in each sample type Ly et al., 2016). We produced 9,072,824 ± 749,832 (mean ± standard deviation) reads per subject and sample type and assembled those reads into larger contigs for downstream analysis. Additional details of the read numbers and the assemblies are shown in Supplementary  Table S2. There was minimal GC content variation present amongst contigs in milk viromes, while there was substantial variability from sample to sample in median GC content of the infant stool virome contigs (Supplementary Figure S1). Most of the contigs had homology to viruses (85.7 ± 3.3%), some had homology to bacteria (14.3 ± 3.3%), and few had no known homologies (0.1 ± 0.1%) (Supplementary Figure S2). These findings together indicate that the infant stool and milk samples were substantially enriched for viruses and the GC content differences in sample types suggests greater variability in viromes in infant stool compared to milk. We also characterized the bacterial communities using the V1-V2 segment of 16S rRNA gene for comparison  and produced 116,079 ± 21,005 sequence reads per subject and sample type.

Analysis of Viral and Bacterial Alpha Diversity in Human Milk and Infant Stool
We characterized the alpha diversity present in the viromes and the bacterial biota of both milk and infant stool to discern whether there may be differences in the numbers of viral genotypes and bacterial operational taxonomic units (OTUs) present based on body site. Viral diversity using the homologous virus diversity index (HVDI) (Santiago- Rodriguez et al., 2015b) did not significantly differ between milk and stool (p = 0.26) ( Figure 1A). We developed the HVDI based on the Shannon Index (Gotelli and Colwell, 2001) because it allows us to measure viral diversity in a similar manner by which we analyze alpha diversity in bacterial communities. In our analysis of bacterial diversity, we found that there was significantly greater diversity in maternal milk compared to infant stool (p = 0.03) ( Figure 1B). While most of the milk and infant stool specimens had similar bacterial diversity amongst the subjects studied, there were a few infant stool specimens that had relatively low virome alpha diversity compared to the others studied (Supplementary Figure S3). There also were a slightly greater number of estimated viral genotypes present in milk than were estimated in infant stool (Supplementary Figure S4). These results indicate a greater level of viral diversity present in milk than in infant stool, with both below the levels of viral diversity generally observed in adult urine (Santiago-Rodriguez et al., 2015a), stool and saliva (Abeles et al., 2015).

Comparisons of Beta Diversity in Human Milk and Infant Stool
We characterized the beta diversity amongst the viromes and the bacterial biota for all mothers and infants to discern whether there were observable patterns specific to sample type and whether the samples were distinct from other sample types for which virome sequences were available. When trends in beta diversity were observed by Principal Coordinates Analysis (PCoA), the majority of the milk and infant stool viromes clustered distinctly (Figure 2A). Most of the milk specimens formed a homogenous cluster, but specimens from mother/infant pairs 8, 9, and 10 were somewhat distinct. The separate clustering of the specimens could not be explained by delivery method, breastfeeding exclusivity, or antibiotic exposure. These findings were consistent even when a larger group of specimens were characterized ( Figure 2B), which demonstrated that the milk FIGURE 1 | Bar graphs representing Shannon diversity and Homologous viral diversity in infant stool and human milk. (A) Homologous viral diversity index (± standard error) based on virome contigs. (B) Shannon diversity (± standard error) based on 16S rRNA, p-values are shown above each bar. The x-axis represents milk or infant stool, and the y-axis represents Shannon diversity or homologous viral diversity. and infant stool specimens had observable differences from adult urine, saliva, and stool obtained from unrelated adults enrolled in previous studies (Robles-Sikisaka et al., 2013;Abeles et al., 2015;Santiago-Rodriguez et al., 2015a). The infant stool specimens appeared more similar to many of the milk specimens than they did to adult stool ( Figure 2B). These results suggest that there are shared features between infant stool and milk viromes, likely as a result of milk ingestion by the infants. When examining the bacterial biota, the milk formed a rather homogenous cluster and the infant stool formed a more heterogeneous group (Figure 3).

Taxonomic Analysis of Viral and Bacterial Communities
We next characterized the taxonomic compositions of the viromes and bacterial biota of both the infant stool and FIGURE 2 | Principal coordinates analysis of beta diversity in viral communities including specimens from human milk and stool are shown in A, and viral communities including specimens from human milk (blue triangles), infant stool (red triangles), adult saliva (green circles), adult stool (yellow triangles), and adult urine (cyan squares) are shown in B. milk specimens to determine whether we may observe similar taxonomic trends amongst the mother-infant pairs. To discern trends in the taxonomic compositions of viromes, we used TBLASTX hits against the viral database present at the NCBI to identify putative viruses present in the milk and infant stool. We categorized each virus according to the family designation of their best hit (Figure 4). A majority of the viruses in both infant stool (95.5 ± 3.2%) and milk (95.2 ± 2.8%) were predicted to be bacteriophages, with a relatively small minority from infants (4.5 ± 3.2%) and mothers (4.8 ± 2.8%) predicted to be eukaryotic viruses (Supplementary Figures S5). In the infant stool, the most abundant virus family was predicted to be Siphoviridae (contractile-tailed bacteriophages that often have primarily lysogenic lifestyles) ( Figure 5). In the milk specimens, the most abundant viruses were predicted to be from the Myoviridae family (tailed phages that often have primarily lytic lifestyles), but a few subjects also were predicted to have abundant Podoviridae (short tail stubs that often have primarily lytic lifestyles). The difference in the relative abundance of Siphoviridae between milk and infant stool was statistically significant (p = 0.004); however, the observed trends in differences in Myoviridae and Podoviridae between milk and stool did not meet statistical significance (p = 0.091 for Myoviridae and p = 0.130 for Podoviridae). These results highlight that there are substantial differences in the representation of bacteriophages in human milk and infant stool.
Given the high abundance of bacteriophages among the viruses, we evaluated to see if there were similar composition changes in the bacteria biota. We observed that the infants were generally dominant in anaerobes in their stool, with most having a large abundance of Bacteroides and some having a large abundance of the Firmicute Veillonella (Figure 6 and Supplementary Figure S6). In contrast, most of the milk specimens were abundant in the Firmicute Streptococcus. While Streptococcus could be observed in the stool, and both Veillonella and Bacteroides could be observed in the milk, they generally were not of high relative abundances. Mother/infant pairs 8, 9, and 10 had observable differences in virome taxonomy (Figure 4) and beta diversity (Figure 2A) compared to the other pairs, the same from mother/infant pairs also had distinctive bacterial taxonomies compared to the other pairs ( Figure 6). These results suggest that while some bacteria may have been shared between milk and the infant GI tract, their relative abundances generally were not shared.

Shared Viral Contigs Between Human Milk and Stool
While our analysis identified distinct features of viromes in their taxonomy, alpha and beta diversity between milk and infant stool, the patterns visualized by PCoA suggested that there also were some shared features in the two specimen types. To determine whether there were shared features of these viromes between mother and infant, we quantified the numbers of viral contigs shared in each mother-infant pair. We measured this by creating global assemblies from all of the viral sequence reads obtained from both the mother and infant, and then quantified FIGURE 3 | Principal coordinates analysis of beta diversity in bacterial communities (16S rRNA) of human milk (blue circles) and infant stool (green squares).
FIGURE 4 | Bar graphs of putative viral family assignments. The y-axis represents the percentage of viral contigs assigned to each viral family, and the x-axis represents stool (left) from each infant studied and milk (right) from each mother studied. the contribution to each viral contig assembled from the mother, infant, or both the mother and infant. There were a number of contigs that were created from reads from the mother alone or the infant alone, but approximately 30% were created using reads from both the mother and infant, suggesting that these viruses were present in both (Figure 7). While the relative number of these contigs varied in each mother-infant pair, we identified viral contigs constructed from both milk viromes and infant stool viromes in all pairs studied (Supplementary Figure S7). To confirm these results, we utilized a separate method based on BLASTN homology amongst the viral contigs in both mother and infant, and performed a permutation test (Robles-Sikisaka et al., 2013) to decipher whether the proportion of highly homologous viral contigs within a mother-infant pair was significantly greater than would be observed between mother-infant pairs. In all 10 pairs, there was a significant proportion (p < 0.0001) of homologous viral contigs within a mother-infant pair than would be expected by comparisons between mother-infant pairs, indicating a significant conservation of viral contigs in each pair ( Table 1). As was observed through assembly analysis (Supplementary Figure S7), there was substantial variation in the proportions of homologous viruses observed within each mother-infant pair ( Table 1), suggesting that the same number of viruses were not conserved between mother and infants across the different pairs. While the proportion of homologous viruses was relatively small in some pairs (range from 3.1 to 18.4%), the results were statistically significant in each pair. We characterized some of the viral assemblies in these subjects and found that many represented bacteriophages that could be assembled with high average coverage from both the milk and the infant stool (Supplementary Figure S8). For example, Pair #1 shares a 15,330 nt bacteriophage in both milk and stool with significant sequence homology to Burkholderia ambifaria phage BcepF1 (Supplementary Figure S8A). Interestingly, Pair #2 shares a different 16,602 nt bacteriophage that also has significant homology to B. ambifaria phage BcepF1 (Supplementary Figure S8B). While these subjects share this likely Myovirus in both milk and infant stool, Pair #6 shares a 14,064 nt Enterobacteria phage that has high homology to Siphoviruses (Supplementary Figure S8C). Each of these viruses likely are of high abundance in both the milk and infant stool, as their average coverage ranges from 744X to 25,625X in each subject. We previously have demonstrated that close contacts can share portions of their viromes ; therefore, the sharing of bacteriophages between mother and infant, is consistent with our previous findings and suggests that the sharing of viruses between close contacts is a common phenomenon.

DISCUSSION
One of the most distinct features of the human microbiome is the extent to which it may vary by body site. Indeed, it has been known for quite some time that the bacterial microbiome of the mouth is distinct from the distal colon, is distinct from the skin, and so on. This phenomenon has been investigated to a much lesser extent for the human virome. It is clear FIGURE 5 | Bar graphs (± standard error) demonstrating the proportion of viral contigs assigned to specified virus families in infant stool (black bars) and human milk (gray bars). The y-axis represents the percentage of contigs assigned to each Family, and the x-axis represent the different virus families with bacteriophage families located on the left and eukaryotic virus families located on the right. The " * " represents values that are statistically significant with p-values ≤0.01.
that there are distinctions between oral viromes based on their particular oral biogeographic niche , and that there are considerable differences in oral and stool viromes (Abeles et al., 2015), but less data exists for the viromes of other body sites. Because many of these viromes consist of considerable numbers of bacteriophages, it is reasonable to postulate that their phages may differ by body site, which is what we observed in our analysis of milk and infant stool (Figures 4, 5). While predicting the putative hosts of phages can be problematic, here we use their homologies to phage families to garner more information about the constituents of these phage a Based on the mean ± standard deviation of 10,000 iterations. 1,000 random contigs were sampled per iteration. b Empirical p-value based on the fraction of times the estimated percent homologous reads within each pair exceeds that for different pairs.
FIGURE 6 | Bar graphs of bacterial communities based on 16S rRNA at the Class level. The y-axis represents the percentage of OTUs assigned to each bacteria Class, and the x-axis represents stool (left) from each infant studied and milk (right) from each mother studied. The Firmicutes from the Bacilli Class belong to the Genus Streptococcus, and the Firmicutes from the Clostridia Class belong to the Genus Veillonella.
FIGURE 7 | Bar graphs (± standard error) representing the relative proportion of viral contigs in each mother-infant pair assembled that include contributions from the human milk, contributions from the infant stool, or contributions from both milk and stool.
communities. The significant contrast in the constituents of milk and infant stool strongly suggested considerable differences in these viromes, yet the proportion of shared viruses in the milk and the infant stool was higher than expected based on taxonomy. We believe that a great deal of the explanation for this discrepancy is that presence/absence of viruses in the milk and stool does not necessarily reflect relative abundances. Therefore, while the taxonomic compositions of the milk and stool remain considerably different when examining the abundances of the viruses, the presence of certain phages in both sample types demonstrates a relatedness that would be overlooked if only considering relative abundances. We speculate that this phenomenon was apparent when examining the beta diversity of the viromes, where there were substantial similarities in the milk and infant stool that was quite distinct from adult stool, salivary, and urine specimens. We have observed significant differences in the bacterial biota on different body surfaces, but the milk and infant stool offer some of the most significant differences in virome constituents that we have observed. The relative predominance of Siphoviridae in infant stool compared to the predominance of Myoviridae in the milk was significant. While taxonomic features could represent lifestyle differences amongst the viral populations, where the viruses in milk have more predominantly lytic lifestyles compared to more lysogenic lifestyles in the infant stool, these differences also could reflect the differences in bacterial taxonomy, where Bacteroides and Veillonella were predominant in the infant stool and may have been more likely to have viruses from the Siphoviridae family. A more thorough examination of the structural features of the individual viruses identified in milk and infant stool may help to elucidate the lifestyles of these viruses by identifying whether many contain lysogenic modules. While a minority of the viruses we identified were predicted to be eukaryotic, there are methodological reasons that could lead to their underrepresentation. For example, larger viruses such as herpesviruses may be filtered out in virome preparations that use 0.2 µ filters to reduce potential bacterial contamination (Conceicao-Neto et al., 2015). Surprisingly, despite filtering viruses homologous to larger phycodnaviruses were readily identified (Figure 4). These viruses have previously been identified in the human oropharyngeal virome (Yolken et al., 2014), and could potentially represent contributions from infant saliva to maternal milk.
We previously examined the sharing of viruses in a cohort of unrelated cohabitating individuals and found that there was considerable sharing of viruses in relatively short time periods . We found evidence of sharing of both stool viruses and salivary viruses even among individuals who were not romantic couples. These data suggested that either close contact or shared environmental reservoirs resulted in the sharing of substantial numbers of viruses between individuals. That study differs in that we were measuring sharing amongst the same specimen types. For example, we measured shared stool viruses amongst the individuals in the cohort, assuming that viruses in the lower GI tract of 1 individual would have a significant propensity to colonize the lower GI tract of their housemate. In this study, we measured the sharing of viruses amongst different sample types, with the understanding that viruses that inhabit milk may transfer to the infant GI tract but not have the ability to colonize the infant distal colon. It is possible that viruses also were shared from the infant oral cavity into milk (West et al., 1979); however, because the milk volume ingested by the infant may have exceeded the spread of infant saliva into milk, transmission from milk to the infant appears more plausible. While we were able to show evidence of the same viruses colonizing the distal GI tract and the milk of the mothers, the degree of sharing was less than what we observed in the stool of unrelated housemates in that study (roughly 30% were shared at 6 months in that study . Because prior studies have demonstrated that bacteriophages may bind and persist on mucosal layers (Barr et al., 2013), such as those found in the distal colon, we cannot be absolutely certain that these viruses are truly replicating in the infant GI tract, but rather they may be binding and persisting on mucosal layers of the infant until those layers slough off. A longitudinal study to characterize the persistence of the phage in the infant distal colon would provide greater evidence that they may be persisting due to ongoing replication.
In our prior studies, we have sequenced viromes using Semiconductor Sequencing techniques that generally provide fewer sequence reads and less overall coverage of the viromes. In this study, we sequenced these viromes using the Illumina MiniSeq, which provided about 16-20x the coverage that we have achieved with these viromes in the past. We have a few observations from the switch in sequencing modalities, including: (1) there were not greater numbers of contigs created with the additional coverage, which supports our prior findings that the lower sequencing depth was likely adequate to characterize the contents of these viromes, (2) diversity estimates did not appear to be affected, as roughly the same proportions of reads were assembled into contigs as were found in the viromes with the lower sequencing depth, (3) the average length of the contigs was not longer with the additional sequencing depth, suggesting that sequencing depth is not the limiting factor in our ability to assemble high quality and potentially complete viral genomes, and (4) contaminants even at the relatively low levels present in these viromes become apparent with greater sequencing depth. We limited our analysis to assembled contigs >=2000 nucleotides to reduce the effects of contamination, and in doing so, identified that 85.7% of the contigs assembled had BLASTX homologs to known viral structural or replication genes.
There were a few limitations that affected the conclusions we could draw from the data in this study. Those limitations included the relatively small sample size, the cross-sectional nature of the sampling, and the characteristics of the cohort. While the intent of the study was primarily to provide the first characterization of the virome of human milk and to decipher whether viruses found in milk also could be found in the infant gut, a larger cohort may have provided greater insights into whether specific demographics and traits of the mothers and infants may have affected virome contents and virus sharing. All the mothers in this study self-identified as Hispanic, which limits whether the results of this study may be generalized across diverse populations. A longitudinal study may have definitively resolved whether viruses shared between mother and infant were the result of persistent colonization or were more likely a result of transient passage through the GI tract. Because only 4/10 infants were exclusively breastfed (Supplementary Table S1), while all mother/infant pairs shared viruses, it is less likely that viruses were shared solely as the result of transient GI track passage. It is possible that formula may also have a virome, which also could have contributed to the results. Despite the relatively small cohort size and variations in cohort characteristics, this study still provides the first insights into the role of the human milk virome in the establishment of the infant GI tract virome.
As scientists continue to characterize the microbiome of various body surfaces, it is becoming increasingly clear that few if any human body surfaces are not inhabited by microbes. In particular, it has been known for quite some time that human milk has both bacteria (Jeurink et al., 2013) and some viruses (Michie and Gilmour, 2001;Stiehm and Keller, 2001;Duranti et al., 2017). The presence of bacteria in milk can be traced to the third trimester of pregnancy and continues through lactation (Rodriguez, 2014). These bacteria have been hypothesized to be derived from the mother's skin microbiota (Ramsay et al., 2004(Ramsay et al., , 2005, the mother's gut via an entero-mammary pathway (Perez et al., 2007;Jimenez et al., 2008;Abrahamsson et al., 2009;Jost et al., 2013Jost et al., , 2014, commensal microbiota that inhabit human breast tissue (Urbaniak et al., 2014;Xuan et al., 2014), and the infant's oral cavity via suckling behavior (West et al., 1979). The origin of viruses in the infant GI tract has been hypothesized to occur through vertical transmission from mother to infant, with breast milk as the likely means of the transmission (Duranti et al., 2017). The data presented in this study supports that breast milk may be responsible for a portion of the viruses identified in the infant GI tract.
The relationship between mother and infant has gained greater attention as researchers focus on the role of the microbiome in gut-brain axis development and the role of the microbiome in the development of the infant immune system (Duerkop et al., 2009;Bengmark, 2013;Cong et al., 2016;Yang et al., 2016). Prior studies show that infants born vaginally are colonized by a number of microbes that ultimately may play a role in infant development (Bokulich et al., 2016). The data presented here add an additional layer of complexity to the potential role of the microbiome in infant development, as it has identified a virome unique to milk that may then be transmitted to the infant. While most of the viruses identified in this study were bacteriophages, there were some Eukaryotic viruses also identified, and these have previously been shown to play some role in the development of the immune system in germ-free mice (murine norovirus). Even if most of the viruses observed in this study are not directly involved in immune development or communication along the gut-brain axis, their role as natural perturbations could be important in determining the bacterial constituents of the infant GI tract in its earliest stage of development.

Human Subjects and Culture Conditions
Human subject involvement in this study was approved by the Institutional Review Board at Children's Hospital Los Angeles. Written informed consent was obtained from all mothers and financial compensation was provided for their participation in this study. We recruited healthy mothers and their full-term infants from the Los Angeles community and recorded mother and infant ages, comorbidities, current and recent antibiotic use, and infant feeding characteristics. We collected milk from mothers and stool from the infants' soiled diapers during study visits. Visits occurred during all times of the day depending on the mother's availability. Mothers were instructed to clean the breast per their normal routine prior to feeding. Milk (1-2 ounces) was obtained either through manual expression or electric breast pumps using sterilized connectors and sterile bottles for collection. We collected from a total of 10 mother/infant pairs, and both milk and stool was stored at −80 • C until processed in this study.

Analysis of 16S rRNA
Genomic DNA was prepared from the maternal milk using the Qiagen QIAamp DNA MINI kit (Qiagen, Valencia, CA, United States) and from the stool using the QIAamp DNA Stool MINI kit. Each sample was subjected to a bead beating step prior to nucleic acid extraction using Lysing Matrix-B (MP Bio, Santa Ana, CA, United States). We amplified the bacterial 16S rRNA gene V1-V2 hypervariable region using the forward primer 8F (AGAGTTTGATCCTGGCTCAG) fused with the Ion Torrent Adaptor A sequence and one of 70 unique 10 base pair barcodes, and reverse primer 357R (CTGCTGCCTYCCGTA) fused with the Ion Torrent Adaptor P1 from each subject and sample type (Whiteley et al., 2012). PCR reactions were performed using Platinum High Fidelity PCR SuperMix (Invitrogen) with the following cycling parameters: 94 • C for 10 min, followed by 30 cycles of 94 • C for 30 s, 53 • C for 30 s, 72 • C for 30 s, and a final elongation step of 72 • C for 10 min. Resulting amplicons were purified on a 2% agarose gel stained with SYBR Safe (Invitrogen) using the Qiagen MinElute PCR Purification Kit. Amplicons were further purified with Ampure XP beads (Beckman-Coulter), and molar equivalents were determined for each sample by quantifying the amplicons using PicoGreen (Invitrogen) using a plate reader. Samples were pooled into equal molar proportions and sequenced on 316 chips using an Ion Torrent PGM according to manufacturer's instructions (Life Technologies) (Rothberg et al., 2011). Resulting sequence reads were removed from the analysis if they were <180 nucleotides or >400 nucleotides, had any barcode or primer errors, contained any ambiguous characters, or contained any stretch of >8 consecutive homopolymers. Sequences then were trimmed according to any site that had a Phred Score of less than 15. Sequences then were assigned to their respective samples based on a 10 nucleotide barcode sequence using Ion Assist 1 .
We sequenced an average of 116,079 reads from each sample, and analyzed the sequence data using Quantitative Insights Into Microbial Ecology (QIIME 1.5) (Caporaso et al., 2010). Representative OTUs from each set were chosen at a minimum sequence identity of 97% using the QIIME script pick_otus_through_otu_table, which uses the Greengenes database (DeSantis et al., 2006). PCoA was performed based on beta diversity using weighted UniFrac distances (Lozupone et al., 2006) using the QIIME script beta_diversity_through_plots. The results of the beta diversity distance matrices were used to determine the weighted UniFrac distances between different samples and sample groups. Alpha diversity based on the Shannon Index (Gotelli and Colwell, 2001) also was performed using the QIIME pipeline. Differences in the relative abundances of taxa between subject groups were determined using the Mann Whitney U test using MaxStat Pro 2 .

Virome Preparation and Sequencing
Stool viromes were prepared by diluting 0.4 g of stool in 4 mL of SM buffer (100 mM NaCl, 8 mM MgSO4, and 50 mM Tris pH 7.5), vortexing for 40 min to separate viral particles, spun at 4,000 × g for 10 min at 4 • C to pellet the remaining solid material, and the supernatant saved from each specimen. The milk specimens were centrifuged at 4,000 × g for 10 min at 4 • C, and the supernatant was then treated in an identical manner as we have developed for saliva specimens . Supernatants from stool and milk were filtered sequentially using 0.45 and 0.2 µm cellulose acetate filters (GE Healthcare Life Sciences) to remove cellular and other debris, and then purified on a cesium chloride gradient according to previously described protocols (Pride et al., 2012). Only the fraction with a density corresponding to most known bacteriophages (Murphy et al., 1995) was retained, further purified on Amicon YM-100 protein purification columns (Millipore, Inc.), treated with 2 units of DNase I, and subjected to lysis and DNA purification using the Qiagen UltraSens Virus kit (Qiagen). Recovered DNA was screened for the presence of contaminating bacterial nucleic acids by quantitative 16S rRNA gene PCR using primers 8F (AGAGTTTGATCCTGGCTCAG) and 357R (CTGCTGCCTYCCGTA) in Power SYBR Green PCR Mastermix (Thermo Fisher Scientific) (Abeles et al., 2015). No products were detected in any of the viromes after 30 cycles, which does not exclude the presence of contaminating bacterial nucleic acids, but indicates that they were not present at dominant levels. Viral DNA then was amplified using GenomiPhi Hy MDA amplification (GE Healthcare Life Sciences), and specimens prepared for sequencing using the Nextera DNA Library Prep XT kit (Illumina, Inc) according to manufacturer instructions. The size of products were determined using a High Sensitivity DNA Kit on a Bioanalyzer (Agilent Technologies), and quantified using a High Sensitivity Double Stranded DNA kit on a Qubit Fluorometer (Thermo Fisher Scientific). DNA from each specimen was pooled into equimolar proportions and sequenced on the Illumina MiniSeq Instrument (Illumina, Inc), producing an average of 9,072,824 paired-end reads per specimen. For quality control on the sequence reads, we performed the following steps: (1) we trimmed sequence reads with Phred scores <30, (2) we removed any reads with ≥8 consecutive homopolymers, (3) we removed reads <50 nucleotides or >300 nucleotides in length, and (4) we removed reads that contained any ambiguous nucleotides. Each virome was screened for contaminating nucleic acids using BLASTN analysis (E-score <10 −5 ) against the human reference database available at ftp://ftp.ncbi.nlm.nih. gov/genomes/H_sapiens/. Any reads with significant sequence similarities to human sequences were removed prior to further analysis using Ion Assist 3 .

Virome Analysis
All reads were assembled using CLC Genomics Workbench 9.5.3 (Qiagen) based on 98% identity with a minimum of 50% read overlap, which were more stringent than criteria developed to discriminate between highly related viruses (Breitbart et al., 2002). Because the average and median read lengths were 150 nucleotides, the minimum tolerable overlap was approximately 75 nucleotides. Assemblies created containing reads from milk and infant stool were assembled using the same parameters. The consensus sequence for each contig was constructed according to majority rule, and any contigs <2000 nucleotides were removed prior to further analysis. Contigs were annotated using BLASTX against the NCBI NR database with an E-value cutoff value of 10 −5 . Specific viral sequences were identified using Ion Assist 3 by parsing BLASTX results for known viral genes including replication, structural, transposition, restriction/modification, hypothetical, and other genes previously found in viruses for which the E-value was at least 10 −5 . Each individual virome contig was annotated using this technique; however, if the best hit for any portion of the contig was to a gene with no known function, lower level hits were used as long as they had known function and still met the E-value cutoff. ORF prediction was performed using FGenesV (Softberry Inc, Mount Kisco, NY, United States), and putative functions assigned by BLASTP homology against the NR database (E-score <10 −5 ). If the best hit was to a gene with no known function, lower level hits were used for the annotation as long as they had known putative 3 www.thepridelaboratory.org function and still met the E-score cutoff (10 −5 ). Virus types were determined by parsing the virus families from the TBLASTX best hits of each viral contigs with an E-value <10 −20 . Statistically significant differences in the representation of viruses between groups were determined by two-tailed t-tests.
Analysis of shared sequence similarities present in each virome was performed by creating custom BLAST databases for each virome, comparing each database with all other viromes using BLASTN analysis (E-value <10 −10 ), and these compiled data used to calculate beta diversity with Bray Curtis distances using QIIME (Caporaso et al., 2010). These distances were used as input for PCoA. We determined putative sharing of viruses by constructing assemblies from both the milk and infant stool reads from each subject, and then deciphered the contribution to each resulting contig from milk reads and the infant stool reads, similar to techniques we have previously described Ly et al., 2016). We utilized this technique to decipher those contigs that were unique to milk or infant stool, and those shared between milk and infant stool in each mother-infant pair. Statistical significance was determined by comparisons between groups by the Mann Whitney U test using MaxStat Pro 4 . We also performed a permutation test (10,000 iterations) to assess whether infant stool and milk viromes had significant overlap in each mother-infant pair using Ion Assist 4 (Robles- Sikisaka et al., 2013Abeles et al., 2014;Ly et al., 2014;Naidu et al., 2014). We simulated the distribution of the fraction of shared virome homologs between milk and infant stool within each mother-infant pair. For each pair, we computed the summed fraction of shared homologs using 1000 random contigs between randomly chosen between milk and infant stool of different mother-infant pairs, and from these computed an empirical null distribution of our statistic of interest (the fraction of shared homologs). The simulated statistics within each pair across all time points were referred to the null distribution of inter-pair comparisons, and the p-value was computed as the fraction of times the simulated statistic for the each exceeded the observed statistic.

Viral Diversity
To measure alpha diversity in the viral communities, we utilized a technique termed the HVDI. The technique is based on finding high levels of homology amongst contigs within viromes that likely belong to the same virus but were placed into separate contigs due to the limitations of the assembly process . Virome reads were assembled using 98% identify over a minimum of 50% of the read length using CLC Genomics Workbench 4.65 (CLC bio USA, Cambridge, MA, United States), and the resulting contig spectra utilized as the primary input for the index. We created custom nucleotide BLAST databases for each subject that contained all their contigs. We then used BLASTN analysis to find high levels of homology (e-score <10 −20 ) between different contigs within the same subject. We accepted only high levels of homology that spanned at least 50% of the length of the shorter contig being compared. All contigs in each subject were treated as nodes and those contigs that had high homology to other contigs in the same subject were added to a network by directing edges between the nodes. After evaluating homologies among all intra-subject contigs, networks formed from directed edges/nodes were assigned to individual viruses and nodes with no associations were considered singular viruses. For each resulting network, we added the number of reads assigned to each node on the network and the combined number of reads was used to represent the relative abundance of the virus represented by that network. The relative abundances of all viruses were calculated using this technique, and a new contig spectra representing the viral population in each subject was formed. The contig spectra from each subject then were used as surrogates for population structures and input directly into the Shannon Index (Gotelli and Colwell, 2001) to estimate diversity.

DECLARATIONS Ethics Approval and Consent to Participate
Subject recruitment and enrollment were approved by the Human Research Protection Programs at the Children's Hospital Los Angeles under project # CCI-12-00044. All subjects signed written informed consent indicating their willingness to participate in this study and willingness to allow us to publish our findings. Each subject was compensated for their participation in this study.

Consent to Publish
The consent forms signed by each participant included their consent to allow us to publish our findings.

AVAILABILITY OF DATA AND MATERIALS
All sequences are available for download in the Sequence Read Archive under accession number SRP127558 (https://www.ncbi. nlm.nih.gov/sra/?term=SRP127558). Ion Assist can be found at www.thepridelaboratory.org and runs on Windows XP or higher.

AUTHOR CONTRIBUTIONS
PP and DP conceived and designed the experiments. PP and CC recruited subjects. ML, MS, KJ, and AS performed the experiments. PP and DP analyzed the data. PP contributed reagents, performed the examinations. PP, GA, and DP wrote and critically reviewed the manuscript.

FUNDING
Supported by the K23 HD072774-02 to PP and the Doris Duke Charitable Foundation to DP.

ACKNOWLEDGMENTS
We thank Katrine Whiteson for her contributions to this manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01162/full#supplementary-material FIGURE S1 | Box and whiskers plots demonstrating the percent G + C content amongst the virome reads recovered from infant stool (left) and human milk (right). The y-axis represents the percent G + C content.
FIGURE S2 | Bar graphs representing the proportion of contigs from infant stool and human milk with BLASTX hits to known viruses, bacteria, or without BLASTX hits. The y-axis represents the percentage of total contigs and the x-axis represents each infant stool (left) and milk (right) specimen.
FIGURE S3 | Bar graphs representing Shannon diversity and Homologous viral diversity in infant stool and human milk. (A) Homologous viral diversity index based on virome contigs for all infants and mothers, (B) Shannon diversity based on 16S rRNA for all infants and mothers. The x-axis represents milk (right) or infant stool (left), and the y-axis represents Shannon diversity or homologous viral diversity.
FIGURE S4 | Bar graphs (± standard error) representing the estimated number of virus genotypes identified in infant stool and human milk. The x-axis represents milk or infant stool, and the y-axis represents estimated virus genotypes determined by the homologous virus diversity index. The p-value is shown above.
FIGURE S5 | Bar graphs (± standard error) representing the proportion of contigs from infant stool and human milk with TBLASTX homology to bacteriophage and eukaryotic virus families. The percentage of contigs is represented on the y-axis, and the infant stool and milk specimens are shown on the x-axis. p-values ≤0.001 are represented by " * * ". FIGURE S6 | Bar graphs (± standard error) demonstrating the proportion of OTUs assigned to specific bacteria Classes in infant stool (black bars) and human milk (gray bars). The y-axis represents the percentage of OTUs assigned to each Class, and the x-axis represent the different bacteria Classes. The " * " represents values that are statistically significant with p-values ≤0.01, the " * * " represents p-values ≤0.001, and the " * * * " represents p-values ≤0.0001. The Firmicutes from the Bacilli Class belong to the Genus Streptococcus, and the Firmicutes from the Clostridia Class belong to the Genus Veillonella.
FIGURE S7 | Bar graphs representing the relative proportion of viral contigs in each mother-infant pair assembled that include contributions from the human milk, contributions from the infant stool, or contributions from both milk and stool. The y-axis represents the percentage of viral contigs assembled, and the x-axis represents the contribution to viral contigs from the mother (black), infant (white), or both (gray).
FIGURE S8 | Assemblies of viral contigs from select mother-infant pairs. (A) represents a 15,330 nt phage from mother-infant Pair #1, (B) represents a 16,602 phage from mother-infant Pair #2, and (C) represents a 14,064 phage from mother-infant Pair #6. The average coverage from the infant stool and the human milk is shown on the right of each panel. The portions of the contig identified in the milk and stool are represented by the colored boxes. Putative ORFs and their directions are represented by the arrows and their annotations are represented in each panel. The length of the contig is denoted at the top.
TABLE S1 | Characteristics of study participants at time of specimen collection.