From Binary Model Systems to the Human Microbiome: Factors That Drive Strain Specificity in Host-Symbiont Associations

Microbial symbionts are ubiquitous and can have significant impact on hosts. These impacts can vary in the sign (positive or negative) and degree depending on the identity of the interacting partners. Studies on host-symbiont associations indicate that subspecies (strain) genetic variation can influence interaction outcomes, making it necessary to go beyond species-level distinction to understand host-symbiont dynamics. In this review, we discuss examples of strain specificity found in host-symbiont associations, from binary model systems to the human microbiome. Although host and bacterial factors identified as mediators for specificity could be distinct at the molecular level, they generally fall into two broad functional categories: (1) those that contribute a required activity in support of the association and (2) those involved in antagonistic interactions with organisms outside of the association. We argue here based on current literature that factors from these two categories can work in concert to drive strain specificity and that this strain specificity must be considered to fully understand the molecular and ecological dynamics of host-symbiont associations, including the human microbiome.


INTRODUCTION
Microbes form associations with diverse eukaryotic hosts from protists to humans (McFall-Ngai et al., 2013). These microbial associations can impact host fitness in positive (mutualistic) or negative (antagonistic) directions that in turn influence host diversity and evolution. For example, the notable spread in North America of Drosophila neotestacea flies is associated with the bacterial symbiont Spiroplasma that protects the fly host from parasitic competitors (Jaenike et al., 2012;Cockburn et al., 2013). Organisms that engage in associations can be generalists capable of associating with many potential partners, or specialists that are restricted to associations with one or a few partners (Chomicki et al., 2020). In this review we limit our discussion to associations in which a host partner exhibits specialization for interaction with one or a few specific subspecies of a microbial symbiont. Host specialization for specific subspecies of symbionts with particular traits can confer consequences on the symbiotic outcome. Host associations with non-native partners from the same microbial species as the native partner can result in lower net fitness benefits, and even a switch from net fitness benefit (mutualistic) to net fitness cost to the host (antagonistic). To explore the molecular traits and processes that promote the presence of a particular symbiont subspecies within an individual host, we discuss here examples of systems in which these processes have been experimentally tested. We do not delve deeply into the evolutionary or ecological theory of specificity, but refer readers to recent publications on these topics (Heath and Stinchcombe, 2014;Shapiro and Polz, 2014;Foster et al., 2017;Batstone et al., 2018;Chomicki et al., 2020).
One of the earliest examples of host-symbiont strain specificity comes from studies of binary associations between one host and one microbial partner. There is a rich history of symbiosis research on the nodules of leguminous plants that harbor nitrogen-fixing rhizobia bacterial symbionts (Oldroyd, 2013). Rhizobia strain-level differences discernable through numerical taxonomy, nucleic acid hybridization, and 16S rRNA analyses can dictate nodule formation with specific legume hosts (as reviewed in van Rhijn and Vanderleyden, 1995). Strain specificity also occurs in the well-studied symbiosis between the squid animal host Euprymna scolopes and Vibrio fischeri bacteria. In this association, the squid host provides nutrients for bacterial growth while V. fischeri bioluminesce in a specialized host structure called the light organ (Nyholm and McFall-Ngai, 2004). Bioluminescence is thought to provide counterillumination that helps camouflage and protect the host from predation by matching environmental down-dwelling light (Jones and Nishiguchi, 2004;Haddock et al., 2010). Strains of V. fischeri, categorized by having >95% single-gene (gapA) sequence identity, varied in competitive colonization proficiency, with the native strain outcompeting non-native strains (Lee and Ruby, 1994;Nishiguchi et al., 1998). Recently, the association between soil-dwelling insect-parasitic Steinernema nematode species and Xenorhabdus bacterial species has proven to be a powerful system to reveal the impact of bacterial strain identity on overall fitness of a symbiotic pairing. Steinernema nematode spp. from two distinct phylogenetic clades (I and III) naturally associate with different X. bovienii strains, classified based on >96% average nucleotide identity of 1,893 sets of orthologous genes (Murfin et al., 2015). Cross-pairing studies using six Steinernema nematode hosts and nine X. bovienii bacterial strains showed that the fitness of the association differed among X. bovienii strains, despite their genome-wide sequence similarity (Murfin et al., 2015;McMullen et al., 2017). More strikingly, the fitness of a non-native pairing negatively correlated with the phylogenetic distance of the non-native X. bovienii strain to the native X. bovienii strain (Murfin et al., 2015;McMullen et al., 2017). Together, the findings from these three well-studied model systems provide ample evidence that bacterial strain identity influences overall fitness of host-symbiont associations.
The varying impacts of bacterial strains on hosts is not limited to binary associations, but also occurs in complex systems that include more than one microbial partner, such as the well-studied Apis mellifera (honeybees) gut microbiome that provides immunity against pathogens and nutrients through metabolism of complex carbohydrates (Kwong and Moran, 2016). It consists of up to nine bacterial species including well-characterized symbionts Snodgrassella alvi and Bifidobacterium asteroides (Kwong and Moran, 2016). 16S rRNA and metagenomic sequencing and bioinformatics revealed strain variation among honeybee gut microbiome members (Engel et al., 2012;Moran et al., 2012;Ellegaard and Engel, 2019) with functional consequences to the host. For example, an S. alvi strain isolated from honeybees has a competitive advantage over S. alvi strains isolated from Bombus spp. (bumblebees) for colonization of honeybee hosts (Kwong et al., 2014). Similarly, metabolomics comparisons of bee gut colonized by distinct strains of B. asteroides revealed differences in the abundance of metabolites such as arabinose, galactose, and xylose, suggesting that bacterial strain-level variation can alter available nutrients (Zheng et al., 2019).
Strain differences also impact human interactions with microbial partners, which in the gut alone consist of up to 1,000 bacterial species (Sekirov et al., 2010). More recent studies that couple metagenomics with techniques to differentiate strains (as reviewed in Brito and Alm, 2016;Niu et al., 2018) revealed strain variation in the human-associated microbiome. Although multiple strains of a single species can be found in the gut microbiota of an individual human, one strain is typically stably dominant in abundance (Schloissnig et al., 2013;Costea et al., 2017;Truong et al., 2017;Garud and Pollard, 2019). In one estimate, a single strain can account for >80% of the strain composition in an individual's gut microbiota . However, the identity of the dominant strain can vary among individuals (Schloissnig et al., 2013;Costea et al., 2017;Lloyd-Price et al., 2017;Truong et al., 2017). Compared to the gut, inter-individual microbial strain variation is even greater in other body sites such as the oral cavity, nose, and vagina (Lloyd-Price et al., 2017). The broad impacts of the microbiota on human health such as in development, immunity, and nutrient acquisition (Sekirov et al., 2010;Mohajeri et al., 2018) can correlate with bacterial strain variation within the microbiota. For example, the strain-variable copy number of a toxin secretion component encoded by the gut bacterium Bacteroides uniformis is positively correlated with inflammatory bowel disease (Greenblum et al., 2015). Similarly, the presence of Staphylococcus epidermidis strains that encode virulence factors such as the secretory antigen SsaA correlates with the skin disease psoriasis .
Overall, strain-level differences in bacterial symbionts can dictate the degree and the sign of interaction outcomes between hosts and microbes. In the next section, we discuss molecular factors that mediate strain specificity in host-symbiont associations. To showcase general trends, we included molecular factors described in the literature as having some level of strain specificity, though we note that the definition of "strain" varies among researchers. Identified factors generally fall into two broad categories: (1) those that fulfill direct functional roles in activities specialized for the symbiosis and (2) those that mediate antagonistic interactions that indirectly shape the symbiosis.

STRAIN SPECIFICITY FACTORS THAT FULFILL FUNCTIONAL ROLES IN HOST-SYMBIONT ASSOCIATIONS
Signals mediating the initiation of rhizobia-legume host associations have been well-characterized with respect to their role in strain specificity. When leguminous plants need nitrogen, they elicit rhizobial expression and secretion of nodulation factors: lipochitooligosaccharides (LCOs) (Oldroyd, 2013). When plants perceive LCOs they activate the symbiosis signaling pathway triggering plant restructuring and bacterial invasion through infection threads to form nodules (Oldroyd, 2013). Rhizobium leguminosarum LCOs mediate the host range specificity of individual strains ( Figure 1A). Specifically, R. leguminosarum bacteria are classified into biovars (bv): viciae, trifolii, and phaseoli, and each biovar forms nodules with different legume hosts (as reviewed in van Rhijn and Vanderleyden, 1995;Dénarié et al., 1996). A seminal study showed that bv. trifolii and bv. viciae differ in the gene sequence encoding nodulation factor E (NodE). When nodE from bv. viciae was introduced to a bv. trifolii nodE mutant, it was sufficient to alter the host range of bv. trifolii to that of bv. viciae (Spaink et al., 1989). Sequence differences between bv. viciae and bv. trifolii nodE result in distinctive LCO structure, composition, and hydrophobicity, which in turn dictates host responsiveness (Spaink et al., 1991(Spaink et al., , 1995Bloemberg et al., 1995).
Nodulation between Sinorhizobium meliloti bacteria and Medicago truncatula legumes also can vary depending on the bacterial and host identity (Snyman and Strijdom, 1980). When certain S. meliloti strains are paired with specific M. truncatula cultivars, incompatible interactions can arise that result in pseudonodules that unlike nodules are small, do not fix nitrogen and display sensescence (Tirichine et al., 2000;Simsek et al., 2007). Further, bacteria in pseudonodules are eventually lysed Yang et al., 2017). In this legume-bacteria association, exopolysaccharide succinoglycans mediate strain specificity ( Figure 1A). The production and succinylation of these molecules are necessary for infection thread formation (Leigh and Walker, 1994;Jones et al., 2007;Simsek et al., 2007;Mendis et al., 2016). S. melioti strains with compatible or incompatible interactions with M. truncatula vary in their succinoglycans trimeric oligosaccharide succinylation patterns (Simsek et al., 2013). Introduction of succinoglycan biosynthetic genes from a compatible to an incompatible S. meliloti bacterial strain is sufficient to confer compatibility and alters the succinylation pattern to resemble that of the compatible strain (Simsek et al., 2007). In sum, investigations of diverse legume symbiosis have established that host-symbiont specificity is dictated by the ability of a bacterial strain to produce specific signaling factors (LCOs or succinoglycans) that are recognized by the host (Figure 1A).
Similarly, specificity of the V. fischeri bacteria-E. scolopes squid host association is defined by a strain variable molecular factor. A mutant screen for genes necessary for squid colonization revealed a regulator of symbiotic colonization sensor kinase or RscS (Visick and Skoufos, 2001). RscS regulates the syp locus encoding regulatory proteins and structural proteins involved in polysaccharide synthesis and export (Yip et al., 2005(Yip et al., , 2006Shibata et al., 2012). Initial analyses indicated that a specific allelic form of RscS is necessary and sufficient for squid host colonization among V. fischeri strains in the paraphyletic group "B" (Mandel et al., 2009), indicating that RscS is a factor for strain specificity in host-microbe association ( Figure 1B). Intriguingly, follow-up studies revealed that in other groups of V. fischeri strains (group "A" and "C"), although the syp locus is still necessary for host colonization, RscS is not (Rotman et al., 2019). The current hypothesis explaining this difference is that these other V. fischeri strains have distinct mechanisms for regulating the syp locus (Rotman et al., 2019) suggesting that V. fischeri-squid strain specificity is based on functional differences in regulatory pathways of the conserved syp locus, rather than in variation of the syp-encoded host-interaction molecules themselves.
In human infants, the bacterial starch utilization system (Sus) is implicated in the strain specificity of the gut bacterium B. uniformis ( Figure 1C) (Yassour et al., 2018). Sus is composed of multiple proteins that bind, uptake, and degrade glycans, with different modules specific to different glycans (Martens et al., 2009). Using metagenomics and single-nucleotide variants in species-specific markers, Yassour and others identified strains that were transmitted between mother-infant pairs within 3 months of birth (Yassour et al., 2018). In their analysis, two patterns of transmission emerged: primary and secondary strain transmission. In primary strain transmission, the bacterial strain dominant in the mother is also the dominant strain in the infant, but in the secondary strain transmission, a non-dominant bacterial strain in the mother is the dominant strain in the infant (Yassour et al., 2018). In mother-infant pairs displaying primary B. uniformis strain transmission, the dominant maternal strains encode a Sus module while in mother-infant pairs that displayed secondary strain transmission, the non-dominant maternal strains, and not the dominant maternal strains, encode a Sus module (Yassour et al., 2018). These results provide evidence that Sus module is selected in the infant gastrointestinal tract and may be a driver of strain-specific association of human infants with B. uniformis (Figure 1C). The authors hypothesized that this strain specificity is driven by unique glycans in mother's breast milk that select for B. uniformis strains capable of utilizing these glycans for colonization (Yassour et al., 2018). The role of Sus module in host colonization is supported by findings that Bacteroides Sus-like proteins are necessary for colonization in the murine host (Lee et al., 2013).
Overall, these examples showcase that strain-specific associations can arise due to selection for regulation or sequence of molecular factors that fulfill specific positive roles in host-symbiont associations (Figure 1). However, strainspecific associations can also be due to molecular factors that are involved in antagonistic interactions between hosts and microbes or among microbes within host-associated communities. In the next section, we showcase host factors that antagonize non-native microbes and microbial factors that antagonizes non-native hosts or other microbes, which mediate strain-specific associations.

STRAIN SPECIFICITY FACTORS THAT ARE INVOLVED IN ANTAGONISTIC INTERACTIONS
M. truncatula legumes exhibit specificity for certain S. meliloti bacterial strains and antagonize other strains. The M. truncatula Mtsym6 allele contributes to this strain incompatibility (Tirichine et al., 2000). Mtsym6 encodes at least two nodule-specific cysteine rich (NCR) peptides (Wang et al., , 2018Yang et al., 2017). NCRs have bactericidal activity against select S. meliloti strains (Wang et al., , 2018Yang et al., 2017) ( Figure 2A). Since S. meliloti succinoglycans mediate strain specificity (Figure 1A), it is intriguing to hypothesize that M. truncatula NCRs may select for specific S. meliloti strains based on the structure of their exopolysaccharide succinoglycans (Figure 2A). Though this hypothesis has not been directly tested, there is strong support for it in the literature. When exposed to sublethal NCR concentrations, S. meliloti upregulate succinoglycan synthesis genes (Penterman et al., 2014), and mutants defective in succinoglycan biosynthesis are sensitive to NCRs and display increased membrane permeability, branching, and bloating compared to wild-type S. meliloti (Arnold et al., 2017;Montiel et al., 2017). A recent study showed that (1) succinoglycans protect S. meliloti against NCR activity, (2) such protection depends on succinylation patterns, (3) there is a direct interaction between NCRs and succinoglycan, and (4) succinoglycan protection is specific to NCRs (and not other cationic peptides) (Arnold et al., 2018) (Figure 2A).
Strain-variable antagonism also occurs in the association of X. bovienii with Steinernema nematode spp. from two distinct phylogenetic clades (I and III). Cross-pairing studies revealed that while X. bovienii from clade I nematode hosts (clade I X. bovienii strains) are compatible with clade I nematode hosts, they are incompatible with clade III Steinernema nematode hosts (Murfin et al., 2015;McMullen et al., 2017). This incompatibility is not due to the inability of clade I X. bovienii strains to support clade III Steinernema nematode host growth. Rather, it is caused by the toxicity of those particular strains of X. bovienii to these nonnative nematode hosts (Murfin et al., 2019). Recent work indicates that a Shiga toxin subunit 1 A homolog (StxA) is necessary for a clade I X. bovienii bacterial symbiont to kill non-native host clade III S. feltiae nematodes ( Figure 2B) (Ginete, 2020). StxA-encoding clade I X. bovienii bacterial symbionts exhibit ribosome-inactivation activity against clade III S. feltiae nematodes (Ginete, 2020). When the stxA gene encoding the toxin is deleted through mutation, a clade I bacterial symbiont can support clade III S. feltiae development and reproduction (Ginete, 2020). These results suggest that clade I bacterial symbionts are capable of forming mutualistic associations with clade III S. feltiae nematodes, but their Frontiers in Ecology and Evolution | www.frontiersin.org toxicity hinders or prevents these non-native associations (Ginete, 2020) (Figure 2B).
Microbial factors that target other microbes can drive strain-specific host associations ( Figure 2C). Bacteriocins are microbial-encoded toxins that kill closely related organisms (García-Bayona and Comstock, 2018), and this target specificity impacts colonization between bacterial strains in host environments (Figure 2C). In murine host guts, introduction of an Enterococcus faecalis strain encoding the bacteriocin Bac-21 eliminates an endogenous vancomycinresistant E. faecalis strain (Kommineni et al., 2015). Another bacteriocin, microcin, is necessary for the probiotic E. coli strain Nissle 1917 to reduce colonization levels of another mouse-commensal E. coli strain, but only when the intestine is inflamed (Sassone-Corsi et al., 2016), suggesting that environmental conditions such as the presence of inflammation may influence impact of bacteriocins on microbe-microbe competition. Finally, Bacteroides fragilis strains expressing the BSAP-1 bacteriocin can eliminate another B. fragilis strain in the mouse intestine after 1 week of post-co-inoculation (Roelofs et al., 2016). BSAP-1-mediated elimination depends on the version of a specific outer membrane protein encoded by the target strain, with resistant strain variants expressing OmpR and sensitive variants expressing OmpS (Roelofs et al., 2016). An assessment of human gut metagenomics data set indicates that while ompR sequences are present in 98% of metagenomes that were positive for BSAP-1, no ompS sequences were found in the BSAP-1 positive metagenomes (Roelofs et al., 2016). These results indicate that variation in both bacteriocins and their receptors can shape the microbiome membership within host environments, which in turn will result in apparent host-symbiont strain specificity.
Present in ∼25% of gram-negative bacterial species, the Type VI secretion systems (T6SS) are composed of a membrane complex with a base plate, a contractile sheath surrounding an Hcp tube, and a tip complex (Coulthurst, 2019;Allsopp et al., 2020). Bacteria utilize T6SS to secrete effectors ranging in activity from nutrient scavenging to causing cellular death (Allsopp et al., 2020). Recent studies show that T6SS can mediate strain specificity in host-symbiont associations ( Figure 2C). In the V. fischeri-squid model system, the squid can harbor multiple V. fischeri strains (Nishiguchi et al., 1998;Wollenberg and Ruby, 2009) with different strains occupying distinct light organ sites Sun et al., 2016). When T6SS is deleted in V. fischeri, two different strains can occupy the same site, indicating that T6SS may mediate exclusion of other strains from each light organ niche (Speare et al., 2018;Guckes et al., 2019). In mice, the presence of T6SS can impact levels of host colonization by B. fragilis strains ( Figure 2C) Hecht et al., 2016;Wexler et al., 2016). For example, a commensal B. fragilis strain reduces colonization of an enterotoxigenic B. fragilis strain in a T6SS-dependent manner (Hecht et al., 2016). This inhibition decreases cecal injury, inflammation, and ulcerations in the murine host (Hecht et al., 2016), indicating that T6SSmediated microbial competition can have consequences to host health.
A recent metagenomics analysis revealed that there is an enrichment for T6SS-encoding strains in the human infant intestinal microbiota, suggesting that T6SS are important for colonization at this developmental stage (Verster et al., 2017). B. fragilis bacterial strains that encode immunity proteins against T6SS effectors displayed resistance to T6SS-dependent inhibition in murine hosts Hecht et al., 2016;Wexler et al., 2016). This indicates that the presence of T6SS, variation in the T6SS effector repertoire, and variation in T6SS effector immunity combine to shape overall B. fragilis strain composition in the host by ultimately selecting for B. fragilis strains that encode specific immunity proteins. The influence of B. fragilis T6SS-effector-mediated selection may extend beyond Bacteriodes, since B. fragilis T6SS presence or absence results in distinct microbial community composition and anti-B. fragilis-T6SS-effectors have been found in other gut microbiome members (Verster et al., 2017;Ross et al., 2019). These findings indicate that T6SS-mediated antagonism selects for certain strains within a microbiome, such that the T6SSencoding or resistant strains dominate the genotypes available for host interactions. Overall, these findings from diverse symbioses showcase that host and microbial antagonistic factors mediate strain-specific associations (Figure 2).

CONCLUSION
Diverse symbiotic microbiomes have been analyzed using 16S rRNA amplicon sequencing that identify individual community members to the species level. This approach is useful for defining broad characteristics of a microbiome and for predicting individual symbiont function based on knowledge of its taxonomic placement. However, studies that employ strategies such as molecular genetic manipulation of laboratory models of symbiosis, and meta-genomic sequencing efforts, are revealing that symbiotic associations can be dramatically influenced by strain-level variation that is not revealed by standard sequencing practices. In particular, binary host-symbiont associations enable cross-pairing experiments that led to early evidence for the impact of bacterial strain variation on hosts. While assessment of consequences of strain variation on host fitness remains difficult in complex systems, simplification by focusing on individual microbial symbionts or on naturally "simple" systems has revealed that strain variation can have consequences for host associations, physiology, and fitness.
Strain-level specificity in host-symbiont associations occurs in a broad range of symbioses, including those in the plant, invertebrate, and mammalian systems. While much remains to be learned regarding the molecular basis of strain specificity of symbioses, two general categories emerge from the examples presented here: those with variable symbiont-encoded factors necessary for association with the host and those with variability in antagonistic behaviors that modulate associations between particular hosts and microbes. The former type of strain specificity commonly involves the presence, recognition, and utilization of polysaccharides: S. meliloti and V. fischeri bacterial exopolysaccharides necessary for symbiosis with legumes and FIGURE 3 | Strain specificity factors in host-symbiont associations can drive selection for specific host-symbiont associations (boxed) through positive selection (direct functional factors) and negative selection (antagonistic factors). Strain-specific association is indicated by matching color in microbes and hosts. squid, respectively, and bacterial utilization of host-derived polysaccharides in the mammalian gut. The latter involves either host-microbe antagonism or microbe-microbe antagonism.
While these classifications may appear simplistic, they provide a generalized framework to understand factors that contribute to strain specificity in host-symbiont associations beyond those mentioned in this review. For example, field studies indicate that symbiont strain variation even occurs widely in nature (Valette et al., 2013;Parkinson et al., 2015;Russell et al., 2017;Guyomar et al., 2018;Perez and Juniper, 2018;Ellegaard and Engel, 2019;Porter et al., 2019;Ravenscraft et al., 2020). One notable example is the strain-specific associations of wild-sampled Bathymodiolus mussels and their gill-localized endosymbionts (Ansorge et al., 2019). There is a positive correlation between the geochemical characteristics of the host environment (e.g., hydrogen levels) and the predicted function of strain-specific genes (e.g., hydrogenases) (Ansorge et al., 2019). The authors of this study hypothesize that strain specificity in this association occurs through metabolic efficiency selection conferred by variable hydrogenase enzymes (Ansorge et al., 2019), similar to Sus-mediated selection among Bacteroides strains (Lee et al., 2013;Yassour et al., 2018). From the aforementioned framework, these hydrogenase enzymes can be viewed as potential functional factors involved in host adaptation to environmental nutrient availability that facilitate strain specific association.
Functional and antagonistic factors likely work together to drive the persistence of certain host-symbiont associations (Figure 3). Functional factors can act as positive selection for specific host-symbiont associations, initiating and/or maintaining such association over host generations. On the other hand, antagonistic factors can act as negative selection against other host-microbe associations, hindering their formation and/or contributing to their breakdown. Through these selection mechanisms, we posit that they may facilitate the stability and persistence of certain host-symbiont associations over evolutionary periods. In support of this hypothesis, a previous study suggests that antagonism between actinomycetous bacterial symbionts from different ant isolates may facilitate strain specificity of ant-symbiont associations (Poulsen et al., 2007). For future studies, it will be important to contextualize these molecular factors for strain specificity in the ecology and evolution of host-symbiont associations.
As strain variation can dictate the degree and the sign of interaction outcomes between hosts and microbes, it is important to continue investigating ways in which symbiont strains vary genotypically and phenotypically, particularly by identifying strain variable factors driving associations with hosts. In addition, it is essential to elucidate selective pressures that drive evolution of microbial strain variation as well as host-microbe strain specificity. This knowledge will allow a more comprehensive understanding of the molecular and ecological dynamics of hostsymbiont associations, and to effectively use this knowledge to our benefit, such as for minimizing host susceptibility to pathogens or maximizing host benefits from agriculturally or medically relevant mutualisms.

AUTHOR CONTRIBUTIONS
DRG: conceived of topic and scope, wrote text, and designed figures. HG-B: contributed to topic refinement, literature review, and editing of text and figures. All authors contributed to the article and approved the submitted version.