Biogeography and Independent Diversification in the Protist Symbiont Community of Heterotermes tenuis

The eukaryotic microbiome of “lower” termites is highly stable and host-specific. This is due to the mutually obligate nature of the symbiosis and the direct inheritance of protists by proctodeal trophallaxis. However, vertical transmission is occasionally imperfect, resulting in daughter colonies that lack one or more of the expected protist species. This phenomenon could conceivably lead to regional differences in protist community composition within a host species. Here, we have characterized the protist symbiont community of Heterotermes tenuis (Hagen) (Blattodea: Rhinotermitidae) from samples spanning South and Central America. Using light microscopy, single cell isolation, and amplicon sequencing, we report eight species-level protist phylotypes belonging to four genera in the phylum Parabasalia. The diversity and distribution of each phylotype’s 18S rRNA amplicon sequence variants (ASVs) mostly did not correlate with geographical or host genetic distances according to Mantel tests, consistent with the lack of correlation we observed between host genetic and geographical distances. However, the ASV distances of Holomastigotoides Ht3 were significantly correlated with geography while those of Holomastigotoides Ht1 were significantly correlated with host phylogeny. These results suggest mechanisms by which termite-associated protist species may diversify independently of each other and of their hosts, shedding light on the coevolutionary dynamics of this important symbiosis.


INTRODUCTION
The degree of intraspecific variation in the insect gut microbiota varies greatly among lineages. At one extreme, many insects, including butterflies and walking sticks, have such highly variable gut microbial communities that it has been argued they have no specific microbiome (Shelomi et al., 2013;Hammer et al., 2019;Ravenscraft et al., 2019). This high variability likely stems from molting, which disrupts the gut microbiota, and from the lack of a mechanism for vertical transmission of microbes. At the other extreme, in social insects, intimate interactions among nestmates create a route by which the microbiome can be transmitted and stably maintained (Engel and Moran, 2013;Sanders et al., 2014). As a result, many social insects host a stable and consistent core microbiome at both the intra-colony and intra-species levels (e.g., corbiculate bees, Kwong et al., 2017).
In termites, the social behavior of proctodeal trophallaxis (anal feeding) provides a mechanism for both vertical inheritance of gut microbes and for re-establishment of the microbiota after each molt (Brune and Dietrich, 2015). This behavior emerged in the ancestor of termites and their sister lineage Cryptocercus and likely enabled the transition to wood feeding by trapping cellulolytic protists and their nitrogenfixing endosymbionts in an increasingly specialized digestive symbiosis (Nalepa, 2015). As a result, termites exhibit relatively stable and largely co-diversifying gut microbial communities (Bourguignon et al., 2018).
The eukaryotic component of "lower" termite microbiomes (all families except Termitidae) consists of highly specialized wood-feeding flagellates (protists) that cannot live outside their host. This means that unlike prokaryotic microbiota, the protist communities do not include a transient, environmentally acquired component. Because of this strict vertical inheritance, protist communities are strongly influenced by termite phylogeny and are not expected to have a biogeographical influence independent of their hosts (Tai et al., 2015;Taerum et al., 2018). However, there are two documented evolutionary mechanisms consistent with direct termiteto-termite transmission that could conceivably introduce a biogeographical component to protist distribution within a single host species. These are (1) transmission of protist species from one host species to another, and (2) lineage-specific losses of symbiont species.
Horizontal symbiont transfer (HST) from one termite lineage to another is theoretically possible, according to experiments on laboratory colonies (Light and Sanford, 1928;Dropkin, 1946). It may have taken place between Hodotermopsis (Archotermopsidae) and the ancestor of Reticulitermes (Rhinotermitidae) (Kitade, 2004;Tai et al., 2015). This would explain the distinct protist community found in Reticulitermes as compared to its rhinotermitid relatives (Kitade, 2004). Another ancient HST may have occurred in the ancestor of Serritermes and Glossotermes (Serritermitidae), again explaining why the symbionts of these host genera differ from those of their rhinotermitid relatives (Radek et al., 2018). To date these are the only two documented inferences of HST in termites. However, if HST is more common than currently appreciated, it could lead to regional differences in protist communities across a host species' range, especially when different parts of that range overlap with different termite species.
Lineage-specific loss of protist symbionts is much more common than HST. The absence of one or two of the expected protist species from a termite colony has been documented by both morphological and molecular methods (Kitade and Matsumoto, 1993;Kitade et al., 2012Kitade et al., , 2013Taerum et al., 2018;Michaud et al., 2020). Such losses are more common for some host species than others, and for some protist species than others (Kitade and Matsumoto, 1993;Kitade et al., 2013;Michaud et al., 2020). Protists are inherited biparentally: both king and queen contribute their symbionts to the nascent colony's microbial community (Shimada et al., 2013;Brossette et al., 2019). Host outcrossing can therefore allow complementation of these deficient communities, helping to maintain a consistent protist community across the host species' range (Michaud et al., 2020). However, if gene flow between host populations becomes restricted, the opportunity to re-acquire a protist species by host inter-population mating would cease, and regional differences in protist communities could emerge.
If these mechanisms can, in fact, lead to a geographical influence on protist distribution, such an influence should be more readily detected in a broadly distributed host. Heterotermes tenuis (Hagen) (Blattodea: Rhinotermitidae) occupies a vast geographical range from as far north as southern Mexico and the West Indies down to northern Argentina, and from Ecuador to eastern Brazil (Constantino, 1998;Krishna et al., 2013;Carrijo et al., 2020). It is one of the most common termite species in both dry and forested areas of South America (Mathews, 1977;Davies et al., 2003;Ackerman et al., 2009). Like other Heterotermes species, it forages underground and consumes wood in contact with soil (Mathews, 1977). It has been reported to cause damage to buildings and crops, including sugarcane and eucalyptus plantations (Constantino, 2002;Batista-Pereira et al., 2004;Haifig et al., 2008). Heterotermes tenuis exhibits both morphological and molecular diversity, suggesting that it may be considered a species complex rather than a singular species (Constantino, 2000;Carrijo, 2013).
In this study, we first revisited the hindgut protist community of H. tenuis in order to clarify its composition. Using a combination of light microscopy, 18S rRNA sequences from single isolated cells, and high-throughput amplicon sequencing of H. tenuis colonies collected across its range, we identified eight distinct species-level clades, belonging to four genera, which we refer to as phylotypes. We next examined the diversity and distribution of 18S rRNA amplicon sequence variants for each of the protist phylotypes in order to determine whether symbiotic protists might exhibit biogeographical patterns independent from those of their host.

Termite Collections and Barcoding
Live and RNA later-preserved termites were collected in Ecuador in 2016 under permit No 015 -2016 -IC -FAU -FLO -DPAO -PNY, in French Guiana in 2019 under permit TREL1820249A/108, and in Panama in 2018 under permit SC/A-24-17. Ethanol-preserved specimens were provided by collections at the University of Florida and the Museu de Zoologia da USP (MZUSP), Brazil. Full accession and collection information for each specimen is provided in Supplementary Table 1.
Molecular barcodes for termite specimens were generated by amplifying and sequencing the mitochondrial large subunit 16S ribosomal RNA gene (mt16S), using the primers LRN 5 -CGC CTG TTT ATC AAA AAC AT-3 and LRJ 5 -TTA CGC TGT TAT CCC TAA-3 (Simon et al., 1994;Kambhampati and Smith, 1995). Template DNA was extracted from the excised hindgut of live workers or from the macerated abdomen of ethanolor RNAlater-preserved workers using the GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific) according to the manufacturer's protocol. The PCR cycle included a 3 min denaturation at 95 • C followed by 30 cycles of 95 • C for 30 s, 46 • C for 30 s, and 72 • C for 60 s, and a final 10 min extension at 72 • C. PCR products were purified using the GeneJET PCR Purification Kit (Thermo Fisher Scientific, Waltham, MA, United States) according to the manufacturer's protocol and sequenced directly on both strands on an Applied Biosystems 3730 capillary sequencer.

Live Protist Isolation and Characterization
Live termite hindguts were removed and their contents suspended in Ringer's solution (8.5 g NaCl, 0.2 g KCl, 0.2 g CaCl 2 , 0.1 g NaHCO 3 per L, HiMedia Laboratories). Individual Pseudotrichonympha, Holomastigotoides, Cononympha, and Cthulhu cells were isolated by drawn-glass micropipette on an AxioVert inverted microscope and photographed with an Axiocam 105 color camera (Zeiss). Additional micrographs were taken on an AxioImager upright compound light microscope with an AxioCam 503 monochrome camera (Zeiss).
Each isolated cell was washed twice in fresh Ringer's solution and transferred into a 0.5 ml tube for DNA extraction using the MasterPure DNA purification kit (Epicentre Biotechnologies) following the manufacturer's protocol, except that purified DNA was resuspended in 5 µl of deionized, ELGA-purified water. The 18S rRNA gene was then amplified using a nested PCR approach. Primer pairs SpiroF1 5 -ATA CTT GGT CGA TCC TGC CAA GG-3 and SpiroR1 5 -TGA TCC AAC GGC AGG TTC MCC TAC-3 (Taerum et al., 2019) were used for the for the initial (outer) reaction and GGF 5 -CTT CGG TCA TAG ATT AAG CCA TGC-3 and GGR 5 -CCT TGT TAC GAC TTC TCC TTC CTC-3  for the second (inner) reaction. PCR cycles for both primer pairs included a 3 min denaturation at 95 • C, followed by 30 cycles of 95 • C for 30 s, 55 • C for 30 s, and 72 • C for 90 s, and a final 10 min extension at 72 • C. Resulting PCR products were separated by gel electrophoresis and visualized with GelGreen stain (Biotium). Bands of the expected size (∼1500 base pairs) were excised from a 1% agarose gel, purified using the GeneJET Gel Extraction Kit (Thermo Fisher Scientific), cloned using the TOPO TA cloning kit for sequencing (Invitrogen), and sequenced on both strands using the ABI 3730XL capillary sequencer.

Phylogenetic Analyses
We successfully amplified and sequenced the 18S rRNA gene from six individual Pseudotrichonympha cells, 10 Holomastigotoides cells, three Cononympha cells, and one pool of eight individually isolated Cthulhu cells. Between two and eight clones were sequenced from each cell, trimmed of vector, and assembled using Geneious R9 (Kearse et al., 2012). For preliminary analyses, all 116 new sequences were aligned with previously published Parabasalia 18S rRNA gene sequences using MAFFT v. 7.222 (Katoh and Standley, 2013). Ambiguously aligned sites were identified by eye and removed in AliView 1.17.1 (Larsson, 2014). A maximum likelihood (ML) phylogeny was computed using RAxML v. 8.0 (Stamatakis, 2014).
For the final parabasalian 18S rRNA phylogenetic analyses, we selected a subset of representative sequences from each protist phylotype in order to reduce redundancy. These sequences were submitted to GenBank under accessions MW353606-MW353628. New and previously published sequences from Honigbergiellida (Cthulhu and outgroup), Spirotrichonymphida (Holomastigotoides and Cononympha), and Trichonymphida (Pseudotrichonympha and outgroup) were aligned separately in order to maximize the number of alignable sites in each matrix. The final, manually trimmed, matrix sizes were: Honigbergiellida 18 taxa, 1389 sites; Spirotrichonymphea 31 taxa, 1471 sites; Trichonymphida (family Teranymphidae only) 31 taxa, 1404 sites. Bayesian and ML phylogenetic analyses were performed for each matrix using MrBayes v. 3.2.6 (Ronquist et al., 2012) and RAxML v. 8.0 (Stamatakis, 2014), respectively, under the GTR-model, with support for each node assessed by 1,000 bootstrap replicates (ML) and posterior probability (Bayes). For the Bayesian analyses, two independent chains, sampled once every 100 generations, were run until they converged (the average standard deviation of partition frequency values between the chains dropped below 0.01). Convergence was reached after 20,000 generations for Honigbergiellida, 240,000 generations for Spirotrichonymphida, and 110,000 generations for Teranymphidae. The first 25% of trees were discarded as burn-in and majority rule consensus trees were computed from the remaining 300, 4600, and 1650 trees, respectively. We also computed ML trees from each of these data sets with all amplicon sequence variants (ASVs) from each taxon in order to visualize the phylogenetic diversity of the ASVs (see below for details).
For the host mt16S analyses, new and previously published sequences from Heterotermes species were aligned with MAFFT v. 7.222 (Katoh and Standley, 2013), which yielded a clean alignment of 67 taxa by 404 sites. This matrix was subjected to ML and Bayesian phylogenetic analyses under the GTR-model of sequence evolution as above. New termite mt16S barcode sequences were submitted to GenBank under accession numbers MW345686-MW345724 (Supplementary Table 1).

Hindgut Community 18S rRNA Amplicon Sequencing
DNA was extracted from the whole gut contents of live termites or the macerated abdomens of termites preserved in ethanol or RNAlater (Thermo Fisher Scientific, Waltham, MA, United States) using the GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific, Waltham, MA, United States). Each DNA template was acquired from a single worker individual. Amplicon libraries were generated using a two-step PCR protocol with dual index sequencing (Kozich et al., 2013). The first PCR used gene-specific primers with Illumina adaptor sequences at their 5 ends: nexF-ParaV45F 5 -TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG-3 and nexF-ParaV45R 5 -GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA G-3 (Jasso-Selles et al., 2017). The second PCR attached indexing barcodes to both ends of each amplicon (Hamady et al., 2008). PCR programs included an initial 3 min denaturation at 95 • C, then 35 cycles of 95 • C for 30 s, 48 • C (1st step) or 50 • C (2nd step) for 30 s, and 72 • C for 50 s, followed by a 10 min final extension at 72 • C. All PCR reactions used EconoTaq 2x Master Mix (Lucigen) with 2.5 µl of 2 µM stock forward and reverse primers and 2.5 µl of undiluted template DNA in a 30 µl reaction. PCR products were purified using Ampure (Beckman Coulter) magnetic beads in combination with the Beckman Biomek NXp robot and quantified using Qubit broad range dsDNA fluorescent dye (Invitrogen) on a Biotek HT1 Plate Reader. After quantification, samples were pooled at approximately equivalent concentrations and sequenced on the Illumina MiSeq platform. This study includes samples run in 2017 using 2 × 300 paired-end chemistry and samples run in 2019 using 2 × 250 paired-end chemistry.
Amplicon sequence analyses were performed using QIIME 2 2020.2 (Bolyen et al., 2019). Casava 1.8 paired-end demultiplexed fastq files were imported via QIIME tools import. The demultiplexed reads were trimmed, merged, and denoised separately for each sequencing run using DADA2 (Callahan et al., 2016). The 2 × 250 samples did not require trimming. Both sequencing runs were merged into a single dataset after denoising. Samples consisting of fewer than 500 reads for the Parabasalia taxa of interest were discarded. Taxonomy was assigned using a naïve Bayes classifier implemented in QIIME 2 (Bokulich et al., 2018) trained on a manually curated Parabasalia 18S rRNA reference file of 1075 published and unpublished sequences, including all 116 18S rRNA clones from single cells sequenced in this study. Raw demultiplexed fastq files were submitted to NCBI SRA under bioproject PRJNA683896.

ASV Diversity and Distribution Analyses
Mean and maximum pairwise distances between amplicon sequence variants (ASVs) assigned to each protist phylotype and linear regressions of ASV diversity and uniqueness vs. protist prevalence were computed in R version 4.0.1 (R Core Team, 2020). Distributions of ASVs across sample locations were determined by sorting ASVs into individual datasets for each protist phylotype and for each geographical coordinate using QIIME 2 2020.2 (Bolyen et al., 2019). For Mantel tests, all genetic distance matrices (for protist 18S rRNA ASVs and host mt16S rRNA sequences) used uncorrected pairwise ("p") distances with no model of evolution. In cases where multiple unique ASVs were present at the same coordinate, the ASV with the greatest read depth was selected to represent that coordinate in uncorrected pairwise distance matrices for each protist phylotype. Euclidean distance was used as the distance measure for the geographic distance matrices. All full and partial Mantel tests were conducted in R version 4.0.1, using packages vegan (Oksanen et al., 2019) and ape (Paradis and Schliep, 2019), with significance level determined from 999 permutations in each test.

Morphology of H. tenuis Hindgut Protist Symbionts
Using light microscopy, we observed the protists inhabiting the guts of live termites collected in Ecuador, Panama, and French Guiana. We identified four genera of protists, all belonging to the phylum Parabasalia: Pseudotrichonympha, Cthulhu, Cononympha, and Holomastigotoides (Figure 1). Each of these genera had been previously reported to inhabit the gut of H. tenuis, though Cononympha was previously known as Microspironympha or Spirotrichonympha (Mackinnon, 1926(Mackinnon, , 1927de Mello, 1954;Saldarriaga et al., 2011;James et al., 2013). We did not observe any additional genera beyond those previously reported to occur in H. tenuis.
Pseudotrichonympha (Trichonymphida) is a large, hypermastigote, wood-feeding protist. It has an anterior cell portion called a rostrum and a posterior portion called the cell body or post-rostral area. The rostrum has a smooth apical cap and an internal tube-like structure. Except for the apical cap, the entire cell is covered with flagella of three different lengths; short flagella on the rostrum, long flagella forming a fringe around the cell at the base of the rostrum, and medium-length flagella covering the rest of the cell (Grassi and Foà, 1911;Grassi, 1917;Saldarriaga et al., 2011). We observed two Pseudotrichonympha morphotypes in H. tenuis, one with a raised rim around the margin of its apical cap and one without (Figures 1A,B). Whether this characteristic represents a stable, heritable trait or a variable or artifactual one remains unknown. Previously published electron micrographs of Pseudotrichonympha from H. tenuis also showed a raised rim around the apical cap while light micrographs from the same study lacked this feature (Saldarriaga et al., 2011).
Cthulhu (Honigbergiellida) is a very small protist defined by its bundle of roughly 20 anterior flagella. This feature differentiates it from its relatives Cthylla and Hexamastix, which each have five anterior flagella (James et al., 2013). The number of flagella borne by the H. tenuis symbiont is difficult to capture in still micrographs of live specimens ( Figure 1C), but video microscopy demonstrates that the number is more than 10 (Supplementary Video 1).
Cononympha (Spirotrichonymphida) is similar to Holomastigotoides in having spiral bands of flagella, but these come together at the cell apex to form a spiral staircase-like structure called a pseudorostrum with an internal axial tube-like structure called a columella (Koidzumi, 1921;Jasso-Selles et al., 2017). We observed two distinct Cononympha morphotypes inhabiting H. tenuis (Figures 1D,E). One type had rounded cells measuring less than 30 µm in length with a nucleus positioned near the anterior of the cell (Figure 1D), while the other type had larger, more elongated cells that measured 35-65 µm in length and a more centrally positioned nucleus ( Figure 1E).
Holomastigotoides (Spirotrichonymphida) is also a woodfeeding, highly flagellated protist, but bears helical bands of flagella that originate at the cell apex and encircle the cell nearly to its posterior pole. Its nucleus is positioned anteriorly (Grassi and Foà, 1911;Grassi, 1917;Brugerolle and Lee, 2000). Within this morphological category, we observed a range of cell sizes and shapes, suggesting multiple Holomastigotoides species in H. tenuis (Figures 1F-J). One morphotype was characterized by very large cells, measuring 130-180 µm in length, with a flexible, irregular outline (Figures 1I,J). Other cells were smaller with an acutely pointed apex ( Figure 1G) or an apex that tapered gradually to a point ( Figure 1H) or a more rounded apex (not shown).

Molecular Phylogenetic Characterization of Individually Isolated Protist Cells
For molecular characterization of these symbionts, we isolated single protist cells and amplified, cloned, and sequenced their 18S rRNA genes. Cells were selected to represent the full range of morphological variability within each genus as much as possible. From 20 single cells, we successfully sequenced 116 clones, which clustered into eight groups of closely related sequences (specieslevel phylotypes) (Supplementary Table 2). We recovered two phylotypes from our isolated Pseudotrichonympha cells, one phylotype of Cthulhu, two phylotypes of Cononympha, and three phylotypes of Holomastigotoides, all of which branched with previously characterized congeners, consistent with our morphological identifications. Six of these phylotypes branched closely with previously published sequences from H. tenuis symbionts (Noda et al., 2007;Saldarriaga et al., 2011;James et al., 2013;Gile et al., 2018). We labeled our new sequences concordantly, so Pseudotrichonympha paulistana follows Saldarriaga et al., 2011, Pseudotrichonympha sp. follows Noda et al., 2007, and the three Holomastigotoides phylotypes follow previously published clones Ht1, Ht2, and Ht3 (Noda et al., 2007;Saldarriaga et al., 2011;Gile et al., 2018). The single Cthulhu phylotype did not require a distinguishing phylotype label. Cononympha phylotypes were numbered 1 and 2 arbitrarily but consistently.
In order to maximize the number of alignable sites, we carried out phylogenetic analyses separately for Pseudotrichonympha (Figure 2), Cthulhu (Figure 3), and the Spirotrichonymphida genera Holomastigotoides and Cononympha (Figure 4). The position of the two Pseudotrichonympha phylotypes was not resolved, though Pseudotrichonympha from Heterotermes hosts formed a supported clade (Figure 2). Our Cthulhu sequences were nearly identical to a previously published symbiont clone from H. tenuis collected in northern Colombia (James et al., 2013; Figure 3). This sequence (JX975351) was amplified from whole gut DNA, not an isolated cell, so it was not attributed to Cthulhu, though Cthulhu morphotypes were observed in H. tenuis in the same study (James et al., 2013). Our new sequence was amplified from a pool of eight Cthulhu cells isolated from H. tenuis collected in Ecuador (Figure 3). For Cononympha, the two phylotypes formed a clade with Cononympha aurea from Heterotermes aureus, the only Cononympha sequences so far available from any Heterotermes host (Jasso-Selles et al., 2017; Figure 4). Finally, two of the three Holomastigotoides phylotypes branched together with weak support while the third phylotype branched sister to the Holomastigotoides species from Heterotermes aureus. In contrast to Pseudotrichonympha, the Holomastigotoides species from Heterotermes hosts did not form a clade (Figure 4).

Hindgut 18S rRNA Community Profiling
We also carried out 18S rRNA V4-V5 amplicon sequencing (metabarcoding) to characterize the hindgut parabasalian symbionts of H. tenuis sampled across much of its range. We included H. tenuis from four countries, French Guiana, Panama, Ecuador, and Brazil, with Brazilian samples coming from 10 states and both forest and savannah habitats (Figure 5). In total, we profiled the protist community of 95 individual worker termites from 39 distinct colonies, of which 16 were represented by a single individual (all ethanol-preserved museum specimens) and 21 had 2-4 individuals, while live collections from Ecuador and French Guiana had six and 20 individuals (biological replicates), respectively. In order to link each of these profiles to host genotype, we sequenced the mt16S of at least one individual from each colony. Our maximum likelihood phylogeny of these molecular barcodes recovered three distinct clades with moderate support (Figure 5), in agreement with a previous study (Carrijo et al., 2020).
After trimming, merging, and denoising, we assigned taxonomy to our amplicon sequence variants (ASVs) using a custom reference database of full-length Parabasalia 18S rRNA sequences, including all 116 Sanger sequences generated in this study. All Parabasalia ASVs from the 95 samples were assignable to the eight phylotypes we characterized from single cells. Note that ASVs are equivalent to zero-radius OTUs; each has a unique sequence but no longer includes read abundance information. Together the ASVs represent the total sequence diversity present in the 18S rRNA amplicons. The number of ASVs per protist phylotype in our data set ranged from 17 (Cthulhu) to 136 (Holomastigotoides Ht2), for a total of 557 protist ASVs (Supplementary Table 3).
For a closer look at ASV phylogenetic diversity, we aligned all ASVs with the new and previously published 18S rRNA sequences used in the isolated cell analyses above (Figures 2-4) and computed phylogenetic trees (Supplementary Figures 1-4). All ASVs branched with the phylotypes to which they had been assigned, though they revealed considerably more sequence variability than the fewer single cell clones ( Supplementary  Figures 1-4). Particularly high levels of sequence divergence were exhibited by the two Cononympha phylotypes (Supplementary  Figure 1, Supplementary Table 3), but because they lack clearly demarcated subclades, we maintain our interpretation of two phylotypes. In contrast, the ASVs clustering with Holomastigotoides Ht1 did form two supported subclades and therefore could alternatively be considered as two phylotypes, possibly representing two distinct species (Supplementary  Figure 2). In keeping with this possibility, we noted that ASVs from these two subclades are nearly mutually exclusive in distribution, co-occurring in just one out of the nine colonies in which Ht1 was found, and suggesting they might belong to distinct cellular lineages. Because these subclades are very closely related and we did not isolate a representative cell from the additional subclade, we will consider Ht1 as one phylotype in this work. In sum, the ASV diversity across our data set supports eight protist species-level phylotypes as a reasonable accounting of the H. tenuis symbiont community. The 18S rRNA sequence variability within each of these protist phylotypes, including Ht1 (Supplementary Table 3), is consistent with that previously observed for named termite-associated protist species (Tai et al., 2014;Taerum et al., 2019Taerum et al., , 2020Jasso-Selles et al., 2020).

Distribution of Protist Phylotypes Across H. tenuis Colonies
Our sampled colonies contained different combinations of these eight phylotypes, with only two colonies harboring all eight phylotypes (Supplementary Table 1). None of the protist phylotypes was restricted to a single host haplotype group or a specific geographical region. In several colonies, we detected only two or three protist phylotypes. However, due to the destructive nature of our sampling, we were limited to just one termite individual from several of our museum collection colonies. This modest level of sampling increases the likelihood that protist species present in these samples' colonies might not be detected by 18S rRNA profiling, whether due to biological variability among nestmates (e.g., in recently molted individuals) or technical limitations (e.g., biased sample degradation or allelic dropout during PCR). We therefore examined the prevalence of each protist phylotype across a reduced set of colonies for which at least two individual termites were sampled.
Holomastigotoides Ht2 was the most prevalent phylotype, present in 22 out of 23 colonies (96%). Next most common were Pseudotrichonympha paulistana and Pseudotrichonympha sp., each present in 21 of the 31 colonies (91%). Only one colony harbored neither Pseudotrichonympha species: our live colony from Panama that was in visible decline in the lab. While we did isolate a single Pseudotrichonympha cell (Gam11) from this colony shortly after collection, by the time we collected gut samples for 18S rRNA profiling no Pseudotrichonympha species were apparent by light microscopy. We have also observed Pseudotrichonympha aurea disappear from declining lab colonies of Heterotermes aureus (Jasso-Selles et al., 2017). The next most prevalent phylotypes were Cononympha sp. 1 and Cononympha sp. 2, occurring in 17 and 19 out of 23 colonies, respectively (74% and 83%). The phylotypes with the patchiest distributions were Cthulhu and Holomastigotoides Ht1, present in six and four colonies, respectively (26% and 17%, Supplementary Table 4).
Because differences in preservation mode among samples (i.e., ethanol, RNAlater, live) might lead to biased detection of protist phylotypes (Hammer et al., 2015), we also examined protist phylotype prevalence across each sample preservation mode separately (Supplementary Table 4). The ethanol-only set showed very similar values to those observed in the 2 + termite subset, with the same ranked order of protist phylotypes by proportion of colonies in which they were detected (i.e., Ht2 > Pseudotrichonympha spp. > Cononympha sp. 2 > Cononympha sp. 1 > Ht3 > Cthulhu > Ht1). However, the RNAlater (n = 7) and live (n = 3) subsets deviated from this trend in some respects. Most notably, both Cononympha species were detected in all of the non-ethanol-preserved samples, suggesting that ethanol samples might be biased toward faster degradation of Cononympha sequences. Holomastigotoides Ht3 was also more prevalent in these sample types than in ethanol, present in six out of seven RNAlater colonies and all three live colonies, while Cthulhu was completely absent from RNAlater colonies and present in one out of three live colonies. Without further sampling it is impossible to determine whether these differences reflect technical bias or true biological differences in the sampled colonies. Note that six out of seven RNAlater colonies and one out of three live colonies come from Panama, as opposed to just one out of 29 ethanol-preserved colonies.
The most common number of protist phylotypes recovered across our 2 + termite colonies is six (seen in 9 out of 23 colonies), and these communities generally consist of both Pseudotrichonympha species, Holomastigotoides Ht2 and Ht3, and both Cononympha species. In other words, they most commonly lack Cthulhu and Holomastigotoides Ht1 (Supplementary Table 1). The next most common number of protist phylotypes is five (5 out of 23), and these colonies are most often additionally missing one of the Cononympha species or Holomastigotoides Ht3. The seven-and eight-protist phylotype colonies (1 and 2 out of 23) add one or both of the rarer Cthulhu and Holomastigotoides Ht1. From these observations, we qualitatively conclude that the "typical" hindgut protist community of H. tenuis consists of Pseudotrichonympha paulistana and/or Pseudotrichonympha sp., Holomastigotoides Ht2, another of the H. tenuis-associated Holomastigotoides species, and at least one of the H. tenuis-associated Cononympha species.

Diversity and Phylogeography of Protist 18S rRNA Sequence Variants
There is considerable diversity among 18S rRNA amplicon sequence variants (ASVs) within each protist phylotype, and the level of diversity itself varies among phylotypes ( Supplementary   Figures 1-4, Supplementary Table 3). This is consistent with the intragenomic sequence variability of 18S rRNA genes observed in single cell PCR (e.g., Saldarriaga et al., 2011;Taerum et al., 2018). We recovered between 17 (Cthulhu) and 136 (Holomastigotoides Ht2) unique ASVs for each protist phylotype across our data set. However, the number of unique ASVs per phylotype correlated strongly with the prevalence of each phylotype across our data set (R 2 = 0.84), suggesting that the observed among-phylotype variability in number of unique ASVs is driven by sampling depth rather than interspecific biological differences (Supplementary  Table 3). Similarly, the proportion of ASVs that were unique to a single location varied across phylotypes, and these values were also correlated with phylotype prevalence across samples, though weakly (R 2 = 0.49, Supplementary Table 3).
We next asked whether the genetic distances among protist ASVs correlate with geographical distances using Mantel tests ( Table 1). Mantel tests compare two distance matrices (i.e., uncorrected pairwise distances for all ASVs and geographical distances between sample locations) and compute a test statistic from cross products of the two matrices. Significance can Green background indicates statistical significance (p < 0.05). For full Mantel tests, each protist species' ASV uncorrected pairwise distances were compared to geographical distances and to host mt16S uncorrected pairwise distances (host phylogeny), and H. tenuis mt16S distances were compared to geographical distances. Partial Mantel tests were carried out between protist ASV distances and host mt16S distances, with geographical distances as a control, and between protist ASV distances and geography, with host mt16S distances as a control.
be assessed by comparison to a distribution of test statistics from permutated data (Diniz-Filho et al., 2013;Buttigieg and Ramette, 2014). Note that while biased sample degradation would be expected to impact the inferred presence or absence of a phylotype, it would not be expected to impact the sequence of nucleotides in an amplified fragment. Our Mantel tests therefore include ASVs from all 39 colonies sampled in this study. For nearly all protist species, there was no correlation between geographical distance and genetic distance of ASVs. Only one species, Holomastigotoides Ht3, showed a weakly significant correlation between the genetic distances among its ASVs and the geographical distances between their sample locations (r = 0.211, p = 0.024; Table 1). These results indicate an overall lack of geographical pattern in the protist ASV sequence diversity in our data set.
An important caveat to this test is that protist biogeography depends on host biogeography, and host biogeography might be correlated with host phylogeny. We therefore carried out a series of Mantel tests to examine this relationship further. First, we compared host mt16S distances to host geographical distances and found no significant correlation, suggesting that host phylogeny should not be a confounding factor in this case. Next, we computed partial Mantel tests between protist ASV genetic and geographical distances while controlling for host phylogeny, and again found only a weakly significant correlation between Holomastigotoides Ht3 genetic and geographical distances (r = 0.181, p = 0.029; Table 1).
We also checked for correlation between protist ASV distances and host mt16S distances. Although host phylogeny is known to impact protist phylogeny and community composition in termites (Tai et al., 2015), it is unknown whether this influence manifests or is detectable at the host population level. In both full and partial Mantel tests, there was no detectable correlation between host and protist genetic distances, except for one species: Holomastigotoides Ht1, whose genetic distance showed a significant correlation with that of the host genetic distance (full test r = 0.500, p = 0.005; partial test controlling for geography r = 0.503, p = 0.014; Table 1).

Drivers of Protist Diversification
In this study, we asked whether the considerable intraspecific variation in protist ASVs exhibits any patterns with respect to geography. Because protist communities are reported to be quite consistent across their termite host species, it is perhaps unsurprising that there was no significant correlation between most of the protists' 18S rRNA ASV sequence diversification and their geographical or host phylogenetic distances. This suggests that there is as much ASV diversity within samples and/or between close (geographically or host phylogenetically) samples as there is across more distant samples. However, we did detect a weakly significant correlation between geographical and ASV pairwise distances of Holomastigotoides Ht3 (Table 1). This phylotype exhibits considerable shared sequence divergence (i.e., long branches) among subsets of ASVs, though their phylogenetic relationships are not resolved (Supplementary Figure 2). The most divergent Holomastigotoides Ht3 ASVs are closely related to our single-cell 18S rRNA sequences from Panama while the rest of the ASVs mainly come from Brazilian samples. This appears to be driving the correlation between ASV and geographical distances, which are greatest between Panama and southern and eastern Brazil.
The other significant correlation in our data set was between host mt16S pairwise distances and the pairwise distances of ASVs from Holomastigotoides Ht1. This is the protist phylotype for which the ASVs formed two supported, sister subclades in phylogenetic analyses (Supplementary Figure 2). As mentioned previously, these ASV subclades had a nearly mutually exclusive distribution across host colonies, making it unlikely that they co-exist in a single genome. While both subclades were found in samples from both the red and blue host haplotype groups ( Figure 5, Supplementary Table 1), they were differentially distributed among the haplotype subclades. In particular, host colonies HFE, Ht061, and Ht069, which branch relatively deeply in the red haplotype group, harbor one of the Holomastigotoides Ht1 subclades, while host colonies Ht409, Ht431, and Ht406 in the crown clade of the red haplotype group harbor the other subclade of Holomastigotoides Ht1. This differential distribution of ASV subclades within the red host haplotype group appears to be driving the correlation between ASV and host mt16S distances.
Whether or not these correlations reveal incipient speciation in Holomastigotoides Ht1 or Ht3, they certainly highlight plausible scenarios by which termite-associated protists might speciate independently of their hosts or of each other (Figure 6). Such examples are welcome because the coevolutionary history of termites and their protists is too complex to be explained by co-speciation alone. In the case of Holomastigotoides Ht3, whose ASV genetic distances correlate with geographical distances, a divergent subset of ASVs is found almost exclusively in samples from Panama, on the margin of its mainland range. This pattern is reminiscent of peripatric speciation, in which a small marginal population becomes isolated from the parent population and diverges by genetic drift (Mayr, 1954). Although the host genetic distances do not show this pattern, Holomastigotoides Ht3 in Panama would be isolated from its conspecifics if they were missing from contiguous host populations toward the mainland. As our data suggest, the lack of Holomastigotoides Ht3 in certain host colonies is not only plausible but common.
Holomastigotoides Ht1 is the only H. tenuis symbiont whose ASV genetic distances are significantly correlated with the mt16S genetic distances of its host. While co-phylogeny at the genus level has been demonstrated between Pseudotrichonympha and its rhinotermitid hosts (Noda et al., 2007), incomplete protist 18S rRNA lineage sorting would be expected to obscure codiversification in recently diverged host lineages (Maddison, 1997). Indeed, there is no detectable protist diversification in symbionts of recently diverged Zootermopsis species (Taerum et al., 2018), and evidence of incomplete lineage sorting has been reported from Teranympha symbionts of Reticulitermes (Noda FIGURE 6 | Scenarios of evolutionary diversification in termite-specific protists. Open black ovals represent termite colonies while filled circles indicate one of their symbiotic protist species. Note that termite-specific protists are biparentally inherited and do not persist in the environment outside their hosts. (A) Co-diversification of host and symbiont species. If host populations become separated by some barrier to gene flow (dashed vertical line), symbiont populations will also be separated. Over time protists and hosts can co-speciate. (B) If certain host colonies lose the symbiont species (empty ovals), they can form a barrier isolating two populations of symbionts. Symbiont populations can then diverge from one another in the absence of host diversification. This might occur more readily on the margins of a host species' range, in a process reminiscent of peripatric speciation. Note that other symbiont species occupying the same host species could remain unaffected. (C) If a symbiont species becomes rare due to loss from many host colonies, its inheritance becomes nearly uniparental: any host colony harboring that symbiont species would have inherited it from just one parent (i.e., king or queen), unless both parents came from the same colony. In this pseudo-uniparental inheritance scenario, symbiont genetic diversification is more likely to be congruent with that of the uniparentally inherited mitochondrial genome. et al., 2018). So why is host/symbiont co-diversification detectable in Holomastigotoides Ht1? We suspect that the cause is the patchy distribution of this protist phylotype. Because Holomastigotoides Ht1 is more commonly absent than present among our sampled H. tenuis colonies, a host lineage lacking Holomastigotoides Ht1 is unlikely to gain it by mating with a different host lineage, and daughter colonies are unlikely to inherit Holomastigotoides Ht1 from both parents, unless both parents belong to the same host lineage. In this way, protist populations could become isolated even while their hosts maintain gene flow. Protist rarity would therefore strengthen the congruence between maternally inherited mt16S distances and biparentally inherited protist ASVs. Another consequence of protist rarity could be the pruning of ASV diversity whenever a protist species fails to be transmitted to a daughter colony. This would further accelerate the ASV lineage sorting process.

Variability of Protist Community Composition
From early morphology-based observations it became a paradigm that symbiont communities are highly consistent across populations of a termite species with only occasional exceptions where a colony might be lacking one of the expected protist species (Kirby, 1937(Kirby, , 1949. This assumption was first directly tested in Japanese Reticulitermes species, in which colonies missing protist species were found to be common (Kitade and Matsumoto, 1993). For example, 10 out of 41 Reticulitermes speratus nests were found to be missing one or two of the 15 identified protist morphospecies (Kitade and Matsumoto, 1993). A similar prevalence of missing protists was seen in Hodotermopsis sjostedti, in which 13 out of 34 nests were missing one or two protist morphospecies (Kitade et al., 2012). These relatively high-variability termite species stand in contrast with Coptotermes formosanus, in which all 11 nests investigated contained all three of the recognized morphospecies, though an unidentified small trichomonad flagellate was also present in three of the colonies examined . In our study, 21 out of 23 H. tenuis colonies were missing at least one of the eight protist species, 20 were missing two or more, and 11 were missing at least three species. By these metrics, the H. tenuis protist community has far higher inter-colony variability than any other reported to date. In H. tenuis, colonies missing protist species are the norm rather than the exception.
An important caveat to this comparison is the difference in methods. First, our combined morphological and molecular approach likely detects more species than a purely morphological approach, providing more opportunity to detect missing species. For example, the protist community of C. formosanus was long considered to consist of three species, Pseudotrichonympha grassii, Holomastigotoides hartmanni, and Cononympha (Spirotrichonympha) leidyi (Koidzumi, 1921), but molecular studies have detected an additional Holomastigotoides and an additional Cononympha species Nishimura et al., 2020). Thus, although the morphology-based study of C. formosanus found all three of the traditional morphospecies in all 11 colonies examined, the absence of a Holomastigotoides and/or Cononympha species could have gone unnoticed . Next, molecular approaches have different biases. Amplicon sequencing may fail to detect some protist species, particularly those present in low abundance (Michaud et al., 2020). On the other hand, index hopping may result in protist species being erroneously detected in samples where they are absent (van der Valk et al., 2020). By considering only the colonies for which we sampled at least two termite individuals, and by employing dual indexing in our experimental design, we aimed to mitigate both of these potential biases, but we still likely detected more inter-colony variability than a morphological study alone would have.

CONCLUSION
Here, we have characterized the H. tenuis symbiont community as consisting of eight protist phylotypes. By combining 18S rRNA data from single isolated cells with high-throughput amplicon sequencing, we were able to confidently delimit these phylotypes and investigate their distribution across much of the geographical range of H. tenuis. None of the protist phylotypes was restricted to a single host haplotype group or a specific geographical region. Instead, we found that the protist symbiont community of H. tenuis is highly variable, with only two colonies harboring all eight phylotypes, and the vast majority of colonies missing at least two phylotypes. This result should be interpreted with caution due to limitations in our level of sampling, but it is so extreme that even a heavy bias, once corrected, would leave H. tenuis with the title of most variable protist community. We also investigated each phylotype's genetic diversity across host colonies and found an overall lack of correlation with geographical or host phylogenetic distances, unsurprisingly for samples from a single host species. However, the two exceptions highlighted theoretical mechanisms of protist speciation that could work within the context of a diverse and widespread host species such as H. tenuis. These mechanisms bring a new perspective to the complex patterns of coevolution between termites and their protist symbionts.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the manuscript/Supplementary Material.

AUTHOR CONTRIBUTIONS
DJ-S and GG performed microscopy. DJ-S performed single cell isolation. FM, NC, DJ-S, JS, and AR performed molecular work. FM, NC, and GG performed data analyses. PS, JŠ, DS-D, RS, and TC performed termite collection, maintenance, and preservation. GG performed study conception and manuscript writing.

ACKNOWLEDGMENTS
We would like to thank Mikaela Garcia, Xyonane Segovia, and Shaurya Aggarwal for technical assistance.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2021. 640625/full#supplementary-material Supplementary Figure 1 | Unrooted maximum likelihood phylogenetic tree of Cononympha 18S rRNA gene sequences from isolated cells and from amplicon profiling of termite gut contents. Nodes are intentionally compressed to show backbone structure. Isolated cell sequences determined in this study are nearly full length (∼1500 bp) and are indicated by red font. ASV sequences from amplicon profiling comprise only the V4-V5 regions of the 18S rRNA gene (∼450 bp) and are indicated by qiime2-generated sequence IDs in black text.
Supplementary Figure 2 | Unrooted maximum likelihood phylogenetic tree of Holomastigotoides 18S rRNA gene sequences from isolated cells and from amplicon profiling of termite gut contents. Nodes are intentionally compressed to show backbone structure. Isolated cell sequences determined in this study are nearly full length (∼1500 bp) and are indicated by red font. ASV sequences from amplicon profiling comprise only the V4-V5 regions of the 18S rRNA gene (∼450 bp) and are indicated by qiime2-generated sequence IDs in black text.
Supplementary Figure 3 | Unrooted maximum likelihood phylogenetic tree of Cthulhu 18S rRNA gene sequences from isolated cells and from amplicon profiling of termite gut contents. Isolated cell sequences determined in this study are nearly full length (∼1500 bp) and are indicated by red font. ASV sequences from amplicon profiling comprise only the V4-V5 regions of the 18S rRNA gene (∼450 bp) and are indicated by qiime2-generated sequence IDs in black text.
Supplementary Figure 4 | Unrooted maximum likelihood phylogenetic tree of Pseudotrichonympha 18S rRNA gene sequences from isolated cells and from amplicon profiling of termite gut contents. Nodes are intentionally compressed to show backbone structure. Isolated cell sequences determined in this study are nearly full length (∼1500 bp) and are indicated by red font. ASV sequences from amplicon profiling comprise only the V4-V5 regions of the 18S rRNA gene (∼450 bp) and are indicated by qiime2-generated sequence IDs in black text.
Supplementary Table 1 | Termite collection data, number of individuals assayed per collection, and protist taxonomy assignment of 18S rRNA ASV reads. Supplementary Video 1 | Live Cthulhu cell in motion displaying > 10 flagella in an anterior bundle. Scale bar 10 µm.