Comparative Plastid Genomics of Non-Photosynthetic Chrysophytes: Genome Reduction and Compaction

Spumella-like heterotrophic chrysophytes are important eukaryotic microorganisms that feed on bacteria in aquatic and soil environments. They are characterized by their lack of pigmentation, naked cell surface, and extremely small size. Although Spumella-like chrysophytes have lost their photosynthetic ability, they still possess a leucoplast and retain a plastid genome. We have sequenced the plastid genomes of three non-photosynthetic chrysophytes, Spumella sp. Baeckdong012018B8, Pedospumella sp. Jangsampo120217C5 and Poteriospumella lacustris Yongseonkyo072317C3, and compared them to the previously sequenced plastid genome of “Spumella” sp. NIES-1846 and photosynthetic chrysophytes. We found the plastid genomes of Spumella-like flagellates to be generally conserved with respect to genome structure and housekeeping gene content. We nevertheless also observed lineage-specific gene rearrangements and duplication of partial gene fragments at the boundary of the inverted repeat and single copy regions. Most gene losses correspond to genes for proteins involved in photosynthesis and carbon fixation, except in the case of petF. The newly sequenced plastid genomes range from ~55.7 kbp to ~62.9 kbp in size and share a core set of 45 protein-coding genes, 3 rRNAs, and 32 to 34 tRNAs. Our results provide insight into the evolutionary history of organelle genomes via genome reduction and gene loss related to loss of photosynthesis in chrysophyte evolution.


INTRODUCTION
Chrysophytes are a large algal group with diverse morphologies and various nutritional modes, including phototrophy, mixotrophy, and heterotrophy. Among them, mixotrophs and heterotrophs are ecologically important eukaryotes that feed on bacteria and other eukaryotes inhabiting freshwater, brackish, and marine environments. The phototrophic chrysophytes have goldenbrown plastids with chlorophylls a and c. The chrysophyte plastid is derived from a red alga through secondary (i.e., eukaryote-eukaryote) endosymbiosis (Kim and Archibald, 2009). As stramenopiles, chrysophytes are closely related to the Synchromophyceae and Eustigmatophyceae. They are also phylogenetically grouped together with 16 classes of plastid-containing stramenopiles, including brown algae and diatoms (Han et al., 2019;Kim et al., 2019).
Many comparative genomic and phylogenomic studies have used plastid genome data to shed light on the evolution of photosynthetic stramenopiles (Ruck et al., 2014;Sěvcǐḱováet al., 2015;Han et al., 2019;Kim et al., 2019). In non-photosynthetic species of chrysophytes, the ultrastructures of remnant plastids have been reported (Grossmann et al., 2016), and a recent genome and transcriptome-based phylogenomic study suggested parallel reductive evolution of non-photosynthetic chrysophytes, including the genus Paraphysomonas, which appears to have undergone complete loss of the plastid genome (Dorrell et al., 2019). But there are still many unanswered questions about the loss of photosynthesis in the chrysophytes, questions that can be addressed by investigating the structure and coding capacity of plastid genomes in diverse chrysophyte species.
To that end, we sequenced three plastid genomes from the following representative non-photosynthetic chrysophyte algal genera: Spumella, Pedospumella, and Poteriospumella. We carried out a detailed comparative analysis of their plastid genome structures and gene contents relative to each other and to published photosynthetic chrysophyte plastid genome sequences for Ochromonas sp. CCMP1393, synuralean algae, and the nonphotosynthetic "Spumella" sp. NIES-1846(Sěvcǐḱováet al., 2015Dorrell et al., 2019;Kim et al., 2019). Our results contribute to the growing body of knowledge relating to how gene content and genome structure changes in response to the loss of photosynthesis in chrysophytes and other algae.
All cultures were derived from single-cell isolates for unialgal cultivation. Total genomic DNA was extracted from exponentially growing cell cultures using the QIAGEN DNEasy Blood Mini Kit (QIAGEN, Valencia, CA, USA) following the manufacturer's instructions. A paired-end library was prepared using the NexteraXT protocol (Illumina) according to the manufacturer's protocol. Whole genome sequencing was performed using the Illumina MiSeq platform to generate paired-end 2 × 300 bp sequencing reads. More than 2 Gb of raw data were generated for each species. genome origin as follows: (1) BLAST searches against the entire assembly using commonly known plastid genes as queries resulted in significant hits for these contigs (Jung et al., 2017) and (2) the predicted gene contents were similar to the previously published 160 kbp plastid genome of the chrysophytes Ochromonas sp. (KJ877675) and Synura petersenii (MH795128).
To aid in gene annotation, we created a database of proteincoding, rRNA, and tRNA genes using data from previously sequenced chrysophyte plastid genomes. Preliminary annotation of protein coding genes was performed using AGORA (Jung et al., 2018) and GeneMarkS (Besemer et al., 2001). The final annotation file was checked in Geneious Pro 10.2.2 (Kearse et al., 2012) using the ORF Finder (https://www.ncbi.nlm.nih.gov/orffinder/) with genetic code 11 (Bacterial, Archaeal, and Plant plastid code). The predicted open reading frames (ORFs) were checked manually, and the corresponding ORFs (and predicted functional domains) in the genome sequence were annotated.
The tRNA genes were identified using tRNAscan-SE version 1.21 (Lowe and Chan, 2016) with the default settings using the "Mito/Chloroplast" model. To help identify rRNA gene sequences, a set of known plastid rRNA sequences from the public database was used as a query sequence to search our new genome data using BLASTn. Physical maps were visualized with OrganellarGenomeDRAW 1.3.1 (Greiner et al., 2019). For structural and synteny comparisons, the genomes were aligned using GeneCo  with default settings.

Phylogenomic and Phylogenetic Analyses
Phylogenomic analysis was carried out on amino acid sequence datasets created by combining 40 protein coding genes (8,267 amino acids) from 98 plastid genomes of stramenopiles. The plastid genome of Hydrurus foetidus was analyzed and annotated based on publicly available data on scaffold UYFQ01001012 (Bråte et al., 2019). The sequences of six haptophyte species were used as outgroup taxa for rooting purposes. For the phylogeny of the nuclear-encoded SSU rDNA gene (1,591 nucleotides) from 177 chrysophyte taxa and Leukarachnion sp., Nannochloropsis limnetica, and Synchroma grande were used as outgroup taxa. The dataset was aligned using MUSCLE 8.0 in the program MacGDE2.6 (Smith et al., 1994;Edgar, 2004).
Bayesian analyses were run using MrBayes 3.2.7 (Ronquist et al., 2012) with a random starting tree, two simultaneous runs (nruns = 2) and four Metropolis-coupled Markov chain Monte Carlo (MC3) algorithms for 2 x 10 7 generations, with one tree retained every 1,000 generations. The burn-in point was identified graphically by tracking the likelihood values using TRACER v. 1.6 (http://tree.bio.ed.ac.uk/software/tracer/). ML phylogenetic analyses of individual protein alignments and concatenated alignments were conducted using IQ-TREE v.1.5.2 (Nguyen et al., 2015) with 1,000 bootstrap replicates. ML phylogenetic analysis of the nuclear-encoded SSU rDNA gene was performed using RAxML 8.1.20 (Stamatakis, 2014) with the general time reversible plus Gamma (GTR + GAMMA) model. We used 1,000 independent tree inferences using the -# option of the program to identify the best tree. The best evolutionary model for each tree was selected using the posterior mean site frequency (PMSF) model (the LG+F+G tree as the guide tree) incorporated in IQ-TREE. Trees were visualized using FigTree v.1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).

General Features of Non-Photosynthetic Chrysophyte Plastid Genomes
Three new plastid genomes were sequenced from the nonphotosynthetic chrysophte genera Spumella, Pedospumella and Poteriospumella ( Table 1). The structures and coding capacities of these genomes were compared to those of the published genomes of the related photosynthetic chrysophyte Ochromonas sp. CCMP1393 and synuralean species (Sěvcǐḱováet al., 2015;Kim et al., 2019) and the non-photosynthetic "Spumella" sp. NIES-1846 (Dorrell et al., 2019). Most obvious is the fact that the plastid genomes of non-photosynthetic chrysophyte Spumella-like flagellates have lost all of the photosynthesis-related genes found in their photosynthetic relatives ( Figure 1, see below). The plastid genome sizes of the four non-photosynthetic chrysophyte taxa range from~53.2 kbp ("Spumella" sp. NIES-1846) to~62.9 kbp (Poteriospumella lacustris), and the overall GC content ranged from 21.5% ("Spumella" sp. NIES-1846) to 38.9% (Spumella sp. Baeckdong012018B8) ( Table 1). These taxa share a core set of 45 protein-coding genes, two rRNA operons, and 32-34 tRNAs. All of the plastid genomes of Spumella-like flagellates analyzed herein have typical IR regions, short single-copy (SSC) regions and long single-copy (LSC) regions. Putative lateral gene transfers from other organisms were not detected in any of the newly sequenced genomes.

Photosynthesis-Related Genes
Not surprisingly, the bulk of the gene loss in the plastid genomes of Spumella-like flagellates occurred in relation to photosynthesis ( Table 1, Figures 1 and 2). This includes loss of genes for photosystem I, photosystem II, the cytochrome b6/f complex and ATP synthase subunits ( Figure 1). Neither the genes for carbon fixation (RuBisCO) and chlorophyll biosynthesis (chlI) nor those encoding cytochrome c biogenesis proteins (ccs1 and ccsA) were found to be present in any of the non-photosynthetic chrysophyte plastid genomes.
The psa and psb gene families encode protein subunits of photosystem I and photosystem II, respectively. Although 10 psa genes and two photosystem I assembly protein genes (ycf3 and ycf4) occur in photosynthetic Ochromonas sp. CCMP1393 and synuralean species (Sěvcǐḱováet al., 2015;Kim et al., 2019), none are present in the plastid genomes of any Spumella-like flagellates. Similarly, the colorless cryptophyte plastid genome has lost all 11 psa genes found in the genomes of photosynthetic cryptophyte species (Donaher et al., 2009;Kim et al., 2017;Tanifuji et al., 2020), and the plastid genomes of nonphotosynthetic diatoms (colorless Nitzschia spp.) have also lost all ten psa genes found in photosynthetic diatoms (Kamikawa et al., 2018). The photosynthetic chrysophyte plastid genomes possess a total of 16 psb genes, but all of these genes were absent in the four Spumella-like flagellates. The entire set of 18 psb genes  has also been completely lost in non-photosynthetic cryptophytes and diatoms ( Figure 1B).
The pet (photosynthetic electron transport) gene is almost completely absent from the plastid genomes of all nonphotosynthetic Spumella-like flagellates, with the curious exception of petF ( Figures 1A, B). In photosynthetic organisms, pet proteins create a complex required for oxygenic photosynthesis, in particular the non-cyclic electron flow mediated by the cytochrome b6f complex at the thylakoid membrane. The petF (ferredoxin) coding region of non-photosynthetic chrysophyte Spumella-like flagellates, as well as nonphotosynthetic cryptophytes, appears to be intact and the predicted protein is surprisingly well conserved ( Figure 1B). The gene may be involved in various redox reactions, given that transcriptome data from Spumella-like flagellates (Dorrell et al., 2019) show that genes for the proteins glutamine synthetase and glutamate synthase are expressed, although they do not contain obvious plastid-targeting sequences. In other secondarily non-A B FIGURE 1 | (A) Chrysophyte plastid genome content. Genes with two copies are shown in gray boxes. (B) Presence or absence of genes for photosystem I (PSI, psa-), photosystem II (PSII, psb-), the cytochrome b6/f complex (pet-), carbon fixation (rbc-), chlorophyll biosynthesis (chl-), cytochrome c biogenesis proteins (ccs-), the ATP synthase subunits (atp-), and the TAT system (tat-) are shown for three distantly related lineages, i.e., photosynthetic and non-photosynthetic chrysophytes, diatoms, and cryptophytes. Genes present/absent in plastid genomes are shown in brown or white, respectively. The rbc-gene present only in certain species is colored light brown. The data were derived from previously published studies (Donaher et al., 2009;Kamikawa et al., 2015;Kim et al., 2017;Dorrell et al., 2019;Kim et al., 2019;Tanifuji et al., 2020; this study). photosynthetic organisms such as the colorless euglenoid Euglena longa, the pet genes are all missing (Gockel and Hachtel, 2000). In addition to petF, the Spumella-like flagellates retain sufB/C (involved in Fe-S cluster biogenesis) in their plastid genomes.
Plastid genes involved in carbon fixation (RuBisCO subunit, rbcL and rbcS) and its regulation (cbbX), chlorophyll biosynthesis (chlI), and cytochrome c biogenesis proteins (ccs1 and ccsA) were absent in the Spumella-like flagellates but present in the plastids of photosynthetic chrysophytes. The cryptophytes represent an interesting point of comparison in that the cbbX, rbcL and rbcS genes are found in the genome of some colorless Cryptomonas species (e.g., C. paramecium) but not others (Cryptomonas spp. SAG9772f and CCAP1634B), while colorless diatoms appear to uniformly lack such genes ( Figure 1B).
In contrast to photosynthetic chrysophytes, non-photosynthetic cryptophytes and diatoms, Spumella-like flagellates lack a full complement of plastid ATP synthase subunit genes, which are typically associated with the electron transport chain of photosynthesis ( Figure 1B). The plastid genomes of nonphotosynthetic cryptophytes contain eight ATP genes (atpA, B, D, E, F, G, H, and I), with the exception of an atpF pseudogene in Cryptomonas paramecium (Donaher et al., 2009;Tanifuji et al., 2020). Non-photosynthetic diatoms retain a near-complete set of ATP synthase genes in their plastid genomes, with atpE and atpF being present or absent in a species-specific fashion (Kamikawa et al., 2018). The loss of ATP synthase subunit genes in response to the loss of photosynthesis is a recurring theme in land plants and diatoms (Barrett et al., 2014;Kamikawa et al., 2018).
The ATP synthase complex is involved in generating proton gradients for thylakoid protein transporters, mediated by the twin arginine translocator (TAT) system, which depends on proton gradients to perform protein translocation across the thylakoid membrane. The tatC gene, the core protein of the TAT system, is absent from the genomes of the four Spumella-like flagellates compared here, whereas it is present in photosynthetic chrysophytes and non-photosynthetic cryptophytes and diatoms. With respect to plastid ultrastructure and ATP synthase complex genes, the Spumella-like flagellates we analyzed appear to have lost both thylakoid membranes and all of the ATP synthase complex genes that were present in their photosynthetic ancestors ( Figure  1B). Interestingly, another Spumella-like flagellate, Cornospumella fuschlensis AR4D6, retains visible thylakoid structures (Grossmann et al., 2016). The colorless diatom Nitzschia sp. retains a few reduced thylakoid membranes in the plastid stroma and ATP synthase complex and TAT genes (Kamikawa et al., 2015), whereas the colorless cryptophyte Cryptomonas paramecium does not appear to possess any thylakoid membranes (Sepsenwol, 1973) but has retained many of its ATP synthase complex genes (Donaher et al., 2009;Tanifuji et al., 2020).

Housekeeping Genes in Non-Photosynthetic Chrysophytes
The ribosomal protein operons in the four analyzed Spumellalike flagellate plastid genomes were found to be almost identical in terms of gene content and structure relative to each other and to those of the photosynthetic chrysophytes. 25 rpl genes for 50S ribosomal subunit proteins are found in photosynthetic chrysophytes, almost all of which are also present in the Spumella-like flagellate plastid genomes. The single exception is rpl24, which is absent in other Spumella-like flagellates but is present in Poteriospumella lacustris (Figure 2). The rpl24 gene has also been described as missing in the plastid genomes of Eustigmatophyceae, a stramenopile lineage that is closely related to chrysophytes based on plastid genome data (Sěvcǐḱováet al., 2019).
In terms of 30S ribosomal protein genes, seventeen are present in the plastid genomes of Spumella-like flagellates and photosynthetic chrysophyte species, with the exception of rps8 and rps20, which are missing in "Spumella" sp. NIES-1846. The rps14 gene was found to be duplicated and translocated in the genome of Spumella sp. Baeckdong012018B8 sequenced herein, whereas the previously published genome of "Spumella" sp. NIES-1846 is quite different from the three genomes we sequenced in terms of its rpl gene repertoire. No fewer than ten ribosomal protein genes were lost in the NIES-1846 strain: rpl18, rpl23, rpl24, rpl29, rpl31, rpl34, rpl35, rpl36, rps18, and rps20 (Figures 2 and 3). Six of these rpl gene losses (rpl18, rpl23, rpl24, rpl29, rpl31, and rpl36) map to the conserved ribosomal protein operon.
The set of ribosomal protein genes is generally conserved amongst the plastid genomes of diverse species, speaking to their importance in the assembly of functional organellar ribosomes. The smallest known number of plastid ribosomal protein genes occurs in apicomplexan apicoplasts: between 15 and 17 in total (Wilson and Williamson, 1997). In other non-photosynthetic organisms, the number of plastid-encoded ribosomal protein genes are only slightly reduced relative to phototrophs, such as in Epifagus virginiana (Wolfe et al., 1992) and the parasitic green alga species Helicosporidium sp. (de Koning and Keeling, 2004), and in non-photosynthetic diatoms, chrysophytes, cryptophytes, and parasitic red algae (Figure 3 and Supplementary Table S1). For reasons that are unclear, the loss of ribosomal protein genes goes hand in hand with plastid genome reduction in nonphotosynthetic species.
The tufA gene, which encodes the plastid protein synthesis elongation factor Tu (EF-Tu), is present in almost all sequenced plastid genomes. In the green line, tufA is present in the plastid genomes of most green algae, including the non-photosynthetic Helicosporidium and Polytoma species, but has been transferred to the nucleus in embryophytes and the charophycean Zygnema circumcarinatum (Martin et al., 1998;de Vries et al., 2013). In red alga-derived secondary plastids, tufA is present in all sequenced genomes, even in apicomplexans and the nonphotosynthetic diatom Nitzschia species, the only apparent exception being peridinin-pigmented dinoflagellates, which have extremely reduced plastid genomes (Howe et al., 2008). With the exception of Spumella sp. Baeckdong012018B8, all sequenced chrysophyte plastid genomes harbor the tufA gene ( Figure 2). While the tufA gene was found to be missing in the plastid genome of Spumella sp. Baeckdong012018B8, gene transcripts for a putatively plastid-targeted EF-Tu protein were detected in transcriptome data from members of the core Spumella clade (Dorrell et al., 2019).
The molecular chaperone protein-coding genes dnaK (a member of the hsp70 family) and groEL (a chaperonin gene) are found in almost all red alga-derived plastids (i.e., those of cryptophytes, haptophytes, stramenopiles), as well as rhodophytes and glaucophytes (Green, 2011;Kim et al., 2017). In non-photosynthetic species, the colorless cryptophytes retain both dnaK and groEL, whereas colorless diatoms have only the dnaK gene in their plastid genome (Dorrell et al., 2019;Tanifuji FIGURE 3 | Presence/absence of plastid-encoded ribosomal protein genes in diverse algae and the cyanobacterium Anabaena variabilis. Taxa highlighted bold correspond to those specifically analyzed in this study. Filled boxes indicate the presence of ribosomal protein genes (green=green alga-derived secondary plastid lineage; red=red alga-derived plastid lineage). The missing ribosomal protein genes (i.e., rpl7, rpl8, rpl15, rpl17, rpl25, rpl26, rpl30, and rps21) were not detected in the plastid genomes of any of the lineages examined herein. Accession numbers and fully surveyed datasets are provided in Supplementary Table S1, online Supplementary Material. 2, multi-copy genes; Y, pseudogenes. et al., 2020). The non-photosynthetic chrysophyte Spumella-like flagellate plastid genomes have uniformly lost the groEL gene as well (Table 1), while the dnaK gene shows a complex pattern of presence and absence in the four genomes we analyzed. The dnaK gene is located in the IR region and has been partially copied in both IR regions in Pedospumella sp. Jangsampo120217C5. The partially copied dnaK gene has also been detected in photosynthetic synuralean algae in a species-specific fashion . Here we found that the dnaK gene is present in Poteriospumella lacustris Yongseonkyo072317C3 but is absent in Spumella sp. Baeckdong012018B8 and the "Spumella" sp. NIES-1846 (Figure 2).
The secA gene encodes a protein translocase subunit involved in the hydrolysis of ATP to transfer proteins into the thylakoid lumen. The secA gene is known to be absent in the glaucophytes, land plants and green algae, but is present in rhodophytes and the red alga-derived plastid genomes of cryptophytes, haptophytes, and stramenopiles. Colorless cryptophytes and diatoms (Nitzschia spp.) possess the gene (Dorrell et al., 2019;Tanifuji et al., 2020), but "Spumella" sp. NIES-1846 has lost it. In phototrophic chrysophytes, the secA gene is present in the SSC region of the plastid genomes of Ochromonas sp. CCMP1393 and synuralean species . In the non-photosynthetic chrysophytes, we found the secA gene to be present in Poteriospumella lacustris but absent in the plastid genomes of other Spumella-like flagellates; this gene absence correlates with extensive gene reduction in the SSC region ( Figure  2). Interestingly, a secY-like gene was found between rpl36 and rpl18 in Poteriospumella lacustris (Figure 2, gradient red box). The Sec translocon subunits SecA and SecY, may function together for protein translocation across the thylakoid membranes. The existence of a divergent secY gene in Poteriospumella lacustris is consistent with the idea of ongoing plastid genome reduction in these organisms.
Previous studies have shown that the tsf gene, which encodes elongation factor Ts, is present in primary red algal plastids (Grzebyk et al., 2003). While photosynthetic chrysophytes also have a tsf gene in their plastid genomes, among the heterotrophic Spumella-like flagellates the gene was found to be present only in Pedospumella sp. Jangsampo120217C5; it does not reside in the plastid genomes of the other Spumella-like species examined in our study (Figure 2). The tsf gene was also not detected in transcriptome data from Poteriospumella lacustris (strains JBC07 and JBM10), Pedospumella encystans and Pedospumella sinomuralis (Dorrell et al., 2019).
Finally, the acpP gene, which encodes an acyl carrier protein, thus far appears to be present only in the plastid genome of Poteriospumella lacustris and photosynthetic chrysophytes (Sěvcǐḱováet al., 2015;Kim et al., 2019) (Figure 2). The acyl carrier protein is involved in the fatty acid biosynthesis pathway and is variably present and absent in the plastid genomes of other algae across the eukaryotic tree of life (Grzebyk et al., 2003).

Plastid Genome Reduction and Rearrangement in Non-Photosynthetic Chrysophytes
A genome-wide comparison of synteny shows that plastid gene order in the Spumella-like flagellates is almost identical to that of photosynthetic chrysophytes (Figure 2), with the complete absence of the photosynthesis-related genes accounting for the bulk of the differences in genome size. We also carried out comparative genomic studies of plastid housekeeping genes in non-photosynthetic species among the red alga-derived plastids, including the "apicoplast" genome of the malaria parasite Plasmodium falciparum (Wilson et al., 1996;Waller and McFadden, 2005), colorless cryptophyte Cryptomonas species (Donaher et al., 2009;Tanifuji et al., 2020), and the colorless diatom Nitzschia sp. (Kamikawa et al., 2018). Overall, the presence of shared plastid genes in chrysophyte lineages reveals non-random retention of genes in the plastid genome despite the independent loss of photosynthesis in the four non-photosynthetic chrysophytes examined herein. Our data support the idea that independently evolved non-photosynthetic plastids show similar genome structure and gene loss patterns to those in the nonphotosynthetic chrysophyte plastid genome.
The non-photosynthetic chrysophyte plastid genomes exhibit slightly different gene contents and structures in their inverted repeat (IR) regions ( Figure 2). The IRs, which ranged in length from 8.08 kbp (Poteriospumella lacustris) to 8.53 kbp (Pedospumella sp. Jangsampo120217C5), contained three ribosomal protein genes (rpl21, rpl27, and rpl34), ycf60, three rRNAs and eight tRNAs. Twelve protein-coding genes in the repeat region were lost in the plastid genome of Spumella-like flagellates relative to the photosynthetic chrysophyte Ochromonas sp. CCMP1393 and synuralean plastid genomes (Figure 2).
The plastid genomes of the Spumella-like flagellates exhibit different gene orders and gene loss patterns among the three ribosomal proteins (rpl21, rpl27, and rpl34) and ycf60 in the IR/ SSC junctions (Figure 2, marked in blue). The gene order in this region of the Pedospumella sp. Jangsampo120217C5 genome was found to be most similar to that of the photosynthetic chrysophyte Ochromonas sp. CCMP1393 and the synuralean species (i.e., loss of the secA gene and the following gene order: trnN-rpl34-ycf60-rpl27-rpl21-trnL; Figure 2). The genes in this region of the Poteriospumella lacustris genome were inverted (trnL-rpl21-rpl27-rpl34-trnS-trnM-ycf60-partial secA) and also contained a secA gene in the SSC region. The Spumella sp. Baeckdong012018B8 genome was significantly reduced in this area, with the rpl34-trnL-ycf60-rpl27-rpl21 genes located in the SSC region. Finally, in the plastid genome of "Spumella" sp. NIES-1846, only the trnN-trnL genes were encoded in the IR region, with the rpl27-rpl21-orf230 genes being located in the SSC portion of the genome (and, as noted above, the rpl34 gene was lost).
Previous studies have shown that expansions and contractions are common in the IR boundaries of diatom and green algal genomes (Goulding et al., 1996;Wang et al., 2008;Jansen and Ruhlman, 2012;Sabir et al., 2014;Turmel et al., 2015). Here we have observed that instances of SSC/IR expansion and contraction have also occurred during the evolutionary history of nonphotosynthetic Spumella-like flagellates, leading to changes in genome structure and coding capacity. Such events can alter gene order through inversion, transposition, gene loss, and/or contraction of the IR region. Such dynamics have also been observed in the LSC/IR junction of the plastid genomes of photosynthetic synuraleans, leading to gene rearrangements, expansion/contraction of the IR region, and gene loss events . The contraction of the IR region caused by gene loss is also likely one of the factors contributing to plastid genome reduction in non-photosynthetic chrysophytes.

Gene Rearrangements
The Spumella-like flagellate plastid genomes are generally highly syntenic, with the exception of variations in the IR/SSC boundary regions and relative positions of 13 genes located in three clusters designated A, B and C (Figure 2). These clusters are as follows: (A) sufC-sufB, (B) ycf19-rps16-trnR-trnQ, and (C) clpC-trnF-trnW-rpl11-rpl1-rpl12-trnP. Three different cluster patterns were inferred. The first pattern, observed in Pedospumella sp. Jangsampo120217C5 and "Spumella" sp. NIES-1846, involves the clusters in the order A-B-C, and is shared with photosynthetic Ochromonas sp. CCMP1393 and synuralean species. The second pattern, A-B-C′, is observed in Poteriospumella lacustris, and the third pattern is seen in Spumella sp. Baeckdong012018B8; in this case the cluster order is C′-B′-A′ and is inverted relative to Pedospumella sp. Jangsampo120217C5 and "Spumella" sp. NIES-1846. The genes in these three clusters may be translocated and rearranged as a result of the loss of genes flanking the clusters in each species.

Phylogenetic Relationships Among Chrysophytes
Phylogenomic analysis using 40 plastid-encoded proteins showed a monophyletic assemblage of chrysophytes within stramenopiles with maximal support (posterior probability (PP)=1.00, maximum likelihood (ML)=100%, Supplementary Table S2). The chrysophyte assemblage formed a sister relationship with Eustigmatophyceae ( Figure 4A and Supplementary Figure S1), consistent with recently published studies suggesting that eustigmatophyte plastids are closely related to those of chrysophytes (Sěvcǐḱováet al., 2015;Han et al., 2019;Kim et al., 2019;Sěvcǐḱováet al., 2019). Our plastid phylogenomic investigations showed that the mixotrophic species Ochromonas sp. CCMP1393 and the putatively facultative photoautotrophic species Hydrurus foetidus formed a strongly supported sister relationship with the exclusively photosynthetic synuralean assemblage ( Figure 4A), which is consistent with previous multigene phylogenetic studies based on plastid-encoded proteins .
We also carried out a phylogenetic analysis of nuclear SSU rDNA sequences from a wide range of chrysophyte species ( Figure 4B) to serve as a reference point for interpreting the plastid genome tree and, more generally, to better understand changes in nutritional modes during the evolution of the group. The photosynthetic synuralean lineage was found to form a monophyletic assemblage. We also found that sequences from non-photosynthetic chrysophytes were intermixed with those of mixotrophic lineages (Grossmann et al., 2016;Kristiansen and Sǩaloud, 2017;Dorrell et al., 2019;this study). Collectively, the combined results of plastid and nuclear gene phylogenies demonstrate that photosynthesis has been lost multiple times during the evolutionary history of chrysophytes.

The Biology and Evolution of Chrysophytes and Their Plastid Genomes
Photosynthetic chrysophytes are generally mixotrophs that rely on the ingestion of organic nutrients, with the exception of the phototrophic synuralean lineage (Graupner et al., 2018;Lie et al., 2018). The chrysophyte Hydrurus foetidus forms filamentous branches within a soft polysaccharide coat and is a putatively facultative phototrophic species (Klaveness and Lindstrøm, 2011;Lavoie et al., 2018). Genome size and cell size differ significantly among species exhibiting different nutritional modes (Olefeld et al., 2018). Where known, heterotrophic chrysophytes have the smallest genomes and cell volumes, phototrophs have the largest, and mixotrophs are generally in between the two. The extreme reduction in the plastid genome of the non-photosynthetic Spumella-like flagellates examined herein surely results from the evolution of a heterotrophic lifestyle concomitant with the loss of genes for photosynthesis that are no longer needed. According to Olefeld et al., 2018, the loss of photosynthesis in chrysophytes may be due to energetic factors. Chrysophytes generally do not have a carbon-concentrating mechanism (CCM), indicating that they do not use bicarbonate (HCO 3 -) as a carbon source in diverse water environments (Maberly et al., 2009) and that they are subject to carbon limitation as a selection pressure. Therefore, many chrysophyte lineages may have independently changed their nutritional mode to that of obligate heterotrophs.
The phylogenetic analyses carried out in our study and elsewhere (e.g., Dorrell et al., 2019) provide clear evidence for parallel evolution of plastid genomes in response to a shift to heterotrophy. How many times this occurred is still unclear. Nevertheless, from the data in hand we propose a modified model of chrysophyte plastid evolution, as follows ( Figure 5). The red alga-type plastid of photosynthetic chrysophytes stems from a secondary (or possibly) tertiary endosymbiotic event in an ancestor shared with other plastid-bearing stramenopiles (see Sibbald and Archibald, 2020 and references therein; Figure 5, ①). Much later, after the diversification of chrysophytes, Spumellalike flagellates lost their photosynthesis-related genes on multiple occasions ( Figure 5, ②, ③). Plastid genome reduction occurred as a result of gene loss from one side of the IR/SSC boundary region ( Figure 5, ④), and the expansion of the SSC region led to the presence of remnant genes on the other side of the IR/SSC region ( Figure 5, ⑤). The loss of much larger numbers of genes brought about complete loss of plastid genomes ( Figure 5, ⑥), as observed in the paraphysomonad clade (Dorrell et al., 2019). The loss of the secA, secY, tufA, rpl, and rps genes in a speciesspecific fashion and contraction of the IR/SSC boundary region observed in our study provide examples of genome reduction "in action," providing evidence consistent with this model.

A B
FIGURE 4 | Phylogenies of phototrophic, mixotrophic, and heterotrophic chrysophytes. (A) Plastid genome-based Bayesian tree of chrysophytes and other stramenopile taxa. The topology of this tree (inferred from an alignment of 40 proteins and 8,267 amino acids) is consistent with a single acquisition of photosynthetic ability from a red alga-derived secondary plastid in a chrysophyte ancestor. The numbers on each node represent posterior probabilities (left) and ultrafast bootstrap approximation (UFBoot) values calculated using IQ-Tree (right). Thick branches indicate fully supported nodes (PP = 1.00/ML = 100). This tree shows the phylogenetic relationships of chrysophytes and other stramenopiles based on a subset of taxa; for phylogenies inferred using a fully expanded dataset, refer to Supplementary Figures S1 and S3. (B) Nuclear SSU rDNA tree of chrysophytes showing the putative relationships of the phototrophic chrysophyte lineage in the context of non-photosynthetic lineages. Sequences from the strains whose plastid genomes were sequenced in this study are highlighted. Together, the plastid and nuclear gene tree topologies suggest parallel evolution of chrysophyte plastid genomes in response to shifts to heterotrophy. The numbers on each node represent posterior probabilities (left) and maximum-likelihood (ML) bootstrap support values calculated using RAxML (right). Support values (PP < 0.70/ML<70) are shown on each node. Bold branches indicates fully supported values (PP = 1.00/ML = 100). The numbers in () are indicate the number of taxa in the species, genus, or order. The species name with " " indicates uncertain taxonomic status. An expanded phylogeny based on a much larger dataset is provided in Supplementary Figure S2. The scale bars indicate the number of substitutions/site.

CONCLUSIONS
Analysis of three newly sequenced plastid genomes from Spumella-like flagellates has provided insight into the fine-scale dynamics of genome reduction in chrysophytes. Heterotrophic lineages appear to have evolved from photosynthetic and mixotrophic ones multiple times independently during chrysophyte evolution. Our results are consistent with previous suggestions that heterotrophic chrysophytes are polyphyletic and their plastid genomes have undergone extreme reduction due to the loss of photosynthesis-related genes. The almost identical gene content and structure of the genomes of Spumella-like flagellates analyzed herein suggests that non-photosynthetic chrysophytes have experienced parallel gene losses as they independently transitioned from phototrophy to heterotrophy.

AUTHOR CONTRIBUTIONS
JIK and WS conceived and designed the experiments. JIK and MJ performed the experiments and analyzed the data. JIK, JMA, and WS interpreted the data and wrote the manuscript. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
The authors thank Drs. Gangman Yi and Jaehee Jung for bioinformatic assistance and Dr. Richard Dorrell for sharing the SSU rDNA sequence of "Spumella" sp. NIES-1846. The two reviewers are also thanked for their helpful comments on an earlier version of the manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020. 572703/full#supplementary-material ADDITIONAL FILE 1: SUPPLEMENTARY FIGURE S1 | Phylogenetic tree of chrysophyte plastid-encoded proteins and those of other photosynthetic stramenopiles. This tree was constructed using a dataset of 40 concatenated protein-coding genes (8,297 amino acids) selected with a main focus on the leucoplasts of non-photosynthetic chrysophyte Spumella-like flagellates. The numbers on each node represent posterior probabilities (left) and ultrafast bootstrap approximation (UFBoot) values calculated using IQ-Tree (right). The bold branch indicates strongly supported values (PP = 1.00/ML = 100). The scale bar indicates the number of substitutions/site. ADDITIONAL FILE 2: SUPPLEMENTARY FIGURE S2 | The nuclear encoded SSU rDNA tree of 177 chrysophytes and 3 outgroup taxa showing the putative relationships of the non-photosynthetic chrysophyte lineages relative to the photosynthetic ones. The NCBI accession numbers are provided with taxon names. The numbers on each node represent posterior probabilities using Bayesian analysis. The bold branch indicates strongly supported values (PP = 1.00). The scale bar indicates the number of substitutions/site. ADDITIONAL FILE 3: SUPPLEMENTARY FIGURE S3 | Phylogenetic tree of chrysophyte plastids and those of other photosynthetic stramenopiles. This tree was constructed using a dataset of 40 concatenated protein-coding genes selected with a main focus on the leucoplasts of non-photosynthetic chrysophyte Spumella-like flagellates (8,297 amino acids). The tree was generated using the PMSF model (the LG+F+G tree as guide tree) and ultrafast bootstrap approximation (UFBoot) values calculated using IQ-Tree.