Phylogeny of Algal Sequences Encoding Carbohydrate Sulfotransferases, Formylglycine-Dependent Sulfatases, and Putative Sulfatase Modifying Factors

Many algae are rich sources of sulfated polysaccharides with biological activities. The physicochemical/rheological properties and biological activities of sulfated polysaccharides are affected by the pattern and number of sulfate moieties. Sulfation of carbohydrates is catalyzed by carbohydrate sulfotransferases (CHSTs) while modification of sulfate moieties on sulfated polysaccharides was presumably catalyzed by sulfatases including formylglycine-dependent sulfatases (FGly-SULFs). Post-translationally modification of Cys to FGly in FGly-SULFs by sulfatase modifiying factors (SUMFs) is necessary for the activity of this enzyme. The aims of this study are to mine for sequences encoding algal CHSTs, FGly-SULFs and putative SUMFs from the fully sequenced algal genomes and to infer their phylogenetic relationships to their well characterized counterparts from other organisms. Algal sequences encoding CHSTs, FGly-SULFs, SUMFs, and SUMF-like proteins were successfully identified from green and brown algae. However, red algal FGly-SULFs and SUMFs were not identified. In addition, a group of SUMF-like sequences with different gene structure and possibly different functions were identified for green, brown and red algae. The phylogeny of these putative genes contributes to the corpus of knowledge of an unexplored area. The analyses of these putative genes contribute toward future production of existing and new sulfated carbohydrate polymers through enzymatic synthesis and metabolic engineering.


INTRODUCTION
Sulfates are found in algal proteins, carbohydrate, sulfolipids, and low molecular weight sulfated compounds (DeBoer, 1981). Many algae were reported to be rich sources of sulfated polysaccharides with biological activities (Hernandez-Sebastia et al., 2008). Sulfated fucans from brown algae and sulfated galactans from green and red algae have been reported to be potent anticoagulant agents (Pomin and Mourão, 2008). Some of these algal sulfated polysaccharides such as agar, agarose, and carrageenan, constitute the major component of algal extracellular matrix or cell wall, and have wide applications in food, cosmetics and pharmaceutical industries (McHugh, 2003).
Sulfur (normally in sulfate form) constitutes one of the nine essential macronutrients required by plants including algae (Yildiz et al., 1994). Sulfur assimilation in plants and algae begins with the activation of sulfate by ATP sulfurylase, which catalyzes the adenylation of sulfate to 5 ′ -adenylylsulfate (APS). APS can either be phosphorylated by APS kinase or reduced by glutathione-dependent APS reductase. Both enzymes and pathways are important for cellular synthesis of sulfated and reduced sulfur compounds in algae, respectively (Gao et al., 2000). Sulfation is catalyzed by sulfotransferases (STs) which transfer a sulfuryl group (SO 3 ) from 3 ′ -phosphoadenosine 5 ′phosphosulfate (PAPS) to a hydroxyl group of a substrate (Hernandez-Sebastia et al., 2008). In addition, activities of sulfatases which were assumed to be involved in the modification of sulfate moieties on sulfated polysaccharides have also been reported in various algae. The pattern and number of these substitutions not only affect the physicochemical/rheological properties of sulfated polysaccharides but also their biological activities (Opoku et al., 2006;Tuvikene et al., 2008).
Carbohydrate sulfotransferases (CHSTs) are of particular interest in algae because several genera of marine macroalgae synthesize sulfated polysaccharides that constitute the major component of their cell walls which chelate metallic ions and provide hydration to the cells. Mammals CHSTs are among the best characterized CHSTs. Most of them are Golgi-localized and membrane-bound, and are involved in the biosynthesis of sulfated oligosaccharides and glycosaminoglycans (Fukuda et al., 2001). In addition, CHSTs such as NodH and NoeE that are involved in the biosynthesis of nodulation factors have also been characterized from symbiotic rhizobacteria, Sinorhizobium melioti and Rhizobium sp. NGR234, respectively (Ehrhardt et al., 1995;Hanin et al., 1997). Characterization of algal candidate genes for CHSTs has not been reported.
The conserved catalytic site of FGly-SULFs consists a divalent metal ion located within a pocket in which substrates are bound, a highly conserved motif at the N-terminus (or "sulfatase signature") which spans over a 12-mer linear sequence with a core motif C/S-X-P-X-R, and a unique active aldehyde residue, α-formylglycine (FGly) (Hanson et al., 2004). FGly is formed post-translationally by the oxidation of a cysteine (Cys) residue that is conserved in all eukaryotic and most prokaryotic sulfatases (Schmidt et al., 1995;Dierks et al., 1998b). Some bacterial species possess serine (Ser) residue instead of cysteine (Cys) residue at the same position of the catalytic site leading to the "Cys-type" or "Ser-type" prokaryotic sulfatases. The structural similarity amongst FGly-SULFs suggested that they shared a common ancestral gene (Meroni et al., 1996;Parenti et al., 1997).
Post-translational modification of Cys to FGly occurs at the endoplasmic reticulum at a stage the polypeptide is not yet folded into its native structure (Schirmer and Kolter, 1998). The enzyme that is involved in the post-translational modification of Cys to FGly in FGly-SULFs is known as sulfatase modifiying factor (SUMF) while the enzyme AtsB is responsible for the posttranslational modification of bacterial Ser-type sulfatases (Dierks et al., 1998b;Schirmer and Kolter, 1998). SUMF1 was found to be responsible for the multiple sulfatase deficiency in human. SUMFs belong to a gene family that is highly conserved during evolution from bacteria to human (Dierks et al., 1997(Dierks et al., , 1998aLandgrebe et al., 2003).
Despite the importance of these sulfated polysaccharides, the roles of CHSTs in their formation, and sulfatases in their modifications; little is known about the sequences and structures of algal CHSTs, FGly-SULFs and their SUMFs. In recent years, a few algal genomes have been fully sequenced (Armbrust et al., 2004;Merchant et al., 2007;Bowler et al., 2008;Cock et al., 2010;Bhattacharya et al., 2013;Collén et al., 2013) and can be used for the survey for algal candidate genes encoding algal FGly-SULFs and SUMFs. The aims of this study are to mine for sequences from the fully sequenced algal genomes and to infer their phylogenetic relationships to known CHSTs, FGly-SULFs, and SUMFs from other organisms.

Generation of Sulfatase Signature Logos
Logo analyses for FGly-SULF sequences were performed at the Berkeley Structural Genomics Center (http://weblogo.Berkeley. edu/) to visualize the information content associated with each position of a given motif shared by related sequences. In the graphical representation, the conservation at each position (expressed in bits) is represented by the overall height of each position whereas the relative frequencies of the symbols within a position are indicated by the relative sizes of the symbols.
The reported values were computed as the rate between the information content of the given position and the information content of varying positions within the motif.

RESULTS AND DISCUSSION
In total, 83, 41, and 14 algal sequences encoding CHSTs, FGly-SULFs, and SUMFs were retrieved, respectively (Tables 1-3). Human CHSTs, FGly-SULFs, and SUMFs were used for the mining and also phylogenetic analyses in this study mainly because sequences from human were the best characterized in terms of sequence and functions compared to those from other organisms.

Algal CHST Sequences
Human CHSTs can be divided into two groups based on the presence of two conserved domains for Superfamily Sulfotransferase 1 and 2, respectively (Figures 1, 2). All human CHSTs classified in the Superfamily Sulfotranferase 1 (CHSTs 1-7, mainly for Gal/N-acetylglucosamine/N-acetylglucosamine 6-O-STs; glucosamine N-deacetylase/N-ST or heparin sulfate STs, NDSTs; (heparan sulfate)-glucosamine 3-O STs, HS3S1, 2, 5, 6, A and B) were found to contain pfam 00685 for Sulfotransfer_1 domain, while most of those in the Superfamily Sulfotransferase 2 have pfam 03567 for Sulfotransfer_2 domain (CHSTs 8-15, for Nacetylgalactosamine 4-O STs and N-acetylgalactosamine 4-sulfate 6-O STs; heparan sulfate 6-O-STs, HS6ST 1-2; heparan sulfate 2-O ST, HS2ST and uronyl-2-O ST, UST) except for a few CHSTs such as galactose-3-O STs (G3ST1-4) that have pfam 06990 for Gal-3-O-Sulfotr domain. The findings on human CHSTs concur with the information published in the Interpro abstract for IPR005331 (www.ebi.ac.uk/interpro/) that Sulfotransfer_2 domain (pfam 03567) is present in a number of CHSTs that transfer sulfate to positions 3 (CHSTs 10), 4 (CHSTs 8, 9, 11 and 13; dermatan-4 ST, D4ST) and 6 (HS2ST, HS6ST, chondroitin-6 ST) of carbohydrate groups in glycoproteins and glycolipids. According to the Interpro abstract for IPR000863, Sulfotransfer_1 domain is found in flavonyl-3-STs, aryl STs, alcohol STs, and phenol STs. However, we found that many human CHSTs also contain this domain. The algal CHSTS ( Table 1) were found to have either one of the pfams mentioned above or with no putative domain. All the green algal CHSTs were found to have pfam 00685 only while either pfam 00685 or pfam 03567 was found in the brown and red algal CHSTs. Only one red algal CHST from C. crispus (CHOCR_R7Q8D2) was found to have pfam 06990 (Figure 2, Table 1). The algal CHST sequences are generally very diverse. The use of phylogeney in assigning functions based on substrate specificity or pattern of sulfation requires further verification. Most of these algal CHSTs were clustered according to green, brown or red algae or even genera, except for a few clusters with green, red, and brown algal CHSTS (Figure 1). For examples, CHLRE 455231 was clustered among a group of brown algal CHSTs; THAPS B8C3T6 was in a group of green algal CHSTs; CHLRE A8IZD0 and CHOCR R7QLM0 were clustered with brown algal CHSTs; and CHOCR R7Q533 and PHATR B7FXZ9 were clustered with green algal CHSTs. These CHSTs could share similar functions in algae of different genera/species.
It is likely that the genes encoding FGly-SULF are absent from the genomes of red algae or at least in the red algal species examined. Since sulfate is not a limiting factor for marine algae that grow in seawater which has a high sulfate concentration (25-28 mM) compared to freshwater or land (10-50 µM) (Friedlander, 2001;Bochenek et al., 2013), it is possible that recycling of sulfate through FGly-SULFs may not be required. Furthermore, the biosynthesis of sulfated polysaccharides was proposed to be a possible result of physiological adaptation of macroalgae, marine angiosperms, and seagrasses (but not terrestrial plants) to marine environments (Aquino et al., 2005).
It is also possible that the red algal sulfatases belong to sulfatases other than the FGly-SULF type. Currently, three  groups of sulfatases have been described: Group 1 which consists of the FGly-SULFs (Boltes et al., 2001); Group 2 with the Fe(II) α-ketoglutarate-dependent sulfatases (Müller et al., 2004); and Group 3 which consists of the zinc-dependent metallo β-lactamase superfamily or alkylsulfatases (Hagelueken et al., 2006). In addition, sulfatases (arylsulfatases) together with alkaline phosphatases and phosphoglycerate mutases were shown to belong to a superfamily of phospho-/sulfo-coordinating metalloenzymes that share the catalytic core of nucleotide pyrophosphatases/phosphodiesterases by homology searches and alignment-assisted mutagenesis (Gijsbers et al., 2001). The sulfatase genes may have also diverged to an extent that they cannot be readily identified using bioinformatic search tools. The red algal sulfatase could have novel sequences as reported for 12 sequences encoding putative D-galactose-2,6-sulfurylases I and II as revealed by the genome analyses of C. crispus (Collén et al., 2013). The galactose-2,6-sulfurylases I from C. crispus which share some identities to L-amino acid oxidase from C. reinhardtii (U78797) have no similarities to any reported sulfatases. Evidence on the enzyme activity of their recombinant proteins is crucial to show that they are indeed novel red algal sulfatases.  The evolutionary history was inferred using the Neighbor-Joining method. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. Sequences from green, brown, and red algae are shown by respective colors. The identifier of the sequence starts with the species abbreviation followed by the UNIPROT/Genbank accession number and annotation wherever possible (Table 1). HUMAN, Homo sapiens; CHSTs 1-7, for Gal/N-acetylglucosamine/ N-acetylglucosamine 6-O-STs; NDST, glucosamine N-deacetylase/N-ST or heparin sulfate STs; and HS3S1, 2, 5, 6, A and B, (heparan sulfate)-glucosamine 3-O STs.
Sulfatase-like activities have also been reported previously in higher plants (Baum and Dodgson, 1957;Poux, 1966) although sequences encoding these enzymes have not been reported. Searching the complete plant genomes at the Phytozome revealed only one incomplete FGly-SULF-like sequence from Ricinus cucumis which contains a CSATR motif which resembles the sulfatase signature. However, this sequence was incomplete, short, and without introns. Further analyses revealed that similar sequences (orthologs) were absent in other plant species, thus was believed to be contaminated sequence from associated bacterial species. Figure 3 shows the phylogeny of FGly-SULFs from human, yeasts, worm, fruitfly, and algae which has two main branches. The well characterized human FGly-SULFs were divided into two main branches with sulfatases SULF 1 and SULF 2, and glucosamine N-acetyl-6-sulfatase (GNS) in one branch while the remaining human FGly-SULFs (arylsulfatases, ARS A, B, C, D, E, F, G, H, I, J, K; N-galactosamine-6-sulfatase, GALNS; iduronate 2-sulfatase, IDS; and N-sulfoglucosamine sulfohydrolase, SGSH) are distributed in the other branch. The clustering of human FGly-SULFs may reflect their functions or substrate preference in general. The only two FGly-SULFs from worm were distributed FIGURE 2 | Phylogenetic relationship of algal CHSTs in Superfamily Sulfotransfer_2 (with pfam domains 03567 and 06990). The evolutionary history was inferred using the Neighbor-Joining method. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. Sequences from brown and red algae are shown by respective colors. The identifier of the sequence starts with the species abbreviation followed by the UNIPROT/Genbank accession number and annotation wherever possible (Table 1) one in each branch with SUL 1 in the same cluster as the human SULF 1 and SULF 2. The D. melanogaster SULF1, GNS, IDS, SGSH were grouped with their orthologs from human while another four uncharacterized FGly-SULFs formed a separate cluster which is unique for D. melanogaster. The FGly-SULF sequences from yeasts were clustered in the same branch except for that of ascomycetes Neurospora crassa which was found to be in a separate branch. It is likely that the FGly-SULFs from Saccharomycetes and Schizosaccharomycetes may have evolved after the divergence from Ascomycetes.
All the green algal FGly-SULFs (Ch. reinhardtii and V. carteri) were distributed in the same branch as human SULF 1, SULT2, and GNS, while all the brown algal FGly-SULFs were divided into subclusters in the other branch (Figure 3), implying that FGly-SULFs from these two groups of algae could have evolved from different origins or from the same origin which has diversified before speciation of brown and green algae. The green algal subcluster (Subcluster 5) consists of sequences from both Ch. reinhardtii and V. carteri thus implying that these sequences may have originated from the same ancestral FGly-SULF which FIGURE 3 | Phylogenetic relationship of algal FGly-SULFs. The evolutionary history was inferred using the Neighbor-Joining method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. Sequences from green and brown algae are shown by respective colors. The identifier of the sequence starts with the species abbreviation followed by the (Continued)
could have existed before speciation. The brown algal sequences were divided into four subclusters: Subcluster 1 which consists of three sequences from Ph. tricornutum whereby each contains an extra C-terminus; Subcluster 2 with nine sequences from E. siliculosus; Subcluster 3 which consists of three sequences from T. pseudonana with a gap in pfam 00884 and closely related to yeast FGly-SULFs (except for the sequence from N. crassa); Subcluster 4 which consists of three sequences from T. pseudonana, and a sequence from Ph. tricornutum; and Subcluster 5 which contains mainly green algal FGly-SULFs. The existence of highly identical sequences from each species suggests duplication of FGly-SULFs upon speciation (Figure 3). All algal sequences analyzed contain the sulfatase domain (pfam 00884), with a few of them bearing a gap within this domain, mainly those from diatoms (three from T. pseudonana in Subcluster 2 and one from Ph. tricornutum in Subcluster 4). However, the presence of gap within these sequences has little consequence in affecting their phylogeny compared to their similarity within the same species. In addition, sequences from Ph. tricornutum in Subcluster 1 contain an extra Cterminus.
The comparison of amino acid at the active sites of algal sulfatases showed that the green and brown algae share only two conserved residues (C-X-X-X-R) at the same positions ( Figure 4A), which is less conserved compared to the core motif for human FGly-SULFs (C-X-P-S-R). Both Ch. reinhardtii and V. carteri share the same core motif: C-C-P-(S/A)-R (Figures 4B,C), while the brown algae have more diverse core motif (C-X-X-X-R; Figures 4D-F) whereby only the first C residue and the last R residue are conserved. Within the FGly-SULFs from each brown algal species, the core motif C-T-P-(A/S)-R is conserved among those from E. siliculosus (Figure 4F), while C-(S/W)-(P/I)-(T/S)-R and C-(C/W)-(P/V/I)-S-R were shared by those in Ph. tricornutum and T. pseudonana, respectively (Figures 4D,E). The A residue immediately after the core motif is highly conserved in brown algae (Figures 4D-F).

Algal SUMF and SUMF-like Sequences
Since the sulfatase signature was identified in all algal FGly-SULF sequences (Figure 4), sequences that encode SUMFs which modify the C residue to FGly in the active site of FGly-SULFs were searched among the algal genomes. Table 3 shows that SUMF sequences that were highly similar to those of eukaryotic SUMFs were only retrieved from brown algae (Ph. tricornutum, T. pseudonana, and E. siliculosus)   The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The identifier of the sequence starts with the species abbreviation followed by the UNIPROT/Genbank accession number and annotation wherever possible (Table 3). HUMAN, Homo sapiens; BOVINE, Bos taurus; MOUSE, Mus musculus; PHYPA, Physcomitrella patens; SCHPO, Schizosaccharomyces pombe. Sequences from green, brown, and red algae are shown by respective colors. The structure and pfam domains of each group are shown on the right panel.
and a green microalga (plankton), Ostreococcus tauri. In addition to the SUMF sequences, SUMF-like sequences that are highly similar to the coding sequence of Meiotically Upregulated Gene (MUG) 158 (also known as Egt1) from a yeast, Sch. pombe, were retrieved from green (Auxenochlorella protothecoides), brown (T. pseudonana and E. siliculosus) and red algae (C. crispus, Po. cruetum, Cy. merolae, and Ga. sulphuraria), as well as a moss (Bryophyte), Physcomitrella patens (Pp1s94_113V6 abbreviated as PHYPA_113V6) which represents the missing link between green algae and higher land plants ( Figure 5).
The SUMF-like sequences were longer than the SUMF sequences. These two groups of sequences are only identical at the formyl-glycine generating enzyme (FGE)-sulfatase domain (pfam 03781). MUG 158 from Sch. pombe has an S-adenosyl-Lmethionine (SAM)-dependent methyltransferase domain (pfam 10017; including DUF 2260, a domain with unknown function), an uncharacterized DinB_2 domain (pfam 12867; including an iron-binding motif, H-X(3)-H-X-E), in addition to the FGEsulfatase domain (Pluskal et al., 2014). This protein was reported to be involved in cell division and its expression was up-regulated upon the entry of cell into meiosis (Mata et al., 2002). Highly identical to the sequences of NcEgt1 from N. crassa and MsEgtD from Mycobacterium smegmatis, MUG 158 was also reported to be involved in the first step of ergothioneine biosynthesis (Pluskal et al., 2014). Ergothioneine, an amino acid derived from thiourea that contains components associated with histidine, was reported to accumulate in oxidative-stress susceptible area in human body (Cheah and Halliwell, 2012) thus was believed to be able to scavenge oxidizing species that are not free radicals (Chaudière and Ferrari-Iliou, 1999). However, ergothioneine is only synthesized by a few filamentous fungi, actinobacteria, and cyanobacteria but not by higher plants and animals. The red alga, Po. purpureum SAG1380-1C, was reported to produce a small amount of ergothioneine (Saha et al., 2015). It is unknown whether the algal SUMF-like sequences share the same function as SUMF sequences, or have other functions in ergothionein biosynthesis or meiosis as in MUG 158. Alternatively, these sequences may possess both functions. At least three brown algae were found to have both SUMF and SUMF-like sequences, FIGURE 6 | Multiple sequence alignment of algal SUMF-like sequences. The amino acid sequences were aligned by ClustalW. Identical and similar sequences were highlighted in black and gray, respectively. The pfam domains 10017 (Histidine-specific SAM-dependent methyltransferase), 12867 (DinB domain), and 037181 (FGE-sulfatase) are underlined with green (dotted line), blue (broken line) and red, respectively. The DinB_2 iron-binding motif is indicated by blue box while the red box shows the EgtB subfamily C-terminal sequences. The identifier of the sequence starts with the species abbreviation followed by the UNIPROT/Genbank accession number and annotation wherever possible (Table 2). PHYPA, Physcomitrella patens; SCHPO, Schizosaccharomyces pombe.
indicating that both types of sequences could have different functions.
The phylogeny of SUMF and SUMF-like sequences (Figure 5) shows two main clusters consisting of SUMF sequences and SUMF-like sequences, respectively; which may share the same ancestor. The SUMF cluster consists of human SUMF1-3, bovin SUMF 1-2, mouse SUMF 1-2 together with four SUMFs from three brown algae (one from Ph. tricornutum and T. pseudonana, respectively; two from E. siliculosus) and one from the green microalga O. tauri. Each of the SUMF sequences in this cluster contains a FGE-sulfatase domain except for one of the SUMF sequences from E. siliculosus (ECTSI_D8LGF4) which has an incomplete domain while the SUMF sequence from O. tauri has an additional but incomplete glycosyltransferase domain (pfam 00534). The SUMF-like cluster consists of MUG 158 from Sch. pombe, SUMF-like sequences from A. protothecoides, T. pseudonana, E. siliculosus, C. crispus, Po. cruetum, Cy. merolae, Ga. sulphuraria, and Phy. patens. The domains found in the SUMF-like sequences are more variable. The red algal SUMFlike sequences were found to contain DinB_2 domain (pfam 12867) at their N-termini, in addition to the FGE-sulfatase domain (Figures 5, 6). The SUMF-like sequences from the green lineage (moss and green alga), similar to MUG158, were found to have two additional domains, i.e., pfam 10017 (S-adenosyl-L-methionine (SAM)-dependent methyltransferase domain) and pfam 12867 at the N-terminus of the FGE-sulfatase domain; while the brown algal SUMF-like sequences have pfam 12867 and pfam 10017 at the N-and C-termini of FGE-sulfatase domain, respectively (Figure 6). One of the sequences from E. siliculosus (ECTSI_D8LGF5) which has an incomplete FGEsulfatase domain could not be assigned to either group of sequences.
It is intriguing that SUMF sequences were not found in the genomes of both green algae Ch. reinhardtii and V. carteri which have FGly-SULF sequences; and equally intriguing that SUMF sequence was found in O. tauri wherein FGly-SULF sequence was not detected. Similarly, SUMF or SUMF-like sequences were not reported in Saccharomyces cerevisiae and a few other yeasts which were shown to have FGly-SULFs. Could there be other sequences that are able to modify FGly-SULFs in Ch. reinhardtii and V. carteri? Alternatively, modification of Cys to FGly may not be necessary for these green algal FGly-SULFs. It is obvious that a group of SUMF-like sequences are present in green, brown, and red algae as well as moss, yeasts (at least Saccharomycetes and Ascomycetes), bacterium Mycobacterium, however, their functions are uncharacterized.
In general, the phylogeny of algal CHSTs, FGly-SULFs, and SUMFs or SUMF-like sequences revealed that many protein sequences were clustered according to their groups i.e., green (for CHSTs with pfam Sulfotransfer_1 domain), brown (for CHSTs, FGly-SULFs, and SUMFs or SUMF-like sequences), and red (for CHSTs with pfam Sulfotransfer_2 domain, FGly-SULFs and SUMFs or SUMF-like sequences) algae. Duplication/multiplication and functional divergence of these sequences could have happened after the divergence of these three groups of algae during evolution. Since only two green algal SUMFs or SUMF-like sequences were retrieved, the same trend was not observed. The clustering of a few CHSTs with pfam Sulfotransfer_1 domain from different groups of algae implied the existence of an ancestral sequence before the separation of these algal groups. The phylogenetic analyses of these putative genes contribute to the corpus of knowledge of an unexplored area. Algal CHSTs, FGly-SULFs, and SUMFs constitute a highly attractive target for future research to produce existing and new sulfated carbohydrate polymers through enzymatic synthesis and metabolic engineering.