Crossroads between Bacterial and Mammalian Glycosyltransferases

Bacterial glycosyltransferases (GT) often synthesize the same glycan linkages as mammalian GT; yet, they usually have very little sequence identity. Nevertheless, enzymatic properties, folding, substrate specificities, and catalytic mechanisms of these enzyme proteins may have significant similarity. Thus, bacterial GT can be utilized for the enzymatic synthesis of both bacterial and mammalian types of complex glycan structures. A comparison is made here between mammalian and bacterial enzymes that synthesize epitopes found in mammalian glycoproteins, and those found in the O antigens of Gram-negative bacteria. These epitopes include Thomsen–Friedenreich (TF or T) antigen, blood group O, A, and B, type 1 and 2 chains, Lewis antigens, sialylated and fucosylated structures, and polysialic acids. Many different approaches can be taken to investigate the substrate binding and catalytic mechanisms of GT, including crystal structure analyses, mutations, comparison of amino acid sequences, NMR, and mass spectrometry. Knowledge of the protein structures and functions helps to design GT for specific glycan synthesis and to develop inhibitors. The goals are to develop new strategies to reduce bacterial virulence and to synthesize vaccines and other biologically active glycan structures.


INTRODUCTION
Glycans play important roles in most biological processes in health and disease. Bacteria and human beings have a close relationship in the intestine, which can be symbiotic or pathogenic. Bacteria often produce human-like glycan structures with bacteria-specific glycosyltransferases (GT)s that have given them a selective advantage for adhesion, colonization, and survival. Knowledge of these enzymes can help us understand human counterparts of GTs, and provide a convenient technology to synthesize both bacterial and human glycans. Bacterial GTs can be easily expressed and stored; they are more soluble and often remarkably active and stable.
Currently, GTs from many different organisms have been classified into 96 GT families in the Carbohydrate-active Enzymes (CAZy) classification system (http://www.cazy.org), based on their sequence similarity derived from GenBank (ftp://ftp.ncbi.nih.gov/ genbank/ or EMBL or DDBJ) (1,2). Very few of the bacterial GTs have been biochemically and functionally characterized, thus proposed enzymes are assigned based on similarity searches. The CAZy database also contains genetic, structural, mechanistic, and functional information of known GTs. The former Escherichia coli (EC) nomenclature for GTs as well as the currently accepted nomenclature and alternative names for GTs are included. A number of databases provide sequence analyses of GTs (e.g., NCBI BLAST, PFAM, INTERPRO, DBCAN, Swiss-Prot -ExPASy).
For searches of glycan structures, a number of databases are useful (3). For example, GLYCOSuiteDB contains information on N-and O-linked glycans and glycoproteins and Glycobase on N-and O-Glycan structures. For glycomics analyses by mass spectrometry (MS), GlycoMaster DB at http://www-novo. cs.uwaterloo.ca:8080/GlycoMasterDB is helpful (4). The current E. coli O-antigen database (ECODAB) contains known O antigen structures of E. coli, the analytic data available, and has links to genes involved in O antigen synthesis from the O antigen gene cluster (5). Many of the E. coli antigens can be found in other bacterial strains. Finally, the Consortium for Functional Glycomics (http://www.functionalglycomics.org/) provides a large database for glycan functions.
Because of wide-spread development of antibiotic resistance, we need new anti-bacterial strategies, and bacterial GTs are virulence factors that could be targeted. The understanding of GTs can help in the production of vaccines to protect against bacterial infections, cancer, and for application in inflammation and autoimmune disease. In this review, we will compare mammalian and bacterial GTs that show remarkable similarity of action, protein folding, or mechanisms, in spite of surprisingly large differences in amino acid sequences.

MAMMALIAN GLYCOPROTEINS AND BACTERIAL GLYCANS
Mammalian glycoproteins are involved in virtually all cellular activities; they serve as ligands for antibodies or lectins, or as receptors involved in signaling, cellular interactions, cell growth, differentiation, and cell death (6)(7)(8)(9)(10)(11). Glycans are important in the inflammatory response, the innate and adaptive immune system, and cancer metastasis, as well as microbial colonization and infections. Glycoproteins have many functional epitopes attached to either N-glycans or O-glycans, and the amounts of many of these epitopes can be altered in disease, for example, in cancer. Although there is remarkable diversity in glycan structures in mammals, and hundreds of different chains can be found in glycoproteins, only six sugar residues (Man, GlcNAc, GalNAc, Gal, sialic acid, Fuc) are forming the extended and branched varieties of glycans with few modifications such as O-acetylation and sulfation. N-and O-glycans can affect the chemical and physical properties and the conformations of proteins and the accessibility of peptide epitopes.
Bacteria display an astounding variety of unusual sugars and sugar linkages as well as modifications of sugars that are foreign to human beings and, therefore, can trigger immune responses. However, a number of specific bacterial glycans are mimics of mammalian glycoprotein epitopes ( Table 1). Partial structures of Oantigenic polysaccharides of Gram-negative bacteria (ECODAB) often mimic human glycans and may help bacteria to evade the immune system and promote colonization. The mimicry may prevent the production of effective vaccines to protect against bacterial infections, which requires new considerations of antibacterial strategies. About half of the EC strains have some form of mammalian epitope within their O antigens. This includes Galβ1-3GlcNAcβ-, and Galβ1-4GlcNAcβ-linkages, which are part of the glycan backbone structures (type 1 and type 2, respectively) in mammalian glycoproteins. In bacteria, those are internal structures within the O antigen repeating unit. The cancer-associated Thomsen-Friedenreich (TF or T antigen, O-glycan core 1) is common in glycoproteins and also found in several O antigens of E. coli. Blood group O, A, and B, sialylated glycans, and polysialic acid are mimics found in a number of bacterial strains. The fact that bacteria are able to synthesize these human-like structures suggests that they have the appropriate biosynthetic enzymes ( Table 2), although this would be difficult to anticipate from the inspection of the amino acid sequences of their GTs. Biochemical characterization of bacterial enzymes and structure/function studies are important prerequisites to utilize these enzymes in chemoenzymatic synthesis of mammalian glycoprotein epitope structures.

ROLE OF O ANTIGENS
The LPS of Gram-negative bacteria are essential structures of the outer membrane. LPS binds to the LPS-binding protein, requiring the CD14/TLR4/MD2 receptor complex, which elicits a strong response during infections, through TLR4 signaling. LPS consist of a lipid A base (endotoxin), which carries a relatively invariable inner oligosaccharide core, strain-specific outer core oligosaccharides, and the serotype-specific outer O antigen polysaccharide. O antigens are polysaccharides composed of up to 50 repeating units of oligosaccharides with one (homopolymeric) to 10 sugars (heteropolymeric) that play a role in bacterial adhesion and colonization, affect pathogenicity and survival, and can be bacteriophage receptors. The enormous structural variability of O antigens is mediated by many specific GTs and other enzymes that modify O antigens, thus increasing structural diversity, e.g., by adding phosphate, acetyl groups, or branching sugar residues. The LPS molecules are necessary for stabilization of the outer membrane and form a barrier against penetration of toxins. In particular, the O antigens serve to evade complement; they protect against phagocytosis and give the bacteria a strain-specific and diversity-selective advantage. The molecular mimicry found in a large proportion of bacteria adds to their ability to prevent recognition by the host immune system and thus promotes virulence. A number of bacteria do not have an extended O-antigenic polysaccharide but instead have a short lipooligosaccharide that may have structural identity with human glycoproteins or glycolipids and may lead to pathological conditions. The close relationship between bacteria and human beings is also apparent in the abundance of bacterial lectins that bind to mammalian glycoproteins and thus promote adhesion to mammalian tissues.

N-GLYCOSYLATION OF MAMMALIAN AND BACTERIAL GLYCOPROTEINS
In eukaryotic cells, N-glycans are assembled first on a dolicholphosphate (P-Dol) intermediate on the cytoplasmic side of the endoplasmic reticulum (ER) membrane (7,12) (Figure 1). GlcNAc-phosphate is transferred by GlcNAc-phosphotransferase in a reversible reaction, inhibited by tunicamycin, to P-Dol, followed by the transfer of another GlcNAc residue to form chitobiose. This is followed by five Man residues, all transferred from nucleotide sugar donor substrates to form the common N-glycan core structure Man 5 -chitobiose linked to PP-Dol. This heptasaccharide is flipped across the membrane and further addition of sugars occurs on the inside of the ER lumen through transfer from Man-P-Dol and Glc-P-Dol donor substrates. After completion of the lipid-linked N-glycan, it is transferred en bloc to the Asn residue(s) of Asn-X-Ser/Thr sequons in a glycoprotein by the oligosaccharyltransferase complex (OST), and Glc and Man Poly α8-sialyl-transferase 2, ST8SIA2 residues are selectively cleaved by glycosidases. After transfer to the Golgi, further removal of Man residues occurs, and GlcNActransferase I (GnT I, MGAT1) adds the first of the N-glycan antennae in β1-2 linkage to the Manα1-3 arm of the core. This can then be followed by several steps that depend on the presence of this first GlcNAc antenna and the expression of processing enzymes, which remove two Man residues from the Manα1-6 arm, add Fucα1-6 to the inner chitobiose GlcNAc, and add further antennae to form complex type N-glycans. N-glycans can be extended by repeating Gal-GlcNAc residues to form type 1 or type 2 chains ( Table 1); they may be branched by GlcNAcβ1-6 linkages and may be decorated with specific functional epitopes and blood group determinants (7). This results in hundreds of different N-glycan structures, depending on the glycosylation potential of the cell.

FIGURE 1 | Biosynthesis of N-glycosylated glycoproteins in eukaryotes.
N-glycosylation is initiated at the endoplasmic reticulum (ER) membrane using nucleotide sugar donor substrates and a membrane-bound acceptor phospholipid with multiple isoprenyl units (dolichol-phosphate, P-Dol). The first sugar (GlcNAc) is transferred as GlcNAc-phosphate from UDP-GlcNAc by GlcNAc-P-transferase, resulting in GlcNAc-diphosphate-dolichol (GlcNAc-PP-Dol). This step can be inhibited by the UDP-GlcNAc analog tunicamycin. On the outside face of the ER membrane, another GlcNAc is added to form chitobiose, followed by five Man residues to form a heptasaccharide (Man 5 GlcNAc 2 )-PP-Dol. This heptasaccharide is flipped to the inside of the ER where the chain grows by transfer of sugars from membrane-bound Man-P-Dol and Glc-P-Dol. The completed saccharide Glc 3 Man 9 GlcNAc 2 is then transferred by an oligosaccharyltransferase complex (OST) to the Asn residue in an Asn-x-Ser/Thr sequon of nascent proteins. After trimming of sugar residues in the ER by removal of Glc and Man residues to the Man 8 GlcNAc 2 structure, glycoproteins are exported to the Golgi where further trimming occurs by mannosidases. Many N-glycan chains are processed to the complex type by the addition of GlcNAc residues by GlcNAc-transferases I to V (MGAT1 to 5). Chains grow further by the addition of Gal-GlcNAc sequences and termination by sialyl-, Fuc-, Gal-, GlcNAc-, and GalNAc-transferases, which are all highly specific for both the donor and the acceptor substrates and with few exceptions form only one type of linkage between sugars. This creates a multitude of hundreds of different structures and epitopes with many possible functions, depending on the final destination of the glycoprotein, e.g., in the cell membrane or in secretions. Glycoprotein biosynthesis is regulated at many different levels, e.g., by the synthesis and delivery of nucleotide sugar substrates, the expression, activities and localization of glycosyltransferases and trimming hydrolases, the competition of enzymes for common substrates, levels of metal ion activating factors, localization of enzymes involved, and rate of transport of glycoproteins.
Not all N-glycosylation sites carry N-glycans, and there are differences in chain processing between different glycosylation sites of the same protein. The peptide has been shown to interact with the glycan chains, and this controls the conformations of the glycan and the peptide and leads to site-specific glycosylation. Many sequentially acting and competing GTs assemble glycoproteins in a cell type-specific pattern. Most of the GTs involved exist as families of enzymes ( Table 3). Several of these have been shown to be localized in specific Golgi compartments according to their action within the complex pathways.
In the mammalian biosynthetic pathways, the sequence of sugar additions is controlled by the gene expression, the relative activities of competing enzymes, the enzyme localizations, levels of substrates and cofactors, and the distinct substrate specificities of GTs. These types of controls still need to be investigated for glycosylation reactions in bacteria.
Bacteria such as Campylobacter jejuni (Cj) also have Nglycosylated proteins (13). An oligosaccharide is first assembled on undecaprenol-phosphate (P-Und), an analog of P-Dol, in the cytoplasmic compartment. The sugar-PP-Und is then flipped to the periplasmic space where the glycan chain is transferred en bloc to protein by oligosaccharyltransferases. These GTs have a broad specificity toward their donor substrates but also require a sequon, Asp/Glu-x-Asn-y-Ser/Thr, where x and y cannot be Pro, in the protein acceptor, that bears close resemblance to the mammalian N-glycosylation sequon (Figure 1).

PROTEIN O-GLYCOSYLATION
O-glycans of glycoproteins and mucins are assembled in mammals without a lipid intermediate and without removal of sugar residues by glycosidases (14,15). The first sugar is always GalNAcαlinked to Ser or Thr (the cancer-associated Tn antigen). All sugars are transferred from nucleotide sugars in the Golgi, resulting in extended and branched O-glycans with hundreds of different structures. The most common structure is Galβ1-3GalNAc, core 1, the T antigen, which is normally masked by the addition of other -T, -transferase. residues but exposed in many cancer cells. In a number of cells, core 1 is branched by core 2 β6-GlcNAc-transferase (C2GnT) or extended in a fashion that is similar to the synthesis of complex N-glycans. GalNAc is transferred from UDP-GalNAc by up to 20 polypeptide GalNAc-transferases (GALNTs) (14)(15)(16). All GALNTs are classified in the GT27 family with a GT-A fold ( Table 3). They have a catalytic domain linked to a lectin (ricin-like) domain at the C terminus. This lectin domain has three subdomains and may play an important role in binding products or substrates containing GalNAc residues. A crystal structure of mouse GALNT1 with Mn 2+ supported the importance of a DxH motif and the role of Asp209, His211, and His344 (17) ( Table 4). The conformations of human GALNT2 (18) crystallized with UDP and with or without an acceptor peptide showed a loop formed over UDP. It appeared that the acceptor peptide connected the otherwise separate catalytic and lectin domains. Kinetic analyses showed that the presence of GalNAc in the acceptor was beneficial for activity. Human GALNT10 was crystallized complexed with UDP, GalNAc, and Mn 2+ (19). GalNAc-peptides appear to bind to the second beta-subdomain of the lectin domain. Binding of the donor induces a conformation change that opens the acceptor-binding site. These three crystallized ppGalNAcTs are similar in overall structure and mechanism.
An equivalent GALNT that transfers GalNAc to protein has not been identified in bacteria, although bacteria are known to O-glycosylate Ser/Thr residues of proteins with various sugar residues. In contrast to mammalian O-glycosylation, bacteria transfer a pre-assembled oligosaccharide to Ser/Thr. Bacterial protein OGTs have no sequence homology to GALNT and their action is reminiscent to that of OST in the N-glycosylation pathway. In several bacteria, for example in Campylobacter and Neissseria, an oligosaccharide or monosaccharide is first pre-assembled on PPlipid in the cytoplasmic compartment, flipped to the periplasm and then transferred en bloc to Ser/Thr residues of proteins. These enzymes have a relaxed oligosaccharide donor specificity (46). Oligosaccharyltransferase PglL (which has not yet been assigned to a GT family) from Neisseria meningitides (Nm) can transfer many different glycans from sugar-PP-Und or sugar-PP-lipid (including sugar-PP-Dol) donor substrates to protein in the periplasmic space. UDP-N -diacetyl-bacillosamine was also a donor substrate in vitro, showing that even nucleotide sugars can be donors and a single sugar could be transferred to protein. Mutagenesis experiments showed that PglL from Nm requires His349 for activity and for interaction with the lipid-linked oligosaccharide (47).

BIOSYNTHESIS OF BACTERIAL O ANTIGENS
There are many similarities in the pathways and mechanisms by which bacterial O antigens and mammalian glycoproteins are synthesized. In Gram-negative bacteria, O antigens are synthesized by specific GTs at the cytosolic face of the inner membrane where the nucleotide sugar donor substrates are present, as well as the membrane-bound P-Und, an analog of the mammalian P-Dol, as the acceptor substrate for the transfer of the first sugar (48) (Figure 2). The first GT to act is always a sugarphosphate transferase that produces the sugar-PP-Und substrate for subsequent transfer of monosaccharides by GTs. Most E. coli have GlcNAc or GalNAc at the reducing end of the repeating unit, thus sugar-phosphate transferase WecA and its orthologs are responsible for the first reaction, maintaining the α-anomeric configuration of GlcNAc. 4-Epimerases may also be involved in interconverting GlcNAc and GalNAc in the activated form (UDP-GlcNAc/UDP-GalNAc) or after the sugar transfer (49).  The common heteropolymeric O antigens are synthesized by sequential transfer of sugar units by donor-and acceptor-specific, membrane-associated GTs. The specificities of these bacterial GTs are distinct and comparable to eukaryotic GTs. A completed repeating unit is then translocated across the inner membrane to the periplasmic side by the multiple membrane-spanning flippase Wzx, a process resembling the transfer of Man 5 GlcNAc 2 -PP-Dol intermediate across the ER membrane. Polymerization involves the addition of repeating units to the reducing end of the growing chain by Wzy polymerase. This enzyme has 12 predicted transmembrane domains with the catalytic domain in the periplasm that has some specificity for the structure of the repeating unit. Wzy may invert the anomeric linkage of the first sugar in the polysaccharide since many repeating units have the GlcNAcβ-linkage in the O antigen. Many genes specifically involved in the synthesis of the O antigen are found in the O antigen gene cluster. The presence of the wzy gene suggests that the O antigen is synthesized by the Wzy-dependent pathway (Figure 2). A much less specific chain terminator Wzz then helps to restrict the number of repeating units assembled in the O antigen. This is followed by a ligase (polysaccharide transferase)-catalyzed transfer of the O antigen to a specific sugar of the outer core structure, synthesizing the complete LPS. This releases PP-Und, which is recycled to P-Und. LPS is then extruded to the outer membrane by the Lpt complex (50).
The less common homopolymeric O antigens, such as the d-Rha polymers of Pseudomonas aeruginosa (Pa) and the d-Man polymers of E. coli O9, are synthesized by the transfer of monosaccharides from nucleotide sugars to R-GlcNAc-PP-Und in a processive fashion in the ABC transporter-dependent pathway (Figure 3) (51). Some of the processive GTs can have multiple

Growing O antigen
Lipid A-inner core-outer core

Outer membrane
Repeating unit

FIGURE 2 | Biosynthesis of lipopolysaccharides in Gram-negative bacteria by the polymerase-dependent pathway.
Many steps of the complex sequences and controls in the biosynthesis of LPS in Gram-negative bacteria are similar to those in mammalian glycoprotein biosynthesis. The inner membrane serves as the site of glycan biosynthesis, and the membrane-bound acceptor is undecaprenol-phosphate (P-Und) having 11 isoprenyl units, which is less than those found in eukaryotic Dol. Nucleotide sugars are synthesized in the cytosol and used for most glycosylation reactions. As in the N-glycan biosynthesis, the first sugar is transferred as sugar-phosphate by membrane-bound WecA to synthesize GalNAc/GlcNAc-PP-Und. This step can also be blocked by tunicamycin. It is possible that a 4-epimerase is involved. Subsequently, sugars are added individually to form the repeating unit of the O antigen. The glycosyltransferases that transfer sugars from nucleotide sugars usually have a high specificity for their donor and acceptor substrates and are associated with the membrane. After Wzx transports the repeating units to the periplasm, they are polymerized by Wzy by addition of repeating units to the reducing end of the growing polysaccharide linked to PP-Dol. The O antigen can be further processed and modified to form completed O antigens and the biosynthesis is usually terminated with Wzz. The O polysaccharide is then transferred to a sugar of the core oligosaccharide linked to lipid A by a ligase, forming the LPS, which is exported to the outer membrane by the Lpt complex. The O antigenic polysaccharide is then exposed to the environment on the outer membrane. Although many bacterial enzymes involved in LPS synthesis have been cloned, the individual steps of LPS synthesis are not well understood, mainly because of the major challenge to find the appropriate enzyme substrates and conditions to assay enzymes. The example shows the biosynthesis of the E. coli O104 antigen. The repeating unit tetrasaccharide contains the cancer-associated T antigen (Galβ1-3GalNAc), as well as the sialyl-T antigen (sialylα2-3Galβ1-3GalNAc). The WbwA sialyltransferase and the WbwB Gal-transferase remain to be characterized. catalytic domains, e.g., Man-transferase WbdA. The entire O antigen is assembled on the cytosolic side, and terminated by termination reactions, e.g., methylation. This is followed by translocation of the large O-antigen-PP-Und by the Wzm exporter and the ATPbinding Wzt to the periplasm where it is further processed. The presence of wzm and wzt genes in the O antigen gene cluster would suggest that this pathway is operative.
The events utilizing membrane-bound acceptor substrates in bacteria are similar to those of the early N-glycan synthesis in eukaryotes at the ER inner membrane (Figure 1). In both mammals and bacteria, isoenzymes are known that can synthesize the same linkage, often with slightly different substrate specificity. These isoenzymes are interesting models to study the catalytic sites and requirement for specific amino acids critical for catalysis and specificity.

CHARACTERIZATION OF GLYCOSYLTRANSFERASES
Chemical synthesis has been used to produce natural-like or unnatural glycans but the stereochemistry and regio-selectivity is difficult to achieve. Nature has developed GTs, excellent tools to synthesize an amazing diversity of glycan structures with defined anomeric configurations and linkages. GT reactions do not require harsh conditions or protection of reactive groups. GTs have distinct specificities for their donor and acceptor substrates. More than 100,000 genes from various species are thought to encode GTs, and organisms have 1-2% of their genes dedicated to GTs (52).
In order to assess the requirements and characteristics of GT activities, specific and accurate enzyme assays have to be developed. Nucleotide sugar donor substrates for mammalian glycoprotein biosynthesis are usually commercially available but for bacterial enzymes may have to be chemically or enzymatically synthesized. It would be difficult to extract the natural donor and acceptor substrates from bacteria in the pure form. Therefore, syntheses for bacteria-specific donor substrate analogs have been developed, e.g., for UDP-QuiNAc (UDP-6-deoxy-GlcNAc) found in E. coli and Pseudomonas aeruginosan (PA) (53) or for GDPd-Rha found in PA (54). Oligosaccharides linked to a synthetic aglycone group may be suitable acceptors for both, mammalian GTs and bacterial GTs. However, bacterial GTs that act early in the www.frontiersin.org O antigen synthesis pathway seem to require sugar-diphosphatelipids as acceptors, which are difficult to synthesize. We developed the natural acceptor analog GlcNAcα-diphosphate-lipid to mimic the product of the first sugar-phosphate addition (55), which was very active as an acceptor. In order to isolate the enzyme product from the assay mixture for quantification, a number of different chromatographic methods have been employed, including hydrophobic or anion exchange methods, HPLC, TLC, and capillary electrophoresis. Enzyme-coupled assays or lectin and antibody binding have also been used to determine activities. Methods to assay specific GTs are essential prerequisites to study their properties and optimal conditions, substrate specificities, and to develop inhibitors. GTs can be classified based on similarities of their amino acid sequences, according to the sugar they transfer, and the stereochemistry of the reaction in the CAZy database. If at least 100 amino acids in two different stretches of the protein have significant similarity to other members of the same family but not to other families, GTs are assigned to a specific family with the same predicted fold, and being either inverting or retaining GT (Tables 3  and 4). However, not all known sequences fit into a GT family or are reclassified when the specific function of the GT has been established, and the number of families are growing. Sequence similarity of unknown proteins can be used to predict function and protein folding. However, the final proof of function has to be obtained by biochemical analysis of enzymes. Most GTs in bacteria have not been functionally characterized, and this area is both challenging and tedious, often because the appropriate donor and acceptor substrates have to be especially prepared.
Crystal structures for GTs from eukaryotic and prokaryotic sources have been helpful in delineating the catalytic actions of GTs. It is interesting that this large and important class of thousands of enzymes that bind to many different nucleotide sugars as well as to a very large variety of monosaccharides, oligosaccharides, glycopeptides, and glycolipids occurs in only two major fold types, GT-A and GT-B. GTs, thus, have a relatively conserved three-dimensional architecture within their catalytic sites and share mechanisms, resulting in an extremely large number of product structures with linear or branched glycans of mostly unknown functions.
The binding of substrates to GTs and the transfer reactions have been shown to involve conformational changes in the enzyme proteins. GT-A folded enzymes have two tightly associated α/βα Rossman nucleotide-binding-like domains with two α-helices surrounding an open twisted, central β-sheet. The donor and acceptor substrates bind in different domains. The GT-B folded enzymes have two β/α/β Rossman-like domains, which are less tightly associated with each other and have the active site in the cleft in Frontiers in Immunology | Immunotherapies and Vaccines between domains (56). Usually, the sugar donor substrate binds first. This induces a conformational change in the enzyme forming a lid over the nucleotide sugar, facilitating the binding of the acceptor substrate and catalysis in an ordered sequential, regio-and stereo-specific mechanism (57,58). Internal disordered loops seem to be a common feature in mammalian and bacterial enzymes (40). Upon substrate binding, a disordered, short protein loop becomes ordered when donor substrate is bound. A change in orientation and conformation of the resulting ordered loop appears to facilitate binding of the second substrate and catalysis. Thus, the function of an ordered loop could be to allow catalysis, possibly by excluding water that would hydrolyze the donor substrate, or to form a lid over the nucleotide binding site allowing acceptor to bind, or to allow movement, and facilitating the reaction.
Generally, GTs have a distinct acceptor substrate specificity and with few exceptions, utilize only one type of nucleotide sugar donor substrate. Although few of the bacterial GTs have been biochemically characterized, it appears that both bacterial and mammalian GTs generally have similar properties with respect to their optimal pH, metal ion requirement, and donor specificity, although some bacterial GTs have a more promiscuous acceptor specificity (59). They can, thus, synthesize unnatural linkages that may find application as inhibitors or for biological studies. For example, β4-Gal-transferase LgtB from Helicobacter pylori (Hp) has been used to synthesize thio-glycosides.
A comparison of mammalian and corresponding bacterial GTs ( Table 2) shows that there is a low percentage of amino acid identity (often <12%), although the activities are comparable and the sugar transfer reactions follow similar mechanisms. Exceptions are ABO transferases, GTA and GTB, that synthesize blood group A and B, respectively, similarly in human beings and in certain bacteria, and show about 20% identity. Some of the α2 and α3/α4-Fuc-transferases also have similar activities when comparing human and bacterial GTs and show 14.5-17.5% identity. This suggests an exchange of genes between mammals and bacteria or a common evolutionary origin. The similarity and identity between GTs with similar function in bacteria or within the eukaryotic GT families can be much higher. The arrangements of amino acids in the catalytic site may therefore be similar, leading to the binding of the same nucleotide sugars and acceptors with the transfer of the sugar in a specific linkage. The requirement of a metal ion to stabilize the negative charge of the nucleotide sugar may also be the same. An evolutionally conserved feature of GTs is that the catalytic mechanism usually involves a catalytic base.
Inverting GT (52,57,58) inverts the anomeric configuration of the sugar in the donor substrate. This inversion is expected to follow a single displacement where the catalytic base deprotonates the hydroxyl group of the acceptor to be glycosylated, which then becomes an active nucleophile attacking carbon-1 of the sugar of the donor substrate. This mechanism involves an S N 2 reaction and an oxocarbenium ion transition state. Crystal structures show that the catalytic base (Asp, Glu, or His) is properly positioned near the hydroxyl to be glycosylated. In many cases, this catalytic residue is within a conserved DxD motif ( Table 5). Both the DxD motif and the negatively charged phosphate group of the nucleotide leaving group may be stabilized by a divalent metal C2GnT SPDE (20,21) Sialyltransferases, mammalian Sialyl-motif L Sialyl-motif S Sialyl-motif VS Sialyl-motif III Sialyltransferases, bacterial HP Polysialyltransferases, mammalian PSTD motif (67) Polypeptide GalNAc-transferases Ricin-like lectin domain ABO transferases DxD (60) ion, but positively charged amino acids could also serve this function (20). Inverting GTs have been shown to act with a sequential ordered mechanism. GTs that retain the anomeric linkage of the nucleotide sugar may function in a double displacement mechanism (58). Thus, for retaining GTs, a short-lived glycosyl-enzyme intermediate may form. This is followed by a shift in protein conformation that allows a nucleophilic attack on the anomeric center of the sugar by the deprotonated hydroxyl of the acceptor substrate to be glycosylated, maintaining the original anomeric linkage. A double displacement mechanism has been proposed for GalNAc-transferase GTA and Gal-transferase GTB, and a covalent glycosyl-enzyme intermediate through Cys303 was found (68). Other mechanisms may be possible and need to be investigated for retaining enzymes (58). GTs can also transfer sugar to water and thus have a nucleotide sugar hydrolase activity.
Mammalian GTs are single or multiple membrane-spanning proteins in the ER or single transmembrane-spanning type II membrane proteins in the Golgi, with a short cytosolic domain, a transmembrane anchor domain, and a stem region that helps the globular catalytic domain to protrude into the Golgi lumen. In the bacterial inner membrane, the first enzyme that adds the sugar-phosphate to P-Und such as WecA, as well as related sugar-phosphate transferases has multiple membrane-spanning domains. The remaining bacterial GTs that assemble O antigen repeating units do not have a transmembrane domain but have short hydrophobic stretches that may contribute to an association with membrane components. It is possible that both, mammalian and bacterial GTs, exist in protein/membrane complexes that activate enzymes and make the assembly of glycan chains highly efficient.

A LARGE FAMILY OF Gal-TRANSFERASES
Families of at least 5 β3-Gal-transferases (B3GALTs) and at least 7 β4-Gal-transferases (B4GALTs) participate in forming the extensions of glycoproteins (69) that are the basis for the attachments of epitopes including the Lewis x antigen, the selectin ligand involved in the inflammatory response (8). These inverting metal iondependent GTs have a DxD motif, bind UDP-Gal and a number of GlcNAc-terminating acceptor substrates.

THE B4GALT FAMILY
The crystal structures of both, human and bovine β4-Galtransferases 1 (B4GALT1) in complexes with donor and acceptor substrates and several mutants, have been thoroughly studied (22). UDP-Gal binds in a deep catalytic pocket of the bovine B4GALT1 together with Mn 2+ , in the vicinity of Asp252, Asp318, and Glu317 residues. The conformational change induced by binding UDP-Gal creates the binding site for GlcNAc-terminating oligosaccharides. The GlcNAc moiety, which needs to be in the β-anomeric configuration is bound by Phe280, Phe360, Tyr286, Arg259, and Ile363. The enzyme has three DxD sequences.
In the bovine B4GALT1 enzyme, the first Asp254 residue in the DVD motif has contact with UDP and Mn 2+ but mutations of Asp318 or Asp320 within the DDD sequence show that these residues are essential for activity. His344 normally interacts with Mn 2+ . A His344Met mutant is active in the presence of Mg 2+ , instead of Mn 2+ and maintains a closed conformation bound to Mg 2+ and UDP-hexanolamine, allowing an acceptor to bind. The mutant is, thus, useful to study the role of conformational changes and the binding of various acceptors (70,71).
The catalytic domain of B4GALT1 has a short and a larger flexible loop containing the metal ion binding site. The binding of the donor and metal ion induces conformational changes in the long flexible loop, which changes from the open to the closed conformation, creating a lid over the bound nucleotide sugar. This opens an acceptor-binding site at the C terminus of the flexible loop. After the transfer reaction, the loop changes back to the open conformation, releasing the nucleotide (72). β4-Galtransferase 7 (B4GALT7) is another member of the same family, involved in priming glycosaminoglycan synthesis by adding Gal to Xylose (24). B4GALT7 also works in an S N 2 type mechanism and changes conformation from closed to open conformation upon binding UDP and Mn 2+ . The mammalian β4-Gal-transferases have a common B4GALT motif GWGxED, which is not found in β3-Gal-transferases or in the bacterial counterparts of B4GALT (61) ( Table 5).
β4-Gal-transferases that synthesize Galβ1-4GlcNAc sequences are also found in bacteria. For example, β4-Gal-transferase LgtB from Helicobacter pylori (Hp) can synthesize Galβ4-S-GlcNAc and Galβ4-Man linkages (59). The repeating unit of Shigella boydii (Sb) also contains the Galβ1-4GlcNAc sequence, which is synthesized by β4-Gal-transferase WfeD (73). The sequences of human β4-Gal-transferase and WfeD have about 9% identity; yet, the reaction catalyzed is similar. Both enzymes are inverting GTs, bind UDP-Gal and GlcNAc-R acceptor substrates, are activated by Mn 2+ , and have a DxD motif. Interestingly, we found that both enzymes are also activated by Pb 2+ , although the activation of the bacterial enzyme is much higher and is similar to Mn 2+ activation. While human β4-Gal-transferase is in the GT7 family with a GT-A fold, the structure and predicted fold of the WfeD in GT family 26 is uncertain ( Table 3). The human enzyme does not accept the negatively charged bacterial acceptor substrate, GlcNAc-PP-lipid, and vice versa, the bacterial enzyme cannot act on GlcNAcβ-Bn, which is the standard acceptor for assays of the human enzyme. Mutagenesis of WfeD showed that the central Glu101 residue of the DxExE sequence is essential for activity. Lys211 was also found to be important, possibly by binding one or two phosphate group(s) of the acceptor substrate (73). Lys residues are apparently not involved in catalysis of the human enzyme. WfeD is not inhibited by GlcNAcβ-naphthyl, which is a potent inhibitor of the mammalian β4-Gal-transferase (74).

THE FAMILY OF β3Gal-TRANSFERASES (B3GALT)
Human glycoproteins can be extended with Galβ1-3GlcNAc (type 1) sequences that are also found in O antigens, e.g., in the repeating unit structure of the E. coli O7 antigen. There are five enzymes that synthesize the Galβ1-3GlcNAc linkage on a variety of acceptors in mammals. They are inverting GTs having a DxD motif and a requirement for divalent metal ions such as Mn 2+ (15,69). B3GALT5 has a distinct specificity for O-glycan core 3 (GlcNAcβ1-3GalNAc-) acceptors. However, crystal structures are not available for β3-Gal-transferases. Members of the β3-Gal-transferase family have two common peptide motifs, in addition to the DxD motif ( Table 5).
A β3-Gal-transferase WbbD from E. coli O7 was detected that can act on GlcNAcα-PP-lipids where apparently the lipid structure is of minor contribution to the activity (75). The enzyme belongs to the GT2 family with a predicted GT-A fold and synthesizes the disaccharide Galβ3GlcNAc α-linked to PP-lipid as the second step in repeating unit synthesis. Deletion of the enzyme eliminates the synthesis of O antigen on LPS. This supports the idea that an inhibition of this second step is successful in creating bacteria that are more susceptible to the mammalian immune system.

BIOSYNTHESIS OF THE THOMSEN-FRIEDENREICH (TF) ANTIGEN
The cancer-associated T antigen, Galβ1-3GalNAc-, core 1, is the precursor for most O-glycans. In cancer, core 1 is often found in the unsubstituted form, while in normal glycoproteins, it is substituted by other sugars and is thus not recognized by anti-T antibodies. Sialylation of core 1 is also common in glycoproteins and often overexpressed in cancer and is recognized as the sialyl-T antigen (15). Several bacteria carry the T antigen as an internal structure within their O antigen repeating unit. The Shiga toxin producing O104 serogroup of E. coli is unusual in that it contains the T antigen in its O antigen repeating unit, as well as the sialyl-T antigen, sialylα2-3Galβ1-3GalNAc-(ECODAB).
The core 1 structure in human beings is synthesized by core 1 β3-Gal-transferase (T synthase, C1GALT1) and deficiencies of the enzyme are associated with pathological conditions including cancer. T synthase is the only known GT that requires the co-expression of a chaperone protein, Cosmc, C1GALT1C1 (76). C1GALT is a GT31 family member with a predicted GT-A fold, requires Mn 2+ for activity and prefers GalNAcα-glycopeptides as substrates but can also transfer Gal from UDP-Gal to GalNAcαbenzyl and related substrates (77).

Frontiers in Immunology | Immunotherapies and Vaccines
The GTs responsible for the synthesis of the T antigen in bacteria have a similar function ( Table 2). The T synthase WbwC in the E. coli O104 strain is within the GT2 family (Table 3), and has only 10.5% identity compared to human C1GALT. No chaperone is necessary for the expression and activity of the bacterial enzyme (78). Both, human C1GALT and WbwC have a GT-A fold and DxD motifs, utilize UDP-Gal as a donor and require Mn 2+ as a cofactor. However, in contrast to C1GALT, WbwC has a specificity for GalNAcα-diphosphate-lipid acceptor, while GalNAcα-peptides are not substrates. At this time, no crystal structure is available for T synthases but it is conceivable that the three-dimensional amino acid arrangements in the catalytic sites are similar. WbwC and human C1GALT could be distinguished using bis-imidazolium salt inhibitors, which showed that only WbwC, but not human C1GALT, was strongly inhibited with IC 50 values of 8 µM (78). These inhibitors could selectively attack GTs in pathogenic bacteria. However, a potent inhibitor for T synthase has yet to be discovered (77).

P BLOOD GROUP SYNTHESIS
Human blood group P ( Table 1) and related, complex structures containing the Galα1-4 linkage are synthesized by α4-Galtransferases (A4GALT), mainly using glycolipids with Gal residues as acceptors, e.g., lactosylceramide (79). However, a different α4-Gal-transferase from pigeon, related to β4-Gal-transferase from the same species, but not to β4-Gal-transferases from human beings, has been described that preferably acts on the N-glycans of glycoproteins (80).
A number of bacteria, including Cj (81), also express an α4-Gal-transferase with about 11% identity to the human enzyme ( Table 2). The LgtC α4-Gal-transferase from Nm synthesizes the bacterial mimic of the human P blood group (45). The enzyme is a member of the GT8 family with a GT-A fold and follows a bi-bi kinetic mechanism where UDP-Gal binds first. The crystal structure of LgtC with analogs of UDP-Gal and lactose substrates suggests that Asp103 and Asp105 of one of the four DxD motifs, as well as His244, are in the vicinity of the donor substrate, while a Mn 2+ ion coordinates the phosphates of UDP. The mainly helical C terminus is expected to form hydrophobic and electrostatic interactions with the bacterial membrane. Multiple conformational states of LgtC with and without bound substrate analogs were found by methyl-TROSY NMR (82), which is additional information that cannot be obtained by static crystal structure analysis.

A NEW DxDD MOTIF IN GT2 TRANSFERASES
A new DxDD motif ( Table 5), essential for activity, was discovered in WbwC (78). This motif is also present in WbdN, WfaP, WfgD, WbgO, WbiP, and CgtB (83)(84)(85)(86)(87). All of these GTs in the GT2 family having a DxDD motif are specific for the transfer of either Gal or Glc in β1-3 linkage to GalNAc or GlcNAc. Mutagenesis showed that in WbiP from E. coli O127 (83), the first Asp of the DxDD sequence was critical for activity while the second contributed but was not essential. In WbwC from E. coli O104 and O5, all three Asp residues were mutated and found to be important for activity. The first Asp (D91) is probably the catalytic base. The other Asp residues may support the nucleophilic property of the catalytic base (78).
While WbwC synthesizes the Galβ1-3 linkage attached to the first GalNAc residue at the reducing end of the O antigen repeating unit, several other GTs having a DxDD motif in the GT2 family were shown to synthesize the T antigen at a more internal position of the repeating unit. These GTs have a different specificity from that of WbwC and do not require the diphosphate in the acceptor. The T synthase activities of variants of CgtB from Cj mainly act on β-linked GalNAc acceptors. Variants of CgtB have distinct acceptor specificities (86) and synthesize lipooligosaccharides, which mimic mammalian glycolipids and glycoproteins.

GlcNAc-TRANSFERASES FORM BACKBONE STRUCTURES
Gal-transferases cooperate with five or more β3-GlcNActransferases (B3GNT) within the GT31 family to form the type 1 and 2 backbone structures of mammalian glycan chains (15,88,89) (Table 3). B3GNTs have significant sequence similarity with Gal-transferases. It is not known if these enzymes are physically associated, although their combined action would suggest this. A family of three β6GlcNAc-transferases (IGnT, GCNT2) then can add 1-6 branches to the linear chains. The β3-GlcNActransferases, but not the β6-GlcNAc-transferases, require divalent metal ions for activity. No crystal structures are yet available for B3GNTs.
In the N-glycosylation pathways, GnT I to V (MGAT1 to 5) (12) are responsible for forming GlcNAc-based antennae that can be further extended through repeating linear or branched GlcNAcβ1-3Gal-disaccharides. MGAT1 is an inverting GT with a GT-A fold within the GT13 family. The crystal structure of rabbit GnT I with UDP-GlcNAc and Mn 2+ supports an ordered sequential mechanism. The DxD motif is present as EDD, with Glu211 being the likely catalytic base (19). MGAT2, 3, 4, and 5 are all inverting GTs and have been classified in the GT16, GT17, GT54, and GT18 families respectively. Although the GT17 family also contains uncharacterized bacterial proteins, no bacterial equivalents of MGAT have been found in bacteria.
In the O-glycosylation pathways, the basis of most extended chains is core 2. Core 2 β6-GlcNAc-transferase C2GnT1 (GCNT1) adds a branch to O-glycan core 1 to form the core 2 structure GlcNAcβ1-6(Galβ1-3)GalNAc-R (15). The enzyme has a GT-A fold and is classified in the GT14 family. The crystal structure of the catalytic domain of mouse C2GnT1 shows that the protein has four conserved intramolecular disulfide bonds (20,21). Cys217, however, has to be reduced to support the activity, although it is not an essential residue (90). The human enzyme expressed in insect cells has two flexible N-glycans that protect the protein from degradation (91). C2GnT1 is an inverting GT that is active in the presence of EDTA and does not require Mn 2+ . The crystal structure suggests that the conserved, basic amino acids Arg378 and Lys401 stabilize the diphosphate group of UDP-GlcNAc and thus serve the function of Mn 2+ . The structure supports specificity studies of C2GnT1, showing an absolute requirement for the 4and 6-hydroxyl groups of the Gal and GalNAc residues and the 2acetamido group of GalNAc (77). Glu320 of the conserved SPDE sequence may be the catalytic base; it binds to the 4 and 6-oxygen of GalNAc and could thus deprotonate and activate the 6-hydroxyl to induce a nucleophilic attack on the C-1 of the GlcNAc moiety of UDP-GlcNAc (20,21).

www.frontiersin.org
Bacteria do not appear to have C2GnT or GnT I equivalents, but they express type 1 of type 2 chains and β3-GlcNAc-transferases comparable to the mammalian enzymes in their activities. For example, a β3-GlcNAc-transferase from Hp is involved in the synthesis of lipooligosaccharides and GlcNAcβ1-3Gal-extensions that resemble mammalian epitopes (92). The β3-GlcNActransferase LgtA from Nm acts on lactose and has a relaxed donor specificity. It is most active with UDP-GlcNAc but can also utilize UDP-GalNAc (93). Both, the mammalian and bacterial β3-GlcNAc-transferases accept a wide variety of acceptor substrates but have low sequence identity ( Table 2) (15).

FUCOSYLTRANSFERASES THAT SYNTHESIZE THE H ANTIGEN
The blood group O (H antigen, Fucα1-2Gal-R) is found in virtually all human beings and in certain bacteria and is the precursor substrate structure to form blood groups A and B. The enzymes that synthesize the H antigen in human beings are inverting α2-Fuc-transferases 1 and 2 (FUT1 and FUT2) that are closely related in sequence to the GT6 family ABO transferases GTA and GTB, although FUT1 and 2 have been classified into a different (GT11) family. FUT1 has a broad acceptor specificity for Galβ-R while FUT2 prefers O-glycan core 1 (T antigen) as a substrate (15).
Similar enzymes ( Table 2) have been identified in Hp as FutC (94), in E. coli O86 as WbwK (95), as WbsJ in E. coli O128 (64), and WbiQ in E. coli O127 (96). WbwK and WbiQ have a distinct specificity for the T antigen (95,96) and do not act on Galβ1-4 glycans. These FUT, therefore, have an activity resembling that of human FUT2 and have 12-17.5% sequence identity. HpFucT2 (FutC) adds Fuc preferably to Lewis x acceptors but also uses Lewis a and type 1 chains (94). In contrast, WbsJ prefers acceptors with terminal Galβ1-4Glc structures (64). WbsJ functions in the absence of divalent metal ion and does not have a DxD motif. Especially the first Arg residue of the HxRRxD motif, conserved in α2and α6-Fuc-transferases, is critical for activity due to its positive charge. Domain swapping between WbwK and WbsJ showed that the C-terminal motifs function in determining acceptor specificity (95). All of the identified α2-Fuc-transferases have significant homology in GT family 11 ( Table 3) with a predicted GT-B fold but none have been crystallized.

FUCOSYLTRANSFERASES INVOLVED IN THE SYNTHESIS OF LEWIS ANTIGENS
Lewis type antigens play essential roles in cell adhesion in the immune system and during inflammation, and aberrant amounts are often found in cancer. A family of mammalian, inverting α3-Fuc-transferases (FUT3-7, 9-11) is involved in Lewis antigen synthesis by linking Fuc to GlcNAc (9,15,97,98). The enzymes vary in their acceptor substrate specificities and cell type expression and are in the GT10 family with a GT-B fold. FUT3 is an exceptional enzyme that has a dual specificity and adds Fucα1-3 on type 2 chains to synthesize Lewis x and y, as well as Fucα1-4 to type 1 chains to synthesize Lewis a and b ( Table 1). FUT5 also shows some α4-Fuc-transferase activity. Human FUT3 and 5 have Trp111, responsible for type I acceptor recognition and 1-4 linkage synthesis. FUT that do not have this Trp synthesize the 1-3 linkage (99).
The bacterial α3FUTs show weak homology to mammalian FUT in two small segments of the catalytic domains (α3FUT motifs). They have about 10% sequence identity and a common GT-B fold but no transmembrane domain (25,62). Two amphipathic α-helices serve to anchor the enzymes in the membrane. The gastric pathogen Hp is a prime example of expressing human-like type 1 and type 2 chains that are fucosylated and include Lewis antigens, which may play a role in adhesion to gastric epithelial cells or in internalization. Hp have short O antigens (lipooligosaccharides) and the human glycan mimics help to mask the immunogenic determinants of Hp, thus evading immune surveillance and supporting persistent Hp infections. The different pH environments in the various regions of the stomach influence the expression of Lewis antigens, and likely the activities of GTs, leading to phase variations.
A number of bacteria have Fucα1-4 linkages but Hp is especially rich in Lewis a, b, x, and y structures and in α3/4-Fuc-transferase activities (100). All of the eukaryotic and most bacterial α3-Fuctransferases are in the GT10 family. Hp has futA and futB genes encoding α3FUT, in addition to 1-3/4 FUT (FucTa). FucTa has the CNDAHYSALH sequence near the C terminus that controls type I chain recognition. It seems that in this α3/4 FUT, it is Tyr instead of Trp that determines the acceptor preference. Thus, the Y350A mutant synthesizes Lewis x since it had dramatically reduced α4 FUT activity (100).
The crystal structure of α3-Fuc-transferase from Hp shows that a Glu95 residue is positioned closely to the anomeric carbon of Fuc of the donor GDP-βFuc and could be a catalytic base (25) while Glu249 could stabilize the intermediate oxonium ion. Mutants in these Glu residues are virtually inactive. Interestingly, tandem repeats of 7 amino acids (DDLRINY) are found in this α3FUT. The 2-10 heptad repeats appear to connect the N terminus to 2 amphipathic helices at the C terminus and are thought to be involved in maintaining secondary structure and activity (101). The C terminal sequence appears to determine the stability and overall structure of the protein.
A different α3-Fuc-transferase HhFT2 from Helicobacter hepaticus (Hh) synthesizes the Lewis x as well as the sialyl-Lewis x antigen (102). This enzyme is a member of the GT11 family and has more homology to α2-Fuc-transferases such as WbsJ of GT11, but less to alpha3/4 FUT in GT family 10. It has 10.4% sequence Frontiers in Immunology | Immunotherapies and Vaccines identity with the human enzyme FUT4. HhFT2 has three conserved motifs, one at the N terminus, one central, and one near the C terminus ( Table 5).

SYNTHESIS OF THE Fucα1-6GlcNAc LINKAGE
The α6-Fuc-transferases add Fuc in α1-6 linkage to the reducing end GlcNAc of the N-glycan core. The human enzyme (FUT8) requires the prior action of GnT I and cannot act when the chitobiose of the N-glycan core carries an α3Fuc residue, or if the internal Manβ residue carries the bisecting GlcNAc. FUT8 is classified in family GT23 with a GT-B fold. The crystal structure of human FUT8 shows three domains: an N-terminal coiled-coil domain, a catalytic domain that resembles GT-B folded GTs, and a C-terminal SH3 domain, although its significance is unknown. The C-terminal part of the catalytic domain contains a Rossmannlike fold with three regions, conserved in α2-, α6-, and other Fuc-transferases. Both Arg365 and Arg366 are critical for binding to GDP-Fuc while Asp453 may be a critical catalytic base (26).
A bacterial α6-Fuc-transferase with similar activity in the GT23 family with a GT-B fold and only 8% sequence identity is NodZ from Rhizobium sp. (Rsp) (Tables 2 and 4) (103). The crystal structure of NodZ shows two domains of nearly equal size but with different shape, separated by a central cleft (27). There are three conserved sequence motifs near the C terminus that play a role in GDP-Fuc binding or catalysis.

GLYCOSYLTRANSFERASES THAT SYNTHESIZE BLOOD GROUPS A AND B
The two human ABO transferases that synthesize the antigenic blood group A and B determinants from the H antigen (α3-GalNAc-transferase GTA and α3-Gal-transferase GTB, respectively) are homologous retaining enzymes within the GT6 family with a GT-A fold ( Table 3). It is astounding that the critical difference in donor specificities determining blood group A or B lies in a difference of only four amino acids. While GTB that transfers Gal has Gly176, Ser235, Met266, and Ala268, the GTA protein that transfers the slightly larger GalNAc has mostly smaller amino acids Arg176, Gly235, Leu266, and Gly268.
In GTA and GTB, two domains are separated by a catalytic cleft containing the DxD motif (Asp211-Asp213) (39). However, a highly conserved Glu303 is likely to be the active nucleophile. UDP binds in the nucleotide sugar-binding domain at the N terminus and the Mn 2+ ion coordinates the β-phosphate of UDP. The H antigen acceptor binds to the C terminus. A disordered and flexible internal loop adjacent to the active site (40) becomes ordered when the nucleotide (sugar) is bound. This leads to a conformational change in the protein (43). Two amino acids are in contact with donor or acceptor in GTA and GTB (39) but only one of them determines the binding of the nucleotide-bound sugar moiety, i.e., either Gal or GalNAc. Leu266 in GTA has contact with the acetamido group that allows binding of UDP-GalNAc. Due to the larger Met in this position (Met266), GalNAc cannot be accommodated and, therefore, Gal binds. Ala/Gly268 has contact with the 3-and 4-hydroxyl groups of Gal and thus does not contribute to the difference in donor specificity.
Human beings have antibodies against the absent blood group (A or B), and it is possible that this is induced by bacteria displaying this blood group. A number of bacterial GTA-like enzymes are also in the GT6 family and resemble the human counterpart with relatively high sequence identity of about 20%. The similarities between human and bacterial enzymes suggest a horizontal gene transfer between species and between bacteria. The bacterial enzymes have an NxN sequence instead of the eukaryotic DxD motif, and most of these enzymes do not have a metal ion requirement. Thus, bacterial enzymes may have altered catalytic mechanisms, although there is a strong conservation of mammalian-type of residues in the active sites (104).
Helicobacter mustelae (Hm) synthesize the blood group A determinant, which reacts with anti-human blood group A antibodies (105). The enzyme responsible, GTA-like α3-GalNAc-transferase (BgtA), has 20% sequence identity to its human counterpart GTA and can act on Fucα1-2Galβ1-3-R or Fucα1-2Galβ1-4-R substrates ( Table 2). Thus, bacteria may have acquired the GTA gene from a mammalian host, enhancing their molecular mimicry, although it is not clear how the human blood group is giving them a selective advantage.
The GTA-like enzyme BoGT6a from Bacteroides ovatus (Bo) (44) and GTB-like α3-Gal-transferase WbnI from E. coli O86 (95) that synthesize blood group B are related to the human enzymes with significant sequence homology in the GT6 family. Both donor and acceptor substrates are the same as those for GTA and GTB from human beings. The crystal structure of BoGT6a revealed a disordered region, which becomes ordered when acceptor Fuc-lactose is bound. This is accompanied by a large conformational change from the open to a closed state. Isothermal titration calorimetry (ITC) experiments showed that BoGT6a binds UDP-GalNAc with high affinity.
In non-primate mammals and new world monkeys, the linear blood group B occurs (Galβ1-3Gal-), without the α2-linked Fuc residue. This structure is foreign to human beings who have anti-linear B antibodies, thus hindering xenotransplantation. The α3-Gal-transferase A3GALT that synthesizes the linear B determinant is a homolog of GTA and GTB and has been crystalized with UDP and Mn 2+ (41). The invariable Glu317 was identified as the catalytic base. The crystal structure in a complex with Galβ-pnp suggests that Trp residues are critical for binding the natural substrate Galβ1-4GlcNAc (106). The disordered C terminal region is critical for allowing the substrate to bind (42). Bacterial analogs of this α3-Gal-transferase remain to be characterized.

SIALYLTRANSFERASES IN MAMMALS AND BACTERIA
Sialyltransferases are ubiquitous in eukaryotes and are also expressed in certain bacteria (107). These enzymes synthesize sialic acid linkages commonly found on the non-reducing termini of Nand O-glycans, and gangliosides as sialylα2-3Galβ1-or sialylα2-6Gal(NAc)-linkages. In addition, sialylα2-8 linkages are found, especially in polysialic acids (PSA), which are extremely large, linear polymers, expressed in a cell type specific, restricted fashion in embryonic, neuronal, and other selected cell types (108). Sialic acids contribute to the acidity and hydration of a glycoprotein, the metal ion binding, and epitope exposure. While sialic acid can mask the underlying epitope, certain lectins of the immune system (e.g., siglecs) directly recognize sialic acid in specific linkages. Metastatic cancer cells and leukemia cells are often hypersialylated, www.frontiersin.org which reduces further processing of glycans and causes glycan chains to be shorter (15). Sialylation significantly affects the adhesive properties of cells and has also been implicated in the functions of cell surface receptors (109).
Sialyltransferases are inverting GTs that usually lack a DxD motif and do not require divalent metal ions. Thus, general acids and bases identified in the crystal structures of α3and α6-sialyltransferases that may interact closely with the substrates include His residues (28,33). All known eukaryotic sialyltransferases have been classified as inverting GT29 with a GT-A fold, having at least four sialylmotifs (Table 5), a large (L), small (S), very small (VS), and motif III (65). The L motif contains the donor binding site while the S motif also binds the acceptor.
Bacterial sialyltransferases are inverting enzymes that bind CMP-sialic acid donor substrate and acceptors terminating in Gal or sialic acid but do not have these sialylmotifs and do not belong to the GT29 family. Instead, they are classified as GT42 (with a GT-A fold), GT52 or GT80 (with a GT-B fold), and GT38 (polysialyltransferases, PSTs). The 6-sialyltransferases (ST6GalNAc) acting on O-glycans do not appear to have a bacterial counterpart. Two highly conserved short motifs have been identified in bacterial PST and other bacterial sialyltransferases (GT52 and GT80), a Cterminally located HP sequence and a more N-terminally located D/E-D/E-G sequence (66). Certain bacterial sialyltransferases have multiple activities, including CMP-sialic acid hydrolase, transsialidase, and neuraminidase activities and are usually from the GT80 family. Thus, sialyltransferases can be promiscuous with respect to the linkages they form (or cleave) and the acceptor substrates they recognize. Bacterial sialyltransferases probably evolved separately from the eukaryotic enzymes, although their functions and mechanisms can be similar.

ALPHA3-SIALYLTRANSFERASES
In human beings, 6 α3-sialyltransferases (ST3GAL) form the sialylα2-3 linkage. The expression and activity of α3sialyltransferase ST3GAL1 that synthesizes sialyl-T antigen are increased in breast cancer (110) and appear to promote survival of cancer cells in the blood (111). In keeping with its activity in adding a terminal structure, ST3GAL1 is localized to the medial and late Golgi compartments in human mammary cells (112). ST3GAL1 acts on glycopeptides with core 1 structure and also on Galβ1-3GalNAcα-R acceptors that have hydrophobic aglycone groups. In contrast, a bacterial equivalent, WbwA from E. coli O104, responsible for the rare occurrence of the sialyl-T antigen in E. coli, is in the GT52 family with a GT-B fold. WbwA, but not mammalian ST3GAL1, also has HP and D/E-D/E-G motifs ( Table 5). The crystal structure of porcine ST3GAL1 with CMP and Galβ1-3GalNAc-acceptor substrate suggests that the essential His302 interacts with the phosphate of CMP-sialic acid. His319 is the catalytic base in motif III that is proposed to be positioned near carbon-2 of the sialic acid moiety (28). The conserved Tyr269 residue interacts with the 4-hydroxyl of GalNAc and thus determines the enzyme specificity for Galβ1-3GalNAc-over the Galβ1-3GlcNAc-acceptors.
An α3-sialyltransferase of the GT80 family from Photobacterium Phosporeum (Pp) has been crystallized with CMP (29). The acceptor-binding site has a wide access explaining that a range of possible disaccharides with Galα and Galβ linkages can form substrates. CMP binds in a cleft between the two domains of the GT-B fold. The main chain nitrogen of His317 in the HP motif is close to the nitrogen-4 of Cytidine and the side chain of His317 is near the oxygen of the phosphate. This suggests a critical role of these His residues in catalysis. Another α3-sialyltransferase with a GT-B fold in family GT52 from Nm was crystallized (31) with the donor analog CMP-3F-Neu5Ac. Asp258 could be a general base and His280 (within the HP motif) a general acid.

ALPHA6-SIALYLTRANSFERASES
Human α6-sialyltransferases add sialic acid to the Gal termini of N-acetyllactosamine chains of N-glycans. ST6GAL1 is highly expressed in colon cancer and metastatic cells (113) and also resides in the trans-Golgi (114). A homolog with 48% sequence identity (ST6GAL2) is mainly expressed in the brain and has the additional ability to synthesize sialylα2-6GalNAcβ1-4GlcNAcstructures (115). Human ST6GAL1 is a glycoprotein stabilized by three disulfide bonds (33). The catalytic residue, His370, deprotonates the 6-hydroxyl of Gal, generating an active nucleophile that attacks the carbon-2 of sialic acid. The reaction follows a randomorder mechanism of substrate binding. Rat ST6GAL has three disulfide bonds and two N-glycans (32). As many GTs, the enzyme has a disordered loop, and His367 is the catalytic base within the sialyl motif VS.
The bifunctional bacterial α6-sialyltransferase PM0188 from Pasteurella multocida (Pm) of GT family 80 has 14.6% sequence identity to human ST6GAL1. The crystal structure showed the GT-B fold and that Asp141, His311 (within the HP motif), Glu338, Ser355, and Ser356 were important for catalysis (37). The Photobacterium sp. (Psp) α6-sialyltransferase was also crystallized with CMP and lactose (35). The enzyme is in the GT80 family with a GT-B fold and has three domains, with the donor and acceptor bound between domains 2 and 3. Asp232 (within the D/E-D/E-G motif) is near the 6-hydroxyl of Gal while the nitrogen of His405 (within the HP motif) is close to the phosphate-oxygen. Thus, Asp232 could be a catalytic base that deprotonates the 6-hydroxyl of Gal, and His405 could be a catalytic acid that protonates the donor substrate.

MULTIFUNCTIONAL SIALYLTRANSFERASES
In bacteria, mimics of human sialylα2-3/6/8Galβ1-structures occur, e.g., in the lipooligosaccharides of Gram-negative bacteria such as Cj (116,117). Cells of the human nervous system are rich in gangliosides as well as glycoproteins containing similar sialyl-linkages. Thus, after bacterial infections, cross reactivity of antibodies could cause the rare development of neurological disorders. Guillain-Barré syndrome is an example (118). Cj expresses a bifunctional α3/8-sialyltransferase CstII and an α3-sialyltransferase CstI, which are responsible for the molecular mimicry of Cj in their lipooligosaccharide structures. Both enzymes have a predicted GT-A fold within the GT42 family. The structure of CstI shows (30) that His 202 is the catalytic base. Similarly, His188 is likely the catalytic base in CstII that deprotonates the 3-hydroxyl of Gal, which then attacks carbon-2 of sialic acid of the donor (119). The flexible lid in the CstII protein becomes ordered in a closed form when CMP binds (36). The acceptors lactose (for α3-sialyltransferase activity) or sialyl-lactose (for α8sialyltransferase activity) bind in a cleft and Arg129, Asn51, and Tyr81 contribute to the binding of the sialylated acceptor. The role of His188 as a catalytic base in CstII has also been confirmed by NMR studies (120). The intrinsic pK a values of His188 were measured in monomeric mutants by determining the pH-dependent chemical shifts of [ 13 C]-labeled His188.
The monofunctional sialyltransferases function with similar mechanisms compared to the mammalian enzymes. Multifunctional enzymes, however, are primarily found in bacteria and include the α3-sialyltransferase PmST1 from Pm, which binds CMP-sialic acid as donor and lactose, Gal, GalNAc as well as sialic acid as acceptor. The crystal structure shows that binding of CMPsialic acid donor substrate causes a change in conformation and opens the acceptor-binding site. The activities of PmST1 function optimally at different pH values. It has a GT-B fold within the GT80 family ( Table 3). The crystal structure shows that Asp141 is the catalytic base (34) with His112 also being important for enzyme activity. Another multifunctional α3-sialyltransferases PdST from Pasteurella dagmatis (Pd) in the GT80 family with a GT-B fold is also a CMP-sialic acid hydrolase. At low pH, it can act as a trans-sialidase and a sialidase (121).

POLYSIALYLATION
Important sialic acid structures are the PSA, found in human neuronal and other selected cell types (107). Only a selected number of proteins carry the PSA modification (122). For example, polysialylated neural cell adhesion molecule N-CAM is prominent in the developing nervous system but also occurs in leukocytes with roles in the regulation of cell adhesion. N-CAM becomes anti-adhesive when long polymers of α2-8-linked sialic acids are covalently attached to its N-glycans (108). The sialylα2-8 linkages of PSA are synthesized based on sialylα2-3/6Gal residues of N-glycans by developmentally regulated PSTs, which are highly expressed in the developing and embryonic brain (123). Neuropilin-2 (NRP2) is a glycoprotein containing multiple N-glycosylation sites, as well as O-glycans with sialylated core 1 and 2 structures. In cells lacking core 2, human PST (ST8SiaIV) was shown to assemble PSA on sialylated core 1 chains of neuropilin (124). The presence of these PSA polymers extends the half-life of proteins.
E. coli and Nm are examples of bacteria that carry sialylα2-8 polymeric PSA capsules, which help bacteria to resist phagocytosis. These PSA capsules mimic the eukaryotic chains, although they are linked to the membrane via a lipid anchor, and may have bacteria-specific modifications such as O-acetylation (125). The large, charged and hydrated polymeric enzyme product is assembled in the cytoplasmic compartment and then extruded through the membranes by ABC transporter and export systems (Figure 3). PSA confers a selective advantage to bacteria in the human nervous system and is associated with meningitis or other neurological conditions. Bacteria may also have PSA with α2-9 linkages or alternating α2-8 and α2-9 linkages.
In mammals, PSTs synthesize PSA by the addition of individual sialic acid residues in a processive fashion. Like the other sialyltransferases, mammalian PSTs are inverting enzymes of the GT29 family ( Table 3). In addition to four sialyl motifs, ST8SIAII and ST8SIAIV (formerly STX and PST, respectively) have a unique, conserved, polybasic PST domain (PSTD motif) ( Table 5) (67), which is absent from the other types of sialyltransferases. Basic residues in the PSTD motif are responsible for acceptor substrate recognition (126,127).
In bacteria, the PSA capsule is synthesized by GT38 family PST that are inverting enzymes. In E. coli, PST has only 5.4% sequence identity with the human enzyme (128). The human PST equivalent from Nm has <10% sequence identity with human PST and has a requirement for Mg 2+ (129). Kinetic experiments of His and Pro mutants of the PST from Nm suggested that the HP motif contributes to CMP-sialic acid but not acceptor binding. The acceptors can be a glycolipid containing two sialic acid residues as a primer. Gal-terminating glycans of glycopeptides, including the T antigen linked to Ser, also served as acceptor substrates for the PST form Nm. Different PSTs synthesize either the α2-8 or α2-9 linked polymers of bacterial PSA capsules.

METHODS TO STUDY GLYCOSYLTRANSFERASE PROTEIN STRUCTURES AND FUNCTIONS
It is often difficult to produce sufficient pure enzyme in order to analyze protein structure by X-ray crystallography. In addition, the protein may not show exactly the same properties in a crystal, compared to solution and body fluids. To approximate the protein structure present in the natural environment, protein NMR studies have been helpful (130). Enzyme substrate or inhibitor interactions have been determined by biochemical kinetics studies but can also be studied by MS and Saturation Transfer Difference (STD) NMR (131,132). Conformational dynamics of proteins to understand molecular recognition can be achieved by molecular dynamics simulation and docking programs, which requires knowledge of protein structure. Theoretical modeling has been undertaken to predict protein structure, substrate binding, and dynamic properties of GTs. Thus, the three-dimensional interactions between substrates and enzyme protein, cofactor binding sites, ligand flexibility, and movements can be estimated by computational methods (133,134). Multivariate data analysis of the amino acid property patterns also helps to predict a protein fold (135).
New enzymes can be designed based on knowledge of protein structure and substrate binding. For example, the blood group B GTB Gal-transferase has been re-designed with a model Epimer Propensity Index (EPI) to transfer Glc instead of Gal (136). The orientation of the sugar donor in the folded enzyme is highly conserved. The R228K mutant of β4GalT1 has higher Glc-transferase activity due to the inability to effectively bind the axial 4-hydroxyl of Gal (23). Similarly, GTB modeling correctly predicted a higher Glc-transferase activity of GTB in the presence of the unnatural UDP-Glc donor upon increasing the sizes of Ser185 to Asn and Cys (136).
One approach to developing good GT inhibitors is to obtain qualitative and quantitative information on the substrate binding sites from NMR spectroscopy. STD NMR measures the signals of the unbound substrate, which is then compared to those of the bound substrate. Saturation transferred from the enzyme to specific sites of the bound substrate is seen as an attenuation of resonance signals. The difference spectra at different ligand concentrations allow to identify the bound substrate and to determine www.frontiersin.org the binding affinity. In spin-lock filtering experiments, transverse relaxation of substrate signals is recorded, which is enhanced when the ligand is bound. Thus, signals are attenuated upon ligand binding. This process can be enhanced by using spin labels. The conformations and relative placements of bound GnT V substrates have been determined using transferred NOE and STD measurements (137).
In addition to these NMR experiments, surface plasmon resonsance (Bioacore) experiments can be used to determine the binding affinities (132). The ligand binding ability of GTs with and without donor or a number of potential inhibitors can be assessed with biotinylated substrates bound to a streptavidin-coated chip. The binding of donor and acceptor analogs to the blood group B enzyme GTB has also been analyzed by ITC combined with STD NMR titration (138). The results show the binding stoichiometry and binding affinity of one donor and acceptor molecule per protein, the thermodynamics, enthalpy, and entropy changes upon binding as well as the dissociation constants. The study also emphasizes that there can be differences in binding substrate analogs that should be considered.
Electrospray-mass spectrometry (ES-MS) has been used to determine the thermodynamics and affinities of substrates (139). Association constants were measured from the relative abundance of ions in the EI-MS spectra for GTA and GTB in aqueous solution with native donor and acceptor substrates as well as substrate analogs, products, and metal ion cofactor. To confirm the retaining mechanism of the enzymes, a mutant of blood group A (GTA) GalNAc-transferase in solution containing UDP-GalNAc and Mn 2+ was studied, as well as the similar mutant of GTB. The catalytic Glu303 was replaced with Cys. After Trypsin digestion, a covalent intermediate could be trapped. Thus, tandem MS using collision-induced dissociation confirmed that Cys303 in both GTA and GTB enzymes was responsible for forming the glycosyl-enzyme intermediate. The formation of trisaccharide product can also be proven by MS (68). Thus, the double displacement mechanism of these retaining GTs was supported by MS.

GLYCOSYLTRANSFERASE INHIBITORS
Detailed knowledge of GT structure and function is the basis for the development of effective GT inhibitors that may re-direct glycan biosynthesis. GT substrates usually bind through a small number of essential hydrogen bonds or hydrophobic interactions. Thus, not all of the substituents of the donor sugar, the base of the nucleotide, or the sugars of the acceptor are critically involved in binding (140). Therefore, modifications of these residues can result in competitive inhibitors that still bind in the catalytic site but do not support catalysis. Inhibitors can be ligands that bind well to the enzyme but cannot be released easily, or interfere with catalysis either as donor substrate analogs, acceptor substrate analogs, transition state analogs, compounds that prevent conformational changes necessary for catalysis, or compounds that distort protein conformation (74,77,103). Small structural modifications of compounds can have a dramatic effect on their inhibitory activity. Inhibitors have been designed that interfere with conformational changes and flexible loop movements that are essential events for substrate binding and catalysis (141). Sugar donor analogs for ABO transferases (GTA and GTB), carrying a substituent at the uracil moiety, block the stacking of amino acids required for the proper folding of the internal loop. A heterocyclic compound inhibited GTB by interfering with its ability to bind metal ion, as well as donor and acceptor substrates. The compound does not appear to be structurally related to the acceptor but partly binds in the acceptor-binding site (142). A combination of crystal structure, Biacore, STD NMR, and docking experiments suggested that the inhibitor competes with binding of Fuc of the acceptor and the Mn 2+ ion. Non-competitive inhibitors have also been described that potentially alter the structure of the enzyme leading to inactive proteins (77,78). Modified nucleotide sugars are often recognized by GTs leading to transfer of unnatural sugars (143). Fluorescent groups modifying the base of the sugar-nucleotide can be useful as indicators of binding (144).

CHEMOENZYMATIC SYNTHESIS OF SHARED EPITOPES
The preparation of bacterial GTs that lack a transmembrane domain is relatively inexpensive and they can be used in chemoenzymatic synthesis not only of bacterial glycoconjugates but also for mammalian oligosaccharides and glycoproteins with specific epitopes ( Table 1). Due to the variety of bacterial enzymes with different specificities, a diverse range of glycan structures can be synthesized and processed for use as vaccine, to prepare antibodies for passive immunity, and for further studies of glycan functions. Examples include the synthesis of the complete blood group Forssman antigen GalNAcα3GalNAcβ3Galα4Galβ4Glc-pnitrophenyl by β3-GalNAc-transferase and α4-Gal-transferase from Cj, followed by α3-GalNAc-transferase from Pm (81,140). The assembly of the entire blood group B determinant was achieved using GTs from E. coli O86 (145). Bacterial enzymes α4-Gal-transferase LgtC, β3-Gal(NAc)-transferase LgtD, and α2-Fuc-transferase WbsJ (146) efficiently synthesized the tumor-associated epitope Globo-H-hexasaccharide (Fucα2-Galβ1-3GalNAcβ1-3Gaαl-4Galβ1-4Glcβ-benzyl).
Knowing the amino acids and mechanisms involved in substrate binding and catalysis, bacterial enzymes or new mutant enzymes can be engineered for use in the production of new natural or unnatural glycan structures, or for more efficient synthesis of known structures (147). For example, new donor specificity can be engineered by mutating only one or two critical amino acids that convert the function of the enzyme (140,148).
Phosphorylases can also reversibly form glycosidic linkages (149). They can have similarity to either inverting or retaining glycohydrolases or to GT-B-folded retaining GT (CAZy). Sugar-1-P can be used as a substrate for phosphorylases to produce a wealth of different glycans with regio-selectivity. An interesting combination of chemical and enzymatic synthesis of the T antigen and the Galβ1-3GlcNAc linkage has been achieved using a combination of galactokinase (GalK) from E. coli that synthesizes Gal-1-P, and a Galβ1-3 HexNAc phosphorylase from Bifobacterium infantis that has promiscuous acceptor specificity (150). These enzymes could add Gal in the presence of ATP to synthetic GalNAc-and GlcNAc-substrates with various aglycone groups. The phosphorylase has multiple DxD motifs and an Asp-rich domain at the C terminus. The T antigen was also synthesized from sucrose and GlcNAc using phosphorylase from Bifidobacterium longum (151) together with sucrose phosphorylase, UDP-Glc-hexose-1-P uridyltransferase and UDP-Glc 4-epimerase.
Glycosidases catalyze reversible reactions and have also been used to form sugar linkages using high concentrations of reactants. Glycosidases act with an inverting or retaining mechanism, utilizing a catalytically active nucleophile in the active site such as Asp or Glu. Mutant glycohydrolases that lack the catalytic base as well as hydrolase activity can be used to efficiently transfer a sugar to an acceptor substrate and synthesize specific linkages (glycosynthases). For example, large N-glycan-type oligosaccharides can be transferred to the GlcNAc residue linked to Asn of glycoproteins by a mutant endo-glucosaminidase that normally cleaves the chitobiose and releases the N-glycan. Thus, engineered glycosidases can be stereo-selective and very useful in achieving high yields of complex glycans (152).

CONCLUDING REMARKS
It is astounding that proteins can be so different in amino acid sequence and yet become similar specific and effective catalysts for the transfer of sugars to proteins, lipids, and sugars and only have two major protein folds. Many possibilities are there for binding of donor and acceptor substrates but the transfer only involves inversion or retention of the anomeric configuration of the sugar. Mechanisms common to eukaryotes and bacteria include a change in protein conformation upon nucleotide sugar binding facilitating acceptor binding, and the action of a base (Glu, Asp, His) that deprotonates the hydroxyl to be glycosylated, which then becomes a nucleophile that results in cleavage of the sugar from the donor substrate. Bacterial and mammalian enzymes are often comparable in their action so that mammalian epitopes can easily be synthesized with bacterial enzymes, for example, to produce vaccines for cancer. However, the bacterial world is much more complex, variable, and challenging. Knowledge of bacterial GTs can lead to the synthesis of glycans, enzyme substrates, and antigens to study their biological functions and role in disease and to synthesize vaccines against specific pathogenic strains of bacteria. Bacteria may have evolved to express the GTs that make human-like structures, giving them a selective advantage. Most of the time, bacteria and human beings are symbiotic or compatible but once in a while, the mimicry of bacteria can lead to infection and serious consequences. We speculate that bacterial and mammalian enzymes with similar functions may have evolved in parallel, or may be derived from an ancient common ancestor. There may have been exchange of genes between these species (horizontal gene transfer), or GTs may be derived by convergent evolution. The many similar genes of a particular family may have been derived by gene duplication from an ancestral gene.
Further detailed understanding of GT structures and mechanisms helps to visualize how amino acids cooperate in forming a catalytic site, predict their functions, and to gain valuable insight into the syntheses of complex glycans in mammals and in our close neighbors, bacteria. Both, the bacterial world and human beings can benefit from this relationship. In addition, inhibitors of bacterial GTs may help to eliminate virulence factors, and this is an urgently needed goal in light of growing antibiotic resistance.