Genomic Analysis of the Human Gut Microbiome Suggests Novel Enzymes Involved in Quinone Biosynthesis

Ubiquinone and menaquinone are membrane lipid-soluble carriers of electrons that are essential for cellular respiration. Eukaryotic cells can synthesize ubiquinone but not menaquinone, whereas prokaryotes can synthesize both quinones. So far, most of the human gut microbiome (HGM) studies have been based on metagenomic analysis. Here, we applied an analysis of individual HGM genomes to the identification of ubiquinone and menaquinone biosynthetic pathways. In our opinion, the shift from metagenomics to analysis of individual genomes is a pivotal milestone in investigation of bacterial communities, including the HGM. The key results of this study are as follows. (i) The distribution of the canonical pathways in the HGM genomes was consistent with previous reports and with the distribution of the quinone-dependent reductases for electron acceptors. (ii) The comparative genomics analysis identified four alternative forms of the previously known enzymes for quinone biosynthesis. (iii) Genes for the previously unknown part of the futalosine pathway were identified, and the corresponding biochemical reactions were proposed. We discuss the remaining gaps in the menaquinone and ubiquinone pathways in some of the microbes, which indicate the existence of further alternate genes or routes. Together, these findings provide further insight into the biosynthesis of quinones in bacteria and the physiology of the HGM.


INTRODUCTION
Quinones are membrane lipid-soluble carriers of electrons that are essential for cellular respiration (Collins and Jones, 1981). Of the numerous types of quinones used for respiration, the three that are the most studied and most widespread among microorganisms are ubiquinone (UQ, coenzyme Q), menaquinone (MK, vitamin K), and 2-demethylmenaquinone (DMK) (Collins and Jones, 1981;Nowicka and Kruk, 2010). In bacteria, only one pathway has been described for UQ biosynthesis (Meganathan, 2001b), whereas two different pathways are known for MK synthesis (Figure 1). The first "traditional" pathway includes DMK as an immediate precursor of MK, and both MK and DMK are synthesized via this pathway (Bentley and Meganathan, 1982;Meganathan, 2001a). In the second pathway, which was more recently discovered, MK is synthesized through futalosine (Hiratsuka et al., 2008). The final steps of this pathway remain unclear, and no information is available about the synthesis of DMK by this pathway (Arakawa et al., 2011;Barta et al., 2014; FIGURE 1 | Pathways for UQ, MK, and DMK biosynthesis in the analyzed genomes. The names of relevant enzymes are shown for each reaction. Circled numbers show the numbers of genomes in which genes for the corresponding enzyme were found. Solid arrows correspond to previously known enzymes, and dashed arrows together with italicized enzyme names correspond to enzymes predicted in the current work. Because the products of the reactions catalyzed by MenL and MenP are unknown, two possible variants of the pathway are shown. Abbreviations: Met, L-methionine; SAH, S-Adenosyl-L-homocysteine; SAM, S-adenosyl-L-methionine. Zhi et al., 2014). UQ can be synthesized by Alpha-, Beta-, and Gammaproteobacteria and by Eukaryotes, whereas MK/DMK can be synthesized only by various groups of Bacteria and Archaea (Collins and Jones, 1981;Bentley and Meganathan, 1982;Meganathan, 2001a;Nowicka and Kruk, 2010;Zhi et al., 2014). All three pathways begin with chorismate and have no shared enzymes except for the UbiE/MenG methyltransferase, which is involved in both the UQ and the "traditional" MK/DMK pathways (Lee et al., 1997; Figure 1).
Because of the inability of humans to synthesize MK, this compound should be consumed in food. MK is found in meat, dairy, and fermented food products (Walther et al., 2013). MK can also be obtained by the interconversion of a dietary derived phylloquinone (PK). PK is synthetized by plants and Cyanobacteria and differs from MK only by its side chain. While MK has a polyprenyl side chain, PK has a phytyl side chain (Nowicka and Kruk, 2010). The extent of PK-derived MK has been estimated to range from 5 to 25% of the digested PK (Shearer and Newman, 2008;Shearer et al., 2012). The dietary requirements of vitamin K (i.e., PK and MK) range from 0.86 to 3.15 µg per day for infants of 0-6 months (Canfield et al., 1990(Canfield et al., , 1991Greer et al., 1991;Shearer and Newman, 2008) and from 0.75 to 1.0 µg per kg of body mass per day for adults (Frick et al., 1967;Booth and Al Rajabi, 2008;Shearer et al., 2012).
The classic role of MK in humans is as an enzyme cofactor for the γ-carboxylation of peptide-bound glutamate residues, and evidence has recently been mounting that there is a correlation between human health and dietary MK Newman, 2008, 2014;Booth, 2009;Van Winckel et al., 2009;Walther et al., 2013).
In addition to dietary sources, a portion of the total available MK is synthesized by the human gut microbiome (HGM); however, the extent of MK derived from the HGM has not been determined (Ramotar et al., 1984;Conly and Stein, 1992;Suttie, 1995;LeBlanc et al., 2013;Ramakrishna, 2013). Various HGM communities have been intensively studied in recent years (Eckburg et al., 2005;Gill et al., 2006;Sonnenburg et al., 2006;Kinross et al., 2011;Flint et al., 2012;Lozupone et al., 2012;Leimena et al., 2013;Maurice et al., 2013), but most of these studies have concentrated on the analysis of metagenomic data. Metagenomic analysis is a powerful tool for the determination of HGM composition in healthy and diseased states (Cowan et al., 2005;Kinross et al., 2011;Simon and Daniel, 2011;Cho and Blaser, 2012;Gosalbes et al., 2012;Kelly and Mulder, 2012;Walker et al., 2014) as well as HGM variability related to age (Clemente et al., 2012;Yatsunenko et al., 2012), diet (Kurokawa et al., 2007;Hehemann et al., 2010;Wu et al., 2011), geography (Yatsunenko et al., 2012;Tyakht et al., 2013;Suzuki and Worobey, 2014), host genetics, and lifestyle (Yatsunenko et al., 2012). However, analysis of the completely or partially assembled genomes of representative HGM samples can also yield additional knowledge about the cellular physiology of individual bacterial strains and information about interactions between different organisms. Thus, a shift from metagenomic analysis to the analysis of individual genome sequences may become important in the investigation of the HGM and its interactions with the human organism. To date, there are 382 HGM genomes available through the National Institutes of Health (NIH) Human Microbiome Project (http://www.hmpdacc.org/HMRGD/), providing an excellent opportunity for gaining a further understanding of HGM biology.

MATERIALS AND METHODS
From the list of human gut microbes found in at least 50% of the analyzed HGMs (Nelson et al., 2010;Qin et al., 2010), we selected 250 genomes of human intestinal inhabitants that were available in the PubSEED (Overbeek et al., 2005;Disz et al., 2010) and Integrated Microbial Genomes (IMG) databases (Markowitz et al., 2014). The following four model genomes were added to the reference set: an inhabitant of the lower gut, Escherichia coli K-12 MG1655 (Blattner et al., 1997); an intestinal inflammationcausing agent, Salmonella enterica Typhimurium LT2 (Winter et al., 2010); a model organism for the analysis of carbohydrate metabolism in the intestine, Bacteroides thetaiotaomicron VPI-5482 (Hooper et al., 2002;Xu et al., 2003); and a model organism related to multiple gut strains, Bacillus subtilis 168 (Hong et al., 2009). All 254 of the selected genomes are presented in Table S1 in Supplementary Materials. The added model organisms belong to the three main phyla represented in HGM (Nelson et al., 2010;Qin et al., 2010), Bacteroidetes (B. thetaiotaomicron), Firmicutes (B. subtilis), and Proteobacteria (E. coli and S. enterica). We included of the model organisms to expand known metabolic functions in these organisms to the HGM genomes as well as to use their predictions as mean to quality control the predictions for the HGM genomes. For example, usage of genomes for the model organisms can be used to avoid wrong predictions for the functions of novel genes by verifying their absence in the model genomes. Note that three of the four model organisms (B. thetaiotaomicron, E. coli, and S. enterica) have been also detected in HGM, but in less than 50% of the studied samples (Qin et al., 2010).
The PubSEED platform was used to annotate the genes for quinone biosynthesis proteins. To avoid misannotations, all of the proteins with the same function were checked for orthology. Orthologs were defined as the best bidirectional hits that have a similar genomic context. To search for the best bidirectional hits, a BLAST algorithm (Altschul et al., 1997) implemented in PubSEED and the IMG platform was used (cutoff = e −20 ). Additionally, in the search for orthologs, the GenomeExplorer program package (Mironov et al., 2000) was used, and orthologs were determined as the best bidirectional hits with protein identity no less than 20%. To analyze genomic context and gene occurrence, we used PubSEED and STRING v9.1 (Franceschini et al., 2013) along with phylogenetic trees for protein domains in MicrobesOnline (Dehal et al., 2010). To analyze protein domain structure, we used searches in the Pfam (Finn et al., 2014) and CDD (Marchler- Bauer et al., 2013) databases, and the "Domains and Families" service implemented in the MicrobesOnline platform. Additionally, functional annotations of the analyzed genes were performed using the UniProt (Magrane and Consortium, 2011) and KEGG (Kanehisa et al., 2012) databases.
Multiple protein alignments were performed using the ClustalX v 2.0 tool (protein weight matrix: BLOSUM series; gap opening: 15; gap extension: 0.5) (Larkin et al., 2007). Phylogenetic trees were constructed using the maximumlikelihood method with the default parameters implemented in PhyML-3.0 (Guindon et al., 2010). The obtained trees were visualized and midpoint-rooted using the interactive viewer Dendroscope, version 3.2.10, build 19 (Huson et al., 2007). To clarify the taxonomic affiliations of the analyzed genomes, the NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/ taxonomy) was used. To predict substrate specificities according to the specificity-determining positions (SDP), the SDPfox web tool (Mazin et al., 2010) was used with the maximum percent of gaps allowed in a group in each column being 50%. As an input for the SDP analysis, we used multiple alignments for all UbiA/MqnP, UbiD/MqnL, and UbiX/MqnM proteins found in the analyzed genomes. No preliminary division to into specific groups was done. The specificity groups were determined based on the SDP analysis as follows. An uploaded aligned sequence set was randomly divided into groups (the minimal number of the groups = 2, the number of the sequences = 10), then "SPDgroup" procedure and the "Move sequences according to the best weight" procedure were applied.
All of the annotated genes are represented as a subsystem in PubSEED (http://pubseed.theseed.org/SubsysEditor.cgi? page=ShowSubsystem&subsystem=Quinones_biosynthesis_ HGM) and all of the protein sequences for the annotated genes in FASTA format are represented in file Sequences S1 in the Supplementary Materials.

RESULTS
Here, we present a systematic analysis of the biosynthetic pathways for UQ, MK, and DMK in 254 genomes, including 250 genomes for commonly found in the human gut (Qin et al., 2010) and four genomes for model organisms. Application of a comparative genomic analysis to individual HGM genomes identified the distribution of various pathways for quinone biosynthesis in HGM microorganisms and revealed four alternative forms of known enzymes in these pathways. Our analysis resulted in a prediction of the genes responsible for the last three steps of the futalosine biosynthesis pathway, which were previously unknown. Additionally, we compared the distribution of quinone biosynthetic pathways with the distribution of quinone-dependent reductases for electron acceptors within the HGM genomes.

Known Genes for Quinone Biosynthesis
To reconstruct the biosynthetic pathways of UQ, MK, and DMK in the 254 studied genomes, we analyzed the distribution of previously known genes involved in these pathways (Figure 1). On the basis of the presence of known genes, a pathway can be classified as complete or incomplete. A pathway was assigned as complete when known genes found in the genome could form an uninterrupted biosynthetic pathway from chorismate to UQ or MK and as incomplete when the pathway was interrupted because of the absence of certain genes. The presence of only one gene was classified as an incomplete pathway. To avoid misannotations, all genes possibly involved in quinone biosynthesis were checked for orthology with the known genes, as described in "Experimental procedures." The distribution of the genes for UQ biosynthesis (Ubi pathway) among the analyzed genomes was restricted to Gamma-and Betaproteobacteria. Complete Ubi pathways were found in 19 genomes (Table S1 in Supplementary Materials), whereas incomplete pathways were found in 4 genomes. The pathway for MK/DMK biosynthesis through O-succinylbenzoate (Men pathway) was more broadly distributed among the analyzed genomes. Complete Men pathways were found in 24 genomes of Firmicutes and Proteobacteria, whereas incomplete pathways were found in 91 genomes belonging to the phyla Actinobacteria, Bacteroidetes, Firmicutes, Fusobacteria, Proteobacteria, and Verrucomicrobia. Because the enzymes for the last steps of MK synthesis through the futalosine biosynthesis pathway (Mqn pathway) were not known (Arakawa et al., 2011;Barta et al., 2014;Zhi et al., 2014), this pathway was considered complete when all of the enzymes catalyzing the reactions for the synthesis of 1,4-dihydroxy-6-naphthoate from chorismate (Figure 1) were found in the genome. The complete pathway was found in 10 genomes from the phyla Bacteroidetes, Firmicutes, and Proteobacteria, whereas an incomplete pathway was found in 2 Firmicutes genomes.

Novel Enzyme in the Ubi Pathway
Using genomic analysis of the Ubi pathway, we predicted an alternative form of 3-polyprenyl-4-hydroxybenzoate carboxylyase (EC 4.1.1.-) (Figure 2). Previously, two alternative forms of this enzyme, UbiD and UbiX, were described (Cox et al., 1969;Alexander and Young, 1978;Gulmezian et al., 2007). Analysis of the genome of Acinetobacter junii SH205 revealed the presence of all the genes for the Ubi pathway except for ubiD and ubiX. To find a non-orthologous replacement (Koonin et al., 1996) for this enzyme, we searched for ubiD and ubiX orthologs in the genomes of organisms related to A. junii (Figure 2). These genomes belonged to two families of the order Pseudomonadales: Moraxellaceae (which includes A. junii) and Pseudomonadaceae. Orthologs of Escherichia coli K-12 MG1655 ubiD (e ≤ 9e −54 , identity ≥ 86%) and ubiX (e ≤ 3e −60 , identity ≥ 54%) were found in Pseudomonadaceae but not in the Moraxellaceae genomes. Thus, the Moraxellaceae genomes should encode an alternative 3-polyprenyl-4-hydroxybenzoate carboxy-lyase that is non-orthologous to ubiD or ubiX, and this alternative gene should be present in Moraxellaceae but not in Pseudomonadaceae. Analysis of gene occurrence by IMG revealed 29 candidates. A hypothetical protein (the locus tag in A. junii is HMPREF0026_02430), was considered to be the most probable candidate because of its co-localization on the chromosome with the ubiE and ubiB genes in the genomes of Acinetobacter spp. and Psychrobacter sp. PRwf-1. We named this gene ubiZ. The number of genomes with a complete Ubi pathway thus increased from 19 to 20.

Novel Enzymes in the Men Pathway
We predicted alternative genes for 1,4-dihydroxy-2-naphthoyl-CoA thioesterase (EC 3.1.2.28) and for (1R,6R)-2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase (EC 4.2.99.20) (Figure 2). The first enzyme is typically encoded by the menI gene (Widhalm et al., 2009;Chen et al., 2013), whereas the second enzyme is typically encoded by the menH gene (Jiang et al., 2007(Jiang et al., , 2008. No menI orthologs were found in the genomes of Eggerthella sp. 1_3_56FAA or Gordonibacter pamelaeae 7-10-1-b. A sequence similarity search using the tblastn algorithm (Altschul et al., 1997) revealed in these genomes a distantly related homolog of menI (e = 1.4, identity = 30%, the locus tag in G. pamelaeae is GPA_07290) that was annotated as a hypothetical protein. Because such an e-value and identity is not significant, we verified the presence of this protein in the analyzed genomes. The orthologs of this new gene were found only in genomes having the Men pathway but lacking menI. Consequently, we propose that this gene, called menJ, could replace the function of menI. There are no additional corroborations for this prediction, such as a genomic location with other MK synthesis genes or a sequence similarity to functionally relevant proteins. Hence, this prediction remains quite speculative.
A non-orthologous replacement for the menH gene was predicted by a detailed analysis of the menI genes in the studied genomes. This analysis revealed two types of MenI proteins. The first type has the same domain structure as a protein from E. coli and includes a single thioesterase domain (Pfam ID: PF03061). The second type of MenI has the same PF03061 domain at its C-terminus and an additional hydrolase domain (Pfam ID: PF08282) at its N-terminus ( Figure S1A in Supplementary Materials). The PF08292 domain belongs to the haloacid dehydrogenase superfamily, the members of which catalyze reactions similar to the one catalyzed by MenH (Koonin and Tatusov, 1994). In the analyzed genomes, we observed two types of orthologs containing the PF08282 domain: the PF08282-MenI fusions and proteins having a single PF08282 domain ( Figure S1B in Supplementary Materials). All the proteins that contained a PF08282 domain were present only in genomes lacking the MenH orthologs. Thus, the PF08282domain protein, called MenY, was proposed to be a nonorthologous displacement for MenH. The prediction of the menJ and menY genes increased the number of genomes with a complete Men pathway from 24 to 72.

Novel Enzyme in the Mqn Pathway
Using a genomic analysis of the Mqn pathway, we predicted an alternative form of 1,4-dihydroxy-6-naphthoate synthase (EC 1.14.-.-) and proposed an enzyme for the previously unknown steps of this pathway. In the genomes of Mitsuokella multacida DSM 20544 and Megamonas hypermegale ART12/1, no mqnD genes were found, but we detected a gene that co-localized together with the mqnC and mqnE genes (the locus tag in M. hypermegale is MHY_27640). In addition to M. multacida and M. hypermegale, this protein was found to cluster with the mqnC and mqnE genes in six genomes, including Selenomonas artemidis F0399, Selenomonas flueggei ATCC 43531, Selenomonas noxia ATCC 43541, Selenomonas sp. 67H29BP, Selenomonas sp. F0430, and Selenomonas sputigena ATCC 35185. In further four genomes, Pelotomaculum thermopropionicum SI, Syntrophomonas wolfei str. Goettingen, Syntrophothermus lipocalidus DSM 12680, and Thermosinus carboxydivorans, this protein was found to co-localize with the mqnEAC operon. Because this gene was found only in conjunction with the other genes in this pathway but not with mqnD, we proposed that this gene (mqnZ) can replace the gene function of mqnD. Unfortunately, sequence analysis of this protein did not provide additional support for its prediction as the only sequence similarity was found with the phosphorylase superfamily (Pfam ID: PF01048) (e = 1.14e −32 ).

Prediction of the Last Steps of the Mqn Pathway
Previously, it has been proposed that the final biosynthetic stages of the Mqn pathway should be catalyzed by a polyprenyltransferase, a carboxy-lyase, and a methyltransferase (Arakawa et al., 2011;Barta et al., 2014;Zhi et al., 2014), but the particular enzymes are unknown. Analysis of genomes containing genes for the Mqn pathway revealed genes that are homologous to ubiE/menG, ubiA, ubiD, and ubiX. The UbiE/MenG proteins encode S-adenosyl-L-methioninedependent methyltransferases that transfer a methyl group to the carbon atom of the aromatic ring, which is part of both UQ and MQ. These methyltransferases are involved in both the Ubi and Men pathways (Lee et al., 1997). Thus, we propose that the Mqn co-occurring homologs of UbiE/MenG could be involved in menaquinone synthesis. To test this hypothesis, we constructed a phylogenetic tree for the UbiE/MenG-like proteins co-occurring with the Ubi, Men, and Mqn pathways ( Figure S2 in Supplementary Materials). The Mqn co-occurring proteins did not form a separate monophyletic branch but were mixed with the Men co-occurring proteins. Based on this co-occurrence, we propose that the Mqn co-occurring UbiE/MenG proteins play the same role as UbiE/MenG, i.e., catalyzing the methylation of DMK to produce MK.
The UbiA protein catalyzes the attachment of a polyprenyl group to the quinone aromatic ring (Melzer and Heide, 1994). We propose that the Mqn co-occurring homologs of UbiA are involved in the Mqn pathway. On the other hand, in the Mqn pathway a polyprenyl group could be attached to 1,4-dihydroxy-6-naphthoate or its derivatives, whereas UbiA catalyzes the attachment of a polyprenyl group to 4-hydroxybenzoate. Thus, different substrate specificities of UbiA and its Mqn co-occurring homologs could correlate with distinguishable differences on the amino acid sequence level. Correlations between phylogeny and substrate specificity have been reported for various enzymes (Lee et al., 2007;Olivares-Hernández et al., 2010;Reddy et al., 2013;Ratnikov et al., 2014). Consequently, we proposed that UbiA and its Mqn co-occurring homolog form a clearly distinguished branches in their phylogenetical tree. To test this hypothesis, we constructed a phylogenetic tree ( Figure S3 in Supplementary Materials) for the three homolog groups: (1) UbiA proteins, (2) the Mqn co-occurring homologs of UbiA, and (3) MenA proteins. Each group in the tree formed a monophyletic branch. The UbiA proteins and the Mqn co-occurring homologs of UbiA (named MqnP) branches were neighboring but were clearly separated from each other. Thus, we propose that UbiA and MqnP have similar functions but have different substrate specificities. The MqnP protein is proposed to be a polyprenyltransferase in the Mqn pathway.
The UbiD and UbiX proteins are carboxy-lyases in the Ubi pathway (Cox et al., 1969;Alexander and Young, 1978;Gulmezian et al., 2007). Homologs for both of these proteins were found in the genomes encoding enzymes for the Mqn pathway. As for the UbiA homologs, we suspected that the Ubi pathway proteins and their Mqn co-occurring homologs would have different substrate specificities and would be clearly separated based on the corresponding phylogenetic trees. Construction of such trees ( Figure S4A in Supplementary Materials) confirmed this proposition. The UbiD proteins and their Mqn co-occurring homologs formed monophyletic branches that were clearly separated from each other. The same was true for the UbiX proteins and their Mqn co-occurring homologs ( Figure S4B in Supplementary Materials). Based on these results, we propose that the Mqn co-occurring homologs of UbiD and UbiX are carboxy-lyases in the Mqn pathway; we named them MqnL and MqnM, respectively.
Additionally, the presence of SDPs (Kalinina et al., 2004;Mazin et al., 2010) was analyzed for the UbiA/MqnP, UbiD/MqnL, and UbiX/MqnM proteins were found in the HGM genomes. For each of these groups of homologous, 33, 27, and 7 SDPs were determined, respectively ( Figure S5 in Supplementary Materials). After grouping by determined SDPs (see "Materials and Methods" for details), each set of homologs was divided to two groups. In all three groups of homologs, SDPs-based groups turned out to be in agreement with the division of phylogenetic trees and our predictions of functions.
Thus, the enzymes for all three steps (polyprenylation, decarboxylation, and methylation) in the transformation of 1,4-dihydroxy-6-naphthoate into MK were predicted. However, the order of these steps is not completely clear. Because the Mqn pathway associated with UbiE/MenG was indistinguishable from the Men pathway, we propose that methylation may be the final step of the Mqn pathway, as is the case for the Men pathway. For the ordering of polyprenylation and decarboxylation, we propose two alternative scenarios (Figure 1). In the first scenario, 1,4-dihydroxy-6-naphthoate is initially polyprenylated to form 3-polyprenyl-1,4-dihydroxy-6naphthoate, which then is decarboxylated to DMK. In the second scenario, 1,4-dihydroxy-6-naphthoate is initially decarboxylated to form naphthoquinone, which could then be polyprenylated to form DMK. The mqnP, mqnL, and mqnM genes were all found in each genome containing the Mqn pathway, but these genes were absent in the other genomes, supporting our predictions of their involvement in the Mqn pathway.

Co-distribution of Quinone Biosynthetic Pathways and Quinone-Dependent Reductases for Electron Acceptors
Based on the presence of the Ubi, Men, and Mqn pathways in each analyzed genome, we predicted the patterns of the synthesized quinones. The ability to synthesize UQ is directly determined by the presence of the Ubi pathway, whereas the situation for DMK and MK synthesis is more complicated. Because DMK is an intermediate in the Men and Mqn pathways, organisms with genomes having one of these pathways should be able to synthesize DMK. The ability to synthesize MK depends on the presence of the ubiE/menG genes. In the presence of the Men or Mqn pathway, having ubiE/menG in the genome determines the ability to synthesize both MK and DMK. Meanwhile, the presence of the Men or Mqn pathway without ubiE/menG in the genome results in the synthesis of DMK only. Overall, five different patterns of quinone synthesis were found (Table S2 in Supplementary Materials). A total of 19 genomes were able to synthesize UQ, MK, and DMK, 4 genomes were able to synthesize UQ only, 99 genomes were able to synthesize MK and DMK, 8 genomes were able to synthesize DMK only, and, finally, 124 genomes were not able to synthesize any of the quinones.
The distribution of quinone synthesis patterns in the HGM genomes was also compared to the distribution patterns of quinone-dependent reductases for electron acceptors (also referred to as reductases) obtained from a previous analysis of HGM genomes (Ravcheev and Thiele, 2014). Only quinoneinteracting reductases were included in our analysis. A reductase was considered as quinone-interacting if orthologs of known quinone-interacting subunits were found in chromosomal gene cluster encoding for this reductase. In total, we analyzed 17 types of reductases with the following electron acceptors: oxygen (Cyo, Qox, and Cyd), nitrate (Nar and Nap), nitrite (Nrf), tetrathionate (Ttr), thiosulfate (Phs and Tsr), sulfite (Dsr), polysulfite (Psr), trimethylamine N-oxide (Tor), dimethyl sulfoxide (Dms), selenate (Ynf), fumarate (Frd), and arsenate (Arx), as well as the Ynf reductase with unknown specificity. Overall, orthologs of 23 quinone-interacting membrane subunits were found in the analyzed genomes. For 14 of these membrane subunits, interactions with quinones have been confirmed experimentally, whereas for the 6 proteins interactions with quinones have been previously predicted based on similarity with known quinone-interacting subunits. For the remaining three proteins, TsrF, DmsH, and YdhD, interactions with quinones were predicted in this study based on sequence similarity with the experimentally validated quinone-interacting NrfD protein from E. coli (Table S3 in Supplementary Materials).
Of the 254 genomes, 225 (88.6%) genomes demonstrated good agreement between the distributions of reductases and quinones (Figure 3 and Table S2 in Supplementary Materials): 121 genomes (47.6%) encoded both reductases and quinonesynthesis pathways, whereas the remaining 104 genomes (40.9%) encoded neither reductases nor quinone-synthesis pathways. The remaining 29 genomes (11.4%) disagreed in their distributions of quinone-synthesis pathways and reductases. In 20 genomes (7.9%), reductases were identified but no quinone biosynthetic FIGURE 3 | Co-distribution of quinone biosynthetic pathways and quinone-dependent reductases for electron acceptors. The number of genomes is shown. Inconsistencies between the reductase and quinone patterns are indicated by ellipses. TMAO, trimethylamine N-oxide; DMSO, dimethyl sulfoxide. pathways were found. Finally, 9 genomes (3.5%) encoded pathways able to synthesize at least one quinone but contained no identified reductases.

DISCUSSION
In this study, we analyzed the distribution of three pathways for the biosynthesis of respiratory quinones in 254 genomes, including 250 genomes for microbes commonly found in the healthy human gut and four genomes for model organisms.
Our key results are as follows. (i) The HGM distribution of canonical pathways was consistent with previous reports and with the distribution of reductases for electron acceptors. (ii) A comparative genomics analysis identified four alternative forms of the previously known enzymes for quinone biosynthesis. (iii) Genes for a previously unknown part of the futalosine pathway were identified, and the corresponding biochemical reactions, enzymes, and genes were proposed. Furthermore, we discuss the remaining gaps in some of the genomes.
The distributions of the three studied quinone biosynthesis pathways in the HGM correspond to previous data on the distribution of these pathways (Collins and Jones, 1981;Nowicka and Kruk, 2010). For instance, the Men pathway has been previously shown to be more frequent among Prokaryotes (Zhi et al., 2014). This observation corresponds to the presence of this pathway in almost half of the analyzed genomes. In comparison, the Ubi and Mqn pathways are present in 9 and 5% of the studied genomes, respectively. Additionally, only Alpha-, Beta-, and Gammaproteobacteria synthesize UQ (Meganathan, 2001b;Cluis et al., 2012). Indeed, among the studied genomes, the UQ biosynthetic genes were not found to be absent in any of analyzed Beta-or Gammaproteobacteria (no Alphaproteobacteria genomes were analyzed in this work). Two-thirds of the studied genomes belong to genera for which experimental data on the production of respiratory quinones is available (Table S4 in Supplementary Materials). The predictions from the genomic analysis were consistent with the experimental data for all HGM genomes. A similar concordance was observed when the distributions of quinone biosynthetic pathways and of quinonedependent reductases for electron acceptors were compared (Figure 3). For instance, the genes for the UQ-interacting aerobic reductase complex CyoABCD (Abramson et al., 2000) were found only in genomes encoding the UQ biosynthetic pathway. Additionally, anaerobic reductases for tetrathionate, thiosulfate, polysulfide, sulfite, fumarate, trimethylamine N-oxide, dimethyl sulfoxide, selenate, and arsenate were found exclusively in the genomes encoding the MK/DMK biosynthetic pathways, but not in genomes having only the UQ or DMK biosynthetic pathways. A comparative genomics approach can thus be used to accurately annotate quinone biosynthetic pathways.
Comparative genomic analysis of the quinone biosynthetic pathways revealed four non-orthologous replacements for previously known enzymes. A total of four alternative enzymes were predicted: one for the Ubi pathway, two for the Men Pathway, and one for the Mqn pathway. These predictions were made possible by analysis of multiple genome sequences and by the use of multiple comparative genomics methods. For example, the prediction of MenY, a non-orthologous replacement for the previously known protein MenH, is based on the following assumptions. (1) The MenY protein belongs to the superfamily of HAD hydrolases, i.e., its general function is relevant. (2) The MenY protein forms protein fusions (multi-domain proteins) with the MenI protein. (3) The menY gene is co-located on the chromosome with other men genes in the 31 studied genomes. (4) The menY gene is present only in genomes that lack the menH gene. Each of these assumptions alone is not enough to confirm the prediction that MenY is a MenH-replacing enzyme, but collectively, they provide a sufficient basis for such a prediction. Thus, the combinatorial use of multiple methods can increase the impact of genome-based predictions. Additionally, the use of comparative genomics for the prediction of alternative enzyme forms provides a good basis for further experimental validation.
The most notable result of this work is the prediction of previously unknown stages of the futalosine pathway and of genes encoding the corresponding enzymes: mqnP, mqnL, and mqnM (Figure 1). This prediction was permitted by the application of multiple genomic techniques to large numbers of analyzed genomes. Further experimental verification is required to confirm the validity of this prediction. This re-annotation also resolves the incomplete Ubi pathway in a number of genomes outside of Alpha-, Beta-, and Gammaproteobacteria, making the annotation consistent with data on the taxonomic distribution of this pathway (Meganathan, 2001b;Cluis et al., 2012).
The detection of enzymes for previously unknown stages of the Mqn pathway increases our knowledge of the evolutionary history of the quinone biosynthetic pathways. The Mqn pathway was shown to be the primordial pathway for the MK biosynthesis, whereas the Men pathway appeared later in evolution (Zhi et al., 2014). The narrow taxonomic distribution of the Ubi pathway clearly indicates that it is the most recently evolved pathway. The results of the current work specify the details of pathway evolution. Quinone biosynthetic pathways that newly emerge during evolution may use parts of preexisting pathways. Thus, the Men pathway adopted a methyltransferase and, because the MenA is a homolog of the MqnP, possibly a polyprenyltransferase from the more ancient Mqn pathway. The Ubi pathway, the youngest of the three studied pathways, adopted three proteins from the Mqn pathway: polyprenyltransferase and two non-homologous carboxy-lyases. Additionally, the Ubi pathway adopted a methyltransferase from the Mqn or Men pathway. Thus, younger pathways contain more enzymes adopted from older pathways. This assumption is based on only the three analyzed pathways and is therefore quite speculative. Nonetheless, this assumption can lead to an interesting conclusion. For example, the alternative isoprenoid quinones, such as sulfolobusquinone, caldariellaquinone, and benzodithiophenoquinone, have a very narrow taxonomic distribution (Nowicka and Kruk, 2010;Zhi et al., 2014), and their biosynthetic pathways should be younger than the Men and Mqn pathways. The pathway for the biosynthesis of the alternative quinones would thus be expected to contain homologs of enzymes from more ancient pathways. This expectation could be used to predict such pathways. Of course, such predictions will also be quite speculative and will require experimental validation. However, if this assumption is true, it can be used for the prediction of various novel pathways beyond quinone biosynthesis.

FURTHER DIRECTIONS
Our study resulted in the functional predictions for a number of genes involved in quinone biosynthesis. Such predictions illustrate the power of comparative analysis of individual genomes. Nonetheless, some problems related to microbial quinone biosynthesis still remain unresolved. The first problem is the presence of incomplete biosynthesis pathways. For instance, incomplete Men pathways were found in 43 genomes. Three nonexclusive hypotheses might explain this pathway incompleteness: (1) the incompleteness of genome sequences, (2) inter-microbe exchange of metabolites, and (3) non-orthologous replacements. Two-thirds of the analyzed genomes have a draft status, and some genes for the quinone biosynthetic pathways may thus be absent from the current version of the genome. Completion of the sequences of current draft genomes may complete the Men pathway. For example, the draft genome sequence of Escherichia sp. 1_1_43 lacks genes for the Men pathway and lacks some genes in the Ubi pathway (Table S1 in Supplementary Materials). Nevertheless, in all the complete genomes of Escherichia spp., all the genes for these two pathways were found. Thus, we can be confident that the quinone biosynthesis genes missing from the current genome of Escherichia sp. 1_1_43 will be detected in the finished version of this genome. Thus, the first further direction is an update of the study results using a novel, complete version of previously incomplete genomes.
The availability of finished genomes for all the studied organisms will only partially resolve the problem of pathway incompleteness, as incomplete pathways were also found in a number of finished genomes (Figure 4). For example, the multiple finished genomes of Lactobacillus spp. showed incomplete Men pathways. The common feature of these incomplete pathways is that early steps of the pathway are missing whereas late steps are present, at least the steps required for the polyprenylation and methylation catalyzed by MenA and MenG, respectively (Figure 4). Because the addition of a hydrophobic polyprenyl group takes place at the penultimate step of the Men pathway, all MK precursors from isochorismate to 1,4-dihydroxy-2-naphthoate are soluble; thus, exchange of these metabolites between different microorganisms could be possible. For example, 1,4-dihydroxy-2-naphthoate could be used by microbes having only the menAG genes, whereas O-succinylbenzoate could be used by microbes having the menEBIAG genes. We hypothesize that such organisms could utilize soluble precursors of MK. Hence, the corresponding transport genes should be present in their genomes, but remain to be annotated. Two types of transporters would be required, (1) transporters for the export of soluble quinone precursors in the producing organisms and (2) transporters for the import of these precursors in the consuming organisms. Whereas nothing is known about the exchange of quinone precursors in microbial communities, inter-species exchange of metabolites has been demonstrated for the HGM. For instance, HGM organisms can exchange acetate, extracellular polysaccharides, formate, fucose, molecular hydrogen, secondary bile acids, short-chain fatty acids, sialic acid, and succinate (Stams and Plugge, 2009;De Vuyst and Leroy, 2011;Ng et al., 2013;Kovács, 2014;Vogt et al., 2015). Thus, the proposed exchange of soluble quinone precursors is very likely.
The hypothesis that MK precursors may be exchanged between microbes is tempting but remains speculative. To confirm this hypothesis, the transporters of the corresponding MK precursors must be computationally predicted and then experimentally validated. For better support of the exchange hypothesis, co-presence of the producing and consuming organisms in various HGM samples would need to be analyzed. These inter-microbe interactions should also be validated experimentally and/or tested computationally using the mathematical models (see below).
Even if the exchange hypothesis were true, the problem of incomplete pathways could not be sufficiently resolved. In some finished genomes, such as Akkermansia muciniphila or Megamonas hypermegale, the pathways lack internal steps (Figure 4). For these genomes, an important next step would be the search for non-orthologous replacements in incomplete pathways. Non-orthologous replacements are well known for quinone biosynthetic pathways. For example, in E. coli, the decarboxylation of 3-polyprenyl-4-hydroxybenzoate can be catalyzed by two non-homologous proteins, UbiD and UbiX (Cox et al., 1969;Alexander and Young, 1978;Gulmezian et al., 2007). Another example is the existence of multiple nonorthologous replacements for enzymes catalyzing the early steps of the Mqn pathway (Arakawa et al., 2011). Additionally, four non-orthologous replacements were predicted in this study. The non-orthologous replacement hypothesis is very promising because these replacements can be successfully determined even using computational methods alone. If such replacements are found, the problem of pathway incompleteness could be resolved.
The other unresolved problem is the inconsistency between the distributions of quinone biosynthetic pathways and quinonedependent reductases for electron acceptors (Figure 3). Two Frontiers in Microbiology | www.frontiersin.org main types of inconsistencies were observed: (1) absence of reductases in the presence of quinone biosynthetic pathways and (2) absence of quinone biosynthetic pathways in the presence of reductases. Of course, both of these problems may be partially resolved by the availability of finished genomes. On the other hand, such inconsistencies were also found in finished genomes (Table S1 in Supplementary Materials). The absence of reductases could be explained by the use of unknown reductases by an organism. In a previous study, we have predicted two novel membrane reductases among HGM genomes (Ravcheev and Thiele, 2014), and at least one of them, thiosulfate reductase Tsr, was predicted as quinone-dependent (Table S3 in Supplementary Materials). Further systematic analysis of respiratory enzymes in the HGM genomes could resolve this type of inconsistencies.
The absence of quinone biosynthetic pathways could be explained by the existence of alternative quinone biosynthetic pathways. For instance, the alternative Mqn pathway for MK biosynthesis has been discovered in 2008 (Hiratsuka et al., 2008). Additionally, alternative types of quinones may be used for respiration in organisms having reductases but lacking the Ubi, Men, and Mqn pathways. In this study, we limited our analysis to the biosynthesis of UQ, MK, and DMK, but the diversity of microbial respiratory quinones is much wider (Collins and Jones, 1981;Nowicka and Kruk, 2010). Nonetheless, the biosynthetic pathways of alternative quinones are poorly understood. We anticipate that increasing wealth of experimental and genomic data will substantially improve our understanding of these pathways.
In this study, the non-orthologous displacements and genes for previously unknown reactions were only computationally predicted; thus requiring experimental confirmation. For example, the predicted 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate dehydrogenase MenY could be verified using model organisms, such as Corynebacterium spp. or Bacteroides spp. Similarly, the novel enzymes in the Mqn pathway could be experimentally analyzed using Helicobacter spp.
Furthermore, the respiratory systems in the HGM organisms are not limited to reductases and quinones. For broader coverage of the respiratory systems, our study will need to be extended to the analysis of ATP synthases and dehydrogenases for electron donors. Other pathways of relevance to human health, such as the B-vitamin synthesis capability of the HGM (Magnúsdóttir et al., 2015), should be also considered. The identification of novel enzymes and pathways will lead to better understanding of the HGM biochemistry and physiology. This is a pre-requisite to understand the HGM's contribution to human health and disease as well as to rationally alter HGM. Current approaches aiming to modify the HGM, such as fecal transplants (Kelly et al., 2015) and probiotics, lack mechanistic bases, which is partially due to insufficient biochemical knowledge.
The quinone biosynthesis needs also to be analyzed in the context human-microbe interactions. An inability of human cells to synthetize MK, together with the importance of this compound for the human health Newman, 2008, 2014;Booth, 2009;Van Winckel et al., 2009;Walther et al., 2013), raises the question about a role of gut microbiota in the MK supply of the host. So far, MK biosynthesis has been studied in monocultures of model organisms (Bentley and Meganathan, 1982;Ramotar et al., 1984;Fernandez and Collins, 1987;Walther et al., 2013) or in animal-microbe models (Kindberg et al., 1987;Davidson et al., 1998) but the information about the human-microbe exchange of MK is still very scarce. Nonetheless, rat models demonstrated that luminal concentrations of MK produced by Escherichia coli and Bacteroides vulgatus could reach 6-7 and 8 µg per g of dry feces, respectively. Assuming a similar ratio for humans and a daily fecal output of 25-50 g solid matter in healthy individuals (Wyman et al., 1978), microbial produced and excreted MK could range from 150 to 400 µg feces. The recommended dietary intake of vitamin K, of which MK is a minor part, is for sucking infants 0.86-3.15 µg per day for sucking infants (Canfield et al., 1990(Canfield et al., , 1991Greer et al., 1991;Shearer and Newman, 2008) and for adults 75-90 µg per day (Frick et al., 1967;Booth and Al Rajabi, 2008;Shearer et al., 2012). In fact, the microbial contribution to MK requirements has been suggested to be approximately 50% (Wyman et al., 1978), but evidence is still missing.
Computational modeling (Palsson, 2006;Orth et al., 2010) of HGM metabolism could be used to systematically elucidate the MK biosynthesis potential of different HGM representatives. In fact, genome-scale metabolic models for numerous HGM microorganisms have been published (Thiele et al., 2005(Thiele et al., , 2011Orth et al., 2011;Branco Dos Santos et al., 2013;Heinken et al., 2013Heinken et al., , 2014Bauer et al., 2015), but still require better coverage of the respiratory chain and quinone biosynthesis pathways. At the same time, computational models for human metabolism (Duarte et al., 2007;Sahoo et al., 2012Sahoo et al., , 2015Heinken et al., 2013;Thiele et al., 2013a,b) are available, thereby, enabling the in silico study of HGM and their interactions with each other (Freilich et al., 2011;Zomorrodi and Maranas, 2012;Khandelwal et al., 2013;Heinken and Thiele, 2015a) as well as with the human host (Thiele et al., 2013a;Bauer et al., 2015;Heinken and Thiele, 2015b,c). Particularly, such a modeling approach could help us to estimate the microbial contribution to the MK requirements in humans.
Quinone biosynthesis has been studied in many taxonomically diverse microbial species (Collins and Jones, 1981;Meganathan, 2001b;Shearer and Newman, 2008;Nowicka and Kruk, 2010;Cluis et al., 2012), but no systematic analysis of quinone biosynthesis has yet been done for microbes found in a particular ecosystem. Considering that many microbes remain unculturable, as their culturing conditions remain unidentified, it is crucial to search their genomes for potential exchanges of metabolites with other community members. Particularly, quinones may play an important role in the co-metabolism of microbial communities, as they are main electron transfer molecules in microbial respiratory chains. Quinones influence the energy production and further, through cellular redox status and central metabolism, they affect the utilization of carbon and nitrogen sources and biosynthesis of indispensable compounds, such as amino and fatty acids. Thus, the ability of a microbe to de novo synthetize, salvage, or utilize quinones determines vital cellular properties, such as growth and replication. The presented results of the distribution of quinone biosynthetic pathways could be further expanded to include strains with high relevance in the biotechnological industry, in ecology, and in human health. Such large-scale comparative genomics effort could provide further insight into evolutionary mechanisms, including co-evolution of microbiomes with the host.

AUTHOR CONTRIBUTIONS
DR and IT conceived and designed the research project and wrote the manuscript. DR performed genomic analysis of the quinone biosynthetic pathway. All authors read and approved the final manuscript.