Leaving the Dark Side? Insights Into the Evolution of Luciferases

Bioluminescence—i.e., the emission of visible light by living organisms—is defined as a biochemical reaction involving, at least, a luciferin substrate, an oxygen derivative, and a specialised luciferase enzyme. In some cases, the enzyme and the substrate are durably associated and form a photoprotein. While this terminology is educatively useful to explain bioluminescence, it gives a false idea that all luminous organisms are using identical or homologous molecular tools to achieve light emission. As usually observed in biology, reality is more complex. To date, at least 11 different luciferins have indeed been discovered, and several non-homologous luciferases lato sensu have been identified which, all together, confirms that bioluminescence emerged independently multiple times during the evolution of living organisms. While some phylogenetically related organisms may use non-homologous luciferases (e.g., at least four convergent luciferases are found in Pancrustacea), it has also been observed that phylogenetically distant organisms may use homologous luciferases (e.g., parallel evolution observed in some cnidarians, tunicates and echinoderms that are sharing a homologous luciferase-based system). The evolution of luciferases then appears puzzling. The present review takes stock of the diversity of known “bioluminescent proteins,” their evolution and potential evolutionary origins. A total of 134 luciferase and photoprotein sequences have been investigated (from 75 species and 11 phyla), and our analyses identified 12 distinct types—defined as a group of homologous bioluminescent proteins. The literature review indicated that genes coding for luciferases and photoproteins have potentially emerged as new genes or have been co-opted from ancestral non-luciferase/photoprotein genes. In this latter case, the homologous gene’s co-options may occur independently in phylogenetically distant organisms.

observed in the water column have luminescence capability (Martini and Haddock, 2017).
From a molecular perspective, bioluminescence is the product of the oxidation of a luciferin substrate catalysed by a luciferase enzyme. The electronically excited oxyluciferin emits light as it relaxes to the ground state. In some cases, the luciferase and the luciferin are associated in a single unit, the so-called photoprotein. In photoprotein systems, the substrate/enzyme complex may require additional cofactors to be functionally active (Shimomura, 2012). The general luminescence reaction is ubiquitous in all known luminescent organisms, however, this ability to produce light emerged multiple times independently in the tree of life: more than 94 times according to the recent literature (Hastings, 1983;Wilson and Hastings, 1998;Haddock et al., 2010;Davis et al., 2016;Lau and Oakley, 2020). Several luciferins-i.e., at least 11 different molecules identified so far-and luciferases have indeed been described in a large variety of taxa (Herring, 1987;Haddock et al., 2010;Lau and Oakley, 2020, for review). Around 100 different species have been substantially described using biochemical approaches (Supplementary Table 1). While the majority of investigated luminous species use a luciferase/luciferin system (at least 75 species, Supplementary Table 1), photoproteins have been described in around 25 species (i.e., in cnidarians, ctenophores, annelids, molluscs, crustaceans, echinoderms, and fishes, Supplementary Table 1) (Shimomura, 1986(Shimomura, , 2008(Shimomura, , 2012. Luciferases are generally considered as "taxon-specific" (Haddock et al., 2010;Shimomura, 2012). Besides, phylogenetically related organisms may sometimes rely on non-homologous enzymes for photogenesis, supporting the convergent evolution of bioluminescence. Therefore, there is no common luminous ancestor for all bioluminescent species. The luminescent systems have different origins, resulting in highly diverse systems involving different molecular actors, different associated morphological and anatomical structures, and different types of control mechanisms (Haddock et al., 2010). It is strongly suggested that the multi-convergent evolution of bioluminescence demonstrates the existence of intense selective pressures in support of the emergence of bioluminescence mechanisms during organism evolution (Haddock et al., 2010). In that view, the acquisition of the ability to emit light could be seen as an "evolutionary easy process" during evolution (Haddock et al., 2010).
Based on the unpredictable emergence of luminescent species throughout evolution, and the necessity for oxygen in luminescence reactions, it has been speculated that bioluminescence might have evolved to eliminate oxygen or reactive oxygen species from the organism (Timmins et al., 2001;Wilson and Hastings, 2013). Bioluminescence would then be derived from defence mechanisms against free-radicals, i.e., coopted from an oxygen detoxifying mechanism to a light-related communication type (Seliger, 1975). Wilson and Hastings (2013) argue that bioluminescence evolved in response to low oxygen levels during the time between the evolutionary emergence of photosynthesis on earth (the so-called "great oxidation event" that occurred around 2 billion years ago) and the Cambrian explosion (around 500-550 million years ago).
According to Wilson and Hastings (2013), all bioluminescence systems "consume" oxygen and could therefore be considered primary oxygen detoxification strategies, with light simply considered a secondary by-product. Bioluminescence would have then acquired a different functional role when antioxidant pathways, such as those involving superoxide dismutases and catalases, became widespread with increasing oxygen levels. As stated by Valiadi and Iglesias-Rodriguez (2013), the hypothesis of Wilson and Hastings is mainly based on the bioluminescence systems of bacteria and fireflies, but it is largely plausible for other bioluminescent organisms. In cell cultures, coelenterazine (i.e., the most common luciferin in the marine environment) has been shown to reduce the death of fibroblasts exposed to oxidative stress (Rees et al., 1998). Coelenterazine is detected not only in luminescent organs but is also found in the digestive tract and hepatopancreas of several luminous and non-luminous decapods, cephalopods and fishes (Shimomura, 1987;Mallefet and Shimomura, 1995;Thomson et al., 1997;Rees et al., 1998;Duchatelet et al., 2019). These observations support an anti-oxidative function of this kind of compound and luciferins might then be antioxidant molecules emitting light as a by-product of their "reactive oxygen scavenging chemical activity" (Rees et al., 1998;Haddock et al., 2010). The presence of common light-emitting luciferins in luminous but also is non-luminous organisms (i.e., ecological notion of luciferin reservoir) led to the hypothesis of the "luciferin dietary acquisition" (Haddock et al., 2010;Shimomura, 2012): luminous organisms acquired their luciferin through their food, and those molecules can transit via the food chain (i.e., a predator can retrieve the luciferin produced by its prey) (demonstrated in some species: Barnes et al., 1973;Warner and Case, 1980;Frank et al., 1984;Thomson et al., 1997;Haddock et al., 2001;Mallefet et al., 2020). The same luciferin can then be found in phylogenetically distant organisms (e.g., coelenterazine is found in at least nine phyla). This "oxygen defence" hypothesis has also been adapted for luciferases that might initially be antioxidative enzymes secondarily co-opted in luciferases (Haddock et al., 2001;Wilson and Hastings, 2013). The hypothesis has the advantage to explain the widespread occurrence of bioluminescence in organisms. However, the general idea that luminescence might have evolved to eliminate oxygen stress is over-simplistic and has been disproved by the discovery of several luciferases which are not homologous to antioxidative enzymes but rather derived, i.e., coopted, from unrelated enzymes (Viviani, 2002;Loening et al., 2006;Müller et al., 2009;Delroisse et al., 2017b).
Understanding the evolution of bioluminescence is challenging because various biological processes were shown to be synergically involved in the emergence of bioluminescence (e.g., substrate dietary acquisition, gene cooption, potential horizontal gene transfers, . . .) (e.g., Viviani, 2002;Loening et al., 2006;Delroisse et al., 2017b;Bessho-Uehara et al., 2020b). Recently, a remarkable example of dietary enzyme acquisition has been described in the predator fish Parapriacanthus for which not only the luciferin was shown to be recovered from the ostracod prey, but also the functionally active luciferase (Bessho-Uehara et al., 2020b).

METHODOLOGICAL APPROACH FOR THE META-ANALYSES
Luciferase and photoprotein protein sequences were retrieved from NCBI (Supplementary Table 1). The global dataset was analysed using a sequence-similarity-based clustering approach based on BLASTp e-values and using the CLANS software (Frickey and Lupas, 2004). Based on the CLANS clustering, pairwise sequence identity and similarity were calculated from trimmed multiple sequence alignments of each luciferase/photoprotein subset (defined as a group of potentially homologous luciferases/photoproteins) using SIAS web tool. 1 Molecular domain prediction was performed using the Hidden Markov Model of Simple Modular Architecture Research 1 http://imed.med.ucm.es/Tools/sias.html Tool (SMART) 2 (Figure 1). Molecular weight values (determined by biochemical approaches, not by in silico analyses) were collected from the literature.
In total, 134 sequences of luciferases and photoproteins (from 75 species and 11 phyla), were collected in the context of the present review. In parallel, we generated the list of known bioluminescent proteins, including those for which no sequences are available, yet (Supplementary Table 1). To our knowledge, it is the most complete repertoire of known luciferases and photoproteins (however, the idea of generating an exhaustive set is certainly utopian).
Our analyses highlighted the presence of 12 distinct types of bioluminescent proteins, defined as clusters of homologous bioluminescent proteins based on an E-value threshold of 1e −10 (Figure 2). In the following sections, we will discuss each of these photoprotein/luciferase clusters from an evolutionary perspective.

Bacteria Luciferases (Group I), the Most Ancient Type of Bioluminescent Proteins Shared by all Luminous Bacteria
Bioluminescent bacteria (i.e., terrestrial or aquatic) are all distributed into seven genera belonging to three families of Gammaproteobacteria: Vibrionaceae, Shewanellaceae and Enterobacteriaceae (Brodl et al., 2018), and at least 28 luminous marine bacteria have been described (e.g., Vibrio harveyi, Photobacterium sp.) (Tanet et al., 2020). Symbiotic associations with luminescent bacteria have also been described in teleost fish and squids (Dunlap and Kita-Tsukamoto, 2006) and suggested as hypotheses, in other organisms (e.g., Mackie and Bone, 1978;Taylor et al., 1983). These last hypotheses (i.e., luminescence produced by symbiotic bacteria) were recently challenged in sharks  and urochordates (Tessler et al., 2020). Multiple bacteria species have been identified in bioluminescent symbiotic associations: Aliivibrio fischeri, Aliivibrio logei, Photobacterium leiognathi, Photobacterium phosphoreum, Photobacterium kishitanii, Photobacterium mandapamemsis, Candidatus Enterovibrio luxaltus, Candidatus Enterovibrio escacola, Candidatus Photodesmus katoptron, Candidatus Photodesmus blepharon, etc. (Boettcher and Ruby, 1990;Dunlap and Kita-Tsukamoto, 2006;Ast et al., 2007;Dunlap et al., 2007;Kaeding et al., 2007;Hendry et al., 2014Hendry et al., , 2018Freed et al., 2019) and some studies suggested that unidentified species could also be involved (Haygood and Distel, 1993). The bacterial luciferase is described as a flavin-dependent monooxygenase and is composed of two different but homologous subunits: the α subunit is the catalytic core, and the β subunit is crucially required for maintaining the catalytic function of the α subunit. In all described bacterial luminescence cases, the luminous reaction involved the oxidation of both a long chain aldehyde and a reduced flavin mononucleotide (FMNH2) into the corresponding carboxylic acid and the oxidised flavin mononucleotide (FMN). The reaction is catalysed by the heterodimeric enzyme and involves oxygen. Tetradecanal is possibly the natural aldehyde substrate of bacterial luciferases but other aldehydes may also be used (Ulitzur and Hastings, 1979). The product of the oxidation of the reduced riboflavin phosphate (FMNH2), the FMN-4a-hydroxide-in an excited state-reverts to its basic state when it emits light. FMN-4a-hydroxide is then generally considered to be the luciferin, as it is the light emitter (Kurfürst et al., 1984;Lei et al., 2004).
Up to now, all investigated bacteria are sharing a common type of luciferase (observed as the "Bacteria luciferases (Group I)" cluster in our meta-analyses, see Figure 2). It is interesting to note that similar bacterial luciferase-like encoding genes have been detected in Archaea (identity superior to 45%) and Fungi (identity superior to 35%), although more functional information are missing for these groups. Also, no homologous sequences have been found in metazoans (J.D. personal observations).
The dinoflagellate luciferin is thought to be derived from chlorophyll and has a very similar structure (Dunlap et al., 1981;Topalov and Kishi, 2001). A modified form of this luciferin is also found in herbivorous euphausiid shrimps, indicating a probable dietary link for the luciferin acquisition (Shimomura, 1980(Shimomura, , 1995. The dinoflagellate luciferase contains three homologous domains, and each domain-of approximately 46kDa-is known to be enzymatically active and to participate in the bioluminescence reaction (Li et al., 1997;Fajardo et al., 2020). The crystal structure of one of the domains (D3) in its inactive form was solved by Schultz et al. (2005). All luminous dinoflagellates investigated until now are sharing a homologous luciferase type [Dinoflagellata luciferase (Group II) cluster in Figure 2]. Noctiluca scintillans, considered as a "primitive dinoflagellate species" (non-photosynthetic dinoflagellate), has only one enzymatically active luciferase domain potentially corresponding to the gene structure of the ancestral luciferase of dinoflagellates (Valiadi and Iglesias-Rodriguez, 2013).
In dinoflagellates, the luminous organelles-the scintillonsalso contain a luciferin binding protein that protects the luciferin from oxidation by the luciferase at physiological pH. While the luciferases are highly conserved in all investigated dinoflagellate species, this is not the case for the luciferin binding proteins, which are all homologous but also appear to be highly variable in sequence and structure (Fajardo et al., 2020, for review). The evolutionary origin of dinoflagellate luciferase remains elusive, and no exact homologous sequences have been detected in non-dinoflagellate organisms. However, a structural similarity has been found with fatty-acid-binding proteins (FABPs) (lipocalin family) present in metazoans (Schultz et al., 2005).

Fungi Luciferases (Group III), a Common Type of Luciferase for all Fungi
Approximately 100 fungi species from the order Agaricales emit light using a standard luciferase-luciferin system (Oliveira et al., 2012;Kotlobay et al., 2018). Although fungal bioluminescence's ecological role is not fully understood, there is evidence that it might be used to attract spore-dispersing insects (Oliveira et al., 2015).
Our analyses confirm that all investigated luminous Fungi share a common luciferase type (Group III, Figure 2). Unlike all other described bioluminescent proteins, Fungi luciferases are characterised by the presence of a transmembrane domain (Figure 1). No homologs of these enzymes have yet been found and they most likely represent a novel family of protein .
All investigated ctenophores and cnidarians use coelenterazine as their light-emitting substrate. All investigated ctenophores and most investigated medusozoans (Cnidaria) species use calcium-activated photoproteins. The specific case of Periphylla sp., however, will be discussed in the section "Other Groups of Luciferases or Photoproteins" as the species has been shown to use a luciferase system. In addition, the case of the anthozoans appears to be different, as well, and will be tackled in the section "Renilla-Type Luciferases (Group VI), a Luciferase Type also Found in the Brittle Star Amphiura filiformis and, Possibly, in the Tunicate Pyrosoma atlanticum" as the octocorals Renilla sp., and most probably other sea pansies and sea pens (Bessho-Uehara et al., 2020a), are using a different and non-homologous luciferase system.
Ctenophoran and cnidarian photoproteins are functionally related to coelenterazine-binding proteins from Renilla, sarcoplasmic calcium-binding protein from the marine worm Nereis diversicolor and calmodulin proteins (Schnitzler et al., 2012). Schnitzler et al. (2012) proposed a metazoan-wide phylogeny for the "Aequora-type photoprotein" (i.e., Ctenophora and Medusozoa Photoproteins) gene family. They identified photoprotein-like genes in non-luminescent taxa (i.e., the poriferan Amphimedon and the cnidarian Nematostella), and demonstrated that the gene family likely arose at the base of the Metazoa (Schnitzler et al., 2012). Light-emitting calciumbinding photoproteins (i.e., functional photoproteins) may have evolved independently from a homologous gene found in ctenophores, cnidarians, and non-luminous sponges (Prasher et al., 1985;Tsuji et al., 1995). The emergence of "light-emitting photoproteins" in cnidarians and ctenophores could then appear as an example of parallel evolution of conserved and homologous genes.

Insecta-Type Luciferases (Group V), a Luciferase Type Also Possibly Found in the Cephalopod Watasenia scintillans
All luminous insects [i.e., Coleoptera with around 2,300 luminous species (Li et al., 2021); Diptera with a lower specific diversity but not yet exhaustively evaluated to the best of our knowledge; Collembola and Hemiptera (Harvey, 1952;McElroy et al., 1974)] share a unique homologous luciferase-type, and the insect proto-luciferase is known to derive from Acyl-CoA ligase enzymes that have a primary metabolic function (Viviani, 2002). These enzymes are members of the ANL superfamily of adenylating enzymes. This superfamily consists of various enzymes, in addition to the Insecta-type luciferase, such as long-chain fatty acid Co-A ligases and acetyl-CoA synthetases as well as other closely related synthetases and a plant auxin-responsive promoter family. The name ANL derives from three subfamilies-Acyl-CoA synthetases, the NRPS adenylation domains, and the Luciferase enzymes. Members of this superfamily catalyse the initial adenylation of a carboxylate to form an acyl-AMP intermediate, followed by a second partial reaction, most commonly forming a thioester (Gulick, 2009). While the homology between all insect luciferase genes is apparent (Figure 2), the evolutionary origin of bioluminescence in this group still appears very complicated and recent studies supported independent emergences of bioluminescence and parallel evolution of luciferases in fireflies, click beetles and Diptera (Fallon et al., 2018;Watkins et al., 2018). Several authors suggested that the high abundance of ancestral gene duplications in this gene family, and as a result the associated closely related enzymatic activities, served as "raw materials for the selection of new adaptive catalytic functions" (Weng, 2014;Fallon et al., 2018).
Generally, the insect luciferin-luciferase system uses the firefly-type luciferin and requires ATP as a cofactor. Interestingly, the use of a distinct bioluminescent system was demonstrated for Arachnocampa flava and Orfelia fultoni , two phylogenetically related dipteran species from Australia and North America, respectively. While both species produce light via an insect-type luciferase, A. flava is using a different type of luciferin compared to the other insects (Watkins et al., 2018).
Surprisingly, homologous luciferases to the Insecta-type luciferase were also proposed in the marine squid Watasenia scintillans (Cephalopod, Mollusca) and the sponge Suberites domuncula (Porifera) (Figure 2). The firefly squid W. scintillans emits intense blue bioluminescence from photophores located at the tip of two of its arms. Within the photophore, luciferases are specifically organised in microcrystals, and these proteins are catalysing the bioluminescent reaction using the coelenterazine disulfate luciferin and ATP (Tsuji, 1985;Hamanaka et al., 2011). The involvement of ATP in the luminescence reaction was, however, questioned (Teranishi and Shimomura, 2008;Shimomura, 2012). Gimenez et al. (2016) identified potential Watasenia luciferases (i.e., several related proteins were pinpointed: wsluc1-3) that are sharing around 20% sequence identity with firefly luciferases (Gimenez et al., 2016). While the authors demonstrated that the expression profile of the predicted luciferases is matching with the luminous patterns, additional functional studies would be necessary to confirm the bioactivity of the predicted luciferase. Müller et al. (2009) suggested that an Insecta-type luciferase (acetyl-CoA synthetase) was involved in the bioluminescence of the sponge Suberites domuncula (Wiens et al., 2010) but this hypothesis remains very speculative. The authors showed that tissue extracts produce light that was detected using sensitive films in the dark. Then, the luciferase-like protein was immunodetected within the tissue. Finally, the recombinant sponge luciferase-like protein produced in Escherichia coli has been shown to emit light, but no information was given as to the actual amount of light produced. The authors suggested that the marine sponge Suberites may be using a firefly luciferase homolog and a luciferin similar to the firefly luciferin which raises many unanswered questions. Additional data would be required to test the presumed involvement of the insect-type luciferase in ecologically relevant light emission in the poriferan Suberites domuncula. The records of luminescence in Porifera are extremely limited and the clear status of intrinsic bioluminescence in these organisms still needs to be confirmed. Martini et al. (2020) recently published the most documented observation of bioluminescence in a deep-sea sponge paving the way to a better understanding of bioluminescence in these organisms. These authors suggested that an undescribed carnivorous sponge species (Cladorhizidae) is using a coelenterazine-dependent bioluminescence .
From the above examples, it appears probable that enzyme of the ANL superfamily have independently evolved in distant species to produce light using unrelated substrates (Gimenez et al., 2016). It may then represent a striking example of parallel evolution.
Renilla-Type Luciferases (Group VI), a Luciferase Type Also Found in the Brittle Star Amphiura filiformis and, Possibly, in the Tunicate Pyrosoma atlanticum The luminescence of the sea pansy Renilla reniformis, a shallow-water soft coral (octocoral) that displays blue-green bioluminescence upon mechanical stimulation, has been intensively studied since the luciferase has been cloned and sequenced in 1991 (Lorenz et al., 1991). The Renilla luciferase enzyme catalyses coelenterazine oxidation leading to bioluminescence. The Renilla luciferase shows a characteristic alpha/betahydrolase fold (Marchler-Bauer et al., 2003). It is found to have a high level of tertiary structure similarity and to be homologous to bacterial haloalkane dehalogenases which are primarily hydrolase enzymes cleaving a carbon-halogen bond in halogenated compounds (Hynková et al., 1999;Loening et al., 2006). Horizontal gene transfers that are known to play critical roles in the evolutionary acquisition of novel traits in eukaryotes (Boto, 2014), have been suspected to explain the high similarity of the Renilla luciferase compared to bacterial haloalkane dehalogenases (Loening et al., 2006;Delroisse et al., 2017b).
A Renilla-type luciferase was recently identified in the brittle star Amphiura filiformis (Echinodermata, Ophiuroidea). In this species, which emits a blue luminescence at the level of the arm spines (Delroisse et al., 2017a), coelenterazine is the luciferin and is acquired via a dietary pathway . The predicted A. filiformis luciferase, highly similar to the Renilla luciferase (up to 47% of identity, up to 69% of similarity) (Figure 2), would constitute the unique example of luciferase described so far in echinoderms. Amphiura luciferase has been detected specifically in the animal's spine photocytes, which constitutes a strong indication of its photogenesis implication (Delroisse et al., 2014(Delroisse et al., , 2017a. However, given the expression of Renilla luciferase-like proteins in non-luminous echinoderms, this hypothesis must be confirmed by the recombinant expression of the A. filiformis protein sequence to verify its luciferase activity. It has been suggested that the haloalkane-dehalogenase function constitutes the metazoan ancestral state, which shifted to luciferase in cnidarians (lineage of Renilla) and brittle stars (lineage of A. filiformis) (Loening et al., 2006;Delroisse et al., 2017b). Haloalkane dehalogenases were presumably co-opted in luciferases in these two specific lineages. In A. filiformis, the apparent late duplications of luciferase-like genes could suggest the co-occurrence of both functions (Delroisse et al., 2017b).
Renilla sp. and A. filiformis would then possess similar and homologous luciferases to catalyse the photogenous reaction. Delroisse et al. (2017b) hypothesised that a co-emergence happened between these two luminous systems using the same compounds under similar environmental pressure. The ecological similarities between the Renilla (Renilla mulleri, Renilla reniformis, and potentially luminous sea-pens in general) and A. filiformis, such as the benthic position on loose sediment and the suspension-feeding strategy, would presumably permit to acquire coelenterazine from planktonic organisms in a "dietary way." The predation pressure would positively select the emergence of the bioluminescence function endowing these slow-moving organisms with an efficient anti-predation strategy.
Similar to what was observed in the brittle star A. filiformis, a Renilla-like luciferase has also been found in the luminous tunicate Pyrosoma atlanticum (Figure 2). Immunodetections of the luciferase have been performed within the Pyrosoma tissues. In parallel, in vitro expression and functional testing of the protein qualitatively confirmed the enzyme bioactivity (Tessler et al., 2020). However, the hypothesis of the Renilla-like luciferase involved in the bioluminescence of P. atlanticum has recently been questioned (Berger et al., 2021).

Sthenoteuthis-Type Photoproteins or Symplectin (Group VII)
The flying squid Sthenoteuthis oualaniensis is characterised by a light organ on its mantle. The light organ contains thousands of small granules, in which a photoprotein exists as the active form (Kuse, 2014; for review). The species emits light using the oxidation of the dehydro-coelenterazine luciferin substrate by the so-called symplectin photoprotein enzyme (Fujii et al., 2002). The 60-kDa symplectin photoprotein was extracted and characterised (Fujii et al., 2002). Sequence analyses revealed no sequence similarity to known bioluminescent proteins (Figure 2) but a significant similarity to the carbon-nitrogen hydrolase domain found in mammalian biotinidase and vanin (pantetheinase) (Fujii et al., 2002). Francis et al. (2017) recently explored the phylogenetic distribution of these enzymes, grouped in the symplectin/pantetheinase protein family, in metazoans. These authors suggested that symplectins may have multiple functions including hydrolase activity (Francis et al., 2017). Both dehydrocoelenterazine and symplectin-like photoprotein were recently demonstrated to be responsible for the light emission of the Humboldt squid, Dosidicus gigas, one of the largest cephalopods on Earth (Francis et al., 2017;Galeazzo et al., 2019 Pholas dactylus, the common glowing piddock, is a famous luminescent organism because Dubois, who discovered the general luciferin-luciferase reaction back in the 19th century, was specifically working on this species (Dubois, 1889). The biochemistry of the Pholas bioluminescence was studied in depth in the 1970s (Henry and Michelson, 1973;Michelson, 1978). The term Pholasin was initially dedicated to the luciferin before Pholasin was confirmed to be a photoprotein (Henry and Michelson, 1973;Knight and Campbell, 1987;Kuse, 2020). Dunstan et al. (2000) cloned the gene coding for the Pholasin apoprotein. These authors compared the amino acid sequence with known proteins present in the public databases and more specifically with the sequences of other cloned bioluminescent proteins (available at that time). A small region of similarity was found between the recombinant protein and the putative luciferin-binding sites of Vargula luciferase and Renilla luciferinbinding protein. However, these sites are very small and do not inform on the potential homology status of these proteins. Our analyses indicated that the Pholas photoprotein have no clear homology with other known bioluminescent proteins [ Figure 2, three protein sequences, predicted from different clones sequenced by Dunstan et al. (2000) are presented but these identical protein sequences most probably correspond to a unique gene].

Oplophorus Luciferase (Group IX), a System Only Confirmed in Oplophorus gracilirostris (Decapoda, Pancrustacea)
This cluster only contains the luciferase of the deep-sea shrimp Oplophorus gracilirostris. O. gracilirostris secretes a luminous blue cloud from the basal part of its antennae when disturbed (Shimomura et al., 1978). Similar behaviours are observed in various luminescent decapod shrimps including the genera Heterocarpus, Systellaspis, and Acanthephyra (Harvey, 1952). However, the involvement of an Oplophorus-type luciferase in the light emission has only been confirmed in O. gracilirostris.
The Oplophorus luciferase catalyses the oxidation of coelenterazine. The enzyme consists of two subunits (19 and 35 kDa), but the smaller subunit is the only one to have a catalytic activity while the 35 kDa protein is thought to have a role in the stabilisation of the catalytic unit. The 19 KDa protein of Oplophorus luciferase is the smallest known catalytic component having a luciferase function (Inouye et al., 2000).
Oplophorus luciferase presents no homology with other known bioluminescent proteins (Figure 2).

Cypridinidae Luciferases (Group X), a Common Type of Luciferase for Ostracods of the Cypridinidae Family
Around 150 ostracod species from the family Cypridinidae (out of about 300 species) can produce light (Brandão et al., 2015;Morin, 2019). These organisms use bioluminescence for defence or to create courtship displays. All investigated luminous cypridinid ostracods use the same luciferin and homologous enzymes to produce light (Harvey, 1924). It has been proposed that luminous cypridinid ostracods synthesise their luciferin from the amino acids tryptophan, isoleucine, and arginine (Oba et al., 2002). This luciferin, called Vargulin or Cypridina-type luciferin as it was initially found in the ostracods Vargula and Cypridina, is also the one used by the midshipman fish Porichthys sp. A clear dietary link has been established, and fishes are losing their ability to luminesce if they are not fed with luciferincontaining food (Mensinger and Case, 1991). The luminescent fish Parapriacanthus ransonneti obtains its luciferin but also its luciferase enzyme from bioluminescent ostracod preys (Bessho-Uehara et al., 2020a).
In parallel, the ostracod family Halocyprididae also contains a dozen luminous species and five of them-belonging to the same genus-were shown to use a coelenterazine-based system (e.g., Conchoecia pseudodiscophora) (Angels, 1968;Campbell and Herring, 1990;Oba et al., 2004). Their corresponding luciferases are still unknown.
Cypridinid luciferases have no homology with other known bioluminescent proteins (Figure 2).

Copepoda Luciferases (Group XI), a Common Type of Luciferase for all Copepods
Some marine copepods emit a bright blue light using a classical luciferase-luciferin system based on the coelenterazine substrate. It is a case of secreted bioluminescence, and the simple oxidation reaction do not require any additional cofactors. Copepod luciferases are small secreted proteins of around 18-24 kDa. Several copepod luciferase genes were cloned (e.g., Gaussia, Metridia, Pleuromamma, Lucicutia, Heterorhabdus, Heterostylites. . .) (Takenaka et al., 2008(Takenaka et al., , 2012Thouand and Marks, 2014). The luciferases from the copepods Gaussia princeps and Metridia longa have been used as bioluminescent reporters in various applications (Thouand and Marks, 2014).

Odontosyllis Luciferase (Group XII)
The Group XII cluster only contains the luciferase of the luminous annelid worms from the genus Odontosyllis. The bioluminescent systems of Odontosyllis enopla and Odontosyllis octodentata have been partly characterised and the luciferin of O. enopla has been partially purified, showing that light emission requires the presence of magnesium, molecular oxygen, and crude luciferase (Shimomura et al., 1963;Trainor, 1979;Shimomura, 2012). More recently, the luciferase and luciferin of Odontosyllis umdecimdonta have been characterised, the enzyme was cloned, and it was shown that magnesium ions are not required for the luciferin-luciferase reaction (Mitani et al., , 2019Schultz et al., 2018;Kotlobay et al., 2019). Odontosyllis luciferases have no homology with other known bioluminescent proteins (Figure 2).
Interestingly, Deheyn and Latz proposed that a photoprotein could be involved in the bioluminescence of the species Odontosyllis phosphorea (Deheyn and Latz, 2009). If this assumption is confirmed, it will imply the presence of two convergent bioluminescent protein types (a luciferase and a photoprotein) in the genus Odontosyllis (Deheyn and Latz, 2009).

Other Groups of Luciferases or Photoproteins
There are many luminous organisms in which, although no luciferase sequence is available, crucial biochemical information indicate that the bioluminescent proteins involved do not correspond to any of the 12 luciferase types described above. This suggests that additional groups might be described in the future.
Conversely to the other medusozoans (see section "Ctenophora and Medusozoa Photoproteins (Group IV), the First Described Photoprotein-Type"), Periphylla periphylla has been depicted as using a luciferase rather than a photoprotein for its light emission. Shimomura and Flood (1998) described two types of luciferases catalysing the luminous reactioni.e., luciferase-L (32 kDa) and luciferase-O (75 kDa) using coelenterazine as substrate-, occurring in the jellyfish marginal exumbrella photocytes and eggs, respectively (Shimomura and Flood, 1998;Shimomura, 2012). The bioluminescent system of Periphylla suggests the emergence of convergent bioluminescent protein in medusozoans, with at least two different systems: the luciferases observed in Periphylla and the photoproteins observed in several Medusozoa species.
In echinoderms, despite the relatively common occurrence of luminous species, only two ophiuroid species, A. filiformis and Ophiopsila californica have been investigated biochemically. The former luminesces with a luciferin-luciferase system stricto sensu (see above) whereas the latter emits light with a photoprotein system (Mallefet et al., 2012Shimomura, 1986Shimomura, , 2012. A high diversity of physiological luminescence control mechanisms has been described in these organisms. Species from the same genus (e.g., O. californica and Ophiopsila aranea) can sometimes exhibit different control mechanisms of the photogenous reaction (Mallefet, 2009 for review).
For a considerable number of studied bioluminescent organisms, the luminous system is partially or totally unknown. For the majority of decapods or bony fishes, for example, only one part of the system (i.e., the luciferin) is known and the luciferases or photoproteins remain unknown (Shimomura, 2012; see Supplementary Table 1). This is also the case for the terrestrial potworms Fridericia heliota (Petushkov et al., 2014;Tsarkova et al., 2016) and Henlea sp. (Petushkov and Rodionova, 2005;Oba et al., 2017), the earthworm Diplocardia longa (Ohtsuka et al., 1976;Rudie et al., 1981), as well as the freshwater limpet, Latia neritoides (Shimomura and Johnson, 1968;Ohmiya et al., 2005); for which the luciferins are identified, and luciferases only partially revealed. Within the terrestrial environment, the biochemistry of bioluminescence of species such as the luminous centipedes (Myriapoda, Chilopoda) and millipedes (Myriapoda, Diplopoda)-such as the millipede Motoxya sp.-have not been investigated to date (Rosenberg and Meyer-Rochow, 2009;Oba et al., 2017). Some studies only rely on cross-reactivity between extracts of closely related, or even phylogenetically distant species, to determine the type of luciferin used (Shimomura, 2012). Most often, no data on the luciferase or photoprotein sequence, activity or specificity is available for species that are difficult to collect and fragile, such as those found in deep oceanic strata. Therefore, efforts are still needed to discover hitherto unknown bioluminescent systems. For instance, lanternfish, dragonfish and viperfish luminous systems were demonstrated to use coelenterazine as luciferin, but no luciferase/photoprotein has been identified to date (Tsuji and Haneda, 1971;Mallefet and Shimomura, 1995;Duchatelet et al., 2019). Similarly, shark bioluminescence systems-while these organisms have been extensively studied these last years (e.g., Claes et al., 2020;Mallefet et al., 2021)remain totally enigmatic, even though attempts were made to decipher the bioluminescent compound in the lanternshark Etmopterus spinax (Renwart and Mallefet, 2013). Cross-reactivity with known luciferin failed to trigger light production, and preliminary search for luciferase homologues within the available transcriptomic data did not yield any results suggesting the involvement of an unknown bioluminescent system in luminous sharks (Renwart and Mallefet, 2013;Delroisse et al., 2018Delroisse et al., , 2021.

DISCUSSION
Bioluminescence evolution is often used as a striking illustration of convergent evolution in life history. While it appears clear that many extant bioluminescent systems have evolved independently on earth, the number of fully characterised bioluminescent proteins is still minimal, and the evolution of these light-emitting luciferases and photoproteins remains mostly enigmatic. While the diversity of luciferins involved in bioluminescent systems is rather well evaluated (Lau and Oakley, 2020, for review), the diversity of bioluminescent proteins is, without a doubt, mostly under-evaluated.
It is essential to clarify that the present review does not illustrate the evolutionary history of bioluminescence because it only focuses on visualising the bioluminescent protein homology across the tree of life. Knowing the luciferase evolutionary history is not enough to explain the global bioluminescence's evolutionary history. The reality is indeed more complex, and bioluminescence substrates likely possess different evolutionary histories from those of luciferases (as presented by Fallon et al., 2018 in the case of luminous insects). The same remark could also be made for other associated actors such as luciferin binding proteins, for example. As illustrated in Lau and Oakley (2020), understanding how bioluminescence emerged in living organisms requires the investigation of all potential actors of bioluminescence including luciferin biosynthetic pathways or dietary acquisition pathways, luciferases, bioluminescence control, . . . While bioluminescence can be convergent at one biological level, the convergence may not be found at other levels (Lau and Oakley, 2020).

Twelve Distinct Bioluminescent Protein Types Are Currently Described
Multiple types of luciferases emerged convergently in the tree of life (Figure 3). Based on the currently available sequence data, our meta-analyses suggest that at least 12 nonhomologous bioluminescent protein types (i.e., three types of photoproteins, nine types of luciferases stricto sensu) appeared independently during evolution. Our analyses confirmed that luciferases/photoproteins appear relatively lineage-specific (e.g., all described luminous bacteria share a common and homologous luciferase type). To cite Lau and Oakley (2020), "most known bioluminescent proteins exhibit wide molecular diversity and are not homologous across distantly related taxa, which suggest that most origins of bioluminescent proteins are the result of convergent, but not parallel, evolution." However, our analyses also highlighted that, in some cases, a similar systemi.e., homologous enzymes-could be used by phylogenetically distant organisms: Group IV photoproteins are shared by ctenophores and medusozoans, Group V luciferases are shared by insects, the cephalopod Watasenia scintillans and, putatively, the sponge Suberites domuncula; Group VI luciferases are shared by the sea pansy Renilla sp., the tunicate Pyrosoma atlanticum and the brittle star Amphiura filiformis. While enzymes appear to be homologous within all precited Groups (IV, V, VI), it also appears that they have been independently co-opted into luciferases in these distant lineages. In short, in these examples of parallel molecular evolution, the proteins themselves are homologous, but their luciferase function is not. However, as exemplified by Tyler (1988), homology should apply most appropriately to the structural features, not their functions.  Lau and Oakley (2020) and Haddock et al. (2010). Phylogenetic tree based on Giribet and Edgecombe (2020).

Bioluminescent Proteins Could Be Bifunctional Enzymes
The evolution of bioluminescence in insects is thought to have emerged from the activity of ancestral fatty acyl-CoA synthetase (ACS) enzymes present in all insects. Beetle luciferases share high sequence identity with these enzymes and often retain the ACS activity (Adams and Miller, 2020). Besides, some ACS enzymes from non-luminous insects can catalyse bioluminescence from synthetic D-luciferin analogues (Mofford et al., 2017;Adams and Miller, 2020).
The annelid polynoidin is also present in nonluminescent scale worms suggesting that bioluminescence might have originated from a non-related mechanism (in this case: quenching of superoxide radicals) (Plyuscheva and Martin, 2009).
The case of the Renilla-type luciferase characterising the brittle star A. filiformis, was investigated in detail based on genomic and transcriptomic data. While the Renilla-type luciferase appears to be specifically expressed in the photocytes in the luminous brittle star, it was also highlighted that similar Renilla-type luciferases are present in non-luminous echinoderms, raising questions about the evolution of bioluminescence in echinoderms. In the sea-urchin Strongylocentrotus purpuratus, a Renilla-type luciferase protein (DspA) was identified as the first biochemically characterised haloalkane dehalogenase of non-microbial origin (Fortova et al., 2013). Nagata et al. (2015) noted homology between the luciferase from R. reniformis and some microbial hydrolases catalysing the removal of halogens from aliphatic hydrocarbons, the so-called haloalkane dehalogenases. Both enzymes share the conserved catalytic triad of residues. Microbial haloalkane dehalogenase (Shingomonas sp.) shares high sequence identity (42%) and similarity (62%) with the sequence of Renilla luciferase. This similarity is somewhat surprising considering that haloalkane dehalogenases are hydrolases and Renilla luciferase is an oxygenase. It seems therefore that the "luciferase-like" proteins could have kept the original microbial function of haloalkane dehalogenases, at least, in sea urchins. Chaloupkova et al. (2019) have recently reconstructed an ancestor of evolutionarily related (but catalytically distinct) haloalkane dehalogenases and Renilla luciferase that has both hydrolase and monooxygenase activities. Fortova et al. (2013) and Delroisse et al. (2017b) reported the absence of light emission after coelenterazine addition to luciferase-like protein extracts from the sea-urchin S. purpuratus and in the sea-star Asterias rubens, which strongly suggest the absence of a luciferase function for the enzymes of these nonluminous species.
Symplectins, from the cephalopod Sthenoteuthis and Dosidicus, which are derived from pantetheinase enzymes, also contain active site residues involved in pantetheinase catalysis suggesting that these photoproteins may have multiple functions including a hydrolase activity (Francis et al., 2017).
These examples of functional shifts indicate that luciferases did not necessarily derive from ancestral oxygenases, and that luciferases may retain the ancestral function and be bifunctional in some cases. Outside the bioluminescence field, this observation has already been reported for oxygenases (Chen et al., 2005). Cooptions of genes non-related to monooxygenases into luciferases are predicted for the Groups IV (Ctenophora/Cnidaria photoproteins), V (Insecta/Watasenia/Suberites luciferases) and VI (Renilla/Pyrosoma/Amphiura luciferases) bioluminescent enzymes that emerged from calmodulin, acyl-CoA ligase and bacterial haloalkane dehalogenase enzymes, respectively. However, determining the ancestral function of a protein is difficult, and these enzymes might have been functioning as oxygenases long before the emergence of their non-oxygenase activity.

CONCLUSION
Evolution (and natural selection, in particular) often promotes evolutionary innovation by co-opting preexisting genes for new functions, and gene duplication is known to facilitate this process (Hoffmann et al., 2010): citing Hoffmann et al. (2010), "some examples of convergent evolution of protein function provides an impressive demonstration of the ability of natural selection to cobble together complex design solutions by tinkering with different variations of the same basic protein scaffold" (Hoffmann et al., 2010). Here we emphasise that multiple bioluminescent proteins potentially appeared during evolution by the independent emergences of new genes or by the cooption of existing genes with an ancestral function unrelated to bioluminescence (i.e., convergent evolution). In this latter case, cooption might have occurred independently across the tree of life (i.e., parallel evolution) leading to homologous light-emitting systems in nonrelated luminous organisms. As already suggested for other types of proteins (e.g., Casewell, 2017), our findings suggest that co-option may be an underappreciated process underpinning protein "neofunctionalisation."

AUTHOR CONTRIBUTIONS
JD performed analyses and wrote the first draft of the manuscript. LD participated in the data collection from the literature. PF and JM supervised the work. All authors participated in discussions and revised the final manuscript.

ACKNOWLEDGMENTS
This study is a contribution from the "Centre Interuniversitaire de Biologie Marine" (CIBIM). This study is the contribution BRC #378 of the Biodiversity Research Center (UCLouvain) from the Earth and Life Institute Biodiversity (ELIB). JD, JM, and PF are, respectively, postdoctoral fellow, Research Associate, and Research Director of the Fund for Scientific Research of Belgium (F.R.S-FNRS). LD is postdoctoral researcher at the University of Louvain. We thank the reviewers as well as Warren Francis and Oba Yuichi for their constructive comments.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars.2021. 673620/full#supplementary-material Supplementary Table 1 | List of known bioluminescent systems based on biochemical and molecular data (including the sequence references used for the CLANS analysis presented in Figure 2).
Supplementary File 1 | References used to generate the list of known bioluminescent systems based on biochemical and molecular data (Supplementary Table 1).