Discovery and Characterization of a Bilirubin Inducible Green Fluorescent Protein From the Moray Eel Gymnothorax zonipectis

Since the initial discovery of Aqueoria victoria’s green fluorescence off the coast of Washington’s Puget Sound, biofluorescent marine organisms have been found across the globe. The variety of colors of biofluorescence as well as the variability in the organisms that exhibit this fluorescence is astounding. The mechanisms of biofluorescence in marine organisms are also variable. To fluoresce, some organisms use fluorescent proteins, while others use small molecules. In eels, green biofluorescence was first identified in Anguilla japonica. The green fluorescence in A. japonica was discovered to be caused by a fatty acid binding protein (UnaG) whose fluorescence is induced by the addition of bilirubin. Members of this class of proteins were later discovered in Kaupichthys eels (Chlopsid FP I and Chlopsid FP II). Here, we report the discovery and characterization of the first member of this class of green fluorescent fatty acid binding proteins from the moray eel Gymnothorax zonipectis. This protein, GymFP, is 15.6 kDa with a fluorescence excitation at 496 nm and an emission maximum at 532 nm upon addition of bilirubin. GymFP is 61% homologous to UnaG and 47% homologous to Chlopsid FP I. Here, we report de novo transcriptome assembly, protein expression, and fluorescence spectroscopic characterization of GymFP. These findings extend the fluorescent fatty acid binding proteins into a third family of true eels (Anguilliformes).


INTRODUCTION
Biofluorescence is a phenomenon that is widespread in the marine environment, being found in eels (Hayashi and Toda, 2009;Kumagai et al., 2013;Gruber et al., 2015;Krivoshik et al., 2020), sharks, and many other fish species (Sparks et al., 2014). Marine biofluorescence was first discovered in the 1960s by Osamu Shimomura in the jellyfish Aqueoria victoria (Shimomura et al., 1962). The structure of green fluorescent protein (GFP) is a beta barrel with the fluorophore forming from a spontaneous cyclization reaction of three amino acids: Serine-Tyrosine-Glycine (Barondeau et al., 2005). Derivatives of GFP were later identified in corals (Matz et al., 1999;Salih et al., 2000;Field et al., 2006;Leutenegger et al., 2007).
In 2009, a green fluorescent protein (UnaG) was discovered in an eel, Anguilla japonica (Hayashi and Toda, 2009;Kumagai et al., 2013). UnaG was found to be a member of the fatty acid binding protein family (Kumagai et al., 2013). Structural characterization of UnaG revealed a beta barrel structure (Kumagai et al., 2013). UnaG did not fluoresce spontaneously as GFP, instead it required the addition of bilirubin, a breakdown product of heme metabolism (Kumagai et al., 2013). Other Anguilla species were later discovered to have bilirubin inducible fluorescent fatty acid binding proteins in their transcriptome sequences (Funahashi et al., 2017). Our work led to the discovery of members of this class of proteins in two chlopsid eels: Kaupichthys hyoproroides and Kaupichthys n. sp. These proteins are called Chlopsid FP I and Chlopsid FP II, respectively (Gruber et al., 2015).
All fluorescent fatty acid binding proteins discovered to date have a Glycine-Proline-Proline (GPP) motif that resides on a loop that may act to protect bilirubin from solvent. This may be a reason for fluorescence in the eel fatty acid binding proteins (Gruber et al., 2015;Krivoshik et al., 2020). Work with Chlopsid FP I showed that mutation of the GPP motif to GGG significantly reduces quantum yield (Gruber et al., 2015), while deletion fully quenches fluorescence (Krivoshik et al., 2020).
Here, we report a new fluorescent fatty acid binding protein (GymFP) isolated from the widespread Indo-Pacific moray eel, Gymnothorax zonipectis. This discovery represents the third family of Anguilliformes (true eels), Muraenidae, from which FPs have been isolated and characterized. This is the first report of a fluorescent fatty acid binding protein from Muraenidae. Herein we report de novo transcriptome assembly of G. zonipectis, fluorescence characterization, and recombinant expression of GymFP.

Collection and Identification of Gymnothorax zonipectis
Specimens of G. zonipectis were collected during daylight SCUBA dives via the application of rotenone to a targeted shallow reef habitat (8-12 m) in Peava Lagoon, Western Province of the Solomon Islands (−8.784222 degrees S, 158.231345 degrees E). Immediately after collection, the G. zonipectis specimen (AMNH 277097) was placed in a narrow photographic tank and held flat against a thin plate glass (Figures 1A,B). Fluorescent macro images were produced in a dark room by covering the flash with interference bandpass excitation filters (Omega Optical, Inc., Brattleboro, VT; Semrock, Inc., Rochester, NY) to elicit fluorescence. Longpass (LP) and bandpass (BP) emission filters (Semrock) were attached to the front of the camera lens to block the excitation light and record emitted fluorescence. The eel was stored in a liquid nitrogen dry shipper and transported back to the American Museum of Natural History, New York, where it was immediately stored at −80 • C. The specimen was then delivered to Baruch College.
Research and collection permits were obtained from the Ministry of Fisheries and Marine Resources (MFMR), and the Ministry of Environment, Climate Change, Disaster Management and Meteorology (MECDM), Honiara, Solomon Islands. Fishes were collected and handled in accordance with AMNH Institutional Animal Care and Use Committee (IACUC) and American Fisheries Society (AFS) guidelines, as established for the safe and humane care and handling of vertebrate animals. Fieldwork was carried out in collaboration with and permitted by the Solomon Islands MFMR and MECDM, and facilitated in the Solomon Islands by the Wildlife Conservation Society (WCS, New York; Munda, Western Province, Solomon Islands).
Gymnothorax zonipectis was identified using the following criteria: Overall body coloration brown. The posterior 2-3 laterosensory pores on the upper and lower jaws are enclosed in vertically oriented white bars that are continuous across the lower jaw. Distinct irregular dark brown marking with pale borders posterior to orbit. Body with oblique pattern of darkly pigmented broken/discontinuous vertical bars (in four longitudinal series) that become more pronounced caudally. Bars become more darkly pigmented/larger with bright white borders on posterior fins (Supplementary Figure 1).

Determination of Fluorescent Properties From Tissue Samples
G. zonipectis was dissected using a Lightools Research LT-9900 Illumatool Tunable Lighting System (Lightools Research) to ensure that the samples taken contained fluorescent tissue. Crosssectional images of specimens were generated using a Zeiss-Axio Zoom V16 stereo fluorescent microscope affixed with a Nikon D4 camera. These samples were homogenized in 1X PBS using a BeadBug homogenizer (Benchmark Scientific) and centrifuged at 15,000 rcf for 10 min.

RNA Extraction and de novo Transcriptome Assembly
RNA extraction of the fluorescent tissue from G. zonipectis was performed using the RNeasy Fibrous Tissue Mini Kit from Qiagen. Fluorescent tissue in G. zonipectis was found below the skin of the eel. The location of the fluorescence is shown in a photo of the cross section of the eel in Figure 1E. A 30 mg piece of fluorescent tissue was cut using a scalpel for RNA extraction. The extracted RNA sample (28 µL of 75 ng/µL) was then sent to GENEWIZ, LLC (South Plainfield, NJ) for de novo transcriptome assembly and bioinformatic analysis including a BLAST search was used to search for protein candidates.

Library Preparation With PolyA Selection and HiSeq Sequencing
RNA Library preparation was performed by Genewiz, LLC. RNA was quantified using Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, CA, United States). RNA integrity measurement was completed using TapeStation (Agilent Technologies, Palo Alto, CA, United States). Frontiers in Marine Science | www.frontiersin.org NEBNext Ultra RNA Library Prep Kit for Illumina was used in library preparation using manufacturer's instructions (NEB, Ipswich, MA, United States). mRNAs were enriched with Oligod(T) beads. Enriched mRNAs were fragmented for 15 min at 94 • C. First and second strand cDNA were synthesized. cDNA fragments were end repaired and adenylated at 3 ends, and universal adapters were ligated to cDNA fragments, followed by index addition and library enrichment by PCR with limited cycles.
The Agilent TapeStation was used for library validation (Agilent Technologies, Palo Alto, CA, United States), and quantified by using Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA) as well as by quantitative PCR (KAPA Biosystems, Wilmington, MA, United States).
After clustering the library on flowcell, the flowcell was loaded on the Illumina HiSeq instrument (4,000 or equivalent) according to manufacturer's instructions. The sample was sequenced using a 2 × 150 bp Paired End (PE) configuration. Image analysis and base calling were conducted by the HiSeq Control Software (HCS). Raw sequence data (.bcl files) generated from Illumina HiSeq was converted into fastq files and de-multiplexed using Illumina's bcl2fastq 2.17 software. One mismatch was allowed for index sequence identification.

Data Analysis
Data analysis was performed by GENEWIZ, United States. Sequence reads were trimmed to remove possible adapter and nucleotides with poor quality (error rate < 0.01) using Trimmomatic v.0.36 (Bolger et al., 2014). Then Trinity v2.5, de novo assembler (Grabherr et al., 2011), was used on combined samples per species. One de novo assembled transcriptome was created with a minimum contig length of 200 bp. Transrate v1.0.3 (Smith-Unna et al., 2016) was used to generate statistics for the de novo assembled transcriptome. EMBOSS tool getorf (Rice et al., 2000) was then used to find the open reading frames within the de novo assembled transcriptome. The de novo transcriptome assembly was then annotated using Diamond BLASTx against the NCBI NR database.

Bacterial Expression of GymFP
Plasmids were ordered from Genscript United States in a pET−24b(+) vector with kanamycin resistance for expression in BL21(DE3) E. coli with an N-terminal 6x histidine tag. Following a chemical transformation, BL21(DE3) were plated on agar containing a 1:1,000 dilution of kanamycin (50 mg/mL). Single colonies were selected and grown overnight in 5 mL LB media containing 5 µL kanamycin at 37 • C in a shaking incubator. The resulting culture was transferred to 100 mL LB with 100 µL kanamycin and left to grow at 37 • C until reaching an OD 600 of 0.4. A 1:1,000 dilution of IPTG (0.1 M) was then added, and the culture was left to grow for another 3 h after which the culture was then centrifuged at 3,000 rcf for 30 min.
The pellets were then combined and resuspended into 5 mL of Tris-HCl (50 mM), NaCl (150 mM). Lysozyme (100 µL of 10 mg/mL) was added, and the sample was left at room temperature for 1 h. The sample was then centrifuged at 8,000 rcf for 30 min. The supernatant was purified using Nickel affinity chromatography. The elution buffer contained Tris-HCl (50 mM), NaCl (150 mM), and Imidazole (300 mM), pH 7.4. Protein A 280 was measured using a Cary60 UV-Vis spectrophotometer. The extinction coefficient was calculated using the ExPasy ProtParam tool (Artimo et al., 2012) to be 21,430 M −1 cm −1 . Bilirubin was dissolved in NaOH and diluted to 10 µM in 1X PBS (1.19 mM phosphates, 13.7 mM sodium chloride, 270 µM potassium chloride), pH 7.3, for further use in fluorescence assays.

Fluorescence Analysis of GymFP
A Hitachi F-7,000 Fluorimeter was used to collect fluorescence spectra. Spectra were recorded of a 1:1 complex of bilirubin and purified GymFP (2.1 µ M of each).

Properties of Endogenous Gymnothorax zonipectis Fluorescence
Fluorescence is visible in the tissue of Gymnothorax zonipectis along the entire organism and is localized below the skin (Figure 1). Solubilization of fluorescent tissue from G. zonipectis was completed in 1X PBS (1.19 mM phosphates, 13.7 mM sodium chloride, 270 µM potassium chloride, pH 7.3). Following centrifugation, the fluorescent protein remained soluble in the supernatant. Boiling the homogenized samples fully quenched fluorescence.

Alignments and Phylogenetic Tree
Sequences of GymFP, Chlopsid FP I, and UnaG were aligned to identify homologous residues (Figure 2A). Residues in purple are identical between the respective sequences. The GPP sequence motif is shown in green. GymFP is 61% identical to UnaG and 47% identical to Chlopsid FP I, according to the SIM Alignment tool on ExPasy (Huang and Miller, 1991). Alignments and phylogenetic analysis of fluorescent and non-fluorescent FABPs are included in Supplementary Figure 2. The tree illustrates not only the relation of the FPs to brain FABPs, but also the distinction of GymFP as it is phylogenetically distinct from the Anguilla and Kaupichthys FPs.

Properties of GymFP
GymFP is 139 amino acids in length and has a calculated molecular weight of 15.6 kDa, similar to UnaG (Hayashi and Toda, 2009;Kumagai et al., 2013) and Chlopsid FP (Gruber et al., 2015;Krivoshik et al., 2020). Bilirubin-bound GymFP has an excitation maximum at 496 nm and a peak emission at 532 nm (Figure 3). The apo protein does not fluoresce. Chopsid FP I has EX/EM of 488/525 nm and GymFP has 496/532 nm, which is similar to that of UnaG (497/532 nm) (Figure 4).

DISCUSSION
Fluorescent proteins have now been reported and characterized in three anguilliform (true eels) families and genera, Anguillidae (Anguilla; Hayashi and Toda, 2009;Kumagai et al., 2013), Chlopsidae (false morays: Kaupichthys; Gruber et al., 2015;Krivoshik et al., 2020), and in this report, Muraenidae (morays: Gymnothorax). Interestingly, we observed that G. zonipectis has a different distribution of fluorescence in the tissue. In Anguilla and Kaupichthys fluorescence was observed throughout the musculature of the organism (Hayashi and Toda, 2009;Kumagai et al., 2013;Gruber et al., 2015). However, in G. zonipectis it was only found in a thin layer below the outer layer of the skin. The implications of this difference are not yet known.
Members of the genus Anguilla are catadromous and are known to undergo vast migrations (Tsukamoto, 2006) between habitats in freshwater and tropical and subtropical open ocean water. In Anguillidae, it has been hypothesized that fluorescence may have antioxidant properties in the muscle tissue, resulting from bilirubin binding (Kumagai et al., 2013). All members of Chlopsidae are marine, as are nearly all species of Muraenidae. In Chlopsidae, fluorescence has been proposed to potentially serve a visual function during full moon spawning. Previous studies have shown that these eels reproduce and spawn synchronized according to a lunar cycle (Lee et al., 2008). In the two species of chlopsid eels studied to date, Kaupichthys hyoproroides and Kaupichthys n. sp., two FPs were discovered (Chlopsid FP I and II) that are 94% homologous to each other (Gruber et al., 2015).
Previous work on muraenids has demonstrated that the undulated moray G. undulatus possesses rod visual pigments that allow them to detect wavelengths of 495 nm (Munz and McFarland, 1973), which suggests that Gymnothorax eels are capable of green-light detection. Additional studies of other moray species reveal that Gymnothorax favagineus has a rod at 487 ± 5.4 nm and single cone 501 ± 7.7 nm and Gymnothorax reticularis has a rod at 486 ± 4.0 nm and a single cone at 494 ± 5.8 nm (Wang et al., 2011). There is relatively sparse behavioral data associated with G. zonipectis and hence, at the moment, there is no clear indication if fluorescence plays a role in the eels' visual ecology. Nocturnal feeding habits have been observed for individuals of this species (Böhlke and Randall, 2000), and while a Bronsonian knot has not been observed in G. zonipectis, it has been observed in other members of Gymnothorax (Kondo, 1955;Barley et al., 2016). Additional species of Gymnothorax have been shown to hunt collaboratively with groupers, including Plectropomus pessuliferus masrubri and the coral trout Plectropomus leopardus (Vail et al., 2013).
A transcriptome of G. zonipectis was completed and is now available under Bioproject PRJNA718586. A total of 273,073,027 reads were acquired and run through a BLAST search by GENEWIZ, LLC (South Plainfield, NJ) to generate a list of  sequences. To search the hits from the transcriptome assembly, we looked for fatty acid binding proteins that contained a GPP sequence insert. We found one transcript matching this criteria and synthesized the corresponding gene for protein expression. A number of brain fatty acid-binding proteins were also found that did not contain this sequence. This further supports this sequence insert's importance in eel fluorescence. None of these sequences contained GPP in residues 58-60 and were therefore not analyzed any further. In previous work with Chlopsid FP we had synthesized and expressed the homologous fatty acid binding protein analoges that did not contain GPP and they were found to be non-fluorescent.
The protein sequence discovered in G. zonipectis, which we refer to as GymFP, shares 61.0% identity with UnaG, isolated from Anguilla japonica, and 46.7% identity to Chlopsid FP I, isolated from Kaupichthys hyoproroides. Further characterization of this FP has uncovered many similarities with both UnaG and Chlopsid FP I, including the emission of green light and the use of bilirubin as a ligand. It is significant that this third eel FP also possesses a GPP motif. The consistency of this motif in all fluorescent fatty acid binding proteins is also observed in the grouping in the phylogenetic tree. We generated a phylogenetic tree based on 14 fatty acid binding proteins from fluorescent eels, non-fluorescent fish, and humans. The fluorescent sequences formed their own clade ( Supplementary Figure 2A). This is consistent with other trees that have been previously published (Gruber et al., 2015;Funahashi et al., 2017). We also note that the Anguilla and Gymnothorax fluorescent fatty acid binding protein groups are sister, although this is not highly supported (36%).
We used Swiss Model (Guex et al., 2009;Bienert et al., 2017;Bertoni et al., 2017;Waterhouse et al., 2018;Studer et al., 2020) to generate a model of GymFP (Figure 2B). The template for the model was pdb code 4I3B, which is the WT structure of UnaG (Kumagai et al., 2013). The model shows the expected beta barrel structure. The GPP motif is highlighted in green and is in the same position found in the other fluorescent eel fatty acid binding proteins. In purple, we highlight those areas that are the same in UnaG, ChlopsidFP I and GymFP. It should be noted that we have not been able to determine sequence motifs responsible for shifting fluorescence excitation or emission to date (Krivoshik et al., 2020).
The lipocalin family of proteins, which includes fatty acid binding proteins and apolipoprotein D, contains other proteins that are able to use heme breakdown products to produce fluorescence. Sandercyanin, found in Stizostedion vitreum (previously Sander vitreus), is homologous to apolipoprotein D and is able to use biliverdin to produce red fluorescence (Ghosh et al., 2016).
In closing, this report provides the characterization of another member of the eel fluorescent fatty acid binding protein family. This is the first report of fluorescent protein characterization from an eel from the family Muraenidae.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, PRJNA718586.

ETHICS STATEMENT
This study was carried out in strict accordance with the recommendations in the Guidelines for the Use of Fishes in Research of the American Fisheries Society and the American Museum of Natural History's Institutional Animal Care and Use Committee (IACUC).

AUTHOR CONTRIBUTIONS
AG, DG, and JG conceptualized the project and wrote the manuscript. DG and JS were involved in the collection of specimens. JG obtained funding for this research. AG, SK, and JG performed the experiments and calculations. All authors reviewed and revised the manuscript.

ACKNOWLEDGMENTS
We thank Alec Hughes (WCS) for logistical support and help obtaining permits, and to the crew of the A.R. Ford (Honiara) for considerable support in the field. We thank Corey Howell and The Wilderness Lodge for logistical and SCUBA diving assistance around Gatokae Island, and Robert Schelly and Tate Sparks for assistance with field collections and fluorescent photography. We would also like to thank Prof. Mercer Brugler for his help with phylogenetic tree preparation. We thank Genewiz, LLC for the RNA library preparation, transcriptome assembly and bioinformatics work.