The Metabolome of a Cyanobacterial Bloom Visualized by MS/MS-Based Molecular Networking Reveals New Neurotoxic Smenamide Analogs (C, D, and E)

Members of the cyanobacterial genus Trichodesmium are well known for their substantial impact on nitrogen influx in ocean ecosystems and the enormous surface blooms they form in tropical and subtropical locations. However, the secondary metabolite composition of these complex environmental bloom events is not well known, nor the possibility of the production of potent toxins that have been observed in other bloom-forming marine and freshwater cyanobacteria species. In the present work, we aimed to characterize the metabolome of a Trichodesmium bloom utilizing MS/MS-based molecular networking. Furthermore, we integrated cytotoxicity assays in order to identify and ultimately isolate potential cyanotoxins from the bloom. These efforts led to the isolation and identification of several members of the smenamide family, including three new smenamide analogs (1–3) as well as the previously reported smenothiazole A-hybrid polyketide-peptide compounds. Two of these new smenamides possessed cytotoxicity to neuro-2A cells (1 and 3) and their presence elicits further questions as to their potential ecological roles. HPLC profiling and molecular networking of chromatography fractions from the bloom revealed an elaborate secondary metabolome, generating hypotheses with respect to the environmental role of these metabolites and the consistency of this chemical composition across genera, space and time.


INTRODUCTION
Blooms of toxin-producing cyanobacteria (harmful algal blooms, HABs) continue to be a threat to water resources in the U.S. and across the globe (Carmichael and Boyer, 2016). Research surrounding these bloom events with respect to cyanobacteria has generally focused on freshwater planktonic species and a suite of well-characterized toxins, including the anatoxins, saxitoxins and microcystins (Bláha et al., 2009). However, species of cyanobacteria in the marine realm have been a prolific source of exquisitely potent cytotoxic metabolites (Luesch et al., 2001;Taori et al., 2008;Pereira et al., 2012). Members of the bloom-forming genus Trichodesmium are an understudied group of marine cyanobacteria with respect to toxin production and environmental impact. With respect to new natural products, the cyclic peptide trichamide was characterized from a cultured strain of Trichodesmium erythraeum, although no significant cytotoxicity was observed against HCT-116 cells and CEM-TART cells when tested at 10 and 50 µg/mL, respectively (Sudek et al., 2006). The lipoamides, credneramides A and B were isolated and characterized from a field-collected benthic cyanobacterium identified as a new species of Trichodesmium (Malloy et al., 2012). These metabolites inhibited spontaneous calcium oscillations in murine cerebrocortical neurons (Malloy et al., 2012). Several known cyanotoxins, such as anatoxin, saxitoxin, microcystins and aplysiatoxins have been reported from Trichodesmium blooms collected from distinct geographic areas (Ramos et al., 2005;Detoni et al., 2016;Shunmugam et al., 2017).
In the current report, we detail the comprehensive metabolic profiling of a Trichodesmium bloom collected from the western Gulf of Mexico utilizing MS/MS-based molecular networking (Watrous et al., 2012;Yang et al., 2013;Wang et al., 2016) and cytotoxicity assays. In our previous work on this Trichodesmium bloom, we have utilized cytotoxicity assays, NMR-guided isolation and MS-guided isolation independently to characterize chlorinated polyketides and hybrid polyketide peptides (Bertin et al., 2016(Bertin et al., , 2017aBelisle et al., 2017). The current report attempts to describe the Trichodesmium bloom metabolome more completely, focusing on a networking tool to cluster molecules based on similarities in the MS/MS fragmentation patterns (Watrous et al., 2012). Our efforts ultimately led to the isolation and characterization of three new members of the smenamide family of molecules (1-3) and the previously reported smenothiazole A (Figure 1). Smenamides C and E demonstrated potent neurotoxicity (1 and 3).

General Experimental Procedures
Optical rotations were measured using a Jasco P-2000 polarimeter. UV spectra were measured using a Beckman Coulter DU-800 spectrophotometer. CD spectra were recorded using a Jasco J-1100 CD spectrometer. NMR spectra were collected using a Bruker 800 MHz NMR instrument. Additional NMR spectra were recorded on a Varian 500 MHz NMR instrument. HRESIMS analysis was performed using an AB SCIEX TripleTOF 4600 mass spectrometer with Analyst TF software. LC-MS/MS analysis was carried out using a ThermoFinnigan LCQ AdvantageMax mass spectrometer with an electrospray ionization (ESI) source. Semi-preparative HPLC was carried out using a Dionex UltiMate 3000 HPLC system and Agilent 1100 series system each equipped with a micro vacuum degasser, an autosampler and a diode-array detector.

Collection, Extraction and Fractionation of Bloom Material
Samples from a localized bloom of Trichodesmium were collected from Padre Island, Corpus Christi, TX during 9-11 May 2014 as described previously (Bertin et al., 2016(Bertin et al., , 2017aBelisle et al., 2017). Briefly, bloom material was collected in 5-gallon buckets from ca. 0.5-meter water depth and concentrated by gentle filtration through an 18 µm mesh screen. A subsample of the cell mass was examined microscopically and identified using Komárek and Anagnostidis (2005) as being dominated by cyanobacteria of the genus Trichodesmium. The material was frozen and shipped for further chemical analysis. The biomass (ca. 14 g dry weight) was repeatedly extracted with 2:1 CH 2 Cl 2 :CH 3 OH and the extracts were combined and evaporated under reduced pressure (3.95 g). The extract was reconstituted in hexanes and applied to silica gel (300 mL) in a wide fritted column with a vacuum attachment. The extract was fractionated using a stepped gradient from 100% hexanes to 100% CH 3 OH resulting in nine fractions. Seven of the nine fractions (C-I) were further analyzed by means of cytotoxicity assays and MS/MS-based molecular networking. The first two fractions: 100% hexanes (A) and 90% hexanes in EtOAc (B) were intended to remove hydrocarbons and exceedingly lipophilic substances from the sample and were not analyzed further.

Molecular Networking
Fractions C-I were subjected to LC-MS/MS analysis with data collection in data-dependent acquisition mode on a ThermoFinnigan LCQ AdvantageMax mass spectrometer with an electrospray ionization (ESI) source. A Kinetex 5 µm C18 column (100 × 4.6 mm) was used for separation of analytes. The LC method consisted of a linear gradient from 30 to 99% CH 3 CN in water + 0.1% formic acid over 17 minutes, followed by an isocratic period at 99% CH 3 CN of 3 minutes. The flow rate was held at 0.6 mL/min. The MS spray voltage was 5 kV with a capillary temperature of 400 • C. For the MS/MS component, the CID isolation width was 2.0 and the collision energy was 35.0 eV. The raw data files were converted to mzXML format using MSConvert from the ProteoWizard suite (http://proteowizard.sourceforge.net/tools. shtml) 1 . The molecular network was generated using the online platform at Global Natural Products Social Molecular Networking website (gnps.ucsd.edu) using parameters detailed in Table S5. The network was visualized using the Browser Network Visualizer tool available on the gnps website.

Isolation of 1-3 and Smenothiazole A
Fractions G (100% EtOAc, 104.0 mg) and H (75% EtOAc in CH 3 OH, 286.9 mg) were chosen for further purification based on the quantity of network ions in these fractions, the molecular features of these ions (ratio of M + and M +2 isotope), and cytotoxicity results of the mixed chromatography fractions. Fractions G and H were combined based on similarities in LC-MS/MS profiles and 1 H NMR resonances. The combined sample was further fractionated over a 2 g C18 SPE column eluting with 50% water in CH 3 CN (13.7 mg), 100% CH 3 CN (143.2 mg), 100% CH 3 OH (74.5 mg) and 100% EtOAc (54.5 mg). The fraction eluting with 100% CH 3 CN was subjected to reversed phase HPLC using a YMC 5 µm ODS column (250 × 10 mm); mobile phase: 65% CH 3 CN /35% water with 0.05% formic acid added to each solvent, flow 3 mL/min. Fractions were collected based on UV characteristics and HPLC fractions were analyzed by HRESIMS for ions of interest from the molecular network. Further purification was carried out using the YMC column mentioned above; mobile phase: 80% CH 3 CN in water with 0.05% formic acid added to each solvent, flow 3 mL/min resulted in the isolation of 7.0 mg of 1 (t R , 11.5 min). A mobile phase of 65% CH 3 CN in water with 0.05% formic acid added to each solvent, flow 3 mL/min was used to isolate 0.6 mg of 2 (t R , 26.0 min) and 0.3 mg of 3 (t R , 21.2 min). A final purification was carried out using a YMC 5 µm ODS column (250 × 10 mm); mobile phase: 80% CH 3 CN in water with 0.1% formic acid added to each solvent, flow 3 mL/min and 2.0 mg of smenothiazole A was isolated (t R , 5.0 min).

Cytotoxicity Assay
Neuro-2A cells and HCT-116 cells were added to assay plates in 100 µl of Eagle's Minimum Essential Media (EMEM) or 100 µl of McCoy's 5A media respectively each supplemented with 10% FBS at a density of 5,000 cells/well. Cells were incubated overnight (37 • C, 5% CO 2 ) and examined microscopically to confirm confluence and adherence. Mixed chromatography fractions (C-I) were dissolved in DMSO (1% v/v) and tested at concentrations of 40 and 4 µg/mL with 10 µM doxorubicin used as a positive control. Compounds 1-3 were dissolved in DMSO (1% v/v) and added to the cells in the range of 100 to 0.1 µM in order to generate EC 50 curves. Four technical replicates were prepared for each concentration and each assay was performed in triplicate. Doxorubicin was used as a positive control (EC 50 : 0.043 ± 0.032 µM for neuro-2A cells; EC 50 : 0.071 ± 0.004 µM for HCT-116 cells) and DMSO (1% v/v) was used as a negative control. Assays were resolved as previously reported (Bertin et al., 2017b) and EC 50 curves were generated using Graphpad Prism software.

Trichodesmium Bloom-Cytotoxicity of Chromatography Fractions and Molecular Network
Several of the chromatography fractions (D-H) derived from the bloom material showed strong cytotoxicity against neuro-2A cells at 40 µg/mL ( Figure S1). Fraction D showed the greatest potency at 4 µg/mL. Examination of the molecular network showed that compounds from cluster 3 were major ions in fraction D (Figure 2). However, these metabolites were not isolable following further purification procedures. The majority of the metabolites in the molecular network were found in fractions F-I. We identified Cluster 2 as a molecular cluster of interest due to the number of ions in the cluster and the M + and M +2 ratio indicating a single chlorine atom in each of these metabolites (cf. Figures 2, 3). Additionally, fractions G and H showed potent cytotoxicity against neuro-2A cells; thus, our subsequent purification efforts centered on these two fractions. HPLC analysis indicated abundant metabolites in the combined G+H HPLC pre-fraction ( Figure S2) and repeated chromatography resulted in the isolation of 1-3 as optically active colorless oils.  cyanobacteria metabolites with methylated tertiary amides such as smenamides A and B and kalkitoxin (Wu et al., 2000;Teta et al., 2013). The split signals were determined to be the result of two conformers in the E and Z configuration at the tertiary amide functionality in 1. This phenomenon was observed for all three new metabolites (1-3); in the structure characterization for each of these compounds, the data for the Z conformer is discussed. NMR data tables in the Supporting Information provide information on the E conformer. While the multiple conformers presented difficulties in NMR interpretation, three partial structures of 1 (a-c) were characterized initially based on 1 H-1 H COSY spin systems followed by HMBC correlation analysis (Figure 4). In the first partial structure (a), a moderately deshielded diastereotopic methylene group (H-20a, δ H 2.12; H-20b, δ H 2.07) was correlated by COSY to a second methylene group (H 2 -21, δ H 1.56) which itself was correlated by COSY to a third methylene group (H 2 -22, δ H 3.26). This latter deshielded methylene was correlated by HMBC to C-23 (δ C 35.9) and the C-24 carbonyl (δ C 169.9). The singlet methyl  (H 3 -25, δ H 1.97) showed an HMBC correlation to C-24 and characterized the western half of 1 with an N-methyl acetamide functionality. In the second partial structure of 1 showed HMBC correlations to a quaternary carbon (C-11, δ C 131.7) and the C-10 carbonyl (δ C 170.6) extending the polyketide chain of 1. The C-11-C-13 olefin was assigned E geometry based on the 13 C chemical shift of C-12 (δ C 13.7) compared to δ C 20.1 for the Z geometry (see below). The two sets of moderately deshielded methylenes (H 2 -20 and H 2 -17) showed HMBC correlations to the quaternary carbon at C-18 (δ C 142.8). A deshielded methine singlet (H-19, δ H 6.04) also showed an HMBC correlation to C-18, supporting an exomethylene vinyl chloride bridge connecting partial structures a and b. The configuration of the vinyl chloride was assigned as Z based on NOE correlations from H-19 to H 2 -17 and H 2 -16. The chemical shift of C-10 was consistent with that of an amide functionality and COSY correlations from H-4 to H-8 supported the assignment of a leucine residue in the third partial structure. However, the chemical shift at C-3 was somewhat deshielded for that of a standard amide or ester carbonyl (δ C 180.3). An O-methyl singlet (H 3 -9, δ H 3.87) was correlated to C-3 by HMBC supporting the presence of a methoxy functionality. Additionally, H-2 (δ H 5.28) was correlated to C-3 and the C-1 carbonyl by HMBC. HMBC correlations from H-2 and H-4 to C-10 connected the third partial structure to the remainder of the molecule, establishing an isobutyl-methoxypyrrolinone moiety and satisfying the final three degrees of unsaturation required by the molecular formula. The structure of 1 was established as a highly functionalized linear polyketide-peptide of the smenamide family (Teta et al., 2013). While the correlations and chemical shifts described above relate to the Z conformer of 1, NMR data for the E conformer were also analyzed, and are listed in Table S1. The absolute configuration of 1 (4S, 14R) was determined to be identical to that of smenamide A by comparison of the CD spectrum of 1 to that of naturally occurring smenamide A (Caso et al., 2017). The spectra were nearly identical in sign and magnitude.
HRESIMS analysis of 2 gave an [M+H] + of m/z 467.2693, suggesting a molecular formula of C 25 H 39 N 2 O 4 Cl, identical to that of 1. Examination of 1 H NMR, multiplicity-edited HSQC, and HMBC spectra of 1 and 2 showed that the two molecules were nearly identical (cf. Tables S2 and S3). 13 C NMR differences were most pronounced at C-12 (δ C 13.7 in 1; δ C 20.1 in 2) and C-13 (δ C 142.7 in 1; δ C 135.9 in 2). These chemical shifts and the NOE correlations between H 3 -12 and H-13 in 2 supported the Z configuration of the C-11-C-13 olefin in 2 and established 2 as a geometric isomer of 1. The absolute stereochemistry of 2 is proposed to be identical to that of 1 based on similarity in optical rotation values. HRESIMS analysis of 3 gave an [M+H] + of m/z 499.2935, suggesting a molecular formula of C 26 H 43 N 2 O 5 Cl, and a requirement of 6 degrees of unsaturation. The examination of 1 H and 13 C NMR data and the placement of m/z 499 in the same molecular network cluster as smenamide C and D (1 and 2), suggested that 3 was a close structural analog. The reduction in degrees of unsaturation in 3 compared to 1 was due to the presence of a secondary alcohol at C-13 (H-13, δ H 3.73; C-13, δ C 74.3) in 3 and a methine at C-11 (H-11, δ H 3.95; C-11, δ C 42.7). The H-11 and H-13 methine protons were correlated by COSY and H-13 also showed a COSY correlation to H-14 (δ H 1.48). Additionally, the C-25 methyl resonance of the acetyl group in 1 and 2 was not present in 3. COSY correlations between a methylene at H-25 (δ H 2.27) and a methyl triplet at H-26 (δ H 0.97) supported an N-methylpropanamide functionality in 3 and completed the planar structure of smenamide E (3). The secondary alcohol of 3 was resistant to acylation with Mosher's acid chloride and the configuration of this position could not be determined by chemical derivative formation. Therefore, in the current report, we report only the planar structure for this new metabolite.
During the attempt to isolate the compound with an m/z 530 from Cluster 2 in the network (Figure 3), we isolated a peptidic compound with spectrometric and spectroscopic characteristics consistent with that of the previously reported cytotoxin smenothiazole A (Esposito et al., 2015). Analysis of NMR data, optical rotation value, and CD spectra (negative Cotton effect at 234 nm, Figure S35) confirmed its identity.

DISCUSSION
In our previous work on Trichodesmium blooms and their natural products, we have utilized cytotoxicity assays, NMRguided isolation, and MS-guided isolation (Bertin et al., 2016(Bertin et al., , 2017aBelisle et al., 2017). We have previously characterized the cytotoxic polyketide trichophycin A, the polyketides trichotoxins A and B, and the moderately cytotoxic polyketide-peptide trichothiazole (Bertin et al., 2016(Bertin et al., , 2017bBelisle et al., 2017). In the current network, we did observe an m/z value consistent with trichothiazole (Figure 2). The node was in Cluster 1 (m/z 342) and showed an identical MS/MS fragmentation pattern to that of trichothiazole. However, we did not observe nodes for trichophycin A or trichotoxins A and B. It should be noted that the macrocyclic polyketide-peptides tricholides A and B and unnarmicin D (Bertin et al., 2017a) also clustered in the network (Figure 2, Cluster 4). It may be that some metabolites in the bloom metabolome do not ionize well by ESI+ or give informative fragments during MS/MS acquisition; this represents a limitation in the implementation of MS/MS-based networking to describe bloom metabolomes. Thus, the metabolite information gained in the network may be somewhat biased toward peptides and hybrid polyketidepeptides. Nevertheless, taking into account the limitations of this approach, molecular networking was a remarkable tool for visualizing a complex metabolome rich in metabolites with intriguing structural elements and cytotoxicity to neuro-2A cells. Analyzing fractions C-I using the networking procedure identified 93 nodes that were members of 13 clusters. This approach allowed us to isolate and characterize three new members of the smenamide family (1-3). Furthermore, within the smenamide cluster, we tentatively identified the known compound smenamide A or B (double bond isomers of each other at m/z 501). This later node in Cluster 2 (Figure 3) showed an identical MS/MS fragmentation pattern to that of smenamide A/B from published data and the HRESIMS analysis supported this identification ( Figure S36) (Teta et al., 2013). Both of these known metabolites are very potent cytotoxins with IC 50 values around 50 nM against Calu-1 cells (Teta et al., 2013). Smenamides C and E (1 and 3) were less potent cytotoxins than smenamide A and B, possibly due to the replacement of the phenylalanine amino acid unit with leucine (1, 3). Intriguingly, smenamide D (2) was not cytotoxic to either the neuro-2A and HCT-116 cell lines and we speculate that the cis configuration in the middle of the polyketide chain may affect binding of 2 to its molecular target. Smenothiazole A showed nanomolar levels against multiple cell lines and was previously isolated from a marine sponge; however, the authors indicate a likely cyanobacterial origin (Esposito et al., 2015). This is the first report of smenothiazole A from a bloom of Trichodesmium.
Overall, the networking procedure has identified new target molecules for isolation such as those in Cluster 1 (Figure 2). A comprehensive characterization of the chemical space within the bloom material is challenging as the metabolic composition of the sub-fractions we have generated are all nearly as complex as that of the sub-fraction from which 1-3 were isolated ( Figure S2).
The networking tool significantly improves the efficiency of our isolation and characterization workflow.
We did not identify anatoxins, saxitoxins or microcystins during the course of this analysis. This may be due to our focus on lipophilic metabolites, low abundance of these compounds in our samples, or a lack of informative MS/MS fragments. Trichodesmium blooms are complex events harboring a diverse array of microorganisms (Capone et al., 1997). Thus, the unequivocal identification of the producing organisms of the toxic metabolites described in this current work and other studies from environmental collections will ultimately require pure cultivation of producing organisms, the identification of biosynthetic gene clusters of toxic molecules, or the localization of metabolites to particular cell types (Simmons et al., 2008). In the original isolation and characterization of smenamides A and B, the authors suggest that a cyanobacterial symbiont is the true producer of the sponge-derived compounds (Teta et al., 2013). The present report supports this observation, as certain structural features such as the exomethylene vinyl chloride moiety are characteristic of cyanobacterial metabolism (Kan et al., 2000;Edwards et al., 2004;Nunnery et al., 2012). To the best of our knowledge, the N-methyl propanamide functionality in 3 has not previously been reported in a polyketidepeptide from cyanobacteria, and represents a biosynthetically intriguing unit because these organisms are not known to produce propionate. Conceivably, it may derive from an Sadenosylmethionine (SAM)-mediated methylation of an acetate precursor, the proposed first building block in the production of these smenamide-type natural products. This would be a similar biosynthetic transformation to that involved in producing the t-butyl group in apratoxin A which employs a combination of two SAM methyl transferases to incorporate these methyl groups (Grindberg et al., 2011;Skiba et al., 2017).
The identification of these neurotoxic metabolites (1 and 3) and the other more potent smenamides and smenothiazole A from a Trichodesmium bloom raises important questions as to their ecological role during these events. It will be important to characterize these bloom-associated metabolites in a longitudinal sense to evaluate their ongoing contribution to HABs.

AUTHOR CONTRIBUTIONS
MB was responsible for the study design. PZ and PM were involved in sample collection, organism identification and sample storage. CV and MB did the structure characterization of secondary metabolites. CV and SC carried out cytotoxicity studies. EG, WG, and MB did the molecular networking procedure. MB wrote the manuscript with editing help from all co-authors.

FUNDING
Acquisition of certain spectroscopic and spectrometric data in this publication was made possible by the use of equipment and services available through the RI-INBRE Centralized Research Core Facility at the University of Rhode Island, which is supported by the Institutional Development Award (IDeA) Network for Biomedical Research Excellence from the National Institute of General Medical Sciences of the National Institutes of Health under grant number P20GM103430. Certain NMR experiments were conducted at a research facility at the University of Rhode Island supported in part by the National Science Foundation EPSCoR Cooperative Agreement #EPS-1004057. We gratefully acknowledge the American Society of Pharmacognosy Starter Grant (MB).