Modulation of Medium-Chain Fatty Acid Synthesis in Synechococcus sp. PCC 7002 by Replacing FabH with a Chaetoceros Ketoacyl-ACP Synthase

The isolation or engineering of algal cells synthesizing high levels of medium-chain fatty acids (MCFAs) is attractive to mitigate the high clouding point of longer chain fatty acids in algal based biodiesel. To develop a more informed understanding of MCFA synthesis in photosynthetic microorganisms, we isolated several algae from Great Salt Lake and screened this collection for MCFA accumulation to identify strains naturally accumulating high levels of MCFA. A diatom, Chaetoceros sp. GSL56, accumulated particularly high levels of C14 (up to 40%), with the majority of C14 fatty acids allocated in triacylglycerols. Using whole cell transcriptome sequencing and de novo assembly, putative genes encoding fatty acid synthesis enzymes were identified. Enzymes from this Chaetoceros sp. were expressed in the cyanobacterium Synechococcus sp. PCC 7002 to validate gene function and to determine whether eukaryotic enzymes putatively lacking bacterial evolutionary control mechanisms could be used to improve MCFA production in this promising production strain. Replacement of the Synechococcus 7002 native FabH with a Chaetoceros ketoacyl-ACP synthase III increased MCFA synthesis up to fivefold. The level of increase is dependent on promoter strength and culturing conditions.


INTRODUCTION
Biologically derived diesel from water-oxidizing, photosynthetic microorganisms (PSMs) is considered an efficient and promising next-generation technology for the production of renewable fuels (Radakovits et al., 2010;Work et al., 2013). These photosynthetic organisms are capable of high photon conversion efficiencies and could be deployed so that their cultivation does not directly compete with the contemporary global food supply (Schenk et al., 2008;Mata et al., 2010). To improve commercial feasibility, research efforts have focused on areas ranging from optimizing photosynthetic yields to performing scalability assessments (Larkum et al., 2012;Ho et al., 2014;Quinn and Davis, 2015). These efforts can be combined with advances in next-generation DNA sequencing and metagenomic analysis to characterize the diversity of phototrophic life at the enzyme/molecular level, which facilitates the discovery of genetic "parts" that can be transformed into "chassis" organisms to genetically improve biotechnological phenotypes in production organisms (Kodzius and Gojobori, 2015).
While most contemporary research efforts are focused on improving biomass accumulation, relatively few efforts are targeting improved oil quality. However, lipid oil quality is a significant issue in diesel fuel utilization, as biological products typically have longer chain lengths and higher degrees of unsaturation relative to conventional petrodiesel (Durrett et al., 2008). These features lead to poor coldflow temperature properties (longer chain lengths) and oxidative instability (higher unsaturation) (Knothe, 2005(Knothe, , 2008. The genetic manipulation of fatty acid chain length in plants has succeeded in changing oil crops to synthesize more saturated, medium-chain fatty acids (MCFAs,, which are preferable for biodiesel (Voelker et al., 1992;Thelen and Ohlrogge, 2002). Expressing acyl-ACP thioesterases (acyl-ACP TEs) alone, or with ketoacyl-ACP synthase (KAS) from plants that produce MCFA, leads to MCFA accumulation in transgenic hosts (Jones et al., 1995;Leonard et al., 1998;Slabaugh et al., 1998). However, there are relatively few successful reports regarding MCFA accumulation in PSMs (Radakovits et al., 2011), and only limited information is available regarding mechanisms to increase MCFA synthesis in PSMs to produce a higher quality biofuel. Studies of TEs, the enzymes most commonly used to control fatty acid chain length, indicate differences between algal and plant TEs in terms of substrate recognition and phylogeny (Jing et al., 2011;Blatti et al., 2012;Beld et al., 2014).
In this study, we used bioprospecting, cell sorting, and fatty acid methyl ester (FAME) profiling to identify a diatom from Great Salt Lake (GSL), Utah (Chaetoceros GSL56 hereafter) that naturally accumulated high levels of C 14:0 fatty acid (>16%), and initiated studies probing fatty acid synthesis (FAS) in this organism. We chose GSL as a sampling site because the growth of halophilic algae in high-salt production systems has the potential to yield strains that are productive in seawater raceways where evaporation leads to increased salt concentrations; and because unique enzymology of potential biotechnological relevance can be found in extremophiles isolated from hypersaline ecosystems (Meuser et al., 2013). Chaetoceros GSL56 accumulates among the highest levels of C 14:0 in a PSM that we have observed to date. As an initial characterization of the MCFA accumulation phenotype, we probed the physiological parameters influencing C 14:0 accumulation in this alga, and attained and assembled a wholecell transcriptome to begin probing FAS enzymes. To validate gene annotations and to search for superior FAS enzymes, a targeted set of genes encoding eukaryotic FAS enzymes from this diatom were transformed into a cyanobacterium [Synechococcus sp. PCC 7002 (hereafter, Synechococcus 7002)] engineered to secrete fatty acids. We establish that a eukaryotic type III ketoacyl-ACP synthase (KASIII) enzyme can functionally replace the endogenous FAS enzyme FabH in Synechococcus 7002, and that expression of this non-native, eukaryotic enzyme improves MCFA yields under the culturing conditions used.

Strain Identification and Culturing
Chaetoceros GSL56 was isolated from Farmington Bay in Great Salt Lake (GSL), Utah, USA in 2008. Cultures were maintained and grown in f/2 medium, at 29 • C in a Percival incubator (Percival, Perry, IA, USA) illuminated with white fluorescent light [∼40-120 µmol photons m −2 s −1 of photosynthetically active radiation (PAR)] using a 16/8 light/dark cycle. Cell numbers were measured using a Z2 Coulter Counter (Beckman-Coulter, Brea, CA, USA). The partial 18S rRNA gene (1162 bp) was amplified from genomic DNA and sequenced for taxonomic identification using the universal primers 360FE and 1492RE (Dawson and Pace, 2002). The 18S rRNA gene sequence was deposited in Genbank (accession no. HQ710801).

FAME Quantification
Lipids were extracted and converted into FAMEs as described previously (Radakovits et al., 2012;Work et al., 2015) with some modifications. Briefly, 1.0 ml of methanol saturated with 5% KOH (0.8 g/ml) was added to 0.5 ml fresh culture samples in 4.0 ml sample vials, sealed and incubated at 100 • C for 90 min resulting in cell lysis and lipid saponification. Acidcatalyzed methylation was then carried out by adding 1.5 ml 1:16 12N HCl/MeOH to the same vial and incubating at 80 • C for 6 h. FAMEs were then extracted into 1.25 ml hexane via gentle inversion. Extracts were analyzed directly by gas chromatography-flame ionization detection (GC-FID) using an Agilent 7890A gas chromatograph equipped with a DB5-ms column (Agilent Technologies, Santa Clara, CA, USA). Different concentrations of the Supelco 37 component FAME standards mix (Supelco Inc., Bellefonte, PA, USA) were analyzed by GC-FID and used for peak identification and sample quantification. Fatty acids were also extracted from representative samples spiked with C 13:0 internal standards (∼80% recovery) to assess cellular fatty acid detection.

Lipid Profile Analysis by Thin Layer Chromatography and Subsequent Methyl Esterification (TLC-FAME Analysis)
Thin-layer chromatography (TLC) analysis was conducted as described previously (Vieler et al., 2007;Radakovits et al., 2011). Briefly, 10 ml of pelleted culture was resuspended in 400 µl MeOH and then sonicated for 10 min for cell lysis. Then 400 µl chloroform was added to solubilize lipids, followed by the addition of 400 µl water for phase separation. The organic layer containing lipids was transferred to new microcentrifuge tubes and dried under a N 2 stream. Concentrated lipids were resuspended in 30 µl chloroform, then 4.0 µl of resuspended lipids were spotted onto HPTLC-HL normal phase, 150 mm silica gel plates (10 cm × 20 cm; Analtech, Newark, DE, USA) and developed in a TLC chamber. A series of lipid standards, which included monogalactosyldiacylglycerol (MGDG), digalactosyldiacylglycerol (DGDG) and sulfoquinovosyldiacylglycerol (SQDG; Lipid Products, Nutfield, UK); phosphatidylcholine (PC), phosphatidylethanolamine (PE) and phosphatidylglycerol (PG; Avanti Polar Lipids, Inc., Alabaster, AL); cholesterol (Chol), palmitic acid (free fatty acid; FFA) and glyceryl trioleate (triacylglycerol; TAG; Sigma-Aldrich, St. Louis, MO, USA), were run concurrently to identify different lipid classes. The first eluent [(methyl acetate:isopropanol: chloroform:methanol:KCl (0.25%)] in a ratio of 25:25:25:10:4 (v/v/v/v/v), ran to a height of ∼5 cm from the origin. After drying, the plates were developed with a second eluent [hexane:diethylether:acetic acid in a ratio of 70:30:2 (v/v/v)] to a height of ∼8 cm from the origin. TLC plates were then sprayed with a 0.05% solution of primuline (TCI America, Portland, OR, USA) in acetone. Individual lipid bands were visualized under UV light at 365 nm. Following TLC separation, the individual lipid bands were marked and scraped from the TLC plates. The fatty esters in each lipid class were transesterified into FAMEs and analyzed by GC-FID, as described previously (Radakovits et al., 2011).

Total RNA Extraction and Transcriptome Sequencing/Assembly
Total RNA was extracted from 20 ml of Chaetoceros GSL56 cells at stationary phase that were grown in f/2 medium at 3.5% salinity, supplemented with 1.06 × 10 −4 M Na 2 SiO 3 . The plant RNA reagent (Invitrogen, Grand Island, NY, USA) was used to purify RNA from cells according to the manufacturer's instructions. Genomic DNA was removed by DNase I (RNase free, Ambion, Grand Island, NY, USA) treatment, and subsequent RNA purification was carried out using the RNeasy MinElute Cleanup Kit (Qiagen, Germantown, MD, USA). Purified RNA was first quantified using a NanoDrop ND-1000 (Thermo scientific, Grand Island, NY, USA), and then more accurately measured using the QuantiT RiboGreen RNA assay kit (Invitrogen, Grand Island, NY, USA) with fluorescence detection.
Total RNA was sequenced by the National Center for Genomic Resources (NCGR, Santa Fe, NM, USA) as part of the Marine Microbial Eukaryote Transcriptome Project (Gordon and Betty Moore Foundation; Keeling et al., 2014). The RNA library was made with an insert size of ∼200 bp, using the TruSeq RNA Library preparation kit with poly-A+ selection. RNA was sequenced from both ends (paired-end reads 2 × 50 nt) using an Illumina Hi-seq 2000 (San Diego, CA, USA). The transcriptome dataset for Chaetoceros GSL56 is currently available through NCBI 1 .
Transcriptome assembly was conducted by NCGR using NCGR's internal pipelines. Reads less than 25 bp after quality trimming were discarded, with the remaining reads assembled into contigs using ABySS (Simpson et al., 2009). All assembled contigs were subjected to gap closing using GapCloser v 1.10 (Li et al., 2008). To identify overlaps between contigs, the OLC (overlap layout consensus) assembler miraEST was used (Chevreux et al., 2004). BWA was used to align sequence reads back to assembled contigs (Li and Durbin, 2009). A final subset of contigs was created by filtering the contigs dataset by a minimal length of 150 bp. ESTScan with a Bacillariophyta scoring matrix 1 http://www.ncbi.nlm.nih.gov/bioproject/?term~$=$~PRJNA231566 was used to predict coding sequences (CDS) from the final subset of contigs (Iseli et al., 1999;Lottaz et al., 2003).

Transcriptome Annotation and Molecular Phylogeny
All CDS sequences were aligned against the non-redundant (nr) protein databases at the National Center for Biotechnology Information (NCBI) using the BLASTx algorithm with an E-value cutoff of 10 −6 . The resulting top 10 blast hits were exported into Blast2Go software v 3.0.10 for functional annotation and statistical analysis (Conesa and Götz, 2008).
Transcripts putatively involved in fatty acid metabolism, including FAS and TAG synthesis, were reexamined. Representative sequences of each gene were downloaded from NCBI and used to search contig datasets to avoid assembly error. All identified transcripts were aligned against transcriptome reads in the NCBI Sequence Read Archive (SRA) database to check assembly integrity and coverage. To identify sub-classes of FAS genes, such as β-ketoacyl-ACP synthase I, II, and III (KASI, II, and III), alignment of translated amino acids sequences was conducted using MUSCLE 3.8.31 and phylogenetic trees were generated using the Phylogeny.fr program (Dereeper et al., 2008).

KAS Complementation in Synechococcus 7002
Only one full length KASIII was annotated in the Chaetoceros GSL56 transcriptome (KASIII hereafter), which was amplified from Chaetoceros GSL56 cDNA using the primers listed in Supplementary Table S2. The amplified DNA fragment containing NdeI/HindIII restriction sites from the primers was inserted into a modified pNSI-cpcBA-YFP plasmid (Davies et al., 2014), containing a gentamicin resistance cassette (aacC1; Kovach et al., 1995) so that KASIII was positioned immediately after the cpcBA promoter (Xu et al., 2011) to form the new plasmid pNSI-cpcBA-gslKASIII-GentR. For homologous recombination, the Synechococcus 7002 fabH gene and flanking sequence (0.9 kb) was amplified from Synechococcus 7002 genomic DNA, and the resulting 2.0 kb DNA fragment inserted into plasmid pNSI-cpcBA-YFP-GentR by cloning in between the M13 forward and the M13 reverse sequences, generating plasmid pfabH. A 2.6 kb DNA fragment that included the cpcBA promoter, KASIII gene and the aacC1 marker was then amplified from pNSI-cpc-gslKASIII-GentR and inserted into pfabH to replace part (the first 500 bp) of the coding region of fabH, which is the target of gene disruption. The resulting plasmid, pfabH-cpcBA-gslKASIII-GentR, was used to transform Synechococcus 7002 for KASIII substitution assays. In order to minimize an imbalance between KASIII and other FAS enzymes in transgenic strains, another plasmid without the cpcBA promoter was created (plasmid pfabH-gslKASIII-GentR), in which KASIII expression was driven by the endogenous Synechococcus 7002 fabH promoter. This plasmid was constructed using designed primers (Supplementary Table  S2), NEBuilder HiFi DNA assembly master mix (New England Biolabs Inc., MA) and Gibson cloning (Gibson et al., 2009) to avoid the use of restriction sites. A DNA fragment that contained the KASIII gene and the aacC1 marker gene was inserted into plasmid pfabH, replacing the first 500 bp of coding region of fabH at the start codon. DNA cloning was conducted in Escherichia coli DH5α cells. The genetic replacement of fabH with KASIII was conducted in a Synechococcus 7002 wild type strain and a lauric acid secreting (SA01) strain (Xu et al., 2013;Work et al., 2015).
To assess in vivo KASIII function, fully segregated transgenic strains were tested. Wild type and transgenic strains were first cultivated in 30 ml A+ medium in 125 ml flasks with the required antibiotic(s) (50 µg/ml gentamicin for SK01 and SK02, 50 µg/ml spectinomycin for SA01, 50 µg/ml gentamicin + 50 µg/ml spectinomycin for SAK01 and SAK02) at 37 • C in an illuminated Percival incubator providing continuous illumination of ∼200 µmol PAR m −2 s −1 and aerated with 1% CO 2 /air. After 3-4 days of cultivation, all cultures were diluted in 50 ml fresh media (with antibiotics) in 250 ml flasks, and normalized to the same optical density (OD 730 = 0.5). For fatty acid production, experimental cultures were cultivated at two different conditions: (A) room temperature, atmospheric CO 2 , and constant illumination at ∼80 µmol photons m −2 s −1 of PAR; and (B) 30 • C, 1% CO 2 and constant illumination at ∼80 µmol photons m −2 s −1 of PAR inside a growth chamber (Multitron, AJ125BC).

Chaetoceros GSL56 Fatty Acid Profiles and Physiology
To identify organisms and enzymes capable of facilitating MCFA production in algae and cyanobacteria, we surveyed an algal collection isolated from GSL for organisms with natively high levels of MCFA ( Figure 1A). Relative to the 42 halophilic algae that we isolated from this site, the ratio of C 14:0 to other fatty acids was the highest in Chaetoceros GSL56 as determined by FAME analysis (Figure 1; Table 1). C 14:0 fatty acid reached ∼20-35% of the total fatty acid content in this alga, depending on growth conditions and phases. Marine diatoms (e.g., Chaetoceros, Thalassiosira, and Phaeodactylum) tend to accumulate more C 14:0 fatty acid (by percentage) relative to representative green algae [e.g., Chlamydomonas and Chlorella (Table 1)]. Chaetoceros GSL56 grew faster and reached higher cell numbers in medium containing 3.5% salinity, compared to lower and higher NaCl concentrations, 1.75 and 6% salinity, respectively (Supplementary Figure S1). Growth under different light intensities indicated that cells grew to a slightly higher density at lower light intensities (20 µmol PAR m −2 s −1 and 60 µmol PAR m −2 s −1 ) relative to a higher light intensity (120 µmol PAR m −2 s −1 ), where the final cell concentrations were decreased relative to cultures at lower light. Therefore, cells were cultivated in medium containing 3.5% salinity at 60 µmol PAR m −2 s −1 , unless otherwise noted.
To further understand the biosynthesis and physiology of C 14:0 fatty acid in Chaetoceros GSL56, specifically whether C 14:0 accumulates predominately in membranes or storage products, lipids classes were extracted from cultures at stationary phase and subjected to TLC-FAME analysis (Figure 2). TLC analysis showed that the dominant lipid classes in Chaetoceros GSL56   include two phospholipids (PC and PG), SQDG, DGDG, MGDG, free fatty acids (FFAs), and triacylglyceride (TAG) (Figure 2A). The FFAs may represent artifacts of sample preparation. Sequential FAME analysis revealed that ∼66% of C 14:0 fatty acid was stored in TAG and the second largest portion was in SQDG (∼10%; Figure 2B). Compared to total fatty acid distributions, about 50% of total fatty acids were found in TAG and 8% was in SQDG, demonstrating enrichment in these two lipids classes ( Figure 2C).

Transcriptome Analysis and Functional Annotation
To investigate the fatty acid biosynthetic pathway underpinning high C 14:0 fatty acid content in Chaetoceros GSL56, we obtained a whole-cell transcriptome and assembled the data into expressed genes. Whole transcriptome sequencing was carried out using a cDNA library obtained from cells at stationary phase. A total of 36,745,292 raw reads were generated from sequencing the cDNA library using Illumina HiSeq 2000 (  Figure S2). The most represented annotations belonging to each GO category are listed in Supplementary Figure S2. Transcripts predicted to encode proteins in the molecular function category are more abundant than in the other two categories, cellular component and biological process. In the molecular function category, the GO term "protein binding" is the most represented, and in cellular component and biological process, the GO terms "integral component of membrane" and "oxidation-reduction process" are the most represented, respectively.

Analysis of Fatty Acid Biosynthesis Pathway
We expected that the predominant control of MCFA chain lengths occurs in the fatty acid biosynthetic pathway since C 14:0 is present in all lipids classes. Genes putatively involved in the de novo FAS pathway were first identified (Figure 3; Supplementary Table S3). Examination of gene isozymes with putative functions in fatty acid initiation and elongation indicates an underrepresentation of the genes in the upstream portion of this pathway, when compared to Chlamydomonas reinhardtii and two diatoms -T. pseudonana and P. tricornutum (Miller et al., 2010;Boyle et al., 2012;Dyhrman et al., 2012). In the transcript assembly, we could only identify a single gene encoding an acyl-carrier protein (ACP), which acts as a tether shuttling growing fatty acids to multiple components of the fatty acid synthase complex. The predicted polypeptide is 72% identical to an ACP from the diatom P. tricornutum (GenBank: EEC50984.1; Figure 4B). In both the C. reinhardtii and T. pseudonana transcriptomes, two ACPs with relatively high expression levels are found. Two putative monomeric acetyl-CoA carboxylase (ACCase) enzymes were identified in the Chaetoceros GSL56 transcriptome (transcript ID: 5820 and 12896) that have 82-85 and 72-76% similarities with representatives in P. tricornutum and T. pseudonana, respectively. Heteromeric ACCase enzymes that have dissociated enzymatic subunits were not identified, whereas they were identified in C. reinhardtii. The abundances of these two ACCase transcripts in Chaetoceros GSL56 was low as indicated by calculated reads per kilobase per million mapped reads (RPKM) values (13 and 0.4), in contrast to ACCase transcript abundance in C. reinhardtii (RPKM 246.5 and 250;Miller et al., 2010). It should be noted that a direct comparison of RPKM values between species may be compromised due to different genome sizes and different isozymes. Thus, we compared the RPKM of ACCase to other genes involved in the fatty acid biosynthetic pathway within the same species. Whereas the ACCase ranks high in the fatty acid biosynthesis pathway in  C. reinhardtii (Miller et al., 2010;Lv et al., 2013), it is among the least abundant transcripts in Chaetoceros GSL56. In contrast to ACP and ACCase, the genes involved in fatty acid elongation reactions are more abundant in Chaetoceros GSL56 relative to C. reinhardtii. For example, 11 unique isozymes were identified for the 3-ketoacyl reductase (KAR) enzymes, with the highest RPKM value equal to 245, whereas in C. reinhardtii only one copy of this gene was found. Transcripts encoding hydroxyacyl-ACP dehydratase (HAR) and enoyl-ACP reductase (EAR) were also identified at high abundance (RPKM: 267 and 131, respectively).
Substrate specificity of some key enzymes in Chaetoceros GSL56 is likely to lead to C 14:0 fatty acid accumulation. In plants, two enzymes, ketoacyl-ACP synthase (KASI/II, KASIII) and acyl-ACP TEs (FatA/B) typically show different enzymatic activities toward different fatty acid chain lengths, which contribute to fatty acid chain length determination (Jones et al., 1995;Leonard et al., 1998;Abbadi et al., 2000;Dehesh et al., 2001). Therefore, we initially attempted to identify genes encoding these two enzyme classes in Chaetoceros GSL56. To maximize gene identification, protein queries that consisted of homologous genes from diatoms and several other algae were conducted. Putative genes were then aligned with well-characterized genes for further annotation (Figure 4). There are four distinct transcripts annotated as KAS. Based on sequence alignments with KAS genes from plants and bacteria (Figure 4A), one transcript (14815) likely encodes a plant type KASIII enzyme, and it is not related to the other KAS enzymes. Transcript (9683) is predicted to be a KASII enzyme that is similar to KASII from P. tricornutum. Transcript 11978 has homology with plant type KASI/II/IV, while the fourth transcript (16566) has only tenuous KAS similarity. Unambiguous TE identification is complicated since there are often no clear homologs to the conical plant-type enzymes found in diatoms. As expected, a homolog of this type of an acyl-ACP TE was not identified in Chaetoceros GSL56 (data not shown); however, genes encoding putative thioester hydrolyzing enzymes were found (transcripts: 2647, 3750, 4370, 9448, and 9539).
KASIII in Chaetoceros GSL56 Substitutes for FabH in Synechococcus sp. PCC7002 As shown above, Chaetoceros GSL56 effectively accumulates high levels of MCFA. However, the data also demonstrate that this strain has poor growth and biomass productivity metrics (Supplementary Figure S1). We therefore examined whether enzymatic "parts" from Chaetoceros GSL56 could be used in a biofuel "chassis" organism (Synechococcus 7002) to improve a biofuel phenotype, in this case MCFA synthesis and secretion. First, each of the genes potentially encoding thioester hydrolyzing enzymes were cloned and transformed into Synechococcus 7002 concomitantly with fadD disruption to enable fatty acid secretion, an approach that was successfully used previously for genuine thioesterase enzymes (Work et al., 2015); however, none of the encoded enzymes showed acyl-ACP TE activity in Synechococcus 7002 (data not shown). We then explored whether expression of KASIII (functional homolog of endogenous FabH) could influence FAS in this cyanobacterium. A recent report concluded that FabH is the ratelimiting step in FAS in Synechococcus 7002 (Kuo and Khosla, 2014). Plasmids were designed for concurrently knocking out the native Synechococcus 7002 fabH gene while inserting KASIII from Chaetoceros GSL56 in its place. Enzyme expression was driven either at native levels by the endogenous fabH promoter (SK01) or at very high levels by the cpcBA promoter (SK02; Xu et al., 2011). Fully segregated transgenic strains were obtained through repeating streaking of single colonies on A+ plates containing the required antibiotics ( Figure 5). Controls using a plasmid that had yellow fluorescent protein instead of KASIII from Chaetoceros GSL56 were also generated, but these mutants were not able to reach homoplasmy, indicating that FabH (or a functional replacement) is essential for Synechococcus 7002 viability. Relative to the wildtype Synechococcus 7002, the two transgenic strains expressing KASIII (SK01 and SK02) showed no major differences in either fatty acid profiles or total fatty acid production when transformed into the wildtype background (data not shown). We then transformed a Synechococcus 7002  strain SA01 (Work et al., 2015) that is able to secrete lauric acid (C 12:0 ) due to the expression of the medium chain acyl-ACP TE from Umbellularia californica (UcFatB; Table 3). As the SA01 strain is already able to secrete FFAs, we probed whether enhanced MCFA synthesis would occur when the rate-limiting endogenous FabH was replaced with the Chaetoceros GSL56 KASIII. KASIII substitution/overexpression in other systems has been shown to influence fatty acid profiles/yields, putatively by changing the rate of fatty acid initiation relative to downstream enzyme activities, and/or by catalyzing elongation reactions for short chain (but not long chain) acyl-ACP substrates (Abbadi et al., 2000;Dehesh et al., 2001;González-Mellado et al., 2010). Samples were cultivated at room temperature, 0.04% CO 2 (condition 1); data are average of three independent biological replicates, and standard deviations are shown in parentheses. a C 18:N includes C 18:1 , C 18:2 , and C 18:3 .
FIGURE 7 | Representative GC-FID chromatograms of FAMEs from the Synechococcus 7002 mutants SAK03 (A) and SA01 (B). FAMEs were extracted from batch cultures grown at room temperature and harvested on day 20.

Enhanced Lauric Acid Production in Transgenic Strains
When FabH was replaced by the Chaetoceros GSL56 KASIII enzyme in the Synechococcus 7002 SA01 background, both new transgenic strains (SAK01 and SAK03) showed enhanced levels of C 12:0 fatty acid production/secretion relative to SA01 (Figure 6; Table 4). An increase was observed under both of the two different culturing conditions tested. For cells grown at room temperature without CO 2 augmentation (growth condition #1), SA01, SAK01, and SAK03 all grew similarly ( Figure 6A). Despite no noticeable growth differences between SA01 and the KASIII expressing mutants, C 12:0 fatty acid accumulated at much higher levels per ml of culture and in the relative percentage of all fatty acids. The enhancement ranged from 1.1 to 5-fold depending on growth phases and the promoters used. In SA01, the highest levels of C 12:0 accumulation occurred at day 11 (11.5 mg/L, 20% of total fatty acids) and started to slightly decrease thereafter (Figures 6B,C). However, in the KASIII expressing strains, the C 12:0 fatty acid levels continue to show increases even at day 20. The relative percentages of C 12:0 fatty acid also continued to increase during the experimental period. We also tested SA01, SAK01, and SAK03 at higher temperature (30 • C) and supplemented with 1% CO 2 (growth condition #2). All strains grew faster than in growth condition #1. Under these conditions, the transgenic strains (SAK01/03) showed defective growth relative to SA01 ( Figure 6D); and unexpectedly, all strains showed decreased C 12:0 fatty acid productivity and a lower percentage of C 12:0 than in growth condition #1 (Figures 6E,F).
The highest amount of C 12:0 (54 mg/L, 30% of all fatty acids) was produced from SAK03 with KASIII expression driven by the cpcBA promoter. In sum, expression of KASIII in Synechococcus 7002 and co-expression of a medium chain specific thioesterase enhances MCFA synthesis (C 12:0 , Figure 7) in Synechococcus 7002 under the growth conditions tested.

DISCUSSION
Medium chain fatty acids are desirable in both the fuel and chemical industries . However, large quantities of MCFAs are only available in some specific oilseeds, such as coconut and Cuphea. Expression of MCFA specific enzymes can lead to the accumulation of shorter chain fatty acids in crop plants, providing a way to increase MCFA productivity (Voelker et al., 1992). In this study, we explored FAS in a halophilic alga (Chaetoceros GSL56) that is enriched for MCFA, and used whole cell transcriptome sequencing to identify genes encoding putative FAS enzymes. Significantly, we identified a gene encoding a KASIII enzyme that when expressed in Synechococcus 7002 is able to functionally replace the native cyanobacterial FabH enzyme, and when co-expressed in a C 12:0 MCFA secreting strain results in increased levels of MCFA synthesis. Chaetoceros GSL56 primarily synthesizes C 14:0 , C 16:0 , C 18:0 , C 18:N , and C 20:5 fatty acids that are differentially incorporated into all lipid classes (Table 1; Figures 1 and 2). Depending on growth conditions, saturated C 14:0 fatty acids range from ∼15-40% of all fatty acids in Chaetoceros GSL56 which is amongst the highest native C 14:0 levels that we have observed in an alga ( Figure 1A). Additional marine algae have also been documented to contain MCFA, primarily within the Bacillariophyta and Haptophyta phyla (Chen et al., 2007;Guihéneuf et al., 2010).
Analysis of individual lipid classes in Chaetoceros GSL56 revealed that MCFAs are preferentially incorporated into TAG (Figure 2; Supplementary Table S1), which is consistent with the MCFA profiles in plant seeds and the other diatom species Alonso et al., 2000;Chen et al., 2007). In contrast to neutral oil droplets, the phospholipids and galactolipids, which are used as membrane constituents, are enriched in long chain fatty acids (C 16−20 ).
Although Chaetoceros GSL56 has a promising MCFA phenotype, this alga has poor growth metrics relative to biotechnologically relevant strains. We therefore identified enzymes in the MCFA biosynthetic pathway that could be used as "parts" in more biotechnologically promising strains. Synechococcus 7002 is a cyanobacterium with among the fastest growth rates of any PSM, and has become the recent focus of several biotechnology efforts (McNeely et al., 2010;Mendez-Perez et al., 2011;Davies et al., 2014;Work et al., 2015). We therefore transferred selected components of the Chaetoceros GSL56 FAS machinery into Synechococcus 7002 to (i) verify functional annotations and (ii) determine whether enzymes from Chaetoceros GSL56 could be expressed in a cyanobacterium to increase MCFA synthesis. Based on precedence in plant studies, (Voelker et al., 1992;Dehesh, 2001), we targeted substrate specific enzymes, such as acyl-ACP TEs and β-ketoacyl ACP synthase (KAS), to influence MCFA synthesis (Voelker et al., 1992;Leonard et al., 1998).
Initially, we probed whether five genes encoding putative thioester hydrolysis enzymes contained acyl-ACP TE activity that could be used to produce MCFAs in Synechococcus 7002. Expression of each of these genes individually while concurrently knocking out the gene encoding the fatty acid recycling enzyme FadD (Kaczmarzyk and Fulda, 2010;Liu et al., 2011;Ruffing, 2014;Work et al., 2015), did not result in the production of any MCFA in Synechococcus 7002. This may be because the enzymes tested were not genuine acyl-ACP TEs, or because they did not fold into functional enzymes in Synechococcus 7002. We then targeted the KAS enzyme family. The β-ketoacyl ACP synthase III (KASIII) functionally replaced the FabH enzyme in Synechococcus 7002, confirming the annotation of KASIII as a genuine β-ketoacyl ACP synthase III. Furthermore, expression of KASIII in a Synechococcus 7002 strain that coexpresses a plant-sourced medium chain specific thioesterase (UcfatB) produced up to 40% of total fatty acid as lauric acid (SAK03), which is approximately four times more than corresponding strains expressing the native FabH enzyme. Enhanced lauric acid production in transgenic strains may be due to, (i) an increased acyl-ACP pool size since fabH is the rate-limiting enzyme of fatty acid synthase in Synechococcus 7002 (Kuo and Khosla, 2014); (ii) modified overall FAS rate that better matches thioesterase activity ; and/or, (iii) concentrated short to medium chain acyl-ACPs pools caused by KASIII (Abbadi et al., 2000). The exact mechanism of KASIII enhancement, and testing whether KASIII participates in MCFA synthesis in Chaetoceros GSL56, is the subject of further investigation.
The euryhaline cyanobacterium Synechococcus 7002, is a particularly promising host for FFA production, because it shows a unique high tolerance to FFAs (Ruffing, 2014). However, the total metabolic flux to FAS represents a small portion (∼5-10%) of cell dry weight, indicating evolutionary limitations to fatty acid productivities in this host (Work et al., 2015). Replacing FabH, which is the kinetically rate-limiting enzyme in FAS, with KASIII did not improve fatty acid production when transformed into the wild-type host (data not shown), suggesting that downstream elements that traffic fatty acids to membrane lipids may now be rate limiting. However, the expression of acyl-ACP thioesterase allows cleavage of MCFA from acyl-ACP, and has been reported to increase malonyl-ACP turnover rates and reduce the inhibition of acyl-ACP in other systems (Magnuson et al., 1993). Transgenic strains expressing KASIII and thioesterase (SAK01 and SAK03) accumulated higher amounts of C 12:0 and C 14:0 fatty acids, relative to their parental strain SA01, and these increases are consistent with UcfatB hydrolytic activity (Voelker and Davies, 1994). Levels of MCFA are influenced by promoter strength, indicating a correlation between KASIII expression levels and MCFA production.
It is still undetermined how Chaetoceros GSL56 regulates C 14 fatty acid production as we were unable to identify an acyl-ACP TE with C 14:0 specificity in this alga, which is consistent with other studies in red algae and diatoms (Beld et al., 2014). Future studies of the role of KASIII on medium chain FAS, as well as the substrate specificities of other enzymes, such as acyltransferase, will be important in understanding fatty acid chain length regulation in this and other diatoms.
In future research, we also intend to explore whether, the expression of other eukaryotic FAS enzymes in Synechococcus 7002, which typically has less than 10% of its biomass in fatty acyl lipids, results in further yield improvements. It is possible that these eukaryotic enzymes, which did not evolve under the regulatory mechanisms used by cyanobacteria, remain active outside of the metabolic context that evolved to limit FAS in cyanobacteria such as Synechococcus 7002.

AUTHOR CONTRIBUTIONS
HG designed and transformed Synechococcus 7002 strains, performed batch experiments and the majority of biochemical assays, and compiled this manuscript. RJ participated in GSL strain screening, generated data in Figure 1 and revised this manuscript. FD participated in designing transformation vectors and revised this manuscript. LS and PS developed methods for extracting and quantitating internal standards. MP provided the conception of this work, oversaw and edited this manuscript. All authors revised the manuscript for intellectual content.