Differential Protein Expression During Growth on Medium Versus Long-Chain Alkanes in the Obligate Marine Hydrocarbon-Degrading Bacterium Thalassolituus oleivorans MIL-1

The marine obligate hydrocarbonoclastic bacterium Thalassolituus oleivorans MIL-1 metabolizes a broad range of aliphatic hydrocarbons almost exclusively as carbon and energy sources. We used LC-MS/MS shotgun proteomics to identify proteins involved in aerobic alkane degradation during growth on medium- (n-C14) or long-chain (n-C28) alkanes. During growth on n-C14, T. oleivorans expresses an alkane monooxygenase system involved in terminal oxidation including two alkane 1-monooxygenases, a ferredoxin, a ferredoxin reductase and an aldehyde dehydrogenase. In contrast, during growth on long-chain alkanes (n-C28), T. oleivorans may switch to a subterminal alkane oxidation pathway evidenced by significant upregulation of Baeyer-Villiger monooxygenase and an esterase, proteins catalyzing ketone and ester metabolism, respectively. The metabolite (primary alcohol) generated from terminal oxidation of an alkane was detected during growth on n-C14 but not on n-C28 also suggesting alternative metabolic pathways. Expression of both active and passive transport systems involved in uptake of long-chain alkanes was higher when compared to the non-hydrocarbon control, including a TonB-dependent receptor, a FadL homolog and a specialized porin. Also, an inner membrane transport protein involved in the export of an outer membrane protein was expressed. This study has demonstrated the substrate range of T. oleivorans is larger than previously reported with growth from n-C10 up to n-C32. It has also greatly enhanced our understanding of the fundamental physiology of T. oleivorans, a key bacterium that plays a significant role in natural attenuation of marine oil pollution, by identifying key enzymes expressed during the catabolism of n-alkanes.


INTRODUCTION
Thalassolituus oleivorans MIL-1 is a motile aerobic bacterium belonging to the Gammaproteobacteria class, which was first isolated from seawater and sediment samples collected in Milazzo Harbor, Sicily, Italy, supplemented with the n-alkane, tetradecane (n-C 14 ) (Yakimov et al., 2004). The strain is a typical example of marine obligate hydrocarbonoclastic bacteria (OHCB) which have a highly specialized substrate specificity toward hydrocarbons . Like many other OHCB, such as Alcanivorax borkumensis and Oleispira antarctica, T. oleivorans grows almost exclusively on aliphatic hydrocarbons (Yakimov et al., 1998(Yakimov et al., , 2002. OHCB are present in non-polluted marine environments at low numbers, but following oil spills, they typically bloom and become dominant members of the microbial community (Kasai et al., 2002a,b;Cappello et al., 2007;McKew et al., 2007a;Teramoto et al., 2009;Vila et al., 2010).
The biogeography of T. oleivorans shows it is widely distributed all over the world. The RDP (Ribosomal Database Project) and GenBank databases contain 16S rRNA gene sequences of 54 Thalassolituus-like bacteria from microbial communities inhabiting both marine (Baltic, Barents, Mediterranean, North, Okhotsk and South China seas, and the Atlantic, Pacific, and Polar Oceans) and terrestrial environments (caves and ground waters) (Yakimov et al., , 2010; Mou et al., 2008). Previous reports have shown that Thalassolituus-related species were among the most dominant members of hydrocarbon/petroleum enriched consortia and can outcompete other OHCB such as Alcanivorax, which is often the most dominant alkane degrader in marine oil spills (Harayama et al., 2004;Kostka et al., 2011). For example, Thalassolituus outcompeted Alcanivorax in n-C 14 enriched microcosms even though tetradecane was shown to be the preferred substrate of Alcanivorax (Yakimov et al., 2005). T. oleivorans became abundant in crude oil-amended North Sea microcosms at both 4 • C and 20 • C and was in fact the most dominant bacterium at 20 • C, with total extractable hydrocarbons 15% their original value, confirming the important role this species plays in crude oil degradation in seawater (Coulon et al., 2007). T. oleivorans has also been shown to be the most dominant alkane degrader in both single and mixed alkane seawater microcosms over a range of different length n-alkanes from n-C 12 up to n-C 32 (McKew et al., 2007b). Also, in contrast to previous studies, Thalassolituus, rather than Alcanivorax, was shown to be dominant in crude oil-amended microcosms, suggesting that they are less affected by potentially stressful compounds within crude oil (Hara et al., 2003;Brakstad and Lødeng, 2005;Yakimov et al., 2005). T. oleivorans was also one of the dominant species within bacterial communities in deep water samples taken near the oil plume from the Deepwater Horizon oil spill in the Gulf of Mexico (Camilli et al., 2010;Hazen et al., 2010). Thalassolituus spp. also dominated the microbial communities present in water samples collected from oil production wells in Canada (Kryachko et al., 2012).
Given the importance of Thalassolituus in the attenuation of marine oil pollution, the type strain (T. oleivorans MIL-1 DSM 14913 T ) was genome sequenced (Golyshin et al., 2013).
T. oleivorans has a strong biotechnological potential to be used for bioremediation of marine oil spills due to its high affinity for alkanes, autochthonous marine origin, and critical role in natural cleansing of marine systems. Compared to some physicochemical methods currently used, bioremediation is considered a safer, more efficient and cost-effective alternative for the removal of oil contaminants in the environment (Liu et al., 2011).
Studies investigating the genetics and biochemistry of bacterial alkane degradation have mainly focused on the enzymes involved in the initial step of oxidizing n-alkanes, particularly medium-length chains (van Beilen and Funhoff, 2007;Rojo, 2009). Terminal aerobic alkane degradation occurs through sequential oxidation of a terminal carbon initiated by alkane monooxygenases, which produce primary alcohols, and followed by alcohol and aldehyde dehydrogenases, produces the corresponding aldehydes and fatty acids, respectively (Ji et al., 2013). Bacteria capable of degrading medium-chain alkanes frequently contain CYP153-type cytochrome P450s and/or homolog of well characterized integral membrane non-heme iron alkane monooxygenases, such as AlkB (Rojo, 2009;Wang and Shao, 2013). The degradation mechanisms of long-chain alkane oxidation are less well understood. Bacteria degrading longchain alkanes may contain a long-chain alkane monooxygenase such as AlmA, and/or a thermophilic soluble long-chain alkane monooxygenase, such as LadA from Geobacillus (Feng et al., 2007;Throne-Holst et al., 2007). However, the genes that code for these proteins are not present in the T. oleivorans genome (Golyshin et al., 2013) and this study provides evidence that suggests an alternative enzyme system used by T. oleivorans for long-chain alkane oxidation.
In this study, we used liquid chromatography-tandem mass spectrometry (LC-MS/MS) shotgun proteomics, a powerful tool to simultaneously identify and quantity the differential expression of large sets of proteins. We examined the reorganization of the proteome to identify proteins significantly upregulated whilst growing on either medium-or long-chain alkanes. This has given a unique insight into proteins involved in the transport and degradation of hydrocarbons, for this globally important marine OHCB.

Growth of T. oleivorans MIL-1
T. oleivorans displayed an ability to grow over a wide range of n-alkanes of different length and displayed highly similar growth on n-alkanes from n-C 10 to n-C 32 (Supplementary Figure 1), with no significant differences in growth rates. After a short lag phase, measurable growth was observed after 3 days on all n-alkanes, with cultures displaying a linear growth pattern that entered the stationary phase after 14 days. No growth was observed on the branched alkane pristane. We compared the proteomes during the early stages of growth (day 4) on the medium-chain alkane tetradecane (n-C 14 ), the long-chain alkane octacosane (n-C 28 ) compared to the non-hydrocarbon control Tween 80 (one of the very few non-hydrocarbon substrates the strain can grow on).
Overview of LC-MS/MS Shotgun Proteomic Analysis 12 LC-MS/MS runs were performed consisting of four independent biological replicates of three treatments [medium-chain alkane (n-C 14 ), long-chain alkane (n-C 28 ), non-hydrocarbon control (Tween 80)] resulting in 191647 spectral counts that were assigned to 1792 proteins, representing 50% of the total protein-coding genes on the T. oleivorans MIL-1 genome (Supplementary Table 1). Almost half (49%) of the spectral counts were assigned to the 100 most abundantly detected proteins and 83% were assigned to the 500 most abundant proteins. The remaining 1292 proteins (representing 17% of spectral counts) were detected with low spectral counts and many not across all biological replicates.
A total of 139 proteins were significantly differentially expressed during growth on n-C 14 compared to the nonhydrocarbon control Tween 80 ( Figure 1A) with 63% upregulated on n-C 14 . 311 proteins were significantly differentially expressed during growth on n-C 28 when compared to Tween 80 ( Figure 1B) with 60% upregulated on n-C 28 . 234 proteins were significantly differentially expressed during growth on n-C 14 compared to n-C 28 ( Figure 1C) with 51% upregulated on n-C 14 . Overall the total proteome differed markedly between the three growth substrates with highly similar proteomes between replicates ( Figure 1D).

Terminal Oxidation During Growth on Medium-Chain Alkanes
Five proteins identified as being involved in the terminal oxidation of medium-chain alkanes were significantly differentially expressed during growth on n-C 14 compared to both n-C 28 and Tween 80 (Figure 2A). An alkane 1monooxygenase (TOL_1175; P < 0.001) was exclusively expressed on n-C 14 . This protein has a 99% identity to an alkane 1-monooxygenase (R615_11545) from T. oleivorans R6-15 and 65% identity to an alkane 1-monooxygenase (OLEAN_C23040) from Oleispira antarctica RB-8 encoded by the alkB2 gene. The expression of a second alkane 1-monooxygenase (TOL_2658; P = 0.002) was 11-fold greater in abundance during growth on n-C 14 compared to n-C 28 with no detection in the nonhydrocarbon control, Tween 80. This monooxygenase has 77% identity to an alkane 1-monooxygenase (WP_076515071) from the marine aliphatic alkane-degrader Oleibacter marinus. An oxidoreductase (TOL_2659; P = 0.001) coded by a gene located adjacent to the alkane 1-monooxygenase was also upregulated in a similar ratio (12-fold higher on n-C 14 compared to both n-C 28 and Tween 80). Domain analysis revealed this oxidoreductase is a ferredoxin that contains a 2Fe-2S iron-sulfur cluster binding domain and it is a member of the Fer2 (PF00111) family. This protein is known to shuttle electrons to the monooxygenase. Expression of a ferredoxin reductase (TOL_2371; P = 0.035) was also 14-fold higher during growth on n-C 14. This protein has a FIGURE 1 | (A-C) Volcano plots of normalized LC-MS/MS spectral counts comparing T. oleivorans protein biosynthesis during growth on a medium-chain alkane (n-C 14 ), a long-chain alkane (n-C 28 ) and a non-hydrocarbon control (Tween 80). Data points above horizontal dashed line represent P-values below 0.05. Vertical dashed lines represent a twofold change. Specific proteins involved in the oxidation of alkanes significantly differentially expressed during growth on n-C 14 and n-C 28 are highlighted. AM, alkane monooxygenase; F, ferredoxin; FR, ferredoxin reductase; AD, alcohol dehydrogenase; ALD, aldehyde dehydrogenase; BVMO, Baeyer-Villiger monooxygenase; E, esterase. (D) Principal component analysis of replicate T. oleivorans proteomes on growth substrates n-C 14 , n-C 28 , and Tween 80.
100% and 86% identity to a ferredoxin reductase in T. oleivorans R6-15 (W8G014) and Oleispira marina (A0A1N7MYW0), respectively. Ferredoxin reductase oxidizes NAD(P)H to NAD(P) + capturing electrons to transfer to the ferredoxin. The Normalized Spectral Abundance Factor (Supplementary Table 1) was higher for the ferredoxin (0.0802%) and ferredoxin reductase (0.1250%) compared to either alkane monooxygenases (TOL_1175-0.0792%; TOL_2658-0.0647%) indicating T. oleivorans requires more electron transporters than monooxygenases by themselves. The combination of an alkane monooxygenase and a ferredoxin/ferredoxin reductase indicates a non-heme iron alkane monooxygenase system ( Figure 2B) was upregulated during medium-chain alkane degradation. Finally, FIGURE 2 | (A) Spectral counts (means ± SE; n = 4) of differentially expressed terminal alkane oxidation proteins during growth on a medium-chain alkane (n-C 14 ,), a long-chain alkane (n-C 28 ,) and a non-hydrocarbon control (Tween 80) in Thalassolituus oleivorans MIL-1. * denotes spectral counts significantly greater (P < 0.05) during growth on n-C 14 relative to growth on n-C 28 and TW. (B) Oxidation of an n-alkane by the identified proteins in the non-heme iron alkane monooxygenase system. The ferredoxin reductase (FdxR) oxidizes NAD(P)H to NAD(P) + . The electrons (e -) generated from this are shuttled to the alkane monooxygenase (AM) by the ferredoxin (Fdx). The monooxygenase introduces oxygen into the alkane at the terminal carbon converting it into a primary alcohol. This alcohol is further oxidized to an aldehyde and then to a fatty acid by the alcohol dehydrogenase (AD) and aldehyde dehydrogenase (ALD), respectively. expression of an aldehyde dehydrogenase (TOL_0223; P = 0.001) was at least sevenfold greater during growth on n-C 14 compared to n-C 28 and Tween 80. This combination of proteins makes up a near complete pathway for the terminal oxidation of n-alkanes, however, we did not observe the upregulation of a specific alcohol dehydrogenase that would convert the primary alcohol produced by the monooxygenase into the corresponding aldehyde. An alcohol dehydrogenase (TOL_2772) was, however, expressed on all three growth substrates, although it was significantly more abundant during growth on n-C 28 (TOL_2772). In addition to this, an additional three alcohol dehydrogenases (TOL_1420, TOL_2068, TOL_2458) were expressed equally across the three growth substrates and could potentially be constitutively expressed and catalyze the conversion of the primary alcohol to an aldehyde. The metabolite from terminal oxidation of an alkane, the primary alcohol 1-tetradecanol, was detected by GC-MS after 4 days of growth on n-C 14 , confirming a terminal oxidation of the alkane, and no secondary alcohols were detected.

Subterminal Oxidation During Growth on Long-Chain Alkanes
During growth on n-C 28 , a specific subterminal alkane monooxygenase that would oxidize the initial alkane into a secondary alcohol was not identified. However, one of the two alkane monooxygenases (TOL_2658), heavily upregulated during growth via terminal oxidation on n-C 14, was also expressed during growth on n-C 28 (and not expressed during growth on Tween 80). Three additional proteins identified with a role in the subterminal oxidation of long chain alkanes were, however, specifically significantly upregulated during growth on n-C 28 (Figure 3). The expression of an alcohol dehydrogenase (TOL_2772; P = 0.029) was at least 2.5-fold greater during growth on n-C 28 compared to n-C 14 and Tween 80, and this was significantly upregulated together with a flavin-binding family monooxygenase (TOL_0709; P = 0.002) that increased 18-fold. TOL_0709 has been identified as a Baeyer-Villiger monooxygenase (BVMO) class enzyme (BVMOs are known to catalyze the conversion of a ketone into an ester via the insertion of an oxygen in a ketone next to the carbonyl atom) due to the presence of two Rossmann fold motifs flanking two BVMO fingerprint sequence motifs (Supplementary Figure 2) (Fraaije et al., 2002;Riebel et al., 2013) and amino-acid sequence homology to known BVMO's (Supplementary Figures 3, 4). As well as expression of the BVMO, the use of a sub-terminal oxidation pathway is also supported by the expression of an esterase (TOL_0906; P < 0.0001) that was 17-fold upregulated during growth on n-C 28 compared to n-C 14 with no expression detected during growth on Tween 80 (Figure 3). An esterase is required to hydrolyse the subsequent ester of a BVMO sub-terminal oxidation to generate an alcohol and a fatty acid. Metabolite analysis also confirmed that no primary (alcohol 1-octacosanol) was detected in any of the 7 days of growth tested.

Long-Chain Alkane Transport
Due to the size and hydrophobicity of long-chain alkanes (e.g., n-C 28 ) multiple passive and active transport systems may be FIGURE 3 | (A) Spectral counts (means ± SE; n = 4) of differentially expressed subterminal alkane oxidation proteins during growth on a medium-chain alkane (n-C 14 ,), a long-chain alkane (n-C 28 ,) and a non-hydrocarbon control (Tween 80) in Thalassolituus oleivorans MIL-1. * denotes spectral counts significantly greater (P < 0.05) during growth on n-C 28 relative to growth on n-C 14 and TW. (B) Subterminal oxidation of an n-alkane by the identified proteins. An alkane monooxygenase (AM) introduces oxygen into the alkane at a subterminal carbon converting it into a secondary alcohol. The secondary alcohol is converted to the corresponding ketone by an alcohol dehydrogenase (AD). This ketone is then oxidized by a Baeyer-Villiger monooxygenase (BVMO) to render an ester. The ester is then hydrolysed by an esterase (E), generating an alcohol and a fatty acid that then enters the tricarboxylic acid (TCA) cycle. required for uptake. Four proteins possibly involved in longchain alkane transport were significantly differentially expressed during growth on n-C 28 compared to n-C 14 and Tween 80 (Figure 4). Expression of an outer membrane TonB-dependent receptor (TOL_0244; P = 0.01) increased sevenfold during growth on n-C 28 compared to Tween 80 ( Figure 4A). This receptor has a 40.3% similarity to OmpS (B5T_01485), a TonBdependent receptor in Alcanivorax dieselolei, which was found to be essential for alkane detection and utilization (Wang and Shao, 2014). This suggests the receptor may carry out detection, highaffinity binding and energy-dependent uptake of n-C 28 into the periplasm ( Figure 4B).
The expression of a long-chain fatty acid transport protein (TOL_1625; P < 0.0001) increased ninefold in the presence of FIGURE 4 | (A) Spectral counts (means ± SE; n = 4) of differentially expressed long-chain alkane transport proteins during growth on a medium-chain alkane (n-C 14 ,), a long-chain alkane (n-C 28 ,) and a non-hydrocarbon control (Tween 80) in Thalassolituus oleivorans MIL-1. * denotes spectral counts significantly greater (P < 0.05) during growth on n-C 28 relative to growth on n-C 14 and TW. (B) Proposed uptake of a long chain n-alkane by the identified proteins. TonB interacts with TOL_0244 which carries out high-affinity binding and energy dependent uptake of the substrate into the periplasm. In the absence of TonB, the receptor will bind the substrate but not carry out active transport. TOL_1625 is a FadL homolog which allows hydrophobic compounds to cross the cell membrane through diffusion. TOL_3188 is a porin which may facilitate passive uptake of long-chain alkanes and may be transported to the outer membrane by the inner-membrane transport protein TOL_3187.
n-C 28 when compared to n-C 14 with no expression detected on Tween 80 ( Figure 4A). This protein is a member of the Toluene_X family, which is a family of passive outer membrane transport proteins that include TodX from Pseudomonas putida (implicated in toluene catabolism) and FadL, a group of longchain fatty acids transporters. Given that FadL family proteins have also recently been implicated in alkane transport across the outer membrane (Wang and Shao, 2014), TOL_1625 likely allows passive uptake of n-C 28 into the periplasm ( Figure 4B).
The expression of an uncharacterized protein (TOL_3187; P < 0.0001) increased 42-fold in the presence of n-C 28 when compared to n-C 14 and Tween 80 ( Figure 4A). Domain analysis showed two mycobacterial membrane protein large (MMPL) domains and a LolA-like domain. MMPL proteins are important in substrate transport across the inner membrane to the periplasm. SCOOP analysis in Pfam was used to detect protein families which have a relationship with the MMPL family using sequence information, revealing a relationship with the protein export SecD/SecF family. LolA is required for localization of lipoproteins to the outer membrane. Consensus prediction of GO terms in I-TASSER showed the predicted molecular function of the protein is a transporter (GO:0005215), the biological process is transport (GO:0006810) and the protein is an integral component of a membrane (GO:0016021). Another uncharacterized protein (TOL_3188; P < 0.0001) is coded by a gene immediately downstream of TOL_3187 and was expressed exclusively when growing on n-C 28 (Figure 4A). Domain analysis showed that no conserved domains were present. Consensus prediction of GO terms in I-TASSER showed the predicted molecular function is a porin (GO:0015288) and the protein is part of a macromolecular complex (GO:0032991) between two or more proteins. This porin may be transported to the outer membrane via the MMPL protein (TOL_3187) and then facilitate the passage of long-chain alkanes ( Figure 4B).

Chemotaxis Toward Long Chain-Alkanes
T. oleivorans typically has a monopolar, monotrichous flagellum, although a monopolar tuft of four flagella has also been observed (Yakimov et al., 2004). We identified a whole array of corresponding genes spanning the region TOL_2479-TOL_2522 that code for proteins involved in chemotaxis and for components of the flagellum and flagellar motor. Three proteins, coded by genes in this region were significantly upregulated during growth on n-C 28 relative to growth on n-C 14 and Tween 80 ( Figure 5). These include a purine-binding chemotaxis protein, CheW (TOL_2481; P = 0.001), a methylaccepting chemotaxis protein, MCP (TOL_2508; P < 0.0001) and a histidine kinase, CheA (TOL_2510; P < 0.0001). Expression increased 2.5-fold, 2-fold, and 4-fold, respectively, compared to growth on n-C 14 and Tween 80 suggesting a putative role in chemotaxis toward long-chain alkanes. An uncharacterized protein (TOL_0708; P = 0.001) was also significantly upregulated, increasing twofold during growth on n-C 28 . The gene that codes for this protein is located adjacent to the Baeyer-Villiger monooxygenase (TOL_0709) that was expressed during growth on n-C 28 . Domain analysis revealed a HAMP domain and an MCP signal domain, indicating the protein is another MCP. Considering that other M have been detected next to alkaneoxidizing monooxygenases [e.g., tlpS in Pseudomonas aeruginosa PAO1 , alkN in Pseudomonas putida GPo1 (van Beilen et al., 2001)], this suggests that the protein is involved in chemotaxis toward alkanes with a specific role in long-chain alkane metabolism.

DISCUSSION
T. oleivorans is a key member of the marine OHCB characterized by their ability to metabolize hydrocarbons almost exclusively as substrates . OHCB can only use a small number of organic acids or central metabolism intermediates such as acetate, lactate, pyruvate or Tween 40/80 (Yakimov et al.,FIGURE 5 | (A) Spectral counts (means ± SE; n = 4) of differentially expressed chemotaxis proteins during growth on a medium-chain alkane (n-C 14 ,), a long-chain alkane (n-C 28 ,) and a non-hydrocarbon control (Tween 80) in Thalassolituus oleivorans MIL-1. * denotes spectral counts significantly greater (P < 0.05) during growth on n-C 28 relative to growth on n-C 14 and TW. MCP, methyl-accepting chemotaxis protein; CheW, purine binding chemotaxis protein; CheA, histidine kinase. (B) Proposed chemotactic response of Thalassolituus oleivorans MIL-1 to long-chain alkanes. The cytoplasmic side of the MCP dimers interacts with two proteins CheW and CheA. When the MCP is not bound to an attractant, it stimulates CheA to phosphorylate itself using ATP. CheA auto-phosphorylation is inhibited when the attractant is bound to its MCP. CheW physically bridges CheA to the MCPs to allow regulated phosphotransfer to CheY. Phosphorylated CheY phosphorylates the basal body FliM which is connected to the flagellum.
1998; Golyshin et al., 2003). T. oleivorans was capable of growth on every n-alkane substrate tested with lengths ranging from n-C 10 to n-C 32 . This range of utilizable alkane substrates is far greater than the n-C 7 to n-C 20 range previously documented for T. oleivorans (Yakimov et al., 2004;Wentzel et al., 2007) and similar to Alcanivorax borkumensis SK2 which is also capable of growth on alkanes up to n-C 32 (Schneiker et al., 2006), although T. oleivorans cannot grow on the branched alkane pristane. T. oleivorans has a wider substrate range compared to some other OHCB as it is capable of growth on longer-chain alkanes. For example, Marinobacter hydrocarbonoclasticus can only metabolize alkanes up to n-C 28 (Klein et al., 2008) and Oleispira antarctica can only metabolize alkanes up to n-C 24 (Gregson, unpublished).

Terminal Oxidation During Growth on Medium-Chain Alkanes
TOL_1175 which was significantly expressed only during growth on n-C 14 codes an alkane monooxygenase which is homologous to AlkB2 from the psychrophilic obligate hydrocarbon degrader Oleispira antarctica (OLEAN_C23040, 65% identity). A transcriptional regulator belonging to the GntR family (TOL_1176) was detected in the data set but was not significantly differentially expressed. The gene coding for the regulator is located adjacent to the alkB2 gene on the genome. This organization of a GntR transcriptional regulator being adjacent to the alkB2 gene can be seen in other alkanedegrading bacteria, e.g., P. putida GPo1, A. borkumensis and A. hongdengensis (van Beilen et al., 2004;Wang et al., 2010;Sabirova et al., 2011). The electron shuttling proteins rubredoxin and rubredoxin reductase are required for electron transfer to AlkB which it uses for alkane hydroxylation (van Beilen et al., 2006). In general, bacterial cytochrome P450s are the proteins which require ferredoxin and ferredoxin reductase for electron transfer, rather than rubredoxin and rubredoxin reductase. However, TOL_2658 which was exclusively expressed on alkanes and had 11-fold greater expression on n-C 14 compared to n-C 28 , encodes an alkane monooxygenase, and TOL_2659, which was expressed in a similar pattern (12fold higher on n-C 14 ), encodes a ferredoxin. Ferredoxin and ferredoxin reductase have been shown to functionally replace rubredoxin and rubredoxin reductase in vitro (Peterson et al., 1966;Benson et al., 1977). Additionally, analysis of sequenced microbial genomes and metagenomes from terrestrial, freshwater and marine environments found four out of 23 genes encoding multi-domain AlkB system comprised an N-terminal ferredoxin domain, a ferredoxin reductase domain and a C-terminal alkane monooxygenase domain (Nie et al., 2014).
The organization of the genes involved in alkane degradation differs significantly among OHCB . For example, the genetic organization of A. borkumensis is similar to that of P. putida Gpo1 where all the proteins required to oxidize an alkane up to the corresponding acyl-CoA derivative are encoded by the alkBFGHJKL operon, which includes the alkane monooxygenase and rubredoxins, alcohol and aldehyde dehydrogenases, and an acyl-CoA synthetase (van Beilen et al., 1994(van Beilen et al., , 2001. This pathway appears to have been horizontally transferred across many bacteria (van Beilen et al., 2001). In contrast T. oleivorans does not share this level of organization, as many of the proteins involved in alkane degradation that were upregulated are encoded by genes that are distributed throughout the genome rather than on a single operon. In general, alkane-degrading bacteria have multiple alkane monooxygenases which expand the n-alkane range of the host strain (van Beilen et al., , 2006. For example, Alcanivorax borkumensis contains two alkB and three P450 type monooxygenase genes (Schneiker et al., 2006) and multiple monooxygenases exist in the Actinobacteria, Amycolicicoccus subflavus DQS3-9A1 and Rhodococcus Q15/NRRL B-16531 (Whyte et al., 2002;Nie et al., 2013). When several alkane monooxygenases coexist in a bacterium, they are normally located at different sites in the chromosome, and the alkaneresponsive regulators that control expression of the degradation genes (e.g., LuxR/MalR or AraC/XylS) may or may not be adjacent to them .

Subterminal Oxidation During Growth on Long-Chain Alkanes
Bacteria degrading long-chain alkanes frequently contain proteins homologous to AlkB a membrane-bound nonheme di-iron monooxygenases found in Pseudomonas putida GPo1 which utilizes a terminal oxidation pathway for alkane degradation. For example, Acinetobacter sp. M1 contains genes that code for membrane-bound oxygenase complexes involved in long-chain alkane degradation (Tani et al., 2001). This includes alkMa, which was induced by long-chain nalkanes greater than n-C 22 , and alkMb, which was preferentially induced by n-alkanes with chain lengths of n-C 16 to n-C 22 . Acinetobacter sp. DSM 1784 was first reported to have an enzyme involved in the degradation of n-alkanes longer than n-C 30 , which was named AlmA . AlmA homologs have been detected in other OHCB such as Alcanivorax borkumensis SK2, Alcanivorax hongdengensis A-11-3 and Alcanivorax dieselolei B-5 (Rojo, 2009;Wang et al., 2010;Liu et al., 2011). In addition to AlmA, other alkane hydroxylases involved in long-chain n-alkane degradation have also been reported, such as a thermophilic soluble longchain alkane monooxygenase (LadA), which exerts terminal oxidation of alkanes ranging from n-C 15 to n-C 36 in Geobacillus thermodenitrificans NG80-2 (Feng et al., 2007;Ji et al., 2013). However, almA and ladA homologs are not present on the T. oleivorans' genome, suggesting an alternative mechanism of long-chain alkane degradation. Several proteins upregulated during growth on n-C 28 (Figure 3) provided strong evidence of a potential subterminal long-chain alkane oxidation pathway. This involves the initial hydroxylation of one of the subterminal carbon atoms in the alkane chain by an alkane monooxygenase (Kotani et al., 2007). Whilst expression of a specific subterminal alkane monooxygenase was not observed during growth on n-C 28 , the same monooxygenase that was highly upregulated during terminal oxidation of n-C 14 was also expressed. This suggests the possibility that this enzyme may act in a nonspecific regioselective manner and can activate alkanes at either the terminal or sub-terminal position. Such enzymatic nonspecificity can result in both primary and secondary alcohols from initial alkane oxidation at both terminal and subterminal carbons (Fredricks, 1967;Markovetz and Kallio, 1971;Singer and Finnerty, 1984).
Secondary alcohols generated can be converted to the corresponding ketone via an alcohol dehydrogenase (TOl_2772 was 2.5-fold upregulated), which would then be subsequently oxidized by the Baeyer-Villiger monooxygenase (BVMO) that was 18-fold upregulated (TOL_0709) to render an ester. The esterase (TOL_0906) would then hydrolyse the ester to generate an alcohol and a fatty acid that enters the tricarboxylic acid (TCA) cycle. Similar esterases are also present in T. oleivorans R6-15 (W8G2U9; 100%), Oleibacter marinus (WP_076513796.1; 56%) and Oleispira antarctica RB-8 (Q6A2S8; 52.5%). Our sequence analysis of the BVMO (TOL_0709) that was upregulated, shows that the amino acid sequence clusters phylogenetically with the BMVO subclass of the group B flavoprotein monooxygenases (Supplementary Figure 3), including the prodrug activator ethionamide monooxygenase (EthA), a bona fide BVMO capable of oxidizing several ketones from Mycobacterium tuberculosis (Fraaije et al., 2004;Throne-Holst et al., 2007). BLASTp analysis revealed that an identical copy of this BVMO is also present in the other two strains of Thalassolituus that recently had their genomes sequenced (T. oleivorans K188/CP017810.1; T. oleivorans R6-15/CP006829.1). An identical protein sequence (CUS41955.1) has also been detected in a metagenome from a marine hydrothermal vent, and the sequence also shares 71% and 64% identity to putative EthA type BVMOs from the OHCBs Oleibacter sp. HI0075 (KZZ04053.1) and Oleispira antarctica RB-8 (OLEAN_C10660), respectively, suggesting sub-terminal BMVO type alkane degradation may be more widespread in the marine environment than is currently known. TOL_0709 groups phylogenetically with numerous EthA type BVMO monooxygenases (Supplementary Figure 4) but also has homology with AlmA genes from many species. AlmA has been implicated in both terminal (Wang and Shao, 2014) and subterminal (Minerdi et al., 2012) oxidation of alkanes, but there is still poor understanding of the exact role of AlmA in alkane degradation (Wang and Shao, 2012). All AlmA sequences within the phylogenetic tree (Supplementary Figure 4) do also contain both BMVO fingerprint motifs (Supplementary Figure 2) (Fraaije et al., 2002;Riebel et al., 2013) suggesting that many AlmA proteins may catalyzes subterminal BMVO reactions rather than terminal monooxygenase reactions.
More distantly related BVMOs have been isolated from Acinetobacter species such as Acinetobacter radioresistens S13 (Minerdi et al., 2012), which was induced during growth in the presence of long-chain alkanes (n-C 24 , n-C 36 ) and shown to subterminally oxidize alkanes. Both T. oleivorans MIL-1 and Acinetobacter radioresistens S13 are capable of growth on either short-and/or long-chain alkanes as the sole carbon source. They both possess genes coding for terminal alkane hydroxylases homologous to alkB and BVMOs homologous to EthA (van Beilen et al., 1992;Minerdi et al., 2012). As seen in Acinetobacter radioresistens S13 (Minerdi et al., 2012), T. oleivorans MIL-1 appears to differentially express either AlkB or BVMO according to the presence of medium-or long-chain alkanes. Both terminal and subterminal oxidation can coexist in some microorganisms (Rojo, 2009). Subterminal oxidation has been described for both short-and long-chain alkanes as well as for fatty acids (Ashraf et al., 1994;Wentzel et al., 2007). In fatty acids, chain length determines the rate and position of hydroxylation (Van Bogaert et al., 2007). The longer the chain the more the terminal/subterminal oxidation ratio declines. For example, palmitic acid (C 16 H 32 O 2 ) and heptadecanoic acid (C 17 H 34 O 2 ) are predominantly hydroxylated at the terminal position, whereas no terminal oxidation is observed for oleic acid (C 18 H 34 O 2 ). This may also be the case for n-alkanes and explain why the subterminal oxidation pathway may potentially is only occur during growth on n-C 28 . The production of ketones and esters are impossible in the terminal oxidation pathway so the upregulation of a BVMO and esterase, which catalyze ketone and ester metabolism respectively, and the lack of a primary alcohol metabolite during growth on n-C 28 strongly suggests T. oleivorans is using a subterminal oxidation pathway during growth on long-chain alkanes.

Long-Chain Alkane Transport
Although the genes and proteins that enable the uptake and passage of aromatic hydrocarbons across the bacterial outer membrane have been characterized (for a review, see Wang and Shao, 2013), the transport mechanisms involved in longchain aliphatic alkane uptake remain unclear. Direct uptake of alkanes from the water phase is only possible for low molecular weight compounds which are sufficiently soluble to facilitate efficient transport into cells (Rojo, 2009). As the molecular weight of the alkane increases its solubility in water decreases (Eastcott et al., 1988). Our data suggests that, to overcome this problem, T. oleivorans MIL-1 utilizes a combination of both active and passive transport systems for uptake of alkanes, with higher expression on n-C28 compared to the non-hydrocarbon control, including a TonB-dependent receptor (TOL_0244, active, sevenfold increase in n-C 28 ) a FadL homolog (TOL_1625, passive, ninefold increase in n-C 28 ) and a specialized porin (TOL_3188, passive, exclusively expressed in n-C 28 ). An inner membrane transport protein (TOL_3187, active, 42-fold increase in n-C 28 ) involved in the export of an outer membrane protein was also expressed.
TOL_0244 encodes a TonB-dependent receptor. These receptors are specialized, ligand specific, and use active transport to move compounds against a concentration gradient through the proton-motive force generated through physical interaction with TonB-ExbB-ExbD, an inner membrane complex (Buchanan et al., 1999;Postle and Kadner, 2003). TonB dependent receptor homologs have been detected and implicated in the uptake of alkanes in other OHCB such as OmpS in Alcanivorax dieselolei B-5 (Wang and Shao, 2014). The outer membrane protein OmpS (B5T_01485) shares a 40.3% similarity with TOL_0244 and shares conserved domains with a TonB-dependent receptor protein responsible for ferric citrate transport in Escherichia coli (EGD69624). OmpS was found to not only detect the presence and utilize alkanes but also trigger the expression of an alkane chemotaxis response. Active transport proteins involved in the uptake of alkanes are not fully understood and only the TonB active transport system had been previously described to be involved.
TOL_1625 encodes a long-chain fatty acid transporter which is a member of the FadL family of proteins. Members of this family are responsible for the passive transport of hydrophobic compounds across the bacterial outer membrane (van den Berg et al., 2004). The first member of the FadL family was isolated from Escherichia coli and characterized as a long chain fatty acid transporter (Black et al., 1987). FadL shares some structural similarities with the first known bacterial alkane importer, AlkL, which is part of the alkBFGHJKL operon in Pseudomonas putida GPo1 (van Beilen et al., 2001;Grant et al., 2014). They both possess a lateral transfer mechanism with high affinity substrate binding causing conformational changes in the N terminus that opens a channel for passive substrate diffusion, allowing the entry of hydrophobic molecules into the outer membrane through the opening in the barrel wall (van den Berg, 2010; Grant et al., 2014). FadL homologs are present in many Gramnegative bacteria which are involved in the biodegradation of xenobiotics (van den Berg et al., 2004). These compounds face the same bacterial cell entry problems as long-chain fatty acids. This suggests hydrophobic compounds enter the cells by a mechanism like that employed for long chain fatty acids by FadL (van den Berg et al., 2004). This theory has been supported by biochemical data on other members of the FadL family including XylN, the xylene transporter from Pseudomonas putida, and TbuX, the toluene transporter from Ralstonia pickettii PK01 (Kahng et al., 2000;Kasai et al., 2001). FadL homologs have also been detected in other OHCB, such as A. dieselolei B-5, which has three outermembrane proteins belonging to the FadL family shown to be involved in the selective transport of n-alkanes (C 8 -C 36 ) (Lai et al., 2012;Wang and Shao, 2014).
Proteins that must reach the outer membrane need to be transported from the ribosome where they are synthesized to the inner membrane, where they have to be differentiated from inner-membrane proteins, then transported across the inner membrane into the periplasm, transported through the periplasm and finally assemble into the outer membrane (Reusch, 2012). TOL_3187 is a member of the MMPL family. Members of this family are inner membrane multisubstrate efflux pumps, which belong to the RND (resistance-nodulation-cell-division) permease superfamily of transmembrane transporters (Székely and Cole, 2016). They transport a range of different substrates including trehalose monomycolate, siderophores, phtiocerol dimycocerosate, sulfolipid-1, acylated trehaloses and mycolate ester wax (Jain and Cox, 2005;Seeliger et al., 2012;Varela et al., 2012;Pacheco et al., 2013;Wells et al., 2013;Belardinelli et al., 2014). Substrate transport is driven by the proton motive force (PMF) of the transmembrane electrochemical proton gradient (Chim et al., 2015). MMPL are not restricted to mycobacteria and have been found in related Streptomyces and Rhodococcus (Deshayes et al., 2010;Letek et al., 2010;Cano-Prieto et al., 2015). SCOOP analysis showed there is a relationship between the MMPL family and the SecDF family. The functional importance of SecDF in protein translocation across the inner membrane was previously shown in vivo with SecD and SecF-deficient Escherichia coli strains severely defective in protein export (Pogliano and Beckwith, 1994;Nouwen et al., 2005;Hand et al., 2006). This demonstrated bacterial SecD and SecF are required for efficient protein translocation across the inner membrane. The relationship with the SecDF family suggests MMPL may also be involved in export of outer membrane proteins across the inner membrane.
TOL_3187 is also a member of the LolA-like superfamily consisting of two proteins in the Lol system, the periplasmic molecular chaperone LolA and the outer membrane lipoprotein receptor LolB. The Lol system comprises five Lol proteins (A-E) which catalyzes the sorting and outer membrane localization of lipoproteins (Narita et al., 2004;Tokuda and Matsuyama, 2004). Lipoproteins released into the periplasm form a watersoluble complex with the perisplasmic chaperone, LolA (Remans et al., 2010). This LolA-lipoprotein complex crosses the periplasm and then interacts with the outer membrane receptor LolB, which is essential for the anchoring of lipoproteins to the outer membrane (Tsukahara et al., 2009). The Lol system has recently implicated as an alternative to the β-barrel assembly machinery (BAM) for the assembly and insertion of integral outer membrane proteins (Collin et al., 2011;Dunstan et al., 2015;Huysmans et al., 2015;Jeeves and Knowles, 2015). For example, the Pul secretin, a homododecamer of the outer membrane protein PulD, requires the outer membrane lipoprotein PulS for outer membrane localization (Hardie et al., 1996;Nouwen et al., 1999). It has been suggested PulD might "piggy back" on PulS to co-opt LolA for outer membrane localization. The porin (TOL_3188) was exclusively expressed on n-C 28 . Most small substrates, with molecular masses below 650 Da, cross the outer membrane via passive diffusion through non-specific porins, such as OmpF or OmpC (Nikaido, 2003). When the size of the substrate is too large for the generalized porins, an alternative specific or specialized porin is used (Davidson et al., 2008). We propose that this specialized porin may be actively transported by TOL_3187 across the inner membrane into the periplasm, then across the periplasm to the outer membrane where it is assembled. Following this it then facilitates passive diffusion of long-chain alkanes.

Chemotaxis Toward Long Chain Alkanes
Chemotaxis allows the movement of flagellated bacteria toward or away from chemical gradients in the environment and this process plays a role in hydrocarbon degradation by bringing cells into contact with alkanes (Parales and Harwood, 2002). A methyl-accepting chemotaxis protein (MCP) detects the presence of attractant/repellent via a periplasmic sensing domain. Binding of the chemoeffector (alkane) causes the MCP to undergo a conformational change that is transmitted across the inner membrane through to cytoplasmic chemotaxis protein.
The signal is transmitted to the flagellar via CheW and the sensor histidine kinase CheA, which can phosphorylate the response regulator CheY. Phosphorylated CheY controls swimming behavior by binding to the flagellar motor to reverse the default direction of flagellar rotation from counter-clockwise to clockwise. This signaling cascade results in movement toward or away from the chemical attractant or repellent. During growth on n-C 28 two MCPs (TOL_0708, TOL_2508), CheW (TOL_2481), and CheA (TOL_2510) were detected suggesting their involvement in chemotaxis toward long-chain alkanes.
Several alkane-specific chemotaxis genes have been found in alkane-degrading bacteria. For example, tlpS, which is located downstream of the alkane monooxygenase gene alkB1 on the P. aeruginosa PAO1 genome, has been predicted to encode an MCP that plays a role in alkane chemotaxis . Similarly alkN appears to encode a MCP in P. putida GPo1 (van Beilen et al., 2001). Raman microspectroscopy showed chemotaxis in Acinetobacter baylyi toward alkanes is highly specific and the chemotaxis proteins therefore follow the same behavior (Li et al., 2017). For example, the coupling protein CheW2 is from A. dieselolei was found to be only induced by long-chain alkanes (Wang and Shao, 2014). However, both MCP and CheA were upregulated in response to all types of alkanes, including n-C 8 to n-C 32 and pristane. This data demonstrates an entire chemotaxis complex differentially expressed during growth on long-chain alkanes.

CONCLUSION
In conclusion, T. oleivorans MIL-1, an important OHCB that can dominate microbial communities following marine oil spills, possess an ability to degrade a wide-range of n-alkanes including both medium and long chain lengths. This study has expanded the known substrate range of T. oleivorans to include longer nalkanes up to n-C32. This study has also significantly enhanced our understanding of the fundamental physiology of T. oleivorans MIL-1 by identifying key enzymes involved in both the terminal oxidation of medium-chain alkanes and subterminal oxidation of long-chain alkanes and proteins involved in chemotaxis and transport across the cell membrane of long-chain alkanes. In particular, a potentially novel BVMO system is upregulated for the subterminal oxidation of long-chain alkanes, highlighting an alternative pathway to the currently known AlmA and LadA terminal monooxygenase pathways. Given the homology of the BVMO enzyme TOL_0709 to proteins present in other ubiquitous marine OHCB (e.g., Oleibacter and Oleispira), this pathway may be widespread in the marine environment.

Metabolomics
T. oleivorans MIL-1 (DSM 14913) was grown in sterile 160 ml Nunc Cell Culture Treated Flasks containing 100 ml of ONR7a media (Dyksterhouse et al., 1995). Cultures were enriched separately with the medium-chain alkane n-tetradecane (n-C 14 ), or the long-chain alkane n-octacosane (n-C 28 ). Cultures were incubated in an orbital shaker (16 • C, 60 rpm) over a period of 7 days. Each day triplicate cultures of n-C 14 and n-C 28 were agitated through water sonication for 30 min and preconcentrated through solid phase extraction (SPE) by Supel-Select HLB cartridges (Supelco, 200 mg, 6 ml) and eluted with 12 ml of chloroform. Samples were further concentrated through evaporation of the chloroform to 200 µl with nitrogen. The concentrated solvent was analyzed to detect the alcohols produced from the oxidation of the alkanes used as growth substrates. GC-MS analysis was carried out as previously described (McKew et al., 2007b). The mass spectrometer scanned m/z values from 10 to 550, with identification of target analytes based on retention times of analytical standards; 1-tetradecanol, 2-tetradecanol and 1-octacosanol (Sigma-Aldrich).

LC-MS/MS Proteomics
Cells were harvested from 50 ml of each culture by centrifugation (4,600 × g, 15 min) and washed in 2 ml of phosphate buffered saline. Total protein was extracted by resuspending the cell pellet in 75 µl of extraction buffer [62.5 mM TRIS, 10% glycerol w/v, 12 mM dithiothreitol (DTT), 2% sodium dodecyl sulfate (SDS) v/v and one Pierce Protease Inhibitor Tablet per 50 ml], heating in a water bath (95 • C, 12 min) fully lysing the cells, and then centrifuging (10,500 × g, 5 min) to remove cell debris. Protein extracts were visualized by SDS-PAGE and trypsin digestion and LC-MS/MS with a Thermo Fisher hybrid high resolution LTQ Orbitrap instrument was performed as previously described (McKew et al., 2013). (Cox and Mann, 2008). The LTQ Orbitrap raw data files were first converted to MSM files with the MaxQuant "Quant" module. The opensource search engine Andromeda, which is integrated into MaxQuant, was used to identify peptides in sequence databases by their fragmentation spectra (Cox et al., 2011). Peptides and proteins were filtered at 0.01 false discovery rate (FDR) to obtain the final datasets. Proteins were quantified by counting the number of MS/MS spectra matched to corresponding proteins. Uniprot protein sequences from the T. oleivorans MIL-1 genome (Golyshin et al., 2013) were used to perform protein identification. Proteins were validated using the default settings in MaxQuant and Andromeda with a minimum of at least one peptide, but that any such protein had to be unambiguously identified by peptides that were unique to that protein (see full Parameter settings in Supplementary Table 1). Spectral counts were normalized to total spectral counts account for small observed differences between runs (total spectral counts varied between 14707 and 16669). The Normalized Spectral Abundance Factor (number of spectral counts divided by the length of the polypeptide, expressed as percentage for each protein compared to the sum of this ratio for all the detected proteins) was also calculated as longer proteins are expected to produce more peptides. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (Deutsch et al., 2017) via the PRIDE (Vizcaíno et al., 2016) partner repository with the dataset identifier PXD011824.

Statistical and Bioinformatic Analysis
Differential expression analysis was performed by analysis of variance (ANOVA) and Tukey's HSD test with Benjamini-Hochberg post hoc corrections (Benjamini and Hochberg, 1995) within the XLSTAT-Premium Version 2016.1 (Addinsoft) 'OMICs' package. All proteins significantly (P < 0.05) upregulated during growth on a substrate compared to another were subjected to a BLAST (Basic Local Alignment Search Tool) (Altschul et al., 1990) against the NCBI nr database. Protein family and domain analysis was carried out in Pfam v30.0 (Finn et al., 2016). SCOOP (Simple Comparison of Outputs Program) (Bateman and Finn, 2007) was used to detect relationships between families in the Pfam database. Proteins were assigned to functional families by hierarchical classification of protein domains based on their folding patterns in CATH v4.1 (Class, Architecture, Topology, Homology) (Sillitoe et al., 2015). Full length secondary and tertiary structure predictions, functional annotations on ligand-binding sites, enzyme commissions numbers and gene ontology terms were generated using the I-TASSER SERVER (Zhang, 2008).