The Surfactin-Like Lipopeptides From Bacillus spp.: Natural Biodiversity and Synthetic Biology for a Broader Application Range

Surfactin is a lipoheptapeptide produced by several Bacillus species and identified for the first time in 1969. At first, the biosynthesis of this remarkable biosurfactant was described in this review. The peptide moiety of the surfactin is synthesized using huge multienzymatic proteins called NonRibosomal Peptide Synthetases. This mechanism is responsible for the peptide biodiversity of the members of the surfactin family. In addition, on the fatty acid side, fifteen different isoforms (from C12 to C17) can be incorporated so increasing the number of the surfactin-like biomolecules. The review also highlights the last development in metabolic modeling and engineering and in synthetic biology to direct surfactin biosynthesis but also to generate novel derivatives. This large set of different biomolecules leads to a broad spectrum of physico-chemical properties and biological activities. The last parts of the review summarized the numerous studies related to the production processes optimization as well as the approaches developed to increase the surfactin productivity of Bacillus cells taking into account the different steps of its biosynthesis from gene transcription to surfactin degradation in the culture medium.


INTRODUCTION
Surfactin was firstly isolated in 1968 by Arima et al. as a new biologically active compound produced by Bacillus with surfactant activities, leading to its appellation. Its structure was elucidated firstly through its amino acid sequence (Kakinuma et al., 1969a) and then its fatty acid chain (Kakinuma et al., 1969b). Surfactin was thus characterized as a lipopeptide composed of a heptapeptide with the following sequence: L-Glu1-L-Leu2-D-Leu3-L-Val4-L-Asp5-D-Leu6-L-Leu7, forming a lactone ring structure with a β-hydroxy fatty acid chain. Bearing both, a hydrophilic peptide portion and a lipophilic fatty acid chain, surfactin is of amphiphilic nature, leading to exceptional biosurfactant activities and diverse biological activities.
Surfactins are actually considered as a family of lipopeptides, sharing common structural traits with a great structural diversity due to the type of amino acids in the peptide chain and the length and isomery of the lipidic chain (Ongena and Jacques, 2008). More than one thousand variants can potentially be naturally synthesized. This remarkable biodiversity mainly results from their biosynthetic mechanism.
This review is composed of 4 main sections. At first, a detailed description of the biosynthesis mechanisms will allow to understand origin of the biodiversity. Secondly, the diversity of variants will be seen, as well as its enhancement possibilities. Thirdly, the link between surfactin's varying structure and its properties and activities will be described. Lastly, the production process and its optimisation will be discussed, either for the whole surfactin family or for specific variants.

Peptide Moiety
Surfactins, as most of the cyclic lipopeptides (CLPs), are not synthesized ribosomally, but rather by specialized systems, termed non-ribosomal peptide synthetases (NRPSs). NRPSs are multimodular mega-enzymes, consisting of repeated modules. A module is defined as a portion of the NRPS that incorporates one specific amino acid into a peptide backbone. The order of the modules is usually co-linear with the product peptide sequence. Each module can in turn be dissected into the following three domains: the adenylation (A) domain, the thiolation (T) domain ("-syn. peptidyl-carrier protein (PCP)-") and the condensation (C) domain (Marahiel et al., 1997;Roongsawang et al., 2011). The A-domain recognizes, selects, and activates the specific amino acid of interest (Dieckmann et al., 1995). Taking into account the 3D-structures of several adenylation domains and their active site, several tools have been set up to correlate the amino acid residue present in this active site and their substrate specificity. A NRPS code was so defined that it is based on 8 amino acid residues from the active site (Stachelhaus et al., 1996;Rausch et al., 2005). The activated amino acid is hereby covalently bonded as a thioester to the flexible 4 ′ -phosphopantetheinyl (4 ′ -Ppant) arm of the T-domain. The 4 ′ -Ppant prosthetic group is 20 Å in length and can swing from one to another adjacent catalytic center. Exactly this flexibility enables the transfer of the activated amino acid substrate to the C-domain, which catalyzes in turn (i) the formation of a peptide bond between the nascent peptide and the amino acid carried by the adjacent module and allows afterwards (ii) the translocation of the growing chain to the following module. Various functional subtypes of the C domain have been described. For example, an L C L domain catalyzes the formation of a peptide bond between two Lamino acids while a D C L domain between a L-amino acid and a growing peptide ending with a D-amino acid (Rausch et al., 2007). The first module (A-T module) is considered the initiation module, while the subsequent (C-A-T) modules are defined as elongation modules. After several module-mediated cycles of peptide extension, the complete linear intermediate peptide is released by the terminal thioesterase (TE) domain which, often, catalyzes an internal cyclization (Marahiel et al., 1997;Trauger et al., 2000). Besides the above mentioned domains, the NPRS assembly line can furthermore comprise additional optional domains, which catalyze modifications of amino acid building blocks e.g. their epimerization (E-domains) (Süssmuth and Mainz, 2017). The lipid moiety of surfactins and most of the microbial lipopeptides is introduced directly at the start of the biosynthesis. The initiation module features a C-A-T-instead of a classic A-T-structure (Sieber and Marahiel, 2005;Bloudoff and Schmeing, 2017). It contains a special N-terminal C-domain, termed C-starter (C S ) domain and is in charge of the linkage of a CoA-activated β-hydroxy fatty acid to the first amino acid. The activated fatty acid stems foremost from the primary metabolism (Figure 1).
Three decades ago, the biosynthetic gene cluster (BGC) of the CLP surfactin was described in parallel by different research groups (Nakano et al., 1988;Cosmina et al., 1993;Fuma et al., 1993;Sinderen et al., 1993). The structural genes were identified in B. subtilis and are formed by the four biosynthetic core NRPS genes srfAA, srfAB, srfAC, and srfAD (Figure 1) which code together for a heptamodular NRPS assembly line. The threemodular enzyme SrfAA contains N-terminally the typical C Sdomain of CLP-BGCs and acylates the first amino acid Glu1 with various 3-OH-fatty acids stemming from primary metabolism. The peptide is subsequently extended in a co-linear fashion by the elongation modules of SrfAA, SrfAB and SrfAC to yield a linear heptapeptide (FA-L-Glu1-L-Leu2-D-Leu3-L-Val4-L-Asp5-D-Leu6-L-Leu7). The inverted stereochemistry can be readily attributed to the presence of E-domains in modules M3 and M6 and D C L domains in modules M4 and M7 (Figure 1). Finally, the TE domain of SrfAC releases the lipopeptide and performs the macrocyclization between Leu7 and the hydroxy-group of the 3-OH fatty acid. Notably, SrfAD consist solely of a second TE-domain, which represents rather a supportive repair enzyme and is able to regenerate misprimed T-domains during NRPS assembly (Schneider et al., 1998;Schwarzer et al., 2002;Yeh et al., 2004).
Beside the structural NRPS genes, the surfactin BGC comprises one built-in and several adjacent accessory genes encoding e.g. transporters and regulatory proteins (MiBIG Accession No: BG0000433). Amongst these, we would like to further highlight the genes sfp, ycxA, krsE, yerP and comS, which are particularly related with the production yield of surfactin.
Sfp represents a phosphopantetheinyl transferase (PPTase) and is located ∼4 kb downstream of the srf BGC. The T-domain of an NRPS is, upon its expression, not directly active but rather exists nascent in its non-functional apo-form. For full functionality, the flexible 4 ′ -Ppant arm needs to be fused to the T-domain. The latter process is mediated by the PPTase Sfp, thereby converting all T-domains of the surfactin BGC into their active holo form (Quadri et al., 1998;Mootz et al., 2001). This fact makes Sfp indispensable for the production of surfactin (Tsuge et al., 1999). For example, in the reference strain, Bacillus subtilis 168, the sfp locus is truncated and therefore non-functional, which abolishes in turn surfactin production. However, the production can be restored by the transfer of a complete sfp locus (Nakano et al., 1988(Nakano et al., , 1992. Further important genes in the context of surfactin production are genes encoding transporters which are efflux pumps. From a physiologically point of view, the pumps avoid intracellular surfactin accumulation and constitute an essential self-resistance mechanism (Tsuge et al., 2001). In particular since surfactin inserts into biomembranes and at higher concentration causes membrane disruption. An ecological rationale for transporters could be that surfactin is extracellularly at the correct site where it can exert its beneficial activity. So far, three transporters have been identified in Bacilli, that are involved in surfactin efflux, i.e. YcxA, KrsE, and YerP. It has been demonstrated that the separate overexpression of the corresponding genes enhanced release rates of surfactin  by 89, 52, and 145%, respectively.
Finally, the surfactin BGC exhibits a unique peculiarity on the genetic level, in bearing a co-encoded regulatory gene, termed comS inside itself (D'Souza et al., 1994). It is located in the open reading frame of the NRPS gene srfAB (Hamoen et al., 1995), more precisely within the A-domain of module 4 (Figure 1). ComS is on the one hand involved in the positive regulation of the genetic competence of the cell (Liu and Zuber, 1998) and on the other hand part of the quorum sensing system comQXPA (Ansaldi et al., 2002;Schneider et al., 2002;Auchtung et al., 2006) which in turn regulates surfactin production. Beyond this brief explanation, for an excellent overview about the role of ComS, the reader is referred to a review, written by Stiegelmeyer and Giddings (2013). Since the production yield is coupled with the presence and functionality of ComS in the coding region of srfAB, the genetic engineering of the surfactin synthetase in this region requires special attention.

Fatty Acid Chain Synthesis
Since fatty acid biosynthesis plays a critical role in surfactin production, and strongly determines its activity and properties, in this section we briefly summarize this central metabolic pathway and the subsequent steps leading to the modification and activation of the fatty acyl-CoA precursor.
All organisms employ a conserved set of chemical reactions to achieve the de novo Fatty Acid (FA) biosynthesis, which works by the sequential extension of the growing carbon chain, two carbons at a time, through a series of decarboxylative condensation reactions (Wakil et al., 1983) (Figure 2). This biosynthetic route proceeds in two stages: initiation and iterative cyclic elongation. The acetyl-CoA carboxylase enzyme complex (ACC) performs the first committed step in bacterial FA synthesis to generate malonyl-CoA through the carboxylation of acetyl-CoA (Marini et al., 1995;Tong, 2013). The malonate group from malonyl-CoA is transferred to the acyl carrier protein (ACP) by a malonyl-CoA:ACP transacylase (FabD) (Serre et al., 1994(Serre et al., , 1995Morbidoni et al., 1996). The first reaction for the synthesis of the nascent carbon chain comprises the condensation of malonyl-ACP with a short-chain acyl-CoA (C2-C5) catalyzed by a 3-keto-acyl carrier protein synthase III (FabH). Acetyl-CoA is used as a substrate for the synthesis of straight-chain FA, while branched-chain fatty acids (BCFA) arise from isobutyryl-CoA, isovaleryl-CoA and methylbutyryl-CoA priming substrates. These precursors derive, from the catabolism of the branched-chain amino acids valine, leucine and isoleucine, respectively. The crucial branched-chain α-keto acid decarboxylase (BKD) complex catalyzes the decarboxylation of α-keto acids to generate the corresponding branched-chain acyl-CoA primers (Willecke and Pardee, 1971;Kaneda, 1991;Lu et al., 2004). The substrate specificity of FabH plays a determining role in the branched/straight and even/odd characteristics of the fatty acid produced. B. subtilis possesses two FabH isoenzymes, FabHA and FabHB, both of which preferentially utilize branched-chain acyl-CoA primers (Choi et al., 2000). Therefore, BCFA are the main components of phospholipids, where iso-C15:0, anteiso-C15:0, iso-C16:0, iso-C17:0, and anteiso-C17:0 represent the major FA found in Bacillus species (Kaneda, 1969;Kämpfer, 1994). The pattern of the BCFA can be modified by environmental conditions such as temperature (Graumann and Marahiel, 1999).
Next, the keto-acyl-ACP product of FabH condensation enters the elongation/reducing cycle of the fatty acid synthase II (FAS-II). There, the keto group is reduced by the NADPH dependent β-ketoacyl-ACP reductase (FabG) to give β-hydroxy-acyl-ACP. FIGURE 2 | Biochemical steps for the formation of fatty acid and their channeling to surfactin biosynthesis. The first step of fatty acid synthesis involves the production of malonyl-CoA by the acetyl-CoA carboxylase complex (ACC). The malonyl-CoA-ACP transacylase, FadD, transfers the malonyl groups to the acyl carrier protein (ACP) to produce malonyl-ACP. FabH, condensates the malonyl-ACP and a priming acyl-CoA substrate to produce the first new C-C bond. The keto group of the β-ketoacyl-ACP is completely reduced by the reducing enzymes of the cycle, FabG, FabZ, FabI, and then the condensing enzyme FabF initiates a new round of elongation of the growing carbon chain utilizing malonyl-ACP. The acyl-ACP product is primarily channeled to PL biosynthesis or alternatively to surfactin biosynthesis. For this, at least two additional biochemical steps are required, a hydroxylation of a free FA by YbdT and its activation by an ACS.
The β-hydroxyacyl-ACP intermediate is then dehydrated to trans-2-enoyl-ACP by a 3-hydroxyacyl-ACP dehydratase (FabZ). Then, the cycle is completed by an enoyl-ACP reductase, which reduces the double bond in trans-2-enoyl-ACP to form acyl-ACP (Fujita et al., 2007). B. subtilis possesses two enoyl-ACP reductases (FabI and FabL) with opposite preferences for the NADPH or NADH cofactor .
In all the successive steps of FA elongation, the acyl-ACP intermediate and malonyl-ACP are the substrates of FabF condensing enzyme (3-oxoacyl-ACP-synthase II) that elongates the growing acyl chain and initiate each new round of the cycle (Schujman et al., 2001). Finally, the acyl-ACPs of the proper chain length are substrates of acyltransferases involved in cell membrane phospholipid synthesis. Alternatively, some structurally specific FA are not integrated in the cell membrane phospholipids. Those modified FA could be, under specific environmental or growth conditions, channeled into secondary metabolic pathways. They are then a of specialized molecules, as it is the case of lipopeptides.
Once the long chain FA is synthesized, the next steps needed for surfactin biosynthesis involves the production of the 3hydroxy-acyl-coenzyme A (CoA) substrates. Youssef et al. based on in vitro assays, suggested that acyl 3-hydroxylation occurs prior to CoA ligation (Youssef et al., 2011). These authors reported that YbdT, a cytochrome P450 enzyme, catalyzes the hydroxylation of the FA precursors to be incorporated in the lipopeptide biosynthetic pathway (Youssef et al., 2011). Cytochrome P450 are monooxigenases capable of introducing an oxygen atom into FA and in other lipidic and non-lipidic molecules. The B. subtilis genome contains eight genes coding for cytochrome P450 enzymes (Hlavica and Lehnerer, 2010). In vitro, high-performance liquid chromatography (HPLC) and gas chromatography-mass spectrometry analyses demonstrated that the recombinant ybdT gene product hydroxylates myristic acid in the presence of H 2 O 2 , to produce β-hydroxymyristic acid and α-hydroxymyristic acid (Matsunaga et al., 1999). Furthermore, a ybdT mutant strain of B. subtilis OKB105 produces biosurfactants with only 2.2% of 3-hydroxylated C14, while the 97.8% contained non-hydroxylated FA with chain lengths of C12, and C14-C18 (Youssef et al., 2011) and are thus linear.
Finally, the surfactin synthetase assembly line can be initiated in presence of a CoA-activated FA (Steller et al., 2004). Fatty acids are converted into their corresponding acyl-CoA derivative by fatty acyl CoA ligases (FACS). Of the four putative FACS identified in homology searches in the genome of B. subtilis, two of them, LcfA and YhfL, were characterized in vitro to be involved in surfactin production. HPLC-MS based FACS activity assays indicated that LcfA and YhfL catalyze the thioester formation with CoA and various FA substrates (3-OH C8, 3-OH C10, C12, and C14). All four single mutants in the FACS homolog genes, lcfA, yhfL, yhfT and yngI, decreased surfactin production by 38% -55%, compared with the wild-type levels. Interestingly, a quadruple mutant in the FACS did not completely abolish surfactin biosynthesis, such strain still presents 16% surfactin production, compared with the levels produced by the wildtype strain. This observation suggests that other non-canonical FACS are present in B. subtilis or that other pathways, such as transthiolation from ACPs to CoA, could be involved in providing the fatty acyl moiety.
The hydroxylated and CoA activated FA derivative is finally transferred onto the surfactin synthetase assembly line, in a reaction performed by the N-terminal condensation (C S ) domain, that is as mentioned above responsible for the lipoinitiation mechanism. In vitro, the recombinant dissected C domain, catalyzed the acylation reaction using glutamate-loaded PCP domain and 3-OH-C14-CoA as substrates (Kraas et al., 2010).
Frontiers in Bioengineering and Biotechnology | www.frontiersin.org

VARIANTS OF SURFACTIN
The surfactin biosynthesis mechanism previously described is responsible for the high biodiversity of surfactin-like molecules. In addition, the assembly line machinery of surfactin synthetases can be easily modified by synthetic biology in order to increase this biodiversity. Both aspects will be developed in the following chapter.

Natural Variants
Three main peptide backbones and the NRPSs responsible for their biosynthesis, produced by different Bacillus species, have been so far described in literature: surfactin as previously described from B. subtilis, B. amyloliquefaciens, B. velezensi, and B. spizizeni amongst others, pumilacidin from B. pumilus (Naruse et al., 1990) and lichenysin from B. licheniformis (Horowitz et al., 1990). Compared to surfactin, pumilacidin has a leucine in position 4 instead of a valine, as well as an isoleucine or a valine in position 7 instead of a leucine. Lichenysin differs from surfactin by a change in the first amino acid residue: a glutamine (Gln) instead of a glutamic acid (Figure 3).
This first biosynthetic diversity in surfactin is increased by the promiscuous specificity of adenylation domains of modules 2, 4, and 7 of surfactin synthetases which are able to accept L-Leu, L-Val or L-Ile amino acids residues as well as L-Ala for module 4. Similarly low levels of specificity have been observed for lichenysin (Peypoux et al., 1991;Bonmatin et al., 2003).
Based on all these results, it appears that the aspartic acid in position 5, as well as the D-Leucine in position 3 and 6 are present in all the members of the surfactin family. The only mention of an asparagine (Asn) for lichenysin (Yakimov et al., 1995) was quickly refuted by the same author after the use of fast atom bombardment mass spectrometry (Yakimov et al., 1999). The specificity of M3 and M6 could result from (i) an enzyme of the assembly line machinery such as the epimerisation domain which could accept only leucine as substrate, (ii) from the specificity of the adenylation domain or (iii) from the specificity of the involved condensation domains.
The changes in the peptide chain are not the only source of diversity in the surfactin family. As mentioned before, surfactin is a heptapeptide linked to a fatty acid chain. Regarding this chain, the length of it can vary from 12 to 17 carbons atoms, mainly being C14 and C15.
Another change in this lipid chain is its isomery, it can have a linear, n, configuration, but it can also be branched, iso and anteiso. Anteiso can only be in an uneven carbon chain length, while iso can be found in all chain lengths (odd and even-numbered carbon chain). These derivatives can be mainly explained by the promiscuity of the C S -domain present in module M1 toward its relaxed substrate specificity.
Finally, natural linear surfactins (Figure 3) have been also identified in the culture supernatant of Bacillus strains (Gao et al., 2017). The molecular mechanism responsible for this linearization is not yet known. It could result from an incomplete efficacy of TE domain which could release some surfactin without cyclization or from enzymatic or chemical degradation of cyclic surfactin.
In addition, heterologous enzymes are also capable to catalyze linearization. An in vitro study showed the linearisation effect of a purified V8 endoprotease from Staphylococcus aureus (Grangemard et al., 1999). Furthermore, an in vivo study demonstrated that Streptomyces sp. Mg1 produces, as a mechanism of resistance, an enzyme that hydrolyses surfactin into its linear form (Hoefler et al., 2012).
Surfactin methyl ester was observed in the supernatant of Bacillus subtilis HSO121 (Liu et al., 2009), and a methylated product of surfactin with a valine in position 7 was discovered in the supernatant of a Bacillus mangrove bacteria strain (Tang et al., 2007). This change was also discovered in the supernatant of Bacillus licheniformis HSN221 with surfactin and lichenysin methyl esters (Li et al., 2010) and in the culture medium of Bacillus pumilus through surfactin methyl ester (Zhuravleva et al., 2010).

Synthetic and Biosynthetic Variants
In addition to the natural surfactins seen before, synthetic variants can be obtained through chemical modifications or genetic engineering of the NRPS. This leads to new forms or to a controlled production of a specific form. Reasons for structural changes are manifoldly given, foremost to reduce the toxicity of surfactin, but also to optimize its biological activities or to increase its water solubility.
For example, reaction of surfactin with n-hexyl alcohol lead to mono-and di-hexyl-surfactin, with 2-methoxyethanol to monoand di-2-methoxy-ethyl-surfactin (Shao et al., 2015). Amidation through a reaction with alcohol and then NH 4 Cl was also observed (Morikawa et al., 2000). Esterification and amidation of aspartic and glutamic acid eliminate the negative charge of those amino acid residues, creating an even greater diversity in the surfactin family because of the charge change that they bring and thus the modification in surfactin biological and surfactant properties.
Linearization of the cyclic surfactin previously mentioned as a natural process can also be obtained by chemical alkaline treatment (Figure 3) (Eeman et al., 2006).
In addition to those chemical modifications of surfactin naturally produced, synthetic forms can be chemically produced (Figure 3). Liquid phase techniques have been used at first (Nagai et al., 1996) but, because of the many steps and the purification of intermediates needed, it was replaced with a quicker solid phase peptide synthesis (SPPS) technique. Different forms of surfactins have been produced, such as standard surfactin, but also analogs with a change in the amino acid sequence, such as an epimerisation (D-Leu2), a change in charge (Asn5) and the switch of two residues (Asp4-Leu5) (Pagadoy et al., 2005). Linear surfactin was also produced, as well as linear with an amidated carboxy-terminus function (Dufour et al., 2005). Finally, the fatty acid chain length was likewise changed, with C10 and C18 (Francius et al., 2008). However, due to the complexity of the production, these lipopeptides are intended only for research use. As said before, in addition to the chemical changes, the genetic engineering can be also applied to the genes coding for the NRPS, in order to modify the structure of surfactin. The generation of novel derivatives by rational design can hereby be achieved by site directed mutagenesis, module-insertion, deletion, and substitution (Alanjary et al., 2019). Application of the site directed mutagenesis technique, an A-domain specificity of an NRPS module shift from L-Glu to L-Gln and from L-Asp to L-Asn at position 5 in modules 1 and 5 was accomplished, respectively .
Concerning the concept of module substitutions, particularly the Marahiel group showed in a ground breaking way from the mid 90s onwards the feasibility of module swaps which allowed single or multiple variations concerning all seven amino acids (Stachelhaus et al., 1995(Stachelhaus et al., , 1996Schneider et al., 1998;Eppelmann et al., 2002). As a practical aspect, beside the gain in basic research knowledge, for several modified surfactins, such as Cys7-surfactin, a decreased hemolytic activity was observed. Furthermore, ring contracted surfactin derivatives were obtained by deletion of complete NRPS modules. In this way, the corresponding knockouts yielded hexapeptidic surfactin congeners, individually lacking Leu2, Leu3, Asp5 and Leu6. Notably, the Leu2 Leu3 and the Leu6 surfactin variants showed a reduced toxicity toward erythrocytes and enhanced antibacterial activities, while the Asp5 surfactin exhibited an even higher inhibitory ability for Gram positive bacteria, but kept the hemolytic capabilities of the native surfactin Jiang et al., 2016). However, each genetic manipulation mentioned above resulted in a significant decrease in the production yield. Nevertheless, these studies showed the feasibility and moreover demonstrated in an encouraging way that the surfactin scaffold can be fine-tuned concerning its intended activity and its undesired side effects.
Very recently, the Bode group revolutionized the concept of module swapping. It includes the finding that C-domains have to be subdivided into a C Donor (C D ) and C Acceptor (C A ) portion and that both are amino-acid specific (Bozhüyük et al., 2019). This redefines nowadays the borders of an exchange unit. Instead of a classic A, A-T or C-A-T domain swap, it is preferable to exchange a C D -A-T-C A domain unit (Figure 4). The huge advantage of these findings is that peptidevariants can be generated by genetic engineering at a much higher success rate and without any production loss. The technique will be an incentive to modify highly bioactive structures, such as surfactin. The exchange units can be derived from other Bacilli or codon-optimized from other bacterial genera. Particularly, in combination with synthetic biology, in future numerous genetically-engineered modifications can be envisioned: beside the exchange of amino acids, ring contractions by module deletion and ring expansions, by addition of an exchange unit, can be generated, respectively (Figure 4). Since peptides, containing D-configured amino acids are less prone to degradation, the change of the absolute configuration by insertion of epimerization domains could lead to derivatives that are less prone to enzymatic degradation. Furthermore, since the biotechnological production of surfactin always results in the production of complex mixtures, e.g., varying in the fatty acid portion, it would be desirable to produce surfactin with a more defined lipid moiety. For this purpose, the biobrick-like exchange of the C Donor -portion of the C S -domain could lead to the incorporation of the desired 3-OH fatty acid. Finally, it can be also envisioned to modify the surfactin NRPS assembly line even further, e.g. by introduction of catalytic domains which drive intramolecular cyclization-, N-methylation-, hydroxylation-, and redox-reactions.

STRUCTURE AND PROPERTIES RELATIONSHIP
Surfactins and surfactin-like molecules are amphiphilic molecules with a polar part mainly constituted by the two negatively charged amino acid residues Glu and Asp (in native surfactin) and an apolar domain formed by the lateral groups of aliphatic amino acid residues (mainly Leu) and the fatty acid chain. This amphiphilic structure is responsible for its attractive physico-chemical properties as well as its various biological activities.

Surfactin Structure and Its Influence on Physico-Chemical Properties and Biological Activites
The amphiphilic structure of surfactins leads to strong surface activity, i.e., their capacity to reduce the surface/interfacial tension and to self-assembly in nanostructures, and the presence of negative charge(s). Thus, they display as physico-chemical properties foaming (Razafindralambo et al., 1998;Fei et al., 2020), emulsifying (Deleu et al., 1999;Liu et al., 2015;Long et al., 2017;Fei et al., 2020) and dispersing properties, solid surface wetting and surface hydrophobicity modification performance (Ahimou et al., 2000;Shakerifard et al., 2009;Marcelino et al., 2019;Fei et al., 2020), and chelating ability (Mulligan et al., 1999;Grangemard et al., 2001;Eivazihollagh et al., 2019). This strong surface activity leads to detergent applications (Zezzi do Valle Gomes and Nitschke, 2012), but they also show promising perspectives of applications in the environmental sector to enhance oil recovery in oil-producing wells Joshi et al., 2016;Long et al., 2017;de Araujo et al., 2019;Alvarez et al., 2020;Miyazaki et al., 2020), to increase the biodegradation rate of linear and aromatic hydrocarbons , and for metal removal from soil or aqueous solutions (Zouboulis et al., 2003;Eivazihollagh et al., 2019). Very recently, it was also suggested that surfactin can effectively demulsify waste crude oil . Their emulsifying property also confers them a potential of application in the food and cosmetics area for the product formulation (Mnif et al., 2013;Varvaresou and Iakovou, 2015;Zouari et al., 2016) as well as in the pharmaceutical area for the formulation of stable microemulsion drug delivery systems (Ohadi et al., 2020).
The variations in the molecular structure of the peptidic part and/or of the hydrocarbon chain greatly impact their physicochemical properties. In term of self-aggregation behavior, the critical micellar concentration (CMC) value decreases with a longer fatty acid chain (CMC Surfactin C15 = 20 µM; CMC surfactin C14 = 65 µM; CMC surfactin C13 = 84 µM in Tris-HCl pH 8) (Deleu et al., 2003;Liu et al., 2015). It also decreases with the presence of a methyl ester on the Glu residue (Grangemard et al., 2001) or the replacing of the Glu residue by a Gln as in lichenysin (Grangemard et al., 2001;Bonmatin et al., 2003). On the contrary, the linearization of the peptide cycle (CMC linear surfactin C14 = 374 µM in Tris pH 8.5) (Dufour et al., 2005) and the presence of a Leu4 instead of the Val4 as in pumilacidin (de Araujo et al., 2019) increase it. Different self-assembled nanostructures like sphere-like micelles, wormlike micelles and unilamellar bilayers coexist with larger aggregates in aqueous solution depending on the surfactin concentration, pH, temperature, ionic strength and metal ions (Zou et al., 2010;Taira et al., 2017;Jahan et al., 2020). These parameters can induce conformational changes in the secondary structure of the cyclic peptide moiety and thereby affect the shape and the packing parameter of surfactin (Jahan et al., 2020).
The capacity of surface tension reducing is also influenced by the molecular structure of surfactin. Depending of environmental conditions, lichenysin is or not more efficient than surfactin to reduce the surface tension (in Tris pH 9.4 γ cmc =35 and 37 for lichenysin and surfactin respectively and in NaHCO 3 pH 9.4 γ cmc =30 and 29 for lichenysin and surfactin respectively) (Grangemard et al., 2001), while pumilacidin is less (de Araujo et al., 2019). Linearization of the peptide cycle lessens this capacity (34 mN/m in Tris pH 8.5). Nevertheless, the replacing of carboxyl group by a sulfo methylene amido group leads to a complete loss of activity (Bonmatin et al., 2003). The chain length but also the branching type also impact the surface tension. A longer chain is more efficient and the normal configuration is more active than the iso one which is more powerful than the anteiso (Yakimov et al., 1996).
The effect of the chain length on the foaming properties does not follow this trend as it was shown that a lipidic chain with 14 carbon atoms provides surfactin with best foaming properties compared to that with 13 or 15 carbon atoms (Razafindralambo et al., 1998).
Lichenysin was also demonstrated to be a better divalent cation chelating agent than surfactin (Grangemard et al., 2001). This effect is assigned to an increase accessibility of the carboxyl group to the cation in the case of lichenysin (Habe et al., 2018). The complexation of divalent cations with the lipopeptide in a molar ratio of 2:1 for lichenysin leads to the formation of an intermolecular salt bridge, stronger than the intramolecular complexation in a 1:1 ratio with surfactin (Grangemard et al., 2001;Habe et al., 2018).
Globally speaking, the few studies focused on the structureproperties relationships of surfactin family emphasize three main facts. The first is that the unique feature of the peptide loop provides surfactin with a fascinating molecular behavior at interfaces . Furthermore, the peptide cycle linearization leads to a structural distortion of the molecule reducing or annihilating its surface active power. The second fact is that the surface activity of surfactin is dictated by the interplay of hydrocarbon chain and peptide sequence . The more distant and distinct the polar and apolar domains are, the stronger the surface active power is. The last fact is that the charges of the polar part also play a primordial role in the physico-chemical properties. A monoanionic surfactin is more efficient than a dianionic one, due to a reduced repulsive effect between the molecules at the interface.
The remarkable physico-chemical properties of surfactin are also responsible for their biological activities which, in most of the cases, involve perturbation or disruption of membrane integrity. It was demonstrated for haemolytic (Kracht et al., 1999;Dufour et al., 2005), antibacterial (Bernheimer and Avigad, 1970), antiviral (Yuan et al., 2018;Johnson et al., 2019), and antimycoplasma (Vollenbroich et al., 1997) activities of surfactin as well as its ability to inducing systemic resistance in plant (Ongena et al., 2007;Ongena and Jacques, 2008). Some of those activities leading to promising results in the agricultural field (Chandler et al., 2015;Loiseau et al., 2015). But surfactin was also characterized for anti-inflammation (Takahashi et al., 2006;Zhao et al., 2017), anti-sepsis (Hwang et al., 2007), anti-tumor  and immunomodulatory (Park and Kim, 2009) activities for which another target than membranes is involved. A synergistic effect has been observed between surfactin and other lipopeptides. The addition of surfactin at an inactive concentration to iturin increase its haemolytic activity (Maget-Dana et al., 1992). The combination of surfactin and fengycin lead to a decrease in disease in tomato and bean plants (Ongena et al., 2007). Furthermore, while surfactin has no effect against fungi, it has been shown to enhance the biological activities of other lipopeptides against fungi and oomycetes (Deravel et al., 2014;Tanaka et al., 2015;Desmyttere et al., 2019).

Use of Molecular Modeling for Mechanism of Action Investigation
Molecular modeling methods are powerful theoretical tools to investigate structure functions relationship of surfactin and its mode of action. Docking and Molecular Dynamic (MD) simulations have been used in various studies involving surfactin for the characterization of diverse properties to predict activities and domains of applications.
Further investigations have shown that surfactin binds protofibrils by forming a stable hydrogen bond with residues involved in salt bridges responsible ofamyloid aggregation and plaques stability (Verma et al., 2016). Another docking investigation, employing Swiss Dock (Lien Grosdidier et al., 2011), has shown that surfactin binds favorably via hydrogen bonds to porcine pancreatic lipase and inhibits its activity, which could lead to a novel and potent body weight reducer for obesity control (Meena et al., 2018).
Beside these investigations on monomeric surfactin interacting with potential targets, MD simulations proved to be an efficient tool to study molecular assemblies. A surfactin monolayer at the air-water interface was studied under various interfacial concentrations. It was shown that packed structures are formed via intra-and inter-molecular hydrogen bonds, stabilizing the β-turn structure of the peptide ring, favoring the β-sheet domain organization and hydrophobic contacts between molecules Another simulation was applied to study the self-assembly of surfactin in water and more particularly the structural organization of the micelles (Lebecque et al., 2017). Micelles were pre-formed with PackMol (Martinez et al., 2009) and were simulated to analyse their behavior. The optimal aggregation number, i.e., 20, predicted by this approach is in good agreement with the experimental values. Two parameters were analyzed, the hydrophilic (phi)/hydrophobic (pho) surface and the hydrophobic tail hydration (Lebecque et al., 2017). A higher phi/pho surface ratio means a more thermodynamically favorable organization of the hydrophilic and hydrophobic domains, but steric and/or electrical repulsions between polar heads have also to be considered. For surfactin, it was shown that the phi/pho surface ratio undergoes a decrease for the largest micelles of surfactin because they have to rearrange themselves to reach a more favorable organization. The low value of apolar moieties hydration observed for surfactin micelles is due to the very large peptidic head that efficiently preserves hydrophobic tails from contact with water. The Coarse Grain (CG) representation MARTINI (Marrink et al., 2007) (grouping atoms into beads to speed up the simulation process) was similarly applied to analyse the structural properties and kinetics of surfactin self-assembly in aqueous solution and at octane/water interface . With complementary MD of a pre-formed micelle and a monolayer, the authors showed that their CG model is in agreement with atomistic MD and experimental data, for micelle self-assembly and stability, as well as for the monolayer. Furthermore, this study allows the development of a set of optimized parameters in a MARTINI CG model that could open further investigations for surfactin interaction with various biofilms, proteins or other targets of interest with a better sampling than atomistic MD.

PRODUCTION
This last part of this review is dedicated to the improvement of the production of surfactin like compounds. It will first consider the techniques for the identification and the quantification of these lipopeptides and then focus on strain, culture conditions, and bioprocess optimization. Not to forget, the purification process allows for a greater recovery of the surfactin produced and lower the losses.

Identification and Quantification of Surfactin and Its Variants
In order to discover new natural variants or verify the production of synthetic ones, the identification is an important process. The first surfactin structure elucidation was made through hydrolysis of the peptide and fatty acid chain into fragments, their identification and alignment (Kakinuma et al., 1969b). However, with the continuous innovations of analytical-chemical techniques such as mass spectrometry MS/MS (Yang et al., 2015a), nuclear magnetic resonance (NMR) (Kowall et al., 1998) and Fourier transform IR spectroscopy (FT-IR) (Fenibo et al., 2019), the analysis of new variants can be determined quicker and without hydrolysis. While FT-IR provides the functional groups, NMR leads to a complete structural characterization of the compounds but requires completely purified products at the level of mg quantities. Mass spectrometry does not enable the differentiation of compounds having the same mass (such as leucine and isoleucine for example), nor the type of fatty acid chain (linear, iso or anteiso), but provides the global mass and the peptide moiety primary sequence.
An overview of surfactin's dosage techniques can be found in Table 1. The first ones rely on surfactin's amphiphilic nature, so that its production can be detected through its surfactant activity.
Indirect methods, such as emulsification measure, haemolytic activity (blood agar plate) or cell surface hydrophobicity can be used. However, the correlation between those activities and surfactant activity has been refuted. Youssef et al. (2004) does not recommend the use of blood agar lysis as a screening method. Therefore, direct methods to measure the surface activity, such as interfacial tension measurement, drop shape analysis, drop collapse assay or oil spreading should be used (Youssef et al., 2004). Newer techniques have been developed the last few years for a rapid detection and quantification, based on color shifts or fluorescence. The first color shift approach is based on the higher affinity of a mediator, initially forming a complex with a color indicator, for surfactin and thus the release of the color indicator in the solution (Yang et al., 2015b). The fluorescence technique is based on the same principle, but with fluorescein instead of a color indicator (Heuson et al., 2018). This leads to a more sensitive and stable procedure. However, another color shift approach has been developed based only on the interaction between bromothymol blue solution and lipopeptides (Ong and Wu, 2018). However, since they are not specific for surfactin, the best and most sensitive quantification method is still the use of reversed phase HPLC-UV or MS (Geissler et al., 2017). This method also allows the discrimination between the various homologs of the surfactin family. Indeed, the molecules are separated based on their hydrophobic properties, giving a shorter retention time for lipopeptides with a leucine in position 7 and a longer retention time for lipopeptides with a valine in position 7. The separation is also based on the fatty acid chain, the shorter the fatty acid chain length is, the shorter the elution time is (Dhali, 2016). Furthermore, the production capacity of a micro-organism can be discovered through PCR, with primers specific to the surfactin biosynthesis genes (sfp and srf ) (Mohammadipour et al., 2009) or genome sequencing. However, these methods do not reflect the real lipopeptide production, since only the presence of the genes is observed. RT-PCR allows the detection of the transcribed genes, but does not allow to reflect the post-transcriptional modifications.

Optimisation of Surfactin Production
In order to enhance the surfactin production, in addition to fermentation optimization, the genetic engineering of the producing strains is of great significance. It was already covered in the past by other teams (Hu et al., 2019) and will be more developed here.
A first strategy would be to allocate more resources of the cell to surfactin biosynthesis by suppressing different cellular processes. It was successful with the plipastatin operon disruption (Coutte et al., 2010a) or biofilm formation related genes (Wu et al., 2019). However, a strain with a 10 % genome deletion, comprising genes for plipastin, bacilysin, toxins, prophages and sporulation, had a lower surfactin production (Geissler et al., 2019). Then, concerning surfactin production itself, the strategy can take place at different stages of the surfactin cell production: at the transcription level by promoter substitution or modification of the transcriptional regulatory genes of srfA operon, at the level of surfactin synthesis by increasing the precursor availability, during the molecule's excretion and finally during its degradation (Figure 5).

Transcription
As seen before, surfactin NRPS is coded by four genes, srfA-A, srfA-B, srfA-C, and srfA-D, that are controlled by the P srf antoinducible promoter, triggered by signal molecules from a quorum sensing pathway. Studies were performed to exchange this promoter with inducer-specific or constitutive ones. It emerged that a replacement with a constitutive promoter in a weak surfactin producer strain leads to an increase in the production, but that the opposite effect is observed for strong surfactin producers (Willenbacher et al., 2016). However, the use of novel artificial inducible promoters leads to an increase in surfactin production of more than 17 times (Jiao et al., 2017).
In addition to the promoter, transcriptional regulatory genes also control the expression of the NRPS genes. The cell density dependent quorum sensing system plays a regulatory role in many pathways in Bacillus, and among others in the regulation of the srfA operon. Ohsawa et al. (2006) showed that the inhibition of the ComQXP quorum sensing locus lead to a decrease in the expression of srfA genes and Jung et al. (2012) showed that the overexpression of ComX and PhrC increases the production of surfactin.
In addition to the quorum sensing system itself, regulators also impact the srfA operon, the quorum sensing system or even other mechanisms that indirectly impact surfactin. There are positive regulators such as PerR (Hayashi et al., 2005) and negative regulators such as CodY , Rap (Hayashi et al., 2006), SinI (López et al., 2009) and Spx (Zhang et al., 2006).

Increasing Precursor Supply of NRPS by Feeding or Metabolic Engineering
Modifying media and fermentation condition is a strategy to overproduce the lipopeptide precursors as well as to favor the production of certain isoforms. For example it was seen that the feeding of leucine as 50% of the nitrogen source lead to an increase in specific surfactin production of three times . Another strategy is the application of rational metabolic engineering approaches such as: (i) blocking competitive pathways for building blocks, as well as, those pathways that consume products; (ii) pulling flux through biosynthetic pathways by removing regulatory signals; and (iii) by overexpressing rate-limiting enzymes.

Amino Acids Precursors
One way to develop this metabolic engineering approach is to use knockout of genes which negatively influence the intracellular pool of amino acids precursors. To implement the knock-out of gene which negatively influence the intracellular pool of amino acid precursor, their metabolic pathways have to be modeled as a reaction network taking into account the regulation processes.
Firstly, the various pathways involved in the metabolites needed for the amino acid production should be addressed. In this research for compounds from the glycolysis that influence the amino acid production, pyruvate is interesting from multiple points of view. It is the entry point of the Krebs cycle through its conversion into acetyl-CoA, but it is also used as a substrate for the production of amino acids that compose the surfactin. Indeed, pyruvate is converted into valine and leucine. Furthermore, the production of isoleucine is made through threonine and pyruvate. The Krebs cycle also contributes to the amino acid production, with oxoglutarate and oxaloacetate, they belong to the metabolism of aspartic and glutamic acid. Secondly, the various enzymes that regulates metabolite production should be addressed. The search can also go a level above, with the regulators and promoters of those enzymes, such as pleiotropic regulators CodY or TnrA (Dhali, 2016). Lastly, the transporters of the amino acid precursors can be addressed. Indeed, the amino acid can be transported into the cell from the environment. Wang et al. (2019), showed that the knockout of murC, yrpC and racE, negative regulators involved in the metabolism of glutamate, lead to an increase in surfactin production. The choice of those knock-outs can also be directed by methods from computational biology, to narrow them down and reduce the laboratory time needed.
Some prediction methods are based on formal reasoning techniques based on abstract-interpretation (Niehren et al., 2016). This is a general framework for abstracting formal models that is widely used in the static analysis of programming languages. Formal models are reaction networks with partial kinetic information with steady state semantics define systems of linear equations, with kinetic constraints, that are then abstracted. Here, the methods were to be developed further, so that they could be applied to reaction networks rather than other kinds of programs. This approach has been used for the branched chain amino acids (leucine, valine, and isoleucine) that mainly compose the surfactin peptide chain .
The quite complex metabolic pathway of leucine production from threonine and pyruvate was modeled, by rewriting the informal model from SubtiWiki  into this formal modeling language, while adding and adapting some reactions. It selected gene knock-outs that may lead to leucine overproduction, for which some of them an increase in surfactin production in Bacillus subtilis 168 was observed after experimental verification (Dhali et al., 2017).
Since single gene deletion is successful, multiple gene deletion must be the next aim. To be able to perform various deletions and/or insertions in the same strain, a markerless strategy is required. Various strategies can be performed such as temperature sensitive plasmid, pORI vectors, auxotrophy based methods, but also the cre/lox system (Yan et al., 2008), the popin pop-out technique (Tanaka et al., 2013) and the CRISPRi technology (Wang et al., 2019).

Fatty Acid Precursors
As mentioned, fatty acids are one of the crucial components of surfactin, and modifications of this part of the molecule, such as length and isomerism, demonstrated to impact on the physicochemical properties and on the biological activity of lipopeptides (Dufour et al., 2005;De Faria et al., 2011;Henry et al., 2011;Liu et al., 2015;Dhali et al., 2017). Different metabolic engineering strategies were applied to improve surfactin production, in terms of the branched-chain fatty acid supply included: (i) enhancing the branched-chain α-ketoacyl-CoA supply (Dhali et al., 2017;Wang et al., 2019;Wu et al., 2019); (ii) enhancing malonyl-ACP synthesis (Wu et al., 2019); (iii) overexpressing the whole fatty acid synthase complex (Wu et al., 2019); and (iv) pulling substrates flux toward surfactin biosynthesis by enhancing srfA transcription (Jiao et al., 2017;Wu et al., 2019).
Another study showed that the overexpression of the bkd operon produces less surfactin, besides being detrimental for cell growth (Wu et al., 2019). As the BKD complex requires lipoylation for its dehydrogenase activity, this enzyme competes with other lipoic acid dependent complexes (pyruvate dehydrogenase complex (PDH), 2-oxoacid dehydrogenase, acetoin dehydrogenase and the glycine cleavage system), generating a suppression of cell growth and, eventually, of surfactin production. By overexpressing the enzymes responsible for lipoic acid synthesis (lipA, lipL, and lipM) Martin et al., 2011), this suppressive effect is reversed. The competitive lipoylation process between BKD and other lipoic acid dependent complexes is eliminated (Wu et al., 2019) and thus generates a higher production of surfactin with respect to the parental strain.
A further pathway, targeted to modification, represents the malonyl-ACP synthesis. Acetyl-CoA is converted into malonyl-CoA through the activity of ACC (accDABC). Thus, overexpression of these genes in combination with that of fabD, the malonyl-CoA:ACP transacylase, has been reported to increase the levels of surfactin production (Wu et al., 2019). Furthermore, these authors applied systematic metabolic engineering in B. subtilis 168 to construct surfactin hyperproducer strains. Other successful interventions related to FA biosynthesis have also been described. The simultaneous overexpression of most FAS II coding genes; fabH and fabGZIF (Runguphan and Keasling, 2014) and expression of the E. coli tesA thioesterase (Steen et al., 2010), to "pull" through the pathway. The combination of the mentioned interventions, in an already modified B. subtilis 168 chassis, further improved surfactin production by 220% (Wu et al., 2019).
Acetyl-CoA, is a key intermediate metabolite, which is not only used for surfactin biosynthesis, but fundamentally for cell growth and proliferation. Acetyl-CoA is generated from pyruvate by PDH; overexpression of enzymes of the glycolytic pathway and the KO of genes coding for enzymes associated with the acetyl-CoA consumption are common strategies to increase the supply of this key intermediate. Wu et al. (2019) showed that the simultaneous overexpression of the PDH genes and that of the glycolysis enzymes produce an increase in biomass but not a significant increase in the levels of surfactin. However, if these interventions were combined with the overexpression/deregulation of the srf gene cluster, the surfactin production could be further improved to 12.8 g/l, achieving a 42% (mmol surfactin/mol sucrose) of the theoretical yield.

Directed Biosynthesis of Surfactin
D Due to the non-specificity of some adenylation domains, the proportion of natural variants of surfactin can be modified through the feeding of certain amino acids as the nitrogen source in the culture medium. In the peptide moiety, this only affects L amino acid residues located in position 2, 4, and 7, and with a greater variation in position 4. Indeed, the feeding of valine leads to an increase of valine in position 7 (Menkhaus et al., 1993), the feeding of isoleucine (Ile) leads to the apparition of isoleucine in position 2 and/or 4 (Grangemard et al., 1997) and the feeding of alanine (Ala) lead to a surfactin with alanine in position 4 (Peypoux et al., 1994). Also, the culture medium can also influence the proportion of surfactin variants with different acyl moieties. For example, Liu et al.  found that the strain B. subtilis BS-37 has lower surfactin titers with higher proportions of C15-surfactin when grown in LB compared with glucose medium. Another team analyzed the influence of amino acid residues on the pattern of surfactin variants produced by B. subtilis TD7 (Liu et al., 2012). The β-hydroxy fatty acid in surfactin variants was C15>C14>C13>C16, when no amino acid was added in the culture medium. On the other hand, when Arg, Gln, or Val was added to the culture medium, the proportion of surfactins with even β-hydroxy fatty acid chain significantly increased; whereas the addition of Cys, His, Ile, Leu, Met, Ser, or Thr significantly enhanced the proportion of surfactins with odd β-hydroxy fatty acid. Some of these results can be explained by the mode of biosynthesis of branched fatty acids, the precursors of which are branched chain amino acids (Kaneda, 1991). Thus, valine feeding enhances the proportion of iso variants with even fatty acid chains, while leucine and isoleucine feeding enhances the proportion of uneven iso or anteiso fatty acids chains respectively (Liu et al., 2012).
Modification of the variant pattern can also be obtained by genetic engineering of precursor pathways. As previously mentioned, increasing the branched chain 2-ketoacyl-CoAs intermediates is one of the strategies used for enhancing the synthesis of surfactin. The deletion of gene codY, which encodes a global transcriptional regulator and negatively regulates the bkd operon lead to a 5.8-fold increase in surfactin production in B. subtilis BBG258 with an increase by a factor 1.4 of the amino acid valine in position 7 instead of leucine (Dhali et al., 2017). On the other hand, Wang et al. (2019), using CRISPR interference (CRISPRi) technology, were able to repress the bkdAA and bkdAB genes of the bkd operon; provoking a modest improvement in surfactin concentration, but a significant change in the proportion of the nC14 component. Similar results were observed in B. subtilis BBG261, a derivative lpdV mutant strain, where the interruption of this 2-oxoisovalerate dehydrogenase of the BKD complex led to higher percentage of the nC14 isoform (52,7% in the lpdV mutant in comparison with the 21,2% of the control strain) (Dhali et al., 2017).

Excretion
The excretion of surfactin is another important step for its overproduction. Even if, as mentioned before, surfactin can insert itself in the membrane of the cell, the transmembrane efflux is mediated by protein transporters.
As mentioned before, thanks to its amphiphilic structure, surfactin can interact with the membrane of the cell. Under or at the CMC, the surfactin can insert itself in the membrane, and above the CMC it can even solubilize it (Deleu et al., 2003(Deleu et al., , 2013. However, it was hypothesized by Tsuge et al. that the gene yerP, homolog to the RND family efflux pumps, is involved in the surfactin efflux (Tsuge et al., 2001). Later,  showed that the overexpression of three lipopeptide transporters, dependent on proton motive force, YcxA, KrsE and YerP lead to an increase in surfactin export of 89, 52, and 145% respectively.
Those studies are promising and the efflux proteins need to be further investigated to fully understand the excretion of surfactin.

Degradation
Lastly, the importance of surfactin degradation should not be underestimated. Indeed, a decrease in surfactin concentration of 59 and 73% has been observed during the fermentation process (Nitschke and Pastore, 2004;Maass et al., 2016), leading to the presence of degradation mechanisms by the cell themselves.
Three hypotheses are considered by the different teams observing this phenomenon. Since that, for different mediums with the same carbon content, the surfactin decrease happened at the same time, it could be that surfactin is used as a carbon source after glucose depletion. Or, since the decrease happened at the same surfactin concentration, that it is degraded because of its possible inhibitory effect at higher concentration (Maass et al., 2016). It was also shown that the surfactin decrease is linked to the increase in protease activity in the culture medium and thus the produced enzymes could be involved in this degradation (Nitschke and Pastore, 2004).
As for the excretion, this degradation process was seldomly researched but could greatly influence the surfactin production.

Culture Medium and Conditions
Landy culture medium, based on glucose and glutamic acid, is one the main culture medium usually used for surfactin production. Furthermore, some studies have been performed to ameliorate it Akpa et al., 2001;Wei et al., 2007;Ghribi and Ellouze-Chaabouni, 2011;Huang et al., 2015;Willenbacher et al., 2015).
However, another type of approach for the culture medium is rising. Indeed, the use of cheap substrate such as waste or by-products from the agro-industrial field is more and more researched (De Faria et al., 2011;Gudiña et al., 2015;Moya Ramírez et al., 2015;Paraszkiewicz et al., 2018), since this approach enables a sustainable production of surfactins. The recent review of Zanotto et al. develops specifically this approach (Zanotto et al., 2019).
Concerning the fundamental parameters of culture condition, a pH of 7 and a temperature of 37 • C leads to a higher production rate (Ohno et al., 1995a). However, when up-scaling from a flask culture to a larger scale, the main challenge in surfactin production appears. Indeed, the agitation rate and oxygenation of the culture medium play an important role in the production (Hbid et al., 1996;Guez et al., 2008;Ghribi and Ellouze-Chaabouni, 2011). As surfactin is a surfactant and thus increases the stability of a gas-liquid dispersion, this agitation leads to the abundant production of foam. Nonetheless, even if this foam production is often considered as a drawback, it can be used with the appropriate reactors as an advantage to easily recover surfactin.

Production Processes
For an overproduction of surfactin, the addition of a solid carrier to an agitated liquid culture can enhance surfactin production by stimulating cell growth and by promoting a biofilm formation. Yeh et al. (2005) added activated carbon, agar and expanded clay, observing a 36 times increase with activated carbon.
Nonetheless, as mentioned before, due to the high foam generation in surfactin production, classical stirred reactors are not optimal for this bioprocess. Indeed, adding antifoam to the culture medium has many drawbacks. Antifoams may have a negative effect on cell growth and are costly, but even more, they have to be eliminated during purification. Thus, multiple strategies can be applied: (i) to use this foam production to its advantage or (ii) to reduce or avoid foam production.
For the first strategy, the foam fractionation method consists in a continuous removal of the foam from a liquid agitated culture to a sterile vessel. So, this removal is a first purification step and by the continuous extraction avoids any possible feedback inhibition from the products (Cooper et al., 1981;Davis et al., 2001). However, the foam can carry a part of the culture medium and cells out and thus decrease the production. For the second strategy, a rotating disk bioreactor was used by Chtioui et al. (2012) where a biofilm formation occurs on a rotating disk in a liquid medium. The process is simple and can easily be upscaled, but the oxygen transfer is quite low and thus not optimal for surfactin production.
Bacillus biofilm formation capacity can also be used in other type of biofilm reactors such as packed bed reactors, where the liquid medium recirculates on a packing in the reactor (Zune et al., 2016). The purification is easily performed, but the biofilm growth is difficult to control because it depends on the liquid distribution in the packing. Recent studies have considered the genetic engineering of the bacterial cells to modify their biofilm formation ability or their filamentous growth in order to enhance their adhesion on the packing (Brück et al., 2019(Brück et al., , 2020. A membrane reactor allows for a bubbleless oxygen transfer through a membrane between the air and the culture medium. Furthermore, a first surfactin purification can be made through ultrafiltration coupled to the fermentation (Coutte et al., 2010b). However, there is a surfactin adsorption on the membrane and they can be costly when upscaled.
Lastly, a solid medium can be used with solid state fermentation that avoids the mechanical stirring of liquid cultures and thus the foam production. It represents a simple process but with parameters more difficult to control than in a liquid culture. However, many waste and by-products used as novel substrate are in a solid state and could thus be used without pretreatment (Ohno et al., 1995b).
Most studies are performed on the enhancement of one of the steps of the production process, but some studies are performed to decrease the costs in a large scale production (Czinkóczky and Németh, 2020).

Purification
The purification process is a major step in the surfactin production and depends on the fermentation process used. Linked to the techniques mentioned before, foam can be recovered during the fermentation and lead to 70% of recovery (Davis et al., 2001;Willenbacher et al., 2014). For a fermentation process with the surfactin in the liquid medium, acid precipitation, linked to the negative charge of surfactin, is the oldest and more common used technique. It can lead to a high recovery rate, but has a low purity (55%) and is the only technique that cannot be continuously coupled to the production. Solvent extraction can also be used alone but it is mostly coupled with acid precipitation to enhance the purity (Kim et al., 1997;Geissler et al., 2017). One of the most common type of purification, membrane filtration, can especially be used for surfactin through its micelle forming ability above its critical micelle concentration. The aggregated molecule is larger an thus can be retained by membranes with a MWCO of 10-100 kDa (Jauregi et al., 2013) with recovery rates and a purity above 90% depending on the applied membrane. Furthermore, hybrid methods have been successfully employed, i.e., precipitation before filtration (Chen et al., 2007), which facilitated the process or increased the final purity.
The techniques mentioned above are mostly used for the extraction of surfactin from the culture medium. Some uses of surfactin require a higher purity that can be obtained with the following methods. The physico-chemical properties of surfactin can be used through its adsorption on resin or active charcoal (Liu et al., 2007), leading to variable recovery rates and purity. Chromatographic derived methods can also be used to get a better purity and to separate individual variants or isoforms of the lipopeptide (Smyth et al., 2010). Reverse phase chromatography, based on hydrophobic interactions, is the most common technique employed.

CONCLUSIONS
With the improved genetic toolbox which is now available, a larger and more diverse chemical space of the surfactin scaffold can be generated and explored. This endeavor will create novel surfactin derivatives with improved, specialized, or expanded biological activities. And even if this molecule's potential applications range is already broad and reaches different industrials sectors, it may be enhanced with those novel compounds. However, despite the advancements in surfactin production, its production cost is still withholding it for a widespread commercial use in low added-value applications.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
The literature review and manuscript writing were performed by AT, CC-P, MB, YL, MD, JN, TF, SG, AL, LL, AA, HGra, HGro, and PJ. Insights were provided by MA and MM. In addition, AT and PJ have co-ordinated and synthesized the different contributions. All authors have read and agreed to the published version of the review.

FUNDING
This work was founded by the ERACoBioTech program (BestBioSurf project), the European INTERREG Va SmartBioControl/Bioscreen project and the national funding agencies, the Walloon Region (Belgium), the Dutch Research Council (NWO) (the Netherlands), the Agency for Renewable Resources (FNR) (Germany), the Federal Ministry of Food and Agriculture (Germany), and the Ministry of Science, Technology and Innovation (Argentina) and Innovate UK (the United Kingdom).

ACKNOWLEDGMENTS
We thank Edwin Foekema and Tinka Murk (Marine Animal Ecology group of Wageningen University) for their insights on this manuscript. We thank Andrew Zicler for his help in the figure design.