The Key Roles of Mycobacterium tuberculosis FadD23 C-terminal Domain in Catalytic Mechanisms

Sulfolipid-1 (SL-1) is located in the Mycobacterium tuberculosis (M. tb) cell wall, and is essential for pathogen virulence and intracellular growth. Multiple proteins (e.g., Pks2, FadD23, PapA1, and MmpL8) in the SL-1 synthesis pathway can be treated as drug targets, but, to date, their structures have not been solved. The crystal structures of FadD23 bound to ATP or hexadecanoyl adenylate was determined in this study. We have also investigated long-chain saturated fatty acids as biological substrates of FadD23 through structural, biological, and chemical analyses. The mutation at the active site of FadD23 greatly influences enzymatic activity. Meanwhile, the FadD23 N-terminal domain alone cannot bind palmitic acid without C-terminal domain facilitation since it is almost inactive after removing the C-terminal domain. FadD23 is the first protein in the SL-1 synthesis pathway whose structure has been solved. These results reveal the importance of the C-terminal domain in the catalytic mechanism.


Introduction
Tuberculosis (TB) is a communicable disease that is a major cause of ill health and death worldwide (World Health Organization, 2021). Mycobacterium tuberculosis (M. tb), the causative agent of TB, has a complex cell wall structure that protects the pathogen from harsh environments, antibiotics, and host factors. Sulfolipid-1 (SL-1), one of the important unique components of the pathogenic M. tuberculosis cell wall (Goren et al., 1971;Schelle and Bertozzi, 2006), is a key inhibitor of mitochondrial oxidative phosphorylation, phagosome-lysosome fusion, and induces changes in phagocytic cells (Kato and Goren, 1974;Goren et al., 1987;Zhang et al., 1988). In addition, recent studies have demonstrated that SL-1 is necessary and sufficient to trigger neuronal activation and coughing in humans (Ruhl et al., 2020). Collectively, this shows that SL-1 plays a key role in the pathogenesis of tuberculosis. Hence, many efforts have been directed toward investigating pivotal proteins in the SL-1 biosynthetic pathway and understanding how M. tuberculosis regulates this lipid or its precursors as a potential mechanism for host immune regulation to identify new anti-TB drug targets.
FadD23-Pks2 plays a central role in SL-1 synthesis (Lynett and Stokes, 2007). Pks2 is involved in synthesizing such long-chain multiple methyl-branched fatty acids. However, disrupting the gene encoding the enzyme responsible for synthesizing these acids may produce sulfolipiddeficient mutants (Sirakova et al., 2001). FadD23 is one of the 34 fatty acid adenylating enzymes (FadDs) in M. tuberculosis (Cole et al., 1998). The first class of FadDs comprises fatty acyl-coenzyme A (CoA) ligases (FACLs) that may be involved in lipid degradation and remodeling (Kuhn et al., 2016). FACLs use CoA as a common substrate to form acyl-CoA. Another class of FadDs comprises FAALs involved in the biosynthesis of toxic lipids in mycobacteria, as well as in the biosynthesis of lipopeptides of pharmaceutical importance in other microorganisms.
FAALs activate and transfer fatty acids onto related Pks proteins to further synthesize the diverse M. tuberculosis lipids (Mohanty et al., 2011). FAALs use a carboxylate substrate and ATP to form acyl-AMP. It is speculated that FAALs transfer the acyl chain to the Pks modified with the phosphopantetheinyl (P-pant) arm through a special proteinprotein interaction (Buchholz et al., 2009;Gavalda et al., 2009). Biochemical test corroborates that acyl-AMP ligases (FadD26, FadD30, and FadD32) can specifically transfer activated fatty acids to the cognate Pkss (Trivedi et al., 2004).
Typically, FadDs adopt two main conformations: adenylation and thiolation. The C-terminal domains undergo a domain alternation of approximately 140° to complete the catalysis steps. The active site is located at the interface of the N-and C-terminal domains (Reger et al., 2007(Reger et al., , 2008Kochan et al., 2009;Goyal et al., 2012). One research group identified 5′-O-[N-(11-phenoxyundecanoyl) sulfamoyl]-adenosine (PhU-AMS) as the optimal ligand for M. tb FadD32, and PhU-AMS was shown to inhibit the M. tb FadD32-catalyzed loading of Pks13 (Kuhn et al., 2016). Another study showed that 11-phenoxyundecanoyl-AMS has an inhibitory effect on M. tb FadD28 (Baran et al., 2020). However, whether non-hydrolyzable analogs of acyl-AMS have an inhibitory effect on other FAALs is still unknown. Mycobacterium tuberculosis FadD23 and M. tb FadD28 have 66% sequence similarity and share the same evolutionary branch; however, only the structure of the N-terminal domain of M. tb FadD28 is available. Here, we solved the full-length M. tb FadD23 structure and attempted to elucidate its molecular mechanism. To gain insight into substrate recognition, as well as to provide a foundation for designing drugs against M. tuberculosis, our research identified the 2.68 Å crystal structure of M. tb FadD23 complexed with ATP and the 2.25 Å crystal structure with AMP-PNP. We observed palmitic acid in the crystal structure indicating that the protein moiety most likely utilized palmitic acid more than other Escherichia coli fatty acids. Meanwhile, FadD23 included a hydrophobic tunnel and has substrate recognition specificity, as verified by biochemical experiments. We also elucidated the 1.5 Å crystal structure of the M. tb FadD23 N-terminal domain and verified that the C-terminal domain is essential in lipid synthesis through biochemical experiments. Based on our research and previous reports, we propose a structural basis for the substrate binding and generation of acylation reactions. Furthermore, we propose that FAALs adopt the C-terminal "lid" domain model and exert their enzymatic activity through the conformational change in the C-terminal "lid" domain to generate the product.

Cloning, expression, and purification of protein
The fadD23 gene, encoding the fatty-acid-AMP ligase, FadD23, was amplified from genomic DNA of M. tuberculosis H37Rv by PCR. The amplified DNA was then inserted between restriction sites Nde I and Xho I of a pET15b expression vector with a 6 × Histidine (His) tag at the N-terminus. The resulting plasmid was transformed into E. coli BL21(DE3) plySs. Cells were grown in Lysogeny broth (LB) medium with 100 μg/ml ampicillin at 37°C. When the optical density at 600 nm (OD 600 ) reached 0.7, the culture was treated with 0.2 mM isopropyl-D-thiogalactopyranoside (IPTG) at 37°C to induce protein expression. To induce FadD23 expression in Mycobacterium smegmatis (M. smeg) mc 2 155 cells, fadD23 was sub-cloned into a pMV261 vector with an 8 × His tag at the C-terminus, under the control of an acetamide promoter. The resultant plasmid was transformed into M. smegmatis mc 2 155 cells by electroporation. The cells were cultivated in LB liquid media supplemented with 50 μg/ml kanamycin and 20 μg/ml carbenicillin. When the cells were grown to an optical density (OD 600 ) of 1.0-1.2 at 37°C, overexpression of the recombinant protein was induced by 0.2% (w/v) acetamide at 16°C. Supplementary Table S1 lists the primers used in this study.
Escherichia coli and M. smegmatis mc 2 155 cells were then harvested within 5 and 96 h of induction, respectively, and resuspended in lysis buffer (50 mM Tris·HCl, pH 8.0; 300 mM NaCl; and 5% glycol). Thereafter, cell debris was removed by high-speed centrifugation at 18,000 rpm for 40 min after crushing using a homogenizer (AH-Basic-II, ATS Engineering Limited, China). Next, the supernatant was loaded onto an Ni-NTA affinity column (Cytiva, Marlborough, MA, United States) and washed in the resuspension buffer (50 mM Tris·HCl, pH 8.0; 300 mM NaCl; 5% glycol; and 20 mM imidazole). Subsequently, the protein was eluted in elution buffer (50 mM Tris·HCl, pH 8.0; 300 mM NaCl; 5% glycol; and 300 mM imidazole). FadD23 was further purified by ion-exchange and size-exclusion chromatography (Supplementary Figure S1A). Finally, the protein was stored in 20 mM Hepes, pH 7.5, and 150 mM NaCl. The construction and purification methods of the FadD23 N-terminal domain (1-465 AA) were the same as those of the full length.
2.3. Data collection, processing, and structure determination X-ray data were collected from beamline BL17U1, BL18U1, and BL19U1 at the Shanghai Synchrotron Radiation Facility (SSRF) at 100 K. Data integration and scaling were performed using XDSgui (Kursula, 2004;Kabsch, 2010). The structure was determined by molecular replacement using the Phaser module in Phenix (McCoy et al., 2007;Adams et al., 2010) using the M. tb FadD28 [RCSB Protein Data Bank code (PDB: 3E53)] N-terminal as a search template. The output model of the molecular replacement subsequently built the C-terminal using electron density and was subjected to iterative cycles of manual model adjustment using Coot (Emsley and Cowtan, 2004). Refinement was conducted using Phenix. The processing methods of different crystal structure were similarly conducted.

AlphaFold2 predictions
The full AlphaFold v2.0 (AlphaFold2) pipeline was obtained from DeepMind and installed onto a local workstation (Jumper et al., 2021). Structure predictions of full-length FadD23 were performed using AlphaFold2 software. The Local Distance Difference Test (pLDDT) plot was constructed using AlphaPickle (Arnold, 2021).

Mass spectrometer analysis
The mass spectrometer (MS) analysis samples were: 200 μl of purified FadD23 (1 mg/ml) or buffer extract fatty acid using 600 μl of 100% trichloromethane. Then, the solution was dried by Termovap Sample Concentrator (N-EVAP-24, Organomation, United States), and dissolve in 100% acetonitrile. The MS analysis of the FadD23 sample and unknown fatty acids was performed using exact mass TOF-MS (Xevo G2-XS Tof, Waters, United States) via an electrospray ionization source mass spectra were recorded in the negative ionization mode over a 50-1,200 m/z mass range. The scan parameters included 2.5 kV capillary voltages and 1 s scan times. Other source conditions were as Frontiers in Microbiology 04 frontiersin.org follows: 100°C source temperature, 50 and 600 l/h cone, and desolvation gas flow, respectively, and 40 V cone voltage. Data processing and analysis were performed using MassLynx V4.1 (Waters, United States).

Differential scanning fluorimetry
The Differential scanning fluorimetry DSF method is a widely used for determining protein stability. When a ligand is bound to the protein, the stability of the protein increases, such that the exposed hydrophobic groups at the same temperature decrease; a higher temperature is required to fully expose the hydrophobic groups of the protein. In a 96-well Real-time Quantitative PCR (RT-PCR) plate, a solution was prepared with a final concentration of 1 mg/ml FadD23 protein, 9× SYPRO Orange Protein Gel Stain (Thermo Fisher Scientific, Waltham, MA, United States), 2 mM ATP, 2 mM MgCl 2 , and 2 mM fatty acid (octoic, decanoic, lauric, myristic, palmitic, and stearic acids) dissolved in 100% Dimethyl sulfoxide (DMSO) with a final well volume of 30 μl. The solution was allowed to equilibrate at 25°C for 20 min before a thermal melt protocol was initiated using a Bio-Rad CFX96 RT-PCR instrument (Bio-Rad Laboratories, Hercules, CA, United States). The "FRET" channel was selected to satisfy the excitation and emission wavelengths of SYPRO Orange (λex 470 nm/λem 570 nm). The melt sequence involved first maintaining the sample at 20°C for 5 min before linearly increasing the temperature by 1°C/min with fluorescence measurements every 30 s. Data were plotted as the derivative of the melting curve as a function of temperature.

Surface plasmon resonance
FadD23 was immobilized on a Series S Sensor Chip CM5 (Cytiva, Marlborough, MA, United States) to 3,000 response units (RUs) using a Biacore X100 (Cytiva, Marlborough, MA, United States) and running buffer composed of 10 mM PBS pH 7.4, 0.05% Tween 20, and 5% DMSO. Serial dilutions of fatty acid were injected at concentrations ranging from 100 to 0.19 mM. Response curves were fit to a 1:1 binding model using Biacore X100 Evaluation Software (Cytiva, Marlborough, MA, United States).

Hydroxamate-MesG assay
The 100-μl reaction system contained 20 mM Tris pH 8.0, 2 mM MgCl 2 , 1 mM DTT, 150 mM hydroxamate (pH 7.0), 0.1 U nucleoside phosphorylase (PNP), 0.04 U pyrophosphatase, 0.2 mM MesG, 1 mM ATP, 7.5 μM FadD23, and varying concentrations of fatty acids. Reactions were run in a 96-well plate (Corning 3,635; Corning Life Sciences, Corning, NY, United States) and monitored at A 360 using a Varioskan™ LUX multimode microplate reader (Thermo Fisher Scientific, Waltham, MA, United States). The kinetic parameter of the substrate was determined for FadD23 using the hydroxamate formation assay; the standard assay conditions, described above, were used. The initial rate was measured as a function of the substrate concentration to provide a saturation curve, which was fit by nonlinear regression analysis to the Michaelis-Menten equation using GraphPad Prism 8.4. Then, k cat was determined k cat = V max /[E t ]. All mutations or truncations determine initial rate under sufficient and consistent substrate conditions (100 μM).

Overall structure of Mycobacterium tuberculosis FadD23
The oligomeric state of FadD23 is a monomer determined by size exclusion chromatography and further confirmed by analysis ultracentrifugation (AUC; Supplementary Figures S1A,B). Although many efforts have been directed toward the crystal screening test, apo-form M. tb FadD23 proteins failed to grow suitable crystals for data collection. However, well-shaped crystals from M. tb FadD23 with AMP-PNP or ATP could be diffracted to resolutions of 2.25 and 2.68 Å for data collection, respectively ( Figure 1B). The full-length M. tb FadD23, containing a large N-terminal domain (1-460) and a small C-terminal domain (473-584), connect by flexible loops and an antiparallel β-sheet (465-472), was obtained. Mycobacterium tuberculosis FadD23 has 14 helices, 23 β-sheets, and some irregular loops ( Figure 1C). The N-terminal can be divided into three regions: regions A and B containing more than three helices and three sheets, and region C containing more flexible loops ( Figures 1C,D). Regions A and B form two continuous sheets; sheet 1 comprising β1 and β2 form region A and β9, β10, β11, β12, and  Figure S2B). In addition, region B surrounds a cavity for substrate binding. The C-terminal domain comprises three peripheral helices and three inner sheets ( Figure 1C; Supplementary Figure S2C).
It was observed that two M. tb FadD23 complexes have extra electron densities in difference maps (F o -F c ) near ATP or AMP-PNP ( Figure 2A). After corroboration with previous studies and homologous structures (Zhang et al., 2011), it was concluded that this position is a substrate binding pocket, and its density indicates that it may be a fatty acid. The molecular weight of the small molecule obtained after protein denaturation and extraction was 256.4 Da, consistent with the size of palmitic acid (Supplementary Figure S1C). We surmised that it is derived from the E. coli expression system and binds to M. tb FadD23. Therefore, we interpreted these two densities as palmitic acid and hexadecanoyl adenylate (AMP-C16); the statistics and final refinement results are shown in Table 1. AMP-PNP is a non-hydrolysable ATP analog. FadD23 catalyzed the hydrolysis of ATP and connected AMP with palmitic acid to generate a molecule of AMP-C16. The structure of the ATP-FadD23 complex is similar to that of the AMP-PNP complex, with an RMSD of 0.2869 Å. In the structures of both complexes, no obvious electron density was observed for several loop regions, including residues 97-100, 125-137, 154-157, 172-176, and 511-513. However, a weak electron density was observed for parts of a loop (residue 553-557). The substrate is mainly located in the N-terminal domain, while ATP or AMP-PNP is bound to the contact surface between the N-(regions B and C) and C-terminal domains, and the fatty acid part points to the β-sheet of the N-terminal domain.

Binding affinity of Mycobacterium tuberculosis FadD23 with substrates
We suggested that palmitic acid in the structure was bound in the M. tb FadD23 hydrophobic pocket. Since M. tb FadD23 can obtain  Frontiers in Microbiology 06 frontiersin.org saturated fatty acids from the expression system, it is speculated that the protein can bind fatty acids. Therefore, the affinity of fatty acids of different lengths was determined to infer the range of carbon chain lengths with better binding capacity. First, we use DSF to determine protein stability after adding different substrates. To verify our conjecture, we conducted experiments on FadD23 by adding different fatty acids, ATP, and Mg 2+ . After data processing, we observed that adding palmitic acid to FadD23 slightly increased the melting temperatures (T m ; by approximately 0.5°C), indicating enhanced protein stability ( Figure 2B, Supplementary Figure S3B). In addition, the T m increased by 10°C after adding fatty acid, ATP, and Mg 2+ , further demonstrating enhanced stability (Supplementary Figures S3A,B). Furthermore, the protein had increased rigidity after substrate addition, which was conducive to acquiring protein crystals. Figure 2B shows the results as two valleys when ATP is absent, which can be regarded as two melting states ( Figure 2B, Supplementary Figure S3A). When considering the addition of ATP, an apparent difference was observed regarding the importance of adenylation for stabilizing protein conformation.
To further determine the fatty acid specificity of FadD23, we measured the affinity of different fatty acids using a surface plasmon resonance system (Biacore T200; GE Healthcare, Chicago, IL, United States). The affinity of ATP to FadD23 reached 1 μM, and palmitic acid to FadD23 reached 3.450 × 10 −5 M (Supplementary Table S2). At the same time, the binding ability between the protein and fatty acids increased with carbon chain length up to palmitic acid (the binding affinity increased); thereafter, increased carbon chain length weakened affinity. This provided robust evidence to speculate that palmitic acid is the optimal substrate for FadD23.
After various enzyme activity testing methods were conducted, the hydroxamate formation assay of Wilson and Aldrich (2010) was finally selected. In the absence of an acceptor, the tightly bound acyl-adenylate can be release of pyrophosphate (PPi) by the enzyme. The pyrophosphate produced is hydrolyzed to inorganic phosphate by pyrophosphatase that will react with MesG allowing the reaction to be monitored in a continuous fashion at 360 nm. We calculated the enzymatic activity of M. tb FadD23 under different substrate conditions. The K m , k cat , and k cat /K m value for lauric acid are 33.89 μM, 2.40 × 10 −3 s −1 , and 71.8 s −1 mol −1 L, respectively. The K m , k cat , and k cat /K m value for palmitic acid is 14.26 μM, 6.72 × 10 −4 s −1 , and 49.4 s −1 mol −1 L, respectively. However, the K m value could not be fitted well with octanoic or stearic acid as substrates. It worth noting that the lower K m value was that palmitic acid as substrate, a higher k cat /K m value was obtained with lauric acid as substrate (Table 2).
Next, we compared the fatty acid binding tunnels with full-length structures with other FAALs in Mycobacteria, including M. marinum (M. m) FadD32, M. smeg FadD32, and M. tb FadD10 (Liu et al., 2013;Guillet et al., 2016). The linear distances between the distal end and the portal of their fatty acyl binding tunnels were 7 Å, 14 Å or longer, and 10.8 Å, respectively. Moreover, FadD23 had a fatty acid binding pocket 15.5 Å in length, indicating the specificity of different FAALs with fatty acids (Supplementary Figures S4A-D).

Substrate binding site of Mycobacterium tuberculosis FadD23
It is evident in the crystal structure we obtained that the adenosine binding pocket is located in the cleft between the N-terminal and C-terminal domains and the residues involved in binding are all from the large N-terminal domain. The fatty acids or AMP-C16 were bound in the hydrophobic pocket of the N-terminal domain. Hydrogen bonds between FadD23 and adenosine groups were observed in the structure. The backbone oxygen of Ala328 and the -OH groups of Tyr376 formed two hydrogen bonds with N37 of adenosine. The NE2 group of His221 and the Nα of Ala332 were differentially bound to the monophosphate of AMP. The OD1 atom of Asp446 and O38 of the ribose group in adenosine formed a hydrogen bond ( Figure 2C). All the mentioned interactions were also observed in the AMP-PNP-FadD23 complex ( Figure 2D). At the same time, Asp222, Glu301, Arg302, Tyr329, Gly330, and Leu331 residues also interacted with AMP, contributing to the stability of the proteinsubstrate complex. In the AMP-PNP-FadD23 complex, the O27 of AMP-PNP interacted with NE of Arg460 through hydrogen bond, and Nα of Ser300 formed a hydrogen bond with N09 of AMP-PNP ( Figure 2D). Based on these observed interactions, mutations were introduced. When His221 is replaced with Ala221 in FadD23, the activity of the enzyme was significantly reduced. The initial rate of the reaction was reduced to 75% of the original initial rate under sufficient and consistent substrate conditions ( Figure 3A). We also compared AMP binding pockets of these structures. His221(M. tb FadD23) was conserved in FAALs (Supplementary Figure S5A). The backbone oxygen of Ala328 (M. tb FadD23) interacted with N37 of adenosine, Nα of Ala332 interacted with AMP, and the hydrogen bond between the OD1 atom of Asp446 (M. tb FadD23) and O38 of the ribose group were also observed in other FadDs (Supplementary Figure S4E).
Palmitic acid or hexadecanoyl adenylate was located in a hydrophobic pocket of the N-terminal domain. Met223, Ala328, and Val336 were located near the base of adenosine's hydrophobic pocket, the amino acid residues in the middle of pocket were Met195, Tyr199, Val226, Leu227, and Leu297, while the amino acid residues Phe192, Ile196, Phe200, Leu211, Ala231, Val235, and Phe265 at the end of the pocket formed a semi-enclosed barrel ( Figure 3B; Supplementary Figure S2D). The enveloping hydrophobic amino acid residues listed above constituted the binding pocket for the fatty acid substrate of FadD23. The length of hydrophobic pocket may correlate with substrate-specific selection.  Figure S4F).
Based on the above results, we speculated that the C-terminal domain plays a key role in stabilizing the overall conformation. Next, we solved the 1.5 Å crystal structure of M. tb FadD23 N-terminal domain. The N-terminal structure was compared with that of the complex with ATP or AMP-PNP, and the RMSDs were observed to be 0.4265 and 0.3836 Å, respectively. No obvious changes in substrate binding sites were observed. At the same time, after analyzing the structure, no extra differential density was observed, indicating that the N-terminus of FadD23 could not capture fatty acids from the same expression system (Supplementary Figure S5B).
We inferred that the C-terminal domain serves as a cover for fatty acid binding to FadD23 and is important for the function of FadD23. The protein of the M. tb FadD23 N-terminal domain has no fatty AMP-ligase activity ( Figure 3A).
Next, we used AlphaFold2 software to predict the structure of M. tb FadD23 and obtained five models with high confidence to compare structures by structurally aligning with the FadD23 (Figure 3C). The calculated RMSD values of structures, Ranked_0, Ranked_1, Ranked_2, Ranked_3, and Ranked_4, were 1.34, 1.02, 1.41, 1.43, and 1.13 Å, respectively. At the same time, the pLDDT of each residue was calculated (Supplementary Figure S6). It can be seen that among the five predicted structures, the confidence of the residues in the N-terminal domain was higher than that of the residues in the C-terminal domain. There are some low-confidence residues in both domains. In the structures we got, these residues are disordered, e.g., residues 96-100, or be stabilized when form complex with substrates, e.g., residues 552-556. Therefore, we speculated that these amino acid residues promote the occurrence of enzymatic activities in capturing substrates and generating products. Further, we speculated that the C-terminus resembles a lid that changes from a swinging (open) state to a stable (closed) state interacting with the N-terminal domain and the substrate after the substrate is bound to protein. Additionally, the anti-β-sheet (465-472) stabilizes the overall structure in the closed conformation.
Evaluation using PDBePISA 1 revealed that the size of the interaction area between the two domains was 891.8 Å 2 . At the interface of the two domains, in addition to the interaction of the β-sheet, Arg310 and Glu483, Arg302, and Asp476 exhibited two hydrogen bonding interactions ( Figure 3D). The Frontiers in Microbiology 08 frontiersin.org

Cross-talk between Mycobacterium tuberculosis FadD23 and Pks2
When M. tb FadD23 was expressed by the M. smegmatis, we observed the lanes of the peak one fraction eluted from gel chromatography in SDS-PAGE gel, in addition to that of FadD23, which had an obvious band at the position of approximately 220 kDa (Supplementary Figure S7). After verification using mass spectrometry, we confirmed that this is MSMEG_4727, which is homologous to Pks2 in M. smegmatis (data not shown).

Discussion
Previous research proved that M. tb FadD23 can transfer hexadecanoyl adenylate or AMP-C18 to Pks2 in vivo (Sirakova et al., 2001). We speculate the FadD23 binds to medium-or long-chain fatty acids (C8-C18). In this study, we verified the biological properties of FadD23 using different methods. We successfully detected the m/z corresponding to palmitic acid in the MS analysis. Despite the ample data generated by the enzymatic assay and MS analysis, we did not find sufficient evidence that could strongly support the binding selectivity of FadD23 to a specific long-chain fatty acid. Our research did, however, reveal that FadD23 directly captures palmitic acid from the E. coli expression system and that the palmitic acid-FadD23 binding slightly increases the T m .
We speculate that the reason for the two T m values in DSF is that the first T m peak is due to C-terminal domain instability, and the second one is caused by overall protein denaturation, which is also related to the conformational instability of the N-and C-terminals. The enzyme activity assay methods we initially used yielded no results. We either used a malachite green reagent to form a stable dark green color with the free phosphate released by the enzyme or a method determining the remaining amount of ATP that can be used by luciferase. We speculate that after the product was generated, since there was no downstream related protein to receive the product in time, the product could not be released. The hydroxamate formation assay, which depends on the ability of hydroxylamine to release tightly bound acyl-adenylate, was finally utilized. To explore the effect of changing the size of the hydrophobic pocket on the length of the bound fatty acid chain, we attempted to construct F192S, M195S, and F265S mutants, and we also introduced the S300A and D446A mutations according to structure analyzing or structure-based sequence alignments (Robert and Gouet, 2014;Supplementary Figures S4F, S5A) in FAALs. Unfortunately, no obvious protein expression was detected. Therefore, only FadD23 with H221A mutation could be used for subsequent enzymatic experiments. His221 is one residue that attached to the oxygen atom of α-phosphate both before and after AMP-C16 formation, gets closed (3.9 Å) to carboxyl group of the fatty acid. The histidine is conserved in FAALs according structure-based sequence alignments (Supplementary Figures S4F, S5A) in FAALs. That indicate that His221 should play a central role in hydrolysing ATP and producing the products. The reduction in enzymatic activity by 25% correlates with to hints provided by analyzing the interactions in structures and structure-based sequence alignments.
The crystal structures of FadD23 with either substrate, substrate analog, and N-terminal domain were obtained, but we did not obtain the structure of the apo state. Through the aforementioned structural analysis and biological properties studies, we infer that the mode of action of FadD23 includes the N-terminal domain binding to the substrate and the C-terminal domain functioning similarly to a lid that stably binds the substrate in the pocket to generate the product, which is then transported by downstream proteins. It is worth noting that the hydroxamate formation assay revealed that the binding capacity is nearly abolished when the C-terminal domain is absent. This is consistent with our hypothesis that the C-terminal domain plays a significant role in stabilizing substrate binding in the enzyme pocket.
Previous study took small angle X-ray scattering (SAXS) to elucidate the conformation changes of M. smeg FadD32 during catalysis process (Guillet et al., 2016). Because we did not obtain the structure of the apo state of FadD23, we took AlphaFold2 to predict the apo state of FadD23. The crystal structure of FadD23 with a substrate compared to the sequence-based predicted structure by AlphaFold2 can provide some hints for the conformational change in the catalytic process. We also speculate that the four-step catalytic cycle of FadD23 depends on analyzing all three structures and predictions from AlphaFold2 ( Figure 4). FadD23 can capture fatty acids from the surrounding environment. Firstly, the C-terminal domain rotates after binding ATP and induces the conformation change in the full structure. Secondly, ATP de-pyrophosphate combines with palmitic acid to form a product (hexadecanoyl adenylate) that dissociates from FadD23 with the participation of downstream proteins. Finally, FadD23 returns to the apo state and waits for substrate capture. That speculation also consistent with research of M. smeg FadD32 (Guillet et al., 2016), and is a complementary understanding for M. tb FadD23. Moreover, we intended to verify the unloading of FadD23 products in vitro and clarify the relationship with downstream proteins, but we did not obtain any. This might be because more components are involved in vivo, which needs further exploration. It seems that FAALs transfer the acyl chain to the Pks modified with the P-pant arm through a special protein-protein interaction (Gavalda et al., 2009). We confirmed that FadD23 and Pks2 could be co-purified when over-expression FadD23 in M. smegmatis. However, it is unclear whether the transfer of the substrate requires the participation of other proteins. Therefore, the structural basis of the interaction between FadD23 and Pks2 or other proteins and the mode of action between these proteins need to be elucidated in the future. Previous studies have mentioned that substrate analogs, such as tetradecyl-AMS for M. tb FadD28 or PhU-AMS for M. tb FadD32, could be treated as general inhibitors for different FAALs. In vitro anti-tubercular activity experiments proved that PhU-AMS inhibits M. tuberculosis with a minimum inhibitory concentration 3.13 μM (Kuhn et al., 2016;Baran et al., 2020). We also attempted to demonstrate the inhibition of the enzymatic activity of M. tb FadD23 using PhU-AMS, but PhU-AMS has no inhibitory activity against FadD23 (data not shown). As M. tb FadD23 and M. tb FadD28 have 66% sequence similarity and share the same evolutionary branch, it is very interesting that PhU-AMS have no effect on enzymatic activity of M. tb FadD23. Therefore, we speculate that PhU-AMS has no inhibitory effect on FadD23 because it considerably differs from its optimal substrate as the FAAL class of FadDs appears to functional nonredundant in production defer fatty acids (Sourjik and Berg, 2004). In a word, identifying a specific FadD23 inhibitor will be very complicated.
In summary, we elucidated the structure of full-length M. tb FadD23, the first protein in the SL-1 synthesis pathway. Through structural, biological, and chemical analyses, we postulate that palmitic acid is the preferred biological substrate of FadD23. Based on our results, we believe that the FadD23 N-terminal domain alone cannot efficiently bind fatty acids without C-terminal domain and the C-terminal domain plays a crucial role in the catalytic mechanism. However, more studies are needed to clearly elucidate the mechanism of SL-1 synthesis pathway and there provides a theoretical background for designing FadD23 inhibitors. Whether a specific inhibitor could reduce SL-1 production with the possible effect of attenuating virulence could do reducing the bacterial load in the host is still a question, more efforts are needed to develop specific inhibitors against FAALs or selective inhibitors of FadDs.

Data availability statement
The data presented in the study are deposited in the RCSB Protein Data Bank (www.rcsb.org), accession numbers 8HD4, 8HDF, 8HCZ.