Electrostatic interactions mediate the nucleation and growth of a bacterial functional amyloid

Bacterial biofilm formation can have severe impacts on human and environmental health. Enteric bacteria produce functional amyloid fibers called curli that aid in biofilm formation and host colonization. CsgA is the major proteinaceous component of curli amyloid fibers and is conserved in many gram-negative enteric bacteria. The CsgA amyloid core consists of five imperfect repeats (R1-R5). R2, R3, and R4 have aspartic acid (D) and glycine (G) residues that serve as “gatekeeper” residues by modulating the intrinsic aggregation propensity of CsgA. Here, using mutagenesis, salt-mediated charge screening, and by varying pH conditions, we show that the ability of CsgA variants to nucleate and form amyloid fibers is dictated by the charge state of the gatekeeper residues. We report that in Citrobacter youngae CsgA, certain arginine (R) and lysine (K) residues also act as gatekeeper residues. A mechanism of gatekeeping is proposed wherein R and K residues electrostatically interact with negatively charged D residues, tempering CsgA fiber formation.


Introduction
Bacteria can be found in biofilms, complex communities enclosed within an extracellular matrix (ECM) (Costerton et al., 1995;Costerton, 1999). The ECM confers resistance to the underlying cells from physical and chemical stressors such as dehydration, predation, and antibiotics among others (Elasri Mohamed and Miller Robert, 1999;Daniel and O'Toole George, 2005;Roberts and Stewart, 2005;Matz et al., 2008;Mulcahy et al., 2008;Truelstrup Hansen and Vogel, 2011;Haaber et al., 2012;Kostakioti et al., 2013). Biofilms have been extensively studied due to their impact on human health and disease (Costerton, 1999;Donlan and William Costerton 2002;Parsek and Singh, 2003;Motta et al., 2021). The physiological and biochemical basis of biofilm formation by enteric bacteria is of particular interest due to connections with host digestion, immunity, and pathologies (Banwell et al., 1985;Macfarlane et al., 1997;Palestrant et al., 2004;de Vos, 2015;Donaldson et al., 2016). The Enterobacteriaceae family includes many species which colonize the human gut and are implicated in disease, such as Salmonella enterica, Serratia marcescens, Klebsiella, Yersinia pestis, and Escherichia coli (E. coli) (Hufnagel David et al., 2015). The biofilms produced by enteric bacteria are composed of functional amyloids, polysaccharides, and extracellular DNA (eDNA) (Steinberger and Holden, 2005;Jonas et al., 2007;Qin et al., 2007;Izano Era et al., 2008;Guiton Pascale et al., 2009;Kostakioti et al., 2013). Of interest is the functional amyloid curli, produced by certain enteric bacteria and first identified in E. coli (Chapman Matthew et al., 2002). Curli amyloid fibrils have been shown to be important in biofilm formation by mediating initial surface attachment and are an important structural component of the overall biofilm architecture (Kikuchi et al., 2005;Hufnagel et al., 2013;Hung et al., 2013).
Two divergently transcribed operons containing seven genes control the expression, secretion, and formation of the curli fibril (Chapman Matthew et al., 2002). The major curli subunit is an aggregation-prone protein called CsgA. CsgA is predicted to be mostly intrinsically disordered until its polymerization is initiated in the extracellular space upon encountering the surface-anchored nucleator protein, CsgB (Sujeet et al., 2019). Once polymerized, the resulting curli amyloid fibrils provide structural integrity to the biofilm. CsgA has been predominantly studied in E. coli cells. E. coli lacking CsgA are deficient in biofilm formation and surface attachment (Chapman Matthew et al., 2002;Tursi and Tükel, 2018). Interestingly, CsgA homologs from diverse bacterial species have been shown to complement E. coli CsgA deletion in vivo and fragments of CsgA homolog fibers can seed the amyloidogenic aggregation of E. coli CsgA in vitro . Since most bacterial communities are composed of multiple species, it is plausible that CsgA homologs from different species can be shared to build biofilms containing a heterogeneous matrix and population. This process is of particular interest for the diverse range of amyloid-producing enteric bacteria in the human gut and introduces the importance of studying CsgA homologs in various enteric bacterial species. Given the importance of CsgA in biofilm formation, various in vivo and in vitro methods have been developed to study how CsgA aggregates into amyloid fibers .
In vitro Thioflavin-T (ThT) fluorescence studies have shown that the polymerization of E. coli CsgA follows a nucleation dependent polymerization (NDP) model (Jain et al., 2017). The sigmoidal aggregation pattern includes the initial nucleation lag phase followed by rapid fiber polymerization and a final stationary phase (Chiti and Dobson, 2017;Jain et al., 2017). CsgA is comprised of five conserved imperfect repeat units designated as R1-R5 (Wang et al., 2007). Each repeat unit is predicted to form a β-helix-like structure with a characteristic strand-loop-strand motif (DeBenedictis et al., 2017). The Q-X4-N-X5-Q consensus sequence of the repeat units is important for initiating and propagating CsgA polymerization via side chain interactions of the glutamine (Q) and asparagine (N) residues (Wang and Chapman, 2008). Interestingly, units R2, R3 and R4 contain certain aspartic acid (D) and glycine (G) residues, termed as "gatekeepers," which impede the intrinsic aggregation propensity of E. coli CsgA, a phenomenon termed as "gatekeeping" (Wang et al., 2010). Substitution of the gatekeeper residues with corresponding residues in R1 and R5 repeat units lead to increased aggregation propensity of CsgA and a significant decrease in the lag phase, indicating that gatekeepers may play an important role in modulating CsgA nucleation (Wang et al., 2010). Charged amino acid residues like lysine (K), arginine (R), glutamic acid (E), and aspartic acid (D) have been shown to serve as gatekeeper residues by interfering with β-sheet formation due to large and flexible sidechains (e.g., R, K) or charge-charge repulsion (Reumers et al., 2009;Beerten et al., 2012). It has also been shown that electrostatic interaction between positively and negatively charged amino acid residues can modulate amyloidogenesis in proteins (Yun et al., 2007;Meisl et al., 2017;Lin et al., 2020). We previously found that wild-type Citrobacter youngae CsgA (CY CsgA) does not contain the same gatekeeper D residues as E. coli CsgA and that CY CsgA polymerizes very quickly compared to other CsgA homologs (Bhoite et al., 2022). In addition, a mutated CY CsgA variant where the E. coli gatekeeper ( GK ) D residues were added, named CY GK CsgA (CY CsgA V78D/S89D/N125D ), displayed slower polymerization rates than CY CsgA (Bhoite et al., 2022). Here, we investigated the role of gatekeeper residues in modulating amyloidogenesis of the CY GK CsgA mutant. We hypothesized that the introduction of gatekeeper D residues in CY CsgA could either control polymerization by a) repulsion between the negatively charged D residues preventing compact amyloid formation or b) intramolecular interactions with D residues could be stabilizing CsgA monomers delaying the formation of a polymerization competent species.
Sequence alignment of CsgA homologs from diverse species revealed that a) not all CsgA homologs have the conserved D gatekeeper residues and b) in CsgA homologs with conserved D residues, two positively charged residues, namely arginine (R) and lysine (K), are highly conserved (Supplementary Figure S1). We thus hypothesized that the gatekeeping ability is in part conferred by the negatively charged D residues and the positively charged R and K residues electrostatically interacting with each other, modulating the formation of the aggregation prone β-helix conformation of CsgA. Wild-type CY CsgA contains the positively charged R62 and K107 residues but lacks most of the gatekeeper D residues (Supplementary Figure S1). The introduction of gatekeeper D residues could allow for electrostatic interactions with the native R and K residues to form, leading to the increase in lag phase observed in the CY GK CsgA mutant.
In this study, CY GK CsgA was used to investigate the mechanism behind gatekeeping activity. We report that pH-induced charge manipulation and salt-mediated charge screening significantly impacted gatekeeping activity in CY GK CsgA. We also identify new positively-charged gatekeeper residues, R62 and K107, and show that deletion or substitution of these residues negatively impacted gatekeeping activity. We demonstrate that charge neutralization and mutation of the positively charged residues as well as partial loss of negative charge on the D78, D89, and D125 residues abolished gatekeeping activity. Based on our data, we propose a mechanism wherein an electrostatic interaction, namely R or K residues interacting with D residues, represents one mechanism of gatekeeping CY GK CsgA nucleation.

Results
Substitution of positively charged R and K residues leads to increased nucleation rates Sequence alignments revealed the presence of conserved positively charged R and K residues in diverse CsgA homologs (Supplementary Figure S1). We hypothesized that the positively charged R62 and K107 residues play an important role in gatekeeping CY GK CsgA polymerization in addition to the earlier discovered gatekeeper D78, D89, and D125 residues. To test this hypothesis, the positively charged R62 and K107 residues in CY GK CsgA (CY CsgA V78D/S89D/N125D ) were mutated either to a neutral alanine (CY GK CsgA R62A/K107A ) or a negatively charged aspartic acid (CY GK CsgA R62D/K107D ) ( Figure 1A). We purified these two mutants along with CY GK CsgA and studied their aggregation kinetics in vitro using Thioflavin-T (ThT) fluorescence assays (Naiki et al., 1989;Xue et al., 2017). As previously reported (Bhoite et al., 2022), CY GK CsgA displayed a Frontiers in Molecular Biosciences frontiersin.org 02 lag phase of~6 to 7 h at pH 7.3 ( Figure 1B). Interestingly, at pH of 7.3, CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D displayed a reduced lag phase of~1 h and~3 h, respectively, compared to CY GK ( Figure 1B). The aggregation kinetic curves were fitted by the Finke-Watzky twostep model of nucleation and growth (Morris et al., 2008). This model allowed estimation of the nucleation rate constant k 1 and autocatalytic growth constant k 2 for CY GK CsgA polymerization (Supplementary Figure S2). The nucleation rate constant k 1 of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D showed a four log increase in the nucleation rate compared to CY GK CsgA ( Figure 1C). This significant increase in the nucleation rate indicated that R62 and K107 residues are important for nucleus formation and may function as gatekeeper residues. The growth rate constant k 2 did not change between CY GK CsgA and CY GK CsgA R62A/K107A , but CY GK CsgA R62D/K107D showed a two times slower growth rate suggesting there may be negative charge repulsion impacting CsgA polymerization at the growth phase ( Figure 1D).
Salt-mediated charge-masking negatively affects the gatekeeping function of R, K, and D residues Electrostatic interactions between the negatively charged D78, D89, and D125 residues and the positively charged R62 and K107 gatekeeper residues in CsgA might play a role in slowing down the nucleation rate of CY GK CsgA. Aggregation reactions were carried out at varying concentrations of NaCl. Increasing NaCl concentrations was predicted to increase nucleation rates as the charge screening would disrupt electrostatically-mediated gatekeeping. In CY GK CsgA we observed that with increasing salt concentrations, the lag phase decreased with a concomitant increase in the nucleation rates and a four log increase in the nucleation rate at the highest salt concentration of 600 mM (Figures 2A, B). Conversely, the salt-mediated charge screening had less of an effect on the lag phase and nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D , which both would lack the electrostatic interactions between a positively charged gatekeeper residue and a D gatekeeper residue. The nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D between 0 mM and 600 mM salt increased only by .5 and 1. Negative charge on D residues is necessary for gatekeeping function The degree of ionization and hence the charge states of an amino acid can change depending on the pH of the buffer solution. The Protparam Tool from Expasy calculated the theoretical pI of CY GK CsgA to be 5.6. We studied the aggregation kinetics of the three proteins at pH 4 and pH 5 to test the effect of charged states of R, K, and D residues on gatekeeping activity. The theoretical pK a of the ionizable carboxylic acid group in D residues is 3.9. However, in the context of the entire protein, the pK a values differ significantly compared to the theoretical values, especially in intrinsically

Frontiers in Molecular Biosciences
frontiersin.org 03 disordered proteins (Thurlkill et al., 2006;Quijada et al., 2007;Grimsley et al., 2009;Pahari et al., 2019). Noteworthy, is the observation that the experimentally calculated pK a values of D residues can range from .5 to 9.9 (Pahari et al., 2019). We thus reasoned that the pK a of D78, D89, and D125 residues in CsgA could be different from the theoretical value but that the balance of ionization states of the aspartic acids will trend towards protonation at lower pH conditions and towards deprotonation as the pH is increased. We hypothesized that in CY GK CsgA at pH 4 the D78, D89, and D125 gatekeeper residues would trend towards protonation leading to a loss of electrostatic interactions with positively charged R62 and K107 residues. As the pH increases to 5, the equilibrium of the charge state of the D78, D89, and D125 gatekeeper residues would shift more towards deprotonation leading to increased interaction and, therefore, increased gatekeeping function. CY GK CsgA, CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D at pH 4 displayed similar aggregation kinetics with significantly increased nucleation and growth rates compared to those at pH 7.3 (compare Figures 1B-D, 3A-C). Moreover, there was less than 1 log difference in the nucleation rates and less than 1.4 times difference in the growth rates between

Frontiers in Molecular Biosciences
frontiersin.org CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D at pH 4 (Figures 3B, C). As the pH was increased to pH 5, we observed an increase in the lag phase of CY GK CsgA compared to pH 4 ( Figures 3D-F). The nucleation rate of CY GK CsgA at pH 5 was 1.2 log lower than that at pH 4 while the nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D at pH 5 did not significantly change from pH 4 ( Figure 3E). As the pH increased from pH 4 to pH 5, it would be predicted that D residues might deprotonate and become negatively charged. Thus, at pH 5 in CY GK CsgA we observed increased lag phase of aggregation and significantly decreased nucleation rates compared to pH 4, indicating that gatekeeping function at the nucleation level depends in part on the charge state of D78, D89, and D125 gatekeeper residues.
We next monitored aggregation kinetics at pH 4 and pH 5 in the presence of varying salt concentrations. At pH 4, with increasing salt concentrations, we observed no significant change to the nucleation and growth rates (Supplementary Figures S3A-I). The absence of negative charge on D78, D89, and D125 gatekeeper residues at pH 4 effectively abolished electrostatic interaction-based gatekeeping activity, leading to reduced effect of salt-mediated charge screening on nucleation and growth rates. The effect of salt did not significantly affect the nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D at pH 5, however, the nucleation rate of CY GK CsgA at pH 5 increased by more than 1 log between the lowest and highest salt concentration indicating some degree of electrostaticmediated gatekeeping is occurring at the nucleation level We further tested the electrostatic interactions between R62, K107 and D78, D89, and D125 gatekeeper residues by studying the aggregation kinetics of CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D at pH 6, which is above the theoretical pI of CY GK CsgA. At pH 6, the equilibrium of the protonation state of D residues is predicted to shift more towards deprotonation compared to that at pH 5. We thus predicted that with increased negative charge on the D78, D89, and D125 residues, the gatekeeping function in CY GK CsgA would be enhanced compared to that at pH 5, resulting in a longer lag phase and a larger effect from salt-mediated charge screening. At pH 6, in CY GK CsgA we observed a three log decrease in the nucleation rates compared to at pH 5, while the nucleation rates of CY GK CsgA R62A/K107A decreased by only .6 log ( Figures 3G, H). Interestingly, despite the disruption of gatekeeping function in CY GK CsgA R62D/K107D , we observed 1.9 log lower nucleation rate at pH 6 compared to pH 5 ( Figure 3H). The growth rates for the three proteins also displayed significant difference at pH 6 ( Figure 3I; Table 1). In the presence of salt, the effect of gatekeeping activity in CY GK CsgA at pH 6 was negatively affected suggesting that disruption of electrostatic interactions led to increased nucleation rates ( Figures 4A, B). At the highest salt concentration of 600 mM, CY GK CsgA nucleation rates were comparable to CY GK CsgA nucleation rates at pH 4 and 5. Salt-mediated charge screening had no significant impact on the nucleation rates of CY GK CsgA R62A/K107A ( Figures 4D, E), while in the case of CY GK CsgA R62D/K107D nucleation rates increased by two log with increasing salt concentration ( Figures 4G, H). The growth rates showed a small increase for all the three proteins with approximately two times increase in the growth rates at 600 mM NaCl compared to 0 mM NaCl ( Figures 4C, F, I). Increasing presence of negative charges on D residues led to stronger gatekeeping function specifically in CY GK CsgA, however this charge was reversed by increasing salt-mediated charge screening.
It is important to note that CY GK CsgA contains other charged residues, including residues with pK a values that falls within this pH range. Notably, all three proteins contain a 6-His tag as well as four native histidine residues. These residues would also trend towards deprotonation as the pH reaches and surpasses pH 6. It is possible that changes in the ionization state of other residues within CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D would also impact polymerization rates, however it does not appear that these changes account for the differences in the nucleation rate between CY GK CsgA and the mutants CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D .

Diminished gatekeeping activity at pH 8 suggests deprotonation of positively charged residues
We next explored the aggregation kinetics of CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D at pH 8. The experimentally determined pK a of K residues ranges from 6.5 to 12.12 while R residues have the highest pK a among the ionizable groups and are thus rarely deprotonated at pH ≤ 10 (Isom et al., 2008;Harms et al., 2009;Harms et al., 2011;Pahari et al., 2019). At pH 8, buried K residues shift towards deprotonation, while other K and R residues likely remain protonated (Isom et al., 2011).
Interestingly, in CY GK CsgA we observed 3.6 log increase in the nucleation rates compared to those at pH 7.3 with no significant difference in the nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D between pH 8 and pH 7.3 ( Figure 1C, 5B). Overall, CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D had comparable nucleation rates at pH 8, with less than .4 log difference between the three proteins ( Figures 5A, B). Based on our observations we suggest that the K residues in CY GK CsgA are likely to be deprotonated at pH 8 and hence gatekeeping function would be negatively impacted. While K residues buried in the interiors of globular proteins have been shown to have significantly altered pKa values (Isom et al., 2011), in context of CsgA, which is an intrinsically disordered protein, our observations suggest that the impact of pH on the charged state of K residues might extend beyond the conformational state of the protein under study. Our observations were further supported by salt-mediated charge screening that showed less effect on the nucleation rates of CY GK CsgA and CY GK CsgA R62A/K107A with less than .6 log difference between 0 mM and 600 mM NaCl (Supplementary Figures S5A, D). The nucleation rates of CY GK CsgA R62D/K107D showed one log increase at 600 mM NaCl compared to 0 mM NaCl (Supplementary Figure S5G). Interestingly, unlike the nucleation rates, the growth rates of CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D at pH 8 increased significantly with increasing salt concentrations (Supplementary Figures S5C, F, I) suggesting the involvement of other amino acid residues in amyloid fiber elongation. Trends in nucleation rate across the tested pH range for the CsgA variants are summarized in Figure 6. CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D mutants form curli fibers in vivo During curli biogenesis, E. coli CsgA monomers are secreted to the extracellular space and the in vivo polymerization of CsgA is directed by a membrane associated CsgB nucleator protein via the Type VIII secretion system (Sujeet et al., 2019). CsgA homologs belonging to diverse species have been shown to be nucleated by E. coli CsgB both in vitro and in vivo . In addition, wild-type CY CsgA can complement an E. coli ΔcsgA strain Bhoite et al., 2022). To ensure that the addition of gatekeeper residues in CY CsgA did not compromise curli fiber formation, we tested polymerization of   (Robinson et al., 2006;Hammer et al., 2012). The assembly of extracellular amyloid fibers was assessed by growing the cells on Congo red indicator plates. Strains that assemble extracellular cell surface associated fibers stain red on Congo red plates while the strains that cannot make extracellular cell surface associated fibers appear white or light pink . Wild-type E. coli MC 4100 and the ΔcsgA mutant strains with a plasmid expressing E. coli CsgA or wild-type CY CsgA formed red colonies after 48 h incubation indicating proper surface-anchored curli amyloid formation ( Figure 7A). Light pink colonies were observed for the ΔcsgA mutant strain and the ΔcsgA mutant that contained the empty vector ( Figure 7A). Interestingly, the ΔcsgA mutant strains which contained plasmids that expressed either CY GK CsgA, CY GK CsgA R62A/K107A or CY GK CsgA R62D/K107D formed red colored colonies, indicating that these mutants successfully secrete and assemble curli in vivo ( Figure 7A). Whole-cell transmission electron microscopy (TEM) revealed the presence of cell-surface associated curli amyloid fibers in WILD-TYPE MC 4100, the ΔcsgA mutant strain harboring an E. coli CsgA plasmid, and the ΔcsgA mutant strain with a CY wild-type CsgA plasmid ( Figure 7B). No cell surface-associated curli fibers were seen in the ΔcsgA mutant strain and the ΔcsgA mutant strain harboring the empty vector ( Figure 7B). Cell-surface associated curli fibers were present in cells from the ΔcsgA mutant strains harboring CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D encoding plasmids, and appeared to have similar morphology to fibers produced by ΔcsgA with a CY wild-type CsgA plasmid ( Figure 7B). Curli fibers were found on fewer cells in CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D mutants than on wild-type MC 4100 and the ΔcsgA mutant strain with an E. coli CsgA expressing plasmid. Congo red staining was not apparent in the underlaying agar beneath the biofilms grown in Figure 7A indicating that fibers were not polymerizing without anchoring to the cell surface. However, nonsurface anchored curli fibers were observed when imaging the CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D mutants.

R62 and K170 residues are implicated in gatekeeping function
Sequence alignment of CsgA homologs revealed the presence of conserved R and K residues in addition to gatekeeper D residues previously found in E. coli CsgA (Supplementary Figure S1) (Wang et al., 2010). Charged residues like R, K, and D have been shown to function as gatekeeper residues in many proteins (Wang and Chapman, 2008;Reumers et al., 2009;Beerten et al., 2012). We thus hypothesized that the R and K residues in CsgA are also important for gatekeeping function. CY GK CsgA is a variant of wild-type C. youngae CsgA and includes added D78, D89, and D125 gatekeeper residues. CY GK CsgA natively contains the conserved R and K residues. These residues were replaced with alanine (CY GK CsgA R62A/K107A ) or aspartic acid (CY GK CsgA R62D/K107D ) to test whether these residues function as gatekeepers. This substitution resulted in a significant decrease in the lag phase and increase in the nucleation rates compared to CY GK

FIGURE 5
Loss of positive charge on K residues negatively impacts gatekeeping function. Analysis of aggregation kinetics of CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D at 37°C, pH 8. (A) ThT fluorescence assay. (B) Nucleation rates k 1 of amyloidosis (log scale) and (C) Growth rates k 2 of amyloid fiber propagation (linear scale). (Error bars represent SEM for ThT assay and SD for k 1 and k 2 of three replicates).

FIGURE 6
Comparison of the nucleation rates at different pH conditions. Nucleation rates k 1 (log scale) of CY GK CsgA (blue), CY GK CsgA R62A/K107A (red), and CY GK CsgA R62D/K107D (green) at different pH conditions at 37°C (Error bars represent SD for k 1 of three replicates).

Frontiers in Molecular Biosciences frontiersin.org
CsgA at pH 7.3 ( Figures 1B, C, 6). This increase in the nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D suggested that in addition to the D78, D89, and D125 gatekeeper residues, the R62 and K170 residues are implicated in gatekeeping function. The growth rates were not significantly affected in CY GK CsgA or CY GK CsgA R62A/K107A , but CY GK CsgA R62D/K107D showed 2 times reduction in growth rates ( Figure 1D). This decrease in the growth rate could be attributed to the increased negative charge on the protein with the addition of gatekeeper D residues in the CY GK CsgA background as well as the replacement of native positively charged lysine and arginine with aspartic acid. Charge repulsion has been shown to negatively impact amyloidogenesis (Guo et al., 2005;Sahoo et al., 2009;Shammas et al., 2011). The repulsion between negatively charged aspartic acid (D) residues likely acts as a gatekeeper at the growth phase due to the accumulation of negative charge in the growing CY GK CsgA R62D/K107D fiber.

Disrupting electrostatic interaction between gatekeeper residues reduces gatekeeping activity
At pH 7.3, R and K residues are predominantly positively charged while D residues are predominantly negatively charged. As increased nucleation rates were observed when R62 and K107 residues were substituted to charge insensitive A or negatively charged D residues (at pH 7.3) ( Figure 1C), we hypothesized that the electrostatic interactions between R62, K107 and D78, D89, and D125 gatekeeper residues were responsible for gatekeeping function. We used salt-mediated charge screening to neutralize the charge on R62, K107, D78, D89, and D125 residues. In CY GK CsgA, the charge screening had the highest impact on aggregation kinetics and nucleation rate compared to CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D (Figures 2A, B, D, E, G, H) which would not have R-D or K-D electrostatic interactions acting as gatekeepers. The significantly higher nucleation rate in CY GK CsgA in the presence of salt suggested that the salt-mediated charge screening of the R, K, and D residues prevented electrostatic interactions between them resulting in disruption of gatekeeping activity that is acting at the nucleation level. In the case of CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D , the absence of R62 and K107 residues made their nucleation rates less sensitive to salt-mediated charge screening. To further substantiate our claim that electrostatic interactions are gatekeeping CsgA polymerization, we measured the aggregation kinetics at various pH conditions. At pH 4, the equilibrium of protonation state of D residues shifted more towards protonation. In the absence of negative charge, D78, D89, and D125 residues no longer functioned as gatekeepers which was reflected in a decreased lag phase and increased nucleation rate of CY GK CsgA (Figures 3A,  B). In CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D , the absence of positively charged R62 and K107 residues in addition to the lack of negative charge on D78, D89, and D125 gatekeeper residues abolished the gatekeeping function ( Figures 3A, B). At pH 5, CY GK CsgA displayed a longer lag phase of aggregation and 1.2 log lower nucleation rate than at pH 4 while the nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D were not affected ( Figures 3D, E). At pH 5, as the solution pH surpassed the pK a of carboxylic group of D residues, the equilibrium of protonation state of D residues shifted more towards deprotonation and hence the net negative charge on D residues increased (Blaber, 2019). This increase in the negative charge on D residues resulted in increased gatekeeping activity in CY GK CsgA ( Figures 3D, E). Despite the increased negative charge on D78, D89, and D125 residues, due to the lack of positively charged R62 and K107 residues in CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D , the nucleation rates at pH 5 were not affected ( Figure 3E). These observations suggested that a net negative charge on D residues was necessary for gatekeeping activity via electrostatic interactions between R62, K107 and D78, D89, and D125. Comparison of CY GK CsgA and CY GK CsgA R62A/K107A nucleation rate constant k 1 yielded a p-value indicating a non-significant change (Table 1). This value is due to the standard deviation between replicates. As can be seen in Figure 6, nucleation rate k 1 had the most drastic change between pH 5 and 6, especially when comparing CY GK CsgA and CY GK CsgA R62A/K107A .
As the solution pH surpassed the pI of CY GK CsgA, the degree of deprotonation of D residues increased. With increased negative charge on D residues in CY GK CsgA we observed the lowest nucleation rates at pH 6 compared to that at any other pH ( Figure 6). Interestingly, at pH 6 due to the absence of R62 and K107 residues in CY GK CsgA R62A/K107A , the nucleation rates did not show any significant decrease while CY GK CsgA R62D/K107D showed 1.9 log reduction in nucleation rates compared to pH 5 ( Figure 3H). The substitution of R62 and K107 residues with D residues in CY GK CsgA R62D/K107D along with the increased negative charge on D residues at pH 6 resulted in charge repulsion. This could explain the delayed nucleation and slower growth as has been reported in other amyloid proteins (Wang et al., 2010;Beerten et al., 2012). Moreover, increasing salt concentration and charge screening led to increased nucleation and growth rates further supporting the idea of charge repulsion in absence of salt leading to delayed nucleation and growth rates in CY GK CsgA R62D/K107D at pH 6 ( Figures 4G-I). Similarly, increasing salt concentrations at pH 6 led to increased nucleation and growth rates in CY GK CsgA due to screening of the negative charge on D residues (Figures 4A-C). The nucleation and growth rates of CY GK CsgA R62A/K107A were not affected due to the absence of R62 and K107 mediated electrostatic interaction with D residues (Figures 4D-F).
The increased presence of negative charge on D residues led to greater gatekeeping activity. We then explored the effect of charge neutralization of the positively charged R62 and K107 residues on gatekeeping function. At pH 8, we have shown that K residues are likely partially deprotonated in CY GK CsgA. While the D residues were negatively charged, the partial deprotonation of K residues at pH 8 negatively impacted the gatekeeping activity ( Figures 5A, B). The loss of electrostatic interactions between the D and K residues resulted in increased nucleation rates in CY GK CsgA compared to those at pH 7.3 ( Figures 5A-C). The lack of electrostatic interaction due to the absence of R62 and K107 residues was further demonstrated in the similar nucleation rates of CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D at pH 7.3 ( Figure 6 and compare Figure 1C, 5B). As the electrostatic interactions were disrupted due to the deprotonation (CY GK CsgA) or absence of K107 residue (CY GK CsgA R62A/K107A ), saltmediated charge screening did not show any significant increase in the nucleation rates (Supplementary Figures S5B, E). For CY GK CsgA R62D/K107D , the 1 log increase in the nucleation rates at the highest salt concentration could be attributed to the screening of negative charges on D residues as charge repulsion has been shown to slow down amyloid fiber growth (Supplementary Figure S5H) (Abdolvahabi et al., 2015;Vettore and Buell, 2019).

Frontiers in Molecular Biosciences
frontiersin.org 09 CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D can complement ΔcsgA E. coli To ensure that mutations to the R and K residues did not significantly affect curli fiber formation, we expressed CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D in E. coli cells lacking CsgA. Our earlier studies have shown that CsgA homologs from diverse species could complement E. coli CsgA deletion in vivo . CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D successfully complemented E. coli CsgA deletion in vivo as seen by the red colored colonies on Congo red indicator plates ( Figure 7A). Whole-cell analysis confirmed the presence of surfaceassociated curli fibers in E. coli ΔcsgA strains which harbored plasmids expressing CY GK CsgA, CY GK CsgA R62A/K107A , or CY GK CsgA R62D/K107D ( Figure 7B). No obvious morphological differences were observed in the assembled curli fibers of cells expressing wild-type CY CsgA, CY GK CsgA, CY GK CsgA R62A/K107A , or CY GK CsgA R62D/K107D indicating that the addition of gatekeeper residues did not significantly affect fiber assembly. Few CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D cells presented with surface-anchored curli fibers while most cells from E. coli ΔcsgA with plasmids expressing EC wild-type CsgA or CY wild-type CsgA had surface-anchored curli fibers. Congo red staining induced by the formation of non-surface associated curli fibers was not observed in the agar underlying the CY GK CsgA R62A/K107A and CY GK CsgA R62D/K107D biofilms, indicating that non-surface attached fibers observed by TEM likely detached during grid preparation or staining. As these variants have introduced D residues, as well as mutations to the positively charged lysine and arginine residues, it is possible that fiber stability is compromised due to accumulation of negative charge which may be exacerbated by the addition of negatively charged staining agent uranyl acetate.
In this report, in addition to the earlier characterized D78, D89, and D125 residues, we identify R62 and K107 residues as new gatekeeper residues in bacterial functional amyloid CY GK CsgA. We provide preliminary evidence behind the mechanism by which R62, K107, and D78, D89, and D125 residues function as gatekeepers in CY GK CsgA. The resulting electrostatic interactions between these oppositely charged residues allows for the formation of a stable contact slowing down formation of an amyloid-competent pre-fibrillar structure, thereby modulating amyloidogenesis. As this interaction occurs prior to CsgA monomers forming the characteristic β-sheet formation required for nucleation, we cannot visualize these gatekeeper interactions in predicted structures illustrating fully folded CY CsgA. Our in vitro study would benefit from additional

Frontiers in Molecular Biosciences
frontiersin.org studies using HDX-mass spectrometry, NMR etc. To shed light on the role of electrostatic interactions in the formation and structure of bacterial functional amyloids. Our study thus lays the foundation for understanding an electrostatic interaction-based biochemical mechanism that controls CY GK CsgA nucleation.

Materials and methods
Protein purification CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D CsgA were purified with certain modifications . Briefly, cell pellets of CY GK CsgA, CY GK CsgA R62A/K107A , and CY GK CsgA R62D/K107D were first treated with 2 mL 1,1,1,3,3,3-hexafluoro-2propanol (HFIP) and incubated at room temperature (RT) for 10 min with intermittent mixing. Following this, 25 mL of 8 M guanidine hydrochloride in 50 mM of potassium phosphate buffer (KPi) pH 7.3 was added to the cell lysate and incubated on a rocker for 1 h at RT. The solution was then centrifuged at 10,000 g for 20 min at 4°C. The supernatant was collected and sonicated three times for 20 s each. 800 µL of Sigma HIS-Select ® HF Nickel Affinity Gel beads were added to the sonicated supernatant and incubated on a rocker for 1 h at RT. The protein was eluted with 125 mM imidazole in 50 mM KPi pH 7.3 . Following elution, the proteins were buffer exchanged to the buffer pH of choice using Thermo Scientific Zeba ™ Spin Desalting Columns 7k MWCO. The protein concentration after buffer exchange was assayed using Thermo Scientific Pierce ™ Rapid Gold BCA Protein Assay Kit. Primers used to make mutant strains are listed in Table 2 and strains are listed in Table 3.

Description
Sequence Purpose   105_C127G_G128C 5′-TGC TCT GCA AAG CGA TGC GGC TAA ATC AGA TGT CAC  TAT C-3′ Site directed mutagenesis primers to mutate the positive K and R in CY GK CsgA to Alanine (  CsgA was diluted to a final concentration of 20 µM in presence or absence of varying concentrations of NaCl. The samples were incubated at 37°C under quiescent conditions in presence of 20 µM ThT. The ThT fluorescence intensity was recorded every 20 min with orbital shaking for 5 s before the readings (excitation: 438 nm; emission: 495 nm). All experiments were performed in triplicates with at-least three biological replicates and nucleation and growth rates were calculated using the following equation (Morris et al., 2008).
Where a is the final value of Y at the end of the reaction, k 1 is the nucleation rate and k 2 is the growth rate. ThT assays comparing different CsgA mutants were conducted separately.

Complementation assay
Overnight cultures in LB broth at 37°C of wild-type E. coli MC 4100 or wild-type E. coli MC 4100 ΔcsgA cells expressing either empty vector (EV), E. coli CsgA, CY CsgA, CY GK CsgA, CY GK CsgA R62A/K107A or CY GK CsgA R62D/K107D CsgA were pelleted and diluted to 1.0 OD 600nm in YESCA (yeast extract, casamino acids). 4 µL was spotted on YESCA agar plates supplemented with 50 µg/ mL Congo red and incubated at 26°C for 48 h to induce CsgA expression. Images were recorded using Canon EOS Rebel XSi camera and the background Congo red color was edited out in Adobe Photoshop.

Transmission electron microscopy (TEM)
For whole-cell imaging, wild-type E. coli MC 4100 or E. coli MC 4100 ΔcsgA cells expressing either empty vector (EV), E. coli CsgA, CY CsgA, CY GK CsgA, CY GK CsgA R62A/K107A or CY GK CsgA R62D/K107D CsgA were grown on YESCA-agar plates supplemented with Congo red for 48 h at 26°C. After incubation, the cells were scraped from the YESCA-agar plates and re-suspended to 1.0 OD 600nm in 50 mM potassium phosphate buffer pH 7.3 before applying 5 µL of the cell suspension to formvar-coated grids followed by staining with 1% uranyl acetate solution. Samples were imaged on Jeol electron microscope (JEOL1400plus).

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions
SB and MC designed research. SB, DK, and MG performed research. SB, DK, MG, and MC analyzed data. SB, DK, and MC wrote the manuscript.