Friends in need: How chaperonins recognize and remodel proteins that require folding assistance

Chaperonins are biological nanomachines that help newly translated proteins to fold by rescuing them from kinetically trapped misfolded states. Protein folding assistance by the chaperonin machinery is obligatory in vivo for a subset of proteins in the bacterial proteome. Chaperonins are large oligomeric complexes, with unusual seven fold symmetry (group I) or eight/nine fold symmetry (group II), that form double-ring constructs, enclosing a central cavity that serves as the folding chamber. Dramatic large-scale conformational changes, that take place during ATP-driven cycles, allow chaperonins to bind misfolded proteins, encapsulate them into the expanded cavity and release them back into the cellular environment, regardless of whether they are folded or not. The theory associated with the iterative annealing mechanism, which incorporated the conformational free energy landscape description of protein folding, quantitatively explains most, if not all, the available data. Misfolded conformations are associated with low energy minima in a rugged energy landscape. Random disruptions of these low energy conformations result in higher free energy, less folded, conformations that can stochastically partition into the native state. Two distinct mechanisms of annealing action have been described. Group I chaperonins (GroEL homologues in eubacteria and endosymbiotic organelles), recognize a large number of misfolded proteins non-specifically and operate through highly coordinated cooperative motions. By contrast, the less well understood group II chaperonins (CCT in Eukarya and thermosome/TF55 in Archaea), assist a selected set of substrate proteins. Sequential conformational changes within a CCT ring are observed, perhaps promoting domain-by-domain substrate folding. Chaperonins are implicated in bacterial infection, autoimmune disease, as well as protein aggregation and degradation diseases. Understanding the chaperonin mechanism and the specific proteins they rescue during the cell cycle is important not only for the fundamental aspect of protein folding in the cellular environment, but also for effective therapeutic strategies.


Introduction
Protein folding in the cell is not always a spontaneous process due to unproductive pathways of misfolding and aggregation. Chaperonin molecules in bacterium prevent such off-pathway reactions and promote protein folding through spectacular ATPdriven cycles of binding and releasing substrate proteins (SPs). Chaperonins are distinguished among the molecular chaperone family by the presence of a cavity that offers a productive environment for protein folding, thus preventing unwarranted inter protein interactions, which could occur in the crowded cellular environment. Many chaperones are known as heat-shock proteins (Hsp), alluding to their overexpression under stress conditions, although their action is also required under normal cell growth. Availability of chaperone assistance at critical junctures, for example through thermotolerance, ensures cell viability even when cellular functions would otherwise be overwhelmed. More broadly, comprehensive protein quality control relies on a range of chaperone subfamilies, classified according to their molecular weight, to deliver assistance with essential processes in the protein lifecycle: folding/refolding, with an important role for Hsp60/Hsp10 (GroEL/S), Hsp90 and Hsp70/Hsp40 classes (DnaK/DnaJ); protection against oxidative stress, Hsp33; disaggregation, Hsp100 (Hsp104/ClpB), Hsp70/Hsp40, and small Hsps (sHsp); and degradation, Hsp100 (Clp family, p97, the proteasome Rpt1-6 ring). (Parsell and Lindquist, 1993;Wickner et al., 1999;Frydman, 2001;Kim et al., 2013).
Two distinct chaperonin classes have been identified. GroEL and its co-chaperonin GroES in Escherichia coli ( Figure 1A) are the prototype for chaperonin systems found in eubacteria and endosymbiotic organelles, or Group I chaperonins. The thermosome ( Figure 1B) and TCP-1 ring complex (TRiC, or CCT for chaperonin-containing TCP1) are representative for archaeal and eukaryotic cells, respectively, and are known as Group II chaperonins. Structural characterization (Braig et al., 1994;Xu et al., 1997;Ditzel et al., 1998;Fei et al., 2013) reveals that chaperonins have an oligomeric, double-ring structure, composed of two (thermosome) or more (CCT) distinct subunits within the same ring or identical subunits (GroEL). Within each subunit there are three distinct domains: The ATPbinding equatorial domain, the flexible apical domain and the intermediate hinge region. The co-chaperonin GroES is a singlering oligomer with identical subunits, capping one of the GroEL rings ( Figure 1A). This elaborate annealing machinery is present in nearly all organisms and it is essential for cell survival (Fayet et al., 1989).
Why does folding of some proteins in the crowded cellular milieu require chaperonin assistance? This requirement does not exist in vitro, as favorable conditions for folding can be identified for known chaperonin substrates. Cellular conditions, however, are unfavorable (non-permissive) for a subset of these proteins, leading to formation of misfolded conformations. To reach the native state from the misfolded conformations, proteins must overcome large free energy barriers, a feat which could prove difficult to accomplish within the biological time scale. Moreover, misfolded proteins expose patches of hydrophobic amino acids, making them potential targets for aggregation or leaving them vulnerable to degradation. Chaperonins rescue proteins trapped in misfolded conformations and allow them to reach the native state within a protected folding chamber.
It should be noted that the chaperone annealing action does not alter the three-dimensional conformation of the native protein, in accord with Anfinsen. (1973) hypothesis that the information needed for the native state is encoded solely in the amino acid sequence. Instead, chaperones induce pathways that ensure the correct folding of newly translated or newly translocated proteins (Naqvi et al., 2022).
Here, we provide our perspective on the substrate recognition mechanisms for the two chaperonin types. A number of reviews describe in detail other fundamental features of the chaperonin machinery, including structure, allosteric motion and ATPase activity (Thirumalai and Lorimer, 2001;Hartl and Hayer -Hartl, 2002;Saibil and Ranson, 2002;Fenton and Horwich, 2003;Spiess et al., 2004;Horovitz and Willison, 2005;Horwich et al., 2006;Gruber and Horovitz, 2016;Thirumalai et al., 2020;Horovitz et al., 2022). We refer the interested reader to these accounts for a broader picture of the chaperonin mechanisms.
We also briefly examine the role of chaperonin in disease and point to extensive research in the area (Ranford and Henderson, 2002). Prevention of aggregation through chaperonin assisted folding of non-native polypeptides naturally suggests that defects in the chaperonin machinery may result in disease. The extreme situation, the absence of chaperonin, is fatal, as a consequence of the essential nature of this machinery for the cell. Besides these immediate implications, chaperonins are also found to be major immunogens that play an important role in infection, autoimmune disease, and idiopathic diseases such as arthritis and atherosclerosis. Considering the potential therapeutic use, the study of chaperonin assisted protein folding is likely to suggest valuable practical approaches.

Chaperonin hemicycle
Chaperonins operate as continuous annealing machines by alternating encapsulation of substrate proteins within the cavity of each ring. These encapsulation events are enabled by large scale, coordinated, conformational transitions that take place in conjunction with ATP and GroES binding in the active ring of GroEL. In this section, we focus on the series of events that occur during the GroEL hemicycle.At the initiation of the chaperonin cycle, termed the T state ( Figure 2), GroEL presents a nearly continuous hydrophobic ring formed at the mouth of the cavity by the seven apical domain binding sites. (Braig et al., 1994) This state has high affinity for non-native polypeptides, which also present exposed hydrophobic surfaces. Binding of misfolded proteins to GroEL prevents the formation of irreversible protein aggregates. Upon ATP and GroES binding to the same ring, large-scale, entirely concerted, domain motions in that ring result in doubling the size of the cavity. During these transformations, GroES, which occupies the same apical binding sites as the SP, (Fenton et al., 1994;Buckle et al., 1997;Xu et al., 1997;Chen and Sigler, 1999) displaces the SP in the Reaction hemicycle of GroEL illustrating the substrate protein (SP) folding assistance. EL and ES stand for GroEL and GroES respectively. The GroEL T state has a high affinity for SP binding. Upon ATP and GroES binding, the SP is displaced into the expanded GroEL cavity, where productive folding can take place. Dissociation of the complex occurs upon the initiation of a folding reaction in the opposite GroEL ring. The structures of the T, R, and R″ states are known. Reproduced from (Stan et al., 2007(Stan et al., ) Ⓒ(2007 National Academy of Sciences.

Frontiers in Molecular Biosciences
frontiersin.org largely expanded cavity. As a result of these spectacular allosteric transitions, the SP is presented with a completely different, mostly hydrophilic, environment that promotes SP folding. The chaperonin cycle is completed by ATP hydrolysis and the binding of ATP in the opposite ring, which initiates the cycle in that ring. These events trigger the release of GroES, ADP and SP from the folding chamber. Stringent GroEL substrates require several cycles of binding and release in order to reach their native state. In each cycle, productive folding, if it were to occur at all takes place within the cavity. (Thirumalai et al., 2020).

Iterative annealing mechanism
The function of the GroEL machinery can be quantitatively understood within the Iterative Annealing Mechanism (IAM) framework (Todd et al., 1996). This mechanism is described in the framework of the energy landscape, which associates a free energy to each conformational state of the protein (Figure 3). During each cycle, the SP is rescued from one of the low energy minima, that corresponds to a misfolded state. From the ensuing higher free energy state, the protein chain undergoes kinetic partitioning (Guo and Thirumalai, 1995) to either the native state or to the same or a different low energy minimum. Protein folding in a model cavity has been investigated using implicit solvent and coarse-grained models for the SP. (Betancourt and Thirumalai, 1999;Klimov et al., 2002;Baumketner et al., 2003;Takagi et al., 2003;Jewett et al., 2004;van der Vaart et al., 2004;Stan et al., 2007) These studies have provided several important clues about how protein folding occurs in confinement. It turns out that an optimum range of interactions between the cavity wall and the SP results in enhanced stability and folding rates.

GroEL substrate protein binding mechanism
GroEL manifests a promiscuous behavior towards binding non-native polypeptides. Misfolded proteins, that expose hydrophobic residues, are recognized by GroEL without preference for a specific secondary or tertiary structure (Viitanen et al., 1992;Aoki et al., 2000). Despite the large number of proteins that can form complexes with GroEL (Viitanen et al., 1992), in vivo only about 5-10% of E. coli

FIGURE 3
Energy landscape perspective of the chaperonin annealing action.

Frontiers in Molecular Biosciences
frontiersin.org proteins can afford to use the chaperonin machinery under normal conditions (Lorimer, 1996;Ewalt et al., 1997). Even upon heat stress, only about 30% of E. coli proteins require folding assistance (Horwich et al., 1993). The relatively reduced participation of GroEL to protein folding in the cell prompts us to wonder why only a subset of proteins of the entire organism uses chaperonin assistance. Given the GroEL promiscuity, how does GroEL discriminate between substrates and non-substrates within a proteome? Addressing these questions is challenging, from an experimental point of view, because of the inherent difficulty in arresting structures of complexes formed between GroEL and non-native polypeptides. Somewhat surprisingly, even after 25 years, the only available crystal structures of GroEL-bound ligands correspond to the GroEL-GroES complex (Xu et al., 1997) and to peptides bound to GroEL (Wang and Chen, 2003) or to the GroEL apical domain fragment (Buckle et al., 1997;Chen and Sigler, 1999), while a number of lower resolution cryo-EM structures (Ranson et al., 2001;Roseman et al., 2001;Falke et al., 2005;Chen et al., 2006) are available. Nevertheless, these structures, as well as a number of biochemical studies, identified the GroEL binding sites and the multivalent binding of stringent substrate proteins. Bioinformatic analyses complementing these studies pinpointed chaperonin signaling pathways and chemical character conservation at functionally relevant sites.
Characterization of the GroEL binding sites using mutational (Fenton et al., 1994) and crystallographic (Xu et al., 1997) studies pointed towards a mostly hydrophobic groove between two amphiphatic helices ( Figure 4A), as well as a nearby loop, located in the apical domain of each GroEL subunit. Specific residues implicated in GroES and SP binding are Tyr 199, Tyr 203, Phe 204, Leu 234, Leu 237, Leu 259, Val 263, Val 264 (Fenton et al., 1994). In addition, these studies led to the key observation that the same GroEL region responsible for recognizing misfolded substrates is ultimately destined to form the interface with GroES in the course of the chaperonin cycle. Strikingly, the structures of peptides bound to GroEL overlap significantly (Chen and Sigler, 1999), suggesting that strong restrictions are imposed on the bound conformation.
Bioinformatic analysis of a large number of chaperonin sequences further revealed that the various chaperonin functions (peptide binding, nucleotide binding, GroES and SP release) require that the chemical character and not the identities of specific amino acids be preserved (Stan et al., 2003). Moreover, this study lent support to the sequence analysis by Kass and Horovitz (Kass and Horovitz, 2002), which suggested that correlated mutations couple residue doublets or triplets along signaling pathways within GroEL or between GroEL and GroES.
Multivalent binding of stringent SP substrates was suggested to be implicated in the GroEL unfoldase action. This action is brought about by the large scale conformational transitions that take place in coordinated fashion in all GroEL subunits during the chaperonin cycle, resulting in an increased separation of the apical binding sites. At the initiation of the chaperonin cycle, the seven GroEL binding sites form a nearly continuous ring at the cavity opening. Stringent GroEL substrates, such as malate dehydrogenase and Rubisco (not natural substrates for GroEL, however Rubisco is a substrate of the Rubisco binding protein, GroEL homolog in chloroplast), appear to interact with at least three consecutive binding sites (Farr et al., 2000;Elad et al., 2007). By contrast, Rhodanese, which is a less-stringent substrate,  (Stan et al., 2005).

Frontiers in Molecular Biosciences
frontiersin.org requires two non-contiguous binding sites (Farr et al., 2000). The effect of this displacement, corroborated with multivalent SP binding, is to impart a stretching force to the SP. (Thirumalai and Lorimer, 2001). Taken together, these important results suggest that substrate recognition involves peptides that occupy the GroEL binding site in a similar conformation as the GroES mobile loop. For stringent GroEL SPs, multiple interfaces are formed involving these peptides and several contiguous GroEL subunits. The peptide complementarity to the GroEL binding sites is defined, as in the GroES case, by amino acids whose chemical character is strongly conserved.

Identification of GroEL substrates at the proteome level
The promiscuous behavior of GroEL towards binding nonnative polypeptides appears to be at variance with the relatively small fraction of protein chains in an organism that actually use the GroEL machinery. However, common features of GroEL substrates and similar conformations of bound peptides, as discussed above, suggest a set of requirements for GroEL recognition. Several computational approaches (Chaudhuri and Gupta, 2005;Stan et al., 2005;Noivirt-Brik et al., 2007;Raineri et al., 2010;Tartaglia et al., 2010) and proteomic studies (Houry et al., 1999;Kerner et al., 2005) were successful in identifying and characterizing GroEL substrates within whole proteome.
Proteomic and biochemical studies (Houry et al., 1999;Kerner et al., 2005) provided the first experimental identification of GroEL SPs, on a proteome-wide scale, in E. coli. These studies found that 252 of the~2400 cytosolic proteins in E. coli interact with GroEL. Among this set of proteins, 85 are stringent substrates under normal growth conditions and they occupy 75-80% of the GroEL capacity. Additional in vivo studies (Chapman et al., 2006) involving a temperature-sensitive lethal E. coli mutant suggested a wider set of~300 GroEL interacting proteins, including some that had not been revealed by previous in vitro experiments. The latter study raises the possibility that even transient GroEL interaction, in the cellular environment, suffices to prevent aggregation of misfolded proteins. The set of obligate in vivo substrates was subsequently narrowed to ≃ 60 proteins identified in experiments using GroEdepleted conditions (Fujiwara et al., 2010), and completed by the addition of 20 novel substrates identified using cell-free proteomics (Niwa et al., 2016). GroEL substrates were also identified in other bacteria, such as Thermus thermophilus (Shimamura et al., 2004) and Bacillus subtilis (Endo and Kurusu, 2007).
One line of computational research focuses on identifying polypeptide regions within proteins that render them natural substrates for GroEL. The underlying hypothesis is that natural SPs have the same sequence complementarity to the GroEL binding site as GroES (Chaudhuri and Gupta, 2005;Stan et al., 2005). Therefore, SPs possess sequence patterns similar to the GroES mobile loop segment 23-31, GGIVLTGAA, which binds to GroEL. In one approach (Chaudhuri and Gupta, 2005), SP binding motifs are defined as strong hydrophobic patches (i.e., containing amino acids L, V, I, F, M) having 40-50% sequence similarity to the GroES segment GGIVLTG. The sequence similarity is evaluated using a pairwise alignment between the protein sequence and the peptide GGIVLTG and allowing amino acid substitutions that preserve the chemical character (hydrophobichydrophobic or same charge). In a different approach (Stan et al., 2005), the binding motif is required to match the pattern G_IVL_G_A that includes N C = 6 GroES amino acids in contact with GroEL ( Figure 4) and three arbitrary amino acids ("_"). Pattern matching takes into account possible amino acid substitutions that preserve the chemical character, as well as less strongly bound peptides, having four (G_IVL) and five (G_IVL_G) contacts. Natural SPs must possess multiple copies of the binding motif, N B , to satisfy the required multivalent binding to GroEL. About a third of the sequences in the E. coli proteome are expected to be natural SPs (Horwich et al., 1993). This method retrieves the expected fraction of natural SPs in E. coli for sequences that satisfy 4 < N C < 6 and 2 < N B < 4. No preferred secondary structure emerges in this set of proteins. This method is able to identify 80% of experimentally determined natural substrate proteins for GroEL from E. coli (Houry et al., 1999;Kerner et al., 2005) and predicted SPs in several other proteomes.
GroEL must not only recognize proteins that require folding assistance, but also the protein conformations that must be remodeled. How does GroEL discriminate between native conformations, which it should not recruit, from the misfolded conformations of proteins it must selectively assist? A structural and bioinformatic analysis (Stan et al., 2006) found that GroEStype binding motifs are not significantly exposed to solvent in the native conformation of GroEL SPs. This result suggests that GroEL recognition of misfolded conformations of SPs requires that multiple GroES-type binding motifs be solvent-exposed. In accord with this hypothesis, molecular dynamics simulations that probe extensively the conformational space of an obligate GroEL substrate, DapA (Nagpal et al., 2015), reveal that its GroEStype binding motifs are solvent-exposed in unfolding intermediates, but are inaccessible in the native conformation. These studies find that, for the seven motifs identified within the DapA sequence, the average solvent-exposed area per residue increases from ≃ 74Å in the native conformation to ≃ 182Å in the intermediate structures. Experimental studies using hydrogenexchange coupled with mass spectrometry (Georgescauld et al., 2014) support the increased exposure of the hydrophobic segments and loss of hydrogen bonds that accompany the destabilization of the TIM-barrel core of this substrate.
A different line of computational research (Noivirt-Brik et al., 2007;Raineri et al., 2010;Tartaglia et al., 2010;Azia et al., 2012) uses machine learning approaches to examine physicochemical characteristics of E. coli proteins that indicate a requirement for GroE-dependent folding. Among two sets of in vivo substrates Frontiers in Molecular Biosciences frontiersin.org (Kerner et al., 2005;Chapman et al., 2006), stringent dependence on GroEL correlates with low folding propensity and high translation efficiency (Noivirt-Brik et al., 2007). Secondary structure content, as well as contact order, which quantifies the average distance along the polypeptide chain between amino acids that form native contacts, were not found to distinguish GroEL SPs from other proteins. Consistently, this study found that homologues of these SPs in Ureaplasma urealyticum, an organism that lacks a chaperonin system, do not possess sequence characteristics that would require them to recruit the GroE system. Additional features were found by two other studies to separate GroEL SPs from GroE-independent proteins. One found that lower rate of evolution, hydrophobicity, and aggregation propensity are characteristics of GroEL SPs (Raineri et al., 2010), however it was later argued that the estimation of aggregation propensity may reflect the algorithm bias towards amyloid structure (Azia et al., 2012). Solubilities of E. coli proteins are found to display a bimodal distribution within a cell-free system in the absence of chaperones, with stringent GroEL SPs belonging to the more aggregation-prone set (Niwa et al., 2009). In agreement with these results, the second computational approach was successful in distinguishing the GroEL requirement for the previously identified substrate classes (Kerner et al., 2005) on the basis of decreasing folding propensity and increasing likelihood of aggregation (Tartaglia et al., 2010). To probe the substrate requirements in controlled fashion, recent experiments used computationally designed substrates based on the enhanced green fluorescence protein (eGFP) (Bandyopadhyay et al., 2017;Bandyopadhyay et al., 2019). These in vitro and in vivo studies showed that GroEL dependence of eGFP variants increases with increasing frustration (Ferreiro et al., 2018), effected through point mutations (Bandyopadhyay et al., 2017), or contact order (Plaxco et al., 1998), engineered through circular permutations (Bandyopadhyay et al., 2019). Intriguingly, as noted above, in vivo GroEL SPs are not distinguishable from non-substrates through the contact order parameter. This suggests that other features play a larger role in determining GroE-dependence.

Specific recognition of substrate proteins by group II chaperonins
In contrast to the extensive knowledge of the set of proteins that require assistance from the GroEL-GroES system, relatively little is currently known about the substrates of group II chaperonins. The presence of distinct subunit types within group II chaperonins suggests that specialized binding mechanisms were developed to target different substrates. However, the extent of subunit heterogeneity varies among members of this class. In archaeal chaperonins, one (Knapp et al., 1994), two (Waldmann et al., 1995), or three (Archibald and Roger, 2002) distinct subunit types are identified, whereas in eukaryotic chaperonins eight different subunits are described (Liou and Willison, 1997). Correspondingly, it is plausible that different substrate recognition mechanisms are used by archaeal and eukaryotic chaperonins. Group II chaperonins have been suggested to employ a sequential, rather than cooperative, mechanism for conformational transitions, consistent with their suggested domain-by-domain folding of SPs and specific SP interaction.
Archaeal chaperonins are abundant in the cell (approximately 1-2% of cellular proteins) and have low subunit heterogeneity as a result of gene interconversion (Archibald and Roger, 2002). These facts prompted the suggestion that, like GroEL, they assist folding of a large set of proteins perhaps through a promiscuous mechanism. In support of this hypothesis, it is noted that thermosome, which has two subunit types, assists folding of GroEL substrates green fluorescence protein (Yoshida et al., 2002) and cythrate synthase (Iizuka et al., 2004). The coexistence of group I and group II chaperonins within the archaebacterium Methanosarcina mazei (Klunker et al., 2003) provides an unique opportunity to compare and contrast the annealing action of the two chaperonin classes. Both chaperonins contribute to the folding of 13% of the proteins in the archaeal cytosol, albeit the two sets of substrates are nonoverlapping (Hirtreiter et al., 2009).
The less abundant eukaryotic chaperonin CCT (0.1% of cellular proteins) uses a significantly different mechanism of substrate recognition than GroEL. CCT was initially suggested to interact only with actins and tubulins (Kubota et al., 1994). Recently, numerous other substrates have been identified, including some that contain tryptophan-aspartic acid repeats (Spiess et al., 2004). Substrates include the myosin heavy chain, the Von Hippel-Lindau (VHL) tumor suppressor, cyclin E, and the cell division control protein. Charged residues on the surface of CCT SPs appear to be required for recognition by the eukaryotic machinery. Intriguingly, CCT substrates cannot be folded by other prokaryotic or eukaryotic chaperones (Tian et al., 1995).
A challenging aspect of the CCT substrate recognition mechanism is the lack of knowledge of the CCT binding site. Several proposals exist regarding the localization of CCT binding sites. One assumes structural homology to the GroEL binding sites, formed by two apical domain helices (Xu et al., 1997). In contrast to the GroEL binding site, the two CCT helices have a mostly hydrophilic character, which would be consistent with the notion that CCT recognizes surface charged residues (Jayasinghe et al., 2010). A second proposed CCT binding site involves a flexible helical protrusion (Heller et al., 2004) that acts as a built-in lid for the chaperonin cavity. Finally, the inner side of the closed cavity was also suggested as CCT binding site (Pappenberger et al., 2002). This region has a mostly charged and polar character, a feature similar to the lining of the GroEL cavity wall. At this time, few experimental data are available to unambiguously define the CCT binding site. A study that used photocrosslinking and Frontiers in Molecular Biosciences frontiersin.org fluorescence spectroscopy to probe VHL binding (Spiess et al., 2006) provides strong indication that the CCT binding sites are located within helix 11, which is structurally homologous to the GroEL binding site. It is possible that more than one of these proposed locations correspond to in vivo CCT binding sites. This would not be completely surprising given the diversity among CCT substrates and the CCT inhomogeneous oligomeric structure (Spiess et al., 2004). Distinct CCT subunits may serve the purpose of providing the versatility to recognize different substrates.

Implication of chaperonins in disease
An intriguing connection was made between the Hsp60 chaperonin class and prion disease (DebBurman et al., 1997). Prion proteins are suggested to form fibrillar aggregates upon conversion of the normal cellular form PrP C , having primarily α-helical structure, into a β-sheet rich misfolded conformation, PrP Sc . Experiments performed in vitro found that GroEL promotes conversion to the disease-related PrP Sc (DebBurman et al., 1997). These authors proposed that in vivo validation of the chaperoninassisted conversion would provide a natural target for clinical approaches.
Mutations in human chaperonins result in diseases, such as the hereditary spastic paraplegia (Hansen et al., 2002), or the McKusick-Kaufman Syndrome (Stone et al., 2000). Chaperonins have been implicated, through autoimmune response, as putative causes of diseases such as rheumatoid arthritis, atherosclerosis, and inflammation (Ranford et al., 2000). Immunosuppresive action of chaperonins has been described in animal models of juvenile arthritis (van Eden et al., 1989) and diabetes (Elias et al., 1990), as well as in human pregnancy (Cavanagh and Morton, 1994). Immunization with a mycobacterial chaperonin was suggested to protect against arthritis (van Eden, 1991). To date, there is no clear understanding of the subset of chaperonin SPs affected by these mutations and the precise effect of these mutations on chaperonin annealing action (Barral et al., 2004). Mastering the intricacies of the chaperonin action will provide answers to these questions and suggest effective therapies.

Conclusion
Protein folding assistance mediated by chaperonins is a critical quality control mechanism to maintain protein homeostasis. Selective recruitment of substrate proteins by chaperonins represents a fundamental regulatory step in the remodeling action, given the limited availability of chaperonins within the cytosol and the stringent dependence of a subset of proteins on this assistance. Remarkably, GroEL substrate selectivity is achieved even as the chaperonin promiscuously binds misfolded proteins. As highlighted here, research efforts to elucidate the substrate recognition mechanism have primarily focused on two complementary questions. One question is focused on how GroEL binds substrate proteins that require its assistance. As the GroEL binding site is well established and the GroES cochaperone competes with substrates during the chaperonin cycle, this suggests that natural SPs include polypeptide regions similar to the GroES loops that participate in the interface with GroEL. The additional observation that substrates interact with multiple GroEL subunits (2-3) further defines the requirement that several GroES-type motifs be present within the polypeptide chain, at least for stringent substrates. The other question refers to which proteins are likely to require folding assistance in vivo.
Here, low partition factor (fraction of molecules that fold spontaneously) and high aggregation propensity emerge as important factors that underlie the GroEL requirement. In addition, such factors can help to explain the extent of GroEL dependence among known substrates.

Author contributions
All authors conceived and executed the project, and wrote the report.

Funding
DT is grateful to the National Science Foundation (CHE 19-00033) for support. Additional support from the Welch Foundation (F-0019) through the Collie-Welch Regents Chair is greatly acknowledged. GS would like to acknowledge the National Science Foundation (MCB-2136816) for support.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.