# THE ROLE OF AAA+ PROTEINS IN PROTEIN REPAIR AND DEGRADATION

EDITED BY : James Shorter and Walid A. Houry PUBLISHED IN : Frontiers in Molecular Biosciences

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-656-7 DOI 10.3389/978-2-88945-656-7

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# THE ROLE OF AAA+ PROTEINS IN PROTEIN REPAIR AND DEGRADATION

Topic Editors: James Shorter, Perelman School of Medicine at the University of Pennsylvania, United States Walid A. Houry, University of Toronto, Canada

ATPases Associated with diverse cellular Activities (AAA+) comprise a superfamily of proteins that are defined by the presence of the AAA+ domain containing canonical Walker A and B motifs required for ATP binding and hydrolysis. Members of this superfamily act on other proteins, DNA, RNA, or multicomponent complexes to affect their conformation or their assembly. There have been substantial advances in understanding the structure and mechanism of function of a large number of AAA+ proteins. In this Research Topic, review articles and original research papers discuss new aspects as well as provide a detailed overview of several AAA+ proteins, namely: ClpXP, Lon, ClpB, Hsp104, p97, AAA+ proteins of the proteasome, Rubisco activases, Torsin, Pontin, and Reptin.

Citation: Shorter, J., Houry, W. A., eds (2018). The Role of AAA+ Proteins in Protein Repair and Degradation. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-656-7

# Table of Contents

*05 Editorial: The Role of AAA+ Proteins in Protein Repair and Degradation* James Shorter and Walid A. Houry

## SECTION I

#### CLPXP, LON AND RELATED ATP-DEPENDENT PROTEASES


Robert H. Vass, Jacob Nascembeni and Peter Chien


Lisa-Marie Bittner, Alexander Kraus, Sina Schäkermann and Franz Narberhaus


#### SECTION II

#### CLPB AND HSP104


# SECTION III


Yihong Ye, Wai Kwan Tang, Ting Zhang and Di Xia

*183 The Interplay of Cofactor Interactions and Post-translational Modifications in the Regulation of the AAA+ ATPase p97* Petra Hänzelmann and Hermann Schindelin

# SECTION IV

#### AAA+ PROTEINS OF THE PROTEASOME


Aaron Snoberger, Raymond T. Anderson and David M. Smith

# SECTION V

#### RUBISCO ACTIVASES


#### SECTION VI

#### TORSIN

*255 Torsin ATPases: Harnessing Dynamic Instability for Function* Anna R. Chase, Ethan Laudermilch and Christian Schlieker

#### SECTION VII

#### PONTIN AND REPTIN

*262 The Role of Pontin and Reptin in Cellular Physiology and Cancer Etiology* Yu-Qian Mao and Walid A. Houry

# Editorial: The Role of AAA+ Proteins in Protein Repair and Degradation

James Shorter <sup>1</sup> \* and Walid A. Houry 2,3 \*

<sup>1</sup> Department of Biochemistry and Biophysics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, United States, <sup>2</sup> Department of Biochemistry, University of Toronto, Toronto, ON, Canada, <sup>3</sup> Department of Chemistry, University of Toronto, Toronto, ON, Canada

Keywords: AAA+ ATPase, chaperone, protease, Hsp104, p97

**Editorial on the Research Topic**

#### **The Role of AAA**+ **Proteins in Protein Repair and Degradation**

ATPases Associated with diverse cellular Activities (AAA+) comprise a superfamily of proteins that perform a large variety of functions essential to cell physiology, including control of protein homeostasis, DNA replication, recombination, chromatin remodeling, ribosomal RNA processing, molecular targeting, organelle biogenesis, and membrane fusion (Hanson and Whiteheart, 2005; Erzberger and Berger, 2006; Snider et al., 2008). Members of this superfamily are defined by the presence of what is termed the AAA+ domain containing the canonical Walker A and B motifs required for ATP binding and hydrolysis (Hanson and Whiteheart, 2005). Typically, genomes encode approximately ten to several hundred AAA+ family members (**Table 1**; Finn et al., 2017), each of which is thought to be adapted to specific functional niches that necessitate precise mechanisms of substrate recognition and processing (Hanson and Whiteheart, 2005). The striking adaptive radiation of AAA+ proteins to operate in diverse settings illustrates the versatile utility of the AAA+ domain (Erzberger and Berger, 2006). AAA+ proteins typically form hexameric complexes and act as motors to remodel other proteins, DNA/RNA, or multicomponent complexes (**Figure 1**). Indeed, many chaperones and ATP-dependent proteases are or have subunits that belong to this superfamily (**Figure 1**; Olivares et al., 2016).

Over recent years, there has been substantial progress in identifying the structure and functional mechanism of a large number of AAA+ proteins (Gates et al., 2017; Puchades et al., 2017; Ripstein et al., 2017; Zehr et al., 2017). In this research topic, several elements of this exciting progress are conveyed in 21 articles, which encompass a detailed structural and mechanistic view of several AAA+ chaperones and proteases, including: ClpX (Alhuwaider and Dougan;Bittner et al.; Elsholz et al.; LaBreck et al.; Vass et al.), ClpA (Bittner et al.; Duran et al.), ClpB and Hsp104 (Chang et al.; Duran et al.; Franke et al.; Johnston et al.), Hsp78 (Abrahão et al.), ClpC (Alhuwaider and Dougan; Elsholz et al.), ClpE (Elsholz et al.), Pontin (Mao and Houry), Reptin (Mao and Houry), FtsH (Alhuwaider and Dougan), 19S proteasome (Snoberger et al.; Yedidi et al.), Lon (Alhuwaider and Dougan; Bittner et al.; Fishovitz et al.), p97 (Hänzelmann and Schindelin; Saffert et al.; Ye et al.), Pex1/6 (Saffert et al.), CbbQ (Mueller-Cajar), rubisco activase (Bhat et al.), torsins (Chase et al.), and mitochondrial AAA+ proteases (Glynn). Here, we introduce these fascinating works.

#### STUDIES ON CLPXP, LON, AND RELATED ATP-DEPENDENT PROTEASES

In their research article, "The Protein Chaperone ClpX Targets Native and Non-native Aggregated Substrates for Remodeling, Disassembly, and Degradation with ClpP," LaBreck et al. perform

#### Edited and reviewed by:

Pierre Goloubinoff, Université de Lausanne, Switzerland

#### \*Correspondence:

James Shorter jshorter@pennmedicine.upenn.edu Walid A. Houry walid.houry@utoronto.ca

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

Received: 22 August 2018 Accepted: 07 September 2018 Published: 02 October 2018

#### Citation:

Shorter J and Houry WA (2018) Editorial: The Role of AAA+ Proteins in Protein Repair and Degradation. Front. Mol. Biosci. 5:85. doi: 10.3389/fmolb.2018.00085 TABLE 1 | Number of AAA+ proteins in model organisms<sup>a</sup> .


<sup>a</sup>The table was obtained from the InerPro database (Finn et al., 2017).

<sup>b</sup>Also known as Caulobacter vibrioides.

a series of elegant experiments to establish that ClpX possesses disaggregase activity against polypeptides that contain specific ClpX-recognition signals (LaBreck et al.). In the presence of ClpP, ClpX couples disaggregation of these substrates to their degradation. Importantly, they also establish that ClpXP prevents the accumulation of aggregates formed by proteins bearing ClpX recognition signals in vivo (LaBreck et al.). These studies illuminate ClpX as a protein disaggregase, which was previously underappreciated.

In their research article, "The Essential Role of ClpXP in Caulobacter crescentus Requires Species Constrained Substrate Specificity," Vass et al. explore species-specific functions of ClpX (Vass et al.). Curiously, ClpX is essential in some species such as C. crescentus, but not essential in other bacteria such as E. coli (Vass et al.). Importantly, E. coli ClpX was unable to complement C. crescentus ClpX in vivo (Vass et al.). This lack of activity was due to species-specific differences in the N-terminal domain of ClpX, which are critical for processing the replication clamp loader subunit DnaX in C. crescentus. Thus, small differences in ClpX specificity may be particularly critical for specific bacterial species.

In their review on "Functional Diversity of AAA+ Protease Complexes in Bacillus subtilis," Elsholz et al. discuss the functions of several AAA+ proteases in B. subtilis, namely: ClpCP, ClpEP, ClpXP, ClpYQ, LonA/B, and FtsH (Elsholz et al.). They discuss how different stress responses control their expression and the phenotypes observed upon deletion of these different proteases. The ability of some of these proteases to control competence, sporulation, motility, and biofilm formation are described. Finally, the authors discuss targeting these proteases for the development of novel antibiotics.

In their review entitled "AAA+ Machines of Protein Destruction in Mycobacteria," Alhuwaider et al. (Alhuwaider and Dougan) discuss recent advances in determining the structure and function of AAA+ proteases of mycobacteria. These proteases are: ClpXP1P2, ClpC1P1P2, Lon, FtsH, and Mpa. The authors also discuss the Pup-proteasome system (PPS) present in mycobacteria, which is equivalent to the ubiquitin-proteasome system in eukaryotes. Alhuwaider et al. then conclude with a discussion of novel compounds that dysregulate or inhibit the activity of ClpP1P2 and others that dysregulate ClpC1. These compounds have promising activities against mycobacteria.

In a research article entitled "The Copper Efflux Regulator CueR Is Subject to ATP-Dependent Proteolysis in Escherichia coli," Bittner et al. demonstrate that the AAA+ proteases Lon, ClpXP, and ClpAP are responsible for the degradation of E. coli CueR, which is a transcription factor that controls the induction of the copper efflux Cue system (Bittner et al.). The authors found that the recognition of CueR by the AAA+ proteases requires the accessible C-terminus of CueR. They conclude that ATPdependent proteases are required for copper homeostasis in E. coli.

Fishovitz et al. carry out a detailed comparison between human and E. coli Lon in their research article entitled "Utilization of Mechanistic Enzymology to Evaluate the Significance of ADP Binding to Human Lon Protease" (Fishovitz et al.). By using a detailed mechanistic study, they found that unlike E. coli Lon, human Lon has low affinity for ADP despite showing comparable kcat and KM-values in the ATPase activity. They propose that human Lon is not regulated by a substrate-promoted ADP/ATP exchange mechanism. These differences between human and E. coli Lon might allow the future development of species-specfic Lon inhibitors.

In his review on "Multifunctional Mitochondrial AAA Proteases," Dr. Glynn discusses the two mitochondrial AAA proteases, i-AAA and m-AAA (Glynn). Both are mitochondrial

channel.

inner membrane proteins. However, i-AAA projects the ATPase and protease domains into the mitochondrial intermembrane space, while the m-AAA protease projects the catalytic domains into the matrix. The structures of these proteases are discussed as well as their mechanism of function. The proteases can carry out complete substrate degradation, but also can only cleave certain substrates such as for MrpL32 and Atg32.

#### ClpB and Hsp104

In their mini-review, "Structural Elements Regulating AAA+ Protein Quality Control Machines," Chang et al. discuss how pore loop-1, the Inter-Subunit Signaling motif, and the Pre-Sensor I insert motif might contribute to the activity of two Hsp100 disaggregases, bacterial ClpB and yeast Hsp104 (Chang et al.). They propose a model for how these structural elements might enable the AAA+ ATPase cycle to be coupled to substrate translocation across the central channel of ClpB and Hsp104. This process of polypeptide translocation is thought to underpin how ClpB and Hsp104 extract polypeptides from aggregated structures (Chang et al.).

Duran et al. provide a "Comparative Analysis of the Structure and Function of AAA+ Motors ClpA, ClpB, and Hsp104: Common Threads and Disparate Functions" in their review (Duran et al.). They discuss the ability of these three AAA+ proteins (ClpA, ClpB, and Hsp104) to translocate polypeptides through their hexameric complexes. All these proteins have two AAA+ domains and are known to unfold proteins. Importantly, ClpB and Hsp104 are also known to function as disaggregases, while ClpA can form a complex with the ClpP protease. The authors highlight the need to use transient state kinetic methods to examine the kinetic mechanisms of these motor proteins. They describe how the use of such methods allowed them to show that, for example, ClpA translocates polypeptides at about 20 aa s −1 , while in complex with the ClpP protease, ClpA translocation rate is even higher at about 35 aa s−<sup>1</sup> . The authors also discuss the importance of the Hsp70 chaperone in the function of ClpB/Hsp104,and the observation of species specificity in the interaction between Hsp70 and ClpB/Hsp104.

In their research article entitled "Mutant Analysis Reveals Allosteric Regulation of ClpB Disaggregase," Franke et al. carry out mutational analysis on the E. coli ClpB disaggregase to characterize its allosteric regulation (Franke et al.). ClpB can be divided into an N-terminal domain and two AAA+ domains separated by a helical region termed the M-domain. The authors identify a highly conserved residue in the first AAA+ domain, A328. ClpB-A328V mutant was found to have very high ATPase activity and exhibited cellular toxicity. Unexpectedly, the high ATPase activity of ClpB-A328V was mainly due to the second AAA+ ring as assessed by amide hydrogen exchange mass spectrometry. The authors conclude that A328 is a crucial residue in controlling the ATP hydrolysis in both AAA+ rings of ClpB.

In their research article entitled "Substrate Discrimination by ClpB and Hsp104," Johnston et al. describe the innate substrate preferences of ClpB and Hsp104 in the absence of the DnaK and Hsp70 chaperone systems (Johnston et al.). They show that substrate specificity is determined by the first AAA+ domain in each protein. They reached this conclusion by testing the two chaperones for their ability to act on several model substrates. They also tested different chimeras of the two chaperones.

In "Hsp78 (78 kDa Heat Shock Protein), a Representative AAA Family Member Found in the Mitochondrial Matrix of Saccharomyces cerevisiae," Abrahão et al. discuss the structure and function of Hsp78 (Abrahao et al.). Hsp78 is the mitochondrial paralogue of Hsp104, which functions in protein disaggregation and reactivation (Abrahao et al.). Curiously, Hsp104 and Hsp78 were lost upon the transition from protozoa to metazoa (Abrahao et al.). However, Abrahao et al. discuss the existence of ANKCLP, which appears alongside Hsp78 and Hsp104 in protozoa and survives the evolutionary transition to metazoa. ANKCLP possesses an AAA+ domain similar to nucleotidebinding domain 2 (NBD2) of Hsp104 and Hsp78, but is otherwise highly divergent. Intriguingly, mutations in ANKCLP cause 3 methylglutaconic aciduria, progressive brain atrophy, intellectual disability, congenital neutropenia, cataracts, and movement disorder in humans (Abrahao et al.).

### p97

In their review, "Structure and Function of p97 and Pex1/6 Type II AAA+ Complexes," Saffert et al. discuss two different AAA+ complexes that remodel ubiquitinated substrate proteins (Saffert et al.). One function of p97 is to dislocate ubiquitinated substrates from the ER membrane to the proteasome during ERassociated degradation (Saffert et al.). By contrast, Pex1/Pex6 is a heterohexameric motor comprised of alternating Pex1 and Pex6 subunits, which is essential for peroxisome biogenesis and function. Recent cryo-electron microscopy (cryo-EM) structures of p97 and Pex1/6 are discussed and key structural differences are highlighted.

In their review entitled "A Mighty 'Protein Extractor' of the Cell: Structure and Function of the p97/CDC48 ATPase," Ye et al. summarize the current knowledge of the structure and function of p97 and its role in several diseases (Ye et al.). p97 has two AAA+ domains connected with a short linker. It also has an Nterminal domain, which mediates its interactions with different adaptor proteins. The authors provide a detailed discussion of the structure of p97 and the effect of nucleotides on its different conformations. These studies are based on using techniques such as EM, X-ray crystallography, and high-speed atomic force microscopy. The authors then discuss the multicellular functions of this highly conserved protein, including its roles in ER-associated protein degradation (ERAD), mitochondriaassociated degradation (MAD) by extracting polypeptides from mitochondrial outer membrane, and ribosome-associated degradation (RAD). Finally, Ye et al. provide a summary of p97 mutations leading to several human diseases such as IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia)], FALS (familial amyotrophic lateral sclerosis), CMT2Y (Charcot-Marie-Tooth disease, type 2Y), hereditary spastic paraplegias (HSP), Parkinson's disease (PD), and Alzheimer's disease (AD).

In their review on p97 entitled "The Interplay of Cofactor Interactions and Post-translational Modifications in the Regulation of the AAA+ ATPase p97," Hänzelmann and Schindelin discuss how different cofactors modulate the activity of the p97 ATPase (Hanzelmann and Schindelin). They highlight the fact that the ability of p97 to be involved in a large number of cellular processes is due to the large number of cofactors that interact with this protein. They elucidate three different classes of p97 cofactors, namely: (i) Substrate-recruiting cofactors like UBA-UBX proteins and UFD1-NPL4, (ii) Substrate-processing cofactors like ubiquitin (E3) ligases and deubiquitinases (DUBs), and (iii) Regulatory cofactors like the UBX proteins, which may sequester or recycle p97 hexamers. The authors also discuss the role of post-translational modifications on p97 activity, and on its interactions with its cofactors and substrates.

#### AAA+ Proteins of the Proteasome

In "AAA-ATPases in Protein Degradation," Yedidi et al. review the activities of Rpt1, Rpt2, Rpt3, Rpt4, Rpt5, and Rpt6, which are the AAA+ ATPases of the eukaryotic proteasome, as well as some of their bacterial relatives such as PAN, Mpa, and VAT (Yedidi et al.). They focus on new technologies to understand how these AAA+ ATPases function by translocating unfolded polypeptides into the proteolytic chamber of the protease (Yedidi et al.). Conformational changes within the AAA+ ring and adjacent chambered protease appear to generate a peristaltic pumping mechanism to deliver substrates for degradation (Yedidi et al.).

In theirresearch article, "The Proteasomal ATPases Use a Slow but Highly Processive Strategy to Unfold Proteins," Snoberger et al. establish that proteasomal AAA+ proteins employ a low velocity but highly processive motor mechanism to deliver substrates to the proteolytic cavity of the proteasome (Snoberger et al.). This mechanism contrasts with ClpX, which utilizes a high velocity but less processive motor mechanism to deliver substrates to the ClpP protease for degradation. These differences in motor mechanism may have evolved in response to differing demands of their specific clientele.

#### Rubisco Activases

In their review on "Rubisco Activases: AAA+ Chaperones Adapted to Enzyme Repair" Bhat et al. discuss the unique function of the Rubisco activase (Rca) in remodeling Rubisco (Bhat et al.). Rca is a AAA+ chaperone that is highly conserved in photosynthetic organisms from bacteria to higher plants. Rubisco is Ribulose-1,5-bisphosphate carboxylase/oxygenase enzyme, which is involved in fixing atmospheric CO<sup>2</sup> during photosynthesis. It is the most abundant protein on earth and is the key enzyme in the synthesis of all organic matter on the planet. However, Rubisco is a poor enzyme and is easily inhibited by side products of its catalytic reactions or by compounds synthesized by some plants under low light conditions. Rca functions to alleviate or "cure" Rubisco from such problematic inhibitions. The authors discuss the structure of Rca from different species and the potential mechanisms of its function.

Dr. Mueller-Cajar provides a review on "The Diverse AAA+ Machines that Repair Inhibited Rubisco Active Sites" (Mueller-Cajar). He discusses the presence of three evolutionarily distinct classes of Rubisco activases (Rcas): (1) green and (2) redtype Rcas that are mostly found in photosynthetic eukaryotes of the green and red plastid lineage, respectively, and (3) CbbQO present in chemoautotrophic bacteria. He discusses the evolution of these activases and their potential use in synthetic biology to enhance Rubisco activity in plants.

#### Torsin

In their perspective article, "Torsin ATPases: Harnessing Dynamic Instability for Function," Chase et al. discuss the Torsins, which are also phylogenetically related to NBD2 of yeast Hsp104 (Chase et al.). Torsins are the only AAA+ ATPases localized inside the ER and connected nuclear envelope (Chase et al.). Intriguingly, mutations in TorsinA cause DYT1 dystonia, a neurological disorder in humans (Chase et al.). Torsins exhibit weak ATPase activity that is augmented via active-site complementation due to co-assembly with specific accessory cofactors LAP1 and LULL1 (Chase et al.). Chase et al. suggest that dynamic assembly and disassembly of Torsin/cofactor complexes play important roles in their function in nuclear trafficking and nuclear-pore complex assembly (Chase et al.).

### Pontin and Reptin

In their extensive review on "The Role of Pontin and Reptin in Cellular Physiology and Cancer Etiology," Mao and Houry discuss the multiple functions of the highly conserved Pontin and Reptin AAA+ ATPases (Mao and Houry). These two proteins typically function together as a complex but can also function independently. The authors highlight the roles of Pontin and Reptin in chromatin remodeling. They also discuss how Pontin and Reptin modulate the transcriptional activities of several proto-oncogenes such as MYC and β-catenin. Mao and Houry elucidate how Pontin and Reptin have been found to be required for the assembly of PIKK signaling complexes as well as telomerase, mitotic spindle, RNA polymerase II, and snoRNPs. The authors conclude with an overview of current efforts aimed at identifying inhibitors of Pontin and Reptin to be developed as novel anti-cancers.

# CONCLUDING REMARKS

In conclusion, this collection of 21 articles highlights a number of important structural and mechanistic aspects of AAA+ proteins involved in protein repair and degradation. We are excited to see how the field will continue to develop during the ongoing cryo-EM revolution (Egelman, 2016). We anticipate that cryo-EM will enable deeper understanding of how these fascinating molecular machines operate in diverse situations (Gates et al., 2017; Puchades et al., 2017; Ripstein et al., 2017; Zehr et al., 2017).

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

Work in WH's laboratory in this area is funded by a Canadian Institutes of Health Research Project Grant (PJT-148564). JS is supported by NIH grant R01GM099836.

# REFERENCES


protease YME1 gives insight into substrate processing. Science 358: eaao0464. doi: 10.1126/science.aao0464


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Shorter and Houry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Protein Chaperone ClpX Targets Native and Non-native Aggregated Substrates for Remodeling, Disassembly, and Degradation with ClpP

Christopher J. LaBreck † , Shannon May † , Marissa G. Viola † , Joseph Conti and Jodi L. Camberg\*

*Department of Cell and Molecular Biology, University of Rhode Island, Kingston, RI, USA*

#### *Edited by:*

*James Shorter, University of Pennsylvania, USA*

#### *Reviewed by:*

*Axel Mogk, University of Heidelberg, Germany Peter Chien, University of Massachusetts Amherst, USA Aaron L. Lucius, University of Alabama at Birmingham, USA*

*\*Correspondence:*

*Jodi L. Camberg cambergj@uri.edu*

*† These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

*Received: 24 December 2016 Accepted: 07 April 2017 Published: 04 May 2017*

#### *Citation:*

*LaBreck CJ, May S, Viola MG, Conti J and Camberg JL (2017) The Protein Chaperone ClpX Targets Native and Non-native Aggregated Substrates for Remodeling, Disassembly, and Degradation with ClpP. Front. Mol. Biosci. 4:26. doi: 10.3389/fmolb.2017.00026*

ClpX is a member of the Clp/Hsp100 family of ATP-dependent chaperones and partners with ClpP, a compartmentalized protease, to degrade protein substrates bearing specific recognition signals. ClpX targets specific proteins for degradation directly or with substrate-specific adaptor proteins. Native substrates of ClpXP include proteins that form large oligomeric assemblies, such as MuA, FtsZ, and Dps in *Escherichia coli*. To remodel large oligomeric substrates, ClpX utilizes multivalent targeting strategies and discriminates between assembled and unassembled substrate conformations. Although ClpX and ClpP are known to associate with protein aggregates in *E. coli*, a potential role for ClpXP in disaggregation remains poorly characterized. Here, we discuss strategies utilized by ClpX to recognize native and non-native protein aggregates and the mechanisms by which ClpX alone, and with ClpP, remodels the conformations of various aggregates. We show that ClpX promotes the disassembly and reactivation of aggregated Gfp-ssrA through specific substrate remodeling. In the presence of ClpP, ClpX promotes disassembly and degradation of aggregated substrates bearing specific ClpX recognition signals, including heat-aggregated Gfp-ssrA, as well as polymeric and heat-aggregated FtsZ, which is a native ClpXP substrate in *E. coli*. Finally, we show that ClpX is present in insoluble aggregates and prevents the accumulation of thermal FtsZ aggregates *in vivo*, suggesting that ClpXP participates in the management of aggregates bearing ClpX recognition signals.

Keywords: disaggregation, proteolysis, unfoldase, ATPase, AAA+

#### INTRODUCTION

Maintaining cellular proteostasis relies on chaperone pathways that promote native protein folding. Typical strategies include targeting misfolded, unfolded, and aggregated polypeptides for reactivation or degradation (Bukau and Horwich, 1998; Wickner et al., 1999; Stoecklin and Bukau, 2013). Misfolded proteins are generated during polypeptide elongation and as a complication of environmental stress (Powers and Balch, 2013). The challenges imposed on chaperone systems by proteotoxic stress are especially relevant in pathogenic organisms like E. coli, which experience extreme fluctuations in environmental conditions leading to accumulation of protein aggregates and subsequent proteotoxicity (Mogk et al., 2011). Protein quality control systems reactivate, degrade and remove damaged and aggregated proteins. Under thermal stress in E. coli, the heat shock response provides a cellular defense mechanism and upregulates heat shock protein and chaperone levels to restore proteostasis (Mogk et al., 2011).

In addition to preventing protein aggregation, chaperone proteins mediate aggregate clearance through proteolysis of non-native proteins and aggregation reversal (Hartl et al., 2011; Mogk et al., 2011). Clearance of misfolded proteins in E. coli is carried out by AAA+ (ATPases Associated with diverse cellular Activities) proteins, which initiate substrate recognition, unfolding, and translocation into a proteolytic chamber (ClpP, HslV; Snider and Houry, 2008; Sauer and Baker, 2011). Several AAA+ proteins, such as Lon and FtsH, contain both AAA+ chaperone and proteolytic domains within a single protomer (Sauer and Baker, 2011). The chaperone-protease Lon recognizes exposed aromatic and hydrophobic residues, which may contribute to less stringent substrate selectivity and favor degradation of unfolded or misfolded proteins (Gur and Sauer, 2008).

The Clp ATPases of the AAA+ superfamily can be separated into two functional categories: degradation or disaggregation machines. Degradation machines, including ClpX, ClpA, and HslU form complexes with peptidases ClpP or HslV to remove misfolded proteins or specific substrates (Zolkiewski, 2006). Disaggregation machines, including Hsp104 and its bacterial homolog ClpB, disaggregate and reactivate aggregated proteins by an ATP-dependent mechanism and can function in cooperation with the Hsp70/DnaK system independent of protein degradation (Zolkiewski, 1999; Dougan et al., 2002; Doyle et al., 2007; Sweeny and Shorter, 2016). Through a collaborative mechanism, Hsp70, with Hsp40, binds first to a polypeptide segment of an aggregated protein and then the substrate is remodeled by Hsp104/ClpB (Zietkiewicz et al., 2004, 2006; Acebrón et al., 2009).

E. coli substrates that are degraded by ClpXP include a variety of cellular proteins, metabolic enzymes and several proteins capable of forming large conformational assemblies, including FtsZ, Dps, and MinD (Flynn et al., 2003; Stephani et al., 2003; Neher et al., 2006; Camberg et al., 2009, 2014; Conti et al., 2015). ClpXP can associate with cellular aggregates in E. coli and can promote removal of cellular inclusions, but direct protein disaggregation in vitro is not well characterized for ClpX (Vera et al., 2005; Winkler et al., 2010). An early study suggested that ClpX, in the absence of ClpP, could protect the lambda O phage protein from aggregation and resolubilize lambda O aggregates (Wawrzynow et al., 1995). In Bacillus subtilis, ClpX also localizes to protein aggregates, suggesting that it may be involved in protein disaggregation (Kruger et al., 2000; Kain et al., 2008; Kirstein et al., 2008; Simmons et al., 2008). ClpX and ClpX substrates are present in polar protein aggregates in E. coli under stress in vivo, suggesting that ClpX associates with aggregated proteins and participates in their removal (Kain et al., 2008; Maisonneuve et al., 2008; Simmons et al., 2008).

ClpXP comprises an asymmetric, hexameric ring of ClpX docked to two stacked heptameric rings of the ClpP serine protease (Wang et al., 1997; Glynn et al., 2009). Although ClpX has been shown to independently remodel substrates, such as MuA, in the presence of ClpP, hydrophobic "IGF" loops on the bottom surface of the ClpX hexamer contact hydrophobic pockets on the ClpP tetradecamer, allowing unfolded substrates to access the ClpP proteolytic chamber (Kim et al., 2001; Abdelhakim et al., 2010; Baker and Sauer, 2012). Nucleotide binding by ClpX protomers, in the cleft between the large and small AAA+ subdomains, regulate the position of the subdomains relative to each other; these conformational changes enable ClpX to couple substrate translocation to ATP hydrolysis (Glynn et al., 2009; Baker and Sauer, 2012). Substrates are then translocated into the ClpP chamber for degradation (Baker and Sauer, 2012).

Substrates bind to the ClpX N-domain and to residues in the ClpX central channel (pore-loops; Bolon et al., 2004; Park et al., 2007; Martin et al., 2008; Baker and Sauer, 2012). The N-domain of ClpX is separated from the AAA+ domain by a flexible linker and can dimerize independently. The N-domain is important for direct recognition of some substrates, including FtsZ and MuA, as well as adaptor proteins, but is not required for direct recognition of the ssrA-tag (Abdelhakim et al., 2008; Martin et al., 2008; Camberg et al., 2009; Baker and Sauer, 2012). Adaptor proteins, such as RssB or SspB, promote the interaction and engagement of specific substrates, such as RpoS or ssrAtagged substrates, respectively (Sauer and Baker, 2011). The ssrA tag is an 11-residue degron appended to a nascent polypeptide when the ribosome stalls during protein synthesis, targeting the misfolded protein for subsequent degradation (Gottesman et al., 1998; Levchenko et al., 2000).

ClpXP is implicated in the degradation of diverse cellular substrates and more than 100 substrates have been reported (Flynn et al., 2003; Neher et al., 2006). Native substrates of ClpX contain recognition motifs at the N- or C-termini (Flynn et al., 2003). Notably, the essential cell division protein FtsZ in E. coli has two distinct ClpX motifs: one in the flexible linker region and one near the C-terminus (Camberg et al., 2014). FtsZ is a tubulin homolog that assembles into linear polymers in vitro and forms the septal ring critical for division in vivo, called the Z-ring (Erickson et al., 2010). ClpXP degrades ∼15% of FtsZ proteins during the cell cycle in E. coli and is capable of degrading both monomers and polymers in vitro (Camberg et al., 2009). ClpXP degrades polymers more efficiently, which is consistent with a common strategy of multivalent recognition of substrates by AAA+ ATPases (Davis et al., 2009; Camberg et al., 2014; Ling et al., 2015). In addition to FtsZ, several other ClpXP substrates form large oligomeric structures, including the tetrameric phage protein MuA, the dodecameric bacterial protein Dps, and the bacterial cell division ATPase MinD (Stephani et al., 2003; Neher et al., 2006; Abdelhakim et al., 2010; Conti et al., 2015). Like FtsZ, alternate monomeric and oligomeric conformations of MuA are also differentially recognized by ClpX (Abdelhakim et al., 2008, 2010; Ling et al., 2015).

In this study, we use engineered and native substrates to investigate the role of ClpX and ClpXP in the disassembly and degradation of protein aggregates that bear specific ClpX recognition signals. We observed that ClpX, with and without ClpP, destabilizes Gfp-ssrA aggregates in vitro. The native ClpXP substrate FtsZ forms several discrete conformations, including linear ordered polymers and also heat-induced aggregates. Our results show that ClpXP disassembles both heat-induced and linear polymers containing FtsZ. Finally, we also demonstrate that thermal stress promotes aggregation of FtsZ, which is exacerbated in cells deleted for clpX or clpP. Together, these results show bona fide chaperone activity for ClpX in vitro and suggest that ClpX, with or without ClpP, may play a broader role in rescue and disassembly of protein aggregates.

# MATERIALS AND METHODS

#### Bacterial Strains and Plasmids

E. coli strains and plasmids used in this study are described in **Table 1**. An expression plasmid encoding FtsZ(1C67) was constructed by introducing a TAA stop codon (at residue 317 of FtsZ) into pET-FtsZ by site-directed mutagenesis (Camberg et al., 2009).

#### Expression and Purification of Proteins

Gfp-ssrA was purified as previously described (Yakhnin et al., 1998). ClpX, ClpP, FtsZ, and FtsZ(1C67) were each overexpressed in E. coli BL21 (λDE3) and purified as described (Maurizi et al., 1994; Grimaud et al., 1998; Camberg et al., 2009, 2014). ClpX(E185Q) was purified as described for wild type ClpX, except the expression strain, E. coli MG1655 1clpX carrying plasmid pClpX(E185Q), was induced with 1% arabinose (**Table 1**; Camberg et al., 2011). Gfp(uv) containing an N-terminal histidine tag was overexpressed in E. coli BL21 (λDE3) and grown to an OD<sup>600</sup> of 1.0 and induced for 3 h at 30◦C. Cells were lysed by French press in purification lysis buffer (20 mM HEPES, pH 7.5, 5 mM MgCl2, 50 mM KCl, and 10% glycerol). Soluble extracts were bound to TALON metal affinity resin (GE Healthcare), eluted with an imidazole gradient, and imidazole was removed by buffer exchange. Protein concentrations are reported as FtsZ monomers, ClpX hexamers, ClpP tetradecamers, and Gfp or Gfp-tagged monomers. For polymerization assays, FtsZ was labeled with Alexa Fluor 488 and active protein (FL-FtsZ) was collected after cycles of polymerization and depolymerization as described (González et al., 2003; Camberg et al., 2014).

# Dynamic Light Scattering

Dynamic light scattering (DLS) measurements were made using a Zetasizer Nano ZS (Malvern Instruments). To determine size distribution, FtsZ (5 µM), aggFtsZ (5 µM), Gfp-ssrA (1.5 µM), and aggGfp-ssrA (1.5 µM) in reaction buffer (50 mM HEPES, pH 7.5, 100 mM KCl and 10 mM MgCl2) were added to polystyrene cuvettes and scanned at 23◦C with a detector angle of 173◦ and a 4 mW, 633 nm He–Ne laser. The reported intensity-weighted hydrodynamic diameters are based on 15 scans.

# Heat Denaturation, Aggregation, Disassembly, and Reactivation of Aggregated Substrates

To heat-inactivate Gfp substrates, Gfp-ssrA (1.5 µM) or Gfp(uv) (1.5 µM) was added, where indicated, to buffer containing HEPES (50 mM, pH 7.5), KCl (100 mM), MgCl<sup>2</sup> (10 mM), glycerol (10%) and dithiothreitol (DTT) (2 mM) in a volume of 800 µl and incubated at 85◦C for 15 min. Immediately following heat-treatment, the denatured substrate was placed on ice for 2 min and added to a reaction (50 µl) containing ClpX, (0.3 µM), ClpX (E185Q) (0.3 µM), ClpP (0.3 µM), ATP (4 mM), ATPγS (1 mM), or ADP (2 mM), where indicated. Samples containing ATP were supplemented with an ATP-regenerating system containing phosphocreatine (5 mg ml−<sup>1</sup> ) and creatine kinase (CK) (60 µg ml−<sup>1</sup> ). Fluorescence recovery was monitored by measuring fluorescence in a Cary Eclipse fluorometer with excitation and emission wavelengths set at 395 nm and 510 nm, respectively. Readings were corrected for background signal by subtracting the fluorescence of buffer. Rates were calculated by fitting to a one-phase association model in GraphPad Prism (version 6.0b). Disaggregation was monitored by 90◦ -angle light scatter with excitation and emission wavelengths set to 550 nm. Readings were corrected for background signal by subtracting the scatter of the buffer and then plotted as percent of the initial turbidity. Heat-induced aggregation of Gfp-ssrA with time was monitored by 90◦ -angle light scatter with the temperature of the cuvette holder set to 80◦C using a circulating water bath.

To inactivate native FtsZ substrates, FtsZ and FtsZ(1C67) (5 µM) were heated for 15 min in reaction buffer (20 mM HEPES, pH 7.5, 100 mM KCl, 10 mM MgCl2) in a volume of 120 µl at 65◦C, then cooled on ice for 40 s, and held at 23◦C until addition to reactions (60 µl volume) containing ClpX (0.5 µM or 1 µM), ClpX(E185Q) (0.5 µM), ClpP (1 µM), ATP (4 mM) and an ATP-regenerating system (phosphocreatine at 5 mg ml−<sup>1</sup> and creatine kinase at 60 µg ml−<sup>1</sup> ), where indicated. Disaggregation was monitored by 90◦ -angle light scatter with excitation and emission wavelengths set to 450 nm. Readings were corrected for background signal by subtracting the scatter of the buffer and then plotted as percent of the initial turbidity. Heat-induced aggregation of FtsZ with time was monitored by 90◦ -angle light scatter with the temperature of the cuvette holder set to 65◦C using a circulating water bath.

# Polymerization and GTP Hydrolysis Assays

FL-FtsZ was incubated with the GTP analog GMPCPP (0.5 mM) in the presence of increasing concentrations of ClpX and ClpP (0, 0.25, 0.5, or 1 µM) as indicated and in the presence of phosphocreatine at 5 mg ml−<sup>1</sup> and creatine kinase at 60 µg ml−<sup>1</sup> . Samples were incubated for 3 min in buffer containing MES (50 mM, pH 6.5), KCl (100 mM) and MgCl<sup>2</sup> (10 mM) at 23◦C, then centrifuged at 129,000 × g in a Beckman TLA 120.1 rotor for 30 min. Pellets were resuspended in 0.2 M NaCl with 0.01% Triton X-100 (100 µl) and the fluorescence associated with FL-FtsZ for supernatants and pellets was measured using a Cary Eclipse spectrophotometer. GTP hydrolysis rates for FtsZ and FtsZ(1C67) were measured before and after aggregation

#### TABLE 1 | *E. coli* strains and plasmids used in this study.


using the Biomol Green (Enzo Life Sciences) detection reagent as described (Camberg et al., 2014).

#### Heat Shock of Wild Type and Deletion Strains

E. coli wild type and deletion strains were grown overnight, diluted 1:100 in fresh Lennox broth the next day and grown at 30◦C to an OD of 0.4. All strains were incubated in a water bath at 50◦C for 1 h, followed by recovery at 30◦C for 35 min. Cells were harvested by centrifugation and lysed with Bacterial Protein Extraction Reagent (B-PER) (ThermoFisher Scientific) (2 ml) and lysozyme (25 µg ml−<sup>1</sup> ). Insoluble fractions were collected by centrifugation at 15,000 × g for 5 min at 4◦C, resuspended in lithium dodecyl sulfate sample buffer and analyzed by reducing SDS-PAGE. Total proteins were transferred to a nitrocellulose membrane and visualized by Ponceau (Fisher Scientific) staining and membranes were immunoblotted using antibodies to ClpX and FtsZ (Camberg et al., 2009, 2011). Band intensities were analyzed by densitometry (NIH ImageJ), normalized to the intensity of the average of the "no heat" sample, and evaluated for significance by the Mann-Whitney test. Where indicated, to test a mild heat shock condition, cells were incubated in a water bath at 42◦C for 30 min, followed by recovery at 30◦C for 35 min, and analyzed as described.

#### RESULTS

# ClpXP Degrades Aggregates *In vitro*

To determine if ClpX can remodel protein substrates from the aggregated state, we used the fusion protein, Gfp-ssrA, which forms aggregates upon heat treatment (Zietkiewicz et al., 2004, 2006). Gfp-ssrA is rapidly degraded by ClpXP and has been extensively studied to understand substrate targeting by ClpXP. The Gfp moiety is widely used in protein disaggregation assays because it forms non-fluorescent aggregates when heated, but is disaggregated and reactivated by several chaperone systems (Zietkiewicz et al., 2004, 2006). Therefore, we heated Gfp-ssrA at 85◦C for 15 min to induce aggregation (aggGfp-ssrA), resulting in an 86% loss of fluorescence emitted (**Figure 1A**). Next, to measure the distribution of aggregates by size after heating, we performed dynamic light scattering (DLS) of untreated and heat-denatured Gfp-ssrA. We observed that without heating, the particle sizes are uniform with an average hydrodynamic diameter of 8–10 nm (**Figure 1B**). After heating, aggregates are ∼500–600 nm, and there is a narrow distribution of particle sizes and no small particles (i.e., <100 nm; **Figure 1C**). Upon heattreatment, aggregation of Gfp-ssrA (1.5 µM) occurs rapidly and plateaus by 10 min by 90◦ -angle light scattering (**Figure 1D**). The heat inactivation is irreversible since incubation of aggregated Gfp-ssrA (aggGfp-ssrA) alone does not lead to appreciable fluorescence reactivation, which is consistent with previous reports using Gfp (**Figure S1**; Zietkiewicz et al., 2004). To determine if ClpXP can bind to aggregates and degrade them, we incubated aggGfp-ssrA with ClpXP and monitored turbidity by 90◦ -angle light scattering. Incubation of aggGfp-ssrA with ClpXP led to a 35% loss of turbidity in 2 h (**Figure 1E**). However, when ClpXP was omitted from the reaction, there was very little change in turbidity over time (5% loss in 2 h; **Figure 1E**). This suggests that ClpXP targets aggregated substrates for degradation. To determine if degradation is required to reduce turbidity, we omitted ClpP and observed that ClpX is capable of reducing sample turbidity by 15% in 2 h (**Figure 1E**). Finally, when ATP

FIGURE 1 | Disaggregation and degradation of aggregated Gfp-ssrA by ClpXP. (A) The fluorescence emission spectra (450–600 nm) of Gfp-ssrA (1.0 µM) (green) and heat-treated Gfp-ssrA (1.0 µM) (black) (85◦C for 15 min) were measured using an excitation wavelength of 395 nm. Plotted curves are representative of three traces. (B) DLS was performed for Gfp-ssrA (1.0 µM) (green) as described to determine particle size (nm) distribution. (C) DLS was performed for heat-treated Gfp-ssrA (aggGfp-ssrA) (1.0 µM) (black) as described to determine particle size (nm) distribution. (D) Aggregation by 90◦–angle light scatter was measured for Gfp-ssrA (1.5 µM) (open circles) in a cuvette attached to a circulating water bath held at 80◦C. Light scattering was monitored for 15 min. (E) Disaggregation of aggGfp-ssrA (1 µM) was monitored by 90◦–angle light scatter as described in Materials and Methods. Disaggregation reactions contained aggGfp-ssrA (1 µM) (black circles), ClpX (0.5 µM) and ATP (blue circles), ClpX (0.5 µM), and ClpP (0.6 µM) (gold circles), ClpX (0.5 µM), ClpP (0.6 µM), and ATP (4 mM) (red circles), and a regenerating system, where indicated. Light scattering was monitored for 120 min. Curves shown are representative of at least three replicates. (F) Degradation of Gfp-ssrA and aggGfp-ssrA was monitored as described in Materials and Methods in reactions containing Gfp-ssrA (1 µM) or aggGfp-ssrA (1 µM), where indicated, and ClpX (0.5 µM), ClpP (0.6 µM), ATP (4 mM), and a regenerating system, where, indicated. Reactions were incubated at 23◦C for 120 min and samples were analyzed by SDS-PAGE and Coomassie stain.

was omitted from the reaction containing ClpXP, we observed a <10% reduction in the turbidity of the reaction (**Figure 1E**). To confirm that ClpXP degrades aggGfp-ssrA, we incubated aggGfpssrA with combinations of ClpX, ClpP, and ATP, and sampled degradation reactions after 2 h. We observed that in the presence of ClpXP, aggGfp-ssrA is degraded, but not when ClpP or ATP was omitted (**Figure 1F**). Together, these results demonstrate that ClpXP targets aggregates for ATP-dependent degradation and that ClpX is also capable of promoting disassembly in the absence of ClpP.

FtsZ is a well-characterized ClpXP substrate that is essential for cell division and forms linear polymersin vitro in the presence of GTP (Erickson et al., 2010). We previously showed that ClpXP binds to GTP-stimulated FtsZ polymers and promotes FtsZ degradation (Camberg et al., 2009). ClpXP also recognizes and degrades non-polymerized FtsZ, but less efficiently than polymerized FtsZ (Camberg et al., 2009). In vitro, FtsZ rapidly aggregates when heated at 65 ◦C and this aggregation is associated with an increase in overall light scatter and a 97% loss of GTPase activity (**Figures 2A,B**). FtsZ, which purifies as a mixture of monomers (40.4 kDa) and dimers (80.8 kDa), has an average hydrodynamic diameter of 10–15 nm by DLS (**Figure 2C**). Heat treatment of FtsZ (5 µM) at 65◦C produces several particle sizes, including small (30–40 nm) and large aggregates (>300 nm; **Figure 2D**). To determine if ClpXP reduces the turbidity associated with aggregated FtsZ (aggFtsZ), we incubated aggFtsZ with ClpXP and ATP and observed a 40% loss of turbidity after incubation with ClpXP for 2 h (**Figure 2E**). However, in the absence of ClpXP, the light scatter signal remained stable for aggFtsZ (**Figure 2E**). Incubation of ClpX with aggFtsZ also resulted in a 25% loss in light scatter, suggesting that ClpX also promotes disassembly of aggregates similar to what we observed for aggGfp-ssrA (**Figures 1E**, **2E**).

Next, to confirm that aggFtsZ is degraded by ClpXP, we assembled reactions containing combinations of aggFtsZ, ClpX, ClpP, and ATP and sampled these reactions at 0 and 120 min for analysis by SDS-PAGE. We observed that in the presence of ClpXP and ATP, 50% of the total aggFtsZ in the reaction is lost to degradation after 120 min (**Figure 2F**). Omission of either ClpP or ATP from the reaction prevents loss of aggFtsZ (**Figure 2F**). These results indicate that ClpXP degrades aggFtsZ. Furthermore, the amount of aggFtsZ after incubation with ClpX is unchanged despite the decrease in light scatter detected, suggesting that ClpX can disaggregate aggFtsZ (**Figures 2E,F**).

In addition to forming aggregates upon heating, FtsZ also assembles into a linear head-to-tail polymer, which is a native,

aggFtsZ (5 µM) and ClpX (1 µM) (blue circles), or aggFtsZ (5 µM), ClpX (1 µM), and ClpP (1 µM) (red circles), with ATP (4 mM) and a regenerating system. Light scattering was monitored for 120 min. The curves shown are representative of at least three replicates. (F) Degradation was monitored for FtsZ and aggFtsZ as described in Materials and Methods in reactions containing FtsZ (6 µM), aggFtsZ (6 µM), ClpX (0.5 µM), ClpP (0.5 µM), ATP (4 mM) and a regenerating system, where indicated. For degradation of FtsZ, GMPCPP (0.5 mM) was included to promote the assembly of stable polymers. Degradation reactions were incubated at 23◦C for 120 min. To detect protein loss due to degradation, samples from 0 and 120 min were analyzed by SDS-PAGE to solubilize any remaining aggregates. (G) Degradation was monitored for FL-FtsZ (125 pmol) incubated in the presence of GMPCPP (0.5 mM) for 3 min, then ATP (4 mM), a regenerating system, and increasing concentrations of ClpXP (0, 0.25, 0.5, and 1 µM as shown) were added and reactions were incubated for an additional 30 min at 23◦C. Reactions were centrifuged at 129,000 × g for 30 min at 23◦C. Pellet-associated FtsZ was quantified by fluorescence, and each data point is an average of at least three replicates.

ordered aggregate, and distinct from the disordered aggregates which are induced by heating (aggFtsZ). We compared the loss of aggFtsZ by ClpXP to a similar reaction monitoring loss of native polymerized FtsZ, which is a known substrate of ClpXP. Like aggFtsZ, we also observed a ∼50% loss of polymeric FtsZ, stabilized by the GTP analog GMPCPP, after 120 min in reactions containing ClpXP and ATP (**Figure 2F**). GMPCPP promotes the assembly of stable polymers that are far less dynamic than polymers assembled with GTP (Lu et al., 2000). To test if ClpXP disassembles GMPCPP-stabilized FtsZ polymers, we incubated pre-assembled polymers with ClpXP and ATP. Then, we collected polymers by high-speed centrifugation. In these assays, we used active fluorescent FtsZ, labeled with Alexa fluor 488 (FL-FtsZ), to quantify the amount of polymerized FtsZ in the pellet fraction and soluble FtsZ in the supernatant. We observed that after incubation of GMPCPP-stabilized FtsZ polymers with increasing concentrations of ClpXP (0–1 µM), few FtsZ polymers were recovered in the pellet fractions containing ClpXP (26% of the total FtsZ was recovered in the reaction containing 1 µM ClpXP), indicating that ClpXP is highly effective at

promoting the disassembly of GMPCPP-stabilized FtsZ polymers (**Figure 2G**).

## ClpX Reactivates Heat-Aggregated Gfp-ssrA

Incubation of ClpX with aggGfp-ssrA resulted in loss of turbidity, suggesting that ClpX may function independently of ClpP to reactivate substrates (**Figure 1E**). Reactivation of misfolded proteins may occur through binding and stabilization of intermediates enabling proteins to adopt the native folded conformation, or through ATP-dependent chaperone-assisted unfolding. To determine if ClpX, which recognizes the ssrA amino acid sequence, is able to reactivate aggGfp-ssrA, we monitored fluorescence of aggGfp-ssrA in the presence and absence of ClpX and ATP. AggGfp-ssrA regains very little fluorescence alone, ∼20 units, which is 8% of the initial fluorescence lost upon heating; however, in the presence of ClpX, fluorescence recovers rapidly in the first 10 min of the reaction and then plateaus, regaining ∼85 units, which is 27% of the initial fluorescence lost upon heating (**Figure 3A**).

ClpX catalyzes ATP-dependent unfolding of substrates (Kim et al., 2000; Singh et al., 2000). To determine if ATP is essential for reactivation, we incubated aggGfp-ssrA with ClpX under various nucleotide conditions including with ATP, the ATP analog ATPγS, ADP and omission of nucleotide. We observed an 82% slower rate of fluorescence reactivation when ClpX and aggGfp-ssrA were incubated with ATPγS than with ATP (0.02 and 0.11 AU min−<sup>1</sup> , respectively), and no recovery over background with ADP or without nucleotide (**Figure 3B**). Reactivation by ClpX and ATP is prevented in the presence of ClpP, and the residual fluorescence after heat treatment is lost upon degradation (**Figure S2**). Together, these results indicate that ClpX requires ATP to reactivate Gfp-ssrA and, surprisingly, that ATPγS is also capable of promoting reactivation, although at a much slower rate than ATP (**Figure 3B**).

# Reactivation and Disaggregation by ClpX Requires a Specific Recognition Sequence

Next, we determined if a ClpX recognition motif is important for efficient recognition of aggregated substrates by ClpX. We compared reactivation of aggGfp-ssrA with heat-aggregated Gfp (aggGfp) without an ssrA tag. We observed that after incubation with ClpX and ATP for 60 min, ∼30 units of fluorescence were recovered, which is 8% of the initial pre-heat fluorescence, indicating that aggGfp is a poor substrate for reactivation by ClpX (**Figure 4A**). In contrast, aggGfp-ssrA recovered 33% (>100 units) of the initial pre-heat fluorescence after incubation with ClpX (**Figure 4A**).

Two regions of FtsZ are important for promoting degradation of E. coli FtsZ by ClpXP, one in the unstructured linker region (amino acids 352–358) and one near the C-terminus (residues 379 through 383; Camberg et al., 2014). Using a truncated FtsZ mutant protein, FtsZ(1C67), which is deleted for 67 C-terminal amino acid residues, including both regions involved in ClpX recognition, we tested if ClpXP reduces the light scatter in reactions containing heat-aggregated FtsZ(1C67) [aggFtsZ(1C67)]. We heated FtsZ(1C67) at 65◦C for 15 min, the condition that promotes aggregation of full length FtsZ, and confirmed that heat treatment resulted in an 84% loss of GTP hydrolysis activity and an increase in light scatter, which is stable over time (**Figures 4B,C**). In the presence of ClpXP, we observed no decrease in light scatter for aggFtsZ(1C67) after incubation for 120 min (**Figure 4C**), which is expected since FtsZ(1C67) is a poor substrate for ClpXP degradation (**Figure S3**). Together, these results demonstrate that for ClpX to recognize aggregates and promote disaggregation, disassembly and/or reactivation, a ClpX recognition motif is required.

#### Impaired Reactivation by ClpX(E185Q)

ATP is required for reactivation of aggGfp-ssrA, however, it is unknown if this event requires ATP-hydrolysis and substrate unfolding. Therefore, we used the ClpX mutant protein ClpX(E185Q), which has a mutation in the Walker B motif and is defective for ATP-hydrolysis, but interacts with substrates (Hersch et al., 2005; Camberg et al., 2014). We observed that ClpX(E185Q) is defective for disaggregation of aggGfp-ssrA by monitoring turbidity by 90◦ -angle light scatter of reactions containing aggGfp-ssrA, ClpX(E185Q) and ATP (**Figure 5A**). We also tested if aggFtsZ is disassembled by ClpX(E185Q), and observed no reduction in light scatter in reactions containing aggFtsZ, ClpX(E185Q) and ATP after 120 min compared to ClpX (**Figure 5B**). Finally, we tested if reactivation of aggGfp-ssrA requires ATP hydrolysis using ClpX(E185Q) instead of ClpX. We observed that ClpX(E185Q) promotes a small amount of reactivation of aggGfp-ssrA and restores fluorescence, but to a much lesser extent than the level observed for wild type ClpX (**Figure 5C**). These results suggest that ATP hydrolysis by ClpX is required to promote efficient reactivation of aggGfp-ssrA and disassembly of large complexes containing aggFtsZ or aggGfpssrA (**Figures 5A–C**).

# ClpXP Prevents Accumulation of FtsZ Aggregates *In vivo* under Extreme Thermal Stress

ClpX and ClpP were previously reported to localize to protein aggregates in E. coli, suggesting that ClpXP may target aggregates in vivo for direct degradation (Winkler et al., 2010). We used three replicates.

the native ClpXP substrate FtsZ, which aggregates upon heat treatment, to determine if ClpX and/or ClpXP modulates FtsZ aggregate accumulation after thermal stress by comparing the levels of FtsZ present in insoluble cell fractions (**Figures 2A**, **6A**). Wild type cells and cells deleted for clpX, clpP, clpB, clpA, dnaK, lon, hslU, and hslV were exposed to heat shock and insoluble protein fractions were collected and analyzed by immunoblot. We observed that FtsZ was present in the insoluble fraction of wild type cells (BW25113), and this amount was 42% higher in cells exposed to heat shock at 50◦C (**Figure 6A** and **Figure S4A**). However, FtsZ levels were even higher in the insoluble fractions of 1clpX and 1clpP strains compared to the parental strain (2.4 fold and 2.3-fold, respectively), although the amount of total protein was similar to the wild type strain exposed to heat shock (**Figure S4B**). We detected less protein overall in the 1dnaK strain after recovery, but this strain also had poor viability after heat shock and recovery (**Figure S4C**). In addition, we also detected ClpX in the insoluble fraction in all strains except the clpX deletion strain (**Figure S4A**). Next, we conducted a mild heat shock, 42◦C for 30 min, followed by recovery, and observed that deletion of clpB had a larger effect on the accumulation of insoluble FtsZ than deletion of clpX (**Figure S4D**). To determine the relative contributions of either clpB or clpX during a 40 min recovery period after incubation at 50◦C, we analyzed insoluble FtsZ levels at 20 min time intervals during recovery (**Figure 6B**). Notably, we observed that in cells deleted for clpX, insoluble FtsZ was present immediately after heat treatment and continued to accumulate throughout the recovery period to a greater extent than in wild type or clpB deletion cells. These results suggest that ClpXP prevents accumulation of FtsZ aggregates in cells exposed to extreme thermal stress. Since we observed that insoluble FtsZ levels were elevated in 1clpB strains exposed to mild heat shock (**Figure S4D**), we repeated the recovery time course in clpX and clpB deletion strains after mild heat shock, 42◦C for 30 min, to monitor insoluble FtsZ levels (**Figure S4E**). We observed that insoluble FtsZ accumulates during the recovery period in clpB deletion strains after mild heat shock (**Figure S4E**).

Finally, if ClpXP is active in cells after severe heat shock, then it should not be a thermolabile protein. To determine if ClpXP remains active after exposure to 50◦C in vitro, we incubated ClpXP in buffer at 50◦C for 1 h, and then measured activity after addition of Gfp-ssrA by monitoring the loss of GfpssrA fluorescence. We observed that ClpXP remained active for unfolding and degradation of Gfp-ssrA after incubation at 50◦C for 1 h (**Figure S4F**). As a control, ClpXP was also incubated in buffer at 30◦C for 1 h and then assayed for activity. We observed that ClpXP incubated at 30◦C was more active than ClpXP incubated at 50◦C, suggesting that a partial loss of activity had occurred at high temperature (**Figure S4F**). However, this assay was performed in the complete absence of other cellular chaperones or substrates and suggests that some ClpXP likely continues to retain activity after exposure to heat stress, while some may become inactivated.

### DISCUSSION

Here, using both a native and an engineered aggregated substrate, we demonstrate that ClpXP has the operational capacity to disassemble and degrade large aggregates that have ClpX degrons. In this study, FtsZ, a native substrate of ClpXP in E. coli, was aggregated in vitro by thermal stress, and we further show that FtsZ also aggregates in vivo when cells are exposed to high temperature (**Figures 2A**, **6A**). The observation that FtsZ is aggregation prone is in agreement with a prior study

as described in Materials and Methods for aggGfp-ssrA (1.0 µM) alone (black

reporting the presence of FtsZ in intracellular aggregates of 1rpoH cells incubated at 42◦C by mass spectrometry (Tomoyasu et al., 2001). FtsZ aggregates are cleared in vitro and in vivo by ClpXP, and ClpXP does not require the assistance of additional chaperones (**Figures 2E,F**, **6A**). Moreover, in the absence of ClpP, ClpX also promotes disassembly of FtsZ and Gfp-ssrA aggregates indicating that disassembly can also occur by a

proteolysis-independent mechanism, although disaggregation is more efficient in the presence of ClpP. ClpXP-mediated disassembly of Gfp-ssrA aggregates requires ATP in experiments monitoring turbidity (**Figure 1E**). In addition, the Walker B mutation in ClpX, E185Q, which impairs ATP hydrolysis, also impairs disaggregation of aggGfp-ssrA and, to a lesser extent, aggFtsZ. Aggregate disassembly and resolubilization by ClpX was previously described using the substrate lambda O protein, and here we show disassembly of aggregates and kinetic monitoring using two additional substrates, as well as reactivation of GfpssrA fluorescence (Wawrzynow et al., 1995). Reactivation of GfpssrA is largely dependent on ATP hydrolysis (**Figure 3B**), since ClpX(E185Q) only weakly promotes reactivation of aggregated Gfp-ssrA (**Figure 5C**), yet ClpX(E185Q) is capable of stable interactions with substrates in the presence of ATP, although they are not unfolded (Hersch et al., 2005; Camberg et al., 2014). It is unlikely that there are soluble, unfolded Gfp-ssrA monomers in solution after heating, since we did not detect them by DLS and it has been demonstrated that soluble, unfolded Gfp rapidly refolds, in 20–30 s, by a spontaneous reaction that does not require chaperones (**Figure 1C**; Makino et al., 1997; Tsien, 1998; Zietkiewicz et al., 2004). Therefore, it is likely that large

aggregates contain loosely associated unfolded proteins, which can be removed and reactivated by ClpX and, in the case of Gfp-ssrA, allowed to spontaneously refold. As expected, recognition by ClpX is highly specific, as Gfp without an ssrA-tag is not reactivated (**Figure 4A**).

We also detected partial disaggregation of aggFtsZ by ClpX, but not by ClpX(E185Q) (**Figure 5B**). Aggregation of FtsZ is induced at 65◦C, but the aggregates formed by FtsZ are smaller than those formed by Gfp-ssrA (30 and 600 nm, respectively; **Figures 1C**, **2D**). FtsZ aggregates likely contain 8–10 monomers, based on the average size of a folded FtsZ monomer, which is ∼40 Å in diameter (**Figure 2D**; Oliva et al., 2004). In contrast, Gfp aggregates in this study likely contain more than 120 subunits, based on an average size of a folded Gfp monomer, which is ∼50 Å across the long axis (van Thor et al., 2005). The small size of the FtsZ aggregate may allow it to be more susceptible to disassembly by ClpX than a larger aggregate.

In the model for disassembly of aggregates by ClpXP, ClpX binds to exposed recognition tags on the surface of the aggregate and promotes removal, unfolding and degradation of protomers from within the aggregate (**Figure 7A**). Removal of protomers eventually leads to destabilization and fragmentation of the aggregate as well as degradation (**Figures 1F**, **2F**). Although this process does not require ClpP, it occurs more robustly when ClpP is present than when ClpP is omitted (**Figures 1E**, **2E**). For aggregated substrate reactivation, ClpX likely engages unfolded protomers from the aggregate, which may be internal or loosely bound to the exterior of the aggregate, unfolds and release them. For small aggregates, this activity may be sufficient to lead to fragmentation and capable of promoting reactivation of substrates such as Gfp-ssrA (**Figure 7B**).

Finally, we observed large increases in insoluble FtsZ when cells were exposed to two different temperatures, 50◦C, which represents extreme heat shock, and 42◦C, which represents a mild heat shock (**Figures 6A,B**, **Figure S4D**). At 42◦C, deletion of clpB was associated with a large accumulation of insoluble FtsZ, suggesting that under mild heat stress, ClpB is the major factor that ensures FtsZ solubility (**Figures S4D,E**). However, we observed a remarkably different result after heat shock at 50◦C and throughout the recovery period. Specifically, in a clpX deletion strain, large amounts of insoluble FtsZ accumulate during the recovery period to a greater extent than in a clpB deletion strain (**Figures 6A,B**). It is unknown if ClpXP and ClpB are processing FtsZ aggregates directly in vivo, because we did not observe a reduction of aggregated FtsZ during the recovery period for any strain. FtsZ is typically present at very high levels (5,000–20,000 copies per cell) and is essential for cell division in E. coli (Bramhill, 1997). Interestingly, FtsZ also forms linear polymers as part of its normal biological function to promote cell division, and polymers are efficiently recognized, disassembled, and degraded by ClpXP (**Figures 2F,G**; Camberg et al., 2009, 2014; Viola et al., 2017). Given the diverse conformational plasticity of FtsZ, its use as a model disaggregation and remodeling substrate will be informative for studies of targeting and processing of multisubunit substrates by AAA+ proteins. As with FtsZ, many other ClpXP substrates are detectable in protein aggregates in cells (Flynn et al., 2003; Maisonneuve et al., 2008). Moreover, a previous study showed that ClpXP is important for cell viability under thermal stress conditions in cells depleted of DnaK (Tomoyasu et al., 2001). Given that it is estimated that 2–3% of E. coli proteins are ClpXP substrates, ClpXP likely serves as an additional mechanism to manage accumulation of aggregation-prone proteins in vivo, particularly under extreme stress conditions (Flynn et al., 2003; Maisonneuve et al., 2008).

#### AUTHOR CONTRIBUTIONS

CL, SM, MV, and JLC conceived and designed the experiments and wrote the paper. CL, SM, MV, JC, and JLC performed the experiments and analyzed the data.

#### FUNDING

This work was funded by an Institutional Development Award (IDeA) from the National Institute of General Medical Sciences of the National Institutes of Health (#P20GM103430 to JLC). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

#### ACKNOWLEDGMENTS

We thank Sue Wickner, Joel Hoskins, Shannon Doyle, Eric DiBiasio, David Vierra, and Katherine Kellenberger for helpful discussions, Paul Johnson and Janet Atoyan for sequencing assistance. Sequencing was performed at the Rhode Island Genomics and Sequencing Center, which is supported in part by the National Science Foundation under EPSCoR Grants Nos. 0554548 & EPS-1004057.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmolb. 2017.00026/full#supplementary-material

Figure S1 | Heat-aggregation of Gfp-ssrA. The fluorescence emission of aggGfp-ssrA (1.0 µM) (black circles) was monitored as described in Materials and Methods for 90 min.

Figure S2 | Unfolding and degradation of aggregated Gfp-ssrA by ClpXP. Unfolding and degradation were monitored for aggGfp-ssrA (1.0 µM) alone (black circles) or in the presence of ClpP (0.3 µM) (gold circles), ClpX (0.3 µM) and ClpP

# REFERENCES


(0. µM) (red circles) with ATP (4 mM), where indicated. Fluorescence emission (AU) was monitored as described in Materials and Methods.

Figure S3 | Degradation of FtsZ and FtsZ(1C67) by ClpXP. Degradation was monitored for FtsZ (6 µM) and FtsZ(1C67), ClpXP (0.5 µM), ATP (4 mM), GMPCPP (0.5 mM), and a regenerating system where indicated at 23◦C for 120 min as described in Materials and Methods, and samples were analyzed by SDS-PAGE and Coomassie stain.

Figure S4 | Insoluble FtsZ in deletion strains after heat-treatment. (A) Single gene deletion strains (Table 1) were incubated at 50◦C for 1 h and recovered as described in Materials and Methods. Cells from deletion strains were collected and insoluble protein extracts were collected as described and analyzed by reducing SDS-PAGE. Immunoblots were performed with antibodies to FtsZ or ClpX as described. (B) Total protein present in insoluble cell extracts shown in (A) after heat shock at 50◦C and recovery was detected by transferring proteins to a nitrocellulose membrane and staining with Ponceau. (C) Cell viability for all strains in (A) was determined by measuring colony forming units (CFU ml−<sup>1</sup> ) of cultures before heating ("pre-HS"), after heat treatment at 50◦C for 1 h ("post-HS"), and after 35 min of recovery at 30◦C ("post-rec"). (D) FtsZ levels were compared in single gene deletion strains after heat shock at 42◦C for 30 min and recovery (30◦C) as described in Materials and Methods. Cells were collected and insoluble protein extracts were analyzed by immunoblotting with antibodies to FtsZ as described. (E) Insoluble FtsZ levels were monitored in wild type, 1*clpX* and 1*clpB* deletion strains before heat shock (50◦C for 1 h or 42◦C for 30 min, where indicated) and during the 30◦C recovery period (0, 20, and 40 min). At the indicated times, cells were collected from cultures and insoluble protein extracts were analyzed by immunoblotting with antibodies to FtsZ as described. (F) Thermal stability of ClpXP was assayed by incubation of ClpX (0.5 µM) and ClpP (0.7 µM) in phosphate buffered saline supplemented with ATP (4 mM) MgCl2 (10 mM), glycerol (15%), Triton X-100 (0.005%), and TCEP (1 mM). Reactions containing ClpXP were added to a preheated quartz cuvette attached to a circulating water bath set to 50 or 30◦C, where indicated, and incubated for 1 h. The circulating water bath was rapidly cooled to 30◦C, the reactions were supplemented with ATP and regenerating system, Gfp-ssrA (0.2 µM) was added, and fluorescence was monitored with time in the absence (black) or presence of ClpXP, treated at 50◦C (red) or 30◦C (aqua).

Proc. Natl. Acad. Sci. U.S.A. 106, 10614–10619. doi: 10.1073/pnas.09048 86106


reveals five classes of ClpX-recognition signals. Mol. Cell 11, 671–683. doi: 10.1016/S1097-2765(03)00060-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 LaBreck, May, Viola, Conti and Camberg. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Essential Role of ClpXP in Caulobacter crescentus Requires Species Constrained Substrate Specificity

#### Robert H. Vass <sup>1</sup> , Jacob Nascembeni <sup>2</sup> and Peter Chien1, 2 \*

*<sup>1</sup> Molecular and Cellular Biology Graduate Program, University of Massachusetts, Amherst, MA, USA, <sup>2</sup> Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, MA, USA*

The ClpXP protease is a highly conserved AAA+ degradation machine that is present throughout bacteria and in eukaryotic organelles. ClpXP is essential in some bacteria, such as *Caulobacter crescentus*, but dispensible in others, such as *Escherichia coli*. In *Caulobacter*, ClpXP normally degrades the SocB toxin and increased levels of SocB result in cell death. ClpX can be deleted in cells lacking this toxin, but these ∆*clpX* strains are still profoundly deficient in morphology and growth supporting the existence of additional important functions for ClpXP. In this work, we characterize aspects of ClpX crucial for its cellular function. Specifically, we show that although the *E. coli* ClpX functions with the *Caulobacter* ClpP *in vitro*, this variant cannot complement wildtype activity *in vivo*. Chimeric studies suggest that the N-terminal domain of ClpX plays a crucial, species-specific role in maintaining normal growth. We find that one defect of *Caulobacter* lacking the proper species of ClpX is the failure to properly proteolytically process the replication clamp loader subunit DnaX. Consistent with this, growth of ∆*clpX* cells is improved upon expression of a shortened form of DnaX *in trans*. This work reveals that a broadly conserved protease can acquire highly specific functions in different species and further reinforces the critical nature of the N-domain of ClpX in substrate choice.

#### Edited by:

*James Shorter, University of Pennsylvania, USA*

#### Reviewed by:

*Walid A. Houry, University of Toronto, Canada Jodi L. Camberg, University of Rhode Island, USA Sue Wickner, National Institutes of Health, USA*

> \*Correspondence: *Peter Chien pchien@biochem.umass.edu*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *09 January 2017* Accepted: *19 April 2017* Published: *09 May 2017*

#### Citation:

*Vass RH, Nascembeni J and Chien P (2017) The Essential Role of ClpXP in Caulobacter crescentus Requires Species Constrained Substrate Specificity. Front. Mol. Biosci. 4:28. doi: 10.3389/fmolb.2017.00028* Keywords: CLPX, CLPP, ClpXP, Caulobacter crescentus, ATP-Dependent Proteases

# INTRODUCTION

Energy dependent proteolysis is a cellular process that maintains protein homeostasis, quality control, and allows for temporal changes in protein concentration required for cell signaling (Sauer and Baker, 2011). ClpXP is a conserved protease complex that performs highly targeted degradation. ClpXP is a two-part protease system consisting of a regulatory element (ClpX) and peptidase (ClpP) and is present throughout biological systems, ranging from bacteria to eukaryotic organelles. ClpX requires the use of ATP to self oligomerize, recognize, and unfold target proteins. The unfoldase has two main functions; (1) recognize substrates and (2) translocate them into the ClpP pore for degradation. The AAA+ domain of ClpX contains the Walker motifs that bind/hydrolyze ATP and the central pore loops required for substrate engagement (Baker and Sauer, 2012). An additional unique feature of ClpX is its N-domain, which is needed for recognition of some protease substrates. Regardless of how they are recognized, all substrates must be translocated to ClpP. Therefore, ClpX must interact effectively with ClpP to realize the full potential of this protease (Singh et al., 2001; Joshi et al., 2004).

The ClpX unfoldase must regulate which substrates are targeted for destruction by the ClpP chamber (Baker and Sauer, 2012). For example, in the bacterium Caulobacter crescentus, ClpX activity responds to cell cycle cues and stresses to meet the proteolytic demands as needed (Jenal and Fuchs, 1998; Smith et al., 2014; Williams et al., 2014; Joshi et al., 2015; Lau et al., 2015; Vass et al., 2016). To accomplish these different proteolytic tasks, ClpXP recognizes substrates using both simple degradation tags (degrons) and with the assistance of adaptor proteins that promote degradation of new substrate pools in a ClpX N-domain dependent manner (Baker and Sauer, 2012; Vass et al., 2016). One instance of this complex regulation is during trans-translation, where the rescue of stalled ribosomes is accompanied by the appending of the SsrA peptide, which is recognized by the ClpXP protease, to improperly translated polypeptides leading to their destruction (Tu et al., 1995; Keiler et al., 1996; Gottesman et al., 1998). Although this base recognition is independent of the ClpX N-domain, the SspB adaptor can further improve degradation of SsrA-tagged substrates by binding the N-domain of ClpX (Levchenko et al., 2000).

The ClpXP complex is not essential in all organisms. For example, ClpXP is dispensable in Escherichia coli (Gottesman et al., 1993), but is required in C. crescentus (Jenal and Fuchs, 1998; Osteras et al., 1999). Recent work points to a critical role of ClpXP in Caulobacter through the essential processing of the replication clamp loader subunit DnaX, driving cell cycle progression, and destruction of the toxin SocB processes that are absent in E. coli (Aakre et al., 2013; Vass and Chien, 2013; Joshi and Chien, 2016). Interestingly, despite high homology, the E. coli ClpX cannot complement the essential ClpX function in Caulobacter cells (Jenal and Fuchs, 1998; Osteras et al., 1999). Here, we use chimeric variants of ClpX to determine which features of this protease are important for either species-specific or species-nonspecific activity. We find that the N-domain of ClpX plays an especially important role in regulating essentiality in Caulobacter, but that expression of a non-complementing ClpX provides benefit during cell growth. Together, our work demonstrates how ClpXP specificity regulates species-specific responses in a bacterium where this protease is essential.

#### RESULTS

#### Escherichia coli ClpX Forms an Active Protease with Caulobacter ClpP in vitro

Prior work suggests that the E. coli ClpX cannot substitute for ClpX in Caulobacter (Osteras et al., 1999). What are the differences between E. coli ClpX (ECX) and Caulobacter ClpX (CCX) that restrict essentiality in Caulobacter? An alignment of ECX to CCX protein sequences reveals high identity (68%) and a total homology of ∼90% (Supplemental Figure 1). We sought to understand why these enzymes do not substitute for each other despite their high similarity. A simple explanation for the inability of ECX to complement in Caulobacter may be an inability for ECX to bind with the Caulobacter ClpP and form an active protease. We tested this hypothesis by monitoring ClpXP dependent degradation of GFP-ssrA where loss of fluorescence occurs when ClpX successfully delivers substrate to ClpP (**Figure 1**).

Both ECX and CCX are able to deliver substrate to Caulobacter ClpP (CCP), while only ECX can recognize and degrade GFPssrA together with E. coli ClpP (ECP, **Figure 1**). By titrating ClpP, we can derive an effective binding of ClpX to ClpP as a measure of protease activation (Kactivation) and find similar strengths of interactions between ClpX and ClpP in those combinations that result in an active protease (**Table 1**). This suggests that both ECX and CCX associate similarly with CCP. Note that the CCX + ECP combination fails to degrade GFPssrA (**Figure 1**), but because this combination is not germaine to this current work, we did not further explore this observation in this manuscript. Our major conclusion from this characterization is that it seems that ECX forms a productive protease with CCP, therefore the failure of ECX to replace CCX in vivo (Osteras et al., 1999) likely stems from a failure to maintain a particular substrate degradation profile rather than a failure of protease assembly. We decided to capitalize on this difference in activity to explore how species-specific elements of ClpX are required in different bacteria.

### The N-domain of Caulobacter ClpX Harbors an Essential Species-Specific Function

Although the ClpX pore is critical for substrate recognition, the ClpX N-domain provides additional specificity, often driven upon the binding of the N-domain by adaptor proteins that aid in degradation of substrates. We speculated that the ClpX N-domain contains species-specific motifs that provide for the essential activity in Caulobacter. Because ECX could form an active protease with CCP in vitro, we inferred that the AAA+ domain of ECX was sufficient to interact with CCP, as the N-domain is dispensable for the ClpX-ClpP interaction (Singh et al., 2001). Therefore, we used this system to determine how different variants and chimeras of ECX or CCX could support viability in Caulobacter.

We expressed different ClpX variants in a strain background where the endogenous ClpX could be depleted (Osteras et al., 1999). Similar to what had been reported previously (Osteras et al., 1999), expression of ECX from a plasmid failed to complement, while similar expression of CCX restored growth (**Figure 2A**). Expression of a CCX lacking the N-domain (1N-CCX) was also unable to support viability (**Figure 2A**; Bhat et al., 2013). Interestingly, a chimeric construct consisting of the N-domain of CCX fused to the AAA+ domain of ECX (CC-ECX) was able to restore viability in this background (**Figures 2A,B**). Western analysis confirms the expression of the appropriate constructs and the depletion of the endogenous ClpX (**Figure 2C**). The presence of ECX also affects normal Caulobacter growth even in the presence of CCX (**Figure 2A**;

TABLE 1 | Apparent binding constants between ClpX and ClpP using Kactivation as a proxy.


*Values are derived from fitting degradation data for active proteases from* Figure 1 *to a first-order binding equation (degradation rate* = *maximum rate/(Kactivation*+ *[ClpP])). Because CCX* + *ECP does not degrade GFPssrA, we did not fit this data (N/D).*

+xyl), which we speculate may be due to ECX binding to CCP and disrupting the formation of productive CCX+CCP complexes. Taken together with our in vitro work (**Figure 1**), our data suggests that the CCX N-domain is required for identification of substrates and proper degradation, which is ultimately needed for Caulobacter survival.

### Bypassing the Essential Requirement for ClpX Reveals Nonessential Proteolysis Important for Growth

Recent work suggests that the regulated destruction of the SocB toxin by the ClpXP protease via the adaptor SocA justifies the essential need for ClpX in Caulobacter (Aakre et al., 2013). In this model, depletion of ClpXP results in accumulation of the SocB toxin and cell death. It is possible that the CCX N-domain contains unique regions needed for interacting with the SocA adaptor to promote SocB degradation. If so, these regions are either absent in ECX or they are masked, which would explain the

finding that ECX fails to complement viability (**Figure 2A**). An alternative model is that the ECX engages inappropriately with other target proteins, which results in cell death due to prolific degradation. We sought to distinguish between these models by taking advantage of strains where socB is deleted.

In cells lacking SocB, clpX could be deleted, but these cells are abnormal and show poor viability upon plating (**Figure 3A**). As expected, expression of CCX restored viability in a dilution-plating assay (**Figure 3A**). However, in contrast to prior observations (Osteras et al., 1999, **Figure 2A**), expression of ECX complements growth (**Figures 3A,D**). The 1N-CCX construct also improves viability, though less effectively than variants of ClpX with an N-domain (**Figures 3A,D**). Microscopy studies reveal that expression of CCX in ∆clpX∆socB cells restores normal morphology and cell length (**Figures 3B,C**). Interestingly, although expression of ECX restores viability, cell morphology and cell length are still dramatically perturbed (**Figures 3B,C**). This perturbation is also seen with expression of the chimeric CC-ECX construct (**Figures 3B,C**), suggesting that species-specific differences in the ClpX AAA+ domain are responsible for these changes in cell morphology. Consistent with this interpretation, expression of the 1N-CCX restores cell length more fully than either of the constructs containing the ECX AAA+ domain (**Figures 3B,C**). Thus, it seems that there are species-specific N-domain dependent and AAA+ domaindependent substrate recognition profiles that both contribute to the role of ClpX in Caulobacter.

#### Species-Specific Processing of DnaX Is Needed for Robust Growth

Given the species-specific nature of the phenotypic complementation, we next explored the molecular consequences of ClpX variant expression.

DnaX is a subunit of the replication clamp loader complex that is responsible for sliding clamp dynamics during replication and DNA damage responses (Kelch, 2016). In Caulobacter, full length DnaX (also called τ) is processed by the ClpXP protease to generate shorter fragments (γ1 and γ2) that are critical for survival and a robust DNA damage response (**Figure 4A**; Vass and Chien, 2013). Because ∆socB cells can tolerate the loss of ClpX, we examined the levels of DnaX in this background. In line with our expectations, DnaX was not processed in cells lacking ClpX (**Figure 4B**). Previous in vitro work suggested that the N-terminal domain of ClpX plays an essential role for proteolytic recognition of DnaX (Vass and Chien, 2013) and, consistent with this model, cells expressing 1N-CCX fail to process DnaX. However, this Ndomain dependence is species-specific, as cells expressing ECX also do not correctly process DnaX, resulting in loss of the shortest (γ2) DnaX and accumulation of full length DnaX (**Figure 4B**). The ECX AAA+ domain is able to process DnaX correctly as expression of the CC-ECX chimeric ClpX, which contains the ECX AAA+ domain, is sufficient to restore the production of both normal DnaX fragments. Therefore, speciesspecific combinations of the N-domain and the ClpX AAA+

domain are needed for normal processing and degradation of DnaX.

Previously, we showed that DnaX processing is essential for wildtype growth (Vass and Chien, 2013), however 1socB1clpX strains are viable even though DnaX is not processed in this background (**Figure 4B**). Given the sickness of these cells, we asked if expression of the γ-fragments of DnaX could improve growth in these strains. Consistent with a critical need for DnaX isoforms, we found expression of either γ1 or γ2 DnaX increased growth rate in liquid cultures, compared to the empty plasmid control (**Figure 4C**). Curiously, expression of full length DnaX (which only generates τ in this ClpX-free strain) inhibits growth and reduces density at saturation suggesting that an excess of τ is toxic. Despite the clear improvement in growth, the doubling time of γ1 or γ2 expressing strains is still ∼9–10 h (**Figure 4C**), substantially longer than the ∼90 min doubling time of wildtype Caulobacter in these conditions. Therefore, there must be additional non-essential aspects of ClpXP degradation that promote normal robust growth.

# Cell Cycle Adaptors Do Not Rely on Species Restricted Interactions with ClpXP

Caulobacter growth and development relies on adaptors that interact with the ClpX N-domain (Aakre et al., 2013; Smith et al., 2014; Lau et al., 2015). The ECX AAA+ domain is active (**Figure 1**) but the ECX variant results in a DnaX distribution different from CCX (**Figure 4B**). Therefore, we next asked if adaptor mediated degradation was altered in strains expressing ECX.

CtrA is a master regulator and replication inhibitor in Caulobacter that must be degraded during the transition from the swarmer to stalked cell to promote replication and developmental changes (Jenal and Fuchs, 1998; Wortinger et al., 2000).

Degradation of the CtrA protein is an excellent model for Ndomain dependent delivery as this process requires a multiadaptor hierarchy consisting of CpdR, PopA, and RcdA (Taylor et al., 2009; Smith et al., 2014; Joshi et al., 2015; Lau et al., 2015). By monitoring the adaptor-dependent delivery of CtrA we could explicitly test if the ECX N-domain was capable of supporting these adaptor interactions. As a read out of CtrA degradation, we used Western blotting to monitor levels of CtrA following inhibition of protein synthesis upon addition of chloramphenicol. As anticipated, cells containing the CCX Ndomain (CCX, CC-ECX) can degrade CtrA while cells without ClpX or expressing 1N-CCX are unable to degrade CtrA robustly (**Figure 5A**). Cells expressing ECX as the only ClpX variant exhibit CtrA degradation similar to wildtype (**Figure 5B**). Thus, the N-domain of ECX is able to support degradation through the adaptor hierarchy found in Caulobacter.

Our working model is that ECX fails to degrade the SocB toxin because the N-domain of ECX fails to bind the SocA adaptor (**Figure 2**). However, the N-domain of ECX appears fully competent to interact with the cell cycle adaptor hierarchy (**Figure 5**). Because adaptor-dependent delivery requires unique interactions supplied by the N-domain and contacts with the ClpX AAA+ domain, our work reveals a complexity in this regulation that results in both speciesspecific and species-nonspecific recognition of protease substrates.

#### DISCUSSION

The presence of the ClpX unfoldase in all bacteria is likely due to a need for its protease activity. Given the similarity between orthologs, it is perhaps not surprising that many species of ClpXP can universally recognize some substrates based on conserved sequence or structural degrons, such as SsrA-tagged proteins. Increasing the versatility of ClpX activity therefore requires additional elaboration of ClpX-substrate interactions. Adaptors can fill this role, but are not the only method of diversifying substrate recognition.

Our comparison of E. coli and Caulobacter ClpX reinforces the working model that the most conserved regions of the ClpX AAA+ domain support functions required for all protease activity, such as ATP hydrolysis, oligomerization and ClpP binding (**Figure 6**). More diverse regions appear to be the origin of species-specific activity. For example, both ECX and CCX contain the "IGF" motifs required for ClpP binding, but the area surrounding this region varies (Supplemental Figures 1, 2). This difference may explain the inability of CCX to interact with ECP in an in vitro setting. By contrast, the Caulobacter ClpX N-domain appears to support essential contacts required for Caulobacter viability that the E. coli N-domain does not provide. These contacts may include stringent recognition of substrates or interactions with critical adaptors needed for viability. We speculate that the differences in sequences between these species

of N-domains (Supplemental Figures 1, 2) may underlie these different binding profiles.

The N-domain alters substrate targeting to ClpX by directly recognizing substrates or cooperating with a diverse set of adaptors for target degradation. In our study, we find fusing the Caulobacter's ClpX N-domain onto the AAA+ domain of E. coli ClpX restores the essential nature of ClpX in Caulobacter. We interpret this as evidence for the N-domain of the Caulobacter ClpX playing a unique role, such as facilitating degradation of the SocB toxin. However, differences between these N-domains do not result in purely exclusive behavior as the E. coli ClpX can support adaptor-dependent CtrA degradation and is able to restore growth defects in cells lacking SocB. In addition, an altered ability to process DnaX among the ClpX constructs suggest inherent differences in direct substrate recognition and may also reflect altered cooperation between the ClpX N-domain and AAA+ domain.

In conclusion, although the ClpX sequence is highly conserved between E. coli and C. crescentus, there are speciesspecific differences in activity that restrict the complementation between orthologs. These differences seem principally reflected by N-domain interactions, which account for both direct recognition and coordinated adaptor activity. However, it also seems that differences in substrate recognition by the ClpX AAA+ domain may affect how different ClpX orthologs support normal growth in Caulobacter. The work presented here argues that many aspects of ClpX function are conserved throughout bacterial evolution, but small differences may result in an altered ClpX specificity that is only critical in a particular species.

# MATERIALS AND METHODS

All Caulobacter strains, liquid or plated, were grown in PYE at 30◦C, in the presence of the appropriate antibiotics or sugars.

#### In vitro ClpX Analysis

ClpX and ClpP from C. crescentus and E. coli were purified as before (Chien et al., 2007). Degradation of GFP-ssrA was performed as before (Rood et al., 2012).

#### Caulobacter Strains

Expression of ClpX variants driven by the Caulobacter ClpX promoter were generated by cloning 500 bp upstream of the Caulobacter clpX gene and fusing this to ClpX alleles using a pMR10-based vector. Plasmids were electroporated into ∆socB, clpX:: cells or parental strain UJ220 (Osteras et al., 1999). The following ClpX constructs were used: Caulobacter ClpX (CCX), E. coli ClpX (ECX), Caulobacter ClpX AAA+ domain (CCX minus the N-domain residues 2-53, 1N-CCX), and the chimeric fusion of the Caulobacter N-domain substituted for the N-domain on the E. coli ClpX body, a direct N-terminal 2-53 aa substitution (CC-ECX).

#### Caulobacter Length Analysis

Phase contrast images of Caulobacter cells (Zeiss AXIO ScopeA1) were subject to axial length analysis measuring pole-to-pole distance using the MicrobeJ software suite (ImageJ). Length is reported in microns.

# ClpX Depletion

ClpX depletion was done in a similar fashion to (Bhat et al., 2013), except cells were back diluted twice during the ∼20 h ClpX depletion. Samples for ClpX replete conditions were taken prior to depletion. Samples for both ClpX replete and depletion conditions were pelleted and snap frozen then resuspended in an SDS loading buffer to a normalized OD600 = 0.1. Sample volumes were then heated at 95◦C for 5 min. Equal volumes of sample were subjected to SDS-PAGE followed by Western transfer. Resulting blots were probed with anti-ClpX or anti-DnaX antibodies and visualized with appropriate secondary antibodies conjugated to HRP and chemifluorescent substrate.

#### CtrA Degradation

∆socAB and ∆socB, clpX:: cells were diluted from overnight culture and allowed to reach mid-log phase, until the cells reached 0.3–0.5 OD600. Translational inhibitor chloramphenicol was added to a final concentration of 30 µg/ml. Following the addition of chloramphenicol, aliquots were removed every 30 min for 2 h. Cells were pelleted and snap frozen then resuspended, normalized to an OD600 of 0.3. Sample volumes were heated at 95◦C for 5 min. Equal volumes of sample were subjected

#### REFERENCES


to SDS-PAGE followed by Western transfer. Resulting blots were probed using an anti-CtrA antibody and visualized as above.

#### Liquid Growth Assay

∆socAB and ∆socB, clpX:: with the corresponding plasmids were grown from single colonies. For the time courses, samples were back diluted to a starting density of OD600 = ∼0.1, and changes in optical density were measured over time. Resulting growth curves are the average of biological replicates, n = 3. Error bars represent standard deviation for the set of n = 3 (**Figure 3D**).

#### Plated Growth Assays

∆socAB and ∆socB, clpX:: with appropriate plasmids were grown from single colony into log growth. All plating samples started with a density of ∼0.1 OD600 then followed a ten-fold dilution for each subsequent spot. Four microliters of resulting cultures was used to spot onto solid media and grown for ∼3 days.

#### AUTHOR CONTRIBUTIONS

RV and JN performed experiments. RV and PC designed experiments and wrote the manuscript.

# ACKNOWLEDGMENTS

The authors thank Meg Stratton and members of the Chien lab, Vierling lab, Hebert lab and Peyton lab for valuable discussions. We also thank the Protein Homeostasis theme of the Institute for Applied Life Sciences for discussions. This work was sponsored by NIH R01GM111706 to PC and in part by funding from a Chemistry Biology Interface Program Training Grant (NIH T32GM08515) to RV. Portions of this work were initiated while PC was in the laboratory of Tania Baker (MIT).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmolb. 2017.00028/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Vass, Nascembeni and Chien. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Functional Diversity of AAA+ Protease Complexes in *Bacillus subtilis*

#### Alexander K. W. Elsholz <sup>1</sup> , Marlene S. Birk <sup>1</sup> , Emmanuelle Charpentier 1, 2, 3 and Kür ¸sad Turgay <sup>4</sup> \*

<sup>1</sup> Department of Regulation in Infection Biology, Max Planck Institute for Infection Biology, Berlin, Germany, <sup>2</sup> The Laboratory for Molecular Infection Sweden, Department of Molecular Biology, Umeå Centre for Microbial Research, Umeå University, Umeå, Sweden, <sup>3</sup> Humboldt University, Berlin, Germany, <sup>4</sup> Faculty of Natural Sciences, Institute of Microbiology, Leibniz Universität, Hannover, Germany

Here, we review the diverse roles and functions of AAA+ protease complexes in protein homeostasis, control of stress response and cellular development pathways by regulatory and general proteolysis in the Gram-positive model organism Bacillus subtilis. We discuss in detail the intricate involvement of AAA+ protein complexes in controlling sporulation, the heat shock response and the role of adaptor proteins in these processes. The investigation of these protein complexes and their adaptor proteins has revealed their relevance for Gram-positive pathogens and their potential as targets for new antibiotics.

#### *Edited by:*

Walid A. Houry, University of Toronto, Canada

#### *Reviewed by:*

Axel Mogk, Zentrum für Molekulare Biologie, University of Heidelberg, Germany Pierre Genevaux, Centre National de la Recherche Scientifique (CNRS), France Teru Ogura, Kumamoto University, Japan

*\*Correspondence:*

Kür ¸sad Turgay turgay@ifmb.uni-hannover.de

#### *Specialty section:*

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> *Received:* 14 April 2017 *Accepted:* 15 June 2017 *Published:* 12 July 2017

#### *Citation:*

Elsholz AKW, Birk MS, Charpentier E and Turgay K (2017) Functional Diversity of AAA+ Protease Complexes in Bacillus subtilis. Front. Mol. Biosci. 4:44. doi: 10.3389/fmolb.2017.00044 Keywords: AAA+ protease complexes, Hsp100/Clp proteins, *Bacillus subtilis*, protein quality control, chaperones, regulatory proteolysis, McsB, adaptor proteins

# INTRODUCTION

Bacteria, like all living organisms must rapidly sense and adapt to drastic changes in their environment (Roux, 1914). These environmental changes can directly or indirectly affect protein structure, activity and homeostasis. Protein quality control systems are an important part of cellular adjustment processes allowing a response to such changes. The conserved cellular protein quality control systems comprise chaperones and members of the AAA+ family, which can prevent or reverse the potentially toxic aggregation of misfolded proteins. Damaged, misfolded, or aggregated proteins that cannot be successfully refolded or repaired, can subsequently become degraded by the AAA+ protease complexes (Wickner et al., 1999; Hartl et al., 2011; Mogk et al., 2011).

These AAA+ proteins are members of a conserved family of ATP-hydrolyzing proteins with all kind of activities in many cellular pathways, including replication, DNA and protein transport, transcriptional regulation, ribosome biogenesis, membrane fusion, and protein disaggregation or degradation. The AAA+ family proteins often form hexamers, and can convert the energy of ATP hydrolysis into mechanical force in order to remodel or unfold proteins or nucleoprotein complexes, to move DNA or proteins, or to facilitate membrane fusion (Ogura and Wilkinson, 2001; Erzberger and Berger, 2006; Sauer and Baker, 2011).

The unifying activity of the AAA+ family proteins participating in protein quality control systems is to unfold proteins facilitated by ATP hydrolysis-dependent translocation using specific loops in the pore formed by the AAA+ hexameric ring structure. This unfoldase activity is central for the function of AAA+ proteins in protein disaggregation and degradation (Horwich et al., 1999; Sauer and Baker, 2011). In conjunction with Hsp70 chaperones, AAA+ proteins of the Hsp104/ClpB protein family can disaggregate and subsequently refold protein aggregates (Glover and Lindquist, 1998; Mogk et al., 2015). However, in AAA+ protease complexes, AAA+ unfoldases such as ClpC or ClpX associate with a specific barrel-shaped, compartmentalized protease complex, such as ClpP, which receive the unfolded proteins for degradation from the translocating AAA+ proteins (Weber-Ban et al., 1999; Wickner et al., 1999). Related AAA+ proteases such as Lon or FtsH form hexameric complexes, but encompass both, a AAA+ followed by a metallo-protease domain (**Figure 1**).

Interestingly, in the proteasome, the eukaryotic AAA+ protease complex, the base of the 19S regulatory subunit is consisting of AAA+ proteins forming a hetero-oligomeric hexamer, which is associated with the proteolytic 20S particle. Here, the heterologous AAA+ proteins play a similar role as homo-oligomeric hexameric AAA+ proteins in the bacterial AAA+ protease complexes of the Hsp100/Clp protein family (Kirstein et al., 2009b; Sauer and Baker, 2011; Matyskiela and Martin, 2013).

Specific sequence tags and/or adaptor proteins are necessary for the recognition, selection and preparation of substrate proteins for degradation by the AAA+ protease complexes. Diverse adaptor proteins for many AAA+ proteins have been characterized and identified in various bacteria, including model systems such as Escherichia coli, B. subtilis, or Caulobacter crescentus. The synthesis and activity of these adaptor proteins can be regulated by a variety of mechanisms and input signals. For example, adaptor protein activity can be controlled by sequestration, proteolysis, post-translational modification, or anti-adaptor proteins (Kirstein et al., 2009b; Sauer and Baker, 2011; Battesti and Gottesman, 2013; Joshi and Chien, 2016; Kuhlmann and Chien, 2017; Yeom et al., 2017). It was recently demonstrated in E. coli that DnaK selects and targets substrates for disaggregation and refolding by ClpB, and therefore can be considered an adaptor for ClpB (Weibezahn et al., 2004; Oguchi et al., 2012; Seyffer et al., 2012; Winkler et al., 2012b).

In B. subtilis, the ClpC adaptor proteins MecA, YpbH, and McsB, the ClpX adaptor proteins YjbH and CmpA, and the LonA adaptor protein SmiA were identified and characterized (Kirstein et al., 2009b; Mukherjee et al., 2015; Tan et al., 2015; **Figure 1**). Interestingly, the adaptor proteins of ClpC not only recognize substrate proteins, but also facilitate the activation of the ClpC hexamer, which allows for subsequent formation of the functional protease complex. In the absence of substrates, these adaptor proteins are themselves degraded, which leads to inactivation of ClpCP. This regulatory mechanism curbs the activity of the ClpCP protease when substrates are not present (Kirstein et al., 2006). In summary, adaptor proteins play an important role in controlling and facilitating the various and different regulatory and general functions of their cognate AAA+ proteins (Kirstein et al., 2009b; Sauer and Baker, 2011; Battesti and Gottesman, 2013; Joshi and Chien, 2016; Kuhlmann and Chien, 2017).

# PROTEIN QUALITY CONTROL AND STRESS RESPONSE SYSTEMS IN *BACILLUS SUBTILIS*

B. subtilis is considered the model organism for Gram-positive bacteria. B. subtilis cells are amenable to genetic manipulation, and many tools and methods exist for the study of its physiology and fundamental cellular processes (Sonenshein et al., 2002; Graumann, 2017). It is a soil-dwelling organism that can adjust to rapidly changing environmental conditions, including the availability of nutrients, water and oxygen, and changes in light,

The different distinguishing AAA+ and accessory domains are depicted.

temperature, and salinity. This ability to sense and respond to various environmental stimuli is a prerequisite for the survival of B. subtilis in its ever-changing environment (Hecker and Völker, 2001). In addition to a number of general and specific stress response systems controlled by dedicated transcription factors (e.g., SigmaB, CtsR, HrcA, Spx, PerR, or OhrR; Hecker et al., 1996, 2007; Mogk et al., 1997; Zuber, 2009; Elsholz et al., 2010b; Runde et al., 2014), B. subtilis cells can also respond to environmental changes by triggering sophisticated and complex developmental programs that result in sporulation, biofilm formation, motility, or competence (Rudner and Losick, 2001; Errington, 2003; Chen et al., 2005; Lopez et al., 2009; Vlamakis et al., 2013; Mukherjee and Kearns, 2014; Hobley et al., 2015). The AAA+ protease systems and their adaptor proteins are intricately involved in stress response and developmental programs of B. subtilis cells. Consequently, pleiotropic effects were observed in clpX, clpC, and clpP deletion strains and these observed phenotypes are not only linked to protein quality control, but also imply a regulatory role for these genes in various stress response and developmental pathways (Dubnau and Roggiani, 1990; Msadek et al., 1994; Gerth et al., 1998; Kock et al., 2004; Zuber, 2004; Kirstein et al., 2009b; Runde et al., 2014).

# Role of AAA+ Proteins and Chaperone Networks in *B. subtilis* Protein Homeostasis

The B. subtilis protein quality control system includes chaperones like the Hsp70 (DnaKJE) and Hsp60 (GroE) system, as well as other conserved chaperone systems such as ribosome-associated chaperones (Trigger factor), Hsp90 (HtpG), small heat shock proteins and redox chaperones (Schumann et al., 2002; Moliere and Turgay, 2009), together with AAA+ protease complexes.

The AAA+ unfoldase ClpB, which together with DnaK is necessary for protein refolding and disaggregation (Glover and Lindquist, 1998; Weibezahn et al., 2004; Haslberger et al., 2007; Winkler et al., 2010, 2012a; Oguchi et al., 2012; Seyffer et al., 2012), is widely conserved in most bacterial species, but is notably absent from B. subtilis. However, it was demonstrated that B. subtilis ClpC, which is closely related to ClpB, can together with the adaptor protein MecA or its paralog YpbH disaggregate and refold protein aggregates in vitro when not associated with ClpP (Schlothauer et al., 2003; Haslberger et al., 2008).

In B. subtilis, the AAA+ protease complexes ClpCP, ClpEP and ClpXP are part of the protein quality control system. ClpC was identified as a stress-induced protein, the 1clpC strain is thermosensitive and, similar to 1clpP or 1clpX strains, display impaired degradation of misfolded proteins (Krüger et al., 1994, 2000; Msadek et al., 1994; Gerth et al., 1998, 2004; Kock et al., 2004). ClpE expression is tightly controlled and is only induced after severe heat shock, implying that ClpEP might function as an additional protease system under other severe stress conditions (Derre et al., 1999a; Gerth et al., 2004; Miethke et al., 2006). Consistent with their function in protein homeostasis, ClpC, ClpX, ClpE, and ClpP were all observed to associate with subcellular protein aggregates, especially upon heat shock or heterologous protein synthesis (Krüger et al., 2000; Jürgen et al., 2001; Miethke et al., 2006; Kain et al., 2008; Kirstein et al., 2008; Simmons et al., 2008).

As previously demonstrated for other bacteria (Sauer and Baker, 2011), ClpXP of B. subtilis is necessary for the degradation of proteins whose translation is stalled. These unfinished polypeptides are prone to aggregation and must be eliminated. In a process called trans-translation, stalled ribosomes are rescued by the activities of the SmpB protein in conjunction with the transfer and messenger RNA (tmRNA). The tmRNA is a specialized small RNA, which aided by SmpB first acts as a tRNA and subsequently like an mRNA. This not only liberates the ribosome, but also results in the addition of a short sequence, termed an SsrA tag to the C-terminus of the unfinished protein (Keiler et al., 1996; Muto et al., 2000; Abe et al., 2008; Keiler, 2008; Ujiie et al., 2009). ClpXP recognizes the C-terminal SsrA tag, and degrades these unfinished proteins, thereby preventing their aggregation (Keiler et al., 1996; Wiegert and Schumann, 2001; Sauer and Baker, 2011).

The membrane-associated FtsH AAA+ protease is most likely also directly involved in protein quality control, since a deletion of ftsH causes pleiotropic effects, including salt, and heat sensitivity (Deuerling et al., 1995, 1997). The two B. subtilis AAA+ protease Lon paralogs, LonA and LonB, do not have a significant role in the degradation of misfolded proteins (Riethdorf et al., 1994; Schmidt et al., 1994; Krüger et al., 2000; Serrano et al., 2001; Simmons et al., 2008). Only very little is known about the possible in vivo role of the B. subtilis ClpYQ (CodWX) AAA+ protease complex (Slack et al., 1995; Kang et al., 2003; Simmons et al., 2008; **Figure 1**).

# Role of Chaperones and AAA+ Protease Complexes in Controlling Stress Response Pathways

An interesting feedback mechanism was observed for the regulation of chaperone synthesis in B. subtilis. The transcription of the dnaK and groE operon is controlled by the repressor HrcA, which is also encoded as the first gene of the dnaK operon. The GroEL chaperone is necessary for maintaining the repressor activity of HrcA. However, when GroEL interacts with unfolded proteins, HrcA repressor activity cannot be maintained and the synthesis of GroEL and DnaK is induced. The elevated levels of chaperones help to protect and repair the proteome. This subsequently restores the repressor activity of HrcA, thereby terminating the transcriptional induction of chaperones (Mogk et al., 1997; Schumann et al., 2002).

The same AAA+ protease complexes can be involved in general proteolysis for protein quality control and in regulatory proteolysis to control the activity of transcription factors and other key regulatory proteins. In B. subtilis, not only chaperones like GroEL are involved in sensing protein folding stress, but the AAA+ protease complexes ClpCP or ClpXP with their adaptor proteins McsB and YjbH are involved in sensing various stresses and are also involved in the regulation of their own synthesis by controlling e.g., CtsR or Spx stability (Zuber, 2004; Kirstein et al., 2009b; Rochat et al., 2012; Runde et al., 2014; Engman and von Wachenfeldt, 2015; Mijakovic et al., 2016).

#### Stress Response and the Control of the Spx Regulon by ClpXP and Its Adaptor Protein YjbH

The unusual transcription factor Spx was first identified by analyzing genetic suppressor mutations selected in a clpP or clpX deletion strain, which were mapped to the yjbD gene encoding Spx (Nakano et al., 2001). Spx is normally degraded by ClpXP, and the growth defect in B. subtilis strains lacking clpX or clpP is due to an accumulation of this transcription factor (Nakano et al., 2002, 2003a,b; **Figure 2**).

The same suppressor mutant analysis suggested, and subsequent structural analysis demonstrated, that Spx modulates transcription by interacting with the alpha subunit of the RNA polymerase (Nakano et al., 2000; Newberry et al., 2005). In doing so, it inhibits the interaction of activators with the RNA polymerase (Nakano et al., 2003b). In addition, it was observed that Spx can also operate at specific promoters as a redoxcontrolled activator of transcription (Nakano et al., 2005, 2010; Newberry et al., 2005; Lin and Zuber, 2012; Lin et al., 2013). Spx controls a broad regulon that includes genes important for the redox stress response, such as the redox chaperone TrxA and genes that maintain cellular thiol homeostasis (Antelmann et al., 2000; Nakano et al., 2003a,b; Zuber, 2009; Rochat et al., 2012). It was recently observed that not only oxidative stress but also heat stress can induce Spx activity and that Spx is essential for thermotolerance development in B. subtilis. These results suggested that Spx is important to orchestrate the heat and oxidative stress responses (Runde et al., 2014).

The stress sensing for the regulatory proteolysis and activity control of Spx is mediated via the adaptor protein YjbH and the N-terminal domain (NTD) of ClpX. Under normal conditions, ClpXP and the adaptor protein YjbH suppress Spx activity by mediating its degradation (Larsson et al., 2007; Rogstam et al., 2007; Garg et al., 2009; Kommineni et al., 2011; Chan et al., 2014). The adaptor protein YjbH induces the exposure of a ClpXP recognition element of Spx, thereby promoting Spx degradation under normal conditions (Chan et al., 2014). It was demonstrated that the zinc ion-containing NTD of ClpX is sensitive to oxidative stress, which would inhibit ClpXP mediated degradation. Spx activity can be directly modulated by disulfide bond formation upon oxidation of two specific cysteines (Nakano et al., 2005; Zhang and Zuber, 2007). Oxidative inactivation (Garg et al., 2009) or stress-mediated sequestration of YjbH to protein aggregates (Engman and von Wachenfeldt, 2015) results in the stabilization and accumulation of Spx also under heat stress conditions (Zuber, 2009; Runde et al., 2014). Therefore, multiple stress signals are sensed and integrated by the adaptor protein YjbH, the AAA+ protein ClpX and Spx itself in order to control the activity and stability of this transcription factor (Zuber, 2004, 2009; **Figure 2**).

Interestingly, a study combining global transcriptomics and identification of Spx chromosomal binding sites revealed that Spx activates not only transcription of the genes for the ClpC adaptor proteins MecA and YpbH (Nakano et al., 2003a), but also the genes for the AAA+ protein ClpX and its adaptor protein YjbH (Rochat et al., 2012). The same study provided evidence that Spx positively influences the expression of CtsR dependent genes. The observation of additional identified Spx binding sites might even suggest that HrcA-dependent gene expression could also be affected by Spx (Rochat et al., 2012). These results support a central and intricate role of Spx in B. subtilis heat shock response and protein quality control (Runde et al., 2014; **Figure 2**).

a \* ; Zhang and Zuber, 2007; Garg et al., 2009).

#### Heat and Oxidative Stress Responses and the Control of the CtsR Regulon

CtsR (Class three stress repressor) is a global repressor of protein quality control genes in B. subtilis and all Gram positive bacteria (Elsholz et al., 2010a) and recognizes a conserved direct heptanucleotide repeat sequence in its dimeric form (Krüger and Hecker, 1998; Derre et al., 1999b). However, CtsR repressor activity is influenced by several different stress signals, and many of the signal transduction mechanisms that converge on CtsR are regulated by the protein quality control machinery (Elsholz et al., 2010a). Thus, CtsR represents a central regulator for the adaption of the cell to environmental changes that influence cellular protein quality control.

CtsR controls the expression of its own operon containing ctsR, mcsA, mcsB, and clpC. clpP and clpE are also regulated by CtsR as single genes. CtsR therefore controls its own synthesis. mcsA and mcsB genes were identified as encoding modulators of CtsR activity (Krüger et al., 2001). Proteins like ClpC or ClpP whose expression is inhibited by CtsR play a crucial role for the adaptation to high temperatures and must be induced during heat stress in order to ensure survival of the cell (Krüger and Hecker, 1998; Derre et al., 1999b, 2000; Gerth et al., 2004). The level of control by CtsR is reflected by the number of CtsR binding sites in the respective promoters. The tighter the CtsR mediated repression is, the stronger the transcription of these proteins is repressed under optimal growth conditions and can be induced during stress conditions (Helmann et al., 2001; Petersohn et al., 2001). In contrast to what is known about the regulation of other heat stress response systems, the inactivation of CtsR during heat stress depends solely on an intrinsic thermosensing function, independent of other components such as chaperones influencing CtsR activity (Elsholz et al., 2010b; **Figure 3**). CtsR uses a highly conserved tetraglycine loop within the winged helix-turn-helix domain (HTH) to sense changes in temperature (Fuhrmann et al., 2009). This region possesses a high conformational entropy that confers decreased thermostability, and is conserved among all Gram-positive CtsR homologs (Elsholz et al., 2010b). Under non-stress conditions, CtsR binds to and represses its DNA operator. However, upon temperature upshift, the labile glycinerich loop within the HTH changes conformation such that CtsR binding to DNA is impaired, and the expression of genes under the control of CtsR is induced. Interestingly, this ability of CtsR to sense changes in temperature is conserved among low-GC Gram-positive bacteria and adapted to the speciesspecific temperature of the ecological niche. This could suggest that the highly conserved tetraglycine loop is involved in the ability to sense temperature upshifts but that distinct, variable regions of CtsR are responsible for adaptation to speciesspecific temperatures (Elsholz et al., 2010a,b). Interestingly, CtsR-dependent gene expression becomes repressed upon heat exposure within 15 min (Elsholz et al., 2010b), showing that not the high temperatures itself, but rather the temperature upshift leads to CtsR inactivation. Newly synthesized CtsR molecules are able to bind to their DNA operators even under heat stress conditions, whereas inactivated CtsR molecules are targeted for ClpCP-dependent proteolysis.

#### **ClpE-dependent control of CtsR activity**

The mechanism described above allows expression of the CtsR regulon within minutes of exposure to heat (Krüger and Hecker, 1998). However, this CtsR mediated response is strictly limited in time, because newly synthesized active or reactivated CtsR can repress the transcription of its regulon again after about 15 min. Interestingly, the apparent reactivation of CtsR depends somehow also on the activity of the AAA+ protein ClpE. In a clpE mutant strain, CtsR is fully functional under normal growth temperatures and becomes inactivated upon heat exposure. However, the repression of CtsR-dependent gene expression is dramatically delayed in the absence of ClpE (Miethke et al., 2006). This observation indicates that ClpE—together with other AAA+ proteins such as ClpC—might be involved in maintaining the repressor activity of CtsR. This mechanism would ensure that expression of CtsR-regulated genes is only inhibited when appropriate levels of active AAA+ proteins are present to maintain CtsR activity (Miethke et al., 2006). How exactly the two diverging functions between CtsR-degradation and CtsRreactivation are controlled and separated by the two AAA+ proteins, remains unclear, but for example an involvement through the control of McsB activity seems plausible. In a clpE or clpC mutant, the removal of protein stress conditions is delayed, which would keep the McsB kinase active for a longer time (Elsholz et al., 2011a), resulting in CtsR inactivation and thus delayed re-activation.

#### **Regulation of CtsR by McsB**

The most important regulator of CtsR is McsB, which is a protein arginine kinase and an adaptor protein for the ClpCP protease complex targeting specific substrates, such as CtsR, for ClpCPdependent degradation. McsB is considered as a versatile protein that integrates different stress signals and fulfills a diverse set of functions (Kirstein et al., 2005, 2007; Fuhrmann et al., 2009, 2013; Elsholz et al., 2011b, 2012; Schmidt et al., 2014; Mijakovic et al., 2016).

#### **McsB as a protein kinase and its control by ClpC, McsA and YwlE**

Protein arginine phosphorylation by McsB can drastically change protein activity by switching the charge of the protein at the phosphorylation site and/or by targeting the protein for degradation (Kirstein et al., 2005, 2007; Fuhrmann et al., 2009; Elsholz et al., 2012; Trentini et al., 2016). Therefore, McsB kinase activity must be stringently controlled. Consistent with this, cells expressing hyperactive McsB display a severe growth defect (Elsholz et al., 2011a).

The activity of the McsB kinase is tightly controlled by a complex regulatory network that involves its activator McsA, the AAA+ proteins ClpC and ClpE, as well as the recently identified protein arginine phosphatase YwlE (Kirstein et al., 2005; Elsholz et al., 2011a; Mijakovic et al., 2016). Autophosphorylation of McsB is thought to promote its activation (Kirstein et al., 2005; Elsholz et al., 2011a, 2012; Fuhrmann et al., 2013). YwlE is the cognate phosphatase for McsBdependent arginine phosphorylation events (Kirstein et al., 2005; Elsholz et al., 2011a, 2012; Fuhrmann et al., 2013) and YwlE

counteracts McsB function not only by de-phosphorylating its substrates, but also by dephosphorylating McsB itself (**Figure 3**).

ClpC and ClpE both act as inhibitors of McsB activity (Kirstein et al., 2005, 2007; Elsholz et al., 2011a). It has been shown that the McsB kinase activity is strongly inhibited by ClpC in vitro (Kirstein et al., 2005) and that McsB strongly interacts with ClpC in vivo due to a translation coupling of McsB with ClpC, but that this interaction is abolished upon stress induction. Moreover, in the absence of ClpC, McsB kinase activity is observed even in the absence of any stress conditions (Elsholz et al., 2011a). These observations suggest that under non-stress conditions, McsB interacts with ClpC and that this interaction inhibits McsB activation. Upon stress induction, McsB is released from ClpC inhibition and is free to phosphorylate its target proteins. Interestingly, the release of McsB from ClpC activate McsB as a protein arginine kinase and adaptor protein (Kirstein et al., 2005, 2007; Elsholz et al., 2011a; **Figure 3**).

Interestingly, McsB not only promotes protein degradation, but also inhibits the repressor activity of CtsR, possibly by phosphorylating CtsR within the DNA-binding domain (Kirstein et al., 2005, 2007; Fuhrmann et al., 2009; Elsholz et al., 2011a). Although McsB is not involved in the inactivation of CtsR upon heat stress, it has been shown that McsB kinase activity results in CtsR inactivation in vivo (Elsholz et al., 2010b, 2011a). This regulatory mechanism might explain the inactivation of CtsR under other stress conditions that have been shown to strongly activate CtsR-dependent gene expression, including salt and protein folding stress. A common cellular event that is induced by all these different stress conditions is protein misfolding and aggregation, which could directly or indirectly affect this inhibitory interaction between ClpC and McsB (Kirstein et al., 2007; Elsholz et al., 2011a; **Figure 3**). The activation of McsB might represent a regulatory mechanism that monitors the level of protein stress in the cell and ties the protein homeostatic state of the cell to the expression and activity of protein quality control systems. In addition, McsB has been shown to phosphorylate hundreds of proteins including many regulatory proteins (Elsholz et al., 2012; Schmidt et al., 2014; Trentini et al., 2016). Thus, it is conceivable that McsB might influence a wide range of cellular processes.

#### **Sensing of oxidative stress via McsA and YwlE**

As mentioned above, McsB kinase activity is inhibited not only by the association with the AAA+ protein ClpC, but also by the protein arginine phosphatase YwlE (Elsholz et al., 2012; Schmidt et al., 2014; **Figure 3**). Although, YwlE shows a strong homology to low-molecular weight protein tyrosine phosphatase (LMWPTP), it de-phosphorylates arginine rather than tyrosine residues (Fuhrmann et al., 2013). This selectivity for phospho-arginine residues depends on a single amino acid change (Fuhrmann et al., 2013). Interestingly, the active center of LMWPTPs and YwlE contains a cysteine residue that is susceptible to oxidative damage (Chiarugi and Cirri, 2003; Fuhrmann et al., 2016). Recently, Fuhrmann and colleagues showed that YwlE is indeed subject to regulation through oxidation of this critical cysteine residue under certain oxidative stress conditions, such as exposure to H2O<sup>2</sup> (Fuhrmann et al., 2016). Once this cysteine residue in the active center is oxidized, YwlE becomes inactive, resulting in the partial activation of the McsB kinase (Fuhrmann et al., 2016; **Figure 3**). This specific regulatory circuit involving YwlE illustrates another way by which oxidative stress promotes McsB-dependent regulation of diverse cellular processes.

Interestingly, these two molecular mechanisms are not the only regulatory circuits that influence the activity of CtsR and its associated protein quality control networks. It has been shown that CtsR is inactivated during thiol-reactive stress conditions. Under these stress conditions, CtsR inactivation depends on a redox-dependent partner switching mechanism involving McsA and McsB. Under normal growth conditions, McsA strongly interacts with McsB. This not only activates the McsB kinase, but also inhibits McsB binding to and inactivation of DNA-bound CtsR (**Figure 3**).

McsA is a redox-sensing protein whose activity depends on the redox state of its thiols. Oxidation of these thiols prevents interaction of McsA with McsB. Liberated McsB is no longer inhibited by McsA and is thus able to remove CtsR from the DNA (Elsholz et al., 2011b; **Figure 3**).

This molecular redox switch not only controls the expression of CtsR-dependent protein quality control systems, but also influences their activity directly. Interaction of McsB with McsA is required for its kinase activity, which is in turn necessary for the role of McsB as an adaptor that promotes protein degradation by ClpCP (Kirstein et al., 2007). During thiol-reactive stress, McsA oxidation not only promotes McsB-dependent removal of DNAbound CtsR, but also prevents McsB kinase activity (Elsholz et al., 2011b), thus also influencing the activity of ClpC (**Figure 3**). Interestingly, in low GC Gram-positive bacteria that lack McsA and McsB, ClpE might be able to sense and respond to oxidative stress. The NTD of ClpE is homologous to the NTD of ClpX, which contains a Zn-binding site, known to render ClpX sensitive to oxidation (Zhang and Zuber, 2007; Garg et al., 2009). This suggests that the NTD of ClpE like the NTD of ClpX could act as a sensor for oxidative stress. Thereby ClpE could sense stress and induce the CtsR operon in these organisms, since the inactivated ClpE might not be able to activate CtsR any longer (Elsholz et al., 2011b).

# The General Role of ClpC and McsB in Cellular Protein Quality Control

As mentioned above, McsB can act as an adaptor for the AAA+ protein ClpC. It has been shown that this activity depends on the ability of McsB to function as a protein kinase (Kirstein et al., 2005, 2007). Only when active as a kinase McsB can stimulate ClpC activity, and this specific activation depends on site-specific phosphorylation of ClpC by McsB (Elsholz et al., 2010b). The kinase activity of McsB has also shown to be required for the degradation of specific substrates by the ClpCP protease. However, McsB might be involved in regulatory proteolysis of not only transcription factors such as CtsR, but also other proteins. There are strong indications that the ClpC adaptor proteins McsB like MecA or YpbH play an important role together with ClpCP not only in regulatory proteolysis of CtsR, but also in general proteolysis and protein quality control (Kirstein et al., 2008).

#### McsB and Protein Quality Control

Heat stress promotes the kinase activity of McsB and promotes the association of McsB with subcellular protein aggregates at the poles. ClpC and ClpX are also recruited to these aggregates but in an McsB-independent manner (Kirstein et al., 2008). Interestingly, in an mcsB deletion strain the misfolded protein, GudB∗, accumulates at the cell pole (Stannek et al., 2014), where it probably associates with protein aggregates. This observation could suggest a possible scenario where McsB together with ClpC or ClpE is important to disassemble small protein aggregates prior to degradation or reactivation facilitated by the chaperone system. Moreover, McsB and ClpC have been implicated in the disassembly of the competence apparatus, which is also located at the poles. Here the accumulation of a component of the competence apparatus ComGA-GFP fusion gave the first indication of such a mechanism (Hahn et al., 2009). This suggests the possibility that McsB, like the other proteins encoded in the CtsR regulon, is also a central player of the protein quality control system.

#### **Direct recognition of unfolded arginine-phosphorylated proteins by ClpCP**

The arginine kinase activity of McsB is required for its ability to stimulate ClpC activity and to promote degradation of its substrates by the ClpCP protease. This makes it difficult to dissect the kinase and adaptor activities of phosphorylated McsB (Kirstein et al., 2007). Nevertheless, it was recently demonstrated that the NTD of ClpC can directly recognize phosphorylated arginines at two binding sites. An in vitro arginine-phosphorylated artificial protein substrate, the naturally unfolded beta-casein, could alone activate ClpC and was degraded by ClpCP without the presence of McsB and McsA (Trentini et al., 2016). These experiments demonstrate that ClpCP alone can recognize and degrade an arginine phosphorylated protein suggesting a new possible recognition tag for ClpCP-mediated protein degradation, and expanding the known repertoire of degradation tags for controlled protein degradation mechanism in bacteria (Trentini et al., 2016).

However, it should be noted that another ClpCP substrate, the arginine-phosphorylated CtsR, is not recognized and degraded by ClpCP in the absence of McsB and that CtsR phosphorylation on arginine residues is not sufficient for its targeting for degradation by ClpCP (Kirstein et al., 2007). It is possible that beta-caseine, which is an unfolded protein might itself be recognized directly by the NTD of ClpC (Erbse et al., 2008) in addition to the recognition of its phosphorylated arginines. Arginine-phosphorylated unfolded beta caseine might participate in activating ClpC and become targeted by degradation because of these two distinct interactions with ClpC. Nevertheless, these results suggest that during heat stress, McsB might phosphorylate unfolded or aggregated proteins to mark them for subsequent ClpCP degradation, however that might not apply to other proteins targeted by McsB for ClpCP degradation. A ClpC variant with mutations in both Arg-P binding sites (ClpCEA) did not complement a clpC deletion strain for survival during heat stress (Trentini et al., 2016), suggesting the possibility of a more general protein quality control role of protein arginine phosphorylation. However, it is not yet understood how McsB activates ClpC. Therefore, the complex interaction between McsB as adaptor and kinase, its substrate and the NTD of ClpC have to be sorted out before a more definitive understanding of the role of McsB as adaptor protein and arginine protein kinase during heat stress in B. subtilis cells can be reached. To fully understand the role of arginine phosphorylation, McsB, and ClpC in general protein quality control, further in vivo and in vitro studies should be conducted.

### AAA+ PROTEASE COMPLEXES AND THE CONTROL OF REGULATORY AND CELL DEVELOPMENTAL PATHWAYS OF *B. SUBTILIS*

Regulatory proteolysis represents a very fast and efficient cellular control mechanism (Jenal and Hengge-Aronis, 2003). Therefore, it comes as no surprise that the B. subtilis AAA+ protease complexes are not only intricately involved in protein quality control and in sensing and responding to stress, but are also engaged in the initiation and control of distinct cellular developmental processes of B. subtilis.

In the ever-changing environment encountered by bacteria, the ability to differentiate into specialized cell types is a crucial survival strategy. Complex developmental processes are a hallmark of B. subtilis and AAA+ proteases play crucial roles for the regulation of these cellular processes.

#### Competence

When grown into stationary phase, a subpopulation of B. subtilis cells develop the ability to actively take up extracellular DNA. ComK is the transcription factor necessary and sufficient to induce the transcription of the competence state (K-state) regulon. ComK induces the transcription of competence genes, which encode the proteins necessary to form the DNA receptors that recognize and transport extracellular DNA into the cell. Concurrently, DNA repair and recombination systems are upregulated, whereas general transcription, translation, cell division and growth are impaired (van Sinderen et al., 1995; Haijema et al., 2001; Berka et al., 2002; Hamoen et al., 2003; Chen et al., 2005; Hahn et al., 2005, 2015). Thus, the K-state cells are not only able to take up DNA, but also exhibit properties such as growth inhibition that are characteristic of persister-like cellular states (Hahn et al., 2015), and which can confer a survival advantage in the face of antibiotics or other stressors (Yüksel et al., 2016).

In exponentially growing B. subtilis cells, ComK is constantly antagonized by the adaptor protein MecA. MecA not only targets ComK for degradation by ClpCP, but also directly inhibits ComK activity (Dubnau and Roggiani, 1990; Kong and Dubnau, 1994; Turgay et al., 1997, 1998; Persuh et al., 1999). At higher cell density in post-exponential cells, signaling via a quorum sensing system causes the stable phosphorylation of the response regulator ComA, which results in the synthesis of the small protein ComS (D'Souza et al., 1994; Hamoen et al., 1995). ComS competes with ComK for binding to MecA (Prepiak and Dubnau, 2007), which results in the release of ComK from MecAmediated inhibition and degradation (Turgay et al., 1997, 1998). Since ComK is a positive autoregulatory transcription factor, this release results in the exponential synthesis of ComK in the subpopulation of competence-developing B. subtilis cells. The MecA-dependent retargeting of the abundant ComK protein for ClpCP degradation is essential for the escape from competence (Turgay et al., 1998).

This post-translational regulatory mechanism—where the activity of an adaptor protein is controlled by the signal-induced synthesis of a small protein that acts like an anti-adaptor protein—was also observed in E. coli for the proteolytic control of the general stress sigma factor σ <sup>S</sup> by the adaptor protein RssB-P (Becker et al., 1999; Bougdour et al., 2006; Hengge, 2009; Battesti and Gottesman, 2013; Battesti et al., 2013; Micevski et al., 2015).

### Sporulation

Endospore formation is a terminal cellular developmental process leading to two different types of cells in a structure termed the sporangium. This event begins with asymmetric cell division, after which the larger mother cell encloses the smaller forespore cell and supports its development into an endospore. This concerted cellular developmental process culminates in the release of the endospore from the lysing mother cell (Rudner and Losick, 2001; Higgins and Dworkin, 2012). The endospore is metabolically inactive and highly resistant to most stressors and environmental extremes (Piggot and Hilbert, 2004). Once the cell has committed to this developmental process, it is irreversible (Dworkin and Losick, 2005). Consequently, half of the progeny will transform into an endospore, whereas the other half will die. It is therefore critical that this process is tightly regulated. Indeed, the decision whether or not to commit to this complex developmental process is controlled by multiple regulatory circuits that integrate several distinct signals (Higgins and Dworkin, 2012). Interestingly, AAA+ protease complexes have several important roles at various stages of this complex decision-making process. The roles of ClpCP, ClpXP and FtsH sporulation have been elucidated in detail (Pan et al., 2001; Bradshaw and Losick, 2015; Tan et al., 2015).

One of the interesting aspects of sporulation is an asymmetric cell division that results in two unequally sized daughter cells a smaller forespore and a larger mother cell. Upon asymmetric division, both cells engage specific and distinct gene expression programs that ultimately determine their markedly different fates (Piggot and Hilbert, 2004). The first cell type-specific genetic program is the activation of the alternative sigma factor F in the forespore, which depends on both a partner-switching mechanism involving the anti-sigma factor SpoIIAB and the antianti-sigma factor SpoIIAA, and also on the activity of the PP2C phosphatase SpoIIE (Stragier and Losick, 1996).

Sigma F and all factors required for its activation are produced at the onset of sporulation and thus are present in both cell compartments (Gholamhoseinian and Piggot, 1989). For over two decades it was not understood how Sigma F is activated exclusively in the forespore. SpoIIE is the critical controller of the activation of Sigma F: it de-phosphorylates SpoIIAA, which can then activate Sigma F (Stragier and Losick, 1996). Intriguingly, SpoIIE is expressed in both compartments but the protein is found only in the forespore (Gholamhoseinian and Piggot, 1989). Bradshaw and Losick recently implicated the AAA+ protease FtsH in the compartment specific regulation of SpoIIE stability during the early stages of sporulation (Bradshaw and Losick, 2015).

They showed that SpoIIE is subject to FtsH-dependent degradation in the mother cell, but is protected from proteolysis in the forespore. This specific stabilization results in the accumulation of active SpoIIE proteins in the forespore that lead to the forespore-specific activation of Sigma F (**Figure 4A**). The stabilization of SpoIIE in the forespore is not linked to differences in FtsH expression or activity in the different compartments.

Normally, SpoIIE is degraded by FtsH upon recognition of an N-terminal degradation tag. However, relocation of SpoIIE from the polar divisome to the cell pole results in stabilization of SpoIIE by a mechanism that is not yet fully understood but seems to involve SpoIIE oligomerization (Bradshaw and Losick, 2015; **Figure 4A**). Nonetheless, the local control of SpoIIE degradation is a great example of how proteolysis can be a crucial regulatory mechanism in the control of cell polarity.

Interestingly, FtsH is not the only AAA+ protease that is involved in the control of SigmaF activity. It has been shown that the ClpCP protease is responsible for the degradation of the anti-sigma factor SpoIIAB (Pan et al., 2001). Under normal growth conditions, SpoIIAB interacts and thus inactivates Sigma F (Duncan and Losick, 1993). This interaction is also thought to stabilize SpoIIAB. Upon de-phosphorylation of the anti-antisigma factor SpoIIAA by SpoIIE, SigmaF is liberated (Stragier and Losick, 1996) and SpoIIAB is subject to ClpCP-dependent degradation (Pan et al., 2001). Although, this proteolytic mechanism is not directly involved in the activation of SigmaF, it is required to maintain the stability of free Sigma F (Pan et al., 2001). Targeting of SpoIIAB for ClpCP-dependent degradation is enabled by the presence of the C-terminal amino acid sequence LCN (Pan et al., 2001; Pan and Losick, 2003). Interestingly, none of the described ClpC adaptors are involved in the proteolysis of SpoIIAB, which implicates a hitherto unidentified adaptor or molecular mechanism in this process (Kirstein et al., 2009b). Since artificially LCN-tagged proteins are also subject to degradation during exponential growth (Pan and Losick, 2003), it is unlikely that this process depends on a sporulation-specific adaptor protein (**Figure 4B**).

Regulated proteolysis is also involved in the control mechanisms ensuring proper spore formation. The ClpXP protease together with the adaptor protein CmpA are involved in the quality control of the spore envelope. In cells that produce spores with a proper spore envelope, CmpA is degraded through ClpXP-dependent proteolysis and sporulation continues. However, in cells that display defects in the spore envelope maturation, CmpA is stabilized and mediates ClpXPdependent degradation of the coat morphogenetic protein SpoIVA. This proteolytic event causes instability and subsequent lysis of the spore, thereby ensuring that only properly assembled spores are produced within the population. The presence of ClpXP and CmpA is required but not sufficient for degradation of SpoIVA and also of CmpA itself. The proteolytic activity of this regulatory circuit depends on the presence of a specific signal or component that is under the control of the cell typespecific Sigma K. However, the nature of this signal or component is unclear and requires further investigation (Tan et al., 2015; **Figure 4C**).

The three mechanisms described above are examples of how regulated protein degradation is involved in the control of sporulation. In addition, evidence exists that AAA+ proteases and their associated proteolytic events play even more roles in the control of sporulation. A recent global high-throughput genetic screen highlighted the pleiotropic function of ClpC in the control of sporulation. Meeske and colleagues showed that cells lacking clpC had a dramatic defect in sporulation efficiency and displayed different phenotypes, such as delayed entry, asymmetric engulfment, reduced or no Sigma G activity and a concomitant small forespore phenotype (Meeske et al., 2016). This observation suggests that ClpC is specifically involved in the control of distinct but yet unknown regulatory events during sporulation.

#### Motility and Biofilm Formation

A first analysis of B. subtilis strains with clpC, clpX, or clpP mutations suggested that these genes are important for swimming motility (Rashid et al., 1996; Liu and Zuber, 1998; Msadek et al., 1998). It was demonstrated that ClpCP and ClpXP enable motility via regulatory proteolysis of the transcription factors ComK, DegU and Spx, which directly or indirectly influence the transcription of flagellar genes (Liu and Zuber, 1998; Ogura and Tsukahara, 2010; Molière et al., 2016).

Interestingly, B. subtilis cells can switch from swimming to swarming motility on surfaces, which is accompanied by a hyperflagellation of the swarming cells (Kearns, 2010). The transcriptional activator SwrA determines the number of flagella in B. subtilis cells (Mukherjee and Kearns, 2014). This transition is controlled by regulated proteolysis of SwrA, which in swimming cells is targeted by the adaptor protein SmiA for LonA-dependent degradation (Mukherjee et al., 2015).

The transformation of B. subtilis cells from the motile to the sessile state depends on the presence of the SlrR regulatory protein. In the SlrR low state, motility and autolysis genes are expressed. In contrast, in the SlrR high state SlrR together with SinR repress motility and autolysis genes, resulting in long

FIGURE 4 | Regulation by Proteolysis during sporulation. (A) Model for the controlled degradation of SpoIIE by FtsH. In normal cells and the mother cell after asymetric division, monomeric SpoIIE accumulates at the divisome and is rapidly degraded by FtsH, who recognizes SpoIIE through a C-terminal Tag (red). This leads to the stabilization of phosphorylated SpoIIAA (AA-P) and in turn to the inactivation of Sigma F (sF ) by SpoIIAB (AB). In the forespore, SpoIIE is enriched due to the close proximity to the division sites, which favors transfer of SpoIIE to the smaller forespore. The high concentration of SpoIIE promotes multimerization, in which the Tag-sequence is buried within the multimeric complex. This protects SpoIIE from FtsH-dependent proteolysis and leads to SpoIIE-dependent de-phosphorylation of SpoIIAA (AA), which in its unphosphorylated form can interact with SpoIIAB, thereby freeing and activating Sigma F, resulting in the cell-type specific activation of Sigma F. (B) Model for the control of Sigma F. The Kinase SpoIIAB (AB) is able to phosphorylate SpoIIAA (AA-P), which allows SpoIIAB to bind and inactivate Sigma F. Once the SpoIIE phosphatase (IIE) is activated, SpoIIAA becomes de-phosphorylated leading to the binding of SpoIIAB and the activation of Sigma F. To prevent further phosphorylation of SpoIIAA, SpoIIAB is targeted by ClpCP for degradation, which shifts the equilibrium toward unphosphorylated SpoIIAA. (C) Model for the CmpA-dependent control of spore integrity. In spores with a proper coat formation, CmpA is targeted by ClpXP and SpoIVA is stabilized, resulting in functional spore formation. In contrast, in cells with spores that display a defective coat, CmpA then mediates degradation of SpoIVA, which also depends on so far unknown factors controlled by Sigma K. This regulatory process results in cell lysis, preventing the spore development to proceed.

chains of sessile cell and biofilm formation. The induction of SlrR expression is well understood and depends on a complex three-protein regulatory circuit (Chai et al., 2010b; Norman et al., 2013). Interestingly, the switch from the SlrR high state to the motile, SlrR low depends on the controlled degradation of SlrR. It is not clear how SlrR is degraded, but it is known that an LexA-like auto-cleavage of SlrR is involved in SlrR stability. Interestingly, it was shown that the AAA+ protease ClpCP influences the stability of SlrR, but the precise molecular mechanisms have not yet been described (Chai et al., 2010a).

# RELEVANCE OF *B. SUBTILIS* AAA+ PROTEASE COMPLEXES AS A NEW TARGET FOR ANTIBIOTICS AND FOR TARGETING VIRULENCE IN GRAM-POSITIVE PATHOGENS

Understanding the processes that determine stability and degradation of regulatory proteins under different environmental conditions in a model organism such as B. subtilis can provide important information that holds true for other bacterial species. AAA+ protease complexes mediate numerous essential aspects of bacterial physiology and are widely conserved among bacteria (Kirstein et al., 2009b; Sauer and Baker, 2011). They therefore represent promising targets for the development of novel antimicrobial therapies that are urgently needed to combat the rise in antibiotic resistance in pathogenic bacterial species (Raju et al., 2012; Culp and Wright, 2016). While it is estimated that up to 10% of pursued targets for drug development are proteases, therapeutics targeting bacterial proteolytic complexes are comparatively underrepresented (Drag and Salvesen, 2010).

AAA+ protease complexes are especially attractive as potential targets for novel antimicrobial therapies as they are essential for virulence in several pathogenic bacteria (Butler et al., 2006; Culp and Wright, 2016; Malik and Brötz-Oesterhelt, 2017). Because virulence is not generally essential for basic growth, the inhibition of virulence is believed to impose a lower evolutionary pressure on the pathogen. Therefore, AAA+ protease complextargeted therapeutics might be less likely to induce resistance and might therefore represent a more durable anti-infective strategy (Rasko and Sperandio, 2010). Furthermore, adverse effects arising from modulation of the activity of human AAAprotease complex homologs are unlikely because of their low resemblance to the bacterial proteins (Raju et al., 2012). Another favorable feature of the large, multimeric AAA+ protease complex as potential targets for antimicrobials are the multitude of different activities and active sites that could be targeted by small molecules. Therefore, it is not surprising that AAA+ protease complex modulators—in contrast to well-established antibiotics—have substantially different mechanisms of action.

One class of AAA+ protease complex modulators, the acyldepsipeptides (ADEPs), was shown to exhibit an inhibitory effect on growth of several Gram-positive organisms, including Staphylococci and Streptococci by interacting with and dysregulating ClpP (Brötz-Oesterhelt et al., 2005). The molecular mechanism of ADEP activity was later investigated in more detail in a B. subtilis model, where it was shown that ADEPs influence ClpP activity in two ways. Firstly, they prevent ClpP from associating with its corresponding ATPase. This inhibits formation of the complete protease complex responsible for regulated proteolysis. Secondly, ADEPs enable ClpP to degrade unfolded proteins, making it independent from its ATPase and thereby deregulating substrate specificity (Kirstein et al., 2009a; Lee et al., 2010). It was later shown that ADEP4 kills Staphylococcus aureus persister cells by triggering indiscriminate, ClpP-mediated degradation of over 400 proteins (Conlon et al., 2013), including for example the cell division protein FtsZ (Sass et al., 2011). ClpP is not essential in S. aureus, but mutants lacking clpP were shown to be more susceptible to a range of other antibiotics. This suggests that ClpP reprogramming by ADEP4 in combination with other antibiotics may represent a possible strategy to eliminate persister cells (Conlon et al., 2013).

The working mechanism of ADEPs relies on both dysregulation of ClpP and disruption of the protease complex. Other natural compounds such as cyclomarin, ecumicin, and lassomycin, all of which bind to the N-terminal domain of the Mycobacterium tuberculosis chaperone ClpC1, were recently discovered. While the exact mode of action is still to be discovered, it was suggested that binding of the N-terminal domain of ClpC1 by ecumicin or lassomycin leads to inhibition of degradation of natural substrates, which would eventually lead to accumulation of proteins and toxicity (Gavrish et al., 2014; Gao et al., 2015; Culp and Wright, 2016). For cyclomarin, alteration of substrate specificity or structural changes that result in a more accessible axial pore of the protease complex were discussed. These hypotheses were based on the observation that the cyclomarin binding region at the N-terminal domain of ClpC1 overlaps with the site corresponding to the MecA interaction site on the NTD of B. subtilis ClpC (Schmitt et al., 2011; Vasudevan et al., 2013; Culp and Wright, 2016; Malik and Brötz-Oesterhelt, 2017).

Various questions regarding the mechanism behind antibacterial activity of these newly identified compounds targeting the NTD of AAA+ proteins remain unanswered (Culp and Wright, 2016; Malik and Brötz-Oesterhelt, 2017). Advancing the knowledge of AAA+ proteases in the B. subtilis model will help to understand how these promising targets for novel antimicrobial therapies against pathogenic bacteria work, but will also help to unravel the molecular mechanism of these antibiotics. In addition, understanding the molecular mechanism of the AAA+ protease complexes in B. subtilis help us to understand the mechanism of these molecular machines during virulence. AAA+ proteases contribute to virulence in two distinct ways. Firstly, they play a crucial role in removal of misfolded proteins that are formed under unfavorable environmental conditions. Secondly, proteases have been shown to contribute to virulence by controlling the abundance of regulatory proteins and transcription factors in response to diverse stimuli encountered during infection (Ingmer and Brøndsted, 2009). In Gram-negative organisms, several proteases of the AAA+ family contribute to virulence while

#### REFERENCES


in Gram-positive bacteria, the involvement of AAA+ protease complexes exceed the involvement of any other protease family (Ingmer and Brøndsted, 2009). In Listeria monocytogenes for example, ClpP was shown to regulate the expression of an essential virulence factor (Listeriolysin), the multiplication of the pathogen within macrophages, and the transcription of an actin-polymerizing protein (ActA) that is required for cell-to-cell spread (Gaillot et al., 2000). Additionally, the ClpCP-MecA complex was implicated in the downregulation of the surface virulence-associated protein, SvpA (Borezée et al., 2001). MecA was first described in B. subtilis as an adaptor protein for specific substrate recognition by ClpCP (Turgay et al., 1998). These examples support the notion that B. subtilis is a useful model organism for the study of the role of AAA+ protease complexes.

#### CONCLUSION

The various AAA+ protease complexes of the Gram-positive model organism B. subtilis are involved in many cellular processes, ranging from protein homeostasis and protein quality control to stress response pathways and the control of cellular developmental processes. Adaptor proteins play an important role in substrate recognition during both general and regulatory proteolysis (Jenal and Hengge-Aronis, 2003; Kirstein et al., 2009b; Battesti and Gottesman, 2013; Joshi and Chien, 2016; Kuhlmann and Chien, 2017). More recently, a new protein modification mediated by the ClpC adaptor protein and protein arginine kinase McsB was discovered in B. subtilis (Fuhrmann et al., 2009). The possible role and function of this unusual protein modification (Mijakovic et al., 2016) is an area of active investigation (Elsholz et al., 2012; Fuhrmann et al., 2016; Trentini et al., 2016).

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

Work in the Laboratory of KT was supported by the Deutsche Forschungsgemeinschaft. Work in the Laboratory of EC was supported by the Göran Gustafsson Foundation (Göran Gustafsson Prize), Umeå University, the Max Planck Foundation and the Max Planck Society. The authors want to thank Christina Gross for critical reading and many helpful comments on the manuscript.


factor that interacts with the turnover element in RpoS. Proc. Natl. Acad. Sci. U.S.A. 96, 6439–6444. doi: 10.1073/pnas.96.11.6439


subtilis. Proc. Natl. Acad. Sci. U.S.A. 90, 2325–2329. doi: 10.1073/pnas.90. 6.2325


control, and Spx-DNA contact at a conserved cis-acting element. J. Bacteriol. 195, 3967–3978. doi: 10.1128/JB.00645-13


C-terminal domain of the RNA polymerase alpha subunit. PLoS ONE 5:e8664. doi: 10.1371/journal.pone.0008664


Frontiers in Molecular Biosciences | www.frontiersin.org

pleiotropic transcription factor Spx in Bacillus subtilis. Nucleic Acids Res. 40, 9571–9583. doi: 10.1093/nar/gks755


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Elsholz, Birk, Charpentier and Turgay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# AAA+ Machines of Protein Destruction in Mycobacteria

#### Adnan Ali H. Alhuwaider and David A. Dougan\*

*Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia*

The bacterial cytosol is a complex mixture of macromolecules (proteins, DNA, and RNA), which collectively are responsible for an enormous array of cellular tasks. Proteins are central to most, if not all, of these tasks and as such their maintenance (commonly referred to as protein homeostasis or proteostasis) is vital for cell survival during normal and stressful conditions. The two key aspects of protein homeostasis are, (i) the correct folding and assembly of proteins (coupled with their delivery to the correct cellular location) and (ii) the timely removal of unwanted or damaged proteins from the cell, which are performed by molecular chaperones and proteases, respectively. A major class of proteins that contribute to both of these tasks are the AAA+ (ATPases associated with a variety of cellular activities) protein superfamily. Although much is known about the structure of these machines and how they function in the model Gram-negative bacterium *Escherichia coli*, we are only just beginning to discover the molecular details of these machines and how they function in mycobacteria. Here we review the different AAA+ machines, that contribute to proteostasis in mycobacteria. Primarily we will focus on the recent advances in the structure and function of AAA+ proteases, the substrates they recognize and the cellular pathways they control. Finally, we will discuss the recent developments related to these machines as novel drug targets.

Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Franz Narberhaus, Ruhr University Bochum, Germany Michal Zolkiewski, Kansas State University, United States Cordula Enenkel, University of Toronto, Canada*

> \*Correspondence: *David A. Dougan d.dougan@latrobe.edu.au*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *28 May 2017* Accepted: *27 June 2017* Published: *19 July 2017*

#### Citation:

*Alhuwaider AAH and Dougan DA (2017) AAA*+ *Machines of Protein Destruction in Mycobacteria. Front. Mol. Biosci. 4:49. doi: 10.3389/fmolb.2017.00049* Keywords: AAA+ protease complexes, protein degradation, Mycobacterium, novel drug targets, proteasome

#### TUBERCULOSIS

Tuberculosis (TB) is a devastating disease that currently affects approximately one third of the world's population. Each year TB is responsible for over 1 million deaths with almost 10 million new cases being diagnosed. The disease is caused by a single pathogen—Mycobacterium tuberculosis (Mtb) and although the disease is eminently curable, the inappropriate administration of drugs has led to the emergence of several drug resistant strains, which are increasingly more difficult to eradicate. Most recently, a totally drug-resistant (TDR) strain of Mtb has emerged, which as the name suggests is resistant to all available drugs for the treatment of TB. Hence, there is an urgent need to develop new drugs that target novel pathways within these resistant strains. An emerging approach is the targeting of proteases.

# AAA+ PROTEASES IN MYCOBACTERIA

Protein degradation is a fundamental cellular process that controls the irreversible removal of proteins from the cell. Given the definitive nature of this process, the machines that control protein turnover in the cell must be tightly regulated to prevent the unwanted turnover of normal cellular proteins. At the same time, these proteases need to permit, not only the broad recognition of damaged proteins, but also the precise recognition of specific regulatory proteins in a timely fashion. In bacteria, this is achieved by a collection of proteolytic machines (together with their cofactors), which mediate the explicit recognition of a diverse set of protein substrates. Not surprisingly, proteases have been identified as important drug candidates and the dysregulation of these machines has been demonstrated to kill both dormant and actively dividing cells (Brotz-Oesterhelt et al., 2005; Conlon et al., 2013). Mycobacteria such as Mtb [and Mycobacterium smegmatis (Msm), a close non-pathogenic relative of Mtb], are rod-shaped acid fast staining bacteria that retain characteristics of both Gram-positive and Gram-negative bacteria and as such they contain a somewhat unique composition of proteins. In mycobacteria, protein turnover in the cytosol is mediated by at least four different ATP-dependent machines (**Figure 1**), several of which are essential (Sassetti et al., 2003; Raju et al., 2014). Broadly speaking, these machines can be arranged into two groups, (i) the bacterial-like proteases [which include FtsH and Lon as well as the Casein lytic protein (Clp) proteases ClpC1P and

ClpXP] and (ii) the eukaryotic-like proteasome. They are typically composed of two components—a barrel-shaped peptidase that is capped at one or both ends, by a ring-shaped unfoldase (**Figure 2**). Invariably the unfoldase component belongs to the AAA+ (ATPases associated with a variety of cellular activities) superfamily and as such they are commonly referred to as AAA+ proteases (Sauer and Baker, 2011; Gur et al., 2013). Although a few of these machines (e.g., FtsH and Lon) contain both components on a single polypeptide, most machines (e.g., ClpC1P, ClpXP, and Mpa-20S) contain each component on separate polypeptides. The steps in the degradation pathway of these machines are generally conserved (**Figure 2**). In the first step, the substrate is either directly engaged by the unfoldase, or indirectly engaged by an adaptor protein before it is delivered to the unfoldase. Regardless of the initial mode of contact, substrate engagement by the unfoldase is generally mediated by specialized accessory domains and/or specific loops, located at the distal end of the machine (**Figure 2**). Following this step, the substrate is translocated through the central pore of the unfoldase (in an ATP-dependent manner), into the proteolytic chamber of the associated peptidase where the substrate is cleaved

FIGURE 1 | Linear cartoon of the different AAA+ proteins in mycobacteria, illustrating the position of various domains and motifs. The AAA+ domains either belong to the classic (light blue) or HCLR (dark blue) clade. Each AAA+ domain contains a consensus sequence for ATP binding (GX4GKT/S, where X is any amino acid) and hydrolysis (hDD/E, where h is any hydrophobic amino acid) known as the Walker A (A), and Walker B (B) motifs, respectively. Most AAA+ proteins contain an unique accessory domain, such as the zinc-binding domain (ZBD, in pink) in ClpX, the Clp N-terminal domain (orange) in ClpC1 and ClpB, the Lon SB (substrate binding) domain (green) in Lon, the α-helical (yellow) and OB/ID (pink) domains in Mpa, the p97 N-terminal domain (black) in *Msm*0858 and the Tetratricopeptide (TPR)-like domain (gray) in VCP-1. ClpC1 and ClpB also contain a middle (M) domain (yellow) located between the first and second AAA+ domain. The membrane-bound AAA+ protein, FtsH contains two transmembrane domains (black bars) separated by an extracellular domain (ECD, in white) and a C-terminal metallopeptidase (M14 peptidase) domain (red) containing the consensus sequence (HEXGH). Lon contains an N-terminal substrate binding (Lon SB) domain a central AAA+ domain and a C-terminal serine (S16) peptidase domain (red) with the catalytic dyad (S, K). All cartoons are derived from the sequences for the following *M. smegmatis* proteins ClpX (A0R196), ClpC1 (A0R574), FtsH (A0R588), Lon (O31147), Mpa (A0QZ54), ClpB (A0QQF0), p97/*Msm*0858 (A0QQS4), VCP-1/*Msm*1854 (A0QTI2). Domains (and domain boundaries) were defined by InterPro (EMBL-EBI) as follows: AAA+ (IPR003593); C4-type Zinc finger (IPR010603); Clp N-terminal (IPR004176); UVR or M (IPR001943); Lon SB (substrate binding) (IPR003111); p97 N-terminal (IPR003338); p97 OB/ID (IPR032501); Tetratricopeptide (TPR)-like (IPR011990); S16 protease (IPR008269), M41 protease (IPR000642).

into small peptide fragments. Interestingly, in some cases these peptidases are also activated for the energy-independent turnover of specific protein substrates, through the interaction with non-AAA+ components (Bai et al., 2016; Bolten et al., 2016). These nucleotide-independent components facilitate substrate entry into the proteolytic chamber by opening the gate into the peptidases, as such we refer to them as gated dock-and-activate (GDA) proteases. Although this group of proteases is not the focus of this review, we will discuss them briefly (see later).

# THE Clp PROTEASE(S)

The Clp protease is a large multi-subunit complex composed of a barrel-shaped peptidase (ClpP) flanked on either or both ends by a hexameric AAA+ unfoldase (ClpX or ClpC1). Interestingly, in contrast to most bacteria, the Clp protease is essential in Mtb, not only for virulence but also for cell viability (Sassetti et al., 2003; Carroll et al., 2011; Raju et al., 2012). It is also essential for viability in Msm, indicating that beyond its role in virulence, the Clp protease plays a crucial role in "general" proteostasis. Consistently, the Clp protease is responsible for regulation of various stress responses in both Mtb (Barik et al., 2010; Raju et al., 2014) and Msm (Kim et al., 2009), as well as the turnover of incomplete translation products that have been co-translationally tagged with the SsrA sequence (Raju et al., 2012; Personne et al., 2013).

# Processing and Activation of the Peptidase (ClpP)

The peptidase component of the Clp protease—ClpP, is composed of 14 subunits, arranged into two heptameric rings stacked back-to-back. The active site residues of ClpP are sequestered inside the barrel-shaped oligomer away from the cytosolic proteins. Entry into the catalytic chamber is restricted to a narrow entry portal at either end of the barrel. Although the overall architecture of these machines is broadly conserved (across most bacterial species), the composition and assembly of the ClpP complex from mycobacteria is atypical. In contrast to most bacteria, mycobacteria contain two ClpP homologs (ClpP1 and ClpP2), both of which form homo-heptameric ring-shaped oligomers. Although these homo-oligomers can assemble into both homo- and hetero-tetradecamers, only the hetero-oligomeric complexes (composed of a single ring of each subunit) exhibit catalytic activity in vitro (Akopian et al., 2012; Schmitz et al., 2014) (**Figure 3**). Unexpectedly, the in vitro activity of this complex was also dependent on the presence of a novel dipeptide activator—benzyloxycarbonyl-leucyl-leucine [z-LL] and each ring of the active complex displays unique specificity (Akopian et al., 2012; Personne et al., 2013; Li et al., 2016).

Similar to E. coli ClpP (EcClpP), both Mtb ClpPs (ClpP1 and ClpP2) are expressed as proproteins. However, in contrast to EcClpP (in which the propeptide is auto-catalytically processed),

the processing of both Mtb ClpPs, appears to occur in a sequential fashion, possibly via an in trans mechanism. Specifically, the propeptide of MtbClpP2 is initially processed by the active sites of MtbClpP1, before propeptide cleavage of MtbClpP1 can occur (Leodolter et al., 2015). Currently however, it remains unclear if cleavage of the MtbClpP1 propeptide also occurs in trans (via the active site residues of MtbClpP2) or simply requires interaction with "active" processed MtbClpP2 for autocatalytic processing. Consistent with the in trans processing observed for the MtbClpP1P2 complex, MsmClpP2 also appears to be processed by the catalytic residues of MsmClpP1, however the precise location of this processing event remains uncertain (Akopian et al., 2012). Likewise, it remains unclear if MsmClpP1 contains a propeptide, as the in vitro processing of MsmClpP1 has yet to be observed (Benaroudj et al., 2011; Akopian et al., 2012; Leodolter et al., 2015). Additional experiments are still required to fully understand the mechanism of processing and activation of this complex.

Recently the crystal structure of MtbClpP1P2, in complex with an alternative activator (z-IL) and the ClpP-specific dysregulator (acyldepsipeptide, ADEP, see later) was solved to 3.2 Å (Schmitz et al., 2014). This structure (in comparison to the inactive MtbClpP1P1 complex) provided a detailed understanding of how the hetero-oligomeric complex is assembled and activated (Ingvarsson et al., 2007; Schmitz et al., 2014). Notably, the MtbClpP1P2 structure is formed by a single homo-oligomeric ring of each subunit, the shape (and dimensions) of which is significantly different to that of the inactive ClpP1 homooligomer (Ingvarsson et al., 2007; Schmitz et al., 2014). The active complex, forms an "extended" conformation (∼93 Å high × 96 Å wide) which is stabilized by the complementary docking of an aromatic side-chain (Phe147) on the ClpP1 handle, into a pocket on the handle of ClpP2 (Schmitz et al., 2014). This docking, switches the catalytic residues of both components into the active conformation. By contrast the ClpP1 tetradecamer, which lacks this complementary handle recognition, is compressed (∼10 Å flatter and wider) and as a result the catalytic residues are distorted from their active conformation (**Figure 3**). This structure also revealed that the peptide "activator" was bound in the substrate binding pocket (of all 14 subunits), albeit in the reverse orientation of a bona fide substrate (Schmitz et al., 2014). This provided a structural explanation for why high concentrations of the activator inhibit protease activity (Akopian et al., 2012; Famulla et al., 2016). Significantly, the MtbClpP1P2 structure also established that the ClpP-dysregulator, (ADEP) only interacts with a single ring of the complex (namely MtbClpP2). Interestingly, despite docking to a single ring, ADEP triggered pore opening of both rings of the complex (the cis ring to to 25 Å and the trans ring to 30 Å). This simultaneous opening of both pores is thought, not only, to facilitate translocation of substrates into the chamber, but also likely to promote the efficient egress of the cleaved peptides (**Figure 3**). Consistent with the asymmetric docking of ADEP to the MtbClpP1P2 complex, Weber-Ban and colleagues recently demonstrated that both unfoldase components (MtbClpC1 and MtbClpX) also only dock to MtbClpP2, generating a truly asymmetric Clp-ATPase complex (Leodolter et al., 2015). This asymmetric docking of both unfoldase components appears to be driven by the presence of an additional Tyr residue within the hydrophobic pocket of ClpP1, which prevents unfoldase-docking to this component. The reason for this asymmetry is currently unclear, although one possibility is that an alternative component docks to the "shallow" hydrophobic pocket of ClpP1, thereby expanding the substrate repertoire of the peptidase. Consistent with this idea, an ATP-independent activator of the ClpP protease has recently been identified in Arabidopsis thaliana (Kim et al., 2015).

Although the Clp protease is essential in mycobacteria, only a handful of substrates have been identified. The currently known Clp protease substrates include aborted translation products tagged with the SsrA sequence, the anti-sigma factor RseA, and several transcription factors, WhiB1, CarD, and ClgR (Barik et al., 2010; Raju et al., 2012, 2014; Yamada and Dick, 2017). Of the known substrates, only RseA has been extensively characterized. In this case, phosphorylation of RseA (on Thr39) triggers its specific recognition by the unfoldase, MtbClpC1 (Barik et al., 2010). This phosphorylation-dependent recognition of RseA is reminiscent of substrate recognition by ClpC from Bacillus subtilis (BsClpC), which is also responsible for the recognition of phosphoproteins, albeit in this case proteins that are phosphorylated on Arg residues (Kirstein et al., 2005; Fuhrmann et al., 2009; Trentini et al., 2016). Interestingly, both BsClpC and MtbClpC1 also recognize the phosphoprotein casein, which is often used as a model unfolded protein. However, it currently remains to be seen if MtbClpC1 specifically recognizes phosphorylated Thr residues (i.e., pThr) or whether phosphorylation simply triggers a conformation change in the substrate. Likewise, it remains to be determined if misfolded proteins are generally targeted for degradation by ClpC1 in vivo or whether this role falls to alternative AAA+ proteases in mycobacteria. In contrast to RseA (which contains an internal phosphorylation-induced motif), the remaining Clp protease substrates contain a C-terminal degradation motif (degron). Based on the similarity of the C-terminal sequence of each substrate to known EcClpX substrates (Flynn et al., 2003), we speculate that these substrates (with the exception of WhiB1) are likely to be recognized by the unfoldase ClpX. Significantly, the turnover of both transcription factors (WhiB1 and ClgR) is essential for Mtb viability.

#### Potential Adaptor Proteins of ClpC1 and ClpX

As illustrated in **Figure 2**, substrate recognition by AAA+ proteases is generally mediated by the AAA+ unfoldase component, however in some case this may be facilitated by an adaptor protein (Kirstein et al., 2009b; Kuhlmann and Chien, 2017). Adaptor proteins are generally unrelated in sequence or structure. Invariably they recognize a specific substrate (or class of substrates), which is delivered to their cognate unfoldase, by docking to an accessory domain of the unfoldase. In some cases, adaptor docking not only delivers the substrate to the unfoldase, but also activates the unfoldase, for substrate recognition (Kirstein et al., 2005; Rivera-Rivera et al., 2014). In the case of ClpX, most known adaptor proteins dock onto the N-terminal Zinc binding domain (ZBD). Despite the conserved nature of this accessory domain in ClpX, across a broad range of bacterial species, a ClpX adaptor protein has yet to be identified (either biochemically or bioinformatically) in mycobacteria. Nevertheless, given that most of the ClpX adaptor proteins that have been identified in bacteria are associated with specialized functions of that species, we speculate that mycobacteria have evolved a unique ClpX adaptor (or set of adaptors) that are unrelated to the currently known ClpX adaptors. In contrast to ClpX, mycobacteria are predicted to contain at least one ClpC1-specific adaptor protein—ClpS. In E. coli, ClpS is essential for the recognition of a specialized class of protein substrates that contain a destabilizing residue (i.e., Leu, Phe, Tyr, or Trp) at their N-terminus (Dougan et al., 2002; Erbse et al., 2006; Schuenemann et al., 2009). These proteins are degraded either by ClpAP (in Gram positive bacteria) or ClpCP (in cyanobacteria) via a conserved degradation pathway known as the N-end rule pathway (Varshavsky, 2011). Although most of the substrate binding residues in mycobacterial ClpS are conserved with E. coli ClpS (EcClpS), some residues within the substrate binding pocket have been replaced and hence it will be interesting to determine the physiological role of mycobacterial ClpS and whether this putative adaptor protein exhibits an altered specificity in comparison to EcClpS.

#### FtsH

FtsH is an 85 kDa, membrane bound Zn metalloprotease. It is composed of three discrete domains, a extracytoplasmic domain (ECD) which is flanked on either side by a transmembrane (TM) region (**Figure 1**). The TM regions tethered the protein to the inner membrane, placing the ECD in the "pseudoperiplasmic" space (Hett and Rubin, 2008). The remaining domains (the AAA+ domain and M14 peptidase domain) are located within the cytosol. To date the function of FtsH is poorly understood in mycobacteria, and currently it is unclear if ftsH is indeed an essential gene (Lamichhane et al., 2003; Sassetti et al., 2003). Nevertheless, based on complementation experiments in an E. coli ftsH mutant strain, it appears that MtbFtsH shares an overlapping substrate specificity with EcFtsH, as it can recognize both cytosolic proteins (such as transcription factors and SsrAtagged proteins) as well as membrane bound proteins (such as SecY). Hence MtbFtsH is proposed to play a role in general protein quality control, stress response pathways, and protein secretion (Srinivasan et al., 2006). It is also proposed to play a crucial role in cell survival as it is reported to be transcriptionally upregulated in response to agents that produce reactive oxygen intermediates and reactive nitrogen intermediates (RNIs) in macrophages (Kiran et al., 2009).

#### Lon

Lon is a broadly conserved AAA+ protease, which although absent from Mtb is present in several mycobacterial species, including Msm (Knipfer et al., 1999). In Msm, Lon is an 84 kDa protein composed of three domains, an N-terminal domain, which is generally required for substrate engagement, a central AAA+ domain and a C-terminal S16 peptidase domain (**Figure 1**). The physiological role of mycobacterial Lon is currently unknown and to date no physiological substrates have been identified. Despite the lack of physiological substrates available, MsmLon like many Lon homologs can recognize and degrade the model unfolded protein, casein (Rudyak and Shrader, 2000; Bezawork-Geleta et al., 2015). Based, largely on the identification of casein as a model substrate, MsmLon is predicted to be linked to the removal of unwanted misfolded proteins from the cell. Interestingly in E. coli, Lon also plays a crucial role in the regulation of persistence, through the activation of several Toxin-Antitoxin (TA) systems (Maisonneuve et al., 2013). Although Msm only contains a few TA systems, MsmLon is expected to play a similar role to its E. coli counterpart. Surprisingly Mtb lacks Lon, but contains almost 100 TA systems (Sala et al., 2014). Hence it will be intriguing to determine how these different TA systems are activated in Mtb and which, if any, of the known AAA+ proteases contribute to this process. Nevertheless, the activity of MsmLon appears to be highly regulated, as MsmLon in addition to its catalytic peptidase site also contains two allosteric polypeptide binding sites (Rudyak and Shrader, 2000). Based on a series of in vitro experiments, it appears that the activity of MsmLon is linked to its oligomerization, however in contrast to most AAA+ proteins, the oligomerization of MsmLon is proposed to be mediated, not by ATP levels, but rather by the concentration of Mg2<sup>+</sup> and the level of "unfolded" protein. These findings suggests that in vivo activity of Lon is tightly controlled by the presence of available substrate (Rudyak et al., 2001).

#### THE PUP-PROTEASOME SYSTEM (PPS)

In addition to the bacterial-like proteases, mycobacteria also contain an additional protease that shares similarity with the eukaryotic 26S proteasome. Similar to its eukaryotic counterpart [which is responsible for the degradation of proteins that have been marked for destruction with ubiquitin (Ub)], the mycobacterial proteasome is responsible for the recognition and removal of proteins that have been tagged by a protein called Pup (Prokaryotic Ub-like Protein). The conjugation of Pup to a substrate protein is referred to as Pupylation (see below) and collectively the proteolytic system is referred to as the Pup Proteasome System (PPS). Remarkably, despite the obvious functional similarities between Pup and Ub, the proteins are not conserved nor are the steps involved in their conjugation to substrates. Significantly, the PPS plays a crucial role in Mtb persistence and virulence by protecting cells from Nitric oxide and other RNIs that are produced by host macrophages during infection (Darwin et al., 2003).

# Prokaryotic Ubiquitin (Ub)-Like Protein (Pup) and Pupylation

Pup is a small (64 residue) unstructured protein (Chen et al., 2009) that although unrelated to Ub in sequence and structure, shares a common function with Ub. It is expressed in an inactive form [sometimes referred to as Pup(Q)] that contains a Cterminal Gln. The activation of Pup(Q) is mediated by an enzyme called Dop (Deamidase Of Pup), which involves the deamidation of the C-terminal Gln (to Glu) to generate Pup(E) (Striebel et al., 2009; Burns et al., 2010a). Once activated, the C-terminus of Pup(E) is first phosphorylated by PafA (Proteasome Accessory Factor A) through the hydrolysis of ATP, then attached to a substrate Lys residue by PafA, via the formation of an isopeptide bond between the C-terminal γ-carboxylate of Pup(E) and the εamino group of a Lys residue on the substrate in a process known as pupylation (Pearce et al., 2008; Forer et al., 2013).

Pupylation is involved in a variety of different physiological roles. In pathogenic bacteria such as Mtb, it plays an important role not only in virulence, protecting the cell from nitrosative stress (Darwin et al., 2003) but also in copper homeostasis (Shi et al., 2014), while in Msm it has been implicated in amino acid recycling under nutrient starvation conditions (Elharar et al., 2014). Given the diverse range of physiological roles, it is not surprising that the molecular targets of pupylation also vary from species to species. Although the target of pupylation, responsible for regulating copper homoestasis in Mtb has yet to be identified, Darwin and colleagues recently identified Log (Lonely guy) as the molecular target of pupylation that is responsible for protection of Mtb against nitrosative stress (Samanovic et al., 2015). Log is responsible for synthesis of the hormone, cytokinin. In Mtb, Log accumulates in cells lacking a component of the PPS, triggering the overproduction of cytokinin, which results in the toxic accumulation of aldehydes (breakdown products of cytokinin). In contrast to the regulation of nitrosative stress in Mtb, which involves the pupylation of a single target, Msm cells pupylate many targets in their response to nutrient starvation (Elharar et al., 2014). Indeed, Gur and colleagues demonstrated that high molecular weight proteins were preferentially targeted for pupylation under nutrient starvation conditions, and proposed that the turnover of these proteins was more efficient for amino acid recycling, than that of low molecular weight proteins. Consistently, the same group have recently demonstrated that during starvation, the opposing size preference of Dop and PafA, supports the preferential pupylation of high molecular weight proteins (Elharar et al., 2016). Pupylation has also recently been proposed to regulate iron homeostasis in Corynebacterium glutamicum. Interestingly, this bacterial species lacks both subunits of the 20S core particle (CP), and hence it is proposed that the pupylation-mediated regulation of iron homeostasis is independent of protein turnover. In this case, the target of pupylation is a single protein—ferritin, which is pupylated at Lys78. Ferritin is an iron storage protein which forms a cage composed of 24 identical subunits that encapsulates ∼4,500 iron atoms (Andrews, 2010). Under iron limitation conditions, normal cells access this stored iron through disassembly of the ferritin cage, which is mediated by ARC (a homolog of Mpa, see below). In contrast, in cells lacking components of the pupylation machinery, ARC is unable to disassemble the ferritin complex and as a result these cells are unable to access the stored iron and hence exhibit strong growth defects under iron limitation conditions (Kuberl et al., 2016). In addition to these reports, several proteomic studies have identified that over 100 different proteins are pupylated (Festa et al., 2010; Poulsen et al., 2010; Watrous et al., 2010). However, whether each pupylated protein regulates a specific response or whether the complete set of pupylated proteins serve a collective purpose is yet to be defined. Nevertheless, these proteomic studies demonstrated that pupylation is a selective process, as only specific exposed Lys residues were modified. This suggests that PafA, likely displays some degree of substrate specificity beyond the target Lys residue and hence residues surrounding the target Lys may modulate interaction with PafA. Alternatively, it may suggest, that mycobacteria contain an additional factor that modulates substrate recognition by PafA.

#### The Mycobacterial Proteasome

The mycobacterial proteasome is a multi-subunit machine composed of two components, a central peptidase component called the 20S CP which is flanked at either or both ends by a ring-shaped activator (**Figure 4**). The 20S CP is composed of four stacked heptameric rings; two outer rings composed of seven identical α-subunits (PrcA) and two inner rings composed of seven identical β-subunits (PrcB) (Hu et al., 2006; Lin et al., 2006). The β-subunits are catalytically active and hence form the central proteolytic chamber, while the α-subunits are catalytically inactive form a cap for the protease that interacts with different regulatory components. Assembly and maturation of the 20S CP is a multistep process. First the α<sup>7</sup> ring is formed, which creates a template for the folding and assembly of the β<sup>7</sup> ring (Lin et al., 2006). This complex (α7β7), termed the halfproteasome, assembles (via the β<sup>7</sup> interface) to generate a full proteasome. In contrast to the eukaryotic proteasome, it appears that the mycobacterial 20S CP does not require additional factors for assembly (Bai et al., 2017). Following assembly of the full-proteasome, the β-subunit propeptide is autocatalytically processed, exposing a new N-terminal residue (Thr56), which forms the catalytic nucleophile of the mature complex (Zuhl et al., 1997; Witt et al., 2006) (**Figure 4**). Like ClpP, the catalytic residues of the 20S CP are sequestered inside the proteolytic chamber of the mature complex, and access to this chamber is restricted by a narrow entry portal (∼10 Å in diameter) at either end of the barrel. This entry portal is formed by the N-terminal residues of the α-subunits and opening of the portal (to gain access to the proteolytic chamber) is controlled by the activator binding which regulates movement of the Nterminal residues of the α-subunits (Lin et al., 2006). To date two proteasomal activators have been identified in mycobacteria; an ATP-dependent activator called Mpa (Mycobacterial proteasome ATPase) (Darwin et al., 2005) and a nucleotide-independent activator known as PafE (Proteasome accessory factor E) or Bpa (Bacterial proteasome activator) (Delley et al., 2014; Jastrab et al., 2015). Although both activators use a conserved mechanism to regulate gate-opening, they each recognize specific types of substrates and as such control distinct degradation pathways in mycobacteria.

## ATP-Dependent Proteasome Activator—Mpa

Mpa (the ATP-dependent activator of the proteasome) is responsible for the specific recognition of protein substrates that have been tagged with Pup. It is a 68 kDa protein composed of four distinct regions (**Figure 5**); an N-terminal α-helical domain (for interaction with Pup) and a C-terminal tail bearing the tripeptide motif, QYL (for docking to, and activation of the 20S CP) (Pearce et al., 2006), which are separated by an AAA+ domain and an interdomain region composed of two oligosaccharide/oligonucleotide-binding (OB) subdomains (OB1 and OB2). Although the AAA+ domain is directly

unfolded proteins.

hydrophobic channel, which is proposed to interact with hydrophobic (Hy) residues that are exposed in proteins such as HspR (heat-shock protein R) and model

responsible for ATP-binding and hence enzyme activity and the oligomerisation of Mpa, the interdomain region is also believed to promote assembly and stability of the Mpa oligomer as this region alone can form a hexamer in the absence of nucleotide (Wang et al., 2009, 2010). Once assembled into a hexamer, each pair of N-terminal α-helices (from adjacent subunits) associates to form a coiled-coil (CC). These CC structures protrude from the hexameric-ring like tentacles (**Figure 5**) and are directly responsible for the recognition of Pup (Striebel et al., 2010). Although each tentacle contains two Pup binding sites (one on each face), it appears that Pup only binds to the inner face of a single tentacle within the hexamer (Sutter et al., 2010; Wang et al., 2010). The interaction (between Pup and Mpa) is mediated by central region of Pup (residues 21–51), and docking to the tentacle occurs in an anti-parallel manner. This orientation of Pup, ensures that the unstructured N-terminus of Pup is directed toward the pore of Mpa, where it engages with the pore to initiate translocation of the substrate in an ATP-dependent fashion (Wang et al., 2009). Consistent with this idea, deletion of the N-terminal residues of Pup specifically prevented the in vitro turnover of pupylated substrates (Burns et al., 2010b; Striebel

et al., 2010). Currently however, the fate of conjugated Pup is unclear, some evidence suggests that Pup, in contrast to Ub, is degraded together with the substrate (Striebel et al., 2010) while other evidence supports the idea that Pup is removed from the substrate, by Dop, before the pupylated substrate is degraded (Burns et al., 2010a; Cerda-Maira et al., 2010; Imkamp et al., 2010). The interaction with the 20S CP is mediated by the Cterminal tripeptide motif (QYL), which docks into a hydrophobic pocket on the α-ring. However, this motif is normally occluded by a β-grasp domain located within the C-terminal region of Mpa, which prevents efficient docking of the ATPase component to the 20S CP (Wu et al., 2017). As such, it has been proposed that additional factors may facilitate robust interaction between the ATPase and the protease. Interestingly, a single Lys residue near the C-terminus of Mpa is targeted by pupylation, which inhibits its ability not only to assemble, but also to dock to the 20S CP (Delley et al., 2012). Therefore, the pupylation of Mpa appears to serve as a mechanism to reversibly regulate the proteasome mediated degradation of pupylated substrates, which may play an important role in controlling the turnover of pupylated substrates.

# ATP-Independent Proteasome Activator—Bpa/PafE

The first evidence for an additional proteasomal activator in mycobacteria came from comparison of the growth phenotypes of strains lacking different components of the proteasome, either mpa or prcBA (Darwin et al., 2003). The dramatic difference observed in the phenotypes displayed by these strains suggested that the 20S CP might be involved in the turnover of a separate class of substrate, likely through an additional activator. Recently two groups, independently identified a single novel activator of the proteasome—PafE/Bpa, which facilitates the ATP-independent turnover of the model unfolded substrate, βcasein (Delley et al., 2014; Jastrab et al., 2015). Like Mpa, PafE/Bpa contains the C-terminal motif (QYL), which is essential for its interaction with the hydrophobic pocket of the α-ring and activation of the proteasome (**Figure 5**). It also forms a ringshaped complex, however in contrast to Mpa this complex is composed of 12 subunits which form a very large channel (∼40 Å in diameter) that is lined with hydrophobic residues (Bai et al., 2016; Bolten et al., 2016). Although the mechanism of substrate recognition and release is not fully understood, it is proposed that the hydrophobic channel of PafE/Bpa interact with exposed hydrophobic residues in unfolded proteins. To date, the only physiological substrate to be identified is the heat shock protein repressor (HspR) (Jastrab et al., 2015).

# OTHER AAA+ PROTEINS INVOLVED IN MYCOBACTERIAL PROTEOSTASIS

In addition to the known AAA+ proteases in mycobacteria, three other AAA+ proteins are either known or predicted (based on annotated function/sequence homology) to play a role in proteostasis (**Figure 1**). They are ClpB, Msm0858/Rv0435c and Valosin containing protein-1 (VCP-1, also incorrectly annotated as Cdc48). VCP-1 (Msm1854) is a 43 kDa protein of unknown function. It contains a C-terminal AAA+ domain and an Nterminal Tetratrico peptide repeat (TPR)-like helical domain. Although the VCP-1 gene is only distributed in a limited number of Actinobacterial species (including Msm), it is invariably located in a putative operon, together with another gene of unknown function (MSMEG\_1855). MSMEG\_1855 encodes a membrane bound TPR-containing protein, which shares homology with B. subtilis BofA—a regulator of sporulation transcription factor, Sigma K (Zhou and Kroos, 2004). Therefore, we propose that VCP-1 (together with MSMEG\_1855) is tethered to the inner membrane, and speculate that this complex regulates activation of a signal transduction pathway in mycobacteria.

Msm0858/Rv0435c (known as p97 in mammals or Cdc48 in yeast and plants) is a widely conserved 78 kDa protein, which is found in all kingdoms of life. In mammals, p97 plays a central role in the Ub proteasome system (UPS), where it not only interacts directly with ubiquitylated proteins to regulate their turnover, but also serves as a hub for the docking of numerous cofactors which help to mediate p97's many activities in the cell (for a detailed review of p97 function see Meyer and Weihl, 2014). Like mammalian p97, Msm0858 is composed of an N-terminal domain and two AAA+ domains. Interestingly, although the second AAA+ domain (D2) of Msm0858 exhibits a consensus sequence for both the Walker A and B motifs, critical residues in both motifs of the first AAA+ domain (D1) have been replaced (notably Thr in the Walker A motif is replaced with Val, while the first Asp in the Walker B motif is replaced with Ala). Despite these changes, both domains of Msm0858 displayed ATPase activity indicating that each domain can both bind and hydrolyze ATP (Unciuleac et al., 2016). Consistently, the recent crystal structure of Msm0858 revealed that the structures of the D1 and D2 domains of Msm0858 are highly similar to the equivalent domains in mammalian p97, with a root mean square deviation of 1.5 and 2.4 Å, respectively (Unciuleac et al., 2016). The structural similarity extends beyond the AAA+ domains of Msm0858, into its N-terminal domain, and despite this domain sharing only modest sequence similarity with mammalian p97 it shares significant structural similarity with its mammalian counterpart. In mammals, the N-terminal domain of p97 is an important docking platform for cofactor binding and hence the diverse activities of p97. This suggests that Msm0858 could serve a similar range of functions in mycobacteria, albeit using a distinct set of cofactors. Surprisingly, and in contrast to mammalian p97, Msm0858 was only observed to form a dimer in solution, however it remains to be seen if the lack of hexamer formation is due to the experimental conditions used, or alternatively it might indicate that a specific adaptor protein or cofactor is required for assembly or stabilization of the Msm0858 hexamer. Hence, it will be interesting to determine the oligomeric state of Msm0858 in vivo, and identify any factors that may modulate the activity of this highly conserved protein.

ClpB is a broadly conserved protein of ∼ 92 kDa, that like ClpC1, is composed of two AAA+ domains which are separated by a middle domain (**Figure 1**). However, in contrast to ClpC1 (in which the M-domain is composed of two helices) the M-domain of ClpB is composed of four helices which form two coiledcoil motifs. In EcClpB, the M-domain serves as an important regulatory domain of the machine, as it represses the ATPase activity of the machine. It also serves as an important docking site for its co-chaperone DnaK. Collectively, ClpB and DnaK (together with its co-chaperones, DnaJ and GrpE) form a bichaperone network that is responsible for the reactivation of aggregated proteins. A similar role for mycobacterial ClpB was recently confirmed (Lupoli et al., 2016). Indeed, MtbClpB plays a crucial role in controlling the asymmetric distribution of irreversibly oxidized proteins (Vaubourgeix et al., 2015) and as such ClpB-deficient Mtb cells exhibit defects in recovery from stationary phase or exposure to antibiotics. Hence, ClpB might be a useful antibiotic target in the future, forcing cells to maintain their damaged proteome.

# AAA+ PROTEASES AS NOVEL DRUG TARGETS

Since the golden age of antibiotic discovery, very few new antibiotics have been bought to market and as a result, we are now seeing the rise of numerous antibiotic resistance bacteria.

This includes, but is not limited to, the bacterial pathogen that is responsible for TB - Mtb. Indeed, there are currently three different strains of Mtb, each of which exhibits increasing resistance to available antibiotics. They are: multi drug resistant (MDR) Mtb which is resistant to the first line defense drugs isoniazid and rifampicin; extensively drug resistant (XDR) Mtb which is resistant to both first line defense drugs as well as to fluoroquinolones and at least one of the three injectable second line defense drugs, and totally drug resistant (TDR) Mtb which is resistant to all currently available drugs. As a consequence, there is an urgent need to develop new drugs that target novel pathways in these drug resistant strains of Mtb. Recently, several different components of the proteostasis network have been identified as promising novel drug targets in Mtb.

# Dysregulators of ClpP1P2 Function: Activators and Inhibitors

In the Clp field, the interest in antibiotics was sparked by the identification of a novel class of antibiotics termed acyledepsipeptides (ADEPs) (Brotz-Oesterhelt et al., 2005). This class of antibiotic, was initially demonstrated to be effective against the Gram-positive bacterium, B. subtilis where it was shown to dysregulate the peptidase, ClpP. Specifically, ADEPs interact with the hydrophobic pocket of ClpP, triggering cell death via one of two suggested modes of action. The first mode-of-action is to activate the ClpP peptidase, by opening the gate into the catalytic chamber from ∼10 Å to > 20 Å in diameter (Lee et al., 2010; Li et al., 2010). This results in the unregulated access of newly synthesized or unfolded proteins into the proteolytic chamber resulting in their indiscriminate degradation (**Figure 6A**). This mode-of-action activation appears to be crucial for ADEP-mediated killing of bacteria in which ClpP is not essential, such as B. subtilis. The second modeof-action is to prevent docking of the partner ATPase (e.g., ClpC, ClpA, or ClpX), which inhibits the regulated turnover of specific substrates (Kirstein et al., 2009a). This mode-of-action appears to be critical in the ADEP-mediated killing of bacteria in which the unfoldase components are essential, such as Mtb (Famulla et al., 2016). Consistent with this idea, ADEPs only binds to one face of the ClpP1P2 complex—ClpP2, the face that is responsible for interaction with the ATPase component (Ollinger et al., 2012; Schmitz et al., 2014). Although these compounds are promising drug candidates, they currently exhibit poor drug-like qualities and are efficiently removed from the cell (Ollinger et al., 2012), hence additional development is required to improve their effectiveness in vivo.

Last year, the first non-peptide based activator of ClpP was identified from a screen of fungal and bacterial secondary metabolites (Lavey et al., 2016). In this case, the identified compound (Sclerotiamide) dysregulated EcClpP, by activating the ATPase-independent turnover of casein. Intriguingly, Sclerotiamide appears to be quite specific for EcClpP, as it was unable to dysregulate BsClpP, hence it will be interesting to see how and where this compound binds, and whether it will be able to activate other ClpP complexes such as the MtbClpP1P2 complex in the future.

In addition to the ClpP activators, several ClpP specific inhibitors have also been developed. The first group are the βlactones (**Figure 6B**). These are suicide inhibitors that inactivate ClpP through the formation of an acyl-ester intermediate between the β-lactone ring (of the inhibitor) and the catalytic Ser of the peptidase which is much more stable than the intermediate formed between the substrate and the catalytic Ser during peptide bond catalysis (Bottcher and Sieber, 2008). In 2013 Sello and colleagues developed two β-lactone derivatives which killed Mtb cells (Compton et al., 2013). Interestingly, both β-lactones specifically target the ClpP2 component of the ClpP1P2 complex in Mtb, hence there is still potential for the development of ClpP1 inhibitors. Despite their effectiveness in vivo, most β-lactones exhibit poor stability in plasma and hence this will likely limit their future development (Weinandy et al., 2014).

The final inhibitor of ClpP1P2 was recently identified by Dick and colleagues from a whole-cell high throughput screen (Moreira et al., 2015). Interestingly, the compound they identified (bortezomib) is a known inhibitor of the human proteasome, which is currently being used in the treatment of multiple myeloma (under the commercial name, Velcade). Perhaps unsurprisingly, bortezomib has also been used in biochemical assays with the Mtb proteasome (Hu et al., 2006). Clearly the cross reactivity of bortezomib with the human proteasome represents a challenge for the future, although there are already promising signs that more specific ClpP1P2 inhibitors can be developed (Moreira et al., 2017).

#### Dysregulators of ClpC1 Function

Given the ATPase component(s) of the Clp protease are essential for viability, it is not surprising that dyregulators of these components also have antibacterial properties. Cyclomarin A (CymA) was the first identified dysregulator of the ClpC1 component of the Clp protease (**Figure 6C**). It is a cyclic nonribosomal peptide that is produced by a marine bacterium (Renner et al., 1999). In 2011, CymA was identified as a potent antitubercular compound, which not only inhibited Mtb growth in vitro, but it also demonstrated bactericidal activity in human derived macrophages. Significantly, CymA also exhibited bactericidal activity against a panel of MDR strains of Mtb (Schmitt et al., 2011). Using a simple affinity chromatography approach, Schmitt and colleagues were able to show that CymA specifically bound to a single protein— ClpC1 (Schmitt et al., 2011). This binding appears to increase the ClpC1-medaited turnover of proteins in the cell and as such CymA was proposed to dysregulate ClpC1 function. Based on current structural data, CymA binds directly to the Nterminal domain of ClpC1 where it is proposed to alter the flexibility of this domain, thereby improving access of substrates to the pore of ClpC1 (Vasudevan et al., 2013). However, this mechanism of action has yet to be verified biochemically and hence the mode of CymA dysregulation remains uncertain. Intriguingly, the binding of CymA occurs near the docking site of adaptor proteins (MecA and ClpS) in equivalent systems (Kirstein et al., 2009b) and hence it is possible that CymA also modulates the docking of putative adaptor proteins in Mycobacteria.

Interestingly, the N-terminal domain of ClpC1 appears to be a common target of ClpC1 dysregulators, as two additional compounds were recently identified to bind to this region, ecumicin and lassomycin (Gavrish et al., 2014; Gao et al., 2015). Both compounds were identified from high-throughput screens; lassomycin from a screen using extracts of uncharacterized soil bacteria (Gavrish et al., 2014), while ecumicin was identified from a screen of actinomycetes extracts (Gao et al., 2015). Significantly, lassomycin not only inhibited the growth of wild type Mtb cells, but also exhibits potent antibacterial activity against MDR strains of Mtb, while ecumicin exhibited potent antibacterial activity against both actively dividing and dormant Mtb cells, as well as MDR and XDR strains of Mtb. Lassomycin is a ribosomally synthesized lasso-peptide that contains several Arg residues and hence is predicted to dock into an acidic patch on the N-domain of ClpC1. In contrast, ecumicin is a macrocyclic tridecapeptide composed of several non-cononical amino acids, which similar to CymA, is predicted to bind to in close proximity to a putative adaptor docking site (Gao et al., 2015; Jung et al., 2017). Interestingly, despite docking to different sites within the N-terminal domain, both compounds (lassomycin and ecumicin) stimulate the ATPase of ClpC1, but in contrast to CymA, they appear to uncouple the interaction between the ATPase and the peptidase, as they both inhibit the ClpC1-mediated turnover of the model unfolded protein, casein (**Figure 6C**). Currently however, it remains unclear if cell death results from the increased unfolding activity of ClpC1 or from the loss of ClpP1P2-mediated substrate turnover. Future efforts to determine the molecular mechanism of each compound are still required. This will likely be aided by structural studies of these compounds in complex with their target. Importantly, although further development of these compounds is still required to improve their pharmacokinetic properties, these compounds hold new hope in the battle against antibiotic resistant pathogens. It will also be interesting to see what else nature has provided in our ongoing battle against pathogenic microorganisms.

# AUTHOR CONTRIBUTIONS

AAHA and DAD wrote and critically revised this work.

#### FUNDING

This work was supported by an ARC Australian Research Fellowship to DAD from the ARC (DP110103936) and a La Trobe University postgraduate research scholarship to AAHA.

# REFERENCES


the virulence of Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. U.S.A. 112, E1763–E1772. doi: 10.1073/pnas.1423319112


essential cofactor interactions with chaperone DnaK. Proc. Natl. Acad. Sci. U.S.A. 113, E7947–E7956. doi: 10.1073/pnas.1617644113


kills Mycobacterium tuberculosis by targeting the ClpC1 subunit of the caseinolytic protease. Angew. Chem. Int. Ed Engl. 50, 5889–5891. doi: 10.1002/anie.201101740


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer CE and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2017 Alhuwaider and Dougan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Copper Efflux Regulator CueR Is Subject to ATP-Dependent Proteolysis in *Escherichia coli*

Lisa-Marie Bittner, Alexander Kraus, Sina Schäkermann and Franz Narberhaus \*

Microbial Biology, Ruhr University Bochum, Bochum, Germany

The trace element copper serves as cofactor for many enzymes but is toxic at elevated concentrations. In bacteria, the intracellular copper level is maintained by copper efflux systems including the Cue system controlled by the transcription factor CueR. CueR, a member of the MerR family, forms homodimers, and binds monovalent copper ions with high affinity. It activates transcription of the copper tolerance genes copA and cueO via a conserved DNA-distortion mechanism. The mechanism how CueR-induced transcription is turned off is not fully understood. Here, we report that Escherichia coli CueR is prone to proteolysis by the AAA<sup>+</sup> proteases Lon, ClpXP, and ClpAP. Using a set of CueR variants, we show that CueR degradation is not altered by mutations affecting copper binding, dimerization or DNA binding of CueR, but requires an accessible C terminus. Except for a twofold stabilization shortly after a copper pulse, proteolysis of CueR is largely copper-independent. Our results suggest that ATP-dependent proteolysis contributes to copper homeostasis in E. coli by turnover of CueR, probably to allow steady monitoring of changes of the intracellular copper level and shut-off of CueR-dependent transcription.

Keywords: AAA<sup>+</sup> proteases, proteolysis, Lon, ClpXP, ClpAP, CueR, copper homoeostasis, MerR family

#### INTRODUCTION

Copper is a trace element required as cofactor for full functionality of several enzymes, such as cytochrome c oxidase of the respiratory chain (van der Oost et al., 1994). The intracellular copper concentration must be strictly maintained since elevated copper levels are toxic for the cell, e.g., by generation of reactive oxygen species (Rensing and Grass, 2003; Grass et al., 2011). In Escherichia coli two copper efflux systems, the Cue and the Cus system, adjust the intracellular copper level to the cellular demand (Rensing and Grass, 2003; Rademacher and Masepohl, 2012). While the Cus system operates under anaerobic conditions, the Cue system is predominantly active under aerobic conditions (Outten et al., 2001). CueR, the key regulator of the Cue system, activates transcription of the copper tolerance genes copA and cueO (Outten et al., 2000; Stoyanov et al., 2001). CopA is a P-type ATPase located in the cytoplasmic membrane and pumps monovalent copper ions (Cu+) into the periplasm (Petersen and Møller, 2000; Rensing et al., 2000). The multi-copper oxidase CueO is located in the periplasm and oxidizes Cu<sup>+</sup> to the divalent form, Cu2+, which is not able to pass the inner membrane by simple diffusion (Grass and Rensing, 2001; Rensing and Grass, 2003).

The transcription factor CueR is a member of the MerR family named after the mercury resistance regulator MerR (Brown et al., 2003). Proteins of this family typically form homodimers and are comprised of three characteristic domains: the N-terminal DNAbinding domain, the central dimerization helix, and the C-terminal metal-binding domain

#### *Edited by:*

Walid A. Houry, University of Toronto, Canada

#### *Reviewed by:*

Kür ¸sad Turgay, Leibniz University of Hanover, Germany Mihaela Pruteanu, Humboldt University of Berlin, Germany

> *\*Correspondence:* Franz Narberhaus franz.narberhaus@rub.de

#### *Specialty section:*

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

*Received:* 22 December 2016 *Accepted:* 13 February 2017 *Published:* 28 February 2017

#### *Citation:*

Bittner L-M, Kraus A, Schäkermann S and Narberhaus F (2017) The Copper Efflux Regulator CueR Is Subject to ATP-Dependent Proteolysis in Escherichia coli. Front. Mol. Biosci. 4:9. doi: 10.3389/fmolb.2017.00009 (Brown et al., 2003; Changela et al., 2003). CueR contains two copper-binding cysteines in its metal-binding domain (C112, C120), which are essential for covalent binding of monovalent copper ions. An active CueR homodimer, binding two Cu<sup>+</sup> ions (holo-CueR), induces the expression of copA and cueO by binding to their promoter regions which induces torsional transformations in the DNA conformation (Changela et al., 2003; Chen et al., 2003; Stoyanov and Brown, 2003; Philips et al., 2015). By kinks and undertwisting, the DNA switches from a B-form into an A-form-like conformation that allows access of the RNA polymerase. The metal-free CueR dimer (apo-CueR) is also able to bind to the promoter region resulting in a tight DNA conformation, which represses copA and cueO expression (Philips et al., 2015).

CueR binds copper with high affinity (Changela et al., 2003). An open question is how CueR-mediated expression of copper detoxification systems is turned off when necessary or how the cellular CueR pool is maintained to allow continuous sensing of the actual intracellular copper level. Several studies have implicated a role of proteolysis in the regulation of metal homeostasis (Lu and Solioz, 2001; Solioz, 2002; Lu et al., 2003; Solioz and Stoyanov, 2003; Liu et al., 2007; Pruteanu et al., 2007; Pruteanu and Baker, 2009). Regulated proteolysis is a universal post-translational strategy adapting the existing protein pool to the cellular demand. In E. coli five different ATPdependent proteases (AAA<sup>+</sup> proteases, ATPases associated with a variety of cellular activities), namely ClpXP, ClpAP, HslUV, Lon, and FtsH, are responsible for quality control of proteins as well as for the regulated turnover of intact proteins (Baker and Sauer, 2006; Sauer and Baker, 2011; Bittner et al., 2016). AAA<sup>+</sup> proteases are comprised of two functional domains, the ATPase and protease domain. While the proteases ClpP and HslV associate with separate ATPases to form ClpXP, ClpAP, or HslUV complexes, the two domains of Lon and FtsH are encoded by a single gene. The ATPase domain is needed for ATP-dependent unfolding and translocation of a substrate into the proteolytic chamber of the protease domain, in which the substrate is degraded (Bittner et al., 2016; Sauer and Baker, 2011). AAA<sup>+</sup> proteases recognize their substrates via exposed recognition motifs, so-called degrons and also adaptor proteins can be involved in recognition (Sauer et al., 2004; Baker and Sauer, 2006; Gur et al., 2011, 2013; Sauer and Baker, 2011). An example for proteolysis of proteins involved in metal homeostasis is the MerR-like regulator ZntR, which binds zinc (Changela et al., 2003) and activates expression of the zinc exporter ZntA (Brocklehurst et al., 1999; Outten et al., 1999). ZntR is a substrate of the Lon and ClpXP proteases in E. coli (Chivers, 2007; Pruteanu et al., 2007; Pruteanu and Baker, 2009). Moreover, the metallochaperone CopZ from Enterococcus hirae and the Saccharomyces cerevisiae proteins Ctr1p (plasma membrane transporter for high-affinity copper uptake) and Mac1 (copper-sensing transcriptional activator) are degraded upon increased copper levels (Ooi et al., 1996; Zhu et al., 1998; Lu and Solioz, 2001; Solioz, 2002; Lu et al., 2003; Solioz and Stoyanov, 2003; Liu et al., 2007). Here, we report proteolysis of the metalloregulator CueR by Lon and the ClpP machineries in E. coli.

# MATERIALS AND METHODS

## Bacterial Strains and Growth Conditions

E. coli strains used in this study are listed in **Table 1**. Cells were grown in liquid LB, 2YT, or M9 minimal medium in a water bath shaker (180 rpm) or on LB agar plates at 30 or 37◦C. When required, antibiotics were used as follows: ampicillin (Amp) 100 µg/ml, chloramphenicol (Cm) 25 µg/ml, kanamycin (Kan) 50 µg/ml, or tetracycline (Tet) 10 µg/ml.

# Construction of Plasmids

Plasmids and oligonucleotides used in this study are listed in **Tables 2, 3**, respectively. Recombinant DNA techniques were performed using standard protocols (Sambrook and Russell, 2001). E. coli DH5α cells served as cloning host. For construction of inducible CueR expression plasmids, genomic E. coli K12 DNA was used as template for PCR amplification of the cueR gene for full-length or C-terminally truncated CueR variants. The PCR product was cloned into pASK-IBA5(+) or pASK-IBA3 via primer-derived restriction sites to create pBO2584, pBO2585, pBO2860, or pBO2862, respectively. CueR variants with amino acid substitutions were generated by QuikChange <sup>R</sup> PCR using pBO2584 as template and mutagenized primers to create pBO2591, pBO2595, or pBO4800, respectively. For construction of pBO3687, a plasmid encoding constitutively expressed CueR, the cueR gene was amplified from genomic E. coli K12 DNA and cloned into pACYC184 via primer-derived restriction sites. This plasmid was used for QuikChange <sup>R</sup> PCR to create constitutively expressed CueR\_C112S variant (pBO4801). All cloning results were confirmed by sequencing.

#### *In vivo* Degradation Experiments

To analyze the stability of different CueR variants, cells containing inducible expression plasmids encoding for corresponding CueR proteins were grown overnight in M9 minimal medium containing corresponding antibiotics for selection at 30◦C. Fifteen milliliters of M9 minimal medium supplemented with corresponding antibiotics were inoculated with the overnight culture to an optical density (A580) of 0.05. Cells were grown to an A<sup>580</sup> of 0.5 and protein expression was induced by adding 15 ng/ml anhydrotetracycline (AHT) for 20 min. Translation was blocked by addition of 200 µg/ml Cm. As an exception, translation of the strain lacking all three proteases (1clpXP, 1lon, 1hslUV) and its parental strain E. coli Wt MG1655 was blocked by addition of 300 µg/ml spectinomycin (Sp) since the triple knockout strain is resistant to Cm. Samples were taken at different time points, frozen into liquid nitrogen and subjected to SDS-PAGE, Western transfer, and immunodetection as described below.

To analyze the stability of Strep\_CueR under defined copper concentrations the same in vivo degradation experiments were performed as described above with minor modifications: To avoid copper contamination all steps were performed in plastic ware and all M9 minimal medium components except trace elements were previously incubated overnight with 50 g/l Chelex 100 resin (Bio-Rad) to remove trace metals. Before usage trace metals (without copper component) were added to the medium,

#### TABLE 1 | *E. coli* strains used in this study.


#### TABLE 2 | Plasmids used in this study.


mixed and sterile-filtered. Cells were grown to an A<sup>580</sup> of 0.5, defined copper concentrations (CuSO4) were supplemented for 1 h and the in vivo degradation experiments were performed as described above.

For analyses of Strep\_CueR stability over the entire growth curve cells were grown in LB medium + Amp at 37◦C to different growth phases and in vivo degradation experiments were performed in every growth phase as described above. To analyze Strep\_CueR stability over the whole growth curve under different copper concentrations, defined copper concentrations (CuSO4) were added to the main cultures at the time of inoculation or a copper pulse was given to the main culture after the second in vivo degradation experiment had been started (∼2.5 h after inoculation and 60 min before the third degradation experiment was started).

#### Preparation of Protein Extracts and Immunodetection

Cell pellets were resuspended in TE buffer depending on their optical density (10 mM Tris/HCl, pH 8; 1 mM EDTA; 50 µl TE buffer per A<sup>580</sup> of 1.0) and mixed with protein sample buffer (final concentrations of 2% SDS (w/v), 0.1% (w/v) bromophenol blue, 10% (v/v) glycerol, 1% (v/v) β-mercaptoethanol, 50 mM Tris/HCl, pH 6.8). Samples were incubated for 5 min at 95◦C, centrifuged (1 min, 16,000 × g) and subjected to SDS-PAGE and Western transfer using standard protocols (Sambrook and

#### TABLE 3 | Oligonucleotides used in this study.


Russell, 2001). Strep-tagged fusion proteins were detected using a Strep-tag-HRP conjugate (IBA GmbH). Endogenous CueR and untagged CueR were detected using a polyclonal anti CueR antibody (Yamamoto and Ishihama, 2005) and a goat-antirabbit IgG (H+L) HRP conjugate (BioRad) as second antibody. Protein signals were visualized using Luminata Forte Western HRP substrate (Millipore) and the Chemi Imager Ready (Alpha Innotec). Half-lives of proteins were calculated by pixel counting with AlphaEaseFC software (version 4.0.0, Alpha Innotec).

#### Protein Purification

Strep\_CueR (pBO2584), His6\_CspD (pBO1115), or Lon\_His<sup>6</sup> (pET21b-Lon) were transformed in E. coli 1lon, BL21 or CH1019, respectively. Cells were grown to an A<sup>580</sup> of 0.5 at 37◦C in LB (Strep\_CueR) or 2YT (His6\_CspD and Lon\_His6) medium and gene expression was induced by addition of 150 ng/ml AHT (Strep\_CueR) or 1 mM IPTG (isopropyl-β-Dthiogalactopyranoside) (His6\_CspD and Lon\_His6). Cells were harvested after 3 h of overexpression at 30◦C, resuspended in lysis buffer containing 20 mM Tris/HCl, pH 7.5, 200 mM NaCl, 1 mM DTT, 0.35 mg/ml lysozyme, 0.2 mg/ml DNase, and 0.2 mg/ml RNase and were disrupted via French Press. Strep- or Histagged proteins were purified using streptactin sepharose (IBA GmbH) or Ni-NTA agarose (Qiagen), respectively. Purification of His-tagged proteins was performed as described previously (Langklotz and Narberhaus, 2011). Strep\_CueR purification was performed using standard protocols of the purification kit (IBA GmbH). Protein concentrations were determined via Bradford assay (Bradford, 1976).

#### *In vitro* Degradation Experiments

Fifteen micromolars of Strep\_CueR or His6\_CspD and 600 nM Lon\_His<sup>6</sup> were incubated for 2 min at 37◦C in the degradation buffer described in Bissonnette et al. (2010). In vitro degradation was initialized by addition of 20 mM ATP. Degradation experiments without addition of ATP were performed as controls. Results were visualized by SDS-PAGE and Coomassie staining or Western transfer following standard protocols (Sambrook and Russell, 2001).

#### *In vivo* CueR Activity Assays

Cultures with inducible expression plasmids encoding different CueR variants were grown in plastic ware in copper-free M9 minimal medium treated with 50 g/l Chelex 100 resin (Bio-Rad) to remove trace metals. Before use trace metals (without copper component) and 30 ng/ml AHT were added to the medium, mixed and sterile-filtered. Cells were grown to an A<sup>580</sup> of 0.5 and defined copper concentrations (CuSO4) were adjusted in the cultures. After 1 h, 1 ml of the culture was harvested for βgalactosidase activity assay. The assay was performed as described previously (Miller, 1972).

# RESULTS AND DISCUSSION

### CueR Is a Target of ATP-Dependent Proteolysis in *E. coli*

Transcriptional regulators differentially control genes in order to adapt the proteome to the ambient conditions. Both, level and activity of transcription regulators can be tuned to the cellular need. For instance, the basal level of the copper efflux regulator CueR always present in the cell is elevated at increasing copper concentrations (Yamamoto and Ishihama, 2005). The activity of transcriptional regulators is often controlled by modification or oligomerization. In case of CueR, only the Cu+-bound dimer (holo-CueR) is capable of activating expression of the copper tolerance genes copA and cueO (Outten et al., 2000). Just as important as activation of transcriptional regulators is their inactivation since the cell would waste valuable resources for expression of pathways not needed under the given condition. Moreover, uncontrolled overexpression of membrane proteins like CopA might compromise membrane integrity. Since CueR covalently binds Cu<sup>+</sup> with high affinity in the zeptomolar range, it is unlikely that the transcription factor is inactivated by simple dissociation of copper from its metal-binding pocket (Changela et al., 2003). We postulate that E. coli might shut down the copper-stress response by proteolysis of the metal-loaded transcription factor.

To be able to address whether CueR is a protease substrate in E. coli, we expressed it as N-terminally Strep-tagged variant (Strep\_CueR) that facilitates immunodetection of the protein. First, we used an activity assay previously described by Outten et al. to ascertain that the tagged protein is functionally active as transcription factor. The original assay is based on a 1cueR strain encoding the CueR-dependent copA promoter fused to lacZ on the chromosome, and a plasmid encoding constitutively expressed cueR (Outten et al., 2000). To establish the assay we constitutively expressed untagged CueR and an inactive CueR variant (CueR\_C112S) not able to bind Cu<sup>+</sup> ions (Chen et al., 2003; Stoyanov and Brown, 2003). As expected, β-galactosidase activity increased with increasing copper concentration in the presence of CueR (Figure S1). The CueR\_C112S variant was unable to activate copA expression and produced copper-independent background activity like the empty vector control strain (Figure S1). The assay worked equally well with Strep\_CueR produced from an AHT-inducible plasmid (**Figure 1A**). Copper-controlled copA expression showed that the N-terminal Strep-tag did not interfere with transcriptional activation (**Figure 1B**). The stability of Strep\_CueR was analyzed in an E. coli wildtype strain (MC4100) during exponential growth in M9 minimal medium after translation was blocked by addition of chloramphenicol. The protein was rapidly degraded with a half-life of about 8 min (**Figure 1C**) indicating that Strep\_CueR is a target of proteolysis in E. coli. As control we performed in vivo degradation experiments with Strep\_CueR in a strain lacking cueR, which had no effect on stability (Figure S2). Furthermore, both plasmid-encoded untagged and endogenous CueR were prone to proteolysis, yet with higher half-lives compared to the Strep-tagged version (**Figures 1D,E**). A similar effect on the halflife of tagged proteins was observed for the related transcription factor ZntR (Pruteanu et al., 2007).

# Strep\_CueR Is Degraded by Lon, ClpXP and ClpAP

To identify the protease responsible for Strep\_CueR degradation, we monitored the stability of the protein in various proteasedeficient E. coli strains and their corresponding parental strains. In all parental strains and in strains lacking the membraneanchored FtsH (1ftsH) or the cytosolic HslUV (1hslUV) protease, the half-life of Strep\_CueR was not altered. Hence, FtsH and HslUV are not involved in proteolysis of the transcription factor (**Figure 2**). In contrast, Strep\_CueR was stabilized about sixfold in the 1lon strain. Endogenous CueR also was equally stabilized with a half-life around 2 h in the lon mutant (Figure S3). In a strain lacking the proteolytic ClpP subunit of the ClpXP and ClpAP complexes Strep\_CueR was stabilized about two to threefold. On the contrary, in strains lacking only one of the ATPases of the ClpP-containing proteases (either ClpX or ClpA) Strep\_CueR was degraded wild-type-like suggesting that both ATPase subunits contribute to CueR proteolysis. As expected, Strep\_CueR was completely stable in a strain void of all cytosolic AAA<sup>+</sup> proteases (**Figure 2**). Substrate sharing by different AAA<sup>+</sup> proteases has been described previously and contributes to robust post-translational regulation. For instance, the MerR family member ZntR is degraded by Lon and ClpXP but not by ClpAP (Pruteanu et al., 2007). It seems that regulated proteolysis of MerR-like regulators is a commonly used mechanism to control metal homeostasis in E. coli.

# Activity of CueR Does Not Influence Its Stability

Next, we analyzed whether already known recognition strategies of Lon or ClpP-containing AAA<sup>+</sup> proteases apply to CueR. The mechanisms how AAA<sup>+</sup> proteases recognize their substrates are highly diverse (Hoskins et al., 2002; Sauer et al., 2004; Baker and Sauer, 2006; Sauer and Baker, 2011). Lon predominantly recognizes proteins with exposed aromatic and hydrophobic residues as it is often the case in unfolded or unassembled proteins (Chung and Goldberg, 1981; Gur and Sauer, 2008). Terminal degrons recognized by Lon have also been identified (Ishii et al., 2000; Ishii and Amano, 2001; Shah and Wolf, 2006). Among them is the SsrA-tag, which is C-terminally added via the tmRNA system to polypeptides stalled during translation. However, SsrA-tagged proteins are predominantly recognized by ClpXP (Keiler et al., 1996; Flynn et al., 2001), a protease that is also known to utilize N-terminal degrons (Flynn et al., 2003). ClpAP recognizes several substrates via the so-called N-end rule pathway, in which the first N-terminal amino acid is critical for degradation (Erbse et al., 2006; Mogk et al., 2007; Dougan et al., 2010; Román-Hernández et al., 2011). Comparison of residues in the N or C terminus of CueR with known degrons of Lon, ClpXP and ClpAP did not reveal similarities to other protease substrates. The same was reported for the zinc-dependent transcriptional regulator ZntR, degraded by Lon and ClpXP in E. coli (Pruteanu et al., 2007) suggesting that yet unknown mechanisms of recognition may apply to these MerR-like proteins. For ZntR it was shown that mutation of the conserved arginine in the helix-turn-helix motif of the DNA-binding region results in faster degradation of the protein (Pruteanu et al., 2007). Therefore, we constructed a corresponding Strep\_CueRR18A variant (**Figure 3A**), which as expected (Philips et al., 2015) failed to induce copA-lacZ transcription since DNA binding is impaired (**Figure 3B**). Yet, degradation of the inactive CueR variant was not affected (**Figure 3G** and Figure S4).

Two additional variants of N-terminally Strep-tagged CueR with amino acid substitutions in functionally relevant regions of the protein were analyzed (**Figure 3A**). Strep\_CueRA78<sup>C</sup> is a variant with a substitution at the very beginning of the

transformed with the empty vector pASK-IBA5(+) or the inducible plasmid encoding Strep\_CueR and grown to exponential growth phase (M9 minimal medium; with the addition of 30 ng/ml AHT; 30◦C). Cells were stressed with increasing CuSO<sup>4</sup> concentrations for 1 h and <sup>β</sup>-galactosidase activity was measured in Miller Units (MU). Standard deviations were calculated from at least two independent experiments (B). Plasmid-encoded Strep\_CueR was expressed for 20 min in exponential growth phase (M9 minimal medium; 30◦C) in E. coli (MC4100). Translation was blocked by addition of Cm. Samples were taken at indicated time points, subjected to SDS-PAGE, Western transfer, and immunodetection. Half-lives (T1/2) and standard deviations were calculated from 10 independent experiments (C). In vivo degradation experiments with plasmid-encoded untagged CueR were performed as described above. Half-lives (T1/2) and standard deviations were calculated from five independent experiments (D). Stability of endogenous CueR was determined in E. coli MC4100 as described above. Half-lives (T1/2) and standard deviations were calculated from two independent experiments (E).

dimerization helix that differs from ZntR and MerR, which have a conserved cysteine at this position. Strep\_CueRC112S carries a substitution of a copper-binding cysteine in the metal-binding domain. Strep\_CueRA78C is able to activate copA expression in a copper-responsive manner (**Figure 3C**), while substitution of one of the two copper-binding cysteines (Strep\_CueRC112S) inactivated CueR (**Figure 3D**). Regardless of whether they were active as transcription factor or not, both point-mutated variants

FIGURE 2 | Strep\_CueR is degraded by Lon, ClpXP, and ClpAP protease. Plasmid-encoded Strep\_CueR was expressed for 20 min in exponential growth phase (M9 minimal medium; 30◦C) in different protease-deficient E. coli strains and their corresponding wild-type (Wt) strains. Translation was blocked by addition of Cm or with spectinomycin for the strain lacking all three proteases (1clpXP, 1lon, 1hslUV) and its parental strain (MG1655) since the triple knockout strain is resistant to Cm. Samples were taken at indicated time points, subjected to SDS-PAGE, Western transfer, and immunodetection. Half-lives (T1/2) and standard deviations were calculated from at least two or three independent experiments.

were degraded like Strep\_CueR (**Figure 3G** and Figure S4). Therefore, like for ZntR (Pruteanu et al., 2007), mutations in the dimerization and metal binding regions do not influence proteolysis.

Since some degrons are exposed at the termini of a substrate, we placed a Strep-tag at the C terminus to see whether it affects protein stability. Terminal tags have previously been shown to block proteolysis, for example of the Lon substrate SoxS (Griffith et al., 2004). Although activity of CueR\_Strep was unaffected (**Figure 3E**), the protein was stabilized about six-fold (**Figure 3G** and Figure S4), suggesting a contribution of the Cterminal end to protease targeting. We also constructed a Cterminally truncated, active version of Strep\_CueR lacking the last five C-terminal residues (**Figure 3F**). Strep\_CueR1C5 was degraded like Strep\_CueR (**Figure 3G** and Figure S4) excluding that the last five amino acids of the C terminus are critical for recognition.

Sometimes the recognition process is aided by adaptor proteins (Battesti and Gottesman, 2013). Given that Strep\_CueR is primarily degraded by the Lon protease in vivo (**Figure 2**), we analyzed if it is degraded by Lon in a reconstituted in vitro system. For this purpose, Strep\_CueR and Lon\_His<sup>6</sup> were purified and subjected to in vitro degradation experiments. The replication inhibitor His6\_CspD served as a control protein as it is known to be a direct substrate of Lon in vitro (Langklotz and Narberhaus, 2011) (**Figure 4A**). In contrast to His6\_CspD, Strep\_CueR remained stable when incubated without (Figure S5) or with Lon\_His6, both in the absence and presence of ATP (**Figure 4B**). This is in contrast to ZntR, which is degraded by Lon but not by ClpXP in vitro (Pruteanu et al., 2007). On the one hand it is possible that Lon needs to be allosterically activated for CueR degradation as it was shown for the replication initiator DnaA from Caulobacter crescentus. DnaA is degraded in vivo but remains stable in in vitro degradation experiments. When Lon is allosterically activated by the addition of unfolded proteins, DnaA is degraded in vitro (Jonas et al., 2013; Joshi and Chien, 2016). On the other hand a factor mediating Strep\_CueR degradation might be missing in the purified system. Putative non-proteinaceous regulatory molecules might be guanosine pentaphosphate/tetraphosphate ((p)ppGpp) and inorganic poly phosphates (polyP), which are known to influence proteolysis of several AAA<sup>+</sup> protease substrates (Kuroda et al., 2001, 2006; Kuroda, 2006; Schäkermann et al., 2013; Bittner et al., 2015). However, we can exclude an involvement of (p)ppGpp and polyP in CueR proteolysis since the protein was wild-type-like degraded in strains lacking these regulatory molecules (data not shown).

A putative adaptor protein lacking in the in vitro system might sense the cellular copper status. This is reminiscent of the adaptor protein YjbH that is able to coordinate zinc ions and is involved in ClpXP-dependent degradation of the transcriptional regulator Spx in Bacillus subtilis and Staphylococcus aureus (Garg et al., 2009; Engman et al., 2012). To date little is known about adaptors involved in Lon-dependent degradation. Recently, degradation of the master regulator of flagellar biosynthesis SwrA in B. subtilis was reported to be assisted by the swarming motility inhibitor A (SmiA), in vivo and in vitro. Hence, SmiA is the first described adaptor protein for Lon-dependent proteolysis (Mukherjee et al., 2015). Further, studies targeted at identifying the CueR degron and potential adaptor proteins might reveal similarities and differences in the recognition logics of ZntR and CueR.

### Is Proteolysis of CueR Regulated?

As proteolysis of some metalloregulators, like ZntR, Ctr1p, or Mac1 is metal-dependent (Ooi et al., 1996; Zhu et al.,

FIGURE 3 | Activity and stability of various CueR variants in *E. coli*. Comparison of the amino acid sequence of CueR, ZntR, and MerR. DNA-binding domain and dimerization domain of CueR are marked in dark gray and light gray, respectively. Amino acids, which were substituted in different variants used in this study, are (Continued)

#### FIGURE 3 | Continued

highlighted with arrows and the two copper-binding cysteines of CueR are indicated with gray circles. (\* = identical amino acid; : = conserved substitution; . = semi conserved substitution; − = lacking amino acid). Alignment was performed by using the align tool of the uniprot database (http://www.uniprot.org/) (A). E. coli <sup>1</sup>cueR, <sup>8</sup>(copA-lacZ) cells harboring inducible plasmids encoding Strep\_CueRR18A (B), Strep\_CueRA78C (C), Strep\_CueRC112S (D), CueR\_Strep (E), or Strep\_CueR1C5 (F) were grown in M9 minimal medium with 30 ng/ml AHT at 30◦C to log phase. Cells were then treated with increasing CuSO<sup>4</sup> concentrations for 1 h. β-galactosidase activity and standard deviations were calculated from at least two independent experiments (B–F). Plasmid-encoded CueR variants were expressed for 20 min in exponential growth phase (M9 minimal medium; 30◦C). Translation was blocked by addition of Cm. Samples were taken at indicated time points, subjected to SDS-PAGE, Western transfer, and immunodetection. Half-lives (T1/2) and standard deviations were calculated from at least three independent experiments. For comparison half-life of Strep\_CueR is presented (G).

1998; Liu et al., 2007; Pruteanu et al., 2007), we analyzed the effect of defined copper concentrations on the stability of Strep\_CueR. E. coli cells harboring an AHT-inducible plasmid encoding Strep\_CueR were grown to exponential growth phase under copper-limited conditions. Cells were then supplemented with various copper concentrations for 1 h followed by in vivo degradation experiments. The half-life of Strep\_CueR remained similar at CuSO<sup>4</sup> concentrations between 0 and 200 µM (**Figure 5**) indicating that the cellular copper level has little effect on CueR stability.

It recently turned out that degradation for several protease substrates is growth phase-dependent (Langklotz and Narberhaus, 2011; Westphal et al., 2012; Bittner et al., 2015). Therefore, we examined whether CueR stability depends on the growth status of E. coli and performed in vivo degradation experiments with Strep\_CueR in different growth phases. All experiments described above were performed in M9 minimal medium at 30◦C. To allow optimal growth, LB medium, and a temperature of 37◦C were chosen for this experiment (**Figure 6A**). When no additional copper was added to the culture, degradation was accelerated about twofold from early exponential to exponential growth phase but remained the same in late exponential growth phase (**Figure 6B**). Strep\_CueR was not detectable in later growth phases. Addition of external copper to the growth medium right from the beginning of the

experiment led to constant half-lives in the range between 8 and 10 min (**Figures 6C–E**).

To address whether sudden copper stress affects CueR degradation, we carried out in vivo degradation experiments over the entire growth curve with a copper pulse ∼2.5 h after inoculation (**Figure 7A**). As shown above (**Figure 6B**), Strep\_CueR showed a slightly accelerated degradation upon entry into exponential growth prior to copper treatment (**Figures 7B–D**; time points I and II). Immediately after a copper pulse of 10, 100, or 200 µM CuSO4, the stability of Strep\_CueR increased about twofold before it returned to preshock values (**Figures 7B–D**, time points III and IV) indicating that the cells sensed and slightly reacted to altered copper concentrations. Again, the transcription factor was not detectable in late growth phases. Accelerated degradation of CueR in copper-starved fast-growing cells and transient stabilization of the protein after copper shock are consistent with the physiological demand for this copper export regulator. This is in good agreement with (i) ZntR, which is also only stabilized about two-fold after the addition of zinc (Pruteanu et al., 2007) and (ii) the estimation that newly synthesized CopA proteins reach sufficient efflux power about 2 min after addition of

copper to clear excess copper from the cytosol (Tottey et al., 2007).

Overall, it seems that E. coli continuously degrades CueR with minor adjustments to the external copper status. We propose that this is due to the fast clearance of excess copper after CueR activation of the Cue system, which might not require a long-term stabilization of CueR. Secondly, proteolysis might preferentially erase the overrepresented copper-loaded form of CueR, thereby contributing to the maintenance of a copper-free CueR pool derived from new synthesis to allow measuring the acute copper level in the cell. Apo-CueR may exchange holo-CueR dimers bound to the corresponding promoters via the recently postulated mechanisms of direct substitution or assisted dissociation (Joshi et al., 2012; Chen et al., 2013). Both pathways are based on the formation of a very short-lived transition state (determined as Protein2-DNA ternary complex), in which two CueR dimers (e.g., apo- and holo-CueR) bind to the extended spacer sequence of the -35 and -10 regions of copA or cueO with one of their DNA-binding domains. Given the instability of this state, one CueR protein, e.g., holo-CueR, loses its grip on the dyad giving the other CueR dimer, apo-CueR, the chance to fully bind to the dyad with both of its DNA-binding domains (direct substitution) or both dimers fall off the DNA (assisted dissociation) (Joshi et al., 2012; Chen et al., 2013, 2015). The constitutive proteolysis of CueR described in this study might contribute to an accurate adjustment of the CueR pool always prepared to react to the current cellular copper level to efficiently maintain copper homeostasis.

#### AUTHOR CONTRIBUTIONS

LB, SS, AK, and FN designed the study. LB and AK performed the experiments. LB and FN wrote the manuscript. All authors reviewed the results and approved the final version of the manuscript.

#### ACKNOWLEDGMENTS

Julia Horstmann is acknowledged for plasmid construction and initial steps in this project. The authors gratefully thank Regine Hengge, Eliora Ron, Axel Mogk, Eberhard Klauck, and Thomas V. O'Halloran for kind gift of E. coli strains, Akira

### REFERENCES


Ishihama for the CueR antibody, and Robert T. Sauer for providing the Lon expression system. Jan Arends, Johanna Roßmanith, and Linna Danne are acknowledged for critical reading of the manuscript. We gratefully acknowledge financial support by a grant from the German Research Foundation (DFG; SFB642, GTP-, and ATP-dependent membrane processes) to FN.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmolb. 2017.00009/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Bittner, Kraus, Schäkermann and Narberhaus. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Utilization of Mechanistic Enzymology to Evaluate the Significance of ADP Binding to Human Lon Protease

Jennifer Fishovitz <sup>1</sup> , Zhou Sha<sup>2</sup> , Sujatha Chilakala<sup>3</sup> , Iteen Cheng<sup>2</sup> , Yan Xu<sup>3</sup> and Irene Lee<sup>2</sup> \*

*<sup>1</sup> Department of Chemistry and Physics, Saint Mary's College, Notre Dame, IN, United States, <sup>2</sup> Department of Chemistry, Case Western Reserve University, Cleveland, OH, United States, <sup>3</sup> Department of Chemistry, Cleveland State University, Cleveland, OH, United States*

Lon, also known as Protease La, is one of the simplest ATP-dependent proteases. It is a homooligomeric enzyme comprised of an ATPase domain and a proteolytic domain in each enzyme subunit. Despite sharing about 40% sequence identity, human and *Escherichia coli* Lon proteases utilize a highly conserved ATPase domain found in the AAA+ family to catalyze ATP hydrolysis, which is needed to activate protein degradation. In this study, we utilized mechanistic enzymology techniques to show that despite comparable kcat and K<sup>m</sup> parameters found in the ATPase activity, human and *E. coli* Lon exhibit significantly different susceptibility to ADP inhibition. Due to the low affinity of human Lon for ADP, the conformational changes in human Lon generated from the ATPase cycle are also different. The relatively low affinity of human Lon for ADP cannot be accounted for by reversibility in ATP hydrolysis, as a positional isotope exchange experiment demonstrated both *E. coli* Lon and human Lon catalyzed ATP hydrolysis irreversibly. A limited tryptic digestion study however indicated that human and *E. coli* Lon bind to ADP differently. Taken together, the findings reported in this research article suggest that human Lon is not regulated by a substrate-promoted ADP/ATP exchange mechanism as found in the bacterial enzyme homolog. The drastic difference in structural changes associated with ADP interaction with the two protease homologs offer potential for selective inhibitor design and development through targeting the ATPase sites. In addition to revealing unique mechanistic differences that distinguish human vs. bacterial Lon, this article underscores the benefit of mechanistic enzymology in deciphering the physiological mechanism of action of Lon proteases and perhaps other closely related ATP-dependent proteases in the future.

Keywords: ADP affinity, Lon protease, ADP-ATP exchange mechanism, steady-state kinetic, nucleotide induced conformational changes

# INTRODUCTION

Lon (protease La) is an ATP-dependent serine protease that is found ubiquitously in nature. In eukaryotes, Lon is localized in the mitochondria and helps maintain proper cellular function, while in prokaryotes it is found in the cytosol (Charette et al., 1981; Chung and Goldberg, 1981; Amerik et al., 1991; Wang et al., 1993, 1994; Goldberg et al., 1994; Suzuki et al., 1995). Lon, like other

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Cynthia Dupureur, University of Missouri–St. Louis, United States Jason C. Young, McGill University, Canada*

> \*Correspondence: *Irene Lee Irene.lee@case.edu*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

Received: *31 December 2016* Accepted: *21 June 2017* Published: *11 July 2017*

#### Citation:

*Fishovitz J, Sha Z, Chilakala S, Cheng I, Xu Y and Lee I (2017) Utilization of Mechanistic Enzymology to Evaluate the Significance of ADP Binding to Human Lon Protease. Front. Mol. Biosci. 4:47. doi: 10.3389/fmolb.2017.00047* ATP-dependent proteases such as FtsH, ClpAP, ClpXP, and HslUV, belongs to the AAA+ (ATPase Associated with various cellular Activities) family of proteins. These proteins contain an ATPase domain, which is highly conserved and contains Walker A and B motifs where ATP binding and hydrolysis takes place (Neuwald et al., 1999; Ogura and Wilkinson, 2001). Lon is considered to be one of the simplest proteases because it contains both the ATPase and protease domain in a single subunit (Gottesman and Maurizi, 1992; Maurizi, 1992; Rep and Grivell, 1996).

Lon protease has three activities: intrinsic ATPase, substratestimulated ATPase, and ATP-dependent proteolysis. In bacteria, such as Escherichia coli (ELon), the main function of Lon is to degrade damaged, irregular and short-lived regulatory proteins in cells in order to maintain proper cellular function (Gottesman and Zipser, 1978; Gottesman et al., 1981; Goldberg and Waxman, 1985; Gottesman and Maurizi, 1992; Maurizi, 1992; Goldberg et al., 1994; Gottesman, 1996). In humans, Lon is critical for maintaining the structure and integrity of mitochondria (Bota et al., 2005). Human Lon (hLon) has been found to selectively degrade accumulating proteins damaged by oxidative stress over their native counterparts (Bota and Davies, 2001, 2002).

Lon preferentially degrades damaged or misfolded proteins at its proteolytic site while the ATP is bound and hydrolyzed into ADP and inorganic phosphate (Pi) at its ATPase site. In ELon, ADP was found to act as an inhibitor that binds to the enzyme with much higher affinity than ATP (Thomas-Wohlever and Lee, 2002). Kinetic studies indicated that ADP release is the ratelimiting step along the reaction pathway of ELon (Menon and Goldberg, 1987a,b; Vineyard et al., 2005). These kinetic studies support the model of ADP/ATP exchange, which shows the enzyme becomes proteolytically "inactive" when ADP is bound (Waxman and Goldberg, 1986; Goldberg et al., 1994). When the protein substrate interacts with Lon at the proteolytic active site, it promotes the release of ADP at the ATPase site, which is considered as the rate-limiting step. Lon is only proteolytically "active" when bound ADP is exchanged with ATP (Menon and Goldberg, 1987b). In bacterial Lon, in vitro nucleotide binding and ADP inhibition kinetic studies suggest that the proteolytic activity could be regulated by cellular ATP/ADP level.

Sequence alignment of hLon, ELon, and Salmonella Typhimurium Lon revealed that bacterial Lon such as ELon and S. Typhimurium Lon share greater than 99% sequence identity. However, they only share 42% identity with hLon, but a much higher sequence homology is found within the ATPase domain (Goldberg et al., 1994; Johnson et al., 2008). Since bacterial and human Lon exhibit high sequence homology in their ATPase sites and comparable steady-state kinetic parameters in ATPase activity (Frase et al., 2006), it is plausible that the substrate-promoted ADP/ATP exchange mechanism found in ELon is also used to regulate the proteolytic activity of human Lon in the mitochondria. As mitochondrial Lon functions to degrade oxidized proteins, it is suggested that the protein substrate will bind Lon allosterically in order to reverse ADP inhibition in mitochondria by promoting ADP release. If this is the case, then the levels of oxidized protein vs. ADP serves to regulate Lon's activity (Bulteau et al., 2005). As such, the ratio of ADP/oxidative proteins in the mitochondria is kept at a constant ratio by Lon degradation in order to maintain balance.

To evaluate the effect of ADP on human Lon peptidase activity, the fluorogenic peptidase assay previously (Lee and Berdis, 2001; Thomas-Wohlever and Lee, 2002) used to perform mechanistic characterization of bacterial Lon was used in this study to determine the inhibition profile of ADP for human Lon. Using a limited tryptic digestion assay (Patterson et al., 2004), the effect of ADP on the structural changes in human Lon was assessed. A positional isotope exchange experiment that was used to determine the reversibility of ATP hydrolysis in ELon was also used to study human Lon.

# MATERIALS AND METHODS

#### Materials

Fmoc-protected amino acids, Boc-2-Abz-OH, Fmoc-Lys(Aloc)- Wang resin, Fmoc-Abu-Wang resin, and HBTU were purchased from Advanced ChemTech and NovaBioChem. Tris, IPTG, chromatography media, DTT, Mg(OAc)2, trypsin, kanamycin, chloramphenicol, ATP, DMSO, Tween 20, and all other materials were purchased from Fisher, Sigma, and Amresco.

# General Methods

All reactions conditions are listed as final concentrations. Enzyme concentrations are reported as monomer concentration as quantified by Bradford Assay (Bradford, 1976) or absorbance at 280 nm using the molar extinction coefficient (Gill and von Hippel, 1989). Synthesis of FRETN 89–98 (fluorescent and nonfluorescent analogs) were performed as previously described (Thomas-Wohlever and Lee, 2002; Frase and Lee, 2007). Peptides were quantified by extinction coefficient at A280. All reactions were run at 37◦C unless otherwise stated.

# Expression and Purification of Human Lon Protease

Human Lon was expressed and purified as previously described (Frase et al., 2006). with the following modifications. Human Lon expressed in Rosetta (DE3) cells were grown at 37◦C in superbroth (SB) containing 30µg/mL kanamycin and 34µg/mL chloramphenicol until they reached an OD<sup>600</sup> of 1.0 at which they were induced with 1 mM IPTG for 1 hr at 37◦C. After induction, cells were harvested at 3000 × g at 4◦C. Pelleted cells were combined and resuspended in 50 mM KP<sup>i</sup> lysis buffer (all buffers contain 5 mM BME, 20% glycerol, and 0.01% Tween 20 unless otherwise stated) and lysed in a Dounce homogenizer on ice three times. For complete lysis, cells were sonicated for 5 min in 15 s pulses at 100 V. Cell lysate was cleared by centrifugation at

**Abbreviations:** Abz, aminobenzoic acid; ADP, adenosine 5′ -diphosphate; ATP, adenosine 5′ -triphosphate; DLU, density light units; DTT, dithiothreitol; Fmoc, 9-fluorenylmethoxycarbonyl; IPTG, Isopropyl β-D-1-thiogalactopyranoside; Mg(OAc)2, magnesium acetate; Ni-NTA, nickel nitrilotriacetic acid; ROS, reactive oxygen species; SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis; THF, tetrahydrofuran; Tris, tris(hydroxymethyl)aminomethane, TMSDEA, trimethylsilyl diethylamine; GC/MS, gas chromatography/mass spectrometry.

20,000 × g for 2 h at 4◦C. Cleared lysate was immediately loaded onto a P11 cation exchange column (Whatman) equilibrated in lysis buffer and the flow through was collected. The column was then washed with 0.1 M KP<sup>i</sup> wash buffer until protein was no longer coming off the column. Finally, Lon was eluted with a linear gradient of 0.1 M KP<sup>i</sup> to 0.5 M KP<sup>i</sup> buffer, collected in 20 mL fractions. Fractions were tested for protein content with Bradford dye and positive fractions were analyzed by SDS-PAGE. Fractions containing Lon were combined and diluted to a final KP<sup>i</sup> concentration of 110 mM then loaded onto a DE52 anion exchange column (Whatman) equilibrated in 110 mM KP<sup>i</sup> buffer. Flow-through of the load was collected and the protein was eluted with 120 mM KP<sup>i</sup> buffer. Load and elution fractions were analyzed by SDS-PAGE and Lon-positive fractions were combined and concentrated to ∼6 mL using Amicon YM-30 MWCO membrane. Protein was loaded onto a Sepharose S-300 gel filtration column equilibrated in hLon storage buffer (50 mM HEPES, 75 mM KP<sup>i</sup> pH 7, 5 mM DTT, 1 mM Mg(OAc)2, 150 mM NaCl, 20% glycerol, 0.01% Tween 20) and eluted with the same buffer. Fractions were analyzed by SDS-PAGE and Lon-positive fractions were combined, concentrated, quantified, aliquoted, and stored at −80◦C.

#### ADP Inhibition of Human Lon Peptidase Activity

Reactions containing 50 mM HEPES (pH 8.0), 5 mM Mg(OAc)2, 2 mM DTT, 300 nM hLon and varying amounts of FRETN 89– 98 and ADP were initiated by the addition of 50µM ATP. Peptide cleavage was monitored at 420 nm (λex = 320 nm) on a FluoroMax-3 or FluoroMax-4 fluorometer (Horiba Group) at 37◦C. The rate of peptide cleavage was determined by the slope of a line tangent to the linear phase of the time course and normalized by the rate of complete peptide cleavage by trypsin. Observed rate constants (kobs) were determined by dividing by the concentration of enzyme. Kinetic parameters were determined by global fitting of the data using the program GraphPad Prism 6 for non-competitive inhibition (Equation 1; Cleland, 1979).

$$k\_{obs} = \frac{k\_{cat} \times S^n}{K' \left[1 + \frac{I}{K\_{\bar{\omega}}}\right] + S^n \left[1 + \frac{I}{K\_{\bar{\imath}}}\right]} \tag{1}$$

Where kobs is the observed rate constant for peptide cleavage, kcat is the maximum rate constant, S is peptide substrate concentration, n is the Hill coefficient, K ′ is the observed Michaelis constant for the peptide substrate, I is the inhibitor concentration, and Kis and Kii are the inhibition constants at low and high concentrations of peptide substrate, respectively. K ′ is converted to the true Michaelis constant (Km) using (Equation 2; Cleland, 1979).

$$\log K\_m = \frac{\log K'}{n} \tag{2}$$

#### Effect of Phosphate on Steady-State ATPase Activity

Reactions containing 50 mM HEPES (pH 8), 5 mM Mg(OAc)2, 2 mM DTT, 150 nM hLon in the absence and presence of 1 mM sodium phosphate (NaP<sup>i</sup> , pH 7.2) were initiated with 1 mM [α-<sup>32</sup>P]ATP and incubated at 37◦C. Aliquots were quenched at various time points (0–15 min) in 0.5 N formic acid and 3µL was spotted on a PEI-cellulose TLC plate and developed in 0.3 M KP<sup>i</sup> (pH 3.4). The amount of ADP produced was determined from using Equation (3)

$$[ADP] = \frac{ADP\_{DLU}}{(ATP\_{DLU} + ADP\_{DLU})} \ast [ATP]\_i \tag{3}$$

Where [ADP] is amount of ADP produced, DLU is density light units quantified and [ATP]<sup>i</sup> is the initial concentration of ATP.

#### Positional Isotope Exchange

Isotopically enriched H<sup>18</sup> <sup>2</sup> O was acquired from Sigma. The ATPase reaction was carried out in 150µL total volume with 50 mM Tris pH 7.5, 2 mM DTT, 2 mM Mg(OAc)2, 25% H<sup>18</sup> <sup>2</sup> O, 2 mM ATP, 1µM WT hLon, with and without 8µM λN, a protein substrate that stimulates the ATPase activity of Lon. A control experiment was conducted in the absence of enzyme. The reaction mixture was incubated at 37◦C and quenched with 2µL 0.5 M EDTA after 120 min. The aqueous layer containing the phosphate was retained by extraction first with phenolchloroform, then with chloroform alone. Inorganic phosphate (Pi) was purified from the aqueous layer using a 2 cm AG1-X1 ion exchange column in a Pasteur pipet (Hackney et al., 1980). The ion exchange resin was activated by washing first with 4.5 mL of 1 M HCl, and then with H2O until the pH was above 4. The same sample was added and the column was washed with an additional 4.5 mL of H2O, then with 0.5 mL aliquots of 10 mM HCl until the pH was less than 2.5. The column was eluted with 2.5 mL of 30 mM HCl in 0.5 mL aliquots, which were combined and lyophilized to dryness. Trimethylsilyl phosphate (TMSP) was generating by derivatizing the inorganic phosphate with 10µL trimethylsilyldiethylamide (TMSDEA) and 100µL methylene chloride. The isotopic distribution was determined with a Varian gas chromatograph interfaced with a Varian Saturn 2100T lon trap mass spectrometer. A 30 m VF5-MS column was used for separation. The temperature profile began at 60◦C, then increased by 20◦C/min to 110◦C, then 40◦C/min to 240◦C and held at 240◦C for 5 min. The most abundant ion, M-CH<sup>3</sup> (MW = 300) was monitored. The ion detected after electron impact was (M-CH3) <sup>+</sup> (MW = 299). The experimental relative abundance is calculated using Equation (4)

$$\text{relative } \% \text{ isotope} = \frac{\text{signal}\_{\text{isotope}}}{\text{signal}\_{\text{primary}}} \tag{4}$$

The derivatization reagent TMSP has a high natural abundance of <sup>29</sup>Si and <sup>30</sup>Si, which can obscure the interpretation of the <sup>18</sup>O incorporation results (Hackney et al., 1980). This is known as isotopic spillover and can be calculated according to **Table 1**. To correctly account for the enrichment due to <sup>18</sup>O, the expected isotopic abundance was subtracted from the experimental abundance. There should be no enhancement at M+1. Any enhancement at M+2 is a result of <sup>18</sup>O incorporated into the phosphate. The spillover from the species must be


TABLE 1 | Calculations of the percent isotopic enrichment using the natural isotopic abundance of tris(trimethylsilyl)phosphate minus one methyl group.

\**Indicates enrichment value generated from calculation in same row.*

subtracted from the higher molecular weight isotopes, in addition to the expected natural abundance.

# Tryptic Digests

Trypsin digestion reactions in a mixture containing 6µM WT hLon or 1.5µM WT ELon, 50 mM HEPES (pH 8.0), 5 mM Mg(OAc)2, 2 mM DTT, 1 mM ADP were initiated by the addition of 1/50 (w/w) or 1/275 (w/w) TPCK (N-p-tosyl-L-phenylalanyl chloromethyl ketone)-treated trypsin with respect to Lon. At 0, 15, and 30 min, a 5µL reaction aliquot was quenched with 5µg of soybean trypsin inhibitor (SBTI) followed by boiling at 100 ◦C for 5 min. The quenched reactions were than resolved by 12.5% SDS-PAGE analysis and visualized with Coomassie Brilliant Blue.

# RESULTS AND DISCUSSION

## ADP Inhibition of Peptide Cleavage by Human Lon as a Function of Peptide Concentration

A fluorescent peptide substrate denoted as FRETN 89-98 was used to monitor the inhibition of hLon activity in a continuous peptidase assay. This 11-mer peptide was derived from the sequence of the λN protein (Maurizi, 1987) that contains an anthranilamide donor at one terminus and a 3 nitrotyrosine quencher at the other terminus with a single cleavage site for Lon and one cleavage site for Lon (Lee and Berdis, 2001). Upon cleavage by Lon protease in the presence of ATP hydrolysis, the peptide separates into two pieces, and shows an increase in fluorescence as the quencher is separated from the fluorophore. Protease activity is measured by monitoring fluorescence emission over time. The fluorescent trace contains a short lag phase, followed by a linear phase, then a leveling out of fluorescence indicating substrate depletion. The slope of the linear phase corresponds to the rate of peptide degradation, which can be converted to observed rate constants for comparative studies.

Steady-state peptidase time courses were run in the presence of K<sup>m</sup> level ATP (Frase et al., 2006), varying amounts of peptide substrate and varying amounts of ADP. The rate of each time course was quantified by the slope of a line tangent to the linear phase of the time course. The resulting rate constant data was analyzed using the global fitting programs mentioned in Methods and Materials (**Figure 1**) to yield the kinetic parameters shown in **Table 2**. Fitting of the data to


TABLE 2 | Kinetic Parameters for ADP inhibition of peptide cleavage by human Lon, determined by curve fitting with the indicated software.

, Kii = 1499µM, Kis = 2077µM, and *n* = 1.25.

<sup>µ</sup>M, *<sup>k</sup>*cat <sup>=</sup> 6.63 sec−<sup>1</sup>


2002) it can be discerned that ADP binds ∼300–1500-fold less tightly to human Lon than it does to bacterial Lon, making its inhibitory effect on peptidase activity less. This result suggests that while ADP release may be rate-limiting in the mechanism of bacterial Lon (Thomas-Wohlever and Lee, 2002; Vineyard et al., 2005), it is more likely human Lon has a different ratelimiting step, a distinction between the mechanisms that must be explored further in the future. The fact that hLon binds to ADP with much reduced affinity than ATP in the presence of a protein substrate such as λN indicates that the substratepromoted ADP/ATP exchange mechanism found in ELon does not exist in hLon. Additional mechanistic studies directed toward identifying physiological changes in mitochondria that regulates hLon activity are currently underway.

# Effect of Phosphate on Steady-State ATPase Activity

As Lon catalyzes the hydrolysis of ATP to yield ADP and inorganic phosphate, phosphate rather than ADP release may limit hLon turnover. The rates of ATPase activity of hLon were measured in the absence and presence of 1 mM sodium phosphate (NaPi) as described in Materials and Methods. As shown in **Figure 2**, the rate of ATP hydrolysis in the presence of 1 mM NaPi is not significantly inhibited, suggesting the phosphate release is not the rate-limiting step in the mechanism. Combined with the fact that ADP binds very weakly to human Lon, these results indicate that the ratelimiting step is not associated with either of the product release.

#### Positional Isotope Exchange (Scheme 1)

The difference in the ADP binding between ELon and hLon may be attributed to differences in the reversibility in ATP hydrolysis catalyzed by the two enzymes. Previously, it was demonstrated that ATP hydrolysis was irreversible in ELon (Thomas et al., 2010). In order to determine if ATP hydrolysis is reversible in hLon, we determined the number of isotopicallylabeled oxygen atoms (18O) incorporated into the phosphate from an enriched reaction mixture by comparison to natural isotopic abundance. Rationale for the experimental design is summarized in **Scheme 1**. The Lon catalyzed ATPase reaction is conducted in the presence of <sup>18</sup>O enriched aqueous buffer such that <sup>18</sup>O will be incorporated into the inorganic phosphate generated from ATP hydrolysis. If ATP hydrolysis is irreversible, only one <sup>18</sup>O enriched Pi (M+2) will be detected. If the <sup>18</sup>O enriched Pi reforms ATP at the enzyme binding site and then become hydrolyzed by additional H<sup>18</sup> <sup>2</sup> O, then the molecular weight of inorganic Pi will be higher than M+2 as illustrated in **Scheme 1**. Therefore, the reversibility of ATP hydrolysis catalyzed by Lon could be deduced by determining the extent of <sup>18</sup>O incorporated into the Pi product under steadystate enzyme catalysis condition. To facilitate the quantitative analysis of <sup>18</sup>O incorporation into the Pi product, inorganic phosphate is derivatized by TMSDEA to yield a compound with a boiling point of 228–229◦C that can be analyzed by GC/MS.

In this experiment, a control in which the natural <sup>18</sup>O abundance of H3PO4 was determined. **Table 3** shows the GC/MS approach accurately detected the expected natural abundance of <sup>18</sup>O in H3PO4, thereby validating this detection method. Like bacterial Lon, human Lon possesses intrinsic ATPase activity that is stimulated by protein and certain peptide substrates. To evaluate the effect of protein substrate on the reversibility of the ATPase activity of human Lon, the ATPase reactions were conducted in the absence and presence of the lambda N protein (λN), which is degraded by human Lon and stimulates the ATPase activity (Maurizi, 1987). The results of <sup>18</sup>O incorporation into inorganic Pi generated by hLon catalyzed ATP hydrolysis in the absence and presence of λN are shown in **Table 3**. Since the isotopic distribution of the molecular weight of trimethylsilylphosphate was enriched by M+2, one <sup>18</sup>O was incorporated into the inorganic phosphate (Pi) generated from the hydrolysis of ATP. The <sup>18</sup>O distribution in Pi product is consistent with the incorporation of one <sup>18</sup>O, as no additional <sup>18</sup>O incorporated Pi beyond the natural abundance, were detected. As shown in **Table 4**, in the absence of λN, an enrichment of 3 ± 1% in M+2 was detected (**Table 4A**, averaged of two trials shown in calculated enrichment). In the presence of the λN protein substrate, enrichment in M+2 of 19 ± 1% over the expected natural abundance was detected (**Table 4B**, TABLE 3 | Calculated isotopic enrichment for control phosphate and for the potential incorporation of <sup>18</sup>O into Pi from non-enzymatic hydrolysis of ATP.


TABLE 4 | Experimental and calculated isotopic enrichment for the incorporation of 18O into Pi from hydrolysis of ATP by hLon in the presence of isotopically enriched H2O and in the absence (A) and presence (B) of λN.


averaged of the two trials shown in calculated enrichment). No significant enrichment was detected in the M+4 of Pi, which excludes the reformation of ATP by <sup>18</sup>O labeled Pi generated during the first round of ATP hydrolysis. The detection of only one <sup>18</sup>O incorporated into Pi product generated from hLon-catalyzed ATP hydrolysis in this study supports an irreversible ATPase mechanism. The observed difference in the calculated enrichment number (3 vs. 19%) between the stimulated vs. stimulated ATPase reaction is likely attributed to the relatively lower rate of ATP hydrolysis is in the intrinsic ATPase reaction. Such difference was also observed in the E. coli Lon catalyzed peptide-stimulated vs. intrinsic ATPase reactions.

#### Effect of ADP on Tryptic Digest of Lon

Previously, a limited tryptic digestion was examined to probe the functional role of nucleotide binding to Lon (Patterson et al., 2004). Upon binding to ADP, ELon became more resistant to tryptic digestion and yielded a 67 kDa Lon fragment consisting of the ATPase and protease domains but lacking the first 240 residues (26 kDa) of the amino terminal. Since our inhibition data showed that hLon bound to ADP with much lower affinity than ELon, we decided to utilize the same tryptic digestion assay to probe the interaction of hLon with ADP. **Figure 3B** shows the limited tryptic digestion profiles of hLon (1µM) vs. ELon (1µM) incubated in the absence and presence 1 mM ADP and digested by 1/50-fold (w/w) (**Figure 3A**) and 1/275 fold (w/w) (**Figure 3B**) trypsin under identical conditions (see Section Materials and Methods.) The first time point was obtained 0.25 min after initiating the reaction with trypsin before quenching an aliquot in SBTI and SDS loading buffer. The results indicated that hLon started to be degraded even before the aliquot was quenched but ELon was intact, both in the absence and presence of ADP. In the ELon profile, a 67 kDa fragment persisted at the 15 and 30 min time points only in the presence of ADP. By comparison, hLon was rapidly digested by 1/50-fold trypsin over Lon in the presence and absence of ADP and no specific ADPprotected fragment was detected in lanes 10–15 of **Figure 3A**, suggesting ADP does not protect hLon from tryptic digestion to produce two defined fragments as in the case of ELon. When the ratio of trypsin to Lon was reduced to 1/275 (**Figure 3B**), intact ELon in addition to the 67 kDa and 26 kDa ELon fragments were detected in the ADP treated reactions (lanes 4 and 5). In the absence of ADP, the intensity of the 67kDa ELon fragments was reduced and fragments corresponding to 42 and 37 kDa were detected. By comparison, very faint hLon fragments were detected in tryptic digestion time points of hLon treated with and without ADP. A very faint band corresponding to an apparent molecular weight of 72 kDa (labeled with ∗ between lanes 11 and 12) was detected only in the time points containing ADP,

suggesting this is an ADP-protected hLon fragment. To follow up on this observation, three times the amount of tryptic digested hLon sample treated with and without ADP were resolved with a 7.5% SDS-PAGE. As shown in **Figure 4A**; one hLon fragment, labeled II, was detected only in the ADP-treated reaction. The intensity of the fragment labeled I was stronger and persisted at the 60 min time point in the ADP treated reaction. Fragments I and II of hLon were sequenced by Edman degradation to identify the tryptic sites, which are summarized in **Figure 4B**. The tryptic digested site II matches up with the tryptic digestion site of ELon that was previously shown to be responsible for generating the 67 kDa ADP-protected ELon fragment shown in **Figure 3A**. The 72 kDa ADP-protected hLon fragment is consistent with the calculated molecular weight of the matured human mitochondrial containing the ATPase and proteolytic domain. Therefore, despite the longer hLon sequence and a 42% sequence homology, hLon and ELon bind to ADP and undergo at least one structural change that expose the same tryptic digestion site, suggesting the presence of at least one conserved structural change in the two enzyme homologs upon


(red) are indicated.

binding to ADP. However, the overall difference in the tryptic digestion patterns detected in the ELon vs. hLon shown in **Figures 3A,B** could be attributed to difference in structural dynamics, local conformational flexibilities and/or accessibility of tryptic sites in the respective proteins, and will require a higher resolution method for further clarification. An identical experiment utilizing 10 mM ADP in order to saturate hLon was also carried out to similar results (data not shown). The presence of a high amount of ADP, far more than would ever be present in vivo, did not protect hLon from digestion.

Certain bacteria, such as Salmonella enterica subspecies enterica serovar Typhimurium (S. Typhimurium), are responsible for causing a range of human diseases, such as gastroenteritis and typhoid fever. Salmonella Typhimurim Lon protease is required for systemic infection in mice, which is a common study model for S. Typhimurim infection in humans (Takaya et al., 2003). When Lon-deficient S. Typhimurim is administered as an oral vaccine in mice it has been shown to confer protection against subsequent infection by S. Typhimurium (Matsui et al., 2003). ELon and S. Typhimurium share >99% sequence identity (Johnson et al., 2008). In this study, we demonstrated that the binding of ADP for hLon and ELon differs significantly, suggesting that despite high sequence homology in the ATPase sites, there are mechanistic differences between the homologs. With the recent advances in high-throughput screenings of inhibitors as well as activity probes for kinases, the variations in ADP binding by bacterial vs. human Lon could be potentially exploited to develop selective inhibitors against the bacterial enzyme homologs.

#### SUMMARY

Lon has drawn significant biomedical interest since its discovery. In bacteria, Lon contributes to the pathogenicity of certain bacteria whereas in human, Lon contributes to the maintenance of mitochondria integrity. Therefore, the ability to identify unique features in bacterial Lon will benefit the development of antibiotic agents. In eukaryotes, Lon is located in mitochondria, where ATP is synthesized. Since the proteolytic activity of Lon is coupled with its ATPase activity, which yields ADP, knowing the effect of ADP on the proteolytic activity of eukaryotic Lon will help decipher the mechanism by which the activity of Lon is regulated in mitochondria. Driven by these goals, this study undertook a mechanistic approach, using comparable experiments performed on ELon, to evaluate the effect of ADP on the structure and function of the human homolog. Results generated from this study were directly compared with those obtained in ELon to identify difference between the two proteases. By monitoring the extent of <sup>18</sup>O incorporation

#### REFERENCES

Amerik, A. Y., Antonov, V. K., Gorbalenya, A. E., Kotova, S. A., Rotanova, T. V., and Shimbarevich, E. V. (1991). Sitedirected mutagenensis of La protease: a catalytically active serine protease. FEBS Lett. 287, 211–214. doi: 10.1016/0014-5793(91) 80053-6

into the hydrolyzed inorganic phosphate product, we observed that hLon catalyzed ATP hydrolysis in an irreversible manner, which was the same in ELon. Despite showing comparable kcat and K<sup>m</sup> values in the ATPase activity, the K<sup>i</sup> values of ADP toward the ATP-dependent peptidase activity of Elon were 300–1,500 times lower than those determined for hLon. Judging by the significant difference in protection from limited tryptic digestion in hLon incubated with ADP, we conclude that the mechanisms of ELon and hLon binding to ADP and/or ATP are different. In ELon, the binding interaction with ADP is strengthened by the removal of the gamma phosphate moiety whereas in hLon, such binding interaction is significantly weakened. Based on this observation, we propose that exploring the difference in the binding mechanisms of ADP in ELon vs. hLon will potentially serve as a viable strategy for developing selective inhibitors against Lon in pathogenic bacteria. Another significant finding of this work is the discovery that protein substrate-promoted ADP/ATP exchange mechanism existing in ELon is absent in hLon, as the K<sup>i</sup> of ADP for hLon is >30-fold higher than the K<sup>m</sup> of ATP. In mitochondria, the anticipated level of ATP is at least on millimolar level. Therefore, it is not likely that the proteolytic activity of hLon could be significantly affected by the concentration of ADP in the mitochondria. Given such consideration, the ratelimiting step governing the proteolytic activity of mitochondrial Lon as well as the mechanism that regulates its activity is unknown. Since specific mutations of human Lon have been shown to cause diseases such as CODAS (Strauss et al., 2015), we propose that a more thorough mechanistic study of wild-type vs. mutant hLon will be needed to advance our understanding on the role played by hLon in mitochondrial biology.

#### AUTHOR CONTRIBUTIONS

IL designed the project, directed all experiments, analyzed and interpreted data, and wrote the manuscript. JF designed inhibition experiments, purified enzymes, synthesized peptides, analyzed and interpreted data, and wrote the manuscript. ZS performed limited tryptic digestion experiments. IC performed the ADP inhibition experiments, and limited tryptic digestion experiments, <sup>18</sup>O exchange experiments and data analysis. YX designed and directed the <sup>18</sup>O exchange experiment. SC performed the <sup>18</sup>O mass spec data acquisition.

#### FUNDING

All the work performed in this study were supported by a grant awarded by NSF-Chemistry of Life Division: CHE-1507792.

Bota, D. A., and Davies, K. J. (2001). Protein degradation in mitochondria: implications for oxidative stress, aging and disease: a novel etiological classification of mitochondrial proteolytic disorders. Mitochondrion 1, 33–49. doi: 10.1016/S1567-7249(01)00005-8

Bota, D. A., and Davies, K. J. (2002). Lon protease preferentially degrades oxidized mitochondrial aconitase by an ATP-stimulated mechanism. Nat. Cell Biol. 4, 674–680. doi: 10.1038/ncb836


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Fishovitz, Sha, Chilakala, Cheng, Xu and Lee. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multifunctional Mitochondrial AAA Proteases

Steven E. Glynn\*

*Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, United States*

Mitochondria perform numerous functions necessary for the survival of eukaryotic cells. These activities are coordinated by a diverse complement of proteins encoded in both the nuclear and mitochondrial genomes that must be properly organized and maintained. Misregulation of mitochondrial proteostasis impairs organellar function and can result in the development of severe human diseases. ATP-driven AAA+ proteins play crucial roles in preserving mitochondrial activity by removing and remodeling protein molecules in accordance with the needs of the cell. Two mitochondrial AAA proteases, i-AAA and m-AAA, are anchored to either face of the mitochondrial inner membrane, where they engage and process an array of substrates to impact protein biogenesis, quality control, and the regulation of key metabolic pathways. The functionality of these proteases is extended through multiple substrate-dependent modes of action, including complete degradation, partial processing, or dislocation from the membrane without proteolysis. This review discusses recent advances made toward elucidating the mechanisms of substrate recognition, handling, and degradation that allow these versatile proteases to control diverse activities in this multifunctional organelle.

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Eyal Gur, Ben-Gurion University of the Negev, Israel Nico P. Dantuma, Karolinska Institutet, Sweden Johannes Herrmann, Kaiserslautern University of Technology, Germany*

> \*Correspondence: *Steven E. Glynn steven.glynn@stonybrook.edu*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *30 March 2017* Accepted: *08 May 2017* Published: *22 May 2017*

#### Citation:

*Glynn SE (2017) Multifunctional Mitochondrial AAA Proteases. Front. Mol. Biosci. 4:34. doi: 10.3389/fmolb.2017.00034* Keywords: mitochondria, proteolysis, i-AAA, m-AAA, AAA+

# INTRODUCTION

Mitochondria provide eukaryotic cells with a stage for performing essential activities, including mass ATP production, calcium ion storage, and fatty acid oxidation (Chan, 2006; McBride et al., 2006). These activities are coordinated by a diverse composite proteome encoded by genomes in both the nucleus and mitochondrial matrix (Anderson et al., 1981; Sickmann et al., 2003; Rhee et al., 2013; Calvo et al., 2016). Proteins synthesized in the cytosol must be imported into the organelle via a complex network of translocases, chaperones, and processing peptidases (Neupert and Herrmann, 2007). Once inside, mitochondrial proteins are exposed to damaging reactive oxygen species (ROS), by-products of oxidative phosphorylation (Beckman and Ames, 1998; Ugarte et al., 2010). Preserving mitochondrial function thus requires precise systems of proteostasis to balance the entry and exit of proteins into the organelle, remove damaged components to maintain uninterrupted activity, and respond to the changing energetic needs of the cell (Diaz and Moraes, 2008; Ugarte et al., 2010). One route for the removal of mitochondrial proteins is degradation by a network of proteolytic enzymes (Koppen and Langer, 2007). Together, these proteases select and destroy proteins to achieve a constant recycling of the mitochondrial proteome (Augustin et al., 2005). Absence of proper mitochondrial proteostasis is linked to the development of severe human diseases, including cancer and a host of neurodegenerative disorders (Bulteau and Bayot, 2011; Rugarli and Langer, 2012; Konig et al., 2016; Levytskyy et al., 2016). A recent report has suggested that the proteolytic capacity of mitochondria is used to clear cytosolic protein aggregates that are associated with aging (Ruan et al., 2017).

Mitochondria are enveloped by outer (MOM) and inner membranes (MIM), which enclose the aqueous intermembrane space (IMS) and matrix, respectively. Consequently, both energydependent and independent proteases are located across the organelle operating in both polar and non-polar environments (Koppen and Langer, 2007). Two AAA+ family members, collectively named the mitochondrial AAA proteases, are anchored to the MIM and engage substrates on either side of the membrane (Leonhard et al., 1996). A number of recent studies have provided insight into the diverse roles played by the mitochondrial AAA proteases in maintaining function of the organelle. This review will focus on our current understanding of the structural and mechanistic principles that allow these enzymes to recognize, engage, and process protein substrates.

#### AAA+ Proteins in Mitochondria

Mitochondria contain a number of AAA+ ATPases that can be traced to ancestral bacterial enzymes present during symbiogenesis (for review see Truscott et al., 2010). These proteins contain the family-specific sequence motifs responsible for ATP binding and hydrolysis, and presumably assemble into canonical ring-shaped oligomers (Hanson and Whiteheart, 2005). A feature of the AAA+ family is the coupling of the energy of ATP hydrolysis to power highly diverse functions. In mitochondria, these activities include nonproteolytic chaperones, such as Hsp78, a functional homolog of Hsp104/ClpB that promotes disaggregation of matrix proteins (Leonhardt et al., 1993). Mitochondria also contain a number of AAA+ proteases, including homologs of the well-studied soluble proteases, Lon (Pim1) and ClpXP, which remove oxidatively damaged proteins from the matrix (Wang et al., 1993; Suzuki et al., 1994; van Dyck et al., 1994; Corydon et al., 2000). Interestingly, yeast do not contain the ClpP proteolytic subunit and instead, the ClpX ATPase (Mcx1p) performs important nonproteolytic functions (van Dyck et al., 1998; Kardon et al., 2015). In bacteria, FtsH is a AAA+ zinc-metalloprotease that degrades substrates at the face of the plasma membrane. Two ATPdependent proteases, which are evolutionarily related to bacterial FtsH, are found anchored to the mitochondrial MIM (Leonhard et al., 1996). Named i-AAA and m-AAA, these mitochondrial AAA proteases are positioned to interact with substrates in the IMS, matrix, or MIM (Leonhard et al., 1996, 2000; Koppen and Langer, 2007; Tatsuta and Langer, 2009; Gerdes et al., 2012; **Figure 1**).

#### ORGANIZATION OF THE MITOCHONDRIAL AAA PROTEASES

Both i-AAA and m-AAA proteases encode multiple domains on a single polypeptide: small distal domains located across the MIM from the main body of the protease; an insoluble transmembrane (TM) domain; and a catalytic core comprising a AAA+ ATPase domain and a zinc metalloproteinase domain(Leonhard et al., 1996). The major architectural difference between them lies in the organization of the TM domains. The i-AAA contains a single transmembrane helix that, when inserted into the MIM, projects the ATPase and protease domains into the IMS. In contrast, the m-AAA protease contains two transmembrane spans that project the catalytic domains into the matrix. These opposing orientations allow both faces of the MIM and both aqueous compartments of the mitochondrion to be scrutinized for the appearance of substrates (Leonhard et al., 2000). In all eukaryotes, six identical i-AAA subunits assemble into an active proteolytic complex (YME1L in mammals; Yme1 in yeast). In contrast, multiple isoforms of m-AAA exist with distinct subunit compositions. In yeast, m-AAA is an obligate heterohexamer of alternating Yta10 and Yta12 subunits (Yta10/12; Arlt et al., 1996). In mammals, the protease can either form AFG3L2 homohexamers or heterohexamers of alternating AFG3L2 and Paraplegin subunits. The distribution of these two isoforms is tissue specific, with a greater proportion of heterohexamers present in mitochondria of neuronal cells (Koppen et al., 2007).

The broad structural resemblance to the ancestral FtsHlike protease was confirmed by a moderate resolution cryoEM structure of Yta10/12 revealing an arrangement of stacked hexameric AAA+ and protease rings surrounding an axial pore (Suno et al., 2006; Bieniossek et al., 2009; Cha et al., 2010; Lee et al., 2011; Su et al., 2016; **Figure 2A**). As with other family members, the six ATP binding sites are predominately formed within individual AAA+ domains with important additional interactions provided by neighboring subunits (Hanson and Whiteheart, 2005; Karlberg et al., 2009). The interfaces between AAA+ domains provide a surface for communication and coordination between protomers. An elegant in vivo study using S. cerevisiae Yta10/12 demonstrated that ATP binding to Yta12 inhibits nucleotide hydrolysis in the neighboring Yta10 subunit (Augustin et al., 2009). Suppressor mutations and homology modeling revealed that the presence of a nucleotide γ-phosphate bound to Yta12 is sensed by a patch of conserved inter-subunit signaling residues on Yta10 and transmitted via the pore-2 loop to the Walker-B motif of Yta10. This allosteric coordination is proposed to create an alternating power stroke that maximizes the unfolding force while maintaining grip of the translocating substrate. The observation of similar coordination in Yta12 variants capable of forming homooligomers suggested that this phenomena could exist in related homohexameric proteases.

The lower ring sequesters the proteolytic active sites inside a compartment that can be accessed upon translocation through the axial pore. The active sites are formed by a canonical HEXXH motif that coordinates the water-activating zinc ion (Rawlings and Barrett, 1995; Leonhard et al., 1996). While peptide cleavage by many proteases is strongly influenced by the pattern of residues surrounding the scissile peptide bond, it remains to be seen if such cleavage site preferences exist for the mitochondrial AAA proteases. The protease domain of human AFG3L2 has been identified as a hotspot for mutations linked to the development of the human neurodegenerative diseases (Cagnoli et al., 2010; Di Bella et al., 2010; Pierson et al., 2011). For example, at least 17 single amino acid substitutions in AFG3L2 have been linked to the development of spinocerebellar ataxia type 28 (SCA28), a disorder characterized by imbalance, slurred speech and lack of limb coordination (Mariotti et al., 2008; Di Bella et al., 2010; Lobbe et al., 2014; Qu et al., 2015; Zuhlke et al.,

2015; Svenstrup et al., 2017). Homology modeling using crystal structures of FtsH reveals these mutations largely cluster to positions surrounding the metalloprotease active site and subunit interfaces (**Figure 2B**) and thus are likely to cause defects in polypeptide cleavage and hexamer assembly rather than substrate binding or ATP hydrolysis.

Assembly of many AAA+ oligomers is driven by interactions between ATPase domains. However, truncations of both human and yeast i-AAA lacking the TM and N-terminal domain (ND) fail to form active hexamers, highlighting the importance of interactions within these domains to oligomerization (Leonhard et al., 1999; Shi et al., 2016). Furthermore, replacement of these domains with a synthetic hexamerization sequence was sufficient to drive assembly of active i-AAA proteases in vitro (Shi et al., 2016; Rampello and Glynn, 2017). FtsH also requires the TM domain to promote oligomerization (Akiyama and Ito, 2000). In contrast, assembly of m-AAA hexamers appears to involve additional interactions in the metalloprotease domain (Lee et al., 2011). Truncations of Yta10/12 lacking the distal IMS domain (IMSD) and TM could complement respiratory defects in 1yta10/1yta12 cells but displayed impaired degradation of integral membrane substrates, indicating the presence of unanchored but assembled hexamers in the matrix (Korbel et al., 2004). The interactions that specify the formation of defined heterooligomeric arrangements of different m-AAA proteases also appear to be located in the metalloprotease domain as substitution of only two residues was sufficient to drive assembly of homo-oligomeric Yta12 proteases (Lee et al., 2011).

The distal domains of both proteases contain ∼70–80 folded residues but are positioned differently in their respective primary structures. The i-AAA ND immediately follows the mitochondrial targeting sequence and arranges in the matrix, whereas the m-AAA protease IMSD is encoded between the two transmembrane spans. Despite low sequence homology, a solution structure of the human AFG3L2 IMSD displays a strikingly similar α+β fold to the periplasmic domain (PD) of FtsH (Ramelot et al., 2013; **Figure 2A**). Highly conserved residues between these regions map to the interfaces of the FtsH PDs, implying the AFG3L2 IMSDs form a similar hexameric structure in the assembled protease (Scharfenberg et al., 2015). However, in detergent-solubilized full-length Yta10/12, the IMSDs do not interact directly but instead fan out from the TM domain (Lee et al., 2011; **Figure 2A**). NDs of i-AAA display no homology with domains of other metalloproteases and cluster into two distinct and evolutionarily unrelated families (Frickey and Lupas, 2004; Scharfenberg et al., 2015). Plant and fungal NDs belong to the tetratricopeptide repeat (TPR) fold, whereas NDs from animal sources have no known homologs and no structures have been determined (D'Andrea and Regan, 2003; Scharfenberg et al., 2015; **Figure 2A**).

In both cases, the functions of these distal domains remain unclear. The apparent diversity in the sequence and structure

of these domains may imply they simply act as anchors to stabilize the protease in the membrane during substrate extraction. Indeed, active reconstituted i-AAA proteases lacking the ND and TM domain demonstrated that these domains are dispensable for ATP-dependent proteolysis (Shi et al., 2016). One possible function for these domains is the recognition of substrates on the opposite face of the membrane. Substrates presenting domains on both sides of the MIM appear to be fully degraded, implying translocation of polypeptides across the membrane leaflet but not necessarily transmembrane substrate recognition (Leonhard et al., 2000). The distal domains may also act as interaction surfaces for large protein assemblies that modulate protease function. The analogous FtsH PDs interact with the HflKC complex to promote degradation of uncomplexed subunits of the SecY protein translocase (Kihara et al., 1996, 1997; Akiyama et al., 1998). In all eukaryotes, two related prohibitin subunits, PHB1 and PHB2, form MIM-anchored heterodimeric ring structures with diameters of 20–25 nm (Tatsuta et al., 2005; Merkwirth and Langer, 2009). Both prohibitin subunits bear large C-terminal domains that project into the IMS where they are capable of interacting with the m-AAA IMSDs. Although, the precise interactions between the prohibitin ring and the protease are unclear, deletion of either subunit in yeast accelerates the degradation of non-assembled Cox3 by Yta10/12 (Steglich et al., 1999). In mammals, deletion of PHB2 increases the proteolytic processing of the mitochondrial fission regulator, OPA1 (Merkwirth et al., 2008). Thus, in both cases, the prohibitin ring appears to restrict the activity of the m-AAA protease. A recent study identified a multi-subunit proteolytic hub formed between mammalian YME1L and the MIM rhomboid protease PARL, mediated by the membrane scaffold protein SLP2 (Wai et al., 2016). Presence of this supramolecular SPY complex increased cleavage of the PINK1 kinase by PARL and processing of OPA1 by the nearby OMA1 protease. The location of SLP2 in the matrix invites suggestions of an analogous arrangement to the prohibitin ring, positioned on the opposite face of the MIM and interacting with the NDs of YME1L.

### MODES OF SUBSTRATE PROCESSING

A commonly highlighted feature of the mitochondrial AAA proteases is the contrasting fates of different substrates. Proteins may be completely degraded to small peptide fragments, undergo partial processing to a fixed point in the structure, or be dislocated from the membrane without proteolysis. These outcomes are dependent on the identity of the substrate and allow just two proteases to control a wide variety of mitochondrial operations.

#### Complete Substrate Degradation

It has long been established that both mitochondrial AAA proteases can provide house-keeping functions by fully degrading damaged, misassembled, or unnecessary proteins in their respective compartments. Most of these substrates undergo processive proteolysis to generate small peptides that can be exported from the organelle or further processed by oligopeptidases (Alikhani et al., 2011; Quiros et al., 2015). This class of substrates includes misassembled components of the respiratory chain and F1–F<sup>0</sup> ATP synthase complexes that must be precisely balanced to coordinate expression of both mitochondrial and nuclear encoded subunits (Nakai et al., 1995; Weber et al., 1995; Arlt et al., 1996; Kaser et al., 2003). Rapid turnover of these proteins is essential to prevent the buildup of potential aggregating proteins within the organelle. Accordingly, genetic loss of either protease results in severe phenotypes, including respiratory defects, loss of mitochondrial structure, and increased sensitivity to oxidative stress (Campbell et al., 1994; Tzagoloff et al., 1994; Stiburek et al., 2012). Recently, several more examples of this activity have been identified in a human embryonic cell line, including Ndufb6, ND1, and Cox4, important components of the oxidative phosphorylation machinery (Stiburek et al., 2012).

An increasingly clear role for these proteases is in the protection against mitochondrial stress arising from the accumulation of misfolded proteins (Rainbolt et al., 2014; Bohovych et al., 2015). Both i-AAA and m-AAA in mammals, and i-AAA from Arabidopsis are reported to degrade carbonylated proteins resulting from damage by ROS (Maltecca et al., 2009; Kicia et al., 2010; Stiburek et al., 2012; Smakowska et al., 2014). Additionally, stress-sensitive degradation of YME1L is used to reorganize the proteolytic capacity of the IMS (Rainbolt et al., 2015, 2016). Mitochondrial stress has significant consequences for the import of nuclear-encoded polypeptides from the cytosol. Mammalian YME1L actively attenuates protein import into the matrix in response to stress by degrading Tim17A, a subunit of the TIM23 MIM translocase complex (Rainbolt et al., 2013). In yeast, Yme1 provides surveillance for at least two soluble import components, Tim9 and Tim10. These homologous IMS proteins form a heterohexameric chaperone complex that shuttles imported hydrophobic proteins across the aqueous compartment (Koehler et al., 1998; Bolender et al., 2008). Both subunits contain two internal disulfide bonds encoded by Cx3C motifs, which form in the oxidative IMS environment. Improper formation of these disulfide bonds due to oxidative stress induces degradation of both subunits by Yme1, likely to prevent the accumulation of covalently-linked aggregates (Baker et al., 2012; Spiller et al., 2015). In vitro degradation of purified Tim9 and Tim10 by a solubilized Yme1 protease (hexYme1) confirmed an increased degradation rate upon disulfide bond disruption but also indicated that Tim10 is highly preferred as a substrate to Tim9 (Rampello and Glynn, 2017).

In addition to clearing destabilized proteins to prevent the formation of toxic aggregates, the mitochondrial AAA proteases can also target and remove specific proteins as a means of controlling important metabolic pathways. Ups1 and Ups2 are yeast IMS lipid carrier proteins related to the MSF1'/PRELI family conserved across eukaryotes (Dee and Moffat, 2005; Potting et al., 2010). Both proteins form a complex with the small Cx9C protein, Mdm35, to catalyze the transfer of lipid precursors from the MOM to the MIM to promote synthesis of cardiolipin (CL) and phosphatidylethanolamine (PE) (Sesaki et al., 2006; Osman et al., 2009; Tamura et al., 2009; Potting et al., 2010; Connerth et al., 2012). Lack of CL accumulation in the MIM impairs the function of numerous complexes involved in respiration, mitochondrial fusion, protein translocation, and apotosis (Choi et al., 2007; DeVay et al., 2009; Gebert et al., 2009; Wenz et al., 2009). When complexed to Mdm35, both Ups1 and Ups2 are resistant to proteolysis but are rapidly degraded by Yme1 in the absence of the binding partner (Potting et al., 2010). Crystal structures of Ups1-Mdm35 and the homologous mammalian complex, PRELID-TRIAP1, revealed the tertiary structure of Ups1/PRELID is stabilized by complex formation (Miliara et al., 2015; Yu et al., 2015). The degradation of uncomplexed Ups1 and Ups2 allows mitochondria to control the flux of phospholipid precursors across the compartment while the presence of conserved disulfide bonds in Mdm35 suggests that degradation may occur in response to oxidative stress.

# Limited Proteolysis and Chaperone Activities

In recent years, an increasing number of substrates that encounter an alternative proteolytic fate have been identified. Rather than undergoing complete degradation into small peptides, these substrates are partially processed to yield intact fragments that perform further functions. An example of this mode of action that is conserved across yeast and mammals is the maturation of MrpL32, a nuclear-encoded subunit of the mitochondrial ribosome. MrpL32 is imported into the matrix bearing an extensive unstructured N-terminal region that must be removed by m-AAA prior to ribosome assembly (Nolden et al., 2005; Bonn et al., 2011; Woellhaf et al., 2014). More recently identified examples include Atg32, the MOM-anchored regulator of mitophagy in yeast (Kanki et al., 2009; Okamoto et al., 2009). The C-terminal domain of Atg32 projects into the IMS where it is removed by Yme1 to yield a fragment that remains fixed in the membrane. Blocking the proteolytic processing of the Atg32 by Yme1 results in defects in mitophagy (Wang et al., 2013). An example of partial processing observed in mammals is the cleavage of OPA1, a dynamin-related GTPase that regulates mitochondrial dynamics in mammalian cells (Delettre et al., 2000; Praefcke and McMahon, 2004; Lee and Yoon, 2016). Initiation of mitochondrial fission occurs after successive cleavage of OPA1 by the OMA1 and YME1L proteases to generate distinct short isoforms. The balance of mitochondrial fusion and fission is controlled by the relative abundance of the unprocessed long form (L-OPA1) and processed short forms (S-OPA1) (Anand et al., 2014). An analogous regulator found in yeast, Mgm1, does not appear to be cleaved by Yme1 but rather by the MIM rhomboid protease Pcp1 (Herlan et al., 2003; McQuibban et al., 2003).

What is the mechanism that prevents these partially processed substrates from being degraded completely? Maturation of MrpL32 in yeast requires the removal of 71 N-terminal residues and is dependent on the integrity of a cysteine-rich zinc-binding motif located in a tightly folded C-terminal domain (Bonn et al., 2011). In this case, m-AAA appears to processively degrade MrpL32 from the N-terminus until it encounters the highly stable zinc-binding motif, resulting in stalling of the protease and release of the mature ribosomal subunit. Insertion of spacer sequences prior to the folded domain repositioned the Nterminus of the mature protein, implying that cleavage occurs at a site determined by structural rather than sequence constraints. In crystal structures of the assembled mitochondrial ribosome, the distance between the MrpL32 N-terminus and the C-terminal domain is ∼35 residues (∼50 Å), likely reflecting the distance between the contact site on the outer surface of the protease and the internal proteolytic active sites (Greber et al., 2015). It is an attractive possibility that partial processing of other substrates occurs through a similar mechanism to MrpL32. Atg32 does not contain metal coordination sites but extraction of its transmembrane domain from the MOM could act as a similar barrier to complete degradation, resulting in removal of only the exposed IMS domain. Whereas, extraction from the MIM by Yme1 has been demonstrated conclusively, this model would require the protease to dislocate polypeptides from the MOM with lower efficiency. The presence of the Yme1 transmembrane domains or accessory proteins, such as the prohibitins, in MIM but not the MOM may provide an explanation for this difference.

The final mode-of-action displayed by these proteases involve the remodeling of substrates in the absence of proteolysis. Cytochrome c peroxidases (Ccp1) is dislocated from the MIM by m-AAA followed by degradation by a secondary protease, Pcp1 (Tatsuta et al., 2007). In yeast, Yme1 was shown to aid the import of a mammalian polynucleotide phosphorylase into the IMS (Rainey et al., 2006). Evidence also exists that i-AAA is capable of chaperone-like activity to prevent formation of aggregates by protein refolding rather than degradation (Leonhard et al., 1999; Schreiner et al., 2012).

#### SUBSTRATE RECOGNITION

The studies described above clearly demonstrate that the mitochondrial AAA proteases can act as both general housekeeping enzymes and targeted proteases, processing and degrading specific substrates. Resolving this apparent dichotomy requires understanding the precise mechanisms used to identify and engage substrates. The in vivo degradation by yeast Yme1 of a thermolabile variant of mouse dihydrofolate reductase (mDHFR) fused to the terminus of the integral MIM protein Yme2p generated a number of potential models for substrate recognition (Leonhard et al., 2000). Here, increasing temperature destabilized the solvent accessible mDHFR domain and initiated degradation of the entire fusion protein. The protease could be failing to unfold the folded mDHFR domain at low temperature, sensing the appearance of unstructured polypeptides in proximity to the membrane face, or recognizing specific patterns of residues that only become accessible after domain unfolding at high temperature. Domain swap experiments between i-AAA proteases from Sacchromyces cerevisiae and Neurospora crassa revealed that specificity for certain substrates for could be transplanted, suggesting a mechanism other than sensing folding state (Graef et al., 2007). Moreover, a solubilized human YME1L protease (hexYME1L) was used to demonstrate that simple protein unfolding is not sufficient to initiate degradation and that the protease is capable of unfolding circularly-permuted GFP variants with varying thermodynamic stabilities in vitro, indicating that the enzyme possesses moderate unfolding power (Shi et al., 2016).

Maximal degradation by hexYME1L required substrates to display unstructured terminal tags of 10–20 residues, consistent with in vivo experiments defining a minimal length of 20 residues needed to project from the membrane face to initiate degradation (Leonhard et al., 2000; Shi et al., 2016). Many AAA+ proteases select substrates by recognition of defined sequences, known as degrons (Baker and Sauer, 2006). A survey of model degron sequences identified a phenylalaninerich motif that was preferentially recognized by hexYME1L (Shi et al., 2016). Furthermore, a solubilized yeast hexYme1 protease was used to identify a phenylalanine-rich degradation signal present at the N-terminus of mitochondrial Tim10 (Rampello and Glynn, 2017). This sequence was necessary and sufficient to promote degradation by hexYme1 and the presence of similar N-terminal motifs in additional small Tim family members predicted their degradation by the protease. Together, these studies demonstrated unambiguously that i-AAA can recognize specific sequences located at accessible termini and opened the possibility that conserved recognition motifs may be found across diverse mitochondrial substrates. Intriguingly, a similar motif was found to target substrates to the bacterial Lon protease (Gur and Sauer, 2008). As with the mitochondrial AAA proteases, Lon has a hybrid function of general surveillance and specific protein degradation. A preference for hydrophobic residues such as phenylalanine, which become exposed after domain unfolding, would allow these proteases to select damaged proteins from among the crowded mitochondrial proteome. The presence of these residues at accessible termini in certain constitutively degraded proteins would then allow both the mitochondrial AAA proteases and Lon to bridge the gap between quality control and targeted proteolysis (**Figure 3**).

Substrates of AAA+ proteases are classically recognized by N-terminal domains found at the apical face of the AAA+ ATPase module or by elements within the central translocating pore (Baker and Sauer, 2006). Substrate binding sites on yeast Yme1 have been mapped to conserved helical regions located at distinct positions on the AAA+ (NH) and protease rings (CH) (Graef et al., 2007). Involvement of each binding site is substrate dependent with a more stringent requirement for the CH sites in the degradation of peripheral membrane proteins. The preference for phenylalanine-rich sequences identified in vitro could imply the presence of similarly hydrophobic substrate binding sites on the enzyme. However, the NH sites of Yme1 contain multiple negatively charged residues, inconsistent with interaction with aromatic side chains. Again, this is reminiscent of bacterial Lon that uses distinct binding sites to recognize highly divergent degron sequences (Gur and Sauer, 2009). Further experiments are required to elucidate the precise mechanisms used by the mitochondrial proteases to capture specific degron sequences.

The identification of multiple substrate binding sites on Yme1 may also provide an explanation for how the mitochondrial

AAA proteases overcome a geometric handicap when degrading soluble and peripheral membrane proteins. The entrance to the central pore of each protease directly faces the bilayer, limiting the opportunity for interaction the pore and extramembrane substrates. Whereas integral membrane proteins can be easily engaged by NH sites and fed directly into the proteolytic chamber, substrates located far from the membrane face may be held in place by CH sites to increase their effective concentration close to the translocating pore. To further facilitate substrate engagement, both proteases contain unstructured linkers of typically 20–25 residues that traverse from TM to the exterior of the AAA+ ring, creating a maximal space between the membrane face and the central pore of ∼30–45 Å.

Many AAA+ proteases use adaptor proteins to enhance both substrate selectivity and degradation (Levchenko et al., 2000; Dougan et al., 2002a,b). In addition to the prohibitin rings and SPY complex discussed previously, Mgr1 and Mgr3 have been identified as possible adaptors for Yme1 in yeast (Dunn et al., 2006, 2008). These MIM anchored proteins form a subcomplex that interacts with Yme1 and are required for efficient binding of unfolded polypeptides that project from the MIM (Dunn et al., 2008). Few substrates that require the action of Mgr1/Mgr3 have been directly detected but Yme1-dependent degradation of Cox2 is severely attenuated by deletion of the putative adaptors (Elliott et al., 2012).

#### MECHANISMS OF EXTRACTION FROM THE MEMBRANE

The degradation of integral membrane proteins requires the extraction of transmembrane domains from a favorable phospholipid environment into an unfavorable aqueous compartment. The mechanisms used by the mitochondrial AAA proteases to overcome this barrier remain elusive. Two possible approaches that can be envisioned are: (1) forced dislocation of the transmembrane regions powered by ATP hydrolysis and (2) destabilization of the interactions between substrate transmembrane domains and the bilayer. Many AAA+ proteins translocate proteins across membranes and it is reasonable to assume that similarities exist in their mechanisms of extraction. For example, the degradation of multiple integral membrane proteins by FtsH has been demonstrated in bacteria (Bittner et al., 2015; Hari and Sauer, 2016). In eukaryotes, Msp1 is a membrane-anchored AAA+ protein that lacks proteolytic activity and extracts improperly localized tail-anchored proteins from the cytosolic face of the MOM (Chen et al., 2014; Okreglak and Walter, 2014). Endoplasmic-reticulum associated degradation (ERAD) requires the translocation of ubiquitinated polypeptides across the ER membrane by a group of proteins involving the p97/Cdc48 motor protein (Wolf and Stolz, 2012; Ruggiano et al., 2014). Recently, the extraction of mitochondrial proteins from the MOM has been demonstrated by cytosolic p97/Cdc48 (Heo et al., 2010; Tanaka et al., 2010). The mechanism of translocation in ERAD is debated but may involve passage through a hydrophobic protein channel (Stein et al., 2014). Similarly, the possibility remains that the transmembrane domains of the mitochondrial proteases form a hydrophobic channel through which polypeptides can pass en route to the central pore.

The force required to mechanically extract transmembrane helices from lipid bilayers of varying composition has been measured between 90 and 200 pN (Oesterhelt et al., 2000; Ganchev et al., 2004). It has been noted that the hydrophobicity of integral MIM proteins is generally lower than those in the bacterial inner membrane or eukaryotic plasma membrane, suggesting a lower force is required for extraction (von Heijne, 1986). A study examining the retention of simple transmembrane sequences in the MIM demonstrated that sequences required >3:1 leucine:alanine residues to escape dislocation from the membrane by m-AAA. Under this scheme, the protease could extract most MIM proteins (Botelho et al., 2013). The central pores of both proteases contain loops bearing the canonical aromatic-hydrophobic (Ar-φ) motif that are proposed to deliver the translocating force (Graef and Langer, 2006; Martin et al., 2008). Mutation of the Ar-φ motif impairs the translocation and degradation but not binding of membrane proteins by Yme1, indicating defects in the power stroke (Graef and Langer, 2006). A rigorous in vitro analysis demonstrated that E. coli FtsH lacks significant unfolding power and suggested the protease targets already destabilized proteins as a means of selecting damaged substrates (Herman et al., 2003). While hexYME1L is capable of unfolding stable proteins, a comparison with other AAA+ proteases placed the unfolding power between FtsH and robust unfoldases such as ClpXP and Lon (Shi et al., 2016). Possession of an intermediate power stroke may provide the mitochondrial AAA proteases with a pulling force too weak to unfold the C-terminal domain of MrpL32 or fully remove Atg32 from the MOM but sufficient to extract substrates from the MIM.

#### CONCLUDING REMARKS

Significant progress has been made in recent years in expanding the repertoire of functions performed by the mitochondrial AAA proteases and understanding how these enzymes select and process substrates. However, the answers to many important questions remain elusive. What are the precise interactions used by these enzymes to recognize and engage protein substrates and do they differ for substrates that undergo different fates? What mechanisms exist in mitochondria to modify protease activity to provide further regulation to the mitochondrial proteome, either in form of environmental changes, allosteric modulators, or cofactors such as adaptor proteins? The recent emergence of degron sequences that target substrates for degradation greatly expands the constellation of potential experiments that can be used to elucidate substrate recognition both in vivo and in vitro. Furthermore, the involvement of both proteases in supramolecular complexes mediated by scaffolding proteins presents a clear avenue to understand how protease activity may be altered by association with other mitochondrial proteins. To date, a lack of structural information has hampered our understanding of the precise mechanisms of the mitochondrial AAA proteases but recent advances in cryoelectron microscopy offer the opportunity to visualize these ATP-fueled proteolytic machines at highresolution and gain insight into the molecular details of the degradation process used to preserve the essential functions of mitochondria.

## AUTHOR CONTRIBUTIONS

SG conceived of the topic and wrote the manuscript.

#### REFERENCES


#### FUNDING

Work in the author's lab is funded by NIH grant R01 GM115898.

#### ACKNOWLEDGMENTS

I thank Bojian Ding and Anthony Rampello for helpful discussions


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Glynn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Structural Elements Regulating AAA+ Protein Quality Control Machines

#### Chiung-Wen Chang<sup>1</sup> , Sukyeong Lee<sup>1</sup> and Francis T. F. Tsai 1, 2 \*

<sup>1</sup> Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA, <sup>2</sup> Departments of Molecular and Cellular Biology, and Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA

Members of the ATPases Associated with various cellular Activities (AAA+) superfamily participate in essential and diverse cellular pathways in all kingdoms of life by harnessing the energy of ATP binding and hydrolysis to drive their biological functions. Although most AAA+ proteins share a ring-shaped architecture, AAA+ proteins have evolved distinct structural elements that are fine-tuned to their specific functions. A central question in the field is how ATP binding and hydrolysis are coupled to substrate translocation through the central channel of ring-forming AAA+ proteins. In this mini-review, we will discuss structural elements present in AAA+ proteins involved in protein quality control, drawing similarities to their known role in substrate interaction by AAA+ proteins involved in DNA translocation. Elements to be discussed include the pore loop-1, the Inter-Subunit Signaling (ISS) motif, and the Pre-Sensor I insert (PS-I) motif. Lastly, we will summarize our current understanding on the inter-relationship of those structural elements and propose a model how ATP binding and hydrolysis might be coupled to polypeptide translocation in protein quality control machines.

#### Edited by:

James Shorter, University of Pennsylvania, USA

#### Reviewed by:

Tim Clausen, Research Institute of Molecular Pathology, Austria Daniel Southworth, University of Michigan, USA Petra Wendler, Ludwig-Maximilians-Universität München, Germany

> \*Correspondence: Francis T. F. Tsai ftsai@bcm.edu

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

Received: 24 December 2016 Accepted: 13 April 2017 Published: 04 May 2017

#### Citation:

Chang C-W, Lee S and Tsai FTF (2017) Structural Elements Regulating AAA+ Protein Quality Control Machines. Front. Mol. Biosci. 4:27. doi: 10.3389/fmolb.2017.00027 Keywords: AAA+ proteins, protein quality control, Pre-Sensor I insert, inter-subunit signaling motif, pore loop

# THE AAA+ PROTEIN SUPERFAMILY

AAA+ proteins harness metabolic energy in form of ATP to facilitate diverse cellular processes, including organelle biogenesis, membrane fusion, transcriptional regulation, and protein quality control (PQC). Members of the AAA+ superfamily can be classified into one of four distinct clades or superclades: (1) the clamp loader clade, (2) the initiator clade, (3) the classic clade, and (4) the Pre-Sensor I insert (PS-I) superclade (Iyer et al., 2004; Erzberger and Berger, 2006). The PS-I superclade is further sub-divided into the superfamily 3 (SF3) helicase clade, the HCLR clade (HslU, ClpAB-D2, Lon, and RuvB family), the helix 2 (H2)-insert clade, and the Pre-Sensor II insert (PS-II) clade (Iyer et al., 2004; Erzberger and Berger, 2006). A hallmark of all AAA+ proteins is the AAA+ ATP-binding domain that is composed of ∼220 amino acids and typically forms a hexameric ring structure in solution. The AAA+ domain features several conserved elements required for ATP binding and hydrolysis, including the Walker A and B motifs, the arginine (Arg)-finger motif, and the sensor-1 and -2 motifs (**Figure 1A**). In addition, each AAA+ clade features a specific insertion of a secondary structure element within the core AAA+ fold. For instance, the defining feature of the PS-I superclade is a β-hairpin insertion before the sensor-1 motif (**Figures 1A,B**). Despite the wealth of structural information, the functional importance of clade-specific insertions remains largely unclear.

AAA+ proteins involved in PQC include members of the Clp/Hsp100 family (Bukau et al., 2006; Olivares et al., 2016), Lon (Venkatesh et al., 2012), and FtsH-like proteases in prokaryotes and organelles (Gerdes et al., 2012; Okuno and Ogura, 2013). Clp/Hsp100 members function as protein unfoldases to facilitate either the disaggregation of previously aggregated proteins (Doyle et al., 2013; Jeng et al., 2015; Mogk et al., 2015; Sweeny and Shorter, 2016) or the degradation of ssrA-tagged proteins (Olivares et al., 2016). Members of the Clp/Hsp100 family are found in diverse microorganisms and belong to one of two classes that are distinguished by the number of AAA+ domains present in one polypeptide. Class I proteins, which include ClpA, ClpB/Hsp104 and ClpC, possess two AAA+ domains, termed the D1 and D2 domains, whereas class II proteins such as ClpX and HslU contain only a single AAA+ domain that is homologous to the second AAA+ (D2) domain of class I members (Schirmer et al., 1996). AAA+ domains assemble into a homo-hexamer composed of a D1 (class I) and a D2 ring (class I and II) that represent the physiologically active form of Clp/Hsp100 proteins. In order to facilitate protein degradation, Clp/Hsp100 proteins must associate with an oligomeric peptidase such as ClpP (Olivares et al., 2016), and assemble into a proteolytic machine of similar architecture to the 26S proteasome in Eukarya (Lee and Tsai, 2005). In contrast, PQC machines such as Lon (Venkatesh et al., 2012) and FtsH-like proteases (Gerdes et al., 2012; Okuno and Ogura, 2013) feature an integral peptidase domain that is covalently linked to the AAA+ domain.

#### THE PORE LOOP-1

A hallmark of the AAA+ domain is the presence of conserved loops that line the axial channel of the oligomeric ring assembly. These pore loops have been implicated in substrate interaction. One of these pore loops, known as pore loop-1, features a Tyr/Phe- 9-Gly motif, where 9 is a hydrophobic residue (Wang et al., 2001). The conserved aromatic amino acid is sensitive to mutation and was shown to impair protein function of several AAA+ ATPases when mutated (Yamada-Inagawa et al., 2003; Lum et al., 2004; Weibezahn et al., 2004). For instance, substituting the pore loop-1 tyrosine with alanine impaired substrate binding and translocation by Clp/Hsp100 proteins (Lum et al., 2004; Weibezahn et al., 2004; Hinnerwisch et al., 2005; Wang et al., 2011; Iosefson et al., 2015). The single-particle cryo-EM structure of a ClpB hexamer in the ATP-activated state showed that the D1 pore loop-1 of all six subunits is arrested at the central pore providing a platform for substrates to bind with high-affinity (Lee et al., 2007). This model is consistent with the proposed role of the D1 pore loop-1 Tyr in substrate interaction (Schlieker et al., 2004). Subsequent crystal structures of a ClpB-D2 monomer showed that pore loop-1 is stabilized by nucleotide and is mobile (i.e., disordered) in the absence of nucleotide (Biter et al., 2012; Zeymer et al., 2014), linking nucleotide binding to regulating pore loop conformation. Although the structure of a pore loop-bound substrate complex remains elusive, collectively these findings support a mechanism by which ATP-dependent changes are linked to pore loop conformations that could facilitate substrate translocation through the hexameric ring assembly.

A more recent high-resolution cryo-EM structure of yeast Hsp104 bound to AMP-PNP revealed a left-handed spiral architecture exhibiting a "staircase" arrangement of pore loops along the central channel of the Hsp104 hexamer (Yokom et al., 2016). Notably, in the cryo-EM structure the D2 domain of the 1st subunit contacts the D1 domain of the 6th subunit to give rise to a closed "lock-washer" arrangement. Although the spiral architecture is surprising, it is similar to the left-handed helical assembly observed in crystal structures of bacterial ClpB (Lee et al., 2003; Carroni et al., 2014) and a fungal Hsp104 (Heuck et al., 2016). Examining the atomic structure of a substratetranslocating Clp/Hsp100 complex will be necessary to provide direct support for the functional role of pore loops in substrate threading through the hexamer assembly.

#### THE ISS MOTIF IN AAA+ MACHINES

The ISS motif consists of a network of functionally conserved residues crucial for transmitting the nucleotide status of one subunit to the adjacent subunit, thereby providing the molecular basis how ATP binding and hydrolysis is coordinated between neighboring subunits in the ring assembly. The existence of an ISS motif was first reported for the m-AAA protease (Augustin et al., 2009), a member of the classic clade, and is defined as the α-helix immediately preceding the sensor-1 motif featuring a characteristic aspartic or glutamic acid at its C-terminus, which interacts with a nearby arginine of the same subunit. This arginine in turn interacts with the Arg-finger that senses the nucleotide status in the adjacent subunit (Augustin et al., 2009; Hanzelmann and Schindelin, 2016). The ISS motif is also found in other members of the classic clade, including FtsH (Bieniossek et al., 2006) and p97 (Hanzelmann and Schindelin, 2016). A sequence alignment indicates that an acidic amino acid is conserved amongst members of the HCLR clade, including the D2 domain of Clp/Hsp100 proteins (**Figure 1B**). However, unlike members of the classic clade, the crystal structure of the ClpB-D2 domain showed a direct interaction between Asp685 and the Arg-finger (Arg747) from the same subunit (Biter et al., 2012; Zeymer et al., 2014), providing a means to directly signal the nucleotide status between neighboring subunits (**Figure 2A**). Consistent with a role in inter-subunit signaling, a mutation of Asp685 to alanine significantly impaired ClpB's ATPase activity (Biter et al., 2012), confirming the existence of an ISS motif in the broader AAA+ superfamily.

# THE PS-I INSERT MOTIF

The PS-I motif is the defining feature of members of the PS-I insert superclade (Iyer et al., 2004; Erzberger and Berger, 2006)

and consists of a β-hairpin that buttresses the pore loop-1 of the same subunit (**Figures 1A**, **2A**). Although the location of the PS-I motif is not conserved in the primary amino acid sequence of AAA+ proteins (**Figure 1B**), a pairwise structural comparison of different HCLR clade members shows that the location of the PS-I motif is invariant in the 3D structure. The function of the PS-I βhairpin is perhaps best understood for AAA+ proteins involved in nucleic acid translocation, such as the simian virus 40 large tumor antigen (LTag) (Shen et al., 2005) and the papillomavirus replication initiation protein E1 (Enemark and Joshua-Tor, 2006). Structural studies of the SV40 LTag helicase bound to DNA showed that the β-hairpin is directly involved in binding to DNA (Chang et al., 2013; Gai et al., 2016). In the hexamer structure of SV40 LTag, the helicase forms a near-planar ring with the β-hairpin lining the inner surface of the central channel encircling the double-stranded DNA helix (Gai et al., 2016) (**Figure 2B**). Substrate contacts are mediated by a combination of hydrogen bonding, electrostatic and hydrophobic interactions between residues at the tip of the β-hairpin (Lys512 and His513) and the phosphate backbone, the sugar moieties and the edges of bases of the DNA (Chang et al., 2013; Gai et al., 2016). It has been suggested that ATP-driven domain motions are transmitted to the β-hairpin resulting in DNA translocation along the central channel (Gai et al., 2004; Chang et al., 2013). The importance of the PS-I β-hairpin in substrate binding is also supported by the crystal structure of a hexameric E1 helicase bound to a single-strand of DNA (Enemark and Joshua-Tor, 2006). Consistent with a potential role of the PS-I hairpin in substrate binding, deletion of the β-hairpin loop in ClpB (ClpB1691–695) impaired protein disaggregation to similar levels to that observed with a ClpB variant featuring a D2 pore loop tyrosine to alanine mutation (ClpBY643A) (Biter et al., 2012). Although the ATPase activity is also reduced, it is similar for both mutants (Biter et al., 2012).

More recently, the crystal structure of a fungal Hsp104 in the ADP-bound state was determined (Heuck et al., 2016) revealing a different β-hairpin conformation that contacts the D1 domain, and is distinct from the β-hairpin conformation seen in crystal structures of bacterial ClpB (Lee et al., 2003; Biter et al., 2012; Carroni et al., 2014; Zeymer et al., 2014) and in the aforementioned helicases (Enemark and Joshua-Tor, 2006; Gai et al., 2016). Although deletion of the PS-I insert motif significantly impaired the Hsp104 protein disaggregating activity (Heuck et al., 2016), the interpretation of the observed defect is different. In the case of Hsp104, it was proposed that the PS-I insert motif is involved in signaling the nucleotide status between the two AAA+ rings and is responsible for allosteric regulation that controls Hsp104 function (Franzmann et al., 2011; Heuck et al., 2016). Although not mutually exclusive, determining the

#### REFERENCES

Augustin, S., Gerdes, F., Lee, S., Tsai, F. T., Langer, T., and Tatsuta, T. (2009). An intersubunit signaling network coordinates ATP hydrolysis by m-AAA proteases. Mol. Cell 35, 574–585. doi: 10.1016/j.molcel.2009.07.018

functional importance of the PS-I motif in ClpB/Hsp104 chaperones requires further structural and biochemical confirmation.

### COUPLING THE ATPASE CYCLE TO SUBSTRATE TRANSLOCATION IN PQC MACHINES

The available 3D structures of AAA+ machines involved in PQC have provided snapshots of distinct functional states and have contributed toward our molecular understanding how the ATPase cycle is coupled to conformational changes needed for substrate translocation. Structural evidence suggests that the pore loop-1 conformation optimized for substrate binding is determined by the nucleotide-bound status of the cis-subunit, which in turn is controlled by the nucleotide state of the trans-subunit (Biter et al., 2012) (**Figure 2A**). In this model, the Arg-finger of the cis-subunit senses the ATPbound state in the neighboring subunit and transmits this signal in cis via a conserved acidic amino acid residue (either Asp or Glu) of the ISS motif, triggering ATP hydrolysis in the cis-subunit concomitant with substrate translocation. We propose that the PS-I motif communicates with pore loop-1 and controls substrate interaction by either contacting the substrate directly or regulating the ATPase cycle in the D2 ring through communication with the D1 ring. Although the available structural and biochemical evidence provide support for such mechanism, determining the structure of a substrate bound complex will be necessary to provide a more accurate mechanistic understanding how the ATPase cycle is coupled to substrate translocation in PQC machines.

# AUTHOR CONTRIBUTIONS

CC, SL, and FT contributed to writing the draft and final version of this mini-review.

#### FUNDING

Research in the FT and SL laboratories is supported by grants from the National Institutes of Health (GM104980 and GM111084) and the Welch Foundation (Q-1530). CC is the recipient of an American Heart Association-Southwest Affiliate Postdoctoral Fellowship.

# ACKNOWLEDGMENTS

We sincerely apologize to our colleagues whose important work was not cited in this mini-review.

Bieniossek, C., Schalch, T., Bumann, M., Meister, M., Meier, R., and Baumann, U. (2006). The molecular architecture of the metalloprotease FtsH. Proc. Natl. Acad. Sci. U.S.A. 103, 3066–3071. doi: 10.1073/pnas.0600031103

Biter, A. B., Lee, S., Sung, N., and Tsai, F. T. (2012). Structural basis for intersubunit signaling in a protein disaggregating machine. Proc. Natl. Acad. Sci. U.S.A. 109, 12515–12520. doi: 10.1073/pnas.12070 40109


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Chang, Lee and Tsai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Comparative Analysis of the Structure and Function of AAA+ Motors ClpA, ClpB, and Hsp104: Common Threads and Disparate Functions

#### Elizabeth C. Duran, Clarissa L. Weaver and Aaron L. Lucius \*

*Department of Chemistry, University of Alabama at Birmingham, Birmingham, AL, United States*

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Peter Chien, University of Massachusetts Amherst, United States Jodi L. Camberg, University of Rhode Island, United States*

> \*Correspondence: *Aaron L. Lucius allucius@uab.edu*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *28 April 2017* Accepted: *13 July 2017* Published: *03 August 2017*

#### Citation:

*Duran EC, Weaver CL and Lucius AL (2017) Comparative Analysis of the Structure and Function of AAA*+ *Motors ClpA, ClpB, and Hsp104: Common Threads and Disparate Functions. Front. Mol. Biosci. 4:54. doi: 10.3389/fmolb.2017.00054* Cellular proteostasis involves not only the expression of proteins in response to environmental needs, but also the timely repair or removal of damaged or unneeded proteins. AAA+ motor proteins are critically involved in these pathways. Here, we review the structure and function of AAA+ proteins ClpA, ClpB, and Hsp104. ClpB and Hsp104 rescue damaged proteins from toxic aggregates and do not partner with any protease. ClpA functions as the regulatory component of the ATP dependent protease complex ClpAP, and also remodels inactive RepA dimers into active monomers in the absence of the protease. Because ClpA functions both with and without a proteolytic component, it is an ideal system for developing strategies that address one of the major challenges in the study of protein remodeling machines: how do we observe a reaction in which the substrate protein does not undergo covalent modification? Here, we review experimental designs developed for the examination of polypeptide translocation catalyzed by the AAA+ motors in the absence of proteolytic degradation. We propose that transient state kinetic methods are essential for the examination of elementary kinetic mechanisms of these motor proteins. Furthermore, rigorous kinetic analysis must also account for the thermodynamic properties of these complicated systems that reside in a dynamic equilibrium of oligomeric states, including the biologically active hexamer.

#### Keywords: ClpA, ClpB, Hsp104, translocation mechanism, kinetics, thermodynamics

# INTRODUCTION

The central dogma of molecular biology tells us that proteins are constantly being produced by the cell upon exposure to environmental stresses, nutrients, and metabolites. For example, if we expose cells to a source of lactose we know that synthesis of all of the proteins responsible for lactose metabolism will be upregulated in response. However, the central dogma does not address what happens to those gene products when the lactose is gone. Indeed, the cytosol is a protein rich environment. However, every protein that was produced to respond to stimuli cannot persist in the cytosol when the stimuli are removed and the protein is no longer needed. Rebinding of repressors and removal of the mRNA are two aspects of this. Yet stemming the flow of nascent protein does not address the manner in which they are removed when new and different proteins are needed.

Generally, longer-lived proteins are sequestered into lysosomes for degradation. Shorter-lived proteins are degraded in the cytosol. The presence of a PEST region (region rich in proline, glutamate, serine, and threonine) has been associated with shorter protein half-lives (Rogers et al., 1986). The N-end rule, proposed in the 1980s and expanded upon since then, proposes that certain amino terminal residues promote ubiquitination in eukaryotes and proteolysis, two ATPdependent processes occurring within the cytosol (Bachmair et al., 1986). Over all, cytosolic proteins can have half-lives ranging from minutes, to hours, to days (for reviews, see Dice, 1987; Varshavsky, 1996).

Proteolysis in the cytosol is a potentially dangerous activity for the cell, so removal of proteins that are no longer required presents a challenge. The cell cannot have unregulated proteolysis running rampant in the cytosol. Unregulated proteolysis in the cytosol would deplete necessary, active proteins. In fact, because dysregulation of cytosolic proteases is deadly to cells, it has been explored as an antibacterial strategy (Brotz-Oesterhelt et al., 2005; Hinzen et al., 2006).

The challenge of regulating proteolysis in the cytosol is met by ATP dependent proteases, for review see Sauer et al. (2004). However, what is the requirement for ATP in ATP dependent proteolysis? Peptide bond cleavage is exergonic. Proteases do not require an energy source to catalyze proteolysis. For example, serine proteases, cysteine proteases, aspartic proteases, etc. simply bind to a polypeptide chain and cleave the peptide bond. AAA+ (ATPases associated with a variety of cellular activities) motors and ATP serve as the regulators of proteolytic activity in the protein rich environment of the cytosol.

Across species, ATP dependent proteases are composed of a barrel shaped protease with proteolytic active sites lining the interior cavity (for review see Sauer and Baker, 2011; Olivares et al., 2016). These active sites are accessible by a pore on each end of the barrel that is too small for folded proteins to enter without first being unfolded. Certain AAA+ hexameric ring motors associate with each end of the barrel and couple the energy from ATP binding and hydrolysis to processive translocation of a polypeptide chain through the axial channel of the hexameric ring and into the proteolytic cavity of the protease. Thus, the energy source in an ATP dependent proteolytic reaction serves to both unfold the protein and processively translocate the unfolded polypeptide chain into the proteolytic chamber.

The 26S proteasome in humans and bacterial ClpAP are examples of ATP dependent proteases. ClpA is a AAA+ motor protein that contains two ATP binding sites per monomer and assembles into hexameric rings. These hexameric rings bind to one or both ends of the tetradecameric serine protease ClpP to form ClpAP. ClpA catalyzes protein unfolding and translocation of the polypeptide chain into the proteolytic cavity of ClpP.

Like proteases in the cytosol, enzyme catalyzed protein unfolding in the cytosol is potentially dangerous for the cell. However, this function emerges, or putatively emerges, in many biological contexts. For example, both ClpA and ClpX, another AAA+ motor that associates with ClpP, catalyze "protein remodeling" reactions in the absence of the proteolytic component, ClpP. ClpA remodels an inactive dimer of RepA into two active monomers (Wickner et al., 1994) and ClpX remodels the highly salt-stable MuA transposase (Levchenko et al., 1995; Kruklitis et al., 1996) to induce dissociation from DNA. More recently, mitochondrial ClpX was reported to partially unfold ALA synthase in a tranlocation-depedent mechanism to facilitate pyridoxal phosphate cofactor binding during heme biosynthesis (Kardon et al., 2015). Although it is well established that both ClpA and ClpX processively translocate a substrate into ClpP for the purposes of proteolytic degradation, it is not clear if the motors fully translocate a substrate during protein remodeling reactions. Thus, the question remains; do the motors need to fully translocate a substrate to catalyze such protein remodeling reactions? Furthermore, do they use the same elementary mechanisms to translocate substrates for proteolytic degradation as they do for protein remodeling reactions?

The AAA+ motors Katanin (McNally and Vale, 1993) and Spastin (Hazan et al., 1999) catalyze microtubule severing. Microtubule severing could also be classified as a protein remodeling reaction. It is thought that Katanin and Spastin catalyze this reaction by binding to unstructured tails on α– and β-tubulin (Roll-Mecak and McNally, 2010). Then, using the energy from ATP, they either fully or partially translocate the tubulin molecule through their axial channel. Once a monomer of tubulin is removed from the microtubule, a severing event occurs.

The N-ethylmaleimide-sensitive fusion protein (NSF) is a AAA+ motor involved in vesicle fusion (Block et al., 1988; Fleming et al., 1998; Dalal et al., 2004; Zhao et al., 2012). Specifically, the protein is responsible for disassembly of tightly associated SNARE proteins. NSF may also catalyze partial or complete unfolding/translocation in the process of dissembling the SNARE complex.

The AAA+ motors bacterial ClpB and yeast Hsp104 have the unique ability to recognize and disrupt protein aggregates in vivo. It has been hypothesized that these enzymes processively translocate a polypeptide chain out of a protein aggregate and through their hexameric ring structure (Weibezahn et al., 2004; Tessarz et al., 2008). However, more recent results suggest that complete translocation may not be the case (Li et al., 2015b).

One common thread among Katanin, Spastin, NSF, ClpB, and Hsp104 is that they do not interact with a protease and they are not, themselves, proteases. Thus, they do not covalently modify the substrate on which they operate. This lack of proteolytic activity leads to a technical barrier in addressing the question of whether these enzymes pass a polypeptide chain through their axial channels fully or partially. This is, in part, because unfolding alone is not evidence for complete passage. A number of studies have used GFP and its variants to examine the unfolding reaction (Weber-Ban et al., 1999; Kim Y. I. et al., 2000). However, it remains unclear how much of the GFP tertiary structure needs to be unfolded before the fluorescence is extinguished. Thus, loss of fluorescence does not allow one to conclude that complete translocation has occurred.

Complete proteolytic degradation catalyzed by ClpP is the evidence for complete translocation catalyzed by ClpA and ClpX. Much of what has been learned about translocation catalyzed by ClpA and ClpX has been determined from observing proteolytic degradation catalyzed by the protease, ClpP, in ClpAP and ClpXP, respectively. However, this leads to the question; do the motors catalyze processive translocation the same way in the absence of the proteolytic component as they do in its presence? Determining the mechanism of complete translocation catalyzed by ClpA or ClpX without covalent modification of the substrate presents the same technical difficulties as those articulated for any of the other AAA+ motors mentioned so far, i.e., the substrate on which they operate is not covalently modified.

This review is focused on efforts to examine polypeptide translocation catalyzed by AAA+ motors in the absence of proteolytic degradation. We have sought to develop a set of tools that would allow us to use transient state kinetics to examine the elementary kinetic mechanism of enzyme catalyzed protein unfolding and translocation. Specifically, we sought to determine the elementary rate constants as well as the step-size (distance per step) that define the elementary mechanism of translocation. To this end, the work began with developing strategies to examine ClpA since it was known to be a processive translocase. The work has continued by applying these approaches to the protein disaggregating machines ClpB/Hsp104. However, the work quickly revealed that in order to fully interpret the kinetic mechanistic observations a number of questions regarding the energetics of assembly and ligand binding required attention. These issues are discussed below, building on an overview of the structure of these proteins.

# STRUCTURAL FEATURES OF ClpA, ClpB, AND Hsp104

#### Primary through Tertiary Structure

ClpA, ClpB, and Hsp104 share similarities that have formed the basis for their classification. They are members of the AAA+ superfamily that are further classified as Hsp100 proteins for their roles in coupling ATPase activity to changes in the folding and/or assembly of substrate clients (Schirmer et al., 1996; Neuwald et al., 1999). Hsp100 members are partitioned into two classes based on the number of nucleotide binding domains (NBDs) contained per monomeric unit. Class I proteins, such as ClpA, ClpB, and Hsp104, contain two NBDs while Class II proteins, such as ClpX, contain a single NBD per monomer. In the presence of ATP, these proteins assemble into homohexameric ring-like structures that perform their chaperone activity. ATP binding and hydrolysis occur at canonical Walker A and B motifs contained within each nucleotide binding domain (Walker et al., 1982).

The protomer structures of ClpA, ClpB, and Hsp104 have been reported from various organisms in various nucleotidebound states. In the case of ClpA, the monomer structure has been reported from Escherichia coli ClpA in the ADPbound state (Guo et al., 2002b). For the disaggregases, atomic resolution crystal structures have been reported for Thermus thermophilus ClpB in the AMPPNP-bound state (Lee et al., 2003) and for Chaetomium thermophilum Hsp104 in complex with ADP (Heuck et al., 2016). Comparison of the available protomer structures (**Figure 1A**) as well as the primary sequences of the three motors, highlights their shared structural features. In general, each monomer is made up of an N-domain, nucleotide binding domains 1 (NBD1) and 2 (NBD2) joined by a linker region, and a C-terminal domain. The residues that separate the Walker A and Walker B motifs in each NBD have been modeled to form a loop that extends into the axial channel of the hexameric ring structures. Evidence from multiple studies has implicated conformational changes of these residues with ATP hydrolysis at each NBD in the mechanism of polypeptide substrate translocation by ClpA, ClpB/Hsp104, as discussed below in Sections Mechanisms of Polypeptide Translocation by ClpA and ClpAP and Mechanism of Translocation by ClpB/Hsp104, as well as other AAA+ motors (Yamada-Inagawa et al., 2003; Schlieker et al., 2004; Weibezahn et al., 2004; Hinnerwisch et al., 2005; Martin et al., 2008; Biter et al., 2012; Zeymer et al., 2014). As shown in **Figure 1B**, single particle reconstructions of the hexamer structures for the three motors predict similar arrangements of each domain within the quaternary structure. This structural similarity, in part, formed the basis for the hypothesis that ClpA, ClpB, and Hsp104 operate on substrate proteins through a shared mechanism.

ClpB and Hsp104 share an important feature that ClpA lacks. There is a middle domain (MD) located within the C-terminal end of NBD1 (**Figure 1A**). In the tertiary structure, this region adopts a coiled-coil fold, made up of four α-helices, that extends ∼85 Å from NBD1 (**Figure 1B**). This domain is flexible and restriction of this flexibility has been shown to decrease disaggregation activity (Lee et al., 2003). MD flexibility has made its position and orientation within the hexamer difficult to assign in the multiple ClpB/Hsp104 structures available. The variable MD orientations in hexameric models have led to the hypothesis that nucleotide driven conformational switching of the MD may be an important part of the ClpB/Hsp104 disaggregation mechanism (Oguchi et al., 2012; Seyffer et al., 2012; Rosenzweig et al., 2013). Various studies have also shown the MD to be the binding target of ClpB/Hsp104 co-chaperones, DnaK/Hsp70 (Sielaff and Tsai, 2010; Miot et al., 2011; DeSantis et al., 2012; Seyffer et al., 2012; Lee et al., 2013; Rosenzweig et al., 2013; DeSantis et al., 2014; Doyle et al., 2015).

One other structural element distinguishes the protein translocase, ClpA, from the protein disaggregases ClpB/Hsp104: the presence or absence of a tripeptide motif requisite for the assembly with ClpP. ClpA hexamers interact with the protease ClpP through a conserved IGL/F motif nestled in a helix-loophelix region near the C-terminal end of NBD2 (Kim et al., 2001). ClpB and Hsp104 lack that IGL/F motif, and accordingly, do not naturally associate with ClpP or any known protease.

# Quaternary Structure and Nucleotide-Linked Self-Assembly

In the presence of nucleotide, ClpA, ClpB, and Hsp104 oligomerize to form homo-hexamers that interact with client substrates and partner proteins. Structural models of the hexameric state have been reported for all three motors, in various nucleotide-bound states (Guo et al., 2002b; Lee et al., 2003, 2010; Wendler et al., 2007, 2009; Effantin et al., 2010; Carroni et al., 2014; Heuck et al., 2016; Yokom et al., 2016).

In most cases, the hexameric state is reported to be a planar, ring-like structure with a central axial channel as shown in **Figure 1C**. In these models and single-particle reconstructions, the NBDs from each protomer align side-by-side around the hexamer, forming a NBD1 tier and a NBD2 tier. Hexamer models that capture the orientation of the flexible N-domain, have a third N-domain tier above NBD1, as seen for the hexameric single particle reconstructions in **Figure 1C**. ClpB and Hsp104 hexamers additionally have the MD protruding from the NBD1 tier.

Recently, an alternative asymmetric spiral structure has been reported for the Hsp104 hexamer in the AMPPNPbound state, and in the ATPγS bound state with casein bound as a substrate (Yokom et al., 2016; Gates et al., 2017)

down through the axial channel. These images were prepared using UCSF Chimera (Computer Graphics Laboratory, University of California, San Francisco).

spurring interest and speculation about its structural implications for the disaggregation mechanism. Similarly, Ripstein et al. recently reported images of another AAA+ protein, VAT, which threads protein substrates through its axial channel into the proteasome for degradation, in transient, asymmetric conformations (Ripstein et al., 2017). These asymmetric hexameric structures observed by cryo-EM are similar to the extended spirals reported previously in crystallographic studies (Guo et al., 2002b; Lee et al., 2003; Heuck et al., 2016). These provocative asymmetric structures invite further investigation. Biochemical assays will be key in determining how the asymmetric Hsp104 spiral structure fits into the disaggregation mechanism. This and other efforts to discern the mechanistic details of substrate processing by ClpA, ClpB, and Hsp104, will require the ability to precisely quantify the concentration of hexamers competent for polypeptide substrate binding.

Many studies have established that ClpA and ClpB reside in a distribution of oligomers in the absence of nucleotide (Maurizi et al., 1998; Zolkiewski et al., 1999; Akoev et al., 2004; Veronese et al., 2009; del Castillo et al., 2011). Hydrodynamic studies from Maurizi and co-workers concluded that ClpA resides in a distribution of monomers and dimers in the absence of nucleotide and that ATP is required for assembly into hexamers (Maurizi et al., 1998). In later work, Kress et al. report that ClpA hexamerization occurs through a transient tetramer intermediate (Kress et al., 2007). Using hydrodynamic and thermodynamic techniques, it was later shown that ClpA resides in a distribution of monomers, dimers, and tetramers in the absence of nucleotide (Veronese and Lucius, 2010; Veronese et al., 2011) thereby showing that the tetramer was not a transient intermediate on the pathway to assembly but was significantly populated at thermodynamic equilibrium independent of path. Notably, in the presence of excess nucleotide, ClpA hexamers as well as lower order oligomers remain in solution (Veronese et al., 2011; Li and Lucius, 2013). However, a complete quantification of the nucleotide linked assembly reaction is still needed.

On the other hand, the energetics of ClpB self-assembly in the absence and presence of nucleotide has been quantified (Lin and Lucius, 2015b, 2016). ClpB, like ClpA, resides in a distribution of monomers, dimers, tetramers, and hexamers. An important distinction between the two motors is the observation that ClpB, unlike ClpA, forms hexamers in the absence of nucleotide. A rigorous, in-depth investigation of the self-assembly of Hsp104 is currently lacking in the field, however recent results suggest that, similar to ClpB, Hsp104 populates hexamers and lower order oligomers in both the absence and presence of nucleotide (Weaver et al., 2017). Taken together, these quantitative investigations of ClpA and ClpB self-assembly reveal that macromolecular assembly is thermodynamically linked to nucleotide binding. This has fundamental implications for the driving forces that tune the population of each oligomer in solution.

Specifically, two thermodynamic driving forces govern the self-assembly of these enzymes into hexamers: the free monomer concentration and the free nucleotide concentration. As a result, assays performed on these enzymes in which the concentrations of protein or nucleotide change throughout the experiment, must account for the changing distribution of oligomers. Failure to do so can lead to conclusions about nucleotide processing at each NBD and NBD1-NBD2 interdependence that could otherwise be explained by changes in the macromolecular state.

In much of the published work on ClpA, ClpB, and Hsp104 it has been generally assumed that in the presence of 1–2 mM nucleotide concentrations, all of the protein is in the hexameric state. This assumption is generally supported with size exclusion chromatography (SEC). However, SEC is a nonequilibrium technique, meaning that the equilibrium is perturbed by running the sample through the column. That is to say, the chemical potential of both the protein and nucleotide are changing throughout the experiment and therefore the distribution of oligomeric states is changing throughout the experiment. Moreover, the observation of hexamers in SEC does not rule out the presence of smaller oligomers. Further, it does not rule out the possibility that the self-association equilibrium has been perturbed upon introduction of a mutation in the protein.

It is clear from the self-association and polypeptide binding properties of ClpA and ClpB that smaller order oligomers do persist at saturating concentrations of nucleotide (Veronese et al., 2011; Li and Lucius, 2013; Li et al., 2015a; Lin and Lucius, 2016). For example, **Figure 2** shows the fraction of ClpB oligomers populated in the presence of 100 µM and 2 mM nucleotide as a function of total [ClpB], simulated from the reported energetic parameters for ClpB assembly (Lin and Lucius, 2015b, 2016). In the presence of 100µM nucleotide (**Figure 2A**), a 1µM ClpB sample would be made up of ∼6% hexamers, while 94% of the population would reside in a mixture of monomers, dimers, and tetramers. In the presence of 2 mM nucleotide (**Figure 2B**), the same sample would reside in a distribution made up of 74% hexamers and 26% lower order oligomers. In fact, under these conditions, even at 10µM ClpB, the hexameric state is not fully populated, with hexamers making up ∼89% of the total [ClpB]. This fact severely limits the ability to draw conclusions about the ATPase activity of the hexamer from steady state kinetic measurements where a different distribution of oligomers is present at each substrate concentration.

The simplest explanation for why the assumption that all motor protein is in the hexameric state is problematic, is that the Michaelis-Menten equation is scaled linearly by the total enzyme concentration, i.e., Vmax = kcat × E0. Recall, E<sup>0</sup> is the total amount of enzyme in the experiment, which is controlled by the experimentalist, whereas, E is the free (unbound) enzyme concentration at any given time and its concentration is unknown by the experimentalist. Thus, the maximum velocity is measured at saturating substrate concentration and divided by the known total enzyme concentration, E0, and the kcat is reported.

It is important to recall that Vmax = kcat × E<sup>0</sup> emerges from two assumptions in the derivation of the Michaelis-Menten equation. The first is that the substrate is in large excess over the enzyme. The total substrate concentration relative to the total enzyme concentration is controlled by the experimentalist but, mathematically, it results in being able to assume [S]k<sup>1</sup> is a constant in the first differential equation given by Equation (1)

for Scheme 1.

total [ClpB] in µM monomer.

$$\frac{d\left[ES\right]}{dt} = \left[E\right]\left[S\right]k\_1 - \left(k\_2 + k\_3\right)\left[ES\right]$$

$$E + S \overset{k\_1}{\underset{k\_2}{\rightleftharpoons}} ES \overset{k\_3}{\rightarrow} E + P \tag{1}$$

#### Scheme 1

The second assumption, which is based on the first is that because the substrate is in large excess of the enzyme, the concentration of ES is considered constant or "in the steadystate." If ES is constant, then the differential equation above is set to zero and solved algebraically for ES. However to do this the free enzyme term must be replaced with E<sup>0</sup> − ES. This assumption is valid if and only if the [ES] is constant, which is our underpinning assumption. Thus, under constant ES conditions the conservation of mass equation can be rearranged to E = E<sup>0</sup> − ES.

The assumption that the total enzyme, E0, is equal to the free enzyme, E, plus the bound enzyme, ES, only holds for a non-dissociating macromolecule. Understandably, this was not pointed out by Michaelis and Menten. However, we have not seen it expressly stated since.

The assumptions hold for self-associating systems that do not reside in dynamic equilibria, for example, if E forms only dimers and does not dissociate into monomers. Alternatively, if the experimentalist can maintain the concentration of the enzyme in large excess over the dimerization dissociation equilibrium constant then it may be possible to assume only dimers reside in solution. However, one has to be certain that by doing this they do not simultaneously violate the assumption that the substrate is maintained in large excess over the enzyme concentration.

If the dimer exists in a dynamic equilibrium between monomers and dimers, and concentrations of enzyme below the dimerization equilibrium constant are used, then the assumption is violated. The issue is made much more complicated for ClpA and ClpB where we, and others, have shown that both enzymes reside in a mixture of monomers, dimers, tetramers, and hexamers (del Castillo et al., 2011; Veronese et al., 2011; Lin and Lucius, 2015b, 2016). Moreover, the populations of these species are governed by the free concentration of the substrate (nucleotide). Consequently, the Michaelis-Menten equation will not be scaled by a simple relationship like kcat × E0. This is because the simplest relationship that one can write down that relates the known total monomer concentration to the species that reside in solution for a system such as ClpA or ClpB is given by Equation (2):

$$E\_0 = E + 2E\_2 + 4E\_4 + 6E\_6 + \sum\_{i=1}^{2} ES\_i + 2\sum\_{i=1}^{4} E\_2 S\_i$$

$$+ 4\sum\_{i=1}^{8} E\_4 S\_i + 6\sum\_{i=1}^{12} E\_6 S\_i \tag{2}$$

where the subscript on E represents the oligomeric state and the subscript on S represents the number of nucleotides bound to that oligomer, represented with the counting index, i. There is no simple algebraic way to express Equation (2) to replace E in the differential equation given by Equation (1). Indeed, if no other oligomers are in solution then Equation (2) simplifies to Equation (3):

$$E\_0 = 6E\_6 + 6\sum\_{i=1}^{12} E\_6 S\_i \tag{3}$$

Equation (4) is typically applied to the analysis of steady-state ATPase experiments on ClpA and ClpB. The total monomer concentration is divided by six, the Vmax is measured and divided by E0/6 and a kcat is reported.

$$\frac{E\_0}{6} = E\_6 + \sum\_{i=1}^{12} E\_6 S\_i \tag{4}$$

But what does this parameter mean when we know that the system resides in a dynamic equilibrium and the total enzyme concentration is actually given by Equation (2)? The answer may be that the kcat is not that meaningful because it has been acquired by dividing Vmax by a concentration that does not reflect the true hexamer concentration. However, the Vmax itself contains meaningful information. Contained within the Vmax is information about the self-association equilibrium constants and the nucleotide binding constants. This is because the concentration terms in Equation (2) can be replaced with the appropriate self-association equilibrium constants and the nucleotide binding constants given by Equation(5):

$$E\_0 = [E] \, P\_1 + 2L\_{2,0}[E]^2 P\_2 + 4L\_{4,0}[E]^4 P\_4 + 6L\_{6,0}[E]^6 P\_6 \tag{5}$$

where L2,0, L4,0, and L6,0 represent the self-association equilibrium constants for the formation of dimers, tetramers, and hexamers in the absence of nucleotide, respectively. The first subscript represents the oligomeric state and the second subscript represents the number of nucleotide bound, E and E<sup>0</sup> are as above, and P1, P2, P4, and P<sup>6</sup> are the partition functions for nucleotide binding to the monomer, dimer, tetramer, and hexamer, respectively. Each of the partition functions are functions of the nucleotide binding equilibrium constants and the free nucleotide concentration. Although there are many forms that the partition functions could take, one example for binding to the monomer could be given by Equation (6):

$$P\_1 = \left(1 + K\_1 \left[ATP\right] + K\_1 K\_2 \left[ATP\right]^2\right) \tag{6}$$

where K<sup>1</sup> and K<sup>2</sup> would represent the equilibrium constants for binding to NBD 1 and 2, respectively. This leads to the conclusion that if one observes differences in the Vmax for various point mutations in the enzyme, especially mutations in the ATPase active site, then there are three potential explanations. The first is that the activity has been affected, which is the typical interpretation. However, the second and third explanation are that the nucleotide binding affinity or the selfassociation equilibrium has been affected by the mutation. If the mutation has perturbed the self-association equilibrium and/or the nucleotide binding affinity, then a series of comparisons on ATPase activity between variants and wild type enzymes at the same fixed protein concentration are not reporting on the same concentrations of hexamers catalyzing ATP turnover. Again, showing that hexamers still form upon introduction of mutation does not show that the self-association reaction has not been perturbed.

The resolution to this problem is to employ a thermodynamically rigorous technique that would allow one to measure the equilibrium constants and accurately predict the concentration of the active species in solution (Lin and Lucius, 2015a,b; Lin and Lucius, 2016). In other words, define the thermodynamic parameters in Equation (5) and use them to interpret the kinetic/mechanistic data. In general, the apparent self-association constant for the ligand linked assembly of ClpB would be given by Equation (7):

$$L\_{6,app} = \frac{\left[\text{Clp}B\_6\right] + \sum\_{i=1}^{12} \left[\text{Clp}B\_6ATP\text{-}Y\_i\right]}{\left(\left[\text{Clp}B\_1\right] + \sum\_{i=1}^{2} \left[\text{Clp}B\_1ATP\text{-}Y\_i\right]\right)^6} = \frac{\left\{\text{Clp}B\_6\right\}}{\left\{\text{Clp}B\_1\right\}^6} \quad (7)$$

where the numerator represents the summation of all of the nucleotide ligation states of hexameric ClpB in solution and the denominator represents all of the nucleotide ligation states in the monomeric state. The curly braces on the right hand side of Equation (7) are used as a shorthand notation for the summation on the left. Equation (7) can be simplified to Equation (8):

$$L\_{6,app} = L\_{6,0} \cdot \frac{P\_6}{(P\_1)^6} \tag{8}$$

where L6,0 is as above, the hexamerization equilibrium constant in the absence of nucleotide, and P<sup>6</sup> and P<sup>1</sup> are the partition functions for nucleotide binding to the hexamer and the monomer, respectively. We showed, for ClpB (Lin and Lucius, 2016), that the apparent hexamerization equilibrium constant is given by Equation (9):

$$L\_{6,app} = L\_{6,0} \cdot \frac{\left(1 + \kappa\_6 \cdot [ATP\gamma S]\_f\right)^{m\_6}}{\left(\left(1 + \kappa\_1 \cdot [ATP\gamma S]\_f\right)^{m\_1}\right)^6} \tag{9}$$

where the partition functions for nucleotide binding to the hexamer and the monomer in Equation (8) are given by the partition functions for the n-independent and identical sites model, a model that is commonly used to analyze ITC data and was applied to ITC data for ClpB binding ADP (Carroni et al., 2014). In this model k1and k<sup>6</sup> are the average step-wise equilibrium constants for nucleotide binding, m<sup>1</sup> and m<sup>6</sup> are the stoichiometries of binding to monomers and hexamers, respectively. In a thermodynamically rigorous and model independent analysis of our data we showed that 12 ATPγS molecules were bound to hexameric ClpB and one ATPγS was bound to the monomer. L6,0 was determined in an analysis of assembly in the absence of nucleotide (Lin and Lucius, 2015b) and from an analysis of the dependence of L6,app on ATPγS we determined κ<sup>6</sup> and κ<sup>1</sup> (Lin and Lucius, 2016).

What is most striking, telling, and predictive about Equations (8) and (9) is that they are the simple product of two terms, the hexamerization equilibrium constant in the absence of nucleotide multiplied by the ratio of partition functions for nucleotide binding. If one seeks to introduce a mutation into a protein like ClpB then these are the parameters to interrogate. The mutation would have the ability to influence L6,0, which represents the intrinsic propensity of the protein to assemble into hexamers. However, more likely, introduction of a mutation, especially one in the ATPase active site is likely going to influence the affinity for nucleotide. It seems highly unlikely that the affinity for nucleotide binding to the hexamer, k6, would not change upon introduction of a mutation in the ATP binding site. Whether the intrinsic propensity of the enzyme to assemble or the nucleotide binding affinity is perturbed Equation (9) predicts that the concentration of hexamers in solution will be affected.

The unanswered question we now seek to address is how do partner proteins influence this equilibrium? A hallmark of AAA+ protein unfoldases is that they interact with partner proteins. ClpA interacts with the protease, ClpP and various adaptor proteins. ClpB interacts and collaborates with the KJE system and Hsp104 collaborates with Hsp70 and Hsp40. Equation (9) predicts that if these protein-protein interactions perturb the nucleotide binding by either modulating the stoichiometry or affinity then this will perturb the hexamerization equilibrium constant and thereby the concentration of hexamers present in solution. It is tempting to assert that partner proteins like ClpP and the KJE system would stabilize the hexamers. However, for a ligand linked assembling system, Equation (9) informs us that the interaction could stabilize or destabilize. In fact, since the nucleotide concentration in the cell is well above the affinity constant, here we hypothesize that the ability of partner proteins to modulate the nucleotide binding affinity allows for fine control over the concentration of hexamers present and available to do work. With a detailed analysis of ClpB assembly, we now stand poised to determine how the KJE system influences self-association.

Similarly, several groups have reported that the steady-state ATP hydrolysis rate for ClpA is reduced in the presence of ClpP (Kress et al., 2009; Baytshtok et al., 2015). In addition to ClpP exerting allosteric control over the rate of ATP hydrolysis, again, Equation (9) predicts that this phenomenological observation could be due to many factors. Our transient state kinetics experiments have suggested that ClpA uses only the NBD2 ATPase sites to catalyze processive translocation when associated with ClpP (Miller et al., 2013; Miller and Lucius, 2014). This observation does not rule out the possibility that NBD1 is still binding to ATP. However, when combined with the predictions from Equation (9) it does suggest that if the system goes from a stoichiometry of binding of 12 to 6 then this would perturb the hexamer concentration. Thus, the reduction in the steady-state ATPase rate could be due to a two-fold reduction in the binding stoichiometry and thereby a reduction in the concentration of free hexamers. Alternatively, if ClpP does stabilize the hexameric form then one would have to conclude that the elevated rate of ATP hydrolysis observed in the absence of ClpP must be due to a significant population of monomers, dimers, and tetramers rapidly hydrolyzing ATP.

The coordination of NBD1 and NBD2 has been, and continues to be, an area of great interest in the field. The use of these and other similar variants, abolishing ATP binding (Walker A) or hydrolysis (Walker B or Sensor 1) have been used by many groups to investigate the coordination of the 12 ATP binding and hydrolysis sites within a the ClpA hexamer, as well as for the ClpB and Hsp104 hexamer. One common strategy is "mutant doping," in which a variant is added to wild type protein in known ratios (Werbeck et al., 2008; Hoskins et al., 2009; del Castillo et al., 2010; DeSantis et al., 2012; Yamasaki et al., 2015). Many conclusions have been drawn regarding sequential, probabilistic, or concerted ATP hydrolysis mechanisms. Although the statistical distribution of the number of mutant protomers contained within a hexamer is valid, it may not hold if the mutation perturbs the assembly equilibrium. Many of these studies suffer from the assumption that the entire population of protein resides in the hexameric state. The most convincing among them are experiments where the signal is only sensitive to the hexameric form. For example, it seems clear that ClpX invokes a stochastic model since the studies used a linked hexamer (Martin et al., 2005; Cordova et al., 2014). Consequently, the issues surrounding assembly have been removed.

# MECHANISMS OF POLYPEPTIDE TRANSLOCATION BY ClpA AND ClpAP

# ClpA Mechanism in the Absence of ClpP

Horwich and coworkers showed that ClpAP could catalyze global unfolding of an SsrA tagged GFP construct (Weber-Ban et al., 1999). This was done by incorporating the 11 amino acid SsrA tag, which is a known binding sequence for ClpA and ClpX, at the carboxy-terminus of GFP (Levchenko et al., 1997). When the GFP-SsrA construct was presented to ClpA in the presence of ATP, a slight decrease in fluorescence was observed. However, when the construct was presented to ClpAP in the presence of ATP, a near complete loss of fluorescence was observed. This was interpreted to mean that when ClpA unfolded the GFP in the absence of a protease, GFP was allowed to spontaneously refold. However, in the presence of the proteolytic component, GFP was degraded and thus complete loss of fluorescence was observed.

To examine directional translocation catalyzed by ClpA, Horwich and coworkers developed a FRET based assay (Reid et al., 2001). In this design, a donor fluorophore was placed in the central cavity of ClpP and an acceptor at various positions on model substrates all containing the SsrA sequence at the carboxy-terminus. If ClpA translocates the polypetide chain into the ClpP cavity from the SsrA sequence at the carboxyterminus directionally to the amino-terminus, then FRET time courses would reveal this. FRET time courses were consistent with processive translocation from the carboxy-terminus to the amino-terminus. The results clearly showed that ClpA drives translocation of a polypeptide chain into the proteolytic chamber of ClpP.

Until recently, the elementary kinetic parameters governing this translocation reaction had not been reported. Moreover, most of the mechanistic investigations available were performed in the presence of ClpP. Thus, the critically important elementary kinetic mechanism for polypeptide translocation catalyzed by ClpA was missing from the field. Determining this mechanism required the development of techniques that would be sensitive to the elementary steps in polypeptide translocation in the absence of proteolytic degradation. Such approaches could then be broadly applied to a variety of enzymes that do not associate with proteases (see examples in the Introduction). This kinetic mechanism would include the elementary rate constants governing the reaction, kinetic step-size (aminoacids translocated between two rate-limiting steps), processivity (probability the enzyme will translocate vs. dissociate), and directionality (C to N vs. N to C).

A single-turnover fluorescent stopped flow assay was developed to elucidate these kinetic parameters (Rajendar and Lucius, 2010; Lucius et al., 2011). **Figure 3** shows a generalized schematic representation of this rapid mixing assay. Synthetic polypeptide substrates containing the 11 amino acid SsrA binding sequence at the carboxy-terminus and a single cysteine at the amino-terminus were constructed. The sequence of the polypeptide was based on the Titin I27 domain because the long term goal was to move to full length tandem repeats of I27 as had been done for ClpX (Kenniston et al., 2003, 2005). The cysteine waslabeled with fluorescein-5-maleimide. ClpA was bound to the SsrA sequence in the presence of the slowly hydrolysable ATP analog, ATPγS. Upon ClpA binding, fluorescence quenching was observed. Fluorescence quenching has since been observed for binding by both ClpB and Hsp104 to their respective substrates (Li et al., 2015b; Weaver et al., 2017). This sample was then loaded into one syringe of the stopped-flow fluorometer (see **Figure 3)**. In the other syringe was loaded a large excess of ATP and unlabeled SsrA peptide to serve as a trap for ClpA, i.e., any free ClpA would rapidly bind to SsrA and not the fluorescently modified polypeptide (Rajendar and Lucius, 2010). The large excess of trap ensures single-turnover conditions with respect to the complex of ClpA bound to fluorescently labeled peptide.

In the single-turnover fluorescence assay, the two solutions are rapidly mixed within 2 ms in a stopped-flow fluorometer and fluorescence is observed as a function of time. Fluorescence was observed to increase with time indicating that ClpA dissociated from the polypeptide chain. The question is; do the kinetic time courses yield information on translocation before ClpA dissociates? In principle, if ClpA is taking multiple steps before dissociating then the observed kinetic time courses should reflect the number of steps the enzyme takes before dissociation. Thus, if the length of the peptide is increased, the number of steps the enzyme takes before reaching the end should also increase. That is to say, if the time courses are sensitive to processive translocation, then the time courses should depend upon substrate length.

To test the substrate length dependence of the kinetic time courses, time courses were collected as a function of polypeptide substrate length ranging from 30 to 50 amino acids (Rajendar and Lucius, 2010). Observed was a lag (constant fluorescence) followed by an increase in fluorescence. This lag was observed to increase in duration with increasing substrate length indicating that ClpA remained on the polypeptide for an increasing amount of time with increasing substrate length. This observation is interpreted to indicate that ClpA is taking more steps with each increase in substrate length. Therefore, the single-turnover fluorescence stopped-flow assay is sensitive to processive translocation.

To elucidate the elementary rate constants using transient state kinetics one needs to perturb the system. Variables like temperature, salt concentration and type, pH, etc. can be used

for this perturbation. For a molecular motor that couples ATP binding and hydrolysis to repeated rounds of translocation, the simplest perturbation is to vary the ATP concentration. The initial experiments are usually carried out at excess ATP so that it can be assumed that ATP binding is not rate-limiting. As the [ATP] is reduced, the observed rate constant will reflect ATP binding, or a step coupled to ATP binding. Importantly, because the motor-peptide complex is preassembled prior to rapidly mixing with ATP (**Figure 3**), the signal is insensitive to the changing population of ClpA hexamers throughout the ATP range assayed. The kinetic time courses were collected as a function of ATP from ∼125µM to 5 mM. As the [ATP] was reduced, the observed kinetic rate constant decreased. This is further evidence that the time courses are reporting on translocation since simple dissociation would not be predicted to be ATP concentration dependent.

The kinetic time courses were subjected to global non-linearleast-squares (NLLS) analysis (Lucius et al., 2003, 2011). For ClpA, the enzyme translocated with a repeating rate constant, k<sup>t</sup> = (1.39 ± 0.06) s−<sup>1</sup> and an overall rate of (19 ± 1) AA s−<sup>1</sup> at saturating ATP with a kinetic step-size of (14 ± 2) AA step−<sup>1</sup> . It is important to note that the kinetic step-size represents the average number of amino acids translocated between two rate-limiting steps and may or may not represent physical stepping. While similar strategies have been successfully used to examine helicase catalyzed DNA unwinding and single strand DNA translocation (Fischer and Lohman, 2004; Fischer et al., 2004; Lucius et al., 2004; Lucius and Lohman, 2004), this was the first step-size reported for a polypeptide translocase (Rajendar and Lucius, 2010; Lucius et al., 2011).

The processivity is quantitatively defined as the rate constant for translocation divided by the summation of the rate constants for translocation and dissociation. For example, a translocating enzyme following the mechanism shown in **Figure 4**, where E·P represents the enzyme pre-bound to a peptide of length L, the enzyme can proceed forward with rate constant k<sup>t</sup> or dissociate with rate constant kd. I(L−m) represents the first intermediate that has been translocated by some distance m (step-size).

The processivity is the probability given by Equation (10) (Lucius et al., 2003, 2011).

$$P = \frac{k\_t}{k\_t + k\_d} \tag{10}$$

When k<sup>d</sup> = 0, then P = 1 and every enzyme that binds will translocate to the end without dissociation. On the other hand, as k<sup>d</sup> increases, P approaches zero, which would describe an enzyme with low processivity (an enzyme that has a higher propensity to dissociate than reach the end of the polypeptide chain). The processivity described as a probability, P, can be related to processivity expressed in terms of the average number of amino acids translocated per binding event, N, given by Equation (11) (for a complete derivation of Equation (11) see Appendix B of Lucius et al., 2003).

$$P = e^{-(m/N)}\tag{11}$$

It is tempting to assume that a hexameric ring motor that encircles the linear lattice on which it translocates would be highly processive. However, this is not always true. For example, the hexameric ring helicase, DnaB exhibits a processivity of P ∼0.89 (Galletto et al., 2004). The proposed model is that the ring opens and substrate can "escape" thereby resulting in a dissociation event. However, this primary replicative helicase likely exhibits much higher processivity in the context of the fast moving replication fork, likely due to interactions with other proteins. With respect to ClpB and Hsp104, both enzymes have been proposed to be in "rapid subunit exchange" (Werbeck et al., 2008; DeSantis et al., 2012). Thus, loss of a subunit in a hexameric ring could also result in a dissociation event. Moreover, like DnaB, partner proteins are likely to influence the processivity. Regardless of the mechanism, there is a dearth of quantitative measurements of processivity for polypeptide translocases.

In the initial examination of ClpA catalyzed polypeptide translocation with synthetic peptides, a measureable dissociation rate constant, kd, was not detected above 500µM ATP. However, at 300µM ATP and below, a measureable dissociation

(*m*) until the peptide is fully translocated.

rate constant was observed, allowing for the calculation of processivity. The processivity was determined to be P = (0.876 ± 0.006) at low [ATP]. Using Equation (11) a processivity of ∼100 amino acids per binding event is predicted, which is 2-fold larger than the longest polypeptide used in this study. Thus, this is a preliminary estimate of the processivity at limiting [ATP] and methods allowing the examination of longer polypeptides are needed to rigorously test the processivity for this and related enzymes. Qualitatively, the findings support the idea that ClpA is highly processive, confirming that reported by Maurizi and coworkers (Thompson et al., 1994).

#### Effect of ClpP on the Translocation Mechanism Catalyzed by ClpA

With a method in hand that is sensitive to polypeptide translocation in the absence of proteolytic degradation the question that could be addressed is, does ClpAP translocate using the same mechanism as ClpA alone? A qualitative assessment of stopped-flow time courses had been reported previously that concluded ClpAP translocated faster than ClpA alone but rate constants were not reported (Kolygo et al., 2009).

The single turnover stopped-flow method described above was employed to examine polypeptide translocation catalyzed by ClpAP. However, upon building a complex of polypeptide bound by ClpAP, a number of questions emerge. Hexameric ClpA can bind to either apical surface of ClpP forming a 1:1 complex, or to both apical surfaces of ClpP forming a 2:1 complex (see **Figure 5)**. Should the experimental design conditions examine 1:1 or 2:1 hexameric ClpA to tetradecameric ClpP? Similarly, if the 2:1 complex is examined, should both sides of the enzyme be bound with peptide?

Based on activity measurements, Maurizi and coworkers reported an affinity for ClpA hexamer binding to ClpP tetradecamer to be ∼4 nM (Maurizi et al., 1998). However, the fact that ClpA resides in a distribution of oligomers was not taken into account. ClpA resides in a distribution of monomers, dimers, and tetramers in the absence of nucleotide (Veronese et al., 2009; Veronese and Lucius, 2010). However, even at concentration of nucleotide above 1 mM there remains a distribution of oligomeric states (Veronese et al., 2011; Li and Lucius, 2013).

Thus, it cannot be assumed that all of the ClpA present in solution is in the hexameric state.

For a macromolecule with two binding sites, one can be certain to ligate only one of the binding sites if the two-site macromolecule is maintained in large excess over the ligand. Thus, whether 1:1, 2:1 or a mixture of the ClpAP complexes are present in solution, by maintaining the complex in excess over the polypeptide only one peptide can be bound to any given ClpAP complex in the ensemble.

To build a peptide pre-bound complex, 86 nM tetradecameric ClpP and 1µM monomer of ClpA were used in the presence of 150µM ATPγS. Note that, unlike ClpA, ClpP forms stable tetradecamers (Maurizi et al., 1998) (E. Duran unpublished data). However, the question is; how much hexameric ClpA is present at 1µM monomer? To address this question, sedimentation velocity experiments measured the concentration of hexameric ClpA in the presence of 150µM ATPγS at 1µM total ClpA monomer concentration. Under these conditions, the hexameric concentration was determined to be 130 nM. It is important to note that if the 1µM total monomer concentration is divided by six, i.e., assume only hexamers are in solution, then one would predict 170 nM hexamers, an over estimate by 30% of the hexameric ClpA population. Under these conditions, a mixture of 1:1 and 2:1 complexes is predicted. With that in mind, binding the complex to 20 nM peptide maintains ClpAP (whether 1:1 or 2:1 complex) in large excess over the peptide. Keeping the ClpAP complex in excess over the peptide concentration ensures that peptide is only bound to one ClpA hexamer in a given ClpAP molecule. That is to say, it would be thermodynamically unfavorable to have a doubly peptide ligated 2:1 ClpAP complex.

Subjecting ClpAP to the same analysis as performed on ClpA alone revealed that, indeed, ClpAP does translocate with a faster overall rate of ∼35 AA s−<sup>1</sup> (Miller et al., 2013). This is ∼1.5 times faster than the ∼20 AA s−<sup>1</sup> observed for ClpA alone. The overall rate is the product of the step size and the elementary rate constant governing that step. One of the strengths of the transient state kinetic approaches used is that it is sensitive to these two additional parameters. Interestingly, the kinetic step size for ClpAP was observed to be ∼5 AA step−<sup>1</sup> in stark contrast to the ∼14 AA step−<sup>1</sup> measured for ClpA alone (Rajendar and Lucius, 2010). Further, the rate constant governing translocation was found to be ∼7 s−<sup>1</sup> , which is ∼5-fold faster than the ∼1.4 s−<sup>1</sup> measured for ClpA (Miller et al., 2013).

As stated above, the kinetic step-size does not necessarily represent physical movement. However, a recent single-molecule examination of ClpAP translocation reports steps of ∼1 nm (Olivares et al., 2014), which was reported to be consistent with the 5 AA step−<sup>1</sup> reported from the single turnover experiments described above (Miller et al., 2013). A single molecule experiment that would be sensitive to mechanical movement has not been performed on ClpA alone. Such an experiment would either confirm or refute the measured ∼14 AA step−<sup>1</sup> . Additional testing is necessary to determine whether or not this kinetic step-size represents mechanical movement.

All in all, it is clear that ClpP exerts an allosteric influence on ClpA catalyzed polypeptide translocation. Thus, ClpA and ClpAP should be considered to be two different enzymes that translocate with two different mechanisms. Moreover, questions remain regarding the activities of the 2:1 and 1:1 complexes.

The Walker A and Walker B motifs that form the ATP binding pocket are separated by a loop that extends into the axial channel of ClpA (Guo et al., 2002b). It has been proposed that the loop cycles up and down as the ATP binding site cycles through bound ATP to bound ADP + P<sup>i</sup> and then release of ADP and P<sup>i</sup> . This up and down motion is thought to drive translocation. Hinnerwisch and coworkers showed through crosslinking studies that polypeptide substrate crosslinked with the NBD2 loop in the central channel of ClpA (Hinnerwisch et al., 2005). From these observations, Hinnerwisch and coworkers proposed that the NBD2 loop was responsible for mechanical pulling on the substrate polypeptide being translocated. They proposed a cycle of translocation to consist of ATP binding at NBD2 with the NBD2 loop in the up conformation, followed by ATP hydrolysis that drives movement of the NBD2 loop to the down conformation and concurrent movement of the polypeptide substrate that is bound to the NBD2 loop. Consistently, synchrotron footprinting data showed that the NBD2 loop proceeds through a nucleotide-dependent conformational change (Bohon et al., 2008).

From examination of the ATP concentration dependence of the kinetic step-size and rate constant for ClpAP, the observed step immediately follows ATP binding (Miller et al., 2013). Coupling this observation with the Hinnerwisch model, the step detected in the single-turnover experiments could be either ATP hydrolysis or a conformational change; a conformational change that may represent movement of the NBD2 loop. Since a single repeating step was detected in each cycle of translocation, loop movement may represent movement by ∼5 amino acids.

If the measured kinetic step-size for ClpAP truly represents mechanical movement by ∼5 amino acids then why does ClpA alone exhibit a different kinetic step-size of ∼14 AA step−<sup>1</sup> ? A potential answer to this question lies in the dependence of the overall translocation rate on [ATP] for ClpA and ClpAP. The translocation rate constant for ClpA alone exhibited a sigmoidal dependence on ATP. The isotherm could not be described by a simple rectangular hyperbola. Rather, it required analysis using a Hill model with a hill coefficient of ∼2.5. In contrast, the translocation rate constant for ClpAP did not exhibit a sigmoidal dependence. Since ClpA contains two ATP binding sites per monomer and the single-turnover kinetic time courses are sensitive only to bound hexamer, the observation of a sigmoidal dependence suggests that there is cooperativity between multiple ATP binding sites that are involved in polypeptide translocation. On the other hand, since ClpAP did not exhibit any cooperativity, this indicates that the presence of ClpP relieves the cooperative interactions.

With these observations in mind, **Figure 6** illustrates a working model for both ClpA and ClpAP polypeptide translocation, incorporating known structural information and various biochemical/biophysical studies. **Figure 6A** illustrates ClpA, in the absence of ClpP, with both the NBD1 and NBD2 loops in the up conformation and ATP bound to both domains. The polypeptide substrate is shown in black and is making contact with both the NBD1 and NBD2 loops. Crosslinking studies have shown that contacts between polypeptide substrate and ClpA were only observed with the NBD2 loop, but various single site mutations throughout the NBD1 loop abolished translocation activity (Hinnerwisch et al., 2005). Moreover, recent work indicates that both ATPase sites are involved in translocation catalyzed by ClpA in the absence of ClpP (Rajendar and Lucius, 2010). These two observations implicate the NBD1 loop in translocation. The next step would be for NBD1 to hydrolyze ATP and cause the NBD1 loop to move down and translocate (push) the substrate by up to 14 amino acids creating a polypeptide loop inside the axial channel of ClpA. The loop in the substrate can be accommodated in ClpA since it has been shown that ClpA forms a cavity between the NBD1 and NBD2 loops (Beuron et al., 1998; Guo et al., 2002a). NBD1 would contain ADP and P<sup>i</sup> in the ATP binding site and therefore the NBD1 loop would have a reduced affinity for the polypeptide, which would allow for rebinding by another NBD1 loop loaded with ATP in a neighboring subunit in the hexamer (Farbman et al., 2007; Veronese et al., 2011). The NBD2 loop would cycle through multiple rounds of ATP hydrolysis coupled to translocation of the substrate by 2–5 amino acids per cycle with a rate constant of ∼4 s−<sup>1</sup> . This will occur several times thereby shortening the loop inside the cavity of ClpA before NBD1 translocates another ∼14 amino acids of the polypeptide into the cavity with a rate constant of 1.4 s−<sup>1</sup> .

**Figure 6B** illustrates the working model for how ClpA translocates when associated with ClpP. Since the ATP concentration dependence of the rate of ClpAP catalyzed polypeptide translocation suggests reduced cooperativity between ATP binding sites, it is hypothesized that NBD2 drives translocation in the ClpAP complex. Repeating cycles of ATP binding and hydrolysis could occur at NBD1, but they do not limit the observed translocation. Therefore, this model predicts repeating cycles of ATP binding and hydrolysis at NBD2 would lead to translocation of the substrate by distances of 2–5 aa step−<sup>1</sup> .

The working model predicts that in the absence of ClpP, NBD1 should hydrolyze ATP with a rate constant of (1.39 ± 0.06) s−<sup>1</sup> and NBD2 should hydrolyze ATP with a rate constant of (7.9 ± 0.2) s−<sup>1</sup> in the presence of polypeptide substrate. Kress et al. examined the steady state rate of ATP hydrolysis catalyzed by ClpA both in the presence and absence of ClpP (Kress et al., 2009). Further, they made two variants of ClpA that are deficient in ATP hydrolysis at either NBD1 or NBD2, which allow for the examination of ATP hydrolysis at each domain in the absence of hydrolysis at the other domain, and in the presence or absence of ClpP and SsrA substrate. Interestingly, in the absence of ClpP and the presence of GFP-SsrA, NBD1 hydrolyzes ATP with a rate constant of (0.8 ± 0.2) s−<sup>1</sup> , which is comparable to the rate constant determined for translocation of (1.39 ± 0.06) s−<sup>1</sup> determined using the single-turnover stopped flow experiments. Similarly, in the presence of ClpP and GFP-SsrA, NBD2 hydrolyzes ATP with a rate constant of (6.3 ± 0.5) s −1 , which is similar to the estimate of (7.9 ± 0.2) s−<sup>1</sup> (Miller et al., 2013).

# MECHANISM OF TRANSLOCATION BY ClpB/Hsp104

As stated above, ClpB/Hsp104 shares many structural characteristics with ClpA (see **Figure 1**) and therefore has been hypothesized to share a similar translocation mechanism. One important difference is the absence of an IGF/L loop in ClpB/Hsp104, necessary in ClpA for binding the protease ClpP. This structural difference intimates an important functional difference; ClpB/Hsp104 does not partner with any known protease (Woo et al., 1992).

A disaggregase such as ClpB/Hsp104 does not covalently modify its protein substrate. Disaggregation has been measured by monitoring changes in turbidity, solubility, and various staining techniques in vitro, thermotolerance development studies in vivo, and enzyme reactivation in vivo or in vitro (Parsell et al., 1991, 1994b; Glover and Lindquist, 1998; Goloubinoff et al., 1999; Zolkiewski, 1999; Mogk et al., 2003; Weibezahn et al., 2003; Schlee et al., 2004; Shorter and Lindquist, 2004; Schaupp et al., 2007; del Castillo et al., 2010; Sielaff and Tsai, 2010). These macroscopic observations, while informative, do not report on the molecular level events involved in the mechanism. How can the molecular events in the translocation or disaggregation mechanism be studied in the absence of a covalent modification to the protein substrate? Early investigations of the ClpB/Hsp104 disaggregation mechanism addressed this challenge by building upon the structural similarities between ClpB/Hsp104 and E. coli ClpA. As discussed above, ClpA processively translocates protein substrates through its axial channel and into the protease, ClpP.

The similarities in sequence, tertiary structure, and quaternary structure lead the Bukau group to engineer the IGF/L loop onto the C terminal surface of ClpB and Hsp104. This loop allows a non-native interaction with ClpP, resulting in degradation of the substrate, a measurable covalent modification (Weibezahn et al., 2004; Tessarz et al., 2008). The rationale was that if they could "force" ClpB (Hsp104) to interact with ClpP and they observed proteolytic degradation, then this must mean that ClpB, like ClpA was translocating a substrate through the axial channel and into ClpP for proteolytic degradation.

In these studies, the Bukau group showed that the non-native BAP (ClpB-ClpA-P loop) -ClpP or HAP (Hsp104-ClpA-P loop) -ClpP complex was indeed able to degrade substrate proteins. This observation was interpreted as evidence that BAP and HAP, and therefore ClpB and Hsp104, processively translocate entire proteins through the axial channel and into ClpP, just as is done by the processive translocase ClpA (Weibezahn et al., 2004; Tessarz et al., 2008). Notably, additional studies of BAP-ClpP in which only portions of a substrate were unfolded lead the Bukau group to conclude, "partial threading of the unfolded substrate moiety through the central channel of ClpB is sufficient for efficient protein disaggregation in a physiologically relevant context" and that "partially threaded polypeptide chains are released from ClpB to be refolded" (Haslberger et al., 2008). Since these publications, however, many researchers in the field have often interpreted or summarized the Bukau results with less nuance, carrying forward only the "complete threading" model of polypeptide translocation.

The current prevailing hypothesis in the field is that the BAP-ClpP and HAP-ClpP findings, together with the structural similarities to ClpA, are evidence of complete threading or processive translocation by ClpB and Hsp104. The dominant mechanistic model is the translocation of an entire fulllength protein pulled out of an aggregate through the axial channel of the disaggregating motor. The exclusive portrayal of this complete threading/processive translocation mechanism for these disaggregases has been schematized throughout the literature (Miot et al., 2011; Doyle et al., 2013). Other primary research has also been interpreted as consistent with the complete threading model based largely on the BAP/HAP–ClpP results (Schaupp et al., 2007; Nakazaki and Watanabe, 2014). It should be noted, however, that some researchers in the field do point out the possibility of both complete and partial threading mechanisms (Aguado et al., 2015).

Another important challenge to the findings using BAP and HAP with ClpP is that recent work has shown that BAP-ClpP degrades α-casein in both the absence and the presence of ATP (Li et al., 2015b). Thus, the degradation observed in this experimental design does not report strictly on the ATPdependent translocation mechanism. Nakazaki and Watanabe's findings from their study of various mutations of TBAP-ClpP were interpreted as passive threading, independent of ATP hydrolysis (Nakazaki and Watanabe, 2014). However, these results could alternatively be understood to show that the TBAP-ClpP construct does not report exclusively on ATP-dependent translocation (threading) since they found "no correlation between ATPase activities and degradation rates" (Nakazaki and Watanabe, 2014).

A complementary approach to the BAP-ClpP degradation experiments, one in which there is no forced interaction with a protease, is needed. The stopped-flow fluorometer experimental design, described above (**Figure 3**), developed for the study of ClpA in the absence of ClpP is one such complementary approach (Rajendar and Lucius, 2010). Using this design, Li et al. demonstrated that ClpB is a non-processive translocase, taking only one or two kinetic steps before releasing the polypeptide substrate (Li et al., 2015b). This finding is at odds with the prevailing model of complete threading, by which one polypeptide chain is extracted from an aggregate. However, the Li et al. conclusion is in good agreement with previous results of observed partial threading (Haslberger et al., 2008). Additional studies are needed to expand this work into Hsp104.

Though Hsp104 and ClpB are both structurally and functionally similar, important differences have been observed. For example, both Hsp104 and ClpB can resolve disordered aggregates, however only Hsp104, not ClpB, can also resolve more structured amyloid aggregates (DeSantis et al., 2012). Hsp104 also has an additional function in prion curing not observed for ClpB (Shorter and Lindquist, 2004). What mechanistic differences give rise to these observations?

One possible contribution to the differences between the disaggregases is the differing roles of the two NBDs. The interplay between the NBDs within a hexamer is complex and cooperative. Still, some distinctions between NBD1 and NBD2 have been drawn. Notably, nucleotide binding at NBD1 is necessary for stabilization of ClpB hexamers (Kim K. I. et al., 2000; Watanabe et al., 2002; Mogk et al., 2003; del Castillo et al., 2010). This role of NBD1 in oligomerization is conserved between ClpB and ClpA. Surprisingly, in Hsp104, nucleotide binding in NBD2 is required for stabilization of hexamers (Parsell et al., 1994a; Schirmer et al., 1998).

In both ClpB and Hsp104, like in ClpA, the tyrosines in the pore loops of both NBDs are important for substrate processing (Schlieker et al., 2004; Weibezahn et al., 2004; Lee et al., 2007; Tessarz et al., 2008; Yokom et al., 2016; Gates et al., 2017). As the ATP hydrolysis cycle is carried out in either NBD, the pore loop is thought to move through space due to conformational changes induced by the nucleotide ligation state. The relatively large, planar surface of the tyrosine residue is thought to interact with the polypeptide substrate, pushing or pulling the polypeptide through the central channel. It's possible that differences in nucleotide binding/hydrolysis induced pore loop conformational changes account for the functional differences that exist between ClpB and Hsp104 catalyzed protein disaggregation. Experimental designs that report on the molecular level events involved in ClpB/Hsp104 polypeptide substrate processing, in particular those sensitive to the coordination between pore loop movement and nucleotide ligation state during disaggregation, will be key in testing this hypothesis.

# Effect of DnaK/Hsp70 on ClpB/Hsp104 Mechanism

ClpB and Hsp104 were initially observed to disaggregate clients only in the presence of co-chaperones. These disaggregating motors are far more potent in collaboration with co-chaperones, although conditions have since been found in which ClpB and Hsp104 have innate disaggregation abilities. The co-chaperone system for E. coli ClpB is made up of DnaK, DnaJ, and the nucleotide exchange factor GrpE (termed the KJE system). Yeast Hsp104 collaborates with the co-chaperones Hsp70 (analogous to DnaK) and Hsp40 (analogous to DnaJ). Like ClpB/Hsp104, DnaK/Hsp70 is an ATPase and a disaggregase that can function independently of co-chaperones. The full systems, ClpB/KJE and Hsp104/70/40, have ATPase and disaggregase activity greater than the sum of the components' activities. There are three proposed possibilities that could explain this enhanced activity: (1) DnaK modifies the aggregate making a better binding site for ClpB, (2) DnaK accepts substrate from ClpB after the substrate has been completely translocated, or (3) the ClpB-DnaK complex has greatly amplified disaggregation activity relative to ClpB alone, possibly through a fundamentally different mechanism.

Early attempts to identify which component of the system acted upon an aggregate or client first resulted in divergent findings. The Liberek group identified DnaK as the first actor. They found that DnaK, with DnaJ and ATP, remodeled aggregates to facilitate ClpB-catalyzed disaggregation. Neither a transient tertiary complex with ClpB or additional roles for DnaK downstream of ClpB's action were ruled out (Zietkiewicz et al., 2004, 2006). On the other hand, early work from the Bukau group concluded that ClpB acted first. Specifically, ClpB was observed to expose a substrate's hydrophobic regions, which could then be recognized by the KJE system (Goloubinoff et al., 1999). The development of a ClpB trap mutant (double Walker B variant, able to bind but not hydrolyze ATP) also revealed that ClpBtrap outcompeted DnaK for binding to a model substrate and inhibited DnaK activity (Weibezahn et al., 2003).

Over time, the idea of a ClpB-DnaK (Hsp104-Hsp70) complex has come into favor. One compelling observation in support of this finding is that the activity of the co-chaperones is species specific. ClpB works with DnaK but not Hsp70. Hsp104 works with Hsp70 but not DnaK. This suggests a direct interaction between the chaperones. Furthermore, both the Wickner and Tsai groups engineered sets of chimeras in which domains from ClpB were replaced by the analogous domain from Hsp104 and vice versa. Both groups found that the M domain dictates which species formed a productive cochaperone partnership. For example, Hsp104 with the M domain from ClpB partnered effectively with the KJE system, not the Hsp70 system. This finding is consistent with the identification of the M domain as the binding site for DnaK (Miot et al., 2011; Rosenzweig et al., 2013; Doyle et al., 2015).

Binding affinities of 17 and 25µM have been reported for T. thermophilus ClpB and DnaK (Schlee et al., 2004; Rosenzweig et al., 2013). For E. coli ClpB and DnaK, the K<sup>d</sup> has been estimated in the range of 7–30µM (Kedzierska et al., 2005). Notably, while the ClpB-DnaK complex has been observed by co-elution assays (Schlee et al., 2004; Barnett et al., 2005) and NMR (Rosenzweig et al., 2013), the ternary complex of ClpB-DnaK-client has not been observed. Furthermore, despite the K<sup>d</sup> measurements and estimates in the range of ∼20µM, biochemical assays are often carried out with nanomolar to low micromolar concentrations of DnaK, conditions in which a significant population of the ClpB-DnaK complex is not expected (Weibezahn et al., 2004; Haslberger et al., 2008; DeSantis et al., 2012; Seyffer et al., 2012; Rosenzweig et al., 2013; Aguado et al., 2015; Doyle et al., 2015). Nevertheless, in these cases observations are attributed to the interplay between ClpB and DnaK. Though the existence of a DnaK-ClpB or Hsp70-Hsp104 complex has become widely accepted, the role of co-chaperones upstream and/or downstream of that complex remains under investigation. The convergence of evidence suggests that DnaK acts on the aggregate first, possibly targeting the client to ClpB, and then DnaK binds ClpB unleashing the disaggregating power of ClpB (Weibezahn et al., 2004; Sielaff and Tsai, 2010; Miot et al., 2011; Seyffer et al., 2012). DnaK may also have additional roles in the proper refolding of the client after release from ClpB.

# CONCLUSIONS

The reviewed studies reveal important considerations for design and implementation of the experiments needed to address outstanding questions about ClpA and ClpB/Hsp104 catalyzed protein translocation, degradation, and disaggregation mechanisms, respectively. One major aspect of assay design is the ability to predict the population of degradation/disaggregation active complex present under the chosen experimental conditions. As work on ClpB has revealed, these proteins persist in a distribution of oligomers even at high nucleotide concentrations (**Figure 2**). Therefore, dividing the monomeric protein concentration by six will yield overestimates for the hexameric population present and available to interact with partner proteins and substrates in solution. Instead, quantification of the active hexamer population in a given assay will require a thermodynamically rigorous characterization of the energetics governing nucleotide-linked self-assembly. Although this work has been done for ClpB, the mechanisms of ClpA and Hsp104 ligand linked self-assembly remain to be examined.

A related consideration in assay design is the effect of mutations on AAA+ motor self-assembly. Because the propensity of a protein to oligomerize is in part driven by its primary sequence, mutations of the sequence will have effects on its self-assembly. If unaccounted for, assay readout changes resulting from up- or downregulation of the hexamer population as a result of mutations, could be misinterpreted as up- or downregulation of "activity" in ATPase, reactivation, or other assays. Thus, when designing experiments for AAA+ motors and their corresponding variants, it is important to know whether the signal being monitored reports on events that could be controlled by changes in the assembly state. Interpretations of those results should be tempered by possible contributions from variability in the assembly state.

Single turnover translocation experiments have been designed to yield information about the molecular level events governing AAA+ motor activity without rigorous quantification of the self-assembly mechanism (Rajendar and Lucius, 2010; Li et al., 2015b). However, this was possible, in part, because only hexamers are bound to the polypeptide substrate. If smaller oligomers contributed to the translocation signal then measures would have to be taken to account for this. For example, as soon as ClpP is introduced to ClpA then one has to start asking how the distribution of 1:1 and 2:1 hexameric ClpA to tetradecameric ClpP influences the signal. Similar techniques are being adopted to investigate the molecular level events governing the mechanism of ClpB/Hsp104 catalyzed disaggregation in the absence and presence of partner co-chaperones. As work on ClpA and ClpAP revealed, ClpP induces a major change in the mechanism of ClpA catalyzed polypeptide translocation. It's reasonable, then, to expect cochaperones like DnaK/Hsp70 to similarly affect the disaggregation mechanism of ClpB/Hsp104. Implementation of these transient state kinetic techniques will prove powerful in the deconvolution of cochaperone contributions to the disaggregation activities of ClpB/Hsp104 and functional differences between ClpB and Hsp104.

By definition, motor proteins use an energy source to perform mechanical work. ClpA and ClpB/Hsp104 use the energy from ATP binding/hydrolysis to perform this mechanical work. For any translocase there is interest in how far the translocase moves on its lattice, how much energy is required to make this movement, and how much force is exerted. For ClpA we reported the first kinetic step-size for any AAA+ protein translocase to be ∼14 amino acids per step (Rajendar and Lucius, 2010). Similarly, we showed that ClpAP translocated with a reduced kinetic stepsize of ∼5 amino acids per step (Miller et al., 2013). Consistently, a single molecule optical tweezer measurement reported a stepsize of ∼5 amino acids per step for ClpAP (Olivares et al., 2014). Similarly, single molecule optical tweezer experiments showed that ClpXP translocated in 5–8 amino acid steps (Aubin-Tam et al., 2011; Maillard et al., 2011). In many cases, single-molecule and single turnover kinetics experiments can get around the limitations on the interpretation imposed by macromolecular assembly. Thus, going forward, the combination of singlemolecule and transient state kinetic experiments are going to be essential for addressing detailed mechanistic questions on AAA+ motors.

# AUTHOR CONTRIBUTIONS

ED, CW, and AL contributed equally to this work. ED and CW are co-first authors.

# FUNDING

This work was supported by National Science Foundation Grant MCB-1412624 to AL.

# REFERENCES


molecular motor. Acta Crystallogr. D Biol. Crystallogr. 70(Pt 2), 582–595. doi: 10.1107/S1399004713030629


Zolkiewski, M., Kessel, M., Ginsburg, A., and Maurizi, M. R. (1999). Nucleotidedependent oligomerization of ClpB from Escherichia coli. Protein Sci. 8, 1899–1903. doi: 10.1110/ps.8.9.1899

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Duran, Weaver and Lucius. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mutant Analysis Reveals Allosteric Regulation of ClpB Disaggregase

#### Kamila B. Franke, Bernd Bukau\* and Axel Mogk \*

*Center for Molecular Biology of the Heidelberg University, German Cancer Research Center, Heidelberg, Germany*

The members of the hexameric AAA+ disaggregase of *E. coli* and *S. cerevisiae*, ClpB, and Hsp104, cooperate with the Hsp70 chaperone system in the solubilization of aggregated proteins. Aggregate solubilization relies on a substrate threading activity of ClpB/Hsp104 fueled by ATP hydrolysis in both ATPase rings (AAA-1, AAA-2). ClpB/Hsp104 ATPase activity is controlled by the M-domains, which associate to the AAA-1 ring to downregulate ATP hydrolysis. Keeping M-domains displaced from the AAA-1 ring by association with Hsp70 increases ATPase activity due to enhanced communication between protomers. This communication involves conserved arginine fingers. The control of ClpB/Hsp104 activity is crucial, as hyperactive mutants with permanently dissociated M-domains exhibit cellular toxicity. Here, we analyzed AAA-1 inter-ring communication in relation to the M-domain mediated ATPase regulation, by subjecting a conserved residue of the AAA-1 domain subunit interface of ClpB (A328) to mutational analysis. While all A328X mutants have reduced disaggregation activities, their ATPase activities strongly differed. ClpB-A328I/L mutants have reduced ATPase activity and when combined with the hyperactive ClpB-K476C M-domain mutation, suppress cellular toxicity. This underlines that ClpB ATPase activation by M-domain dissociation relies on increased subunit communication. The ClpB-A328V mutant in contrast has very high ATPase activity and exhibits cellular toxicity on its own, qualifying it as novel hyperactive ClpB mutant. ClpB-A328V hyperactivity is however, different from that of M-domain mutants as M-domains stay associated with the AAA-1 ring. The high ATPase activity of ClpB-A328V primarily relies on the AAA-2 ring and correlates with distinct conformational changes in the AAA-2 catalytic site. These findings characterize the subunit interface residue A328 as crucial regulatory element to control ATP hydrolysis in both AAA rings.

Keywords: AAA+ protein, ClpB, Hsp104, protein disaggregation, arginine finger

# INTRODUCTION

AAA+ proteins constitute a protein superfamily sharing the ability to convert the chemical energy derived from ATP hydrolysis into mechanical work. AAA+ proteins share the AAA domain, which is defined by a region of ∼230 amino acids in length, comprising conserved Walker A and Walker B motifs for nucleotide binding and hydrolysis. The AAA domain also drives protein oligomerization, frequently into hexameric ring-like structures with a central pore. The catalytic active site is located at the subunit interface of AAA domains involving conserved elements from both subunits (Miller and Enemark, 2016). AAA+ proteins differ in the number of AAA domains (one or two) per

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Rina Rosenzweig, Weizmann Institute of Science, Israel Michal Zolkiewski, Kansas State University, USA Carolyn K. Suzuki, Rutgers University, USA*

#### \*Correspondence:

*Bernd Bukau bukau@zmbh.uni-heidelberg.de Axel Mogk a.mogk@zmbh.uni-heidelberg.de*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

Received: *22 December 2016* Accepted: *07 February 2017* Published: *22 February 2017*

#### Citation:

*Franke KB, Bukau B and Mogk A (2017) Mutant Analysis Reveals Allosteric Regulation of ClpB Disaggregase. Front. Mol. Biosci. 4:6. doi: 10.3389/fmolb.2017.00006* protomer and the presence of extra domains, which provide functional specificity by controlling substrate interactions.

Many AAA+ proteins function as ATP fueled unfolding machineries, causing disassembly of substrate complexes or coupling substrate unfolding to degradation via associated peptidases (Sauer and Baker, 2011). Substrate unfolding by AAA+ proteins is typically mediated by pulling at a bound substrate stretch leading to substrate threading through the central pore. This threading activity is executed by pore-located aromatic residues that are located on mobile loops, which move downwards the central channel in a nucleotide-controlled manner (Yamada-Inagawa et al., 2003; Schlieker et al., 2004; Zolkiewski, 2006).

How ATP hydrolysis is orchestrated and linked to the formation of a mechanical force is key to understand AAA+ protein function. The regulation of ATPase activity is complex. AAA+ proteins form asymmetric assemblies as not all AAA domains bind nucleotide at the same time. For various family members including ClpX, PAN, and ClpB it was shown that only four out of six nucleotide binding sites are occupied (Hersch et al., 2005; Glynn et al., 2009; Smith et al., 2011; Carroni et al., 2014). The individual AAA domains can in principle work independently and ATP hydrolysis can proceed in a probabilistic manner (Martin et al., 2005). However, coordination of ATP hydrolysis leads to power strokes with higher strengths that are linked to more efficient substrate threading (Sen et al., 2013).

The position of ATPase active sites at the interface of two neighboring AAA subunits offers a pathway for allosteric signal transmission and subunit coupling. Conserved arginine fingers located at the subunit interface contact the γ-phosphate of ATP bound in the neighboring subunit. Arginine fingers function as essential trans-acting elements in ATP hydrolysis and provide a structural framework to sense nucleotide states and to transmit this information across the AAA ring in an allosteric fashion (Karata et al., 1999; Wang et al., 2005; Zeymer et al., 2014b).

Hsp100 protein disaggregases (Escherichia coli ClpB, Saccharomyces cerevisiae Hsp104) harbor two AAA domains (AAA-1, AAA-2) and solubilize aggregated proteins in concert with a cognate Hsp70 chaperone system (Aguado et al., 2015b; Mogk et al., 2015). Hsp70-mediated recruitment of ClpB/Hsp104 to protein aggregates is coupled to ATPase activation (Seyffer et al., 2012; Lee et al., 2013; Rosenzweig et al., 2013). Hsp70 interaction and ATPase control are directly linked via the specific ClpB/Hsp104 M-domain. The M-domain forms a coiled-coil structure, which is composed of four helices forming two wings termed motif1 and motif2 (Lee et al., 2003). Mdomain motif2 exists in two structural states. It is either in close contact with AAA-1 or dissociates from the AAA-1 ring, enabling its binding to Hsp70 (Oguchi et al., 2012; Carroni et al., 2014). The interaction between M-domain motif2 and AAA-1 downregulates ClpB/Hsp104 ATPase activity (Oguchi et al., 2012; Lee et al., 2013; Lipinska et al., 2013). This qualifies the AAA-1 ring of ClpB/Hsp104 as a main regulatory site, while the AAA-2 ring is suggested to represent the major ATPase motor for substrate threading (Mogk et al., 2015). M-domain mutants disrupting AAA-1/M-domain interaction exhibit high ATPase activities in presence of substrate, leading to increased unfolding power and disaggregation activities (Oguchi et al., 2012; Lipinska et al., 2013; Jackrel et al., 2014). Hyperactive M-domain mutants, however, exhibit temperature-dependent cellular toxicity rationalizing tight control of ClpB ATPase activity (Schirmer et al., 2004; Oguchi et al., 2012; Lipinska et al., 2013). The cellular targets of hyperactive M-domain mutants are largely unknown. Hyperactive ClpB/Hsp104 might act on endogenous proteins exposing a specific recognition tag for ClpB/Hsp104 interaction, leading to unfolding of the native protein. Hyperactive ClpB/Hsp104 could also interfere with the de novo folding of nascent polypeptides and the secretion of secretory proteins.

How the M-domain docking state signals to the ATPase center and which step in the ATPase cycle is modulated is currently unknown. Mixing experiments of ClpB/Hsp104 wild type and ATPase deficient subunits suggest that M-domain dissociation increases AAA subunit cooperation leading to high ATP turnover rates upon additional substrate binding (Seyffer et al., 2012; Lee et al., 2013; Aguado et al., 2015a; Kummer et al., 2016). Such allosteric control might involve the conserved arginine fingers of both ClpB/Hsp104 AAA domains (E. coli ClpB R331/R332 (AAA-1) and R756 (AAA-2). Arginine fingers are essential for ClpB/Hsp104 disaggregation activity (Mogk et al., 2003; Yamasaki et al., 2011; Biter et al., 2012). The arginine fingers are crucial for ATP hydrolysis in the respective AAA ring but also act as trans-acting elements, as they affect ATP hydrolysis in the second AAA ring as well (Mogk et al., 2003; Werbeck et al., 2011; Yamasaki et al., 2011; Biter et al., 2012). Arginine fingers thereby control ATPase regulatory circuits in both, cis and trans.

Here we analyzed the interplay between ClpB intersubunit communication within the first AAA domain and M-domain mediated ATPase control. We analyzed the effects of mutational alterations of a conserved subunit interface residue located close to the conserved arginine fingers of the first AAA domain. We show that small structural alterations at this position have profound and distinct effects on ATPase control, causing either strong reduction or increase of total ATPase activity. Affecting AAA-1 intersubunit signaling can overrule ATPase deregulation by ClpB M-domain mutants, suppressing hyperstimulation of ATPase activity and cellular toxicity. Together our findings confirm and extend our molecular understanding of ClpB interring communication in controlling ATPase and disaggregation activities.

# MATERIALS AND METHODS

# Strains, Plasmids, and Proteins

E. coli strains used were derivatives of MC4100. ClpB was amplified by PCR and inserted into pDS56 and verified by sequencing. Mutant derivatives of clpB were generated by PCR mutagenesis and standard cloning techniques in pDS56 and were verified by sequencing. ClpB was purified after overproduction from E. coli 1clpB::kan cells. ClpB wild type and mutant variants were purified using Ni-IDA (Macherey-Nagel) and size exclusion chromatography (Superdex S200, Amersham) following standard protocols. Purifications of DnaK, DnaJ, GrpE, Luciferase, and Casein-YFP were performed as described previously (Haslberger et al., 2008; Oguchi et al., 2012; Seyffer et al., 2012). Pyruvate kinase of rabbit muscle and Malate Dehydrogenase of pig heart muscle were purchased from Sigma. Protein concentrations were determined with the Bio-Rad Bradford assay.

# Biochemical Assays

#### Disaggregation Assays

ClpB disaggregation activities were determined by following the disaggregation of heat-aggregated Malate Dehydrogenase (0.5µM, 30 min at 47◦C) and 0.05µM urea-denatured firefly Luciferase at 25◦C as described (Oguchi et al., 2012; Kummer et al., 2016). Chaperones were used at the following concentrations: 1µM ClpB (wild type or derivatives), E. coli Hsp70 system: 1µM DnaK, 0.2µM DnaJ, 0.1µM GrpE. Disaggregation reactions were performed in Reaction Buffer (50 mM Tris pH 7.5, 150 mM KCl, 20 mM MgCl2, 2 mM DTT) containing an ATP Regenerating System (2 mM ATP, 3 mM phosphoenolpyruvate, 20 ng/µl Pyruvate Kinase). Luciferase activities were determined with a Lumat LB 9,507 (Berthold Technologies) MDH disaggregation was monitored by turbidity measurement at an excitation and emission wavelength of 600 nm (PerkinElmer LS50B spectrofluorimeter).

Luciferase refolding rates and MDH disaggregation rates were calculated from the linear increase in Luciferase activities and linear decrease in MDH aggregate turbidity

#### ATPase Assay

ATPase activities of ClpB (0.5µM) was determined in Reaction buffer in absence or presence of substrate (10µM casein) using a NADH-coupled colorimetric assay (Sigma) by measuring the decrease of NADH absorption at 340 nm in a BMG Labtech FLUOstar Omega plate reader. Minor differences in ATPase activities determined for ClpB wt and mutants (**Figures 3**, **6**) are caused by analysis of different protein purification batches.

#### Nucleotide Binding

To determine the affinity of ClpB (wt and derivatives) for the fluorescent nucleotide analog mantADP equilibrium titrations of 1,25µM mantADP with different ClpB concentrations were performed at 30◦C using a FP 6500 JASCO Spectrometer (Excitation: 360 nm/Emission: 400–500 nm). The affinity for mantADP can be determined using the following equation:

 $\mathcal{F} = \mathcal{F}\_0 \\ \text{s.t.} \begin{aligned} \text{F} &= \text{F}\_0 \\ &+ (\text{F}\_{\text{max}} - \text{F}\_0) \end{aligned}$ 
$$\begin{aligned} \text{F} &= \text{F}\_0 \left( \frac{\text{[E]}\_0 + \text{[L]}\_0 + \text{K}\_{\text{d}}}{2} - \sqrt{\frac{\left(\text{[E]}\right)\_0 + \text{[L]}\_0 + \text{K}\_{\text{d}}}{4} - \left[\text{E}\right]\_0 \left[\text{L}\right]\_0} \right) \end{aligned}$$

with F: observed fluorescence; F0: fluorescence of fluorophor (mantADP); Fmax: maximum fluorescence observed; [E]0: total concentration of ClpB (µM); [L]0: total concentration of fluorophor (µM); Kd: dissociation constant of the complex (µM).

To determine the affinity of ClpB wt and derivatives for ADP and ATPγS competition titrations were performed. 1,25µM mantADP were initially mixed with 1,25µM ClpB and pre incubated for 5 min at 30◦C. Subsequently this complex was titrated with solutions of ADP and ATPγS and mantADP fluorescence as determined as described above. The affinities for the unlabeled nucleotides were determined using the following equations:

$$\mathbf{F} = \left[ \mathbf{F}\_0 \frac{\mathbf{K}\_{\mathrm{i},\ \mathrm{app}}}{\mathrm{[ATP]} + \mathrm{K}\_{\mathrm{i,app}}} \right] + \left[ \mathbf{F}\_1 \frac{\mathrm{[ATP]}}{\mathrm{[ATP]} + \mathrm{K}\_{\mathrm{i,app}}} \right] $$

with F: observed fluorescence; F0: starting fluorescence without nucleotide; F1: maximum fluorescence observed; [ATP]: concentration of ADP/ATPγS (µM); Ki,app: apparent dissociation constant.

$$\mathcal{K}\_{\text{i.app}} = \mathcal{K}\_{\text{i}} \left( 1 + \frac{[\text{mantADP}]}{\mathcal{K}\_{\text{d}} \, (\text{mantADP})} \right)$$

with Ki,app: apparent dissociation constant; K<sup>i</sup> : dissociation constant; [mantADP]: mantADP concentration (µM); K<sup>d</sup> (mantADP): dissociation constant of mantADP.

#### Unfolding Assays

Unfolding and degradation of Casein-YFP (0.25µM) was determined in Reaction buffer in presence of 4µM BAP (wt or derivatives) and 6µM ClpP and an ATP regenerating system. YFP fluorescence was followed on a Perkin-Elmer LS50B spectrofluorimeter (excitation wavelength 488 nm, emission wavelength 527 nm).

#### Glutaraldehyde Crosslinking

All tested ClpB variants were dialyzed against Reaction Buffer B (50 mM HEPES ph7.5, 25 mM KCl, 5 mM MgCl2, 2 mM DTT). One micromolar ClpB was incubated at 25◦C in presence of ATPγS for 5 min. Crosslinking reactions were started by addition of 0.1% glutaraldehyde. Reactions were stopped after 2 and 10 min by addition of 1 M Tris pH 7.5 and crosslinking products were analyzed by SDS-PAGE (4–15%) followed by Sypro Ruby Staining (ThermoFisher).

#### Hydrogen/Deuterium Exchange Coupled to Mass Spectrometry (HX-MS)

HX-MS experiments were performed as described earlier (Rist et al., 2003). Fifty picomolar of ClpB (wt and derivatives) was incubated for 3 min at 30◦C in low salt MDH buffer (50 mM Tris pH 7.5, 20 mM KCl, 20 mM MgCl2, 2 mM DTT) in presence of 2 mM ATPγS. Next ClpB was diluted 20-fold into respective D2Obased low salt MDH buffer to initiate amide hydrogen exchange. The exchange reaction was quenched by the addition of 1 volume of ice-cold quench buffer (0.4 M potassium phosphate pH 2.2) and injected into an UPLC (Waters) setup, following online peptic digestion. ClpB peptides were analyzed on an electrospray ionization quadrupole time-of-flight mass spectrometer (MaXis UHR qTOF classic, Bruker Daltonics) as described (Rist et al., 2003). Calculation of centroids was conducted manually in an excel sheet based on the following equation after extraction of I<sup>i</sup> (peak intensity) and m<sup>i</sup> (m/z) using the Bruker Compass software (Bruker Daltonics):

$$\langle m \rangle = \frac{\sum \text{I}\_{\text{i}} \text{m}\_{\text{i}}}{\sum \text{I}\_{\text{i}}}$$

For initial data analysis also an automatic data analysis software was used (HDExaminer, Sierra Analytics). A fully deuterated ClpB sample was generated by incubating D2O in the presence of 8 M GdnHCl and analyzed under the same conditions to correct for back-exchange. The relative amount of deuterium atoms incorporated by each peptic fragment was calculated as:

$$\%D = \frac{mass\_t - mass\_{0\%}}{mass\_{100\%} - mass\_{0\%}} \times 100$$

where mass<sup>t</sup> is the observed average mass of the peptide at time point t, mass0% is the observed average mass of the undeuterated peptide and mass100% is the observed average mass of the fully deuterated peptide.

#### Spot Tests

E. coli cells harboring plasmid-encoded clpB alleles were grown in the absence of IPTG overnight at 30◦C. Serial dilutions were prepared, spotted on LB-plates containing different IPTG concentrations and incubated for 24 h at indicated temperatures.

### Fluorescence Microscopy

E. coli 1clpB cells harboring IPTG-inducible YFP-tagged clpB alleles were grown to mid-exponential growth phase in the presence of 100µM IPTG at 30◦C. Cells were subjected to heat stress (20 min at 43◦C) followed by a recovery period (120 min) at 30◦C. One milliliter cell cultures were taken before and after heat stress and during recovery and centrifuged. For snapshot imaging, cells pellets were resuspended in 100µl icecold PBS buffer and immobilized on 1% (w/v) agarose pads (in 1x PBS). Agarose pads were sealed with Apiezon grease and covered with cover slips. Imaging was performed using the xcellence IX81 wide field system (Olympus) with a Plan Apochromat x100/1.45 numerical aperture oil objective, a Hamamatsu OrcaR2 camera and the according filter settings (YFP). For image analysis ImageJ was used and for statistical analysis at least 100 cells were counted to determine the % of cells without and with foci pre, after heat shock and during the recovery phase.

# RESULTS

# Mutating the Interface Residue A328 Affects Cellular Toxicity of ClpB Wild Type and Hyperactive K476C

We set out to study intersubunit communication within the ClpB AAA-1 ring and its connection to subunit coupling of hyperactive ClpB M-domain mutants, exhibiting high ATPase activities. Mutating the classical Arginine-finger of the ClpB AAA-1 domain leads to drastic phenotypes, including oligomerization defects and entire loss of disaggregation activity, thereby affecting further analysis of the coupling mechanism. We therefore aimed at analyzing nearby, conserved residues located at the subunit interface. We made use of a previous genetic study in Arabidopsis thaliana, analyzing the plant homolog of ClpB, Hsp101. The authors isolated the Hsp101-A499T mutant, which harbors a point mutation in M-domain helix3 and exhibits cellular toxicity at increased temperatures (38◦C), a phenotype not observed for hsp101 null mutants (Lee et al., 2005). The position of the mutation (M-domain helix 3) and the determined gainof-function phenotype (toxicity) suggest that Hsp101-A499T represents a hyperactive M-domain mutant. Notably, Hsp101- A499T toxicity could be suppressed by the additional mutation A329V, located at the subunit interface of the AAA-1 ring (Lee et al., 2005). The molecular basis of this suppression activity remained unclear, as a biochemical analysis of Hsp101 wild type and mutants was not performed. The suppressor mutation A329V is located far away from the M-domain mutated site (37 Å based on ClpB structure; **Figure 1A**). This strongly suggests that the suppressor does not act in an allele-specific manner and does not directly affect M-domain conformation but buffers against a general deregulation of Hsp101 activity caused by M-domain mutation. Here, we used this original genetic information as basis to explore the interplay of ClpB ATPase regulation by M-domains and intersubunit communications. We used ClpB-K476C as hyperactive M-domain variant as it still allows for Hsp70 cooperation (Oguchi et al., 2012). To affect intersubunit communication we mutated the highly conserved A328 residue, which corresponds to Hsp101 A329 isolated as suppressor site of the toxic Hsp101-A499T M-domain mutant (**Figure 1A**). The residue A328 is located at the subunit interface in close proximity to the arginine fingers R331/R332 of AAA-1, implying a potential role in ClpB ATPase control (**Figure 1A**).

To test for a role of A328 in regulatory ATPase circuits of E. coli ClpB, we changed the size of the residue at the 328 position, creating A328G, A328V, A328L, and A328I variants. These variants were additionally linked to K476C to analyze for suppressing effects toward the hyperactive M-domain mutation. We started our analysis by testing all ClpB variants for temperature-dependent toxicity (**Figure 1B**). This screen was performed in E. coli 1clpB cells expressing respective plasmidencoded clpB alleles from an IPTG-regulatable promoter. Overexpression of ClpB-K476C caused cell death in presence of 250µM IPTG at 30◦C and 50µM IPTG at 37/42◦C. The additional presence of the A328L and A328I mutations either entirely suppressed K476C toxicity (A328L), or strongly reduced toxicity (A328I). Toxicity upon expression of A328I/K476C was only observed at 42◦C in presence of 100µM IPTG (**Figure 1B**). In contrast, combining A328G or A328V with K476C did not suppress but rather increased toxicity as cell death was already noticed at 30◦C in presence of 50µM IPTG. As control we determined toxicity of single A328 alterations upon expression in E. coli 1clpB cells (**Figure 1B**). Production of ClpB-A328G, A328L, and A328I did not affect cell growth as expected. Surprisingly, expression of clpB-A328V was lethal and cellular toxicity was even higher as compared to expression of clpB-K476C. The observation that ClpB-A328V is toxic on its own can explain the noticed increased toxicity of ClpB-A328V/K476C. Summing up, the interface residue A328 is highly sensitive to mutational alteration causing either cellular toxicity or toxicity suppression of the otherwise lethal ClpB-K476C M-domain mutant. These findings indicate that A328 plays a crucial role in controlling ClpB activity.

FIGURE 1 | The conserved intersubunit residue A328 controls ClpB activity. (A) Hexameric model of AAA-1 ring of *E. coli* ClpB. Structure of the hexameric AAA-1 ring of *E. coli* ClpB and M-domains (red). The hexameric model is based on the crystal structure of *Thermus thermophilus* ClpB (pdb number 1qvr1) and was generated as described in Diemand and Lupas (2006). The enlarged section shows the catalytic site and bound AMPPNP. AAA-1 subunits are in beige and gray. The positions of Walker A and B motifs, the trans-acting arginine fingers and the analyzed mutational sites (A328, K476) are indicated. AMPPNP is shown in black. A sequence alignment of the analyzed subunit interface of ClpB homologs (Hsp101, Hsp104, Hsp78) and ClpA is provided. A328 is highlighted in red, arginine fingers in purple (TT, *Thermus thermophilus*; EC, *Escherichia coli*; AT, *Arabidopsis thaliana*; SC, *Saccharomyces cerevisiae*). (B) *E. coli* 1*clpB* cells expressing the indicated plasmid-encoded *clpB* alleles under control of an IPTG-regulatable promoter were grown overnight at 30◦C. Various dilutions (100–10−<sup>7</sup> ) were spotted on LB plates containing the indicated IPTG concentrations and incubated at 30, 37, or 42◦C for 24 h.

# A328 is Crucial for ClpB Disaggregation Activity

We determined the consequences of A328 mutations on ClpB disaggregation activities by using aggregates of heat-denatured Malate Dehydrogenase (MDH) and urea-denatured Luciferase as model substrates. We focused our analysis on A328L, A328I, and A328V variants as those mutants either suppressed K476C toxicity (A328L, A328I) or exhibited toxicity on its own (A328V) (**Figure 1B**). Disaggregation was performed in presence of the cooperating DnaK (Hsp70) chaperone system (DnaK/DnaJ/GrpE: KJE) as neither ClpB wild type (wt) nor the mutant proteins showed disaggregation activity in absence of the Hsp70 partner (data not shown). Solubilization of MDH aggregates was monitored by determining the decrease in sample turbidity. MDH disaggregation by KJE and ClpB wild type was completed after 60 min. Solubilization of aggregated MDH by hyperactive ClpB-K476C and KJE was completed already after 30 min and the MDH disaggregation rate increased by 2,3 times as compared to ClpB wt (**Figures 2A,B**). All A328X variants, either alone or in combination with K476C, showed strongly reduced disaggregation activity (**Figures 2A,B**). ClpB-A328V had 12% disaggregation activity as compared to ClpB wt and this low activity was hardly increased for ClpB-A328V/K476C, indicating a dominant effect of the A328V mutation. A328L mutants (wt or K476C-linked) only showed background activity as determined in presence of KJE only. Low disaggregation activity (11%) was observed for A328I, but only if combined with hyperactive K476C, indicating that the activating K476C mutation can partially restore disaggregation activity of ClpB-A328I (**Figures 2A,B**).

A similar trend was observed when using aggregates of ureadenatured Luciferase as alternative substrate (**Figures 2C,D**). The disaggregation and refolding of urea-denatured Luciferase is slightly less sensitive toward alterations of ClpB activity. We assume that this is caused by differences in the nature of MDH and Luciferase aggregates including size and structure. Extraction of Luciferase molecules from Luciferase aggregates likely requires lower force application as can be also seen by partial activity of KJE in absence of ClpB (10% Luciferase refolding rate by KJE only as compared to KJE/ClpB wt). Luciferase refolding was fastest by KJE/ClpB-K476C, confirming its hyperactive state. All A328X variants showed reduced Luciferase refolding activity to variable degrees (**Figures 2C,D**). While ClpB-A328L activity was only slightly above the KJE control (13% activity), disaggregation activities of A328I (28%) and A328V (50%) were higher. Linking A328X mutations to hyperactive K476C generally increased disaggregation activities. Increase was largest for A328L/K476C and A328I/K476C and resulted in 28% (A328L/K476C) and 50% (A328I/K476C) Luciferase reactivation activities as compared to ClpB wt (**Figures 2C,D**). Only a minor increase was noticed

for A328V/K476C, exhibiting 57% activity as compared to 50% disaggregation activity determined for ClpB-A328V, resembling results from the MDH disaggregation assays.

Taken together, the disaggregation activities provide a rationale for suppression of ClpB-K476C toxicity by A328L and A328I mutations, as they strongly reduce disaggregation activities and abrogate the high disaggregation activity of K476C. While ClpB-A328V represented the most potent A328X mutant in protein disaggregation, its activity was reduced and clearly different from hyperactive ClpB-K476C. The molecular basis for cellular toxicity noticed upon clpB-A328V expression in E. coli might therefore be different from ClpB-K476C.

#### A328 Controls ClpB ATPase Activity

The hyperactive state of ClpB and Hsp104 M-domain mutants is linked to very high ATP hydrolysis rates in presence of substrate (Oguchi et al., 2012; Lipinska et al., 2013; Kummer et al., 2016). In order to link the determined cellular toxicities and disaggregation activities of ClpB A328X variants to potential changes in ATPase activities, we determined ATP turnover rates in absence (basal rate) and presence (stimulated rate) of the model substrate casein and compared those to ClpB wild type and ClpB-K476C (**Figure 3**). A328L and A328I mutants (alone or linked to K476C) showed strongly reduced basal ATPase activities. Addition of casein increased ATP turnover by A328I and A328I/K476C, resulting in ATPase activities that were either 3-fold lower (A328I) or 1,65-fold higher (A328I/K476C) as compared to ClpB wild type. In case of the A328L mutation a significant ATPase activity was only determined for A328L/K476C upon addition of casein, however, ATP turnover was still 5,6-fold lower as compared to ClpB wt (+ casein). The determined reductions in ATPase activities of ClpB-A328L and -A328I mutants are overall in agreement with their lowered disaggregation activities and also explain why A328L and A328I suppress K476C toxicity as A328L/K476C

and A328I/K476C do not reach the high ATPase activity of K476C. To exclude that the low ATPase activities determined for ClpB-A328I and ClpB-A328L are caused by oligomerization defects, we performed glutaraldehyde crosslinking experiments (Supplementary Figure 1), demonstrating that all investigated ClpB mutants can oligomerize as ClpB wt.

Notably, a very high ATPase rate was determined for ClpB-A328V in presence of casein and ATP turnover was 6,1-fold increased as compared to ClpB wt (**Figure 3**). This high ATPase activity is reminiscent of hyperactive ClpB-K476C. Combining both ATPase activating mutations in ClpB-A328V/K476C did not result in further ATPase activity increase, presumably because the ATPase motor is already running at maximal speed.

ClpB-A328V exhibits opposing consequences on ATPase activity as compared to the A328L and A328I mutants. These differences in ATPase activities correlate well with the noticed effects of respective mutants on cellular toxicity. The high ATPase activity of ClpB-A328V is linked to cellular toxicity, which is not observed for slowly hydrolyzing ClpB-A328L and ClpB-A328I and respective K476C-linked mutants. The low ATPase activities of ClpB-A328I/L mutants are similar to those determined for arginine finger mutants (Mogk et al., 2003; Yamasaki et al., 2011; Biter et al., 2012). In contrast, the high ATPase activity of A328V is entirely unexpected and therefore we focused further analysis on this particular variant.

# ClpB-A328V is Hyperactive and Unfolds Stable Protein Domains

ClpB-A328V exhibits key characteristics of hyperactive ClpB mutants: (i) very high ATPase activity in presence of substrate (**Figure 3**) and (ii) cellular toxicity upon expression in E. coli cells (**Figure 1B**). High ATPase rates of hyperactive ClpB M-domain mutants also enable them to unfold stable protein domains, an activity not observed for ClpB wild type (Haslberger et al., 2008; Oguchi et al., 2012). To test for high unfolding activity we made use of casein-YFP, which is recognized by ClpB as substrate via its casein moiety. High unfolding activity of ClpB will allow for YFP unfolding and can be monitored by loss of YFP fluorescence. However, YFP can rapidly refold upon initial unfolding making it difficult to robustly study ClpB unfolding activity. To overcome this obstacle we made use of the ClpB variant BAP, which binds to the E. coli peptidase ClpP, thereby directly linking successful substrate unfolding and threading to degradation via associated ClpP (Weibezahn et al., 2004). BAP/ClpP allows determining unfolding activities toward casein-YFP, as unfolding of the YFP moiety results in its degradation and thereby an irreversible loss of YFP fluorescence. Fluorescence of casein-YFP remained stable upon incubation with ClpP or BAP-wt/ClpP, confirming that YFP resists threading by BAPwt (**Figure 4**). In contrast, BAP-K476C/ClpP caused rapid loss of YFP fluorescence, decreasing fluorescence intensity to 50% within 20 min. Degradation of Casein-YFP by BAP-K476C/ClpP was not complete. We assume this is caused by heterogeneity of the substrate pool, which includes a fraction that is not accessible for BAP-K476C processing. BAP-A328V/ClpP also degraded Casein-YFP to a similar degree, yet the degradation rates

were 6-fold lower as compared to BAP-K476C/ClpP (**Figure 4**). Still, loss of YFP fluorescence shows that BAP-A328V unfolds the stable YFP moiety in contrast to BAP wt, demonstrating hyperactivity. The A328V mutation therefore fulfills all key characteristics of hyperactive ClpB mutants: (i) cellular toxicity, (ii) high ATPase activity, and (iii) high unfolding activity.

## Low Disaggregation Activity of A328V is not Caused by Alterations in Protein Aggregate Targeting

ClpB-A328V and ClpB-K476C share key characteristics of an hyperactive activity status but differ substantially in disaggregation activities in vitro. While ClpB-K476C exhibits superior disaggregation activity, ClpB-A328V activity is low. We speculated that this difference might stem from reduced binding of ClpB-A328V to the DnaK partner chaperone, resulting in less efficient targeting to protein aggregates. We employed fluorescence microscopy using C-terminal YFP fusions to ClpBwt or -A328V and -K476C mutants to monitor their DnaKdependent binding to protein aggregates. To avoid cellular toxicity upon overexpression of hyperactive clpB-K476C and clpB-A328V mutants all yfp-fused clpB constructs were expressed in E. coli 1clpB cells from a low copy vector to produce approx. ClpB wt levels (data not shown). These low expression levels do not affect growth of E. coli cells. Protein aggregates forming in E. coli cells during heat stress are deposited at the cell poles (Winkler et al., 2010). ClpB-YFP is recruited to polar protein aggregates in a DnaK-dependent manner (Winkler et al., 2012) leading to the appearance of polar ClpB-YFP foci after heat stress (45◦C) at the expense of diffuse cytosolic ClpB-YFP fluorescence (**Figure 5A**). Polar ClpB-YFP fluorescence vanished during a recovery period at 30◦C within 120 min in >80% of cells. Loss of polar ClpB-YFP fluorescence was accompanied with the reappearance of diffuse cytosolic ClpB-YFP fluorescence, reflecting successful protein disaggregation (**Figures 5A,D**). We next monitored the cellular distribution of ClpB-K476C-YFP and ClpB-A328V-YFP during stress application. Both constructs exhibited diffuse fluorescence before heat shock but formed polar foci upon heat shock in a manner indistinguishable from ClpB-wt-YFP (**Figures 5B,C**). These findings exclude that defects in DnaK interaction are causative for reduced ClpB-A328V disaggregation activity. Notably, loss of ClpB-A328V-YFP foci during the recovery period was delayed and half of the cell population still contained polar foci after 120 min (**Figure 5D**). This is indicative of a reduced disaggregation activity of ClpB-A328V-YFP in vivo, in agreement with results obtained for aggregated MDH and Luciferase model substrates in vitro. We conclude that ClpB-A328V is affected in a step of the disaggregation cycle downstream of DnaK-mediated targeting to protein aggregates.

# ClpB-A328V and ClpB-K476C Differ in the Mechanistic Basis of ATPase Hyperactivity

The absence of an obvious defect of ClpB-A328V in DnaK interaction let us to speculate that the molecular basis of ClpB-A328V and ClpB-K476C hyperactivity—reflected by high ATP turnover rates—differs between the two mutants. To analyze for differing effects of ClpB-A328V and ClpB-K476C on the ATPase cycle, we linked the mutations to single ClpB Walker B mutations in AAA-1 (E279A) or AAA-2 (E678A) allowing for ATP binding but abolishing ATP turnover in the respective AAA+ ring. Additionally, we combined the A328V and K476C mutations with the Walker A mutation K611Q, causing deficiency in ATP binding at the AAA-2 ring. We did not include a respective Walker A mutant of AAA-1 as it shows oligomerization defects (Watanabe et al., 2002; Mogk et al., 2003). We determined ATPase activities in absence and presence of substrate casein and included single ClpB Walker B and A mutants as reference (**Figures 6A,B**).

The effects of the tested ClpB mutants on ATPase activities were complex. Linking K476C to E279A or K611Q reduced but did not abolish high ATPase activity in presence of casein (**Figures 6A,B**). ClpB-K476C/E678A did not exhibit increased ATPase activity as compared to ClpB-E678A and ATP turnover was no longer stimulated by casein. We conclude that K476C leads to increased ATP turnover in both AAA rings, however, freezing the AAA-2 ring in the ATP state (E678A) almost entirely prevents casein-dependent ATPase stimulation by the K476C M-domain mutation. The latter effect is also observed for ClpB-E678A, indicating that substrate binding predominantly stimulates ATP turnover at AAA-2. Furthermore, reductions in ATPase activities were most pronounced when linking K476C to Walker A/B mutants of the AAA-2 ring, suggesting that the increased ATP turnover in the hyperactive M-domain mutant K476C is mostly due to increased ATPase activity in the AAA-2 ring.

The results obtained for A328V linked to Walker A and B mutations were different from K476C. The ClpB-E279A/A328V double mutant exhibited reduced ATPase activity compared to respective single mutants, suggesting that each mutation affects

the ClpB ATPase cycle differently, leading to a distinct effect upon combining both mutations. Linking A328V to Walker A or B mutants of AAA-2 entirely abrogated basal and casein-stimulated ATPase activity, in contrast to ClpB-K476C (**Figures 6A,B**). This indicates that ATP turnover at AAA-2 is obligatory for ClpB-A328V ATPase activity and suggests that ATP binding or turnover at AAA-1 might be abrogated in ClpB-A328V. The ClpB-A328V-K476C mutant also did not exhibit ATPase activity when linked to K611Q or E678A (**Figures 6A,B**), demonstrating that the effect of A328V on ATPase control is dominating and cannot be compensated by M-domain mediated ATPase regulation.

The absence of ClpB-A328V ATPase activity if linked to Walker A and B mutants of AAA-2 could be explained by A328V abrogating nucleotide binding at AAA-1. To test for potential nucleotide binding defects we used the fluorescent nucleotide analog mantADP, which shows increased and blue-shifted fluorescence upon binding to ClpB (Schlee et al., 2001). Binding of mantADP was largely unaltered for ClpB-A328V as compared to ClpB wt or ClpB-K476C, and only a 2,2-fold decrease in affinity was determined (KD: 0,51µM for ClpB-A328V vs. 0,23 and 0,25µM for ClpB wt and ClpB-K476C; Supplementary Figure 2A). Similarly, competition titration experiments with ADP or ATPγS did not reveal strong differences in nucleotide binding as respective KD-values of A328V were again only 2 fold increased as compared to ClpB wt (Supplementary Figure 2B). To specifically test for mantADP binding at AAA-1 only we analyzed ClpB-A328V/K611Q, which is deficient in nucleotide binding at AAA-2. mantADP binding curves of ClpB wt and ClpB-A328V were indistinguishable (**Figure 6C**). We were not able to determine KD-values as we did not reach binding saturation in presence of 50µM ClpB protein. This is explained by a low nucleotide binding affinity of AAA-1 if AAA-2 stays nucleotide-free (Fernandez-Higuero et al., 2011). Together these findings exclude nucleotide-binding defects of ClpB-A328V. The lack of any ATPase activity determined for ClpB-A328V/E678A and ClpB-A328V/K611A therefore implies that (i) ClpB-A328V is deficient in ATP turnover at AAA-1 and (ii) the strongly increased ATPase activity of ClpB-A328V in presence of casein is caused by exclusively stimulating ATP turnover at AAA-2. Conversely, preventing nucleotide binding or hydrolysis at AAA-2 might affect the ATPase cycle of ClpB-A328V in a more complex manner, abrogating ATP hydrolysis in the entire ClpB hexamer once AAA-2 activity is blocked. The noticed differences in ATPase activities of A328V and K476C mutants when linked to Walker A or B mutants also demonstrate that the molecular basis for their hyperactive activity states must be different. ClpB-A328V and ClpB-K476C therefore represent different classes of hyperactive ClpB mutants.

# ClpB-A328V Affects the Conformation of the Walker A Motif of AAA-2

The determined effects of hyperactive ClpB-A328V on ATPase and unfolding activities must stem from specific conformational changes within the ClpB ring. To study for potential effects of the A328V mutation on AAA-1, AAA-2, and M-domain conformations we determined the structural flexibility of ClpB-A328V by amide hydrogen exchange (HX) mass spectrometry

(MS). HX-MS determines the solvent accessibility and structural flexibility of the peptide backbone. Amide hydrogens are protected from HX if engaged through hydrogen bonds in secondary and tertiary structures. We compared the HX-MS patterns of peptic peptides from ClpB wt, ClpB-A328V, and ClpB-K476C in the presence of ATPγS (**Figure 7A**). The HX-MS pattern of ClpB-A328V was overall similar to ClpB wt and differences in HX were lower than 5% for most peptic peptides. This confirms nucleotide binding to both AAA domains of ClpB-A328V, as a defect in nucleotide binding to either AAA domain would result in strong deprotection of multiple peptic peptides (Oguchi et al., 2012). ClpB-A328V did not exhibit strong deprotection of M-domain motif2 peptic peptides as observed for ClpB-K476C (**Figure 7A**). ClpB-A328V hyperactivity therefore does not rely on dissociation of M-domain motif2, confirming that the molecular basis of ClpB-K476C and ClpB-A328V hyperactivities is different. Further analysis revealed that only two out of the multiple AAA-1 and AAA-2 peptic peptides of ClpB-A328V showed a deviation of 10% or more in HX as compared to ClpB wt. The first peptide E330-F337 includes the arginine fingers R331 and R332 of AAA-1 and exhibits a 10% increase in HX compared to ClpB wt. This suggests an altered positioning of the arginine fingers in ClpB-A328V. The second peptide L602-L614 (13% increase in HX) is encompassing the Walker A motif of AAA-2 (G605-T612). Notably changes in HX were also observed for other ClpB-A328V peptic peptides of AAA-2 located close to the Walker A peptide (**Figures 7A,B**). This implies structural differences in the catalytic ATPase center of the AAA-2 domains of ClpB wt and ClpB-A328V. An increased deprotection of the AAA-2 Walker A peptic peptide L602-T612, though not as pronounced (8% increased HX) was also noticed for ClpB-K476C. ClpB-K476C also showed strong deprotection of I205-L219 (16% change in HX), encompassing the Walker A motif of AAA-1 (G206-T212). Here, ClpB-A328V also showed increased deprotection, yet not to the same degree. Therefore, both hyperactive mutants, ClpB-A328V and ClpB-K476C, exhibit specific structural changes in the catalytic centers that were most pronounced either in AAA-1 (K476C) or in AAA-2 (A328V), providing a structural correlative to ATPase hyperactivity.

#### DISCUSSION

In the presented work we analyzed the role of intersubunit communication in controlling ClpB ATPase and disaggregation activity. We selected the conserved A328 residue for analysis as it is located in a strategic position at the AAA-1 subunit interfaces close to the essential arginine fingers R331 and R332. Furthermore, the identical residue was identified as intragenic suppressor of a toxic, gain-of-function Hsp101 M-domain mutant (Lee et al., 2005), indicating a role of this residue in controlling ClpB/Hsp101 activity. This observation also provided rationale for analysis of a potential interconnection of ClpB intersubunit communication via A328 and M-domain mediated ATPase control.

Our analysis confirms and extends previous findings on intersubunit communication based on arginine finger mutants (Mogk et al., 2003; Werbeck et al., 2011; Yamasaki et al., 2011, 2015; Biter et al., 2012; Zeymer et al., 2014b). We show that A328, like the arginine fingers, acts in both, intra-ring and interring signaling. The alanine residue is conserved in the AAA-1 domain of Class I Hsp100 proteins (e.g., ClpA, ClpB, ClpC, ClpE, ClpV) harboring two AAA modules, but can also be found in other AAA+ family members including CDC48 and NSF (Supplementary Figure 4). This suggests that the regulatory function of A328 uncovered here for ClpB is widespread and operative in various AAA+ proteins. A328X mutants do not only impact on the ATPase activity of the AAA-1 cis ring, but also that of the AAA-2 trans ring. The residue A328 exhibits remarkable sensitivity toward the introduced mutations, justifying its high degree of conservation. While all characterized A328X mutants

exhibited reduced disaggregation activities to varying degrees, they showed diverging consequences on ATP hydrolysis rates. A328I and A328L had reduced or almost no ATPase activity, whereas A328V hydrolyzed ATP 6-times faster than ClpB wt in presence of substrate casein. This shows that small structural alterations, caused by the additional presence of a single methyl group in A328L/I compared to A328V, have diverse effects on ATPase activity control in the ClpB hexamer. We suggest that the A328X mutations affect the orientation of the nearby arginine fingers. This is supported for ClpB-A328V by HX-MS analysis, revealing a specific conformational change of the AAA-1 peptic peptide E330-F337 encompassing the arginine fingers R331/R332 (**Figure 7A**). We assume that subtle differences in arginine finger conformations between A328I/L and A328V variants are basis for their dramatically different ATPase activities.

While the low ATPase activities of A328L/I were expected, the high ATPase rate of ClpB-A328V is surprising. Further analysis revealed that A328V represents a hyperactive ClpB mutant, which shares key characteristics with hyperactive ClpB M-domain mutants (e.g. K476C): high ATPase and unfolding activities and temperature-dependent cellular toxicity. ClpB-A328V, however, differs from M-domain mutants and constitutes a novel class of hyperactive ClpB mutants. HX-MS analysis revealed that M-domain motif2 of ClpB-A328V does not show increased deprotection as compared to ClpB wt, indicating that M-domain motif2 is not displaced from the AAA-1 ring in contrast to ClpB-K476C. This shows that the initial molecular event resulting in high ATPase rates of ClpB-A328V and ClpB-K476C is different. A328V might, however, activate an allosteric network, which is also involved in ATPase activation upon Mdomain dissociation. The latter event must include additional activation steps as both classes of hyperactive ClpB mutants (A328V and K476C) differ in disaggregation activities and ATPase regulation. Accordingly, coupling A328V or K476C to Walker B mutants that prevent ATP hydrolysis at the AAA-1 or AAA-2 ring, had distinct consequences on ATPase activities. Preventing nucleotide binding or hydrolysis at AAA-2 entirely abrogated ATPase activity of ClpB-A328V, in contrast to ClpB-K476C (**Figure 6**). These findings imply that ATP hydrolysis in the AAA-1 ring is blocked in ClpB-A328V. This defect likely affects cooperation of ClpB-A328V with its Hsp70 partner DnaK. While aggregate targeting of ClpB-A328V by DnaK is unaffected (**Figures 5C,D**), it exhibits low disaggregation activity indicating that ClpB-A328V/DnaK cooperation is affected post recruitment. We suggest that efficient substrate transfer from DnaK to ClpB requires ATP turnover at the AAA-1 ring, an activity not performed by ClpB-A328V. This is explaining why ClpB-A328V has reduced disaggregation activity, whereas ClpB-K476C is highly potent in vitro.

However, ClpB-A328V is different from the ClpB Walker B mutant E279A, which also blocks ATP hydrolysis in the AAA-1 ring. In contrast to ClpB-A328V (which has increased ATPase activity through AAA-2) ClpB-E279A has lower ATPase activity (−/+ casein; **Figure 6**) and hardly exhibits cellular toxicity (Supplementary Figure 3). Also, combining A328V and E279A strongly reduced ATPase activity and cellular toxicity (Supplementary Figure 3). The deregulation of ATPase control in AAA-2 caused by either A328V or E279A mutations is therefore different. This difference might be explained by defects of A328V in sensing the nucleotide state (ATP) and transmitting this signal within the cis AAA-1 and to the trans AAA-2 ring. Loss of ATPase regulation involving A328V causes strongly increased ATP turnover at AAA-2 (**Figure 6**). The AAA-2 ring is providing the main threading power and an increase in its ATPase activity explains the high unfolding power of ClpB-A328V and likely its cellular toxicity. HX-MS analysis of ClpB-A328V revealed specific conformational changes in the catalytic center of AAA-2. The peptic peptide L602-T612, including the Walker A motif of AAA-2, showed increased HX, indicating increased structural flexibility at the catalytic site. Notably, determination and analysis of ClpB AAA-2 crystal structures revealed that the catalytic site of

# REFERENCES


AAA-2 is inactive as the essential Walker A lysine residue (K611) exists in a stretched conformation and does not contact bound nucleotide (Zeymer et al., 2014a). It is tempting to speculate that the increased flexibility determined for the AAA-2 Walker A peptide of ClpB-A328V reflects a repositioning of K611 and therefore activation of the AAA-2 ATPase motor.

Linking A328X mutations to hyperactive ClpB-K476C also offers an explanation for the original identification of the intersubunit residue as suppressor of a toxic Hsp101 M-domain mutant (Lee et al., 2005). ATPase and disaggregation activities of ClpB-A328I/L-K476C double mutants are low. As cellular toxicities of ClpB M-domain mutants correlate with high ATPase activities, the determined reduction in ATP turnover explains the suppressor function of the A328I/L mutation. Linking the K476C mutation to A328I/L still increases ATPase and disaggregation activity. This indicates that the primary signaling routes controlled by the M-domain and A328 are distinct in parts and in the ClpB-A328I/L-K476C double mutant, up-regulation by one pathway (K467C) is counteracted by downregulation of the other pathway (A328I/L).

# AUTHOR CONTRIBUTIONS

Conceived and designed experiments: KBF, BB, AM. Performed experiments: KBF. Analyzed the data: KBF, BB, AM. Wrote the manuscript: BB, AM.

# FUNDING

This work was funded by grants of the Deutsche Forschungsgemeinschaft (BB617/17-2 and MO 970/4-2) to BB and AM.

# ACKNOWLEDGMENTS

KBF was supported by the Hartmut Hoffmann-Berling International Graduate School of Molecular and Cellular Biology (HBIGS).

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmolb. 2017.00006/full#supplementary-material


motions in a AAA+ protein-unfolding machine. Cell 139, 744–756. doi: 10.1016/j.cell.2009.09.034


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Franke, Bukau and Mogk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Substrate Discrimination by ClpB and Hsp104

Danielle M. Johnston †‡, Marika Miot †‡, Joel R. Hoskins ‡ , Sue Wickner\* and Shannon M. Doyle\*

Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States

#### Edited by:

Walid A. Houry, University of Toronto, Canada

#### Reviewed by:

Peter Chien, University of Massachusetts Amherst, United States Michal Zolkiewski, Kansas State University, United States

#### \*Correspondence:

Sue Wickner wickners@mail.nih.gov Shannon M. Doyle doyles@mail.nih.gov

#### † Present Address:

Danielle M. Johnston, Bristol-Myers Squibb, Hopewell, NJ, United States Marika Miot, UMR7622, Biologie du Développement, Centre National de la Recherche Scientifique, Institut de Biologie Paris-Seine, UPMC University Paris 06, Paris, France

> ‡ These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 30 March 2017 Accepted: 12 May 2017 Published: 29 May 2017

#### Citation:

Johnston DM, Miot M, Hoskins JR, Wickner S and Doyle SM (2017) Substrate Discrimination by ClpB and Hsp104. Front. Mol. Biosci. 4:36. doi: 10.3389/fmolb.2017.00036 ClpB of E. coli and yeast Hsp104 are homologous molecular chaperones and members of the AAA+ (ATPases Associated with various cellular Activities) superfamily of ATPases. They are required for thermotolerance and function in disaggregation and reactivation of aggregated proteins that form during severe stress conditions. ClpB and Hsp104 collaborate with the DnaK or Hsp70 chaperone system, respectively, to dissolve protein aggregates both in vivo and in vitro. In yeast, the propagation of prions depends upon Hsp104. Since protein aggregation and amyloid formation are associated with many diseases, including neurodegenerative diseases and cancer, understanding how disaggregases function is important. In this study, we have explored the innate substrate preferences of ClpB and Hsp104 in the absence of the DnaK and Hsp70 chaperone system. The results suggest that substrate specificity is determined by nucleotide binding domain-1.

Keywords: ClpB, Hsp104, molecular chaperone, disaggregase, DnaK, Hsp70, amyloid, aggregate

# INTRODUCTION

All cells have a protein network involved in maintaining the proteome following periods of stress. Maintenance of the proteome utilizes energy-dependent molecular machines that facilitate protein remodeling, reactivation, disaggregation and degradation of misfolded, aggregated or inactive proteins. Members of the Clp/Hsp100 family of ATP-dependent AAA+ proteins are molecular chaperones found in bacteria, archea, and the organelles of metazoans. Hsp104 and ClpB are two members of the Clp/Hsp100 family and are found in yeast and bacteria, respectively, where they are essential for growth following extreme stress, such as high temperature (Hodson et al., 2012; Doyle et al., 2013; Mogk et al., 2015). They aid in cell survival by disaggregating and reactivating proteins inactivated and aggregated following stress conditions (Hodson et al., 2012; Doyle et al., 2013; Mogk et al., 2015). Under normal growth conditions, Hsp104 and ClpB are not essential, however Hsp104, is required for the propagation of specific amyloidogenic proteins, prions, in yeast (Romanova and Chernoff, 2009; Tuite et al., 2011; Wickner et al., 2011; Winkler et al., 2012). Protein disaggregation and reactivation by Hsp104/ClpB require the collaboration of another molecular chaperone, Hsp70/DnaK and its cochaperones (Glover and Lindquist, 1998; Goloubinoff et al., 1999; Motohashi et al., 1999; Zolkiewski, 1999).

Hsp104 and ClpB, like other Clp/Hsp100 chaperones are hexameric ring-like structures (Diemand and Lupas, 2006; Erzberger and Berger, 2006; Doyle et al., 2013; Mogk et al., 2015). Recent studies have indicated that the Hsp104 hexamer may take on a spiral conformation at some point during the protein disaggregation process (Heuck et al., 2016; Yokom et al., 2016). Spirals have been observed previously when ClpA and ClpB were crystalized (Guo et al., 2002; Lee et al., 2003), however the importance of a spiral vs. closed ring architecture is not yet understood. Each protomer of the Hsp104/ClpB hexamer contains two highly conserved AAA+ modules, nucleotide binding domain-1 and -2 (NBD-1 and NBD-2), with each NBD possessing a Walker A and Walker B motif, an arginine finger motif and sensor-1 and -2 motifs (Hanson and Whiteheart, 2005; Erzberger and Berger, 2006; Wendler et al., 2012; Doyle et al., 2013; **Figures 1A,B**). The Hsp104/ClpB protomer also contains an N-terminal domain (N-domain, NTD), which is less conserved between species than the nucleotide binding domains. The NTD is connected to NBD-1 via a flexible linker and is important for interaction with some substrates (Lee et al., 2003; Nagy et al., 2010; Doyle et al., 2012; Zhang et al., 2012; Rosenzweig et al., 2015; Sweeny et al., 2015). Finally, a coiled-coil middle domain (M-domain), which is required for disaggregation activity, is inserted within NBD-1 of Hsp104/ClpB (Lee et al., 2003, 2007, 2010; Doyle et al., 2013; Mogk et al., 2015). The M-domain of Hsp104 and ClpB has been shown to directly interact with the Hsp70 chaperone, Ssa1 in yeast and DnaK in bacteria, in a speciesspecific manner (Sielaff and Tsai, 2010; Miot et al., 2011; Seyffer et al., 2012; Rosenzweig et al., 2013; Kummer et al., 2016). This direct interaction and collaboration is required for the synergy observed in ATP hydrolysis and substrate disaggregation (Doyle et al., 2007a; Miot et al., 2011; Seyffer et al., 2012; Rosenzweig et al., 2013; Kummer et al., 2016). Additionally, Hsp104 has a Cterminal domain that is involved in hexamerization and may play a role in thermotolerance (Mackay et al., 2008).

Although ClpB and Hsp104 require the DnaK/Hsp70 chaperone system for protein disaggregation in vivo and in vitro, alone they possess intrinsic protein remodeling activities: including protein unfolding, activation and disaggregation of small aggregates (Doyle et al., 2007b). The intrinsic chaperone activity is evoked by using mixtures of ATP and ATPγS or by

FIGURE 1 | Hsp104 and ClpB have multiple chaperone activities in vitro. (A) Structure of the ClpB monomer from Thermus thermophilus bound to AMP-PNP (PDB code: 1QVR; chain C) is shown (Lee et al., 2003). Each monomer is comprised of an amino-terminal domain (N-domain; NTD; red), a coiled-coil middle domain (M-domain; blue) and two nucleotide-binding domains (NBD-1 and NBD-2; cyan and purple, respectively). The nucleotide is shown as a CPK model in black. (B) T. thermophilus ClpB hexamer model with bound ATP is shown (Lee et al., 2003; Diemand and Lupas, 2006). In (B), one monomer of the hexamer is shown in color as described in (A). (C) Hsp104 and ClpB can collaborate with the Hsp70 or DnaK system, respectively, in GFP-38 disaggregation, as observed by monitoring the increase in GFP fluorescence over time as described in Section Materials and Methods. (D) Hsp104, but not ClpB, can prevent the assembly of NM-His into amyloid fibers, as observed by monitoring Thioflavin T (ThT) fluorescence as described in Section Materials and Methods. Data are means ± SEM (n = 3). (E) ClpB, but not Hsp104, can prevent the aggregation of heat-denatured luciferase as observed by measuring turbidity by 90◦ light scattering as described in Section Materials and Methods. In (C,E) a representative experiment of three or more replicates is shown.

using ATP hydrolysis defective ClpB/Hsp104 mutant proteins (Doyle et al., 2007b; Hoskins et al., 2009). One interpretation of the observations is that these conditions slow ATP hydrolysis by the chaperone allowing both substrate binding, a condition that requires ATP binding but not hydrolysis, and substrate translocation, a process that requires ATP hydrolysis, to occur simultaneously. By studying ClpB and Hsp104 using these conditions, ClpB and Hsp104 have been shown to function similarly to other Clp/Hsp100 chaperones. Briefly, Clp/Hsp100 chaperones recognize polypeptide substrates that contain an unstructured region of a minimum length, generally at an end. This unstructured region is engaged by residues in pore loops, which extend into the central channel of the Clp/Hsp100 hexamer (Baker and Sauer, 2012; Doyle et al., 2013; Mogk et al., 2015). These pore loops are in a nucleotide binding domain and use ATP driven conformational cycles to power mechanical unfolding of the polypeptide and translocation of the unfolded polypeptide through the channel (Weber-Ban et al., 1999; Lum et al., 2004, 2008; Schlieker et al., 2004; Siddiqui et al., 2004; Weibezahn et al., 2004; Hinnerwisch et al., 2005; Martin et al., 2008; Tessarz et al., 2008; Doyle et al., 2013). Unfolded substrate is then released and can refold spontaneously or with the help of additional chaperones (Hodson et al., 2012; Zolkiewski et al., 2012; Doyle et al., 2013; Mogk et al., 2015).

Substrate recognition and binding by Clp/Hsp100 chaperones, has been well-studied for many Clp proteins, including ClpA and ClpX, two bacterial chaperones associated with proteases (Weber-Ban et al., 1999; Zolkiewski, 2006; Baker and Sauer, 2012). Specific substrates have been identified by proteomic studies and specific recognition sequences have been determined (Flynn et al., 2003; Zolkiewski, 2006; Baker and Sauer, 2012). For ClpB and Hsp104 however, few specific substrates have been identified, and a mechanism for substrate discrimination by ClpB and Hsp104 has not been described. In the present study, we have further explored the question of substrate recognition by ClpB and Hsp104.

#### MATERIALS AND METHODS

## Plasmids

pNM-His was constructed by amplifying the NM region of sup35 by PCR using 5′ and 3′ oligos containing Nde1 and BamHI sites, respectively, and pJC25NMstop (Addgene, plasmid #1228, Shorter and Lindquist, 2004) as template. The NM region PCR product does not contain a stop codon. This DNA was digested with Nde1 and BamHI and ligated into similarly digested pET24b. The resulting plasmid was digested with EcoR1 and Eag1 and a linker coding for six-histidines followed by a stop codon was ligated between the sites. The plasmid was confirmed by DNA sequencing.

#### Purification of Proteins

GroELTrap (Weber-Ban et al., 1999), Hsp104-ClpB chimeras (Miot et al., 2011), GFP-15 (Hoskins et al., 2002), GFP-38 and GFP-XX-H<sup>6</sup> proteins (Hoskins and Wickner, 2006), and GFP (Hoskins et al., 2000) were purified as previously described. Luciferase was from Promega. Protein concentrations given are for monomeric GFP fusion proteins, NM-His and luciferase, hexameric ClpB, Hsp104 and chimeras, and tetradecameric GroELTrap.

#### ClpB Purification

ClpB wild-type and ClpBE279A, E678A (Weibezahn et al., 2003; Doyle et al., 2007b) were constructed and purified as previously described (Zolkiewski, 1999), but with modifications. Cultures of E. coli BL21(DE3) containing pClpBwt (pET24b vector) or pClpBE279A,E678A (pET24b vector) were grown at 30◦C to OD<sup>600</sup> of ∼0.6 and then induced overnight with 0.1 mM IPTG. All purification steps were carried out at 4◦C. Clarified cellular extracts were purified over a Q-Sepharose column (GE Healthcare) in 20 mM Tris-HCl, pH 7.5, 80 mM NaCl, 5 mM MgCl2, 0.5 mM EDTA, 20% glycerol (vol/vol) and 1 mM DTT. Proteins were eluted from the column with a linear gradient of 80–1,000 mM NaCl in the same buffer. Fractions containing ClpB were further purified using Sephacryl S-200 chromatography in 50 mM Tris-HCl, pH 7.5, 200 mM KCl, 10% glycerol, 20 mM MgCl2, 1 mM EDTA, and 1 mM DTT.

#### Hsp104 Purification

Hsp104 wild-type and Hsp104E285A, E687A (Bosl et al., 2005) were constructed and purified as previously described, but with minor modifications (Miot et al., 2011). This is a detailed description of our Hsp104 purification protocol. The plasmid pHsp104wt was used for the expression of tag-less, wild-type Hsp104 (Miot et al., 2011). pHsp104wt was transformed into E. coli strain Rosetta(DE3) by electroporation. Transformed cells were plated on LB plates supplemented with 50 µg/mL ampicillin and 10 µg/mL chloramphenicol and grown overnight at 32◦C. Transformations were optimized to yield several hundred colony-forming units on each plate. The fresh transformants were used to inoculate Hsp104 expression cultures as follows: 5 mL of LB broth was added to each plate and the cells were resuspended using a sterile glass or plastic rod; cells from a single plate were used to inoculate 1 L of LB broth supplemented with 100 µg/mL carbenicillin (chloramphenicol was not added) in a 2 L baffled flask. Typically, 2–4 L of culture were grown at the same time for one preparation. The cultures were incubated with shaking at 25◦C and 250 rpm to an OD<sup>600</sup> = 0.25; the incubator temperature was reduced to 18◦C and Hsp104 expression induced with the addition of IPTG to a final concentration of 0.1 mM; growth was continued overnight (14–16 h) at 18◦C with shaking at 250 rpm. Cells were harvested by centrifugation in a pre-chilled rotor at 5,000 × g (∼5,000 rpm in a Sorvall SLA-3000 or equivalent) for 10 min at 4◦C. The cell pellet from each 1 L culture was resuspended in 25 mL ice cold Q104 buffer [40 mM Hepes pH 7.5, 80 mM NaCl, 0.5 mM EDTA, 20 mM MgCl2, 1 mM DTT, 20% glycerol (vol/vol), 5 mM ATP] containing EDTA-free complete protease inhibitor cocktail (Roche), which was prepared by mixing 1 tablet/50 mL of Q104 buffer. The resuspended cells were lysed by two or three passages through an ice-cold French Pressure cell (10,000 psi). The cell lysate was collected at the sample outlet tube with a vessel submerged in an ice bath. Cell debris was removed by centrifugation at 12,000 × g (10,000 rpm in a Sorvall SS-34 or equivalent) for 15 min at 4◦C.

The resulting supernatant was then centrifuged at 130,000 × g (35,000 rpm in a Sorvall F50L-8x39 or equivalent) for 30 min at 4◦C. As an alternative, a single centrifugation step at 34,500 × g (17,000 rpm in a Sorvall SS-34 or equivalent) for 90 min at 4◦C will produce similar results. The supernatant was used for subsequent purification. The supernatant must be subjected to the first column purification step without interruption or overnight storage or Hsp104 activity will be significantly reduced or lost. All purification steps were performed at 4◦C using prechilled buffers. The clarified lysate was applied to a 20 mL Qsepharose Fast Flow (GE Healthcare) column equilibrated with Q104 buffer at 1 mL/min using a peristaltic pump. The column was washed with two column volumes of Q104 buffer and protein was eluted with a 100 mL, 80–500 mM NaCl linear gradient in Q104 buffer. Column fractions of 3 mL each were collected. At this point, fractions containing Hsp104 can be stored at −80◦C. Next, a 3 mL Q-sepharose Hsp104 peak fraction was applied onto a 40 mL Sephacryl S-200 High Resolution (GE Healthcare) column (1.5 cm I.D. × 30 cm length) equilibrated with SE104 buffer (20 mM Hepes pH 7.5, 200 mM KCl, 0.5 mM EDTA, 20 mM MgCl2, 1 mM DTT, 20% glycerol (vol/vol), 5 mM ATP) at 0.5 mL/min using a peristaltic pump. Fractions (1 mL) were collected and those containing purified Hsp104 were stored at −80◦C. When thawed for use, individual fractions are divided into 100–200 µL aliquots and stored at -80◦C to minimize the number of freeze-thaw cycles. Using this procedure, the Hsp104 activity is stable for at least 1 year.

#### NM-His Purification

NM-His was purified as previously described (Glover et al., 1997) with modifications. Cultures (50–100 mL) of BL21(DE3) clpP- transformed with pNM-His were grown in LB (30 µg/mL Kan and 10 µg/mL Cam) at 37◦C to an OD<sup>600</sup> of ∼0.6–0.8 and induced with 1 mM IPTG for 2 h. Cells were harvested by centrifugation and resuspended in 40 mM Hepes, pH 7.4, and lysed using a French Press. Urea was added to a final concentration of 8 M and the lysate kept at room temperature (∼23◦C) for the remaining preparation. Insoluble material was removed by centrifugation. NM-His was precipitated with the addition of MeOH to 70% (vol/vol) and the precipitate collected by centrifugation. The protein pellet was resuspended in NM Buffer (40 mM HEPES, pH 7.4, 8 M Urea) and then incubated with TALON resin for 30 min. The slurry was poured into an empty chromatography column and washed with 10 bed volumes of NM Buffer. NM-His was eluted with NM Buffer containing 50 mM Imidazole. NM-His containing fractions were precipitated with MeOH as above, the pellet was resuspended in 70% MeOH and the sample was stored at -80◦C in small aliquots. NM-His was stable for ∼6 months.

#### GFP Unfolding Assay

Reaction mixtures (100 µL) contained buffer A [20 mM Tris-HCl, pH 7.5, 100 mM KCl, 5 mM DTT, 0.1 mM EDTA, and 10% glycerol (vol/vol)], 0.005% Triton X-100 (vol/vol), 0.2 mg/mL BSA, 10 mM MgCl2, 2 mM ATP, and 2 mM ATPγS (Roche), an ATP regenerating system (20 mM creatine phosphate and 6 µg creatine kinase), 0.4 µM GFP or GFP fusion protein, 3.0 µM GroELTrap and 1 µM ClpB or Hsp104. GroELTrap is a mutant form of GroEL that binds but does not release unfolded proteins and was included in the reactions to prevent the GFP fusion proteins from refolding (Weber-Ban et al., 1999). Unfolding was initiated with the addition of ATP, ATPγS, and MgCl<sup>2</sup> and the change in fluorescence signal was monitored over time at 25◦C using a Tecan Infinite M200Pro plate reader. Excitation and emission wavelengths were 395 and 510 nm, respectively. For K<sup>M</sup> and Vmax determinations, substrate concentrations were varied between 0.1 and 10 µM while keeping ClpB and Hsp104 concentrations constant at 1µM. GroELTrap was varied between 1 and 5 µM depending on the substrate concentration. Unfolding rates were determined from the initial linear decrease in fluorescence intensities of the GFP fusion proteins. Michaelis-Menten analysis was performed using the non-linear regression analysis in Prism 7.0a for Mac OS X, GraphPad Software, La Jolla California USA (http://www.graphpad.com).

#### Protein Complexes

Reaction mixtures (100 µL) containing GFP-15, GFP-X30-H6, or GFP-X7-H<sup>6</sup> (0.4 µM) with or without ClpBE279A,E678A or Hsp104E285A, E687A (2 µM) were incubated in buffer A, 0.005% Triton-X100, 5 mM ATP, and 10 mM MgCl<sup>2</sup> for 45 min at room temperature. Reaction mixtures were fractionated on a Sephacryl S200 column (GE Healthcare) equilibrated with 20 mM Tris-HCl, pH 7.5, 100 mM NaCl, 20 mM KCl, 0.1 mM EDTA, 10% glycerol, 5 mM ATP, and 10 mM MgCl<sup>2</sup> at room temperature. Fractions (100 µL) were collected and GFP fluorescence was measured in a Tecan Infinite M200Pro plate reader at 25◦C as described above. The percentage of the GFP fusion protein signal that was shifted upon chaperone binding was determined by calculating the area under the shifted peak compared to the total area under all peaks. The elution profile of ClpBE279A, E678A or Hsp104E285A,E687A (2 µM) was determined in the absence of GFP fusion protein by measuring protein in each fraction using the Bradford assay.

# Prevention of Heat-Denatured Luciferase Aggregation

Luciferase (0.2 µM) was denatured at 42◦C in Buffer B (50 mM Tris-HCl, pH 7.5, 150 mM KCl, 20 mM MgCl2, 2 mM DTT) with 5 mM ATPγS in the presence or absence of 0.5 µM ClpB or Hsp104 as previously described (Weibezahn et al., 2003). Aggregation of luciferase was monitored as an increase in sample turbidity by measuring 90◦ static light scattering on a PerkinElmer LS55 luminescence spectrometer with excitation and emission wavelengths each set to 550 nm.

#### Prevention of NM-His Fiber Assembly

NM-His fiber assembly reactions (100 µL) were initiated by diluting denatured NM-His in 8 M urea (20 mM Tris-HCl, pH7.4) 100-fold to a final concentration of 0.2 µM with assembly buffer (40 mM Hepes-KOH, pH 7.4, 150 mM KCl, 1 mM DTT) in the presence or absence of 0.5 µM ClpB or Hsp104 (Shorter and Lindquist, 2004). Assembly reactions were agitated at 1,000 rpm and assembly of NM-His fibers was assessed by Thioflavin T (ThT) binding (100 µM final concentration). ThT fluorescence was read in a Tecan Infinite M200Pro plate reader at 25◦C using excitation and emission wavelengths of 440 and 481 nm, respectively.

#### GFP-38 Disaggregation Assay

GFP-38 disaggregation was performed as previously described (Miot et al., 2011). Reaction mixtures (100 µL) contained 25 mM Hepes, pH 7.5, 50 mM KCl, 0.1 mM EDTA, 5 mM DTT, 0.005% Triton-X-100 (vol/vol), 4 mM ATP, an ATP regenerating system (10 mM creatine phosphate and 3 µg creatine kinase), 10 mM MgCl2, 5 µL heat-aggregated GFP-38 (prepared by heating 75– 100 µL of 14 µM GFP-38 for 15 min at 80◦C in 0.2 mL PCR tubes; the heated luciferase was rapidly frozen on dry ice, thawed and used immediately), 0.5 µM ClpB, 1.3 µM DnaK, 0.2 µM DnaJ and 0.1 µM GrpE or 0.5 µM Hsp104, 1.3 µM human Hsp70 (HSPA1A) and 0.2 µM Ydj1. GFP fluorescence was monitored over time at 23◦C using a Varian Cary Eclipse fluorescence spectrophotometer with a plate reader. Excitation and emission wavelengths were 395 and 510 nm, respectively. Reactivation was determined compared to a non-denatured GFP-38 control.

# RESULTS

# Hsp104 and ClpB Exhibit Substrate Preferences

In this work, we wanted to know if ClpB and Hsp104 differ in their innate substrate preferences. The experiments addressing this question were performed in the absence of the DnaK or Hsp70 chaperone so it would be possible to study the basic properties of the ClpB/Hsp104 machine and avoid the complication of substrate recognition by DnaK and Hsp70. However, it is known that in vivo and in vitro in the presence of ATP, both ClpB and Hsp104 require DnaK/Hsp70 to carry out protein disaggregation and reactivation (Glover and Lindquist, 1998; Goloubinoff et al., 1999; Motohashi et al., 1999; Zolkiewski, 1999; Doyle et al., 2013).

For these experiments, we used ClpB that was purified as described previously (Zolkiewski, 1999; Doyle et al., 2015; see Section Materials and Methods) and Hsp104 that was purified using standard biochemical protocols described in detail in Section Materials and Methods. The chaperones were isolated from E. coli cells overexpressing untagged ClpB or Hsp104 and consequently Hsp104 might not contain post translational modifications that would be present when the protein is expressed in yeast. Biochemical properties of Hsp104 were determined because controversy exists in the literature regarding several of the reported activities of Hsp104. Hsp104 isolated as described here reactivated aggregates in the presence of ATP in combination with Hsp70 and Hsp40 (**Figure 1C**; either yeast Ssa1 or human Hsp70 functioned in combination with Ydj1 or Sis1 from yeast; Miot et al., 2011; Reidy et al., 2014; Doyle et al., 2015). Additionally, it prevented amyloid assembly in the absence of ATP and Hsp70 (**Figure 1D**; Inoue et al., 2004; Shorter and Lindquist, 2004, 2006), and as previously observed it was unable to prevent aggregation of heat-denatured luciferase (**Figure 1E**; Glover and Lindquist, 1998). It also hydrolyzed ATP at a rate similar to published rates (Lum et al., 2004; Doyle et al., 2007b; Miot et al., 2011) and unfolded substrates using a condition that elicits the innate chaperone activity of Hsp104, a mixture of ATP and ATPγS (**Figure 2**; Doyle et al., 2007b). However, using Hsp104 prepared as described here, we were unable to repeat the observations, including one from our group, that Hsp104 accelerates assembly of the NM fragment of Sup35 in an ATP-dependent reaction (Shorter and Lindquist, 2004, 2006; Doyle et al., 2007b) and promotes disassembly of NM fibers in an ATP-dependent reaction in the absence of Hsp70 (Shorter and Lindquist, 2004, 2006; Doyle et al., 2007b; DeSantis et al., 2012). Other groups have previously reported that their Hsp104 preparations were unable to perform these two reported activities (Inoue et al., 2004; Krzewska and Melki, 2006; Savistchenko et al., 2008; Glover and Lum, 2009; Kummer et al., 2016).

To explore substrate discrimination by ClpB and Hsp104 in the absence of DnaK/Hsp70 we tested the two chaperones for the ability to act on several model substrates in vitro. The innate protein unfolding activity of ClpB and Hsp104 in the absence of the Hsp70/DnaK chaperone system was measured in the presence of a mixture of ATP and ATPγS to elicit the intrinsic chaperone activity (Doyle et al., 2007b, 2012; Hoskins et al., 2009). GFP-15, a GFP fusion protein containing a C-terminal 15-amino acid peptide was used as a model substrate. We had previously demonstrated that GFP-15 is a substrate for ClpA, but not ClpX (Hoskins et al., 2002), and we had also shown that ClpB catalyzes its unfolding in the presence of mixtures of ATP and ATPγS (Hoskins et al., 2009; Doyle et al., 2012; **Table 1**; **Figure 2A**). Unfolding of GFP-15 was determined by monitoring the decrease in GFP fluorescence over time in the presence of GroELTrap, a mutant form of GroEL that binds and does not release unfolded proteins (**Figure 2A**; Weber-Ban et al., 1999). In contrast to the rapid rate of GFP-15 unfolding seen with ClpB, the rate of unfolding by Hsp104 was ∼10-fold slower (**Figure 2B**). We next tested another GFP fusion protein that was previously shown to be a substrate for unfolding by ClpA, but not ClpX, GFP-X30-H6, which contains a C-terminal 30 amino acid peptide followed by a six-histidine tag (Hoskins and Wickner, 2006; **Table 1**; **Figures 2A,B**). ClpB unfolded GFP-X30-H<sup>6</sup> at a much slower rate than it did GFP-15 (**Figure 2A**), however Hsp104 catalyzed unfolding of this substrate at a rate ∼5-fold faster than ClpB (**Figure 2B**), showing that ClpB and Hsp104 differ in their ability to act on these substrates.

We then wanted to know if ClpB and Hsp104 also differed in their ability to recognize and unfold GFP proteins with other polypeptide tags fused at either the N- or C-terminus. When two N-terminally tagged GFP fusion proteins, 15-GFP with the same 15 amino acid tag as on GFP-15 and 1-24βGal-GFP with a tag comprised of the first 24 amino acids of βgalactosidase, were tested, both substrates were unfolded by ClpB, as previously observed (Doyle et al., 2012; **Table 1**; **Figure 2C**). In contrast, neither of the N-terminally tagged substrates tested was detectably unfolded by Hsp104 (**Figure 2D**), supporting the above suggestion that ClpB and Hsp104 differ in their ability to unfold specific substrates. We next tested two additional Cterminally tagged GFP fusion proteins of different length but similar sequence, GFP-X42-H<sup>5</sup> and GFP-X7-H6, which are related to GFP-X30-H<sup>6</sup> (**Table 1**). Similar to the results observed for


<sup>a</sup>The 15 amino acid tag on GFP-15 and 15-GFP comprises the first 15 N-terminal residues of the P1 plasmid replication initiator protein, RepA (Hoskins et al., 2002).

GFP fusion proteins were constructed as described in Section Materials and Methods.

GFP-X30-H6, Hsp104 unfolded GFP-X42-H<sup>5</sup> at a faster rate than ClpB (**Figures 2C,D**). However, GFP-X7-H<sup>6</sup> was unfolded faster by ClpB than Hsp104, suggesting that Hsp104 may require a longer tag than ClpB, although the difference in unfolding rates may be due to sequence preferences or potential differences in the secondary structure of the tags (**Figures 2C,D**). We also tested GFP-SsrA, a GFP fusion protein C-terminally tagged with the well-studied SsrA 11-aa peptide, which can be recognized and unfolded by both ClpA and ClpX (Keiler et al., 1996; Singh et al., 2000; **Table 1**). Both ClpB and Hsp104 unfolded GFP-SsrA at a slow rate, indicating that the SsrA tag is poorly recognized by the two disaggregases (**Figures 2C,D**). This result is consistent with ClpB having weak binding affinity for the SsrA tag (Li et al., 2015) and observations previously reported, but not shown, indicating that ClpB does not unfold GFP-SsrA (Hinnerwisch et al., 2005). Taken together, ClpB and Hsp104 appear to have substrate preferences for protein unfolding.

We next tested if the rate of protein unfolding of a substrate correlated with the ability of the chaperone to interact stably with the specific substrate. Mutants of ClpB and Hsp104 with substitutions in the NBD-1 and NBD-2 Walker B sites (ClpBE279A,E678A and Hsp104E285A,E687A) were used for these experiments because they bind but do not hydrolyze ATP and therefore limit the protein remodeling pathway to substrate interaction (Weibezahn et al., 2003; Bosl et al., 2005). ClpBE279A, E678A was first incubated with GFP-15 in the presence of ATP to allow complex formation. Following incubation, the mixture was subjected to gel filtration chromatography and

<sup>b</sup>The 24 amino acid tag on 1-24βGal-GFP comprises the first 24 N-terminal residues of β-galactosidase (Hoskins et al., 2002).

<sup>c</sup>Each (X) sequence of varying length, from 42 to 7 amino acids, comprises residues resulting from the translation of varying portions of the pET24b multicloning site (Hoskins and Wickner, 2006).

GFP fluorescence was measured in the eluted fractions. We observed a peak of fluorescence eluting near the position of ClpBE279A, E678A and separated from the position where GFP-15 eluted when chromatographed alone (**Figures 3A,B**). About 27 ± 6% of the GFP-15 eluted in a complex with ClpB. However, when Hsp104E285A, E687A was incubated with GFP-15 and the mixtures analyzed by gel filtration, there was no detectable peak of GFP-15 fluorescence eluting at the position of Hsp104E285A, E687A (**Figure 3C**).

In parallel experiments, when ClpBE279A, E678A was incubated with GFP-X30-H<sup>6</sup> and ATP and analyzed by gel filtration, a single peak of GFP fluorescence was observed that eluted at the position of free GFP-X30-H<sup>6</sup> (**Figures 3D,E**). In contrast, when Hsp104E285A, E687A was incubated with GFP-X30-H<sup>6</sup> and subjected to gel filtration, a peak of fluorescence, which contained 22 ± 2% of the total fluorescence, was detected eluting at the position of Hsp104 (**Figure 3F**). A third substrate, GFP-X7-H<sup>6</sup> was also tested for its ability to interact with ClpBE279A, E678A and Hsp104E285A, E687A via gel filtration analysis (**Figures 3G–I**). The results were similar to those observed for GFP-15 with about 22 ± 1% of the GFP-X7-H<sup>6</sup> eluting in a complex with ClpB (**Figure 3H**) while there was no detectable complex of GFP-X7- H<sup>6</sup> and Hsp104 (**Figure 3I**). Thus, with these three substrates, the results indicate a direct correlation between the rate of substrate unfolding by ClpB and Hsp104 and the stability of substrate interaction by the chaperone.

To further investigate the relationship between the substrate binding affinity and the rate of substrate unfolding by ClpB and Hsp104, we monitored the initial rates of unfolding of GFP-X30-H6, GFP-X7-H<sup>6</sup> and GFP-15, while keeping the

chaperone concentration constant and varying the substrate concentration. For GFP-X30-H6, Michaelis-Menten analysis indicated that Hsp104 and ClpB similarly interact with this substrate (**Figure 4A**). Hsp104 only has an ∼2-fold lower K<sup>M</sup> and less than 2-fold higher Vmax compared to ClpB. GFP-X7-H<sup>6</sup> was bound similarly by Hsp104 and ClpB, with ClpB having less than a 2-fold lower K<sup>M</sup> for binding than Hsp104 (**Figure 4B**). However, the maximum unfolding rate (Vmax) was ∼4-fold higher for ClpB than for Hsp104 with this substrate (**Figure 4B**). When GFP-15 unfolding was analyzed in the same way, the K<sup>M</sup> for both ClpB and Hsp104 was the same, however the maximum unfolding rates were again different, with ClpB having ∼3-fold higher Vmax than Hsp104 with this substrate (**Figure 4C**). These results indicate that for the substrates tested, binding affinity and the maximum substrate unfolding rate both affect the ability of ClpB and Hsp104 to efficiently process substrates.

# Nucleotide-Binding Domain-1 Is Important for Determining Substrate Binding Specificity

We wanted to explore substrate discrimination by ClpB and Hsp104 further by asking what domain or domains of ClpB and Hsp104 were involved in the substrate discrimination we observed with GFP-15 and GFP-X30-H<sup>6</sup> (**Figure 2**). For these experiments, we utilized previously characterized chimeras of ClpB and Hsp104 (Miot et al., 2011). The chimeras are designated by a series of four characters that represent the four ClpB/Hsp104 domains from the N- to C-terminus, the N-domain, NBD-1, Mdomain, and NBD-2 (**Figure 5A**). "B" represents a domain from ClpB and "4" represents a domain from Hsp104. For example, 444B represents the chimera with the N-domain, NBD-1 and M-domain from Hsp104 and NBD-2 from ClpB.

We tested the ClpB/Hsp104 chimeras for the ability to discriminate between GFP-X30-H<sup>6</sup> and GFP-15, the two substrates most efficiently unfolded by Hsp104 and ClpB, respectively (**Figures 2A,B**). We observed that B4BB, a chimera with NBD-1 from Hsp104 and the other domains from ClpB, unfolded GFP-X30-H<sup>6</sup> at a significantly faster rate than ClpB, although more slowly than Hsp104 wild-type (**Figure 5B**). This result suggests that the Hsp104 NBD-1 is important for substrate specificity. In support of this suggestion, three other chimeras containing the NBD-1 from Hsp104, B44B, 44B4, and 444B, also unfolded GFP-X30-H<sup>6</sup> at rates similar to or slightly faster than Hsp104 wild-type (**Figure 5B**). Additionally, the observation that B44B unfolded GFP-X30-H<sup>6</sup> like Hsp104 wild-type indicates that the N-terminal domain does not affect recognition of GFP-X30- H<sup>6</sup> by Hsp104 (**Figure 5B**). 4BBB unfolded GFP-X30-H<sup>6</sup> at a rate similar to ClpB wild-type, substantiating the conclusion that NBD-1 plays a role in substrate discrimination with this substrate, but the N-domain does not (**Figure 5B**).

We next monitored the ability of the chimeras to unfold GFP-15, the preferred substrate of ClpB (**Figure 2A**). As observed for GFP-X30-H6, chimeras with NBD-1 from Hsp104, including B4BB, B44B, 44B4, and 444B, functioned comparably to Hsp104 wild-type and unfolded GFP-15 at a slow rate (**Figure 5C**). The observation that B44B functioned like Hsp104 wild-type,

ClpB and Hsp104. The concentration of (A) GFP-X30-H6, (B) GFP-X7-H6, and (C) GFP-15 was varied in ClpB or Hsp104 mediated unfolding reactions and the initial rate of unfolding was plotted vs. the substrate concentration as described in Section Materials and Methods. Curves shown are the fit of the data to the Michealis-Menten equation and kinetic parameters (KM and Vmax) were determined as described in the Section Materials and Methods. For GFP-X30-H6 (A) the Hsp104 KM and Vmax are 1.8 (0.2) µM and 0.054 (0.003) min−<sup>1</sup> , respectively, while the ClpB KM and Vmax are 4.1 (0.4) µM and 0.04 (0.002) min−<sup>1</sup> . For GFP-X7-H6 (B) the Hsp104 KM and Vmax are 5.2 (0.6) µM and 0.03 (0.001) min−<sup>1</sup> , respectively, while the ClpB KM and Vmax are 3.0 (0.4) µM and 0.12 (0.009) min−<sup>1</sup> . For GFP-15 (C), the Hsp104 KM and Vmax are 1.6 (0.2) µM and 0.03 (0.001) min−<sup>1</sup> , respectively, while the ClpB KM and <sup>V</sup>max are 1.2 (0.1) <sup>µ</sup>M and 0.091 (0.003) min−<sup>1</sup> . In (A–C), data are the means ± SEM (n = 3).

GFP-15 (C) mediated by Hsp104 (4444; dashed black line), ClpB (BBBB; solid black line) or chimeras (colored lines) in the presence of ATP and ATPγS as described in the Section Materials and Methods. The initial fluorescence was set equal to 1 and a data set representative of three or more replicates is shown.

again emphasized that the N-domain is not important for substrate specificity of this substrate (**Figure 5C**). Additionally, 4BBB, with the N-domain from Hsp104 and NBD-1 from ClpB, unfolded GFP-15 at a rate similar to ClpB wild-type (**Figure 5C**). Collectively, these results suggest that with the two substrates tested, NBD-1 is important for the substrate unfolding preference of Hsp104 and likely ClpB. Moreover, the N-domain does not appear to be involved in recognition of these substrates by ClpB and Hsp104.

#### DISCUSSION

In this study, we showed that Hsp104 and ClpB, in the absence of Hsp70 or DnaK, exhibit differing substrate preferences. By using chimeras of Hsp104 and ClpB domains we found that Hsp104 NBD-1 largely imparted the substrate specificity of Hsp104. The importance of NBD-1 in substrate binding and translocation has been demonstrated for many Clp/Hsp100 chaperones, including ClpX, ClpA, ClpB, and Hsp104, where it has been found that conserved tyrosines in the channel facing pore loops directly interact with substrates (Lum et al., 2004, 2008; Schlieker et al., 2004; Weibezahn et al., 2004; Hinnerwisch et al., 2005; Martin et al., 2008; Tessarz et al., 2008; Doyle et al., 2012). However, it is not clear what is uniquely different between NBD-1 of Hsp104 and NBD-1 of ClpB that is responsible for the substrate specificity that we observed. The NBD-1 pore loops of Hsp104 and ClpB are highly conserved suggesting additional residues in NBD-1 are potentially involved in substrate specificity. These additional substrate interactions may be with other residues in the central channel of NBD-1 or with residues in NBD-1 that are transiently exposed due to ATP-dependent conformational changes. Our results are consistent with a previous study by Tipton et al. that used chimeras of Hsp104 and ClpB to show that prion propagation in yeast requires NBD-1 from Hsp104 and that chimeras with ClpB NBD-1 were unable to support prion propagation (Tipton et al., 2008). Together, these results suggest that NBD-1 is important for substrate specificity of ClpB and Hsp104 in the absence of DnaK/Hsp70.

In our unfolding studies, we observed that ClpB and Hsp104 discriminate between GFP fusion proteins with different polypeptide tags fused at an end. Three of the substrates tested share almost the same 13 C-terminal residues, however, ClpB unfolded one (GFP-X7-H6) at a faster rate than Hsp104 while Hsp104 unfolded two (GFP-X30-H<sup>6</sup> and GFP-X42-H5) faster than ClpB (**Figure 2**). These results suggest that either the length or the secondary structure of the recognition tag may affect the rate of substrate unfolding. In gel filtration studies monitoring substrate binding to ClpB and Hsp104, we observed a direct correlation between the rate of substrate unfolding by ClpB and Hsp104 and the stability of substrate interaction with the chaperone. However, Michaelis-Menten analysis of unfolding assays using three different substrates indicated there was only a 2-fold difference or less in binding affinities between Hsp104 and substrate or ClpB and substrate. The process of substrate unfolding is comprised of multiple steps including substrate recognition and binding, translocation and release, and the differences observed between Hsp104 and ClpB in substrate unfolding are likely due to more than just variances in sequence recognition. Additionally, the stability of the substrate and of the ClpB or Hsp104 hexamer are likely important for the overall substrate unfolding process.

The studies presented here using chimeras of Hsp104 and ClpB indicate that the N-domain of Hsp104 and ClpB does not affect the substrate discrimination observed with the two substrates tested. Previous studies addressed the role of the ClpB N-domain in substrate binding and unfolding and showed that the N-domain of ClpB is important for stabilizing ClpB and interaction with substrate (Nagy et al., 2010; Doyle et al., 2012; Rosenzweig et al., 2015). It was also shown that the N-domain directly interacts with substrates via a substratebinding groove, and this interaction was nucleotide independent (Rosenzweig et al., 2015). Therefore, substrate interaction with the N-domain is different than the nucleotide-dependent binding observed between substrate and the NBD-1 pore loops (Schlieker et al., 2004; Weibezahn et al., 2004; Zolkiewski, 2006; Lum

#### REFERENCES


et al., 2008; Tessarz et al., 2008; Doyle et al., 2012; Rosenzweig et al., 2015). Additionally, previous work indicated that the Ndomains may sterically obstruct access to the central channel and impede substrate binding to the pore loops of NBD-1 (Doyle et al., 2012; Nagy et al., 2010; Rosenzweig et al., 2015). In studies examining the role of the Hsp104 N-domain in protein unfolding and remodeling, it was observed that 1N-Hsp104 was defective in substrate unfolding compared to Hsp104 wildtype, showing a role for the Hsp104 N-domain (Sweeny et al., 2015; Kummer et al., 2016). Therefore, for some substrates it is likely that the N-domain of ClpB/Hsp104 is required for stabilizing the initial interaction between chaperone and substrate and thus is required for the subsequent chaperone activity.

Understanding the mechanism of the intrinsic chaperone activity of ClpB/Hsp104 is providing the groundwork for understanding the more complex and biologically important reaction carried out by ClpB/Hsp104 in physical and functional collaboration with DnaK/Hsp70.

#### AUTHOR CONTRIBUTIONS

DJ, MM, JH, SW, and SD designed the experiments. DJ, MM, JH, and SD performed the experiments. All authors were involved in data interpretation and discussion. SW and SD wrote the manuscript with contributions from JH.

#### FUNDING

This research was supported by the Intramural Research Program of the NIH, NCI, Center for Cancer Research.

protein disaggregation. J. Mol. Biol. 427, 312–327. doi: 10.1016/j.jmb.2014. 10.013


reactivation efficiency. Proteins 80, 2758–2768. doi: 10.1002/prot. 24159


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Johnston, Miot, Hoskins, Wickner and Doyle. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Hsp78 (78 kDa Heat Shock Protein), a Representative AAA Family Member Found in the Mitochondrial Matrix of Saccharomyces cerevisiae

Josielle Abrahão, David Z. Mokry and Carlos H. I. Ramos \*

Chemistry Institute, University of Campinas, Campinas, Brazil

ATPases associated with diverse cellular activities (AAA+) form a superfamily of proteins involved in a variety of functions and are characterized by the presence of an ATPase module containing two conserved motifs known as Walker A and Walker B. ClpB and Hsp104, chaperones that have disaggregase activities, are members of a subset of this superfamily, known as the AAA family, and are characterized by the presence of a second highly conserved motif, known as the second region of homology (SRH). Hsp104 and its homolog Hsp78 (78 kDa heat shock protein) are representatives of the Clp family in yeast. The structure and function of Hsp78 is reviewed and the possible existence of other homologs in metazoans is discussed.

#### Edited by:

James Shorter, University of Pennsylvania, United States

#### Reviewed by:

Umesh K. Jinwal, University of South Florida, United States Rina Rosenzweig, Weizmann Institute of Science, Israel

> \*Correspondence: Carlos H. I. Ramos

cramos@iqm.unicamp.br

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 28 March 2017 Accepted: 08 August 2017 Published: 23 August 2017

#### Citation:

Abrahão J, Mokry DZ and Ramos CHI (2017) Hsp78 (78 kDa Heat Shock Protein), a Representative AAA Family Member Found in the Mitochondrial Matrix of Saccharomyces cerevisiae. Front. Mol. Biosci. 4:60. doi: 10.3389/fmolb.2017.00060 Keywords: ATPases associated with diverse cellular activities, disaggregase, heat shock protein, molecular chaperones, protein folding and misfolding, heat shock protein 78 (HSP78), ClpB

# INTRODUCTION

ATPases associated with diverse cellular activities (AAA+) form a superfamily of proteins involved in a variety of functions, from DNA replication to protein degradation (for reviews see Patel and Latterich, 1998; Sauer et al., 2004; Snider and Houry, 2008; Zolkiewski et al., 2012). Proteins belonging to the AAA+ superfamily are characterized by the presence of an ATPase module, which is 200–250 residues long containing two highly conserved motifs known as Walker A and Walker B and both interact with the bound nucleotide (**Figure 1A**). The Walker A motif (also known as the P-loop) is primarily responsible for binding ATP and has the consensus sequence GXXXXGK(T/S) (Walker et al., 1982), in which X is any residue and terminates with either a threonine or a serine residue. The Walker B motif is involved in hydrolyzing the bound nucleotide, and has the consensus sequence hhhhDE, in which h is a hydrophobic residue (Hanson and Whiteheart, 2005). Additional motifs present are sensor 1, a polar residue (usually asparagine) and sensor 2 (usually an arginine residue), and both are important for ATPase activity (Takahashi et al., 2002; Hanson and Whiteheart, 2005).

Moreover, a subset of the AAA+ family, known as the AAA family, possess a highly conserved motif called the second region of homology (SRH), which is ∼15 residues long and has an arginine (arginine finger) involved in interunit interaction (**Figure 1A**) (Lupas and Martin, 2002). The AAA family is very large, including several clp members that are involved in remodeling target proteins (Hanson and Whiteheart, 2005). Among the members of the clp proteins within the AAA family are ClpB and Hsp104, well known chaperones which have disaggregase activities that can solubilize aggregates (for reviews see Shorter, 2008; Zolkiewski et al., 2012; Mokry et al., 2015). These aggregates are soluble or insoluble non-physiologically associations of misfolded proteins via exposed hydrophobic regions that are strongly correlated with diseases (for reviews see Ramos and Ferreira, 2005; Chiti and Dobson, 2006; Doyle et al., 2013; Knowles et al., 2014).

Additionally, even though members of the ClpB/Hsp104 subfamily are not essential under non-stress conditions, they confer protective qualities against diverse forms of stress. ClpB from bacteria Escherichia coli (EcClpB) and Hsp104 from yeast Saccharomyces cerevisiae (ScHsp104) are about 43% identical (Sanchez and Lindquist, 1990; Squires et al., 1991; Krzewska et al., 2001a; Figure S1). Despite these proteins both having two nucleotide binding domains (NBDs), called NBD1 and NBD2, they have limited homology between them (Schirmer et al., 1996). Indeed, one of the most well characterized Hsp100s, ScHsp104 was identified more than 20 years ago, as a stress-induced chaperone vital for tolerance to heat and ethanol stresses, and some heavy metals (Sanchez and Lindquist, 1990; Parsell et al., 1991; Lindquist and Kim, 1996). This protein is localized in the cytoplasm and plays a major role in the modification and dissolution of heat denatured protein aggregates (Parsell et al., 1994; Glover and Lindquist, 1998; Bösl et al., 2006).

It is important to note that the ClpB/Hsp104 subfamily is not able to recover and refold the majority of protein substrates without the cooperation of the Hsp70 refolding system (Glover and Lindquist, 1998). This interaction is stringently specific since disaggregation is contingent on the presence of Hsp70 and Hsp100 from the same species (Glover and Lindquist, 1998). Notably, another AAA representative found in S. cerevisiae is the mitochondrial 78 kDa heat shock protein, known as Hsp78. In this review, we have explored some features of Hsp78 and speculate the possible existence of other homologs in animals/metazoans.

# Hsp78

#### Sequence and Structure

As a representative AAA member localized to the mitochondrial matrix of S. cerevisiae, Hsp78 has a characteristic signal peptide at the N-terminus required for its proper subcellular localization (Leonhardt et al., 1993). This protein is 811 residues long (predicted mass of 91,336 Daltons) and shares more identity to EcClpB (about 49%) than to ScHsp104 (about 42%), likely due to its mitochondrial origin. Also, Hsp78 is shorter than both proteins because it is truncated at the N-terminus, which is involved in substrate binding in other homologs. The two NBDs are from residues 98 to 344 (NBD1) and 467 to 658 (NBD2; Figure S1). Within these domains, the two ATP binding sites are located from residues 143 to 150 and 541 to 548. The region responsible for substrate binding, located in the first AAA domain is well conserved among these chaperones, notably Tyr251 (ClpB numbering; Figure S1), which is required for binding as deemed by site-directed mutagenesis and crosslinking assays (Schlieker et al., 2004).

The NBD1 is primarily responsible for the ATPase activity, since specific mutations in these sites and others can interfere with ATPase activity (Table S1). On the other hand, the NBD2 is required for proper oligomerization. Although the preferred substrate of Hsp78 is ATP, it can also hydrolyze GTP, CTP and UTP, but with a decrease in efficiency ranging from one tenth to one fiftieth that of ATP hydrolysis (Krzewska et al., 2001a). However, it is worth mentioning that these experiments were performed at high nucleotide concentrations.

Despite no high-resolution structure for Hsp78 being available (see Leidhold et al., 2006 for a model structure), its high sequence identity with EcClpB likely implies the two proteins share a strong degree of structural resemblance. **Figure 1B** shows one of the available structures for EcClpB (PDB number 4CIU) from X-ray diffraction, which has a 3.5 Å resolution and covers 727 residues, from 159–247, 253–285, 294–323, 333–430, 441–649, 659–729, and 732–858 (Figure S2). To better understand why the proteins likely share structural resemblance, a model for the 4CIU structure was produced (**Figure 1B**). In this model, all residues that are identical between EcClpB and Hsp78, according to the alignment shown in Figure S1, are colored in black in **Figure 1B**. Clearly, the residues that are identical to Hsp78 occupy several positions and are almost evenly spaced throughout the protein, a strong indication that the proteins may have a similar conformation. It is also important to point out that similar residues were not included in this model, although they may also adopt a similar conformation. Since the structure for the monomer may be similar, it is just as intuitive that the quaternary structure may also be analogous. As a matter of fact, Leidhold et al. (2006) created a model structure for a hexameric Hsp78 and showed that it is very similar to the hexameric ClpB.

In this sense, is important to note that in the presence of nucleotides, ClpB changes its conformation mainly in the NBD1 domain (**Figure 1C;** for a review see Doyle and Wickner, 2009). Also, Hsp78 oligomerizes to form a hexamer, which influences its ATPase and chaperone activities, although in purified mitochondria smaller oligomers have been identified (Leidhold et al., 2006). Notably, the oligomerization of Hsp78 is dependent upon both protein and ATP concentrations and stoichiometries. In the presence of ATP, Hsp78 elutes with a molecular mass of a hexamer, while it elutes with a much lower apparent molecular mass in the absence of this nucleotide (Krzewska et al., 2001a). Thus, the oligomerization process depends on the concentration of Hsp78, and consequently the protein is more active at higher than lower concentrations (Krzewska et al., 2001a).

### Function

Hsp78 is expressed in the mitochondrial matrix of yeast, and its expression increases upon heat shock. Leonhardt et al. (1993) demonstrated that the number of transcripts belonging to Hsp78 increased approximately 10 fold when cells were heated for 1 h at 42◦C. However, as described for Hsp104, Hsp78 is not essential for cell growth, as Hsp78 deleted yeast are viable (Leonhardt et al., 1993). Moreover, while Hsp104 is important for thermotolerance, Hsp78 appears not to play an important role in tolerance to heat (Sanchez and Lindquist, 1990; Leonhardt et al., 1993). Surprisingly, Hsp78 is capable of partially complementing induced thermotolerance of Hsp104 in an Hsp104 knock out strain, when expressed in the cytosol (Schmitt et al., 1996; Table S2).

FIGURE 1 | AAA+ superfamily features. (A) Proteins belonging to the AAA+ superfamily are characterized by the presence of an ATPase module which is 200–250 residues long with two motifs known as Walker A and Walker B. Additional motifs present are sensor 1 and sensor 2. A highly conserved motif, the second region of homology (SRH), defines the AAA family, a subset of the AAA+ family. (B) Structure from EcClpB (PDB number 4CIU; Carroni et al., 2014) from X-ray diffraction, which has a 3.5 Å resolution. Model for the 4CIU structure, in which all residues that are identical between EcClpB and Hsp78, according with the alignment shown in Figure S1, are colored in black. (C) Silhouette shape of hexameric ClpB from Cryo-EM. Shapes were drawn using Cryo-EM structures from Wendler and Saibil (2010). ClpB forms a ring-shaped hexameric structure in which the NBD1s from each monomer are on the top and the NBD2s are on the bottom. The structures are in the absence (APO; gray) and in the presence of ADP, ATP, or non-hydrolyzable AMPPNP (all in black). Superimposition with the apo form (right) indicates that the main conformational change upon either ATP or non-hydrolyzable AMPPNP binding occurs in the NBD1.

While Hsp78 may have little or no role in conferring cellular thermotolerance, the chaperone plays important functions in the mitochondria. Deletion of Hsp78 is lethal in cells deleted (Schmitt et al., 1995) or carrying specific point-mutations in the mitochondrial Hsp70 (Moczko et al., 1995), which may suggest functional overlap between these two chaperones. In these studies, deficiency in protein import and aggregation in the matrix were detected and eliminated by the expression of Hsp78 (Moczko et al., 1995; Schmitt et al., 1995). The interaction between Hsp70 and Hsp78 has been demonstrated in several studies and Hsp78 can substitute for some chaperone functions of mitochondrial Hsp70 (Schmitt et al., 1995). The two chaperones combined are more efficient when refolding several substrates, either model or specific mitochondrial proteins (Krzewska et al., 2001b; Germaniuk et al., 2002).

Hsp78 is essential for mitochondrial thermotolerance (maintenance of respiratory competence and genome integrity under severe temperature stress) and the recovery of mitochondrial misfolded proteins after heat shock (Schmitt et al., 1996). Other experiments have demonstrated that survival under conditions in which cell growth depends on mitochondrial respiration is severely affected by the deletion of the hsp78 gene (Schmitt et al., 1996). In this case, deletion caused respiratory incompetence and lesions in mitochondrial DNA (Schmitt et al., 1996). Additionally, the presence of Hsp78 is essential in aggregation and disaggregation assays, which were performed on intact mitochondria in order to resolubilize protein heat stress induced aggregates under in vivo conditions (von Janowsky et al., 2006). Hsp78 is also important for the recovery of the normal morphology of yeast mitochondria after severe heat stress as deletion of Hsp78 delays the recovery (Lewandowska et al., 2006). Under stress conditions, Hsp78 cooperates with other mitochondrial heat shock proteins (Schmitt et al., 1996). This protein cooperates with the mitochondrial Hsp70 system (Hsp70/DnaJ/GrpE) to refold luciferase in vitro experiments (Krzewska et al., 2001b). Also, it cooperates with proteolytic systems, such the Pim1/LON complex (for proteolysis in mitochondria) (Röttgers et al., 2002). In summary, Hsp78 is a member of the Protein Quality Control (PQC) system in the matrix of yeast mitochondria that, together with the proteostatic system, is part of a network of utmost importance that protects cells against misfolding and aggregation (Douglas et al., 2009; Tiroli-Cepeda and Ramos, 2011).

#### Is There a Mammalian Homolog?

Hsp78 is not present in metazoans but there is a gene sometimes referred to as the "ClpB homolog" that has a single nucleotide binding domain containing canonical Walker A and B motifs (**Figure 2A**). This protein is also referred to as Q9H078 (in humans) and ANKCLP in general (Erives and Fassler, 2015). Previous results showed that Q9H078 is capable of hydrolyzing ATP when recombinantly expressed (Wortmann et al., 2015). When only the single nucleotide binding domain is considered, sequence identity with known Hsp104/ClpB members is about 40% (65% similar). However, this sequence identity decreases to about 20% when the entire sequence is considered (**Figure 2B**).

Similar to Hsp78, Q9H078 also has the characteristic signal peptide at the N-terminus for localization to mitochondria (**Figure 2A**), and its expression has been detected in cell lines derived from human and murine tissues (Périer et al., 1995; Kanabus et al., 2015; Saita et al., 2017). Heterologous expression studies in yeast demonstrate that a histidine tagged 114 N-terminal truncation of the Q9H078 will express, while expression of the complete or untagged version of the protein is not detected by western blot (**Figure 2C**). This could imply that, in yeast, the protein is localized to the mitochondria, where it is cleaved and subsequently degraded. In support of this, yeast transformed with a C-terminal GFP fusion of the Q9H078 have GFP fluorescence precisely where mitochondria are detected, whereas the same fusion lacking the localization signal does not depict this phenomenon (**Figure 2C**). Importantly, the GFP fluorescence in these experiments serves as an artifact to the presence of the Q9H078, since it was not directly detected by immunoblot analysis. It is worth noting that Hsp78 is more related to ClpB than Hsp104 due to its origin in the mitochondria. Therefore, when only considering this particular aspect, Q9H078 should be considered functionally related to Hsp78, not to Hsp104.

Of noticeable significance, certain mutations in the human Q9H078 gene are associated with a number of pathologies implicated in mitochondrial disorders, one of which can be mimicked in zebrafish but rescued when the native human gene is introduced (Kanabus et al., 2015; Wortmann et al., 2015). The protein also has four ankyrin (two antiparallel αhelices followed by a β-hairpin) domains near the N-terminus, which likely mediate protein-protein interactions (**Figure 2A**). Indeed, it has been shown to associate with ATP2A2 and cleaved by the rhomboid protease PARL, which are both involved in apoptosis (Wortmann et al., 2015; Saita et al., 2017). The cleavage site for PARL lies between the cysteine residue at position 126 and the tyrosine residue at position 127, which excises the localization signal while preserving the ankyrin repeats and nucleotide binding domain (Saita et al., 2017).

In any event, the loss of both hsp104 and hsp78 genes in metazoans (Erives and Fassler, 2015) is striking, and opens the debate whether or not one or more genes can have the functions of these chaperones in animals (Mokry et al., 2015; Wortmann et al., 2015). Erives and Fassler (2015) showed that the ANKCLP gene occurs alongside the Hsp104 and Hsp78 genes in choanoflagellates, indicating that it may be a fusion of an N-terminal ankyrin domain and the C-terminal domain of a clp gene. There are a large variety of clp groups and it is fairly difficult to point out which gene contributed to the C-terminal domain.

Despite its medical importance, ANKCLP appears not to be a bona fide replacement for the lack of Hsp78, nor Hsp104, as indicated by some factors. One is the fact that the ANKCLP gene lacks the NBD1 which is important for function and ATPinduced conformational changes (**Figure 2B**). Additionally, since the homology between NBD1 and NBD2 is limited (Schirmer

#### REFERENCES


et al., 1996), it would be unexpected that the presence of only one could suffice for the entire function. Another point was raised by the work of Erives and Fassler (2015), in which the region upstream of the ANKCLP gene lacks the archetypical heat shock element consensus sequences that allow multimeric binding of the transcription Heat Shock Factor 1 (Hsf1). This extragenic region is strictly conserved in organisms with genuine Hsp78 and Hsp104 genes (Erives and Fassler, 2015).

Nonetheless, the investigations discussed here imply that much work has yet to be done to determine whether or not one or more genes can have the functions of Hsp78 and Hsp104 in animals.

# CONCLUSION

Hsp78 is a representative member of the AAA family in the yeast mitochondria. This chaperone has ATPase activity and can oligomerize into a hexamer or smaller oligomers in isolated mitochondria in a concentration dependent manner. Also, Hsp78 is essential for proper recovery following mitochondrial stress, as the chaperone associates with other Hsps as part of the mitochondrial PQC system. The lack of an Hsp78 homolog in metazoans is enigmatic due to its important role on degradation of fungal mitochondrial proteins. Thus, whether metazoans have completely lost Hsp78/Hsp104 activities remains an open question.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

We thank Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP; 2012/50161-8), Conselho Nacional de Pesquisa e Desenvolvimento (CNPq; 305018/2015-9), and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES; DFATD 88887.125517/2016-00 - 99999.004913/2015-09) for financial support and fellowships.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmolb. 2017.00060/full#supplementary-material

75, 333–366. doi: 10.1146/annurev.biochem.75.101304. 123901


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Abrahão, Mokry and Ramos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Structure and Function of p97 and Pex1/6 Type II AAA+ Complexes

Paul Saffert <sup>1</sup> , Cordula Enenkel <sup>2</sup> and Petra Wendler <sup>1</sup> \*

<sup>1</sup> Department of Biochemistry, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany, <sup>2</sup> Department of Biochemistry, University of Toronto, Toronto, ON, Canada

Protein complexes of the Type II AAA+ (ATPases associated with diverse cellular activities) family are typically hexamers of 80–150 kDa protomers that harbor two AAA+ ATPase domains. They form double ring assemblies flanked by associated domains, which can be N-terminal, intercalated or C-terminal to the ATPase domains. Most prominent members of this family include NSF (N-ethyl-maleimide sensitive factor), p97/VCP (valosin-containing protein), the Pex1/Pex6 complex and Hsp104 in eukaryotes and ClpB in bacteria. Tremendous efforts have been undertaken to understand the conformational dynamics of protein remodeling type II AAA+ complexes. A uniform mode of action has not been derived from these works. This review focuses on p97/VCP and the Pex1/6 complex, which both structurally remodel ubiquitinated substrate proteins. P97/VCP plays a role in many processes, including ER- associated protein degradation, and the Pex1/Pex6 complex dislocates and recycles the transport receptor Pex5 from the peroxisomal membrane during peroxisomal protein import. We give an introduction into existing knowledge about the biochemical and cellular activities of the complexes before discussing structural information. We particularly emphasize recent electron microscopy structures of the two AAA+ complexes and summarize their structural differences.

#### *Edited by:*

James Shorter, University of Pennsylvania, United States

#### *Reviewed by:*

Kür ¸sad Turgay, Leibniz University of Hanover, Germany Yihong Ye, National Institutes of Health, United States

*\*Correspondence:* Petra Wendler petra.wendler@uni-potsdam.de

#### *Specialty section:*

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> *Received:* 24 February 2017 *Accepted:* 05 May 2017 *Published:* 29 May 2017

#### *Citation:*

Saffert P, Enenkel C and Wendler P (2017) Structure and Function of p97 and Pex1/6 Type II AAA+ Complexes. Front. Mol. Biosci. 4:33. doi: 10.3389/fmolb.2017.00033 Keywords: type II AAA+ ATPases, Pex1, Pex6, p97, cryo electron microscopy

# INTRODUCTION

The conversion of chemical energy in the form of nucleotide triphosphates into mechanical energy is a process utilized by all living cells and associated with a large variety of cellular functions. Proteins of the AAA+ superfamily are often essential parts of such molecular machines. They catalyze the hydrolysis of ATP to ADP resulting in mechanical work on a substrate molecule. To date, at least 80.000 AAA+ domains (Pfam ID: PF00004; Finn et al., 2016) have been identified throughout more than 5400 species covering all kingdoms of life. The protein data bank stores structures of 722 proteins with AAA+ domains. All AAA+ have a structurally conserved nucleotide-binding domain (NBD) in common, usually comprising 200–250 amino acids (AA), which is essentially responsible for ATP binding and subsequent hydrolysis (Wendler et al., 2012). All AAA+ NBDs share a conserved Rossmann fold domain with a 51432 topology of the central β-sheet and a C-terminal alpha helical domain. They contain multiple conserved features including the Walker A motif, Walker B motif (Walker et al., 1982) and the second region of homology (SRH) (Swaffield et al., 1992). The SRH, differentiating the classic AAA proteins from other Walker A/B ATP binding proteins, typically contains Arg-residues, which work in trans in the active hexamer to facilitate ATP-hydrolysis by interacting with the γ-phosphate of ATP bound to the neighboring subunit (Karata et al., 1999; Neuwald et al., 1999). The remaining sequence, N-terminal and C-terminal to the NBD, often show very little sequence homology among the AAA+ protein family. The AAA+ family can be further classified by the presence of one NBD (type I) or two consecutive NBDs (type II; NBD1 and NBD2). This review will focus on comparing recent EM structures of two prominent candidates of the type II AAA+ class, p97 (Cdc48, yeast homolog) and the Pex1/Pex6 complex, both belonging to the classic clade of AAA+ proteins (Iyer et al., 2004). The two complexes are the only known members of the class to be involved in remodeling of ubiquitinated substrates, although none of the ATPases harbors a high affinity ubiquitin interaction domain. Intriguingly, the mode of action of the AAA+ complexes is also highly debated and possibly differs between the two functional and structural homologs. The scope of this review is not to give an exhaustive summary of the structural data or the cellular role of the complexes. There are many excellent reviews on the cellular role, on co-factor binding, and on the structure of p97 (Meyer et al., 2012; Buchberger, 2013; Olzmann et al., 2013; Dantuma et al., 2014; Tang and Xia, 2016; Xia et al., 2016) or Pex1/Pex6 (Fujiki et al., 2012; Waterham and Ebberink, 2012; Grimm et al., 2016). We would like to refer the interested reader to these publications for an in-depth coverage of these topics.

## CELLULAR FUNCTIONS OF THE P97 PROTEIN COMPLEX

P97 is an essential protein with a broad cellular distribution. It makes up 1% of the total cellular protein pool and is one of the most conserved proteins in eukaryotes emphasizing its importance in cell homeostasis (Wang et al., 2004). Moir et al. first described the yeast variant of p97 in 1982 and it was preliminarily associated with cell cycle arrest, thus called Cdc48 (Cell Division Cycle) (Moir et al., 1982). Since then, p97 has been intensively investigated and implicated in a myriad of functions. The best documented role of p97 has been established in the Ubiquitin-Proteasome-System (UPS) for mobilizing target proteins for degradation by the 26S proteasome (Ghislain et al., 1996; Hitchcock et al., 2001; Rape et al., 2001; Ye, 2006; Jentsch and Rumpf, 2007; Stolz et al., 2011), in particular during endoplasmic-reticulum-associated degradation (ERAD) (Ye et al., 2001; Jarosch et al., 2002; Rabinovich et al., 2002; Meusser et al., 2005). Over the last decade p97's critical role has been furthermore associated with cell-cycle regulation (Meyer and Popp, 2008; Meyer et al., 2012) and DNA repair (Meerang et al., 2011; Ramadan and Meerang, 2011). Due to the above mentioned diverse functions, p97 has been termed "the Swiss army knife of cell biology" (Baek et al., 2013) and "a molecular gearbox" in the ubiquitin pathway (Jentsch and Rumpf, 2007). These different tasks are enabled and regulated by many adaptors/clients, which recruit and connect p97 to different cell organelles (Dreveny et al., 2004). Furthermore, the capability of p97 associating with ubiquitinated substrates as well as ubiquitinbinding adaptors/clients adds to its versatility (Rape et al., 2001; Richly et al., 2005). The N-terminal domain predominantly mediates adaptor/client binding, however numerous binding partners for the C-terminus have been identified (Buchberger et al., 2015).

So far, at least 40 proteins have been found to interact with p97 in mammalian systems. Intriguingly, most of these factors share common binding motifs and conserved binding modules (Buchberger et al., 2015). Binding and recruitment of adaptors/clients to the N-terminus is facilitated via the ubiquitin related UBX-domain (ubiquitin regulatory X) or UBXL-domain (UBX-like) and the three linear motifs named VIM (VCP interacting motif), VBM (VCP binding motif) or SHP box (Yeung et al., 2008). The number and variety of different interaction motifs suggest a high temporal and spatial regulation of interacting partners. To date, 13 human proteins have been identified possessing an UBX-domain, all of which have been implicated in p97 binding (Yeung et al., 2008; Buchberger et al., 2015). All these proteins compete for the same binding domain in the N-terminus of p97. It has been shown that the nucleotide binding state of p97 can be a discriminating factor for binding different adaptors/clients. The Ufd1/Npl4 heteromeric complex possesses a SHP box (Ufd1) and an UBXL-domain (Npl4), respectively, and recruits substrates of proteasomal degradation or processing pathways to p97. In contrast, p47 possesses an SHP box as well as an UBX-domain and recruits nonproteasomal substrates to p97. Binding of ATP rather than ADP in the first nucleotide-binding-domain increases the association of the Ufd1/Npl4/p97 complex, allowing for competition with p47, thereby regulating the engagement of p97 in either directed proteasomal proteolysis or non-proteasomal proteolysis pathways (Chia et al., 2012). In addition to the interaction with the N-terminal domain, multiple reports highlight the finding that the C-terminal tail of NBD2 is also capable of binding to specific substrates. This interaction has so far been shown for proteins containing a PUB (PNGase/UBA or UBX containing proteins) or PUL (PLAP, Ufd3p, and Lub1p) domain (Allen et al., 2006; Qiu et al., 2010; Chia et al., 2012). PUB-domain proteins can bind to the C-terminal PIM-motif (PUB-interacting motif) of p97 whereas the interaction with the PUL-domain is more controversial (Zhao et al., 2009; Qiu et al., 2010). Biochemical and mutational analysis determined the binding site of Cdc48 on the yeast Plaa homolog Doa1 (Zhao et al., 2009), but a crystalstructure of the human Plaa with the 10 AA C-terminal peptide of p97 suggest a different binding pocket (Qiu et al., 2010). Intriguingly, the protein UBXD1 can interact with both termini by two independent binding sites, thus being a very unique cofactor of p97 (Kern et al., 2009). Binding studies are complicated by the high oligomeric organization of p97 with a total of 6 N-termini in the complex, thereby allowing theoretically 6 individual binding partners. The matter is further complicated by the fact that many co-factors possess more than one binding motif and that some of the above described binding sites are overlapping, i.e., VIM/VBM and UBX/UBXL. To summarize, in order to understand the entire network of regulatory interactions between different adaptors/clients and p97 further investigation is needed.

#### ATPASE ACTIVITY OF THE P97 COMPLEX

A unique feature of p97 distinguishing it from the other type 2 AAA+ proteins is the high level of conservation of the two Saffert et al. Comparison p97 and Pex1/6

individual NBD's. Although the two domains share over 40% identity at sequence level, multiple studies have shown that the two domains of p97 contribute differently to the bulk hydrolysis activity. Nucleotide binding by NBD1 accelerates but is not required for p97 oligomerization (Wang et al., 2003). Once the hexamer is formed, full length p97 appears unable to exchange nucleotide in NBD1 at physiological temperatures (Davies et al., 2005). In contrast, NBD2 is the major site of ATP hydrolysis in the p97 complex (Song et al., 2003). Yet, NBD1 possesses an intrinsic hydrolysis activity and has been linked to the heatshock induced ATP activity observed with p97/Cdc48 (Song et al., 2003). Binding studies using ITC performed with purified murine p97 have revealed that NBD1 of p97 has a 90-fold higher binding affinity toward ADP compared to NBD2 (KdNBD1 = 1 µM; KdNBD2 = 90 µM; **Table 1**). Interestingly, the binding affinities for the two domains toward the ATP-analog ATPγS are with K<sup>d</sup> = 2 µM and K<sup>d</sup> = 3 µM for NBD1 and NBD2, respectively, very similar (Briggs et al., 2008). ATPγS has been a proven valuable ATP analog for studying binding kinetics and simultaneously excluding residual hydrolysis activity. In contrast to other analogs, it shows similar binding kinetics as ATP. Intriguingly, even with saturating amounts of nucleotide only 9– 10 of the feasible NBDs are occupied (Briggs et al., 2008). ATPγS binding to the NBD2 ring of p97 shows a Hill-coefficient of 3–4, implying that 3–4 protomers of the complex positively cooperate upon ATP binding. It has to be highlighted that different groups have determined diverging Km-values for ATP hydrolysis ranging from 3 to 620 µM (Meyer et al., 1998; DeLaBarre et al., 2006; Briggs et al., 2008; Nishikori et al., 2011; Niwa et al., 2012). Most of these studies have investigated steady-state kinetics. The variations can be partly accounted for by the different homologs and conditions utilized. Several explanations for coordinated ATPase activity in the AAA+ ring have been proposed. For the NBD2 ring of p97, a concerted ATPase hydrolysis seems unlikely since most reports mention unequal nucleotide occupancy in the ring. The concerted approach states that all 6 NBD2 bind, hydrolyze and release nucleotide with identical parameters. It is more plausible to assume that ATP hydrolysis in the NBD2 ring has a "binding change" mechanism, i.e., binding of ATP to one NBD2 positively influences binding of ATP, ADP release or hydrolysis in the adjacent NBD2. Thus, the ATP hydrolysis would proceed in a rotary manner. The positive cooperativity of ATP binding observed in several p97 species favors this mechanism over the possible random ATP hydrolysis (DeLaBarre et al., 2006; Briggs et al., 2008; Nishikori et al., 2011).

It appears that the type of nucleotide bound to NBD1 has a vital role in the overall ATPase activity. As mentioned above, the NBD1 is mainly responsible for efficient hexamerization in a nucleotide dependent manner. However, a Walker B mutation in NBD1 (NBD1E305Q), trapping the domain in a permanent ATP bound state, results in 2-fold decrease in the ATPase activity while having the same apparent Km value for ATP (DeLaBarre et al., 2006; Nishikori et al., 2011). Interestingly, this mutation in the yeast homolog Cdc48 causes a lethal phenotype (**Table 1**). The ATP-trapped Walker B mutant in Caenorhabditis elegans Cdc48 affects the overall ATPase activity and negatively influences the cooperative ATPase activity in NBD2 (Nishikori et al., 2011). It has to be mentioned that other mutations inhibiting Cdc48 NBD1 ATPase activity, e.g., Arg-fingers, do not cause a severe phenotype and do not disturb the ATPase activity in the NBD2 ring (Esaki and Ogura, 2010; Nishikori et al., 2011). The effect of NBD1 mutations is controversial and still highly debated, as there are contradictive findings possibly due to the differences between the investigated homologs. Although challenging, measurements of ATPase activity in different mutants have provided valuable evidence for exchange of information between NBDs in one protomer and between neighboring domains in the NBD rings (Briggs et al., 2008; Nishikori et al., 2011; Chou et al., 2014). However, the in vivo function is most likely regulated by substrate interaction in combination with specific adaptors/clients and there are only few studies that investigate complex activity in the presence of both. DeLaBarre and co-workers have demonstrated that adding a specific substrate of p97, i.e., the cytoplasmic fragment of Synaptotagmin (Syt1), can substantially increase the basal ATPase activity by approximately 4-fold (DeLaBarre et al., 2006). It will thus be interesting and necessary to correlate structural studies with biochemical results, both in the presence and absence of substrate.

# CELLULAR FUNCTION OF THE PEX1/PEX6 PROTEIN COMPLEX

Research on peroxisomes (formerly called microbodies) started in the mid 1950's, but the term "peroxisome" was only introduced in 1966 when microbodies were discovered to be important sites of hydrogen peroxide metabolism (De Duve and Baudhuin, 1966). It was not until 1978 when Paul Lazarow described the β-oxidization of fatty acids occurring in peroxisomes (Lazarow, 1978). Further research has established that peroxisomes are also responsible for bile acid biosynthesis, plasmalogens biosynthesis, and compartmentalized catalase glutathione Stransferase activity (reviewed in Morel et al., 2004; Schrader and Fahimi, 2004; Wanders and Waterham, 2006). So far, over 70 distinct proteins have been found in or associated with mammalian peroxisomes. In humans, at least 14 of those proteins are involved in peroxisome biogenesis (Braverman et al., 2016). Mutations in any of these so-called peroxins and in particular in Pex1 and Pex6 have been reported to cause peroxisome biogenesis disorders (PDB), a spectrum of fatal rare diseases (Geisbrecht et al., 1998; Waterham and Ebberink, 2012).

Pex1 as well as Pex6 were first both described a decade later than p97 in 1991 and 1993, respectively (Erdmann et al., 1991; Spong and Subramani, 1993). The Pex1 and Pex6 ATPase domains were instantly identified to be homologous to previously described domains in p97 and N-ethyl-maleimide sensitive factor (NSF) leading to the affiliation of Pex1 and Pex6 with the growing group of ATPases associated with diverse biological activities (Erdmann et al., 1991; Spong and Subramani, 1993). It is generally accepted, that the import of peroxisomal proteins, which in contrast to proteins of other organelles are exclusively nuclear encoded, is an ATP-driven process. Thus, being so far the only peroxins with a characterized ATPase activity elevates Pex1 and Pex6 importance in the overall homeostasis of peroxisomes.


TABLE

1


Summary

of

the

conserved

motifs

as

well

as

the

effects

of

reported

mutations

in

these

motifs

of

the

type

II

AAA

+

proteins

*Homo*

*sapiens*

*(hs)*

p97,

*Saccharomyces*

*cerevisiae*

(sc)

Cdc48,

(Continued)



Their relevance is further highlighted by the fact that 60% and 16% of all cases of PBDs are caused by mutations in Pex1 and Pex6, respectively (Waterham and Ebberink, 2012). Initially, Pex1 and Pex6 were believed to perform complementary functions as two independent type II AAA+. This hypothesis was consolidated by findings, reporting partial rescue of certain Pex1 and Pex6 phenotypes when Pex6 and Pex1 was overexpressed, respectively. However, it was established that Pex1 and Pex6 form a heteromeric complex in an ATP - and Mg2+- dependent manner (Faber et al., 1998; Geisbrecht et al., 1998; Tamura et al., 1998). Although a hexameric structure has been proposed ever since the discovery of the direct interaction, it was not until 2012 when Saffian and co-workers presented evidence for a 700 kDa hexamer with a 1:1 stoichiometry (Saffian et al., 2012). The first structures, confirming the formation of a hexamer with alternating Pex1/Pex6 dimers, were published in 2015 (**Figure 1**; Blok et al., 2015; Ciniawsky et al., 2015; Gardner et al., 2015).

## ATPASE ACTIVITY OF THE PEX1/6 COMPLEX

In contrast to p97, closer examination of the amino acid sequence reveals only 28–30% identity and substantial sequence variations between the two NBD in Pex1 and Pex6. Both proteins contain a weakly conserved Walker A and Walker B motif in the first NBD (**Figure 1A**). The Pex1 Walker B motif shows an exchange of the conserved glutamate residue involved in ATP hydrolysis to asparagine and aspartate in yeast and humans, respectively (Beyer, 1997; Kiel et al., 1999). The acidic residues of the Pex6 NBD1 Walker B motif and the two arginine fingers of Pex1 NBD1 are absent in most eukaryotic model organisms (Ciniawsky et al., 2015). The arginine fingers of Pex6 NBD1 are only partially conserved in the yeast but not in the human protein. In summary, NBD1 of Pex1 and Pex6 are expected to bind nucleotides, but they most probably have no ATPase activity. NBD2 of both, Pex1 and Pex6, on the other hand, shows all the characteristic conserved features including Walker A motif, Walker B motif as well the SRH, suggesting that the NBD2 of Pex1/Pex6 is responsible for the ATPase activity of the entire complex. Interestingly, neither of the NBDs of yeast Pex1 or Pex6 can oligomerize on its own (Birschmann et al., 2005). Multiple analyses have underlined the importance of the NBD2 for the overall ATPase activity of the yeast complex (Blok et al., 2015; Ciniawsky et al., 2015; Gardner et al., 2015). However, the ATPase activity is not equally split between Pex1 and Pex6 NBD2. When a Walker B mutant is introduced into Pex1 NBD2 the overall ATPase activity is reduced to 50–80% of the WTactivity in yeast. Despite the mutation, cells are still able to grow on oleate as a sole carbon source, indicating functional peroxisomes (Ciniawsky et al., 2015). Thus, ATP hydrolysis in Pex1 NBD2 is not essential for Pex1/6 function. When the same mutation is introduced in NBD2 of Pex6 the ATPase activity of the complex is completely abrogated (Ciniawsky et al., 2015; Gardner et al., 2015), emphasizing the cooperativity between the peroxins in the hexamer. In contrast to p97, very little in vitro data is available describing Pex1 and Pex6 ATPase activity.

The only data published so far, show an apparent Km for ATP binding to yeast Pex1/Pex6 ranging from 0.17 to 0.69 mM (Saffian et al., 2012; Gardner et al., 2015), magnitudes different from the binding affinity of p97 toward ATP.

#### INTERACTION BETWEEN THE PEX1/6 COMPLEX AND PEX5 OR PEX26

Two potential interacting partners have been identified for Pex1/6: Pex5 and Pex26 (Pex15 in yeast; Birschmann et al., 2003; Platta et al., 2005, 2008; Tamura et al., 2006). Pex5 recognizes proteins that carry a peroxisomal targeting sequence and delivers them to the peroxisomal membrane. Not only can it be found in the cytosol but it is also incorporated into the membrane of peroxisomes, where it is proposed to form a temporary protein conducting channel (Erdmann and Schliebs, 2005). The Pex1/6 complex is responsible for ATP dependent recovery of the ubiquitinated Pex5 from the peroxisomal membrane (Platta et al., 2005), although a direct binding between Pex1/6 and Pex5 could not be reconstituted so far. It is plausible that the Pex1/Pex6 complex either recognizes ubiquitinated Pex5 or, as well as p97, needs adaptor proteins to interact with Pex5. Similar to p97, ubiquitin or ubiquitin like domains are suspected to bind to one of the four N-terminal domains of the complex. Intriguingly, Pex1 and Pex6 harbor two N-terminal domains, N1 or N2, each similar to the N-terminus of p97 and NSF (**Figure 1B**) (Shiozawa et al., 2004; Blok et al., 2015). Pex26 is permanently anchored to the peroxisomal membrane and has been shown to bind to the N-terminus of Pex6 when both Pex6 NBDs are ATP bound (Matsumoto et al., 2003). Dissociation from the membrane anchor is mediated by ATP hydrolysis in Pex6 NBD2 in yeast and human cells. Intriguingly, it has been shown that the interaction with the cytosolic part of Pex15 (Yeast homolog of Pex26) greatly reduces the ATPase activity of the NBD2-domains (Gardner et al., 2015). This suggests that the ATPase activity is spatially and temporarily regulated to ensure energy efficient retraction of Pex5 from the membrane only upon substrate binding to Pex1/6. The exact mechanism remains rather elusive and requires a more detailed mechanistic and structural understanding of the complex. Current models interpreting the collective data suggest that the ATPase activity of the Pex1/Pex6 complex results either in a partial or complete unfolding of the membrane anchored Pex5, thereby releasing the protein to be refolded or degraded by the 26S proteasome (Platta et al., 2005, 2008). This process is similar to p97 assisted ERAD, where p97 is required for extraction of luminal as well as membrane proteins from the ER upon their labeling with ubiquitin. The extracted protein is subsequently degraded by the 26S proteasome. Similarly to the Pex6-Pex26 interaction, p97 requires membrane-embedded ERAD components to be localized to the ER. While p97 can extract a variety of ubiquitinated proteins from the ER, Pex1/Pex6 has shown to be involved in the extraction of membrane bound Pex5 only, although Pex5 and p97 substrates share ubiquitin as a common recognition motif.

# EM STRUCTURES OF YEAST PEX1/6

Until 2015, the only available structural information on the Pex1/6 complex, was a crystal structure of the N1 fragment (amino acids 13-179) of mouse Pex1 (Shiozawa et al., 2004). Despite low sequence identity of 22% between the mouse p97 and Pex1 N-terminal fragments, both structures share the same double 9-β barrel fold (**Figure 1B**). In 2015, three groups reported first electron microscopy (EM) structures on the yeast Pex1/6 complex (**Figure 1C**; Blok et al., 2015; Ciniawsky et al., 2015; Gardner et al., 2015). The overall layout of the complex is identical in all three studies. Pex1 and Pex6 form a heterohexamer composed of a trimer of Pex1/6 dimers. Due to an irregular arrangement of the Pex1/6 N-terminal domains, the complex has a triangular appearance. Yet, the NBD1 and NBD2 domains form hexameric rings, which are stacked on top of each other and both of which contain a central pore of varying diameter. Two of the three structure analyses, used negative stain EM and obtained 3D reconstructions of 17–23 Å resolution (Ciniawsky et al., 2015; Gardner et al., 2015). The third analysis used cryo EM to solve Pex1/6 structures in the presence of ATPγS and ADP yielding 7.2 and 8.8 Å resolution, respectively (**Figure 1C**). None of the structures is of sufficient resolution to allow for unambiguous assignment of the nucleotide being bound to the binding pockets, and therefore heterogeneous binding cannot be ruled out. The studies that used negative stain EM investigated the Pex1/6 structure in the presence of different nucleotides (ATPγS, ADP or ATP, Gardner et al., 2015 and ATPγS, ATP, ADP, ADP-AlFx, Ciniawsky et al., 2015) or mutations (Pex1/6DWB, Pex1/6WB, Pex1WB/6, Ciniawsky et al., 2015). **Figure 1C** shows the EM structures of all groups in the presence of ATPγS or ADP. In one case, complex formation in the presence of ADP was poor leading to a small EM dataset and a poorly defined map (Ciniawsky et al., 2015). We therefore include the EM map obtained in the presence of the post hydrolysis transition state analog ADP-AlFx instead of ADP to the comparison (**Figure 1C**). The reconstructions in the presence of ATPγS have a very similar overall architecture. Despite the difference in resolution between maps obtained by cryo EM and negative stain, overlay of the structures demonstrates that the domain orientation is almost identical (**Figure 1D**). When homology models of the NBD1 and NBD2 domains are fitted as rigid bodies into the low resolution maps, the secondary structure elements overlay well with visible alpha helices and beta sheets of the higher resolution maps.

All three studies have in common that very little to no density for Pex1-N1 can be detected in the EM maps, suggesting that this domain is flexibly attached to Pex1-N2. Only one study shows significant Pex1-N movements between pre-hydrolysis and posthydrolysis states at low resolution (Ciniawsky et al., 2015), which hint at a directed movement of the Pex1 N-terminus. The cryo EM reconstructions show little structural differences between the two examined nucleotide states, making it impossible to deduce the functional dynamics of nucleotide hydrolysis. The negative stain reconstructions on the other hand show significant structural changes between the different examined nucleotide states. Between the ADP and ATPγS bound structures, Gardner and colleagues report (1) a rotation of the NBD2 ring relative to the NBD1 ring, (2) a rearrangement of the nucleotide binding domains in NBD1 and NBD2, and (3) a narrowing of the NBD2 pore. Ciniawsky and colleagues also show distinct NBD movements in dependence of the nucleotide or mutation present. In particular, a downward rotation of NBD2 is observed when ADP-AlFx or ADP are present or when either the NBD2 of Pex1 or Pex6 carries a Walker B mutation presumably inducing a post hydrolysis state in some NBD2 binding sites. Furthermore, the structures suggest a nucleotide dependent contact between NDB2 of Pex6 and NBD1 of Pex1. Finally, the study shows that Pex1 NBD2 is locked in a post-hydrolysis state when Pex6 NBD2 is permanently bound to ATP, indicating that Pex1 NBD2 can undergo one round of ATP hydrolysis under these circumstances. Since all studies demonstrate that the Pex1/6 complex has no ATP hydrolysis activity when NBD2 of Pex6 carries a Walker B mutation (**Table 1**), these structural results suggest that Pex1 NBD2 is unable to release ADP or bind ATP in this Pex6 NBD2 Walker B mutant.

In summary, recently published EM structures of the yeast Pex1/6 complex agree in the overall architecture of the complex, but a common mode of action cannot be deduced from these works. While some studies observe domain movements in the whole complex (Ciniawsky et al., 2015; Gardner et al., 2015), others detect no movements whatsoever upon ATP hydrolysis (Blok et al., 2015). In particular domain movements in the N-terminal domains and NBD2 are reminiscent of nucleotide dependent movements observed for p97.

### CRYO EM STRUCTURES OF EUKARYOTIC AND ARCHAEAL P97

P97 is by far the best studied and best characterized ATPase among the exciting AAA+ protein family. Hence the visualization of the p97 complex at a high resolution in the most natural conditions possible has been a pressing issue over the past years. The organization of the protein complex has been intensively studied using many different structural techniques including X-ray crystallography, small angle X-ray diffraction (SAXS) and Cryo-EM. The protein data bank itself contains numerous structures of p97. Despite this, the nucleotide dependent dynamics of the complex remain elusive, because full length X-ray crystallographic structures reveal very little structural changes between AAA+ assemblies in different nucleotide bound states. In brief, three different models for p97 segregation activity are currently considered (Buchberger, 2013) (i) threading of substrate molecules through a central channel of the complex formed by aromatic residues in the pore loops (ii) substrate processing by aromatic residues in the interior of the NBD2 ring without substrate passage through the NBD1 pore, and (iii) large scale movements of the Nterminal, substrate binding domains. Cryo EM structures of p97 deposited prior to 2016 are all resolved to ∼15 Å resolution and although they indicated nucleotide dependent structural re-arrangements they were of too low resolution to elucidate p97 molecular interactions. The recent advances in electron detection and image processing in cryo EM led to new efforts in structure elucidation of p97. In 2016, three groups independently published considerably improved cryo EM structures of p97/VAT (Banerjee et al., 2016; Huang et al., 2016; Schuller et al., 2016). **Figure 2** shows cryo EM reconstructions of mouse p97, human p97 and an archaeal p97, called VAT, obtained in the presence of ADP or ATPγS. The resolution of the mouse and archaeal p97 structures was determined to 6–9 Å, whereas near-atomic resolution structures of 2.4–3.3 Å resolution were obtained for human p97. A distinct density for the nucleotide in the binding pocket can only be seen in the cryo EM maps of human p97. It is important however, to mention that the structures have been obtained applying six-fold symmetry and thus would not provide information about asymmetric nucleotide occupancy in

FIGURE 2 | Cryo EM reconstructions of p97/VAT obtained in the presence of ATPγS or ADP. Top view (upper row), side view (middle row) and cut open side view (lower row) surface representations of EM maps (white) fitted with the respective p97 model are shown. The color code is as follows: p97 NBD1, red; p97 NBD2, blue. The table lists the electron microscopy database (EMD) accession codes, the nucleotide present during data collection, the resolution obtained, the symmetry applied during refinement, the source organism, the number nucleotides bound to the complex, and the reference for each EM reconstruction.

the ring, which have been observed in other studies (Briggs et al., 2008; Schuller et al., 2016). The eukaryotic p97 structures both show a pronounced movement of the N-terminal domains and a rotation of the NDB2 domains upon ATPγS binding (Barthelme and Sauer, 2016; Schuller et al., 2016). Mouse p97 binds 10 ATPγS in the complex and the asymmetric reconstruction indicates that not all N-terminal domains resolved due to flexibility. N-terminal domains that are visible are rotated by ∼90◦ and shifted by ∼12.5 Å in comparison to the ADP bound state. The dataset of ATPγS bound, human p97 was subjected to 3D classification and gave three distinct classes (Banerjee et al., 2016). The resulting structures suggest that the conformational change associated with ATPγS binding can be broken down into two steps. First, the NBD2 domains are binding to ATPγS, leading to a pivot-like movement of the NBD2 domains, narrowing the NBD2 pore dimension. In a second binding event the NBD1 domains are also occupied by ATPγS, lifting the N-termini from a position coplanar to the NBD1 domains to a position significantly above the NBD1 ring, as seen for mouse p97. Furthermore, binding of ATPγS leads to a stabilization of the C-terminal peptide from residue 763–768 in human p97. Intriguingly, this observation matches recent X-ray crystallographic data of ATPγS bound human p97, showing that R766 directly contacts the gamma phosphate of the neighboring subunit in the ring (Hanzelmann and Schindelin, 2016). Mutational analysis of R766 indicates that this residue is involved in nucleotide binding and regulation of the complex's catalytic function. Altogether, the structural differences observed between eukaryotic ADP and ATPγS bound p97 EM structures agree with model (ii) and (iii), but substrate threading most likely is impossible due to a very narrow pore in the NBD1 ring. Large scale movements of the p97 N-terminal domains were previously reported in a crystallographic study of the N-NBD1 fragment carrying disease associated point mutations (Tang et al., 2010). The cryo EM structures of full length mouse and human p97 confirm that these conformational changes occur in solution upon ATPγS binding (Banerjee et al., 2016; Schuller et al., 2016).

The archaeal p97/VAT reconstructions of Huang and colleagues are the best resolved structures of this complex so far, as no atomic resolution structures of VAT are available to date. Despite having almost 50% identity at sequence level, human p97 and VAT differ in their biochemical and structural properties. VAT has been shown to associate directly with the archaeal proteasome (Barthelme et al., 2014) and to utilize pore loop residues in both NBDs for substrate remodeling (Gerega et al., 2005), suggesting substrate threading through the central pore. Structurally, the ATPγS bound EM map of VAT resembles the ADP bound structures of p97 with regard to positioning of the N-terminal domains as well as the rotation of the NBD2 domain (**Figure 2**; Huang et al., 2016). However, the ADP bound map of VAT considerably differs from all other p97 structures. It shows a split washer like, spiral arrangement of the subunits in the double ring that connects NBD1 of one protomer along the seam with NBD2 of the other protomer forming the seam. Similar arrangements have been observed for other remodeling AAA+ complexes, such as ClpA, Hsp104, and rubisco activase (Guo et al., 2002; Stotz et al., 2011; Yokom et al., 2016). In contrast to the eukaryotic structures, the VAT cryo EM structures agree with the mechanistic model (i), although the pore residues are arranged in a helical opening. In this case, ATP hydrolysis relocates the NBD1 domains from a co-planar arrangement in the presence of ATPγS to the spiral arrangement in the presence of ADP, exerting differential pulling forces on various parts of the substrate (Huang et al., 2016). A nucleotide dependent movement of the N-terminal domains is not observed for VAT. A recent follow up study describes the substrate engaged 1N VAT hexamer in different nucleotide bound states, confirming the translocation of the substrate through the central pore of 1N VAT (Ripstein et al., 2017). Mechanistically, unfolding is proposed to be mediated by processive hand-over-hand substrate binding within the ring.

#### STRUCTURAL DIFFERENCES BETWEEN P97 AND PEX1/6 EM RECONSTRUCTIONS

Side-by-side comparison of the ATPγS bound p97 and Pex1/6 EM structures of similar resolution shows that the outer diameter of the NBD rings is almost identical (**Figure 3A**). Nonetheless, the diameter of the inner pore in the NBD1 ring is closed in all p97 structures, while Pex1/6 NBD1 rings form around an inner pore of ∼20 Å. The pore diameter in NBD2 is ∼10 and ∼30 Å for p97 and Pex1/6, respectively. It should be mentioned that the density in the NBD2 domains of the cryo EM structures of Pex1/6 is fragmented and that the negative stain maps indicate pore diameters of only ∼10 Å for the NBD2 ring. The close arrangement of the NBD1 domains in the recent cryo EM reconstructions of eukaryotic p97 agrees with the suggestion that this complex does not thread substrates through the central axis across the length of the barrel and that substrate can only access the NBD2 pore loops by entering and exiting through the NBD2 pore end (**Figure 3B**). In contrast, the Pex1/6 complex provides a sizable central channel through the entire structure and would thus be consistent with a substrate threading mechanism. The Pex1/6 complex also distinguishes itself from p97 in the overall domain arrangement. While p97 NBD1 and NBD2 are almost located on top of each other in the ATPγS bound state of the complex, the NBDs of Pex1 and Pex6 show a staggered arrangement (**Figures 3A,C**). Accordingly, the orientation and relative location of the N2 domain of Pex1 in the ring differs from that of the p97 Nterminal domain. When NBD1/2 protomers of p97 and Pex1/6 bound to ATPγS are superimposed on their NBD2 domains the relative orientation of the AAA+ domains to each other becomes apparent (**Figure 3D**), revealing that Pex1 NBD1 is rotated and shifted outwards with regard to p97 NBD1. This difference in domain arrangement possibly leads to the formation of a bigger central pore in the NBD1 ring of Pex1/6. It should be noted that Gardner and colleagues observe some nucleotide dependent rotation of the Pex1/6 AAA+ rings against each other (Gardner et al., 2015), although the staggered arrangement persists in all nucleotide states. Whether or not the two AAA+ domains adopt a stacked or staggered arrangement in the hexamer might be influenced by the peptide linker that connects the AAA+

domains. Typically, the linker region is not very well conserved and the sequence can differ greatly between different type II AAA+ complexes. So far, no prevalent signal transduction pathway has been found that involves the linker between the AAA+ domains or the linker that connects the N-terminal domains to NBD1. However, the linker region between the Nterminal domain and NBD1 of p97 has been shown to be prone to disease associated mutations (Watts et al., 2004), which seem to trap p97 in a state comparable to the ATPγS bound state (Tang et al., 2010; Tang and Xia, 2013), presumably by preventing movement of the N-terminal domains. Furthermore, an atomic resolution cryo EM structure of human p97 bound to ADP as well as the allosteric inhibitor UPCDC30245 (Banerjee et al., 2016) indicates that some degree of flexibility at the interface of the AAA+ domains is needed for the rotational movement of NBD2. This nucleotide dependent, rotational movement in NBD2 has been observed for human and mouse p97 as well as for yeast Pex1/6 (**Figure 4**). In all cases nucleotide hydrolysis triggers a rotation of the NBD2 domain that moves the substrate binding loops from a central position to a position closer to the C-terminal opening of NBD2. Interestingly, the most common missense allele in human Pex1, Pex1G843D, causes an amino acid exchange at the interface between NBD1 and NBD2 in Pex1, possibly disturbing complex dynamics. In summary, comparison between the cryo EM maps of p97 and Pex1/6 reveals variations in stacking of the tandem AAA+ domains in the complexes, possibly resulting in a larger central channel formed by Pex1/6. The EM structures of eukaryotic p97 exclude substrate threading through the entire length of the AAA+ double layer, due to a very narrow NBD1 pore. The yeast Pex1/6 structures on the other hand support such a mechanistic model, with some of them showing a nucleotide dependent downward rotation of the NBD2 domains (Ciniawsky et al., 2015). The structural studies on Pex1/6 have also demonstrated that the Pex1 N-terminal domains and in particular the Pex1 N1 domain are flexible. Thus, a mechanistic model, whereby Pex1/6 segregates or dislocates substrate protein by movement of the N-terminal domains is also plausible.

# OUTLOOK

Despite a variety of new structural information on type II AAA+ complexes, such as p97 and Pex1/6, we still have a rather static view on these multiprotein assemblies. In order to elucidate their nucleotide dependent dynamics, structural analysis of the complexes in different nucleotide bound states, biochemical findings and cell biological data need to be interconnected. Structure determination of these heterogeneous complexes to below ∼4 Å resolution to allow for identification

of the nucleotide bound to the NBDs is still a challenging task. The derivation of a common mode of action is further complicated by divergent structural and biochemical features of homologous proteins. Thus, a comprehensive characterization of each AAA+ complex is needed, before we can distinguish between specialization and similarities in force generation of different AAA+ proteins. Some of the many questions that remain to be answered for p97 and Pex1/6 complexes are (1) how variations in the structure of the AAA+ domain translate into functional differences, (2) what is the molecular basis of substrate remodeling, (3) how does substrate binding to the complex influence AAA+ activity, and (4) how is the interplay between

#### REFERENCES


orientation of the N-terminal domains and AAA+ binding status regulated.

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

We thank the Deutsche Forschungsgemeinschaft (DFG grant WE4628/1 to PW) for financial support.

complex revealed by cryo-electron microscopy. Proc. Natl. Acad. Sci. U.S.A. 112, E4017–E4025. doi: 10.1073/pnas.1500257112


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Saffert, Enenkel and Wendler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Mighty "Protein Extractor" of the Cell: Structure and Function of the p97/CDC48 ATPase

Yihong Ye<sup>1</sup> \*, Wai Kwan Tang<sup>2</sup> , Ting Zhang<sup>1</sup> and Di Xia<sup>2</sup> \*

<sup>1</sup> Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States, <sup>2</sup> Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States

p97/VCP (known as Cdc48 in S. cerevisiae or TER94 in Drosophila) is one of the most abundant cytosolic ATPases. It is highly conserved from archaebacteria to eukaryotes. In conjunction with a large number of cofactors and adaptors, it couples ATP hydrolysis to segregation of polypeptides from immobile cellular structures such as protein assemblies, membranes, ribosome, and chromatin. This often results in proteasomal degradation of extracted polypeptides. Given the diversity of p97 substrates, this "segregase" activity has profound influence on cellular physiology ranging from protein homeostasis to DNA lesion sensing, and mutations in p97 have been linked to several human diseases. Here we summarize our current understanding of the structure and function of this important cellular machinery and discuss the relevant clinical implications.

#### Edited by:

Walid A. Houry, University of Toronto, Canada

#### Reviewed by:

Alexander Buchberger, University of Würzburg, Germany Thorsten Hoppe, University of Cologne, Germany

#### \*Correspondence:

Yihong Ye yihongy@mail.nih.gov Di Xia xiad@mail.nih.gov

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 10 January 2017 Accepted: 22 May 2017 Published: 13 June 2017

#### Citation:

Ye Y, Tang WK, Zhang T and Xia D (2017) A Mighty "Protein Extractor" of the Cell: Structure and Function of the p97/CDC48 ATPase. Front. Mol. Biosci. 4:39. doi: 10.3389/fmolb.2017.00039 Keywords: AAA ATPase, p97/VCP, Cdc48, chaperones, protein denaturation, protein quality control, neurodegenerative diseases

p97/Cdc48 belongs to the AAA+ (extended family of ATPases associated with various cellular activities) ATPase family, which functions generally as essential chaperones to promote protein folding or unfolding. Cdc48 was initially identified in S. cerevisiae as a cell cycle regulator, which upon inactivation, leads to a cell cycle arrest at the G2-M transition stage (Moir et al., 1982). A mammalian homolog of 97 kDa was later discovered and dubbed as p97 or valosin-containing protein precursor (VCP) (Koller and Brownstein, 1987). In Drosophila, the name TER ATPase (transitional endoplasmic reticulum ATPase) has been used given the partial localization of this enzyme to the endoplasmic reticulum (ER) surface (Zhang et al., 1994, see below). In this review, we use p97 and Cdc48 to refer to the mammalian and yeast homologs, respectively.

As a type II AAA+ ATPase, p97/Cdc48 has two AAA ATPase domains designated as D1 and D2 (**Figure 1A**). These two domains are connected by a short polypeptide linker (D1–D2 linker). Although the ATPase domains are highly similar in sequence and structure, they have distinct functions: while the D1 domain is required for hexameric assembly of p97, the D2 domain is a major contributor of the overall ATPase activity (see below, Song et al., 2003; Wang et al., 2003). In addition, p97/Cdc48 has a sizable N-terminal domain (N-domain) that is linked to the D1 domain by a flexible polypeptide segment (N-D1 linker). At the C-terminus, a short tail is appended to the D2 domain. The interaction of p97/Cdc48 with its partners is mostly mediated by the Ndomain, but a few proteins bind p97/Cdc48 using its C-terminal tail (Ogura and Wilkinson, 2001; Buchberger et al., 2015).

As a soluble protein, p97 is primarily localized in the cytosol, but a fraction is present on organelle membranes including the endoplasmic reticulum (ER), Golgi, mitochondria, and

endosomes (Acharya et al., 1995; Latterich et al., 1995; Rabouille et al., 1995; Xu et al., 2011; Ramanathan and Ye, 2012). How p97/Cdc48 is recruited to different membranes is largely unclear, but this process is probably mediated by adaptors on different organelles, as demonstrated for the ER (Christianson and Ye, 2014). A fraction of p97/Cdc48 is also localized in the nucleus (Madeo et al., 1998), where it assists various chromatinassociated processes or nuclear protein quality control (PQC) (see below).

In multicellular organisms, the expression of p97 is ubiquitous. In humans, the transcription of p97 was moderately upregulated in some cancers, and the level of p97 mRNA appears to correlate with cell sensitivity to cell death induced by a potent p97 inhibitor, a potential anti-cancer drug (Anderson et al., 2015). More recently, genetic studies revealed that mutations in p97 may be causal to several human diseases including IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia) and amyotrophic lateral sclerosis (ALS) (Xia et al., 2016). These findings stimulated a flurry of investigations on p97 substrates whose "mis-handling" by p97 mutants may have caused abnormality in human physiology.

Most p97/Cdc48 substrates identified to date are conjugated with ubiquitin and targeted for degradation by the 26S proteasome, but a few exceptions exist (Ramadan et al., 2007; Wilcox and Laney, 2009; Ndoja et al., 2014). A key feature of the p97/Cdc48-assisted degradation system is that many cofactors or adaptors are capable of recognizing ubiquitin conjugates (Ye, 2006). Some p97 cofactors are enzymes that can add or remove ubiquitin conjugates, but most of them, regardless of whether or not possessing a ubiquitin binding motif, seem to serve an adaptor function that links this ATPase to a specific subcellular compartment or substrate.

# STRUCTURE OF P97

p97 forms a stable hexameric structure with two concentric rings (**Figures 1B,C**): the N-D1 ring has the N-domains laterally attached and therefore has a larger radius (Peters et al., 1990; Zhang et al., 2000; DeLaBarre and Brunger, 2003, 2005; Huyton et al., 2003; Davies et al., 2008; Banerjee et al., 2016; Schuller et al., 2016). A similar ring-shaped structure was observed for various IBMPFD mutants (Tang et al., 2010; Tang and Xia, 2012, 2013) and for wild-type p97 that is in complex with cofactors or adaptors (Dreveny et al., 2004; Ewens et al., 2014; Hanzelmann and Schindelin, 2016a). The hexameric assembly of p97 is dependent on the D1 domain, but is stable in the absence of nucleotide (Wang et al., 2003).

As in all AAA+ ATPases, the AAA module of p97/Cdc48 consists of a characteristic helical domain and a highly conserved RecA-like domain (**Figure 1A**). The RecA-like domain features a nucleotide-binding site at the interface between two adjacent subunits. In this configuration, arginine-finger residues (R359 and R635 for the D1 and D2 ring, respectively) can promote nucleotide hydrolysis by engaging the γ-phosphate of ATP that is bound to an adjacent subunit. In addition, the active site contains a Walker A [P-loop, G(x)4GKT, x is any residue] motif for nucleotide binding and Walker B motif (hhhhDE, h represents hydrophobic residues) for nucleotide hydrolysis (Ogura and Wilkinson, 2001).

# NUCLEOTIDE BINDING AND HYDROLYSIS

Purified p97 hydrolyzes 1–5 ATP molecules per hexamer per second in vitro (Meyer et al., 1998; Song et al., 2003; Ye et al., 2003; Tang and Xia, 2013). The ATPase activity of p97 can be influenced by physical parameters such as temperature, the position of the N-domain, and adaptor (Meyer et al., 1998; Song et al., 2003; DeLaBarre et al., 2006; Niwa et al., 2012; Zhang X. et al., 2015; Bulfer et al., 2016). Importantly, two recent reports showed that the ATPase activity of p97 and CDC48 can be activated moderately by a ubiquitinated model substrate (Blythe et al., 2017; Bodnar and Rapoport, 2017), consistent with genetic studies demonstrating that ATP hydrolysis is indispensable for all documented p97 functions (Kobayashi et al., 2002; Ye et al., 2003; Dalal et al., 2004; Raman et al., 2011; Xu et al., 2011, 2016).

Nucleotides binding to p97 has been measured by isothermal titration calorimetry (ITC) (Briggs et al., 2008; Tang et al., 2010) or by surface plasmon resonance (SPR) (Chou et al., 2014). Although there is a 10-fold difference in measured affinities, the relative affinity of D1 and D2 to nucleotide is comparable between these methods. For isolated wild-type p97, the D1 and D2 domains bind ADP with K<sup>d</sup> of ∼1 µM and ∼80 µM, respectively, but the affinity for ATP and ATPγS is about the same (∼2 µM) for these domains (Briggs et al., 2008). A remarkable observation, though not yet fully appreciated, is the existence of pre-bound or occluded ADP in the D1 domains, which may regulate the asymmetric movement of the N-domain (Tang et al., 2010; Tang and Xia, 2016a). Davies and colleagues first reported using chemical denaturation experiments that about half of the D1 sites in wild-type p97 hexamers are pre-occupied by ADP (Davies et al., 2005). It was subsequently shown that the D1 bound ADP molecules are difficult to remove in vitro, raising concerns about interpreting results from various in vitro ATP binding and hydrolysis experiments (Briggs et al., 2008; Tang et al., 2010).

In vitro studies showed that the two ATPase domains of p97 are not functionally equivalent, as the D2 domain reportedly displays a higher ATPase activity than D1 (Song et al., 2003). Whether the D1 and D2 rings work independently or communicate with each other during the ATP hydrolysis cycle has been studied extensively, though the results reported are not always consistent. By measuring the activity of each ring while inhibiting the other, an early report suggested that the two ATPase rings operate independently (Song et al., 2003), but others showed evidence of inter-ring communications (Beuron et al., 2003; Ye et al., 2003; Chou et al., 2014). Moreover, intricate allosteric communication between ATPase domains within the same ring has been suggested (Nishikori et al., 2011; Hanzelmann and Schindelin, 2016b). These interactions are thought to coordinate domain movement during the ATP hydrolysis cycle.

# NUCLEOTIDE-DEPENDENT CONFORMATIONAL CHANGES

The conformational dynamics of p97 has been elusive, in part owing to difficulties in studying its structure under physiologically relevant in vitro conditions. The issue is further complicated by the occluded D1 nucleotide, which excludes other nucleotides from the same site. Furthermore, structural studies by crystallography often require proteins in different asymmetric units to take a similar conformation, but the six ATPase domains are not synchronized in nucleotide binding and hydrolysis. Despite of these challenges, conformational changes of p97 have been intensively pursued by both cryo-EM and X-ray crystallography. Early cryo-EM studies revealed moderate rotational movement between the two ATPase rings upon ATP hydrolysis as well as closure and opening of the D1 or D2 central channel (Rouiller et al., 2002). Other domain movements were also noted (Beuron et al., 2003). However, due to limited resolution, these studies failed to generate a consistent model. The issue was revisited more recently with the application of newer technologies. One study using high-speed atomic force microscopy showed a conformational change in CDC48.1, a C. elegans p97 homolog, which involves rotation of the ND1 ring back and forth relative to the D2 ring following D2 ATP hydrolysis (Noi et al., 2013). Likewise, another study by single-particle Cryo-EM reported two nucleotide dependent conformations, differentiated by inter-ring rotation of approximately 22◦ (Yeung et al., 2014).

Crystallographic studies initially suggested that nucleotidedependent conformational changes might take place only during the D2 ATP hydrolysis cycle because D1 appeared to be constantly occupied by ADP (Zhang et al., 2000; DeLaBarre and Brunger, 2003, 2005; Huyton et al., 2003; Davies et al., 2008). To date, the most significant structural change associated with the D2 ATPase cycle is the opening of the D2 pore and an inter-ring rotation mentioned above, but whether the D2 pore opening is triggered by nucleotide binding or hydrolysis is unclear (Rouiller et al., 2002; Davies et al., 2005, 2008; Pye et al., 2006; Banerjee et al., 2016; Hanzelmann and Schindelin, 2016b; Schuller et al., 2016). Additionally, part of the D2 domain also undergo an order-to-disorder transition (DeLaBarre and Brunger, 2005).

It has only become clear recently that the D1 domain in p97 can also hydrolyze ATP under physiological conditions. Studies using D2 specific p97 ATPase inhibitor demonstrated that the D1 domain contributes significantly (∼30%) to the overall ATPase activity (Chou et al., 2014; Anderson et al., 2015). Because genetic evidence showed that certain Cdc48 D1 mutants cannot rescue the growth defect of Cdc48 temperature sensitive alleles despite carrying an intact D2 domain, the D1 domain clearly has an important function (Ye et al., 2003; Nishikori et al., 2011).

Whether ATP hydrolysis by D1 is essential for p97 function has been a controversial issue. Nevertheless, D1-dependent conformational changes have been extensively sought by various biophysical approaches and were recently reported by several groups. Retrospectively, a major obstacle in studying D1 dependent conformational changes was the presence of substoichiometric amount of tightly bound ADP in the D1 nucleotide-binding site (Davies et al., 2005; Tang and Xia, 2013). One strategy to circumvent this problem in crystallographic study is to use p97 mutant proteins bearing amino acid substitutes found in IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia syndrome) patients (Kimonis et al., 2000). When purified, the D1 domain in these mutants can efficiently bind to exogenously added nucleotides, allowing crystallographic studies of conformational changes that occur during the D1 ATPase cycle. Strikingly, compared to structures in which D1 is in the ADP-bound state (Down-conformation, **Figure 2A**), in the presence of the ATP analog ATPγS in D1, the N-domain undergoes a hinged upswing (Up-conformation, **Figure 2B**) (Tang et al., 2010; Xia et al., 2016). A similar conformational change was seen with wild-type p97 in solution by small-angle X-ray scattering (SAXS) (Tang et al., 2010). As it turns out that the difference between wild-type and mutant p97 lies in that for p97 mutant all six N-domains undergo a uniform conformational change, allowing X-ray crystallographic studies, whereas for wild-type p97 only a fraction of the six subunits have the N-domains in the Up-conformation (Tang and Xia, 2016a). Thus, unsynchronized nucleotide binding and hydrolysis seems to be a common feature for both D1 and D2, which might be functionally relevant to the observed asymmetric adaptorbinding to the p97 N-domain (Buchberger et al., 2015).

The above-mentioned conformational changes in the Ndomain were lately confirmed by cryo-EM studies. One study found p97 in three different, co-existing states in the presence of ATPγS in solution: one has ADP bound to all 12 sites and the N-domains in the Down conformation; the second, also in the Down conformation, has the six sites in the D1-ring and the six sites of the D2-ring occupied by ADP and ATPγS, respectively; in the third case, all 12 sites contain ATPγS and now the N-domains are held in the Up-conformation (Banerjee et al., 2016). It should be noted that while the EM densities for the D1 and D2 domains are well defined, those for the N domains are not, particularly for the one with full occupancy of ATPγS. The poor density for the N-domains suggests disorder or multiple conformations. Indeed, in another study, carefully sorted images of wild-type p97 prepared in the presence of AMP-PNP showed that even different protomers within a single hexameric p97 molecule display significant asymmetric domain movement, resulting in a random distribution between the Up- and Down-conformations in solution (Schuller et al., 2016). The nucleotide-dependent Up and Down conformational switch of the N domain in the context of the N-D1 fragment was also confirmed recently by NMR (Schuetz and Kay, 2016).

#### MECHANISM OF FORCE GENERATION

A major unresolved issue in the field is how conformational changes in p97 generate the proposed "segregase" activity. To date, the most consistent conformational changes observed are

likely to be mobile (brown balls). ATP binding to the empty sites of the D1 domains will lead the N-domains to the Up-conformation. Occupation of ATP to the D1 domain renders the cognate D2 domain capable of hydrolyzing ATP, which is labeled with a red \*. The D1 domain probably hydrolyzes ATP once a few D2 domains have been converted to the ADP bound state.

the D2 rotation-accompanied pore opening/closing and the upand-down swing motion of the N-domain. While the former appears to be linked to the D2 ATPase cycle, the latter is driven entirely by nucleotide hydrolysis in the D1 domain (**Figure 2C**). Force generation presumably requires cooperation between the D1 and D2 rings, which would explain the observed interdomain communications (Beuron et al., 2003; Ye et al., 2003; Chou et al., 2014; Schuetz and Kay, 2016).

The force applied onto a substrate may result in partial unfolding of a client protein, and thus disrupt its interaction with protein assemblies, membranes, or chromatin. Although many AAA+ proteins are protein unfoldase (e.g., ClpA and ClpX) that threads polypeptides through a central tunnel (Singh et al., 2000), p97 cannot unfold GFP-ssrA, a model aberrant substrate (Rothballer et al., 2007). By contrast, VAT, a thermoplasma acidophilum p97 homolog, is capable of unfolding GFP-ssrA with a low efficiency (Gerega et al., 2005). Intriguingly, this unfolding activity can be dramatically enhanced when the N-domain of VAT is deleted (Gerega et al., 2005; Barthelme and Sauer, 2012). N-deleted VAT can also collaborate with the 20S proteasome to degrade GFP-ssrA in vitro (Barthelme and Sauer, 2016). Protein sequence analyses identified a KYYG motif in a D1 loop of VAT, which is replaced by KLAG in p97. When these tyrosine residues are introduced to replace leucine or alanine in a p97 variant lacking the N domains, it now can unfold and target GFPssrA to the 20S proteasome for degradation (Rothballer et al., 2007; Barthelme and Sauer, 2013). Collectively, these findings indicate that the widely observed cooperation between AAA+ ATPases and the 20S proteasome is an ancient scheme of protein degradation. However, with evolved changes in the N-domain and the D1 ring, p97 appears to acquire a more sophisticated mechanism to process its substrate. It has been speculated that p97/CDC48 might function as a special "unfoldase," perhaps only with the assistance from ubiquitin molecules conjugated to its substrate. Consistent with this view, the requirement of p97/Cdc48 in protein degradation in vivo can be bypassed if a flexible peptide was fused to the C-terminus of a proteasome substrate (Beskow et al., 2009), suggesting that p97/Cdc48 may initiate protein unfolding to expose a loosely-folded segment for subsequent engagement of the proteasome. More direct proof of the ubiquitin dependent unfoldase hypothesis came from two recent studies (Blythe et al., 2017; Bodnar and Rapoport, 2017), which used in vitro reconstitution systems to show that both p97 and its yeast homolog CDC48 can unfold GFP, but only when it carries ubiquitin conjugates. As expected, this activity is dependent on the D2 ATPase activity, the cofactors Ufd1 and Npl4, and on the length of the ubiquitin chains on GFP. Intriguingly, the D1 ATP hydrolysis does not seem to contribute significantly to GFP unfolding in a single round GFP turnover assay (Barthelme and Sauer, 2013). However, it appears to be required for substrate release from CDC48 to ensure processivity. Importantly, the study by Bodnar and Rapoport demonstrates, using two polyubiquitinated model substrates, that once ubiquitin chains are partially trimmed substrates can be completely threaded through the central pore of p97 together with the remaining ubiquitin molecules in a D1 to D2 direction, which results in unfolding of these proteins. The ubiquitin trimming reaction is dependent on an intricate interplay between p97 and its associated deubiquitinase Otu1 (Bodnar and Rapoport, 2017).

#### p97-INTERACTING PROTEINS

Proteomic studies have identified many factors that interact with p97/Cdc48 (Alexandru et al., 2008; Buchberger et al., 2015; Raman et al., 2015). These factors can be categorized either as adaptors, which link p97/Cdc48 to a specific substrate in a subcellular compartment, or as cofactors that facilitate substrate processing. Cofactors usually have enzymatic activities [e.g., Nglycanase, ubiquitin ligase, or deubiquitinase (DUB)] that can alter protein modifiers present on substrates (**Figure 3**).

Some p97/Cdc48-interacting proteins including PLAA/Ufd3, PNGase, HOIP, and Ufd2 bind to the C-terminal appendage of p97/Cdc48 (Rumpf and Jentsch, 2006; Zhao et al., 2007; Qiu et al., 2010; Bohm et al., 2011; Schaeffer et al., 2014; Murayama et al., 2015), but the vast majority bind p97/Cdc48 through its N-domain (**Table 1**) (Buchberger et al., 2015). Sequence analyses have revealed several p97-interacting patterns including VIM (VCP-interacting motif) (Stapf et al., 2011), UBX (ubiquitin regulatory X) (Buchberger et al., 2001; Schuberth and Buchberger, 2008), VBM (VCP-binding motif) (Boeddrich et al., 2006), and SHP box (also known as binding site 1, bs1) (Bruderer et al., 2004). The VCP-interacting motif (VIM) is a linear sequence motif (RX5AAX2R) present in gp78 (Ballar et al., 2006), SVIP (small VCP-inhibiting protein) (Ballar et al., 2007), VIMP (VCP-interacting membrane protein) (Ye Y. et al., 2004), VMS1 (Heo et al., 2010), UBXN6 (Hanzelmann and Schindelin, 2011; Stapf et al., 2011), and ZFAND2B (Stanhill et al., 2006). By contrast, the VBM domain found in proteins such as ataxin-3, Ufd2 and Hrd1 features a polarized sequence motif (RRRRXXYY) (Boeddrich et al., 2006). The SHP box in p47 (Kondo et al., 1997), Ufd1 (Meyer et al., 2000), and Derlin-1 (Lilley and Ploegh, 2004; Ye Y. et al., 2004; Greenblatt et al., 2011) on the other hand is a short polypeptide segment enriched in hydrophobic residues. Noticeably, the UBX domain, an 80 residue module structurally related to ubiquitin, is present in a p97/CDC48 adaptor family known as UBX-containing proteins, consisting of 13 members in humans (**Table 1**).

Intriguingly, despite the drastic difference in sequence and structure, many p97-interacting motifs, particularly those interacting with the N-domain, bind p97 in a similar mode. Consequently, the binding of many cofactors/adaptors to p97 is mutually exclusive (Meyer et al., 2000; Rumpf and Jentsch, 2006). These observations suggested the existence of distinct populations of p97 complexes in cells, each bearing a different set of partners. Conceptually, the composition of a p97 complex may not be static in cells. Co-factor exchange could occur, which would allow p97 to efficiently switch substrate to meet cellular demands. A similar "adaptor swapping" model has been proposed for the multi-subunit SCF (Skp1, cullin, and F box) ubiquitin ligase, which like p97, uses a collection of adaptors to engage distinct substrates. In this case, adaptor switch is catalyzed by Cand1, a protein exchange factor that stimulates the equilibrium of Cul1-Rbx1 with multiple F box protein-Skp1 modules (Pierce et al., 2013). Whether a similar regulatory strategy exists for p97/Cdc48 remains to be seen. Furthermore, given that the substrate processing cycle is comprised of two mechanistically distinct reactions, namely substrate binding and release, it is conceivable that a regulated hierarchical cofactor binding system may be coupled to ATP hydrolysis to coordinate these processes (Hanzelmann et al., 2011; Meyer et al., 2012).

Structural studies have revealed the general principles of p97 complex assembly. To date, one of the best characterized p97 complex is the p47-N-D1 assembly (Dreveny et al., 2004). One crystallographic study showed that the p97 N-domain could be divided into two sub-domains: a N-terminal double 9-barrel and a C-terminal β-barrel (**Figure 4A**). Between the two subdomains features a hydrophobic groove surrounded by patches of charged residues, which is the site bound by the UBX domain found in adaptors such as p47 and FAF1. The interaction usually exploits both hydrophobic and electrostatic forces (**Figure 4B**). More recently, a collection of structural studies showed that this cleft could be used to engage other

p97-binding motifs. For instance, although VIM is unrelated to the UBX domain in both sequence and structure, they both bind to the p97 N-domain at this location (**Figure 4C**, Hanzelmann and Schindelin, 2011). However, certainly not every N-domain binding protein interacts with p97 in such a manner. An additional surface on the N-domain that binds the SHP box was recently reported (**Figure 4D**, Hanzelmann and Schindelin, 2016a). Given that some p97 adaptor or adaptor complex contain both UBX and SHP domains (e.g., p47 and the heterodimeric Ufd1-Npl4 complex, **Table 1**), these adaptors may use a bipartite mechanism to form a complex with p97 (Bruderer et al., 2004; Isaacson et al., 2007; Yeung et al., 2008; Le et al., 2016).

Adaptor/cofactor binding to the C-terminus of p97 has also been studied by crystallography. One such structure is the PUB (PNGase/UBA) domain of the peptide-N-glycanase (PNGase) bound by a 10-residue peptide from the p97 C-terminus dubbed as PUB-interacting motif (PIM) (**Figure 4E**, Zhao et al., 2007). PNGase is a sugar-processing enzyme responsible for the removal of N-glycan from misfolded glycoproteins retrotranslocated from

#### TABLE 1 | p97-interacting proteins.


electrostatic potential surface. The positive potential is in blue, negative in red and neutral in white. (B) Structure of the p97 N-domain in complex with the UBX domain of FAF1 (PDB:3QC8). The N-domain, depicted as a molecular surface overlaid to a ribbon representation, has the N-terminal double Y-barrel domain colored green and C-terminal β-barrel domain colored red. The UBX domain of FAF1 is depicted as ribbon diagram in yellow. Critical residues for interaction are shown as ball-and-stick models and labeled. (C) Structure of the p97 N-domain in complex with the VIM motif of gp78 (PDB:3TIW). Here the VIM motif is shown as helix in yellow and its binding to the N-domain is mostly mediated by charged residues. (D) Structure of the p97 N-domain in complex with the Ufd1 derived SHP peptide (PDB:5C1B). Here the SHP peptide is shown as the stick model in yellow and it binds exclusively to the C-terminal β-barrel domain. (E) Structure of the N-terminal domain of PNGase in complex with a C-terminal peptide of p97 (PDB:2HPL). The PNGase N-terminal domain is shown in cartoon representation in yellow. The bound peptide is shown as a stick model with five residues (labeled) seen in the structure. The carbon atoms are colored in black, nitrogen in blue and oxygen in red. (F) Structure of the PUL domain of FLAA/Ufd3 in complex with a C-terminal peptide of p97 (PDB:3EBB). The PLAA PUL domain is shown in cartoon representation in yellow. The bound peptide is shown as a stick model with four residues visible in the structure. The carbon atoms are colored in black, nitrogen in blue and oxygen in red.

the ER (Blom et al., 2004). The PUB domain binds PIM in a 1:1 stoichiometry. In this complex, the PIM peptide binds to a conserved surface of the PUB domain (Allen et al., 2006; Zhao et al., 2007). Intriguingly, the conserved residue Y805 in the PIM motif essential for the interaction can be phosphorylated in cells. This post-translational modification may serve a regulatory function in controlling the p97-PNGase interaction (Zhao et al., 2007). Another example is demonstrated by the structure of a complex containing PLAA (phospholipase A2-activating protein) and the C-terminal peptide of p97 (Qiu et al., 2010). PLAA (also named Ufd3 or Doa1) has been implicated in a variety of cellular processes including processing of misfolded mitochondria outer-membrane proteins (Wu et al., 2016), ribophagy (Ossareh-Nazari et al., 2010), endosomal trafficking (Ren et al., 2008; Han et al., 2014), and in regulating the cellular ubiquitin level by an unknown mechanism (Johnson et al., 1995). In the structure, Y805 of p97 is once again located at the binding interface, suggesting that phosphorylation dependent regulation might be a common theme for p97 cofactor interactions (**Figure 4F**).

Several p97-adaptor assemblies have also been examined by Cryo-EM (Rouiller et al., 2000; Beuron et al., 2006; Pye et al., 2007; Bebeacua et al., 2012). EM studies showed that in the complex of p97 and Ufd1-Npl4 (Pye et al., 2007; Bebeacua et al., 2012), the adaptors bind to both the N- and D1-domain simultaneously. A similar mode of interaction was observed for Fas-associated factor-1 (FAF1) (Ewens et al., 2014).

Whether cofactor binding can cause a conformational change in p97/Cdc48 has not been thoroughly investigated. Structural studies of adaptor-free p97 N-D1 domain (PDB:1E32) or that bound by p47 (PDB:1S3S) or other adaptors showed no obvious change in the structure of p97 upon adaptor binding (Dreveny et al., 2004). However, adaptor-induced conformational changes may only take place in full-length p97 during a normal ATPase cycle, and thus might have escaped detection so far (Isaacson et al., 2007; Zhao et al., 2007; Qiu et al., 2010; Hanzelmann and Schindelin, 2011; Hanzelmann et al., 2011; Kim et al., 2011; Schaeffer et al., 2014). On the other hand, since ATP-dependent conformational changes, particularly those triggered by ATP binding to the D1 domain affect the position of the N-domain, the interaction of p97 adaptors with the N-domain can probably be regulated by the nucleotide state of the D1 ring, as suggested by a recent study (Bulfer et al., 2016).

# CELLULAR FUNCTION OF p97/CDC48

Given the substrate diversity, p97 is bestowed a broad function, which has been reviewed extensively (Bug and Meyer, 2012; Dantuma and Hoppe, 2012; Meyer et al., 2012; Yamanaka et al., 2012; Dantuma et al., 2014; Meyer and Weihl, 2014). Due to space constraints, we here only discuss a few relatively better characterized molecular processes, aimed at illustrating the general role of this ATPase in cells.

# ROLES IN PROTEIN HOMEOSTASIS CONTROL

p97/Cdc48 has been implicated in several PQC pathways, and thus is an essential component of the proteostasis regulatory network in eukaryotic cells (Meyer et al., 2012). In general, p97 facilitates the degradation of aberrant proteins by releasing them from cellular structures or large protein complexes. The first identified PQC function for p97 is in ER-associated protein degradation (ERAD), a pathway that eliminates misfolded proteins of the secretory pathway (Smith et al., 2011; Christianson and Ye, 2014; Ruggiano et al., 2014). During ERAD, misfolded proteins are retrotranslocated into the cytosol where they are degraded by the ubiquitin proteasome system. For misfolded luminal proteins, the retrotranslocation process consists of two essential steps. First, a portion of a substrate needs to be moved across the lipid bilayer to enter the cytosol. This reaction is believed to be mediated by a protein retrotranslocation complex containing the multispanning membrane ubiquitin ligase Hrd1 (Bordallo et al., 1998; Bays et al., 2001a; Gauss et al., 2006; Carvalho et al., 2010; Stein et al., 2014; Baldridge and Rapoport, 2016). In the second step, p97/Cdc48 is recruited to the site of retrotranslocation via association with proteins present in the retrotranslocation complex. These include Derlins, Hrd1, and VIMP in mammals or Ubxd2 in S. cerevesiae (Lilley and Ploegh, 2004; Ye Y. et al., 2004; Neuber et al., 2005; Schuberth and Buchberger, 2005). These proteins each bear a p97 interacting motif, and the interactions with p97 allow it to effectively capture substrates emerging from the retrotranslocation channel (Carvalho et al., 2010). Misfolded proteins then undergo ubiquitination and are dislocated from the membranes by p97 (Bays et al., 2001b; Ye et al., 2001, 2003; Braun et al., 2002; Jarosch et al., 2002; Rabinovich et al., 2002; Flierman et al., 2003; Zhong et al., 2004; Garza et al., 2009). Dislocated ERAD substrates are eventually targeted for degradation by the proteasome (Zhang and Ye, 2014). In addition to ERAD substrates, p97/Cdc48 can also release a few membrane-bound transcription factors without targeting them for degradation (Hitchcock et al., 2001; Rape et al., 2001; Shcherbik and Haines, 2007; Radhakrishnan et al., 2014); instead, these transcription factors are transported into the nucleus to affect gene expression in response to specific stimulating cues.

It has also been demonstrated that p97 can facilitate mitochondria-associated degradation (MAD) by extracting polypeptides from mitochondrial outer membrane (Heo et al., 2010; Xu et al., 2011; Hemion et al., 2014). This process eliminates aberrant polypeptides from mitochondrial outer membrane to maintain mitochondrial protein homeostasis. In addition, regulators of the mitophagy pathway (e.g., mitofusin), which turns over damaged mitochondria can also be subject to degradation by MAD (Tanaka et al., 2010). Upon mitochondrial damage, p97 and Ufd1, Npl4 are recruited to the surface of mitochondria, which is required for clearance of damaged mitochondria by mitophagy (Kimura et al., 2013). The mechanism that recruits p97 to mitochondria in MAD or mitophagy is unclear. One recent study identified a protein named Vms1 (VCP/Cdc48-associated mitochondrial stressresponsive 1) as a potential linker (Heo et al., 2010, 2013), but the role of Vms1 in mitochondria PQC remains controversial (Esaki and Ogura, 2012). In addition, in S. cerevisiae, a protein named Doa1 (also named Ufd3) can act in conjunction with Ufd1 and Npl4 to recruit substrates to Cdc48 in MAD (Wu et al., 2016).

Another essential PQC function involving p97 is the degradation of aberrant nascent polypeptides stalled on ribosomes in a process dubbed ribosome-associated degradation (RAD) (Brandman et al., 2012; Defenouillere et al., 2013; Verma et al., 2013). Ribosome stalling occurs when an mRNA in translation is defective (e.g., lack of stop codon, truncated, or damaged in other ways). Such defective mRNAs are rapidly decomposed, but only after they have been "put in test" for fidelity by translation (Brandman and Hegde, 2016). Thus, the execution of this cellular mRNA surveillance program is inevitably associated with the production of aberrant polypeptides, which need to be effectively removed. Using diverse model substrates, it has been demonstrated that a series of factors act in concert to split a stalled ribosome (Pisarev et al., 2010; Shoemaker et al., 2010; Shoemaker and Green, 2011), allowing another ribosome-associated ubiquitin ligase to ubiquitinate aberrant nascent polypeptide (Bengtson and Joazeiro, 2010). Subsequently, a ribosome-associated factor named Rqc1 together with the ubiquitinated substrate recruits p97/Cdc48, which in turn extracts defective polypeptides from the ribosome to promote their degradation by the proteasome (Brandman et al., 2012). Accordingly, inactivation of p97/Cdc48 or its cofactor Ufd1 and Npl4 leads to accumulation of ubiquitinated proteins in complex with the 60S ribosome (Verma et al., 2013).

Several recent studies also implicate p97 and Cdc48 in autophagy, which targets unwanted cellular proteins (including misfolded ones) for lysosomal degradation via autophagasomes. However, the precise function of p97 in this process is controversial, mainly because the substrate(s) regulated by p97 is unclear. Several studies suggest p97 as a positive autophagy regulator because its inhibition causes a phenotype reminiscent of what appears to be an autophagasome maturation defect (Ju et al., 2009; Ju and Weihl, 2010; Bug and Meyer, 2012). In S. cerevisiae, a Cdc48 adaptor named Shp1p can bind the autophagy regulator Atg8 to promote macroautophagy (Krick et al., 2010). A more recent study showed that in mammalian cells p97 might be involved in a specialized form of autophagy, which clears ruptured late endosome/lysosome (Papadopoulos et al., 2017). However, another study using a p97 specific inhibitor demonstrated that inhibition of p97 accelerates rather than inhibits autophagasome clearance, increasing the turnover of the autophagy cargo receptor protein p62 (Anderson et al., 2015). This suggests an inhibitory role for p97 in autophagy. Additional studies are required to clarify the precise role of p97 in autophagy.

Other than the proposed "segregase" activity, p97 may also act as a chaperone to transport misfolded polypeptides to the proteasome for degradation, or to simply prevent protein aggregation (Yamanaka et al., 2004; Nishikori et al., 2008; Gallagher et al., 2014; Neal et al., 2017). This activity might be critical for degradation of aggregation-prone nuclear proteins in budding yeast (Gallagher et al., 2014). Additionally, p97 was also shown to facilitate the clearance of non-translating messenger ribonucleoprotein complexes from stress granules via an unknown mechanism (Buchan et al., 2013). Other misfolded proteins that are potential p97 substrate include misfolded unassembled cytosolic and nuclear proteins (Xu et al., 2016). Lastly, in addition to acting directly on misfolded proteins, p97 can also control the stability of certain stress regulators. For example, the complex of p97 and UbxD7 was shown to work with a SCF ubiquitin ligase to target hypoxia-inducible factor 1 alpha (HIF1α) for degradation (Alexandru et al., 2008). More recently, it was shown that p97 could also control the glutamine-regulated turnover of glutamine synthetase as well as the half-life of several cullin-ring ubiquitin ligase substrates (Nguyen et al., 2017; Tao et al., 2017).

# OTHER FUNCTIONS

By releasing polypeptides from the chromatin in a manner analogous to that in ERAD, p97 and Cdc48 can function in an array of nuclear processes known as chromatin-associated degradation (Dantuma et al., 2014). Many nuclear p97 substrates have been identified. These include RNA polymerase (Pol) II complex (Verma et al., 2011), transcriptional repressor α2 (Wilcox and Laney, 2009), and CMG DNA helicase (Maric et al., 2014) in budding yeast, and the DNA replicating licensing factor CDT1 (Franz et al., 2011; Raman et al., 2011), replisome component Mcm7 (Moreno et al., 2014), DNA repairing proteins DDB2, XPC, and Rad52 (Bergink et al., 2013; Puumalainen et al., 2014), mitosis regulator Aurora B kinase (Ramadan et al., 2007; Sasagawa et al., 2012), certain DNA polymerases (Davis et al., 2012; Mosbech et al., 2012), the DNA double strain break (DSB) repair protein Ku70/80 (van den Boom et al., 2016), the RNA binding protein HuR (Zhou et al., 2013), and the polycomb protein L3MBTL1 (Acs et al., 2011) in metazoa. These substrates link p97 to various nuclear pathways ranging from gene expression control to DNA damage response. Intriguingly, although most of these proteins have been shown to undergo ubiquitination in cells, not all of them are subject to proteasome-mediated degradation.

In mitotic cells, p97/Cdc48 can regulate vesicle fusion at the exit of mitosis when the Golgi apparatus and the ER network need to be re-shaped (Kondo et al., 1997; Rabouille et al., 1998; Kano et al., 2005b,a; Uchiyama and Kondo, 2005). This process involves two adaptors p47 (Kondo et al., 1997; Meyer et al., 2002) and p37 (Uchiyama et al., 2006). In addition, a p97-associated deubiquitinase named VCIP135 is required (Uchiyama et al., 2002). It has been proposed that p97 may act on Syntaxin 5 to regulate vesicle fusion (Rabouille et al., 1998; Roy et al., 2000). In post-mitotic cells such as neurons, the complex of p97-p47 has been implicated in maintaining the tubular ER structure in order to control protein synthesis (Shih and Hsueh, 2016).

Several lines of evidence suggested that mammalian p97 might also regulate receptor-mediated endocytosis (Bug and Meyer, 2012; Kirchner et al., 2013). Proteomic studies uncovered the early endosome-associated antigen 1 (EEA1) and Clathrin as p97-interacting proteins (Pleasure et al., 1993; Ramanathan and Ye, 2012). Functionally, inhibition of p97 delays lysosomal targeting of an endocytosis cargo. p97 inhibition also causes clustered and enlarged early endosomes, which might result from increased EEA1 oligomerization and thus uncontrolled endosome tethering and fusion (Ramanathan and Ye, 2012). In another study, the plasma membrane protein caveolin was found to interact with p97 and UbxD1. In p97-deficient cells, enlargement of endosome was similarly observed, and the trafficking of caveolin to late endosomes is affected (Ritz et al., 2011). The precise function of p97 in endocytosis remains to be elucidated, but it might be mechanistically related to the proposed function of p97 in autophagy.

In addition to vesicular trafficking, p97 may also control protein transport in a non-vesicular manner as it was recently demonstrated that the complex of p97 and UBXN10 mediates protein transport into cilia to control ciliogenesis (Raman et al., 2015). Mammalian p97 has also been shown to regulate NFκB signaling by controlling the stability of the small inhibitory protein IκB in the canonical NFκB pathway (Dai et al., 1998; Li et al., 2014) or by facilitating the processing of the p100 subunit in the alternative NFκB activation pathway (Zhang Z. et al., 2015). The p97 was also shown to regulate the stability of RIG-1, a viral RNA sensor in innate immunity (Hao et al., 2015) as well as the activity of adipose triglyceride lipase (ATGL), an enzyme that controls lipid droplet biogenesis (Olzmann et al., 2013).

# RELEVANCE TO HUMAN DISEASE

Genetic studies in the past decade have linked a collection of p97 mutations to human diseases including MSP1 (multisystem proteinopathy 1) [also named IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia)], FALS (familial amyotrophic lateral sclerosis), CMT2Y (Charcot-Marie-Tooth disease, type 2Y) (Dyck and Lambert, 1968; Watts et al., 2004; Johnson et al., 2010; Abramzon et al., 2012; Bucelli et al., 2015), hereditary spastic paraplegias (HSP), Parkinson's disease (PD), and Alzheimer's disease (AD). Mechanistic studies suggest that a major dysfunction of p97 in association with these disease conditions is deregulation of the proteostasis network.

### MULTISYSTEM PROTEINOPATHY 1 (MSP1)

MSP1/IBMPFD is a severe autosomal dominant disorder. Patients experience progressive tissue damages in either the muscles (myopathy), the bones (Paget's disease of the bone, PDB), and/or the brain (frontotemporal dementia, FTD). To date, more than 40 mutations covering 29 different positions in p97 have been reported in MSP1/IBMPFD patients (Nalbandian et al., 2011; Mehta et al., 2013). However, as patients bearing the same mutation from a single family can show drastically different symptoms with differing on-set ages, other genetic or environmental factors may also make significant contribution to the disease etiology.

At the cellular level, muscle fibers from MSP1/IBMPFD patients often contain vacuoles that are stained by antibodies against ubiquitin and p97 (Watts et al., 2004). In brain tissues, nuclear inclusions containing ubiquitin and p97 were also frequently detected in neurons (Kimonis and Watts, 2005). More recent studies also found TAR DNAbinding Protein-43 (TDP-43) accumulating in patient tissues (Weihl et al., 2008). Genetic interactions between TDP-43 and p97 have also been revealed, which may regulate subcellular distribution of TDP-43 (Ritson et al., 2010). These findings suggested a role of p97 in controlling the neurotoxicity of aggregation-prone misfolded polypeptides, possibly by regulating their stability, solubility, or subcellular localization.

Structural studies revealed that MSP1/IBMPFD mutations are mostly mapped to or near the interface between the N and D1 domains of p97 (**Figure 5**). Because patients carrying a single allele of any MSP1 mutations develop normally, these mutations apparently only cause non-optimal performance in p97 ATP hydrolysis cycle, accumulating damages to p97 dependent cellular processes that culminate in neuronal cell death in adulthood (Kimonis et al., 2000). These mutations could affect the function of p97 in multiple facets. For example, many mutations appear to weaken the affinity of the D1 domain for ADP (Tang et al., 2010), resulting in increased (2–4-fold) D2 ATPase activity and a loss in coordinated Ndomain movement (Weihl et al., 2006; Halawani et al., 2009; Tang et al., 2010; Tang and Xia, 2013; Schuetz and Kay, 2016). Moreover, while some cofactors can elevate or inhibit the ATPase activity of wild-type p97, these regulations do not seem to occur with certain disease-associated mutants (Zhang X. et al., 2015). These observations collectively suggest that mutation-induced structural instabilities might have caused a loss in the fine-tuned ATPase cycle, causing cell damages. In addition, biochemical studies also demonstrated an effect of certain mutations on cofactor association (Fernandez-Saiz and Buchberger, 2010; Tang and Xia, 2016b), whereas in the case of p37 and p47, nucleotide dependent regulation of cofactor binding appears to be abolished with disease-associated mutants (Bulfer et al., 2016). In vivo, subtle deregulation of p97 ATPase activity might result in a gain-of-function phenotype in sensitive tissues, as demonstrated recently by a study using a Drosophila IBMPFD model (Zhang et al., 2017). Consistent with this view, Blythe and colleagues show that an IBMPFD mutant that has a moderately increased ATPase activity and can unfold ubiquitinated GFP more efficiently than wild-type p97 (Blythe et al., 2017).

# FAMILIAR AMYOTROPHIC SCLEROSIS (FALS)

Autosomal dominantly inherited amyotrophic lateral sclerosis (ALS) (also known as Lou Gehrig's disease) is a progressive neurodegenerative disease. It mainly affects the motor neurons in the brain and spinal cord, resulting in death from respiratory failure. While most ALS cases were caused by sporadic mutations, about 10% are considered "familial" because often more than one individual in a family develops the disease. Mutations in at least 18 genes have been identified in familial ALS. Among them, p97 mutations account for less than 2% (Johnson et al., 2010; Koppers et al., 2012; Kwok et al., 2015). There are 18 reported mutations appearing in 12 different positions. Although there is a significant overlap between MSP1 and

familial ALS mutations, mutations linked to familial ALS can be found in the D2 domain and many of them are not located at the interface between the N and D1 domains (e.g., I114V in the N domain, R487H, and R662C in the D2 domain) (**Figure 5**). How these mutations alter the function of p97 remains unclear. However, as the pathological hallmark of the disease, loss of motor neurons, is often linked to the appearance of ubiquitin-positive inclusions and/or deposition of TDP-43 positive aggregates (Johnson et al., 2010), ALS pathology may be at least in part attributed to defects in cellular protein homeostasis.

# CHARCOT-MARIE-TOOTH DISEASE, TYPE 2Y (CMT2Y)

Charcot-Marie-Tooth disease (CMT) is an autosomal dominant axonal peripheral neuropathy characterized by distal muscle weakness and atrophy associated with lengthdependent sensory loss. Like ALS, CMT is a clinically and genetically heterogeneous disorder and is divided into subtypes based on genetics, pathology, and electrophysiology of the disease (Dyck and Lambert, 1968). Missense mutations in p97 were recently identified in patients of the CMT2 Y-subtype (Gonzalez et al., 2014; Jerath et al., 2015). As most patients with CMT2Y do not obtain a genetic diagnosis, the number of cases bearing mutations in p97 should be higher than expected. Intriguingly, in addition to p97, other CMT2-associated genes identified include chaperones such as Hsp27 and Hsp22 (Houlden et al., 2008; Nakhro et al., 2013). These observations once again link the etiology of this disease to deregulation of the proteostasis network.

# p97 AS A POTENTIAL ANTI-CANCER TARGET

Given the important roles played by p97 in diverse cellular processes, specific inhibitors of p97 can be useful tools for dissecting the mechanism of p97 action. Early chemical screens focusing on compounds that inhibit ERAD identified two structurally related chemicals (Fiebiger et al., 2004). Characterization of these compounds led to the discovery of the first p97 inhibitor-Eeyarestatin (EerI) (**Figure 6**) (Wang et al., 2008, 2010). Intriguingly, although EerI binding causes a conformational change in p97, it does not seem to affect nucleotide hydrolysis by the D2 domain. Whether it affects ATP hydrolysis by D1 is unclear, nor is the inhibitory mechanism by EerI (Wang et al., 2010). Nevertheless, in tissue culture cells, EerI induces several key phenotypes attributed to p97 inhibition such as the accumulation of polyubiquitinated proteins, ERAD inhibition, ER stress induction, and apoptosis (Wang et al., 2009). Importantly, EerI has significant cancer-killing activities in vitro as it preferentially kills cancer cells isolated from patients; and it can synergize with the proteasome inhibitor Bortezomib to induce apoptosis in cancer cells (Wang et al., 2009). These observations provide a rationale for targeting p97 as a new anti-cancer therapy.

More recently, chemical screens in search of compounds directly targeting p97 have been conducted. Chou and colleagues reported the first reversible p97 D2 inhibitor, DBeQ (Chou et al., 2011). Subsequent work has optimized this chemical, leading to a collection of more potent and specific p97 inhibitors (Chou et al., 2013, 2014; Chapman et al., 2015; Zhou et al., 2015). An independent effort from Magnaghi and colleagues identified several competitive and non-competitive inhibitors that also target the D2 domain (Magnaghi et al., 2013). These p97 D2 inhibitors are highly specific and potent (Magnaghi et al., 2013; Anderson et al., 2015). Structural modeling and Cryo-EM studies have revealed the potential inhibitory mechanism of one p97 inhibitor, the small allosteric inhibitor UPCDC30254 observed at the interface between the D1 and D2 domains, seems to prevent the propagation of conformational changes necessary for p97

characterized by EM.

function (Banerjee et al., 2016). Treatment of human cancer cell lines with these allosteric inhibitors confirmed that inhibition of p97 indeed induces cell death in different cancer cell lines (Chou et al., 2011, 2013; Magnaghi et al., 2013; Anderson et al., 2015). Along this line, it is noteworthy that a reversible p97 inhibitor named CB-5083 has produced promising anti-cancer effects in mouse xenograft tumor models and is now being evaluated in clinical trials (Anderson et al., 2015; Zhou et al., 2015). Lastly, the use of these inhibitors in basic research has started to reveal novel p97 functions in DNA repair, turnover of ruptured lysosomes etc. (van den Boom et al., 2016; Papadopoulos et al., 2017).

In addition to the above-mentioned inhibitors, efforts from several groups have resulted in a large collection of p97 inhibitors (**Figure 6**) (Yi et al., 2012; Polucci et al., 2013; Cervi et al., 2014; Kang et al., 2014; Alverez et al., 2015; Chapman et al., 2015; Tao et al., 2015; Ding et al., 2016; Gui et al., 2016). Among them, it is particularly worth mentioning that several are natural products. Although these chemicals are not thoroughly characterized and their potency is often limited, research along this direction may lead to a safer p97 inhibitor better suited for cancer therapy.

### CONCLUSION REMARKS AND PERSPECTIVE

Through years of studies, we have accumulated a large body of knowledge on the structure and function of p97/Cdc48. Specifically, the identification of new p97 cofactors and substrates has revealed a whole new set of biological functions for this essential chaperone system, and it is anticipated that future studies will further expand the p97 functional repertoire. By contrast, mechanistic dissection of the molecular nature of the "segregase" activity has lagged behind, and many fundamental questions remain unresolved. Among them, the most intriguing one is how conformational changes in p97 generate the proposed "segregase" activity. The recently developed in vitro GFP unfolding assay represent a major step toward fully elucidating the mechanism of this important enzyme. Another key question is to understand the hierarchical organization of cofactor binding in the context of the ATPase cycle and substrate binding cycle. Moreover, animal models bearing disease-associated mutations are needed in order to better appreciate the connections between p97 dysfunction and human diseases. The recent advance in CRISPR technology should dramatically ease the development of these animal models. Finally, given the promising anti-cancer effect of p97 inhibitors, it is anticipated that more p97 inhibitors will be sought, and studies in this direction may one day produce a new class of anti-cancer agent.

# AUTHOR CONTRIBUTIONS

TZ and WT prepared the figures and tables, prepare part of the manuscript. DX and YY wrote the manuscript.

# ACKNOWLEDGMENTS

We thank L. Chen (University of Minnesota, MN) for preparing the p97 inhibitor figure. The research in the laboratories of DX and YY is supported by the Intramural Research Program of the National Cancer Institute and of the National Institute of Diabetes, Digestive & Kidney Diseases at the National Institutes of Health.

may serve a regulatory function in controlling the p97-PNGase interaction (Zhao et al., 2007). Another example is demonstrated by the structure of a complex containing PLAA (phospholipase A2-activating protein) and the C-terminal peptide of p97 (Qiu et al., 2010). PLAA (also named Ufd3 or Doa1) has been implicated in a variety of cellular processes including processing of misfolded mitochondria outer-membrane proteins (Wu et al., 2016), ribophagy (Ossareh-Nazari et al., 2010), endosomal trafficking (Ren et al., 2008; Han et al., 2014), and in regulating the cellular ubiquitin level by an unknown mechanism (Johnson et al., 1995). In the structure, Y805 of p97 is once again located at the binding interface, suggesting that phosphorylation dependent regulation might be a common theme for p97 cofactor interactions (**Figure 4F**).

Several p97-adaptor assemblies have also been examined by Cryo-EM (Rouiller et al., 2000; Beuron et al., 2006; Pye et al., 2007; Bebeacua et al., 2012). EM studies showed that in the complex of p97 and Ufd1-Npl4 (Pye et al., 2007; Bebeacua et al., 2012), the adaptors bind to both the N- and D1-domain simultaneously. A similar mode of interaction was observed for Fas-associated factor-1 (FAF1) (Ewens et al., 2014).

Whether cofactor binding can cause a conformational change in p97/Cdc48 has not been thoroughly investigated. Structural studies of adaptor-free p97 N-D1 domain (PDB:1E32) or that bound by p47 (PDB:1S3S) or other adaptors showed no obvious change in the structure of p97 upon adaptor binding (Dreveny et al., 2004). However, adaptor-induced conformational changes may only take place in full-length p97 during a normal ATPase cycle, and thus might have escaped detection so far (Isaacson et al., 2007; Zhao et al., 2007; Qiu et al., 2010; Hanzelmann and Schindelin, 2011; Hanzelmann et al., 2011; Kim et al., 2011; Schaeffer et al., 2014). On the other hand, since ATP-dependent conformational changes, particularly those triggered by ATP binding to the D1 domain affect the position of the N-domain, the interaction of p97 adaptors with the N-domain can probably be regulated by the nucleotide state of the D1 ring, as suggested by a recent study (Bulfer et al., 2016).

# CELLULAR FUNCTION OF p97/CDC48

Given the substrate diversity, p97 is bestowed a broad function, which has been reviewed extensively (Bug and Meyer, 2012; Dantuma and Hoppe, 2012; Meyer et al., 2012; Yamanaka et al., 2012; Dantuma et al., 2014; Meyer and Weihl, 2014). Due to space constraints, we here only discuss a few relatively better characterized molecular processes, aimed at illustrating the general role of this ATPase in cells.

# ROLES IN PROTEIN HOMEOSTASIS CONTROL

p97/Cdc48 has been implicated in several PQC pathways, and thus is an essential component of the proteostasis regulatory network in eukaryotic cells (Meyer et al., 2012). In general, p97 facilitates the degradation of aberrant proteins by releasing them from cellular structures or large protein complexes. The first identified PQC function for p97 is in ER-associated protein degradation (ERAD), a pathway that eliminates misfolded proteins of the secretory pathway (Smith et al., 2011; Christianson and Ye, 2014; Ruggiano et al., 2014). During ERAD, misfolded proteins are retrotranslocated into the cytosol where they are degraded by the ubiquitin proteasome system. For misfolded luminal proteins, the retrotranslocation process consists of two essential steps. First, a portion of a substrate needs to be moved across the lipid bilayer to enter the cytosol. This reaction is believed to be mediated by a protein retrotranslocation complex containing the multispanning membrane ubiquitin ligase Hrd1 (Bordallo et al., 1998; Bays et al., 2001a; Gauss et al., 2006; Carvalho et al., 2010; Stein et al., 2014; Baldridge and Rapoport, 2016). In the second step, p97/Cdc48 is recruited to the site of retrotranslocation via association with proteins present in the retrotranslocation complex. These include Derlins, Hrd1, and VIMP in mammals or Ubxd2 in S. cerevesiae (Lilley and Ploegh, 2004; Ye Y. et al., 2004; Neuber et al., 2005; Schuberth and Buchberger, 2005). These proteins each bear a p97 interacting motif, and the interactions with p97 allow it to effectively capture substrates emerging from the retrotranslocation channel (Carvalho et al., 2010). Misfolded proteins then undergo ubiquitination and are dislocated from the membranes by p97 (Bays et al., 2001b; Ye et al., 2001, 2003; Braun et al., 2002; Jarosch et al., 2002; Rabinovich et al., 2002; Flierman et al., 2003; Zhong et al., 2004; Garza et al., 2009). Dislocated ERAD substrates are eventually targeted for degradation by the proteasome (Zhang and Ye, 2014). In addition to ERAD substrates, p97/Cdc48 can also release a few membrane-bound transcription factors without targeting them for degradation (Hitchcock et al., 2001; Rape et al., 2001; Shcherbik and Haines, 2007; Radhakrishnan et al., 2014); instead, these transcription factors are transported into the nucleus to affect gene expression in response to specific stimulating cues.

It has also been demonstrated that p97 can facilitate mitochondria-associated degradation (MAD) by extracting polypeptides from mitochondrial outer membrane (Heo et al., 2010; Xu et al., 2011; Hemion et al., 2014). This process eliminates aberrant polypeptides from mitochondrial outer membrane to maintain mitochondrial protein homeostasis. In addition, regulators of the mitophagy pathway (e.g., mitofusin), which turns over damaged mitochondria can also be subject to degradation by MAD (Tanaka et al., 2010). Upon mitochondrial damage, p97 and Ufd1, Npl4 are recruited to the surface of mitochondria, which is required for clearance of damaged mitochondria by mitophagy (Kimura et al., 2013). The mechanism that recruits p97 to mitochondria in MAD or mitophagy is unclear. One recent study identified a protein named Vms1 (VCP/Cdc48-associated mitochondrial stressresponsive 1) as a potential linker (Heo et al., 2010, 2013), but the role of Vms1 in mitochondria PQC remains controversial (Esaki and Ogura, 2012). In addition, in S. cerevisiae, a protein named Doa1 (also named Ufd3) can act in conjunction with Ufd1 and Npl4 to recruit substrates to Cdc48 in MAD (Wu et al., 2016).

Another essential PQC function involving p97 is the degradation of aberrant nascent polypeptides stalled on ribosomes in a process dubbed ribosome-associated degradation

electrostatic potential surface. The positive potential is in blue, negative in red and neutral in white. (B) Structure of the p97 N-domain in complex with the UBX domain of FAF1 (PDB:3QC8). The N-domain, depicted as a molecular surface overlaid to a ribbon representation, has the N-terminal double Y-barrel domain colored green and C-terminal β-barrel domain colored red. The UBX domain of FAF1 is depicted as ribbon diagram in yellow. Critical residues for interaction are shown as ball-and-stick models and labeled. (C) Structure of the p97 N-domain in complex with the VIM motif of gp78 (PDB:3TIW). Here the VIM motif is shown as helix in yellow and its binding to the N-domain is mostly mediated by charged residues. (D) Structure of the p97 N-domain in complex with the Ufd1 derived SHP peptide (PDB:5C1B). Here the SHP peptide is shown as the stick model in yellow and it binds exclusively to the C-terminal β-barrel domain. (E) Structure of the N-terminal domain of PNGase in complex with a C-terminal peptide of p97 (PDB:2HPL). The PNGase N-terminal domain is shown in cartoon representation in yellow. The bound peptide is shown as a stick model with five residues (labeled) seen in the structure. The carbon atoms are colored in black, nitrogen in blue and oxygen in red. (F) Structure of the PUL domain of FLAA/Ufd3 in complex with a C-terminal peptide of p97 (PDB:3EBB). The PLAA PUL domain is shown in cartoon representation in yellow. The bound peptide is shown as a stick model with four residues visible in the structure. The carbon atoms are colored in black, nitrogen in blue and oxygen in red.

the ER (Blom et al., 2004). The PUB domain binds PIM in a 1:1 stoichiometry. In this complex, the PIM peptide binds to a conserved surface of the PUB domain (Allen et al., 2006; Zhao et al., 2007). Intriguingly, the conserved residue Y805 in the PIM motif essential for the interaction can be phosphorylated in cells. This post-translational modification

# A Mighty "Protein Extractor" of the Cell: Structure and Function of the p97/CDC48 ATPase

Yihong Ye<sup>1</sup> \*, Wai Kwan Tang<sup>2</sup> , Ting Zhang<sup>1</sup> and Di Xia<sup>2</sup> \*

<sup>1</sup> Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, United States, <sup>2</sup> Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, United States

p97/VCP (known as Cdc48 in S. cerevisiae or TER94 in Drosophila) is one of the most abundant cytosolic ATPases. It is highly conserved from archaebacteria to eukaryotes. In conjunction with a large number of cofactors and adaptors, it couples ATP hydrolysis to segregation of polypeptides from immobile cellular structures such as protein assemblies, membranes, ribosome, and chromatin. This often results in proteasomal degradation of extracted polypeptides. Given the diversity of p97 substrates, this "segregase" activity has profound influence on cellular physiology ranging from protein homeostasis to DNA lesion sensing, and mutations in p97 have been linked to several human diseases. Here we summarize our current understanding of the structure and function of this important cellular machinery and discuss the relevant clinical implications.

#### Edited by:

Walid A. Houry, University of Toronto, Canada

#### Reviewed by:

Alexander Buchberger, University of Würzburg, Germany Thorsten Hoppe, University of Cologne, Germany

#### \*Correspondence:

Yihong Ye yihongy@mail.nih.gov Di Xia xiad@mail.nih.gov

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 10 January 2017 Accepted: 22 May 2017 Published: 13 June 2017

#### Citation:

Ye Y, Tang WK, Zhang T and Xia D (2017) A Mighty "Protein Extractor" of the Cell: Structure and Function of the p97/CDC48 ATPase. Front. Mol. Biosci. 4:39. doi: 10.3389/fmolb.2017.00039 Keywords: AAA ATPase, p97/VCP, Cdc48, chaperones, protein denaturation, protein quality control, neurodegenerative diseases

p97/Cdc48 belongs to the AAA+ (extended family of ATPases associated with various cellular activities) ATPase family, which functions generally as essential chaperones to promote protein folding or unfolding. Cdc48 was initially identified in S. cerevisiae as a cell cycle regulator, which upon inactivation, leads to a cell cycle arrest at the G2-M transition stage (Moir et al., 1982). A mammalian homolog of 97 kDa was later discovered and dubbed as p97 or valosin-containing protein precursor (VCP) (Koller and Brownstein, 1987). In Drosophila, the name TER ATPase (transitional endoplasmic reticulum ATPase) has been used given the partial localization of this enzyme to the endoplasmic reticulum (ER) surface (Zhang et al., 1994, see below). In this review, we use p97 and Cdc48 to refer to the mammalian and yeast homologs, respectively.

As a type II AAA+ ATPase, p97/Cdc48 has two AAA ATPase domains designated as D1 and D2 (**Figure 1A**). These two domains are connected by a short polypeptide linker (D1–D2 linker). Although the ATPase domains are highly similar in sequence and structure, they have distinct functions: while the D1 domain is required for hexameric assembly of p97, the D2 domain is a major contributor of the overall ATPase activity (see below, Song et al., 2003; Wang et al., 2003). In addition, p97/Cdc48 has a sizable N-terminal domain (N-domain) that is linked to the D1 domain by a flexible polypeptide segment (N-D1 linker). At the C-terminus, a short tail is appended to the D2 domain. The interaction of p97/Cdc48 with its partners is mostly mediated by the Ndomain, but a few proteins bind p97/Cdc48 using its C-terminal tail (Ogura and Wilkinson, 2001; Buchberger et al., 2015).

As a soluble protein, p97 is primarily localized in the cytosol, but a fraction is present on organelle membranes including the endoplasmic reticulum (ER), Golgi, mitochondria, and groups. Retrospectively, a major obstacle in studying D1 dependent conformational changes was the presence of substoichiometric amount of tightly bound ADP in the D1 nucleotide-binding site (Davies et al., 2005; Tang and Xia, 2013). One strategy to circumvent this problem in crystallographic study is to use p97 mutant proteins bearing amino acid substitutes found in IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia syndrome) patients (Kimonis et al., 2000). When purified, the D1 domain in these mutants can efficiently bind to exogenously added nucleotides, allowing crystallographic studies of conformational changes that occur during the D1 ATPase cycle. Strikingly, compared to structures in which D1 is in the ADP-bound state (Down-conformation, **Figure 2A**), in the presence of the ATP analog ATPγS in D1, the N-domain undergoes a hinged upswing (Up-conformation, **Figure 2B**) (Tang et al., 2010; Xia et al., 2016). A similar conformational change was seen with wild-type p97 in solution by small-angle X-ray scattering (SAXS) (Tang et al., 2010). As it turns out that the difference between wild-type and mutant p97 lies in that for p97 mutant all six N-domains undergo a uniform conformational change, allowing X-ray crystallographic studies, whereas for wild-type p97 only a fraction of the six subunits have the N-domains in the Up-conformation (Tang and Xia, 2016a). Thus, unsynchronized nucleotide binding and hydrolysis seems to be a common feature for both D1 and D2, which might be functionally relevant to the observed asymmetric adaptorbinding to the p97 N-domain (Buchberger et al., 2015).

The above-mentioned conformational changes in the Ndomain were lately confirmed by cryo-EM studies. One study found p97 in three different, co-existing states in the presence of ATPγS in solution: one has ADP bound to all 12 sites and the N-domains in the Down conformation; the second, also in the Down conformation, has the six sites in the D1-ring and the six sites of the D2-ring occupied by ADP and ATPγS, respectively; in the third case, all 12 sites contain ATPγS and now the N-domains are held in the Up-conformation (Banerjee et al., 2016). It should be noted that while the EM densities for the D1 and D2 domains are well defined, those for the N domains are not, particularly for the one with full occupancy of ATPγS. The poor density for the N-domains suggests disorder or multiple conformations. Indeed, in another study, carefully sorted images of wild-type p97 prepared in the presence of AMP-PNP showed that even different protomers within a single hexameric p97 molecule display significant asymmetric domain movement, resulting in a random distribution between the Up- and Down-conformations in solution (Schuller et al., 2016). The nucleotide-dependent Up and Down conformational switch of the N domain in the context of the N-D1 fragment was also confirmed recently by NMR (Schuetz and Kay, 2016).

#### MECHANISM OF FORCE GENERATION

A major unresolved issue in the field is how conformational changes in p97 generate the proposed "segregase" activity. To date, the most consistent conformational changes observed are

likely to be mobile (brown balls). ATP binding to the empty sites of the D1 domains will lead the N-domains to the Up-conformation. Occupation of ATP to the D1 domain renders the cognate D2 domain capable of hydrolyzing ATP, which is labeled with a red \*. The D1 domain probably hydrolyzes ATP once a few D2 domains have been converted to the ADP bound state.

the D2 rotation-accompanied pore opening/closing and the upand-down swing motion of the N-domain. While the former appears to be linked to the D2 ATPase cycle, the latter is driven entirely by nucleotide hydrolysis in the D1 domain (**Figure 2C**). Force generation presumably requires cooperation between the D1 and D2 rings, which would explain the observed interdomain communications (Beuron et al., 2003; Ye et al., 2003; Chou et al., 2014; Schuetz and Kay, 2016).

The force applied onto a substrate may result in partial unfolding of a client protein, and thus disrupt its interaction with protein assemblies, membranes, or chromatin. Although many AAA+ proteins are protein unfoldase (e.g., ClpA and ClpX) that threads polypeptides through a central tunnel (Singh et al., 2000), p97 cannot unfold GFP-ssrA, a model aberrant substrate (Rothballer et al., 2007). By contrast, VAT, a thermoplasma acidophilum p97 homolog, is capable of unfolding GFP-ssrA with a low efficiency (Gerega et al., 2005). Intriguingly, this unfolding activity can be dramatically enhanced when the N-domain of VAT is deleted (Gerega et al., 2005; Barthelme and Sauer, 2012). N-deleted VAT can also collaborate with the 20S proteasome to nucleotide binding and Walker B motif (hhhhDE, h represents hydrophobic residues) for nucleotide hydrolysis (Ogura and Wilkinson, 2001).

# NUCLEOTIDE BINDING AND HYDROLYSIS

Purified p97 hydrolyzes 1–5 ATP molecules per hexamer per second in vitro (Meyer et al., 1998; Song et al., 2003; Ye et al., 2003; Tang and Xia, 2013). The ATPase activity of p97 can be influenced by physical parameters such as temperature, the position of the N-domain, and adaptor (Meyer et al., 1998; Song et al., 2003; DeLaBarre et al., 2006; Niwa et al., 2012; Zhang X. et al., 2015; Bulfer et al., 2016). Importantly, two recent reports showed that the ATPase activity of p97 and CDC48 can be activated moderately by a ubiquitinated model substrate (Blythe et al., 2017; Bodnar and Rapoport, 2017), consistent with genetic studies demonstrating that ATP hydrolysis is indispensable for all documented p97 functions (Kobayashi et al., 2002; Ye et al., 2003; Dalal et al., 2004; Raman et al., 2011; Xu et al., 2011, 2016).

Nucleotides binding to p97 has been measured by isothermal titration calorimetry (ITC) (Briggs et al., 2008; Tang et al., 2010) or by surface plasmon resonance (SPR) (Chou et al., 2014). Although there is a 10-fold difference in measured affinities, the relative affinity of D1 and D2 to nucleotide is comparable between these methods. For isolated wild-type p97, the D1 and D2 domains bind ADP with K<sup>d</sup> of ∼1 µM and ∼80 µM, respectively, but the affinity for ATP and ATPγS is about the same (∼2 µM) for these domains (Briggs et al., 2008). A remarkable observation, though not yet fully appreciated, is the existence of pre-bound or occluded ADP in the D1 domains, which may regulate the asymmetric movement of the N-domain (Tang et al., 2010; Tang and Xia, 2016a). Davies and colleagues first reported using chemical denaturation experiments that about half of the D1 sites in wild-type p97 hexamers are pre-occupied by ADP (Davies et al., 2005). It was subsequently shown that the D1 bound ADP molecules are difficult to remove in vitro, raising concerns about interpreting results from various in vitro ATP binding and hydrolysis experiments (Briggs et al., 2008; Tang et al., 2010).

In vitro studies showed that the two ATPase domains of p97 are not functionally equivalent, as the D2 domain reportedly displays a higher ATPase activity than D1 (Song et al., 2003). Whether the D1 and D2 rings work independently or communicate with each other during the ATP hydrolysis cycle has been studied extensively, though the results reported are not always consistent. By measuring the activity of each ring while inhibiting the other, an early report suggested that the two ATPase rings operate independently (Song et al., 2003), but others showed evidence of inter-ring communications (Beuron et al., 2003; Ye et al., 2003; Chou et al., 2014). Moreover, intricate allosteric communication between ATPase domains within the same ring has been suggested (Nishikori et al., 2011; Hanzelmann and Schindelin, 2016b). These interactions are thought to coordinate domain movement during the ATP hydrolysis cycle.

# NUCLEOTIDE-DEPENDENT CONFORMATIONAL CHANGES

The conformational dynamics of p97 has been elusive, in part owing to difficulties in studying its structure under physiologically relevant in vitro conditions. The issue is further complicated by the occluded D1 nucleotide, which excludes other nucleotides from the same site. Furthermore, structural studies by crystallography often require proteins in different asymmetric units to take a similar conformation, but the six ATPase domains are not synchronized in nucleotide binding and hydrolysis. Despite of these challenges, conformational changes of p97 have been intensively pursued by both cryo-EM and X-ray crystallography. Early cryo-EM studies revealed moderate rotational movement between the two ATPase rings upon ATP hydrolysis as well as closure and opening of the D1 or D2 central channel (Rouiller et al., 2002). Other domain movements were also noted (Beuron et al., 2003). However, due to limited resolution, these studies failed to generate a consistent model. The issue was revisited more recently with the application of newer technologies. One study using high-speed atomic force microscopy showed a conformational change in CDC48.1, a C. elegans p97 homolog, which involves rotation of the ND1 ring back and forth relative to the D2 ring following D2 ATP hydrolysis (Noi et al., 2013). Likewise, another study by single-particle Cryo-EM reported two nucleotide dependent conformations, differentiated by inter-ring rotation of approximately 22◦ (Yeung et al., 2014).

Crystallographic studies initially suggested that nucleotidedependent conformational changes might take place only during the D2 ATP hydrolysis cycle because D1 appeared to be constantly occupied by ADP (Zhang et al., 2000; DeLaBarre and Brunger, 2003, 2005; Huyton et al., 2003; Davies et al., 2008). To date, the most significant structural change associated with the D2 ATPase cycle is the opening of the D2 pore and an inter-ring rotation mentioned above, but whether the D2 pore opening is triggered by nucleotide binding or hydrolysis is unclear (Rouiller et al., 2002; Davies et al., 2005, 2008; Pye et al., 2006; Banerjee et al., 2016; Hanzelmann and Schindelin, 2016b; Schuller et al., 2016). Additionally, part of the D2 domain also undergo an order-to-disorder transition (DeLaBarre and Brunger, 2005).

It has only become clear recently that the D1 domain in p97 can also hydrolyze ATP under physiological conditions. Studies using D2 specific p97 ATPase inhibitor demonstrated that the D1 domain contributes significantly (∼30%) to the overall ATPase activity (Chou et al., 2014; Anderson et al., 2015). Because genetic evidence showed that certain Cdc48 D1 mutants cannot rescue the growth defect of Cdc48 temperature sensitive alleles despite carrying an intact D2 domain, the D1 domain clearly has an important function (Ye et al., 2003; Nishikori et al., 2011).

Whether ATP hydrolysis by D1 is essential for p97 function has been a controversial issue. Nevertheless, D1-dependent conformational changes have been extensively sought by various biophysical approaches and were recently reported by several

endosomes (Acharya et al., 1995; Latterich et al., 1995; Rabouille et al., 1995; Xu et al., 2011; Ramanathan and Ye, 2012). How p97/Cdc48 is recruited to different membranes is largely unclear, but this process is probably mediated by adaptors on different organelles, as demonstrated for the ER (Christianson and Ye, 2014). A fraction of p97/Cdc48 is also localized in the nucleus (Madeo et al., 1998), where it assists various chromatinassociated processes or nuclear protein quality control (PQC) (see below).

In multicellular organisms, the expression of p97 is ubiquitous. In humans, the transcription of p97 was moderately upregulated in some cancers, and the level of p97 mRNA appears to correlate with cell sensitivity to cell death induced by a potent p97 inhibitor, a potential anti-cancer drug (Anderson et al., 2015). More recently, genetic studies revealed that mutations in p97 may be causal to several human diseases including IBMPFD (Inclusion Body Myopathy associated with Paget's disease of the bone and Frontotemporal Dementia) and amyotrophic lateral sclerosis (ALS) (Xia et al., 2016). These findings stimulated a flurry of investigations on p97 substrates whose "mis-handling" by p97 mutants may have caused abnormality in human physiology.

Most p97/Cdc48 substrates identified to date are conjugated with ubiquitin and targeted for degradation by the 26S proteasome, but a few exceptions exist (Ramadan et al., 2007; Wilcox and Laney, 2009; Ndoja et al., 2014). A key feature of the p97/Cdc48-assisted degradation system is that many cofactors or adaptors are capable of recognizing ubiquitin conjugates (Ye, 2006). Some p97 cofactors are enzymes that can add or remove ubiquitin conjugates, but most of them, regardless of whether or not possessing a ubiquitin binding motif, seem to serve an adaptor function that links this ATPase to a specific subcellular compartment or substrate.

# STRUCTURE OF P97

p97 forms a stable hexameric structure with two concentric rings (**Figures 1B,C**): the N-D1 ring has the N-domains laterally attached and therefore has a larger radius (Peters et al., 1990; Zhang et al., 2000; DeLaBarre and Brunger, 2003, 2005; Huyton et al., 2003; Davies et al., 2008; Banerjee et al., 2016; Schuller et al., 2016). A similar ring-shaped structure was observed for various IBMPFD mutants (Tang et al., 2010; Tang and Xia, 2012, 2013) and for wild-type p97 that is in complex with cofactors or adaptors (Dreveny et al., 2004; Ewens et al., 2014; Hanzelmann and Schindelin, 2016a). The hexameric assembly of p97 is dependent on the D1 domain, but is stable in the absence of nucleotide (Wang et al., 2003).

As in all AAA+ ATPases, the AAA module of p97/Cdc48 consists of a characteristic helical domain and a highly conserved RecA-like domain (**Figure 1A**). The RecA-like domain features a nucleotide-binding site at the interface between two adjacent subunits. In this configuration, arginine-finger residues (R359 and R635 for the D1 and D2 ring, respectively) can promote nucleotide hydrolysis by engaging the γ-phosphate of ATP that is bound to an adjacent subunit. In addition, the active site contains a Walker A [P-loop, G(x)4GKT, x is any residue] motif for degrade GFP-ssrA in vitro (Barthelme and Sauer, 2016). Protein sequence analyses identified a KYYG motif in a D1 loop of VAT, which is replaced by KLAG in p97. When these tyrosine residues are introduced to replace leucine or alanine in a p97 variant lacking the N domains, it now can unfold and target GFPssrA to the 20S proteasome for degradation (Rothballer et al., 2007; Barthelme and Sauer, 2013). Collectively, these findings indicate that the widely observed cooperation between AAA+ ATPases and the 20S proteasome is an ancient scheme of protein degradation. However, with evolved changes in the N-domain and the D1 ring, p97 appears to acquire a more sophisticated mechanism to process its substrate. It has been speculated that p97/CDC48 might function as a special "unfoldase," perhaps only with the assistance from ubiquitin molecules conjugated to its substrate. Consistent with this view, the requirement of p97/Cdc48 in protein degradation in vivo can be bypassed if a flexible peptide was fused to the C-terminus of a proteasome substrate (Beskow et al., 2009), suggesting that p97/Cdc48 may initiate protein unfolding to expose a loosely-folded segment for subsequent engagement of the proteasome. More direct proof of the ubiquitin dependent unfoldase hypothesis came from two recent studies (Blythe et al., 2017; Bodnar and Rapoport, 2017), which used in vitro reconstitution systems to show that both p97 and its yeast homolog CDC48 can unfold GFP, but only when it carries ubiquitin conjugates. As expected, this activity is dependent on the D2 ATPase activity, the cofactors Ufd1 and Npl4, and on the length of the ubiquitin chains on GFP. Intriguingly, the D1 ATP hydrolysis does not seem to contribute significantly to GFP unfolding in a single round GFP turnover assay (Barthelme and Sauer, 2013). However, it appears to be required for substrate release from CDC48 to ensure processivity. Importantly, the study by Bodnar and Rapoport demonstrates, using two polyubiquitinated model substrates, that once ubiquitin chains are partially trimmed substrates can be completely threaded through the central pore of p97 together with the remaining ubiquitin molecules in a D1 to D2 direction, which results in unfolding of these proteins. The ubiquitin trimming reaction is dependent on an intricate interplay between p97 and its associated deubiquitinase Otu1 (Bodnar and Rapoport, 2017).

#### p97-INTERACTING PROTEINS

Proteomic studies have identified many factors that interact with p97/Cdc48 (Alexandru et al., 2008; Buchberger et al., 2015; Raman et al., 2015). These factors can be categorized either as adaptors, which link p97/Cdc48 to a specific substrate in a subcellular compartment, or as cofactors that facilitate substrate processing. Cofactors usually have enzymatic activities [e.g., Nglycanase, ubiquitin ligase, or deubiquitinase (DUB)] that can alter protein modifiers present on substrates (**Figure 3**).

Some p97/Cdc48-interacting proteins including PLAA/Ufd3, PNGase, HOIP, and Ufd2 bind to the C-terminal appendage of p97/Cdc48 (Rumpf and Jentsch, 2006; Zhao et al., 2007; Qiu et al., 2010; Bohm et al., 2011; Schaeffer et al., 2014; Murayama et al., 2015), but the vast majority bind p97/Cdc48 through its N-domain (**Table 1**) (Buchberger et al., 2015). Sequence analyses have revealed several p97-interacting patterns including VIM (VCP-interacting motif) (Stapf et al., 2011), UBX (ubiquitin regulatory X) (Buchberger et al., 2001; Schuberth and Buchberger, 2008), VBM (VCP-binding motif) (Boeddrich et al., 2006), and SHP box (also known as binding site 1, bs1) (Bruderer et al., 2004). The VCP-interacting motif (VIM) is a linear sequence motif (RX5AAX2R) present in gp78 (Ballar et al., 2006), SVIP (small VCP-inhibiting protein) (Ballar et al., 2007), VIMP (VCP-interacting membrane protein) (Ye Y. et al., 2004), VMS1 (Heo et al., 2010), UBXN6 (Hanzelmann and Schindelin, 2011; Stapf et al., 2011), and ZFAND2B (Stanhill et al., 2006). By contrast, the VBM domain found in proteins such as ataxin-3, Ufd2 and Hrd1 features a polarized sequence motif (RRRRXXYY) (Boeddrich et al., 2006). The SHP box in p47 (Kondo et al., 1997), Ufd1 (Meyer et al., 2000), and Derlin-1 (Lilley and Ploegh, 2004; Ye Y. et al., 2004; Greenblatt et al., 2011) on the other hand is a short polypeptide segment enriched in hydrophobic residues. Noticeably, the UBX domain, an 80 residue module structurally related to ubiquitin, is present in a p97/CDC48 adaptor family known as UBX-containing proteins, consisting of 13 members in humans (**Table 1**).

Intriguingly, despite the drastic difference in sequence and structure, many p97-interacting motifs, particularly those interacting with the N-domain, bind p97 in a similar mode. Consequently, the binding of many cofactors/adaptors to p97 is mutually exclusive (Meyer et al., 2000; Rumpf and Jentsch, 2006). These observations suggested the existence of distinct populations of p97 complexes in cells, each bearing a different set of partners. Conceptually, the composition of a p97 complex may not be static in cells. Co-factor exchange could occur, which would allow p97 to efficiently switch substrate to meet cellular demands. A similar "adaptor swapping" model has been proposed for the multi-subunit SCF (Skp1, cullin, and F box) ubiquitin ligase, which like p97, uses a collection of adaptors to engage distinct substrates. In this case, adaptor switch is catalyzed by Cand1, a protein exchange factor that stimulates the equilibrium of Cul1-Rbx1 with multiple F box protein-Skp1 modules (Pierce et al., 2013). Whether a similar regulatory strategy exists for p97/Cdc48 remains to be seen. Furthermore, given that the substrate processing cycle is comprised of two mechanistically distinct reactions, namely substrate binding and release, it is conceivable that a regulated hierarchical cofactor binding system may be coupled to ATP hydrolysis to coordinate these processes (Hanzelmann et al., 2011; Meyer et al., 2012).

Structural studies have revealed the general principles of p97 complex assembly. To date, one of the best characterized p97 complex is the p47-N-D1 assembly (Dreveny et al., 2004). One crystallographic study showed that the p97 N-domain could be divided into two sub-domains: a N-terminal double 9-barrel and a C-terminal β-barrel (**Figure 4A**). Between the two subdomains features a hydrophobic groove surrounded by patches of charged residues, which is the site bound by the UBX domain found in adaptors such as p47 and FAF1. The interaction usually exploits both hydrophobic and electrostatic forces (**Figure 4B**). More recently, a collection of structural studies showed that this cleft could be used to engage other

# The Interplay of Cofactor Interactions and Post-translational Modifications in the Regulation of the AAA+ ATPase p97

#### Petra Hänzelmann\* and Hermann Schindelin

Rudolf Virchow Center for Experimental Biomedicine, University of Würzburg, Würzburg, Germany

#### Edited by:

Walid A. Houry, University of Toronto, Canada

#### Reviewed by:

Leonid Breydo, University of South Florida, USA Stefan G. D. Rüdiger, Utrecht University, Netherlands

#### \*Correspondence:

Petra Hänzelmann petra.haenzelmann@ virchow.uni-wuerzburg.de

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 01 February 2017 Accepted: 24 March 2017 Published: 13 April 2017

#### Citation:

Hänzelmann P and Schindelin H (2017) The Interplay of Cofactor Interactions and Post-translational Modifications in the Regulation of the AAA+ ATPase p97. Front. Mol. Biosci. 4:21. doi: 10.3389/fmolb.2017.00021 The hexameric type II AAA ATPase (ATPase associated with various activities) p97 (also referred to as VCP, Cdc48, and Ter94) is critically involved in a variety of cellular activities including pathways such as DNA replication and repair which both involve chromatin remodeling, and is a key player in various protein quality control pathways mediated by the ubiquitin proteasome system as well as autophagy. Correspondingly, p97 has been linked to various pathophysiological states including cancer, neurodegeneration, and premature aging. p97 encompasses an N-terminal domain, two highly conserved ATPase domains and an unstructured C-terminal tail. This enzyme hydrolyzes ATP and utilizes the resulting energy to extract or disassemble protein targets modified with ubiquitin from stable protein assemblies, chromatin and membranes. p97 participates in highly diverse cellular processes and hence its activity is tightly controlled. This is achieved by multiple regulatory cofactors, which either associate with the N-terminal domain or interact with the extreme C-terminus via distinct binding elements and target p97 to specific cellular pathways, sometimes requiring the simultaneous association with more than one cofactor. Most cofactors are recruited to p97 through conserved binding motifs/domains and assist in substrate recognition or processing by providing additional molecular properties. A tight control of p97 cofactor specificity and diversity as well as the assembly of higher-order p97-cofactor complexes is accomplished by various regulatory mechanisms, which include bipartite binding, binding site competition, changes in oligomeric assemblies, and nucleotide-induced conformational changes. Furthermore, post-translational modifications (PTMs) like acetylation, palmitoylation, phosphorylation, SUMOylation, and ubiquitylation of p97 have been reported which further modulate its diverse molecular activities. In this review, we will describe the molecular basis of p97-cofactor specificity/diversity and will discuss how PTMs can modulate p97-cofactor interactions and affect the physiological and patho-physiological functions of p97.

Keywords: p97, AAA+ ATPase, conformational changes, protein quality control, protein disassembly, cofactor diversity, post-translational modification

**183**

# INTRODUCTION

p97 (also known as VCP, Cdc48, and Ter94) belongs to the functionally highly diverse AAA+ (ATPase associated with various cellular activities) superfamily of proteins, which is characterized by conserved ATPase core domains. Through specific structural elements, like for example additional domains and insertions as well as different oligomeric arrangements, they act as molecular motors by using conformational changes induced by ATP hydrolysis to perform mechanical work on many different substrates (reviewed in Erzberger and Berger, 2006; Wendler et al., 2012). p97 is a type II AAA+ protein composed of two hexameric ATPase rings (formed by its D1 and D2 domains) that stack on top of each other and an additional N-terminal domain important for cofactor and substrate binding (DeLaBarre and Brunger, 2003, 2005; Davies et al., 2008) (**Figure 1A**). Like all hexameric AAA+ ATPases p97 features a central cavity or pore lined by putative substrate interacting loops. Further members belonging to this group, referred to as NSF/Cdc48/Pex family, are the N-ethylmaleimide-sensitive fusion protein (NSF) involved in vesicular transport processes (reviewed in Zhao and Brunger, 2016), SPATA5 (spermatogenesis associated 5) and NVL (nuclear VCP-like) (Drg1 and Rix7 in yeast) involved in ribosome biogenesis (reviewed in Kressler et al., 2012) as well as PEX1 and PEX6 involved in peroxisome biogenesis (reviewed in Grimm et al., 2016). A feature unique to p97 is its 76 amino acid long, unstructured C-terminal extension, which is highly flexible and is involved in the regulation of the ATPase activity and cofactor assembly, the latter being modulated by phosphorylation (Li et al., 2008; Ewens et al., 2010; Niwa et al., 2012).

p97 participates in many different cellular pathways involved in the regulation of protein homeostasis, membrane fusion and vesicular trafficking as well as chromatin-associated functions (reviewed in Meyer et al., 2012; Meyer and Weihl, 2014). In all these processes p97 extracts or disassembles ubquitylated substrates from membranes, chromatin or, in general, from large protein complexes often, but not always, resulting in downstream degradation by the proteasome (**Figure 1B**): (i) p97 has been shown to extract different ubiquitylated proteins from chromatin in processes such as cell cycle regulation, transcriptional and replication stress responses, several DNA repair processes (nucleotide excision repair, double strand break repair), or replication. Subsequently, these proteins are either degraded by the proteasome or recycled to modulate the dynamics of chromatin regulators (reviewed in Franz et al., 2016); (ii) p97 is also involved in various membrane trafficking processes, including Golgi reassembly at the end of mitosis and in endocytosis (reviewed in Meyer, 2005; Bug and Meyer, 2012); (iii) p97 is a key player in multiple protein quality control pathways mediated by the ubiquitin proteasome system and autophagy. It is involved in the extraction of misfolded proteins from the ER (ER-associated degradation, ERAD; reviewed in Stolz et al., 2011; Wolf and Stolz, 2012) and similarly translocates damaged mitochondrial proteins into the cytosol in a process called outer mitochondrial membrane associated degradation (OMMAD; Heo et al., 2010; Xu et al., 2011; Hemion et al., 2014); p97 is also part of the ribosome-quality control complex (RQC), which is involved in the degradation of stalled nascent peptides (ribosome-associated degradation; Brandman et al., 2012). Recently, it could be shown that p97 is involved in the removal of damaged lysosomes by autophagy (Papadopoulos et al., 2017). Due to its participation in essential cellular processes, p97 has been linked to pathophysiological states including cancer, neurodegenerative disorders and premature aging (reviewed in Chapman et al., 2011; Fessart et al., 2013; Franz et al., 2014; Tang and Xia, 2016). Mutations of p97 are causative of three protein aggregation diseases (proteinopathies; reviewed in Tang and Xia, 2016): Multisystem Proteinopathy (MSP), Familial Amyotrophic Lateral Sclerosis (FALS) and Charcot-Marie-Tooth Disease, Type 2Y (CMT2Y).

The functional diversity of p97 is regulated, amongst other mechanisms, by a large number of regulatory cofactors, which either associate with the N-terminal domain or interact with the extreme C-terminus via distinct binding motifs/domains and target p97 to specific cellular pathways, sometimes requiring the simultaneous association with more than one cofactor (reviewed in Buchberger et al., 2015) (**Figure 1C**). Furthermore, post-translational modifications (PTMs) like SUMOylation, ubiquitylation, palmitoylation, acetylation, and phosphorylation of p97 have been identified by site-specific techniques and/or high throughput proteomics (Fang et al., 2016; PhosphoSitePlus, http://www.phosphosite.org, Hornbeck et al., 2015). More importantly, these modifications were proposed to modulate the diverse molecular activities of p97.

p97 contains 12 ATP binding sites, 6 in each ATPase ring, which are located at the interface of adjacent monomers. Upon ATP-binding and hydrolysis significant conformational changes occur, which are transmitted via long flexible linkers from the D2 domain to the D1 domain and further to the N domain (**Figure 1D**)(Banerjee et al., 2016; Na and Song, 2016; Schuller et al., 2016; reviewed in Xia et al., 2016). These conformational changes, which are regulated by intradomain (within the same protomer; Ye et al., 2003; Chou et al., 2014) and interdomain (between adjacent protomers; Huang et al., 2012; Li et al., 2012; Hänzelmann and Schindelin, 2016b) signaling mechanisms include: (i) Opening and closing of the D2 pore; (ii) Rotational movement of the ATPase rings; (ii) Up and down movements of the N domain. In addition, through these conformational changes the D1 and D2 domains move slightly apart from each other forming an additional channel leading into the D2 pore (Na and Song, 2016). The upper part of the D2 pore has a constriction near the center, which is formed by the six side chains of the D1 residue His317, a structural element referred to as the His-gate (DeLaBarre and Brunger, 2003, 2005; Hänzelmann and Schindelin, 2016b). There are two layers of pore loops lining the D2 pore, a smaller one composed of aromatic amino acids and a longer one featuring both negatively and positively charged residues. An intersubunit signaling network (ISS) has been identified that couples the conformation of the putative substrate-translocating pore to the nucleotide state of the cis-subunit, which is then transmitted to the trans-subunit and coordinates in this way ATP hydrolysis in adjacent monomers (Hänzelmann and Schindelin, 2016b). In addition, the ISS is involved in signal

transmission from the D2 domain via the D1D2 linker to the D1 domain of the adjacent monomer, a mechanism called interprotomer motion transmission (Huang et al., 2012; Li et al., 2012).

In this review, we will focus on the molecular basis of p97 cofactor specificity/diversity and will discuss how PTMs can modulate p97-cofactor interactions and affect the physiological and patho-physiological functions of p97.

# MOLECULAR INSIGHTS INTO p97 COFACTOR DIVERSITY

The participation of p97 in highly diverse cellular processes is regulated by the association with a large number of cofactors (reviewed in Yeung et al., 2008; Stolz et al., 2011; Meyer et al., 2012; Buchberger et al., 2015). Known cofactors are typically multi domain proteins composed of specific p97 binding modules and additional domains which, for example, function in the recognition of ubiquitylated target proteins, possess catalytic domains for substrate processing or transmembrane domains amongst others. So far about 30 cofactors have been identified with the latest entries to the list being published in 2016 (Arumughan et al., 2016), hence the number is expected to further increase. Based on their function, cofactors can be divided into three major classes: (i) Substrate-recruiting cofactors like UBA-UBX proteins and UFD1-NPL4: these cofactors link substrates to p97 and contain, beside a p97 binding motif/domain, additional ubiquitin binding domains/motifs, which target ubiquitylated substrates; (ii) Substrate processing cofactors like ubiquitin (E3) ligases, deubiquitinases (DUBs) and peptide N-glycanase (PNGase,), which process ubiquitylated, and N-glycosylated substrates; (iii) Regulatory cofactors like the UBX proteins UBXD4 and ASPL (also known as TUG and UBXD9) as well as SVIP, which may sequester or recycle p97 hexamers. Despite the large number of cofactors, they interact via a small number of conserved binding modules (reviewed in Buchberger et al., 2015). Although a few cofactors bind via their PUB (PNGase/UBA or UBX containing proteins) or PUL (PLAP, Ufd3p, and Lub1p) domain to the unstructured C-terminal tail of p97, the majority of cofactors interact with the N-terminal domain either via a UBX (ubiquitin regulatory X)/UBXL (UBX-like) domain or three linear binding motifs, called VCP-interacting motif (VIM), VBM (VCP-binding motif), and SHP (BS1, binding segment). Molecular insights have been obtained for all interacting domains/motifs from corresponding p97-ligand complex structures as reviewed below.

#### UBX and UBXL Domains

UBX and UBXL domains both consist of a ubiquitin-like fold. UBX proteins can be sub-divided into two families (**Figure 2A**): (i) UBA-UBX proteins, which also contain a UBA (ubiquitinassociated) domain that can bind to ubiquitylated substrates; (ii) UBX-only proteins. Molecular insights into the p97-UBX domain interaction have been revealed by crystal structures of the N domain in complex with the FAF1-UBX (Hänzelmann et al., 2011; Kim et al., 2011a; Lee et al., 2013) and the UBXD7-UBX (Li et al., 2017), p97-ND1 in complex with the p47-UBX (Dreveny et al., 2004) and full-length p97 in complex with the ASPL-UBX domain (Arumughan et al., 2016) (**Figures 2B,C**). A common feature is that the UBX domain interacts with the N domain via a conserved R...FPR signature motif (**Figures 2A,B**) located in a loop connecting two β-strands, which inserts into a hydrophobic binding pocket located in between the two subdomains of the N domain. The FPR motif adopts a cis-proline configuration, a rarely observed cis-Pro touch-turn structure, also called a FcisP touch-turn motif (Kang and Yang, 2011). In contrast to the UBX domains of FAF1, UBXD7 and p47, the N- and Cterminal regions of the UBX domain in ASPL contain unique structural extensions with the C-terminal extension in extensive contact with the β-grasp fold of the UBX domain (**Figure 2D**). Biochemical and structural data have shown that, as in other UBX proteins, the conserved cis-Pro touch-turn motif is important for the initial association with p97 hexamers. Subsequently an α-helical lariat structure formed by the N-terminal extension dissociates the p97 hexamer into monomers, resulting in the formation of a metastable p97-ASPL heterodimer (Arumughan et al., 2016). The α-helical lariat in ASPL is a flexible structure that directly targets the D1:D1 interprotomer interface in p97 hexamers, a region crucial for oligomer stability (**Figure 2D**). The p97-ASPL heterodimers subsequently oligomerize into (p97- ASPL)<sup>2</sup> heterotetramers, accompanied by a reorientation of the D2 ATPase domain, leading to the inhibition of its ATPase activity.

In analogy to the FPR loop of UBX proteins, the crystal structure of the UBXL domain of the DUB OTU1 (yeast homolog of mammalian YOD1) in complex with the N domain also features a loop (YPP motif) inserting into the hydrophobic pocket (Kim et al., 2011a), whereas the UBXL domain of NPL4 does not feature an extended loop and binds differently (Isaacson et al., 2007; Hao et al., 2015) (**Figure 2C**). Despite displaying only a low degree of sequence identity, UBXL and UBX domains adopt a similar structure and bind in a similar position with respect to the N domain, yet the interaction modes and relative positions are specific for each protein.

# VIM- and VBM Binding Motifs

The VIM and VBM, which have been identified in several unrelated proteins (**Figure 3A**), are both linear polypeptide stretches enriched in positively charged amino acids that adopt an α-helical conformation. The crystal structure of the N domain in complex with a peptide covering the VIM binding motif of gp78 revealed that the α-helical motif interacts with the hydrophobic binding pocket located in between the two subdomains on the N domain (Hänzelmann and Schindelin, 2011) (**Figure 3B**). The VIM of gp78 contains, beside the two conserved arginines of the signature motif, a third non-conserved arginine (underlined in the consensus sequence) in front of the first arginine (RRx5AAx2Rh). All three arginines are important, with the highly conserved last arginine (Arg636 in gp78) being pivotal, and they engage in several electrostatic interactions as well as hydrophobic interactions via the aliphatic part of their side chains with the N domain. The crystal structure of the N domain in complex with the VBM of RHBDL4, a protein crucial for the retro-translocation of polyubiquitylated substrates in the ERAD pathway (Fleig et al., 2012), revealed a highly analogous overall spatial arrangement of the VBM and VIM with respect to the N domain (Lim et al., 2016a) (**Figure 3B**). Interestingly, the directionality of the α-helices is opposite in both structures. Highly conserved basic residues in VBMs (consensus EhRRRRLxhh; h, hydrophobic residue; x, any amino acid; hh = RF in RHBDL4) and VIMs (consensus Rx2h3AAx2Rh; h, hydrophobic residue; x, any amino acid) are important to maintain the N domain interaction by contributing

hydrophobicity). The R...FPR motif is shown in stick representation. (C) Left, superposition of the UBX domains from FAF1 (pdb entry 3QQ8, colored in gold; Hänzelmann et al., 2011), p47 (pdb entry 1S3S, colored in brown; Dreveny et al., 2004) and ASPL (pdb entry 5IFW, colored in gray; Arumughan et al., 2016) bound to

(Continued)

#### FIGURE 2 | Continued

p97 N colored as in (B). Right, superposition of the UBXL domains of NPL4 (pdb entry 4RV0, colored in olive; Hao et al., 2015) and OTU1 (pdb entry 4KDI, colored in orange; Kim et al., 2014) bound to p97 N colored as in (B). (D) Disassembly of p97 hexamers through the interaction with the ASPL-UBX domain and formation of the stable p97-ASPL heterotetramer via metastable p97-ASPL heterodimers. For clarity, only two monomers of p97 are shown. One heterodimer is shown in cartoon representation and the other in surface representation. The ASPL-UBX domain is colored in yellow with the N- and C-terminal extensions in red and p97 in light gray (N domain), dark gray (D1 domain) and gray (D2 domain) (pdb entry 5IFW; Arumughan et al., 2016). The curved arrow indicates the reorientation of the D2 domain.

the majority of the ionic and hydrogen bonded interactions in the interface. However, the RHBDL4 VBM-N domain structure revealed a novel binding mode, which unexpectedly combined the two types of p97-cofactor specificities observed in the UBX and VIM interactions. Specifically, the RF motif in RHBDL4 VBM corresponds to the FPR motif in UBX (**Figure 3C**), and the RRR motif in VBM (RRhRLxRF) corresponds to the RRR motif in the gp78 VIM (RRx2h3AAx2Rh) (**Figure 3B**).

The binding pocket formed in the Nn and Nc lobes provides a sterically unopposed interface for the interaction of the various p97 cofactor proteins. Proteomic studies identified phosphorylation, ubiquitylation, and mono-methylation sites in the VIM/VBM binding motifs of different proteins (Hornbeck et al., 2015), thus indicating that PTMs control the interaction with p97.

#### SHP-Binding Motif

The SHP binding motif has been identified as an additional binding element in several UBX proteins and in UFD1, the latter typically forming a stable heterodimer with the UBXL protein NPL4 (**Figure 4A**). In contrast to the UBX/UBXL domains and the VIM/VBM binding motifs that bind into the hydrophobic cleft located in the N domain, the SHP binding motif targets an alternative binding site on the N domain. The SHP binding motif features two invariant glycine residues and a highly conserved aromatic residue with the consensus sequence h(x)1−2F/W(x)0−1GxGx2L (h, hydrophobic residue; x, any amino acid). The initial crystal structure of full-length p97 in complex with the SHP motif of UFD1 revealed that the motif adopts a mostly extended, yet slightly bent conformation, and binds at the periphery of the C-terminal α + β subdomain (Nc, aa 112–186) of the N domain, in direct vicinity of the ND1 linker (Hänzelmann and Schindelin, 2016a). Subsequently, a high-resolution structure of the N domain in complex with the UFD1-SHP using an N domain-SHP fusion protein (Le et al., 2016) (**Figures 4B,C**) as well as the N domain structure with the SHP of the DERLIN1 (DER1) rhomboid pseudoprotease (Lim et al., 2016b) (**Figure 4C**) were determined. The SHP motif forms a random coil interrupted by a small two amino acid long β-sheet, which associates with the central four-stranded β-sheet, thereby extending it to a five-stranded antiparallel β-sheet. In addition, an adjacent α-helix stabilizes the complex. The motif thus binds in a hydrophobic binding pocket and is stabilized between one of the β-strands and this α-helix, which together are arranged into a β-β-α super-secondary structure, a well-known binding mode mediating protein-protein interactions (Lim et al., 2016b). The two strictly conserved glycine residues (GxG) generate a sharp kink in the middle of the SHP motif, thereby enabling the bending of the motif upon binding to the N domain. The interaction mainly involves hydrophobic contacts with only a few electrostatic contacts being observed. Upon binding of the SHP motif flexible loop regions and secondary structural elements in the binding region are stabilized, including a loop region (141EAYRP145) found to be involved in UBX/UBXL and VIM/VBM interaction, thus suggesting that the rigidification of this region upon SHP binding may affect binding of other cofactors.

The interaction of the UFD1 SHP binding motif could be regulated by phosphorylation. Proteomic studies indicate that both serine residues (Ser229 and Ser231) located in the SGSG motif are phosphorylated, whereas no possible PTMs for the DER1 motif have been identified so far (Hornbeck et al., 2015).

#### PUB and PUL Domains

PUB and PUL domains, which are structurally unrelated, are currently the only known domains that interact with the extreme C-terminus of p97 (**Figure 5A**). Molecular insights into the p97- PUB/PUL domain interaction have been obtained by crystal structures of the PNGase (Zhao et al., 2007) and HOIP (HOIL-1-interacting protein; Schaeffer et al., 2014) PUB domains (**Figure 5B**) as well as of the PUL domain of PLAA (the ortholog of yeast Doa1/Ufd3, which is also known as phospholipase A2 activating protein or PLAP) (Qiu et al., 2010) (**Figure 5C**) bound to a peptide derived from the final residues of the p97 Cterminus, which is referred to as PUB interacting motif (PIM). The formation of the p97-PUB/PUL complex is mediated by hydrophobic and electrostatic interactions, with key interactions being contributed by the hydrophobic Leu804 as well as the aromatic side chain of the penultimate tyrosine residue (Tyr805) of p97, which inserts into a hydrophobic pocket on the PUB/PUL domain. Based on biochemical data (Zhao et al., 2009) it was proposed that the p97 C-terminus stretches into a neighboring positively charged ridge formed on the PUL surface (**Figure 5C**).

The direct interaction of the PUB/PUL-PIM interaction is regulated by PTMs. Phosphorylation of the strictly conserved penultimate tyrosine residue of p97 abolishes binding (Zhao et al., 2007; Li et al., 2008; Schaeffer et al., 2014), thus suggesting a conserved mechanism to control PIM interaction with their binding partners. In addition, PTMs of the respective PUB/PUL domain proteins could regulate the individual interactions. For example, proteomic studies (Hornbeck et al., 2015; Hendriks et al., 2017) revealed that the PUB domain of PNGase is ubiquitylated at Lys50, whereas in HOIP Lys99 is ubiquitylated. Furthermore, the PUB domain containing protein UBXD1 is phosphorylated at Tyr195 (corresponding to Tyr51 in PNGase), ubiquitylated at Lys180 and Lys202, which are replaced by threonines in PNGase (Thr37+59), as well as SUMOylated at Lys180 and Lys193 (the latter residue corresponding to Lys50 in

PNGase). Finally, in the PUL domain of PLAA Lys554, which is located at a similar position to Arg55 of PNGase and Lys99 of HOIP, can be either acetylated, ubiquitylated or SUMOylated.

# REGULATION OF p97—COFACTOR ASSEMBLY

Since p97 participates in multiple cellular processes in different subcellular compartments, p97 needs to be specifically targeted to the respective pathway and its activity must be tightly controlled. On the one hand this is achieved through the diversity of its cofactors, however, cofactor assembly, in addition, is regulated by multiple mechanisms including binding site competition, bipartite binding, different binding stoichiometries, hierarchical binding, and conformational changes (**Figure 6**) (reviewed in Buchberger et al., 2015). Finally, in cellulo PTMs of p97 and its associated cofactors as well as crosstalk between PTMs introduce an additional level of complexity (**Figure 7**).

#### Binding Site Competition

Overall, only three different interaction sites, the hydrophobic inter-subdomain cleft and the SHP binding site in the N domain together with the C-terminus of p97, have been identified so far and these must accommodate the 30 different cofactors, hence competition of diverse cofactors for the same site inevitably occurs. Specifically, although there is no sequence similarity

between the VIM as well as VBM binding motifs and the UBX/UBXL domains, these cofactors all target the same general area, the hydrophobic interdomain cleft of the N domain, which explains the competitions between these cofactors in vitro (**Figure 6A**). Likewise, the PUB and PUL domain cofactors, which display no structural similarity to each other, bind in a highly conserved manner to the C-terminal tail. In cells, however, the situation is expected to be more complex. Depending on the presence of specific substrates and the subcellular localization certain cofactors may be more abundant or even exclusively present, thus alleviating the problem of cofactors competing for the same binding site.

# Conformational Changes upon ATP Binding and Hydrolysis

ATP binding/hydrolysis is coupled to profound conformational changes in the location of the N domains which interconvert between a position in plane with the D1 ring (locked conformation or down conformation; ADP-bound state) to an orientation where they are located above the D1 ring (up conformation; ATP-bound state) (Banerjee et al., 2016; Schuller et al., 2016). Low-resolution cryo-EM studies of the p97−p47 (trimer), p97−FAF1 (trimer) and p97−UFD1-NPL4 complexes (Beuron et al., 2006; Bebeacua et al., 2012; Ewens et al., 2014) revealed that these interactions involve several N domains being present in the up conformation, hence suggesting that the up and down movement of the N domain during ATP-binding and hydrolysis could exert the necessary force to disassemble the macromolecular assemblies being targeted by p97 (**Figure 6B**). Furthermore, the association of cofactors with the N domain affects the ATPase activity of p97 in different ways including inhibitory as well as stimulatory effects (Trusch et al., 2015; Zhang et al., 2015), and it could be demonstrated that cofactors are recruited to the D1 domain in response to ATP binding (Chia et al., 2012).

#### Bipartite p97-Cofactor Interactions

The two primary cofactors of p97, p47 and the heterodimeric UFD1-NPL4 complex as well as other cofactors like several UBX domain containing proteins, harbor more than one p97-binding

module, which enables bipartite binding of these cofactors to the p97 N domain (Bruderer et al., 2004). In the case of p47 its UBX domain and SHP motif are involved, while the UFD1- NPL4 heterodimer employs the UBXL domain of NPL4 and the SHP motif of UFD1 to target p97. Low-resolution cryo-EM studies revealed that p47, where both binding domains/motifs reside on the same polypeptide, binds as a trimer where each subunit apparently interacts with two adjacent p97 monomers (Beuron et al., 2006). In case of the UFD1-NPL4 heterdimer it is still unclear how bipartite binding is accomplished. A model in which the NPL4-UBXL domain and the UFD1-SHP motif target adjacent N domains (Pye et al., 2007) lacks experimental support despite the existence of low-resolution EM structures depicting the p97−UFD1-NPL4 complex (Bebeacua et al., 2012). Recently it has been proposed that a bipartite binding could involve either a single or two adjacent N domains (Hänzelmann and Schindelin, 2016a) (**Figure 6C**).

UBXD1 also employs a bipartite binding mode in which its PUB domain interacts with the C-terminus of p97 and its VIM targets the N domain (**Figure 6C**) (Kern et al., 2009). Among p97 cofactors this interaction mode, which involves the two major binding sites of p97, namely the N domain and the C-terminus, being targeted by a single cofactor is unique, and is expected to restrict the conformational flexibility of p97. Isothermal titration

calorimetry (ITC) studies demonstrated that UBXD1 binds as a trimer with both binding motifs contributing to this interaction (Hänzelmann and Schindelin, 2011).

However, it is currently unknown whether cofactors with two interaction sites remain stably associated via both binding modules at all times. Possibly, ATP hydrolysis and/or substrate binding may induce conformational changes leading to dissociation at one site with a concomitant increase in conformational flexibility of the complex during subsequent catalysis (Buchberger et al., 2015; Hänzelmann and Schindelin, 2016a). Hence a bipartite binding mode not only enhances the affinity of these cofactors but also imposes conformational restrictions in p97, which may modulate its catalytic properties.

#### Oligomeric Assembly

Interestingly, while p97 harbors six N domains most cofactors bind substoichiometrically (**Figure 6D**), probably due to steric hindrance of the large, multidomain cofactors. Currently, the only known cofactor that results in a 6:6 assembly is the small protein SVIP (Hänzelmann and Schindelin, 2011), which is an efficient competitor for N domain cofactors and has been suggested to be a negative regulator during ERAD (Ballar et al.,

#### FIGURE 7 | Continued

p97 post-translational modifications (PTMs). (A) Domain architecture of p97 together with identities of PTMs derived from the public database PhosphoSitePlus (Hornbeck et al., 2015) and published sources. (B) Nucleotide-induced conformational changes of the p97 N-terminal extension (residues 1–24, colored black). Molecular surface representation of p97 in the ADP and ATP states (pdb entries 5FTK and 5FTN; Banerjee et al., 2016). In one of the monomers of the ATP-bound structure the N-terminal extension is modeled according to Schuller et al. (2016). The opening at the D1D2 interface is shown in red. Identified PTMs on the extension are indicated. (C) Nucleotide-induced conformational changes of the p97 C-terminus. Molecular surface representation together with a cartoon representation of the C-terminus in the apo- (top) and ATP-bound state (bottom) (pdb entries 5C19 and 5C18; Hänzelmann and Schindelin, 2016b). The disordered C-terminal tail is indicated with dashed lines. Identified PTMs in the C-terminal helix α9 are shown and listed for the disordered region. (D) Phosphorylation sites identified in the ND1 part of p97 (N-terminal extension in black, N domain in dark gray, D1 in light gray). Identified phosphorylation sites (colored in magenta) are mapped onto the molecular surface of p97 in the ADP- and ATP-bound states (pdb entries 5FTK and 5FTN; Banerjee et al., 2016) and are shown in a side view and top view. Residues in the D1D2 interface are shown in red. In addition, palmitoylation of Cys105 (colored in yellow) and monomethylation of Arg155 (colored in blue) are indicated. (E) Ubiquitylation and SUMOylation sites identified on the ND1 part of p97. Identified ubiquitylation sites (colored in green) and SUMOylation sites (colored in orange) as well as sites, which carry both modifications (colored in cyan), are mapped onto the molecular surface of p97 as in (D). p97 is colored as in (D).

2007). Other cofactors like p47 and UBXD1 trimerize and, as discussed above, the UBX protein ASPL even disrupts the hexameric assembly and forms a heterotetrameric complex with p97. In contrast, only one UFD1-NPL4 heterodimer associates with two adjacent N domains within the p97 hexamer and additional cofactors could possibly interact with non-occupied N domains. Accordingly, it was demonstrated that UBXD7, UBXD8, FAF1 and SAKS1, which are all UBA-UBX domain containing proteins, coimmunoprecipitate with p97 bound to the UFD1-NPL4 heterodimer and endogenous ubiquitin conjugates (Alexandru et al., 2008), a finding which supports the existence of higher-order p97-cofactor1-cofactor2 complexes. Subsequently a hierarchical binding of the two UBA-UBX proteins FAF1 and UBXD7 was demonstrated (Hänzelmann et al., 2011; Lee et al., 2013). Specifically, for UFD1-NPL4 and FAF1, the resulting p97−UFD1-NPL4−FAF1 complex exhibited a stoichiometry of 6:1:1. Since no direct interaction between FAF1/UBXD7 and UFD1-NPL4 could be observed, it is conceivable that conformational changes are induced in p97 upon binding of UFD1-NPL4 which generate an asymmetry in p97 and as a consequence only one of the vacant p97 subunits has the ability to interact tightly with FAF1 or UBXD7 (Hänzelmann et al., 2011). In addition, a macromolecular complex composed of p97, UBXD1 and the two substrate-processing cofactors YOD1 and PLAA has been shown to be involved in the removal of ruptured lysosomes by autophagy, indicating that under certain conditions a distinct set of proteins interact with each other at least transiently in a substrate-dependent manner (Papadopoulos et al., 2017).

#### Post-translational Modifications (PTMs)

Recently it could be shown that most cofactors of p97 such as for example p47 form extremely dynamic complexes with p97 that undergo rapid dissociation and exchange in cell lysates, thus raising the question how p97-cofactor complexes can adequately perform their tasks in vivo (Xue et al., 2016). Mechanisms that modulate the lifespan and stabilization of p97-cofactor complexes like PTMs of p97 (Ewens et al., 2010) and its cofactors (Uchiyama et al., 2003; Almeida et al., 2015) or substrate recruitment by cofactors have been proposed.

PTMs like phosphorylation, lysine acetylation, and ubiquitylation function as molecular switches and may trigger or abolish the association of proteins with cofactors, lipids, DNA, and proteins. PTMs usually lead to structural changes such as exposing or masking active sites as well as interfaces for protein–protein interaction, thus regulating for example subcellular localization, stability, and activity in response to internal and external stimuli (reviewed in Beltrao et al., 2013; Ryšlavá et al., 2013; Venne et al., 2014). The abundance of PTMs is often controlled by so-called "writers" and "erasers," which are enzymes capable of adding or removing the modifications, respectively. In the case of phosphorylation events these correspond to kinases and phosphatases while they involve E3 ligases and DUBs in the case of ubiquitylation. The functional consequences can also be exploited by proteins that specifically bind to these modifications, so called "reader" domains, like Scrhomology-2 (SH2) or WD40 domains, amongst many others, in the case of phosphorylation or specific ubiquitin-binding domains (UBD) in the case of ubiquitylation (reviewed in Seet et al., 2006; Beltrao et al., 2013).

Numerous high-throughput proteomic studies revealed that p97 is extensively targeted by PTMs like phosphorylation (66 identified sites), ubiquitylation (38 identified sites) and acetylation (24 identified sites) (PhosphoSitePlus, http://www. phosphosite.org, Hornbeck et al., 2015), however, the relevant enzymes and functional consequences of these modifications are poorly understood. Similar findings have been reported for other chaperones, which suggests the existence of a combinatorial code regulating the localization, activity, and substrate specificity for these biologically important proteins (Cloutier and Coulombe, 2013). Interestingly, in addition to modification sites that are only accessible in the functional hexameric form of p97, other sites were identified which are only partially accessible, completely buried or even located in the central channel. These latter sites would not be accessible for modifying enzymes unless p97 were to dissociate into a monomer, while partially buried sites could be accessible after conformational changes. Whether p97 can exist in a monomeric form and especially under which condition is not known. Currently the only known example is the described disruption of the hexamer through the interaction with the UBX protein ASPL resulting in the formation of a p97-ASPL heterotetramer (Arumughan et al., 2016). Beside numerous phosphorylation, ubiquitylation and acetylation events also SUMOylation, palmitoylation, methylation, succinylation, and S-glutathionylation were identified (**Figure 7A**). In the case of

p97 PTMs may promote or prevent protein–protein interactions by obscuring existing binding sites, generating new interfaces, or triggering conformational changes, with the exact functional consequence dictated by the substrate and cellular context (Almeida et al., 2015).

Interestingly, in the small 24 amino acid long N-terminal extension (**Figure 7B**) as well as in the C-terminal tail with its preceding α-helix (α9) (**Figure 7C**) many PTMs have been identified, thus indicating an important function of these extensions in regulating the properties of p97. Recently it could be shown that the small N-terminal extension, which is highly flexible and disordered in the ADP-bound state undergoes a large conformational change upon ATP binding, when it becomes ordered, relocates itself beyond the cleft between the D1 and D2 domain and inserts into the D2 domain (Schuller et al., 2016) (**Figure 7B**). Furthermore, deletion of this extension reduces the ATPase activity to a similar degree as found in the absence of the N domain (unpublished data), suggesting an important function in the regulation of the ATPase activity. Furthermore, it could be demonstrated that the C-terminal αhelix undergoes a significant conformational change upon ATP binding (**Figure 7C**). This helix is kinked and inserts between two adjacent monomers into the ATP-binding pocket of the transmonomer and an arginine (Arg766) directly coordinates the γphosphate of the ATP, thereby closing of the D2 nucleotidebinding pocket (Hänzelmann and Schindelin, 2016b). In the absence of nucleotide, the C-terminus is present in a different conformation and no longer inserts into the ATP-binding pocket. In addition to the N- and C-terminal extensions, the cofactor binding N domain and its associated D1 domain are also extensively targeted by PTMs (**Figures 7D,E**). In the following sections we will focus on PTMs that may have functional implications.

#### Phosphorylation

Protein phosphorylation, which typically targets serine, threonine, or tyrosine residues, is a reversible PTM controlled by kinases and phosphatases. Among the 66 currently known phosphorylation sites in p97 some have been studied in more detail. For example, the aforementioned phosphorylation of the C-terminal Tyr805 by c-Src kinase, which abolishes the interaction with PUB/PUL domain-containing cofactors (Zhao et al., 2007; Li et al., 2008; Schaeffer et al., 2014). The negatively charged phosphate group would undoubtedly introduce electrostatic and steric hindrance in the context of the tight binding pocket in the PUB/PUL domain.

Phosphorylation of the C-terminal tail residue Ser784 by DNA-PK (DNA-dependent protein kinase) was demonstrated to accumulate at sites of DNA double-strand breaks (DSBs) (Livingstone et al., 2005), where p97 interacts more tightly with chromatin. Furthermore, p97 can be phosphorylated on Ser457, Ser459, and Ser326 (buried) by ATM (ataxia telangiectasiamutated) and ATR (ATM-Rad3-related), the two proximal checkpoint kinases, which regulate the DNA damage response (DDR) (Mu et al., 2007). Interestingly, Ser457 and Ser459 are both located next to the channel entrance at the D1D2 interface of the adjacent monomer (**Figure 7D**) and conformational changes can be expected upon phosphorylation.

Phosphorylation of Ser770 by SIK2 [salt inducible kinase 2 of the AMP–activated protein kinase (AMPK) family] stimulates the ATPase activity of p97 (Yang et al., 2013). Ser770 is located in the C-terminal tail, which undergoes significant conformational changes during ATP binding/hydrolysis (**Figure 7C**) (Hänzelmann and Schindelin, 2016b). Although SIK2 phosphorylates p97 through its N-terminal kinase domain, it interacts with p97 via its extremely glutamine-rich C-terminal region and plays a critical role in ERAD and ER homeostasis (Yang et al., 2013).

p97 is also a target of the serine/threonine kinase Akt (protein kinase B, PKB), which plays important roles in cell survival and phosphorylates p97 at Ser352 (buried), Ser746 (buried), and Ser748 (N-terminus of the C-terminal helix α9) (Klein et al., 2005; Vandermoere et al., 2006).

In the p97 N domain several phosphorylation sites identified in proteomic studies directly interfere with cofactor association (**Figure 7D**): (i) Thr37 and Ser56 are both located in the hydrophobic binding cleft; (ii) Tyr110 and Tyr143, which are both key residues for UBX/UBXL interaction; (iii) Thr168, which is located close to the SHP binding groove, and could influence binding of SHP-containing cofactors upon structural rearrangements. Since cofactors typically assemble on top of the D1 ring, phosphorylation at this position could influence this interaction by modulating cofactor affinity. Finally, phosphorylation sites in close proximity to the channel formed between the D1 and D2 domains were identified (see above).

#### Ubiquitylation

Covalent ubiquitin-protein conjugates are introduced in a series of three consecutive enzymatic steps (reviewed in Kerscher et al., 2006; Cappadocia and Lima, 2017). Initially a ubiquitinactivating enzyme (E1) activates ubiquitin in an ATP-dependent reaction and binds to it covalently. Subsequently, ubiquitin is transferred to a ubiquitin-conjugating enzyme (E2) in a transthioesterification reaction. Finally, ubiquitin is attached to one or more lysine residues in the target protein, in a reaction catalyzed by a ubiquitin ligase (E3). Modifications with ubiquitin are highly variable in length and linkage type (reviewed in Akutsu et al., 2016; Yau and Rape, 2016). Proteins can be modified at one or multiple lysine residues with either a single ubiquitin molecule (mono- and multi-monoubiquitylation, respectively) or ubiquitin polymers (polyubiquitylation). Modification of proteins with a single ubiquitin subunit typically alters intra- or inter-molecular interactions which in turn affect the localization, the activity of the modified protein or its ability to interact with partner proteins (Husnjak and Dikic, 2012). Ubiquitin contains seven lysine residues (Lys6, Lys11, Lys27, Lys29, Lys33, Lys48, and Lys63) among its 76 residues, which, together with its amino terminus, provide eight sites for attaching further ubiquitin moieties, resulting in homo- and heterotypic polymeric ubiquitin chains. Different linkage types lead to different conformations of the corresponding ubiquitin chains and hence in unique binding epitopes, which trigger specific downstream signaling events (reviewed in Liu and Walters, 2010; Akutsu et al., 2016; Yau and Rape, 2016). Ubiquitylation is a dynamic and reversible process. The action of the E1-E2-E3 cascade is counteracted by deubiquitylating enzymes (DUBs), which specifically remove ubiquitin from target proteins (reviewed in Husnjak and Dikic, 2012; Sahtoe and Sixma, 2015).

Several high-throughput proteomic analysis, which focused on ubiquitylation sites, identified a large number of p97 lysine residues that are ubiquitylated (Danielsen et al., 2011; Kim et al., 2011b; Wagner et al., 2011, 2012; Mertins et al., 2013; Elia et al., 2015; Wu et al., 2015). So far none of the sites have been analyzed in more detail, hence it is not known whether p97 is monoor poly-ubiquitylated, and, should the latter be the case, the linkage types remain undefined. Also, it is not known which E3 ligases are involved in this process. However, p97 is known to associate with several E3 ligases in either a direct or indirect fashion. Direct interactions involve for example the HRD1 and gp78 E3 ligases (Ballar et al., 2006; Morreale et al., 2009), which are both involved in ERAD, or HOIP (Schaeffer et al., 2014), the E3 ligase of the LUBAC complex (Linear Ubiquitin Chain Assembly Complex), which is involved in the formation of linear ubiquitin chains. Indirectly, p97 interacts with SCF E3 ligases containing cullins that bind for example to UBXD7, which in turn binds via its UBX domain to p97 (Alexandru et al., 2008). In addition, the DUBS ATAXIN-3, VCIP135 and YOD1 are known p97 interaction partners (Uchiyama et al., 2002; Boeddrich et al., 2006; Ernst et al., 2009). The physiological function of p97 ubiquitylation remains undefined, but has been speculated to modulate its affinity for cofactors or substrates. Alternatively, the modifications could shift the equilibrium between the different conformational states of p97 or, in the extreme case, even induce additional conformational states. Monoubiquitylation of p97 could of course be a signal to recruit the protein to specific cellular compartments.

Ubiquitylation sites in p97 have been identified in the N, D1, and D2 domain, but not in the flexible C-terminal extension (**Figure 7E**). Taking into account the size of ubiquitin of 8.5 kDa in relation to the p97 N domain (21 kDa), one can conclude, that pretty much irrespective of the site where the modification is introduced, ubiquitylation of the N domain would interfere with cofactor interactions. Ubiquitylation of the N-terminal extension (Lys8/18/20) could prevent insertion of the N-terminus into the D2 domain and could block the entrance to the putative substrate-binding channel (**Figure 7B**). Furthermore, ubiquitylation on top of the D1 ring would directly interfere with cofactor interactions, which typically assemble above the D1 ring (**Figure 7E**). Ubiquitylated p97 could then be recognized by proteins that contain at least one UBD like for example the helical binding domains UBA, UIM, and CUE, zinc fingers like NZF and UBZ as well as other proteins harboring Jab1/MPN, PFU, and WD40 domains (Husnjak and Dikic, 2012). All these domains were identified in several p97 cofactors.

#### SUMOylation

SUMOylation is a ubiquitin-related reversible conjugation pathway in which members of the SUMO family (SUMO1 or the highly related SUMO2/3) are attached to lysine residues of target proteins via an isopeptide bond through the sequential action of E1, E2, and E3 enzymes (reviewed in Cappadocia and Lima, 2017). The participating enzymes can discriminate between the SUMO paralogs at both the conjugation and deconjugation levels (Citro and Chiocca, 2013). Although the two ubiquitin-like modifiers SUMO and ubiquitin are structurally related and feature a β-grasp fold, they have different molecular properties. Besides the presence of three different SUMO isoforms, all of them feature an additional N-terminal extension and a different surface charge. These properties are responsible for different activating, conjugating and deconjugating enzymes and distinct cellular functions (reviewed in Praefcke et al., 2012; van der Veen and Ploegh, 2012). Furthermore, in contrast to ubiquitin chains, which can be linked through all seven lysine residues, Lys11 in the Nterminal extension is the major SUMO acceptor site in SUMO2/3, whereas SUMO1, which is lacking a SUMOylation consensus site, is mainly involved in monoSUMOylation (Matic et al., 2008). In contrast to ubiquitylation, SUMOylation preferentially targets disordered and flexible protein regions (Hendriks et al., 2017). SUMOylation typically controls the dynamics of protein assemblies through binding of SUMO conjugates to SUMO recognition modules termed SUMO interaction motifs (SIMs), whereas ubiquitin interacting proteins typically bind via their UBD to the hydrophobic patch around Ile44 of ubiquitin. A number of SUMOylated protein targets feature the consensus motif "9-K-x-E/D," (9, hydrophobic residue; x, any amino acid). Recent proteomic studies demonstrated that a considerable fraction of total SUMOylation events involve non-consensus sites (Blomster et al., 2010), especially under stress conditions when SUMOylation loses stringency and can act like ubiquitylation. Whereas ubiquitylation, acetylation and phosphorylation events occur throughout the cell, SUMOylation takes place predominantly in the nucleus, more specifically in chromatin and nuclear bodies (Hendriks and Vertegaal, 2016). The degree of SUMOylation is dynamically regulated by various forms of stress, thereby linking SUMOylation to the regulation of cellular homeostasis (Liebelt and Vertegaal, 2016), where it plays ubiquitin-dependent and independent roles. A tightly regulated balance in both time and space exists between ubiquitin and SUMO, in which the same lysine within a protein is targeted and this determines the function, localization, or stability of the modified protein.

By direct comparison of the endogenous SUMO1- and SUMO2/3-modified proteome in mammalian cells p97 was found to be preferentially conjugated to SUMO1 (81%; Becker et al., 2013). Accordingly, it could be shown that the p97 N domain is modified with SUMO1 in a dynamic process involving several non-consensus sites, suggesting that different lysine residues could compensate for each other (Wang et al., 2016). Site directed mutagenesis studies indicate that Lys60/62/63, Lys136 as well as Lys164 (**Figure 7E**) are the most important sites involved in SUMOylation under stress conditions. SUMOylation of p97 under conditions of oxidative and ER stress leads to the distribution of p97 to stress granules and into the nucleus, and promotes assembly of the p97 hexamer. In contrast, pathogenic N domain mutations identified in MSP and FALS, which feature an uncoordinated conformational change of the N domain due to a disturbed communication between the N and D1 domains (reviewed in Tang and Xia, 2016), lead to reduced SUMOylation and weakened p97 hexamer formation upon stress (Wang et al., 2016). Cryo-EM studies of a pathogenic mutant revealed highly flexible N domains, which are situated more on top of the D1 ring, reflecting the ATP-bound state observed in the wild type (Niwa et al., 2012). Since in the ATP-bound state Lys60/62/63 and Lys164 are protected, this would explain why in pathogenic states p97 SUMOylation is reduced. However, SUMOylation of wild type p97 at these positions would also affect N domain movement, ATPase activity and the interaction with cofactors. Defects in the SUMOylation of p97 also trigger altered cofactor binding and attenuated ER-associated protein degradation (Wang et al., 2016). A recent comprehensive SUMO2-specific proteomic study of mammalian cells under standard growth conditions and stress conditions identified 17 additional SUMOylation sites distributed over all domains of p97 (Hendriks et al., 2017). Most of the SUMO targeted lysine residues are also found to be ubiquitylated indicating a crosstalk between SUMOylation and ubiquitylation (see below). However, the SUMO1 targeted N domain residues Lys62, Lys63, and Lys 136 as well as the SUMO2 targeted residue Lys190, which is located in the ND1 linker, are exclusively modified by SUMO.

In addition, the p97 cofactor UFD1, which together with NPL4 is frequently found to be involved in chromatin associated processes, harbors seven SUMOylation sites in its predicted disordered C-terminal region, which is located downstream of the p97 SHP binding motif (Hendriks et al., 2017).

There are several indications that p97 operates at the intersection of the ubiquitylation and SUMOylation pathways, two major signaling events which target chromatin. Consequently, it was proposed that p97, through its ability to associate with cofactors displaying affinities for ubiquitin and SUMO, links these two pathways to either trigger protein degradation or elicit other regulatory events (Bergink et al., 2013; Franz et al., 2016; Nie and Boddy, 2016). The SUMO targeted ubiquitin ligase (STUbl) family of proteins (Sriramachandran and Dohmen, 2014; Nie and Boddy, 2016) integrates ubiquitin and SUMO modifications into a hybrid signal. The resulting mixed SUMO-ubiquitin chains can be recognized by the UFD1- NPL4 complex, which contains both ubiquitin (NZF domain of NPL4, UT3 domain of UFD1) and SUMO (UFD1 C-terminus) interacting motifs.

SUMO conjugation and de-conjugation processes might play a role in p97 functions during DNA repair to regulate the recruitment and release of the participating proteins. A similar function has been shown for the AAA ATPase MDN1, which acts as a SUMO-targeted regulator in mammalian pre-ribosome remodeling (Raman et al., 2016).

#### Palmitoylation (S-Acylation)

Palmitoylation, myristoylation, and prenylation are the most frequently identified covalent lipid modifications (reviewed in Hentschel et al., 2016). Of these three lipid modifications, only palmitoylation is reversible, thus allowing for a more dynamic regulation of protein function with respect to trafficking, localization, stability, aggregation, and interaction with effectors (reviewed in Cho and Park, 2016). Lipidation increases the hydrophobicity of proteins, which promotes the association of the modified proteins with the plasma membrane and other membranes such as those of the ER, mitochondria, Golgi, and endosomes. Palmitoylation, which is catalyzed by palmitoylacyltransferases (PATs), also known as DHHC enzymes and is reversed by palmitoyl protein thioesterases, is the covalent attachment of the 16 carbon fatty acid palmitate to the side chain of specific cysteine residues of target proteins via a thioester bond. Dependent on the target protein palmitoylation functions in a large variety of cellular processes including subcellular trafficking as well as signal transduction and aberrant palmitoylation has been associated with Alzheimer's disease, Huntington's disease and other neurodegenerative disorder (Cho and Park, 2016).

Palmitoylation of p97 at Cys105 located in the N domain has been reported (Fang et al., 2016) (**Figure 7D**). Palmitoylation of p97 could be important for participation of p97 in a variety of cellular processes involved in the regulation of membrane fusion and vesicular trafficking, however, the significance of p97 palmitoylation has not been addressed so far.

#### ε-Acetylation

Besides phosphorylation and ubiquitylation, protein acetylation is probably the most frequent and important PTM involved in cell signaling, gene expression, stress responses, apoptosis, membrane trafficking as well as cellular metabolism and plays a major role in the regulation of nuclear proteins, in particular histones (reviewed in Drazic et al., 2016). Acetylation is catalyzed by lysine (K) acetyltransferases (KATs), which transfer the acetyl group from acetyl-coenzyme A (Ac-CoA) to the ε-amino group of lysine residues. This process is reversible and tightly regulated. Malfunctions of the acetylation machinery have been implicated in cardiovascular and neurodegenerative diseases as well as cancer (Drazic et al., 2016).

The importance of p97 in a large variety of chromatin associated processes like transcription, replication, and DNA repair suggests that p97 is regulated through acetylation. For example, intracellular accumulations of abnormal proteins such as expanded polyglutamines in neuronal cells induces p97 phosphorylation at Ser612 and Thr613 as well as acetylation of Lys614, which allows p97 to translocate into the nucleus (Koike et al., 2010). Following translocation general transcription is suppressed via deacetylation of core histones, resulting in cell atrophy and inhibition of de novo protein synthesis, which decreases the accumulation of misfolded proteins, thus allowing the cell to remove them by chaperone-mediated refolding, proteasomal degradation, and autophagy (Koike et al., 2010). The location of the three sequential residues Ser612, Thr613, and Lys614 in the D1D2 interface close to the entry points to the central channel suggests that conformational changes occur in this region.

High-throughput proteomics (Hornbeck et al., 2015) identified a total of 24 putative p97 acetylation sites, which, interestingly, all overlap with ubiquitylation and SUMOylation sites (Hornbeck et al., 2015; Hendriks et al., 2017) indicating competition between the enzymes catalyzing the different PTMs (see below). This includes acetylation of the N-terminal extension residues Lys8 and Lys18 (**Figure 7B**), several lysines of the N domain and the aforementioned Lys614 as well as Lys754 in helix α9 preceding the C-terminal tail (**Figure 7C**).

#### Lysine and Arginine N-Methylation

Methylation is a PTM, which influences protein-protein interactions, activity, and turnover of proteins as well as cellular localization (reviewed in Biggar and Li, 2015). The ε-amino group of lysine may be modified with up to three methyl groups by lysine-specific methyltransferases (KMTs) and the side chain of arginine may be mono- or di-methylated by arginine methyltransferases (PRMTs; reviewed in Biggar and Li, 2015). S-Adenosylmethionine (SAM; also known as AdoMet) serves as the methyl donor in both reactions. The addition of methyl groups to lysine and arginine residues may negatively alter hydrogen bond-mediated interactions or, alternatively, facilitate stacking with aromatic residues as the methylated residues become more hydrophobic, thus increasing the structural diversity of proteins and modulating their cellular functions. Similar to protein phosphorylation, protein methylation plays important roles in signaling pathways involved in cell growth and differentiation and has been associated with several diseases including cancer (Biggar and Li, 2015).

It could be demonstrated that METTL21D (VCP lysine methyltransferase, VCP-KMT) can tri-methylate p97 on Lys315 (Kernstock et al., 2012). Interestingly, Lys315 is located buried inside the p97 channel close to the constriction in the D1 domain formed by the His-gate and thus not accessible to a methyltransferase in the hexameric state. However, it could be shown that methylation was stimulated by ASPL (Cloutier et al., 2013), indicating that this site becomes available after disruption of the p97 hexamer. An additional lysine mono-methylation site has been identified on Lys231 (Hornbeck et al., 2015), which is located on top of the D1 ring, which has also been found to be ubiquitylated/SUMOylated (Hornbeck et al., 2015; Hendriks et al., 2017). In addition, a recent proteome-wide analysis of arginine mono-methylation sites (Larsen et al., 2016) identified five arginines in p97, which are all functionally important: (i) Arg155, which is the most frequently mutated residue found in MSP (**Figure 7D**); (ii) Arg586 and Arg599, which are both located in the D2 pore-loop 2; (iii) Arg708 located in a regulatory loop region on the outside of the D2 domain (Hänzelmann and Schindelin, 2016b); (iv) Arg753 located in the C-terminal helix (**Figure 7C**).

#### S-Glutathionylation

Reactive oxygen/nitrogen species (ROS/RNS) have been found to act as important physiological modulators of intracellular signaling pathways, but are also causative of aging, cancer, neurodegenerative disorders, and cardiovascular diseases (reviewed in Finkel, 2011; Chung et al., 2013). Covalent modifications of selected cysteine residues present in redoxsensitive proteins mediate, at least in part, the specific effects of ROS/RNS. Oxidative PTMs (Ox-PTM) of cysteine residues represent an important mechanism that regulates protein structure and, ultimately, function. Ox-PTMs including Snitrosylation (also called S-nitrosation, SNO), sulfhydration (SSH), S-glutathionylation (SSG), disulfide bond formation (RS-SR), and sulfenylation (SOH) are stimulated by diffusible small molecules and constitute reversible modifications. In addition, the irreversible formation of sulfinic (SO2H) and sulfonic acids (SO3H) on cysteine residues are induced.

It could be shown that under conditions of oxidative stress p97 is S-glutathionylated at Cys522 (Noguchi et al., 2005). Cys522 is present in the ATP-binding pocket of the D2 domain and its modification negatively regulates the ATPase activity of p97, thus leading to ER stress. Addition of glutathione to Cys522 would induce steric hindrance interfering with ATP binding on the D2 domain. Cys522 modification leads to an accumulation of ubiquitylated proteins and ER stress, followed by apoptosis, which are phenotypes found in several neurodegenerative disorders (Noguchi et al., 2005).

#### Crosstalk between Various Protein Translational Modifications

Proteins are often regulated via a combination of different PTMs, possibly acting as a molecular barcode or PTM code (Beltrao et al., 2013; Venne et al., 2014). These modifications may trigger specific effectors to either initiate or inhibit downstream events, which either induce or retain a signal only when the complementary incoming signal occurs simultaneously both in time and space. The interplay between different PTMs, referred to as crosstalk (reviewed in Beltrao et al., 2013; Venne et al., 2014), can be either positive or negative (Hunter, 2007). Phosphorylation-dependent ubiquitylation (Koepp et al., 2001) and SUMOylation (Hietakangas et al., 2006) represent examples of positive crosstalk where the initial PTM serves as active trigger for the subsequent addition or removal of a second PTM, or as a recognition site for other proteins. Short crosstalk motifs like phosphodegrons involved in ubiquitin-mediated protein degradation and, in general, motifs in which a phosphorylation site is simultaneously present with another PTM, a second phosphorylation site, or SUMOylation/acetylation sites in the context of a five amino acid stretch are known (Ye et al., 2004; Yao et al., 2011). Negative crosstalk may result from the direct competition of two PTMs for the same amino acid or from indirect effects due to one specific PTM masking the recognition site of a second PTM (Hunter, 2007). For example, direct competition exists between SUMOylation, ubiquitylation, phosphorylation, and acetylation with ubiquitylation/SUMOylation and SUMOylation/acetylation being mutually exclusive, while SUMOylation/phosphorylation can be agonistic or antagonistic depending on the substrate in question (Escobar-Ramirez et al., 2015). Furthermore, the combination of different PTMs on a protein generates a highly regulated interface which may be recognized by specific effector proteins resulting in the controlled initiation of downstream signaling events and facilitating the interactions with diverse binding partners (Sims and Reinberg, 2008), thus explaining, for example, how p97 can participate in such a large variety of different cellular functions.

In the case of p97 one would envision a negative crosstalk between acetylation, lysine methylation, ubiquitylation and SUMOylation since all these PTMs compete with each other for the same lysine residues. Also, the high number of PTMs identified in p97 indicates that a combinatory code is at play that regulates its activity, function, substrate specificity, and localization (Cloutier and Coulombe, 2013). The identification of functionally relevant sites and their dynamic regulations requires quantitative mass-spectrometry approaches that can measure changes in the abundance of PTMs under different conditions (Beltrao et al., 2013). For example a recent global profiling study of ubiquitylation, phosphorylation and acetylation in the DNA damage response identified for p97 located in the nucleus several ubiquitylation and acetylation sites, although no phosphorylation sites were found under the same experimental conditions (Elia et al., 2015).

## MODELS FOR SUBSTRATE UNFOLDING AND DISASSEMBLY ACTIVITY

Conformational changes triggered by ATP binding and hydrolysis generate mechanical forces which are responsible for the activity of p97 in the unfolding and disassembly of macromolecular complexes. In the case of p97 the underlying mechanism(s) is (are) still unknown and different models have been proposed (**Figure 8**): (i) The threading model in which substrates are threaded through the central pore of p97; (ii) The D2 in-out model where substrates insert and leave the D2 pore from the D1-distal direction; (iii) The side access model according to which substrates enter the protein chamber through the opening between the D1 and D2 interface; (iv) The translocation-independent or disassembly model which implicates movements of the N domain rather than a direct participation of the D2 pore in the mechanism. These hypotheses will be discussed in more detail in the following paragraphs.


while being processed in the D2 pore (DeLaBarre et al., 2006). The D2 pore contains the typical substrate binding loops found in related enzymes, which, depending on the nucleotide status, are either in a fixed or dynamically released conformation (Davies et al., 2008; Hänzelmann and Schindelin, 2016b). The smaller pore loops contain a conserved 8-X-Gly (aromatic-hydrophobic-Gly) tripeptide motif which has been suggested to play a conserved role during substrate translocation by AAA+ unfoldases (reviewed in Sauer and Baker, 2011; Olivares et al., 2016). However, since all major substrate-recruiting cofactors bind to the N domain residing at the opposite end of p97 relative to the D2 domain, it is hard to imagine how a substrate can enter the D2 pore from the bottom.


hydrolysis in the D1 domain, which causes the N domain to return to the down position, recruits the substrate to the D1D2 interface as in the side access model. Nucleotidedependent conformational changes in p97 would exert a force on the bound substrate, which would remove it from a macromolecular assembly.

Most likely, the mechanism by which p97 unfolds/disassembles target proteins depends on the fate of the substrate (Barthelme and Sauer, 2016), specifically whether it is recycled or partially/fully unfolded for degradation. Furthermore, in the presence of a ubiquitylated substrate and/or PTMs larger conformational changes in the D1 pore and the D1D2 interface region may occur. Nevertheless, an unfoldase activity of p97 as well as an involvement of the p97 central pore in substrate translocation is still hypothetical at this point since the critical residues located in the D2 pore loops play important roles in regulating the ATPase activity of p97 (DeLaBarre et al., 2006; Hänzelmann and Schindelin, 2016b). Hence one cannot interfere from mutagenesis data whether these side chains contribute to ATP hydrolysis or translocation. As mentioned above, an ISS couples the conformation of the pore to the nucleotide state of the same subunit, which is then transmitted to the adjacent subunit and, in this way, coordinates ATP hydrolysis in trans (Hänzelmann and Schindelin, 2016b). A hybrid model takes into account the importance of the p97 pore loops in substrate remodeling, yet it does not require substrates to completely translocate through the axial channel (Barthelme and Sauer, 2016). With the aid of its pore loops p97 could exert a force on a peptide segment of the substrate, leading to its deformation and dissociation from the complex without unfolding the substrate or translocating it through its axial channel (**Figure 8B**).

# CONCLUDING REMARKS

A common function of p97 is its ATP-dependent extraction or disassembly of ubiquitylated substrates from chromatin, membranes, and protein complexes in many diverse cellular functions that maintain cellular homeostasis, contribute to genomic stability and govern important signaling pathways (**Figure 1B**). The key questions regarding the biological functions of p97 are how it participates in so many dissimilar cellular processes in different cellular compartments, in particular, how is p97 targeted to specific cellular pathways and recognizes its substrates and decides on their fates whether they are destined for proteasomal degradation or recycled. Therefore, independent regulatory mechanisms are necessary to control the physiological functions of p97. It is well-established that a large variety of substrate recruiting and substrate processing cofactors provide specificity toward the cellular processes p97 is involved in. Although cofactor assembly is regulated by binding site competition, bipartite binding, conformational changes upon ATP binding/hydrolysis and the formation of specialized subcomplexes composed of several cofactors, the situation in cells, however, is expected to be more complex and PTMs of p97 and its associated cofactors as well as a crosstalk between PTMs introduce an additional level of complexity. A total of about 170 PTMs like phosphorylation, ubiquitylation, acetylation, SUMOylation, palmitoylation, and methylation have been currently identified in p97, indicating that a combination of different PTMs affects the activity, localization, and substrate specificity of p97 in different cellular pathways. In the highthroughput proteomics era the list of p97 cofactors, associated substrates and PTMs is expected to grow and additional cellular functions may emerge. Major challenges for the future will be (i) to establish the correlations between biological functions and the many PTMs reported to exist, (ii) to identify proteins/domains that can specifically recognize them and (iii) to investigate the interplay of p97-cofactor interactions with PTMs. Understanding these aspects will be crucial for elucidating the physiological and patho-physiological functions of p97.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

PH primarily wrote this review with input from HS. Both authors approved it for publication.

#### ACKNOWLEDGMENTS

This work was supported by the Deutsche Forschungsgemeinschaft (Grant HA 3405/3-1) and by the Rudolf Virchow Center for Experimental Biomedicine (Grant FZ 82) to PH and HS.


modifier polymerization sites by high accuracy mass spectrometry and an in vitro to in vivo strategy. Mol. Cell. Proteomics 7, 132–144. doi: 10.1074/mcp.M700173-MCP200


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Hänzelmann and Schindelin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# AAA-ATPases in Protein Degradation

Ravikiran S. Yedidi <sup>1</sup> , Petra Wendler <sup>2</sup> and Cordula Enenkel <sup>1</sup> \*

*<sup>1</sup> Department of Biochemistry, University of Toronto, Toronto, ON, Canada, <sup>2</sup> Department of Biochemistry, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany*

Proteolytic machineries containing multisubunit protease complexes and AAA-ATPases play a key role in protein quality control and the regulation of protein homeostasis. In these protein degradation machineries, the proteolytically active sites are formed by either threonines or serines which are buried inside interior cavities of cylinder-shaped complexes. In eukaryotic cells, the proteasome is the most prominent protease complex harboring AAA-ATPases. To degrade protein substrates, the gates of the axial entry ports of the protease need to be open. Gate opening is accomplished by AAA-ATPases, which form a hexameric ring flanking the entry ports of the protease. Protein substrates with unstructured domains can loop into the entry ports without the assistance of AAA-ATPases. However, folded proteins require the action of AAA-ATPases to unveil an unstructured terminus or domain. Cycles of ATP binding/hydrolysis fuel the unfolding of protein substrates which are gripped by loops lining up the central pore of the AAA-ATPase ring. The AAA-ATPases pull on the unfolded polypeptide chain for translocation into the proteolytic cavity of the protease. Conformational changes within the AAA-ATPase ring and the adjacent protease chamber create a peristaltic movement for substrate degradation. The review focuses on new technologies toward the understanding of the function and structure of AAA-ATPases to achieve substrate recognition, unfolding and translocation into proteasomes in yeast and mammalian cells and into proteasome-equivalent proteases in bacteria and archaea.

Keywords: AAA, ATPase, proteasome, protein folding, proteolysis

# OVERVIEW

Proteins are synthesized during translation through ribosomes and eliminated by degradation through proteases. Since protein synthesis and degradation are expensive ATP-consuming processes, highly selective mechanisms ascertain that only proteins allotted to degradation are eliminated. If the regulation of protein homeostasis fails, futile cycles of protein synthesis and turnover will ruin the economic budget of our cells. Functional proteins would be depleted and non-functional proteins would accumulate in cytotoxic aggregates (Kopito, 2000; Ciechanover and Brundin, 2003; Goldberg, 2003; Schmidt and Finley, 2014).

Thus, functional proteins must be sorted from non-functional proteins to meet the actual cellular situation with rapid adjustments to metabolic changes or environmental stress. How protein textures shift in response to cellular changes is an interesting question in the field of regulated protein homeostasis but out of the scope of this review. Here, we will focus on ATPases associated with diverse cellular Activities (AAA) that collaborate with proteasomes, the most complex proteases with unique opportunities for regulation of cellular proteolysis. AAA-ATPases typically convert the energy of ATP hydrolysis into mechanical force through conformational

#### Edited by:

*James Shorter, University of Pennsylvania, United States*

#### Reviewed by:

*Peter Tsvetkov, Whitehead Institute of Biomedical Research, United States Nico P. Dantuma, Karolinska Institutet, Sweden Christian Dirk Schlieker, Yale University, United States*

> \*Correspondence: *Cordula Enenkel cordula.enenkel@utoronto.ca*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *26 February 2017* Accepted: *06 June 2017* Published: *20 June 2017*

#### Citation:

*Yedidi RS, Wendler P and Enenkel C (2017) AAA-ATPases in Protein Degradation. Front. Mol. Biosci. 4:42. doi: 10.3389/fmolb.2017.00042* changes in their subunits, cope with the unfolding of protein substrates and synergistically act with proteasomes and proteasome-like proteases for degradation (Schmidt et al., 1999; Sauer and Baker, 2011; Matyskiela and Martin, 2013). However, they can also aid protein refolding allowing partial proteolysis or the escape of specific proteins from degradation.

A myriad of proteins is subject to AAA-ATPase coupled protein degradation by proteasomes. Proteasomal substrates are short-lived, have crucial functions within a short time frame and are eliminated within few minutes by proteasomal proteolysis. Proteasomal substrates are usually post-translationally modified by poly-ubiquitin chains, a series of ubiquitin molecules linked by isopeptide bonds to each other and to the substrate. The first proteins conjugated to ubiquitin, initially named heatstable ATP-dependent proteolysis factor, were detected by Ciechanover and Hershko at a time, when scientists were perplexed by the paradox that proteins are turned over in an ATP-consuming manner after being synthesized by ATP consumption (Ciechanover et al., 1980; Hershko et al., 1980). Varshavsky and co-workers revealed that ubiquitin N-terminallyfused to galactose drastically reduced its half-live depending on the N-end rule, the N-terminal amino acid of galactose (Bachmair et al., 1986). The first poly-ubiquitylated substrates identified in cells were cyclins, cyclin-dependent kinase activators, and inhibitors regulating cell cycle progression (Kirschner, 1999). Also nascent polypeptides arising during protein translation are sources of proteasomal substrates, though their abundance might be less than originally assumed (Vabulas and Hartl, 2005). Poorly folded or misfolded nascent polypeptides may expose hydrophobic domains on the surface. If not instantaneously eliminated by proteasomal degradation, they are prone to nucleate toxic protein aggregations and the early onset of neurodegenerative diseases (Turner and Varshavsky, 2000; Navon and Goldberg, 2001; Medicherla and Goldberg, 2008).

# THE 26S PROTEASOME—THE AAA-ATPASE ASSOCIATED PROTEASE OF EUKARYOTES

The 26S proteasomes exist in eukaryotic cells throughout the kingdom. They are composed of ∼40 different protein subunits. Fourteen of these subunits are assembled in the proteolytic core particle (CP) which is composed of a stack of four sevenmembered rings. Both outer rings contain seven alpha-subunits, both inner rings seven beta-subunits. The proteasome belongs to the class of self-compartmentalized threonine-proteases (Baumeister et al., 1998). The catalytic threonines conferred by three different beta-subunits are sequestered within the two inner beta-rings. Their substrate binding pockets have preferences for hydrophobic, basic, and acidic amino acids and are related to chymotrypsin-, trypsin-, and caspase-like peptide cleavage activities, respectively.

The outer alpha-rings form ante-chambers of the catalytic chambers enclosed between the two inner beta-rings (Tanaka, 2009). The central pores of the outer alpha rings are normally closed. N-terminal extensions of the alpha subunits occlude the central pores and restrict the diffusion of small chromogenic peptides used to assay proteolytic activities. Thus, free CP exhibits latent peptide cleavage activity (Groll et al., 1997; Orlowski and Wilk, 2000), at least under physiological potassium ion concentrations (Kisselev et al., 2002). Depending on the ions in the solution dynamic fluctuations between open and closed states of the CP exist as well as suggested by atomic force microscopy and NMR studies (Osmulski et al., 2009; Ruschak and Kay, 2012).

The conformational fluctuations of the central pores of the CP depend on sodium and potassium concentrations (Kohler et al., 2001; Osmulski et al., 2009). Detergents such as 0.02% SDS trigger the opening of the alpha-ring gates and allow free diffusion of chromogenic peptides into the CP. Not only detergents but also fatty acids, cardiolipin and polylysine open the alpha-ring gates and significantly stimulate peptide cleavage activity (Ichihara and Tanaka, 1989).

Thus, folded cellular proteins have restricted access to the proteolytic chamber to minimize nonspecific degradation. In vitro, natively disordered substrates can access the internal catalytic sites by threading their loose termini through the gates of the CP. Also loops lacking strong secondary structures can traverse the channel into the proteolytic cavity of the CP suggesting that intrinsically disordered protein (IDPs) domains trigger gate opening of the CP (Liu et al., 2003; Ben-Nissan and Sharon, 2014). To which extent IDPs are committed to proteasomal degradation remains to be examined, since IDPs might be shielded by chaperones belonging to the AAA-ATPase family and "nanny" proteins which insure their maturation into important regulatory and signaling proteins (Tsvetkov et al., 2009). Without protection proteins with IDPs might represent favored proteasomal substrates as long as they are not aggregated. Along these lines, disordered regions within regulatory and signaling proteins affect their half-life (Tsvetkov et al., 2012; van der Lee et al., 2014).

The gate opening of the CP is regulated by proteasome activators (PA), which relieve the autoinhibition of the CP by the N-terminal extensions of the alpha subunits. PA700, the regulatory complex (RP) of the eukaryotic proteasome, is the best-characterized PA and contains ∼25 subunits. The RP binds to either one or both ends of the CP. The 240 kDa protein PA200/Blm10 is an alternative PA that is highly conserved from yeast to human. It stimulates the cleavage of small chromogenic peptides but does not contain AAA-ATPase activities required for polypeptide unfolding (Rechsteiner and Hill, 2005).

In contrast to these single protein PAs the RP is composed of ∼25 different subunits which are assigned to two subcomplexes, the RP lid and base. Specifically, the RP base contains a hexameric ring of six subunits named Rpts in yeast or PSMCs in mammals that are members of the AAA-ATPase family (Glickman et al., 1998). The ATPase ring is adjacent to the CP alpha ring upon RP-CP binding (Baumeister et al., 1998). Newly advanced technologies using single particle cryo-EM provided detailed insight into the mechanism of how the ATPase ring is properly positioned for alpha ring opening to channel the translocation of unfolded substrates (Matyskiela and Martin, 2013; Unverdorben et al., 2014; Chen et al., 2016; Rodriguez-Aliaga et al., 2016).

### SUBSTRATE RECOGNITION BY POLY-UBIQUITYLATION

Basically, in eukaryotic cells the poly-ubiquitin chain is the post-translational modification of a protein to be recognized as a potential substrate by the RP and to be recycled prior to degradation. Degrons are encoded in the amino acid sequence of the substrate which facilitate substrate processing. In the canonical sense, a chain of at least four isopeptide-conjugated ubiquitin molecules in combination with unstructured termini/loops within the substrate required to be recognized as degradation signal by the RP. Although all AAA-ATPases act on the protein substrate concurrently with the removal of the poly-ubiquitin chain, Rpt5, one of the Rpt ATPase subunits, was found to bind ubiquitin (Lam et al., 2002).

To accommodate poly-ubiquitylated substrates, the proteasome shows a high degree of plasticity and versatility (Glickman and Raveh, 2005). Beyond shuttling ubiquitin receptors which transiently bind to ubiquitin-like domains on RP subunits, three RP subunit, namely Rpn1, Rpn10, and Rpn13 in yeast or PSMND2, PSMD4, and ADRM1 in mammals, serve as intrinsic docking sites for ubiquitin molecules (Finley, 2009; Rosenzweig et al., 2012; Shi et al., 2016). One major delivery site for poly-ubiquitin chains involves Rpn10 and Rpn13, the latter bound to Rpn2. The poly-ubiquitin chain is held between Rpn13 and Rpn10. The ubiquitin hydrolase Rpn11, a subunit of the RP lid and closely positioned to Rpn2 is responsible for the isopeptide-hydrolysis of the poly-ubiquitin chain. The polypeptide stripped off ubiquitin is adopted in an unfolded state by the adjacent AAA-ATPase ring. During the dynamic process of (i) substrate accepting, (ii) commitment, and (iii) translocation three hypothetic conformational states of the yeast proteasome were distinguished by single particle cryo EM analysis (**Figure 1**) (Lander et al., 2013; Unverdorben et al., 2014). The translocation state might be dissected into more intermediates, since human proteasomes exist in at least four states during substrate processing (Wehmer and Sakata, 2016).

The second delivery site for a poly-ubiquitin chain involves Rpn1 and Ubp6, named PSMD2 and USP14 in mammals. While the ubiquitin moieties are bound to Rpn1, the adjacent Ubp6 hydrolase trims super-numerous poly-ubiquitin chains. Again, the polypeptide cleaved off from the poly-ubiquitin chain is proposed to be furthered to the AAA-ATPase ring for unfolding and translocation into the CP (Shi et al., 2016), though Ubp6 is distant from the entrance pore of the AAA-ATPase ring. By trimming lengthy poly-ubiquitin chains the substrate can even escape final degradation, consistent with the finding that inhibition of Ubp6 stimulates protein degradation (Crosas et al., 2006).

A couple of additional ubiquitin receptors are known to ensnare Rpn1 and transiently deliver poly-ubiquitylated proteins to the RP (Rosenzweig et al., 2012). The remote binding of poly-ubiquitin chains most likely transmits allosteric conformational changes toward the coaxial CP alpha ring and the AAA-ATPase central pore to prepare the holo-enzyme for its commitment to protein degradation (Bech-Otschir et al., 2009; Peth et al., 2010).

Notably, poly-ubiquitin modifications are not compulsory for substrate degradation by proteasome holo-enzymes. One of the most prominent substrates that is degraded in an ATP-dependent matter without poly-ubiquitylation is ornithine decarboxylase (ODC) as elaborated by Coffino and co-workers (Erales and Coffino, 2014).

# ANCESTORS OF AAA-ATPASES IN PROTEIN DEGRADATION

Hexameric AAA-ATPase rings involved in ATP-dependent protein degradation exist in 26S proteasomes of eukaryotic cells and prokaryotic ancestors such as HsIU AAA-ATPase which is associated with HsIV protease composed of two homohexameric rings (**Figure 2**). In archaea proteasome-alike proteases composed of four heptameric rings are associated with VAT (Valosin-containing protein-like), the homolog of the ubiquitous AAA-ATPase Cdc48/p97, and PAN (Rockel et al., 2002; Benaroudj et al., 2003). In actinobacteria Mpa associates with the mycobacterial 20S proteasome, another evolutionary ancestor of eukaryotic proteasomes (Striebel et al., 2010). Hexameric AAA-ATPase rings also associate with prokaryotic AAA proteases such as ClpP, a serine protease composed of two heptameric rings. In these bacterial systems the AAA-ATPases are known as Clp ATPases (X, single ATPase ring; A, double ATPase ring) (Grimaud et al., 1998; Baker and Sauer, 2012).

Interfaces between the hexameric AAA-ATPase ring and the heptameric proteasome suggested a symmetry mismatch which precludes close complementary neighborhood and allow room for the dynamic changes underlying the mechanisms of protein translocation though the coaxial pores of the AAA-ATPase and the adjacent protease.

The architecture of prokaryotic AAA proteases is simple. The AAA-ATPase is a homohexamer. No RP equivalent is associated with the protease core. Bacterial proteases require no ubiquitin receptors, as ubiquitin signaling does not exist in prokaryotes (Jastrab and Darwin, 2015). Only in Mycobacteria tuberculosis, one of the world's deadliest pathogens, Pup, the prokaryotic ubiquitin-like protein, targets proteins by monopupylation for degradation. To recognize the Pup degradation tag, the N-terminal coiled coil regions of the AAA-ATPase Mpa homohexamer serve as template for the C-terminal half of Pup1 to fold into a helix (Wang et al., 2010). Beside the rare modification of pupylation, a variety of degrons exists which are encoded in the primary sequence and render a protein into a substrate. Due to their propensity for intrinsic disorder these degrons are prototype-patterns in protein degradation and not only recognized by prokaryotic but also by eukaryotic AAA proteases (Ravid and Hochstrasser, 2008; Varshavsky, 2011).

channel through the CP and ATPase ring is indicated with yellow and white dashed lines, respectively. In the RP, the AAA-ATPase ring along with the N-terminal coiled-coils is colored in cyan. The non-ATPase RP subunits are colored in white except for Rpn1 (brown), Rpn2 (green), Rpn10 (orange), Rpn11 (yellow), and Rpn13 (magenta). In S1, a poly-ubiquitylated substrate (in red labeled with "S" attached to the tetra-ubiquitin chain in blue labeled with "Ub") is recognized by the ubiquitin receptor Rpn13. Subsequently, the poly-ubiquitin chain is anchored to the ubiquitin receptor Rpn10 leading to substrate placement near the N-ring. In S2, the isopeptide bond between the substrate and poly-ubiquitin chain is cleaved by Rpn11 and the unfolding of the substrate is initiated. In S3, the unfolded substrate is translocated through the central pore of the AAA-ATPase ring into the central channel of the CP for degradation. The central pores of the AAA-ATPase O-ring and the CP are not aligned in S1 and S2 but are in S3. A 25◦ rotation of S1 to S2 facilitates the substrate placement into the N-ring and activates Rpn11. The figure was prepared using the PDB IDs: 4CR2, 4CR3, 4CR4, 1UBQ, 2ZNV, 2Z59, and 1UZX through PyMOL (Ver. 1.8.0.2) molecular graphics software (Schrodinger, LLC, New York).

Most bacterial proteasomes are dodecamers of beta subunit ancestors. All other ancestor proteasomes with an exception of the bacterial species from Rhodococcus are composed of identical alpha and beta subunits. Their overall organization is similar. The alpha subunits are arranged in seven-membered outer rings, and the beta subunits in seven-membered inner rings, yielding a barrel-shaped particle with alpha7-beta7-beta7-alpha7 configuration as evidenced by the archaeal species from Thermoplasma acidophilus (Jastrab and Darwin, 2015). The opening by 23 Å of the alpha ring in bacterial proteasomes is wider than the opening by 13 Å in archaea, leading to a funnel through the center of the entire complex (Lowe et al., 1995).

In contrast to eukaryotic proteasomes where each of the seven distinct alpha subunits occupies a specific position to guarantee the closed-gate state, the N-terminal extensions of the identical alpha subunits of Thermoplasma acidiphilum proteasome are disordered and unable to lock the central pore (Lowe et al., 1995). Interestingly, the Mycobacterium proteasome has a closed gate, because the alpha-type subunits assume three different conformations. Three subunits form a rectangular shape ("L"), three form an extended linear shape ("E"), and one projects away to avoid a sterical clash ("V") (Li et al., 2010).

Thus, in Mycobacterial and eukaryotic proteasomes the binding of AAA-ATPase rings facilitates the repositioning of the N-terminal extensions of the alpha subunits to open the central gates. The alpha ring gate of mycobacterial proteasomes can also be widened by Bpa, a just recently identified non-ATPase ring, suggesting that the AAA-ATPase activity is not required for alpha ring gating (Bolten et al., 2016).

Unlike the AAA-ATPase heterohexamer of the eukaryotic proteasome, the bacterial AAA-ATPase ring is a homohexamer (Striebel et al., 2009). Structurally, the prokaryotic AAA-ATPases resemble the eukaryotic counterparts. They all contain an alphahelical domain close to the variable N-terminus followed by the oligonucleotide- and oligosaccharide binding domain (OB) and an AAA-ATPase domain, consisting of a RecA like subdomain and the α helical, C-terminal subdomain (Wendler et al., 2012). ATP binds to the Walker A motif between the two subdomains. Conserved loop residues line up the central pore of the AAA-ATPase ring which grip the unfolded protein substrate for translocation (**Figure 3**).

In PAN, which is an archaeal proteasomal AAA-ATPase ring, the six OB subdomains form the N-ring, while the N-terminal sequences adopt alpha-helical conformations and pair into three coiled coils. A conserved proline residue

FIGURE 2 | Domain organization of AAA-ATPases. (A) Magnified view of the monomer (left) and overall view of the oligomer (right) of Mpa containing two OB rings, OB1 and OB2, along with the N-terminal coiled-coils (blue). Magnified views of monomers bound to nucleotides highlighted by spheres of (B) ClpA with small and large domains SD1, SD2, LD1, and LD2 bound to ADP at the SD1/LD1 and SD2/LD2 interfaces; of (C) Valosin-containing protein-like ATPase (VAT) with nucleotide binding domains NBD1 and NBD2 bound to ATP; of (D) HslU with N-terminal (N), large (LD), and small (SD) domains and ATP; of (E) ClpX with N-terminal (N), large (LD), and small (SD) domains with ADP; of (F) p97/VCP/Cdc48 with N-terminal (N) and domain-1 (D1) and -2 (D2) bound to ATPγS; of (G) proteasome-activating nucleotidase (PAN) with N-domains 1 (from Gcn4) and 2 and large (LD) and small (SD) domains. Again, ATP is bound at the SD/LD interface. This figure was prepared based on the availability of structures in the protein data bank using the PDB IDs: 3M9D, 1KSF, 5VC7, 1DO0, 3HWS, 5C18, 2WG5, and 2WFW through PyMOL (Ver. 1.8.0.2) molecular graphics software (Schrodinger, LLC, New York).

FIGURE 3 | Active site organization of AAA-ATPase rings. (A) Bottom view of the proteasomal AAA-ATPase rings from yeast (upper panel) and human (lower panel). The Walker domain A is highlighted by red spheres and B by blue spheres. Magnified views of the Walker domains are shown for human AAA-ATPase bound to either ATP (B) or ADP (C) in two orientations. (D) Dynamics of Valosin-containing protein-like ATPase of *Thermoplasma acidophilum* (VAT) are visualized by conformational switches between the stacked and spiral (split-) ring versions. Side and top views of the AAA-ATPase subunit colored in red show movements out of the plane upon ATP hydrolysis aiding substrate translocation into the proteasome through its central pore. The split ring form (bottom left) undergoes a conformational change back into the stacked ring (top left), when ADP dissociates from the subunit and ATP binds back to allow the next round of hydrolysis. This figure was prepared using the PDB IDs: 4CR2, 5L4G, 5G4G, and 5G4F through PyMOL (Ver. 1.8.0.2) molecular graphics software (Schrodinger, LLC, New York).

at the base of the N-terminal helix adopts a cis-conformation introducing a kink of the helix that allows coiled-coil formation with its neighbor subunit. To unfold and inject a protein the internal pore loops in the RecA like subdomain move the target protein toward its C-terminal end (Yu et al., 2010). PAN only transiently associates with 20S proteasomes from archaea (Barthelme and Sauer, 2012), unless a genetically engineered cystine bridge stabilizes the docking of the C-terminal HbYX motif in the alpha ring binding pocket of the 20S proteasome (Barthelme et al., 2014).

In general, Clp AAA-ATPases follow an ATP hydrolysis pattern different from eukaryotic AAA-ATPases. The ATP hydrolysis pattern is best studied in the homohexameric ClpX AAA-ATPase while velocity and processivity of most proteasomal AAA-ATPases still remain elusive (Lupas and Martin, 2002). The bacterial AAA-ATPase ClpX hydrolyzes ATP in a semi-stochastic way with a hydrolysis rate of ∼100–500 ATP molecules per minute in the absence of substrate. In association with a protease substrates are degraded with high velocity but low processivity slipping back and forth, once the AAA-ATPase encounters a folded domain (Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Baytshtok et al., 2015; Iosefson et al., 2015; Rodriguez-Aliaga et al., 2016).

#### THE AAA-ATPASES OF THE EUKARYOTIC PROTEASOME

In contrast to the prokaryotic systems, the AAA-ATPase of the eukaryotic proteasome is a heterohexamer suggesting specialization among the six different ATPase subunits (Rubin et al., 1998). The six ATPase subunits arrange in a particular order: Rpt1-Rpt2-Rpt6-Rpt3-Rpt4-Rpt5 (Tomko and Hochstrasser, 2011) (**Figure 3**). The N-terminal domains form coiled-coils, as Rpt2, Rpt3, and Rpt5 contain a conserved proline residue at the base of the helix that build coiled-coils with significant differences in their length and breaks of the symmetry.

The binding of the proteasomal AAA-ATPase ring to the alpha ring of the proteasome requires the highly conserved penultimate tyrosine residue within the C-terminal HbYX (hydrophobic-tyrosine-any amino acid) motif (Smith et al., 2007; Rabl et al., 2008). Upon ATP binding the subunits with HbYX motifs bind to inter-pockets between two alpha subunits of the CP like a "key in a lock" (**Figure 4**). With bacterial AAA-ATPases consisting of homohexamers six identical HbYX motifs

can interact with seven alpha subunit pockets, stabilizing the interactions of the ATPase AAA protease complex (Jastrab and Darwin, 2015).

Though four out of six proteasomal AAA-ATPases, namely Rpt1, 2, 3, and 5, have HbYX motifs, gate opening could be induced by C-terminal peptides of Rpt2 and Rpt5 suggesting that the hexameric ATPase ring is mainly anchored by two contact sites to the heptameric ring of the alpha subunits (Smith et al., 2007). In the proteasome purified from yeast, the C-terminal HbYX motifs of Rpt2, Rpt3, and Rpt5 turned out to bind to the inter-pockets between alpha 3–4, 1–2, and 5–6, respectively. A rotation in the alpha subunits and displacement of a reverse-turn loop occluding the central pore are induced, so that the open gate conformation is stabilized within the holo-enzyme and substrate entry is facilitated (Rabl et al., 2008; Park et al., 2013) (**Figure 5**).

The loops of the ATPase subunit lining the central pore of the hexamer are suspected to contact the substrate to be unfolded. ATP hydrolysis triggers conformational changes of individual ATPase subunits that exert a pulling force to unfold and translocate the substrate through the narrow central pore of the CP alpha ring which is consecutively widened enough to accommodate an unfolded polypeptide chain. The hydrolysis rate of proteasomal AAA-ATPases is ∼30–50 molecules of ATP per minute which is considerably slower than the rate of ClpX AAA-ATPases (Hoffman and Rechsteiner, 1996; Kraut et al., 2012; Kim et al., 2015). The slow velocity allows more processivity during substrate degradation, that the machinery does not stall but rather drives through without slipping, when it approaches a folded domain (Smith et al., 2011; Kim et al., 2015).

In eukaryotic proteasomes the substrate polypeptide is engaged with the unfolding activity of the AAA-ATPase ring concurrently with the removal of the tetra-ubiquitin chain but the recognition of the poly-ubiquitin chain is not sufficient for degradation. The proteolytic engagement requires an unstructured initiation site which reaches through the OBdomain containing N-ring to the AAA-pore (Prakash et al., 2004).

The site of the poly-ubiquitin chain to be cleaved off must be approximately thirty amino acids away where the ubiquitin isopeptidase activity of Rpn11 is located. The length of

FIGURE 5 | The AAA-ATPase ring of the human proteasome. (A) The AAA-ATPase is located on the alpha ring of the CP. The catalytic beta-subunits are colored in red, the alpha-subunits in blue. A magnified view of the AAA-ATPase ring is shown on the right. Coiled-coils of N-terminal regions reach out to other RP subunits. (B) The AAA-ATPase subunit colored in cyan is bound to ADP (red ellipse), while the other five AAA-ATPase subunits are bound to ATP (red box). (C) Rpn3 acts as sensor to induce conformational changes in the RP upon substrate docking into the ATPase ring (shown as a surface diagram). The C-terminus of Rpn3 colored in red is close to the pore of the N-ring (white line) and the O-ring (yellow line). This figure was prepared using the PDB ID: 5L4G through PyMOL (Ver. 1.8.0.2) molecular graphics software (Schrodinger, LLC, New York).

approximately thirty amino acids is also required for proteasomal model substrates with accessible termini, that are degraded in vitro by proteasomes independently of ubiquitination (Kraut et al., 2007; Takeuchi et al., 2007). In some instances, one or two ubiquitin molecules are already sufficient for signaling degradation suggesting that the tetra-ubiquitin chain is not necessarily a switch-on for degradation. The question is what could be the molecular ruler beside the poly-ubiquitin chain on which a protein is recognized as proteasomal substrate. The susceptibility of the unstructured regions of the substrate to unfolding determines the efficacy of degradation rather than the anchoring of ubiquitin to the proteasome (Prakash et al., 2004). Also the size of the protein seems to determine the pathway of degradation in favor of mono- over poly-ubiquitylation (Shabek et al., 2012). The accessibility of lengthy poly-ubiquitin chains to VAT/Cdc48, an abundant ubiquitous AAA-ATPase transiently interacting with archaeal proteasomes, also influences the fate of a proteasomal substrate, as Cdc48 facilitates the extraction of protein substrates stuck into membranes and protein aggregates (Godderz et al., 2015).

### NEWEST INSIGHTS INTO THE DETAILED MECHANISM OF AAA-ATPASES

According to current models of AAA-ATPases individual subunits are in different stages of the ATPase cycle. Prokaryotic AAA-ATPases such as ClpX hydrolyze ATP in a semi-stoichastic manner, whereas eukaryotic AAA-ATPases of the proteasome are suggested to hydrolyze ATP in an ordered and sequential cycle by binding ATP molecules to the ortho position (direct neighboring subunit) of the hydrolyzed ATP molecule. Allostery between eukaryotic AAA-ATPase subunits is mediated by trans-argininefingers which are lacking in ClpX reflecting structural differences with regard to ATP hydrolysis and potentially resulting in distinct strategies for protein unfolding (Kim et al., 2015). ATP binding and hydrolysis induce coordinated conformational changes (Smith et al., 2011; Stinson et al., 2013). With saturating ATP concentration, all six Rpts adopt a staircase arrangement, with Rpt3 at the highest step and Rpt2 at the lowest step relative to the CP, whereas the C-terminal domains are positioned in a plane above the CP (Lander et al., 2012). Engaged with a substrate the staircase arrangement is no more present (Matyskiela and Martin, 2013).

Subunit staggering and staircase arrangements are not due to the asymmetry of the heterohexameric ATPase ring of RP. It has been observed for prokaryotic homohexameric ATPases as well (Thomsen and Berger, 2009).

Could the stair case configuration be static and represent the optimal acceptor state for incoming polypeptides that have to be accommodated from different sites above the central entry pore? Ubiquitin-hydrolyzing activities by Ubp6 and Rpn11 and their corresponding ubiquitin receptor sites are asymmetrically positioned in the RP and hover above the substrate entry port of the Rpt ATPase ring. Substrate or ATP binding may swing the active site of Rpn11 toward the central pore of the AAA-ATPase from a discontinuous conformation to a position in which the AAA-ATPase pore is properly aligned with the alpha ring gate of the CP (Matyskiela and Martin, 2013).

The archaeal VAT ATPase, the archaeal counterpart of Cdc48/p97, showed a staircase arrangement of the homohexameric ring, when at least one subunit was bound to ADP (Huang et al., 2016). Mutations in critical tyrosines of the VAT-pore loops cause defects in protein unfolding and translocation (Gerega et al., 2005). Snapshots obtained by cryo EM and NMR studies revealed that the movement between stacked and split-ring structures for VAT suggests repeated cycles of ATP binding and hydrolysis by setting the central pore on different heights to generate the pulling force on the substrate. They reflect substrate-AAA pore loop contacts with the translocation channel into the proteasome (**Figure 3D**). Transient intermediates of substrate translocation through VAT ATPase were captured by cryo EM. Substrate binding breaks the six-fold symmetry, allowing five of the six VAT subunits to constrict into a tight helix that grips an ∼80 Å stretch of unfolded protein. The structure suggests a processive hand-over-hand unfolding mechanism, where each VAT subunit releases the substrate in turn before re-engaging further along the target protein, thereby unfolding it (Ripstein et al., 2017).

All mechanistic studies on AAA-ATPase before occurred on idle hexamers with no unfolded peptide in the process of translocation (Ripstein et al., 2017). How many of the six subunits of the hexamer are actually loaded with nucleotides, is not definitively determined, unless we assume that the subunits were oversaturated with either non-hydrolyzable analogs of ATP and completely bound to ADP. Negative allostery might be possible when ATP binding to one site prevents nucleotide binding to another site. Furthermore, it is unclear whether the six subunits of the ATPase have hydrolyzed ATP in a random, sequential or concerted manner.

Electron cryo-tomography in cells also revealed asymmetrically twisted Rpt ATPase rings in 26S proteasomes which were assigned to enzymes engaged in degradation compared with idle enzymes in the ground state after ATP hydrolysis. The AAA pore loops are aligned in a spiral plane in the ground state and in a nearly planar configuration in the engaged state (Matyskiela and Martin, 2013; Unverdorben et al., 2014).

When active site mutants in Rpt subunits were compared, the most severe effects on protein degradation were observed for mutations in Rpt subunits within pore loops closest to the substrate entry point in the OB-containing N-ring pointing to the hot spot, the "commitment step" for final degradation (Erales et al., 2012; Beckwith et al., 2013).

Recent advances in dual-laser optical trapping technologies on single molecules allowed testing the existing models of protein unfolding and degradation. Sophisticated reporter substrates such as ssrA-degron-(unfolded Titin)<sup>4</sup> were engineered to measure the mechanical forces that apply on these substrates during translocation (Maillard et al., 2011; Sen et al., 2013; Cordova et al., 2014). Bacterial ClpP protease bound to either double ring AAA-ATPase ClpA or single ring ATPase ClpX were compared for the translocation capacity of the reporter substrate. It was substantially faster degraded but slower translocated by the protease with the ClpA double ring compared with the ClpX single ring (Olivares et al., 2016). The fundamental translocation step is independent of double or single ring architecture supporting the conclusion that constrains imposed by the nucleotide state determine the size of a single power stroke (Glynn et al., 2009; Stinson et al., 2013). Similar settings in a dual optical tweezer assay using a GFP-labeled variant of ssrA-degron-(unfolded Titin)<sup>4</sup> allowed further characterization of the mechanochemical cycle of ClpXP. The AAA-ATPase motor is cycling through two phases. In the dwell phase ClpXP does not move its substrate. In the burst phase CplXP pushes the substrate in increments of few nanometers, resulting in a near simultaneous ATP-driven conformational change of single ATPase subunits, thereby propelling the substrate via individual power strokes (Aubin-Tam et al., 2011). ADP release and ATP binding occurred in the dwell phase, whereas ATP hydrolysis and phosphate release happened in the burst phase. Conformational re-settings of the pore loops appear to determine the time for ADP release from individual ATPase subunits (Rodriguez-Aliaga et al., 2016).

Recent single particle cryo-EM analysis of human 26S proteasomes to near-atomic resolution provided complementary information about the substrate-unfolding AAA-ATPase channel in its nucleotide-bound state (Chen et al., 2016) (**Figure 5**). The AAA pore is shaped by inward facing pore loops, which are arranged in two parallel helixes, one is populated with hydrophobic and the other with charged amino acid residues. The interior of the AAA channel is negatively charged, the interior of the OB channel positively charged. Both parts of the channel are enriched by crucial tyrosine residues, which feature the conserved hydrophobic Tyr/Phe-Val/Leu/Ile-Gly pattern. The resolution of this critical region of the AAA-ATPase allowed the differentiation of four proteasome configurations. Six ATP molecules were tentatively modeled into the binding pockets, because the nucleotide state could not be determined for each Rpt subunit due to the averaging of single particle images. Surprisingly, the alpha rings of the CP were closed in three out of four conformations. The Rpt subunits seem to be in direct contact with the alpha ring of the CP by anchoring the HbYX motif of Rpt5 but not of Rpt2 into the respective interpocket of the CP alpha ring. Movements of the Rpt subunits on the alpha ring eventually facilitate the reach out of the HbYX motifs to Rpt1, 2, and 6 to their nearest inter-pockets, until remaining gate-blocking C-terminal tails align along the center axis of the pore. Taken together, the opening is primed through a series of coordinated, stepwise remodeling events including the RP lid swinging in the appropriate position above the axial channel (Chen et al., 2016). The configuration of the ground state with the closed CP gate was consistent with recent highresolution cryo-EM structures (Huang et al., 2016; Schweitzer et al., 2016). Rpt6 is structurally distinct from the other five Rpt subunits, most notably in its pore loop region. Moreover, the C terminus of Rpn3 was found to protrude into the ATPase ring and proposed to trigger conformational changes to the AAA-ATPase ring (**Figure 5**). Rpn1 and Rpn2, the largest proteasome subunits, are linked by an extended alpha helix suggesting coordinated co-operations between the RP ATPases and non-ATPases to orchestrate substrate recognition, unfolding and translocation (Schweitzer et al., 2016).

# ESCAPE MECHANISMS OF AAA PROTEASES

The proteasome is committed to operate processively on a substrate and determines the substrate's fate (Lee et al., 2001). However, successful initiation of substrate translocation, presumably by the synergistic interaction between the AAA pore loops and the translocation channel into the CP, does not guarantee the execution of proteolysis, when pore loop interactions with the gripped substrates are lost, especially when slippery elements of low complexity or intrinsically disordered sequences are positioned adjacent to folded domains. Especially, repetitive sequences of glycine-alanine residues resulted in the blockage of degradation, because the AAA-ATPase seems to slip on the repetitive sequences without being able to grasp the polypeptide (Levitskaya et al., 1995; Zhang and Coffino, 2004). The preferences of the AAA-ATPases for specific sequences seem to provide an additional component to the degradation code and may fine-tune the half-lives of cellular proteins. Clusters of glutamate repeats inhibited degradation of the protein (Fishbain et al., 2015), possibly by being repulsed by negatively charged amino acid residues in the AAA-pore (Chen et al., 2016). Ubiquitin-associated domains (UBA) protect against proteasomal degradation which is detrimental for shuttling ubiquitin receptors such as Rad23 and Dsk2 which deliver poly-ubiquitylated substrates to the proteasome without being sacrificed. Insertion of an UBA domain near an intrinsically disordered region stabilizes the protein (Heessen et al., 2005; Heinen et al., 2011).

Tetra-ubiquitin can also be covalently linked to a subunit of a protein complex to be targeted to the proteasomes without being degraded, because the subunit is sufficiently folded and not extracted by the Rpt ATPases. Instead, the neighboring subunit having an intrinsically disordered domain is degraded (Prakash et al., 2004). Also the other way around is known that a ubiquitinated subunit of a complex is degraded, while the neighboring subunit remains intact (Hochstrasser and Varshavsky, 1990; Johnson et al., 1990; Verma et al., 2001). Thus, the Rpt AAA-ATPases seem to favor the substrate with the easiest accessible termini and the most likely initiation site, an unstructured region penetrating to the ATPase pore loops. Unstructured regions such as the 37 amino acid long C-terminal tail of ODC, bind so tightly to the AAA-ATPase that polyubiquitination is not required for degradation as known for other degrons in the bacterial and archaeal system (Erales and Coffino, 2014). In vitro, proteins with largely unstructured regions such as NQO1 are even being degraded by the CP without the aid of Rpt AAA-ATPases, but this mechanism is yet to be verified in vivo (Moscovitz et al., 2012).

The RP base complex harboring the Rpt AAA-ATPases was also shown to exhibit foldase activity of AAA-ATPase chaperones. Denatured citrate synthase without ubiquitin modification was refolded and reactivated by Rpt ATPase without being degraded by proteasomes (Braun et al., 1999).

Finally, proteasomal AAA-ATPases have also been propsed to be involved in non-proteolytic re-folding processes such as nucleotide excision repair (Gillette et al., 2001; Gonzalez et al., 2002). DNA microarrays revealed RP subunits but no CP subunits to be associated with chromosomal DNA. However, the experimental conditions under which chromatin immunoprecipitation assays are performed may weaken the interaction between RP and CP resulting in the dissociation of the CP from the RP.

#### OUTLOOK

Different—and sometimes incompatible—models based on NMR, X-ray, and cryo-EM structure analysis are available to visualize important steps in protein substrate unfolding and translocation through AAA-ATPases which are associated with proteasomes and proteasome-like proteases. The usage of optical tweezers and fluorescence microscopy on single molecules allowed the first comprehensive mechanochemical characterization of a bacterial AAA-ATPase. Its motor power

# REFERENCES


reconciles the product of generated force and translocation velocity. This novel approach is expected to add detailed pictures of how the chemical transitions in the ATPase cycle of an AAA-ATPase are coupled to the dwell and burst phases of the motor between its grip on the substrate and its pulling frequency. Future studies based on this technology will reveal whether related AAA-ATPases, including the eukaryotic 26S proteasome, may use similar mechanisms for ATP-dependent substrate unfolding and translocation.

#### AUTHOR CONTRIBUTIONS

CE is the corresponding author, wrote the first draft of the manuscript and approved the final version for publication. RY prepared the Figures and Figure legends. PW has substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

This work was supported by grants from NSERC (4422666), CIHR (325477) to CE, and the Deutsche Forschungsgemeinschaft to PW (WE4628/1).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Yedidi, Wendler and Enenkel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Proteasomal ATPases Use a Slow but Highly Processive Strategy to Unfold Proteins

Aaron Snoberger, Raymond T. Anderson and David M. Smith\*

*Department of Biochemistry, West Virginia University School of Medicine, Morgantown, WV, USA*

All domains of life have ATP-dependent compartmentalized proteases that sequester their peptidase sites on their interior. ATPase complexes will often associate with these compartmentalized proteases in order to unfold and inject substrates into the protease for degradation. Significant effort has been put into understanding how ATP hydrolysis is used to apply force to proteins and cause them to unfold. The unfolding kinetics of the bacterial ATPase, ClpX, have been shown to resemble a fast motor that traps unfolded intermediates as a strategy to unfold proteins. In the present study, we sought to determine if the proteasomal ATPases from eukaryotes and archaea exhibit similar unfolding kinetics. We found that the proteasomal ATPases appear to use a different kinetic strategy for protein unfolding, behaving as a slower but more processive and efficient translocation motor, particularly when encountering a folded domain. We expect that these dissimilarities are due to differences in the ATP binding/exchange cycle, the presence of a trans-arginine finger, or the presence of a threading ring (i.e., the OB domain), which may be used as a rigid platform to pull folded domains against. We speculate that these differences may have evolved due to the differing client pools these machines are expected to encounter.

#### Edited by:

*James Shorter, University of Pennsylvania, USA*

# Reviewed by:

*Alfred L. Goldberg, Harvard Medical School, USA Steven E. Glynn, Stony Brook University, USA Peter Tsvetkov, Whitehead Institute of Biomedical Research, USA*

> \*Correspondence: *David M. Smith*

# *dmsmith@hsc.wvu.edu*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

Received: *23 December 2016* Accepted: *16 March 2017* Published: *04 April 2017*

#### Citation:

*Snoberger A, Anderson RT and Smith DM (2017) The Proteasomal ATPases Use a Slow but Highly Processive Strategy to Unfold Proteins. Front. Mol. Biosci. 4:18. doi: 10.3389/fmolb.2017.00018* Keywords: ATPase, proteasome, PAN, 26S, proteasomal ATPase, Rpt, AAA, AAA+

# INTRODUCTION

Virtually every cellular process relies on properly regulated protein degradation. Bacteria, archaea, and eukaryotes all have systems for targeted protein degradation (e.g., the ClpP protease in bacteria and the 20S proteasome in archaea and eukaryotes). Both ClpP and the 20S proteasome are capable of degrading unfolded proteins, but since their peptidase sites are sequestered on their hollow interior with only small pores through which substrates can enter, these proteases are not able to degrade folded proteins by themselves because they are too bulky to enter these narrow translocation pores. In order to stimulate degradation of folded proteins, regulatory ATPase complexes associate with the proteolytic complex and use the chemical energy from ATP hydrolysis to unfold and inject the folded proteins into the proteases' central chamber for degradation. While much is understood about this process, we do not have a detailed molecular understanding of how these different ATP-dependent machines engage with and forcibly translocate substrates for selective protein degradation (Smith et al., 2006; Finley, 2009; Alexopoulos et al., 2012; Bar-Nun and Glickman, 2012; Tomko and Hochstrasser, 2013; Mack and Shorter, 2016).

To date some of the better characterized regulatory complexes for the 20S proteasome are the heterohexameric 19S regulatory particle in eukaryotes (which forms the 19S–20S, or "26S" complex) and the homohexameric 19S homolog in archaea, PAN (Proteasome Activating Nucleotidase). One of the most extensively studied ClpP regulators is ClpX. In general, the 19S, PAN, and ClpX utilize ATP to: (1) bind and open the gate of their respective protease (Grimaud et al., 1998; Smith et al., 2005; Liu et al., 2006; Alexopoulos et al., 2013), (2) recognize proper substrates (Thibault et al., 2006; Peth et al., 2010; Smith et al., 2011; Kim et al., 2015), and (3) unfold and inject them into their protease's degradation chamber (Ortega et al., 2000; Singh et al., 2000; Prakash et al., 2004; Zhang et al., 2009; Erales et al., 2012). All three of these regulators are members of the AAA+ superfamily (ATPases associated with diverse cellular activities), but only PAN and the 19S ATPases belong to the same AAA sub-clade, which also contain the SRH region (Lupas and Martin, 2002). Due to the complexities of generating ubiquitinated globular substrates that could be degraded by the purified 26S proteasome, far more functional studies have been done on PAN and ClpX, which only require the presence of a small unfolded region (i.e., ssrA) to trigger substrate degradation (Hoskins et al., 2002; Benaroudj et al., 2003). Although they serve similar functions, ClpX and the proteasomal ATPases may not exhibit similar mechanochemical translocation mechanisms, which would not be unexpected since they each belong to different sub-clades of the AAA+ family. Recent functional studies suggest that they may also have different ATPhydrolysis characteristics. For example, evidence suggests that ClpX hydrolyzes ATP in a semi-stochastic fashion (Sauer and Baker, 2011), whereas the proteasomal ATPases appear to use an ordered, sequential cycle with a specific "ortho" binding pattern (binding to neighboring subunits) which is subject to expected equilibrium binding considerations (Smith et al., 2011; Kim et al., 2015). Additionally, function-critical allostery between subunits is mediated by the proteasomal ATPase's trans-arginine fingers (Kim et al., 2015), which is lacking in ClpX (Kim and Kim, 2003). These differences in the structure and hydrolysis patterns of ClpX and the proteasomal ATPases suggest they may use distinct mechanical strategies to unfold proteins.

Prior studies have shown that when ClpX is translocating on a protein and encounters a stably folded domain (e.g., GFP) it will often stop and even slip backward before taking another run at the folded domain. It's thought that this can occur over and over until spontaneous unfolding occurs after which ClpX quickly translocates onto the unfolded domain, trapping it, and preventing its refolding (Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Iosefson et al., 2015b; Rodriguez-Aliaga et al., 2016). ClpX may also perturb the folded domain prior to trapping. This likely continues until the whole domain is unfolded (**Figure 1A**). In this proposed model ClpX seems to function at high velocity, whereby quick trapping of unfolded intermediates (rather than brute force unfolding) is the primary strategy used to unfold the domain. Alternatively, one can think of this as a motor with high velocity, but with low processivity when it encounters an obstacle to translocation that causes slipping. Interestingly, the ATP

hydrolysis rate of ClpX is ∼100–500 ATPs per minute in the absence of substrate (Martin et al., 2005; Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Baytshtok et al., 2015; Iosefson et al., 2015a; Rodriguez-Aliaga et al., 2016),

from three independent experiments (*n* = 3).

which is considerably faster than the ∼30–60 ATPs per minute of the proteasomal ATPases (Hoffman and Rechsteiner, 1996; Kraut et al., 2012; Kim et al., 2015). Consistent with this high velocity, low processivity mechanism, ClpX has been shown to exhibit a non-linear relationship with regard to its ATPase rate and substrate unfolding rate, especially in more tightly folded substrates (Nager et al., 2011). This is expected since at saturating ATP concentrations ClpX is able to translocate at maximal rates and trap unfolded intermediates, but when the ATPase rate is slowed (by using lower ATP concentrations or by competing with non-hydrolyzable ATPγS) the net translocation rate is also slowed when the unfolded intermediates refold before ClpX can trap them. Thus, at lower ATP hydrolysis rates ATP hydrolysis becomes non-productive and ClpX continually slips on the substrate without productive translocation (**Figure 1A**). This model for ClpX translocation kinetics has also been supported with single-molecule force experiments (Aubin-Tam et al., 2011; Maillard et al., 2011; Iosefson et al., 2015b; Rodriguez-Aliaga et al., 2016).

In the present study, we ask if the proteasomal ATPases have translocation and unfolding kinetics that are consistent with this model of ClpX, or if its structural and mechanochemical differences allow it to take a different strategy for substrate unfolding. We show that, unlike ClpX, the 19S and PAN proteasomal ATPases resemble a lower velocity, but highly processive motor that is slower than ClpX but does not appear to stall when it approaches the stably folded domain of GFP, but rather it drives through it without slipping. These kinetics are consistent with the hand over hand sequential mechanism of ATP hydrolysis that has been proposed for the proteasomal ATPases (Smith et al., 2011; Kim et al., 2015). These data therefore suggest that proteasomal ATPases, while slower, are more processive and efficient than ClpX and use a different kinetic strategy for unfolding substrates.

# RESULTS

In order to test unfolding ability of PAN, we used the model substrate of GFP with an unstructured ssrA tag fused to its N-terminus (GFPssrA). GFP's fluorescence is dependent on its tertiary structure; therefore, the rate of unfolding can be monitored by following its decrease in fluorescence in real time. As expected, PAN unfolded GFPssrA in an ATP-dependent manner (**Figure 1B**). The slow loss of GFP fluorescence in the "no ATP" control is attributed to the slow bleaching of GFP with time, which is expected. To determine the catalytic affinity (Km) for GFP we performed a GFPssrA dose response at saturating [ATP] (2 mM). The unfolding rate was determined by calculating the maximum linear rate of the change in GFP fluorescence with time. The Vmax of GFPssrA unfolding was 0.44 ± 0.01 GFPs·PAN−<sup>1</sup> ·min−<sup>1</sup> , which indicates that PAN takes ∼2 min to unfold a single GFP. This unfolding rate for the proteasomal ATPases is consistent with prior observations (Benaroudj et al., 2003). In addition, the Km was found to be 0.187µM (**Figure 1C**). Next, we determined the ATP hydrolysis rate in PAN using a real-time NADHcoupled assay and found the rate of ATP hydrolysis to be 58.5 ± 3.5 ATPs·PAN−<sup>1</sup> ·min−<sup>1</sup> in the absence of substrate and was activated ∼1.7-fold to 97.0 ± 2.9 ATPs·PAN−<sup>1</sup> ·min−<sup>1</sup> upon addition of saturating GFPssrA (2µM), which is also consistent with previous reports (Kim et al., 2015; **Figure 1D**). The ATP hydrolysis rate we found for PAN is fairly similar to previous reports in the mammalian 26S proteasome, which place the ATPase rates between ∼30 and 50 ATPs per minute in the absence of substrate (Hoffman and Rechsteiner, 1996; Kraut et al., 2012), with a ∼1.5–2-fold activation upon addition of substrate (Peth et al., 2013). We compared this ATP hydrolysis rate to previously reported ATP hydrolysis rates for the psueudohexameric ClpX. Reported ATPase rates for the ClpX pseudohexamer tend to vary quite a bit (∼100–500 ATPs per minute; Martin et al., 2005; Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Baytshtok et al., 2015; Iosefson et al., 2015a; Rodriguez-Aliaga et al., 2016), but all of these rates are considerably faster than the reported basal rates for the proteasomal ATPases. Addition of substrate to ClpX typically increases its ATP hydrolysis rate, although the degree to which ClpX is activated depends upon the substrate analyzed (Kenniston et al., 2003; Baytshtok et al., 2015; Iosefson et al., 2015a).

A longstanding question in the proteasomal ATPase field is how chemical energy from ATP is converted into mechanical work on substrates, and the efficiency of such mechanochemical coupling is informative to mechanism. In ClpX, it was found that at higher ATPase rates, ClpX has quite efficient mechanochemical coupling; however, at lower ATPase rates coupling is less efficient (i.e., at lower ATPase rates, ATP hydrolysis often does not lead to unfolding). This less efficient mechanochemical coupling can be observed by decreasing the rate of ATP hydrolysis by either reducing total [ATP] or competing with non-hydrolyzable nucleotide. In order to test the mechanochemical coupling efficiency of PAN, we simultaneously measured, in real time, the unfolding rate of GFPssrA and PAN's ATPase activity (via absorbance of NADH in a coupled ATPase assay—see Materials and Methods Section). 0.2µM GFPssrA (∼Km) was incubated with PAN at various concentrations of ATP to determine the ATPase (**Figure 2A**) and unfoldase rates (**Figure 2B**). To our surprise, Km-values of PAN's ATPase and GFPssrA unfolding matched quite well with one another, with the Km of ATPase activity being 0.397 ± 0.017µM and the Km for GFPssrA unfolding being 0.429 ± 0.025µM. This suggested a tight coupling between unfolding and ATPase rates at least around ½ Vmax. We then plotted the GFP unfolding and ATP hydrolysis rates against each other on a single 2-dimensional plot (**Figure 2C**). Surprisingly, the data was very linear and fit a linear curve with an R <sup>2</sup> of 0.9918. Therefore, PAN exhibits a 1:1 mechanochemical coupling of ATPase and unfoldase activities. In contrast, prior experiments with ATPases that stall (e.g., ClpX) have shown that its ATPase to GFPssrA unfoldase plot is highly non-linear (e.g., when the ATPase rate is ∼50%, the unfolding rate drops to <5%). In **Figures 2C,F**, we show a dotted gray line as an example of what the ATPase vs. unfoldase plot would look like in a stalling ATPase (e.g., ClpX). This non-linear ATPase to GFPssrA unfoldase relationship has been attributed to increased substrate "stalling" and "slipping" upon reaching a globular domain (i.e., GFP's beta-barrel), which results in non-productive

ATP hydrolysis (Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Iosefson et al., 2015b; Rodriguez-Aliaga et al., 2016). Since we found that PAN's ATPase activity is directly proportional (1:1) to GFPssrA unfolding, this data indicates that PAN essentially does not slip when it reaches the folded domain of the GFP beta-barrel. We repeated the experiment using saturating levels of GFPssrA (2µM) and found that the Km for ATPase activity and GFP unfolding were nearly identical to one another (**Figures 2D,E**). Consistent with **Figure 1C**, the Vmax for unfolding was 2-fold higher at saturating [GFPssrA] (0.43 ± 0.03 GFPs·PAN−<sup>1</sup> ·min−<sup>1</sup> ; **Figure 2E**) compared to at the Vmax at ∼Km concentrations of GFPssrA (0.19 ± 0.01 GFPs·PAN−<sup>1</sup> ·min−<sup>1</sup> ; **Figure 2B**). This is expected since the unfolding rate at Km concentrations of GFPssrA should be ½ of the Vmax. Consistent with prior observations, we observed here that saturating levels of GFPssrA stimulated the Vmax for ATPase activity by ∼1.7-fold when compared to the no substrate ATPase experiments (**Figure 1D**), and a ∼1.2-fold increase when compared to the 200 nM GFPssrA experiments (**Figures 2A,D**). Interestingly, we found that in addition to increasing the Vmax, saturating levels of GFPssrA also lowered the Km for ATP hydrolysis and substrate unfolding ∼2–3-fold (compare Kmvalues in **Figures 2A,B** to Km-values in **Figures 2D,E**). This may suggest an underlying mechanism for substrate stimulated ATPase activity, which is well-established in the literature. In

addition, the similar Km between ATPase and unfoldase activities at saturating substrate levels is consistent with the linear fit (R <sup>2</sup> = 0.9455) that we observe when plotting ATP hydrolysis against GFP unfolding (**Figure 2F**), similar to **Figure 2C**. Thus, even when all PAN complexes are bound to a GFPssrA the rate of ATP hydrolysis is tightly coupled to GFP unfolding. In other words, hydrolysis of ATP by PAN almost always results in a successful translocation event, even when it meets a globular domain.

The eukaryotic 19S ATPases are homologous to PAN, however, the 19S forms a heterohexameric ring and has many additional associated non-ATPase subunits while PAN forms a homohexameric ring and has no known non-ATPase subunits. Therefore, it was unclear whether the 1:1 mechanochemical coupling of ATPase rate to substrate unfolding that we observed in PAN would be a general property of proteasomal ATPases, or whether it would only apply to the archaeal proteasomal ATPases. Therefore, we sought to determine whether the eukaryotic 26S (i.e., 19S–20S complex) also had a similar linear relationship between its ATPase and unfoldase activity. The Matouschek group generously provided us with a novel 26S substrate, Ub<sup>4</sup> (lin)-GFP35-His<sup>6</sup> , suitable for use with in vitro 26S unfolding assays. Such a substrate is very useful for mechanistic studies since it allows for the analysis of ubiquitin- and ATP-dependent degradation using the purified 26S proteasome. For the 26S

proteasome to remain functional it requires the persistent presence of ATP, so we could not assess coupling of ATPase and substrate unfolding using the ATP dose response as was done in **Figure 2** for PAN because low ATP concentrations would induce disassembly of the 26S proteasome (Thompson et al., 2009). Instead, we slowed ATPase rate by competing ATP with the largely non-hydrolyzable ATP analog, ATPγS (which by itself stabilizes the 26S complex as does ATP). We first performed this ATPγS competition experiment in PAN and found that as the ATPγS:ATP ratio increased, GFPssrA unfolding rate decreased in a 1:1 linear relationship with the ATPγS:ATP ratio (R <sup>2</sup> = 0.989; **Figure 3A**). This is consistent with and further supports our observations with the ATP dose response method in **Figures 2C,F**, and it demonstrates that the ATPγS:ATP ratio method mimics a linear decrease in ATP

hydrolysis activity in PAN similar to the ATP dose response. We next performed a similar ATPγS competition experiment using the Ub<sup>4</sup> (lin)-GFP<sup>35</sup> substrate and the eukaryotic 26S proteasome and were surprised to find that the 26S had similar 1:1 unfolding kinetics to that observed in PAN (**Figure 3B**) with a strong linear fit (R <sup>2</sup> = 0.982). These ATPγS competition experiments demonstrate that ATP hydrolysis and unfolding are also tightly coupled in ubiquitin-dependent protein degradation by the eukaryotic 26S proteasome. In addition, this indicates that the tight mechanochemical coupling between ATP hydrolysis and unfolding ability is shared between PAN and the 26S and thus it is expected to be a general property of the proteasomal ATPases despite their structural differences.

#### DISCUSSION

Previous studies reveal that the bacterial ClpX pseudohexamer resembles a higher velocity motor. It also has a correspondingly quick steady-state translocation rate: for example ∼7 amino acids per second on the "non-stalling" substrate, cp6SFGFPssrA (Nager et al., 2011). However, when ClpX reaches a tightly folded domain "stalling" and "slipping" can occur, whereby it loses its grip on the substrate and the substrate is often released, resulting in unproductive ATP hydrolysis (Aubin-Tam et al., 2011; Maillard et al., 2011; Nager et al., 2011; Iosefson et al., 2015b; Rodriguez-Aliaga et al., 2016; **Figures 4A,B**). In contrast, the proteasomal ATPases hydrolyze ATP considerably more slowly than does ClpX and we estimate that proteasomal ATPases translocate on non-stalling substrates at an average rate of ∼1.0–1.9 amino acids per second, or about ∼3–7 times more slowly than ClpX. Interestingly, despite these differences in translocation velocity both PAN and ClpX show a similar cost for non-stalling translocation at a mean of ∼1.1–1.2 amino acids translocated per ATP that is hydrolyzed (**Figure 4A**). Despite this similarity, here we find for the proteasomal ATPases that even at low ATPase rates ATP hydrolysis is tightly coupled with translocation, which is the force that drives unfolding. This is consistent with a lack of substrate "slipping," and indicates that proteasomal ATPases are more efficient and processive than ClpX particularly when they reach a folded domain. Therefore, the proteasomal ATPases operate at a lower velocity, but also have higher processivity since they do not slip or lose grip on the substrate (**Figures 4A,B**). This suggests that ClpX and PAN utilize different kinetic strategies to unfold proteins: ClpX uses a fast translocation strategy to trap unfolded intermediates, while the proteasomal ATPases use a slower but more processive and efficient kinetic strategy to drive through unfolded domains with a tight mechanochemical coupling between ATP hydrolysis and translocation events.

What functional characteristics in these ATPases could cause these different kinetic strategies for unfolding proteins? One possibility is the sequential vs. semi-stochastic mechanisms that have been proposed for the proteasomal ATPases vs. ClpX (**Figure 4A**). It could be expected that a semi-stochastic ATPhydrolysis mechanism could lead to states of the ring where all ATPs are hydrolyzed, leaving ClpX in an ADP-bound state

FIGURE 4 | Comparison of the unfolding kinetics for the Proteasomal ATPases vs. ClpX. (A) Summary of ClpX and the proteasomal ATPases' unfolding kinetics taken from experiments performed in this manuscript as well as by other groups (cited in main text). Footnotes: <sup>a</sup>ATP hydrolysis rate in the absence of substrate. <sup>b</sup>Steady state translocation rates are taken from mean unfolding rates with non-stalling substrates. <sup>c</sup>Translocation cost is calculated as the rate of steady state translocation on a non-stalling substrate, divided by the ATPase rate of the enzyme on that same substrate. <sup>d</sup>Stalling is defined as <5% of max unfolding rate at 50% max ATPase activity (Nager et al., 2011). (B) Working model: ClpX ATPases resemble a higher velocity, less processive motor that is prone to slipping. ClpX translocates rather quickly along a loosely folded protein domain. However, at low ATP concentrations, ClpX is unable to drive through tightly folded protein domains, and thus undergoes multiple slips and stalls, and can even dissociate from the protein completely. Proteasomal ATPases resemble a lower velocity, more processive motor. The proteasomal ATPases translocate more slowly along a loosely folded protein domain, but even at these lower speeds the proteasomal ATPase is able to drive through more tightly folded domains (i.e., GFP) without significant slipping or stalling.

only. Since ATP binding drives substrate association, this could lead to loss of substrate affinity and slipping, especially when ATP is limiting. In contrast, it has been proposed that the proteasomal ATPases use a sequential single subunit progression mechanism for ATP hydrolysis (Kim et al., 2015). In this model, at least one ATPase subunit is always bound to an ATP, supporting constant affinity for the substrate, which would be expected to prevent slipping. In this model it would thus be expected that most hydrolysis events are coupled to translocation events, which is supported by our data presented here. This tight mechanochemical coupling can be explained by two different models for the proteasomal ATPases: (1) ATP hydrolysis has sufficient power to forcibly unfold GFP with each power stroke, allowing the ATPase to drive through unfolded domains or (2) ATP hydrolysis does not occur in any one subunit until translocation can take place. These two models could represent differences in the "power stroke" vs. "Brownian ratchet" mechanisms, and many ATPase motors exhibit a blending of both of these mechanisms, but neither of these have been determined for the proteasomal ATPases. However, both models are consistent with the data we have shown here. It's also possible that other structural differences between ClpX and the proteasomal ATPases could play a role in the unfolding kinetics. For example, the proteasomal ATPases have trans-arginine fingers (vs. cis-arginine fingers in ClpX), which constitutes an arginine that allows one subunit to contact the gamma phosphate of the ATP bound to the Walker A/B sites in its neighboring subunit. This arginine is critical for the effects of ATP-binding in the proteasomal ATPases, which include promoting substrate binding, and the association of PAN/19S with the 20S core particle and gate-opening. The placement and allosteric role of this trans-arginine is a fundamental difference between the proteasomal ATPases and ClpX. In addition, the role of the trans-arginine finger combined with the single subunit progression model produces a hand-over-hand translocation model that would be expected to exhibit a high "grip" strength mechanism that allows for high substrate binding affinity even at low ATP (Kim et al., 2015). The proteasomal ATPases also contain a rigid ring of OB domains that substrates are threaded through during translocation. This threading ring generates a rigid platform that folded domains can be pulled against during translocation to cause unfolding. The lack of such a domain in ClpX means that globular domains are pulled into and against the ATPase domains themselves during translocation (especially for the 1N-ClpX which is used in most of the in vitro experiments that study translocation), which could sterically alter their activity during forceful pulling, and could perhaps cause slipping as well (**Figure 4A**).

So why might these two distinct mechanisms have evolved for unfolding proteins? In bacteria, ssrA tags are added to the Cterminus of translationally stalled proteins on ribosomes. In fact, ∼1 in 200 translated proteins are tagged by ssrA, and of these, >90% are degraded by ClpX(P) (Lies and Maurizi, 2008). The vast majority of these translationally stalled proteins will produce truncated proteins, which will typically prevent proper folding, thus destabilizing these proteins. These truncated proteins must also be rapidly degraded in order to prevent aggregation and/or toxicity to the cell. Therefore, a high-velocity unfoldase like ClpX is well-suited to quickly handle such proteins, and perhaps ClpX would only rarely be expected to encounter a more tightly folded protein, which could be handled by other ATPases in bacteria such as ClpA. On the other hand, here we have observed that the proteasomal ATPases resemble a lower velocity motor with a more processive and efficient translocation mechanism. Why might this be? The proteasome degrades most proteins in the cell, both unfolded as well as fully folded, functional proteins. Thus, in order for the proteasome to function optimally for this job it must be able to routinely handle more tightly folded domains than ClpX typically encounters. The high processivity, low velocity characteristics that we have observed here for the proteasomal ATPases seem to be optimized for its specific client pool of proteins that demand reliable degradation of folded and functional proteins. Therefore, we propose that the need to unfold and degrade most folded proteins in the cell is the reason that the proteasomal ATPases use a slower but more processive strategy for protein unfolding and degradation.

# MATERIALS AND METHODS

## Materials, Plasmids, and Protein Purification

PAN, GFPssrA, and T20S were prepared as described (Smith et al., 2005, 2007). The purest available forms of ATP, and ATPγS were purchased from Sigma and stored at −80◦C until use. Rabbit muscle 26S proteasome was purified by the previously described UBL-UIM method (Besche et al., 2009) and were exchanged with reaction buffer by rapid spin column or by dialysis (4 h) immediately prior to use.

Ub<sup>4</sup> (lin)-GFP35-His<sup>6</sup> plasmid was a generous gift from Andreas Matouschek and his lab. Plasmids were transfected into DH5α cells, and 1L cultures were grown at 37◦ at 300 RPM shaking, and induced with IPTG at OD<sup>600</sup> = 0.8 for 4 h. Cell pellets were resuspended in Buffer A (50 mM Tris pH 7.5, 5% glycerol, 300 mM NaCl, 20 mM Imidazole) with 1X protease inhibitor cocktail. Cells were lysed via sonication and spun at 20000 × g for 30 min. Supernatant was loaded onto Nickel-NTA, washed with 10 CV Buffer A, and eluted with Buffer B (Buffer A w/ 300 mM Imidazole). Fractions containing Ub<sup>4</sup> (lin)- GFP35-His<sup>6</sup> were pooled based on fluorescence (ex/em: 485/510) and SDS-PAGE. Pooled fractions were concentrated and further purified using size-exclusion chromatography (GE Superose 12 column). Purest fractions were exchanged into 50 mM Tris pH 7.5 + 5% glycerol.

## ATPase and GFPssrA Unfolding Assays

ATP hydrolysis was measured by reading the loss of NADH absorbance at 340 nm in an NADH-coupled ATP regenerating system (50 mM Tris pH 7.5, 5% glycerol, 20 mM MgCl2, 2 U/µl Pyrivate Kinase, 2 U/µl Lactate dehydrogenase, 3 mM phosphoenolpyruvate, and 0.2 mg/ml NADH, and indicated [ATP]). GFPssrA unfolding was assessed by loss of fluorescence at ex/em: 485/510. For the unfolding experiments, reaction buffer (50 mM Tris pH 7.5, 5% glycerol, 20 mM MgCl2) was incubated with 50 nM PAN, 400 nM T20S, and 0.2 nM GFPssrA (or 25 nM 26S and 100 nM Ub<sup>4</sup> (lin)-GFP35-His<sup>6</sup> for experiments with 26S) and 2 mM ATP (or with indicated ATPγS:ATP ratios with 2 mM total nucleotide). GFP fluorescence loss (ex/em: 485/510) was measured every 20 s in a Biotek 96 well-plate reader to obtain unfolding rates. Error bars represent standard deviations from at least three independent experiments (n ≥ 3).

ATP hydrolysis and GFPssrA unfolding were assessed concurrently in a Biotek 96 well-plate reader by measuring NADH absorbance loss alongside GFPssrA fluorescence loss. The ATP regenerating system buffer (above) was incubated with indicated [ATP] (0–3 mM), 50 nM PAN, 400 nM T20S, and 0.2µM or 2µM GFPssrA. Rates of ATP hydrolysis and GFPssrA unfolding were extrapolated and Vmax and Km-values were obtained by non-linear regression analysis on Sigmaplot using the Hill equation. Error bars are standard deviations from at least three independent experiments (n ≥ 3).

#### AUTHOR CONTRIBUTIONS

AS purified most proteins used in the manuscript, RA purified the Ub-GFP substrate. AS designed, performed, and analyzed the various experiments in this manuscript with input from RA and DS. Manuscript preparation was done by AS and DS. All authors reviewed the results and approved the final version of this manuscript.

#### REFERENCES


#### FUNDING

This work was supported by NIH-R01GM107129 to DS and by F31GM115171 to AS.

#### ACKNOWLEDGMENTS

We thank the members of the Smith lab for helpful and valuable discussions, and the protein core at WVU for their services. We thank Andreas Matouschek and his lab for generously providing us with the Ub<sup>4</sup> (lin)-GFP35-His<sup>6</sup> Plasmid.

processing reactions of an AAA+ degradation machine. Cell 114, 511–520. doi: 10.1016/S0092-8674(03)00612-3


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Snoberger, Anderson and Smith. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rubisco Activases: AAA+ Chaperones Adapted to Enzyme Repair

Javaid Y. Bhat † , Gabriel Thieulin-Pardo † , F. Ulrich Hartl and Manajit Hayer-Hartl\*

*Department of Cellular Biochemistry, Max-Planck-Institute of Biochemistry, Martinsried, Germany*

Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco), the key enzyme of the Calvin-Benson-Bassham cycle of photosynthesis, requires conformational repair by Rubisco activase for efficient function. Rubisco mediates the fixation of atmospheric CO<sup>2</sup> by catalyzing the carboxylation of the five-carbon sugar ribulose-1,5-bisphosphate (RuBP). It is a remarkably inefficient enzyme, and efforts to increase crop yields by bioengineering Rubisco remain unsuccessful. This is due in part to the complex cellular machinery required for Rubisco biogenesis and metabolic maintenance. To function, Rubisco must undergo an activation process that involves carboxylation of an active site lysine by a non-substrate CO<sup>2</sup> molecule and binding of a Mg2<sup>+</sup> ion. Premature binding of the substrate RuBP results in an inactive enzyme. Moreover, Rubisco can also be inhibited by a range of sugar phosphates, some of which are "misfire" products of its multistep catalytic reaction. The release of the inhibitory sugar molecule is mediated by the AAA+ protein Rubisco activase (Rca), which couples hydrolysis of ATP to the structural remodeling of Rubisco. Rca enzymes are found in the vast majority of photosynthetic organisms, from bacteria to higher plants. They share a canonical AAA+ domain architecture and form six-membered ring complexes but are diverse in sequence and mechanism, suggesting their convergent evolution. In this review, we discuss recent advances in understanding the structure and function of this important group of client-specific AAA+ proteins.

Keywords: Rubisco, Rubisco activase, AAA+ protein, CO<sup>2</sup> fixation, photosynthesis

# INTRODUCTION

Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) is the central enzyme of the Calvin-Benson-Bassham (CBB) cycle of photosynthesis (**Figure 1A**). Rubisco catalyzes the carboxylation of one molecule of ribulose-1,5-bisphospate (RuBP) and produces two molecules of 3 phosphoglycerate (3PG), which are then used for the synthesis of sugars, starch, amino acids, and fatty acids (Miziorko and Lorimer, 1983). As such, Rubisco is responsible for the overwhelming majority of carbon fixation by photoautotrophic organisms in the oceans and on land (Field et al., 1998). However, the specificity of Rubisco for CO<sup>2</sup> is limited and the enzyme can also use oxygen as a substrate (Whitney et al., 2011). In this reaction, referred to as photorespiration, Rubisco catalyzes the oxygenation of RuBP, producing only one molecule of 3PG and one molecule of the toxic by-product 2-phosphoglycolate (2P-glycolate) (**Figure 1A**). 2P-glycolate must then be recycled into 3PG through an ATP-dependent mitochondrial-peroxisomal pathway with the loss of CO2. Photorespiration has long been regarded as a wasteful process, but recent advances suggest that it

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Rebekka Wachter, Arizona State University, USA Veena Prahlad, University of Iowa, USA*

> \*Correspondence: *Manajit Hayer-Hartl mhartl@biochem.mpg.de*

*† These authors have contributed equally to this work.*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *10 February 2017* Accepted: *23 March 2017* Published: *10 April 2017*

#### Citation:

*Bhat JY, Thieulin-Pardo G, Hartl FU and Hayer-Hartl M (2017) Rubisco Activases: AAA*+ *Chaperones Adapted to Enzyme Repair. Front. Mol. Biosci. 4:20. doi: 10.3389/fmolb.2017.00020*

might play a crucial role in other aspects of plant life, including nitrate assimilation (Bloom, 2015; Hagemann and Bauwe, 2016; Walker et al., 2016). Moreover, Rubisco is a notoriously inefficient enzyme, with a very slow turnover, fixing at best only 10 CO<sup>2</sup> molecules per second (Feller et al., 2008). As a consequence of its shortcomings, Rubisco amounts to ∼50% of protein in plant leaves and is considered one of the most abundant proteins in nature (Ellis, 1979).

The most common form of Rubisco, form I, found in plants, algae, cyanobacteria, and proteobacteria, is a ∼550 kDa complex composed of eight large (RbcL, ∼50–55 kDa) and eight small subunits (RbcS, ∼15–20 kDa). The RbcL subunits are arranged as a toroid of antiparallel dimers that is capped at both ends by four RbcS subunits (Andersson and Backlund, 2008) (**Figure 1B**). To reach catalytic competence, one active site lysine of Rubisco (Lys201 using the Nicotiana tabacum nomenclature) must first be carboxylated by a non-substrate CO<sup>2</sup> molecule, followed by the binding of a Mg2<sup>+</sup> ion (Cleland et al., 1998). This process is called carbamylation and serves to position the substrate RuBP for efficient electrophilic attack by the second CO<sup>2</sup> molecule that will be fixed in the CBB cycle (Andersson, 2008). Upon RuBP binding, the active site is closed via two sequential conformational changes in RbcL: Loop 6 in the C-terminal domain of RbcL extends over the bound RuBP trapping it below; the C-terminal tail of RbcL then stretches across the subunit and pins down loop 6, closing the active site (Bracher et al., 2017) (**Figure 1C**). Carbamylation of the apo form of the enzyme ("E") to active Rubisco ("ECM") is spontaneous (**Figure 2A**), but can only occur when the active site is in the open conformation.

Premature binding of RuBP to the apo form leads to the formation of a closed, inhibited enzyme ("EI"), in which the bound RuBP is unable to react with either CO<sup>2</sup> or O2. Spontaneous decarbamylation followed by RuBP binding may occur during ongoing photosynthesis, also leading to loss of enzyme activity ("fallover") (Zhu and Jensen, 1991). Moreover, Rubisco is inhibited by so-called misfire by-products, such as xylulose-1,5-bisphosphate (XuBP) and 2,3-pentodiulose-1,5 bisphosphate (PDBP), which are generated at a low frequency during the multistep catalytic reaction (Parry et al., 2008) (**Figure 2A**). Likewise, the inhibitor 2-carboxy-D-arabinitol-1 phosphate (CA1P), which is synthesized by some plants under low light conditions (also referred to as "night-time" inhibitor),

FIGURE 2 | Rubisco regulation by Rca. (A) Regulation of Rubisco activity and inhibition by sugar phosphates. E, the non-carbamylated enzyme; ECM, the carbamylated and Mg2<sup>+</sup> ion-bound enzyme; EI, the sugar phosphate inhibited E form; ECMI, the inhibited ECM form; Rca, Rubisco activase. Figure reproduced from reference Bracher et al. (2017). (B) Phylogenetic tree of selected Rubisco RbcL sequences. The green-type enzymes encompass form IA and IB, and the red-type enzymes form IC and ID. The RbcL C-terminal sequences and their associated Rca's are indicated. X represents variable residues. Rca's from species indicated in bold have been characterized biochemically and/or structurally and are described in this review. The phylogenetic tree was calculated by multiple sequence alignment using T-Coffee (Notredame et al., 2000) and the diagram was generated by the software Dendroscope (Huson and Scornavacca, 2012). Form IA (prokaryote): *M. purpuratum, Marichromatium purpuratum; H. marinus, Hydrogenovibrio marinus; T. crunogena, Thiomicrospira crunogena; H. neapolitanus, Halothiobacillus neapolitanus; N. winogradskyi, Nitrobacter winogradskyi; N. europaea, Nitrosomonas europaea; T. denitrificans, Thiobacillus denitrificans; A. ferrooxidans, Acidithiobacillus ferrooxidans; A. vinosum, Allochromatium vinosum; T. marina, Thiocapsa marina; T. mobilis, Thioflavicoccus mobilis; T. intermedia, Thiomonas intermedia.* Form IB (eukaryote): *Z. mays, Zea mays; T. aestivum, Triticum aestivum; O. sativa, Oryza sativa; S. oleracea, Spinacia oleracea; P. vulgaris, Phaseolus vulgaris; G. hirsutum, Gossypium hirsutum; N. tabacum, Nicotiana tabacum; B. oleracea, Brassica oleracea; A. thaliana, Arabidopsis thaliana.* Form IB (prokaryote): *C*. *reinhardtii, Chlamydomonas reinhardtii; Syn. PCC7502, Synechococcus sp. PCC 7502; F. contorta, Fortiea contorta; N. punctiforme, Nostoc punctiforme; C. stagnale, Cylindrospermum stagnale; Syn. PCC6803, Synechocystis PCC6803; Syn. PCC7002, Synechococcus PCC7002; Syn. PCC6301, Synechococcus PCC6301.* Form ID (eukaryote): *D. baltica, Durinskia baltica; O. sinensis, Odontella sinensis; T. oceanica, Thalassiosira oceanica; T. pseudonana, Thalassiosira pseudonana; G. partita, Galdieria partita; G. sulphuraria, Galdieria sulphuraria; P. purpurea, Porphyra purpurea; G. monilis, Griffithsia monilis; C. merolae, Cyanidioschyzon merolae.* Form IC (prokaryote): *X*. *flavus, Xanthobacter flavus; R. pickettii, Ralstonia pickettii; R. eutropha, Ralstonia eutropha; A. methanolica, Acidomonas methanolica; R. sphaeroides, Rhodobacter sphaeroides.*

Bhat et al. Rubisco Activase

inactivates the active form of Rubisco (Parry et al., 2008; Andralojc et al., 2012) (**Figure 2A**). In all these cases the closed, inhibited Rubisco (EI' or "ECMI") reactivates only slowly, limited by the spontaneous rate of opening of the active site (**Figure 1C**).

Release of inhibitor from inactive Rubisco at a biologically relevant timescale is made possible through intervention by Rubisco activase (Rca) (**Figure 2A**). Rca enzymes belong to the AAA+ protein superfamily (Neuwald et al., 1999) and use ATPdriven conformational changes to remodel Rubisco, thereby facilitating the release of the inhibitory sugar phosphates (Portis, 2003; Portis et al., 2008). Since the discovery, in the early 1980's, of the first Rca in a photosynthesis mutant of Arabidopsis thaliana (Portis and Salvucci, 2002), Rca enzymes have been identified in many photosynthetic organisms containing either greentype or red-type Rubiscos, from chemoautotrophic bacteria to higher plants (Mueller-Cajar et al., 2011; Sutter et al., 2015; Tsai et al., 2015; Loganathan et al., 2016) (**Figure 2B**). Although displaying considerable sequence variability, all Rca's share the core subunit architecture of AAA+ proteins, consisting of a N-terminal nucleotide binding domain with α/β Rossman fold and a C-terminal α-helical domain (Hanson and Whiteheart, 2005; Snider et al., 2008; Wendler et al., 2012). Like most AAA+ proteins, the Rca enzymes function as hexameric donutshaped rings, with their central pore implicated in threading specific peptides of Rubisco (Hauser et al., 2015; Bracher et al., 2017).

In this review, we will discuss recent advances in understanding the structure and mechanism of Rca's from the red and green lineages of photosynthetic organisms. The diversity of these enzymes provides a fascinating example of convergent evolution, and reflects the constraints under which Rca's and their cognate Rubisco substrates may have co-evolved.

### RUBISCO ACTIVASE OF RED-TYPE RUBISCO FORM IC AND ID

Rca has been known since the 1980s (Portis and Salvucci, 2002) but was assumed to be restricted to plants. The first prokaryotic Rca was only recently discovered in the proteobacterium Rhodobacter sphaeroides, which contains the red-type Rubisco form IC (Mueller-Cajar et al., 2011) (**Figure 2B**). RsRca is encoded by the cbbX gene located immediately downstream of the rbcL and rbcS genes (Gibson and Tabita, 1997). Inactivation of cbbX in R. sphaeroides resulted in impaired photoautotrophic growth at low CO<sup>2</sup> levels. The structural and functional analysis of RsRca provided critical insights into the mechanism of Rubisco remodeling. The RsRca subunit (∼35 kDa) is composed of the AAA+ core module with a compact α-helical extension at the N-terminus (Mueller-Cajar et al., 2011) (**Figures 3A,B**). The two subdomains of the core module are separated by a short flexible linker. The α/β subdomain harbors the characteristic Walker A and B nucleotide binding motifs (Mueller-Cajar et al., 2011; Bracher et al., 2017).

The active hexameric complex of RsRca forms only in the presence of ATP and RuBP, the substrate of its target enzyme Rubisco. The RuBP binding site is located in the αhelical subdomain at the bottom of the hexamer (**Figures 3B,C**). The hexamer exhibits a ∼25 Å wide central channel lined by "canonical" pore loop residues (Tyr/Ile/Gly) (Mueller-Cajar et al., 2011). In the absence of RuBP, RsRca forms spiralshaped high molecular weight assemblies that are largely ATPase inactive and may represent a storage form when the organism is not photosynthetically active (Mueller-Cajar et al., 2011) (**Figure 3D**). Thus, the generation of RuBP during photosynthesis would induce the conversion of this storage form into functional hexamers (**Figure 3D**). Biochemical and mutational analysis showed that remodeling of Rubisco depends on the canonical pore loops and the conserved top surface of the hexamer (Mueller-Cajar et al., 2011). Moreover, reactivation of R. sphaeroides Rubisco required the intact C-terminal sequence of RbcL, which is extended in red-type Rubiscos by ∼5–10 residues relative to green-type RbcL. Binding to inhibited Rubisco stimulates the ATPase activity of RsRca ∼4-fold (Mueller-Cajar et al., 2011), in a manner dependent on both the RbcL C-terminus and the top surface of the RsRca hexamer. These findings suggest that RsRca docks onto Rubisco with its top surface and the pore loops transiently pull the C-terminal tail of RbcL into the central pore, to facilitate opening of the active site pocket and release the inhibitory sugar phosphate (**Figure 3E**). This mechanism resembles the threading of ssrA-tagged proteins through the central pore of the bacterial ClpX for degradation by the ClpP protease (Olivares et al., 2016).

Interestingly, the red alga Cyanidioschyzon merolae, containing Rubisco form ID (**Figure 2B**), has two cbbX genes, one nuclear-encoded and one plastid-encoded (Loganathan et al., 2016). It was recently shown that the functional CmRca is a 1:1 hetero-hexamer between nuclear- and plastid-encoded subunits (Loganathan et al., 2016). Both of these Rca subunits share 60–70% identity with RsRca. In the case of CmRca, RuBP acts as an allosteric regulator for modulation of the ATPase activity but is not required for hexamer formation (Loganathan et al., 2016). In both the red-type prokaryotic and eukaryotic Rca enzymes, RuBP regulation of the ATPase activity provides a link between the functional state of the CBB cycle and Rubisco activity.

# PROKARYOTIC RUBISCO ACTIVASE OF GREEN-TYPE RUBISCO FORM IA

The most recent addition to the family of activases are the cbbQ/cbbO genes from the chemoautotrophic bacteria Acidithiobacillus ferrooxidans and Halothiobacillus neapolitanus, containing the green-type Rubisco form IA (Sutter et al., 2015; Tsai et al., 2015) (**Figure 2B**). These genes are generally associated with the Rubisco operon, with the cbbQ gene encoding the ∼30 kDa AAA+ subunits and the cbbO gene a Rubisco adaptor protein of ∼82–88 kDa. Structural and biochemical characterization showed that these proteins function as bipartite complexes consisting of the hexameric CbbQ activase (AfRcaI;

HnRca) with CbbO as a co-factor (Sutter et al., 2015; Tsai et al., 2015) (**Figure 4A**). The α/β subdomain of AfRcaI and HnRca belong to the MoxR group of prokaryotic AAA+ proteins (**Figures 4B,C**), which often cooperate with proteins that contain the von Willebrand factor A (VWA) domain (Wong and Houry, 2012). Indeed, CbbO has a VWA domain with a typical metalion-dependent adhesion site (MIDAS), a motif usually involved in protein-protein interactions via a cation (generally Mg2+) (Whittaker and Hynes, 2002) (**Figure 4A**). Mutagenesis showed that the MIDAS motif interacts with aspartate 82 of the RbcL subunit of A. ferrooxidans (Tsai et al., 2015) (**Figure 4D**). Similar to the synergistic ATPase activation of RsRca and CmRca by RuBP and the inhibited Rubisco (Mueller-Cajar et al., 2011; Loganathan et al., 2016), the ATPase activity of AfRcaI is stimulated by the binding of both CbbO and the inhibited Rubisco (Tsai et al., 2015). This suggests that a two-step conformational change in the activase hexamer leads to optimal ATPase activity for Rubisco reactivation.

Furthermore, deletion or alanine substitution of the last two residues of the C-terminal tail of form IA RbcL resulted in loss of AfRcaI/CbbOI-mediated reactivation of inhibited Rubisco (Tsai et al., 2015). This suggests that the interaction of AfRcaI with the RbcL C-terminus is functionally critical, similar to the mechanism of red-type Rca described above. However, AfRcaI and HnRca do not have the canonical pore loop residues known to be involved in threading of flexible sequences into the central pore (Hanson and Whiteheart, 2005; Olivares et al., 2016). Accordingly, mutating these residues did not result in loss of function (Tsai et al., 2015). In the current model, CbbO acts as an adapter between the activase and Rubisco. Whether and how a pulling force is involved in remodeling remains to be investigated.

Interestingly, A. ferrooxidans also contains a form II Rubisco operon associated with a second pair of cbbQ2/cbbO2 genes (Tsai et al., 2015). The well-characterized form II Rubisco of the α-proteobacterium Rhodospirullum rubrum is a dimer of only RbcL subunits and is Rca-independent (Jordan and Chollet, 1983; Pearce, 2006). The form II Rubisco of A. ferrooxidans is a trimer of RbcL<sup>2</sup> units that can undergo inhibition by tightly binding sugar phosphates (Tsai et al., 2015). Reactivation requires the interaction with AfRcaII/CbbOII (Tsai et al., 2015), providing the first evidence for a Rca-dependent form II Rubisco.

# EUKARYOTIC RUBISCO ACTIVASE OF GREEN-TYPE RUBISCO FORM IB

Almost three decades after the discovery of Rca in A. thaliana (Portis and Salvucci, 2002; Portis, 2003), the first crystal structures of Rca for eukaryotic green-type Rubisco form IB from N. tabacum (Stotz et al., 2011), Larrea tridentata (Henderson et al., 2011), and A. thaliana (Hasse et al., 2015) were solved. The sequences of these activases are longer than those of the Rca enzymes described above. In addition to the AAA+ core module, they feature a small domain at the N-terminus (N-domain) and a C-terminal extension, not resolved in the crystal structures (**Figures 5A,B**). The N-domain is required for targeting Rca to Rubisco (Esau et al., 1996; van de Loo and Salvucci, 1996; Stotz et al., 2011). It cooperates with a short helix (H9) in the α-helical subdomain of the AAA+ module, referred to as the specificity helix (Li et al., 2005; Stotz et al., 2011) (**Figures 5B,D**). In N. tabacum helix H9 interacts with residues arginine 89 and lysine 94 of RbcL (N. tabacum numbering) located in the equatorial region of the Rubisco complex and allows Rca to distinguish between solanaceous and non-solanaceous Rubisco (Portis et al., 2008; Wachter et al., 2013) (**Figure 5D**). The C-terminal extension is critical for the constitutive ATPase activity and mutation of tyrosine 361 results in loss of the ATPase and activase function (Stotz et al., 2011). Higher plants, including A. thaliana, rice, barley, maize and cotton, express two quasi-identical Rca isoforms, α and β, with the α-isoform possessing a slightly longer Cterminal extension (Portis et al., 2008). The isoforms are either expressed from separate genes or result from alternate splicing. The long C-terminal extension of the α-isoform contains two cysteine residues that can undergo F-type thioredoxindependent reversible oxidation (Zhang and Portis, 1999). Under oxidizing conditions, generally at night in the absence of photosynthesis, disulphide bond formation in the C-terminal extension inhibits ATP binding and thus Rubisco activation (Shen and Ogren, 1992; Zhang and Portis, 1999; Zhang et al., 2001, 2002; Portis, 2003; Wang and Portis, 2006; Portis et al., 2008; Carmo-Silva and Salvucci, 2013; Gontero and Salvucci, 2014).

Plant Rca enzymes have been reported to populate a range of dynamic oligomeric states in vitro, but are active as hexamers, as shown for the Rca enzymes of N. tabacum and S. oleracea (Blayney et al., 2011; Stotz et al., 2011; Keown and Pearce, 2014) (**Figure 5C**). Analysis of the NtRca by electron microscopy revealed the position of the N-domains at the top of the hexamer (Stotz et al., 2011). In the crystal structure of AtRca the Ndomain was disordered (Hasse et al., 2015). Stable hexamers of NtRca were generated by mutation of arginine 294 to valine at the interface between adjacent subunits. Hexamers formed with ATP but not ADP and were functionally active (Stotz et al., 2011). In the case of cotton Rca, hexamer formation was also observed with ADP, but was less efficient than with ATP (Kuriata et al., 2014). Indeed, plant activases have been described to be sensitive to the ATP:ADP ratio (Portis et al., 2008; Carmo-Silva and Salvucci, 2013; Thieulin-Pardo et al., 2015). Such a regulation would ensure that Rca functions in a light- and redox-dependent (for the α-isoform) manner (Portis et al., 2008). Rca may also be functionally regulated by fluctuating Mg2<sup>+</sup> concentrations in response to changes in available light, based on the finding that high Mg2<sup>+</sup> caused an ∼8-fold increase in catalytic activity of NtRca (Hazra et al., 2015).

The central pore of NtRca has a diameter of ∼36 Å, wider than the Rca's described above (Mueller-Cajar et al., 2011; Stotz et al., 2011; Hasse et al., 2015; Sutter et al., 2015; Tsai et al., 2015) (**Figures 3**–**5**). NtRca and AtRca do not contain the canonical pore loop motif (aromatic-hydrophobicglycine). Instead, three conserved loop segments face the central solvent channel and mutational analysis of NtRca implicates all of them in Rubisco remodeling (Stotz et al., 2011). This is similar to findings with the microtubule severing AAA+ protein spastin (Roll-Mecak and Vale, 2008). Based on the currently available structural and biochemical data, NtRca recognizes the inhibited Rubisco via the N-domain, with species specificity being imparted by helix H9. Notably, the RbcL of the greentype Rubisco form IB lacks the extended C-terminus that is required for the remodeling of red-type Rubisco. Thus, the exact mechanism of remodeling of plant Rubisco remains to be established.

#### CONVERGENT EVOLUTION OF RUBISCO ACTIVASE ENZYMES

It is believed that Rubisco-mediated CO<sup>2</sup> fixation evolved ∼3.5 billion years ago under non-oxygenic conditions (Nisbet et al., 2007). The evolution of cyanobacteria ∼2.5 billion years ago triggered the shift to an oxygenic atmosphere (Whitney et al., 2011). During this process Rubisco also evolved into multiple enzymatic forms with a range of kinetic properties (Tcherkez et al., 2006; Badger and Bek, 2008; Sharwood et al., 2016; Young et al., 2016). Some Rubiscos apparently acquired mutations that led to tighter binding of RuBP and inhibitory sugar phosphates in the active site, necessitating the repair function by Rca. Notably, no sugar phosphate inhibition has been shown for cyanobacterial Rubiscos, although cyanobacteria contain genes encoding Rcalike proteins (Li et al., 1993), which are required for normal cell growth and Rubisco activity (Li et al., 1999). Interestingly, these proteins contain a C-terminal RbcS-like domain, which may mediate binding to Rubisco.

Recent studies have shown Rca's to exist also in prokaryotic and other eukaryotic organisms containing Rubiscos of form IA, IC, and ID (**Figure 2B**). The divergence in primary sequence of these proteins from different organisms strongly suggests that a process of convergent evolution underlies the use of the common AAA+ module in the Rubisco repair mechanism. Clearly, Rubiscos have co-evolved with their cognate activases, as exemplified by the C-terminal extension in red-type RbcL or the specific surface residues of solanaceous and non-solanaceous RbcL proteins that are recognized by their cognate activases (Wachter et al., 2013) (**Figure 2B**).

# CONCLUDING REMARKS

Based on recent insights into the structural and functional diversity of Rubisco activases, these proteins represent an important paradigm to understanding how the AAA+ module can be adapted to the repair of a specific enzyme. Despite major progress, the exact mechanisms of remodeling are not yet understood. Which conformational changes does Rubisco undergo during reactivation? Are these effects limited to the active site pocket or are they more global? How does Rca distinguish between inhibited and active Rubisco? How is Rubisco remodeling reflected in the allostery of ATP binding and hydrolysis of the Rca subunits? Increasingly sophisticated biophysical techniques, such as hydrogen/deuterium exchange analysis and high resolution cryo-electron microscopy, should be brought to bear on these questions. Elucidating the mechanism of the plant Rca will be of special importance in the context of efforts to improve Rubisco carboxylation efficiency in crop plants (Whitney et al., 2011; Bracher et al., 2017). Engineering Rca itself may be a possible strategy, given its inherent thermal instability (Sage et al., 2008; Parry et al., 2013; Carmo-Silva et al., 2015).

1EJ7, Duff et al., 2000) is shown in surface representation with the RbcL and RbcS subunits in different shades of green. The RbcL C-termini are shown as green lines.

More likely, Rubisco and Rca may have to be co-engineered, mimicking the process that occurred during natural evolution.

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

#### REFERENCES


#### ACKNOWLEDGMENTS

We thank A. Bracher for his help in preparing figures and for critically reading the manuscript. MHH acknowledges funding by the Minerva Foundation of the Max Planck Gesellschaft and a grant of the Deutsche Forschungsgemeinschaft (SFB1035) to MHH and FUH.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Bhat, Thieulin-Pardo, Hartl and Hayer-Hartl. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Diverse AAA+ Machines that Repair Inhibited Rubisco Active Sites

Oliver Mueller-Cajar\*

*School of Biological Sciences, Nanyang Technological University, Singapore, Singapore*

Gaseous carbon dioxide enters the biosphere almost exclusively via the active site of the enzyme ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco). This highly conserved catalyst has an almost universal propensity to non-productively interact with its substrate ribulose 1,5-bisphosphate, leading to the formation of dead-end inhibited complexes. In diverse autotrophic organisms this tendency has been counteracted by the recruitment of dedicated AAA+ (ATPases associated with various cellular activities) proteins that all use the energy of ATP hydrolysis to remodel inhibited Rubisco active sites leading to release of the inhibitor. Three evolutionarily distinct classes of these Rubisco activases (Rcas) have been discovered so far. Green and red-type Rca are mostly found in photosynthetic eukaryotes of the green and red plastid lineage respectively, whereas CbbQO is associated with chemoautotrophic bacteria. Ongoing mechanistic studies are elucidating how the various motors are utilizing both similar and contrasting strategies to ultimately perform their common function of cracking the inhibited Rubisco active site. The best studied mechanism utilized by red-type Rca appears to involve transient threading of the Rubisco large subunit C-terminal peptide, reminiscent of the action performed by Clp proteases. As well as providing a fascinating example of convergent molecular evolution, Rca proteins can be considered promising cropimprovement targets. Approaches aiming to replace Rubisco in plants with improved enzymes will need to ensure the presence of a compatible Rca protein. The thermolability of the Rca protein found in crop plants provides an opportunity to fortify photosynthesis against high temperature stress. Photosynthesis also appears to be limited by Rca when light conditions are fluctuating. Synthetic biology strategies aiming to enhance the autotrophic CO<sup>2</sup> fixation machinery will need to take into consideration the requirement for Rubisco activases as well as their properties.

#### Edited by:

*Walid A. Houry, University of Toronto, Canada*

#### Reviewed by:

*Robert Edward Sharwood, Australian National University, Australia Carlos H. Ramos, Universidade Estadual de Campinas, Brazil*

\*Correspondence:

*Oliver Mueller-Cajar cajar@ntu.edu.sg*

#### Specialty section:

*This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences*

> Received: *17 March 2017* Accepted: *29 April 2017* Published: *19 May 2017*

#### Citation:

*Mueller-Cajar O (2017) The Diverse AAA*+ *Machines that Repair Inhibited Rubisco Active Sites. Front. Mol. Biosci. 4:31. doi: 10.3389/fmolb.2017.00031* Keywords: Rubisco, activase, photosynthesis, AAA+ proteins, molecular chaperones, carbon fixation

# THE CURIOUS CASE OF RUBISCO

The vast majority of carbon dioxide entering the living world does so via the slow and non-specific enzyme ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco) (Spreitzer and Salvucci, 2002). The realization that this enzyme often represents the rate-limiting step of photosynthesis has made it a long-standing target for crop improvement strategies (Parry et al., 2007; Whitney et al., 2011a; Ort et al., 2015; Sharwood et al., 2016b). The peculiar properties of Rubisco can be understood as an accident of natural history. A highly complex reaction mechanism for ribulose 1,5-bisphosphate (RuBP) carboxylation evolved once in a high CO<sup>2</sup> atmosphere lacking O<sup>2</sup> (Andrews and Lorimer, 1987, **Figure 1A**). The unprecedented increase in atmospheric oxygen following the evolution of oxygenic photosynthesis increased the propensity of RuBP oxygenation, making it physiologically relevant (Andrews et al., 1973; Tcherkez, 2016). This resulted in massive metabolite damage (Linster et al., 2013) in the form of a build-up of 2 phosphoglycolate, which in contemporary plants is repaired by photorespiration (Bauwe et al., 2010). In C3 plants exposed to the current atmospheric environment, photorespiration operates at ∼20% of photosynthesis (Cegelski and Schaefer, 2006), making it the second highest flux pathway. Operation of the photorespiratory pathway is energetically wasteful, resulting in a high selection pressure to reduce its flux. However, Rubisco's extensive adaptive walks through sequence space were not rewarded by the discovery of catalytic solutions that eliminated oxygenation (Maynard Smith, 1970; Mueller-Cajar and Whitney, 2008a). Instead it appeared easier to evolve a myriad of diverse syndromes that concentrate CO<sup>2</sup> at the active site of the carboxylase (Badger et al., 1998; Rae et al., 2013; Sage, 2013). However, all of these mechanisms involve active transport, and thus increase the metabolic cost per CO<sup>2</sup> fixed. Therefore, there was a concomitant pressure to enhance the catalytic fidelity of the enzyme by increasing its CO2/O<sup>2</sup> specificity, as manifested most strongly in C3 plants and red algae (Tcherkez et al., 2006).

# EVOLUTION OF HIGHER CATALYTIC FIDELITY BY RIGIDIFICATION OF THE ACTIVE SITE

Catalysis by all Rubiscos requires two cofactors to bind at the active site permitting the functional holoenzyme to form (**Figure 1B**). A non-substrate CO<sup>2</sup> reacts with the amine group of the conserved Lys-201 residue (spinach RbcL numbering) to form a carbamate. A Mg2<sup>+</sup> ion is then bound to complete the activation process, forming the holoenzyme termed ECM (Lorimer et al., 1976; Cleland et al., 1998). The activated enzyme then binds the substrate RuBP, which is processed via a series of five partial reactions to eventually yield two molecules of 3-phosphoglycerate (3-PG) if carboxylated (Tcherkez, 2013, **Figure 1A**). The similarity in size and electrostatic potential of the gases CO<sup>2</sup> and O<sup>2</sup> (Kannappan and Gready, 2008) has culminated in a situation where the enzyme is unable to perfectly discriminate between the carboxylation substrate CO<sup>2</sup> and the competing O2. The critical step at which the enzyme can influence the partitioning between carboxylation and oxygenation is during attack of the gaseous substrate by the enolized RuBP (Chen and Spreitzer, 1992). An analysis of decades of kinetic and isotope-fractionation data suggested that this task is achieved by a relative stabilization of the transition state for CO2, compared to O<sup>2</sup> addition (Tcherkez et al., 2006). This stabilization manifests itself in both reduced flexibility of the active site and tighter binding of the carboxylated product (Pearce and Andrews, 2003). A well-documented outcome of this strategy is the trade-off where faster enzymes tend to exhibit higher Michaelis constants for CO<sup>2</sup> and are less able to discriminate between CO<sup>2</sup> and O<sup>2</sup> (Bainbridge et al., 1995; Tcherkez et al., 2006; Savir et al., 2010). However, it is important to note that new Rubisco kinetic data is highlighting exceptions to these rules, at least regarding some algal enzymes exhibiting relatively low carboxylase efficiencies (Young et al., 2016).

### THE EMERGING REQUIREMENT FOR CATALYTIC CHAPERONES

A consequence of the described strategy, which tends to be less well popularized, relates to the tendency of the enzyme to become irreversibly inhibited by sugar phosphates. Since the unactivated apo-enzyme (E) already possesses all of the features required to bind the substrate RuBP, the active site will close when it encounters the substrate (Jordan et al., 1983; Duff et al., 2000). In the absence of the co-factors required to catalyze carboxylation or oxygenation, RuBP cannot be processed and is now bound unproductively, or "caught in the Rubisco mousetrap" (Andrews, 1996), to form Enzyme-RuBP (ER) (**Figure 1C**). At the same time, losing a valuable active site has reduced the capacity for carbon fixation of the host organism. RuBP is not the only inhibitory substrate, a palette of other sugar phosphates, including some generated by misfire-reactions of Rubisco itself, also tightly bind to the active site (Parry et al., 2008; Andralojc et al., 2012; Bracher et al., 2015). The affinity of the inhibitors is correlated with the enzyme's catalytic parameters, and based on the data available "superior" high specificity Rubiscos bind RuBP and other sugar phosphates more tightly than the low specificity enzymes with more flexible active sites (Pearce and Andrews, 2003; Pearce, 2006).

Over time, as Rubisco active sites became more and more adept at tightly binding the carboxylation-intermediate, the propensity for the apo-Enzyme to bind the substrate nonproductively also increased (Pearce and Andrews, 2003). This led to a temporary removal of significant proportions of active sites from the pool of the enzyme. This problem could be alleviated by the action of molecular chaperones that would selectively engage inhibited Rubisco, and by performing a "chiropractic" maneuver (Carmo-Silva and Salvucci, 2011) conformationally reset the active site.

Earlier articles have comprehensively reviewed our knowledge on biochemical and physiological aspects of both the green-type (Portis, 1995, 2003; Portis et al., 2008; Carmo-Silva et al., 2015) and the red-type activase (Mueller-Cajar et al., 2014; Hauser et al., 2015b). Here we aim to direct attention toward the recent realization that in different autotrophic lineages multiple activase classes have converged on the same biochemical function. We attempt to integrate our understanding regarding mechanistic similarities and differences toward a framework regarding the chaperone-mediated rearrangement of the highly conserved inhibited Rubisco active site.

# THE EVOLUTION OF RUBISCO AND THE THREE RCA CLASSES

In spite of the single phylogenetic origin and highly conserved reaction chemistry of Rubisco, a number of highly distinct clades of Rubisco can be observed today (Tabita et al., 2008). All Rubiscos are comprised of ∼55 kDa large subunits that

#### FIGURE 1 | Continued

with a non-substrate CO<sup>2</sup> to form a carbamate (EC), followed by the binding of a Mg2<sup>+</sup> ion to form the catalytically competent holoenzyme ECM. (C) Both the inactive apo (E) and the active holoenzyme (ECM) are prone to dead-end inhibition by sugar phosphates such as RuBP, which binds to E and CA1P (2-carboxy-D-arabinitol 1-phosphate), which binds to ECM. Rubisco activases (Rca) recognize inhibited active sites and use the energy of ATP hydrolysis to cause a conformational change that releases the inhibitor.

assemble as anti-parallel dimers. Each dimer harbors two active sites formed by the β-barrel C-terminal domain of one subunit and the N-terminal domain (containing a 5-stranded mixed beta sheet) of the other (Knight et al., 1990). This basic functional unit is then often found to be assembled into higher oligomeric states.

**Figure 2** shows a phylogenetic tree of selected RbcL sequences relevant to the present discussion about Rca. The last common ancestor of all extant Rubiscos was probably the aforementioned dimer of large subunits, and this arrangement is still found in a subset of the so-called Form II enzymes, such as the well-studied enzyme from Rhodospirillum rubrum (Anderson and Fuller, 1969). Contemporary Form II enzymes are often found to occupy higher order oligomeric states with a hexameric arrangement recently found to be common (Satagopan et al., 2014; Tsai et al., 2015; Varaljay et al., 2016). A key early innovation in Rubisco evolution concerned the recruitment of the small subunit, a ∼15 kDa scaffolding protein that stabilized tetramers of dimers resulting in a L8S<sup>8</sup> stoichiometry. These enzymes constitute the Form I clade of Rubiscos (Spreitzer, 2003). This clade branched early into a red (Form IC and D) and green-type branch (Form IA and B), the large subunits of which today maintain about 50% sequence identity to each other. Form IA Rubiscos can be subdivided into Form IA<sup>Q</sup> and Form IA<sup>C</sup> sequences, the latter always being associated with carboxysomal gene clusters (Badger and Bek, 2008). It is interesting to note that the photosynthesizers dominating our planet's landmass, the higher plants, possess only a small slice of Rubisco's molecular diversity, all encoding a highly conserved Form IB enzyme derived from the ancestral cyanobacterial endosymbiont.

Three distinct classes of Rubisco activase (green-, red-, and CbbQO-type) have now been identified (Salvucci et al., 1985; Mueller-Cajar et al., 2011; Tsai et al., 2015), permitting us to start dissecting the molecular underpinnings of how different organisms dealt with the outlined problem of blocked Rubisco active sites. The activases were recruited from highly distinct volumes of sequence space in the AAA+ protein universe (Ammelburg et al., 2006), and their AAA modules display less than 25% sequence identity between the groups. This vast and diverse group of molecular motors was clearly well suited for the task of active site rearrangement, as their unifying functional characteristic relates to conformationally remodeling macromolecular substrates using the energy of ATP hydrolysis (Hanson and Whiteheart, 2005; Sysoeva, 2016). The identified activases are not closely related to other well characterized extant molecular chaperones, which currently precludes the formulation of detailed hypotheses regarding their historical evolutionary trajectory.

Green-type Rcas represent the first discovered (Salvucci et al., 1985) and due to their presence in all higher plants, most extensively studied activase system (Portis, 2003; Carmo-Silva et al., 2015). They are evolutionarily derived from cyanobacteria, where homologs are found associated with carboxysomal green-type Form IB Rubisco (Li et al., 1993). Importantly, an experimental verification of the cyanobacterial activase's biochemical function is still elusive (Bracher et al., 2017). The distribution is not universal, but is associated with strains belonging to clade A and B1 according to the classification by Kerfeld and colleagues (Shih et al., 2013; Zarzycki et al., 2013). These are thought to form the sister group to the primary endosymbiont (Ochoa de Alda et al., 2014), which would indicate that Rca was transferred together with Form IB Rubisco during the primary endosymbiotic event about 1.5 billion years ago (Yoon et al., 2004).

On a structural level green-type Rcas show similarity to p97/CDC48 (Hasse et al., 2015) and classification of the Cterminal subdomain revealed a relationship to the D2 AAA+ module of N-ethylmaleimide-sensitive factor (NSF) (Ammelburg et al., 2006). Both of these belong to the classical clade of AAA proteins (Iyer et al., 2004). It is thus reasonable to conclude that specialization toward activase activity occurred using a general molecular chaperone in this clade in an ancient cyanobacterium as a starting point.

The gene encoding red-type Rca (also known as CbbX), is always found in an operon with the red-type (Form IC) Rubisco encoding genes in mixotrophic proteobacteria (Gibson and Tabita, 1997; Badger and Bek, 2008). It is also encountered in the chloroplast genomes of the red lineage (Oudot-Le Secq et al., 2007). A proposed explanation for this distribution involved horizontal gene transfer of the rbcL-rbcS-cbbX gene cluster from a proteobacterium to an ancestor of the primary endosymbiont (Delwiche and Palmer, 1996; Nisbet et al., 2004). Alternatively horizontal gene transfer occurred subsequent to the endosymbiotic event in the ancestor of the red algae, which subsequently lost the green Form IB Rubisco genes (Maier et al., 2000; Rice and Palmer, 2006). Where sequence data exists, it appears eukaryotes possessing red-type Rubisco always encode an additional CbbX isoform in the nuclear or nucleomorph genome (Hovde et al., 2015), and this is thought to be a consequence of gene duplication and migration of one copy to the nuclear genome in an early rhodophyte (Fujita et al., 2008). In the red algae Cyanidioschyzon merolae, the functional red-type Rca has been shown to be a 1:1 hetero-oligomer of the plastid and the nuclear encoded isoform (Loganathan et al., 2016), and we expect this scenario to hold true for red lineage phytoplankton in general.

The closest structural neighbors of red-type Rca, as determined by a DALI search are the helicase RuvB and protease-associated motors such as HslU and ClpX (Hasse et al., 2015). HslU and ClpX are powerful unfoldases that generally thread substrate proteins marked for degradation through their axial pore of the hexamer into a proteolytic chamber (Sauer and Baker, 2011). However, recently more gentle conformational rearrangements have been documented for the mitochondrial ClpX. In this case ClpX acts on an enzyme involved in heme biosynthesis and catalyzes the insertion of a cofactor (Kardon et al., 2015). Hence it is conceivable that subtle "pulling" on enzymes to bring about conformational transitions that favor inhibitor release or co-factor insertion is not an unusual scenario (Olivares et al., 2016). It is therefore a reasonable hypothesis that red-type Rca evolved in proteobacteria from a general molecular chaperone using the axial pore threading mechanism that was either involved in correcting protein conformations or protein complex maturation (including co-factor insertion).

The genes encoding the CbbQO-type activase system (Hayashi et al., 1997, 1999; Sutter et al., 2015; Tsai et al., 2015) are broadly distributed among proteobacteria, but associate strongly with chemolithoautotrophs that use sulfur oxidation as energy source (Badger and Bek, 2008). CbbQ belongs to the large, but relatively poorly characterized MoxR group of AAA+ proteins, which is often found encoded in operons together with a second protein containing a von Willebrand Factor A (VWA) domain (Snider and Houry, 2006; Wong and Houry, 2012).

Different isoforms of the AAA+ protein CbbQ and the VWA-domain containing CbbO assemble as hetero-oligomeric complexes in a Q6O<sup>1</sup> stoichiometry (Sutter et al., 2015; Tsai et al., 2015). Two complexes encoded by Acidithiobacillus ferrooxidans activate phylogenetically remote Rubiscos (Q1O1 activates Form IA<sup>Q</sup> and Q2O2 activates Form II) that are encoded by the same genome (Tsai et al., 2015). In addition there is a third cbbQ-cbbO gene pair (termed Q3O3 in **Figure 2**) associated with a carboxysomal gene cluster, which contains genes encoding a Form IA<sup>C</sup> Rubisco (Heinhorst et al., 2002). The activase function of Q3O3, which is homologous to a complex recently purified and characterized for ATPase activity, has not yet been confirmed (Sutter et al., 2015). This work also pointed out that the presence of multiple Rubisco operons encoding different CbbQ and CbbO isoforms in the same organism is common. It is thus possible that the ancestor of the CbbQO complex became specialized for one Rubisco form, and then switched substrate following a gene duplication. Alternatively the ancestral CbbQO was a generalist Rca and already functional at remodeling both types of Rubisco. The

feasibility to reconstruct ancestral proteins offers a tantalizing opportunity to illuminate these details experimentally (Shih et al., 2016).

Gene pairs highly homologous to CbbQ and CbbO that are not associated with Rubisco genes also exist in proteobacteria (Snider and Houry, 2006; Sutter et al., 2015). The genes encoding the AAA+ protein NirQ and VWA domain protein NorD, are associated with denitrification gene clusters. In the absence of either NirQ or NorD, nitric oxide reductase is produced in non-functional form, implicating NirQ-NorD in enzyme maturation or assembly (Jungst and Zumft, 1992; de Boer et al., 1996). The best biochemically characterized MoxR AAA+ ATPase chaperone system is RavA-ViaA, where RavA is the AAA+ motor, and ViaA is an interacting VWAdomain containing protein (Snider et al., 2006; Wong et al., 2017). Intriguingly one of a number of described function of RavA involves a reduction of the affinity of the allosteric inhibitor ppGpp to the enzyme lysine decarboxylase (albeit in a ViaA independent manner) (El Bakkouri et al., 2010; Kanjee et al., 2011). Therefore, it is likely that in this family many chaperones with functions related to the modulation of enzyme activity remain to be discovered. The CbbQO Rubisco activation system was likely derived from such an origin.

# THE ARCHITECTURE OF INHIBITED RUBISCO ACTIVE SITES

It is established that contemporary Rubisco enzymes all share a common ancestor (Tabita et al., 2007), and although there is significant diversity in quaternary structure, tertiary structure is essentially conserved (Andersson, 2008; Andersson and Backlund, 2008). The implication is thus that the different Rca motors will encounter a highly similar substrate, irrespective of its origin. It is therefore reasonable to expect that Rca mechanisms will display similarities. Consequently motor-substrate specificity should be exchangeable by targeted mutagenesis once the mechanisms are understood in sufficient detail.

Representative examples of Form I and Form II inhibited Rubisco complexes that function as Rca substrates are shown

E60/E49; K334/K330).

in **Figure 3**. The active site is located at the C-terminal face of the beta strands forming the αβ barrel. Residues contributing to the active site are mostly found in the loops connecting the beta strands of the barrel to the downstream helices, but a few are donated by the N-terminal domain of the opposing subunit. Once the substrate RuBP has bound, loop 6 of the beta barrel folds over the active site to form the closed state (Karkehabadi et al., 2007). Loop 6 contributes a critical lysine residue (Form I- K334, Form II- K330), which is thought to position the CO<sup>2</sup> molecule for carboxylation. In Form I enzymes, closure of the active site is accompanied by the C-terminal strand of the large subunit folding over loop 6, with Asp-473 believed to act as a latch residue (Duff et al., 2000; Satagopan and Spreitzer, 2004). The thus secured C-terminus is envisaged to be under tension to push down on Loop 6 via Lys-128 (Bainbridge et al., 1998), providing rigidity to the carboxylation ready active site (Duff et al., 2000). In stark contrast to the Cterminal locking mechanism in Form I Rubisco, inspection of the closed form of the carboxy-arabinitol 1,5 bisphosphate (CABP) bound Form II hexamer from Rhodopseudomonas palustris reveals that the C-terminus does not fold over and lock down Loop 6, but is instead positioned at the apex of the complex (Satagopan et al., 2014) (**Figure 3B**). As a consequence Loop 6 is surface exposed in these structures (Satagopan et al., 2014; Varaljay et al., 2016). Instead of being held in place by the Cterminus, the structure reveals a salt-bridge between Glu-332 (R. palustris RbcL labeling) and Lys-33 on the opposite subunit. These residues are conserved in many Form II enzymes, and the interaction may thus be part of an alternative Loop 6 locking mechanism. Another important feature of active site closure concerns a 2◦ rotation of the N-terminal domain, resulting in a reduced distance between the phosphate binding sites of the active site (Taylor and Andersson, 1996; Duff et al., 2000).

Based on these observations, the conformational changes to bring about an opening of the active site catalyzed by the Rca motors could either involve manipulation of the C-terminal domain, for instance by disruption of the latched C-terminus in Form I enzymes, or Rca-induced movement of the N-terminal domain. In fact both strategies appear to be utilized.

# OLIGOMERIC STATE AND REGULATION OF THE ACTIVASES

The three classes of Rca identified so far all belong to distantly related branches of the AAA+ protein superfamily and possess a single AAA+ domain. Experimentally determined atomic models of the AAA+ module of all activase classes are now available, and all exhibit the expected architecture of this protein family (Henderson et al., 2011; Mueller-Cajar et al., 2011; Stotz et al., 2011; Hasse et al., 2015; Sutter et al., 2015). A Rossmann fold forms the nucleotide binding domain, which is followed by a small α-helical subdomain (Erzberger and Berger, 2006, **Figure 4A**). AAA+ proteins commonly form hexameric rings, and this is certainly the functional form of both the red-type (Mueller-Cajar et al., 2011; Loganathan et al., 2016) and the CbbQO-type Rcas (Sutter et al., 2015; Tsai et al., 2015) as verified by negative-stain electron microscopy.

It is interesting to note that the proteobacterial red-type Rca forms an ATPase inactive fibril in the presence of Mg-ATP. Binding of Rubisco's substrate RuBP to a pocket located in the αhelical subdomain triggers an oligomeric transition to the ATPase and activase functional hexamer (Mueller-Cajar et al., 2011). In contrast the enzyme from the red algae Cyanidioschyzon merolae presents as a constitutive hexamer composed of alternately arranged nuclear and plastid-encoded isoforms (Loganathan et al., 2016). However, the RuBP-binding pocket is conserved in both isoforms and ATPase activity is stimulated by the addition of RuBP. Thus, in both prokaryotes and eukaryotes enzymatic activity of red-type Rca is allosterically regulated by the substrate of the remodeller's target. Nevertheless, mutational studies indicated that the two red-type Rca isoforms in red algae are functionally non-equivalent. For instance eliminating ATPase function of the plastid-encoded isoform by mutating the conserved Walker B glutamate to glutamine counterintuitively enhanced ATP hydrolysis of the hetero-oligomeric complex and resulted only in slight impairment of activase function. In contrast the equivalent substitution in the nuclear encoded isoform eliminated both Rca and ATPase function (**Table 1**). It remains to be seen whether these specializations have resulted in genuine enhancements in activase function or whether they are manifestations of molecular ratchet- type evolutionary trajectories (Gray et al., 2010; Finnigan et al., 2012).

The in vitro oligomeric state of the green-type Rcas is highly polydisperse, possibly ranging from monomeric (Keown et al., 2013) to very large assemblies (Barta et al., 2010; Chakraborty et al., 2012; Kuriata et al., 2014). However, the existence of functional, stable hexamers (Blayney et al., 2011; Stotz et al., 2011; Keown and Pearce, 2014) suggest that this is also the functional species. It is possible that the oligomeric forms may be transitional to permit efficient movement of the activases through the extremely crowded chloroplast stroma (Harris and Koniger, 1997), permitting this less abundant helper protein to shuttle between inactive Rubisco active sites as required. Hexameric assemblies would then occur transiently to form the functional assembly at the inhibited substrate Rubisco. Consistent with this notion, green-type activases rapidly exchange subunits in vitro (Salvucci and Klein, 1994; van de Loo and Salvucci, 1998; Stotz et al., 2011). Regulation of the green-type Rca in higher plants is complex (Carmo-Silva and Salvucci, 2013; Hazra et al., 2015), with a number of mostly energy-related signals integrating. These include redox modulation by thioredoxin and inhibition by ADP (reviewed by Carmo-Silva et al., 2015 and Portis, 2003) and most recently reversible phosphorylation (Boex-Fontvieille et al., 2014; Kim et al., 2016).

CbbQO is unique among activases, in that the AAA+ hexamer CbbQ associates with a single adaptor protein CbbO, which is essential for activase function. The CbbQ6O<sup>1</sup> complexes are monodisperse and do not disassemble as assessed by gel filtration chromatography (Tsai et al., 2015). Finally, both CbbQO and red-type Rubisco activases exhibit a strong stimulation of their ATPase activity when assayed in the presence of inhibited Rubisco complexes (Mueller-Cajar et al., 2011; Tsai et al., 2015;

Loganathan et al., 2016). This type of regulation is not observed in the green-type Rcas (Robinson and Portis, 1989; Hazra et al., 2015).

# MECHANISTIC INSIGHTS INTO RUBISCO REMODELING

AAA+ proteins generally function by translating conformational changes brought about by ATP hydrolysis to a macromolecular substrate, and this principle applies to Rcas and Rubisco. The best described mechanisms so far involve the translocation of the substrate through the axial pore of the hexameric AAA+ ring. This involves a conserved pore loop 1 tyrosine in many well-studied systems, including ClpX (Siddiqui et al., 2004), ClpB/Hsp104 (Weibezahn et al., 2004) and the AAA+ unfoldase of the proteasome (Beckwith et al., 2013). In **Table 1** I summarize biochemical evidence for the mechanistic models described in this section. The outlined threading mechanism appears to be utilized by red-type Rca in both photosynthetic bacteria and red algae. In this model, the activase transiently threads the Cterminus of the Rubisco large subunit into the pore (**Figure 4C**). Red-type Rubiscos all appear to possess a C-terminal extension of 11–12 residues following the critical latch residue Asp-473, which locks the C-terminus to its large subunit. Thus, by pulling on this peptide, the interaction of Asp-473 with its own subunit can be disrupted, releasing the lock and allowing loop-6 to retract, followed by release of the bound inhibitor. Substitutions with alanine of the conserved pore loop 1 tyrosine in both the bacterial and algal red-type Rca, as well as two and four aminoacid deletions of the RbcL C-terminus abolish activase function (Mueller-Cajar et al., 2011; Loganathan et al., 2016).

Interestingly this model, at least relating to transient threading of the Rubisco large subunit C-tail, is unlikely to apply to either of the other two Rca classes. In contrast to the red-type Rubiscos, the C-termini of green-type Rubiscos are of variable length, but often only have 2–4 residues following the latch residue (Satagopan and Spreitzer, 2004). Green-type Rca is thus unlikely to engage this short and variable motif. It was also found that an extension of the tobacco large subunit by six histidine residues


TABLE 1 | Overview of selected key Rca and Rubisco mutants providing insights into the activation mechanism and listed in the order referred to in the text.

did not affect Rca function (Scales et al., 2014). In addition the central pore of the green-type Rca hexamer has a larger diameter than that of red-type Rca, which lead to the hypothesis that a larger secondary structural element, such as a loop, could be threaded instead (Stotz et al., 2011). Consistent with the general theme of a poreloop threading mechanism mutational analysis of pore loop 1 and 2 resulted in the discovery of variants that maintained ATPase function but no longer activated Rubisco (Stotz et al., 2011). Notably, the AAA+ chaperone ClpB has been demonstrated to be capable of threading a looped segment (Haslberger et al., 2008), and the threading mechanism is therefore not limited to free N or C-termini.

The surface exposed βC-βD loop of the large subunit Nterminal domain has long been implicated in the interaction with green-type Rca (**Figure 3A**). Residues 89 and 94 (spinach numbering) in this loop are known to interact with residues 316 and 319 (tobacco Rca numbering) of the activase (Larson et al., 1997; Ott et al., 2000; Li et al., 2005), which are located on a helical insertion in the small subdomain of the AAA+ module (Stotz et al., 2011; Hasse et al., 2015). This interaction involves the same (top) face of the disc-shaped hexamer that is involved in redtype Rca function (Wachter et al., 2013, **Figure 4B**). In addition an N-terminal domain of ∼70 amino acids is also involved in the Rubisco-Rca interaction (Esau et al., 1996; van de Loo and Salvucci, 1996; Stotz et al., 2011), however it is not resolved in current crystal structures. It is conceivable that following initial engagement by activase involving the mentioned structural elements (**Figure 4C**), a pulling force to the βC-βD loop could be brought about by Rca pore loop threading. Rigid body movement of the attached beta sheet would then result in the rotation of the N-terminal domain seen when comparing the closed and open form of the enzyme (Duff et al., 2000).

Mutational analysis of both CbbQO and the two different classes of substrate Rubisco revealed the basis of a common mechanism for CbbQO-type Rcas. More fascinatingly, the results revealed commonalities to both red- and green-type Rca Mueller-Cajar The Diversity of Rubisco Activases

function. It was noted that in spite of low (∼30%) primary sequence identity of the Form I and Form II Rubisco large subunits, the C-termini of those enzymes encoded in cbbQcbbO containing gene clusters displayed a common C-terminal sequence motif (H/KR). Mutagenesis of this motif strongly impaired the ability of the target Rubiscos to be activated by their activases, drawing a strong mechanistic parallel to the pore-loop threading red-type Rcas (Tsai et al., 2015). However, experiments attempting to perturb the poorly conserved pore-loop region of CbbQ did not result in non-functional Rca, and I currently favor a model where the large subunit C-terminus is bound (and consequently immobilized) by the activase, rather than threaded. Here I am also considering the fact that in the Form II substrate the C-terminus does not occupy the same locked latch position as in the Form I complex (**Figure 3**, Satagopan et al., 2014), and thus exerting a pulling force on this motif would not have the same effect.

As is commonly observed for the MoxR class of AAA+ proteins, the CbbO adaptor encoded downstream of the cbbQ gene possesses a von Willebrand factor A (VWA) domain at its C-terminus (Whittaker and Hynes, 2002). This well-described protein-protein interaction module generally uses four residues that are part of a motif known as metal ion dependent adhesion site (MIDAS) to bind a divalent cation. Mutating conserved MIDAS residues mostly abolished CbbQO activase function (Tsai et al., 2015). A fifth ligand to the divalent cation is generally donated by an acidic residue of the interacting protein (Xiong et al., 2002; Santelli et al., 2004). It was discovered that mutating a conserved acidic residue in the previously mentioned surface exposed βC-βD loop of the Rubisco large subunit N-terminal domain to alanine abolished (Form I Rubisco) or greatly reduced (Form II) the ability of Rubisco to become activated by CbbQO (Tsai et al., 2015). Fascinatingly this residue is at the same position as the green-type Rca interacting residue 89 in higher plants Rubisco. We therefore predict that the ATP-hydrolysis powered conformational change brought about by CbbQO and green-type Rcas will emerge to be similar in nature (**Figure 4C**). The precise interaction between a CbbQ hexamer and the CbbO adaptor has not been resolved so far, but involves residues 1–444 of CbbO (Tsai et al., 2015). It is possible that the conformational changes of the hexamer generated by ATP hydrolysis are transmitted to the VWA domain via the CbbO N-terminal region (**Figure 4C**).

Disruption of the closed conformation of the Rubisco holoenzyme by Rca of all three classes will lead to release of the inhibitory sugar phosphate. The active site is thus reset either for cofactor binding, or acceptance of the substrate RuBP (if the inhibitor removed was already bound to ECM holoenzyme, **Figure 1C**).

# THE ROLE OF THE ACTIVASES IN A SYNTHETIC BIOLOGY OF CO<sup>2</sup> FIXATION

A strong impetus regarding research into the detailed mechanisms underlying Rubisco repair in autotrophic organisms is provided by the realization that relatively poor Rubisco performance contributes to the low photosynthetic efficiency of plants, and enhancing its activity is predicted to significantly improve the yield of crops (Long et al., 2015). Given the tight coupling of carboxylase function to maintenance of its activation state by the described highly diverse Rca proteins, any modifications of Rubisco will need to keep in mind compatibilities and other properties of Rca.

#### RUBISCO AND RCA TRANSPLANTATION

A number of strategies regarding the enhancement of C3 photosynthesis rely on the concept of transplanting a Rubisco enzyme of choice into a target crop (Andrews and Whitney, 2003; Zhu et al., 2004). Such experiments need to ensure the presence of a suitable Rca, and technically this is not a difficult problem. Rca in higher plants is encoded by the nuclear genome, and thus Agrobacterium tumefaciens based transformation methods can successfully deliver a target Rca gene (Kurek et al., 2007; Kumar et al., 2009; Fukayama et al., 2012). Deletion or silencing of the endogenous Rca genes may be advantageous if heterooligomerization is likely to occur (for instance if a green-type Rca is to be transplanted). In particular the rapid development of CRISPR-Cas9 technology will facilitate this process further (Belhaj et al., 2015). However, the relative ease of Rca engineering does not extend to Rubisco. Since in higher plants the rubisco large subunit is encoded by the chloroplast (as opposed to the nuclear) genome, this achievement requires the replacement of the endogenous rbcL genes in multiple plastid genome copies. Following significant technical progress in the past decades it is now possible to routinely perform this experiment in tobacco plants using biolistic transformation. Here a particular boon has been the development of a marker-free tobacco-rubrum "master" line (Whitney and Sharwood, 2008), which has its endogenous hexadecameric Form IB Rubisco replaced by a bacterial dimeric Form II enzyme. Due to this Rubisco's low CO2/O<sup>2</sup> specificity, it only permits plant growth at elevated levels of CO<sup>2</sup> (Whitney and Andrews, 2001) and thus facilitates the isolation of transformants expressing more catalytically adept heterologous Form I enzymes. Key examples of successful rubisco transplantation experiments include various higher plant enzymes (Sharwood et al., 2008; Whitney et al., 2011b, 2015), a cyanobacterial Form I enzyme (Lin et al., 2014b) and an archaeal Form III enzyme from Methanococcus burtonii (Wilson et al., 2016). It is therefore technically feasible to produce functional heterologous Rubisco in tobacco plants, although expansion of the technology to other species has so far met with modest success and most crops cannot currently be modified in this manner (Bock, 2015). Current efforts in this area of research are aiming to identify better suited higher-plant Rubiscos (Orr et al., 2016; Sharwood et al., 2016a), or introducing single residue changes into the large subunit that result in desired catalytic switches (Whitney et al., 2011b). Here activase requirements should be easy to satisfy due to the wide level of compatibility between plant Rubiscos and green-type Rcas (Wang et al., 1992). Still a relative paucity of Rubisco-Rca compatibility data may require careful biochemical characterization on a case to case basis.

Although production of heterologous Rubisco in higher plants is currently feasible, a key limitation concerns our incomplete understanding of the enzyme's folding and assembly machinery, which results in either low Rubisco content, or a complete failure in functional Rubisco expression. Regarding the production of heterologous plant Rubisco, rapid progress is being made, for instance co-expression of the Rubisco assembly chaperone Raf1 (Feiz et al., 2012; Hauser et al., 2015a) permitted a doubling of correctly assembled Arabidopsis Rubisco large subunits in tobacco chloroplasts (Whitney et al., 2015).

Among the most tempting targets for transplantation are the red-type Form ID Rubiscos from red algae, some of which have evolved CO2/O<sup>2</sup> specificity values that are twice as high than those found in the land plant Form 1B enzymes (Read and Tabita, 1994; Uemura et al., 1997). For instance functional production of the Rubisco from the red algae Griffithsia monilis (Whitney et al., 2001) in higher plant chloroplasts is predicted to result in a 27% increase in daily canopy carbon gain (Zhu et al., 2004). However, early experiments to produce these proteins in tobacco led to complete insolubility of the gene products (Whitney et al., 2001), consistent with an incompatibility of the folding and/or assembly chaperone machinery. Interestingly this apparent dependency on sophisticated chaperone machinery does not extend to the related bacterial Form IC red-type Rubiscos. The enzyme from Rhodobacter sphaeroides has no requirements for assembly chaperones, merely requiring the GroEL-ES chaperonin for productive folding of the large subunit in a reconstituted system (Joshi et al., 2015). Meeting the biogenesis requirements of Form ID Rubisco may thus be less complicated than that of the higher plant Form IB enzymes, which appear to require a plethora of assembly factors including Raf1, Raf2 and possibly RbcX (Liu et al., 2010; Feiz et al., 2014; Bracher et al., 2017). Once Form ID Rubisco transplantation has been achieved it will need to be supplemented with a red-type Rubisco activase. Based on the work with purified C. merolae proteins it is likely that the cognate algal Rca, a hetero-oligomer of nuclear and plastid encoded subunits, will be optimal for this purpose. However, the simpler homo-oligomeric bacterial red-type Rcas also presents with some activity toward the algal enzyme and thus may be sufficient (Loganathan et al., 2016).

A challenging goal that is currently being pursued by a number of groups involves the transplantation of the prokaryotic carboxysomal CO2-concentrating mechanism into the higher plant chloroplast (Price et al., 2008; Lin et al., 2014a,b). A combination of a high velocity Rubisco operating at very high CO<sup>2</sup> concentrations achieved by carboxysomal Rubisco compartmentalization and active inorganic carbon transport should permit high carbon dioxide assimilation in the absence of photorespiration (Zarzycki et al., 2013). When considering this strategy it is important to realize that a subset of carboxysomal gene clusters include homologs of all three classes of Rca (Zarzycki et al., 2013; Sutter et al., 2015). Activase activity has not yet been demonstrated for any of the carboxysomally associated Rcas biochemically, and an inability to detect this function biochemically was reported in two cases (Li et al., 1999; Sutter et al., 2015). However, in my opinion the association of these Rca homologs with carboxysomal gene clusters is indicative that the associated Rubiscos have not escaped from the activase dependency. Progress here will likely require the use of Rubisco inhibitors other than RuBP, which binds only weakly to carboxysomal Rubiscos (Andrews and Abel, 1981; Pearce, 2006), as well as assay conditions that mimic the crowded carboxysomal interior. In order for Rca associated carboxysomes to function optimally, the relevant activase will likely also need to be supplied (Long et al., 2016).

It is intriguing that significant numbers of carboxysomecontaining organisms do not appear to encode Rca proteins (Zarzycki et al., 2013), suggesting either a true activase independence or the existence of unidentified activase classes. Another enticing possibility would involve members of the general chaperone machinery functioning as activases, in a scenario resembling the situation prior to the evolutionary recruitment of specialized Rcas.

# OVERCOMING THE THERMOLABILITY OF RCA

For a long time it has been realized that plant photosynthesis is highly sensitive to temperature stress (Berry and Bjorkman, 1980), and that the reduction of this process was correlated with a loss in Rubisco activation state (Weis, 1981; Kobza and Edwards, 1987). The discovery that Rca is highly thermolabile, and undergoes heat denaturation at physiologically relevant temperatures provided a mechanistic basis to this observation (Feller et al., 1998; Crafts-Brandner and Salvucci, 2000; Salvucci and Crafts-Brandner, 2004a). This realization was followed by the critical demonstration that expression of more thermostable Rca proteins in Arabidopsis led to enhanced growth and biomass accumulation at moderately elevated growth temperatures (Kurek et al., 2007; Kumar et al., 2009). It is therefore imperative that these promising studies are followed by rigorous analyses of crop plants expressing more thermostable Rca proteins and such experiments have been reported to be taking place (Carmo-Silva et al., 2015). It will be most important to carefully analyse such plants for deleterious phenotypes at high temperatures, since Rca thermolability has been proposed to be regulatory (Sharkey, 2005). It may thus act as a thermal fuse to bring about Rubisco deactivation under stressful high temperature conditions.

In addressing these issues clearly opportunities exist in taking advantage of more thermostable Rca proteins that exist among natural variation (Salvucci and Crafts-Brandner, 2004b; Lawson et al., 2012; Scafaro et al., 2016). It is also worth pointing out that it may not be necessary to restrict oneself to greentype Rca. The characterized red-type Rca from the thermophilic rhodophyte C. merolae was a functional activase at 25◦C, and able to hydrolyze ATP after incubation at 60◦C (Loganathan et al., 2016). Protein engineering approaches that utilize both our mechanistic insights in combination with artificial evolution experiments that utilize an expanding suite of Rubisco dependent Escherichia coli (RDE) systems (Mueller-Cajar and Whitney, 2008b; Durao et al., 2015; Antonovsky et al., 2016; Wilson et al., 2016) will enable incompatibilities between specific Rubiscos and activases to be overcome.

### ACCELERATING RUBISCO ACTIVATION IN PLANTS

An additional opportunity to enhance Rubisco function and photosynthesis by activase engineering relates to the naturally slow activation response of Rubisco under fluctuating light conditions (Mott and Woodrow, 2000; Lawson et al., 2012). Accordingly it was shown that Arabidopsis plants expressing less regulated Rubisco activase isoforms were able to activate Rubisco more rapidly than wild-type plants following a dark to light transition. This property translated to increased biomass accumulation when the plants were grown under a fluctuating light regimen (Carmo-Silva and Salvucci, 2013). Rice plants overexpressing an activase from maize also displayed faster induction of photosynthesis under fluctuating light conditions (Yamori et al., 2012). These results indicate that activases that are highly functional, and thus able to rapidly convert inhibited Rubisco complexes to the ECM holoenzyme, may be able to confer enhanced photosynthetic properties to plants exposed to fluctuating light conditions that may commonly be encountered in natural environments.

While considering the possibility of qualitatively superior activases it is also worth mentioning that the thus far described members of the red-type and CbbQO type Rca clades were all able to remove the extremely tight-binding inhibitor CABP from their cognate Rubiscos (Tsai et al., 2015; Loganathan et al., 2016), whereas the green-type Rca from higher plants is unable to do so (Robinson and Portis, 1988). Although more work is required regarding the relative affinity of CABP to various enzymes, these results indicate that different clades of Rca have evolved different

REFERENCES


levels of remodeling power that can potentially be utilized to advantage in heterologous contexts.

# OUTLOOK

It appears likely that the crops of the future will possess a photosynthetic machinery consisting of carefully selected modules that will ensure maximum yield performance in their particular environment (Zhu et al., 2010; Kromdijk et al., 2016). The properties of Rubisco and its support cast will continue to play a critical role in this endeavor (Sharwood, 2017). In order to intelligently and effectively apply modifications to the photosynthesizers of our choice, a much denser network of Rubisco and activase related data is required (Hanson, 2016). This is critical because our dependence on Rubisco as key carbon fixation catalyst will be ongoing, at least until alternative and more efficient synthetic CO<sup>2</sup> fixation pathways have been successfully and fully integrated into the metabolism of photoautotrophs (Bar-Even et al., 2010; Schwander et al., 2016).

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# FUNDING

My laboratory's research on rubisco activases was funded by Nanyang Technological University (startup grant) and the Ministry of Education of Singapore (MOE2013-T2-2-089).


of a molecular cage modulating the inducible lysine decarboxylase activity. Proc. Natl. Acad. Sci. U.S.A. 107, 22499–22504. doi: 10.1073/pnas.1009092107


activase: product inhibition, cooperativity, and magnesium activation. J. Biol. Chem. 290, 24222–24236. doi: 10.1074/jbc.M115.651745


provide opportunities for improving C3 photosynthesis. Nat. Plants 2:16186. doi: 10.1038/nplants.2016.186


specificity for CO<sup>2</sup> fixation. Biochem. Biophys. Res. Commun. 233, 568–571. doi: 10.1006/bbrc.1997.6497


and, to a lesser extent, of steady-state photosynthesis at high temperature. Plant J. Cell Mol. Biol. 71, 871–880. doi: 10.1111/j.1365-313X.2012.0 5041.x


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Mueller-Cajar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Torsin ATPases: Harnessing Dynamic Instability for Function

Anna R. Chase<sup>1</sup> , Ethan Laudermilch<sup>1</sup> and Christian Schlieker 1, 2 \*

<sup>1</sup> Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA, <sup>2</sup> Department of Cell Biology, Yale School of Medicine, New Haven, CT, USA

Torsins are essential, disease-relevant AAA+ (ATPases associated with various cellular activities) proteins residing in the endoplasmic reticulum and perinuclear space, where they are implicated in a variety of cellular functions. Recently, new structural and functional details about Torsins have emerged that will have a profound influence on unraveling the precise mechanistic details of their yet-unknown mode of action in the cell. While Torsins are phylogenetically related to Clp/HSP100 proteins, they exhibit comparatively weak ATPase activities, which are tightly controlled by virtue of an active site complementation through accessory cofactors. This control mechanism is offset by a TorsinA mutation implicated in the severe movement disorder DYT1 dystonia, suggesting a critical role for the functional Torsin-cofactor interplay in vivo. Notably, TorsinA lacks aromatic pore loops that are both conserved and critical for the processive unfolding activity of Clp/HSP100 proteins. Based on these distinctive yet defining features, we discuss how the apparent dynamic nature of the Torsin-cofactor system can inform emerging models and hypotheses for Torsin complex formation and function. Specifically, we propose that the dynamic assembly and disassembly of the Torsin/cofactor system is a critical property that is required for Torsins' functional roles in nuclear trafficking and nuclear pore complex assembly or homeostasis that merit further exploration. Insights obtained from these future studies will be a valuable addition to our understanding of disease etiology of DYT1 dystonia.

Edited by:

James Shorter, University of Pennsylvania, USA

#### Reviewed by:

Andre Hoelz, California Institute of Technology, USA Kür ¸sad Turgay, Leibniz University of Hanover, Germany

#### \*Correspondence:

Christian Schlieker christian.schlieker@yale.edu

#### Specialty section:

This article was submitted to Protein Folding, Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 25 March 2017 Accepted: 25 April 2017 Published: 11 May 2017

#### Citation:

Chase AR, Laudermilch E and Schlieker C (2017) Torsin ATPases: Harnessing Dynamic Instability for Function. Front. Mol. Biosci. 4:29. doi: 10.3389/fmolb.2017.00029 Keywords: AAA+ proteins, TorsinA, dystonic disorders, nuclear membrane, nuclear pore complex, DYT1 dystonia, protein quality control, ubiquitin

# INTRODUCTION

Torsin ATPases are essential and broadly conserved AAA+ proteins whose discovery was tied to the characterization of the TorsinA DYT1 mutation found in patients with early-onset torsion dystonia, a highly debilitating hereditary movement disorder (Ozelius et al., 1997). Torsins have recently garnered increasing interest in conjunction with pivotal discoveries about their structure and molecular mechanism of activation, as well as compelling insights into their cellular functions. As the sole AAA+ ATPase found in the endoplasmic reticulum (ER) and nuclear envelope (NE), Torsins were implicated in equally broad and critical functions including lipid synthesis (Grillet et al., 2016), regulation of membrane morphology (Rose et al., 2014), and protein quality control (Chen et al., 2010; Nery et al., 2011) as well as the ER redox sensing (Zhu et al., 2008, 2010; Nery et al., 2011; Zhao et al., 2016).

In addition to these roles in the ER, Torsins fulfill distinct functions at the NE. TorsinA and its cofactor LAP1 are essential for proper assembly of fibroblast nuclear envelope-anchored transmembrane actin-associated nuclear (TAN) lines (Luxton et al., 2011), which are comprised of arrays of linker of nucleoskeleton and cytoskeleton (LINC) complexes associated with retrograde flowing actin. TorsinA modulates the rearward motion of nuclei during centrosome positioning and is implicated in maintaining cell polarity in migrating cells (Saunders et al., 2017). A second intriguing role for Torsins at the nuclear periphery is their involvement in modulating nuclear envelope architecture. Deletions of Torsins in human, mouse, worm, and fly cells lead to the formation of omega-shaped "bleb" compartments within the nuclear envelope (Goodchild et al., 2005; Jokhi et al., 2013; Liang et al., 2014; VanGompel et al., 2015; Laudermilch et al., 2016; Tanabe et al., 2016). These perinuclear blebs have been shown to harbor ubiquitinated proteins (Liang et al., 2014; Laudermilch et al., 2016) as well as nuclear pore complex components (Laudermilch et al., 2016). Thus, a picture is emerging in which Torsins accomplish a variety of tasks both at the NE and the ER, and that at least some of these functions are most critical during early developmental stages in neurons (Tanabe et al., 2016). In addition to these functional insights in the cellular context, the recently solved crystal structures of wild-type and DYT1 dystonia mutant Torsin in complex with its cofactor LULL1 confirmed functionally significant structural features that were previously unappreciated (Demircioglu et al., 2016). Several reviews have summarized the current state of the Torsin field (Rose et al., 2015; Laudermilch and Schlieker, 2016; Cascalho et al., 2017); thus, the purpose of the forgoing is to spotlight current hypotheses surrounding the Torsins' roles at the inner nuclear membrane and their dynamic assembly into an active, functional complex.

#### STRUCTURAL INSIGHTS INTO TORSIN COMPLEXES

Though homology to other AAA+ proteins suggested that Torsins were capable of ATP hydrolysis-driven mechanical work from the very beginning, the question of whether they were active ATPases or degenerate AAA+ scaffolds was unresolved until Torsins were functionally reconstituted in vitro (Zhao et al., 2013). TorsinA, -B, and -3A have ATPase activity in the presence of ATP and the luminal domain of the ER-resident protein LULL1 while TorsinA and -B alone are activated by the luminal domain of LAP1, which resides in the NE (Foisner and Gerace, 1993; Goodchild and Dauer, 2005; Zhao et al., 2013). The DYT1 dystonia mutant of TorsinA is refractory to the activation by these cofactors, thus presenting one line of evidence for a loss-of-function mechanism in early-onset torsion dystonia (Zhao et al., 2013). These cofactors have degenerate AAA+ scaffolds lacking the motifs needed for ATP binding, and they activate Torsin ATPase activity by complementing the Torsin active site with an arginine finger residue that is absent in Torsins (Brown et al., 2014; Sosa et al., 2014) (**Figures 1A–C**).

The structure of the TorsinA-LULL1 heterodimer unambiguously confirmed the critical role of a catalytic arginine (Demircioglu et al., 2016). This arginine is positioned to stabilize the negative charge of the transition state, thus lowering the free energy of the nucleotide hydrolysis reaction (Scheffzek et al., 1998). As suggested by biochemical studies (Brown et al., 2014; Rose et al., 2014) the TorsinA-LULL1 crystal structure confirmed the critical role of Torsin's C-terminal helix region for forming interactions with LULL1 (Demircioglu et al., 2016) (**Figure 1C**). It is now apparent that the deletion of E303 in the DYT1 dystonia mutant TorsinA perturbs a critical helix at the cofactor interface (Demircioglu et al., 2016), providing an atomic-level rationale for the observation of reduced cross-linking of the conserved C-terminal TorsinA aromatic residues with the cofactor in the TorsinA disease variant (Brown et al., 2014), and the resulting failure to trigger ATP hydrolysis (Zhao et al., 2013) (for additional details on disease implications, see Rose et al., 2015; Cascalho et al., 2017).

The complementation mechanism for ATPase activation and the presence of a degenerated AAA+ fold is unusual but not unprecedented. The bacterial clamp loader has an inactive δ ′ subunit that activates the adjacent γ ATP-binding AAA+ subunit (Hedglin et al., 2013; Kelch, 2016). Torsins and their cofactors stand out for the fact that they have different modes of staying anchored in their cellular environment: TorsinA and -B have an N-terminal signal sequence followed by a hydrophobic domain while Torsin2A and -3A do not have a hydrophobic domain, and LULL1 and LAP1 are type-II transmembrane proteins. LULL1 is localized throughout the ER (Goodchild and Dauer, 2005), while the nuclear domain of LAP1 binds to the nuclear lamina and therefore resides in the inner nuclear membrane (Foisner and Gerace, 1993). From an evolutionary standpoint, the added complexity of such a distinctive multicomponent ATPase system likely evolved out of the need to create more diverse roles at precise cellular loci, especially in higher organisms. Dependence on the cofactors for at least some of their functions likely allows cells to leverage the common Torsin scaffold to perform more varied functions in targeted locations and potentially relay signals from or to the nucleus and cytoplasm as well.

Though the stoichiometry of the Torsin/cofactor complex under equilibrium conditions remain to be established, recent data point to a dynamic assembly. Three distinct models exist: (a) an alternating, symmetric Torsin/cofactor ring assembly; (b) homo-oligomeric Torsin rings; and (c) a Torsin/cofactor dimer (**Figure 1D**). Though low-resolution structural (Sosa et al., 2014) data and crosslinking data (Brown et al., 2014) are consistent with the formation of an alternating assembly into a closed ring structure, the major limitation of several approaches aimed at a determination of the (hetero)oligomeric state is that they were mostly carried out with hydrolysis-deficient "trap" variants of TorsinA. These variants are refractory to cofactor-induced hydrolysis (Zhao et al., 2013) and bind the cofactor tightly (Naismith et al., 2009; Zhu et al., 2010; Zhao et al., 2013), a situation that is certainly not representative of the dynamic equilibrium in a cell. The rationale for the second model with Torsin-Torsin homo-oligomers is based on data showing that Torsin assembles into hexameric structures on its own in blue native PAGE experiments, and that ATP is often required to allow

rings are eventually dismantled because the cofactors lack the necessary four-helix bundle and conserved residues to form stable closed ring structures. The Torsin-cofactor complex is also transient and dynamic: ATP hydrolysis generates ADP-bound Torsin, destabilizing both the Torsin-Torsin and the Torsin-cofactor

oligomerization in AAA+ ATPases (Hanson and Whiteheart, 2005; Vander Heyden et al., 2009; Jungwirth et al., 2010).

interaction. Note that the transmembrane domain of LAP1 was omitted for clarity.

Given that previous studies of Torsins were conducted primarily with "trap" variants that resulted in more static models, we propose a more dynamic model. This model is most strongly supported by the following evidence: only Torsins, but not LAP1 and LULL1, possess the C-terminal helix bundle that is essential for intra-protomer ring-forming contacts (**Figures 1A–C**) among the Clp/Hsp100 AAA+ proteins (Mogk et al., 2003); the high level of conservation observed in Torsin residues on the "back" interface opposite the cofactor binding face (**Figures 1A,B**) (Demircioglu et al., 2016) suggesting that these residues participate in homotypic Torsin intra-protomer contacts; and the observation of higher-order Torsin oligomers (cf. **Figure 1D**) via blue native PAGE (Vander Heyden et al., 2009; Jungwirth et al., 2010; Goodchild et al., 2015). Given the cofactors' lack of a four-helix bundle and the low level of "back" interface conservation on either cofactor (Demircioglu et al., 2016), and the fact that Torsin oligomerization itself is ATPdependent, it is conceivable that activation of ATP hydrolysis by the bound cofactors would effectively disrupt homotypic intra-ring contacts, as proposed previously (Rose et al., 2015; Demircioglu et al., 2016).

One important point of discussion in the context of this model is how the cofactor luminal domains, which would effectively compete with other Torsin subunits in the ring for a nearly identical interface would manage to initially pervade the ring, gaining access to an ATP-bound Torsin subunit. One possibility (**Figure 1E–II**) is that Torsin oligomers adopt a split lock washer or spiral conformation, similar to NSF (Zhao et al., 2015), in which parts of the nucleotide binding face of Torsin would be rendered accessible to the cofactor. The flexibility of the unstructured region after the hydrophobic domain but before the AAA+ domain (residues 44-57) could impart additional degrees of translational freedom (a ∼49 Å radius of flexibility, based on Cα-Cα distance) to Torsin subunits, thus also allowing the membrane-anchored cofactors to access the nucleotide binding site, which is about 30 Å from the membrane-anchored Nterminus. Considering that ATP binding is broadly required for oligomerization in AAA+ ATPases, hydrolysis and transition to the ADP-bound state would shift the equilibrium to free Torsin and cofactor subunits (**Figure 1E I-II**). Adding to the complexity of the system is the fact that LULL1 has been shown to form higher-order structures (Goodchild et al., 2015), thus creating an equilibrium reaction between Torsin-engaged, free, and homo-oligomeric or otherwise engaged cofactors. Furthermore, it is possible that the cofactors are themselves regulated by an additional layer of control: for example via posttranslational modifications, through dynamic interactions with other proteins on either side of the membrane, or even within the lipid bilayer. In either case, the known properties of the Torsin-cofactor complex are not consistent with a static assembly.

Unlike the Clp/Hsp100 proteins which Torsins are most phylogenetically similar to, the Torsin structure (Demircioglu et al., 2016) further established that Torsins lack the central hydrophobic pore loops that are used to drive substrate translocation through the central channel of other related hexameric AAA+ proteins (Olivares et al., 2016). Combined with the extremely slow ATPase activity (0.006 nucleotides/s), relative to its AAA+ counterparts which can hydrolyze >1.3 nucleotides/s (Martin et al., 2008), these observations render it improbable that Torsin acts in a processive manner to translocate substrates through the inner cavity of the Torsin ring (Zhao et al., 2013; Rose et al., 2015). Instead, Torsins likely interact with substrates with a more transient mechanism such as that of a holder chaperone that quickly binds and releases its substrates, either by lateral diffusion into the axial pore or by binding substrates at the periphery of its assembly. Determining the three dimensional structure of higher-order Torsin assemblies using e.g., cryo-electron microscopy might provide important insights in the future. Though characterizing the precise mechanisms of how ATP hydrolysis translates to work exerted on substrates remains challenging even for wellcharacterized AAA+ proteins, recent studies on NSF, the yeast chaperone Hsp104, and mitochondrial Pex1/Pex6 by cryo-EM have revealed that progression through multiple asymmetric states in stacked spirals, open lock-washers, or more planar assemblies are key drivers for performing work during successive ATP hydrolysis events (Blok et al., 2015; Zhao et al., 2015; Yokom et al., 2016). Given the Torsins assembly's dynamic nature, predicted non-processive action, and similarity to clamp loaders, it is probable that the presence of asymmetric states will also play a role in its activation mechanism and should be accounted for in data analysis and interpretation. Asymmetric hydrolysis events could, for example, couple various asymmetric states to the insertion of the Torsins' own hydrophobic domains or interaction with the transmembrane cofactors, which could in turn modulate membrane curvature and remodeling or substrate interactions. It will be important to examine these states both in the presence and absence of cofactors and, once they have been identified, the Torsin substrates that have eluded the field thus far.

How can we begin to form a mechanistic explanation for the Torsins' exquisite spatiotemporal control during phases of neuronal development while also accounting for their redundancy (Laudermilch et al., 2016; Tanabe et al., 2016)? One likely scenario, is the formation of an anti-parallel gradient by the cofactors LULL1 in the ER and LAP1 at the nuclear envelope that dictate when and where Torsins are activated by cofactors to perform their function (Rose et al., 2015). LULL1 could activate Torsin's chaperone function in the ER, perhaps in response to a flux in redox potential or cofactor density in this compartment. The membrane association of TorsinA is controlled by cleavage of a scissile bond that removes the Nterminal hydrophobic domain during B cell differentiation (Zhao et al., 2016), suggesting an additional layer of control that could modulate substrate specificity, for example from membraneassociated to soluble ER-luminal species, during ER expansion. TorsinA species with a mass identical to this cleavage product have been observed in organ homogenates (Goodchild et al., 2005; Jungwirth et al., 2010).

# A NOVEL ROLE FOR TORSINS IN NUCLEAR PORE BIOGENESIS OR HOMEOSTASIS

The hallmark phenotype seen upon Torsin manipulation or deletion is the "blebbing" or herniation of the inner nuclear membrane into the perinuclear space (**Figure 2A**; Goodchild et al., 2005; Jokhi et al., 2013; Liang et al., 2014; Pappas et al., 2015; VanGompel et al., 2015; Laudermilch et al., 2016; Tanabe et al., 2016). This phenotype has been observed in neural tissues of knockout mouse models of TorsinA (Goodchild and Dauer, 2005) and in HeLa cells with combined knockouts of multiple Torsins (Laudermilch and Schlieker, 2016). Similar herniations have also been observed after manipulation of the respective Torsin variants in Drosophila melanogaster and Caenorhabditis elegans (Jokhi et al., 2013; VanGompel et al., 2015), suggesting that Torsin function at the nuclear envelope is conserved.

One formidable challenge to deciphering Torsin function has been the remarkable redundancy between the four Torsin proteins encoded in mammalian genomes. In TorsinA knockout mice, blebbing is observed strictly in neural tissue (Goodchild et al., 2005), where TorsinA is highly expressed (Jungwirth et al., 2010). However, in fibroblasts from TorsinA knockout mice, additionally depleting TorsinB is sufficient to induce blebbing (Kim et al., 2010). In TorsinA knockout mice, blebbing is restricted to a specific developmental window, and the

Upon arrival at the INM, the high local concentration of LAP1 would trigger ATP hydrolysis in Torsins, leading to the disassembly of the Torsin ring and substrate release. Released substrates can then engage in protein-protein complex formation at the INM.

resolution of the blebs in later stages is dependent on increasing expression levels of TorsinB (Tanabe et al., 2016). Finally, deletion of TorsinA or TorsinB individually in HeLa cells shows little perturbation to normal nuclear envelope architecture, but deleting all four Torsins results in robust blebbing (Laudermilch et al., 2016).

While the precise composition of the blebs and Torsins' role in their formation is still being determined, several recent findings linked Torsins to nucleoporins (nups) (VanGompel et al., 2015; Laudermilch et al., 2016). In C. elegans, Torsin manipulation resulted in nup mislocalization and altered nuclear import kinetics (VanGompel et al., 2015). In Torsin-deficient HeLa cells, a subset of nups localize specifically to the base or "neck" of the blebs at the inner nuclear membrane (Laudermilch et al., 2016) (**Figure 2A**). Collectively, these observations suggest that Torsin plays a role in nuclear pore complex (NPC) biogenesis or homeostasis. The NPC is a massive structure found in the nuclear envelope through which nucleocytoplasmic transport occurs (Field et al., 2014; Knockenhauer and Schwartz, 2016; Kosinski et al., 2016; Lin et al., 2016). While the precise mechanism of NPC assembly is still actively investigated, there are two distinct assembly pathways: one occurs post-mitotically while the nuclear envelope reforms and the other occurs during interphase (Doucet et al., 2010). Interphase assembly begins from the INM and proceeds outward toward the ONM. After several subcomplexes have assembled, the inner and outer nuclear membranes fuse together, and at least some components of the cytoplasmic region are added to the NPC after this fusion event (Otsuka et al., 2016).

Here we propose two models for a functional link between Torsins and nups. Importantly, the shape and dimensions of the blebs are highly similar to normal interphase NPC assembly intermediates (Laudermilch et al., 2016; Otsuka et al., 2016). Thus, the blebs could represent frozen NPC assembly intermediates that require the action of Torsins for their completion. These intermediates would be frozen at a step prior to the fusion of the inner and outer nuclear membranes (**Figure 2B**). Thus, cytoplasmic nups would be expected to be absent from the base of the blebs in this model, while other subcomplexes would be present. Therefore, it will be critical to perform a detailed compositional analysis of the blebs. A diagnostic absence of cytoplasmic nups would support the idea of a frozen assembly intermediate. That Torsin-deficient cells remain viable albeit exhibiting slower growth (Laudermilch et al., 2016) could be attributed to the contribution of unperturbed NPC assembly proceeding through the Torsin-independent postmitotic insertion pathway.

Alternatively, the blebs could result from sealing of nascent NPCs by endosomal sorting complexes required for transport (ESCRT) components, analogous to a process that has recently been described in yeast in which ESCRT proteins and the AAA+ ATPase Vps4 participate in a pathway that surveils NPCs (Webster et al., 2014; Webster and Lusk, 2016).

We envision two general mechanistic models to explain why blebs form in the absence of Torsin. In the first model, Torsin would act directly in NPC biogenesis. For example, Torsin might participate in the fusion of the inner and outer nuclear membranes during NPC assembly, probably in complex with other proteins. In the second model, Torsin would act upstream of NPC biogenesis or surveillance. Specifically, Torsins could act as trafficking chaperones by binding newly synthesized proteins in the endoplasmic reticulum and delivering them to sites of NPC assembly in the nuclear envelope (**Figure 2C**). Torsin could traffic transmembrane nups, or it could deliver proteins that are essential for NPC assembly or surveillance. One reason for invoking such a function is the presence of a 60 kDa transport limit for the nuclear domains of transmembrane proteins residing in the INM (Ungricht et al., 2015). NE proteins assembling into higher-order oligomeric structures must be held competent for trafficking through the pore membrane in a monomeric state to bypass the 60 kDa size limitation imposed by the NPC. For example, trimeric Sun proteins (Sosa et al., 2012) at INM harbor sizable nuclear domains (∼34 kDa for Sun1). Trafficking through the pore membrane in a trimeric state would be difficult to reconcile with this 60 kDa size limit. Our specific proposal here is that Torsins could stabilize the monomeric form by association with the luminal domains of NE proteins, while the nuclear domains of NE proteins will ensure INM targeting. Upon arrival at the INM, substrates will be released from Torsins due to the high local concentration of the Torsin activator LAP1 at the INM resulting in disassembly of the Torsin ring and allowing the released substrate to engage in complex formation (**Figure 2C**). While hypothetical, this model would be consistent with the observation that a hydrolysis-deficient trap variant of TorsinA accumulates in the NE (Goodchild and Dauer, 2004; Naismith et al., 2004), which can be attributed to a failure of LAP1 to catalyze the release of Torsin from its NE-targeted clients.

Our model could also explain the accumulation of K48 ubiquitylated proteins in the nuclear periphery in Torsin deficient cells (Laudermilch et al., 2016). Given that the INM of mammalian cells was recently shown to be competent for

#### REFERENCES


the degradation of membrane proteins (Tsai et al., 2016), it will be critical to determine if the half life of otherwise stable NPC/INM proteins (Doucet et al., 2010; Toyama et al., 2013) is compromised in Torsin-deficient cells due to the absence of normally stabilizing interactions that are perturbed due to trafficking defects, and to discern a (mis)localization of INM proteins to the ONM vs. INM upon Torsin manipulation.

In conclusion, we have now reached a stage in our understanding of Torsin biology that is sufficient to begin formulating more precise hypotheses about their mechanism and their functions that can be tested by definitive experiments. The likelihood that further genetic experiments within a cellular context will yield the holy grail of the Torsin field the elusive substrates that trigger the changes affected by Torsins in the ER and at the nuclear envelope—is more probable than ever. Merging these functional details with a structural understanding of the Torsins' action will provide the necessary basis for developing targeted DYT1 dystonia therapies.

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct, and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

This work was supported by the National Institutes of Health (1R01GM114401 to CS and T32GM007223 to EL) and an NSF GROW award to ARC.


Knockenhauer, K. E., and Schwartz, T. U. (2016). The nuclear pore complex as a flexible and dynamic gate. Cell 164, 1162–1171. doi: 10.1016/j.cell.2016.01.034


Trends Biochem. Sci. 23, 257–262. doi: 10.1016/S0968-0004(98) 01224-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Chase, Laudermilch and Schlieker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Pontin and Reptin in Cellular Physiology and Cancer Etiology

#### Yu-Qian Mao<sup>1</sup> and Walid A. Houry 1, 2 \*

<sup>1</sup> Department of Biochemistry, University of Toronto, Toronto, ON, Canada, <sup>2</sup> Department of Chemistry, University of Toronto, Toronto, ON, Canada

Pontin (RUVBL1, TIP49, TIP49a, Rvb1) and Reptin (RUVBL2, TIP48, TIP49b, Rvb2) are highly conserved ATPases of the AAA+ (ATPases Associated with various cellular Activities) superfamily and are involved in various cellular processes that are important for oncogenesis. First identified as being upregulated in hepatocellular carcinoma and colorectal cancer, their overexpression has since been shown in multiple cancer types such as breast, lung, gastric, esophageal, pancreatic, kidney, bladder as well as lymphatic, and leukemic cancers. However, their exact functions are still quite unknown as they interact with many molecular complexes with vastly different downstream effectors. Within the nucleus, Pontin and Reptin participate in the TIP60 and INO80 complexes important for chromatin remodeling. Although not transcription factors themselves, Pontin and Reptin modulate the transcriptional activities of bona fide proto-oncogenes such as MYC and β-catenin. They associate with proteins involved in DNA damage repair such as PIKK complexes as well as with the core complex of Fanconi anemia pathway. They have also been shown to be important for cell cycle progression, being involved in assembly of telomerase, mitotic spindle, RNA polymerase II, and snoRNPs. When the two ATPases localize to the cytoplasm, they were reported to promote cancer cell invasion and metastasis. Due to their various roles in carcinogenesis, it is not surprising that Pontin and Reptin are proving to be important biomarkers for diagnosis and prognosis of various cancers. They are also current targets for the development of new therapeutic anticancer drugs.

#### Keywords: Pontin, Reptin, AAA+, cancer, cellular pathways

# INTRODUCTION

Pontin (RUVBL1, TIP49, TIP49a, Rvb1) and Reptin (RUVBL2, TIP48, TIP49b, Rvb2) belong to the AAA+ (ATPases Associated with various cellular Activities) superfamily whose proteins are characterized by having the conserved Walker A and Walker B motifs, which are involved in ATP binding and hydrolysis (Grigoletto et al., 2011; Matias et al., 2015). Pontin and Reptin were discovered in the late 1990s in a variety of species by multiple groups, resulting in their different naming conventions. The proteins are also putative DNA helicases, sharing homology with the bacterial RuvB helicase (Otsuji et al., 1974; Makino et al., 1998; Kurokawa et al., 1999). However, their function as helicases is not yet established and remains controversial. There are also debates in regards to their oligomeric state as they have been observed to form

#### Edited by:

Vladimir N. Uversky, University of South Florida, United States

#### Reviewed by:

Edouard Bertrand, UMR5535 Institut de Génétique Moléculaire de Montpellier (IGMM), France Vibhor Mishra, Howard Hughes Medical Institute, United States

> \*Correspondence: Walid A. Houry walid.houry@utoronto.ca

#### Specialty section:

This article was submitted to Protein Folding Misfolding and Degradation, a section of the journal Frontiers in Molecular Biosciences

> Received: 02 May 2017 Accepted: 03 August 2017 Published: 24 August 2017

#### Citation:

Mao Y-Q and Houry WA (2017) The Role of Pontin and Reptin in Cellular Physiology and Cancer Etiology. Front. Mol. Biosci. 4:58. doi: 10.3389/fmolb.2017.00058 homo-hexamers, hetero-hexamers, and even hetero-dodecamers (Matias et al., 2006; Cheung et al., 2010a; Niewiarowski et al., 2010; Gorynia et al., 2011). It is also likely that Pontin and Reptin assume different oligomeric states under different functional contexts based on their cellular activities (Grigoletto et al., 2011; Nano and Houry, 2013). For example, Queval et al. (2014) proposed that the oligomerization of Pontin and Reptin can be controlled by interaction of the proteins with the nucleosome.

The Pontin/Reptin cellular activities include: transcriptional regulation, chromatin remodeling, DNA damage signaling and repair, assembly of macromolecular complexes, regulating cell cycle/mitotic progression, and cellular motility, all of which contribute to their central roles in promoting cell proliferation and survival (Gallant, 2007; Jha and Dutta, 2009; Boulon et al., 2012; Nano and Houry, 2013; Rosenbaum et al., 2013; Kakihara and Saeki, 2014). This also makes them ideal candidates for promoting tumorigenesis and cancer development, especially when activating mutations occur upstream or downstream in their functional pathways (Grigoletto et al., 2011; Matias et al., 2015; Zhao et al., 2015). Not surprisingly, Pontin and Reptin were shown to be essential for tumor cell growth of many cancers and were found to be overexpressed in a large number of cancer types. Thus, here we will summarize the cancer cell types that Pontin and Reptin are involved in and explore the molecular pathways in which Pontin and Reptin contribute to oncogenesis.

#### ROLES OF PONTIN/REPTIN IN CANCER

The role of Pontin and Reptin in the development of hepatocellular carcinoma (HCC) is well-established (Haurie et al., 2009; Berasain, 2010; Menard et al., 2010; Raymond et al., 2015; Breig et al., 2016). Not only are they both overexpressed in HCC tissues, where their overexpression was associated with poor prognosis, they both also showed stronger cytoplasmic staining in tumor cells compared to normal hepatocytes (Rousseau et al., 2007; Haurie et al., 2009).

Since their discovery in HCC and colorectal cancer, many other groups reported the involvement of these two ATPases in several cancer types that affect various organs of the body (Grigoletto et al., 2011) (**Table 1**). This suggested that Pontin and Reptin may play a fundamental role in cancer development, requiring further investigation to consolidate their functions and whether their contribution or regulation of tumor progression is specific to each type of cancer or can be generalized to most.

Within the digestive system (**Table 1**), Pontin and/or Reptin were implicated in cancers of the esophagus, stomach, colon, and pancreas (Li et al., 2010; Lauscher et al., 2012; Tung et al., 2013; Taniuchi et al., 2014; Cui et al., 2016). Specifically, Pontin was implicated in the survival and proliferation of gastric cancer cells and in promoting the invasiveness and migration of pancreatic ductal adenocarcinoma (PDAC) cells (Taniuchi et al., 2014; Cui et al., 2016). Pontin overexpression was correlated with adverse response to adjuvant therapy in colorectal cancer and with poor prognosis for advanced tumor stages. It was found that Pontin levels can be used as a biomarker to discriminate esophageal squamous-cell carcinoma (ESCC) from normal tissue (Lauscher et al., 2007, 2012; Tung et al., 2013). On the other hand, Reptin was shown to be overexpressed in primary tissue of gastric and colon cancers. Reptin overexpression was correlated with aggressive colorectal cancer in a cell model (Li et al., 2010; Flavin et al., 2011; Milone et al., 2016).

In the excretory system (**Table 1**), overexpression of the two ATPases was found in renal cell carcinoma (RCC) (Ren et al., 2013; Zhang et al., 2015). Like in HCC patients, cytoplasmic localization of Pontin and Reptin in RCC was found to be correlated with metastasis and unfavorable outcome (Rousseau et al., 2007; Haurie et al., 2009; Ren et al., 2013; Zhang et al., 2015). Whether correlation with localization of the protein can apply to other cancer types where cytoplasmic expression was also shown remains to be investigated. Along the same vein, Pontin was found to be overexpressed in the more aggressive and metastatic form of bladder cancer, micropapillary carcinoma (Guo et al., 2016).

Several studies have reported Pontin and/or Reptin expression in both non-small cell lung cancer (NLSCLC) and small cell lung cancer (SCLC) and suggested their potential use as biomarkers for diagnosis and prognosis of lung cancer (Dehan et al., 2007; Ocak et al., 2014; Uribarri et al., 2014; Yuan et al., 2016; Velmurugan et al., 2017) (**Table 1**).

Pontin was also identified in screens of biomarker/autoantigen panels for ductal carcinoma in situ (DCIS) as well as node negative early stage breast cancers (**Table 1**) (Lacombe et al., 2013, 2014). This could prove to be important for early diagnosis of DCIS and could be a complement to mammography. Functionally, Pontin and Reptin were found to be important in breast cancer cell models in the context of elevated snoRNA and hypertrophy of the nucleolus (Su et al., 2014).

Lastly, these two proteins were shown to be important in cancers of white blood cells, resulting in lymphomas and leukemia **(Table 1**). Specifically, BCL6, a transcriptional repressor essential for B and T cell development and differentiation, repressed Pontin expression in lymphoma cells (Baron et al., 2016). In addition, Pontin and Reptin were critical regulators of AML1-ETO (in acute myeloid leukemia) and MLL-AF9 (in mixed lineage leukemia), respectively, where their ATPase activities were required for clonogenesis and survival of the cancer cells (Osaki et al., 2013; Breig et al., 2014).

# ROLE OF PONTIN/REPTIN IN SPECIFIC CELLULAR PATHWAYS

Recent work on Pontin/Reptin attempted to uncover their roles in cellular pathways and processes leading to tumor development. Here, we will discuss the role of these proteins in seven main processes: (1) assembly of replication machinery, (2) aggresome formation, (3) regulation of cell cycle checkpoint, (4) proper mitotic progression, (5) transcriptional regulation, (6) DNA damage response, and (7) cell invasion/migration.

# Assembly of Replication Machineries by the R2TP Complex

Pontin and Reptin are established critical regulators of cell growth and proliferation. One of the ways they achieve this is through the assembly of multiple molecular complexes

#### TABLE 1 | Overexpression of Pontin/Reptin in various cancer types.


belonging to the replication machinery, largely mediated by the HSP90-interacting chaperone-like complex R2TP (Boulon et al., 2012; Von Morgen et al., 2015), which was discovered by our group (Zhao et al., 2005). R2TP consists of four proteins and is conserved from yeast to humans (Nano and Houry, 2013). Pontin and Reptin are two of the components of the complex, and they interact with PIH1D1 and RPAP3 to form R2TP. Whereas, RPAP3 can bind HSP90 through its TPR domain, PIH1D1 has been proposed to act as an adaptor for the complex and targets R2TP to its clients such as NOP58 of box C/D snoRNP in yeast, dyskerin core factor of box H/ACA snoRNP in mammalian cells, RPB1 subunit of RNA polymerase II in yeast and mammalian cells, and Tel2 of the TTT complex that interacts with mTOR in yeast and mammalian cells (Boulon et al., 2010; Machado-Pinilla et al., 2012; Kim et al., 2013; Kakihara et al., 2014) (**Figure 1**).

#### Role of Pontin/Reptin in RNP Biogenesis

First found to be important for the biogenesis of box C/D small nucleolar RNP (snoRNP), the role of R2TP has now expanded to the assembly of RNPs of the L7Ae family members (Boulon et al., 2008; McKeegan et al., 2009). In addition to box C/D snoRNPs, this family also consists of box H/ACA snoRNPs (including telomerase), U4 small nuclear RNPs (snRNPs), and selenoprotein mRNAs (Boulon et al., 2008; Machado-Pinilla et al., 2012; Bizarro et al., 2014, 2015). Generally, snoRNPs consist of a small RNA bound by a conserved set of four proteins (Watkins and Bohnsack, 2012). They catalyze specific post-transcriptional modifications on premature rRNAs that are essential for the biogenesis/function of the ribosome: box C/D snoRNPs act in 2′ -O-methylation, while box H/ACA snoRNPs guide pseudouridylation of pre-rRNAs (Lui and Lowe, 2013).

#### **Assembly of box C/D snoRNPs**

Recently, overexpression of snoRNAs has been implicated in the tumorigenesis of several cancers, such as small-cell lung cancer, prostate cancer, breast cancer, and neuronal tumors (Mei et al., 2012; Williams and Farzaneh, 2012; Su et al., 2014; Herter et al., 2015). Elevated snoRNAs support ribosome biogenesis, nucleolar hypertrophy (a common feature in cancer), and protein synthesis for the proliferation of cancer cells (Ruggero and Pandolfi, 2003; Montanaro et al., 2008). In addition, snoRNPs are established oncogene MYC targets, and elevated snoRNP component Fibrillarin was recently found to inactivate tumor suppressor p53 in a cap-independent mechanism (Su et al., 2014; Herter et al., 2015). Thus, regulation of the assembly and biogenesis of snoRNPs (reviewed in Massenet et al., 2016) would also be critical for tumorigenicity.

Though many models have been proposed for the nuclear biogenesis of the box C/D snoRNPs in both yeast and humans, the specific mechanisms and sequence of assembly steps of its

FIGURE 1 | Assembly pathways of RNP complexes regulated by R2TP. (A) Assembly of box C/D snoRNP. R2TP facilitates the pre-assembly of box C/D snoRNP components (shown in purple) along with other assembly factors (shown in yellow). PIH1D1 and RPAP3 may dissociate from this pre-snoRNP complex earlier than Pontin/Reptin as other snoRNP proteins and the snoRNA are brought to interact. Pontin/Reptin along with ZNHIT6 and NUFIP dissociate last and mature box C/D snoRNP is translocated into the nucleolus where it functions. (B) Assembly of box H/ACA snoRNP and the telomerase holoenzyme. R2TP facilitates the dissociation of SHQ1 assembly factor from box H/ACA snoRNP protein dyskerin. Other snoRNP core proteins (shown in blue), assembly factors (shown in yellow), as well as, the snoRNA are then assembled with the free dyskerin. TERT, the catalytic subunit of the telomerase, may also be bound by the R2TP complex for its assembly with the rest of the snoRNP. (C) Assembly of U4 and U5 snRNPs. For U4 snRNP, R2TP along with co-factors (shown in yellow) pre-assembles with PRP31 (shown in purple). Recruitment of 15.5 K then promotes binding of the U4 snRNA. For U5 snRNP, an intermediate complex is first assembled in the cytoplasm by R2TP and HSP90. After nuclear import, the snRNA, and other snRNP proteins are incorporated. U4 can then form a tri-snRNP with U5 and U6.

core proteins (Fibrillarin, NOP56, NOP58, and 15.5K) by the array of biogenesis factors is still under debate. One hypothesis is that the PIH1D1 and RPAP3 of R2TP act as loading factors for Pontin and Reptin onto core snoRNP proteins NOP58 and 15.5K. Subsequently, PIH1D1 and RPAP3 dissociate from this complex (Bizarro et al., 2014) (**Figure 1A**). Pontin/Reptin alone with other assembly factors, NUFIP, ZNHIT3, and ZNHIT6 form a pre-snoRNP complex that can be stable independent of RNA (Bizarro et al., 2014; Verheggen et al., 2015). As additional core snoRNP proteins and snoRNA are brought in, the assembly factors are replaced. Pontin/Reptin as well as NUFIP are the last to dissociate from the mature box C/D snoRNP (Bizarro et al., 2014) (**Figure 1A**). This is supported by evidence that Pontin and Reptin bound differentially to snoRNP proteins and PIH1D1 in an ATP-dependent manner (McKeegan et al., 2009; Cheung et al., 2010b). Whereas, snoRNP 15.5K interacted with Pontin/Reptin when loaded with ATP, the addition of ATP in vitro has been shown to dissociate PIH1D1 and RPAP3 from R2TP (McKeegan et al., 2007, 2009). In addition, pulldown assays using snoRNP core proteins as bait were unable to find PIH1D1 nor RPAP3 as interactors (Bizarro et al., 2014).

Another hypothesis is that R2TP as a complex, along with other assembly factors interact and stabilize Nop58 to allow its assembly on the snoRNA with other core snoRNP proteins (Kakihara and Saeki, 2014; Kakihara et al., 2014). This hypothesis is supported by the observations that PIH1D1 interacts in vitro with multiple snoRNP proteins such as NOP58, NOP56, and Fibrillarin, and that PIH1D1 is able to immunoprecipitate endogenous or transfected snoRNA (Watkins et al., 2004; McKeegan et al., 2007, 2009; Boulon et al., 2008; Prieto et al., 2015). R2TP proteins were also seen to interact with snoRNP proteins along with other assembly factors such as NUFIP and ZNHIT6, further supporting this hypothesis (McKeegan et al., 2007; Boulon et al., 2008). Further research is needed to elucidate the step-wise assembly of the box C/D snoRNP. Regardless, Pontin and Reptin are essential assembly factors of snoRNP biogenesis, shown to bridge interactions between multiple core proteins.

#### **Assembly of box H/ACA snoRNPs**

The other major class of snoRNPs are box H/ACA consisting of a snoRNA with a box H/ACA sequence motif that guides the complex to its rRNA target and of four conserved proteins: dyskerin, GAR1, NOP10, and NHP2 (Mannoor et al., 2012). R2TP was found to be essential for the assembly of this snoRNP as well (**Figure 1B**). Core protein dyskerin is normally bound by assembly factor SHQ1 and prevented from forming the mature snoRNP (Machado-Pinilla et al., 2012). All components of the R2TP complex were required for the dissociation of SHQ1 from dyskerin, though only Pontin/Reptin and PIH1D1 interacted directly with dyskerin (Machado-Pinilla et al., 2012). Pontin/Reptin also directly interacted with SHQ1. This suggested a model where PIH1D1 targeted Pontin and Reptin to the dyskerin-SHQ1 complex, thereby allowing Pontin and Reptin to remove SHQ1 from dyskerin (Machado-Pinilla et al., 2012) (**Figure 1B**). Whether this process is through competitive binding or dependent on the ATPase activity of Pontin and Reptin to induce conformational changes in the dyskerin-SHQ1 complex is uncertain. The role of RPAP3 in this process is also not clear.

#### **Assembly of the telomerase complex**

The human telomerase complex is composed of the telomerase reverse transcriptase enzyme TERT and the TERC RNP consisting of the telomerase RNA component TERC (which contains a box H/ACA motif) along with all four proteins of the box H/ACA snoRNP family (Maciejowski and de Lange, 2017). Thus, it can also be considered as being a member of the box H/ACA class. Pontin and Reptin were found to play a critical role in the assembly and activity of telomerase through interacting with both TERT and the TERC RNP (Venteicher et al., 2008). Thus, their role in TERC RNP assembly may follow that of the canonical box H/ACA snoRNP, where R2TP dissociates dyskerin from SHQ1, allowing the free dyskerin to interact and associate with other snoRNP proteins (**Figure 1B**).

Telomerase is responsible for adding telomere repeats to chromosome ends, protecting them from DNA damage or erosion (Maciejowski and de Lange, 2017). In differentiated human somatic cells, TERT is silenced and telomeres undergo programmed shortening, eventually leading to cell growth arrest as well as senescence or apoptosis. However, telomerase is upregulated in cancer, enabling indefinite proliferation of the cells and the development of tumors (Li and Tergaonkar, 2014; Maciejowski and de Lange, 2017). Pontin/Reptin can regulate TERT both on the gene and protein levels (Venteicher et al., 2008; Li et al., 2010; Flavin et al., 2011). Though both Pontin and Reptin were needed for the accumulation of TERT mRNA, only Reptin depletion inhibited TERT promoter activity; this is likely through the regulation of MYC (c-myc), the transcription factor for TERT (Li et al., 2010). Reptin was found to bind MYC at the promoter region of TERT, and when Reptin was depleted, MYC was unable to bind to the E-box motif (the MYC-binding motif) on the TERT promoter (Venteicher et al., 2008; Li et al., 2010). Thus, it is intriguing to hypothesize that a silencing factor/repressor may usually bind this region, and that Reptin assists MYC in displacing the repressor thus allowing transcription of TERT.

Venteicher et al. (2008) found that Pontin directly interacts with the TERT protein in complex with Reptin, forming a TERT-Pontin/Reptin complex. However, the enzymatic activity of TERT in this complex is significantly lower than that when TERT is associated with the TERC RNP member dyskerin. During the cell cycle, the interaction between TERT and the ATPases peaks in S phase and diminishes in G2, M, and G1. This suggests that Pontin and Reptin may be binding to a pretelomerase TERT that needs remodeling or association with other factors for its activity (Venteicher et al., 2008). One hypothesis is that Pontin and Reptin may act again as assembly factors as part of the R2TP complex and dissociate after the mature telomerase complex is formed (Venteicher et al., 2008; Machado-Pinilla et al., 2012). This is supported by observations that HSP90 functions in the nuclear import of TERT (Lee and Chung, 2010; Jeong et al., 2016). It may also be possible that Pontin and Reptin hold TERT in an inactive form until TERT activity is needed.

#### **Assembly of spliceosomal snRNP U4 and U5**

The spliceosome is comprised of five snRNPs (U1, U2, U4, U5, and U6) that cooperatively mediate the splicing of pre-mRNAs for proper gene expression (Matera and Wang, 2014). U4, U5, and U6 are recruited to the splicing site as a tri-snRNP complex and then rearranged into a catalytically active complex (Nguyen et al., 2015). In addition to the snRNA, each snRNP contains a heptameric ring of either Sm or Like-Sm proteins, as well as a variable number of snRNP-specific proteins (Matera and Wang, 2014). The assembly of snRNP-specific proteins has recently been proposed to be regulated by the R2TP complex along with HSP90 (Bizarro et al., 2015; Cloutier et al., 2017; Malinova et al., 2017).

The assembly of snRNPs generally begins with the export of snRNAs out of the nucleus (Matera and Wang, 2014). In the cytoplasm, the Sm ring is loaded onto the snRNA by the SMN complex and reimported (Battle et al., 2006). Assembly of U4 specific proteins PRP31 and 15.5K into the snRNP by R2TP, NUFIP and ZNHIT3 is thought to occur after reimport into the nucleus (**Figure 1C**) (Bizarro et al., 2015). PRP31 first forms a complex with R2TP and assembly factors, then binding of 15.5K promotes the stable incorporation of PRP31 into the snRNP (Bizarro et al., 2015).

On the other hand, assembly of the U5-specific proteins occurs first in the cytoplasm (**Figure 1C**) (Malinova et al., 2017). An intermediate complex is formed with the recruitment of Mao and Houry Role of Pontin and Reptin in the Cell

PRPF8 and EFTUD2 to the R2TP/HSP90 complex along with AAR2, ZNHIT2, and other assembly factors (Cloutier et al., 2017). HSP90 is thought to stabilize PRPF8 and EFTUD2 through the interaction of PIH1D1 N-terminal domain with the phosphorylated DSDED motif on EFTUD2 (Malinova et al., 2017). After the nuclear import of this complex, binding of SNRNP200 and other cofactors occurs, followed by binding of U5 snRNA and release of assembly factors for the maturation of U5 snRNP (Malinova et al., 2017) (**Figure 1C**). Finally, U4, U5, and U6 snRNPs assemble together to form the U4/U6.U5 tri-snRNP (**Figure 1C**).

#### Assembly of RNA Polymerase II

RNA polymerase II (POL II) is a fundamental cellular complex that synthesizes all the mRNAs and capped non-coding RNAs. Its 12 subunits are assembled in the cytoplasm, in part by the R2TP complex, and only fully assembled POL II is imported into the nucleus (Boulon et al., 2010). The subunits are formed in two subcomplexes: RPB1-associated and RPB3-associated complexes, each interacting with a specific set of assembly factors (Boulon et al., 2010, 2012) (**Figure 2A**). R2TP along with a set of six proteins that form a prefoldin-like (PFDN-like) complex (PFD2, PFD6, PDRG1, UXT, URI, and WDR92) named the R2TP/PFDN complex (Boulon et al., 2010; Millan-Zambrano and Chavez, 2014), was found to interact with the POL II RPB1 subcomplex (**Figure 2A**). Free RPB1 subunits in the cytoplasm were mainly stabilized by HSP90 via interactions with RPAP3, facilitating the association and assembly of RPB1 with other subunits (Boulon et al., 2010). In addition, URI of the R2TP/PFDN complex interacted with RPB5, another subunit of the RPB1 subcomplex, further implicating R2TP/PFDN in the assembly of POL II (Mita et al., 2013). Fully assembled POL II is then transported into the nucleus via the Iwr1 import adaptor, as well as assembly factor RPAP2 and GTPase GPN1/RPAP4 (Boulon et al., 2010; Forget et al., 2013). While RPAP2 mediated the nuclear import of POL II, GPN1/RPAP4 is required for the recycling of RPAP2 by exporting it back into the cytoplasm in a CRM1-dependent manner (Forget et al., 2013).

Intriguingly, RPAP3 was found to interact with subunits of both RNA polymerase I and III (Jeronimo et al., 2007; Boulon et al., 2010). In addition, the URI interactor RPB5 is a subunit common to all three RNA polymerases, suggesting that the R2TP/PFDN complex may function in the assembly of RNA polymerases in general (Mita et al., 2013). If so, this may partly explain the overexpression of Pontin and Reptin in many cancers, as their supporting role in protein synthesis and gene expression will help meet the high demand in proliferating tumor cells.

#### Assembly of mTORC1 and Other PIKK Family Members

PIKK (phosphatidylinositol 3-kinase-related protein kinase) signaling family important for DNA repair and cellular metabolism (Bakkenist and Kastan, 2004) comprises of six members including mTOR (mechanistic target of rapamycin), SMG-1 (suppressor with morphogenetic effect on genitalia-1), ATM (ataxia telangiectasia mutated), ATR (telangiectasia Rad3 related), DNA-PKcs (DNA-dependent protein kinase catalytic

FIGURE 2 | Assembly pathways of macromolecular complexes regulated by R2TP. (A) Assembly of RNA polymerase II. Two subcomplexes of the polymerase, RPB1-associated (shown in blue) and RPB3 associated (shown in purple), are formed with the help of R2TP/PFDN and other assembly factors (shown in yellow). Assembly factors dissociate as the mature RNA polymerase II is formed and Iwr1 importin is brought in. Fully assembled RNA polymerase II is then translocated into the nucleus also mediated by assembly factor RPAP2. The point at which R2TP/PFDN dissociate is not known. RPAP2 dissociates in the nucleus and is recycled by being co-exported with GPN1. (B) Dimerization of mTOR complex. Each mTOR subunit is bound by either the TTT complex (dark green) or the R2TP complex (light green). The WAC adaptor facilitates the interaction between these two complexes for the dimerization of mTOR. The assembly factors then dissociate from the dimerized and activated mTOR complex. R2TP is also involved in the assembly/stability of other PIKK family members (shown in purple), however the molecular basis of its functions is poorly understood.

subunit), and TRAAP (transformation/transcription domainassociated protein) (Baretic and Williams, 2014). Pontin and Reptin regulate mTOR as well as other members of the PIKK signaling family at the transcriptional level, protein level, and functionally (Dugan et al., 2002; Horejsi et al., 2010; Izumi et al., 2010; Kim et al., 2013). This was shown by the decrease in both mRNA and protein levels of PIKK members upon depletion of either Pontin or Reptin (Izumi et al., 2010), and, consequently, downstream signaling was also affected. Evidence suggested that Pontin and Reptin regulate transcription factors such as E2F1, whose target genes include members of the PIKK family (Dugan et al., 2002; Taubert et al., 2004; Tarangelo et al., 2015).

Interactors of R2TP, such as TEL2 of the TTT complex (Tel2, Tti1, and Tti2; **Figure 2B**) were shown to be essential for the protein stability of all members of the PIKK family (Takai et al., 2007; Horejsi et al., 2010, 2014; Pal et al., 2014). This interaction is mediated by the phosphoserine-containing motif DpSDD/E on TEL2 interacting with the N-terminal domain of PIH1D1 (Horejsi et al., 2010; Pal et al., 2014). In addition, the HSP90 chaperone was found to be required for the accumulation of PIKK proteins, likely through its cofactor RPAP3 (Izumi et al., 2010; Pal et al., 2014). Pontin and Reptin, perhaps through the R2TP complex, were shown to be directly involved in the remodeling and assembly of complexes formed by PIKK members. For instance, the ATPases promoted the remodeling of mRNA surveillance complexes, of which SMG-1 is a subunit, during nonsense-mediated mRNA decay (Izumi et al., 2010).

In addition, Pontin and Reptin were shown to be important for the localization and dimerization/activation of the mTORC1 complex under metabolic stress (Kim et al., 2013; David-Morrison et al., 2016). mTOR is a serine/threonine kinase that senses cellular nutrients and energy levels to regulate metabolism and physiology in mammalian cells. It is the catalytic subunit of two distinct complexes named mTORC1, which controls cell growth and protein synthesis, and mTORC2, responsible for cell survival signaling. PIH1D1 was also shown to be important for the assembly of mTORC1 complex components (Kamano et al., 2013). A recent model suggested that Pontin/Reptin associated with TTT to form a Pontin/Reptin-TTT complex under energyrich conditions, helped by the adaptor WAC (David-Morrison et al., 2016). This complex then facilitated the dimerization and proper localization of mTORC1 to the lysosome in an energydependent manner (Kim et al., 2013; David-Morrison et al., 2016) (**Figure 2B**).

Functionally, R2TP has been shown to promote mTORC1 dependent transcription of rRNA, and thus ribosome biogenesis (Kamano et al., 2013). The R2TP-mTORC1 interaction is thought to be mediated by PIH1D1, which only interacted with mTORC1 complex components but not mTORC2 (Kamano et al., 2013; Horejsi et al., 2014) (**Figure 2B**).

Taken together, Pontin and Reptin can regulate the function of many macromolecular complexes within the cell, sometimes on multiple different levels throughout a pathway. It is therefore expected that defects in any of these pathways can have considerable impact on cell growth.

# Role of Pontin/Reptin in Aggresome Formation

Aggresome formation is a highly-regulated process that protects the cell from aggregating polypeptides when its protein degradation and chaperone systems are overwhelmed. Aggresomes are formed from aggregated and misfolded polypeptides that are transported to a centralized location near/around the centrosomes (Johnston et al., 1998; Markossian and Kurganov, 2004).

Pontin and Reptin were identified in a siRNA screen for proteins involved in aggresome formation (Zaarur et al., 2015). Depletion of the two ATPases led to the build-up of scattered cytoplasmic aggregates and reduced the formation of centralized aggresomes. It was found that Pontin and Reptin interacted and co-localized with synphilin-1, previously shown to accumulate in and form cytoprotective aggresomes (Tanaka et al., 2004; Zaarur et al., 2015). Additionally, Pontin/Reptin were found to promote disassembly of protein aggregates in vivo. Thus, Pontin and Reptin may function as disaggregating chaperones, and/or be indirectly involved in aggresome formation.

# Role of Pontin/Reptin in Cell Cycle Regulation

Studies in a variety of cancer cell lines have consistently demonstrated that downregulation of either Pontin or Reptin may lead to cell cycle arrest at the G1/S phase checkpoint, resulting in the accumulation of cells in G1 and a reduction of cells in all other phases of the cell cycle (S, G2/M) (Rousseau et al., 2007; Haurie et al., 2009; Menard et al., 2010; Osaki et al., 2013; Ren et al., 2013; Breig et al., 2014; Zhang et al., 2015; Yuan et al., 2016). G1/S transition is regulated by many proteins and pathways that are normally inactivated until entry into S phase is signaled (Otto and Sicinski, 2017). For example, E2F1 transcription factor, responsible for the expression of a collection of S-phase promoting genes, is normally held in the inactive state by retinoblastoma proteins (RB) (Johnson et al., 2016) (**Figure 3**). This E2F1-RB complex is phosphorylated by cyclin D1 and, consequently, E2F1 dissociates from RB and becomes able to act on its target genes (Malumbres and Barbacid, 2001). However, cyclin D1 and other cell cycle genes are only upregulated when mitogenic signals activate downstream pathways such as the PI3K/AKT signaling pathway (Hustedt and Durocher, 2016). This in turn activates transcription factors such as MYC and β-catenin for the expression of cell cycle proteins, including cyclin D1 (Shtutman et al., 1999; Liao et al., 2007). GSK-3β is an important inhibitor of MYC and β-catenin as well as of cyclin D1 when cells are not ready for entry into S phase (Domoto et al., 2016). However, active AKT phosphorylates and inhibits GSK-3β, releasing its repression (McCubrey et al., 2016) (**Figure 3**).

Research has shown the importance of Pontin and Reptin in various steps of the G1/S cell cycle checkpoint pathway (**Figure 3**). Silencing of Pontin in lung adenocarcinoma led to the phosphorylation and degradation of cyclin D1, thus resulting in cell cycle arrest at G1/S (Yuan et al., 2016). Evidence suggested that Pontin acts upstream of GSK-3β through the AKT/GSK-3β/cyclin D1 pathway (**Figure 3A**), though how Pontin functions in the activation of this signaling pathway remains to be elucidated.

Additionally, Pontin and Reptin were shown as interactors of MYC and β-catenin (see section Role of Pontin/Reptin in Mitosis for details), and thus can potentially regulate their

transcriptional activity for production of cyclins (Bauer et al., 2000; Wood et al., 2000). In RCC cells, Pontin knockdown led to a decreased mRNA expression of both MYC and cyclin D1 (Zhang et al., 2015). Pontin and Reptin can also regulate the ability of MYC to enhance cell-cycle progression by stimulating its inhibition of the transcription factor MIZ1 (Etard et al., 2005). Consequently, its target p21, which inhibits cyclin protein activity, is transcriptionally repressed (**Figure 3B**) (Etard et al., 2005; Hustedt and Durocher, 2016).

Further downstream in the signaling pathway, Pontin may be needed for the dissociation of RB from E2F1 through the interaction with ecdysoneless (ECD) (**Figure 3C**). ECD is an evolutionarily conserved protein essential for embryogenesis and cell cycle progression into S phase (Kim et al., 2009). It competes with E2F1 for binding to RB, thus allowing E2F1 to freely activate its target genes. Pontin may facilitate efficient binding of ECD to RB and dissociation of RB from E2F1, as interaction with Pontin is required for ECD's ability to regulate progression of cell cycle (Mir et al., 2015). Since ECD also contains a DSDD motif and is shown to interact with PIH1D1, Pontin may function as part of the R2TP complex in this process (Horejsi et al., 2014). However, the interaction between PIH1D1 and ECD was shown not to be important for its cell cycle functions (Mir et al., 2015). It is also possible that Pontin and Reptin can promote E2F1 transcription in this context as part of the TIP60 histone acetyltransferase complex, since this complex was seen to be recruited by E2F1 in late G1 phase (**Figure 3D**) (Taubert et al., 2004).

#### Role of Pontin/Reptin in Mitosis

Pontin and Reptin may also play specific and perhaps essential roles in mitosis independent of each other. Whereas, Pontin was largely implicated in the assembly of mitotic spindles, the function of Reptin remains to be uncovered (Gartner et al., 2003; Sigala et al., 2005; Ducat et al., 2008; Fielding et al., 2008; Gentili et al., 2015). However, both Pontin and Reptin undergo dramatic subcellular relocalization during mitosis and even displaying distinct localization signals within the intercellular bridge (**Figure 4**) (Sigala et al., 2005; Gentili et al., 2015).

During interphase, Pontin/Reptin are mostly nuclear (**Figure 4A**) (Gartner et al., 2003; Sigala et al., 2005). Upon entry into mitosis, Pontin/Reptin are increasingly redistributed to the cytoplasm, culminating in metaphase, where they are almost completely excluded from the condensed chromosomes (**Figure 4B**). Here, and until early anaphase, Pontin/Reptin are observed at mitotic spindles and centrosomes, co-localizing with both α- and γ-tubulin (**Figure 4C**) (Gartner et al., 2003; Sigala et al., 2005; Ducat et al., 2008). During anaphase-to-telophase transition, both relocate to the central spindle, first forming a compact band, then accumulating into distinct foci. At telophase, Pontin was found to form two foci that co-localized with β-tubulin at the sides of the cytokinetic furrow (**Figure 4D**). On the other hand, Reptin was found to form only one focus that was concentrated at the center of the midbody, separated from Pontin (Gentili et al., 2015).

Both Pontin and Reptin have been found to be part of the microtubule interactome, and were identified as candidate mitotic regulators in a RNAi-based phenotypic screen in Drosophila S2 cells (Bjorklund et al., 2006; Ducat et al., 2008). Depletion of Pontin led to multiple mitotic defects in a variety of mammalian cells, including increased mitotic death, delayed anaphase onset, defective spindles, leading to misaligned and lagging chromosomes (Gartner et al., 2003; Ducat et al., 2008; Magalska et al., 2014; Gentili et al., 2015). Depletion of Reptin on the other hand had little effect by itself, and only

enhanced the defects observed with Pontin depletion (Ducat et al., 2008). This suggested that Pontin is the main protein involved in promoting mitotic spindle assembly, likely through regulating the localization of the γ-tubulin ring complex (γ-TuRC) and Integrin linked kinase (ILK) to the mitotic spindle and centrosome (Gartner et al., 2003; Fielding et al., 2008).

γ-TuRC serves as the cap and initiation site for microtubule polymerization (Prosser and Pelletier, 2017). Both Pontin and Reptin were shown to interact with γ-TuRC and were required for the nucleation and organization of robust microtubule arrays in Xenopus egg extracts (Ducat et al., 2008). Perhaps Pontin and Reptin act as chaperones for the stability and localization of γ-TuRC to the spindle poles and along the microtubule array. ILK was recently found to be important in the centrosome for mitotic spindle organization likely by maintaining the interaction between spindle organization proteins Aurora A and TACC3/ch-TOG, in a manner that is dependent on its kinase activity (Fielding et al., 2008). ch-TOG is required for spindle organization and microtubule polymerization, and Aurora A kinase recruits ch-TOG through phosphorylating TACC3. Consequently, depleting ILK led to spindle defects. Pontin and ILK co-localize in the centrosome and were dependent on each other for their localization (Fielding et al., 2008). Thus, they may form a co-complex during mitosis to function in the centrosome (Dobreva et al., 2008).

As mentioned above, later in mitosis, Pontin and Reptin relocalized to the central spindle and even seemed to separate from each other at the midbody (Sigala et al., 2005; Ducat et al., 2008; Gentili et al., 2015). This dissociation is likely regulated by Pololike kinase 1 (PLK1), a mitotic kinase, that is found to interact and co-localize with Pontin during cytokinesis (Gentili et al., 2015). PLK1 has many critical functions in mitosis, including proper mitotic entry, spindle assembly, centrosome maturation, and chromosome segregation (Petronczki et al., 2008; Otto and Sicinski, 2017). During cytokinesis, PLK1 is required for midbody formation and function, of which Pontin might be a mediator, as Pontin was also found to be a PLK1 substrate in vitro (Gentili et al., 2015). However, the specific functions and molecular mechanisms of Pontin and Reptin at the midbody remain to be characterized.

Pontin and Reptin have also been implicated at the end of mitosis in chromatin decondensation (Magalska et al., 2014). Here, they re-associate and were shown to exist largely as a heterocomplex again, although they are functionally redundant in this context and can act independent of one another (Magalska et al., 2014). ATPase-deficient mutants of either protein showed a dominant-negative effect on chromatin decondensation.

#### Role of Pontin/Reptin in the Regulation of Transcriptional Oncogenic Factors

Pontin and Reptin have long been recognized to regulate transcription through interaction with different transcription factors, many of which are highly involved in tumorigenesis, including MYC, β-catenin-LEF/TCF, and E2F to name a few (Gallant, 2007; Huber et al., 2008; Grigoletto et al., 2011; Rosenbaum et al., 2013; Matias et al., 2015). The

role of Pontin/Reptin in these contexts generally promotes cell proliferation and survival, which is crucial for cancer development (**Table 2**).

#### The Role of Pontin/Reptin in TIP60 Histone Acetyl Transferase Activity

Histone acetylation is an important strategy for the regulation of gene expression as it typically relaxes chromatin structure allowing the binding of the transcriptional machinery to proper promoter regions (Desjarlais and Tummino, 2016). As a histone acetyltransferase, the TIP60 complex acts in a similar fashion and mostly functions as a co-activator of many transcriptional pathways. The TIP60 complex consists of proteins with chromatin remodeling activity such as Pontin/Reptin and p400, adaptor/scaffolding subunits such as TRAAP and DMAP1, histone binding proteins BRD8 and ING3, as well as the histone acetyltransferase TIP60 among others (Desjarlais and Tummino, 2016). The complex is involved in regulating chromatin remodeling, transcription and DNA repair (Kusch et al., 2004; Zhao et al., 2016).

In the context of transcription, the TIP60 complex, or at least components of it, are recruited by several oncogenic transcription factors that are regulated by Pontin and Reptin. For example, the E1A 243R adenoviral oncoprotein was recently found to interact with subunits of the TIP60 complex including Pontin and Reptin as well as MYC (Zhao et al., 2016). E1A 243R promoted the interaction between TIP60 and MYC to form a supercomplex consisting of all three components, which was important for the cellular transformation activities of MYC and E1A (**Figure 5A**) (Dugan et al., 2002; Zhao et al., 2016). Other transcription factors regulated by Pontin/Reptin also recruit TIP60 including HIF1α and NF-κB, which are further discussed below.

#### Role in MYC Regulation

MYC is an oncogenic transcription factor that promotes cell proliferation by transcriptionally activating genes involved in cell cycle progression, protein synthesis, and ribosome biogenesis, including Pontin and Reptin (Dang, 2012). Recently, ChIP-seq analysis showed that MYC binds the promoter regions of both Pontin and Reptin (Walz et al., 2014). MYC also binds to the promoter of genes coding for cell-cycle inhibitors, such as p21 (Etard et al., 2005). MYC is a repressor of p21 through inhibition of the MIZ1 transcription factor (Etard et al., 2005). The direct binding of the two ATPases to MYC oncogenesis domain was shown to be important for MYC/MIZ1 interaction. Here Pontin and Reptin act as co-repressors in an additive manner, thus enhancing the repression of p21 by MYC (**Figure 5A**) (Wood et al., 2000; Etard et al., 2005).

Pontin and Reptin were also found to be essential for MYC-mediated oncogenic transformation and modulated MYCinduced apoptosis in an ATPase dependent manner (**Table 2**), where ATPase-deficient mutant of Pontin enhanced apoptosis if MYC was overexpressed (Wood et al., 2000; Dugan et al., 2002). Apoptosis is a common strategy for cells to prevent transformation and uninhibited proliferation. Thus, inhibiting the ATPase activity of Pontin can prove to be therapeutically beneficial. Pontin and Reptin were also shown to be important for the repression of tumor suppressor C/EBPδ and Drosophila cell adhesion gene mfas, both also target genes of MYC, further supporting their roles in MYC-mediated oncogenesis (Bellosta et al., 2005; Si et al., 2010).

On the other hand, Pontin and Reptin can act as activators of MYC-mediated transcription. Within the nucleolus, an interaction of Pontin and MYC at the rRNA promoter was observed (**Figure 5A**), though the function and the mechanistic aspects of this interaction are still unclear (Cvackova et al., 2008). Recent findings suggested that rRNA transcription might also be regulated by the R2TP complex indirectly through its interactions with mTORC1 (Kamano et al., 2013). Reptin was shown to activate MYC-dependent transcription of TERT (**Figure 5A**) in cooperation with ETS2, a transcription factor acting downstream of growth factor signaling (Li et al., 2010; Flavin et al., 2011).

It was recently revealed that MTBP (Mdm2-binding protein) may be involved in the interactions between Pontin/Reptin and MYC (Grieb et al., 2014). MTBP associated with MYC at its target promoters through direct binding with Pontin and Reptin (Grieb et al., 2014). Co-overexpression of MYC and MTBP resulted in dramatic increase in proliferation and transformation experimentally, and correlated with a 10-year reduction in patient survival (Grieb et al., 2014). It would be interesting to investigate whether MTBP also co-overexpressed with Pontin and Reptin in patient samples and whether MTBP is involved in regulating Pontin and Reptin interaction with other transcription factors and/or protein complexes.

#### Role in E2F1 Regulation

A similar role for Pontin/Reptin in MYC-mediated transformation and oncogenesis was observed for transcription factor E2F1, an important regulator of cell cycle, to which Pontin also directly binds (Dugan et al., 2002). Using a pre-clinical mice model of HCC, Tarangelo et al. (2015) reported that Pontin and Reptin were recruited by transcription factor E2F1 to open the chromatin at E2F1 target genes, which in turn enhanced the transcriptional response of metabolic genes during cancer progression (**Figure 5B**). Here, Pontin/Reptin act as co-activators for E2F1 (**Table 2**). However, whereas Reptin ATPase activity was required for chromatin remodeling, the role of Pontin seemed limited to stabilizing Reptin expression (Tarangelo et al., 2015). The authors suggested that the recruitment of Pontin and Reptin may be a common mechanism used by E2F1 to promote cancer progression. Through ChIP-seq studies, the authors also showed that the chromatin remodeling effects of Pontin and Reptin were not through the TIP60 histone acetyltransferase (HAT) complex that Pontin and Reptin are subunits of (as described above), since TIP60 was not observed at the promoters of E2F1 target genes in this model of HCC. However, TIP60 recruitment along with Pontin and Reptin by E2F1 was seen in the context of cell cycle gene transcription (**Figure 5B**) (Taubert et al., 2004).

Their cooperative action as co-activators was also observed in the context of regulating nuclear receptors, estrogen receptor (ER) and androgen receptor (AR), as well as the transcription factor complex interferon stimulated-gene factor 3 (ISGF3)

#### TABLE 2 | Regulation of transcription factors by Pontin and Reptin.


(**Figure 5C** and **Table 2**) (Kim et al., 2007; Dalvai et al., 2013; Gnatovskiy et al., 2013).

#### Role in HIF1α Regulation

The TIP60 complex, including both Pontin and Reptin subunits, has recently been found to regulate the hypoxia pathway through co-activating the transcription factor HIF1α (hypoxiainducible factor alpha) (Perez-Perri et al., 2016). Transcriptome analysis showed that more than 60% of HIF1α target genes utilized either TIP60, CDK8-Mediator, or both as co-activators (Perez-Perri et al., 2016). In cancer, due to uncontrolled proliferation of cells, the tumor and its microenvironment are often deprived of oxygen (Wilson and Hay, 2011). This signals the hypoxic response to alter cellular metabolism for better adaptation (Perez-Perri et al., 2016). However, this often leads to angiogenesis, epithelia-to-mesenchymal transition (EMT), metastasis, apoptosis, and resistance to treatments (Wilson and Hay, 2011). Thus, understanding and modulating hypoxia activation is important for therapeutic targeting. TIP60 is recruited by HIF1α to its target genes for chromatin modification and RNA polymerase II activation (Perez-Perri et al., 2016). Both Pontin and Reptin were required for proper function of the TIP60 complex and consequently HIF1α transcription activity in this context (**Figure 5D**).

However, opposing roles of Pontin and Reptin have also been found for HIF1α activity (**Table 2**), perhaps independent

of TIP60 (Lee et al., 2010, 2011): Pontin acted as an activator and Reptin as a repressor (**Figure 5D**). Whereas, Pontin methylation by hypoxia-induced G9a and GLP recruited p300 (a co-activator with HAT activity), Reptin methylation by G9a seems to recruit the histone deacetylase HDAC1 (Lee et al., 2010, 2011). Of interest, Pontin and Reptin were each found to regulate only a subset of hypoxia target genes that largely did not overlap with one another (Lee et al., 2010, 2011; Matias et al., 2015). This suggested that HIF1α may interact with defined partner transcription factors that required different co-activators/repressors for its transcriptional regulation, providing flexibility under different cellular/environmental contexts. Taken together, understanding these interactions could provide better and more specific targeting strategies for cancer therapy.

#### Role in p53 Regulation

p53 is a transcription factor that has been studied extensively for its tumor suppression capabilities (Brown et al., 2009). Mutations in p53 that lead to the development of tumorigenesis are a common feature in cancer (Muller and Vousden, 2013). These can result from a single substitution in its amino acid sequence, which enables p53 to attain new properties that promote proliferation, metastasis and cell transformation in addition to the loss of its tumor suppressing functions (Muller and Vousden, 2013; Zhao et al., 2015). Thus, such tumor promoting p53 is termed gain-of-function mutant p53 (mutp53 GOF) (Muller and Vousden, 2013). Pontin was recently found to interact with mutp53 GOF and regulate its transcriptional activity for a subset of genes (**Figure 5E**; **Table 2**) (Zhao et al., 2015). This interaction promoted mutp53 GOF-mediated cell migration, invasion, and clonogenic potential in an ATPase dependent manner (Zhao et al., 2015).

Reptin was found to interact with wild-type p53 and suppress its anti-tumor activity (**Table 2**) through an interaction with anterior gradient-2 (AGR2) protein, a potent inhibitor of p53 mediated transcription that promotes cancer cell proliferation, survival, and metastasis (**Figure 5E**) (Maslon et al., 2010; Gray et al., 2013; Ocak et al., 2014; Clarke et al., 2016). Reptin can also inhibit p53 through repressing transcription of p14ARF (alternate reading frame of CDKN2A) (**Figure 5E**) (Xie et al., 2012). p14ARF is a tumor suppressor that acts in both p53 dependent and -independent manner (Ozenne et al., 2010). In the context of p53, p14ARF binds to and inactivates MDM2, which in turn promote the stabilization and activation of p53 (Sherr and Weber, 2000; Xie et al., 2012). Thus, as an inhibitor of p14ARF, Reptin promotes the proliferation of cancer cells.

#### Role in NF-κB Regulation

Nuclear factor-κB (NF-κB) is a family of dimeric transcription factors (p50, p52, RelA/p65, c-Rel, and RelB) activated by cellular stimuli such as oxidative stress, viral/bacterial antigen, and cytokines including TNFα and IL-1β (Moynagh, 2005). Their target genes control processes such as inflammation, cell proliferation, and cell survival (Tergaonkar, 2006). Thus, if constitutively active, unhealthy/genomically unstable cells that should normally die of apoptosis would remain in the population and lead to tumor development.

In the canonical pathway, NF-κB heterodimers are bound by IκB proteins, which sequester them in the cytoplasm and keep these transcription factors inactivated (**Figure 5F**) (Gilmore, 2006). A stimulus will activate the IKK (IκB kinase) complex which consists of IKKα, IKKβ, and NEMO (NF-κB essential modulator, also known as IKKγ) (Scheidereit, 2006). IKK is activated by monoubiquitination then phosphorylates the IκB inhibitor and causes its eventual degradation. This allows NF-κB to translocate into the nucleus for its function (Scheidereit, 2006).

Within the cytoplasm, RPAP3 of the R2TP complex was found to bind and regulate NEMO of the IKK complex (Shimada et al., 2011). RPAP3 binding inhibited the monoubiquitination of NEMO, which prevented the activation of the IKK complex (**Figure 5F**) (Shimada et al., 2011). This leads to the repression of NF-κB transcription. Whether Pontin or Reptin functions together with RPAP3 as the R2TP complex in this context is uncertain.

Pontin and Reptin were seen to regulate NF-κB p65 transcription antagonistically (**Table 2**), where Pontin rescued Reptin repression of transcription of p65 in reporter assays (Qiu et al., 2015). The authors suggested that Reptin repression was mediated in part through interaction with p65 in the cytoplasm and perhaps prevented degradation of the regulatory element, IκB-α (**Figure 5F**) (Qiu et al., 2015). IκB-α binds to and masks the nuclear localization signal of NF-κB, sequestering p65 in the cytoplasm, and thus downregulating its transcriptional activity (Tergaonkar, 2006).

Pontin bound to TIP60 was thought to co-activate a subset of NF-κB targets in response to IL-1β, including metastasis suppressor KAI1 (Kim et al., 2005, 2006; Rowe et al., 2008). In normal cells, IL-1β induces the displacement of the NCoR/TAB2 co-repressor complex (consisting of NCoR, TAB2, MEKK1, and HDAC3) that normally binds p50 (**Figure 5F**) (Rowe et al., 2008). This allows the recruitment and binding of co-activators Bcl3 and the Pontin-TIP60 complex, consequent acetylation at histones H3 and H4, thus leading to transcriptional activation (Kim et al., 2005).

However, Reptin in complex with β-catenin was found as a co-repressor of the same set of genes. β-catenin is a gene transcription regulator involved in the Wnt signaling pathway (described in the following section) (Kim et al., 2005) (**Figure 5F**). In metastatic cells, increased β-catenin expression decreases TIP60 expression and prevents binding of the co-activator complex. β-catenin with Reptin form a co-repressor complex that binds p50 (Kim et al., 2005). Repression of KAI1 expression by the Reptin-β-catenin complex was thought to occur in part through recruitment of histone deacetylase HDAC1, which required Reptin sumoylation at K456 (**Figure 5F**) (Kim et al., 2006). Desumoylation of Reptin by SENP-1 prevented the association with HDAC1 and decreased nuclear localization of Reptin, allowing Pontin/TIP60 to bind and activate transcription (Kim et al., 2006).

In general, the two ATPases through their respective complexes were shown to bind NF-κB transcription factor p50 at the promoter region of KAI1 in a mutually exclusive manner (Kim et al., 2005, 2006). Thus, this represents another instance where Pontin and Reptin seem to act independently of each other, and even antagonistically.

#### Role in β-Catenin Regulation

β-catenin is another transcriptional regulator highly involved in oncogenesis. In the canonical Wnt-signaling pathway, it interacts with the LEF/TCF (lymphoid enhancing factor/T-cell factor) family of transcription factors to activate numerous genes involved in proliferation and survival (Macdonald et al., 2009). The pathway is often found activated in a variety of cancers, either through activating mutations in β-catenin itself or in proteins involved in Wnt-signaling (Morin, 1999; Macdonald et al., 2009). Pontin and Reptin possess opposing regulatory functions in this pathway (Bauer et al., 1998, 2000; Yakulov et al., 2013) (**Table 2**). The ATPases were shown in vitro to directly interact with β-catenin in the same region, and thus might exhibit competitive binding (Bauer et al., 1998, 2000). Similar to previous cases, such as for H1Fα- and NF-κBdependent transcription, repression by Reptin is mediated by recruitment of HDAC. Here specifically, Reptin sumoylation was shown to be important for recruiting HDAC and consequently repressing the transcriptional activity of canonical β-catenin targets such as cyclin D1 (**Figure 5G**) (Bauer et al., 2000). Whether Pontin recruits TIP60 or other HATs for its coactivating activities on β-catenin remains to be investigated. Recently, an anti-apoptotic protein c-FLIP<sup>L</sup> (cellular FLICE-like inhibitory protein) was found to promote activation of β-catenindependent transcription by Pontin (Zhang et al., 2017). c-FLIP<sup>L</sup> increased binding of Pontin at target gene promoters by binding to Pontin using its DED (death-effector domain) (Zhang et al., 2017).

The role of Pontin and Reptin in β-catenin-LEF/TCF mediated transcription may also be inhibited by other protein(s). For instance, Hint1 (histidine triad nucleotide-binding protein 1) was found to suppress Pontin activation of β-catenin transcription, and APPL1/2 (adaptor proteins containing pleckstrin homology domain, phosphotyrosine binding domain, and leucine zipper domain) were shown to relieve the repression of transcription by Reptin (**Figure 5G**) (Weiske and Huber, 2005; Rashid et al., 2009). Hint1 is implicated in transcription regulation and growth control, and the HIT family of proteins, to which Hint1 belongs, is often found inactive in many carcinomas (Weiske and Huber, 2006). APPL1/2 are effectors of the small GTPase Rab5 and function in early steps of endocytosis (Rashid et al., 2009). Whereas, Hint1 prevented Pontin to Pontin interactions, APPLs reduced the association between Reptin, HDAC and β-catenin (Weiske and Huber, 2005; Rashid et al., 2009).

#### Role in the Regulation of Other Transcription Factors

Oct4, one of the main ESC (embryonic stem cell)-specific transcription factors, is essential for regulating embryonic development and the self-renewing property of ESCs (Shi and Jin, 2010). Pontin acts as a transcriptional co-activator of Oct4 for both the expression of genes required for ESC maintenance and for lincRNAs (long non-coding RNAs) that repress the lineage differentiation program in ESCs through methyltransferases such as Ezh2 (**Figure 5H**) (Boo et al., 2015) (**Table 2**). Pontin activation of Oct4 targets is thought to be mediated by recruitment of p300 acetyltransferase (Boo et al., 2015). Reptin was also found to maintain pluripotency of ESCs, perhaps acting in complex with Pontin (Do et al., 2014) (**Table 2**).

Using proteomics, EVI1 (Ecotropic viral integration site-1), C/EBP (CCAAT/enhancer-binding protein) alpha and beta, which are transcription factors, were found to interact with Pontin and Reptin (**Figure 5H**) (Bard-Chapeau et al., 2013; Cirilli et al., 2016). EVI1 is an oncogenic transcription factor that is often overexpressed in cancers such as myeloid leukemia and epithelial cancers, while C/EBPα and β regulate processes such as cell proliferation, apoptosis and transformation (Bard-Chapeau et al., 2013; Cirilli et al., 2016). Though identified, the functional role and molecular mechanism of Pontin/Reptin interactions with these transcription factors are not known.

# Role of Pontin/Reptin in the DNA Damage and Repair

Genomic instability is a hallmark of cancer. DNA damage response (DDR) and repair pathways are typically induced under these conditions (Jeggo et al., 2016). Failure to properly repair DNA damage allows accumulation of damage and results in genomic instability, promoting development of cancer (Ciccia and Elledge, 2010; O'Connor, 2015). On the other hand, the cytotoxicity of the DNA damage has been largely exploited for chemotherapy, but not without significant collateral damage and side effects (Deans and West, 2011; O'Connor, 2015). Thus recently, DDR has been explored for more targeted chemotherapy.

It is well-known that Pontin and Reptin are involved in DNA damage response due to their participation in protein complexes that are major players in this process (Grigoletto et al., 2011; Matias et al., 2015). Such complexes include the TIP60 complex mentioned previously and the chromatin remodeling complex INO80. Pontin and Reptin have recently been shown to interact with transcription factors RUNX2 (Runt-related transcription factor 2) and YY1 (Ying-Yang 1) for processes involved in DNA damage response (Wu et al., 2007; Lopez-Perrote et al., 2014; Yang et al., 2015). More recently, Pontin and Reptin were also found to be important for the stability of the Fanconi anemia (FA) core complex that functions in interstrandcrosslink (ICL) repair (Rajendra et al., 2014). The two ATPases participate in these processes together as a heterohexamer and/or independently to provide a broad spectrum of responses to the various circumstances and stresses that a cell encounters.

#### Role in TIP60 Complex—H2AX Regulation

Phosphorylation of histone variant H2AX on Ser139 is one of the earliest events following DNA damage (O'Connor, 2015). Its abundant signal allows it to act as a sensitive marker for DNA damage and the repair that follows (Ciccia and Elledge, 2010). As subunits of the TIP60 complex, Pontin and Reptin are highly involved in the regulation of this signal.

After DNA damage, histone H3 methylation site is exposed and the MRN complex (consisting of Mre11, Rad50, and Nbs1 proteins) binds to the damaged site. MRN then recruits TIP60 in complex with checkpoint kinase ATM and facilitates the interaction between TIP60 and methylated H3 (**Figure 6A**) (Sun et al., 2009). This interaction upregulates the acetyltransferase

activity of TIP60. ATM is activated through acetylation by TIP60 and autophosphorylates for further activation (Sun et al., 2005). ATM is then able to phosphorylate histone H2AX and a host of DNA damage proteins, regulating downstream signaling (Ciccia and Elledge, 2010) (**Figure 6A**).

Depletion of Pontin after DNA damage increased the amount and lifetime of phosphorylated H2AX, which could be mimicked by TIP60 depletion (Jha et al., 2008). Since Pontin is required for the histone acetyltransferase activity of TIP60, this suggested that Pontin in complex with TIP60 was also important for the removal of phospho-H2AX (Kusch et al., 2004; Jha et al., 2013) (**Figure 6A**). In agreement, Ikura et al. (2007) found that TIP60 acetylation of H2AX mediates its release from chromatin (**Figure 6A**). On the other hand, TIP60 acetylation of H4 is also required for curbing the phospho-H2AX signal (Jha et al., 2008).

Conflicting results have been reported regarding Reptin depletion. Whereas, Ni et al. (2009) found that Reptin depletion increased H2AX phosphorylation following UV irradiation in HeLa cells, Raymond et al. (2015) showed that etoposide or γ irradiation of HuH7 and Hep3B cells, which produced double strand breaks (DSBs), led to reduced phosphorylation of H2AX upon Reptin depletion, and thus resulted in defective repair. Further studies are needed to explain whether these differences are due to the nature of the DNA lesion, source of damage or cell type specificity. Raymond et al. (2015) also found that DSB repair was regulated by Reptin in part through stabilizing DNA-PKcs (a member of the PIKK family). Thus, overexpression of Reptin in chemoresistant ovarian and breast cancers could confer higher DNA damage repair abilities and partly explain their resistance to therapy (Yang et al., 2012).

Role in TIP60 Complex—Homologous Recombination DSB is the most toxic and dangerous type of DNA damage, as it can lead to loss of genetic material if left unresolved (Ciccia and Elledge, 2010). The two main repair strategies for DSBs are homologous recombination (HR) and non-homologous end joining (NHEJ) (Clarke et al., 2017). As their names suggests, HR uses a template DNA (the sister chromatid) for repair, and thus is less error prone, while NHEJ is an inaccurate process known to lead to genomic instability and thus cancer susceptibility (Ciccia and Elledge, 2010; Jeggo et al., 2016). However, HR is restricted to late S/G2 phase due to its requirement for a template sequence, whereas NHEJ can occur any time during the cell cycle (Clarke et al., 2017). The main factor in determining which repair pathway is used depends on the extent of DNA end processing, which is controlled by the 53BP1 (p53 binding protein 1)-containing complex that protects the ends from overprocessing (Tang et al., 2013). The HR pathway requires the dissociation of 53BP1 for extensive end resection by specialized machinery, which is facilitated by the recruitment of BRCA1 (breast cancer early onset 1) (Kusch et al., 2004; Tang et al., 2013). Competitive binding of BRCA1 and 53BP1 at DSB sites on the chromatin determines the pathway choice between HR and NHEJ: BRCA1 binding promotes HR and 53BP1 binding promotes NHEJ (Clarke et al., 2017).

Pontin and Reptin were found to be involved in HR through both TIP60 and INO80 complexes. TIP60 acetylates histone H4 at K16, disrupting the interaction between the H4K16 residue and 53BP1 (Sun et al., 2009; Tang et al., 2013). This, in combination with recruitment of TIP60 complex subunit MBTD1 to the methylation site on H4 at K20 displaces 53BP1 from the histone tail (**Figure 6A**) (Jacquet et al., 2016). It was recently found that PRMT5 (protein arginine methyltransferase 5) methylated Pontin at R205 and that this was required for the acetyltransferase activity of TIP60 and, consequently, for the mobilization of 53BP1 (**Figure 6A**) (Clarke et al., 2017). Thus, it was not surprising that TIP60-deficiency led to impaired HR and conferred sensitivity to DNA-damaging anticancer therapy based on poly ADP ribose polymerase (PARP) inhibition, a phenotype mimicked by Pontin depletion (Tang et al., 2013).

#### Role in INO80 Complex

It was known that INO80 also facilitates HR, but the molecular mechanism had been unclear (Wu et al., 2007; Tsukuda et al., 2009; Gospodinov et al., 2011). Gospodinov et al. (2011) found that INO80 mediates resection of double-strand break ends and is required for the formation of replication protein A (RPA) foci. RPA functions to prevents single stranded DNA created during resection from forming secondary structures or winding back onto itself (Ciccia and Elledge, 2010). Alatwi and Downs (2015) reported that depletion of INO80 after DNA damage led to defective RAD51 foci formation, a phenotype also seen with TIP60 depletion. RAD51 is recruited to resected ends of the damaged DNA and is the primary mediator of strand invasion and recombination for the HR pathway (Ciccia and Elledge, 2010). The role of INO80 in both processes might be to remove histone variant H2A.Z for the resolution of repair in complex with YY1 (**Figure 6B**) (Alatwi and Downs, 2015). As subunits of the INO80 complex, Pontin and Reptin were also seen to accumulate at DSBs (Alatwi and Downs, 2015). Their ATPase activity was required for the formation of RAD51 foci, through direct interaction and in cooperation with the YY1 transcription factor (Wu et al., 2007; Lopez-Perrote et al., 2014).

#### Role in Fanconi Anemia DNA Repair Pathway

Pontin and Reptin were shown to be involved in yet another DNA repair pathway, the Fanconi anemia (FA) pathway, responsible for the repair of interstrand crosslinks (Deans and West, 2011). It was recently demonstrated that Pontin and Reptin interacted directly with the FA core complex, and regulated the abundance of the FA subunits on both the protein and mRNA levels (Rajendra et al., 2014). Depletion of these two ATPases resulted in sensitivity to DNA crosslinking agents, chromosome aberrations and defective FA pathway activation (Rajendra et al., 2014).

Pontin and Reptin can regulate the FA core complex either directly or through maintaining the stability of its upstream activator the serine/threonine-protein kinase ATR (**Figure 6C**) (Rajendra et al., 2014), which is a member of the PIKK family. ATR also activates the FANCI and FANCD2 dimer through phosphorylation (Deans and West, 2011). This activation is completed by the monoubiquitination of the dimer by the FA core complex (Ceccaldi et al., 2016). The dimer participates in subsequent recruitment of nucleases and other proteins important for interstrand crosslink repair (Rajendra et al., 2014; Ceccaldi et al., 2016).

# Role of Pontin/Reptin in Epithelial-Mesenchymal Transition (EMT)

The role of Pontin and Reptin in cell migration and invasion has recently been investigated although is not yet wellelucidated (Ren et al., 2013; Taniuchi et al., 2014; Zhang et al., 2015; Breig et al., 2016). As mentioned above, Pontin and Reptin were originally thought of as nuclear proteins, however, accumulating research demonstrated their cytoplasmic localization as well, ranging from partial to predominantly cytoplasmic (Grigoletto et al., 2011). This was also recently found to have a clinical significance, perhaps acting through the epithelial to mesenchymal transition (EMT) pathway.

#### Cytoplasmic Localization of Pontin/Reptin

Localization of Pontin and Reptin in the cytoplasm seems to be a common marker for cancer metastasis and involvement in cell migration. High cytoplasmic expression of the proteins was correlated with poor prognosis and metastatic progression in patients with HCC and RCC (Ren et al., 2013; Zhang et al., 2015; Breig et al., 2016). Cytoplasmic localization of Pontin was also reported in human colorectal cancer (CRC) and lymphoma tissue sections, as well as in PDAC cells, embryonic stem cells (ESCs), and HeLa cells; while Reptin cytoplasmic localization was found in HEK293 cells, HeLa cells and adipocytes (Makino et al., 1998; Mizuno et al., 2006; Lauscher et al., 2007; Ni et al., 2009; Xie et al., 2009; Taniuchi et al., 2014; Baron et al., 2016). This suggested the presence of functions of Pontin and Reptin specific to the cytoplasm, outside of their roles in chromatin remodeling, DNA damage response, and transcriptional regulation within the nucleus.

Depletion of Pontin and Reptin in RCC cells (A498, 786- O), where their expression was predominantly cytoplasmic, significantly inhibited cell migration and invasion ability (Ren et al., 2013; Zhang et al., 2015). A similar phenotype was observed upon Pontin/Reptin silencing in many other cancer models such as prostate cancer (LNCap), HCC (HuH7, Hep3B), PDAC (S2- 013), and hypoxia treated breast cancer cells (MCF7) (Kim et al., 2006; Rousseau et al., 2007; Lee et al., 2010; Taniuchi et al., 2014). However, the molecular mechanism by which this occurs is currently unclear.

#### Role in Regulating EMT-Associated Cellular Events

Analysis of human RCC tissue samples showed an overexpression of Pontin and decreased expression of E-cadherin (an epithelial marker) compared to normal renal tissue (Zhang et al., 2015). Loss of E-cadherin expression is a hallmark of EMT, followed by disassembly of epithelial cell-cell junctions, loss of apical-basal polarity, reorganization of cortical cytoskeleton and increased cell mobility (**Figure 7**) (Lamouille et al., 2014). This allows tumor cells that have undergone EMT to disseminate to distant sites, become resistant to apoptosis and senescence, and act as cancer stem cells (CSCs) (Marcucci et al., 2016). Mounting

FIGURE 7 | Role of Pontin/Reptin in the EMT pathway. Summary of the EMT pathway and the stages at which Pontin and Reptin potentially function to promote cell invasion and migration. (A) Pontin promotes F-actin polymerization and G-actin local concentration. (B) Reptin promotes EGFR signaling through regulating the mRNA and protein expression of meprin α. (C) Pontin and Reptin activate PI3K-AKT-mTOR intracellular signaling at multiple stages. (D) Pontin and Reptin interact with cell survival/proliferation transcription factors (shown in orange) to promote transcription of many genes involved in metastasis, which may include several master EMT transcription factors (shown in blue).

evidence showed that EMT is a crucial mechanism for malignant transformation and metastatic progression (Marcucci et al., 2016). Intriguingly, Pontin depletion dramatically increased E-cadherin expression, which suggested that Pontin/Reptin may also promote cell migration and invasion through the EMT pathway (Zhang et al., 2015). Furthermore, decreased vimentin expression (a mesenchymal marker) was observed after silencing either Pontin or Reptin, further supporting this hypothesis (Zhang et al., 2015).

In addition to the changes in expression profiles of various cell adhesion and cell junction genes, EMT is also helped by changes in the cell matrix and cytoskeleton through reorganization of actin and intermediate filaments. PDAC is the most common type of pancreatic cancer, and one of the most difficult to treat due to its aggressive and highly metastatic nature. Taniuchi et al. (2014) found that Pontin promotes invasiveness and migration of PDAC cells through a direct interaction with actin filaments at cell protrusions (**Figure 7A**). Pontin mediated actin polymerization by binding filamentous-actin (Factin), which enhanced elongation of existing actin filaments. Globular-actin (G-actin) is the monomeric building block for F-actin and sufficient concentration is needed for efficient assembly of filaments. Even though Pontin did not interact with G-actin, it increased the localization of G-actin at cell protrusions, allowing increased F-actin structures and actinbased motility. Knockdown of Pontin decreased peripheral actin rearrangements, thus inhibiting formation of cell protrusions (Taniuchi et al., 2014). This in turn repressed the motility and invasiveness of PDAC cells, which can prove to be valuable for therapeutic targeting.

#### Role in EMT Pathway Signaling

The cellular changes in EMT are controlled by a complex underlying molecular mechanism and crosstalk between many signaling pathways such as TGFβ, WNT, EGF, Notch, and IL-6 (**Figure 7**) (Lamouille et al., 2014). Reptin may enhance activation of these receptors through its interaction with meprin α (MEP1A), a secreted metalloproteinase with pro-angiogenic and pro-migratory activity (Lottaz et al., 2011; Minder et al., 2012). Many of its targets are highly relevant for cancer and the EMT pathway (Broder and Becker-Pauly, 2013; Breig et al., 2016). Meprin α mediated the transactivation of the EGFR signaling in colorectal cancer and has a possible role in invasion and metastatic dissemination (Minder et al., 2012). In the HCC context, Breig et al. (2016) found meprin α to be a downstream mediator for Reptin-dependent migration and cell invasion (**Figure 7B**). Exogenous meprin α was able to restore migration and invasion capabilities but not proliferation in Reptin-silenced cells. Reptin also regulated both mRNA and protein expression of meprin α, though the mechanism has yet been elucidated. Furthermore, the expression of Reptin and meprin α were correlated in patient samples. Expression of either proteins were also independently found to be correlated with poor differentiation and low post-operative survival, supporting their potential involvement in EMT (Breig et al., 2016).

In a cancer context, stimuli such as hypoxia and mechanical stress from the tumor microenvironment and/or overexpression of pathway components can work cooperatively to induce EMT (Marcucci et al., 2016). These will activate a number of intracellular signaling pathways including the PI3K-AKTmTOR pathway and the RAS-RAF-MAPK pathway that are also highly involved in sustaining cancer cell growth and proliferation (**Figure 7C**) (Lamouille et al., 2014; Marcucci et al., 2016). Pontin and Reptin can potentially promote EMT through the PI3K-AKT-mTOR pathway since they have been found to be important for mTORC1 stabilization and activation (Kim et al., 2013). Pontin silencing also reduced the levels of AKT protein in lung adenocarcinoma, which suggests that it may also function in the upstream activation of the pathway or can stabilize AKT as well (**Figure 7C**) (Yuan et al., 2016).

#### Role in EMT Transcriptional Regulation

Transcription factors such as NF-κB, STAT3, H1F1α, and β-catenin are upregulated downstream of the intracellular signaling pathways activated in EMT, which then induces the expression and activation of a pool of EMT-promoting master transcription factors such as TWIST, SNAILs, and ZEBs (**Figure 7D**) (Lamouille et al., 2014). These directly control the expression of genes associated with epithelial and mesenchymal phenotypes including E-cadherin, fibronectin, and vimentin (Lamouille et al., 2014).

Pontin and Reptin have been found to regulate β-catenin transcription with opposing effects: Pontin enhancing transcription and Reptin repressing it (**Figure 7D**) (Bauer et al., 2000). Nuclear β-catenin expression significantly decreased following Pontin depletion in RCC, supporting a hypothesis that during oncogenesis, Pontin may upregulate β-catenin transcription targets by relocating β-catenin from the adherens junctions to the nucleus, thus increasing transcription of oncogenes such as MYC (Ren et al., 2013). This is consistent with Lauscher et al.'s (2007) observation where nuclear co-localization of Pontin and β-catenin was correlated with progression of colorectal cancer. In addition, Kim et al. (2005) showed the importance of Reptin in repressing the anti-metastatic gene KAI1. Pontin and Reptin were also shown to regulate transcription in the hypoxia pathway, as well as in promoting expression of interferon-stimulated genes (**Figure 7D**) (Lee et al., 2010, 2011; Gnatovskiy et al., 2013; Perez-Perri et al., 2016).

#### DRUG TARGETING

As describe above, Pontin and Reptin are clearly important proteins for the proliferation and survival of cells. Through their diverse cellular functions, they are highly relevant for the progression of cancer, and this makes them novel therapeutic anticancer drug targets. Conditional silencing of Reptin in xenografts of HCC in mice led to arrest of tumor development and even regression of tumors in several mice likely through tumor cell senescence (Menard et al., 2010). Mice with conditional hemizygous knockout of Pontin resulted in significantly smaller tumor formation 6 months after induction of cancer (Bereshchenko et al., 2012). However, the same mice when examined at 9–12 months showed accelerated progression of cancer and tumor formation, suggesting that long-term Pontin inhibition may pose unforeseen risks in vivo (Bereshchenko et al., 2012). This is to be expected due to its many fundamental roles for normal cells.

Preliminary in silico drug screening for Pontin ATPase inhibitors was able to identify several novel compounds using both molecular docking and in vitro ATPase assays (Elkaim et al., 2012). Three of these compounds showed anti-proliferative activities on tumor cells. Using the same method on a pool of pre-existing molecules, two specific compounds were identified that competitively bound to the ATP-binding pocket (Elkaim et al., 2014). One compound induced apoptosis as well as necrosis in cellular assays and thus has the potential to be developed further as a therapeutic. In addition, high-throughput screening by Daiichi-Sankyo Co. LTD. in Japan also identified compounds that were selective Pontin/Reptin ATPase inhibitors (Patent WO2015125786A1). These compounds inhibited tumor cell proliferation in vitro as well as demonstrated anti-tumor activities in vivo with human xenografts in mice.

There is accumulating evidence that Pontin and Reptin functions are tightly regulated on multiple levels, including transcription, oligomeric state, subcellular localization, and interacting partners. Recent studies suggested that different posttranslation modifications such as methylation, sumoylation, and phosphorylation may partially modulate the specificity of their roles. Thus, developing drugs to target these post-translational modifications and or their specific interactions may be more beneficial and less toxic than inhibiting the overall activities of these proteins.

#### REFERENCES


### CONCLUDING REMARKS

Consistent with the overexpression of Pontin and Reptin in many cancer types, the two AAA+ proteins are found to regulate many fundamental cellular pathways involved in cell proliferation and survival. These include the assembly of macromolecular complexes as part of the R2TP complex, regulation of cell cycle checkpoint and mitosis, regulation of oncogenic transcription factors, regulation of DNA damage response as well as repair, and promotion of epithelial to mesenchymal transition among others. Much is still unknown about the molecular mechanism that Pontin and Reptin use to facilitate these processes or the repertoire of interactors involved. It is, therefore, crucial to shed further light into the roles that Pontin and Reptin perform in the cell to accelerate the discovery of novel therapeutics for various types of cancer.

#### AUTHOR CONTRIBUTIONS

Y-QM and WAH wrote and edited the manuscript.

# FUNDING

Y-QM was supported by a fellowship from the Centre for Pharmaceutical Oncology at the University of Toronto. This work was funded from the Canadian Institutes of Health Research grants (MOP-93778 and MOP-130374) to WAH.

#### ACKNOWLEDGMENTS

We thank members of the WAH group for their critical comments on this manuscript.


human RuvB-like RuvBL1 and RuvBL2 complexes. Biochem. J. 429, 113–125. doi: 10.1042/BJ20100489


using feature selection and decision tree methods. Sci. World J. 2013:782031. doi: 10.1155/2013/782031


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Mao and Houry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.