Nuclear-encoded factors associated with the chloroplast transcription machinery of higher plants

Plastid transcription is crucial for plant growth and development. There exist two types of RNA polymerases in plastids: a nuclear-encoded RNA polymerase (NEP) and plastid-encoded RNA polymerase (PEP). PEP is the major RNA polymerase activity in chloroplast. Its core subunits are encoded by the plastid genome, and these are embedded into a larger complex of nuclear-encoded subunits. Biochemical and genetics analysis identified at least 12 proteins are tightly associated with the core subunit, while about 34 further proteins are associated more loosely generating larger complexes such as the transcriptionally active chromosome (TAC) or a part of the nucleoid. Domain analyses and functional investigations suggested that these nuclear-encoded factors may form several functional modules that mediate regulation of plastid gene expression by light, redox, phosphorylation, and heat stress. Genetic analyses also identified that some nuclear-encoded proteins in the chloroplast that are important for plastid gene expression, although a physical association with the transcriptional machinery is not observed. This covers several PPR proteins including CLB19, PDM1/SEL1, OTP70, and YS1 which are involved in the processing of transcripts for PEP core subunit as well as AtECB2, Prin2, SVR4-Like, and NARA5 that are also important for plastid gene expression, although their functions are unclear.


INTRODUCTION
Plastids are specific organelles in plant and algal cells that are responsible for photosynthesis and some important metabolic pathways. They possess their own genetic material and are generally considered to be of endosymbiotic origin (McFadden and van Dooren, 2004). Similar to bacteria, the DNA is organized into dense particles, the nucleoids (Pfalz and Pfannschmidt, 2013). The genome size from plastids of vascular plants ranges from 120 to 180 kbp and the encoded gene set is highly conserved (Sugiura, 1992). They can be categorized into three groups according to their molecular function of the encoded components: (1) Components of the plastid gene expression machinery (RNA polymerase, ribosomal proteins, tRNAs, and rRNAs); (2) Subunits of photosynthesis-related complexes (Rubisco, PSII, the cytochrome b6f complex, PSI, NAPH dehydrogenase, and ATP synthase), and (3) a few proteins involved in other processes (e.g., ClpP1 and YCF3) (Sugiura, 1992). The chloroplast proteome is estimated to be between 2100 and 3600 proteins (Leister, 2003). Most of the chloroplast proteins are encoded by the nuclear genome and are imported from the cytosol (Li and Chiu, 2010), due to the limited coding capacity of the chloroplast genome. However, chloroplast gene expression is still essential for the development of chloroplasts and the maintenance of chloroplast functions. It involves the action of numerous nuclear-encoded factors, besides proteins encoded by the plastome. Recently, proteomics data (Pfannschmidt et al., 2000;Ogrzewalla et al., 2002;Suzuki et al., 2004;Pfalz et al., 2006;Steiner et al., 2011;Melonek et al., 2012) and genetic analysis (Chi et al., 2008;Ogawa et al., 2009;Wu and Zhang, 2010;Qiao et al., 2011Qiao et al., , 2013Kindgren et al., 2012;Pyo et al., 2013;Yu et al., 2013) identified that numerous nuclear-encoded proteins with various functions are associated with the transcriptional machinery and are involved in chloroplast gene expression. In this paper, we focused on these nuclear-encoded factors for chloroplast transcription.

TWO TYPES OF PLASTID RNA POLYMERASES IN HIGHER PLANTS
Plastid genes are transcribed by two RNA polymerases, the nuclear-encoded RNA polymerase (NEP) and the plastidencoded RNA polymerase (PEP). NEP is a phage-type RNA polymerase with a single subunit (Chang et al., 1999;Lerbs-Mache, 2011). In Arabidopsis, the nuclear genome encodes three NEPs. RpoTp is targeted to chloroplast, RpoTm is targeted to mitochondria, and RpoTmp is dually targeted to both organelles (Hess and Borner, 1999). NEP is important for plant development. Inactivation of RpoTp results in defects in plastid gene expression and leaf development (Hricová et al., 2006;Swiatecka-Hagenbruch et al., 2008) while plants with inactivated RpoTmp exhibit several defects, including a plastid gene expression defect, delayed greening and growth retardation of leaves and roots (Courtois et al., 2007). The dysfunction of both NEPs resulted in seedling lethality at a very early developmental stage (Hricová et al., 2006). Although NEP is generally considered to be a single subunit RNA polymerase, recent biochemical analysis revealed that RPOTmp interacts with a thylakoid RING-H2 protein. This protein might mediate the fixation of RPOTmp to thylakoid membranes in order to regulate the transcription of the plastid rrn genes (Azevedo et al., 2008).
PEP is composed of four core subunits encoded by the genes rpoA, rpoB, rpoC1, and rpoC2 that are located on the plastid genome. PEP exhibits a certain sensitivity to inhibitors of bacterial transcription, such as tagetitoxin, and the group of rifampicin-related drugs, indicating a distinct degree of conservation of these eubacterial-type RNA polymerase during evolution (Liere et al., 2011). Like for bacterial RNA polymerases, the activity/specificity of the PEP core enzyme is regulated by sigma-like transcription factors that are encoded by the nuclear genome of higher plants. In Arabidopsis, there exist six chloroplast sigma factors (SIG1-SIG6). These sigma factors might have overlapping as well as specific functions for recognizing a specific set of promoters during chloroplast development (Schweer, 2010;Liere et al., 2011). Besides the sigma factors, however, the core subunits of PEP are associated also with additional proteins (see below) that mediate a number of additional functions to the PEP complex.
NEP and PEP play different roles in plastid gene transcription during plastid development and plant growth (Liere et al., 2011). Based on their transcription by the different RNA polymerases, plastid genes can be grouped into three classes (Hajdukiewicz et al., 1997;Ishizaki et al., 2005). Transcription of photosynthesisrelated genes (such as psbA, psbD, and rbcL) depend largely on PEP (class I), whereas a few house-keeping genes (mostly encoding components of the transcription/translation apparatus, such as rpoB) are exclusively transcribed by NEPs (class III). Most of plastid genes, however, are transcribed by both PEP and NEPs (class II). Generally, NEP is more active in the young, non-green tissues early in leaf development. It transcribes housekeeping genes including the four core subunits of PEP polymerase which primarily constitute the plastid gene expression machinery. Once PEP is formed in later developmental stages, it thereafter transcribes the photosynthesis-related genes (Hajdukiewicz et al., 1997;Lopez-Juez and Pyke, 2005;Schweer et al., 2010b) and plastid tRNAs (Williams-Carrier et al., 2014). In the mature chloroplast, the activity of NEP is barely detected, while PEP activity maintains high for chloroplast development and plant growth. Nevertheless, recent investigations demonstrated that both NEP and PEP are present in seeds, and PEP is also important for seed germination. This indicates that PEP exists also in non-photosynthetically active seed plastids (Demarsy et al., 2006).

PEP IS ASSOCIATED WITH NUMEROUS NUCLEAR-ENCODED PROTEINS
Early biochemical analysis demonstrated that two different forms of the PEP complex exist in higher plant, that is, PEP-A and PEP-B (Pfannschmidt and Link, 1994). PEP-B is composed only of the rpo core subunits and is present in both etioplasts and greening chloroplasts. During light-dependent chloroplast development, this PEP-B enzyme is reconfigured into an eukaryote-like enzyme complex, the PEP-A, by association of numerous proteins (Pfannschmidt and Link, 1997;Steiner et al., 2011;Pfalz and Pfannschmidt, 2013). PEP-A is the major RNA polymerase in matured chloroplast of higher plant. Attempts have been focused on the isolation of the plastid RNA polymerase complex and its associated proteins for many years (Pfalz and Pfannschmidt, 2013). Biochemical analyses uncovered that the core rpo subunits of PEP are present in both the insoluble RNA polymerase preparation called transcriptionally active chromosome (TAC), and the soluble RNA polymerase preparation (sRNAP) (Krause and Krupinska, 2000;Pfalz et al., 2006;Melonek et al., 2012). The TAC fraction was isolated from lysed plastids through one or two gel filtration chromatography steps and subsequent ultracentrifugation, while the soluble RNA polymerase (sRNAP) is prepared from isolated and lysed plastids via several chromatographic purification steps without precipitation by centrifugation (Pfalz and Pfannschmidt, 2013). Based on gel filtration and mass spectrometry analysis from different organisms, including Nicotiana tabacum (Suzuki et al., 2004), spinach (Melonek et al., 2012), mustard (Sinapis alba) (Pfannschmidt et al., 2000;Pfalz et al., 2006;Steiner et al., 2011), and Arabidopsis (Pfalz et al., 2006) it is estimated that the TAC complex contains 43 nuclear-encoded proteins ( Table 1). Ten proteins were reproducibly found to be tightly associated with PEP core subunits in mustard seedlings and, therefore, were named polymeraseassociated proteins (PAPs) (Steiner et al., 2011). The other proteins were found in the previous reported TAC complex and might represent more loosely attached components of the transcription machinery (Pfalz et al., 2006). Two TAC components, pTAC7  and MurE-like (Garcia et al., 2008), were not identified as PAPs in mustard (Steiner et al., 2011), however, based on their mutant phenotype in T-DNA inactivation mutants of Arabidopsis these two proteins were proposed to be PAPs (Pfalz and Pfannschmidt, 2013). One essential common feature of all PAPs is that they are essential for PEP activity. The Arabidopsis knock-out lines for the corresponding genes show all an albino/ivory or pale-green phenotype with severe defects in chloroplast development and PEP-dependent transcription ( Table 1) (Pfalz et al., 2006;Garcia et al., 2008;Myouga et al., 2008;Arsova et al., 2010;Schröter et al., 2010;Gao et al., 2011;Steiner et al., 2011;Gilkerson et al., 2012;Yagi et al., 2012;Yu et al., 2013). The phenotype of these PAP mutants is identical to that of rpo-gene knock-out mutants in tobacco (Allison et al., 1996;Hajdukiewicz et al., 1997;De Santis-MacIossek et al., 1999). In the knockout mutants of AtECB1/SVR4/MRL7 (Qiao et al., 2011;Yu et al., 2014), PEP-Related Development Arrested 1 (PRDA1) (Qiao et al., 2013), and Delayed Greening 1 (DG1) (Chi et al., 2008), the expression of PEP-dependent chloroplast genes is also severely reduced. These proteins have not been identified in PEP complex by previous proteomic analyses (Krause and Krupinska, 2000;Suzuki et al., 2004;Pfalz et al., 2006;Steiner et al., 2011). Nevertheless, they interacts with some members of the PEP/TAC complex (Chi et al., 2010;Qiao et al., 2011Qiao et al., , 2013Kindgren et al., 2012;Yu et al., 2014) and are either loosly or temporarily attached.
Based on proteomic analysis and protein interaction investigation, the TAC complex contains at least 50 proteins of which 46 are nuclear-encoded (Tables 1, 2). These nuclearencoded proteins can be classified into several groups including

Localization information is from individual GFP-fusion experiment or immune analysis.
DNA/RNA binding proteins, thioredoxin proteins, kinases, ribosome proteins and proteins with unknown function (Table 1).
Yeast two-hybrid and other biochemical assays revealed the relationship of some proteins in the PEP complex (Figure 1). The interactions between these PAPs are consistent with the biochemical experiments that identified these proteins in the PEP complex under the stringent condition (Steiner et al., 2011). Currently, proteins directly interacting with the PEP core subunits have not been identified in the PEP complex. Immunoprecipitation analysis demonstrated that pTAC3 is associated with the rpo subunits (Yagi et al., 2012). However, the direct interaction between pTAC3 and PEP core subunits has not been verified.

PROTEINS IN THE PEP COMPLEX WITH DNA/RNA BINDING DOMAIN
The eukaryotic transcriptional machinery consists of RNA polymerases and various DNA binding proteins, such as transcription factors. These DNA-binding proteins recognize the promoter to regulate downstream gene transcription. In the TAC complex, there are at least 14 proteins with DNA-binding domains ( Table 1) (Pfalz et al., 2006;Steiner et al., 2011;Pfalz and Pfannschmidt, 2013). pTAC3 belongs to the SAP protein family. The ptac3 mutant exhibits an albino phenotype with reduced PEP-dependent plastid transcription. It is unclear yet if pTAC3 can bind to a specific DNA region in order to regulate plastid gene transcription (Yagi et al., 2012). pTAC6 is essential for chloroplast transcription (Pfalz et al., 2006) since the expression of the psbA gene was barely detectable in the ptac6 mutant, compared with that in ptac2 and ptac12 (Pfalz et al., 2006). It is likely that pTAC6 is a specific regulator for psbA (Pfalz et al., 2006), however, to date its function remains enigmatic. In bacteria, there exist two transcription termination mechanisms; Rho-independent transcription termination and Rho-dependent termination. The mitochondrial transcription termination factor (mTERF) family was identified to regulate mitochondrial gene expression including transcription termination (Kleine, 2012). pTAC15 is a member of the mTERF protein family (Pfalz et al., 2006). Whether it can terminate the transcription of PEP-dependent plastid genes needs to be verified. The TAC complex contains at least six RNA-binding proteins including ZmWhy1, pTAC10, the elongation factor EF-Tu, and three ribosomal proteins, S3, L12-A, and L26 (Table 1). Whirly proteins belong to a small nuclear transcription factor family commonly found in plants. In Arabidopsis, pTAC1/AtWhy1 and pTAC11/AtWhy3 can bind DNA (Xiong et al., 2009). They are required to maintain the stability of the plastid genome (Maréchal et al., 2009). The whirly 1 ortholog in maize (ZmWHY1/pTAC1) can bind both RNA and DNA, and co-immuno-precipitated with chloroplast RNA splicing 1 (CRS1) (Prikryl et al., 2008). pTAC10 contains a S1 domain and has RNA binding activity in tobacco (Jeon et al., 2012), and it may be one substrate of chloroplast-target casein kinase 2 (cpCK2) (Reiland et al., 2009). The phosphorylation of pTAC10 may affect its RNA binding. The detailed function of the elongation factor EF-Tu and the ribosomal proteins S3, L12-A, and L26 in chloroplast is not reported. The existence of these RNA-binding proteins, however, suggests that there exists a translation subdomainin the TAC/nucleoid.

www.frontiersin.org
July 2014 | Volume 5 | Article 316 | 5 These factors regulate plastid transcription with unknown mechanism. c These factors indirectly affect PEP activity through regulating the processing of chloroplast transcripts encoding the core subunits.

CONNECTIONS OF REGULATORY MODULES WITH THE RNA POLYMERASE
Light plays highly important roles in the regulation of plastid gene transcription. The majority of PAPs (Pfalz and Pfannschmidt, 2013) and most sigma factor genes of higher plants are lightinduced (Lerbs-Mache, 2011). Plastome-wide PEP-DNA association is also a light-dependent process (Finster et al., 2013). In plants, light plays an important role in almost every facet of plant growth and development through the action of photoreceptors. Interestingly, pTAC12 is an intrinsic subunit of the PEP complex (Pfalz et al., 2006;Steiner et al., 2011), but it was also identified as HEMERA and localized in both the nucleus and the chloroplast (Chen et al., 2010). pTAC12/HEMERA was considered as a proteolysis-related protein involved in phytochrome signaling in the nucleus (Chen et al., 2010). Its function in the PEP complex is unknown so far, but it was uncovered that pTAC12 interacts with pTAC14 in the yeast-two-hybrid system (Gao et al., 2011) suggesting that these two proteins might be also interaction partners in the native complex.
Chloroplasts are the site of photosynthesis that also produces reactive oxygen species (ROS). During photosynthesis, unbalanced excitation of the two photosystems affects the redox state of the electron transport chain which in turn serve as signals for plant acclimation responses. The PEP complex is a major target of such photosynthetic redox signals (Dietz and Pfannschmidt, 2011). Thioredoxin z (Trx Z) is a novel thioredoxin protein with disulfide reductase activity in vitro. It interacts with two fructokinase-like proteins FLN1 and FLN2 in the yeast two hybrid system and is also a component of the PEP complex (Pfalz et al., 2006;Steiner et al., 2011) (Figure 1). Trx-Z mediated redox change of FLN2 during light-dark transitions (Arsova et al., 2010). Recent studies identified AtECB1/MRL7 as a thioredoxinfold like protein with thioredoxin activity (Yu et al., 2014) that interacts with Trx Z in the PEP complex (Powikrowska et al., 2014;Yu et al., 2014). These two proteins thus may form a functional module to mediate redox signaling from thylakoids toward the RNA polymerase but the functional details of these interactions are completely unknown. Further redox mediators might be Fe Superoxide Dismutase 2 (FSD2) and FSD3, two iron superoxide dismutases, and PRDA1 is a chloroplast protein without any known domain. prda1 and fsd2 fsd3 knock out mutants are highly sensitive to oxidative stress Qiao et al., 2013). These proteins, therefore, may act as ROS scavengers in order to protect the PEP complex. The interactions between AtECB1 and PRDA1, FSD2, FSD3 suggest that the redox signaling pathway and ROS scavengers are eventually associated.
Protein phosphorylation is a very important post-translational modification in eukaryotic cells that regulates many cellular processes. In chloroplast, the phosphorylation of chloroplast proteins affects photosynthesis, metabolic functions and chloroplast transcription (Baginsky and Gruissem, 2009). The PEP complex appears to interact with a so-called plastid transcription kinase (PTK), named cpCK2 (Ogrzewalla et al., 2002). The Arabidopsis sigma factor 6 was reported to be phosphorylated by cpCK2 (Schweer et al., 2010a). Furthermore, pTAC5, pTAC10, and pTAC16, were also predicated to be phosphorylated by cpCK2 (Reiland et al., 2009). The enzyme activity of cpCK2 was inhibited by GSH, which suggests that cpCK2 is generally under SH-group redox regulation (Baginsky et al., 1999;Turkeri et al., 2012). Biochemical analyses of mustard seedlings during photosynthetic acclimation suggested that redox signals in chloroplasts are linked to chloroplast transcription via the combined action of phosphorylation and thiol-mediated regulation events (Steiner et al., 2009). Proteins related with phosphorylation and redox signaling are closely located in the PEP complex which is in agreement with the results of the physiological studies for plastid gene expression.
Heat stress is a major abiotic factor for plants, that leads to severe retardation in plant growth and development. To maintain the process of chloroplast transcription under heat stress and to support the survival of the plant, the chloroplast transcriptional machinery needs to deal with heat stress to a certain extent. The protein pTAC5 is a C4-type zinc finger DnaJ protein with disulfide isomerase activity. Its expression is induced by heat stress (Zhong et al., 2013) and, subsequently, pTAC5 and Heat Shock Protein 21 (HSP21) form a heterocomplex, although they are not PAP members of the PEP complex (Zhong et al., 2013). pTAC5 as well as HSP21 may protect chloroplast transcription under heat stress.

OTHER NUCLEAR ENCODED FACTORS THAT REGULATE PEP ACTIVITY
In addition to the intrinsic components of PEP complex, multiple additional factors were identified to regulate the processing of PEP core subunit transcripts and PEP activity by individual mutant analysis. Both Chloroplast Biogenesis19 (CLB19) (Chateigner-Boutin et al., 2008) and Pigment-Deficient Mutant 1(PDM1) (Wu and Zhang, 2010;Yin et al., 2012) genes encode pentatricopeptide repeat proteins. CLB19 is involved in the editing of the rpoA transcript (Chateigner-Boutin et al., 2008), while PDM1 is associated with rpoA polycistronic for rpoA cleavage (Wu and Zhang, 2010;Yin et al., 2012). Recent investigations demonstrated that PDM1/Seedling Lethal1 (SEL1) was also involved in accD RNA editing (Pyo et al., 2013). The PPR protein OTP70 was reported to affect the splicing of the rpoC1 transcript (Chateigner-Boutin et al., 2011). The gene Yellow Seedling 1 (YS1) encoding a PPR-DYW protein is required for editing of rpoB transcripts (Zhou et al., 2009). The common feature of the Arabidopsis knockout lines for all these proteins is that the plastid expression pattern in these mutants is similar to that of rpo-gene knock-out mutants in tobacco (Chateigner-Boutin et al., 2008, 2011Zhou et al., 2009;Wu and Zhang, 2010;Pyo et al., 2013).
NARA5 encodes a chloroplast-localized phosphofructokinase B-type carbohydrate kinase family protein, which might be involved in massive expressions of plastid-encoded photosynthetic genes in Arabidopsis (Ogawa et al., 2009). The Prin2 is a small protein possibly involved in redox-mediated retrograde signaling in chloroplast (Kindgren et al., 2012) and the SVR4-like is a homolog of AtECB1/SVR4/MRL7, encoding a chloroplast protein essential for proper function of the chloroplast in Arabidopsis (Powikrowska et al., 2014). All these proteins may reversibly associate with the PEP complex but detailed studies are necessary to understand their functional roles and connections with the RNA polymerase. Alternatively, these proteins may act as signaling factors in order to mediate environmental stimuli and plastid gene expression.

CONCLUDING REMARKS
Plants grow under very different environment conditions and photosynthesis is the major function of chloroplast which is important for plant growth and development. Plastid gene expression is essential for chloroplast development and normal functions including photosynthesis. The PEP complex is the major RNA polymerase activity in mature chloroplasts. Proteomic and genetic analyses identified that at least 50 nuclearencoded proteins in higher plant are important for PEP dependent plastid gene expression. These proteins may form several functional modules within the nucleoid or TAC in order to mediate plastid gene expression in response to light, redox changes, phosphorylation and heat stress or to protect the PEP complex from ROS damage. The large number of nuclear-encoded proteins reveals the complexity of plastid gene expression and regulation that is greatly different from the gene expression in the nucleus or in prokaryotes. However, the current knowledge about plastid transcription is quite limited and the investigation of the relationship between transcription, post-transcriptional processing as well as translation in the nucleoid could provide novel insights into chloroplast gene expression.