A purification strategy for analysis of the DNA/RNA-associated sub-proteome from chloroplasts of mustard cotyledons.

Plant cotyledons are a tissue that is particularly active in plastid gene expression in order to develop functional chloroplasts from pro-plastids, the plastid precursor stage in plant embryos. Cotyledons, therefore, represent a material being ideal for the study of composition, function and regulation of protein complexes involved in plastid gene expression. Here, we present a pilot study that uses heparin-Sepharose and phospho-cellulose chromatography in combination with isoelectric focussing and denaturing SDS gel electrophoresis (two-dimensional gel electrophoresis) for investigating the nucleic acids binding sub-proteome of mustard chloroplasts purified from cotyledons. We describe the technical requirements for a highly resolved biochemical purification of several hundreds of protein spots obtained from such samples. Subsequent mass spectrometry of peptides isolated out of cut spots that had been treated with trypsin identified 58 different proteins within 180 distinct spots. Our analyses indicate a high enrichment of proteins involved in transcription and translation and, in addition, the presence of massive post-translational modification of this plastid protein sub-fraction. The study provides an extended catalog of plastid proteins from mustard being involved in gene expression and its regulation and describes a suitable purification strategy for further analysis of low abundant gene expression related proteins.


INTRODUCTION
Plant chloroplasts are semiautonomous cell organelles of endosymbiotic origin that emerged from a cyanobacteria-like ancestor (Lopez-Juez and Pyke, 2005). One evolutionary remnant of this origin is their own genome (called plastome) comprising 100-120 genes and a pre-dominantly bacteria-like geneexpression machinery being essential for its proper expression. The plastome gene set in vascular plants is highly conserved and encodes mainly proteins with a function in photosynthesis and the gene expression machinery (Sugiura, 1992). However, for full functionality plastids require the import of many proteins that are encoded by the nuclear compartment since during evolution the endosymbiotic ancestor lost most of its genes to the nucleus of the host cell via horizontal gene transfer (Martin et al., 2002;Stoebe and Maier, 2002). These nuclear-encoded plastid proteins are translated in the cytoplasm as precursor molecules that are subsequently imported into plastids with the help of N-terminal transit peptides directing them to their correct sub-compartment (Soll and Schleiff, 2004). After removal of the transit peptide the mature proteins are then assembled into their final configuration together with the plastid-expressed proteins and, therefore, all major multi-subunit complexes (such as photosystems, ribosomes, or metabolic enzyme complexes) represent a patchwork of nuclear as well as plastid expressed proteins (Allen et al., 2011).
Based on the prediction of transit peptides and genome-scale proteomics it was estimated that plastids may contain around 1500-4000 different proteins (Abdallah et al., 2000;Baerenfaller et al., 2008;Ferro et al., 2010;van Wijk and Baginsky, 2011). Reference proteomes generated for maize and Arabidopsis cover 1564 and 1559 proteins, respectively, so far (Huang et al., 2013) indicating that a large part of the predicted plastid proteome has yet not been detected. This might be caused by the fact that plastids from different tissues (for instance roots, cotyledons, leaves, flowers, and fruits) likely contain different protein compositions, but also from the fact that especially regulatory proteins are present in only trace amounts that are difficult to detect in a matrix of highly abundant proteins, e.g., from the photosynthetic apparatus (Huang et al., 2013). Further complexity in the plastid protein complement may derive from the occurrence of multiple post-translational modifications that are essential for regulatory events.
Cotyledons display a high activity in plastid transcription and translation being essential for the light-induced development of chloroplasts out of the embryonic pro-plastids (Baumgartner et al., 1989(Baumgartner et al., , 1993. Thus, the proteome of cotyledon plastids comprises a high amount of proteins implicated in gene expression providing a useful source material for the characterization of the nucleic acids binding proteome. The chloroplast proteome of the dicotyledonous model organism Arabidopsis thaliana is well studied in adult leaves, however, an analysis of that of cotyledons is lacking mainly because the small size of the cotyledons is not very suitable for the isolation of chloroplasts and subsequent analyses of their proteins via chromatography. In recent investigations, the fast growing cruciferous plant mustard (Sinapis alba) demonstrated a high suitability for performing biochemical and physiological analyses of plastid gene expression in cotyledons since the seedlings and their cotyledons are much larger than that of Arabidopsis (Oelmuller et al., 1986;Tiller and Link, 1993;Pfannschmidt and Link, 1994;Link, 1996;Baginsky et al., 1997). Isolation of cotyledons in the order of kilograms is easily achieved after just 5 days of growth and provides enough material even for the biochemical analysis of low-abundant proteins by chromatography followed by mass spectrometry. Since Sinapis is a close relative of Arabidopsis, peptide data evaluation for the identification of mustard plastid proteins was found to be applicable for well conserved proteins by using the A. thaliana or Brassicales protein databases (Schröter et al., 2010;Steiner et al., 2011). Thus, the use of mustard as a source for cotyledons combines the advantages of mustard chloroplast preparation with the availability of protein data of well studied organisms like A. thaliana or some Brassica species.
In recent studies, proteins implicated in plastid gene expression in mustard have been isolated by a number of different purification schemes. These include the isolation of the membrane bound insoluble transcriptionally active chromosome (TAC) by ultracentrifugation and gel filtration (Hallick et al., 1976;Bülow et al., 1987;Pfalz et al., 2006) and the isolation of soluble proteins such as RNA polymerases, kinases, RNA binding proteins and sigma factors by various chromatographic steps (Tiller et al., 1991;Nickelsen and Link, 1993;Tiller and Link, 1993;Pfannschmidt and Link, 1994;Liere and Link, 1995;Baginsky et al., 1999). Recently, we applied the purification scheme of plastid isolation followed by protein enrichment via heparin-Sepharose (HS) chromatography and visualization by two-dimensional (2D) blue native (BN)-PAGE to isolate protein complexes such as the RNA polymerase complex as well as a number of gene expression related proteins (Schröter et al., 2010;Steiner et al., 2011). However, these HS purified fractions still included a number of metabolic enzymes which exacerbate the analysis of the nucleic acids binding sub-proteome as they tend to cover low abundant proteins or even hinder their visualization and identification. Here, we present a pilot characterization of the nucleic acids binding sub-proteome of chloroplasts from mustard cotyledons. To this end we used HS chromatography followed by a second chromatographic step with phosphocellulose (PC) which was shown to be very effective for isolating nucleic acids binding enzymes like RNA polymerases (Bottomley et al., 1970;Tiller and Link, 1993). This was followed by isoelectric focussing (IF) and 2D gel electrophoresis that allowed us to estimate the size of the nucleic acids binding sub-proteome and the ideal IF range for its visualization and protein determination using mass spectrometry. The use of 2D gel electrophoresis also revealed massive post-translational modifications of the sub-proteome.

ENRICHMENT OF NUCLEIC ACIDS BINDING PROTEINS FROM MUSTARD CHLOROPLASTS
In previous studies we analyzed gene expression related protein complexes from isolated mustard chloroplasts using a combination of HS chromatography followed by a two dimensional BN/SDS polyacrylamid gel-electrophoresis (2D BN-PAGE) and electro-spray ionization-tandem mass spectrometry (ESI-MS/MS). Besides the plastid-encoded RNA polymerase, various CSP41 complexes and translation related proteins, we identified several metabolic enzyme complexes such as GAP-dehydrogenase, ATPases, or RubisCO that co-purify in this affinity chromatography. These abundant proteins exacerbated the identification of further low-abundant proteins (Schröter et al., 2010;Steiner et al., 2011). In addition, these studies were focussed on the analysis of large native protein complexes using a BN-PAGE approach. This limited the characterization of gene expression related proteins that may occur in small complexes or as individual proteins. In this study, we aimed a deeper investigation of the size, composition and complexity of the nucleic acids binding subproteome of mustard chloroplasts. To this end, we performed chloroplast isolation and HS chromatography from mustard cotyledons precisely as described before (Schröter et al., 2010). Bound proteins were eluted with a high-salt step, concentrated by dialysis and, for further enrichment of gene expression related proteins, applied to a cation exchange column with PC as matrix as described earlier (see above). Proteins were eluted by a second high-salt step and dialyzed against a low-salt storage buffer for analysis and further use (see Materials and Methods). A first comparison of peak fractions with equal protein amounts of both purification steps was done by SDS-PAGE and silver staining ( Figure 1A). The PC fraction exhibited a selective enrichment of many protein bands between 5 and 75 kDa and a strong exclusion of proteins larger than 75-80 kDa. For a more detailed resolution of this protein fraction, we performed 2D gel electrophoresis with an IF as first dimension followed by a SDS-PAGE (Figures 1B,C) as second dimension. Using IPG stripes with a non-linear (NL) pH range from 3 to 11 for the IF and a gradient polyacrylamide gel, we could obtain an overview of the total protein content leading to the identification of around 600 individual spots. We observed two major areas where multiple proteins accumulated on the gel which were located between approximately pH 4.5-7 and pH 9-11. Because of the non-linearity of the IF gradient, proteins at the outer ranges of the IF stripe were poorly resolved which became mainly evident at the basic pH values. Therefore, linear IPG gels were used in addition, overlapping with the first one between pH 3-10 and pH 6-11. The latter gradient resolved the problem with spot accumulation especially observed at the cathode. The higher resolution led to the identification of further proteins leading to a total count of 1079 individual protein spots within the PC fraction which could be distinguished between the different gels. We regard this as the nucleic acids binding sub-proteome of mustard plastids. Our data indicate a significant higher complexity of this specific sub-proteome as it was estimated earlier from the HS fractions (Schröter et al., 2010).

IDENTIFICATION OF PROTEINS FROM THE PC FRACTION BY LC-ESI-MS/MS
All 1079 spots were cut out and proteins were subjected to an ingel tryptic digest. In 153 cases, selected spots were pooled from duplicate gels in order to increase the protein amount for the subsequent measurements. Since a database from S. alba is currently not available, protein identification was performed by comparing the determined mass spectrometry data to the Brassicales and A. thaliana databases (compare Materials and Methods). By this means 225 proteins were reliably identified with at least two different peptides in 180 spots indicating that several spots contained more than one protein. In addition, 36 particular proteins were identified in more than one spot (up to 40 different ones) suggesting post-translational modification of these proteins ( Table 1).
In total, 58 different proteins were identified. In further analyses, the identified gene models were checked for presence of a plastid transit peptide using TargetP (Emanuelsson et al., 2000) ( Figure 2). Plastid-directing transit peptides could be predicted for 36 of these proteins, ten of them exhibit an additional luminal transit peptide and four plastid-encoded proteins were identified. Considering a detection probability of 73% for a transit peptide, we estimated the percentage of true plastid proteins within the PC fraction to be around 94%. Some of the identified proteins were found before in mustard (Pfannschmidt et al., 2000;Pfalz et al., 2006;Schröter et al., 2010), but 36 were identified here for the first time ( Table 1). Based on functional similarities and structural homologies, a categorization of proteins into protein families or subgroups was conducted (Figure 3). A practical classification mode is given by the modified MapMan bin system (Thimm et al., 2004) of the Plant Proteomics Data Base (PPDB) (Sun et al., 2009). In Table 1, proteins were listed following the PPDB bin grouping as given in column 2. For further comparison, we summarized identified proteins into five major groups. The first group comprises transcription and transcript related proteins, namely subunits of the plastid encoded RNA polymerase (PEPs) and PEP associated proteins (PAPs) as defined in Steiner et al. (2011), other pTACs (pTAC proteins not belonging to the PAPs) and RNA and DNA related proteins (bin 27 and 28, not belonging to PAPs and pTACs). A second large group comprises translation related proteins (bin 29.2 and 29.5). Three further groups cover proteins involved in protein homeostasis (bin 29 and 21 not belonging to PEPs and PAPs), photosynthesis (bin 1) and a miscellaneous group called "others" including various enzymes catalyzing metabolic reactions or protein modifications.

PEPs, PAPs, AND OTHER pTACs
We detected most subunits of the soluble PEP complex including PAP3, PAP4, PAP5, PAP6, PAP8, PAP10, PAP11, PAP12 as well as the PEP core subunit RpoA (Pfalz and Pfannschmidt, 2013). Other PEP core subunits (RpoB, RpoC1, RpoB) and PAP1, PAP2, PAP7, and PAP9 were not identifiable in spots of these gels. Most of the identified proteins of this group became visible as single isolated spots in the acidic range (pH 3-6) on the gel (Figure 4) and at their expected molecular weight. An exception was PAP6 representing the protein fructokinase-like 1 (FLN1) that contains a protein domain of the pfkB-carbohydrate kinase family (Arsova et al., 2010;Steiner et al., 2011). This protein appeared in a chain of five spots of the same apparent molecular weight but with slightly varying isoelectric points from which the two strongest spots were identified as PAP6 here. This observation suggests post-translational modification of this kinase. In addition, for PAP6 but also for PAP3 and PAP11 one or two spots of lower molecular weight, respectively, were detected suggesting a targeted degradation or proteolytic modification of these two proteins (Figure 4). For PAP4 and PAP12 only a degradation product was detectable, while a spot of the full length protein was not identified.
Besides PEP and PAP proteins, we identified two proteins described as component of the TAC in mustard, PTAC4 and PTAC18 (Pfalz et al., 2006). PTAC4 is the vesicle-inducing protein in plastids 1 (VIPP1) which plays a crucial role in membrane stability (Zhang et al., 2012). The PTAC18 protein belongs to the cupin superfamily that merges proteins with a conserved βbarrel fold, giving this type of protein a strong thermal stability. It represents a family of very diverse members including enzymes and seed storage proteins, but also transcription factors (Dunwell et al., 2001). However, the exact function of pTAC18 is largely unknown. PTAC18 was identified in spot 255 being smaller and more in the acidic range as expected from the predicted protein representing likely a fragment. PTAC4 was identified in spots 243, 278, 280, and 295. Two hundred and seventy-eight and 280 are on the same size but with slightly different IPs suggesting post-translational modification of the protein.
An exceptional constituent of the PC protein fraction represents the protein CSP41 that appears in two forms, CSP41a and CSP41b. Originally described as the chloroplast stem-loop binding protein of 41 kDa (Yang et al., 1996) it has been discussed to be involved in RNA processing and stabilization as well as in RNA protection (Qi et al., 2012). As described for the HS fractions it represents a dominant protein of the nucleic acids binding proteome of plastids being present in multiple multimeric complexes of highly variable sizes (Schröter et al., 2010;Qi et al., 2012). In the PC fractions, the two forms of CSP41 appear to be especially enriched as they can be detected in 10 spots of the same apparent molecular weight of around 34 kDa but with different IPs (three for CSP41a and seven for CSP41b). The main accumulation is visible in the middle of the gels between pH 5.5 and 7. The CSP41a spots are by far the strongest spots observed in the whole gel followed by the spots for CSP41b. Roughly estimated they account for 30-40% of the total protein content in this fraction making a precise estimate difficult. In addition, the proteins are detectable in 34 less stained and smaller spots of different sizes suggesting massive post-translational modifications as well as multiple degradation or targeted proteolytic events acting on both protein forms. These smaller protein spots of CSP41a/b appear to contain not only random fragments of the proteins but could be observed as reproducible spot pattern in all replicates of nucleic acids binding sub-proteome preparations from mustard.

TRANSLATION ASSOCIATED PROTEINS
Numerous proteins identified in this work are directly or indirectly related to translation. In total 12 ribosomal proteins of the large 50S subunit of plastid ribosomes (PRPL) were identified,

Frontiers in Plant Science | Plant Physiology
October 2014 | Volume 5 | Article 557 | 4 FIGURE 2 | Numbers of total, analyzed, and identified spots of the PC peak fractions. Proteins separated in the 2D gels shown in Figure 1C are given in yellow boxes at the top. Protein groups corresponding to Table 1 are displayed below in colored boxes. At the left side of each box the number of putative plastid proteins per bin being either plastid encoded or for which a plastid transit peptide was predicted is given. At the right side proteins without these properties are given.
Beside the ribosomal subunits a number of translation initiation factors (IF) were present in the fractions and were detected here for the first time in S. alba. Except of eIF1A (a subunit of the cytosolic translation initiation complex) all of them contain a predicted plastid transit peptide. This accounts also to eIF3 which is known as a subunit of a eukaryotic IF (eIF). IF2 and IF3 represent plastid translation IF while elongation factors (EF) EF-Tu and the eukaryotic EF1alpha4 are involved in translation elongation. eIF1A, EF-Tu, and EF1-alpha4 appear as single spots while the others were found in several spots suggesting post-translational modifications here, too.
Furthermore, we identified a SpoU methylase that belongs to the class of SPOUT enzymes and introduces a methylation of 2 -OH groups of tRNA or rRNA riboses (Cavaillé et al., 1999;Tkaczuk et al., 2007), and two proteins that are subunits of  Table 1. (C) Distribution of solely the plastid proteins of the recent PC fractions to functional groups according to Table 1 but with an aggregation of "PEPs and PAPs" with "Other pTACs" and a part of "DNA and RNA" to one bin "Transcription." the nascent polypeptide associated complex (NAC). This dimeric complex is composed of an alpha-and beta-chain and may reversibly bind to ribosomes (Wiedmann et al., 1994). The alpha-NAC-like proteins identified during this work are encoded by different genes in Arabidopsis but exhibit a strong similarity within their amino acid sequence. The α-NAC like protein 1 and 3 were determined in the same two spots on the gels representing double spots.

PROTEINS INVOLVED IN PROTEIN HOMEOSTASIS, PHOTOSYNTHESIS, AND METABOLISM
We identified the chloroplast heat shock cognate protein 70-2 (cpHsc70-2) which is the analog of one of only two stromal Hsp70s in A. thaliana plastids (Su and Li, 2008). In addition, we www.frontiersin.org October 2014 | Volume 5 | Article 557 | 7

FIGURE 4 | Essential polymerase-associated proteins (PAPs) of the soluble PEP complex.
Positions of PAPs in the 2D gel after isoelectric focussing of the PC fraction on a pH 3-11NL and pH 6-11 gradient. Spot identity is given at the right margin. Fragments are additionally indicated by an asterisk. Marker sizes and pH range are given at right margin and above or below the gel, respectively. The gel is silver-stained.
found a TCP-1/cpn60 family chaperonin and a protein disulfide isomerase like 2-1 (PDIL 2-1) belonging to the thioredoxin superfamily and acting as folding catalyst. All proteins are identified in mustard fractions here for the first time and likely function in protein stability or formation. The correct folding of proteins is the last but essential step of gene expression. The group of photosynthesis related proteins contains four proteins. The alpha and beta subunits of the plastid ATP synthase were formerly identified in S. alba (Schröter et al., 2010). Another ATPase, the RubisCO activase and the Rieske cluster of the cytochrome b6/f complex were detected here first by mass spectrometry in the mustard plastid proteome. These proteins are most likely not involved in gene expression but co-purify in the column chromatography because of their substrate affinities. This is also true for the group of the miscellaneous proteins including the malate dehydrogenases (MDH) and the malate synthase (MLS), both identified in several spots.
Proteins involved in fatty acid metabolism were identified as well. These include acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha (CAC3) and FabZ, a beta-hydroxyacylacyl carrierprotein (ACP) dehydratase. An earlier study on the purification of the acetyl-CoA carboxylase multienzyme complex also resulted in the enrichment of nucleoid-associated proteins (Phinney and Thelen, 2005) suggesting a potential physical link between these two larger protein associations.
A third protein found (MFP2) is involved in lipid degradation. It was already identified in the HS-fractions in former experiments (Schröter et al., 2010). We found also a cystein synthase and phosphoserine aminotransferase (PSAT) as well as a pyrroline-5-carboxylate reductase (P5CR) known to be essential for amino acid metabolism and a serine hydroxymethyltransferase (SHMT) being essential for photorespiration. The mustard protein in the PC fractions matches to mitochondrial SHMT1 and 2 peptides of several Brassicales. The exact affiliation to one of these SHMTs remains unclear since the matching peptides fit to both proteins ( Table 1). The PC fractions contain also the myrosinase MB3 (involved in glucosinolate degradation) and a cruciferin fitting best to A. thaliana CRU3 (Table 1). Finally, also actin was detected in one spot, although mustard peptides of PC fractions match to different actin types of different Brassicales.

DISCUSSION THE PLASTID NUCLEIC ACIDS BINDING PROTEOME OF MUSTARD
Goal of our study was the establishment of a purification scheme allowing the estimation of size and composition of the plastid nucleic acids binding sub-proteome from mustard. By using HS and PC chromatography coupled to IF and SDS-PAGE we could reproducibly isolate 1079 protein spots from which we could identify 180 protein spots by mass spectrometry. However, to our surprise these 180 protein spots were found to represent just 58 individual proteins indicating a high degree of posttranslational modification of this specific sub-proteome which in part might be caused by differential phosphorylation (Reiland et al., 2009(Reiland et al., , 2011. Since we used NaF as phosphatase inhibitor in all preparation steps, the differential phosphorylation states of the analyzed proteins should be well conserved. In contrast, different redox states of thiol groups were not maintained during our purification procedure since reducing agents were included in all steps. Detection of a differential redox state in these fractions will require more specific methods such as redox difference gel electrophoresis (redox-DIGE) (Hurd et al., 2007(Hurd et al., , 2009. We also observed numerous smaller fragments from several proteins indicating degradation events. These, however, were not random as the spot pattern was reproducible between different preparations suggesting that it is not caused by action of proteases during purification, but by targeted events in the chloroplast. Whether these products represent intermediate steps of protein degradation or whether these fragments perform distinct functions remains to be determined. In summary, this high degree of post-translational modification indicates that the size of the sub-proteome is certainly smaller than the 1079 spots detected. If we assume a similar percentage of individual proteins as within the identified spots (32.2%) for the complete fraction then we estimate 347 proteins for the total nucleic acids binding subproteome. Since we identified a number of co-purifying proteins involved in metabolic processes (29.3%), we had to reduce this number to 236 proteins. However, our mass spectrometry determination has a certain bias since we could detect only the fraction of sufficiently abundant proteins which likely is enriched in metabolic enzymes. In addition, a significant part of posttranslational modification detected in our fractions is focussed on only two proteins, CSP41a and b which partly compromise our estimate. Without these two proteins, we estimate 314 proteins for the chloroplast nucleic acids binding sub-proteome. This appears a reasonable number taking into account the proteins

Frontiers in Plant Science | Plant Physiology
October 2014 | Volume 5 | Article 557 | 8 that are already known to be involved in the regulation of plastid gene expression such as NEP, PEP, PAPs, pTACs, PPRs, ribosomal proteins and so on. It, however, leaves still some space for the discovery of as yet unidentified regulators that might appear only in trace amounts such as eukaryotic transcription factors (Wagner and Pfannschmidt, 2006).

SPECIFIC FEATURES OF THE PROTEIN FRACTION AFTER PC CHROMATOGRAPHY
PC chromatography is a well established purification step for nucleic acids binding proteins from chloroplasts (Bottomley et al., 1970;Tiller and Link, 1993). Crucial for the quality of these fractions, however, are a thorough chloroplast preparation via sucrose gradient centrifugation and a pre-purification step of the chloroplast lysate using HS chromatography. In comparison to results from earlier work using just HS fractions (Schröter et al., 2010) we observed a high enrichment of translation associated proteins and especially of CSP41 proteins. Co-purification of metabolic enzymes as well as components from other cell compartments was clearly reduced. We obtained a good coverage of the subunits for the plastid RNA polymerase complex PEP; however, surprisingly the larger subunits of this complex were not detectable. We observed a significant reduction of proteins above 80 kDa in size within the PC fractions (Figure 1), however, this might be not the reason for the failure of detection since all other components of the complex were identified in the fractions and especially RpoC2 and RpoB are known to bind DNA/RNA. Since these large subunits are highly conserved and have been successfully detected earlier in HS fractions (Steiner et al., 2011) it is likely that they are not well separated on the IEF. Further analyses using additional enrichment methodologies before the IEF step such as size-exclusion chromatography might help to target this problem in the future.
The largest amount of all identified proteins in the PC fractions is dedicated to translational processes with 43% of all proteins ( Figure 3C). The 50S subunit of plastid ribosomes contains 33 subunits with 31 orthologs to Escherichia coli and the two plastid specific subunits PRPL5 and PRPL6 . The 30S subunit is composed of 21 E. coli orthologs and four plastid specific proteins with no homologs in other ribosomes . Most ribosomal proteins have contact to RNA in various ways, either they are structural components or directly involved in the translational process. Thus, ribosomal proteins contain nucleic acids binding structures which adhere to the used column materials and represent one main component of the nucleic acids binding subproteome of plastids. On the 2D-gels most of them accumulate at the higher pH-ranges and the use of the basic IPG-gels of pH 6-11 led to a good resolution of this group of proteins. The identification of 80S ribosomal proteins in plastid fractions is likely caused by the co-purification of particles attached to the outer chloroplast membrane, like known for tonoplast membrane fragments (Schröter et al., 2010). The main regulation of translation occurs at the level of initiation which is performed by initiation factors (IF). In eukaryotes this process is assured via 12 eIFs comprised by 23 polypeptides, whereas in prokaryotes three IFs are sufficient (Kapp and Lorsch, 2004). In plastids orthologs for all bacteria-type translation factors can be found, but the translational complex contains additional proteins not present in bacteria (Beligni et al., 2004). Three of the four IFs identified in this study contain a cTP although only IF2 and IF3 are plastid IFs with a prokaryotic origin. The third one, eIF3f, is a subunit of the eIF3 and is important for the basic cell growth and development and influences the expression of about 3000 genes in A. thaliana also in interaction with two other eIF3 subunits (Xia et al., 2010). ChloroP predicts a plastid transit peptide of 40 amino acids for eIF3f of A. thaliana and it was previously also identified in fractions enriched in plastid nucleoids (Huang et al., 2013). Thus, it seems to be a true plastid protein and not a co-purification of the cytosolic translational apparatus. However, it might be also possible that this protein possesses a dual localization both in nucleus and plastids contributing to the coordination of gene expression between the two genetic compartments as proposed for other plant cell proteins (Krause and Krupinska, 2009). The elucidation of the precise role of eIF3f in plastids and whether it is involved in the regulation of plastid gene expression will be an interesting field of future research.
The dominant proteins in the PC fractions are the two proteins named CSP41a and CSP41b (Yang et al., 1996;Yang and Stern, 1997). CSP41a and b were also detected in isolates of the PEPcomplex as one of the most abundant component (Pfannschmidt et al., 2000;Suzuki et al., 2004;Schröter et al., 2010) but they appear not to belong to the PAPs but co-purify with these fractions because of the enormous size of their largest conglomerates (Peltier et al., 2006;Schröter et al., 2010;Qi et al., 2012). Here, we identified CSP41a in 8 and CSP41b in 40 spots of diverse sizes and isoelectric points. Thereby, both form a defined spot pattern which was congruent in most replicates of the 2D-gels prepared for this work. This suggests that not only a multimerization of CSP41a/b occurs but maybe also an integration of defined fragment species of the proteins that might be important for specific functions. In addition to targeted fragmentation, the spot pattern after 2D SDS-PAGE suggests also a strong post-translational modification of the two proteins. Indeed, phosphorylation and lysine acetylation have been reported for the corresponding Arabidopsis proteins (Reiland et al., 2009(Reiland et al., , 2011Finkemeier et al., 2011). The spot pattern as well as the positions of the two proteins in the 2d-gels is highly reminiscent to those recently reported for Arabidopsis (Qi et al., 2012). The only difference occurs in the number of identified spots which were 6 Csp41a and 5 Csp41b in Arabidopsis while in mustard we observed 3 Csp41a and 7 Csp41b variants (besides the fragmented versions) (Figure 5). This suggests the action of at least some species-specific modifications of the proteins.

CONCLUSION
Here, we describe the technical requirements for a highly resolved biochemical purification of several hundreds of protein spots representing the nucleic acids binding sub-proteome of plastids.
Our analyses indicate a high enrichment of proteins involved in transcription and translation and, in addition, the presence of massive post-translational modification of this plastid protein sub-fraction. Furthermore, our study provides an extended catalog of plastid proteins from mustard being involved in gene www.frontiersin.org October 2014 | Volume 5 | Article 557 | 9 FIGURE 5 | Distribution of CSP41a and b spots in the pH 3-11 NL 2D-gel. CSP41a is drawn in yellow and CSP41b in orange. Marker sizes and pH range are given right beside and above the silver stained gel, respectively.
expression and its regulation and describes a suitable purification strategy for further analysis of low abundant gene expression related proteins.

PLANT GROWTH AND ISOLATION OF PLASTIDS
Mustard seedlings (Sinapis alba L., var. Albatros) were cultivated under permanent white light illumination at 20 • C and 60% humidity. Cotyledons were harvested under the respective light and stored on ice before homogenization in ice-cold isolation buffer in a Waring Blender and filtering through muslin and nylon. Chloroplast isolation by differential centrifugation and sucrose gradient centrifugation in a gradient between 30 and 55% sucrose was conducted as described earlier (Schröter et al., 2010).

ISOLATION OF NUCLEIC ACIDS BINDING PROTEINS BY HS-AND PC-CHROMATOGRAPHY
Lysis of plastids and the chromatography at HS CL-6B was performed according to (Tiller and Link, 1993;Steiner et al., 2009). Proteins were washed, eluted with 1.2 M (NH 4 ) 2 SO 4 and the peak fractions detected via protein quantification assays (RC DC™, Bio-Rad Laboratories, Inc., Hercules, CA, USA) (Schröter et al., 2010

TRYPTIC DIGEST, LC/ESI-MS/MS AND DATA ANALYSIS
The spot pattern of the different gels was compared. Matching low abundant spots were pooled (as indicated in Supplemental Table 1) to increase the detectable protein amount. Tryptic digest of protein spots was conducted after destaining as referred (Mørtz et al., 1994;Stauber et al., 2003).
Mass spectrometry was carried out at LCQ™-DecaXP ion trap mass spectrometer (Thermo Finnigan, San Jose, CA, USA) using a data-dependent scan procedure with four cyclic scan events as described in Schröter et al. (2010). The first cycle, a full MS scan of the mass range m/z 450-1200, was followed by three dependent MS/MS scans of the three most abundant ions. Sample run and data acquisition was performed using the Xcalibur™ software (Version1.3 © Thermo Finnigan 1998-2001). Seventy-six of the low abundant spots were measured at a Finnigan LTQ linear ion trap mass spectrometer (Thermo Finnigan, Thermo Fisher Scientific Inc., Waltham, MA, USA) coupled online after a nano HPLC Ultimate 3000 (Dionex, Thermo Fisher Scientific Inc., Waltham, MA, USA) (Schmidt et al., 2006). After one full MS the instrument was set to measure the collision induced dissociation pattern of the four most abundant ions and exclude the measured once for 10 s from newly measuring. The resulting spectra were analyzed using the Proteome Discoverer vs. 1.0 (Thermo Fisher Scientific Inc., Waltham, MA, USA) with the implemented Sequest algorithm . Therefore, a database of all RefSeq (reference sequence) sequences of A. thaliana and Arabidopsis lyrata as well as the complete Brassica napus and Capsella rubella and the remaining Brassicales proteins of NCBI was created [NCBI 2012.03.19 109146 sequences: Arabidopsis RefSeq 67924 sequences (35375 A. thaliana, 32549 A. lyrata) + B. napus 10622 sequences + C. rubella 4246 sequences + other brassicales 26354 sequences]. The Proteome Discoverer Software was set to adjust the Xcorr to reach a false discovery rate of ≤ 1% (Veith et al., 2009). All proteins with at least two unique peptides were taken for further analysis.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014. 00557/abstract Supplemental Figure S1 | Silver stained 2D-gels of the PC fractions with isoelectric focusing for the first dimension in pH gradients between 6-11, 3-11 NL, and 3-10 indicated in the upper left corner of each gel. The second dimension is performed in a 7.5-20% SDS polyacrylamide gel.
Spots are marked and numbered in yellow. Marker sizes and pH range are given right beside and below the gel, respectively. Table 1 | Identified peptides from the PC fraction. Spots are listed in numerical order. Accession numbers of proteins belonging to the same spot are listed in an order starting with the highest peptide coverage. Spot nr., identification number of the protein containing spot on the 2D-gels (see Supplemental Figure S1). Descriptions of depicted proteins are given as stated in the databases (see Materials and Methods). Coverage, coverage of the depicted proteins by the identified peptides; calc. pI, calculated pI of the depicted proteins based on the protein sequences in the database; MW, calculated molecular weight based on the protein sequences in the databases; z, peptide ion charge; lower case "m" in the peptide sequence, oxidized form of methionine; lower case "w" oxidized form of tryptophane; lower case "c" cystein with carbamidomethylation; lower case "k" acetylation of lysine. Table 2