Drought Stress Causes Specific Changes to the Spliceosome and Stress Granule Components

The spliceosome processes RNAs from a pre-RNA state to a mature mRNA thereby influencing RNA availability for translation, localization, and turnover. It consists of complex structures containing RNA-binding proteins (RBPs) essential for post-transcriptional gene expression control. Here we investigate the dynamic modifications of spliceosomal RBPs under stress and in particular drought stress. We do so by mRNA interactome capture in Arabidopsis thaliana using label free quantitation. This approach identified 44 proteins associated with the spliceosome and further 32 proteins associated with stress granules. We noted a high enrichment in the motifs RDRR and RSRSRS that are characteristic of RNA interacting proteins. Identification of splicing factors reflect direct and/or indirect stress induced splicing events that have a direct effect on transcriptome and proteome changes under stress. Furthermore, detection of stress granule components is consistent with transcriptional arrest. Identification of drought induced stress granule components is critical in determining common abiotic stress-induced foci that can have biotechnological applications. This study may therefore open ways to modify plant stress responses at a systems level through the modification of key spliceosome components.


INTRODUCTION
Transcription and post-transcriptional gene regulation (PTGR) are the first and main targets in gene expression control. The PTGR regulatory networks involve RNA processing, modification, stabilization, storage, localization, translation, and degradation (Cole, 2001;Lorkovic and Barta, 2002). These processes are tightly controlled, in part by RNA-binding proteins (RBPs). RNA processing and modifications are chiefly carried out by splicing factors, a group of RBPs, constituting the spliceosome complexes. The spliceosome is a large ribonucleoprotein (RNP) complex whose assembly at each intron involves small nuclear RNAs and hundreds of proteins. In eukaryotes, the spliceosome performs the essential processing of RNA; removing introns from pre-mRNA to form a mature mRNA. The highly dynamic composition of the spliceosome and RNPs, in general, orchestrates changes of the transcriptome during growth, development and in response to exogenous cues (Köster et al., 2017). Such consequences point to critical functions of RBPs in controlling the cellular mRNA population at any given time.
It has also been reported that transcriptional arrest leading to the induction of stress granule (SG) formation in response to low oxygen, oxidative, and heat stresses is mediated by RBPs (Anderson and Kedersha, 2006;Ivanov and Nadezhdina, 2006;Weber et al., 2008). SGs are cytoplasmic foci formed from aggregates of non-translated messenger RNPs. It has been proposed that SGs serve as sorting sites, where mRNAs are directed for storage, re-initiation or degradation through transfer to the processing bodies (Kedersha et al., 2005). In animal systems, SGs have been noted to contain poly-adenylated mRNPs in the form of stalled 48S pre-initiation complexes that also contain small ribosomal proteins, translation initiation factors, and RBPs such as poly(A) binding protein (PAB) 1 (Kedersha and Anderson, 2002;Kimball et al., 2003). In plants, only a few proteins have been identified as constituents of the SGs and including the initiation factors eIF4E, RBP47, and UBP1 (Weber et al., 2008) and LaRP1a (Merret et al., 2013). However, a detailed composition of SGs is yet to be fully characterized in plant and other systems under various stresses. Genome-wide and systems level identification of proteins binding, in vivo and in a time-and stimulus-specific manner, to mRNA has been made possible through the use of an interactome capture technology. This method has been applied to obtain the first genome-wide mRNA interactomes in various organisms including human cell lines (Baltz et al., 2012;Castello et al., 2013;Kwon et al., 2013), yeast (Saccharomyces cerevisiae) (Beckmann et al., 2015), drosophila (Drosophila melanogaster) (Sysoev et al., 2016), macrophages (Liepelt et al., 2016), and higher plants (Marondedze et al., 2016b). Recently, we have shown that an environmental cue such as drought stress, induced by polyethylene glycol (PEG), causes modification of the mRNA interactome affecting potential dual function proteins such as proteins involved in intermediary metabolism (Marondedze et al., 2019). Thus, far, interrogating the mRNAbinding proteome (RB-proteome) has given new insights into the mechanisms underpinning developmental and physiological states of cellular systems. However, the role of the spliceosome in stress-dependent transcriptional changes has remained unclear. Here, were therefore carried out a study to determine if the components of the spliceosome are altered during stress by monitoring changes in the RBPs during drought stress using Arabidopsis thaliana as model system. Additionally, we further interrogated the composition of drought induced SGs.

Cell Culture and Treatment
Cells derived from roots of Arabidopsis thaliana (ecotype Columbia-0) were grown in liquid medium, as previously described (Marondedze et al., 2013Ordonez et al., 2014). The cell cultures used in this study were obtained from Mrs Xiaolan Yu in the Department of Biochemistry at the University of Cambridge. Cells were treated with 40% (v/v) polyethylene glycol (PEG) 6000, a dehydration-inducing agent to mimic drought stress or with equal volumes of media as a negative control. Three biological replicates of cells treated with PEG or mock-treated cells were collected at 1 and 4 h post-treatment.
Each time-point treatment had a corresponding mock treatment per replicate. The medium was drained using Stericup R filter unit (Millipore, Billerica, MA), and cells were rinsed with 1X phosphate buffered saline immediately before UV-crosslinking (Marondedze et al., 2016b).

Abscisic Acid (ABA) Assay
Three biological replicates of cell suspension cultures for each time-point (controls at 0, 1, and 4 h, and 40% PEG treated samples at 1 and 4 h) were subjected to Phytodetek R ABA Immunoassay (Agdia Inc., Elkhart, Indiana, USA) following the manufacturer's instructions. ABA levels were measured and statistically evaluated between each control and treatment timepoint. The data for this assay has been published (Marondedze et al., 2019).

UV-Crosslinking and Interactome Capture
In vivo UV-crosslinking and isolation of Arabidopsis RBPs was performed, as previously described (Marondedze et al., 2016b), using a protocol that utilizes a modified method originally optimized for HeLa cells (Castello et al., 2013). Sample from each time-point were split into two, one set for UV-crosslinking and the second set for non UV-crosslinking. Samples for UVcrosslinking were irradiated in vivo with UV (254 nm) using a Stratalinker R UV crosslinker (Stratagene, La Jolla, CA) and the mRNA-protein complexes were pulled down using oligo(dT) beads. Purified proteins were analyzed by label free tandem mass spectrometry. Similarly to (Marondedze et al., 2016b), the quality of the mRNA-protein crosslinked complex pull-down was assessed by performing an additional control whereby the sample was treated with RNase T1/A mix (Thermo-Fisher Scientific) and the reaction was performed according to the manufacturer's recommendations. To isolate RBPs, mRNA-protein samples were treated with RNase A/T1 mix to release them from the captured RNA molecules. Crosslinking and isolation of RBPs were evaluated by western blotting using antibodies against polypyrimidine tract-binding protein 1, β-actin (Sigma Aldrich, St Louis, MO, USA) and Histone 3 (Abcam, Cambridge, UK) following the manufacturer's recommendations (see Marondedze et al., 2016b).

Protein Digestion and Mass Spectrometry
Protein samples were reduced, alkylated, buffer exchanged and digested, as described elsewhere (Marondedze et al., 2016b). Dried peptides were resuspended in 20 µL of 5% (v/v) acetonitrile and 0.1% (v/v) formic acid and analyzed with Q-Exactive TM Hybrid Quadrupole-Orbitrap TM using nano-electrospray ionization (Thermo-Fisher Scientific, San Jose, CA) coupled with a nano-Liquid Chromatography (LC) Dionex Ultimate 3000 Ultra High Performance Liquid Chromatography (UHPLC) (Thermo-Fisher Scientific). Mass spectrometry parameters and run analysis were performed following the protocol described in Marondedze et al. (2016a).

Mass Spectrometry Data Analysis
Raw files were processed using the Proteome Discoverer v2.1 (Thermo-Fisher Scientific) interlinked with the local MASCOT server (Matrix Science, London, UK). MASCOT searches were carried out against Arabidopsis thaliana database [built using the Arabidopsis information resource (TAIR; release 10)] using a precursor mass tolerance of 20 ppm, a fragment ion mass tolerance of ±0.5 Da and strict trypsin specificity allowing up to two missed cleavages, peptide charges of +2, +3, and +4. Carbamidomethyl modification on cysteine residues was used as a fixed modification, oxidation on methionine residues as variable modifications and the decoy database was selected. Further stringency was applied on the peptide spectrum matches (PSMs) by allowing "forward" and "decoy" searches by MASCOT to be re-scored using the Percolator algorithm in Proteome Discoverer v2.1 thus yielding a robust false discovery rate (FDR) of <1%. A minimum of two high confidence peptides per protein was prerequisite for identification using Proteome Discoverer.

UV-Crosslink Enrichment
Protein enrichment upon UV-crosslinking was performed as previously described (Marondedze et al., 2016b). Proteins that were detected in both the UV-crosslinked samples and the control (non-UV crosslinked samples) were quantitatively analyzed to assess UV-crosslinking enrichment. Normalized intensities of UV-crosslinked samples were quantitatively compared with the normalized intensities of the control (non-UV crosslinked samples), and a log 2 -fold change of ≥2 and p-value of ≤0.05 (using Student's T-test corrected for multiple testing using the method of Benjamini and Hochberg (Benjamini and Hochberg, 1995) were applied for proteins to be categorized as enriched RBPs and to be considered for further data analysis.

Drought Stress Responsive RB-Proteome Analysis
After normalization of the data and UV-crosslink enrichment analysis, proteins from the UV-crosslink enrichment and those that were only identified in the UV-crosslinked samples were used for quantitative analysis. Only proteins detected in at least two biological replicates were included. In this analysis, samples collected at 1 h time point, that is 1 h PEG treated samples and mock treated controls were compared against each other and similarly for the samples collected at 4 h time point. Proteins with a log 2 -fold change ≥1.5 and p-value ≤ 0.05 that was corrected for multiple hypotheses testing using the method of Benjamini and Hochberg (Benjamini and Hochberg, 1995) were classified as significantly differentially regulated proteins. Furthermore, to determine the interaction capacity relative to the total soluble protein, a heatmap representing protein abundances were z-score normalized within the total and the RNA-binding protein UV crosslinked samples separately. Hierarchical clustering was performed with the heatmapper (www2.heatmapper.ca) using Spearman Rank Correlation as the distance metric and complete linkage.

Gene Ontology Analyses and Classification
To identify proteins that are components of the spliceosome complex data mining was done using gene ontology (GO) enrichment too AGRIGO (http://www.heatmapper.ca) and pathway analysis using the KEGG mapper (http://www.kegg. jp/kegg/tool/annotate_sequence.html; February 2017), which annotates sequences by BlastKOALA. BlastKOALA is an internal annotation tool in KEGG that assigns KEGG Orthology numbers by BLAST searches against a non-redundant set of KEGG GENES using SSEARCH computation (Kanehisa et al., 2016). Stress granules were determined from previously identified and characterized data in literature (Buchan and Parker, 2009;Chantarachot and Bailey-Serres, 2018;Kosmacz et al., 2019). Classical and non-classical RNA-binding domains (RBDs) were detected from the drought stress RB-proteome identified in this study using pfam (http://pfam.xfam.org). RBPs and candidate RBPs were classified, as described previously (Beckmann et al., 2015). Co-expression for functional and data correlation analysis of selected up-and down-regulated proteins was performed using ATTED database (http://atted.jp).

Biophysical Characteristics and Sequence Topographies Analyses
Analyses of biophysical properties including length of proteins (number of amino acids), isoelectric points (pI) and hydrophobicity were performed using R (version 3.3.1). Amino acid composition enrichment between the drought stress responsive RBPome and input total proteome as reference as the background set was determined using the web-based composition profiler program (http://www.cprofiler.org/) using default setting and ordering amino acids by hydrophobicity (Kyte-Doolittle) (Vacic et al., 2007). Significance level was assessed using Bonferroni correction. Length and sequences of amino acids were retrieved from TAIR (https://www. arabidopsis.org/tools/bulk/sequences/index.jsp), the pI were obtained from TAIR (https://www.arabidopsis.org/tools/bulk/ protein/index.jsp) and hydrophobicity values were calculated using the GRAVY calculator (http://www.gravy-calculator. de). The biophysical characteristics and sequence topographies distribution biases were assessed using R packages, as outlined previously (Reichel et al., 2016). Amino acid motif enrichment from the spliceosome and SG RBPs relative to the published RBP repertoire (Köster et al., 2017) and input proteome reference as background were analyzed using the Discriminative Regular Expression Motif Elicitation (DREME, http://meme-suite.org/ tools/dreme) (Bailey, 2011) an interface in MEME suite version 4.12.0 (http://meme-suite.org) using an E-value threshold of 0.05.

Evolutionary Conservation of Drought Responsive Spliceosome and SG Proteins
To understand the conservation and potentially, the role of drought responsive RBPs associated with spliceosome and SGs, InParanoid version 8 (http://inparanoid.sbc.su. se/cgi-bin/index.cgi, Sonnhammer and Ostlund, 2015) was used to identify their predicted orthologs among selected dicots (Glycine max, Solanum lycopersicum, Vitis vinifera), monocots (Brachypodium distachyon, Hordeum vulgare, Oryza sativa, Sorghum bicolor), Saccharomyces cerevisae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, and Homo sapiens. Here, a two-way prediction was possible. The InParanoid   program generates ortholog groups that include all inparalogs with scoring below 0.05, which is achieved by using clustering rules based on genome-wide pairwise sequence similarity matches between two species (Sonnhammer and Ostlund, 2015). Phylogenetic tree construction was performed using the phylogeny web service (Phylogeny.fr), which utilizes multiple sequence alignment of as sequences from each organism and BLAST-EXPLORER to build datasets for phylogenetic analysis (Dereeper et al., 2008(Dereeper et al., , 2010. The ScanProsite (http://prosite.expasy.org/) was used for motif and copy number assignment.

Spliceosome Components and Transcriptional Regulation During Drought Stress
A distinct drought-responsive RNA-binding proteome has already been established (Marondedze et al., 2019). To gain further insights into the drought-responsive RNA-binding proteome, we interrogated changes in the spliceosome components, an important step in the genesis of RNA processing and modifications, to better understand the regulation of post-transcriptional stress responses at the systems level.
Among the 1408 proteins detected in at least two biological replicates, 74 are associated with the spliceosome pathway. Of the 74 proteins, 23 are significantly (p-value ≤ 0.05) responsive to drought stress as compared to the controls and 21 proteins were detected to associate with mRNA either at 1 or 4 h post-treatment (Table 1, Supplementary File 1).
Of the 23 enriched spliceosome components, ten proteins belong to the common components (CC) category (Figure 1). The second dominant group comprises of other splicingrelated proteins, with eight proteins. Overall 17 proteins are decreasing in abundance at 1 h and/or 4 h post-treatment. In the serine/arginine-rich (SR) category, serine/arginine-rich splicing 30 or SC35-like splicing factor 30 (SCL30) is increasing in abundance at 4 h while RS-rich zinc knuckle-containing protein 33 (AtRSZ33, AT2G37340) and SR-rich protein 34A (SRp34A, AT3G49430) are decreasing in abundance at 1 h after drought stress treatment ( Figure 1B). The SR proteins play a role in the constitutive splicing and selection of alternative splice sites in plants and animals, a central mechanism to generate proteome diversity and regulating gene expression (Zhou et al., 2002). In plants, alternative splicing profiles are modulated by environmental stress, hormones and various organ developmental stages (Iida et al., 2004;Palusa et al., 2007). For example, under high light irradiation and salinity, transcript levels of SCL30 and AtSR45A increase and their splicing patterns are altered (Tanabe et al., 2007). In the current study, SCL30 is increasing in abundance 4 h posttreatment in the mRNA interactome suggesting that the role of SCL30 is enhanced under drought stress and conceivably indicating posttranscriptional gene regulatory activity. Overall, SR proteins have been reported to contribute to alternative splicing by affecting splice sites selection in a concentrationand phosphorylation-dependent manner and seem to control this process in a developmental-regulated, tissue-specific and stress-responsive manner (Duque, 2011).
Notably, established ABA-responsive proteins were detected as part of the spliceosome pathway and these include ABA hypersensitive 1 (also known as cap binding protein 80 (CBP80); AT2G13540) that increased in mRNA association at both 1 and 4 h post-treatment, hyaluronan (AtRGGA, AT4G16830) and RNA-binding K homology (KH) domaincontaining protein (AT1G14170) whose abundances decreased over time (Figure 1B). In addition, cap binding protein 20 (CBP20) was detected only after stress treatment. CBP80/20 complex of the CC class is a heterodimeric complex that is essential for RNA metabolism through binding to the mRNA cap structure (Hugouvieux et al., 2001;Kierzkowski et al., 2009). Arabidopsis CBP20 has a role in ABA regulation and drought stress (Papp et al., 2004) while its partner CBP80 is known to modulate early ABA signal transduction (Hugouvieux et al., 2001) and acting in the flowering pathway (Kuhn et al., 2008). It is noteworthy that in the abh1 (cpb80) mutant, plants show ABA-hypersensitive stomatal closure, reduced wilting under drought stress and transcripts implicated in ABA signaling were down-regulated. Cytosolic calcium increased in the abh1 guard cells suggesting amplification of early ABA signaling response (Hugouvieux et al., 2001). It has also been reported that CBP20 and CBP80 are stabilized by ABA via a post-translational mechanism and that these proteins participate in ABA induction of miR159 during seed germination resulting in increased abiotic stress tolerance (Kim et al., 2008). Taken together, the increase in CBP80/20 complex proteins, similarly to reported ABA responses, implicate them in the post-transcriptional regulation of drought stress responsive genes.
AtRGGA, an RBP localized in the perinuclear space and cytoplasm, has been shown to increase in expression in Arabidopsis seedlings in response to ABA and PEG treatment (Ambrosone et al., 2015). Transgenic plants overexpressing AtRGGA showed more tolerance to ABA, drought and salt stresses as indicated by lower rate of water lose in detached leaf assays when compared to control untransformed plants.
The rgga mutant showed high sensitivity to ABA and osmotic stress, suggesting that AtRGGA participates in ABA-dependent mechanisms of salt and drought stress responses (Ambrosone et al., 2015). Expression profile analysis using genecat (https:// www.arabidopsis.org) shows that AtRGGA is highly expressed in seeds, particularly under dehydration. In the current study, we observe that at RBP level, the abundance of AtRGGA decreases in response to drought stress although at the RNA level it has been observed to increase. These findings suggest a reduction in the modulation of AtRGGA RNA targets possibly to facilitate its regulatory function in enhancing tolerance to drought stress.
Furthermore, six of the 23 significantly changing proteins are increasing in abundance during drought stress. Co-expression analysis of the six proteins reveal that 36 proteins of the spliceosome are co-expressed with at least one of the six proteins. Among the 36 proteins, 10 proteins are co-expressed with at least two of the six up-regulated proteins. Three of the proteins, modifier of suppressor of non-expresser of pathogenesis-related genes 1, 4 (MOS4)-associated complex 3B, SCL30 and SCL33 are increasing in abundance after drought stress while the remaining seven are decreasing. Notably, 12 of the co-expressed proteins are known components of the stress granules, suggesting a potential crosstalk between spliceosome function and translational arrest, which are two distant biological processes in stress response. A global gene ontology analysis of the 300 co-expressed proteins for each of the six up-regulated proteins show an enrichment bias on biological processes such as "RNA metabolic process, " "primary metabolic process, " "methylation, " "chromosome organization, " and "flower development" depicting that various key processes pre-and post-transcription are associated with drought stress responsive spliceosome components. We therefore postulate that the spliceosome is a key regulatory component that contributes to stress-dependent transcriptional regulation.

Stress Granule Components and Their Role in Drought Stress
In addition, we set out to detect stress granule components and changes in their abundance. We observed that a third (12 proteins) of the proteins co-expressed with spliceosome proteins, which are increasing in abundance during drought stress are classical components of SGs. Seven of the 12 proteins are also either increasing in abundance or detected to interact with mRNA after drought stress treatment (Supplementary File 2). Interestingly, of the seven, three proteins namely, TCP-1/chaperonin 60 (AT1G24510 and AT5G20890), and Tudor (AT4G31880), are co-expressed with CBP80, while eukaryotic translation initiation factor 2δ (AT1G04170), eukaryotic translation initiation factor 2 subunit 1 (AT2G40290) and TCP-1/chaperonin 60 (AT5G20890) are co-expressed with the protein Binding to TOMV RNA 1L (AT5G04430). Binding to TOMV RNA 1L is also co-expressed with additional four SG proteins namely, PAB4 (AT2G23350), PAB8 (AT1G49760), RBP 47A (AT1G49600) and Ras-GAP SH3 domain-binding protein [also called nuclear factor 2 (AT5G43960)] whose abundances decrease upon drought stress. However, PAB4 and PAB8 are shared between spliceosome and SG potentially suggesting a common post-transcriptional regulation function. The remaining proteins glutathione S-transferase PHI 2 (AT2G42520) and nuclear transport factor 2A (AT1G27310) are co-expressed with RNA-binding glycine-rich protein B1 (AT1G60650) and the splice factor SCL30, respectively. This relationship signifies an important potential crosstalk that may exist between the spliceosome and SG assembly during stress responses.
In order to gain a global picture on SG composition, we looked at the entire drought stress responsive mRNA binding proteome data to see how many SG proteins cropped up in the data and how they are modified under drought stress. This led to the identification of 32 SG associated proteins including the 12 described above (Table 1, Supplementary File 1). Seventeen of the SG components significantly changed in abundance upon stress compared to the control set. SGs are an essential part of the response to environmental stresses through reversible translational suppression that acts as regulators of mRNA storage and stability e.g., during oxidative stress (Keene, 2007), heat stress (Weber et al., 2008), and hypoxia (Sorenson and Bailey-Serres, 2014). SG components regulate translational initiation phase of recovery after environmental stresses.
Co-expression analysis was performed on a set of SG proteins including rhamnose biosynthesis 1 (AT1G78570), RBP45B (At1g11650), and Ras-GAP SH3 domain-binding protein (Supplementary File 3). Notably, five proteins among the rhamnose biosynthesis 1 co-expressed genes are part of the stress granule components. Just like rhamnose, four proteins namely, 2,3-biphosphoglycerate mutase 1, Heat shock protein 70, S-adenosylmethionine synthetase and caffeoyl coenzyme A show drought stress induced mRNA interactions. Rhamnose biosynthesis 1 is a protein involved in the biosynthesis of rhamnose, a major monosaccharide component of pectin. Together with 2,3-biphosphoglycerate mutase 1, Heat shock FIGURE 2 | Dynamic characterization of the spliceosome and stress granule RBPs after drought stress. (A) Protein abundance from total soluble proteome and UV-crosslinked samples for the spliceosome components. (B) Protein abundance for stress granule associated proteins from total soluble proteome and UV-crosslinked samples. Abundance z-score was normalized within each sample extraction type. Hierarchical clustering across all samples is shown on the left, and on the right are the protein accession numbers. Individual RBPs with a significant increase upon UV crosslinked, stress exposure and compared to their respective UV crosslinked samples are highlighted in blue.
protein 70 and caffeoyl coenzyme A, they have been recently identified as part of the SG but have no known RBDs (Kosmacz et al., 2019). In turn, RBP45B, a SG protein, is co-expressed with proteins that are part of the spliceosome. Similarly to RBP45B nearly 90% of its co-expressed counterparts (from Table 1) are also decreasing in abundance upon drought stress treatment  (Köster et al., 2017), (A-G) Seven enriched amino acid motifs against the input proteome reference and (H-K) enriched motifs from the spliceosome specific data set as analyzed by the DREME software, which is part of the MEME suite.
FIGURE 4 | Biophysical features of drought stress responsive spliceosome and stress granule associated RBPs. Density of (A) protein length (number of amino acids), (B) isoelectric point (pI) and (C) hydrophobicity (gravy) were analyzed for spliceosome and stress granule associated proteins responsive to drought stress (green), RBP repertoire (Köster et al., 2017) (red) and input proteome from controls that are used as input or background (blue). (D) Log 2 enrichment of amino acid residues in the spliceosome and stress granule associated proteins responsive to drought stress, determined using the composition profiler (http://www.cprofiler.org/). The significance of enrichment or depletion was tested by a two-sample T-test and amino acids that are significantly enriched or depleted (p ≤ 0.001) compared to the control background (N = 5630) are marked with an asterisk. File 3). Ras-GAP SH3 domain-binding protein, another notable component of the SG is co-expressed 17 SG and spliceosome components and 11 of these proteins are decreasing in abundance just like itself (Supplementary File 3). Overall, co-expression analysis demonstrates the strong connectivity of networks and events associated with drought stress and in particular among proteins that are involved in spliceosome and stress granule formation and/or maintenance. In addition, identification of the spliceosome, and SG associated RBPs adds confidence to our experimental system and it sheds new light on the interrelation of biological processes in mRNA association during drought stress at the systems level.

Dynamic Abundance of Spliceosome and Stress Granule RBPs
Quantifying protein abundance in UV crosslinked samples and total soluble proteins in the cell lysates from the same sample, allowed us to determine changes in RNA-binding that were discrete from coexisting changes in the total protein abundance. In general, RBPs show distinct abundances compared to the total proteins (Figure 2). In the spliceosome, most RBPs decreased in abundance relative to the total protein suggesting that although the RNA association is modified, the overall abundance of the proteins is not (Figure 2A). Important to note that RBPs such as CBP80, SC30 and Binding to TOMV RNA 1L, that show a significant increase in RBP level after stress are decreasing in abundance relative to their total protein level, indicating an increase in their RNA association rather than at the translational level. Notably, stress granule components, PAB5 (AT1G71770), RBP47A (AT1G49600), RBP45B, and RBP45A (AT5G54900) increase in abundance at RBP relative to the total protein levels. Essentially, 11 RBPs increase in their abundance for RNA interaction compared to the control upon RBP enrichment potentially increasing the mRNA occupancy ( Figure 2B). Remarkably, the changes in UV-enriched protein abundance often do not correlate with variations in total protein abundance, signifying that specific RBPs bind RNA differentially during stress.

Domain Organization of Spliceosome and SG Associated Drought Stress Responsive RBPs
Analysis of domain diversity was performed using Prosite (http:// prosite.expasy.org). Majority of the proteins contain at least one of the classical RBDs (Supplementary File 4) including the RNA recognition motif (RRM) domain harbored by 27 proteins. The RRM domain is the most dominant classical RBD and its dominance among drought stress responsive proteins may suggest a preferential or specific stress response control mechanism. Conceivably, this expands the protein function to the transcriptional regulation of responses to abiotic stress. At 1 h post-treatment only, abundances of all the RRM domaincontaining proteins including glycine-rich RBP5, RNA binding family proteins (AT5G02530, AT4G00830, AT1G03457), PAB4, PAB8, RBP45A, RBP45B, RBP45C, and RBP47B decreased ( Table 1). It remains unresolved whether domain architecture and/or domain organization of an RBP affect RNA targets in response to a drought stress but we hypothesize that environmental stress responses may operate via RBP with a particular domain architecture and in turn influence their RNA targets.

Analyses of Amino Acid Motifs, Biophysical Features and Sequence Topology
In order to gain further insights in what could be driving this targeted change on RBPs particularly their differential accumulation and RNA interaction, we characterized their amino acid sequence complexity features. Enriched amino acid sequence motifs were computed against the published RBP repertoire (Köster et al., 2017) and then the input proteome reference using the DREME software, which is part of the MEME suite (Bailey, 2011). Two amino acid motifs were significantly (E-value threshold ≤0.05.) enriched against the background of the RBP repertoire (Figures 3A,B) and six motifs were detected as significantly enriched against the input proteome (Figures 3A-G). Enrichment is biased toward glycine-rich (GR) motifs that have been reported to play a role in several RNAassociated processes (Figures 3A-D) (Thandapani et al., 2013). The GR motifs detected include the GGGY, GYGFV, IFVGG [LI], and GGYGG. The GGGY motif belongs to the GR repeat class I category and is detected in GR containing proteins that act as signal peptides and play a role in pre-mRNA processing, for example the ATP-dependent RNA helicases. The motifs, GYGFV and IFVGG [LI], are classical conserved consensus sequences of the RRM domains of RNP1 and 2, respectively (Lorkovic and Barta, 2002). Motifs GGGY and GYGFV have been detected in dehydrins, a group of evolutionarily conserved GR hydrophilic "late embryogenesis proteins, " which are responsive to ABA treatment, dehydration, salinity or cold stress (Allagulova Ch et al., 2003). Other motifs enriched among the spliceosome and SG drought stress responsive  (Figures 3F-G). Interestingly, when we consider only spliceosome proteins, we identified distinct motifs including two GR (RGGR and GRRG), an RS (RSRSRS) and RDRR motifs (Figures 3H-K) of which RGGR is consistently detected. Examining the biophysical and amino acid characteristics of the spliceosome and SG associated drought stress response proteins, we observed that the two reference datasets span the whole spectrum of protein sizes (represented as number of amino acids in Figure 4A), while the spliceosome and SG associated proteins were generally <1200 amino acid residues long. The majority of proteins in all datasets contain <1000 aa and some inclination toward longer proteins was detected among the RBP repertoire ( Figure 4A). However, this analysis was not significant probably due to the low number of proteins in our dataset compared to the background samples. Compared to the input reference, RBP repertoire exhibited an isoelectric point distribution trend that is skewed toward higher alkaline isoelectric points while spliceosome and SG proteins displayed two distinct peaks one in the acidic and one in the alkaline isoelectric points ( Figure 4B). The RBP repertoire, spliceosome and SG-associated proteins exhibited bias toward lower hydrophobicity than the input proteome ( Figure 4C). For these parameters, a similar trend was observed with regards to the drought stress responsive RBPs, in particular for classical RBPs and proteins whose association with RNA has been established, as previously described (Marondedze et al., 2019).
Amino acid distribution and enrichment between the reference input, spliceosome and SG associated proteins was determined using the composition profiler program (http://www. cprofiler.org/). Positive amino acids with polar side chains that have high affinity for RNA were enriched among the spliceosome and SG associated proteins and these include arginine (R), glutamine (Q), and aspartate (D) (Figure 4D). Glycine (G), an aa that is known to interact strongly with guanine (Lejeune et al., 2005), exhibited the highest enrichment. In contrast, aromatic, and hydrophobic aa, isoleucine (I), leucine (L), and valine (V), were underrepresented ( Figure 4D). Also underrepresented, are amino acids with aliphatic side chains [phenylalanine (F) and tryptophan (W)]. Proline (P) was significantly enriched and has been reported as strongly enhanced in highly disordered protein regions, a central concept for RBPs (Sysoev et al., 2016). We also noted that a similar set of amino acids were enriched in the total RBPs responsive to drought stress with the exception of R and P, that are significantly enriched in the drought stress responsive spliceosome and stress granule proteins (Marondedze et al., 2019). Overall, these molecular and biophysical findings reflect previously reported characteristics of mRNA interactomes and confirm the properties of mRNA recognition. However, the drought stress response stimulus seems to favor a specific set of the spliceosome complex proteins that have specific characteristics as indicated by these biophysical characteristics, e.g., strong enrichment of aa like glycine, highly alkaline isoelectric points and bias toward relatively short length proteins.
FIGURE 5 | Evolutionary conservation of pre-mRNA processing factor 19 and polyadenylate-binding protein RBP45A. Phylogenetic and motif or domain copy numbers of pre-mRNA processing factor 19 (PRP19) (A,B) and polyadenylate-binding protein RBP45A (C,D). Phylogenetic tree construction was performed using the phylogeny web service (Phylogeny.fr and the motifs and copy number assignments were performed using the ScanProsite (http://prosite.expasy.org/).

Conservation of Spliceosome and SG Associated drRBPs Across Different Species
Using Paranoid8 and spliceosome and SG proteins that are significantly regulated during drought stress, 24 proteins are predicted to have orthologs in dicotyledons and 26 in monocotyledons. Most of the orthologs are detected in barley (Hordeum vulgare) and rice (Oryza sativa) ( Table 2, Supplementary File 5). Only 13 Arabidopsis proteins had orthologs detected in all the plant species examined in this study and three proteins had no orthologs in any species examined. The general trend of high sequence identity on predicted orthologs across the plant kingdom could serve as positive criteria in engineering increased drought tolerance in crop plants. In addition, 25 proteins were predicted to have orthologs in either Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, or Saccharomyces cerevisae. Notably, the eight proteins with orthologs in yeast might indicate an ancient origin of RBP dependent processes.
Further analysis of evolutionary trends of the Arabidopsis drought responsive spliceosome and SG associated protein orthologs across different selected species was performed using a combination of phylogenetics, protein sequence motif, and copy number. We observed that the pre-mRNA processing factor 19 (PRP19, AT2G33340), also termed MOS4-associated complex 3B, has orthologs in all of the species blasted (Supplementary File 5), ranging from yeast to higher plants and animals. The Arabidopsis protein is closely related to its orthologs from Vitis vinifera and Gylcine max (Figure 5A). This PRP19 protein is an ubiquitinprotein ligase, containing WD-40 repeats, that is involved in pre-mRNA splicing. Its role is conserved from yeast to higher eukaryotes including plants. It acts as a central component of the PRP19-associated complex that stabilizes the spliceosome structure after dissociation of the U4 snRNA, a role that is mediated by the WD-40 repeats (Smith et al., 1999;Ohi et al., 2003). In plants, PRP19 was postulated to regulate defense responses via transcriptional control (Monaghan et al., 2009). The PRP19 domains are present and highly conserved in all species, however, the WD-40 repeats vary in copy numbers from one in barley to seven in Arabidopsis, rice, human and mouse ( Figure 5B). Another protein worth noting is the RBP45A, which has orthologs in rice, barley, yeast, and in primitive photosynthetic organisms, including the green algae (Chlamydomonas reinhardtii, Figure 5C), indicating an ancient origin. The RRM domains are conserved thus retaining the function of the protein, although in algae only two RRM domains are detected while in yeast and plants three RRM domains are present ( Figure 5D). Other proteins such as the CBP80 and the hyaluronan (AtRGGA) that have previously been shown to be involved in drought stress have no orthologs in yeast (Supplementary File 5). An ortholog of the Arabidopsis CBP80 is detected in barley and sorghum (Sorghum bicolor) FIGURE 6 | Evolutionary conservation of ABA hypersensitive 1 (CPB80) and AtRGGA (hyaluronan). Phylogenetic and motif or domain copy numbers of CBP80 (A,B) and AtRGGA (C,D). Phylogenetic tree construction was performed using the phylogeny web service (Phylogeny.fr and the motifs and copy number assignments were performed using the ScanProsite (http://prosite.expasy.org/). with a varying domain architecture (Figure 6A). In the latter, the protein lacks the MIF4G motif that is implicated in mRNA splicing ( Figure 6B). The observed differences in domain architecture and composition could in part be that barley and sorghum are in general drought stress tolerant crops that may differ in their ABA regulation compared to Arabidopsis, for example.
The orthologs of hyaluronan were found in plants and drosophila but not in yeast ( Figure 6C). Its absence in yeast is not surprising, as yeast does not synthesis hyaluronan. The plant orthologs have two main domains, the Stm1_N and the HABP4_PAI-RBP1 (Figure 6D). The Stm1_N domain is present at the N-terminal of some RGG repeats of nuclear RBPs and associates with ribosomes and nuclear telomere cap complexes (Van Dyke et al., 2004). The HABP4 domain has been observed to bind RNA but with lower affinity than that for hyaluronic acid (Huang et al., 2000), while the PAI-RBP1 binds mainly to the mRNA of plasminogen activator inhibitor 1 and regulates mRNA stability (Heaton et al., 2001).

CONCLUSION
Drought induces significant changes of the spliceosome. In turn, these changes reflect modifications of transcriptional program and hence responses to stress. In addition, identification of stress granule components points to a translational arrest induced by drought stress. The overlap of co-expressed proteins between the two distinct molecular processes, spliceosome function, and stress granule formation, suggests the presence of a systems level response and possibly crosstalk. Furthermore, the presence of conserved stress responsive RBPs indicates an ancient origin of these proteins and possibly evolutionarily conserved posttranscriptional regulation that operates during stress responses and adaptation. This study sets the foundation for future detailed and mechanistic approaches to elucidate dynamic changes and functional roles of RBPs and their binding RNAs under various environmental cues.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the PRIDE repository accession PXD016883.

AUTHOR CONTRIBUTIONS
CM, CG, KL, and LT designed the research. CM performed all the experiments and data analysis. KL and LT provided technical assistance. CM and CG drafted manuscript. All authors participated in revising the manuscript and approved the final version.