Molecular Characterization and Expression Profiling of Brachypodium distachyon L. Cystatin Genes Reveal High Evolutionary Conservation and Functional Divergence in Response to Abiotic Stress

Cystatin is a class of proteins mainly involved in cysteine protease inhibition and plant growth and development, as well as tolerance under various abiotic stresses. In this study, we performed the first comprehensive analysis of the molecular characterization and expression profiling in response to various abiotic stresses of the cystatin gene family in Brachypodium distachyon, a novel model plant for Triticum species with huge genomes. Comprehensive searches of the Brachypodium genome database identified 25 B. distachyon cystatin (BdC) genes that are distributed unevenly on chromosomes; of these, nine and two were involved in tandem and segmental duplication events, respectively. All BdC genes had similar exon/intron structural organization, with three conserved motifs similar to those from other plant species, indicating their high evolutionary conservation. Expression profiling of 10 typical BdC genes revealed ubiquitous expression in different organs at varying expression levels. BdC gene expression in seedling leaves was particularly highly induced by various abiotic stresses, including the plant hormone abscisic acid and various environmental cues (cold, H2O2, CdCl2, salt, and drought). Interestingly, most BdC genes were significantly upregulated under multiple abiotic stresses, including BdC15 under all stresses, BdC7-2 and BdC10 under five stresses, and BdC7-1, BdC2-1, BdC14, and BdC12 under four stresses. The putative metabolic pathways of cytastin genes in response to various abiotic stresses mainly involve the aberrant protein degradation pathway and reactive oxygen species (ROS)-triggered programmed cell death signaling pathways. These observations provide a better understanding of the structural and functional characteristics of the plant cystatin gene family.


INTRODUCTION
Cystatins, which constitute a multigene family, form a class of proteins that inhibits cysteins proteases (Turk and Bode, 1991). Cystatins are sub-divided into stefins without disulfide bonds (family 1), cystatins with two disulfide bonds (family 2), and kininogens with nine disulfide bonds (family 3) based on primary sequence homology (Abrahamson et al., 2003). Most cystatins inhibit the activities of cathepsin L-like proteases, a cysteine protease in the peptidase C1A family (Martinez et al., 2009). Cystatins are widely distributed in both animal and plant systems (Margis et al., 1998;Kotsyfakis et al., 2006). Plant cystatins, referred to as phytocystatins (phy-cys), are small in size, about 12-16 kDa, and have the LARFAV consensus sequence motif in the region corresponding to a predicted N-terminal αhelix (Misaka et al., 1996). Additionally, phy-cys are believed to contain either N or C-terminal extensions that apparently raise their molecular weights up to 25 kDa (Misaka et al., 1996;Martinez et al., 2005). It has been suggested that phy-cys with short N-terminal and longer C-terminal extensions inhibit the activities of cysteine proteases in the peptidase C13 family (Martinez et al., 2007). There are three important signature motifs necessary for the protease inhibition reactions present in all cystatins: a QxVxG reactive site, one or two glycine (G) residues in the N-terminal part of the protein, and a tryptophan residue (W) located downstream of the reactive site (Margis et al., 1998).
Brachypodium distachyon L., a temperate wild annual grass in the Pooideae subfamily has emerged as a novel model plant in the study of temperate cereals, such as wheat and related species (Draper et al., 2001). Although cystatin proteins have been investigated in some plant species, information on this gene family in B. distachyon is limited. Genomewide identification and characterization of cystatin genes in B. distachyon are necessary to determine their functional roles in plant developmental processes and in defense against abiotic stress, which will help to improve cereal crop resistance to various stresses. In the present study, we provide the first molecular characterization and expression profiling of the B. distachyon cystatin genes in various tissues and examine their reactions under different abiotic stresses. Our findings provide novel insights into the structure, evolution, and function of the plant cystatin gene family.

Retrieval and Identification of Cystatin Gene Sequences
To obtain the B. distachyon cystatin genes, previously published orthologous cystatin gene sequences from Hordeum vulgare (Martinez et al., 2009), Oryza sativa (Wang et al., 2015), Triticum aestivum (Kuroda et al., 2001), Zea mays (Massonneau et al., 2005), and Sorghum bicolor (Li et al., 1996) are listed in Table S1, which were used to BLAST against the Brachypodium distachyon genome database, Phytozome v9.0 (http://www.phytozome.net) by the BLAST program. Sequences were selected as candidate genes if they were described as cysteine protease inhibitor activity along with their E-value <10e-10. For each query sequence, information of the location on chromosomes, genomic sequences, full coding sequences (CDS), and protein sequences were collected from Phytozome. Unique cystatin genes were obtained by manually excluding the redundant sequences. Eventually, the identified candidate genes were named as Brachypodium distachyon cystatin (BdC). The putative cystatin protein sequences of B. distachyon are further analyzed with the InterPro program using the PFAM database (http://pfam.sanger.ac.uk; Bateman et al., 2002) and their cystatin domains deduced. Following the PFAM search, BdC genes without typical domain (Aspartic acid proteinase inhibitor) and reactive site motif (QxVxG; Margis et al., 1998) of cystatin protein were deleted from further analysis.

Chromosomal Locations, Exons/Introns Organization, Conserved Motif Analyses, and Characteristics of Cystatin Genes
The gene locations were based on the Phytozome v9.1 database and mapped by MapInspect software. Identification and cataloging of B. distachyon cystatin genes in terms of intragenome or cross-genome syntenic relationships were conducted using the Plant Genome Duplication Database (PGDD) (http://chibba.agtec.uga.edu/duplication/).
The genomic organization such as exons and introns were analyzed by Gene Structure Display Server (GSDS; Guo et al., 2007). Analysis of conserved motifs was performed by MEME (Multiple Em for Motif Elicitation) software version 3.5.4 (Bailey et al., 2006) (http://meme-suite.org) using minimum and maximum motif width of 8 and 15 residues respectively, and a maximum number of 15 motifs, keeping the rest of the parameters at default. The protein sequence characteristics such as pI/Mw and signal peptide was predicted, respectively by using Compute pI/Mw tool (Gasteiger et al., 2005) and Signal P4.1 (Petersen et al., 2011). The subcellular distribution of the proteins was predicted by using TargetP 1.1 (www.cbs.dtu.dk/services/TargetP/) server.

Multiple Sequence Alignment, Hierarchical Cluster Analysis, Tertiary Structure Prediction, and Promoter Analysis of Cystatin Genes
Analysis of DNA and comparisons of deduced protein sequences alignments were carried out by BioEdit software (Hall, 1999). Hierarchical clustering of cystatins was performed by MultiAlin tool using alignment parameters of identity, gap penalty at 8 and 2, respectively for opening and extension (Corpet, 1988). The three-dimensional structures of the B. distachyon cystatins were modeled by Phyre2 Server (http://www.sbg.bio.ic.ac.uk/ phyre2/html/) (Kelley and Sternberg, 2009). Promoter sequences of cystatins were examined using plantCARE database (http:// bioinformatics.psb.ugent.be/webtools/plantcare/html/) (Lescot et al., 2002). A stretch of 2,000 bases upstream of the start site was considered for analysis.

Phylogenetic Analysis
A total of 71 cystatin sequences those from 7 plant species including Brachypodium in this study were used to construct a phylogenetic tree. The sequences and respective protein ID or transcript names are displayed in Table S1, and the corresponding nomenclatures were composed of two letters for genus and species, followed by BdC and an Arabic number. The cystatin amino acid sequences of the whole coding regions were aligned by ClustalW parameters using the Gonnet series as the protein weight matrix. Phylogenetic analysis of the sequences was done by MEGA (Molecular Evolutionary Genetic Analysis) software 5.10 (Tamura et al., 2011) using the neighbor-joining (NJ) method with complete deletion, JTT matrix-based method (Jones et al., 1992) and 1,000 bootstrap replicates with the bootstrap method.

Plant Growth, Stress Treatments and Sample Collection
Brachypodium distachyon 21 (Bd21) seeds were sterilized with 75% alcohol and 15% sodium hypochlorite, rinsed 4-5 times and placed on moistened filter paper in Petri dishes and germinated at 26 • C for 1 week. Then the seedlings were transferred to plastic pots (ten seedlings per pot) filled with Hoagland solution in a growth room at 22 • C and a 16 h day/8 h night photoperiod and supplemented with an average cool white fluorescent light photon flux of 180 µmol s −1 m −2 . The nutrient solution in pots were routinely changed every 3 days. When the seedlings reached the two-leaf stage, various stresses such as cold, H 2 O 2 , CdCl 2 , drought, salinity, and ABA treatments were initiated according to the procedures described in previous reports Zhu et al., 2015). For each treatment, three pots were used. Cold stress was provided to seedlings by placing them in a growth chamber with 4 • C for 12 and 24 h. CdCl 2 stress were investigated with 50 µM for 6 and 12 h. Drought treatment was carried out with 200 mM polyethylene glycol 6,000 for 12 and 24 h; salt treatment accomplished with 160 mM sodium chloride treatment for 12 and 24 h; 0.1 mM ABA treatment done for 6 h; and 20 mM H 2 O 2 treatment was for 2, 4, and 6 h. For each experimental conditions either for stress treatment and control plants, triplicates (three biological replicates) were used. After the stress treatment, control and treated leaves were harvested for assays. Developing caryopses were sampled from 4 to 30 days post anthesis (DPA) at 2-5 day intervals (12 sampling times) according to a previous study . All samples were immediately frozen in liquid nitrogen and kept at −80 • C prior to RNA isolation.
mRNA Isolation, cDNA Synthesis and Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR) Total mRNA was extracted by using the Trizol method. Transcript levels of cystatins genes were analyzed at relative levels by using the SYBR Green-based qRT-PCR method (CFX96, Bio-Rad Thermal Cycler system C1000 series). A 20 µl of reaction volume containing cDNA template, 2 × SYBR premix Ex Taq, 0.5 µM oligonucleotide primer (Takara, SYBR Ex Taq, China) was used for qRT-PCR. The PCR reaction included one cycle at 95 • C for 3 min, followed by 39 cycles of 95 • C for 15 s, 60 • C for 20 s and 72 • C for 15 s, and a final cycle at 65 • C for 5 s and 95 • C for 2 s to check the specificity of the oligonucleotides annealing and dissociation kinetics. The Ubiquitin gene from Brachypodium was used as reference genes according to previous reports (Hong et al., 2008). Primer pairs for qRT-PCR analysis ( Table S2) were designed by the Primer3Plus program (http://www.bioinformatics.nl). Genespecific amplification of both target and reference genes were standardized by the presence of a single, dominant peak in the qRT-PCR dissociation curve analyses. The relative expression level of cystatin mRNA transcripts was calculated relative to Ubiquitin by using the comparative threshold cycle method (Pfaffl, 2001). To determine the number of cDNA copies of duplicated cystatin genes, absolute mRNA expression levels of five genes were measured by construction of standard curve with serial dilutions of known amount of linearised plasmid DNA carrying target (Cystatin) and reference (Ubiquitin) gene, in which transcript or cDNA copies of each target genes were estimated according to previous report (d 'Aloisio et al., 2010;Subburaj et al., 2014). The amplification efficiencies (E) and R2-values (coefficient of determination) of both target and reference genes were generated using the slopes of the standard curves obtained by serial dilutions. The efficiency range of the qRT-PCR amplifications for all of the genes tested was between 90 and 110%. All qRT-PCRs were carried out for three technical and three biological replicates and were normalized according to previous reports (Pfaffl, 2001;Hong et al., 2008).

In silico Identification and Genomic Distribution of Cystatin Genes in B. distachyon
To obtain B. distachyon cystatin (BdC) genes, previously characterized cystatin sequences from wheat, rice, barley, and maize were used as queries to search the public Brachypodium genome database in Phytozome v9.0 (http://www.phytozome. org/). A total of 25 non-redundant BdC genes and their protein encoding sequences were identified (Table S3) and serially named BdC1-BdC19 based on their location and chromosomal order ( Table 1). Chromosomal distribution showed that BdC genes are dispersed over several chromosomes (Figure 1), and the number of BdC genes distributed per chromosome (Nos. 1 to 5) was 15, 6, 0, 0, and 4, respectively. The highest BdC density was found on chromosome 1, with lower densities on chromosomes 2 and 5, whereas chromosomes 3 and 4 had no cystatin genes.
All three motifs were observed in all the cystatin protein sequences those from different plant species except some of the cystatins for rice (OS-12) and wheat (WC-2), where motif-3 and motif-1 were missing, respectively. ( Figure S1). Focusing on Brachypodium, motif 3 appeared twice in the BdC8 protein, once near the N-terminus and again near the C-terminus. Motif 3 was not present in BdC3-1, BdC3-2, BdC3-3, BdC11, and BdC13. Similarly, motif 2 and motif 3 were not present in BdC12 ( Figure 3A). However, the rest of the predicted BdC proteins (exception) had conserved motifs 1, 2, and 3, whose locations were highly homologous in various plant species ( Figure S1). The missing of motifs in several BdC proteins (BdC3-1, BdC3-2, BdC3-3, BdC11, BdC12, and BdC13) indicating a different gene structural characteristics in regard to intron-exon relationships as shown in Figure 2B. From these analyses, differences in motif distribution in cystatin proteins of plants indicated that the functions of these genes might have diverged during evolution.

Amino Acid Structural Analysis of B. Distachyon Cystatins
To search for amino acid variants that could lead to differences in the inhibitory capability of B. distachyon cystatins, alignment of all BdC sequences was performed in the CLUSTAL W program ( Figure S2); this analysis included OC10 from rice, WC1 from wheat, CC1 from sorghum, and HvCPI-3 from barley. SignalP predicted the presence of signal peptides in all BdC proteins except BdC14 and BdC18-1. Cleavage site prediction was run in parallel in TargetP, thus confirming the results ( Figure S2). N-terminal (BdC3-1) and C-terminal (BdC15, OC10, and HvCPI-3) extensions of varying lengths were present in several cystatins, which were not shown completely in the comparison ( Figure S2). In addition to these extensions, differences in the extent of the amino acid sequences corresponding to the loops connecting the β-sheets and α-helices were also found in some cystatins, such as BdC1-1, BdC2-1, BdC8, and so on ( Figure S2).
With the exception of some sequences whose signature motifs were minimally disrupted, the significant protein signatures responsible for cysteine proteinase inhibitory properties were conserved among the 25 BdC sequences. The N-terminal motif LARFAV was fairly conserved among BdC, whereas most had no perfect match with the LARFAV motif from other species ( Figure S2). A hierarchical cluster analysis of BdC, along with cystatins from other species (rice, barley, wheat, and sorghum), indicated that the N-terminal G/GG was also highly conserved in BdC proteins, whereas it was absent in BdC3-1, BdC3-3, BdC5, BdC7-2, and BdC11 ( Figure S3). A conserved G immediately preceded the main body in the N terminus. The region preceding the conserved G is referred to as the N-terminal trunk (NTT) and appears in some BdC proteins ( Figure S3). The functionally indispensable reactive site pentapeptide sequence QXVXG was also observed in all BdC proteins based on ClustalW alignment ( Figure S2) and hierarchical cluster analysis ( Figure S3; Table 1). Although this site in some BdC proteins (BdC3-1, BdC3-3, BdC4, BdC5, BdC9, BdC10, BdC11, BdC12, and BdC13) was partially disrupted by various amino acid residues, this consensus sequence was almost replaced by either DPVVK (hierarchical cluster analysis) or EVVED (ClustalW alignment). Another conserved motif (P/AW) was present only near the C-terminal end in eight BdC proteins (BdC2-1, BdC2-2, BdC14, BdC15, BdC16, BdC17, BdC18-1, and BdC18-2).
To examine structural characteristics, the tertiary structure of all BdC proteins, along with WC1 (wheat), CC1 (maize), and HvCPI-3 (barley), were predicted using the Phyre2 server ( Figure S4). These structures were predicted with similar degrees of accuracy, and almost all BdC proteins conserved key protein motifs with other species in the ClustalW alignment. Therefore, their tertiary structures were similar, conserving an α-helix spanning the LARFAV motif and four main βsheets (β2, β3, β4, and β5; Figure S4). All cystatins consisted of an N-terminal α-helix along with another in the central loop region. However, two Brachypodium cystatins (BdC12 and Bd18-1) showed significant variations in their predicted threedimensional structures, consisting of two α-helices in their central loop regions, probably due to two different reactive sites (QxVxG) modeled by the program (Figure S4, Table 1). The presence of amino acid residue insertions in some cystatins (BdC8, BdC9, BdC10, and BdC11) suggests that these cystatins could have a more extensive loop between each β-sheet than other cystatins. The overall predicted tertiary structures of all BdC proteins were similar to those from wheat (WC1), maize (CC1), and barley (HvCPI-3) ( Figure S4).

BdC Promoter Analysis
Generally, stress-responsive cis-acting elements are present in the promoter regions of stress-inducible genes. A motif search was performed using PlantCARE (Lescot et al., 2002; http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) to identify putative cis-elements in the 1,500 bp promoter sequence upstream of the initiation codon of all BdC genes ( Table S3). The occurrence of cis-elements in BdC genes is shown in Table 2.
Several potential regulatory elements associated with stressrelated transcription factor-binding sites were found, including ABA-response elements (ABREs), CCAAT boxes, heat shock elements (HSEs), stress response elements (STREs), and woundcum-pathogen responsive elements (W-boxes) ( Table 2). The CCAAT enhancer sequences represent binding sites for CCAAT enhancer binding proteins (C/EBP) and act cooperatively with HSEs to increase promoter activation under abiotic stress conditions (Rieping and Schöffl, 1992). The STRE elements are important for transcriptional activation in response to a variety of abiotic stress conditions (Siderius and Mager, 1997). The W-box (consensus sequence TTGAC) binds WRKY factors and responds to heat and wounding (Levée et al., 2004;Ülker and Somssich, 2004) and was found in almost all BdC genes. Similarly, ABRE is a major cis-acting regulatory element that plays important roles in adapting vegetative tissues to abiotic stresses, such as drought and high salinity, as well as in seed maturation and dormancy (Shinozaki et al., 2004). Three important seed-specific cis-motifs (Skn-1_motif, GCN4_motif, and RY-element) were conserved in the promoter regions of some BdC genes, suggesting that these genes are involved in regulating the gene expression of cereal grain storage proteins (Thomas and Flavell, 1990;Ueda et al., 1994). Additionally, other stress and defense responsive elements, such as TC-rich repeats, the G-Box, and the 5 ′ UTR Py-rich stretch, were also identified among BdC genes. More interestingly, a light responsive cis-element such as G-Box was appeared two times in BdC promoters compare to other ciselements, suggesting that BdC genes may highly inducible by light stress.

Differential mRNA Expression of BdC Genes in Different Organs and Determining cDNA Copy Numbers of Duplicated BdC Genes
The transcriptional expression levels of 10 typical BdC genes in different organs, including roots, stems, and leaves at the twoleaf and heading stages, paleas, lemmas, seeds, and developing caryopses, were investigated using qRT-PCR (Figure 5). Specific primer sets were designed for each BdC gene (Table S2). BdC1-1, BdC4, BdC7-1, BdC7-2, and BdC10 had lower expression levels in roots, stems, leaves from seedlings at the two-leaf stage, and 11 DPA paleas compared with BdC2-1, BdC12, BdC14, BdC15, and BdC16, whereas the expression of BdC12 was abundant in roots, leaves, and 11 DPA lemmas. BdC2-1, BdC14, and BdC15 displayed higher expression levels in lemmas (23 DPA) and flag-leaves (23 DPA) compared with other organs. Two pairs of BdC genes (BdC1-1 and BdC4, BdC7-1, and BdC10) shared similar patterns of expression. In contrast, BdC7-1 and BdC7-2, originating from a duplication event, had significantly different expressions in 23 DPA paleas ( Figure 5A). Ten BdC genes displayed distinct expression profiles in different organs.
The dynamic transcription expression levels of 10 BdC genes in 8 different grain developmental stages were investigated ( Figure 5B). The sampled caryopses covered the main stages of endosperm development from cellularization to desiccation. These 10 BdC genes displayed three main expression patterns. The first pattern was displayed by three BdC genes (BdC12, BdC14, and BdC16) and showed gradually increasing expression levels from 6 to 15 DPA, reaching the highest level between 18 and 25 DPA, and decreasing until 28 DPA. The second pattern, displayed by four BdC genes (BdC1-1 BdC10, BdC2-1, and BdC15), had high expression levels in early grain developmental stages (6-12 DPA) and then decreased and were maintained at relatively low abundances. The remaining genes (BdC4, BdC7-1, and BdC7-2) displayed a third expression pattern, with much higher abundances in the early grain developmental stages (6-12 DPA); these levels then decreased slightly or remained relatively low until 18 DPA, after which expression levels increased dramatically, reaching a peak at 22 DPA. It should be noted that both BdC7-1 and BdC7-2 had a uniform expression pattern, with higher expression in developing seeds than in vegetative organs.
Real-time quantitative PCR has been proved to be an efficient method for the quantification of cDNA copy numbers of gene transcripts in which the presence of alleles of large gene families with highly homologs members could be detected in a cDNA population by discriminate the expression level between genes or individual members (d' Aloisio et al., 2010;Kaczmarczyk et al., 2012). In the present study, we found several tandem or segmentally duplicated genes which were shown high allelic similarities to their corresponding duplicated gene pair during sequence comparisons (Figure 2A). In order to confirm the existence of duplicated BdC genes in cDNA population through discriminate the expression levels between the individual BdC members, we carried out an absolute mRNA expression analysis and determined the number of cDNA copies of several tandem (BdC3-1, BdC3-2, BdC3-3) and segmentally duplicated (BdC1-1 and BdC1-2) genes. Allelic specific primers were designed to  discriminate between the duplicated BdC members to prove the primer specificity on the cDNA template of the corresponding allele (Table S2). Among three different tissues (leaf, root and seed) were screened by absolute qRT-PCR analysis, BdC1-1 showed the highest amount of cDNA copies in leaf (6.18 × 10 2 ) where BdC1-2 had about 4.27 × 10 2 . Similarly, a maximum number of BdC1-2 cDNA copies was estimated as 7.50 × 10 0 at seed where BdC1-1 were only 2.45 × 10 1 , as shown in Table S4. While comparing the amount of cDNA copies among BdC3 members in three different tissues, BdC3-1 had a maximum number of cDNA copies at seed (9.20 × 10 2 ) which was higher than cDNA copies of either BdC3-2 (2.73 × 10 1 ) or BdC3-3 (1.18 × 10 4 ). However, BdC3-2 in root exhibited a higher level of cDNA copies (8.17 × 10 0 ) where BdC3-1 and BdC3-3 possessed only 2.68 × 10 1 and 3.43 × 10 5 , respectively. Similar to root and seed, leaf tissue also showed a considerable variation in amount of cDNA copies between BdC3-1 (1.58 × 10 1 ), BdC3-2 (1.07 × 10 2 ) and BdC3-3 (2.20 × 10 6 ) as shown in Table S4. The observed substantial fluctuation of transcription rates between these genes in each tissues might adequate to discriminate their expression levels and thus confirming that there are duplicated BdC genes in cDNA population; furthermore the presence of duplicated BdC genes in cDNA population may provided a suggestive evidence for the existence of duplicated copies of BdC genes in Brachypodium genome.

Evolutionary Conservation and Divergence of the Cystatin Gene Family in B. distachyon
Our results revealed that Brachypodium distachyon BdC genes were unevenly distributed on chromosomes 1, 2, and 5, and chromosome 1 contained the highest BdC density, followed by chromosomes 2 and 5 (Figure 1). More than half were distributed on chromosome 1, suggesting that cystatin genes in Brachypodium distachyon may have a chromosomal preference. Phylogenetic analysis showed that BdC genes, as well as those from eight other Poaceae species, were separated into three well-conserved groups. Most BdC genes shared similar exon/intron structure and motif organization, suggesting that BdC genes maintained high structural conservation over a long evolutionary process. Additionally, the phylogenetic tree also showed that the genetic relationships of Brachypodium cystatins were much closer to barley, wheat, and maize than to other species, such as rice, Job's tears, sugarcane, and Aegilops, as reported previously (International Brachypodium Initiative, 2010;Brenchley et al., 2012).
Open reading frames of different sizes and partial motif deletions and mutations of some important amino acids showed that the Brachypodium BdC gene family probably underwent a complex evolutionary history, involving unequal recombination, duplication, and deletion of gene fragments. These changes would have a significant influence on their respective functions (Kondo et al., 1990;Abraham et al., 2006). We found that some BdC proteins and the QxVxG active site motif in the central loop region were partially (BdC1, BdC3-1, BdC3-3, BdC5, BdC9, BdC10, BdC11, and BdC13) or completely (BdC4 and BdC12) modified by the insertion of or variation in important residues ( Table 1). Similar variations in the QxVxG site and its altered inhibitory action against cysteine proteinase were reported previously (Melo et al., 2003). Additionally, the presence of NTT and W residues (near the C-terminal region) may interact with cysteine proteases (Neuteboom et al., 2009) and actively participate in the inhibition of papain, cathepsin B, or cathepsin H, antifungal activities reported in a previous analysis (Abraham et al., 2006). We speculated that some hypervariable sites may be located at strategic positions on the protein: on each side of the conserved glycine residues in the NTT, within the first and second inhibitory loops entering the active site of target enzymes, and surrounding the LARFAV motif; these were assumed to be positively selected and thus implicated in functional diversity (Kiggundu et al., 2006). However, whatever variations are present at the structural level of BdC genes, the basic 3D structural fold comprising an α-helix and at least four antiparallel β-sheets (β1, β2, β3, and β4) that clearly distinguishes them as cystatins were conserved in the BdC gene structure, as reported in rice and barely (Abraham et al., 2006; Figure S4).

BdC Gene Expression and Plant Growth and Grain Development
As revealed by qRT-PCR, the transcript expression levels of BdC family members in six different tissues confirmed that they were not organ-specific, consistent with previous reports (Kuroda et al., 2001;Abraham et al., 2006;Valdes-Rodriguez et al., 2007). In the present study, several BdC genes, such as BdC2-1, BdC14, and BdC15, had higher expression levels in 23 DPA flag-leaves (Figure 5A), whereas others (BdC12 and BdC16) had significant mRNA accumulation in lemmas (11 DPA and 23 DPA). Additionally, BdC2-1 and BdC16 were considerably more highly expressed than other BdC genes in stem vegetative tissues ( Figure 5A). These results indicate that BdC genes are differentially expressed, implying they play specific roles in different organs or stages of plant growth and development.
Dynamic transcript expression profiling of BdC genes during seed development ( Figure 5B) demonstrated that almost all BdC genes (except BdC12, BdC14, and BdC16) had much higher expression levels during early seed development (6-12 DPA), which is similar to results from rice caryopse formation, in which two distinct oryzacystatin (OCI and OCII) mRNAs could be detected as early as 2 weeks after pollination (Abe et al., 1987). Similarly, some wheat cystatin (WC1, WC2, and WC4) mRNAs were detected in seeds only during the first 2 weeks after pollination (Kuroda et al., 2001;Corre-Menguy et al., 2002). In addition to the early grain developmental stages, the expression of some BdC members (BdC4, BdC7-1, BdC7-2, BdC12, and BdC14) were detected between 18 and 25 DPA (Figure 5B), corresponding to the mRNA transcript accumulation of corn cystatin (CC) in late caryopsis development, between 15 and 30 DPA (Arai et al., 2002). In general, most BdC genes are specifically expressed in developing seeds and contain the cis-acting regulatory elements required for endosperm expression ( Table 2). According to previous reports, phy-cys in seeds can play different roles, including regulation of protein turn-over during seed maturation (Kiyosaki et al., 2007), control of proteolysis during development and/or germination (Gaddour et al., 2001), and protection of seeds against pests (Martinez et al., 2009).
In wheat, WC5 mRNA accumulation was observed only in grain tissues, and its inhibitory action against thiol peptidase from seed protein extracts suggests that seed-specific cystatins play important roles as regulators of peptidase enzymes during seed development (Corre-Menguy et al., 2002). Barley cystatins (Icy1, Icy2, Icy3, and Icy4), primarily cathepsin L-like cys-proteases, were shown to be preferentially expressed in dry and germinating seeds and were efficient inhibitors. This suggests that their main roles are as specialized endogenous regulators of enzymes involved in the mobilization of stored proteins upon germination, which is crucial for seedling growth until photosynthesis is fully established (Martinez et al., 2009). Similarly, some rice and wheat cystatins were also shown to inhibit cys-proteases, such as oryzains and gliadains, which are involved in turnover functions in rice and wheat aleurone layers, respectively (Arai et al., 2002;Kiyosaki et al., 2007). Therefore, further experimental studies are necessary to elucidate the functional role of BdC members in the mobilization of storage reserves and their inhibitory action against proteases in Brachypodium seeds.

Expression Profiling and Potential Functions of BdC Genes in Response to Different Abiotic Stresses
Cystatin genes are implicated in various abiotic stress responses in different plant species, including Arabidopsis thaliana (Zhang et al., 2008), chestnut (Pernas et al., 2000), barley (Gaddour et al., 2001), cowpea (Diop et al., 2004), maize (Massonneau et al., 2005), and rice (Huang et al., 2007). In the present study, all stress treatments (cold, H 2 O 2 , CdCl 2 , salt, drought, and ABA) induced strong BdC accumulation in leaves, suggesting that BdC genes are involved in the stress-responsive mechanism of Brachypodium plants. Among upregulated BdC genes, six (BdC7-1, BdC7-2, BdC10, BdC12, BdC14, and BdC15) were upregulated in response to more than three stress treatments (Figure 6). Furthermore, promoter analysis showed that three stress-related cis-elements (ABRE, MBS, and TC-rich repeats) that frequently occur in the promoter regions of abiotic stress defense pathway genes were also present in the promoter region of these cystatin genes ( Table 2). These results imply that BdC genes could be involved in multiple stress defense mechanisms, as reported previously (Zhang et al., 2008). Different BdC genes exhibited differential expression patterns in response to six different stress treatments (Figure 6), indicating that functional differentiation among these BdC genes occurred during the evolutionary process.
In general, adverse conditions, such as salt, drought, abscisic acid, and H 2 O 2 , often result in the accumulation of reactive oxygen species (ROS) in plant cells, which can change the structural properties of proteins (Berlett and Stadtman, 1997). This can lead to the accumulation of un-folded or misfolded aberrant proteins, which are degraded mostly by cysteine protease (Demirevska et al., 2010). To maintain optimum protein degradation by cysteine protease, plants synthesize protease inhibitors, such as cystatins, to regulate cysteine protease activities under stress conditions (Zhang et al., 2008). Therefore, BdC members participate in the regulation of protease enzymes, enhancing the tolerance of Brachypodium to abiotic stress.
Upregulated expression of BdC2-1, BdC7-2, BdC10, and BdC15 under H 2 O 2 indicated that these genes are involved in antioxidant defense. A signal pathway that leads to PCD is initiated and spread via increasing accumulation of ROS induced by salt and H 2 O 2 (Desikan et al., 1998;Belenghi et al., 2003). ROS-triggered PCD is regulated by cysteine proteases, which play an instrumental role in this physiological process (Solomon et al., 1999). Plants can control PCD by inhibiting the activity of cysteine proteases by regulating the expression of specific protease inhibitor genes. In Brachypodium, the upregulation of BdC genes induced by H 2 O 2 treatment acts as an inhibitor to regulate the activity of cysteine proteases, suppress PCD, and enhance plant antioxidant defense.
Differential transcriptional induction of genes is often influenced by the presence or absence of cis-regulatory elements in the promoter region. In addition to ABRE, MBS, and TCrich repeats, we also observed the presence of other abiotic stress responsive cis-elements, such as the G-box (light responsive motif), W-box (wound and pathogen response), and HSE (heat stress). As shown in Table 2, G-box is the most prevalent cisregulatory element motif presented in BdC genes, and their functional implications in light and other abiotic stresses already have been described well in a previous report (Qin et al., 2011). G-box was considered as the cognate cis-element for the basic zipper (bZIP) (de Pater et al., 1993) or basic helix-loophelix (bHLH) transcription factors (TF; Kawagoe and Mura, 1996). In rice, G-box elements were reported to be significantly enriched in promoter regions of upregulated senescenceinducible genes in response to various hormonal stress (abscisic acid, brassinosteroid, JA and GA) where TFs (bHLH and bZIP) were shown to highly express, further suggesting that G-box elements were effectively involves in inducibility of senescence under in vivo and in vitro conditions (Liu et al., 2016a). In Arabidopsis, PSEUDO-RESPONSE REGULATORs (PRRs) act as transcriptional repressors and play important roles in regulating flowering time and abiotic stress responses. Here, G-box like motifs were observed to be overrepresented in PRRs and showed to be an important cis-regulatory element for mediating the transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) by PRRs (Liu et al., 2016b). These observations provide an important foundation for further functional studies of BdC genes. Some studies also reported the expression of cystatins in roots, shoots (Christova et al., 2006), and stems (Valdes-Rodriguez et al., 2007), and the expression profiling of BdC genes in other tissues under various biotic and abiotic stress treatments awaits further research. The abiotic-stress-induced expression of BdC genes may contribute to the regulation of PCD triggered in Brachypodium in response to unfavorable growth conditions, as reported in other model plant systems (Solomon et al., 1999;Belenghi et al., 2003).

A Putative Cystatin Gene Pathway in Response to Abiotic Stress
Based on our results, and in combination with previous reports, we propose a putative metabolic pathway for cytastin genes in response to various abiotic stresses; this pathway mainly involves the aberrant protein degradative pathway and the ROS-triggered PCD signaling pathway (Figure 7). Abiotic stresses, such as salt, drought, abscisic acid, and H 2 O 2 , induce accumulation of ROS in plant cells, which affects the structural properties of proteins (Berlett and Stadtman, 1997). This may result in the generation of un-folded and mis-folded aberrant proteins that are degraded by cysteine proteases (Demirevska et al., 2010). To maintain optimum protein degradation, cytastins are synthesized to regulate cysteine protease activity in response to various abiotic stresses (Zhang et al., 2008). Meanwhile, the ROStriggered PCD signaling pathway is activated by the increasing accumulation of endogenous ROS (Desikan et al., 1998;Belenghi et al., 2003). The ROS-triggered PCD is regulated by cysteine proteases, which play an instrumental role in this physiological process. To prevent unwanted cell death, plants upregulate the expression of cytastin genes to inhibit the activity of cysteine proteases, indirectly controlling the PCD process (Zhang et al., 2008). Under various abiotic stresses, BdC genes, regulated by stress-related cis-acting elements present in the promoter region, participate in these signaling pathways by regulating the activities of cysteine proteases.

CONCLUSION
In the current study, we identified 25 BdC genes in the B. distachyon genome through in silico analysis. All BdC genes shared similar exon/intron organization with three conserved motifs, which is similar to those from other plant species. Phylogenetic analysis revealed that BdC genes were highly orthologous to those from barley, wheat, and maize. Variations in genomic organization, deletions in motifs, and mutations in critical active site amino acids suggest that these genes underwent a complex evolutionary process and structural and functional divergence. The differential expression patterns in developing caryopses and under various abiotic stress conditions revealed that BdC genes involved in the regulation of cysteine protease activity could mobilize storage reserves and play crucial roles in the response to multiple abiotic stresses through the degradation of aberrant proteins and the ROS-triggered PCD signaling pathway. These results provide a better understanding of the structure and function of the BdC gene family.

AUTHOR CONTRIBUTIONS
SS and DZ carried out the experiments and drafted the manuscript. XL and YH participated in the study and helped to draft the manuscript. YY conceived the study, planned experiments, and helped draft the manuscript. All authors have read and approved the final manuscript.

ACKNOWLEDGMENTS
This research was financially supported by grants from the Ministry of Science and Technology of China (2016YFD0100500, 2016ZX08009003-004). The English in this document has been checked by at least two professional editors, both native speakers of English. For a certificate, please see: http://www.textcheck. com/certificate/XOgwli.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 00743/full#supplementary-material  Pink, P/AW). Signal peptides are underlined. Few N-terminal (BdC3-1, BdC18-1) and C-terminal (BdC15, OC10, and HvCPI-3) amino acid residues are not shown here. Figure S3 | Hierarchical clustering of cystatins from Brachypodium along with related species. The conserved signature motifs are indicated by colored rectangles (Blue, LARFAV; Red, QXVXG; Pink, P/AW). The yellow shaded residues are claimed to be the putative N-terminal "G" residues. The predicted consensus sequences are represented by an orange color box. Figure S4 | Three dimensional structures of B. distachyon BdC proteins. The tertiary structure was predicted by the Phyre2 server and structure composition indicated similarity with the structure of barely (HvCPI-3), wheat (WC1), and sorghum (CC1) as indicated. The secondary structure is shown with α helixes in green, β sheets in yellow, and loops in blue.
Table S1 | List of cystatin amino acid sequences from various plant species used in this study.  Table S4 | Absolute quantification of the mRNA expression level of duplicated BdC genes. Quantification was according to the reference gene (Ubiquitin) in the quantitative real-time polymerase chain reaction (qRT-PCR) analysis. The values listed are indicated as cDNA copies/µg of reverse-transcribed total RNA, and the data are shown as the mean ± standard deviation.