Molecular characterization revealed the role of thaumatin-like proteins in stress response in bread wheat

Thaumatin-like proteins (TLPs) are related to the pathogenesis-related-5 (PR-5) family and involved in stress response. Herein, a total of 93 TLP genes were identified in the genome of Triticum aestivum. Further, we identified 26, 27, 39 and 37 TLP genes in the Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zea mays genomes for comparative characterization, respectively. They could be grouped into small and long TLPs with conserved thaumatin signature motif. Tightly clustered genes exhibited conserved gene and protein structure. The physicochemical analyses suggested significant differences between small and long TLPs. Evolutionary analyses suggested the role of duplication events and purifying selection in the expansion of the TLP gene family. Expression analyses revealed the possible roles of TLPs in plant development and abiotic and fungal stress response. Recombinant expression of TaTLP2-B in Saccharomyces cerevisiae provided significant tolerance against cold, heat, drought and salt stresses. The results depicted the importance of TLPs in cereal crops that would be highly useful in future crop improvement programs.


Evolutionary analyses
To decipher the evolutionary relationship, a phylogenetic tree was built using the full-length TLP 2 4 0 protein sequences of the five cereals and A. thaliana. These were clustered into 11 different 2 4 1 clades based on their phylogenetic relatedness, named as groups I to XI (Fig. 3). The highest 2 4 2 number of TLPs were found in group XI, followed by group II and group X, while group IV was 2 4 3 the smallest with only five genes. All the sTLPs were tightly clustered into group XI, which 2 4 4 could be due to their smaller size. Further, the majority of groups consisted of TLPs from all the 2 4 5 five cereals, except groups III, V and VI that lacked members from one or more plant species.

4 6
Besides, group IV comprised only three TaTLP and two AtTLP proteins. Besides, the 2 4 7 homeologous TaTLPs of T. aestivum were tightly clustered in proximity. TaTLP24-B, etc.) were segmental and one was TDE (TaTLP16-A-TaTLP17-A). In Z. mays, 13 ZmTLP7) were found. All the duplicated gene pairs of each cereal crop were tightly clustered in 2 6 0 proximity in the phylogenetic tree. Over the course of evolution, various evolutionary forces and natural pressures affected the 2 6 4 duplicated genes [53]. To understand the evolutionary divergence between the paralogous gene 2 6 5 pairs, the Ka/Ks analysis was carried out. The Ka/Ks ratio of more than one suggests the positive 2 6 6 (non-purifying) and less than one indicates negative (purifying) selection pressure. All the 2 6 7 paralogous genes showed the Ka/Ks value lesser than one, which suggested the negative or 2 6 8 purifying selection on duplicated TLP genes (Table 1) and ZmTLP7 was not performed due to their 100% similarity. Additionally, the divergence time 2 7 0 of DEs was also calculated using the Ks value and previously described methods [54,55] The number of exons varied from one to three in O. sativa, while one to four in B. distachyon, S. bicolor and T. aestivum. In the case of Z. mays, most of the TLPs consisted of one to three exons, 2 8 0 while ZmTLP16, ZmTLP20, ZmTLP25 and ZmTLP33 exhibited eleven, seven, eight and ten 2 8 1 exons, respectively. Moreover, a total of 44% (98/222) identified TLPs were intronless. Intriguingly, all the sTLPs were intronless, except TaTLP6-A and TaTLP20-A1. The intron phase 2 8 3 analysis revealed that the occurrence of a maximum number of introns in phase 1, followed by 2 8 4 phase 2, while the least number of introns were in phase 0 (supplementary fig. S2).

8 5
The functional nature of a protein depends upon the occurrence of domain composition. All of 2 8 6 the identified cereals' TLPs consisted of a thaumatin domain (PF00314), which confirmed that ZmTLP16, ZmTLP20, ZmTLP25 and ZmTLP33 also consisted of a nuclear protein 96 (NUP96) 2 9 0 domain of ~211 AAs at the C-terminus of these proteins (supplementary file S3).

9 1
Motif investigation revealed the occurrence of 10 highly conserved motifs in the TLP proteins. Motifs 1-9 were parts of the thaumatin domain, while motif 10 was unknown. Motifs 5, 6 and 8 2 9 3 were the most conserved motifs in TLP proteins, in which the thaumatin signature sequence was  TLPs were studied. The TLPs were analyzed for molecular weight (MW), peptide length,  ranged from ~29-35 kDa to ~17-19 kDa, respectively. However, the average pI ranged from 5.89 3 0 4 15 to 6.95 and from 4.90 to 6.82 for the long and small TLPs, respectively ( The majority of TLP proteins lacked TM helices, which suggested their soluble/cytoplasmic TaTLP5-D comprised of two TM helices, which suggested their membrane-bound nature   proteins were predicted to be localized in the extracellular region, while three TLPs from O. protein confirmed their extracellular localization (Fig. 4). Moreover, we could also see some functions (such as anti-fungal etc.). The cytoplasmic localization could be due to their translation 3 2 2 in the cytoplasm, or it might also function inside the cytoplasm. Promoter elements are necessary for the regulation of gene expression under different conditions. Therefore, we performed the cis-regulatory analysis of TLPs to foresee their regulatory box, GBOXLERBCS, BOXII, IBOXCORE were some common cis-regulatory elements.

Expression profiling of the TLP genes under different tissues and developmental stages
Expression analysis of genes is a significant way to understand their involvement in various BdTLP12, and BdTLP20 and BdTLP23 were upregulated in endosperm and embryo, 3 4 6 respectively. Group 2 genes were highly expressed in the leaf, pistil, embryo and endosperm 3 4 7 tissues. However, group 3 genes were highly expressed at 10 days after pollination (DAP) of 3 4 8 seeds; moreover, BdTLP8 and BdTLP18 were upregulated in the pistil, as well (Fig. 5A). In O. 19 they get normalized at later stages. The results suggested that the group 2 and group 3 genes are 3 9 6 early and late responsive TaTLP genes, respectively.

9 7
To validate the RNA-seq expression data, qRT-PCR analysis of eight TaTLP genes was confirmed that a few TaTLP genes are early while others are late responsive. In our study, since the TaTLP2-B exhibited significant differential expression under abiotic stress containing vector was used as control ( Fig. 9A-F). In the spot assay, we observed similar growth        However, in the case of T. aestivum, paralogous TLP genes were probably evolved earlier than  Since the TLPs are paraphyletic in origin, they are supposed to be derived from multiple  The organization of introns and exons of TLPs have been reported to be in the range of one to ten 4 7 6 exons [6,7,9,53]. On similar patterns, our results suggest comparable findings with one to four 4 7 7 exons in most of the TLPs. However, most of the monocots' TLPs have intron-less nature, which 4 7 8 was also found in our analysis [7]. Additionally, our results with respect to the protein lengths, 4 7 9 molecular weights and isoelectric points of long and small TLP proteins, found in accordance 4 8 0 with previously studied TLPs [6,7,9]. In our analysis, the TLP2-B was found to be localized in  responses. Moreover, similar to our finding's differential expression and upregulation of TLP provided increased tolerance against various abiotic stresses in tobacco, cotton, Arabidopsis etc.  lines and blue lines denote the tandem and segmental duplication events, respectively.   upregulated and downregulated expression with red and green colours, respectively. green colours, respectively. respectively. Significance between the control and treated conditions is carried using a 8 4 2 two-tailed student's t-test. The ns, *, ** and *** markings represent the significance at p- conditions, respectively.  File S5. List of qRT-PCR primers.