Genome-Wide Mining of Wheat DUF966 Gene Family Provides New Insights Into Salt Stress Responses

Domain of unknown function (DUF) proteins constitute a great deal of families of functionally uncharacterized proteins in eukaryotes. The DUF966 gene family is found in monocotyledons, dicotyledons, mosses, and other species. However, little is known about the functions of DUF966 genes in wheat (Triticum aestivum L.). In this study, we identified and characterized the TaDUF966 gene family members in wheat by in silico analysis. A total of 28 TaDUF966 proteins were identified in wheat. Phylogenetic analysis divided these proteins into two groups (Groups I and II). Proteins in each group showed a highly conserved DUF966 domain and conserved motif distribution, implying their functional conservation. Analysis of gene expression profiling data showed that some TaDUF966 genes were induced by salt stress. We further confirmed the role of TaDUF966-9B in salt stress using virus induced gene silencing (VIGS) assay. Compared with the empty vector control, the TaDUF966-9B knockdown plants exhibited severe leaf curling at 10 days post-inoculation with BSMV under salt stress, suggesting that TaDUF966 genes play a vital role in salt stress tolerance in wheat. Taken together, these results expand our knowledge of the evolution of the DUF966 gene family in wheat and promote the potential application of these genes in wheat genetic improvement.


INTRODUCTION
Wheat (Triticum aestivum L.) is one of the most important grain crops in the world and the main source of food for the majority of the human population . Abiotic stresses, such as drought, salinity, and extreme temperatures, negatively impact the growth, development, quality, and yield of wheat and other crops (Zhu et al., 2019a). Domain of unknown function (DUF) is a general term used for many domains in the Pfam database that have not been confirmed. These domains have two distinct characteristics: a relatively conserved amino acid sequence and the function of this domain is unknown (Bateman et al., 2010). DUF proteins are reportedly involved in abiotic stress tolerance in plants. For example, the wheat TaWTF1 gene, which encodes the Domain of Unknown Function 860 (DUF860) protein, contains stress-and hormone-responsive elements in its promoter region and is induced by heat stress at the seedling and flowering stages (Qin et al., 2013). In rice (Oryza sativa L.), the expression of OsDUF936.3, OsDUF936.5, and OsDUF936.6 is significantly increased under salt stress conditions, and overexpression of OsDUF936.6 in Escherichia coli significantly improves tolerance to salt stress (Li et al., 2017). Additionally, OsSIDP366, a DUF1644 gene, is a positive regulator of drought and salt stress tolerance in rice (Guo et al., 2016). In Ammopiptanthus mongolicus, the AmDUF1517 gene is potentially involved in the regulation of cold tolerance (Gu and Cheng, 2014).
Increasing studies show that genes containing DUF domains are involved in plant growth and developmental processes. Rice REL2 (ROLLED and ERECT LEAF 2 (OsREL2)) is the first DUF domain-containing protein shown to be involved in the regulation of leaf morphology (Yang et al., 2016). Additionally, rice BEAK-SHAPED GRAIN 1 (BSG1), a DUF640 gene, determines grain shape and size, probably by controlling cell division and expansion in the grain hull (Yan et al., 2013). In Arabidopsis thaliana, silencing of the At2g23470 gene, which encodes a member of the DUF647 protein family, results in severe infertility (Li and Zhao, 2012). Li et al. (2016) determined the role of DUF1313 gene family members in photoperiod sensitivity in maize (Zea mays L.). In the woody perennial plant Populus deltoides, PdDUF231A, a DUF231 family protein, is involved in xylan acetylation and cellulose biosynthesis (Yang et al., 2017). In addition to their role in plant growth, development, and abiotic stress response, DUF proteins are also involved in the biotic stress response. For example, silencing of the OsDUF500 gene in rice increases resistance to bacterial blight strain PXO99 . The plant DUF538 protein is predicted as a partial structural homolog of the bactericidal/permeability-increasing (BPI) protein of the mammalian innate immune system, which provides the first line of defense against different pathogens including bacteria, fungi, viruses, and parasites. Experimental evidence shows that exogenous application of the purified fused product of Celosia DUF538 affects bacterial growth, possibly by binding to the bacterial membrane, similar to the BPI protein (Gholizadeh and Kohnehrouz, 2013).
The DUF966 protein superfamily is found in monocotyledons, dicotyledons, mosses, and other species (Luo and Tian, 2017). Proteins in this superfamily contain one or two highly conserved DUF966 domains. In tomato (Solanum lycopersicum), JWS19 (a DUF966 gene) was transiently and rapidly induced by salt stress and was isolated from salt-treated roots (Tirajo, 2005). Although the molecular function of DUF966 genes is not fully understood, the available evidence strongly suggests that DUF966 family genes are involved in abiotic stress tolerance. For example, in rice, overexpression of the new stress suppressor gene, DUF966-stress repressive gene 2 (OsDSR2), dramatically increased the sensitivity to salt and drought stresses and reduced sensitivity to abscisic acid (ABA) (Luo et al., 2014). In Arabidopsis, the AtST39 gene (At3g46110) plays a positive regulatory role in salt and drought stress resistance pathways (Wang, 2014).
Although the DUF966 genes play vital roles in plants, to the best of our knowledge, their functions remain unexplored in wheat. In this study, we conducted a comprehensive analysis of TaDUF966 family genes in abiotic stress tolerance in wheat, based on their genomic sequence and transcriptome data. A total of 28 TaDUF966 family genes were identified, and their conserved motifs, genomic structure, evolutionary relationship, and functional classification were investigated. Moreover, expression analysis of TaDUF966 genes was conducted, and cis-acting elements in their promoter regions were examined. Our results provide valuable information for understanding the classification and putative functions of DUF966 genes.

Plant Material and Stress Treatment
Common wheat (Triticum aestivum L.) cultivar Zhengmai was used in this study. Plants were grown at 26°C under a 16 h day/ 8 h dark photoperiod. To conduct salt stress treatments, plants were treated with 150 mM sodium chloride (NaCl) at the stages of seedling with one and half leaves. Leaf and root samples were collected at 2, 4, 8, 12, 24, and 96 h after the salt treatment (Jiang et al., 2019). The experiment was performed with three independent biological replicates.

Phylogenetic Analysis, Chromosomal Locations, and Duplication Patterns
Phylogenetic analysis of TaDUF966 proteins was conducted using the LG model (Le and Gascuel, 2008) based maximum likelihood (ML) method in MEGA 7.0 (Kumar et al., 2016). The test guidance was repeated 1,000 times (Felsenstein, 1985) to calculate the support for each node. A midpoint-rooted base tree was drawn using the Interactive Tree of Life (IToL, version 3.2.317, http://itol.embl.de) (Zhu et al., 2019b). The GFF3 gene annotation file was obtained from the wheat database IWGSC v.1.1, and TaDUF966 gene annotations were extracted. Chromosomal start and stop sites and the chromosomal location of each TaDUF966 gene were used to draw the physical map with the MapInspect software. All possible homolog of the DUF966 genes in each subgenome of common wheat were determined using "all-against-all" BLAST, with an E-value cut-off of 1 × 10 −10 and identity > 75% (Letunic et al., 2004). The R package "circlize" was used to draw a graph showing its position and homology relationship (Fang et al., 2020). The non-synonymous (Ka) and synonymous (Ks) substitution rates were calculated using DnaSP 6.0 (Rozas et al., 2017) to diagnose the form of sequence evolution.
The MEME (v5.0.3) (Bailey and Elkan, 1994) suite analysis tool and MAST (http://meme-suite.org/tools) motif search tool were used to identify conserved motifs in TaDUF966 proteins. The known DUF966 protein sequences, including AtDUF966 and OsDSR, were used for training the parameter settings. Then, the trained parameters were used to identify conserved TaDUF966 motifs as follows: each sequence may contain any number of non-overlapping repeats of each motif; number of different motifs (1-10); motif width = 6-50 amino acids (aa). The functions of predicted motifs were analyzed using InterPro (http://www.ebi.ac.uk/interpro) and SMART motif search programs (http://coot.embl-heidelberg.de/SMART). In addition, multiple sequence alignment analysis of TaDUF966 proteins was conducted using ClustalX2 (Kim and Joo, 2010).

Prediction of Cis-Acting Elements
To analyze putative cis-elements in TaDUF966 gene promoters, the 1,500 bp sequence upstream of the start codon of each gene was searched in the Plant-CARE database (Bailey and Elkan, 1994). The identified motifs were analyzed using the Fisher test, and those with an adjusted p-value < 0.05 were considered significantly enriched.

Analysis of Wheat Transcriptome Under Salt Stress
Transcriptome responses of two wheat cultivars to salt stress RNAseq datasets generated previously under an abiotic stress treatments were downloaded from NCBI and mapped to the wheat reference genome using HISAT2. Then, genes were assembled using cufflinks to inspect the expression levels of TaDUF966 genes (normalized by TPM, transcripts per kilobase of exon model per million mapped reads). The R package "pheatmap" was used to draw the pheatmap of TaDUF966 genes (Zhu et al., 2019c).

RNA Extraction and Quantitative Real-Time PCR (qRT-PCR) Analysis
Total RNA was isolated from root samples from plants treated with or without NaCl using the TRIzol Reagent (Invitrogen, Carlsbad, CA, U.S.A). The isolated total RNA samples were quantified using the NanoDrop spectrophotometer (Thermo Fisher Scientific) and then treated with DNase I (Thermo Fisher Scientific) to remove any residual DNA contamination. The DNA-free RNA samples were converted to cDNA using a cDNA synthesis kit (Thermo Fisher Scientific). Gene-specific primers were designed using Primer Premier 5.0 (Table S1). Then, quantitative PCR (qPCR) was carried out using Maxima SYBR Green/ROX qPCR Master Mix Chen et al., 2020). The EF-1a gene was used as an internal reference. Three independent biological replicates, each containing three technical replicates, were conducted for each sample in both control and NaCl treatments. Gene expression was quantified using the 2 −△△Ct method (Yin J. L. et al., 2018).

Virus-Induced Gene Silencing (VIGS) Experiment in Wheat
Two cDNA fragments from different sites silenced TaDUF966-9B. Using BLAST analysis in NCBI, these fragments are specific (Zhu et al., 2018a). Primers for constructs in plant transformation were designed using Primer Premier 5.0 and are listed in Table S2. Barley stripe mosaic virus (BSMV) consists of a, b, and g three RNA chains. According to the manufacturer's instructions of RiboMAXTM, using the RiboMAXTM largescale RNA production system-T7 (Petty et al., 1990) and Ribom7G Cap analogues (Promega, Madison, Wisconsin, USA) in vitro linearized plasmids containing the tripartite BSMV genome were sealed at the transcripts. The second leaf of the two-leaf wheat seedling was mechanically wiped with BSMV transcript and incubated at 25°C. Use BSMV: TaPDS (TaPDS, wheat octahydrolysin desaturase) as a positive control.
Control plants were treated with 1× Fes buffer (0.1M glycine, 0.06M K2HPO4, 1% w/v tetrasodium pyrophosphate, 1% w/v bentonite, and 1% w/v celite, pH 8.5) devoid of BSMV transcripts (Zhu et al., 2018b). At 10 dpi, 150 mM NaCl was poured and samples were taken at 1 d and 6 d for RNA isolation, followed by qRT-PCR analysis and phenotypic analysis. qRT-PCR primers were designed using Primer Premier 5.0 and are listed in Table  S3. The experiment was repeated at least three times.

Identification and Classification of TaDUF966 Genes
A total of 28 TaDUF966 proteins were identified from wheat reference genome using seven rice and five Arabidopsis protein sequences as query (Table S4). These wheat proteins were named according to the following rules: (1) "TaDUF966" stands for the hexaploid wheat DUF966 gene family; (2) Arabic numerals represent the gene number; (3) "-A/B/D" represents the wheat subgenome to which the gene belongs [ Table 1; (Song et al., 2019)]. 1 | Information of the DUF966 gene family in common wheat as well as its wild relatives.

Homologous Gene Pairs and Synteny Analysis
The homologous pairs of DUF966 were identified in the common wheat ancestral species, Triticum urartu (AA), Triticum dicoccoides (AABB), and Aegilops tauschii (DD); 10, 15, and 9 homologous pairs of DUF966 were identified, respectively, named as TuDUF966, TdDUF966, and AeDUF966 based on the naming rules. The location and other details of these genes are summarized in Table 1. Comparisons between TuDUF966 vs. TdDUF966, TaDUF966 vs. TdDUF966, TaDUF966 vs. TuDUF966, TaDUF966 vs. AeDUF966, and TaDUF966 vs. TaDUF966 revealed 2, 21, 12, 6, and 21 pairs of putative paralogs, respectively ( Figure 1C). To better understand the evolutionary factors affecting the DUF966 gene family, we calculated the Ka/Ks ratios among homologous pairs of TaDUF966, TuDUF966, TdDUF966, and AeDUGF966 genes ( Table S6). Most of the DUF966 gene pairs showed Ka/Ks ratio < 1, indicating that these genes contained more synonymous than non-synonymous changes. Moreover, these genes were under negative selection pressure to protect the state of the ancestor. Three homologous pairs, including TaDUF966-11B/TuDUF966-9A, TaDUF966-11B/TuDUF966-8A, and TaDUF966-11B/TaDUF966-13B, showed Ka/Ks ratio > 1, indicating positive selection, accelerated evolution, and neofunctionalization. Analysis of duplication events of TaDUF966 genes revealed that gene duplication occurred mainly among Group II genes ( Figure 1B, Table S6). Of the 28 TaDUF966 genes, 17 (60.7%) genes showed segmental duplication events. Furthermore, TuDUF966 genes showed segmental and tandem duplications, while TdDUF966 genes showed segmental duplications (Table S6). These data suggest that segmental duplication events played a key role in the expansion of the DUF966 gene family in wheat.

Analysis of Gene Structure and Motif Composition in TaDUF966 Genes
To investigate the structure of TaDUF966 genes, we extracted their detailed information from the GFF3 file of the reference genome (Jiang et al., 2020b). The results showed that all TaDUF966 genes contain introns. Comparative exon-intron structure analysis revealed the presence of 2-6 exons in Group I members and 4-8 exons in Group II members ( Figure 2B).
Motif usually refers to the basic structure that constitutes any kind of characteristic sequence. It is a subset of the structural domain, and its function is to reflect the various biological functions of the structural domain. The prediction of protein motifs is a useful protein analysis method. Analysis of 28 TaDUF966 proteins using MEME revealed 10 motifs ( Figures  2C, D). Details of these 10 motifs are shown in Table S7. Each TaDUF966 protein contains 3-10 motifs, and most TaDUF966 proteins contain motifs 1, 2, and 3. In the phylogenetic tree, we found that members with relatively similar genetic relationships have more similar motifs, which indicates that DUF966 members gathered in the same subgroup may have more similar biological effects. In addition, all TaDUF966 proteins contain the DUF966 conserved Motif 1. Motif analysis using NCBI BLAST showed that seven of these 10 motifs (motifs 1-5, 9, and 10) belong to the DUF966 conserved domain, while the remaining three motifs were uncharacterized. Multiple sequence alignment analysis of TaDUF966 proteins revealed that the amino acid sequence of the DUF966 domain is highly conserved ( Figure S1), suggesting that this domain plays a key role in protein function.

Analysis of TaDUF966 Protein Features
Protein feature analysis showed that TaDUF966 proteins contain an average of 422 aa (191 aa-644 aa) (

Prediction of Cis-Acting Regulatory Elements in TaDUF966 Gene Promoters
Cis-acting elements play important roles in the initiation of gene transcription. The TaDUF966 gene promoters were predicted to possess cis-acting regulatory elements, which are involved in three different physiological and biochemical processes. (Figure 3, Tables S9, S10). Among the biotic/abiotic stress-responsive cis-elements, ARE (antioxidant response element) and G-box were highly enriched in TaDUF966 gene promoters. Secondly, the CAATbox and TATA-box (cis-elements related to growth and development) were also enriched in TaDUF966 gene promoters. Among the phytohormone-related cis-acting elements, the ABAresponsive element (ABRE) was the most frequently identified in TaDUF966 gene promoters. In addition, Cis-acting regulatory elements involved in the response to methyl jasmonate (MeJA),  CGTCA-box and TGACG-box, were also identified in most of the TaDUF966 gene promoters. In general, TaDUF966 gene promoter may respond to various endogenous and exogenous stimuli.

Transcriptome Expression Profiling of TaDUF966 Genes Under Salt Stress
Gene expression acts as a link between the transmission and realization of genetic information in organisms, and genomewide analysis of gene expression patterns helps to identify the gene expression pathway and its regulation. In this study, RNAseq data downloaded from NCBI were mapped to the wheat reference genome using HISAT2, and expression patterns of TaDUF966 genes under salt stress were analyzed. The pheatmap, based on standardized TPM values, showed that salt stress strongly induced TaDUF966-5, -9, and -14 (A, B, and D) but did not affect the expression of other genes ( Figure 4A, Table  S11) (Appels et al., 2018). The majority of TaDUF966 genes showed low expression, whereas 9 genes showed high expression in two different wheat varieties. The expression of TaDUF966 genes increased with the duration of salt treatment, reaching a peak at 6 h, followed by a gradual decline.

Expression Analysis of TaDUF966 Genes
To further reveal the potential functions of TaDUF966 genes in association with abiotic stress (salt stress), we analyzed the expression patterns of several TaDUF966 genes by qRT-PCR. According to transcriptome analysis, expression of TaDUF966 TaDUF966-1A  2  1  401  1,206  6  TaDUF966-2B  1  2  292  879  5  TaDUF966-3A  1  2  379  1,140  5  TaDUF966-3B  1  1  roots compared with control roots. The results of qRT-PCR analysis of TaDUF966-5D, TaDUF966-9B, and TaDUF966-14D were consistent with RNA-seq data. These three genes showed higher expression levels in the root and leaf tissues of wheat plants treated with NaCl compared with the control. Expression levels of these three genes increased in leaf tissues at 4 or 8 h after NaCl treatment but decreased in root tissues at 8 h ( Figure 4B). In particular, the expression level of TaDUF966-9B increased in root tissues at 2 h after NaCl treatment, decreased in root tissues at 8 h, and was induced by salt stress 24 h later. Therefore, TaDUF966-9B may be better induced by salt stress. In subsequent trials we focused on this gene.

Silencing of TaDUF966-9B Reduces Salt Tolerance in Wheat
To further confirm the function of TaDUF966-9B, we performed a virus-induced gene silencing (VIGS) assay. Silencing of the TaDUF966-9B gene was achieved using two constructs (BSMV: TaDUF966-9B-1 and BSMV : TaDUF966-9B-2), each carrying an approximately 250 bp fragment ( Figure S2) of the corresponding TaDUF966-9B gene, respectively, applied as independent treatments. All plants inoculated with BSMV showed mild chlorotic mosaic symptoms at 10 days postinoculation (dpi), while leaves inoculated with BSMV : TaPDS showed photobleaching ( Figure 5A), indicating that the BSMVinduced gene silencing system was functional. The expression of each homolog was also analyzed by qRT-PCR. At 10 days after friction-implanted virus, the leaf curling phenotype was significantly stronger in plants expressing either construct than in plants carrying the empty vector control ( Figures 5B, C), and the level of BSMV : TaDUF966-9B-1/2 mRNA in NaCl-treated plants was significantly lower than that in the empty vector control ( Figure 5D). These results indicate that the TaDUF966-9B gene is involved in salt stress tolerance in wheat. The qRT-PCR results. 10-day-old seedling leaves and roots were sampled after 0, 3, 6, 9, 12, and 24 h under stress conditions comprising 150 mM NaCl. t-tests were used to determine significant differences in expression patterns for CK and NaCl treatments (***p < 0.001, *P < 0.05).

DISCUSSION
Advances in high-throughput sequencing over the past two decades have increased the gap between genome sequence data and gene function information (Yin et al., 2020). Additionally, the number of DUF gene families has increased in the Pfam database (Bateman et al., 2010). Although the origin, diversity, and preliminary biological functions of these DUF gene families have been investigated in many studies, elucidating the biological functions of these genes, particularly in wheat, remains challenging. The current Pfam database 32.0 contains 17,939 gene families, of which 3,961 gene families (22%) represent DUFs (Chalupska et al., 2008). Bioinformatics analysis shows that the DUF966 gene family is widely distributed in rice, tomato, and Arabidopsis. In this study, a total of 28 TaDUF966 genes were identified in common wheat. These genes were distributed on 13 chromosomes, particularly on chromosomes 3, 4, and 7; there are six TaDUF966 genes, which are identified on A, B, and D subgenomes, respectively ( Figure 1B). The other 10 TaDUF966 gene family members did not appear simultaneously on three homeologous chromosomes, indicating that TaDUF966 genes might have been amplified by replication during evolution, although a small number of genes could have been selectively deleted during evolution. This is consistent with previous research (Luo and Tian, 2017). Moreover, TaDUF966 proteins contained different numbers of motifs (3 to 10), but only one or two highly conserved DUF966 domains ( Table 2), but functions of each domain remain to be investigated.
In addition to wheat, DUF966 genes have also been identified in other plant species such as Arabidopsis, rice, maize, soybean, sorghum, tomato, grape, poplar, pawpaw, and alfalfa (Luo et al., 2014). A previous study showed that the rice DUF966 gene, OsDSR2, and tomato DUF966 gene, JWS19, play a vital role in response to salt stress and simulated drought stress (Tirajo, 2005;Luo et al., 2014). Nevertheless, the function of these DUF966 proteins has not been characterized further. In this study, our qRT-PCR data showed that three TaDUF966 genes (TaDUF966-5D, TaDUF966-9B, and TaDUF966-14D) were up-regulated in the root and leaf tissues of wheat plants treated with NaCl, which is consistent with previous RNA-seq data ( Figure 5). These results, together with phylogenetic analysis, strongly suggest that the DUF966 family members in wheat are also involved in salt stress tolerance.
To further verify the role of TaDUF966 family genes in salt stress tolerance, we silenced one wheat gene (TaDUF966-9B) using VIGS and then treated the plants with NaCl. The expression of TaDUF966-9B was not induced upon NaCl treatment, indicating that TaDUF966-9B responds to salt stress in wheat. Although gene-specific primers were designed for TaDUF966-9B, and sequences of cDNA fragments cloned into VIGS vectors were confirmed by Sanger sequencing, the possibility of knocking down the expression of TaDUF966-9B homologs could not be completely ruled out. Therefore, further studies are needed to A B D C FIGURE 5 | Silencing of TaDUF966-9B decreases salt tolerance in wheat. (A) BSMV: TaPDS showed photobleaching at 10 dpi; Mock: wheat leaves treated with 1× Fes buffer. (B) At 10 dpi, 150 mM NaCl was poured and sampled, these leaves were sampled at 10 dpi for phenotypic analysis. Bars, 1 cm. (C) Statistics of leaf curl rate in TaDUF966-9B-knockdown plants treated with 150 mM NaCl at 1 and 6 dpi. Values represent mean + standard errors of three independent assays. Differences were assessed using Student's t-tests. (D) Silencing efficiency assessment of TaDUF966 in the 1 and 6 dpi of TaDUF966-9B-knockdown plants treated with 150 mM NaCl (***p < 0.001).
generate stable transgenic lines using the CRISPR/Cas technology to better understand the function of TaDUF966 genes. In conclusion, our results provide a comprehensive analysis of the biological function of DUF966 family genes in wheat.

DATA AVAILABILITY STATEMENT
All datasets presented in this study are included in the article/ supplementary material.

AUTHOR CONTRIBUTIONS
XGZ, DM, and YQ guided the design of the experiment. XYZ, WS, YH, WJ, JS, and JY directed the data analysis. XYZ conducted data analysis and manuscript writing. XGZ, DM and YQ supervised the experiment and confirmed the manuscript. XGZ is the guarantor of this work, so she can have full access to all the data in the research and is responsible for the integrity of the data and the accuracy of the data analysis. All authors contributed to the article and approved the submitted version. Thank all the above staff for the help in this study. The authors thank the reviewers for their valuable suggestions during the revision of the early manuscripts.

FUNDING
This research was funded by the National Key R&D Program of China, grant number 2018YFD0200506.