Transcriptomic Identification of Drought-Related Genes and SSR Markers in Sudan Grass Based on RNA-Seq

Sudan grass (Sorghum sudanense) is an annual warm-season gramineous forage grass that is widely used as pasture, hay, and silage. However, drought stress severely impacts its yield, and there is limited information about the mechanisms of drought tolerance in Sudan grass. In this study, we used next-generation sequencing to identify differentially expressed genes (DEGs) in the Sudan grass variety Wulate No.1, and we developed simple sequence repeat (SSR) markers associated with drought stress. From 852,543,826 raw reads, nearly 816,854,366 clean reads were identified and used for analysis. A total of 80,686 unigenes were obtained via de novo assembly of the clean reads including 45,065 unigenes (55.9%) that were identified as coding sequences (CDSs). According to Gene Ontology analysis, 31,444 unigenes were annotated, 11,778 unigenes were identified to 25 categories in the clusters of orthologous groups of proteins (KOG) classification, and 11,223 unigenes were assigned to 280 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Additionally, there were 2,329 DEGs under a short-term of 25% polyethylene glycol (PEG) treatment, while 5,101 DEGs were identified under the long-term of 25% PEG treatment. DEGs were enriched in pathways of carbon fixation in photosynthetic organisms and plant hormone signal transduction which played a leading role in short-term of drought stress. However, DEGs were mainly enriched in pathway of plant hormone signal transduction that played an important role under long-term of drought stress. To increase accuracy, we excluded all the DEGs of all controls, specifically, five DEGs that were associated with high PEG concentrations were found through RNA-Seq. All five genes were up-regulated under drought stress, but the functions of the genes remain unclear. In addition, we identified 17,548 SSRs obtained from 80,686 unigenes. The newly identified drought tolerance DEGs will contribute to transgenic breeding efforts, while SSRs developed from high-throughput transcriptome data will facilitate marker-assisted selection for all traits in Sudan grass.


INTRODUCTION
Global climate change has increased the incidence of drought worldwide across successive years (Shanker et al., 2014), and it has had many negative effects on plant growth such as leaf rolling, growth inhibition, and death (Mutisya et al., 2010), thus severely affecting agriculture. Sudan grass (Sorghum sudanense) is an annual warm-season gramineous forage grass (Smith et al., 1982;Zhan et al., 2008). Because of its favorable yield, superior value, fast regrowth, superior disease, and pest resistance, and tolerance to abiotic stress, Sudan grass is widely grown in pastures and used as hay and silage throughout the world (Rangaswami Ayyangar and Ponnaiya, 1939;Summer et al., 1965;Zamfir et al., 2001;Al-Suhaibani, 2006). It is distributed throughout China as a typical grass for livestock, aquaculture foods, and protecting fishing ponds (Wei et al., 2008;Chen et al., 2009), especially in arid and semiarid regions (Awad et al., 2013). However, exposure to drought stress for long time periods can affect forage yield and quality. Bibi et al. (2010) showed that drought stress could decrease Sudan grass yields. Therefore, it is important to study drought tolerance mechanisms in Sudan grass.
Drought stress causes osmotic and oxidative stress, which decreases membrane stability, leading to cell death . In order to cope with drought stress, plants have developed comprehensive mechanisms such as metabolic alteration, signal transduction, and differential gene expression (Shanker et al., 2014). Although various breeding approaches have been used to alleviate damage caused by drought stress in plants, genetic engineering has been more effective than other approaches. However, genetic engineering requires identifying genes with expression patterns regulated by drought stress.
The next-generation sequencing (NGS) technique known as RNA sequencing (RNA-Seq) is a cost-efficient tool for sequencing the full transcriptomes of both model and nonmodel species. RNA-Seq has been successfully used in many plants such as Brachypodium sylvaticum (Fox et al., 2013), Sorghum sudanense (Li et al., 2016), sugarcane (Cardoso-Silva et al., 2014), pepper (Ashrafi et al., 2012), orchardgrass (Huang et al., 2015), Hemarthria , and annual ryegrass (Pan et al., 2016). Transcriptome data has been used in biological studies worldwide in order to better understand biological processes (Surget-Groba and Montoya-Burgos, 2010), and it has especially been applied to studying the responses of gene expression to various stresses (Kreps et al., 2002). Shinozaki and Yamaguchi-Shinozaki (2007) reported that plants transformed with drought-inducible genes exhibited improved stress tolerance. Similarly, Ashraf (2010) noted that some genes are overexpressed, thereby inducing damage caused by drought stress, which are thus well utilized to improve the tolerance of plants to drought stress. However, effectively no published reports have used RNA-Seq to analyze the regulation of gene expression by the drought stress in Sudan grass.
In this study, we used RNA-Seq, a powerful NGS-based technique, to study transcription profiles of Sudan grass. The main goals of this study were (1) to develop SSR markers associated with drought-tolerance genes in the Sudan grass variety Wulate No. 1 and (2) to identify differently expressed genes (DEGs) under drought stress. This study provides more information about the molecular mechanisms of drought stress in Sudan grass, thereby contributing to future transgenic breeding efforts in addition to providing markers for MAS.

Plant Material and RNA Isolation
Seeds of the Sudan grass variety Wulate No.1 (Barenbrug Co., Chengdu, China) were sown in sand-culture pots that were placed in controlled growth champers set to 25 • C for 12-h days and 22 • C for 12-h nights. After seeds had germinated in water, 1/2 strength Hoagland's solution was used to cultivate seedlings. After 20 days, seedlings were subjected to polyethylene glycol (PEG) stress as a means of inducing drought stress. The plants were divided into two treatments: (1) plants in three pots (three replicates) were subjected to 25% PEG dissolved in 1/2 strength Hoagland's solution (drought stress); (2) the other three pots were subjected to just 1/2 strength Hoagland's solution (control). The leaves were harvested at 0, 3, and 6 days and stored in a −80 • C freezer prior to RNA extraction. L_0_1, L_3_1, and L_6_1 were non-stressed controls that were collected at 0, 3, and 6 day, respectively. L_3_2 and L_6_2 were drought-stressed treatments that were collected at 3 and 6 day, respectively. Therefore, 5 treatments were set and each treatment had 3 replicates. A total of 15 samples were collected for RNA sequencing.
Total RNA was extract from leaves using the RNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's instructions and then loaded onto and electrophoresed through a 1% agarose gel to check the degradation and contamination of the RNA (Figure 1). The quantity, concentration, and quality of the total RNA were examined using an RNA Nano 6000 kit for the Agilent 2100 Bioanalyzer 2100 System (Agilent Technologies, CA, USA). The integrity of the RNAs is shown in Figure 2.

Preparation and Sequencing of the cDNA Library
We used 3 µg of RNA per sample as genetic material for preparation, and mRNAs were purified from total RNA using poly-T oligo-attached magnetic beads. These fragments were used as templates in a reaction carried out by divalent cations FIGURE 1 | RNA validation of Sudan grass samples by 1% agarose gel electrophoresis. M represents the marker, lanes 1-3 correspond to the three replicates of Wulate No.1 at 0 day in 0% PEG, lanes 4-6 correspond to the three replicates of Wulate No.1 at 3 day in 0% PEG, lanes 7-9 correspond to the three replicates of Wulate No.1 at 3 day in 25% PEG, lanes 10-12 correspond to the three replicates of Wulate No.1 at 6 day in 0% PEG, and lanes 13-15 correspond to the three replicates of Wulate No.1 at 6 day in 25% PEG.
in a NEBNext First Strand Synthesis Reaction Buffer (5×; New England BioLabs, Ipswitch, MA, USA). The first cDNA strand was synthesized using random hexamer primers and transcribed with RNase H − . The second strand cDNA was synthesized using DNA polymerase I and RNase H. NEBNext Adaptor (New England BioLabs) were prepared for hybridization after blunt 3 ′ ends were cut via exonuclease/polymerase activities. Then, we purified cDNA with the AMPure XP system (Beckman Coulter, Brea, CA USA) to select 150-200 bp fragments. The products were generated using 3 µL of USER Enzyme (New England BioLabs) size-selected for adaptor-ligated cDNA at 37 • C for 15 min followed by 5 min at 95 • C and amplified via PCR using Universal PCR primers and index(X) primers. The PCR was performed using Phusion High-Fidelity DNA polymerase (New England BioLabs). Before PCR product sequencing, samples were index-coded for clustering under the TruSeq PE Cluster Kit v3-cBot-HS (Illumia, San Diego, CA, USA). Finally, the amplified products were purified using the AMPure XP system, and library quality was assessed on the Agilent Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA) to create the final cDNA libraries. Sequencing libraries were generated using NEBNext R Ultra TM RNA Library Prep Kit for Illumina R (New England BioLabs) following the manufacturer's recommendations. In total, there were 15 cDNA libraries were constructed of Sudan grass.

Data Analysis
The sequencing-derived image data were transformed into raw reads. To ensure high quality transcriptomic data, the raw reads were cleaned using in-house Perl scripts that remove adapter sequences, poly-N repeats, and low quality reads, whilst calculating the Q20, Q30, GC-content, and sequence duplication level of the clean data. The cleaned reads were then assembled by Trinity (Grabherr et al., 2011) with min_kmer_cov set to 2 as a default, and all other parameters used default settings.

Analysis and Mapping of DEG Reads and Differential Gene Expression
Before mapping clean reads onto the reference genome with RSEM (Li and Dewey, 2011), the assembled Trinity transcript data was regarded as the reference sequence. The bowtie 2 mismatch parameter in RSEM was set to 0. The quantity of read counts for all genes was obtained from the mapping results and then normalized by FPKM (expected number of fragments per kilobase of transcript sequence per millions base pairs sequenced; Trapnell et al., 2010). Differential expression analysis of each sample was performed using the DESeq R package (1.10.1). DESeq provides statistical routines for detecting genes exhibiting differential expression with a negative binominal distribution model (Anders and Huber, 2010). To ensure the accuracy of the P-values, Benjamini and Hochberg's approach was used to control the false discovery rate (FDR), after which a P < 0.05 threshold was used to classify genes as differentially expressed. All DEGs were subjected to GO enrichment analysis. To adjust for gene length, a Wallenius non-central hypergeometric distribution (Young et al., 2010) was applied in the GO enrichment analysis using the GOseq R package (Young et al., 2010). Next, KOBAS 2.0 (Mao et al., 2005) was utilized to test the statistical enrichment of DEGs in KEGG pathways, a database resource for understanding high-level functions within biological systems (Kanehisa et al., 2008). In KOBAS 2.0, a FDR ≤0.05 threshold was used for remarkable enrichment pathways, FIGURE 2 | Sudan grass RNA sample quality. The labeling scheme for each of the Wulate No.1 RNA samples is as follows: L-0d-1 through L-0d-3, the three replicates at day 0 in 0% PEG; L-3d(ck)-1 through L-3d(ck)-3, the three replicates at day 3 in 0% PEG; L-3d(T)-1 through L-3d(T)-3, the three replicates at day 3 in 25% PEG; L-6d(ck)-1 through L-6d(ck)-3, the three replicates at day 6 in 0% PEG; and L-6d(T)-1 through L-6d(T)-3, the three replicates at day 6 in 25% PEG. and the FDR parameter was set as BH in KOBAS 2.0. Finally, blastx was used to search the genome of a related species in the string database with the DEG sequences in order to obtain the protein-protein interactions. The results were then visualized in Cytoscape (Shannon et al., 2003).

Survey of SSR Polymorphism
A total of 20 Sudan grass and sorghum accessions were used to identify SSR markers. We randomly chose 30 SSR primers to evaluate their polymorphism. DNA extractions were performed using the DNeasy Plant Mini Kit (Qiagen) from leaves of the 20 Sudan grass and sorghum accessions. PCR amplifications were conducted in a total reaction volume of 15 µL containing 1.5 µL of DNA, 0.5 µL of Golden DNA Polymerase (TIANGEN Biotech, Beijing, China), 7.5 µL of 2× Reaction Mix (TIANGEN), 0.6 µL of each primer, and 4.4 µL of ddH 2 O. The PCR reaction cycling profile was 94 • C for 4 min followed by 35 cycles at 94 • C for 40 s, 55-60 • C for 40 s, 72 • C for 1 min, and a final extension at 72 • C for 10 min. PCR products were separated via gel electrophoresis on a 6% denaturing polyacrylamide gels at 350 V for 1 h alongside a 50-bp DNA marker used to assess product lengths. Then, we used 0.1% AgNO 3 to stain the gel. Well amplified loci were identified using Gel Doc TM XR (Bio-Rad Laboratories Inc., Hercules, CA, USA).

Sequence Analysis and Transcript Assembly
Replicate cDNA libraries from Sudan grass leaf samples of plants grown under drought and control treatments (Supplementary Image 1) were constructed and sequenced on the Illumina HiSeq 2000 platform. A total of 852,543,826 raw reads were generated, and they have been deposited in the NCBI (National Center for Biotechnology Information) Short Read Archive (SRA: SRP095822). A total of 816,854,366 clean reads were identified after trimming adapters and filtering out low quality reads (Table 1). Using Trinity to further assemble the sequences, we obtained 149,395 contigs with a mean length of 1,386 bp, N50 of 2,366, and lengths of 201-24,422 bp, amounting to a total of 207,118,332 bp (Figure 3). The contigs were further assembled into 80,686 unigenes with a mean length of 938 bp and N50 of 1847 across a total of 75,697,479 bp (Figure 3). There are few Sudan grass ESTs in NCBI databases; however, Li et al. (2010) and Lu et al. (2009) used ESTs from Sorghum bicolor to explore SSRs in Sudan grass. Thus, the present results will serve as valuable genomic resources that will help identify more valuable SSRs. When the sequences were compared to all databases, only 5,892 unigenes (7.3%) were annotated in all, while all 52,315 unigenes (64.83%) were annotated in at least one database according to Blastx search ( Table 2).

Gene Function Annotation
To analyze the conservation of sequences, we compared Sudan grass sequences to those from other species. The top match was Sorghum bicolor (65.7% sequence identity), followed by Zea mays (14.9%), Setaria italica (4.6%), and Oryza sativa (2.9%; Figure 4). As expected, more than 88% of sequences among all unigenes had a top match with sequences from plants in the family Poaceae.
Among 80,686 unigenes, 45,065 (55.9%) were predicted to match CDSs with lengths ranging from 33 to 14,379 bp with an average length of 665 bp ( Figure 5). ESTScan indicated that 34,128 of unigenes (42.3%) were predicted to match CDSs ranging in length from 51 to 10,476 bp with an average length of 336 bp (Figure 5). GO is an international standardized gene functional classification system with three ontologies: biological process (BP), molecular function (MF), and cellular component (CC; Wang et al., 2010). Each ontology offers a comprehensive description of the properties of genes. A total of 31,444 unigenes were annotated according to an analysis performed with Blast 2 GO. BP was the most abundantly represented according to GO, followed by CC and MF (Figure 6). Within BP, cellular process (18,850 unigenes, 21.0%) and metabolic process (18,499 unigenes, 20.6%) comprised the majority. Within MF, binding (19,070 unigenes, 46.5%) and catalytic activity (15,383 unigenes, 37.5%) contained the largest numbers of genes, while within CC, cell (13,495 unigenes, 21.4%) was almost as common as cell part (13,492, 21.4%), in agreement with Li et al. (2016).
To better understand the biological functions of and interactions among genes, 11,223 unigenes were classified into 280 KEGG pathways. The pathways included Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes, Organismal Systems, and

DEGs under Drought Stress
There were 4,903 and 3,466 DEGs identified by comparing expression on the 3rd day to the control expression levels of days 0 and 3, respectively (i.e., L_0_1 vs. L_3_2 and L_3_1 vs. L_3_2 comparisons). Comparisons between Sudan grass exposed to the PEG treatment for 6 days with the day 0 and 6 controls, revealed 9,512 and 6,194 DEGs, respectively (i.e., L_0_1 vs. L_6_2 and L_6_1 vs. L_6_2 comparisons), which indicated that expressions changes that had occurred by the 3rd day were smaller than those occurring by the 6th day. In contrast, the comparison between expression levels on the 3rd and 6th day of the PEG treatment (i.e., L_3_2 vs. L_6_2), revealed only 130 DEGs, with 5 and 125 genes down-and up-regulated, respectively. Few of the same DEGs changed in the 3rd and 6th days, but there were changes among all control samples over time (i.e., L_0_1 vs. L_3_1, L_0_1 vs. L_6_1, and L_3_1 vs. L_6_1; Table 3).
To improve the accuracy of the results, we excluded DEGs that changed in the controls, finding 2,329 DEGs between the controls (0 and 3rd day) and the treatment of 3rd (i.e., L_0_1 vs. L_3_2 vs. L_3_1 vs. L_3_2), which including 1,531 and 785 DEGs down-and up-regulated, respectively. For the 6th days of the PEG treatment compared to the controls (0 and 6th day) (i.e., L_0_1 vs. L_6_2 vs. L_6_1 vs. L_6_2), there were 3,031 and 2,084 DEGs downand up-regulated, respectively, in total of 5,101 DEGs been founded ( Table 4). The DEGs enriched within KEGG pathways indicated Carbon fixation in photosynthetic organisms and Plant hormone signal transduction pathways play important roles in handling short-term drought stress. The next most enriched pathways were Phenylpropanoid biosynthesis, Glycolysis/Gluconeogenesis, and Photosynthesis (Figure 9). Lenka et al. (2011) also demonstrated that the Carbon fixation pathways play an important role in the drought responses of rice. However, under long term drought stress, the Plant hormone signal transduction pathways was most important, followed by Starch and sucrose metabolism and Phenylpropanoid biosynthesis (Figure 9). Huang et al. (2014) declared that the up-regulated genes stimulated by drought stress were  enriched in the signal transduction pathway, agreeing with our results.
To eliminate expression changes among the all controls (0, 3rd, and 6th day), Venn diagrams were examined (Figure 10),  which revealed no genes were down-regulated by PEG stress, while five genes were up-regulated. The corresponding genes ID were c18079_g1, c21893_g1, c33865_g2, c34306_g1, and c35500_g3.
To better understand the functions of these genes, we BLASTed these sequences against NR, NT, PFAM, KOG, Swiss-Prot, and GO databases. The c18079_g1 sequence is 612 bp (Appendix S1), and its NR and NT descriptions indicate it is a hypothetical protein that has been annotated in Sorghum bicolor. According to Swiss-Prot, it is similar to Arabidopsis thaliana's defensin-like protein. According to PFAM, it appears to belong to the Gamma-thionin family or Scorpion toxin-like domain. Lastly, GO analysis indicated it has a role in transport or defense response in BP, plasma membrane or ion channel inhibition or activity in MF, and plant-type in CC ( Table 5). The first response to the drought stress is reducing water deficit by closure of stomata, which is also called hydropassive closure. When the stomata are open, water evaporates from guard cells. However, stomata are opened by the reversal of ion fluxes (Jouyban, 2012); c18079_g1 may be responsible for mediating this response.
The second gene, c21893_g1, is 1,583 bp (Appendix S1), and it also strongly resembles the Sorghum bicolor hypothetical proteins in the NR and NT databases. However, we were unable to find any description of it in the PFAM, KOG, and Swiss-Prot databases, but in the GO database, the functions of the gene were protein folding and unfolded protein binding (BP) or heat shock protein binding (MF; Table 5). Buchanan et al. (2005) indicated that heat shock protein 17.2, heat shock protein 16.9, and HSP 70 were up-regulated by increased PEG stress in sorghum. Several studies reported that drought stress is often combined with heat shock (Wang and Huang, 2004;Mienie and De Ronde, 2008;Grigorova et al., 2011), perhaps explaining the up-regulation of this gene by drought stress in Sudan grass.
The third gene, c33865_g2, is the longest of all five (3,697 bp, Appendix S1). It is similar to the previous genes, which are similar to hypothetical proteins in Sorghum bicolor, but in contrast, it matches genes in all of the examined databases. It matches an uncharacterized AAA domain-containing protein from Schizosaccharomyces pombe in the Swiss-Prot database, and there were many BP, MF, and CC roles in the GO database ( Table 6). The KOG database indicated the gene is an AAA +type ATPase ( Table 5). As an ABC transporter-like protein in sorghum, the gene has been shown to be up-regulated under drought stress (Buchanan et al., 2005). Deeba et al. (2012) also showed that activation of the ATP generation in groundnuts can improve drought tolerance. However, the actually function of this gene is unclear.
The fourth gene, c34306_g, is 2,181 bp in length (Appendix S1). In the NR and NT databases, it matches the phosphosulfolactate synthase-related protein in Zea mays. The PFAM database indicated the gene is a (2R)phospho-3-sulfolactate synthase (ComA) or Small cytokine (intecrine/chemokine), interleukin-8 like gene, whereas GO indicates it has a role in signal transduction, chemotaxis, heat acclimation, coenzyme M biosynthetic process, or immune response (BP); chemokine activity and catalytic activity (MF), and extracellular region (CC; Table 5). Graham et al. (2002) introduced the details of the biosynthetic pathway for coenzyme M. Phosphosulfolactate synthase and ComA catalyze the first step of the coenzyme M biosynthetic process (Liu et al., 2006), and c34306_g may be involved in this process.
The last gene, c35500_g3, is 1,110 bp (Appendix S1), and it is a hypothetical protein in the NR and NT databases. It matched a cucumber L-ascorbate oxidase in Swiss-Prot and a multicopper oxidase or copper ion binding protein in PFAM. In GO, c35500_g3 is characterized as having a role in the oxidation-reduction process (BP), oxidoreductase activity (MF), and extracellular region or plant-type cell wall (CC). Drought stress induces the accumulation of reactive oxygen species (ROS), and therefore ROS-scavenging is stimulated by drought. Osakabe et al. (2014) indicated that ascorbate oxidase plays an important role in scavenging cytosolic H 2 O 2 ( Table 5). The function of ascorbate oxidase was to protect integrated chloroplast proteins, and massive reports have shown that the overexpression of APX2 improved drought tolerance. Increased levels of thylakoid-bound ascorbate peroxidase have been identified in sorghum under drought stress (Buchanan et al., 2005). Dawson et al. (1975) reported that every molecule of ascorbate oxidase contains 8-10 atoms of copper, and this has been subsequently studied as an enzyme model for L-ascorbate oxidase (Ueda and Hanaki, 1984). Thus, this gene maybe involved in the pathway for ascorbate oxidase.
Finally, to ensure the accuracy of the Venn diagrams, we used more comparisons to analyze DEGs (5 combinations). These alternative results coincide with the main results. We again found the same five genes up-regulated and no genes down-regulated (Figure 11).

Development and Validation of SSR Markers
We identified 17,548 SSRs obtained from 80,686 sequences totaling 75,697,479 bp. A total of 13,574 sequences contained SSRs. Of these, 2,965 sequences contained more than a single SSR, and 866 exhibited compound SSR formation. Trinucleotide repeat motifs were the most abundant among the six types  Table 6).
The repeat counts ranged from 5 to 23. SSRs with 5-8 repeats were being most abundant, followed by those with 9-12 repeats (Figure 12). In sugarcane, trinucleotide motifs are perhaps most common among SSRs, which is consist with our results. Cardoso-Silva et al. (2014) reported that trinucleotide SSR motifs strongly impact the rate of frameshift mutations.
We randomly selected 30 primer pairs for amplification. Of these, 23 primer pairs (76.7%) successfully amplified products from genomic DNA from Sudan grass and Sorghum (though the remaining 7 primer pairs failed to generate PCR products). The failure of the primer pairs to amplify may have been caused by the location of primers across splice sites, introns, mutations, or indels. In total, 124 amplified bands were detected from the PCR products of 23 primer pairs. The products of the primer pairs ranged from 134 to 280 bp. Among the 23 primers, the unigene c24737_g1 amplified by primer pair 23 showed the highest average polymorphism information content (PIC = 0.8304), followed by c25684_g1, which was amplified by primer pair 1 (PIC = 0.7608). However, according to Shannon's information index (I), c25493_g1 (amplified by primer pair 9) was the most diverse, (I = 0.6851), followed by c24737_g1 (amplified by primer pair 23). According to Nei's gene diversity (H), c10226_g1 (amplified by primer pair 7) exhibited the most variation (H = 0.5281), followed by c33403_g4 (amplified by primer pair 14; H = 0.5169). Among these 23 unigenes, the average PIC was 0.5327, I was 0.552, H was 0.439, and PPB was 78.18%. Therefore, these 23 unigenes showed high levels of diversity ( Table 7).
SSR makers are extremely useful molecular makers that can be used for genetic linkage mapping, comparative mapping, and many other genotyping applications (Tang et al., 2009). These makers have many advantages including reliable reproducibility, co-dominance, and a high degree of operational transferability to other related species Kaur et al., 2011). SSRs have been widely used in sorghum (Sanchez et al., 2002;Mace and Jordan, 2011;Upadhyaya et al., 2012;Hussein et al., 2014), but Li et al. (2016) reported important differences between sorghum and Sudan grass in indel markers. Few studies have tried to use ESTs from sorghum to design SSR markers in Sudan grass. Further genetic analyses of Sudan grass require the development of more SSR makers for the species. Our study has identified 17,548 SSRs that may be used to enhance the molecular breeding of Sudan grass.

CONCLUSION
In this study, we used NGS data to analyze the Sudan grass transcriptome under drought stress. We found 2,329 DEGs and 5,101 DEGs under a short-and long-term of 25% PEG treatment, respectively. And we also found five genes related to the drought stress strictly. Notably, those genes were up-regulated in Sudan grass under drought stress. This discovery may facilitate the development of transgenic breeding for drought resistance. We also developed a substantial number of new genetic markers, including SSRs, which can be broadly applied to understanding the genetics of Sudan grass.

AUTHOR CONTRIBUTIONS
XZ, YZ, and XiaW conceived the project and designed the experiments; YZ, XiaW, and LH performed the experiments; YZ, XiaW, CL, WX, JP, ZL, and HY analyzed the data; YZ, XiaW, FL, XieW, LY, and DP finalized the manuscript; all authors discussed the results and reviewed the manuscript.