Genome-Wide Dissection of the Heat Shock Transcription Factor Family Genes in Arachis

Heat shock transcription factors (Hsfs) are important transcription factors (TFs) in protecting plants from damages caused by various stresses. The released whole genome sequences of wild peanuts make it possible for genome-wide analysis of Hsfs in peanut. In this study, a total of 16 and 17 Hsf genes were identified from Arachis duranensis and A. ipaensis, respectively. We identified 16 orthologous Hsf gene pairs in both peanut species; however HsfXs was only identified from A. ipaensis. Orthologous pairs between two wild peanut species were highly syntenic. Based on phylogenetic relationship, peanut Hsfs were divided into groups A, B, and C. Selection pressure analysis showed that group B Hsf genes mainly underwent positive selection and group A Hsfs were affected by purifying selection. Small scale segmental and tandem duplication may play important roles in the evolution of these genes. Cis-elements, such as ABRE, DRE, and HSE, were found in the promoters of most Arachis Hsf genes. Five AdHsfs and two AiHsfs contained fungal elicitor responsive elements suggesting their involvement in response to fungi infection. These genes were differentially expressed in cultivated peanut under abiotic stress and Aspergillus flavus infection. AhHsf2 and AhHsf14 were significantly up-regulated after inoculation with A. flavus suggesting their possible role in fungal resistance.

Only the active Hsfs are capable of recognizing and binding to the promoters of target genes. The inactive monomer could be converted into active oligomer under variety of stress conditions (Hartl and Hayer-Hartl, 2002;Wang et al., 2012;Li et al., 2014). There are only a few Hsf genes in yeast and animals, while 20-50 Hsf genes were found in plants (Scharf et al., 2012;Lin et al., 2014;Qiao et al., 2015). Hsf genes were identified in many plants and expressed in various tissues at different developmental stages during different stress conditions (Giorno et al., 2012;Chung et al., 2013;Xue et al., 2014).
Peanut (Arachis hypogaea L.) is an important oil crop in the world. In developing countries, peanuts were rain-fed, so it is important to study the drought stress tolerance of peanut (Ramu et al., 2015). Aspergillus flavus produces potent mycotoxins known as aflatoxins that could cause serious health concerns . It is unknown on the role of Hsf genes in peanut response to abiotic stresses and A. flavus infection.
Cultivated peanut is an allotetraploid (AABB, 4n = 4x = 40) originated from a single hybridization and genome duplication event between two wild type diploid peanuts (AA and BB genomes) (Kochert et al., 1996;Freitas et al., 2007;Moretzsohn et al., 2013;Wang et al., 2016). Recently, the whole genome sequencing of the two ancestral species (A. duranensis and A. ipaensis) have been completed (Bertioli et al., 2016; http://peanutbase.org/). Here, we genome-widely identified and analyzed the Hsf genes from two wild peanuts species: A. duranensis (AA genome) and A. ipaensis (BB genome), respectively. We analyzed the gene duplication events in the wild peanut species, the difference of selection pressure in A, B, and C group of Arachis Hsfs, and the structures of these proteins. Our results provide basic information for further understanding the functional divergence and evolution of Arachis Hsfs. We also applied the knowledge gained from wild species to cultivated one to understand their possible functions on peanut response to abiotic and biotic stress.

Data Collection and Identification of Hsf Genes
The genome sequence data of two wild peanut species (AA and BB genomes) were obtained from the peanut genome database (http://peanutbase.org/). The conserved domains of Hsfs are Hsf-type DBD domain. The HMM ID of this domain is PF00447 in the pfam database (http://pfam.xfam.org/). The amino acid sequences of HMMs were used as queries to identify all possible Hsf protein sequences in AA and BB genome database using BLASTP (E < 0.001). SMART software (http:// smart.embl-heidelberg.de/) was used to identify integrated DBD domain and (HR-A/B) domain in the putative peanut Hsfs. Candidate proteins without integrated DBD domain and HR-A/B domain were removed. NLS domains in peanut Hsfs were predicted using cNLS Mapper software (http://nls-mapper.iab. keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi ). NES domains were predicted using NetNES 1.1 server software (http://www.cbs.dtu. dk/services/NetNES/). AHA domains were predicted based on the conserved-type AHA motif sequence FWxxF/L, F/I/L . Protein isoelectric point (pI) and molecular weight (Mw) were analyzed using Expasy software (http://web.expasy. org/compute_pi/).

Analysis of Synteny
Intraspecies synteny analysis of AA or BB genome and interspecies synteny analysis between AA and BB genomes were based on comparison of 100 kb blocks of chromosome containing Hsf genes according to previous reports (Sato et al., 2008;Zhang et al., 2011;Lin et al., 2014). Hsf genes were set as anchor points according to their chromosome locations. Blocks were identified by local all-vs-all BLASTN (E < 10 −20 ). In intraspecies analysis, when four or more homology genes were detected, these two blocks were considered to be originated from a largescale duplication event (Zhang et al., 2011;Lin et al., 2014). In interspecies analysis, when three or more conserved homology genes were detected, these two blocks were considered syntenic blocks (Sato et al., 2008;Lin et al., 2014;Wang et al., 2016).

Gene Duplication Analysis
Two standards for duplication gene identification were used. High-stringency standard: coding protein pair with ≥50% identity and covering ≥90% protein length. Low-stringency standard: protein pair with ≥30% identity and covering ≥70% protein length (Rizzon et al., 2006). Tandem duplication of genes was marked according to the previously described method (Yuan et al., 2015). Chromosome segmental or large scale duplication of genes was identified based on the intraspecies synteny (Zhang et al., 2011;Lin et al., 2014;Qiao et al., 2015).

Protein Structure Analysis and Homology Modeling
SWISS-MODLE (http://www.swissmodel.expasy.org/interactive) was used to calculate secondary structure and build threedimensional structure of proteins. The templates for building protein 3D model were selected in PDB database based on the best identity. Protein 3D models were selected based on the best global model quality estimation (GMQE). Homology modeling templates included 5d5v.1 (monomer of DBD domain), 5d5v.1 (homo-dimer of DBD domain interacted with SalIII), 5d5u.1 (homo-dimer DBD domain interacted with HSE), 4r0r.1.A (monomer of HR-A/B domain) and 4r0r.1 (homo-trimer of HR-A/B).

Plant Materials, Stress Treatments, and RNA Isolation
Cultivated peanut cv. Luhua-14 was used in this study. Elevenday-old peanut seedlings were subjected to drought (removed from wet medium and kept in air on filter paper), cold (4 • C) and high temperature (42 • C) treatment. Leaf samples were collected at 0, 1, and 6 h after treatment and immediately frozen in liquid nitrogen. Leaf samples without treatment were used as control. Peanut seeds inoculated with A. flavus for 3 days were collected and seeds without A. flavus inoculation were used as control according to a previous report . RNAs were isolated by CTAB method according to a previous method (Wang et al., 2016). For reverse transcription, the firststrand cDNA was synthesized with an oligo (dT) primer using a PrimeScript TM first-strand cDNA synthesis kit (TaKaRa). Three technical replicates were carried out in this study.

Gene Expression Analysis
Quantitative real time PCR (qRT-PCR) was performed using the FastStart Universal SYBR Green Master (ROX) with ABI TM 7500. The qRT-PCR program was set as the following: 95 • C for 30 s, followed by 40 cycles of 95 • C for 5 s, 60 • C for 30 s. Relative gene expression levels were calculated using the CT method. The primers for qRT-PCR were provided in the Table S2. T-test was used to analyze the significance.

Identification of Hsf Genes in Wild Peanut Species
The amino acid sequences of Hsfs were extracted from AA and BB wild peanut genome database using the BLASTP program. The amino acid sequences of Hsf DBD domains (Pfam: PF00447) were used as queries. From AA and BB genomes, we identified 16 and 17 Hsf genes, respectively. The polypeptide lengths of Hsfs varied from 209 to 656 aa in A. duranensis and from 282 to 514 aa in A. ipaensis. A. thaliana Hsf family were often employed as reference to classify Hsf family in other plant species (Scharf et al., 2012;Li et al., 2014;Wang et al., 2014;Qiao et al., 2015). We employed Hsfs from A. thaliana and other species to construct phylogenetic tree together with Hsfs in two wild peanut species. In this study, 21 A. thaliana Hsfs, 11 M. truncatula Hsfs, 10 L. japonicus Hsfs, 16 C. cajan Hsfs, 11 C. arietinum Hsfs, and 40 G. max Hsfs were used for phylogenetic tree construction (Figure 1). These Hsfs were divided into A, B, and C groups that was consistent with previous studies (Scharf et al., 2012;Li et al., 2014;Lin et al., 2014;Wang et al., 2014;Qiao et al., 2015). Group A was divided into 10 clusters, group B was divided into five clusters, and group C contained only one cluster. Clusters in the group A were named as A1-A5, A6a, A6b, A7-A9. Clusters in the group B were named as B1-B5. B5 cluster was not presented in Arabidopsis; however, B5 cluster was identified in many leguminous species including wild peanut species. In wild peanut species, A3, A6a, A7, B3, and B4 clusters were absent (Figure 1). Orthologous of all 16 AA genome Hsfs were found in the BB genome with >90% identity (Table S3).
Interspecies synteny analysis showed that high level synteny was maintained between AA and BB genomes (Figure 2). This synteny analysis supported the identification of orthologouspairs of Hsfs between AA and BB genomes. The nomenclature of AA genome Hsfs was based on their chromosome location order, AdHsf1-16. BB genome Hsfs were named based on their orthologous genes in AA genome AiHsf1-16 and AiHsfX. The orthologous gene of AiHsfX (Araip. A5C77) was not found in AA genome. The gene IDs and physical locations information of wild peanut Hsf genes were showed in Table 1, Figure 3.

Duplication of Hsf Genes in Peanut
Duplicated gene-pairs were found in both AA and BB genomes, including high-stringency standard duplicated genepairs AdHsf5-AdHsf14, AdHsf6-AdHsf16 in AA genome and AiHsf5-AiHsf14, AiHsf6-AiHsf16, AiHsf7-AiHsf8 in BB genome, low-stringency standard duplicated gene-pairs AdHsf7-AdHsf8 in AA genome and AiHsf15-AiHsfX in BB genome. Intraspecies synteny analysis showed that the duplicated gene-pair blocks Frontiers in Plant Science | www.frontiersin.org were not collinear. No chromosome segmental or large scale duplication gene pairs were identified. AiHsf7-AiHsf8 and AdHsf7-AdHsf8 were identified as tandem duplicated gene-pairs.

Features of Hsfs in Wild Peanut Species
Most members of Hsf gene families in both AA and BB genomes contained one intron and two exons. However, AdHsf7 contained three exons and AdHsf14 contained four exons in the AA genome, AiHsf15 contained three exons, AiHsf14, and AiHsfX contained four exons in the BB genome. AdHsf14 contained four exons, while its duplicated gene AdHsf5 contained only two exons. Intronless Hsfs were also found in both AA and BB genomes ( Figure S1).
HR-A/B domain is critical for one Hsf interacting with other Hsfs to form trimer through a helical coiled-coil structure (Scharf et al., 2012;Jaeger et al., 2016;Neudegger et al., 2016). Similar to other plant Hsfs, group A Hsfs have an insertion between HR-A and HR-B regions in peanut. However, this insertion was not found in the group B Hsfs. In Arachis, the sequence of group B Hsf HR-A/B was not conserved compare with that in group A (Figure 4). The DBD domains were conserved in two wild peanut species. The most conserved motif of DBD domains were "FSSFI/VRQLNT/I" in peanut ( Figure S2).

The 3D Structure of Hsfs in Wild Peanut Species
The predicted 3D structures of BDB domain of all AA and BB Hsfs were similar to that of human Hsf BDB (Figure 5A). The predicted 3D structures of HR-A/B domain of AA and BB Hsfs were also similar to the human Hsfs ( Figure 5C). The 3D structures of BDB domain of peanut orthologous were highly conserved.
When adjacent DBD molecules bound to HSE element, two DBD molecules formed symmetrical protein-protein interaction involving the helix α2. The closest intermolecular contact occurred between the Gly50 residues located at the N-terminal end of the α2 helices in chordate Hsfs. Gly50 is conserved and is surrounded by Gln49 and Gln51 in chordate Hsfs (Neudegger et al., 2016). In peanut, we predicted that the closest intermolecular contact residues by homologous comparison and 3D model comparison. The results showed that the closest contact residues were not conserved between chordate Hsfs and peanut Hsfs. For example, in AdHsf1 and AdHsf5, the predicted closest intermolecular contact occurred between the residues His143 ( Figure 5A). We also built models that DBD domain of AA and BB wild peanut Hsfs bound to SatIII element. The result showed that the predicted dimer structures of DBD-DBD interaction for binding to SatIII element and HSE element were distinct ( Figure 5B).

Selective Pressure Analysis of Hsfs in Wild Peanut Species
Site models were used to detect whether different groups of Hsfs were under different selective pressure in peanuts. Group C Hsfs contained only one gene, it could not be analyzed. M0 showed that both AdHsfs and AiHsfs in group A underwent strong purifying selection (ω = 0.31723 in AA genome and ω = 0.40488 in BB genome; Table S4). Interestingly, in group B, both AdHsfs and AiHsfs were underwent positive selection (ω = 1.69713 in AA genome and ω = 1.95226 in BB genome). M0 vs. M3, M1a vs. M2a and M7 vs. M8 comparisons detected 399 positive selection sites in group B AdHsfs (P < 0.05) and 382 positive selection sites in group B AiHsfs (P < 0.001; Table S4). The identification of these positive selection sites in group B Hsfs indicated extensive functional diversity and structural variation (Wang et al., 2016).

Expression of Hsfs in Various Tissues in Cultivated Peanut
We used Hsfs of wild peanut species as queries to identify Hsfs in cultivated peanut species from transcriptome and genomic sequences (unpublished data). Totally, 17 Hsfs were identified in cultivated peanut species and named as AhHsf1-AhHsf16 and AhHsfX. The sequences of these genes were similar to their orthologous genes in wild peanut species (Table S6). To predict the possible function of these genes in cultivated peanut, the expression of these genes was investigated by qRT-PCR. Results showed that AhHsf1, 3,7,8,11,12,14,15,16, and X were expressed predominantly in seeds, while the expression of AhHsf9 and 10 was not detected in seeds. AhHsf2, 4, 5, 6, 9, and 10 were highly expressed in flower. The expression of AhHsf1, 7, 12, 15, and 16 was higher in flower than that in root, shoot or leaf. The expression of AhHsf11, 13, and X was higher in leaf than that in root, shoot or flower. The expression of AhHsf4 and AhHsf6 was higher in root than that in shoot or seed. The expression level of AhHsf9 was higher in shoot than that in leaf or seed (Figure 6).

Hsf Expression in Response to Various Stresses in Cultivated Peanut
The expression of AhHsf was analyzed under high temperature, drought and low temperature by qRT-PCR. The expression levels of most Hsfs (AhHsf1, 3,4,5,6,7,9,10,11,13,14,15, and X) were up-regulated under high temperature. The expression of AhHsf1, 3, 9, 15, and X was up-regulated up to ∼9-folds after 6 h treatment with 42 • C. AhHsf4, 5, 6, 10, 11, and 13 could response rapidly to high temperature, and up-regulated after 1 h treatment. The expression of AhHsf4, 5, 6, 10, and 11 was continuously increased during 1-6 h of 42 • C treatment. The expression of AhHsf13 was decreased at 6 h after 42 • C treatment (Figure 7). The expression of most AhHsfs was up-regulated under drought stress. The expression levels of AhHsf2,4,5,7,12,14,15,and 16 were increased after 1 h of drought treatment. The expression of AhHsf2, 5, 12, 14, 15, and 16 was continuously increased during the first 6 h of drought treatment. The expression of AhHsf1, 3, 9, 10, and 11 was up-regulated after 6 h of drought stress (∼15-folds). AhHsfX didn't respond much to drought stress (Figure 8). The expression of most AhHsfs was up-regulated after 1 h of 4 • C treatment, and then down-regulated at 6 h after treatment. The expression of AhHsf 12 was continuously up-regulated during 6 h of cold treatment. The expression of AhHsf14 was decreased at 1 h and then increased at 6 h after 4 • C treatment ( Figure S3).
Previous study showed that Hsfs may be involved in disease resistance (Pick et al., 2012). In this study, we analyzed the expression of AhHsfs in peanut seeds after A. flavus infection. The expression of most AhHsfs was down-regulated in seed after A. flavus inoculation, while the expression of AhHsf2 and 14 was up-regulated (∼1.5-fold; Figure 9).

Leguminous Contained Different Hsf Clusters
B5 cluster was not presented in Arabidopsis Hsfs, while B5 cluster was identified in most leguminous species, such as C. cajan, L. japonicus wild peanuts, and G. max. B5 Hsf cluster were not detected in Medicago truncatula. Phylogenetic tree showed that the leguminous plants contained different Hsf group members. Both in AA and BB wild peanut species, A3, A6a, A7, B3, and B4 cluster members were not found. Only soybean and M. truncatula contained the B3 members. A6a and A7 Hsf cluster was not found in leguminous. A3 cluster was not found in wild peanuts and M. truncatula. Group C Hsfs were not found in L. japonicus and M. truncatula. Soybean contained most clusters but not A6a and A7. The number of Hsfs from wild peanut species was relative small to compare with cotton, soybean and rosaceae (Li et al., 2014;Wang et al., 2014;Qiao et al., 2015). Phylogenetic tree showed that A. duranensis is the closer relative of A. ipaensis compared with other Leguminous.

WGD may Not the Major Driving Force of Hsfs Large Scale Expansion in Arachis
Our results showed that Hsf gene duplication occurred in both AA and BB peanut genomes. The majority pf Hsf duplication events were similar between AA and BB genomes. For example, the duplicated gene pair AdHsf7-AdHsf8 was located on chromosome 5. The distance between AdHsf7 and AdHsf8 was about 1 kb. The duplicated gene pair AiHsf7-AiHsf8 was located on chromosome 5, and the distance between these two genes was about 2 kb. However, AiHsfX was located on chromosome 6 of BB peanut and its duplicated gene AiHsf15 was on chromosome 9 of BB peanut. It is possible that AdHsf15 didn't undergo duplication or the orthologous of AdHsfX was lost during the evolution (Figure 3). We only found one tandem duplication gene pairs in A. duranensis and A. ipaensis, respectively. Both AA and BB genomes or their common ancestor were underwent the early papilionoid whole-genome duplication (WGD) about 58 million year ago (Ks = 0.65) (Bertioli et al., 2016). Intraspecies synteny analysis showed that Hsf duplication in wild peanut species was not originated from a large scale duplication event, because no intraspecies synteny blocks containing Hsfs was found. However, the recent WGD could be a driving force for the expansion of Hsf gene family in Chinese white pear and apple (Qiao et al., 2015). That may be the reason why peanut has less Hsfs than that in cotton, soybean and rosaceae (Li et al., 2014;Wang et al., 2014;Qiao et al., 2015).

Hsfs Is Different in Group B from That in Group A
Group B Hsfs underwent positive selection (Table S4). Positive selection could contribute to adaptive evolution, functional diversity, and neofunctionalization (Beisswanger and Stephan, 2008). Study on barley showed that many gene families involved in adaptation to environment were under positive selection. Positive selection may lead to the expansion of these gene families (Zeng et al., 2015). However, group A Hsfs underwent purifying selection. Purifying selection may generate genes with conserved functions or pseudogenization (Zhang, 2003). These results indicated that the function of Arachis group A Hsfs may be more conserved and the function of group B Hsfs may be more diverged. The sequences of Hsf group B HR-A/B were not conserved compare with group A HR-A/B which was in agreement with the differential selection they experienced (Figure 4). The 3D structure of peanut group B Hsfs was different from group A and C Hsfs. The 3D structure of group A and C HR-A/B was a continuous helix, while group B HR-A/B 3D structure contained helixes which were linked by a linear part (Figure 5).

The Possible Roles of Arachis Hsfs in Abiotic and Biotic Stresses
Hsfs play a central role in protecting plants from high temperature or other stresses (Nishizawa-Yokoi et al., 2009;Scharf et al., 2012). Many Hsfs could regulate a set of heat-shock protein genes to enhance the thermo-tolerance in plants. Some Hsfs could be regulated by DREB genes as part of drought stress FIGURE 6 | Relative expression levels of Hsfs in different tissues in cultivated peanut. T-test was used to perform analysis of significance. * represents significantly difference (P < 0.05) compared with control (0 h).
In our study, the majority of Hsf promoters contained HSE elements (  . Therefore, Hsfs could play important roles for gene regulation in response to different stresses in peanuts. Some Arachis Hsf promoters contained salicylic acid responsive, MeJA-responsive or fungal elicitor responsive elements, suggesting their roles in response to pathogen infection. In cultivated peanut cultivars, the expression level of AhHsf13 was approximately 500-folds as high as the control after 1 h of heat treatment, and then the expression was decreased after 6 h of treatment. Expression levels of AhHsf1, 3, 9, and AhHsfX were up-regulated by about 10-folds after 6 h of heat treatment to compare with the control. The expression of these Hsfs kept at a high level under continuous heat stress (Figure 7). Group A1a Hsfs were master regulators for acquired thermo tolerance in tomato and Arabidopsis (Scharf et al., 2012). However, we FIGURE 7 | Relative expression levels of Hsfs under heat stress in cultivated peanut. T-test was used to perform analysis of significance. * represents significantly difference (P < 0.05) compared with control (0 h).
found that the expression of AhHsf2 (group A1) did not respond to heat and cold, but to drought stress. In cultivated peanut, expression levels of AhHsf1, 2, 3, 9, 10, 11, 15 were about 10folds as high to compare with the control after 6 h of drought stress (Figure 8). The expression of some Hsfs was altered after Podosphaera aphanis inoculation in woodland strawberry (Hu et al., 2015). Aspergillus flavus produces potent mycotoxin known as aflatoxin which is a key issue of food safety in peanut . We detected whether peanut Hsf genes were involved in the response to A. flavus infection. The results showed that the expression of AhHsf2 and AhHsf14 were significantly up-regulated after A. flavus inoculation. The expression of some AhHsfs was down-regulated by A. flavus infection.

Hsf Gene Family Were Highly Expressed in Peanut Seed
Some Hsfs play key roles in plant seed development . In sunflower and Arabidopsis, HsfA9 was expressed specifically in seeds and the expression of Hsps was changed during seed development (Almoguera et al., 2002;Kotak et al., 2007b). In rice, HsfA7 was expressed specifically in seed under normal condition (Chauhan et al., 2011). In peanut, expression levels of more than half of the AhHsfs were higher in seeds than that in other tissues. These expression patterns may suggest their roles in peanut seed development.
FIGURE 9 | Relative expression variation of Hsfs in seeds inoculated with Aspergillus flavus in cultivated peanut. T-test was used to perform analysis of significance. * represents significantly difference (P < 0.05) compared with control (0 h).

CONCLUSIONS
Genome-wide identification and comparison of peanut Hsfs with other plant species revealed that peanut contained a small number of Hsfs. Phylogenetic tree showed that B5 cluster Hsfs might present only in leguminous. Small scale segmental and tandem duplication but not WGD played important roles in Hsfs expansion in Arachis. The sequences of group B Hsf HR-A/B were not conserved compare with group A HR-A/B which was in agreement with the different selection pressure they experienced. We built the 3D structures of peanut Hsfs with the newly submitted templates and found the difference between group A and B members. Peanut Hsfs may play important roles in abiotic and biotic stress tolerance based on their expression responses to these stresses.

AUTHOR CONTRIBUTIONS
XW designed the study, wrote the manuscript and finalized the figures and tables. PW and LH carried out the majority of experiments, data analysis, and wrote the method section of the manuscript. HS, CL, PL, AL, and HG performed experiments.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 00106/full#supplementary-material  Figure S3 | Relative expression levels of Hsfs under cold stress in cultivated peanut. T-test was used to perform analysis of significance. " * " represents significantly difference (P < 0.05) compared with control (0 h).