Pumpkin (Cucurbita moschata) HSP20 Gene Family Identification and Expression Under Heat Stress

Pumpkin (Cucurbita moschata) is an important cucurbit vegetable crop that has strong resistance to abiotic stress. While heat shock protein 20 (HSP20) has been implicated in vegetable response to heat stress, little is known regarding activity of HSP20 family proteins in C. moschata. Here, we performed a comprehensive genome-wide analysis to identify and characterize the functional dynamics of the Cucurbita moschata HSP20 (CmoHSP20) gene family. A total of 33 HSP20 genes distributed across 13 chromosomes were identified from the pumpkin genome. Our phylogenetic analysis determined that the CmoHSP20 proteins fell into nine distinct subfamilies, a division supported by the conserved motif composition and gene structure analyses. Segmental duplication events were shown to play a key role in expansion of the CmoHSP20 gene family. Synteny analysis revealed that 19 and 18 CmoHSP20 genes were collinear with those in the cucumber and melon genomes, respectively. Furthermore, the expression levels of pumpkin HSP20 genes were differentially induced by heat stress. The transcript level of CmoHSP20-16, 24 and 25 were down-regulated by heat stress, while CmoHSP20-7, 13, 18, 22, 26 and 32 were up-regulated by heat stress, which could be used as heat tolerance candidate genes. Overall, these findings contribute to our understanding of vegetable HSP20 family genes and provide valuable information that can be used to breed heat stress resistance in cucurbit vegetable crops.


INTRODUCTION
Plants are typically afflicted by a wide variety of abiotic and biotic stresses during growth, including as extreme heat, cold, drought, elevated salinity levels, and pest and pathogen infestations. Over evolutionary time, plants have evolved a set of unique defenses including different morphological, molecular and physiological mechanisms or adaptations to avoid exposure to adverse effects (Wang et al., 2004). Heat shock protein (HSP) are often associated with plant responses to temperature stress, heavy metals and reactive oxygen species (ROS) (Sun et al., 2002), as well as infection by pathogens (Maimbo et al., 2007;Sarkar et al., 2009;Kandoth et al., 2011).
Recently, extreme global temperatures have become increasingly frequent due to climate change, increasing the frequency and severity of heat damage to crop production. Extreme temperatures can reduce seed vigor, inhibit germination, and restrict plant growth (Yamori et al., 2014). It is important, therefore, to understand the plant heat tolerance. HSP produced under high temperature stress is highly conservative proteins in organisms (Gupta et al., 2010). The abundance of heat shock proteins is typically low under normal conditions, but under heat stress, rapidly grows to account for up to 15% of the total protein content in organisms (Swindell et al., 2007).
Plant HSPs respond to the external environment changes to confer plant thermotolerance by acting as molecular chaperones, maintaining homeostasis of protein folding, and preventing or repairing the misfolding and degradation of proteins (Charng et al., 2006). Plant HSPs can be grouped into five categories according to their molecular weight and sequence homology, including HSP100, HSP90, HSP70, HSP60 and HSP20 (Waters, 2013). HSP20s-also called as small heat shock proteins (sHSPs)-are the most prevalent and abundant proteins in plant HSPs and have a molecular weight ranging from 15 to 42 kDa (Ji et al., 2019). Many HSP20s can form oligomers with high molecular weight and are involved in maintaining the stability of proteins, thus playing a vital role in the formation of plant acquired thermotolerance (Maimbo et al., 2007;Waters, 2013;Haslbeck and Vierling, 2015). For instance, a majority of HSP20 genes are induced by heat stress in pepper and apple species (Guo et al., 2015;Yao et al., 2020). Additional studies have verified the heat tolerance of HSPs in transgenic plants. The Populus trichocarpa HSP20 gene PtHSP17.8 is involved in heat tolerance . The rice HSP20 protein sHsp17.7 confers both heat tolerance and UV-B resistance of transgenic rice plants (Murakami et al., 2004). The transgenic tobacco overexpressing Zea mays HSP20 gene ZmHSP16.9 showed increased heat tolerance (Sun et al., 2012). Overall, these studies suggest that HSP20 genes are key to mediating heat tolerance in plants.
HSP20 bind target proteins through conformational changes to prevent misfolding and irreversible protein aggregation (Hilton et al., 2013). The most prominent feature of HSP20 protein is a highly conserved alpha-crystallin domain (ACD) containing approximately 90 amino acid residues (Waters et al., 1996). This domain is flanked by a variable N-terminal region and a C-terminal extension (Hilton et al., 2013;Waters, 2013). These three regions possess different functions: the ACD participates in substrate interactions, the N-terminus is involved in substrate binding, and the C-terminal extension is responsible for homooligomerization (Kirschner et al., 2000;Giese and Vierling, 2004;Basha et al., 2006;Jaya et al., 2009). The ACD domain contained 4 anti-parallel sheets and 3 β-strands, i.e., conserved region I (CRI, in the N-terminus and conserved region II (CRII, β7-β8-β9) in the C-terminus, respectively, separated by a hydrophobic region loop (β6-loop) (Bondino et al., 2012;Haslbeck and Vierling, 2015).
Pumpkin (Cucurbita moschata) is a globally important cucurbit vegetable crop. The top producer of pumpkin is China with 7.7 million tons, followed by India (5 million tons) and Russia (1.2 million tons) (Worldmapper, 2021). Pumpkin exhibits strong resistance to abiotic stress and is often used as rootstock to improve stress tolerance of other cucurbits (Cao et al., 2017). However, the molecular regulatory mechanism underlying pumpkin responses to abiotic stresses are not yet fully understood. In recent years, increasingly frequent temperature extremes have seriously limited the quality and yield of cucurbits. Therefore, it is very important to study the heat resistance mechanism of pumpkin and select strong heat resistant pumpkin rootstock varieties to improve the heat resistance of cucurbits. Study of HSP20s is important for understanding the mechanism of heat tolerance in pumpkin.
In this study, we used bioinformatics methods to identify HSP20 genes based on the full genomic sequence of Cucurbita moschata  and investigated their physicochemical properties, phylogenetic relationships, conserved domains, gene structures, cis-elements and expression patterns in response to heat stress. This study provides foundational information for new research into the functions of CmoHSP20 gene family and future screening of candidate heat tolerance genes for the improvement of cucurbit vegetable crops.

Sequence Analysis, Structural Characterization and Chromosomal Localization
The coding sequences, genomic sequences, and amino acid sequences of the CmoHSP20 genes were obtained from the CuGenDB. The theoretical isoelectric point (pI) and molecular weight (MW) of each HSP20 protein was estimated using the pI/ MW tool from ExPASy (v.3.0; http://web.expasy.org/protparam/ ), and the number of transmembrane regions per protein was calculated using TMHMM software (v.2.0; http://www.cbs.dtu. dk/services/TMHMM/). The structures of the HSP20 genes were visualized using the Gene Structure Display Server (GSDS 2.0; http://gsds.cbi.pku.edu.cn). The Multiple Em for Motif Elicitation (MEME) program (MEME Suite 5.3.3; http://meme-suite.org/ tools/meme) was used to identify conserved motifs of HSP20 proteins, specifically recording the number of repetitions (any, maximum number of motifs-10, and the optimum motif widths set from 6 to 200 amino acid residues). The results were visualized using TBtools (Chen et al., 2020). The chromosomal position of each CmoHSP20 gene was acquired from the C. moschata genome browser at the CuGenDB and mapped using MapChart software (Voorrips, 2002).

Phylogenetic Analysis and Classification of Cucurbita moschata HSP20 Genes
The full amino acid sequences of 97 total HSP20 members from C. moschata (N 33), Arabidopsis (N 31) and Oryza sativa (N 33) were aligned using CLUSTALW (Thompson et al., 1994) program (Supplementary Table S1). MEGA-X (Kumar et al., 2018) was used to construct a phylogenetic tree using the Maximum Likelihood (ML) method with bootstrap test of 1,000 times. The HSP20 proteins were classified into different groups according to the classifications and in silico subcellular localization of HSP20 proteins in O. sativa and Arabidopsis (Waters et al., 1996). The phylogenetic tree was visualized and enhanced using the EvolView online tool (Evolview v3; https:// evolgenius.info/evolview-v2).

Gene Duplication, Collinearity and Ka/Ks of HSP20 Analysis
To investigate the degree of synteny, the homologous regions of the CmoHSP20 genes were first identified using Multiple Collinearity Scan toolkit (MCScanX) software (Wang et al., 2012). CmoHSP20 gene duplication events were determined based on whether the length of the shorter gene covered was equal or greater than 70% of the longer gene and if the similarity of the two aligned genes was equal or greater than 70% (Gu et al., 2002;Yang et al., 2008). Tandem and segmental duplications are reported to be the two main mechanisms underlying gene family expansion (Cannon et al., 2004). Genes located on the same chromosome fragment of less than 100 kb and separated by five or fewer genes were considered to be tandem duplicated genes (Wang et al., 2010). Genes found to be coparalogs located on duplicated chromosomal blocks were considered to be segmental duplicated genes (Wei et al., 2007). Tandem and segmental duplication events were visualized using Circos software (v0.69; Krzywinski et al., 2009). Ka/Ks values can be used to predict selection pressure for replicating genes. A Bio-linux system was used to calculate the nonsynonymous (Ka) and synonymous (Ks) nucleotide substitution parameters. If the ratio of Ka/Ks was greater than, equal to, or less than one, this indicated positive, neutral, and purifying selection, respectively (Hurst et al., 2002). Syntenic maps were generated using Circos software to display synteny relationships in the orthologous HSP20 genes obtained from pumpkin and the other focal species.

Protein 3D Structure and Promoter Analysis of CmoHSP20 Genes
The tertiary structures of CmoHSP20 proteins were predicted using the online prediction tool SWISS-MODEL (https:// swissmodel.expasy.org/) according to the default parameters. The cis-acting elements in the 1,500 bp upstream sequences of coding region of C. moschata HSP20 genes were retrieved from the C. moschata genome, and the types, numbers and functions of these elements were analyzed using PlantCARE software (http:// bioinformatics.psb.ugent.be/webtools/plantcare/html/).

Plant Materials and Heat Stress Treatment
Pumpkin cultivar Miben 2 (Cucurbita moschata cv. Miben 2) seeds were sprouting in an incubator under 37°C for 2 days under dark condition and then grown in a growth chamber under standard greenhouse conditions (light/dark cycle: 12 h/12 h at 25°C, 70% relative humidity). When seedlings sprouted three true leaves, uniform seedlings were transferred to a high temperature growth chamber (42°C). Root and leaf samples were collected after 0, 3, 6, 12, and 24 h under normal condition (25°C) and heat stress treatment (42°C) and frozen with liquid nitrogen for total RNA extraction and expression analysis. Each treatment was conducted with three independent replicates, and samples from five plants were collected for each replicate.

Total RNA Extraction and Expression Analyses of CmoHSP20 Genes
Total RNA was extracted from each sample using an RNAprep Pure Plant kit (Tiangen Biotech (Beijing) Co., Ltd., China, DP441) according to the manufacturer's instructions. First strand cDNA was synthesized from 1 μg of RNA using the PrimeScript RT reagent kit (TaKaRa, Japan, RR047A). Quantitative RT-PCR (qRT-PCR) was performed to study the expression profiles of CmoHSP20s using gene-specific primers from TransStart Tip Green qPCR SuperMix (TransGen Biotech, AQ141-04) with an Applied Biosystems QuantStudio 3 & 5 system (ThermoFisher Scientific, United States). The housekeeping gene Actin was used as an internal control. All primers were designed to avoid the conserved region for specificity (Supplementary Table S2).

Statistical Analysis
The relative expression levels of CmoHSP20 genes in roots and leaves were calculated by the 2 −△△CT method (Livak and Schmittgen, 2001). All data were calculated using the expression level under heat stress divided by that under normal condition at the same time points and presented as the means ± standard error (SE) of three replicates and differences were detected using the Student's t-test. Asterisk (p or pp) indicate a significant difference at p < 0.05 or 0.01, respectively.

Identification of the HSP20 Proteins in Cucurbita moschata
Fourty-three HSP20 proteins were initially obtained by HMM search from pumpkin genome database. After removing the repetitive and incomplete sequences and the sequences with a molecular weight outside of the 15-42 kDa range, 33 sequences were confirmed as HSP20 genes and named based on their chromosomal locations. Sequences including genomic sequence, transcript sequence, CDS sequence and protein sequence of the gene family were shown in Supplementary  Tables S3-S6. For information including gene name, gene identity, chromosomal location, open reading frame (ORF) length, amino acid (AA) number, molecular weight (MW) and isoelectric point (pI) of each CmoHSP20 gene, see Table 1. CmoHSP20 genes were distributed on 13 pumpkin chromosomes. The AA number of the CmoHSP20 proteins ranged from 121 (CmoHSP20-31) to 353 (CmoHSP20-21), with an average of 189 AAs. The MW of CmoHSP20s was between 14.29 kDa (CmoHSP20-31) and 39.57 kDa (CmoHSP20-21), with an average of 21.27 kDa, while the pI values of CmoHSP20 ranged from 4.42 (CmoHSP20-1) to 10.08 (CmoHSP20-20), with an average of 7.69. All but six proteins lacked transmembrane domains, suggesting that most CmoHSP20s were non-membrane protein.

Phylogenetic Analysis of CmoHSP20 Proteins
To evaluate the evolutionary relationships of pumpkin HSP20 proteins, a ML phylogenetic tree was constructed. The pumpkin HSP20 proteins were divided into nine subfamilies: cytosol I (CI), CIII, CIV, CV, mitochondria I (MI), MII, endoplasmic reticulum (ER), plastid (P), and peroxisome (PX/Po). Cytoplasmic subfamilies (CI to CV) contained 20 members and constituted the largest clade, while no CmoHSP20s were found in subfamily CII. The M (MI and MII) contained two gene members while the ER, P, and PX/Po subfamilies each contained three HSP20s. Two CmoHSP20s (CmoHSP20-16 and CmoHSP20-25) could not be clustered into any subfamily and remained unclassified (Figure 1; Table 1).

Conserved Motifs and Gene Structures of CmoHSP20s
To investigate the structural features of the HSP20 proteins, the conserved motifs were analyzed using MEME. A total of 10 distinct motifs, named motif 1 to motif 10, were detected (Supplementary Figure S1). The lengths of these conserved motifs varied from 8 (motif 8) to 49 (motif 7) AAs. The number of the conserved motifs for each HSP20 protein FIGURE 1 | Phylogenetic analyses of HSP20s proteins from Cucurbita moschata, Arabidopsis and Oryza sativa. A phylogenetic tree of HSP2 proteins was constructed using MEGA-X software. The 10 subgroups are indicated with different colors. The red triangles represent Cucurbita moschata HSP20s (CmoHSP20s), the green stars represent Arabidopsis thaliana HSP20s (AtHSP20s), and the black circles represent Oryza sativa HSP20s (OsHSP20s).

Homology Modelling of the CmoHSP20 Proteins
In order to obtain a reasonable theoretical structure of the CmoHSP20s, protein homology modelling was performed using a SWISS-MODEL server. The predicted 3D structures are shown in Figure 4. Each CmoHSP20 protein was searched for template automatically in the software and then built model using the template (Supplementary Figure S2). The proteins can be divided into eight groups (group A to H) according to their structure similarity. Group A contained 22 proteins, with the most members. Group B, D, E and G all had only one protein. All CmoHSP20 proteins had β-turn, and the protein structures of the same template were basically similar, suggesting that the predicted results were credible.

Chromosomal Location, Gene Duplication and Synteny Analysis of CmoHSP20s
A total of 33 CmoHSP20 genes were mapped on 13 chromosomes (Chr) and exhibited a non-uniform distribution ( Figure 5). The maximum number of CmoHSP20 genes per Chr was six (Chr01 and Chr04) with only a single gene on Chrs 03, 09, 11 and 16. According to the defined criteria, the analysis of gene duplication events showed that there were two pairs of tandem duplication genes (CmoHSP20-10/-11 and CmoHSP20-22/-23) and two pairs of segmentally duplicated genes (CmoHSP20-10/-28 and CmoHSP20-16/-25) in the pumpkin HSP20 gene family ( Figure 5; Supplementary Table S7).
To understand the phylogenetic relationships of the HSP20 genes with other species, comparative synteny maps of four related genomes (C. moschata VS A. thaliana, C. moschata VS O. sativa, C. moschata VS Cucumis sativus, and C. moschata VS Cucumis melo) were created. Twelve CmoHSP20 genes displayed a syntenic relationship with genes in Arabidopsis, 19 with those in C. sativus, 18 with those in C. melo and only 1 syntenic relationship with O. sativa genes (Figure 6; Supplementary Table S8). The number of collinear gene pairs between pumpkin and other members of Cucurbitaceae (cucumber or melon) was greater than that between more distantly-related Arabidopsis The ratio of Ka/Ks for each orthologous CmoHSP20 gene pair was used to evaluative the type and strength of selective pressure. Both the segmental duplicated CmoHSP20 gene pairs possessed a Ka/Ks ratio <1 (purifying selection), with the higher in the CmoHSP20-16/CmoHSP20-25 pair (Ka/Ks value 0.26).

Promoter Analysis of CmoHSP20 Genes
To understand the role of cis-regulatory elements in CmoHSP20, cis-elements were identified in the 1.5 kb upstream sequence from the translation start site (ATG) of each CmoHSP20 (Figure 7; Supplementary Table S9). Four types of cis-elements, including hormone responsive, stressresponsive, plant development-related, and light responsive elements were identified. The largest number of cis-elements observed across the 33 CmoHSP20 genes was associated with light-related responsiveness, such as G-box, Box 4, AE-box and GT1-motif. The hormone-related cis-acting elements, including abscisic acid responsiveness (ABRE), auxin responsiveness (AuxRR-core and TGA-element), gibberellin responsive elements (GARE-motif, P-box, and TATC-box), MeJA-responsive (CGTCA-motif and TGACG-motif), and salicylic acid-responsive (TCAelement and SARE), were widely present in the promoter region. Among these elements, ABRE, CGTCA-motif and TGACG-motif accounted for the largest part of the hormone responsive category, while the SARE element was only found in the promoter region of CmoHSP20-21. The stress-related category cis-elements containing abiotic stress-related elements (LTR, TC-rich repeats and MBS) and biotic stress-related elements (WUN-motif), as well as plant development-related elements, including circadian control (circadian), differentiation of the palisade mesophyll cells (HD-Zip 1), endosperm expression (GCN_4 motif), meristem expression (CAT-box), flavonoid biosynthetic genes regulation (MBSI), seed-specific regulation (RYelement), and zein metabolism regulation (O2-site) were also identified. Only CmoHSP20-25 promoter region contained HD-Zip 1 element, and only CmoHSP20-26 promoter region contained RY-element element.

Expression Patterns of CmoHSP20 Genes Under Heat Stress
To investigate the expression changes of pumpkin HSP20 genes in response to heat stress, qRT-PCR was used to examine the transcript levels of the 33 CmoHSP20 genes. Overall, the relative expression level of CmoHSP20 genes fluctuated during the 24 h treatments under heat stress conditions (Figure 8). Compared to roots, the expression levels of almost all the genes in leaves were extremely up-regulated. For example,  and -32 were highly induced in leaves after short-term heat stress (42°C for 3 h), with the expression levels were more than 300-fold than under normal condition. The CmoHSP20-16, -24 and -25 genes were down-regulated under heat stress in both roots and leaves, while the expression level of CmoHSP20-31 did not change within neither roots nor leaves in response to heat stress.

DISCUSSION
Environmental stress is one of the most important environmental and economic challenges in the world today, with the potential to dramatically decrease crop yield and quality across agricultural systems. Small heat shock proteins are widely prevalent across plants species and are rapidly synthesized in response to environmental stress (Wang et al., 2004). Many studies have shown that members of the HSP20 gene family produce heat shock proteins and are widely involved in abiotic stress in plants (Yu et al., 2016;Zhao et al., 2018). These family members appear to be highly conserved across plant species (Aghdam et al., 2013).
With the wide availability of genomic information, the functions of the HSP20 family genes in many plants have now been Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753953 10 characterized, including the model plants Arabidopsis (Scharf et al., 2001) and rice (Sarkar et al., 2009), as well as soybean (Lopes-Caitar et al., 2013), watermelon (He et al., 2019), potato (Zhao et al., 2018), pepper (Guo et al., 2015), grape (Ji et al., 2019) and apple (Yao et al., 2020). A comprehensive identification and analysis of HSP20 gene in pumpkin will add important information to this collective knowledge while revealing their functions in pumpkin stress resistance.
With the successful genome sequencing of Cucurbitaceae plants such as pumpkin, it is possible to identify members of the HSP20 gene family at the whole genome level . In this study, 33 HSP20 gene family members were identified from the pumpkin genome database using bioinformatics methods. We found that the number of pumpkin HSP20 genes was similar to those in Arabidopsis (N 31) and O. sativa (N 33) (this study), as well as Capsicum annuum (N 35) (Guo et al., 2015), but lower than that of Gossypium hirsutum (N 94) (Ma et al., 2016). These findings suggest the possibility of a gene gain event during the evolutionary process from diploid (Arabidopsis, O. sativa, C. annuum and C. moschata) to tetraploid (G. hirsutum).
To determine the evolutionary relationships of HSP20 genes, we constructed a phylogenetic tree based on the AA sequences of C. moschata, A. thaliana and O. sativa HSP20s. Our phylogenetic analysis indicated that the pumpkin HSP20 family could be divided into nine subfamilies, including 20 HSP20 proteins located in cytoplasm belonged to CI, CIII, CIV and CV, two proteins located in mitochondria belonged to MI and MII, three located in endoplasmic reticulum, three in the plastids, and three in the peroxisome. Two proteins could not be clustered into any subfamily (Guo et al., 2015). These patterns are similar to the distribution characteristics of HSP20 family members in Arabidopsis and O. sativa (Scharf et al., 2001;Sarkar et al., 2009), indicating a close relationship among HSP20s from pumpkin, Arabidopsis and O. sativa. This suggests that the biological function of pumpkin HSP20s may be predicted based on the activity of similar genes in these other species. In addition, over 50% of CmoHSP20s were classified into CI-CV subfamilies, providing evidence that the cytoplasm could be the primary functional area of the HSP20 family in pumpkin. However, no HSP20 members belonged to CII family, a finding inconsistent with that from the other focal plants (Lopes-Caitar et al., 2013;Guo et al., 2015;Yao et al., 2020). Possibly, the CII subfamily appeared before pumpkin speciation through multiple gene duplication. Unexpectedly, pumpkin CmoHSP20 members were more closely related to those in the FIGURE 7 | Cis-elements analysis of the CmoHSP20 genes promoter regions. (A) The different colors and numbers indicated the numbers of different promoter elements in the CmoHSP20 genes. (B) Colored blocks represent the different types of cis-elements and their locations in each CmoHSP20 gene. The types, numbers, and locations of potential elements in the promoter regions 1.5-kb upstream of the CmoHSP20 genes were determined using PlantCARE software.
Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753953 same subfamily from different species than to the other HSP20s from the same species, implying a relatively high synteny between the same HSP20 subfamily across plant species. The exon/intron structure plays an important role in the evolution of multiple gene families (Xu et al., 2012). Approximately 94% of the CmoHSP20 genes (predominantly belonging to the CI and CV subfamilies) have lack or possess only a single intron (Figure 2). This result is not unexpected, as plants overall tend to retain genes with no intron or less intron (Mattick and Gagen, 2001). HSP20 gene families are one of the rapidly expressed genes under environmental stresses (Sarkar et al., 2009;Yu et al., 2016). Here, we tested the expression patterns of pumpkin HSP20 genes under heat stress and found that most HSP20 genes were up-regulated after heat treatment.
Presence of conserved motif was also investigated to further study the evolution of pumpkin HSP20 proteins. We found that 67% of the CmoHSP20 proteins had three to seven conserved motifs and almost all the proteins contained motif 2 and motif 1, consisting of the ACD. Furthermore, we found that most HSP20s in the same subfamilies showed conserved motifs and similar gene structures, a finding also reported in tomato and apple (Yu et al., 2016;Yao et al., 2020). This phenomenon supported their close evolutionary relationship and classification of the subfamilies. The features of conserved motifs and gene structures of CmoHSP20 family may facilitate the identification of additional functions of CmoHSP20 genes, such as in responses to different types of stressors.
Gene duplication events play a major role in genomic rearrangements and expansions (Vision et al., 2000), allowing an increase in functional divergences that enables plants to adapt to changing environmental conditions (Conant and Wolfe, 2008;Grassi et al., 2008;Flagel and Wendel, 2009). The 33 CmoHSP20 FIGURE 8 | Expression analyses of the Cucurbita moschata CmoHSP20 genes in response to heat stress using qRT-PCR. The mean expression value was calculated from three replicates. Vertical bars indicate the standard deviation. Values of 0, 3, 6, 12, and 24 indicate hours after treatment. Mean values and standard deviations are calculated according the data. Asterisk (p or pp) indicate a significant difference at p < 0.05 or 0.01, respectively.
Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753953 genes were unevenly dispersed on 13 chromosomes of pumpkin, and five clusters with at least two CmoHSP20 genes each were identified ( Figure 5). A number of family members gathered into clusters in certain segments, especially in Chrs 04 and 13. We discovered two pairs of predicted tandem duplicated genes and two pairs of predicted segmental duplicated genes, suggesting that the two duplicated events contributed to the expansion of CmoHSP20 family. Furthermore, both the segmental duplicated genes were found to have undergone strong purifying selective pressure, confirming that the evolutionary pattern of CmoHSP20 genes was highly conservative. In addition, numerous hormone responsive, stress-responsive, plant development-related and light responsive elements were found in the promoter regions of CmoHSP20 genes (Figure 7; Supplementary Table S9), indicating that pumpkin HSP20 genes performed multiple or specific functions. All CmoHSP20 genes contained light responsive cis-elements, suggesting that the pumpkin HSP20s were essential in plant growth and development. It has been demonstrated that HSP20s played key roles in the control of plants' response to environmental stress (Neta-Sharir et al., 2005;Guo et al., 2015;Zhao et al., 2018;He et al., 2019;Ji et al., 2019). We found that most of these genes were up-regulated under heat stress. We also tested the expression levels of other HSP and heat shock transcription factors, and found that the CmoHsfA2, CmoHSP70 and CmoHSP83 were all induced by heat stress, and showed similar expression patterns with CmoHSP20 genes (Supplementary Figure S3). It is worth noting that the relative expression levels of 6 HSP20 genes (CmoHSP20-7, 13, 18, 22, 26 and 32) were extremely up-regulated after 3 h of heat stress. These genes might be mainly involved in the heat stress biological pathway and could be used as candidate genes for heat resistant breeding of Cucurbitaceae vegetable crops.

CONCLUSION
In this study, a genome-wide identification of HSP20 genes in C. moschata was performed and a total of 33 CmoHSP20 genes were identified. These genes are unequally distributed on 13 chromosomes and were classified into nine subfamilies based on their phylogenetic relationships. The basic features, genome distribution, gene structures, conserved motifs, gene duplication events, and cis-elements of these genes were analyzed, providing a foundational understanding of the evolutionary relationships within the HSP20 gene family. CmoHSP20 genes expression was studied using qRT-PCR, results from which revealed that 6 pumpkin HSP20 genes were highly induced by heat stress. Our results provide a basis for identifying important candidate HSP20 genes involved in pumpkin responses to heat stress.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.