Transcriptome Characterization and Expression Analysis of Chemosensory Genes in Chilo sacchariphagus (Lepidoptera Crambidae), a Key Pest of Sugarcane

Insect chemoreception involves many families of genes, including odourant/pheromone binding proteins (OBP/PBPs), chemosensory proteins (CSPs), odourant receptors (ORs), ionotropic receptors (IRs), and sensory neuron membrane proteins (SNMPs), which play irreplaceable roles in mediating insect behaviors such as host location, foraging, mating, oviposition, and avoidance of danger. However, little is known about the molecular mechanism of olfactory reception in Chilo sacchariphagus, which is a major pest of sugarcane. A set of 72 candidate chemosensory genes, including 31 OBPs/PBPs, 15 CSPs, 11 ORs, 13 IRs, and two SNMPs, were identified in four transcriptomes from different tissues and genders of C. sacchariphagus. Phylogenetic analysis was conducted on gene families and paralogs from other model insect species. Quantitative real-time PCR (qRT-PCR) showed that most of these chemosensory genes exhibited antennae-biased expression, but some had high expression in bodies. Most of the identified chemosensory genes were likely involved in chemoreception. This study provides a molecular foundation for the function of chemosensory proteins, and an opportunity for understanding how C. sacchariphagus behaviors are mediated via chemical cues. This research might facilitate the discovery of novel strategies for pest management in agricultural ecosystems.


INTRODUCTION
Insects, the most diverse and successful group of animals on earth, have existed for more than 350 million years (Stork, 1993;Chen et al., 2018); they not only affect the natural environment but also influence human life and productivity in many ways. A sophisticated chemosensory system makes insect prominence among other animals for their survival and reproduction (Leal, 2013).
Insect ORs, a member of a novel family of seventransmembrane proteins located in the dendrite membrane of OSNs with a reversed membrane topology compared to that of G-protein coupled vertebrate ORs (intracellular N-terminus and extracellular C-terminus) (Clyne et al., 1999;Benton et al., 2006), were first found and identified in Drosophila melanogaster (Clyne et al., 1999;Vosshall et al., 1999). In the process of insect olfactory signal transduction, OR and ORCO form a complex of odourantgated ion channels that play a fundamental role in the conversion of chemical signals to electrical signals (Larsson et al., 2004;Jones et al., 2005;Sato et al., 2008;Smart et al., 2008;Wicher et al., 2008;Butterwick et al., 2018;Fandino et al., 2019).
Sensory neuron membrane proteins, located on dendrite cilia in insects, belong to the CD36 family of two-transmembrane domain membrane proteins (Rogers et al., 2001;Hu et al., 2016). Insect SNMPs can usually be divided into two subfamilies: SNMP1 and SNMP2, while in a recent study, SNMP3 has been found in lepidopteran. SNMP1, with specific expression on pheromone-specific OSNs in the insect antennae, was thought to have a pheromone detection function (Vogt et al., 2009); the function of SNMP2 has not yet been clarified; while is specifically SNMP3 is biased-expressed in the larval midgut, which may be involved in functioning immunity response to virus and bacterial infections the silkworm (Zhang et al., 2020).
Chilo sacchariphagus Bojeris, a lepidopteran of the Pyralidae family, is one of the most dangerous pests for sugarcane. Their larvae cause damage by mining the seedlings and stems of sugarcane; this species also harms sorghum, corn and other crops. C. sacchariphagus causes great economic losses to the sugar industry every year in China, as well as in South Africa, India, Swaziland, and other countries and regions (Bezuidenhout et al., 2008;Geetha et al., 2010). At present, research on the sugarcane cane borer is mainly focused on identifying resistant varieties, determining the resistance mechanisms of sugarcane and developing biological control techniques (including the utilization of Trichogramma chilonis Ishii, pheromones, and pathogenic nematodes) (Nibouche and Tibère, 2010;Nibouche et al., 2012;Sallam et al., 2016). Chemoreception plays an irreplaceable role in the foraging, mating, oviposition and other behaviors of C. sacchariphagus, which are vital for its survival in the natural environment. However, few reports have been published on this topic, including on the characterization and function of chemosensory genes and the mechanisms of chemosensory recognition.
In this study, we sequenced and analyzed the C. sacchariphagus adult antennal transcriptomes using the Illumina HiSeq TM 4000 platform. Seventy-two chemoreceptionrelated genes were identified in total, including 31 OBP/PBPs, 15 CSPs, 11 ORs, 13 IRs, and two SNMPs, by analyzing the transcriptome data. Our aim was to identify chemoreceptionrelated genes in this pest insect species, which is destructive to the sugarcane production and sugar industry in China, across Asia and in the Pacific and India. We intend to provide a theory for an improved understanding of how C. sacchariphagus recognizes, locates, forages, and mates.

Insects
The eggs of C. sacchariphagus, obtained from a wild field, were reared at 27 ± 1 • C with 75 ± 5% relative humidity and a 14 L:10 D photoperiod at Guangdong Engineering Research Center for Pesticide and Fertilizer, Institute of Bioengineering, Guangdong Academy of Sciences, Guangzhou, China. Larvae were reared on an artificial diet under the same conditions. After at least three generations, newly emerged male and female adult C. sacchariphagus were chosen as experimental subjects. After pupation, male and female pupae were separated and fed with 10% sugar solution. Antennae of unmated male and female individuals were collected 2 days after eclosion, immediately frozen in liquid nitrogen, and stored at −80 • C. Antennae with intact structure were removed using tweezers.
cDNA Library Construction, Transcriptome Sequencing, Assembly and Functional Annotation Twenty pairs of antennae and 20 body tissues (without antennae) from male and female of C. sacchariphagus were used for RNA extraction. For each sample, total RNA was extracted using TRIzol reagents (Invitrogen, United States) according to the manufacturer's instructions. RNase-free DNase I (Takara Biotechnology Co., Ltd., Dalian, China) was used to remove contaminating genomic DNA. The quantity and quality of RNA were assessed by agarose gel electrophoresis and on a Bioanalyzer 2100 system (Agilent Technologies, United States). RNA with high purity, concentration and integrity was chosen for cDNA library construction and final Illumina sequencing at Gene Denovo Biotechnology Company (Guangzhou, China). The cDNA was then tested for quality and sequenced on an Illumina HiSeq TM 4000 platform as 150 bp paired-end reads.
The obtained raw reads were processed to remove adapters, primers, low-quality sequences, and ambiguous "N" nucleotides. Then, quality assessment of the clean data was carried out by Q30, and the GC content and sequence duplication level were calculated. Clean data were assembled into contigs using Trinity software and subsequently assembled into transcripts using the De Bruijn graph method. The assembled transcripts were further clustered to form unigenes by using the TGI Clustering Tool (Quackenbush et al., 2001;Pertea et al., 2003).
The annotation of all unigenes was performed by BLASTx against a pooled database containing protein entries from the National Center for Biotechnology Information non-redundant protein (NCBI-NR), Swiss-Prot, Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases with an E-value < 10 −5 . After amino acid sequence prediction, annotation of unigenes was obtained using HMMER software (Eddy, 1998), and Gene Ontology (GO) annotations were determined by Blast2GO. In addition, WEGO (Ye et al., 2006) was utilized to perform GO functional classification and evaluate the distribution of gene functions at the macro level. Unigene functions were also predicted by aligning their sequences with the COG database.

Phylogenetic Analysis
The amino acid sequence alignment of the candidate chemosensory-related genes of C. sacchariphagus was performed using CLUSTALX 2.0 (Larkin et al., 2007). The candidate OBPs, PBPs, CSPs, ORs, IRs, and SNMPs of C. sacchariphagus were chosen for phylogenetic analysis along with genes from model organisms Lepidoptera (Manduca sexta and Bombyx mori), Diptera (D. melanogaster), and Hymenoptera (Apis mellifera) species. Phylogenetic trees were constructed by the neighborjoining method, as implemented in MEGA6.0 software. Node support was assessed using a bootstrap procedure with 1000 replicates (Tamura et al., 2013). Phylogenetic trees were colored and arranged using FigTree (Version: 1.4.2).

Expression Analysis by Real-Time Quantitative PCR (qRT-PCR)
Real-time quantitative PCR (qRT-PCR) was performed to verify the expression of candidate chemosensory genes. Tissue samples were collected from C. sacchariphagus adults 2 days after eclosion in three biological replicates, and total RNA was extracted as described above. One microgram of total RNA from the transcriptome samples was subjected to reverse transcription in a total reaction volume of 20 µL according to the manufacturer's instructions (PrimeScript TM RT Reagent Kit, Takara, Japan) to obtain the first-strand cDNAs. With the manual for the SYBR Green I Master (Roche Diagnostics Ltd., Lewes, United Kingdom), qRT-PCR was processed in 10 µL reaction volumes [1 µL cDNA (2 ng/µL), 5 µL SYBR Green I Master, 0.5 µL/primer, and 3 µL ddH 2 O] on a LightCycler R 480 realtime PCR system (Roche Diagnostics Ltd.) with the following program: denaturation at 95 • C for 5 min followed by 40 cycles of 5 s at 95 • C, 20 s at 60 • C, and 20 s at 72 • C. β-actin was used as the internal reference gene, and each gene was tested in triplicate. The relative expression levels of the candidate chemosensory genes normalized to the internal control gene were calculated using the 2 − Ct method (Livak and Schmittgen, 2001).
A total of 4662 unigenes were annotated with functional groups classified into 52 subcategories under three main GO categories ("biological process, " "cellular component, " and "molecular function") via Blast2GO and WEGO software (Figure 2). Among 24 subcategories in the "biological process" category, "metabolic process" and "cellular process" were predominant terms. In the "cellular component" category, "cell part" and "cell" were the most abundant GO terms. Of the 11 subcategories under the "molecular function" category, two contained the largest groups, namely, "catalytic activity" and "binding."
Among the 11 candidate ORs, four were of short length (no more than 100 amino acids), and the remaining seven possessed a deduced protein longer than 200 amino acids ( Table 3). From the prediction, three sequences (CsacORCO, CsacOR1, and CsacOR4) were full-length OR genes with intact open reading frames with a general length of 1500 bp and 5-7 transmembrane domains, which are characteristic of typical insect ORs. Compared with OBPs, the results of BLASTx revealed that the identity of these candidate ORs with known insect ORs was relatively low. Only one candidate OR (CsacORCO) had an identity higher than 80% (96%) with its closest match, while the identities of the remaining ORs ranged from 38 to 71%. Two ORs, CsacOR1 and CsacOR5, formed a small branch that was closely related to BmorOR1 and BmorOR9 from B. mori and MsexOR60 from M. sexta, and these ORs formed a distinct subgroup (Figure 4). Most of the splits in the tree were supported by bootstrap values, and only a few splits were unreliable.
Bioinformatic analysis led to the identification of 15 different sequences encoding candidate CsacCSPs. Due to their complete N-termini, all the sequences had signal peptides. The identity of   the 15 CsacCSPs ranged from 48 to 89% (Table 3). Neighborjoining tree analysis showed that CsacCSP13 and CsacCSP15 formed a specific branch that was close to BmorCSP1 and BmorCSP1 variant from B. mori. Additionally, a specific branch consisting of two CSPs from C. sacchariphagus (CsacCSP4 and CsacCSP10) was divergent from the CSPs of other insects, and the two CsacCSPs have a close relationship to CSP7 precursor from B. mori (Figure 5). The putative IR genes in the C. sacchariphagus transcriptome were represented according to their similarity to known insect IRs. Bioinformatic analysis led to the identification of 13 candidate IRs, of which eight candidate IRs had higher than 70% identity with known insect IRs, and only two had identities lower than 60%. Compared with general insect IRs, which have three transmembrane domains, three IR candidates in C. sacchariphagus (CsacIR4, CsacIR5, and CsacIR7) were predicted to have three transmembrane domains by TMHMM2.0 ( Table 3). In the phylogenetic analysis, CsacIR2, CsacIR7, and IRs from M. sexta (MsexIR1) and D. melanogaster (DmelIR75a, DmelIR75b, and DmelIR75c) formed a distinct subgroup, while CsacIR6, CsacIR10, and CsacIR11 formed a branch that shared a close relation to IR75d from D. melanogaster and IR75a, IR75p.1, and IR75p.3 from M. sexta; additionally, CsacIR1, CsacIR3, and CsacIR12 formed a specific branch consisting of DmelIR8a, AmelIR25a MsexIR8a, MsexIR25a, and BmorIR25a with their positions in phylogenetic tree and strong bootstrap support (Figure 6).
Sensory neuron membrane proteins were identified in pheromone-sensitive neurons in Lepidopteran insects and are thought to function in the process of pheromone recognition (Rogers et al., 2001). Two SNMPs (CsacSNMP1 and CsacSNMP2) were identified in our transcriptome. Both of them all have an identity of greater than 80% with SNMPs of Chilo suppressalis (Table 3). According to the phylogenetic analysis, both C. sacchariphagus candidate SNMPs clustered with their SNMP orthologs into separate subclades (Figure 7), among which CsacSNMP1, BmorSNMP1, and MsexSNMP1 formed a specific branch and CsacSNMP2 and SNMP2 from B. mori and M. sexta shared a close relationship.

DISCUSSION
In this study, the transcriptome of the pest C. sacchariphagus was analyzed using Illumina HiSeq TM 4000 technology. We obtained 16.60 GB of clean data that was assembled into 60084 unigenes with a mean length of 829 bp and N50 length of 1694 bp. There were 60.67% unigenes with a length <500 bp after assembly, possibly due to the short-length sequencing capacity of Illumina sequencing. Among the 60084 unigenes, 28330 unigenes were annotated, and 52.85% of unigenes had no significant match in any of the databases searched. This phenomenon may be caused by the lack of genomic and transcriptomic information for this moth in the databases. This antennal and body transcriptome sequencing provides a dataset of chemosensory genes, including 28 OBPs, three PBPs, 15 CSPs, 11 ORs, 13 IRs, and two SNMPs. Odourant/pheromone binding proteins interact with semiochemicals, hormones or other biologically active chemicals that enter the body through pores and then transport them to ORs located on the membranes of olfactory receptor neurons (Pelosi and Maida, 1995;Vogt, 1995;Kaissling, 1998). Fewer OBPs/PBPs were identified in this transcriptome of C. sacchariphagus (31) than in B. mori (44) or D. melanogaster (51) (Hekmat-Scafe et al., 2002;Gong et al., 2007). The difference in the number of OBPs might be related to the sequencing method, depth, the process of sample preparation or evolutionary differences across different species. These results are comparable to those reported for the transcriptomes of Spodoptera littoralis (33), Spodoptera exigua (34), and Helicoverpa armigera (26) (Liu N. Y. et al., 2012;Poivet et al., 2013;Liu et al., 2015;Walker et al., 2019).
This suggests that C. sacchariphagus OBPs show conservation in gene numbers. Some OBPs are conserved and have orthologous relationships with counterparts from other insects. Insect OBPs/PBPs, mainly expressed in the antennae, are considered to have an olfactory function. Analysis of OBP/PBPs expression profiles in different organs and tissues could reveal their likely functions. qRT-PCR results showed that 22 CsacOBPs/PBPs displayed antenna-enriched expression, indicating that these genes may play critical roles in the process of olfactory reception. Among these genes, 13 (CsacOBP2/5/6/9/12/15/17/24/25/26/27 and CsacPBP1/2) were mainly expressed in male antennae, suggesting that these genes may encode proteins involved in sexspecific behaviors, including selectively sensing and transporting sex pheromones released by females in the process of molecular recognition and searching for suitable mates (Gu et al., 2013;FIGURE 6 | Phylogenetic analysis of putative ionotropic receptors (IRs) of C. sacchariphagus. The tree was constructed in MEGA6.0 using the neighbor-joining method. Genes from C. sacchariphagus are labeled in red. IRs from D. melanogaster (Diptera) are labeled in dark blue, IRs from B. mori (Lepidoptera) are labeled in purple, IRs from M. sexta (Lepidoptera) are labeled in green, and IRs from A. mellifera (Hymenoptera) are labeled in light blue. Jin et al., 2014;Chang et al., 2015;Zhu et al., 2016Zhu et al., , 2019. Ten genes (CsacOBP7/8/10/13/14/16/18/20/21/28) without significant differences in expression levels between males and females may function as general odourant detectors rather than in pheromone recognition (Li et al., 2008;Pelletier and Leal, 2009;He et al., 2019a). Some genes (CsacOBP1/3/4/11/19/22/23/27) showed female antenna-biased expression, indicating that those OBPs may help to locate oviposition sites by recognizing chemicals from hosts, a model that is supported by previous studies of Pieris rapae (Renwick et al., 1992;Sato et al., 1999;Li et al., 2020).
Fifteen CSPs were identified in transcriptome sequencing. This number is almost equal to the number of CSPs in H. armigera (18), Heliothis assulta (17), S. littoralis (21), B. mori (20), and S. exigua (20) but much higher than that of D. melanogaster (4) (Wanner et al., 2004;Gong et al., 2007;Zhou et al., 2010;Poivet et al., 2013;Leitch et al., 2015;Liu et al., 2015;Zhang et al., 2015;Walker et al., 2019), indicating that the numbers of CSP genes differ among different species. CSPs exist in insect chemosensory and non-chemosensory organs and tissues, including antennae, legs, pheromone glands, and wings (Picimbon et al., 2001;Ban et al., 2003;Dani et al., 2011;Liu N. Y. et al., 2012;Wei et al., 2017). In our study, 10 CsacCSPs were significantly expressed in the antennae, and these CSPs might be thought to participate in general odourant recognition and perception (Pelosi et al., 2014;Jia et al., 2018). Four CSPs showed high expression in legs and might be associated with gustatory behaviors, such as detecting non-volatile chemicals (Jia et al., 2020). In the qRT-PCR analysis, some identified CsacOBPs and CsacCSPs displayed high expression in male bodies, and we speculated that these genes are likely to be involved in different functions in non-sensory organs and tissues of the insect body. Some OBPs and CSPs in male insect seminal fluid might be related to binding and releasing pheromones. In D. melanogaster, OBPs were found to be components of the seminal fluid (Takemori and Yamamoto, 2009); LmigCSP91 was identified to have a high expression in reproductive organs in male Locusta migratoria and possessed a good affinity to a kind of pheromone that is produced in the same reproductive organs Zhou et al., 2013). Some OBPs are male specific and could be transferred into female bodies during the process of mating, indicating that these OBPs might function in sperm-egg communication (Findlay et al., 2008;Takemori and Yamamoto, 2009;Prokupek et al., 2010). In addition, CSPs are involved in releasing some molecules in male glands; for example, a CSP was found in large quantities in the ejaculatory apparatus, which secretes the male pheromone vaccenyl acetate (Dyanov and Dzitoeva, 1995).
Laodelphax striatellus (133) (He et al., 2020), Sogatella furcifera (135) (He et al., 2018), and A. mellifera (170) (Robertson and Wanner, 2006), suggesting that different sequencing methods and depths may affect the outcome of studies; the lack of genomic and transcriptomic information in the databases may influence the annotation results for C. sacchariphagus, and some ORs expressed at low levels may be difficult to detect Wang et al., 2017). In the neighbor-joining tree of ORs, CsacOR1 and CsacOR5 are orthologs of BmorOR1; CsacOR4 is the ortholog of BmorOR19; and CsacOR10 clustered close to BmorOR56. In B. mori, OR1 is the receptor of the pheromone bombykol; OR19 can sense linalool, which is related to selection of spawning environment; and OR56, specific and highly sensitive to cis-jasmone, is involved in the sensing of odor molecules released by plants and signal transduction (Wanner et al., 2007;Anderson et al., 2009;Tanaka et al., 2009). The qRT-PCR results showed that CsacOR1/5/10 were highly expressed in the male antennae, suggesting that they are highly specifically involved in the detection of sex pheromones, while CsacOR4 has a higher expression in the female body than in the male body, indicating that it is likely involved in the regulation of femalespecific behaviors, such as the localization of oviposition sites and oviposition . The expression of CsacORCO, which was highly conserved in the OR tree, was significantly antenna-specific. The different expression levels of the ORs in different organs and tissues and different sexes suggested that they might perform different functions, which should be further studied in the future.
In insects, SNMP1 is usually expressed in pheromone-sensitive OSNs and is important for pheromone perception (Jin et al., 2008;Nichols and Vogt, 2008;Vogt et al., 2009;Gomez-Diaz et al., 2016). However, SNMP2 functions remain unclear. In the present study, two SNMPs were identified in C. sacchariphagus. Both were conserved with respect to other holometabolous insect species. They exhibited a clear antenna-predominant expression, suggesting that CsacSNMP1 may be associated with pheromone reception.
In conclusion, 72 candidate chemosensory protein genes (31 OBP/PBPs, 15 CSPs, 11 ORs, 13 IRs, and two SNMPs) were first identified via transcriptome sequencing analysis in C. sacchariphagus, which is an important agricultural pest. This study will not only serve as a valuable resource for future research on the chemosensory system of C. sacchariphagus and other lepidopteran species but also contribute to the development of creative and sustainable pest management strategies involving interference with olfaction.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
JL, HL, JY, YA, and HW conceived, coordinated, and designed the research. YM, JL, and DS assembled and analyzed the transcriptome datasets. JL and HL performed experiments. JL, JY, YA, and HW drafted the manuscript. All authors read and approved the final manuscript.

FUNDING
This work was supported by the GDAS' Project of Science and Technology Development (Grant No. 2019GDASYL-0103040) and GDAS' Project of Science and Technology Development (Grant No. 2020GDASYL-20200103056). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

ACKNOWLEDGMENTS
We thank M.D. students Anwen Liang (State Key Laboratory of Biocontrol, Sun Yat-sen University) for technical assistance. Thanks to Prof. Qiang Zhou (State Key Laboratory of Biocontrol, Sun Yat-sen University) for editorial assistance and comments on the manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys.