Transcriptome Analysis and Characterization of Chemosensory Genes in the Forest Pest, Dioryctria abietella (Lepidoptera: Pyralidae)

In Lepidoptera, RNA sequencing has become a useful tool in identifying chemosensory genes from antennal transcriptomes, but little attention is paid to non-antennal tissues. Though the antennae are primarily responsible for olfaction, studies have found that a certain number of chemosensory genes are exclusively or highly expressed in the non-antennal tissues, such as proboscises, legs and abdomens. In this study, we report a global transcriptome of 16 tissues from Dioryctria abietella, including chemosensory and non-chemosensory tissues. Through Illumina sequencing, totally 952,658,466 clean reads were generated, summing to 142.90 gigabases of data. Based on the transcriptome, 235 chemosensory-related genes were identified, comprising 42 odorant binding proteins (OBPs), 23 chemosensory proteins (CSPs), 75 odorant receptors (ORs), 62 gustatory receptors (GRs), 30 ionotropic receptors (IRs), and 3 sensory neuron membrane proteins (SNMPs). Compared to a previous study in this species, 140 novel genes were found. A transcriptome-wide analysis combined with PCR results revealed that except for GRs, the majority of other five chemosensory gene families in Lepidoptera were expressed in the antennae, including 160 chemosensory genes in D. abietella. Using phylogenetic and expression profiling analyses, members of the six chemosensory gene repertoires were characterized, in which 11 DabiORs were candidates for detecting female sex pheromones in D. abietella, and DabiOR23 may be involved in the sensing of plant-derived phenylacetaldehyde. Intriguingly, more than half of the genes were detected in the proboscises, and one fourth of the genes were found to have the expression in the legs. Our study not only greatly extends and improves the description of chemosensory genes in D. abietella, but also identifies potential molecular targets involved in olfaction, gustation and non-chemosensory functions for control of this pest.


INTRODUCTION
Over the last decade, Illumina sequencing has been applied to a diverse range of insects regarding tissue transcriptomes (Kawahara et al., 2019;McKenna et al., 2019). Taking the Lepidoptera as an example, we refer to the studies on the identification of chemosensory-related genes derived from antennal transcriptomes, and find that at least 55 moth species, including Dioryctria abietella in this study, have described chemosensory gene families involved in olfaction, gustation, development, insecticide resistance and even the sensation of humidity and temperature, i.e., odorant binding proteins (OBPs), chemosensory proteins (CSPs), odorant receptors (ORs), gustatory receptors (GRs), ionotropic receptors (IRs) and sensory neuron membrane proteins (SNMPs) (Agnihotri et al., 2016;Pelosi et al., 2018;Robertson, 2019;Xu, 2020;Montagné et al., 2021). Such great attention to chemosensory genes in moth antennae is attributable to their importance during the processes of smell and taste reception, as they mediate chemosensory-related behaviors of moths, such as the seeking of feeding or ovipositing host plants, the escape of dangerous signals and the recognition of conspecific partners (Cury et al., 2019;Zhang et al., 2019;Liu X. L. et al., 2020;Fleischer and Krieger, 2021). Through transcriptomic sequencing and analyses, we are able to explore the adaptation and specialization of moths to host, non-host plants or the constantly changing habitats, and identify candidate molecular targets associated with chemosensation, such as OBPs, CSPs, ORs, GRs, IRs, and SNMPs facilitating the screening of behaviorally active compounds by a reverse chemical ecology strategy.
During the process of olfactory reception in insects, two classes of binding proteins of OBPs and CSPs are responsible for filtering a variety of odorant molecules in the surrounding environment, as the first barriers for receiving and sensing chemical signals (Leal, 2013;Pelosi et al., 2018). Outside the olfactory roles, some members of OBPs or CSPs that are exclusively or highly expressed in gustatory or non-sensory organs have been found to be involved in additional physiological roles, such as taste, development, flight and insecticide resistance (Pelosi et al., 2018;Wang et al., 2020).
Aside from the ORs, GRs, and IRs described above, a relatively small number of transmembrane proteins, SNMPs, are described in insects. In most species, two SNMPs are ubiquitous, representing SNMP1 and SNMP2 (Vogt et al., 2009). In Lepidoptera, the third SNMP gene subfamily was identified, bringing the numbers of SNMPs to three in some species (Liu et al., 2015;Zhang et al., 2020). More recently, a study reported 4 SNMP-encoding genes in a zygaenid species, Achelura yunnanensis, with two gene variants in the SNMP2 subfamily . Although SNMP genes have been reported for over two decades, their functional studies are limited to members of the SNMP1 group which are involved in the sensing of sex pheromones (Pregitzer et al., 2014;Xu et al., 2021).
The coneworm, D. abietella (Lepidoptera: Pyralidae), is an oligophagous pest where its larvae feed on cones of the Pinaceae family, including the genus Pinus, Abies fabri (Mast.) Craib., Picea abies (L.) Karst., and P. likiangensis var. linzhiensis (Rosenberg and Weslien, 2005;Song et al., 2020;Tang et al., 2020). Early studies have reported sex pheromone components [(3Z,6Z,9Z,12Z,15Z)-pentacosapentaene and (9Z,11E)-tetradecadienyl acetate] of this species and their application in fields (Löfstedt et al., 1983(Löfstedt et al., , 2012. Apart from that, terpene compounds produced by host plants have been implicated to be attractive to this pest (Tang et al., 2020). However, little is known about molecular mechanisms underlying intraspecific communication between female and male moths, as well as interspecific interaction between this pest and host plants. More recently, Xing et al. (2021) described six chemosensory gene repertoires from D. abietella by antennal transcriptome analysis (15 OBPs, 18 CSPs, 65 ORs, 5 GRs, 24 IRs and 5 SNMPs), but the majority of which were partial sequences. Moreover, after our transcriptome-wide sequencing and comparative analyses of 16 tissues including the antennae in this species, the numbers of ORs, GRs, IRs, and SNMPs reported in the previous study were summed down to 42, 4, 14, and 3, respectively. Thus, our study has expanded upon those results here with the identification of 42 OBPs, 23 CSPs, 75 ORs, 62 GRs, 30 IRs and 3 SNMPs in D. abietella, and provides an extensive resource for studying putative functional roles of chemosensory genes in this pest, coupled with expression profiling analyses.

Insects and Tissue Collection
The last instar larvae and pupae of D. abietella were collected from pine cones of Pinus armandii in Zixi Mountain, Chuxiong city, Yunnan Province, China. Next, the larvae were reared in the laboratory until pupae emergence, and pupae were sexed and individually kept in cages. Emerged adults were supplied with 10% honey solution. Rearing conditions were as follows: 25 ± 1 • C, 60 ± 5% relative humidity and a 12-h light/dark cycle.
For RNA sequencing (RNA-Seq) and expression profiling analyses, 16 tissues including 50 antennae, 100 proboscises, 20 heads without antennae and proboscises, 5 thoraxes, 3 abdomens, 30 legs, 50 wings of female or male moths, as well as 30 female pheromone glands with ovipositors and 30 male hairpencils were collected from 3-day-old moths. These collected tissues were immediately frozen in liquid nitrogen and ground by glass homogenizers. After completely homogenized, 1 mL of TRIzol reagent (TaKaRa, Dalian, Liaoning, China) was added to each tissue. The prepared samples were stored at −70 • C until RNA extraction.

RNA Extraction and cDNA Preparation
Total RNAs of various tissues were extracted by using TRIzol reagent (TaKaRa, Dalian, Liaoning, China), according to the manufacturer's protocols. Next, genomic DNA was digested with gDNA Eraser at 42 • C for 2 min. The purified RNA samples were used for construction of cDNA libraries and expression profiling analyses. In brief, cDNA templates of different tissues were synthesized with the PrimeScript TM RT reagent Kit (TaKaRa, Dalian, Liaoning, China).

Library Construction and Sequencing
First, we assessed the quality and quantity of RNA samples by using agarose gels, the NanoPhotometer R Spectrophotometer (IMPLEN, CA, United States), the Qubit R 2.0 Fluorometer (Life Technologies, CA, United States), and the Bioanalyzer 2100 System (Agilent Technologies, CA, United States), respectively. Next, 1 µg of total RNA for each tissue evenly mixed by three biological pools was used to constructed a cDNA library using the NEBNext R Ultra TM RNA Library Prep Kit for Illumina R (NEB Inc., United States). Lastly, the prepared libraries were sequenced with 150-bp paired-end reads on an Illumina HiSeq TM 2000 instrument.
After cDNA libraries were sequenced, redundant sequences were discarded from the resulting raw reads, including adapter, low quality and containing poly-N sequences. The qualityfiltered reads (clean reads) were assembled into transcripts by Trinity v2.5.1 (Grabherr et al., 2011) as the transcript transcriptome. Further, the assembled transcripts were clustered by Corset v1.05 (Davidson and Oshlack, 2014), and the longest transcript for each cluster was selected as one unigene. All the unigenes were pooled into the unigene transcriptome.

Functional Annotation and Gene Expression Estimation
To predict putative functional roles of unigenes, we annotated the unigenes into different databases, including NR [National Center for Biotechnology Information (NCBI) non-redundant protein sequences], NT (NCBI non-redundant nucleotide sequences), PFAM (Protein family), KOG/COG (EuKaryotic Orthologous Groups/Clusters of Orthologous Groups), SwissProt, KO (KEGG Ortholog) and GO (Gene Ontology).
Expression levels of unigenes in different tissues were calculated by mapping clean reads to the reference transcriptome of unigenes using RSEM v1.2.15 with default parameters (Li and Dewey, 2011). The numbers of read counts for each gene were used to determine the expression levels by FPKM (fragments per kilobase of transcript sequence per millions base pairs sequenced) method (Trapnell et al., 2010). If one gene was identified only from the transcript transcriptome, its expression values in tissues were unavailable because the FPKM values were computed based on the reads mapped to the unigene transcriptome.

Gene Identification and Comparison
For the identification of chemosensory genes in D. abietella, two de novo transcriptomes were used, i.e., the transcript transcriptome and the unigene transcriptome. First, a homologybased search with TBLASTN, implemented in Geneious v10.1.3 1 , was employed to identify genes from the transcriptomes, with queries of chemosensory genes from D. abietella, Galleria mellonella, Ostrinia furnacalis, and Bombyx mori Guo et al., 2017;Liu et al., 2018;Zhao et al., 2019;Jiang et al., 2021;Xing et al., 2021;Yin et al., 2021). In search of each query, we set e-value and maximum blast hits as 1e −5 and 20, respectively. In addition, the identified genes were iteratively blasted against the two stand-alone transcriptomes of D. abietella until no new genes were found.
To compare the numbers of chemosensory genes expressed in antennae of adult Lepidoptera, we summarized the studies on chemosensory genes identified from antennal transcriptomes. Specially, if one gene was identified only from antennal transcriptomes, it was thought to have the expression in the antennae. Alternatively, when one gene was found from the non-antennal tissues, but its transcription in the antennae was evidenced by reverse transcription (RT)-PCR or quantitative real-time PCR (qPCR), this gene was also counted as antennae-expressed genes. Detailed information on sequencing tissues, gene numbers and references were seen in Supplementary Table 1.

Phylogenetic Analysis
In the trees of OBPs and CSPs, the proteins from B. mori, D. abietella and S. litura were selected (Gong et al., 2009;Xiao et al., 2021). The OR tree was inferred with the ORs from D. abietella, Helicoverpa armigera, and O. furnacalis Guo et al., 2020). Of these, 35 HarmORs previously characterized were included (Liu et al., 2013;Jiang et al., 2014;Guo et al., 2020;Yang and Wang, 2021). The IRs from B. mori, D. abietella, H. armigera and O. furnacalis were used to infer the tree Liu et al., 2018;Xing et al., 2021). In the GR tree, we selected the GRs from B. mori, D. abietella and H. armigera, in which 15 representatives in H. armigera (CO 2 , sugar, GR43a-like and some bitter GRs known ligands) were included Guo et al., 2017). In the SNMP tree, considering a relatively small gene repertoire in each species, we used the SNMPs from 14 lepidopteran species and D. melanogaster . For the selection of SNMPs used in Lepidoptera, if possible, all SNMPs from the pyralid and crambid species were included. Multiple alignments of the protein sequences were conducted under the L-INS-i algorithm using MAFFT v7.450 with default parameters (Katoh and Standley, 2013). The aligned sequences were used to construct the maximum-likelihood trees under the Whelan and Goldman (WAG) model, implemented in FastTree v2.1.11 (Price et al., 2010). In the phylogenetic analysis, chemosensory receptors and SNMPs with less than 200 amino acids were discarded, and OBPs and CSPs showing below 100 amino acids were removed. Tree edition and visualization were conducted using FigTree v1.4.4 2 .

Expression Profiling Analysis With PCR Approaches
In the expression profiles of genes, full-length transcripts encoding DabiORs and GRs were selected, including DabiOR1-OR54, Orco, and GR1-GR16. In the CSP, IR, and SNMP gene families, except for a pseudogenized DabiIR2, all the genes were selected. In the OBP gene family, we selected 32 DabiOBPs with specific or high expression in antennae. RT-PCR was employed to determine tissue-and sex-specific expression of genes in sequenced tissues. The reactions were performed on a TAdanced 96 G instrument (Analytik Jena AG, Jena, Germany) with an annealing temperature of 58 • C and 35 cycles. For quality control of cDNA templates, a reference gene, ribosomal protein L10 (DabiRPL10), was used. In the RT-PCR analyses, one of three biological templates used for transcriptome sequencing was first used. If the expression of one gene in tissues was inconsistent with its corresponding FPKM results, at least two biological replicates were conducted. Gene-specific primers were designed using Primer Premier 5 (Supplementary Table 2). RT-PCR raw data of genes including uncropped gel pictures were seen in Supplementary Figure 1.
Based on the phylogeny of ORs, we selected 11 candidate DabiPRs in D. abietella, and further detected their relative expression levels in antennae and other tissues using qPCR. Each reaction contained a total volume of 20 µL mixture, consisting of 2 µL of cDNA, each 0.5 µL of forward and reverse primers (10 µM), and 10 µL of Bestar R SybrGreen qPCR mastermix (DBI R Bioscience, Germany). The reactions were run with an annealing temperature of 58 • C and 40 cycles on a LightCycler R 96 System (Roche Diagnostics). For each gene, three biological pools were conducted with three technical replicates for each pool. Two reference genes, ribosomal protein S4 and L8 (DabiRPS4 and RPL8), were used to compute the relative expression of target genes using the Q-GENE method (Muller et al., 2002;Simon, 2003). The amplification efficiencies of primers for target and reference genes were determined using fivefold dilutions of antennal cDNA. The specificity of the products was evaluated by a melting curve analysis. The primers were designed using Beacon Designer 8.14 (Supplementary Table 2). qPCR raw data of target and reference genes were seen in Supplementary Figure 2 and Supplementary Table 3, including cycle threshold (CT) values, melting curves and primer amplification efficiencies. Data represented mean ± SEM. Statistical significance of gene expression levels among tissues was compared using a oneway analysis of variance (ANOVA), followed by the Fisher's least significant difference (LSD) test, implemented in IBM SPSS Statistics 21.0 (SPSS Inc., Chicago, IL, United States). P < 0.05 was considered as significant difference.

RESULTS AND DISCUSSION
The Transcriptome of Various Tissues in Dioryctria abietella Through the Illumina HiSeq R 2000 sequencing technique, we constructed and sequenced 16 RNA-Seq libraries of female and male moths. This sequencing resulted in the generation of variable raw data, ranging from 52,058,094 reads in male heads to 75,443,066 reads in male proboscises. After cleaning, approximately 99.85% of raw reads (952,658,466) were obtained and set as clean reads, representing 463,121,204 in female tissues and 489,537,262 in males. The size of all the clean reads was summed to 142.90 gigabases (G). Such sequencing and analysis yielded a low error rate (0.03%), high Phred values (Q20 = 98.26% and Q30 = 94.69%) and a moderate GC content (45.24%) ( Table 1). With an emphasis on the antennae responsible for olfaction, here our transcriptome sequencing obtained more raw data from D. abietella antennae (females: 58,464,276 sequences and males: 71,979,886 sequences) compared to previous studies in D. abietella (47,005,976 and 50,818,590 reads for females and males, respectively) (Xing et al., 2021), and other two pyralid moths, G. mellonella (41,665,706 and 37,238,352 reads for females and males, respectively) (Zhao et al., 2019) and Plodia interpunctella (21,937,207 and 22,682,896 reads for females and males, respectively) (Jia et al., 2018). This may explain, to some extent, why D. abietella here possessed a larger number of ORs as members of this gene family were primarily enriched in the antennae (70 ORs in this study, 42 ORs in a previous study from D. abietella, 46 ORs in G. mellonella and 47 ORs in P. interpunctella) (Jia et al., 2018;Zhao et al., 2019;Xing et al., 2021).
All clean reads were subject to sequence assembly using Trinity v2.1.5. As a result, 435,990 transcripts were generated with the majority of below 301 bp. This Corset clustering further led to the yields of 172,189 unigenes, 57.71% of which were categorized into the sizes of 301-1000 bp (Supplementary Figure 3A). Functional annotation of all the unigenes showed that the NCBI NR database had the largest number of genes (41,456) among seven databases, followed by GO (33,051) and PFAM (31,976). As many as 60,504 unigenes were presented in at least one database, whereas a relatively small number of genes (4546) were common to all seven databases (Supplementary Figure 3B). In the analyses of GO functions, 33,501 unigenes were annotated into different categories, representing 25, 20, and 10 functional terms in biological process, cell component and molecular function, respectively. In biological process, cellular process and metabolic process were the most abundant terms. Both cell and cell part accounted for the largest proportion in cell component. GO categories of molecular function were mainly composed of binding and catalytic activities (Supplementary Figure 3C). In search of unigenes against the NCBI NR protein sequence database, D. abietella genes shared the highest homology to those from Amyelois transitella (52.65%) (Supplementary Figure 3D).

Identification of Chemosensory Genes in Dioryctria abietella
We identified Table 4).
In comparison with a previous study on chemosensory genes in this species identified from the antennal transcriptome (15 OBPs, 18 CSPs, 42 ORs, 4 GRs, and 14 IRs) (Xing et al., 2021), our current transcriptome did not retrieve one CSP gene, namely DabiCSP2, but newly identified 140 genes encoding 27 OBPs, 6 CSPs, 33 ORs, 58 GRs, and 16 IRs. Of the identified OBPs and CSPs, we followed the previous nomenclature systems, except for DabiPBP3, GOBP1, and GOBP2 that were designated according to their orthology to OBPs in other lepidopterans. For the chemosensory membrane protein gene families, due to their fragmented sequences and some incorrect genes in the previous study, we merged a number of gene fragments, corrected some genes and re-named them. This re-annotation led to the yields of 42 ORs, 4 GRs, 14 IRs, and 3 SNMPs in the previous study (Xing et al., 2021).
Among these, DabiOR5 and OR25 did not belong to the OR gene family and were identified as two parts of the ortholog of GR1 in the CO 2 subclade, DabiIR19 was a member (DabiiGluR10) of ionotropic glutamate receptors, and DabiSNMP5 was discarded because it did not belong to SNMPs (Supplementary Table 4). Together, except for DabiSNMPs, our current study identified more chemosensory genes in other five gene families, largely attributed to transcriptomewide sequencing of multi-tissues. Meanwhile, this possibly reflected that some genes were expressed in non-antennal tissues, especially the GR gene repertoire.

Antennae-Expressed Chemosensory Genes in Adult Moths
With the emergence of RNA-Seq techniques, a large number of tissues in moths have been sequenced. Currently, we focused on a principle olfactory organ of adults, antenna, and compared the numbers of chemosensory genes expressed in the antennae. By using "antennae" and/or "transcriptome" and/or "chemosensory genes" as keywords to search web of science, we found that antennal tissues of 55 moth species including D. abietella have been sequenced, emphasizing the identification and characterization of chemosensory gene families. Subsequently, we combined antennal transcriptome, RT-PCR and qPCR data to determine the numbers of chemosensory genes in the antennae (Figure 1 and Supplementary Table 1).
Further, we concentrated on gene numbers of three gene families primarily involved in olfaction, i.e., OBPs, ORs, and IRs. Difference in the repertoire sizes of antennae-expressed OBPs, ORs, or IRs were possibly attributed to the following reasons: (1) sequencing depth as discussed above; (2) sequencing sexes such as S. nonagrioides, and (3) sequencing platforms or techniques (Figure 1 and Supplementary Table 1). In addition, it was also possible that a wide diversity of host plants drove the evolution of olfactory genes (Pelosi et al., 2018;Robertson, 2019;Auer et al., 2020;Yin et al., 2021). Although lepidopteran GRs, except for CO 2 -sensing receptors, were mainly expressed in gustatory-related tissues like proboscises and legs, they were indeed present in the antennae from D. abietella and other lepidopterans, including sugar, GR43a-like and some bitter receptors Guo et al., 2017;Li G. C. et al., 2021). This may be correlated with contact chemosensation of antennae, as lepidopteran insects were able to touch or tap host plants using their antennae before feeding nectar or laying eggs (Ozaki et al., 2011;Briscoe et al., 2013;Agnihotri et al., 2016;Xu et al., 2016;Guo et al., 2017).

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Odorant Binding Proteins
Previously, lepidopteran OBPs were classified into six subfamilies based on sequence homology and conserved cysteines, i.e., pheromone binding proteins/general odorant binding proteins (PBP/GOBPs), antennal binding protein I (ABP I), ABP II, chemical sense-related lipophilic ligand binding proteins (CRLBPs), Minus-C OBPs and Plus-C OBPs (Gong et al., 2009). In the present study, 42 DabiOBPs were distributed in each subfamily, representing each five in PBP/GOBPs and Minus-C OBPs, each seven in ABP IIs and Plus-C OBPs, and each nine in ABP Is and CRLBPs. In most orthologous clades, there were 1:1:1 orthologs among three species. However, two subfamily expansions were observed in some species, including 9 BmorOBPs in Minus-C OBPs (each five in D. abietella and S. litura) and 17 SlitOBPs in ABP Is (8 in B. mori and 9 in D. abietella). In addition, the ortholog of PBP4, which was previously defined as a novel subclass of the Lepidopteraspecific PBP/GOBPs , was not found in D. abietella (Figure 2A).
Expression profiles with RNA-Seq data revealed that the majority of DabiOBPs exhibited tissue-specific expression. Nearly  Table 4), suggesting their olfactory roles involved in the reception of sex pheromones and plant odorants. Some genes (DabiOBP11, OBP15, OBP19, and OBP31) were primarily expressed in gustatory organs such as proboscises and legs (Supplementary Table 4), possibly associated with taste reception as implied in previous studies (Guo et al., 2018;Koutroumpa et al., 2021). Intriguingly, 5 DabiOBPs (DabiOBP9, OBP18, OBP25, OBP36, and OBP37) had obviously high expression in male abdomens (FPKM > 13) but very low levels in females (FPKM < 1) (Supplementary Table 4). As indicated in S. litura, a number of SlitOBPs were abundantly presented in male reproductive tissues . Thus, the 5 DabiOBPs may be candidate reproductive-related genes for modulating female or male reproductive-associated behaviors. On the other hand, some DabiOBPs appeared to have sex-biased expression in the antennae, for example, DabiPBP1 had 10.25-fold higher transcription in males compared to females, DabiOBP19 and OBP31 transcription was female-biased with 17.69-fold and 21.53-fold higher than males, respectively (Supplementary Table 4).
To validate the results of RNA-Seq, we employed RT-PCR to construct the expression profiles of 32 DabiOBPs in 16 tissues of both sexes. Among them, the expression of all the genes appeared to be detectable in the antennae, but 5 genes (DabiOBP18, OBP23, OBP25, OBP36, and OBP37) exhibited extremely weak bands. Interestingly, most of the genes were also transcribed in the proboscises. In addition, a number of DabiOBPs were expressed in non-chemosensory tissues ( Figure 2B). Combining RNA-Seq and RT-PCR results, it was found that DabiOBPs were mainly transcribed in the antennae (female: 36 and male: 34) and proboscises (female: 34 and male: 35). In 14 other tissues, a comparable number of DabiOBP genes were obtained, ranging from 15 in female thoraxes and abdomens to 28 in male legs ( Table 2). These results well evidenced the presence of DabiOBPs in tissues, suggestive of their functional diversification beyond olfaction. The full names of tissues are seen in Table 1.

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Chemosensory Proteins
With 67 CSP protein sequences from B. mori, D. abietella, and S. litura, we inferred the maximum-likelihood phylogeny of CSPs. A high degree of conservation of CSPs among three species was obtained, with 14 orthologous clades (1:1:1) sharing over 50% average amino acid identities among orthologs ( Figure 3A). Compared with the numbers of CSPs in other lepidopterans annotated in the genomes (Gong et al., 2007;Gouin et al., 2017;Pearce et al., 2017), our current transcriptome may identify most, if not all, of the CSP gene family in D. abietella, coupled with phylogenetic analyses of CSPs in the three moths ( Figure 3A).
Combining FPKM and RT-PCR approaches, the expression profiles of 23 DabiCSPs were determined. In line with those from other moths (Gong et al., 2007;Xiao et al., 2021), DabiCSPs were broadly expressed in various tissues, and none of the tissue-specific genes were found ( Figure 3B). In particular, some of them displayed a very high transcription in each tissue (FPKM > 180), such as DabiCSP5, CSP9, and CSP13. However, DabiCSP11 and CSP21 appeared to be expressed in some tissues at a low level (FPKM < 2.2) (Supplementary Table 4). Together, female and male legs had the largest number of DabiCSPs among tissues with 20 relatives, whereas each 14 genes were presented in heads of both sexes ( Table 2).

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Odorant Receptors
Using 160 protein sequences from D. abietella, H. armigera, and O. furnacalis, we built the maximum-likelihood tree of ORs, rooted with a highly conserved Orco clade across insects (Vosshall and Hansson, 2011). Seventy-two DabiORs excluding 3 short DabiORs (<200 amino acids, DabiOR68, OR69, and OR74) were phylogenetically split into various small clades. In the tree, 10 DabiORs clustered into a conserved pheromonesensing PR clade as previously defined (Krieger et al., 2004;Zhang and Löfstedt, 2015), and thus were identified as candidate DabiPRs in D. abietella (Figure 4A). In addition, Bastin-Héline et al. (2019) recently described a novel PR lineage, which was not close to this classical PR clade in phylogeny. To check whether DabiORs clustered into this novel clade, we reconstructed the tree of DabiORs, together with H. armigera and S. littoralis PRs, and novel PRs identified in a previous study (Bastin-Héline et al., 2019). As a result, DabiOR4 were presented in this novel clade and shared 39.29% amino acid identity to BmorOR30 in B. mori (Supplementary Figure 4). Therefore, this DabiOR4 gene might be also a PR candidate for the detection of sex pheromones in D. abietella. In our phylogenetic analysis, it was noticed that DabiOR23 was homologous to HarmOR42 in H. armigera, with 71.72% amino acid identities ( Figure 4A). In this conserved monophylogenetic group including HarmOR42, a recent study indicated that most members of this group, at least 12 orthologous ORs functionally studied, could detect phenylacetaldehyde, a floral volatile common to flowering plants  Table 1. (Guo et al., 2020). Therefore, we postulated that DabiOR23 may also recognize this compound facilitating the searching of nectar sources.
In the analysis of expression profiles with FPKM values, we found that except for DabiOR70 with unavailable expression data, the majority of DabiORs (67/74) were detected in female or male  Table 1. antennae (FPKM>1), in agreement with their olfactory roles. Of these, 26 DabiORs had a relatively high transcription in the antennae (FPKM > 10), with DabiOrco showing the highest expression in males (FPKM=250.00) or females (FPKM=171.31). In addition, some genes appeared to have sex-biased expression in the antennae of both sexes. For instance, a candidate PR, DabiOR4, was expressed exclusively in female antennae, while DabiOR11, OR53, OR60 and OR72 expression was specific to male antennae (Supplementary Table 4). Interestingly, a certain number of DabiORs were transcribed in the proboscises (FPKM > 1) (Supplementary Table 4), a principle feeding and taste organ of adult Lepidoptera. As implied in previous studies, the ORs from other lepidopteran species also had proboscisesdetectable expression (Briscoe et al., 2013;Guo et al., 2018;Koutroumpa et al., 2021). Considering the presence of olfactory sensilla on the proboscises such as sensilla styloconica, sensilla basiconica, and sensilla coeloconica (Faucheux, 2013;Guo et al., 2018;, it proposed a possibility that the proboscis may partly bear olfactory roles. We further employed RT-PCR to examine the expression profiles of 55 full-length DabiORs in different tissues of both sexes. The results showed that the expression of almost all the genes was consistent with RNA-Seq data. Except for DabiOR44 presenting extremely low expression only in male legs, the remaining 54 DabiORs were detected in female and/or male antennae. In addition, at least half of the 54 genes showed the expression in adult proboscises, although most of them were detected at a low level (Figure 4B). In comparison with OBPs and CSPs, most members of the OR gene family had more specific expression in the antennae (up to 65 in females and 68 in males). A considerable number of DabiORs were detected in female and male proboscises, representing 29 and 31 genes, respectively. None of DabiORs were found in female wings ( Table 2).

Candidate Pheromone Receptors in Dioryctria abietella
In the phylogenetic analyses, 11 candidate DabiPR genes were found in D. abietella. To determine tissue-and sex-specific expression of these genes, we investigated their expression ratios in antennae and other tissues. qPCR analysis revealed that 9 of them were significantly expressed in the antennae relative to other tissues, in which 4 DabiORs (DabiOR4, OR11, OR53, and OR62) transcription was restricted to the antennae and other 5 genes (DabiOR43, OR60, OR61, OR72, and OR73) were detected in the antennal and non-antennal tissues. Of the 11 DabiORs, 4 genes (DabiOR31, OR61, OR62, and OR73) had almost equal expression between female and male antennae, while the remaining 7 genes showed a significantly sex-biased expression in the antennae. Among these, DabiOR11, OR43, OR53, and OR60 were male-biased genes in the antennae, in which DabiOR11 and OR53 previously described as DabiOR41 and OR19, respectively (Xing et al., 2021), had a similar malebiased expression. In contrast to that, DabiOR4, OR15, and OR72 displayed a female-biased expression (Figure 5). It appeared that there were some consistencies between RNA-Seq and qPCR data, that is, similar sex-biased observations for some candidate PRs FIGURE 5 | Expression profile of candidate DabiPRs in different tissues of D. abietella. Significant difference of OR expression levels among tissues is indicated with different lowercase letters (ANOVA, LSD, P < 0.05). Relative expression of DabiORs was normalized relative to two reference genes, DabiRPS4 and DabiRPL8. The full names of tissues are seen in Table 1. with both FPKM and qPCR results, but some differences were also observed for some genes.
The diverse expression results of DabiPRs in the antennae were different from the PRs in noctuid species, in which the expression of most PR genes was significantly male-biased (Zhang and Löfstedt, 2015;Bastin-Héline et al., 2019). In comparison with other species in the Pyralidae family, 2 candidate GmelPRs from G. mellonella, GmelOR13 and OR50, were evenly expressed in female and male antennae (Zhao et al., 2019;Jiang et al., 2021). In P. interpunctella, 4 candidate PintPRs exhibited different sex-biased expression patterns in the antennae, that is, PintOR5 and OR22 were significantly expressed in male antennae, whereas female-biased presence for PintOR7 and OR30 (Jia et al., 2018). In a closely related family Crambidae, a candidate OscaPR gene from Ostrinia scapulalis, OscaOR7, had roughly equal amounts of mRNA in female and male antennae, but failed to respond to female sex pheromones and pheromone orthologs (Miura et al., 2010). In O. furnacalis, OfurOR1 identified as one of the PR members had no or extremely low expression in the antennae (Yang et al., 2015). Similar results about female-biased or equal expression of PRs in the antennae were also observed in Loxostege sticticalis (Wei et al., 2017), Conogethes punctiferalis (Jia et al., 2016), and Chilo suppressalis (Cao et al., 2014). All together, it was noticed that not all candidate PRs of the conserved PR clade in the Pyralidae and Crambidae families had a significantly high expression in male antennae.

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Gustatory Receptors
In the phylogenetic inference of GRs, we selected 29 DabiGRs showing more than 200 amino acids to construct the tree, together with 67 B. mori GRs and 15 H. armigera GRs. Following the classification system of GRs in Lepidoptera (Xu, 2020), DabiGRs partitioned into four subfamilies, i.e., CO 2 , GR43a-like, sugar and bitter receptors comprising 2, 5, 7, and 15 relatives, respectively ( Figure 6A). In the two subfamilies of GR43a-like and sugar GRs, the exact numbers of DabiGRs may be larger than we presented here in the tree, as some fragmented proteins (<200 amino acids) in D. abietella were not included. Similar events also occurred in the bitter GRs subfamily. In the CO 2 GR1 subclade, our transcriptome did not retrieve the ortholog of GR1 from D. abietella, but its partial sequences were presented in the previous antennal transcriptome (Xing et al., 2021). Given the high sequence identities of GR1 orthologs in this subclade (Xu and Anderson, 2015;Xu, 2020), we attempted to use all 62 DabiGRs to infer the phylogenetic relationships of GRs, coupled with the 67 and 15 GRs separately in B. mori  and H. armigera (Xu and Anderson, 2015). In agreement with the results of the originated tree, the orthologous gene of GR1 was not found from D. abietella (Supplementary Figure 5). This may be due to its extremely low expression in tissues of D. abietella or incorrect assembly.
Expression profiles revealed that all DabiGRs exhibited relatively low or no expression in the sequenced tissues (0≤FPKM<6). Of notice, 10 DabiGRs were not identified from the unigene transcriptome and their FPKM values in tissues were unavailable. Fourteen out of the remaining 52 genes were considerably transcribed in the proboscises of both sexes, including 2 CO 2 , 1 GR43a-like, 2 sugar, 5 bitter and 4 unclassified GRs (1<FPKM<6) (Supplementary Table 4). The proboscises-expressed DabiGRs were well consistent with their gustatory roles, as implied by the GRs in Spodoptera littoralis (Koutroumpa et al., 2021), H. armigera (Guo et al., 2018), and H. melpomene (Briscoe et al., 2013). Notably, FPKMs of only 3 DabiGRs (DabiGR3, GR12, and GR42) in the antennae were above 1 (Supplementary Table 4), largely supported by the observation that few or no GRs were identified in antennal transcriptomes (Supplementary Table 1). As expected, RT-PCR results further supported low expression of DabiGRs in tissues, as well as their extensive presence in the proboscises (Figure 6B). Using the data of FPKM and RT-PCR, we identified 19 and 15 DabiGRs in female and male proboscises, respectively, but no or a few genes were found in other tissues (0-6 relatives) ( Table 2).

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Ionotropic Receptors
We constructed the maximum-likelihood tree of IRs from B. mori, D. abietella, H. armigera, and O. furnacalis, using DabiiGluRs as the outgroup. Phylogenetic analysis allowed us to distinguish the IRs into 29 IR orthologous clades, including one pseudogene group (IR2). Except for the IR31a clade, the remaining clades comprised at least one member of DabiIRs. Among the four moth species, most of the IR clades shared a one-to-one orthologous relationship. In particular, each of two clades possessed two gene copies, i.e., IR75q.1 and IR75p/p.2 ( Figure 7A). Previously, IR31a genes were lost in several other species, including the Pyralidae family of A. transitella and P. interpunctella, the Cosmopterigidae family of Hyposmocoma kahamanoa and the Plutellidae family of Plutella xylostella (Yin et al., 2021). Therefore, it was possible that D. abietella had lost the ortholog of IR31a during the process of evolution.
In comparison, this divergent IRs (D-IRs) subfamily, especially the IR100 clade, had variable gene numbers ranging from 1 in D. abietella to 9 in O. furnacalis ( Figure 7A). As implied in the previous studies, D-IRs members mainly contributed to the difference of IR numbers across the Lepidoptera, and moreover most of them were found to be lowly expressed genes in tissues. In contrast to this subfamily, antennal IRs (A-IRs) and Lepidoptera-specific IRs (LS-IRs) generally had high expression in at least one tissue, and high conservation with single-copy genes among different species Zhu et al., 2018;Yin et al., 2021). Accordingly, our study may identify almost all the members of A-IRs and LS-IRs, but missed some D-IRs in D. abietella.
Based on RNA-Seq results, we found that almost all the A-IRs were expressed in the antennae at a considerable level (FPKM > 1), with the exception of DabiIR40a that appeared to be predominant in the antennae but remarkably low levels (females: 0.79 and males: 0.77). Three IR coreceptors, DabiIR8a, IR25a and IR76b, displayed higher expression in the antennae relative to other tissues. Apart from that, their expression was comparable in the proboscises (FPKM > 1) with that in non-antennal tissues, suggesting that they may be associated with taste function. The common presence of IR coreceptors in the antennae and proboscises was also observed in other A-IRs genes, such as DabiIR21a, IR41a, IR64a, IR75p.1, etc. (Supplementary Table 4), suggesting their dual functional roles in response to odorants and tastants. This expression Fifteen HarmGRs were included, representing 3 CO 2 , 1 GR43a-like and 8 sugar GRs, as well as 3 bitter GRs functionally characterized. (B) Expression profiles of candidate DabiGRs in different tissues of both sexes from D. abietella. A reference gene, DabiPRL10, was used to check the quality and quantity of cDNA templates. NC, negative control using sterile water as the template. The full names of tissues are seen in Table 1. feature of A-IRs presented in olfactory and gustatory tissues was observed in S. litura , S. littoralis (Koutroumpa et al., 2021), Papilio xuthus (Yin et al., 2021), and H. armigera . In addition, FPKM values of four LS-IRs in the antennae including a pseudogene of DabiIR2 were above 1, but all of the D-IRs subfamily exhibited extremely low  (Ofur, green). According to sequence homology of IRs, members of the IR gene family were grouped into various orthologous clades, with three conserved IR coreceptors (IR8a, IR25a, and IR76b). The tree was rooted with D. abietella iGluRs. (B) Expression profiles of candidate DabiIRs in different tissues of both sexes from D. abietella. The quality and quantity of cDNA templates were measured by a reference gene, DabiRPL10. NC, negative control using sterile water as the template. The full names of tissues are listed in Table 1. expression levels (FPKM < 1) in the antennae and other tissues (Supplementary Table 4).
Using the RT-PCR method, we further detected the expression profiling map of 31 DabiIRs in various tissues, except for DabiIR2. As a result, 29 out of 31 genes were expressed in at least one tissue, with the exception of DabiIR100a and IR143 whose expression was not detected in all tested tissues. We tried to change the primers and annealing temperatures, but the products of DabiIR100a and IR143 failed to be detected. This may reflect extremely low expression of the two genes in tissues. In line with RNA-Seq results, most of DabiIRs were transcribed in the antennae and proboscises. In particular, 2 D-IRs genes also showed a weak expression in the antennae, i.e., DabiIR7d.1 and IR7d.2 (Figure 7B), consistent with that of their orthologs from other moths Yin et al., 2021). Some genes exhibited tissue-specific expression, for example, DabiIR1.1, IR1.2, IR40a, IR60a, IR75q.1, IR75q.1.1, and IR87a in the antennae, and DabiIR85a and IR100d in the proboscises ( Figure 7B). All together, the antennae harbored the largest numbers of DabiIRs showing 24 in females and 18 in males, followed by the proboscises with 12 in females and 9 in males ( Table 2).

Phylogenetic and Expression Profiling Analysis of Dioryctria abietella Sensory Neuron Membrane Proteins
Prior to this study, virtually all lepidopteran species with available genomic sequences possess 3 SNMPs, phylogenetically partitioned into three groups with single-copy genes in each  Table 1. group . More recently, the SNMP gene repertoire in A. yunnanensis was composed of 4 members, representing 1 AyunSNMP1, 2 AyunSNMP2s (AyunSNMP2.1 and SNMP2.2) and 1 AyunSNMP3 . In agreement with most lepidopteran species, here our transcriptome contained 3 SNMP orthologs that were split into three different clades (Figure 8A).
In the expression profiles of DabiSNMPs, DabiSNMP1 was enriched in the antennae of both sexes, while DabiSNMP2 was broadly presented in all tissues with particularly high expression levels in the antennae, proboscises and wings of female and male moths (FPKM > 500). In contrast to that of 2 DabiSNMPs, DabiSNMP3 expression was not detected in chemosensory-related tissues like antennae, proboscises and legs, but presented primarily in adult abdomens and slightly in adult thoraxes (Supplementary Table 4). RT-PCR results further validated RNA-Seq data of 3 DabiSNMPs. Moreover, DabiSNMP1 appeared to have a higher transcription in male antennae compared to females (Figure 8B), suggesting its putative olfactory roles in the sensing of female sex pheromones in D. abietella, coupled with SNMP1 functions in previous studies (Pregitzer et al., 2014;Liu S. et al., 2020). For the expression of DabiSNMP3 in female and male abdomens (Figure 8B), we suggested that its transcripts may be restricted to midguts of both sexes, as its orthologous genes have been demonstrated to be midgut-specific relatives in B. mori and H. armigera Xu et al., 2021). Taken together, each tissue here studied included at least one SNMP transcript. In particular, 2 SNMP genes were found to have the expression in the antennae and proboscises (DabiSNMP1 and SNMP2), as well as thoraxes and abdomens (DabiSNMP2 and SNMP3) ( Table 2).

CONCLUSION
In this study, we have characterized six chemosensory-related gene families from a destructive forest pest, D. abietella. First, a mixed transcriptome of 16 tissues, including key chemosensory organs of antennae, proboscises and legs, is sequenced and assembled, resulting in the generation of 952,658,466 clean reads and the accumulation of 142.90 G of data. Further, our transcriptome combined with RT-PCR and qPCR analyses identifies 235 genes involved in chemosensation (42 OBPs, 23 CSPs, 75 ORs, 62 GRs, 30 IRs, and 3 SNMPs, with an increase of 140 genes), 160 of which are expressed in the antennae (37 OBPs, 20 CSPs, 70 ORs, 6 GRs, 24 IRs, and 3 SNMPs), 137 in the proboscises (36 OBPs, 19 CSPs, 46 ORs, 21 GRs, 13 IRs, and 2 SNMPs), and 67 in the legs (32 OBPs, 21 CSPs, 4 ORs, 3 GRs, 6 IRs, and 1 SNMP). Phylogenetic analysis combined with sequence homology allows us to identify 11 candidate DabiPR genes, in which 4 relatives are significantly enriched in male antennae compared to females, and thus are regarded as candidates for the sensation of female sex pheromones in D. abietella. Together, our current work has identified potential molecular targets for sensing sex pheromones and plant-derived compounds that are applied to develop environmentally friendly alternatives for the control of this forest pest.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI SRA BioProject, accession no: PRJNA751144.

AUTHOR CONTRIBUTIONS
N-YL conceived this study and drafted the first manuscript. Z-QW, CW, G-CL, and S-MN performed the laboratory experiments. N-YL, Z-QW, and N-NY analyzed the data. N-YL and Z-QW revised the manuscript. All the authors contributed to the article and approved the submitted version.