The Chemosensory Transcriptome of a Diving Beetle

Insects astoundingly dominate Earth’s land ecosystems and have a huge impact on human life. Almost every aspect of their life relies upon their highly efficient and adaptable chemosensory system. In the air, most chemical signals that are detected at long range are hydrophobic molecules, which insects detect using proteins encoded by multigenic families that emerged following land colonization by insect ancestors, namely the odorant-binding proteins (OBPs) and the odorant receptors (ORs). However, land-to-freshwater transitions occurred in many lineages within the insect tree of life. Whether chemosensory gene repertoires of aquatic insects remained essentially unchanged or underwent more or less drastic modifications to cope with physico-chemical constraints associated with life underwater remains virtually unknown. To address this issue, we sequenced and analyzed the transcriptome of chemosensory organs of the diving beetle Rhantus suturalis (Coleoptera, Dytiscidae). A reference transcriptome was assembled de novo using reads from five RNA-seq libraries (male and female antennae, male and female palps, and wing muscle). It contained 47,570 non-redundant unigenes encoding proteins of more than 50 amino acids. Within this reference transcriptome, we annotated sequences coding 53 OBPs, 48 ORs, 73 gustatory receptors (GRs), and 53 ionotropic receptors (IRs). Phylogenetic analyses notably revealed a large OBP gene expansion (35 paralogs in R. suturalis) as well as a more modest OR gene expansion (9 paralogs in R. suturalis) that may be specific to diving beetles. Interestingly, these duplicated genes tend to be expressed in palps rather than in antennae, suggesting a possible adaptation with respect to the land-to-water transition. This work provides a strong basis for further evolutionary and functional studies that will elucidate how insect chemosensory systems adapted to life underwater.


INTRODUCTION
Chemical senses are at the crossroad between an animal and its environment, and thus play a key role in species adaptation (Yohe and Brand, 2018). Over hundreds of millions of years, insects have remained tremendously diverse and ecologically successful in every kind of continental ecosystem on Earth. This owes a great deal to the characteristics and evolvability of their chemosensory system, finely tuned to detect chemical cues in an aerial environment. Although the vast majority of insect species are terrestrial, there are also many insects that live in freshwater habitats, where they are abundant and diversified worldwide (Dijkstra et al., 2014). All are members of lineages derived from terrestrial ancestors, which have secondarily adapted to life in water. This raises the question of how the fundamentally aerial chemosensory equipment of insects has been remodeled in these lineages, to cope with the drastically different physico-chemical constraints associated with an aquatic environment. Indeed, the compounds most readily displaced over long distances in water are predominantly hydrophilic whereas those displaced in the air are mainly hydrophobic (Mollo et al., 2014). Until recently, it was thought that aquatic animals were only able to detect hydrophilic molecules. However, hydrophobic compounds are also prevalent in aquatic ecosystems, and it has recently been shown that marine shrimps and freshwater fish can perceive hydrophobic molecules, either by contact or at short-range (Giordano et al., 2017).
In aerial insects, olfactory sensory neurons-involved in longrange chemodetection-are mainly localized on antennae but can also be found on other body parts, including maxillary and labial palps, depending on the taxa (Hansson and Stensmyr, 2011). Gustatory sensory neurons-involved in short-range or contact chemodetection-are found on palps and legs, as well as antennae, wings and ovipositors (Montell, 2009). Detection of odorants and tastants by these peripheral neurons is mediated by different families of chemoreceptor proteins localized in their dendritic membrane. Among these, an insect-specific family of chemoreceptors called odorant receptors (ORs) arose after waterto-land transition (ca. 410 My ago), in a common ancestor of Ectognatha . ORs are expressed in olfactory neurons and bind airborne hydrophobic molecules (e.g., terpenoids, benzenoids, fatty acid derivatives, . . .). They are transmembrane proteins associated in heteromers with a unique co-receptor named Orco, forming a non-selective cation channel that opens upon ligand binding (Wicher and Miazzi, 2021). In addition to ORs, two major chemoreceptor families have been characterized through genomic and functional studies in aerial insects. The gustatory receptors (GRs) belong to the same superfamily as ORs, though they are not specific to insects (Eyun et al., 2017;Robertson, 2019). They are expressed in gustatory sensory neurons and some have been shown to bind CO 2 , sugars or bitter molecules (Isono and Morita, 2010;Robertson, 2019;Xu, 2020). The third family of chemosensory receptors in insects are the ionotropic receptors (IRs), also found in other protostomes (Croset et al., 2010). To date, the function of IRs has been studied mainly in Drosophila, where they are expressed in olfactory, gustatory and other sensory neurons and form complexes with several co-receptors (Silbering et al., 2011;Sanchez-Alcaniz et al., 2018). Antennal IRs are primarily responsible for the detection of volatile hydrophilic molecules such as acids and amines, but some are also involved in temperature or humidity sensing (van Giesen and Garrity, 2017;Rimal and Lee, 2018). Outside of Drosophila, involvement of antennal IRs in odorant detection has also been demonstrated in Hymenoptera and Lepidoptera (Shan et al., 2019;Zhang et al., 2019Zhang et al., , 2021. Whatever the family considered, the size of chemoreceptor repertoires varies widely among insects, from ten to several hundreds of genes per species. This variability reflects their rapid evolution following a birthand-death process, characterized by numerous gene duplications and losses (Robertson, 2019).
In addition to receptors, other chemosensory gene families are involved in the detection of chemical cues in insects. This is notably the case of odorant-binding proteins (OBPs), secreted in high concentration in the lymph of olfactory sensilla, where they solubilize hydrophobic molecules to facilitate their transport to the chemoreceptors. These proteins thus contribute to the sensitivity of the olfactory system in aerial insects (Brito et al., 2016;Rihani et al., 2021). In addition to OBPs, which are insect-specific, other soluble proteins such as chemosensory proteins (CSPs) and Niemann-Pick C2 (NPC2) proteins are suspected to play a role in chemical senses (Pelosi et al., 2014). The lymph of olfactory sensilla also contains large amounts of odorant-degrading enzymes, belonging to various enzymatic families, important for rapid signal termination following receptor activation (Leal, 2013;Chertemps and Maïbèche, 2021).
The evolution of chemosensory genes has likely played a major role in adaptation of insects to freshwater habitats. However, we know virtually nothing about chemosensory genes in aquatic insects besides pioneering works in mosquito larvae (Xia et al., 2008;Liu et al., 2010;Ruel et al., 2018). In this study, we aimed to gain insights into evolutionary changes that affected chemosensory gene repertoires after land-to-water transition, through transcriptome analysis of chemosensory organs of a diving beetle (Coleoptera, Dytiscidae). Among the many Coleoptera lineages in which such transitions occurred (Short, 2018), Dytiscidae exhibit the highest degree of adaptation to an aquatic life. Adult diving beetles are capable of flying from one water body to another, but they spend most of their life in water and notably feed and reproduce only underwater. They are predaceous animals, like most other members of the Coleoptera suborder Adephaga, and chemical senses may play a prominent role in prey detection (Culler et al., 2014). Morphological studies have indicated that porous sensilla at the surface of antennae and palps of Dytiscidae differ from those of their closest terrestrial cousins, the Carabidae (Baker, 2001). In diving beetles, there is experimental evidence suggesting that the antennae play a role in chemoreception both underwater and in the air, whereas the palps detect chemical stimuli in the liquid medium only (Hodgson, 1953). Furthermore, active movements of the maxillary palps occur at rest in response to exposure to food odors, during subsequent swimming when attempting to locate preys, and upon feeding (M. Manuel, personal observations). However, very little is known concerning the physiology of these chemosensory structures in diving beetles.
Here we have sequenced, assembled and analyzed the transcriptomes of antennae and palps from adult males and females of the diving beetle species Rhantus suturalis Macleay, 1825. This species is of moderate size (10.5-12.5 mm), is common and widespread from western Europe and North Africa through Asia to northern Australia, and lives in a wide variety of freshwater lentic habitats, with a preference for more or less temporary ponds in open environments. Species of the genus Rhantus are prominent predators of mosquito larvae (Culler et al., 2014). Furthermore, it was recently demonstrated that R. suturalis males are attracted underwater by a sex pheromone of unknown composition emitted by females (Herbst et al., 2011). The R. suturalis transcriptome was used to annotate genes belonging to major families of soluble proteins and transmembrane receptors responsible for semiochemical detection in insects, and to estimate their expression levels in antennae and palps. In parallel to this candidate gene approach, we also searched genes specifically expressed in these chemosensory tissues relative to wing muscle, and genes specifically expressed in one sex vs. the other. By doing so, our main goal was to identify features of the diving beetle chemosensory gene toolkit that may reflect specificities associated with an aquatic life, notably expansions or contractions of gene repertoires and unusual expression patterns in antennae and palps when compared to terrestrial insects.

Sample Preparation and RNA Extraction
Adult males and females of R. suturalis were collected in several ponds in the Fontainebleau forest (Bois-le-Roi, ca. 48 • 28 N 2 • 39 E, France) and kept alive in the laboratory in small water tanks. Ablations of antennae and palps were performed under a Leica MZ16 stereomicroscope, with the specimen placed upside down and immobilized in a custom device. Appendages were removed by grasping with forceps at the most basal article (antennomere or palpomere I). Samples and numbers of individuals used for each sample (in brackets) were as follows: female antennae (50), male antennae (45), female palps (maxillary and labial palps mixed in the same sample, 79), male palps (69) and wing muscle (one male individual), leading to a total of five samples. Each specimen was killed in liquid nitrogen immediately after appendage ablations. Each appendage was rinsed in RNAse free water, then immediately transferred into a 2 ml tube on ice containing 500 µl TRIzol TM Reagent (Thermo Fisher Scientific, Waltham, MA) and a mixture of micro-ceramic beads. Tubes were frozen during at least one night at -20 • C.
For RNA extraction, 300 µl of TRIzol TM Reagent was added to the tubes prior to grinding, lysis and homogenization (3 × 30 s, then 5 min on ice, then 3 × 30 s all cycles at 5,000 rpm) in a Minilys homogenizer (Bertin technologies SAS, Montigny Le Bretonneux, France). The liquid phase was then separated from the beads and total RNA was extracted using the phenol/chloroform method as described in the TRIzol TM Reagent user guide. Total RNA was treated with TURBO DNase (Thermo Fischer Scientific) according to manufacturer's instructions, then purified and concentrated using the RNeasy R MinElute TM Cleanup Kit (Qiagen, Hilden, Germany). RNA quality and quantity were measured using a ND-1,000 NanoDrop spectrophotometer (Thermo Fisher Scientific). All RNA samples were also analyzed on a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA) to determine their RNA Integrity Numbers (RIN). All measured RIN were > 7.

RNA Sequencing and de novo
Transcriptome Assembly cDNA library construction and sequencing were carried out at the high-throughput sequencing facility of the Institute for Integrative Biology of the Cell (Gif-sur-Yvette, France). Libraries have been prepared using the Illumina TruSeq mRNA Stranded kit, with minor modifications allowing to obtain long cDNA inserts. Sequencing (150-bp paired-end reads) was performed using a NextSeq500 instrument. Data processing and analysis (summarized in Supplementary Figure 1) was performed on the Galaxy server hosted by the BioInformatics Platform for Agro-ecosystems Arthropods (Rennes, France). Quality check was done with FastQC 0.69 and reads were trimmed with Trimmomatic 0.32 (Bolger et al., 2014). Parameters were as follow: Sliding window = 4 bases; average quality = 20; minlen = 30 bases; headcrop = 10 bases; trailing min. quality = 20. Clean reads from the five samples were assembled using Trinity 2.4 (Grabherr et al., 2011), with the following parameters: min contig length = 200, min count for k-mers to be assembled = 1. Coding sequences were extracted from the reference transcriptome using Transdecoder 3.0 (Haas et al., 2013) with a minimum protein length of 50 amino acids. Redundant sequences were then clustered using CD-HIT-EST 1.3 (Fu et al., 2012) with a similarity threshold of 0.9 and a word size of 8.

Transcriptome Analysis
Transcriptome quality was estimated using BUSCO 3.0 (Simao et al., 2015) with the insecta_odb9 dataset. For transcript annotation, coding sequences were first translated with Transeq 5.0 (Rice et al., 2000). Then, an alignment search was performed using DIAMOND (Buchfink et al., 2015) on the NCBI nonredundant (nr) protein sequence database (access March 2020) with the "more sensitive" mode, a BLOSUM62 scoring matrix and a maximum e-value of 1e-05. In parallel with the alignment search strategy, a protein domain analysis was performed with hmmscan 3.2 (Finn et al., 2011) using the Pfam-A hidden Markov model database (access May 2020; El-Gebali et al., 2019) and default parameters.
To measure expression levels, reads generated for each of the five libraries were aligned on the reference transcriptome with HISAT 2.1.0 (Kim et al., 2015). Transcript abundance was then measured in each sample as FPKM (fragments per kilobase of exon per million fragments mapped) with the Cufflinks 2.2.1 suite (Trapnell et al., 2012), with default parameters. Heatmaps were built using log 2 (FPKM + 1) values. Transcripts specifically expressed in chemosensory organs (antennae or palps) vs. wing muscle and transcripts specifically expressed in chemosensory organs of one sex vs. the other were identified using the Cuffdiff tool, with a False Discovery Rate of 0.05. Amino acid sequences translated from these transcripts were used as queries to search the UniProt database (access May 2020) using BLASTp (Johnson et al., 2008), with a e-value cutoff of 1e-03.
For the comparison of chemosensory gene expression bias in antennae and palps of R. suturalis and Tribolium castaneum, data used for T. castaneum were RPKM values provided in previously published transcriptomic analyses (Dippel et al., 2014(Dippel et al., , 2016. For each family (OBP, OR, GR, IR), orthologous relationships between R. suturalis and T. castaneum genes were determined using neighbor-joining phylogenies built with Seaview 4.7 (Gouy et al., 2010). Genes with FPKM or RPKM values < 1 in both tissues and genes with no orthologous relationship identified were discarded.

Annotation of Chemosensory Genes
For each gene family to be annotated, datasets containing amino acid sequences manually annotated in other coleopteran species were first created. The OR dataset consisted in sequences previously annotated (Mitchell et al., 2020) in the genomes of Calosoma scrutator (Carabidae), Nicrophorus vespilloides (Silphidae), Agrilus planipennis (Buprestidae), Anoplophora glabripennis (Cerambycidae), and Dendroctonus ponderosae (Curculionidae). The GR, IR, OBP, and CSP datasets contained sequences from A. glabripennis (McKenna et al., 2016), A. planipennis, and D. ponderosae (Andersson et al., 2019). The NPC2 dataset contained sequences from T. castaneum (Pelosi et al., 2014). These amino acid sequences were used as queries to search the R. suturalis reference transcriptome using tBLASTn 2.5 (Cock et al., 2015) with an e-value cutoff set at 1e-10. In parallel, results of the hmmscan analysis were mined for the following domains: pfam03392, OS-D Insect pheromone-binding family (CSPs); pfam01395, PBP/GOBP family (OBPs); pfam02949, 7tm_6 Odorant receptor (ORs); pfam08395, 7tm_7 Chemosensory receptor (GRs); pfam00060, Ligand-gated ion channel (IRs). To verify annotations and eliminate false positive hits, amino acid translations of the unigenes were searched against the NCBI nr database using BLASTp (Johnson et al., 2008). In some cases, redundant unigenes encoding the same protein but not clustered by CD-HIT-EST were manually clustered to rebuild a longer sequence. The presence of signal peptides within sequences of soluble protein precursors was predicted with SignalP 4.1 (Nielsen, 2017) and the presence of transmembrane domains within sequences of candidate chemoreceptors was predicted with TMHMM 2.0 (Krogh et al., 2001).

Phylogenetic Analyses
To rebuild the phylogenies of the CSP, OBP, OR, GR, and IR families, amino acid sequences from R. suturalis were aligned with the coleopteran sequences described above. Datasets were purged of predicted pseudogenes and isoforms resulting from alternative splicing. CSP, OBP, and IR alignments were performed with MAFFT 7 (Katoh et al., 2019) and OR and GR alignments were performed with Muscle (Edgar, 2004) as implemented in Seaview 4.7 (Gouy et al., 2010), then manually curated. The best model of amino acid substitution was determined by SMS (Lefort et al., 2017) and trees were calculated with the maximumlikelihood method using PhyML 3.0 , with a mix of SPR and NNI algorithms. Node support was assessed with the SH-like approximate likelihood-ratio test (aLRT) as implemented in PhyML (Anisimova and Gascuel, 2006).

Quantitative Real-Time PCR
Relative gene expression levels were estimated by quantitative real-time PCR (qPCR) for 12 selected genes (3 genes for each of the OBP, OR, IR, and GR families) showing contrasted expression profiles in RNAseq results, in order to validate gene expression data derived from RNAseq. qPCR estimates were performed on RNA preparations independent from those used for RNAseq, and with maxillary and labial palps treated as two distinct samples. Twenty-five R. suturalis adults (11 males and 14 females) were collected in a pond in Rue (Somme, France) in October 2020 and kept alive in the lab for a few weeks before sample preparation. Dissections and RNA extractions were performed as described above, except that maxillary and labial palps were separated and that male and female tissues were mixed. First-strand cDNAs were synthesized using 500 ng of total RNA, oligo-dT primer and SuperScript TM II Reverse Transcriptase (Thermo Fischer Scientific). The pooled cDNA used to build standard curves for calculating PCR efficiencies was synthesized from a mix of 200 ng of RNA from antennae, 100 ng of RNA from each palp and 100 ng of RNA from wing muscle. Specific primer pairs were designed with OligoAnalyzer TM (Integrated DNA Technologies, Coralville, IA) to amplify fragments between 150 and 230 bp (primer sequences are given in Supplementary Table 3). These genes were selected based on the RNAseq data in order to have-for each family-one gene expressed in both antennae and palps, one antenna-biased gene and one palp-biased gene. The gene encoding the ribosomal protein RsutRPL13 was used as the reference gene.
qPCR assays were performed in technical duplicates on the pooled cDNA dilutions (undiluted, 1/5, 1/25, 1/50, 1/100, 1/200, 1/400), and in technical triplicates on the four cDNA samples (antennae, maxillary palps, labial palps, wing muscle). The qPCR mix contained 2 µl of cDNA (or water for the negative controls), 8 µl of SsoAdvanced TM universal SYBR R Green Supermix (Bio-Rad, Hercules, CA), and 500 nM of both gene-specific primers, in a final volume of 10 µl. PCR reactions were run in 96-well plates, in a CFX96 Touch TM Real-Time PCR detection system (Bio-Rad) with the following thermal cycling conditions: 98 • C for 4 min; 40 amplification cycles at 98 • C for 15 s, 60 • C for 1 min. After amplification, melt curve analyses were performed by gradual heating from 65 to 95 • C at 0.5 • C.s −1 . Only one peak was detected for each sample. The slopes of the standard curves were calculated and the amplification efficiency was estimated as E = (10 −1/slope ). Mean normalized expression of the target genes were calculated with Q-Gene (Simon, 2003).

The Rhantus suturalis Reference Transcriptome
Illumina sequencing of the five libraries generated a total of 526 million pairs of raw reads. After trimming, we finally obtained 72,793,754 pairs of clean reads for female antennae, 104,087,147 FIGURE 1 | Maximum-likelihood phylogeny of coleopteran OBPs. The tree was built from an alignment of amino acid sequences of R. suturalis OBPs (RsutOBP, available in Supplementary Table 1) with sequences annotated from the genomes of three coleopteran species belonging to distinct superfamilies. The Plus-C, Minus-C and Antennal Binding Protein II (ABPII) clades were defined as in Andersson et al. (2019). Black dots indicate deep nodes highly supported by the likelihood-ratio test (aLRT > 0.9), with the corresponding value. The tree was rooted using the Plus-C clade as an outgroup. The scale bar indicates the expected number of amino acid substitutions per site.
Frontiers in Ecology and Evolution | www.frontiersin.org pairs for female maxillary and labial palps, 85,494,452 pairs for male antennae, 94,355,486 pairs for male palps and 86,825,522 pairs for the male wing muscle sample. These clean reads were pooled and assembled into ∼300,000 contigs, which led to a reference transcriptome of 47,570 unigenes encoding proteins of more than 50 amino acids (Supplementary Figure 1). The BUSCO analysis revealed a good completeness of this reference transcriptome (only 4.1% of missing genes) as well as a very low level of redundancy (2.7% of duplicated sequences). We found corresponding hits in the NCBI nr protein sequence database for 26,403 of the 47,750 unigenes, and identified conserved protein domains (from the Pfam database) for 27,553 unigenes.

Soluble Proteins Potentially Involved in Chemical Sensing
We annotated 53 transcripts encoding candidate OBPs (including 17 full-length coding sequences) in the reference transcriptome (Supplementary Table 1). Among these 53 RsutOBPs, we identified two members of the Plus-C clade (RsutOBP1-2), an OBP sub-family characterized by a number of cysteine residues above six (the number usually observed in OBPs), eight members of the Antennal Binding Protein II clade (RsutOBP45-52) and a single member of the Minus-C clade (RsutOBP53), characterized by the presence of only four cysteine residues instead of six (Figure 1). The other RsutOBPs clustered in clades generally referred to as "classical OBPs" (Dippel et al., 2014;Andersson et al., 2019). Among them, we identified a remarkable lineagespecific expansion, with RsutOBP5-39 (i.e., 35 out of the 53 OBPs) belonging to a single clade, which contained only two OBPs from D. ponderosae and one OBP from both A. planipennis and A. glabripennis.
Although the lack of biological replicates did not allow to unambiguously demonstrate differential gene expressions between tissues, we used FPKM values to estimate transcript abundance in each tissue. Moreover, quantitative real-time PCR on selected genes exhibiting contrasted expression patterns further confirmed the results obtained by RNAseq (Supplementary Figure 2). Reliable expression in chemosensory organs (i.e., antennae and palps) was observed for 48 of the 53 RsutOBP unigenes annotated, with no visible sexual dimorphism (Figure 2 and Supplementary Table 1). Among these, we found 17 RsutOBPs exhibiting similar FPKM values in antennae and palps. This was notably the case for members of the Plus-C (RsutOBP1 and 2) and Minus-C (RsutOBP53) clades. Eight RsutOBPs appeared more expressed in antennae than palps (ratio of FPKM values above 4), all but one (RsutOBP45-52) belonging to the ABPII clade. Finally, 23 RsutOBPs seemed more expressed in palps than in antennae and most of these OBPs belong to the large RsutOBP expansion (RsutOBP5-39).
In addition to OBPs, we annotated four R. suturalis transcripts encoding CSPs and three transcripts encoding NPC2 proteins (Supplementary Table 1). RsutCSPs belonged to conserved lineages in the coleopteran CSP phylogeny (Supplementary  Figure 3). Among the four candidate CSPs, two were highly expressed in chemosensory organs and therefore likely to be involved in chemical sensing: RsutCSP3 was expressed in  Supplementary Table 1. Transcripts were classified in three categories, from top to bottom: equally expressed in antennae and palps; more expressed in antennae; more expressed in palps. A transcript was classified as more expressed in a specific tissue when the ratio of FPKM values was at least four-fold. Transcripts with FPKM values lower than 1 in all tissues are not shown.
antennae of both sexes whereas RsutCSP4 exhibited extremely high expression levels in both antennae and palps, as well as faint expression in muscle. None of the NPC2 proteins identified was specific to chemosensory organs, with RsutNCP2_2 and 3 being moderately expressed in all tissues investigated (Supplementary Table 1).

Candidate Chemosensory Receptors
Altogether, we annotated transcripts encoding 48 ORs (30 fulllength), 73 GRs (47 full-length) and 53 IRs (24 full-length) in the reference transcriptome (Supplementary Table 1). The R. suturalis ORs were distributed among six of the eight clades of the coleopteran OR phylogeny (Figure 3). Within each clade, RsutORs clustered with ORs annotated in the ground beetle C. scrutator (Carabidae), also belonging to the suborder Adephaga. We found no RsutOR belonging to clades 5 and 6, but RsutOR9 and four C. scrutator ORs formed a clade that could be specific to Adephaga, with no Polyphaga representative. As ORs are the only chemosensory genes that have been annotated in C. scrutator, the OR tree is the only one in which duplications more recent than the divergence of the dytiscid lineage with respect to that of the terrestrial carabids can be identified. In addition to a few instances of such duplications that involve only two or three R. suturalis genes (e.g., RsutOR1-3 and RsutOR6 and 7, within clade 1), a significant OR expansion was apparent within clade 3, which gave rise to 9 R. suturalis paralogs (RsutOR26-34). As expected, the OR family member with the highest expression level was the obligate co-receptor RsutOrco, expressed at high levels in both antennae and palps (Figure 4 and Supplementary Table 1). Most RsutORs (31 out of 48) were found reliably expressed only in male and female antennae, some with high expression levels. However, eight RsutORs appeared expressed in palps and not in antennae. All but one (RsutOR1) belong to the RsutOR expansion within clade 3 (RsutOR26-34). Interestingly, the two palpal RsutOR genes studied by qRT-PCR were found expressed in maxillary palps but not labial palps (Supplementary Figure 2).
In the GR phylogeny, RsutGR1-3 belong to the candidate CO 2 receptor clade, RsutGR4 and 5 to the candidate sugar receptor clade, and RsutGR10 to the candidate fructose receptor clade (Figure 5). The vast majority of GRs identified in R. suturalis (RsutGR15-70) belong to a single clade of the GR phylogeny without reported putative function. This clade contained only a few representatives from Polyphaga species and several occurrences of massive expansions of R. suturalis GRs. The vast majority of RsutGRs were either found expressed only in palps or more expressed in palps than in antennae (Figure 4 and Supplementary Table 1). They generally exhibited low FPKM values, with the exception of the candidate CO 2 receptors RsutGR1-3 and a few others (RsutGR9,36,37,72,and 73). The GR gene with the highest FPKM value-RsutGR9-was found specifically expressed in labial palps by qRT-PCR (Supplementary Figure 2).
Based on the IR phylogeny (Figure 6), we identified unigenes encoding each of the eight following "antennal" IRs, whose putative functions have been assigned based on previous work on the fruit fly (van Giesen and Garrity, 2017;Rimal and Lee, 2018): the four IR co-receptors IR25a, 8a, 93a, and 76b, the two humidity-sensing IR40a and 68a, the temperature-sensing IR21a and the olfactory IR41a. We also identified six RsutIRs within the IR75 clade (RsutIR75a-f), supposedly involved in the detection of water-soluble semiochemicals carrying a carboxylic acid or an amine function (Silbering et al., 2011;Prieto-Godino et al., 2017). The other 39 IRs found in the R. suturalis transcriptome belong to the divergent IR clade, involved in taste in Drosophila (Koh et al., 2014;Sanchez-Alcaniz et al., 2018). Several independent RsutIR expansions were found within this clade, including two large groups of 12 and 13 R. suturalis paralogs, respectively. RsutIR25a (candidate universal IR coreceptor) was the most highly expressed coreceptor, in both antennae and palps (Figure 4 and Supplementary Table 1). RsutIR76b (candidate gustatory IR coreceptor) exhibited a high expression in palps, whereas RsutIR8a (candidate IR olfactory coreceptor) and RsutIR93a were highly expressed in antennae, like the four conserved antennal IRs (IR21a, 40a, 41a, 68a). Two RsutIRs belonging to the IR75 clade were also highly expressed in antennae, and three others exhibited similar FPKM values for both antennae and palps. Among these, RsutIR75f exhibited an exceptionally high expression level in both tissues (Figure 4 and Supplementary Figure 2). The 39 divergent RsutIRs were globally expressed at low levels in palps, although a few of them were also expressed in antennae, sometimes specifically (e.g., RsutIR130). As observed for the soluble proteins, none of the candidate chemosensory receptors annotated here exhibited a sexually dimorphic expression.

Representatives of Other Protein Families Expressed in Chemosensory Organs
Complementary to the detailed analysis of gene families described above, we searched for unigenes from other families that would be specifically expressed in the chemosensory organs relative to the wing muscle. By doing so, we found more than 700 unigenes with hits in UniProt or Pfam databases (Supplementary Table 2). We notably identified more than 40 sequences encoding enzymes that may play a role in metabolism of semiochemicals, such as cytochrome P450 enzymes, carboxylesterases, lipases, short-chain dehydrogenases and UDP-glucuronosyl transferases (Chertemps and Maïbèche, 2021). Some of these unigenes exhibited high FPKM values (Figure 7). We also found six unigenes encoding CD36 proteins, a family to which Sensory Neuron Membrane Proteins (SNMP) belong , and three unigenes encoding candidate pickpocket channels (amiloride-sensitive sodium channels), some of which are involved in olfaction and taste in Drosophila and mosquitoes (Chen et al., 2010;Matthews et al., 2019;Ng et al., 2019;Liu et al., 2020). We also searched for unigenes that would be expressed in chemosensory organs of a single sex (either male or female) and found only a dozen of such unigenes, i.e., five expressed in antennae and 10 in palps. None of these corresponded to gene families with a known direct or indirect role in chemical senses (Supplementary Table 2). This confirmed our initial observation that no chemosensory gene seemed differentially FIGURE 3 | Maximum-likelihood phylogeny of coleopteran ORs. The tree was built from an alignment of amino acid sequences of R. suturalis ORs (RsutOR, available in Supplementary Table 1) with sequences annotated from the genomes of five coleopteran species belonging to distinct superfamilies. The OR clades 1-7 were defined as in Mitchell et al. (2020). Black dots indicate deep nodes highly supported by the likelihood-ratio test (aLRT > 0.9), with the corresponding value. The tree was rooted using the Orco clade as an outgroup. The scale bar indicates the expected number of amino acid substitutions per site.
Frontiers in Ecology and Evolution | www.frontiersin.org  Supplementary Table 1. Transcripts were classified in three categories, from top to bottom: equally expressed in antennae and palps; more expressed in antennae; more expressed in palps. A transcript was classified as more expressed in a specific tissue when the ratio of FPKM values was at least four-fold. Transcripts with FPKM values lower than 1 in all tissues are not shown. expressed between male and female chemosensory appendages in R. suturalis.

DISCUSSION
In this study, we have identified the first chemosensory genes of an aquatic member of the megadiverse order Coleoptera, the diving beetle Rhantus suturalis. Members of six chemosensory gene families (OBPs, CSPs, NPC2 proteins, and receptors of the OR, GR, and IR families) were annotated, incorporated into phylogenetic analyses, and their relative expression levels were estimated, based on RNA-seq data from antennae and palps of adult males and females as well as wing muscle (i.e., in total, 5 samples). The gene repertoires recovered from these transcriptomic data are necessarily partial in the absence of a reference genome for R. suturalis, and because only the main cephalic sensory organs at a single life stage were processed. This  Supplementary Table 1) with sequences annotated from the genomes of three coleopteran species belonging to distinct superfamilies. Putative functions were assigned to several clades as in Andersson et al. (2019). Black dots indicate deep nodes highly supported by the likelihood-ratio test (aLRT > 0.9), with the corresponding value. The tree was rooted using the CO 2 receptor clade as an outgroup. The scale bar indicates the expected number of amino acid substitutions per site.
limitation has to be kept in mind, especially for comparisons of gene family sizes with respect to other Coleoptera or insect species for which genes have been annotated based on complete genomes. However, given our relatively deep sequencing strategy, and results from the BUSCO analysis (only 4.1% of genes missing), we can surmise that the contents obtained here FIGURE 6 | Maximum-likelihood phylogeny of coleopteran IRs. The tree was built from an alignment of amino acid sequences of R. suturalis IRs (RsutIR, available in Supplementary Table 1) with sequences annotated from the genomes of three coleopteran species belonging to distinct superfamilies. Putative functions were assigned to the different IR clades based on previous works on Drosophila. Black dots indicate deep nodes highly supported by the likelihood-ratio test (aLRT > 0.9), with the corresponding value. The tree was rooted using the IR25a/IR8a clade as an outgroup. The scale bar indicates the expected number of amino acid substitutions per site.
Frontiers in Ecology and Evolution | www.frontiersin.org 11 December 2021 | Volume 9 | Article 773915  Dippel et al., 2014Dippel et al., , 2016, and color coding shows log 2 (fold change). For each family, genes are separated by groups of orthology.
Frontiers in Ecology and Evolution | www.frontiersin.org 13 December 2021 | Volume 9 | Article 773915 for the different gene families should not be considerably far from exhaustiveness. Our R. suturalis transcriptome yielded chemosensory gene repertoires whose sizes are generally within the range of what has been previously obtained for terrestrial Coleoptera species, but in the lower, middle or upper range depending of the gene families. R. suturalis has more OBPs (53 retrieved in this study) than the Curculionidae D. ponderosae and the Buprestidae A. planipennis but this number is of the same order of magnitude as in genomes of the Tenebrionidae T. castaneum, the Chrysomelidae Leptinotarsa decemlineata and the Cerambycidae A. glabripennis (Andersson et al., 2019). Among other roles, OBPs are thought to mediate transport of hydrophobic odorants through the sensillar lymph to the olfactory neuron dendrites (Brito et al., 2016;Rihani et al., 2021). Numerous functional studies carried out on Coleoptera (mainly in Scarabaeidae and Chrysomelidae) have identified OBPs capable of binding hydrophobic molecules, whether they are aggregation pheromones or plant volatiles (see Mitchell and Andersson, 2021 for a review). Interestingly, this holds true for the Dytiscidae Cybister japonicus, where a classical OBP and a Minus-C OBP have shown a good affinity for a monoterpene (citral) and for a phenypropanoid (coniferyl aldehayde), respectively (Song et al., 2016). The fact that R. suturalis has a rich complement of OBPs suggests that detection of hydrophobic odorants is important in the chemical ecology of this aquatic beetle. Moreover, the OBP phylogeny revealed an impressive gene expansion in the lineage leading to the Dytiscidae, with a clade of classical OBPs exclusively containing 35 R. suturalis paralogs (Figure 1). In its breadth, this expansion is unparalleled in any other beetle species investigated. This may reflect peculiar functional requirements on OBPs in aquatic Adephaga, perhaps due to the low concentration of hydrophobic molecules in water. However, no OBP has been described yet in terrestrial Adephaga and it remains to be determined whether this large OBP expansion is linked with the transition to an aquatic lifestyle or not.
Contrary to OBPs, the R. suturalis transcriptome contained a low number of NPC2 proteins and CSPs, which are other soluble proteins potentially involved in chemical senses in insects. Whereas the low number of NPC2 proteins is on a par with previous observations in Coleoptera (Pelosi et al., 2014), the number of four CSPs identified in R. suturalis is far below what has been found in beetle genomes (11-20;Andersson et al., 2019). Together with the fact that their expression was generally not restricted to antennae and palps, this suggests that these two protein families may not play a major role in chemoreception in R. suturalis and aquatic Adephaga. Potential roles of CSPs in detoxification, immunity or secretion of defensive compounds have already been proposed in T. castaneum (Contreras et al., 2013;Li et al., 2013).
For chemoreceptors, the number of 48 ORs identified in R. suturalis stands in the lower range for Coleoptera (32-258;Mitchell et al., 2020). Like OBPs, ORs are supposed to be functionally involved in the detection of hydrophobic odorants. It is interesting to observe that this diving beetle species does not present an impoverished complement of ORs with respect to the terrestrial Adephaga beetle C. scrutator (51 ORs). Of the Coleoptera species whose OR genes have been annotated from complete genomes, only the Hydroscaphidae Hydroscapha redfordi has less ORs than R. suturalis (32). Hydroscaphidae are aquatic members of suborder Myxophaga, feeding on algae. The OR phylogeny revealed a possible Dytiscidae-specific OR gene expansion within clade 3, harboring 9 R. suturalis paralogs. Lineage-specific expansions have previously been found in this OR clade of unknown function for species with different lifestyles and feeding habits, namely the carnivorous burying beetle N. vespilloides and the phytophagous long-horned beetle A. glabripennis. Nevertheless, R. suturalis together with its terrestrial Adephaga cousin C. scrutator underwent much less massive gene expansions in the whole OR phylogeny than observed in Polyphaga species for which data are available (Mitchell et al., 2020).
We identified 73 GRs from the R. suturalis transcriptome, mostly expressed in palps. This number is higher than the GR gene repertoire of specialized phytophagous Polyphaga but much lower than in polyphagous species of Polyphaga (e.g., T. castaneum: 219; A. glabripennis: 190; Andersson et al., 2019). Our phylogenetic analysis showed that GR expansions identified in R. suturalis did not occur in the same lineages as those observed in A. glabripennis and D. ponderosae. However, the current lack of GR functional characterization in Coleoptera hampers any discussion on the potential role of these expansions. As insect GRs are mostly known for their ability to bind watersoluble non-volatile semiochemicals, it is tempting to speculate that a larger GR complement would increase underwater chemodetection capabilities but this has yet to be demonstrated. Moreover, the lack of GR gene identification in any other Adephaga makes it impossible to determine which of these GR expansions observed in R. suturalis are possibly specific to water beetles. The same is true for lineage-specific expansions involving R. suturalis divergent IR genes mostly expressed in palps.
In the absence of biological replicates, we could not carry out a differential expression analysis able to capture subtle differences of expression between samples, notably between males and females. Anyway, no R. suturalis chemosensory gene showed any hint of sex-biased expression, a quite puzzling result in light of the recent experimental demonstration that in this species, males are attracted underwater by a sex pheromone (of unknown composition) emitted by females (Herbst et al., 2011). With the exception of D. melanogaster, all pheromone receptors identified so far in insects belong to the OR family (see Fleischer and Krieger, 2021 for a comprehensive review). Even though receptors to aggregation pheromone components (such as those identified in Coleoptera) can be expressed at similar levels in both sexes, sex pheromone receptors generally present a strong sex-biased expression, as commonly observed in male moths (Bastin-Heline et al., 2019). It is of course conceivable that in diving beetles, receptors allowing males to detect the female pheromone belong to a receptor family other than those examined in this study. Other possible explanations could be that a pheromone receptor is indeed present among the candidate receptors that we characterized but is not differentially expressed between the sexes, or that expression of the pheromone receptor is seasonal and was not different between sexes at the period of the year in which we sampled the specimens used in this study (between April and October 2017).
One of the most interesting features of the R. suturalis chemosensory gene repertoires in the face of the terrestrialaquatic ecological transition is that possible dytiscid-specific gene expansions were systematically associated with expression in palps. The comparison of chemosensory gene expression levels in antennae vs. palps (or entire mouthparts) in R. suturalis and T. castaneum (the only terrestrial beetle for which comparable data are available; Dippel et al., 2014Dippel et al., , 2016 reveals that preferential expression sites clearly differ for several orthologous genes or gene groups, with a striking recurrent tendency across chemosensory gene families: expression tends to be shifted to the palps in R. suturalis (Figure 8). Amongst ORs, R. suturalis has several palp-specific ORs whose T. castaneum orthologs are antennae-specific: RsutOR1 (member of clade 1) and all but one members of the R. suturalis-specific expansion in clade 3 (RsutOR26-34). Concerning GRs, most T. castaneum genes are expressed similarly in antennae and mouthparts and some are antennae-biased, whereas the vast majority of RsutGRs seem to be more expressed in palps, notably the highly expressed RsutGR36, 37 and 72, as well as the candidate CO 2 receptors RsutGR1-3. Contrary to the other two chemoreceptor families, expression of IRs is highly similar in T. castaneum and R. suturalis, yet this comparison does not include divergent IRs, which have not been analyzed in T. castaneum (Dippel et al., 2016). Finally, several T. castaneum orthologs of the large R. suturalis OBP expansion (RsutOBP5-39) are also expressed at a higher level in palps, but RsutOBP50 (belonging to the ABP II clade) is highly expressed in palps whereas its T. castaneum orthologs are all antenna-specific.

CONCLUSION
In conclusion, our transcriptome analysis of chemosensory organs in a diving beetle has revealed chemosensory gene repertoires that are rather similar to those of terrestrial Coleoptera. This seems at odds with prevailing views on the impact of physical constraints on chemical communication underwater. However, we did identify several occurrences of genes highly diversified in R. suturalis, notably in the OBP and OR families, which are associated to the detection of hydrophobic odorants in insects. Moreover, these diversified genes were generally highly expressed in palps. A shift of expression of OBPs and ORs from the antennae to the palps may have significant implication with respect to the ecological transition from terrestrial to aquatic life. Indeed, hydrophobic molecules are much less diffusive in water than in the air, leading to the idea that the same proteins that mediate longrange chemodetection (i.e., olfaction in the classic sense) in aerial animals may be involved in short-range or contact chemodetection (gustation, or taste, in the classic sense) in aquatic animals (Mollo et al., 2014). Further expression analyses in Adephaga with various lifestyles associated with functional studies of their chemosensory genes will be necessary to verify whether part of the molecular toolkit functioning in the antennae of terrestrial beetles has indeed diversified and has been reallocated to the palps in aquatic beetles, to allow short-range perception of hydrophobic molecules.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI (accession: PRJNA762278).

AUTHOR CONTRIBUTIONS
NM, MJ, TC, and MM conceived the study. MJ and MM performed insect collection and sample preparation. YJ prepared RNA-seq libraries and performed sequencing. NM, TC, EP, CM, and EJ-J performed transcriptome assembly, data analysis, and phylogenetic reconstructions. NM and MM drafted the manuscript, with input from all authors. All authors contributed to the article and approved the submitted version.