Screening Candidate Effectors of the Bean Bug Riptortus pedestris by Proteomic and Transcriptomic Analyses

The damage of Riptortus pedestris is exceptional by leading soybean plants to keep green in late autumn. Identification of the salivary proteins is essential to understand how the pest-plant interaction occurs. Here, we have tried to identify them by a combination of proteomic and transcriptomic analyses. The transcriptomes of salivary glands from R. pedestris males, females and nymphs showed about 28,000 unigenes, in which about 40% had open reading frames (ORFs). Therefore, the predicted proteins in the transcriptomes with secretion signals were obtained. Many of the top 1,000 expressed transcripts were involved in protein biosynthesis and transport, suggesting that the salivary glands produce a rich repertoire of proteins. In addition, saliva of R. pedestris males, females and nymphs was collected and proteins inside were identified. In total, 155, 20, and 11 proteins were, respectively, found in their saliva. We have tested the tissue-specific expression of 68 genes that are likely to be effectors, either because they are homologs of reported effectors of other sap-feeding arthropods, or because they are within the top 1,000 expressed genes or found in the salivary proteomes. Their potential functions in regulating plant defenses were discussed. The datasets reported here represent the first step in identifying effectors of R. pedestris.


INTRODUCTION
Many hemipterans are important pests that pierce their needle-like mouthparts (stylets) into crop plants and feed on sap. They eject gelling saliva during the feeding that solidifies quickly and forms a continuous sheath in host plants. The sheath is a feeding channel and protects stylets against plant toxins. Meanwhile, watery saliva is used to digest food, regulate plant defenses and facilitate pathogen transmissions (Miles, 1999;Will et al., 2012;Huang et al., 2019c). In order to study the molecular mechanism in interactions between pests and crops, we need to identify the salivary proteins and analyze their functions. Transcriptome analysis of salivary glands and proteome analysis of secreted proteins are two efficient ways to identify salivary proteins. The analyses have been performed on some agriculturally important hemipterans, such as aphids (Carolan et al., 2011;Boulain et al., 2018), planthoppers (Ji et al., 2013;Huang et al., 2018), whiteflies (Su et al., 2012) and leafhoppers (Coudron et al., 2007;DeLay et al., 2012). Though many stink bugs are also important pests, identification of salivary effectors has been largely ignored and previous studies have mainly focused on the activities of digestive enzymes. For example, salivary glands of some pod-sucking coreid bugs produce a large amount of proteinases that are probably used to digest proteins in beans (Soyelu et al., 2007). The coreid bug Mictis profana (Fabr.) uses a sucrase to hydrolyze sucrose into monosaccharides during feeding, thereby increasing local osmotic pressure and unloading the solutes of neighboring plant cells Taylor and Miles, 1994). The mirid bug Apolygus lucorum (Meyer-Dür) is able to produce a series of digestive enzymes by salivary glands, such as pectinases, polygalacturonases, amylases, cellulases and proteinases (Tan et al., 2016;Li et al., 2017;Zhang et al., 2017). Transcripts of salivary glands were sequenced in some true bug species (Francischetti et al., 2007;Zhu et al., 2016). Still, the studies paid main attention to digestive enzymes again, whereas very few discussed the effector functions of the salivary proteins. However, a recent study found that a glutathione peroxidase was highly expressed in the salivary glands of A. lucorum, who probably use it to eliminate the reactive oxygen species (ROS) accumulation in plants (Dong et al., 2020).
In other hemipterans, a variety of salivary effectors that affect plant immunity have been identified (Hogenhout et al., 2009;Sharma et al., 2014). For example, physical puncturing of phloem sieve elements normally leads to a rapid occlusion of sieve elements because of the formation of insoluble protein complexes (e.g., forisomes) inside that are valves of sieve tubes. Phloem-feeding hemipterans, such as aphids and planthoppers, prevent phloem occlusion and the related defense responses by using salivary proteins, including calcium-binding proteins. The proteins bind calcium, thereby weakening the signaling of defenses and avoiding the occlusion of sieve elements Sharma et al., 2014;Ye et al., 2017;Huang et al., 2019c). In addition, hemipteran herbivores commonly use catalases and peroxidases that are ubiquitous heme enzymes to remove hydrogen peroxides in feeding sites (Sharma et al., 2014). Some salivary enzymes, such as phenol oxidases, dehydrogenases and cytochrome P450s, are often used to detoxify plant toxic compounds (Nicholson et al., 2012;Sharma et al., 2014). In addition, non-enzymatic proteins have been increasingly identified in hemipteran saliva, and they often affect plant defenses via different mechanisms (Elzinga et al., 2014;Matsumoto and Hattori, 2018;Xu et al., 2019).
The bean bug Riptortus pedestris (Fab.) (Hemiptera: Heteroptera: Alydidae) is an important pest on soybeans in East Asia. Very recently, the genome of R. pedestris was assembled (Huang et al., 2021b), which provides an important dataset in analyzing the functions of their genes. The pest invades soybean fields during flowering period and causes severe damage to soybeans by sucking pods (Endo et al., 2011;Xu et al., 2021). Severely damaged plants stay green in the stem and leaf in late autumn (Li et al., 2019), indicating that the salivary proteins of R. pedestris have possibly changed the plant development. Identification of the salivary proteins is the first step in understanding the plant's response. Their salivary proteins have been identified by a combination of proteomic and transcriptomic analyses on salivary glands (Huang et al., 2021a). However, whether the proteins are able to be secreted into food is still unknown. And the comparisons among different developmental stages and between sexes are missing. Here, we studied the transcripts of the salivary glands of males, females and nymphs, with a special attention on identifying candidate effectors. In addition, the proteomes of male, female and nymph saliva were, respectively, analyzed. As a result, about 170 salivary proteins, in total, were found and their potential functions as effectors were also discussed.

RNA Extraction, cDNA Library Construction and Illumina Sequencing
Thirty adults (male or female, 10-d old) or fourth-instar nymphs were anesthetized on ice and subsequently dissected to obtain salivary glands (Figure 1). The RNA was extracted by the TRIzol Total RNA Isolation Kit (Takara, Dalian, China), following the manufacturer's instructions. The quality of extracted RNA was verified by the Agilent 2100 Bioanalyzer (Agilent Technologies, CA, United States). Polyadenylated RNA (mRNA) was purified from the total RNA by using oligo(dT) magnetic beads and then the total mRNA was fragmented into short sequences in the presence of divalent cations at 94 • C for 5 min. The cleaved RNA was transcribed, and the second-strand cDNA was obtained. After end-repair and adaptor ligation, the products were PCR-amplified and purified using Ampure XP Beads (Agencourt Bioscience, MA, United States) to create the cDNA library.
The library was sequenced on the Illumina sequencing platform and the raw data were generated using Solexa GA pipeline 1.6. Low quality reads were removed, and the rest sequences were assembled using Short Oligonucleotide Analysis Package (SOAP) de novo software (Li et al., 2008), and then clustered by TGICL v2.0.6 to gain unique genes (Pertea et al., 2003). The clean reads of the transcriptomes have been deposited to SRA database with the accession number of PRJNA690963.

Annotations of Unigenes and Predicted Peptides
The sequences of unigenes were searched in one of four databases to obtain their annotations, including the NR database (NCBI 1 ); FIGURE 1 | The procedure of the experiment and sequence analyses: salivary glands were dissected from the bean bugs (taking a male as an example here), and then the transcriptome was sequenced and analyzed. We tested the tissue-specific expression of 18 genes with secretion signals in the top 1,000 expressed genes, as well as 19 homologs of reported effectors of other sap-feeding arthropods. In addition, saliva of the bean bugs were collected and the proteomes were analyzed by LC-MS/MS. The tissue-specific expression of 31 secreted salivary proteins was tested. The genes that are expressed more in salivary glands have a high potential as effectors, and their functions were discussed in the text.
TransDecoder.LongOrfs was used to extract the long open reading frames (ORFs). The ORFs were blasted in the SwissProt 5 and Pfam databases 6 by Diamond Blastp and Hmmscan, respectively. The coding sequences (CDSs) were extracted from the transcripts by TransDecoder 3.0.1 (Kim et al., 2015), and then the predicted proteins were obtained. Then, the SignalP 5.0 7 was used to test whether sequences have secretion signal peptides or not (Armenteros et al., 2019), while the TMHMM 2.0 8 was used to check the transmembrane areas of sequences (Krogh et al., 2001).
The predicted proteins with secretion signal peptides and simultaneously without transmembrane areas are likely to be secreted by salivary glands into saliva (Nielsen, 2017), and therefore with a relatively high potential in modulation of plant defenses. We had paid attention to the genes with secretion signals (about 192 individuals, Supplementary Table 1) in the top 1,000 expressed genes of the transcriptomes, and 18 genes were selected for testing their expression levels in different tissues (see below). In addition, the amino acid sequences of most

Saliva Collection and In-Solution Digestion
Riptortus pedestris saliva was collected in a Petri dish (2 cm × 11 cm) whose open was covered by two layers of Parafilm with 2 ml sterile sucrose solution (2.5% in water) as food in between (Figure 1). The Parafilm was previously sterilized by 75% ethanol solution. The sucrose solution was prepared with aseptic water and filtered through a 0.22 µm syringe filter (Millipore, MA, United States) for the removal of microorganisms. Ten individuals (males, females or fourthinstar nymphs) were put in each Petri dish and the collection lasted 24 h. The collection was repeated 30 times. In total, 300 individuals were used. After collections, the sucrose solutions of each Petri dish were combined (about 60 ml) and concentrated by ultrafiltration (3-kDa, Amicon Ultra-4 Centrifugal Filter Tube, Millipore; 5,000 g, 4 • C, 30 min). The proteins were dissolved in 200 µl of SDT buffer (4% sodium dodecyl sulfate; 1 mM DTT and 100 mM Tris-HCl) and then were incubated in warm water for 15 min.
Subsequently, DTT was added into protein samples to a concentration of 100 mM, and then the samples were boiled for 5 min. After ultrafiltration (3-kDa; 14,000 g, 25 • C, 10 min), 100 µl iodoacetamide (IAA) buffer (100 mM IAA in UA buffer) was used to dissolve the proteins, and then the samples were incubated at room temperature for 30 min in darkness. After ultrafiltration (3-kDa) again, the samples were washed with 100 µl UA buffer (8 M urea, 150 mM Tris-HCl, pH 8.0) twice, and then washed with 100 µl NH 4 HCO 3 buffer (25 mM, Sigma) twice. Finally, the proteins were digested overnight in 4 µg of trypsin (Sigma) in 40 µl NH 4 HCO 3 buffer (25 mM) at 37 • C. The digested peptides were collected by ultrafiltration (3-kDa) and were dissolved in 40 µl NH 4 HCO 3 buffer (25 mM).

Liquid Chromatography With Tandem Mass Spectrometry
The digested peptides were separated by Thermo Scientific Easy nanoLC 1000 that was equipped with a C18 column (Thermo Scientific Acclaim PepMap100, 100 µm × 2 cm). Buffer A (0.1% formic acid in water) including 5% buffer B (84% acetonitrile and 0.1% formic acid in water) were used as the mobile phase for gradient separation. The sample was uploaded onto the column at a flow rate of 0.3 µl/min. Subsequently, the column was eluted by a linear gradient of buffer B at a flow rate of 0.25 µl/min (0-50 min, concentration increasing from 0 to 35%; 50-55 min, 35 to 100%; and finally pure buffer B maintained for 5 min).
The eluted peptides were analyzed by the Q-Exactive mass spectrometer (Thermo Fisher Scientific, United States). Full MS scans were acquired in the Orbitrap mass analyzer over the range m/z 300-1800 with a mass resolution of 70000 (at m/z 200). The twenty most intense peaks with charge state ≥2 were fragmented in the higher-energy collisional dissociation (HCD) with a normalized collision energy of 30% (the isolation window was 2 m/z), and tandem mass spectra were acquired in the Orbitrap mass analyzer with a mass resolution of 17,500 at m/z 200. For all detections, the dynamic exclusion time was set to 60 s.
Proteins were identified and annotated by using Mascot 2.2 to search UniProt (see footnote 5) with the restriction to R. pedestris data. The following parameters were used: trypsin was selected as the enzyme; two missed cleavage sites were allowed; 20 ppm mass tolerances for MS and 0.6 Da for MS/MS fragment ions; oxidation was a variable modification; carbamidomethyl was a static modification.

Testing Tissue-Specific Expression by Real Time Quantitative PCR
The relative expression of selected genes (68 genes) in different tissues of R. pedestris males, including salivary glands, mid-guts, fat bodies and testes, were compared. Those are 31 proteins found in male saliva, 18 genes that exist in the top 1,000 transcripts and 19 genes (shown in Table 1) that are homologs to reported effectors. First, the total RNA of each tissue (30 individuals) was extracted by the TRIzol Total RNA Isolation Kit (Takara, Dalian, China). The first strand cDNA was synthesized from RNA by using the HiScript III RT SuperMix qPCR kit (Vazyme, Nanjing, China). Then, real time quantitative PCR (RT-qPCR) was performed on a QuantStudio 5 Real-Time System (Thermo Fisher Scientific, United States) by using the Top Green qPCR SuperMix kit (TransGen Biotech, Beijing, China). The reaction program started with an initial denaturation step at 95 • C for 30 s, and then 40 cycles including two steps per cycle, 95 • C for 5 s and 60 • C for 34 s, were performed. The gene-specific primers were designed by using the Primer Premier 5.0 software. To evaluate the primers, the cDNA concentrations were either unchanged, or further diluted by 4, 16, or 64 times. When amplification efficiencies ranged from 90-110%, and the R 2 values were over 0.99 in the regression analysis, the primers were selected. Three biological replicates and three technical replicates were applied. The individual efficiency-corrected calculation method was used to compare the fold changes in expression levels of genes in mid-guts, testes and fat bodies, related to that in salivary glands (Rieu and Powers, 2009;Rao et al., 2013). Two housekeeping genes RpEF-1 and actin were used as reference genes (Lee et al., 2019). The primers and the result of the regression analysis of each gene were listed in the Supplementary File 1.

Statistical Analyses
The statistical analyses on RT-qPCR data were carried out by using SigmaPlot 14 with one-way ANOVA tests. A Holm-Sidak post hoc analysis was used for pairwise comparisons. When the expression levels in four tissues (salivary glands, mid-guts, testes and fat bodies) were fitted with a normal distribution, the comparisons were performed in one run. Otherwise, pairwise comparisons were conducted by each pairs, and the normality always passed. Different lowercase letters above the bars in the Figure 2 indicate that there are significant differences (P ≤ 0.05).

RESULTS
We obtained about 28,000 unigenes in the transcriptomes of salivary glands, and about 40% unigenes have complete ORFs (Supplementary Table 1). The average length of the unigenes was range from 828 to 1,001 bp with some differences between treatments. In the top 1,000 expressed genes of the male transcriptome, there were about 192 genes with the secretion signals (i.e., with secretion signal peptides and without transmembrane domain) and they are likely to be secreted from the gland cells without being anchored to the membranes (Cherqui and Tjallingii, 2000;Nielsen, 2017).
The top 1,000 expressed genes were mainly involved in ribosomal functions, amino acid metabolisms and posttranslational modifications etc. (Supplementary Figure 1), indicating that the salivary glands are specialized to produce many proteins. The GO annotations showed that many proteins in the salivary glands fulfilled binding and catalytic activities (Supplementary Figure 2).
Effector proteins normally have cysteine-rich residues and evolve quickly (Hogenhout and Bos, 2011;Dou and Zhou, 2012). A higher proportion of the secreted proteins in the transcriptome have cysteine-rich residues as opposed to that of housekeeping genes (Supplementary Figure 3). Though a high percentage of the secreted proteins matched with analogous sequences (Evalue < 1 × 10 −5 ) in the NCBI NR database, with or without In addition, we also presented a few genes that had secretion signals and were within the top 1,000 expressed genes. The tissue-specific expression of the genes was tested ( Figure 2B). The sequences were uploaded to the NCBI and the accession numbers were given, in which bold indicated they were highly expressed in salivary glands. Whether the genes had secretion signals was present: Y, Yes; N, None.
the restriction to the R. pedestris data, the relevant ratios of housekeeping genes in the transcriptome were always higher (Supplementary Figure 3). The data together suggest that many secreted proteins in the salivary glands are still unknown and have a potential as effectors.
Dozens of effectors have been reported in hemipterans and other sap-sucking arthropods to date, and 19 homologous proteins (≥30% similarity in amino acids) were also found in the transcriptomes ( Table 1). The expression levels of those genes were compared in different tissues (salivary glands, mid-guts, FIGURE 2 | RT-qPCR testing the tissue-specific expression of genes in salivary glands (Sg), mid-guts (Mg), testes (Te), and fat bodies (Fb) of R. pedestris males. (A) Expression levels of genes that encoded proteins found in male saliva; (B) genes that were selected from the transcriptome of salivary glands. The graphs were shown as: first are genes that are highly expressed in salivary glands, then those that are abundantly produced by mid-guts, testes, or fat bodies. In addition, the expression levels of another 12 genes (6 from the transcriptome and 6 found in saliva) were not biased to a tested tissue and their data were shown in Supplementary Table 4. Different lowercase letters above the bars indicate that there are significant differences (P ≤ 0.05). testes, and fat bodies) of R. pedestris males. In addition, in the 192 proteins with secretion signals in the top 1,000 expressed genes of the male transcriptome, 18 genes that are likely to be effectors based on their annotations (Sharma et al., 2014), were selected and their tissue-specific expression was examined. In addition, a total of 155 proteins were identified from watery saliva of R. pedestris males by LC-MS/MS analysis ( Table 2). A significantly fewer proteins were found in female (only 20) and nymph (11) saliva (Supplementary Table 3). About 60% of female and nymph saliva proteins were also found in male saliva (Supplementary Table 3). The functions of many proteins in the proteomes remained unannotated ( Table 2). The tissuespecific expression of 31 proteins (normally with a secretion signal) found in the male saliva was compared among different tissues. The expression levels that were significantly biased to a tested tissue (56 genes) were shown in the Figure 2. Otherwise, the data were given in Supplementary Table 4 (12 genes). In a previous paper, the salivary proteins of R. pedestris adults were identified by proteomic analysis on salivary glands (Huang et al., 2021a). By comparing to their data, we found that 127 proteins identified here were still novel (Supplementary Table 5), indicating that analysis on secreted proteins in saliva is an important way to identify salivary proteins of insects.

DISCUSSION
Riptortus pedestris has been one of the main pests on soybeans for decades in Korea and Japan (Endo et al., 2011). Recently, its outbreaks have also been found in China (Li et al., 2019). The severely damaged soybeans stay green in late autumn (Li et al., 2019). However, the mechanism is not yet understood. In the seed-filling period in soybean, leaves continuously transport photosynthates to seeds until leaf senescence . However, damage on pods or sink removal may delay leaf abscission (Crafts-Brandner and Egli, 1987;Zhang et al., 2016). Damage by R. pedestris on pods possibly leads to the staygreen of soybeans with a similar mechanism. For example, many digestive enzymes were identified in the transcriptomes and proteomes, and they seemed to be specialized to digest beans (see below). In addition, the bugs often feed on veins of soybean leaves, when they possibly inject effectors that might regulate the soybean development. However, the key effectors remain to be identified.

Different Number of Salivary Proteins Found in Males, Females and Nymphs
Male R. pedestris migrate to soybean fields earlier than females during the flowering period, and then they will release pheromone and possibly induce plants to release volatiles for attracting females and nymphs (Endo et al., 2011;Xu et al., 2021). The release of male pheromone is stimulated by feeding (Morishima et al., 2005). So males may excrete more salivary proteins when feeding on newly located plants to overcome a relatively intact immunity. In addition, adults express some genes specifically by salivary glands, as opposed    In addition, we presented 30 proteins in the Supplementary Table 2, because they are less likely to be effectors, such as references genes in qRT-PCR (Lü et al., 2018), and ribosomal constituent proteins. The relative expression of 31 proteins normally with secretion signals was compared among different tissues (Figure 2A and Supplementary Table 4). The proteins with bold UniProt ID were highly expressed in salivary glands or mid-guts. *There are some proteins whose functions are not yet known.
to nymphs (Huang et al., 2021a), which may also contribute to more proteins found in adult saliva than in nymph saliva. However, a strong variance sometimes occurs among replicates, when proteomes in saliva were analyzed in hemipterans, as reported in other papers (Carolan et al., 2009(Carolan et al., , 2011Huang et al., 2018).

Salivary Digestive Enzymes
In the 814 proteins with secretion signals in the male transcriptome, many of them are probably used for digesting proteins and lipids, as also suggested by Huang et al. (2021a), including 112 proteases (peptidases) and 41 lipases (esterases). Since R. pedestris prefers to feed on bean pods in nature, the enzymes are possibly applied to digest proteins and oils in beans. Similar results were obtained from studies on other seedfeeding bugs, as well as predator bugs (Soyelu et al., 2007;Bigham and Hosseininaveh, 2010;Zibaee et al., 2012). Extraoral digestion seems to be important for many stink bug species . In laboratory, R. pedestris is normally reared on dry soybean seeds and water supply (Takeshita and Kikuchi, 2017), indicating the extra-oral digestion is a primary process of feeding. In addition, enzymes for sugar digestion were also found, including 6 α-amylases and other glucosidases. The enzymes appeared to be less abundant than proteinases in the salivary glands, as also found in other pod-feeding bugs (Soyelu et al., 2007).
In the male proteome, we found an α-glucosidase (UniProt ID: R4WDP5) and a proteinase (R4WQ74), and the both enzymes are highly expressed in mid-guts (Figure 2A). We also found several cathepsin L enzymes in the saliva of males and females ( Table 2  and Supplementary Table 3), which are normally expected to occur in lysosomes and never leave the cells. However, the enzymes are often secreted by digestive systems in insects and act as cysteine proteinases (Terra and Ferreira, 2005). The cathepsins L in R. pedestris saliva normally have the secretion signals and are probably used for extra-oral digestion.

Salivary Effector Candidates: Oxidoreductases
Catalases, glutathione peroxidases and peroxiredoxins are oxidoreductases that are well recognized for degrading ROS and maintaining redox homeostasis in the damaged plant cells (Petrova and Smith, 2014;Sharma et al., 2014;Chaudhary et al., 2015;Dong et al., 2020). A catalase (R4WNB5) existed in male saliva, and the enzyme was abundantly expressed in fat bodies (Figure 2). A peroxiredoxin (MW625814) and a glutathione peroxidase (MW625813) were produced by salivary glands in a relatively high amount ( Figure 2B). These enzymes may also play an important role in suppressing the first-line defense of plants (Sharma et al., 2014;Dong et al., 2020).
Dehydrogenases may regulate plant defense signaling and detoxify plant toxic compounds (Sharma et al., 2014). For example, glucose dehydrogenases were found in the saliva of some aphid species and the activities of the enzymes were corresponding to their virulence (Carolan et al., 2011;Nicholson et al., 2012;Sharma et al., 2014). Several dehydrogenases (R4WD44, R4WCW6, R4WEC6, R4WIH8, and R4WQZ0) were found in the saliva of males and females. In addition, R. pedestris produced two glucose dehydrogenases (MW561670 and MW625827) in a higher amount in salivary glands ( Figure 2B). The functions of these enzymes remained to be confirmed in R. pedestris.

Hydrolases
Like the brown planthopper, Nilaparvata lugens (Stål) (Huang et al., 2016), R. pedestris secret leucyl aminopeptidases (R4WCJ9) in saliva ( Table 2). The enzymes cleave defense peptides (e.g., hormones and neuropeptides) at N-terminus, especially leucine residues. In addition, an aminopeptidase (MW625818) in transcriptome was also found to be specific in salivary glands and testes of R. pedestris. The enzymes were considered to be essential in defending aphids against plant lectins (Nicholson et al., 2012). Metalloproteases, in contrast, possibly cleave peptides at the C-terminal end (Carolan et al., 2011). In aphids and thrips, they are able to counteract host defenses, by degrading plant defense proteins (Carolan et al., 2009(Carolan et al., , 2011Stafford-Banks et al., 2014;Wang et al., 2015b). The metalloprotease (MW625833) of R. pedestris appeared to be a Zn-metallocarboxypeptidase, and it was abundantly expressed in salivary glands. These enzymes have a great potential in degrading defense proteins of host plants.
The chitooligosaccharidolytic beta-N-acetylglucosaminidase (NAGase, R4WE69) is a chitinase. The enzyme was highly expressed in salivary glands of R. pedestris and found in male saliva (Tables 1, 2). Plants NAGases act as an antifungal compound by hydrolyzing N-glycans of polysaccharides and glycoproteins (Altmann et al., 1999). Therefore, insects may also use NAGases for inhibiting fungal infection during feeding on plants (Nicholson et al., 2012;Sharma et al., 2014). In addition, NAGases in the saliva of sap-sucking herbivores possibly affect plant immunity by the interaction with NAGases of host plants (Nicholson et al., 2012;Sharma et al., 2014).

Calcium Binding Proteins
Phloem sieve elements respond to the feeding by piercingsucking insects by quickly inducing calcium flux which possibly triggers the occlusion of sieve elements and increases the related plant defenses . However, the calcium-binding proteins in saliva possibly reduce the reaction, which guarantees a continuous feeding Ye et al., 2017;Tian et al., 2021). Indeed, several types of calcium-binding proteins were found in the proteomes and transcriptomes (Tables 1, 2), and one of them (MW625835) had been found to be highly expressed in salivary glands of R. pedestris.

Others
Trialysins have been found in saliva of the hematophagous bug Triatoma infestans (Klug) (Amino et al., 2002). The protein may lyse cells of both animals and microorganisms, indicating it plays an important role in interaction with hosts (Amino et al., 2002). Similarly, two trialysins (A0A1B4X9A5 and A0A1B4X9A9) were found in the saliva of the bean bug ( Table 2). And the A0A1B4X9A5 was abundantly produced by salivary glands (Figure 2A).
We also found several mucin-like proteins in the top 1,000 expressed genes of the transcriptomes that were commonly with secretion signals, and most of them were expressed in a relatively higher amount in salivary glands (Table 1 and Figure 2). Similar result was found in N. lugens that secreted mucin-like proteins into both watery and gelling saliva (Huang et al., 2016). One of their functions was to form a developed salivary sheath and increase the adaptation of the brown planthoppers to rice plants (Huang et al., 2017). In the laboratory, we observed many salivary sheaths on soybean seeds after fed by R. pedestris under an optical microscope. The sheaths are normally white tubes with helical curves and with variable lengths. Whether mucin-like proteins also contribute the formation of salivary sheaths in R. pedestris needs further studies.
Two serine protease inhibitors (R4WL96 and R4WCP6) were found in the salvia of males and females. The proteins have been found to be essential in regulation of host defenses by various hematophagous arthropods (Amino et al., 2001;Chmelař et al., 2017;Soares et al., 2018). Both enzymes of R. pedestris were highly expressed in salivary glands (Figure 2), and they are expected to be important in interaction with plant defenses.
The effector proteins of fungal pathogens are often small secreted cysteine-rich proteins (SSCPs) with less than 200 amino acid residues, and have a high cysteine content (>2%) (Stergiopoulos and de Wit, 2009). The strategy might also occur in insects. For example, Nl28 is a species-specific SSCP in N. lugens, which induced cell-death symptoms after the transient expression in Nicotiana benthamiana (Rao et al., 2019). The cysteines in the effectors often contribute the formation of disulfide bonds, thereby supporting effectors a specific structure (Saunders et al., 2012). Here, we also found three SSCPs (R4WDU9, R4WQ74, and MW625844) that were greatly expressed in salivary glands or mid-guts of R. pedestris (Figure 2). Therefore, R. pedestris might also use SSCPs to modulate host plant immunity, like fungi and the brown planthoppers.
Insect chemosensory proteins (CSPs) are well known for their functions in olfaction and gustation (Pelosi et al., 2005). However, some papers have found that the proteins are sometimes specifically expressed in salivary glands, and they trigger chlorosis and dwarf phenotypes of N. benthamiana after the transient expressions (Copenhaver et al., 2010;Rao et al., 2019). The MP10, a CSP of the green peach aphid Myzus persicae (Sulz.), activated the jasmonic acid and salicylic acid signaling pathways of N. benthamiana during feeding (Rodriguez et al., 2014;Mugford et al., 2016). Here, two CSPs (A0A2Z4HQ00 and MW625831) were found in saliva or transcriptomes of R. pedestris. However, their expression levels were not biased to a tested tissue (Supplementary Table 4).
Similar with CSPs, odorant binding proteins (OBPs) are also well recognized for the function in sensing odors (Zhu et al., 2019). Here, an OPB (A0A2Z4HQ32) was found in the male saliva ( Table 2), and another OPB (MW625837) was largely produced in R. pedestris salivary glands ( Figure 2B). How OBPs could act as effectors in herbivores is not yet understood. However, some OBPs are used by mosquitoes to scavenge host amines during feeding, which contributes to anti-inflammatory effect (Calvo et al., 2006(Calvo et al., , 2009. Since OBPs possibly have ligand-binding hydrophobic channels (Calvo et al., 2009), they may be used by herbivores to bind defense-related molecules of plants.
Carbonic andydrases are zinc metalloenzymes that catalyze the reversible hydration of carbon dioxide to bicarbonate. A carbonic andydrase (MW625839) was specifically expressed in salivary glands of R. pedestris (Figure 2B). Carbonic andydrases were also found in the watery saliva of a leafhopper and a planthopper species (Hattori et al., 2015;Huang et al., 2016). Silencing the gene resulted in lethality of N. lugens (Huang et al., 2016). How the enzymes help hemipterans feed on plants is not clear. They may play a protective role in the elevated CO 2 concentration during feeding (Huang et al., 2016).

Summary and Perspectives
Transcriptome analysis indicates that salivary glands of R. pedestris possibly produce a rich repertoire of proteins, in which many of them are possibly used to digest proteins and oils in beans. In addition, rich proteins were found in their saliva, and a high proportion of the proteins are not yet annotated, indicating knowledge on the salivary proteins of the pest is very limited. Therefore, the datasets reported here represent an important first step in identifying effectors in R. pedestris. In addition, a few elicitors of moth species are relatively small molecules that are not complete proteins, such as volicitin and inceptin peptides (Alborn et al., 1997;Steinbrenner et al., 2020). Those elicitors might also exist in heteropteran species, in which they have been ignored so far. The different kinds of elicitors and effector proteins are likely to work together in facilitating the feeding success of R. pedestris on soybeans.

DATA AVAILABILITY STATEMENT
The raw datasets presented in this study can be found in online repositories. The names of the repositories and accession numbers can be found below: NCBI SRA database (accession: PRJNA690963) and ProteomeXchange (accession: PXD027846).