A Multiomics Approach Unravels New Toxins With Possible In Silico Antimicrobial, Antiviral, and Antitumoral Activities in the Venom of Acanthoscurria rondoniae

The Araneae order is considered one of the most successful groups among venomous animals in the world. An important factor for this success is the production of venoms, a refined biological fluid rich in proteins, short peptides and cysteine-rich peptides (CRPs). These toxins may present pharmacologically relevant biological actions, as antimicrobial, antiviral and anticancer activities, for instance. Therefore, there is an increasing interest in the exploration of venom toxins for therapeutic reasons, such as drug development. However, the process of peptide sequencing and mainly the evaluation of potential biological activities of these peptides are laborious, considering the low yield of venom extraction and the high variability of toxins present in spider venoms. Here we show a robust methodology for identification, sequencing, and initial screening of potential bioactive peptides found in the venom of Acanthoscurria rondoniae. This methodology consists in a multiomics approach involving proteomics, peptidomics and transcriptomics analyses allied to in silico predictions of antibacterial, antifungal, antiviral, and anticancer activities. Through the application of this strategy, a total of 92,889 venom gland transcripts were assembled and 84 novel toxins were identified at the protein level, including seven short peptides and 10 fully sequenced CRPs (belonging to seven toxin families). In silico analysis suggests that seven CRPs families may have potential antimicrobial or antiviral activities, while two CRPs and four short peptides are potentially anticancer. Taken together, our results demonstrate an effective multiomics strategy for the discovery of new toxins and in silico screening of potential bioactivities. This strategy may be useful in toxin discovery, as well as in the screening of possible activities for the vast diversity of molecules produced by venomous animals.


INTRODUCTION
Spider venoms are composed of a complex mixture of salts, nucleotides and other small molecules, as well as bioactive molecules such as proteins and peptides, usually referred to as toxins (Escoubas and Rash, 2004;Kuhn-Nentwig et al., 2011;Langenegger et al., 2019). In spiders, toxins are produced and stored in venom glands. These toxins are synthetized in an inactive form and undergo several maturation processes (i.e. signal-peptide cleavage, posttranslational modifications (PTMs) and disulfide-bond formation) before being secreted in its mature form (Mebs, 2001;Kaas and Craik, 2015).
The family of cysteine-rich peptides (CRPs) is the main class of toxins present in spider venoms, typically presenting molecular masses between 3 and 9 kDa. The toxins contain ≥6 cysteine residues that form disulfide-bonds (S-S), which confers high stability to the peptides (Fry et al., 2009;Undheim et al., 2015). CRPs acts in different voltage-gated ion channels (Kuhn-Nentwig et al., 2011;Langenegger et al., 2019), such as calcium (Kubista et al., 2007;Deng et al., 2014), potassium (Lee and MacKinnon, 2004;Liao et al., 2006;Lau et al., 2016) and sodium channels (Corzo et al., 2008;Rates et al., 2013;Zhou et al., 2020), making them valuable tools to investigate physiological processes (Ruta et al., 2003;Osteen et al., 2016). Moreover, through the modulation of these channels, spiders can induce paralysis in insects while having minor effects on other taxa, being potential lead molecules for the development of biopesticides (Windley et al., 2012;King and Hardy, 2013). Another class of spider toxins are the antimicrobial peptides (AMPs), commonly found in spider hemolymphs (Silva et al., 2000;Riciluca et al., 2012) as a component of innate immunity, but also found in spider venoms (Jung et al., 2006;Abreu et al., 2017). Usually, AMPs are small molecules rich in cationic and hydrophobic residues that fold into a cationic amphipathic secondary structure (Edwards et al., 2016). AMPs interacts with the negatively charged outer membrane of microorganisms (Seo et al., 2012) through nonspecific interactions with anionic lipids (Arouri et al., 2009), causing membrane disruption through different poreforming mechanisms (Fuertes et al., 2011;Paredes-Gamero et al., 2012). Interestingly, anticancer peptides (ACPs) share the same main characteristics of AMPs, such as folding into a cationic amphipathic structure and interacting with the negatively charged outer-membrane (Gaspar et al., 2013). Therefore, there is an increasing interest in studying the application of AMPs for cancer treatment (Felıćio et al., 2017;Zhang et al., 2019;Peŕez-Peinado et al., 2020).
The biochemical arsenal of spider venoms is essential not only in predation and self-defense, but also in feeding, mating and antimicrobial protection, among other possible roles (Schendel et al., 2019). Tarantula spiders are usually harmless to humans (Lucas et al., 1994;Vetter and Isbister, 2008), but their venoms are valuable natural sources of molecules with potential for biotechnological applications and pharmacological research (Escoubas and King, 2009;Mobli et al., 2017). According to the World Spider Catalog, there are more than 48,000 spider species described (http://wsc.nmbe.ch, accessed on May 1 st , 2020) and it is estimated that they can produce more than 10 million bioactive toxins (Saez et al., 2010). However, according to the ArachnoServer 3.0 database, about 1,500 spider toxins are cataloged and curated to date , representing a small fraction of the estimated universe of spider venom toxins. Thus, the discovery of biologically active peptides derived from spider venoms is still a promising field in toxinology research.
The advances in sensitivity and resolution of mass spectrometers as well as advances in DNA and RNA sequencing techniques have led to a remarkable increase in the number of toxins reported in the last years (Duan et al., 2013;Sanggaard et al., 2014;Zelanis and Keiji Tashima, 2014;Abreu et al., 2017;Zobel-Thropp et al., 2018;Langenegger et al., 2019). However, the increase in toxin identification is not synchronized with the functional description of new toxins, since the functional characterization involves a significant number of experimental processes. Advances in bioinformatics and computational capacity allowed the development of machine-learning algorithms that serve as useful allies in drug discovery (Kaas and Craik, 2015). These machinelearning based tools may be used to predict potential biological activities, such as antimicrobial (Meher et al., 2017) and antitumoral (Manavalan et al., 2017), based on the primary structure of toxins. Thus, they may serve as valuable guides in toxin selection for further investigation.
Our group developed a workflow based on transcriptomic analysis, multiple enzyme digestion of venoms, mass spectrometry and bioinformatic analysis focused on the full sequencing of mature toxins (Abreu et al., 2017;Lomazi et al., 2018). In a previous work, we completely sequenced and determined the number of S-S bonds of new mature CRPs from the venom of Acanthoscurria gomesiana (Abreu et al., 2017). In this present study, we used our methodology allied to in silico predictions of AMPs and ACPs to investigate the A. rondoniae venom, which to the best of our knowledge, remained largely unexplored to date.

MATERIALS AND METHODS
A scheme of the complete methodology used in this work is illustrated as a flow chart (Figure 1). Details of each block are given in the following items of this section.

Animals
Three adult Acanthoscurria rondoniae specimens were kept in captivity in the biotherium of the Laboratoŕio Especial de Toxinologia Aplicada (LETA), Instituto Butantan (SP, Brazil). These animals were collected and maintained under SISBIO/ ICMBio permanent license number 11024-3-IBAMA (Brazilian Institute of Environment and Renewable Natural Resources) and under the SisGen license number A82F014. The spiders were fed on 15-day intervals with cockroaches or crickets and had water ad libitum. All procedures were approved by the Research Ethical Committee of the Federal University of Sao Paulo (protocol number 7649061014).

Venom Gland Transcriptome sequencing
One female specimen of A. rondoniae was anesthetized with carbon dioxide for about 10 min, euthanized to extract the venom glands, which were immediately stored at −80°C. mRNA from the venom glands was extracted and further processed for cDNA library construction following the stranded TruSeq RNA Sample Prep Kit protocol (Illumina, San Diego, CA, USA) (Freitas-de-sousa et al., 2015). Briefly, selected poly-A-RNA was fragmented and primed with random hexamers. Fragmented RNA was reverse transcribed and the generated first strand cDNA was ligated to indexing adapters for hybridization in the flow cell of a HiSeq 1500 System (Illumina, Inc) for sequencing. The size distribution of the cDNA libraries was measured by 2100 Bioanalyzer with DNA1000 assay (Agilent Technologies, CA, USA). An ABI Step One Plus Real-Time PCR System was used in quantification of the sample library before sequencing. The cDNA libraries were sequenced on the Illumina HiSeq 1500 System, in Rapid paired-end flow cell in a strategy of 300 cycles of 2*150bp paired-end. The RNA-seq raw sequencing reads were pre-processed through an "in house" pipeline for the detection of PhiX contaminant, using the software bowtie2 version 2.2.5 (Langmead and Salzberg, 2012), followed by filter quality control, to trim and remove reads with low-complexity and homopolymer enriched regions, poly-A/T/N tails, adapter sequences and low-quality bases with the software fastq-mcf 1.04.662 (Aronesty, 2013). Trimming was performed for reads shorter than 40bp and if mean quality score was lower than 25 in a window size equal to 15 and filtered out those composed by 90% of homopolymers or low-complexity regions. The raw data generated in this project was deposited in the NCBI BioProject section under the accession code PRJNA633430 and BioSample SAMN14943686. This Transcriptome Shotgun Assembly was deposited at NCBI TSA under the accession GIOJ00000000.

Transcriptome Assembly and Database Generation
To generate a nonredundant set of transcripts, we performed a de novo assembly by the Trinity software (Grabherr et al., 2011) version r20131110, using 44,559,666 RNA-seq good quality paired-reads, with parameter CuffFly to reduce the number of false-positive isoforms, and minimum transcript length set to 300 bp. The prediction of translated amino acid sequences for the reconstructed transcripts were based on the TransDecoder software, version 2.0.1 (http://transdecoder.sourceforge.net/), considering only predicted proteins with protein length ≥60aa. Each transcript containing the coding sequences was aligned by BLASTp (Altschul, 1997) against the Uniprot/Swissprot protein database and the Transcriptome Shotgun Assembly (TSA) NCBI to assess the protein annotation with cutoff e-value <1e-5. The analysis of PFAM domains for the predicted proteins was based on hmmsearch tool in the software package hmmer (Johnson et al., 2010) against a PFAM domains database (Bateman, 2004), using the cutoff e-value <1e-3. The TransDecoder usually predicts more than one coding sequence by transcript and only one candidate was selected, following the priority order of match for UniProtKB/Swissprot, PFAM and TSA-NCBI for annotating and selecting the best candidate for each transcript.

Venom Extraction and Enrichment
The venom extraction procedure was carried out as previously described (Rocha-e- Silva et al., 2009). Briefly, after one week of fasting, three A. rondoniae specimens were anesthetized with  carbon dioxide (CO 2 ) and had their venom glands electrically stimulated at a frequency of 10 Hz and voltage of 10-25 V. After extraction, the venom was pooled due to the low yield of each individual and then the pool was quantified using the Bradford reagent and bovine serum albumin (BSA) as a standard. After quantification, 500 µg of the pooled venom was submitted to solid-phase extraction using C18 StageTips (Rappsilber et al., 2007), with minor adjustments, in order to enrich the peptide fraction. Briefly, the StageTips were conditioned with 500 µl of methanol, then with 500 µl of methanol:water (1:1) and centrifuged at 2,100 rpm for 2 min in both steps. Then, 500 µl of 80% ACN in 0.1% trifluoroacetic acid (TFA) were added to the StageTips and centrifuged at 1,800 rpm for 3 min. In the following step, 500 µl of 5% ACN in 0.1% TFA were added and the tip was centrifuged for 1800 rpm for 3 min. After conditioning and stabilization, 500 µg of pooled venom (diluted in 5% ACN in 0.1% TFA, to a total volume of 500 µl) were loaded to the StageTip and then centrifuged for 3 min at 1,400 rpm. Three washing steps were performed using 500 µl of 5% ACN in 0.1% TFA and centrifugation at 1,700 rpm for 2 min. Lastly, the peptide fraction was eluted in 500 µl of 40% ACN in 0.1% TFA by centrifugation at 1,700 rpm for 3 min. After enrichment, the sample was divided into six aliquots of equal volumes (83 µl), vacuum-dried using a Concentrator Plus (Eppendorf) and stored at 4°C until the digestion step.

Proteolytic Digestion
For proteomics analyses, a crude venom pool aliquot of 50 mg of proteins was digested with trypsin. For peptidomics analyses, five out of the six aliquots were submitted to proteolytic digestion using a different enzyme for each aliquot. The vacuum-dried aliquots were dissolved to 50 µl of digestion buffer according to the enzyme: NH 4 HCO 3 50 mM for trypsin and Asp-N; phosphate buffer 50 mM for Glu-C; Tris/HCl 100 mM, CaCl 2 10 mM for chymotrypsin and Tris/HCl 100 mM, CaCl 2 0.5 mM for thermolysin. The sixth aliquot was directly dissolved in 0.1% formic acid for LC-MS/MS analysis to characterize the toxins in their native (mature) forms.
To digest the toxins, volumes of 25 µl of 0.2% Rapigest surfactant (Waters, MA, USA) were added to each sample, which were incubated at 80°C for 15 min. Samples were reduced with 2.5 µl of DTT 100 mM for 30 min at 60°C and then alkylated with 2.5 µl of IAA 300 mM for 30 min at room temperature in the dark. After reduction and alkylation, the enzymes were added in an enzyme:protein ratio of 1:100 and incubated at 37°C for 30 min. Except for thermolysin, in which a ratio of 1:250 was used and the incubation was performed at 75°C for 15 min. TFA was added to a final concentration of 0.5% to stop the digestions. Samples were filtered using Ultrafree -MC PVDF 0,22 µm filters (Millipore), vacuum-dried and stored at −20°C until MS analysis.

Mass Spectrometry: Peptidomics
For peptidomics analysis, digested and native peptide fractions were dissolved in 0.1% formic acid (solution A). Aliquots of 1 ml were automatically injected by a nano chromatography EASY-nlC 1200 system (Thermo Scientific) into a 15 cm x 50 µm Acclaim PepMap ™ C18 column (Thermo Scientific) coupled to a Q Exactive Plus mass spectrometer (Thermo Scientific). Peptides were eluted with a linear gradient of 7%-45% of solution B (80% acetonitrile in 0.1% formic acid) at 300 nl/min for 60 min. Spray voltage was set at 2.5 kV and the mass spectrometer was operated in the data dependent mode, in which one full MS scan was acquired in the m/z range of 300-1,500 followed by MS/MS acquisition using higher energy collision dissociation (HCD) of the five most intense ions from the MS scan. MS and MS/MS spectra were acquired in the Orbitrap analyzer at 70,000 and 17,500 resolution (at 200 m/z), respectively. Unassigned and +1 charge states were not subjected to fragmentation. The maximum injection times and AGC targets were set to 25 ms and 3E6 for full MS, and 40 ms and 1E5 for MS/MS. The minimum signal threshold to trigger fragmentation event, isolation window and stepped normalized collision energy (NCE) were set to, respectively, 2.5E4 cps, 1.4 m/ z and 26, 28, and 30. A dynamic peak exclusion was applied to avoid the same m/z selection for the next 5 seconds. All samples were analyzed in duplicates.

Mass Spectrometry: Proteomics
The proteomics analysis of the digested crude venom pool was performed on a Synapt G2 mass spectrometer coupled to a nanoAcquity UPLC system (Waters). Five ml of peptide samples were loaded online in a Symmetry C18 trapping column (5 µm particles, 180 µm x 20 mm length; Waters) for 5 min at a flow rate of 8 µl/min of phase A (0.1% formic acid). The mixtures of trapped peptides were subsequently separated by elution with a gradient of 7%-35% of phase B (0.1% formic acid in acetonitrile) through a BEH 130 C18 column (1.7 µm particles, 75 mm by 150 mm; Waters) over 90 min at 275 nl/min. Data were acquired in the data independent acquisition mode HDMS E with ion mobility separation in the m/z range of 50-2,000 and in the resolution mode. Peptide ions were fragmented by collision induced dissociation (CID) and energies were alternated between 4 eV and a ramp of 15-65 eV for precursor ion and fragment ions, respectively, using scan times of 1.25 s (Abreu et al., 2017;Pedroso et al., 2017). The ESI source was operated in positive mode with a capillary voltage of 3.0 kV, block temperature of 100°C, and cone voltage of 40 V. For lock mass correction, Glu-Fibrinopeptide B (500 fmol/mL in 50% acetonitrile, 0.1% formic acid; Peptide 2.0) was infused through the reference sprayer at 500 nl/min and sampled for 0.5 s every 60 s. The venom pool was analyzed in triplicate. All mass spectrometry data (DIA and DDA) were deposited to the ProteomeXchange Consortium via the PRIDE (Perez-Riverol et al. 2019) partner repository with the dataset identifier PXD019343.

Quantitative Peptidomics and Proteomics
For quantitative peptidomics, mass spectrometry raw data of the native venom peptides were loaded in Progenesis QI for proteomics (Nonlinear Dynamics, Newcastle, UK). Briefly, a reference run for the duplicates was automatically selected. The retention times of precursor ions were processed for alignment, peak picking and normalized to the reference run using default parameters. The normalized data was exported from Progenesis QI for proteomics in.csv format and further analysis were made in Microsoft Excel (Microsoft), where precursor ions with an intensity below 5.0x10 5 or with redundant m/z values were excluded from subsequent analysis.
Quantitative proteomics was also performed in Progenesis QI for proteomics with the same processing parameters. After processing, a.mgf file of all MS/MS spectra was exported to PEAKS Studio 7.5 (Bioinformatics Solutions Inc.) for protein identification (as described in "Toxin Sequencing"). The identification results were exported back to Progenesis as a.xml file. Venom proteins were quantified by the average signal intensity of the three most intense tryptic peptides of each protein (Silva et al., 2006). Only proteins identified with a minimum of three peptides and in at least two of the three replicates were considered for further analysis.

Toxin Sequencing
Mass spectrometry raw data of digested venom fractions were loaded and processed in PEAKS Studio 7.5 (Bioinformatic Solutions Inc.). De novo analysis was performed according to the following parameters: precursor ion mass tolerance of 10 ppm, fragment ion mass tolerance of 0.025 Da, maximum of one nonspecific cleavage, maximum of two missed cleavages and enzyme set according to the sample. Cys carbamidomethylation was set as fixed modification and Asn/Gln deamidation, Met oxidation and N-terminal acetylation as variable modifications. Database searches were performed with the same parameters of de novo analysis against the previously built A. rondoniae venom gland transcriptome (92,939 sequences and 251 common contaminants) utilizing de novo sequenced peptides with average local confidence (ALC) scores ≥50%. Posttranslational modifications and homology searches were performed through PEAKS PTM and SPIDER modules, respectively. The false discovery rate (FDR) was estimated by the decoy fusion method (Zhang et al., 2012) and set to a maximum of 1%.

Mature Toxin Validation
Primary structures of toxins identified by de novo and database search were submitted to analysis on the Spider|ProHMM module of ArachnoServer 3.0  in order to predict the cleavage sites of signal peptide and propeptide, ultimately resulting in the prediction of its mature sequences. The predicted mature sequences were then confronted with the sequences obtained experimentally, and if a correspondence was observed, the toxin was considered fully sequenced by LC-MS/ MS. Mature sequences were also submitted to analysis in the MS-Product module of ProteinProspector v5.22 (http://prospector. ucsf.edu/prospector/mshome.htm), which provides theoretical fragmentations and precursor ion m/z values. The theoretical m/z values of precursor ions with charges ranging from +2 to +9 were compared to those assigned in Progenesis QI for proteomics and also manually validated in the raw data of the native toxins through Xcalibur (Thermo Scientific). If a peak corresponding to a m/z value of a mature toxin was found in the raw data and the consensus sequence was supported by MS/MS data, the presence of the toxin in the venom was validated.

In Silico Anticancer and Antimicrobial Assays
For the prediction of possible biological activities, in silico analysis were performed by two tools. For antimicrobial activity, we utilized iAMPpred (Meher et al., 2017), a sequence-based computational tool which provides a score (ranging from 0 to 1) which reflects potential antibacterial, antiviral and antifungal properties. For this analysis, only scores >0.8 were considered as significant values. Gomesin from A. gomesiana hemolymph (Silva et al., 2000) was utilized as a positive control for antibacterial activity. Mouse b-defensin-4 (mBD4) and P9 (Zhao et al., 2016), a peptide derived from mBD4, were utilized as positive controls for antiviral activity. Rondonin from A. rondoniae hemolymph (Riciluca et al., 2012) and gomesin were utilized as positive controls for antifungal activity. For the prediction of antitumoral activities, we utilized MLACP (Manavalan et al., 2017), a sequence-based computational tool that utilizes two distinct machine-learning based algorithms, RFACP and SVMACP. Only peptides which presented a consensus with the two algorithms, with both presenting scores >0.5, were considered as potential ACPs, as recommended by the authors (Manavalan et al., 2017). For this analysis, we utilized gomesin (Ikonomopoulou et al., 2018), Aurein 1.2 from the frog Litoria aurea (Rozek et al., 2000) and human neutrophil peptide-1 (HNP-1) (Gaspar et al., 2015) as positive controls for anticancer activity.

Net Charge and Amino Acids Composition Analysis
Net charge and amino acid composition were calculated by the R scripts in the package Peptides (Osorio et al., 2015). For net charge analysis, only CRPs were selected and Cys residues were not considered due to disulfide bonds. For amino acids composition, only predicted ACPs were selected.

A. rondoniae Venom Gland Transcriptomics
Sequencing of the venom gland transcriptome of A. rondoniae resulted in a total 46,511,000 raw paired-reads. After the quality processing of raw reads, a total of 44,559,666 high-quality reads remained (95.8% of the total). De novo assembly using Trinity was performed on high-quality reads generating 150,409 transcripts, with an N50 value of 892 bp, a median length of 473 bp and an average transcript length of 735 bp. These transcripts generated 92,889 unique predicted proteins from the transcriptome, which were used as the reference database for the peptidomics and proteomics analysis.

A. rondoniae Venom Peptidomics
Raw mass spectral files (.raw) of native venom peptides, were initially processed in Progenesis QI for proteomics, resulting in a total of 17,329 mature precursor ions recorded. After manual filtering for threshold establishment and clustering of redundant peptide ions, a total of 2,800 precursor ions were kept for further analysis (Figure 2A and Supplementary Table 1). Main clusters of precursor ions around 5, 5.5, 6, and 7 kDa, mass values typically observed for CRPs from tarantula venoms (King and Hardy, 2013;Abreu et al., 2017). Also, there is a cluster in a mass range below 1.5 kDa, which represents short peptides also commonly found in tarantulas (King and Hardy, 2013). De novo sequencing, database searches and homology analysis of MS/MS spectra of digested venom peptides resulted in the identification of 12,032 peptide spectrum matches corresponding to 2,770 cleaved peptides (Supplementary Table 2) belonging to 74 different proteins (Supplementary Table 3). Among these 74 proteins, 62 were identified with more than two unique peptides and 12 with two unique peptides. The N-terminal of the mature toxins was determined by our multiple digestion approach, given that the same N-terminal amino acid was identified by consensus MS/MS spectra from different enzymes. Although the solid phase extraction step of our method was focused to enrich venom peptides, we identified 55 proteins with mass above 15 kDa (Supplementary Table 3). However, the protein masses from entries in the transcriptomic database are from the complete sequences, with the signal peptide and prodomains, adding the respective masses to the   Table 3).
The theoretical m/z values of these 74 proteins were calculated in ProteinProspector and then compared to those present in Progenesis QI for proteomics data (Supplementary Table 1). A total of 57 toxins fully sequenced from the overlapping peptide fragments generated by multiple enzyme digestions are possible mutated CRPs and posttranslational modified forms. To validate the identification of mature toxins, precursor ion spectra were manually analyzed in Xcalibur (Thermo Scientific) to confirm monoisotopic peak and charge state assignments. This resulted in the validation of 17 new toxins and also of the U1-TRTX-Agm3a, a toxin described by our group in the Acanthoscurria gomesiana venom (Abreu et al., 2017). In total, 18 mature toxins were validated ( Figure 2B and Table 1). The 17 new toxins are distributed in 7 families containing a total of 10 CRPs and seven short peptides ( Table 1). The CRPs families were named according to the nomenclature proposed by King et al. (King et al., 2008). In order to add another level of validation to the mature CRPs identified, the whole translated sequences predicted by our transcriptome were processed in the SpiderPro|HMM module of ArachnoServer 3.0 . The mature sequences of six CRP families were confirmed by the propeptide and signal peptide cleavage site predictions, here represented by the predicted transcript of each family, except for the U6-TRTX-Ar1a, which is two amino acids residues (NR) longer than predicted in the N-terminus (Supplementary Table 4).
We could not validate the presence of the mature forms of the other 40 possibly mutated or modified CRPs and thousands of native precursor ions still remained to be identified. We consider that many of these forms are derived from the main CRP families identified, as clusters of masses are observed around the main seven classes reported here ( Figure 2A and Table 1). Posttranslational modifications, mutations and proteolytic processing at alternative sites may result in a complex population of toxin proteoforms present in the spider venoms. Incorrect assignment of monoisotopic peaks and charge state on acquisition may also limit identification and redundant ion clustering. In addition, native toxins of~11 kDa (Figure 2A) could not have their mature forms identified and other CRP families may have been missed in our native peptidomic analysis. For in silico analysis, only the 18 validated toxins without any posttranslational modifications were utilized.

A. rondoniae Venom Proteomics
In the proteomics analysis of the A. rondoniae venom 33 proteins were quantified. We only considered proteins identified with at least three peptides and present in two out of three replicates ( Table 2). The most abundant venom protein was the cysteinerich secretory protein (Ar-CRISP), composing 28% of the venom, followed by the CRP U3-TRTX-Ar1a (26%) and then, the U5-TRTX-Ar1a (15%). These first three toxins represent 69% of the A. rondoniae venom toxins. The CRP proteoforms identified in the peptidomics analysis were not included in the quantitative proteomics due to the difficulty to precisely quantify sequences with high homologies. But the U3-TRTX-Ar1b, differing by only one amino acid from the U3-TRTX-Ar1a (L44M), is the most intense peak among the native CRPs, corroborating the proteomics results. The most abundant toxin family is of the CRP, composing 58% of the venom ( Table 2). The venom also contains significant amounts of the metalloprotease neprilysin-1 (Ar-Neprilysin-1, 8.4%) and hyaluronidase (Ar-Hyaluronidase, 1.5%). The proteins are homologous to those of the A. geniculata venom (Sanggaard et al., 2014). The Ar-CRISP is 76% homologous to the putative cysteine-rich protease (L1941_T1/1_Tarantula_S_fr3), Ar- Neprilysin-1 is 74% and Ar-Hyaluronidase is 80% to the respective Membrane venom metalloendopeptidase-a (L67_T1/ 2_Tarantula_V_fr5) and Venom hyaluronidase (L1941_T1/ 1_Tarantula_S_fr3) from A. geniculata (Sanggaard et al., 2014).

In Silico Analysis Suggests Possible Antimicrobial Activities of CRPs and Antitumoral Activities of Short Peptides
For screening of the possible biological activities of A. rondoniae toxins, we performed in silico simulations using iAMPred (Meher et al., 2017) and MLACP (Manavalan et al., 2017), two machine-learning algorithms to evaluate potential antimicrobial and antitumoral activities, respectively. Both tools give as the output a score from 0 to 1. Higher scores suggest a higher probability of presenting the respective activity.
The results indicate that all new CRPs may have antimicrobial activities (Table 3). In general, higher scores were obtained for antimicrobial and antifungal activities (>0.9 for both in all seven families), but all CRPs showed antiviral scores >0.5, with the lowest score being 0.685 for U2-TRTX-Ar1a, while all other toxins had scores higher than 0.8. The U3-TRTX-Ar1x family demonstrated the highest scores for antibacterial (>0.99) and antifungal activities (>0.97). As for short peptides, our results suggest a lower probability of presenting antimicrobial activity, except for the peptide VLPPLKF, which had scores above 0.79 for all three activities.
Our positive control for antibacterial activity, gomesin, obtained a score of 0.985, close to those observed for the seven CRPs families. This was also observed for the positive controls of antifungal activity, gomesin and rondonin, with scores of 0.973 and 0.903, respectively (Table 3). Lastly, our positive controls for antiviral activity, mBD4 and P9, a peptide derived from mBD4, obtained scores of 0.773 and 0.940, respectively.
On the other hand, the results obtained for antitumoral activities demonstrate that short peptides of A. rondoniae are more prone to present antitumoral properties than the CRPs in this in silico analysis. From the 7 short peptides, 4 demonstrated a consensus on the two algorithms, indicating potential antitumoral activities ( Table 3). These short peptides are: PLPVFV, VPPILKY, VVVPFVV and VLPPLKF. The two CRPs indicating potential antitumoral activity are the U1-TRTX-Agm3a and the U1-TRTX-Ar1b. Gomesin, aurein 1.2 and human neutrophil peptide-1 (HNP-1), used as positive controls for anticancer activity, demonstrated consensus on the two algorithms ( Table 3).

Net Charge Analysis and Amino Acid Composition
From all 11 CRPs analyzed, seven presented positive net charge at physiological pH (7.4), two are negatively charged and two are neutral (Figure 4). The U6-TRTX-Ar1a presented the highest net charge at physiological pH (+6,0), followed by the U3-TRTX-Ar1a/b with a net charge of +5,9. The negatively charged CRPs U2-TRTX-Ar1a and U5-TRTX-Ar1a presented net charges values of -0,96, and −0,95, respectively. As for amino acid composition, only the predicted ACPs were selected, totalizing two CRPs and four short peptides. Our main goal was to evaluate the percentage of nonpolar (hydrophobic) residues in those toxins. We observed high percentages of hydrophobic residues (>60%) in CRPs and even higher percentages of hydrophobic residues (>85%) in short peptides (Table 4).

DISCUSSION
In this study, we applied a multiomics strategy to explore the venom composition of A. rondoniae and in silico analysis in order to prospect new toxins with possible therapeutical applications. Previously, one experimental study was conducted to characterize AMPs from A. rondoniae spiders, which led to the identification of rondonin, an antifungal peptide present in the spider hemolymph (Riciluca et al., 2012). To our knowledge, this work is the first analysis of A. rondoniae venom composition. Through this strategy, we sequenced the venom gland transcriptome, identified and quantified proteins and determined the sequences of mature toxins of 17 new CRPs and short peptides present in the native venom, as well as one previously identified by our group, U1-TRTX-Agm3a (Abreu et al., 2017).   Homology searches and alignments demonstrated similarities of all seven CRP families to toxins reported in other spider venoms and, as expected, higher similarity to toxins of other Acanthoscurria spiders. The results indicate that these toxins may be biologically essential for the spider survival and also highlights a close phylogeny relationship within the genus. The U3-TRTX-Ar1x family demonstrated a high similarity with the toxins U1-TRTX-Ap1a, U1-TRTX-Agm1a, and µ-TRTX-An1a, all from other Acanthoscurria spiders. It is important to notice that this family also corresponds to the most expressed CRP in A. rondoniae venom observed by the quantitative proteomics and peptidomics approaches. The data highlights the relevance of U3-TRTX-Ar1x family for A. rondoniae spiders. Similarly to other spider venoms, enzymes as neprilysin, hyaluronidase, and carboxypeptidases, among others were found in the venom of A. rondoniae (Sanggaard et al., 2014;Borges et al., 2016;Kuhn-Nentwig et al., 2019). These enzymes may act in synergy with the neurotoxic CRPs to increase the spread and efficiency of the venom in the preys, as hypothesized elsewhere (Kuhn-Nentwig et al., 2019).
Our results from antimicrobial activity prediction demonstrated that all new CRPs identified in A. rondoniae venom have a probability of being antimicrobial, while only one short peptide (sequence: VLPPLKF) demonstrated possible antibacterial and antifungal properties. Gomesin was used as a positive control for antibacterial score since it has shown experimental activity against several Gram-positive and Gram-negative bacteria, such as Escherichia coli, Klebsiella pneumoniae, Bacillus spp and Staphylococcus spp (Silva et al., 2000). Gomesin scored 0.985 in iAMPpred for antibacterial activity, which is slightly lower than most of CRPs analyzed ( Table 3). As for antifungal activities, gomesin and rondonin were selected as positive controls. Gomesin demonstrated activity against the filamentous fungi Tricoderma viridae as well as the yeast Candida albicans (Silva et al., 2000), while rondonin demonstrated activity against Candida albicans (Riciluca et al., 2012). The scores for antifungal activity obtained for gomesin and rondonin were 0.973 and 0.903, respectively, while all seven families of CRPs presented antifungal scores >0.92. Taken together, these results suggest the probabilities of the new CRPs identified in this work to present antibacterial and antifungal activities, which should be further explored by in vitro and in vivo assays. From all seven families of new CRPs, the U3-TRTX-Ar1x family presented the highest scores for antibacterial (>0.99) and antifungal (>0.98) activities, probably due to the high proportion of basic residues (Table 1), which impacts directly on net-charge and isoelectric points. These toxins have a net-charge of 5.94 at pH 7.4 ( Figure 4). The high positive net charges are relevant and possibly increase antimicrobial activity, as evidenced by other studies with cationic peptides (Jiang et al., 2008;Paredes-Gamero et al., 2012). It is also important to notice that these toxins present high isoelectric points of 10.60 and 10.52 for U3-TRTX-Ar1and U6-TRTX-Ar1a, respectively (Figure 4). The net positive charges at neutral pH probably increase the efficiency of interaction with negatively charged membranes of microorganisms (Jiang et al., 2008). Future experiments may confirm the antimicrobial activity of the A. rondoniae peptides.
Many of the CRPs presented in silico antiviral scores, although in lower levels than antifungal and antibacterial, on average ( Table 3). For antiviral activity prediction, we utilized mBD4 and P9, a peptide derived from mBD4, which has shown a broad activity of antiviral effects on respiratory virus such as H1N1, H3N2, H5N1, H7N7, H7N9, SARS-CoV and MERS-CoV in in vivo and in vitro assays (Zhao et al., 2016). The scores for antiviral activity of mBD4 and P9 were 0.773 and 0.940, respectively. The toxins from the U1-TRTX-Ar1x family demonstrated higher scores than P9 for antiviral activity, while all other families except U2-TRTX-Ar1x and U4-TRTX-Ar1x demonstrated scores between those obtained from mBD4 and P9, which also suggests a potential antiviral activity and should be further evaluated by in vitro and in vivo assays. Antiviral peptides may be promising therapeutic drugs (Vilas Boas et al., 2019). Some Arthropod peptides were found to suppress viral gene expression, as the cecropin A from the moth Hyalophora cecropia, which inhibited HIV activity (Wachinger et al., 1998), and mucroporin-M1, a peptide derived from the venom of the scorpion Lychas mucronatus which inhibited the activities of measles, SARS-CoV and influenza H5N1 viruses (Li et al., 2011). The authors proposed that the antiviral action of the peptide mucroporin-M1 could be by interaction with the virus envelope, binding to it by surface charge interactions and drastically decreasing the infectivity of the three viruses (Li et al., 2011). Several of the A. rondoniae CRPs found in this work present positive net charges at physiological pH and could be promising antiviral peptides, although this is not the only property to be considered. In regard to in silico antitumoral activities, the predictions suggest that only two CRPs, U1-TRTX-Ar1b and U1-TRTX-Agm3a, may present antitumoral properties, as well as four short peptides: VLPPLKF, PLPVFV, VVVPFVV and VPPILKY. For this analysis, we utilized HNP-1 and Aurein 1.2 as positive controls. HNP-1, an human a-defensin AMP, showed cytotoxic activity against prostate tumor cells in vitro (Gaspar et al., 2015). Aurein 1.2, derived from the frog Litoria aurea, is another example of AMP with anticancer activity, as demonstrated by in vitro assays (Rozek et al., 2000). Both controls scored >0.84 in both algorithms of MLACP and, consequently, were predicted as ACPs. From all 11 CRPs, U1-TRTX-Agm3a and U1-TRTX-Ar1b have the shortest amino acids sequences, with 31 and 35 amino acids, respectively. It is also important to highlight that hydrophobicity plays a pivotal role in ACPs activity (Huang et al., 2011) and, as shown in Table 4, these potential ACPs are composed of more than 50% hydrophobic residues. It is interesting to note that short peptides indicated potential anticancer activity. In a possible peptide therapy, the potential of short peptides may be advantageous as they are easier to synthesize and modify, present higher ability to penetrate tumors and good biocompatibility (Thundimadathil, 2012). Therefore, the toxins present in the venom of A. rondoniae may be promising candidates to the investigation of therapeutic compounds.
The in silico anticancer and antimicrobial predictions have demonstrated to be important steps in our methodology, since it enabled a simple and fast screening of potential biological activities. It is important to highlight that the in silico predictions are indicatives of potential biological activities. However, the high scores in these predictions may not necessarily imply in real antimicrobial or anticancer activities. To confirm these hypotheses, experimental work should be performed in order to evaluate the biological activities of these peptides in vitro and in vivo. Although not definitive, the predictions suggest promising peptides and may serve as a guide in target selection for further steps of investigation, which is often a time-consuming task. These results demonstrate the effectiveness of a multiomics approach for toxin discovery, characterization and prospection of biological activities. The next steps would be the synthesis or expression of promising toxins to experimentally validate the activities observed in silico.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the NCBI BioProject section under the accession code PRJNA633430 and BioSample SAMN14943686. The Transcriptome Shotgun Assembly was deposited at NCBI TSA under the accession GIOJ00000000. Mass spectrometry data were deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD019343.