DIVERSITY AND EVOLUTION OF ANIMAL VENOMS: NEGLECTED TARGETS, ECOLOGICAL INTERACTIONS, FUTURE PERSPECTIVES

EDITED BY : Sebastien Dutertre, Maria Vittoria Modica, Mande Holford and Kartik Sunagar PUBLISHED IN : Frontiers in Ecology and Evolution

### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-712-6 DOI 10.3389/978-2-88963-712-6

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# DIVERSITY AND EVOLUTION OF ANIMAL VENOMS: NEGLECTED TARGETS, ECOLOGICAL INTERACTIONS, FUTURE PERSPECTIVES

Topic Editors:

Sebastien Dutertre, Centre National de la Recherche Scientifique (CNRS), France Maria Vittoria Modica, Stazione Zoologica Anton Dohrn, Italy Mande Holford, Hunter College (CUNY), United States Kartik Sunagar, Indian Institute of Science (IISc), India

Citation: Dutertre, S., Modica, M. V., Holford, M., Sunagar, K., eds. (2020). Diversity and Evolution of Animal Venoms: Neglected Targets, Ecological Interactions, Future Perspectives. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-712-6

# Table of Contents


	- Timothy N. W. Jackson, Hadrien Jouanne and Nicolas Vidal

# Editorial: Diversity and Evolution of Animal Venoms: Neglected Targets, Ecological Interactions, Future Perspectives

Maria Vittoria Modica1,2 \*, Kartik Sunagar <sup>3</sup> , Mandë Holford4,5,6,7 and Sébastien Dutertre<sup>2</sup>

<sup>1</sup> Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Naples, Italy, <sup>2</sup> Institut des Biomolécules Max Mousseron (IBMM), UMR 5247, CNRS, Université de Montpellier, Montpellier, France, <sup>3</sup> Evolutionary Venomics Lab, Centre for Ecological Sciences, Indian Institute of Science, Bangalore, India, <sup>4</sup> Department of Chemistry and Biochemistry, Hunter College, New York, NY, United States, <sup>5</sup> PhD Programs in Biochemistry, Chemistry, and Biology, The Graduate Center of the City University of New York (CUNY), New York, NY, United States, <sup>6</sup> Department of Invertebrate Zoology, The American Museum of Natural History, New York, NY, United States, <sup>7</sup> Department of Biochemistry, Weill Cornell Medicine, New York, NY, United States

Keywords: venom, evolutionary ecology, biodiversity, comparative genomics, toxins, biotechnology

**Editorial on the Research Topic**

### **Diversity and Evolution of Animal Venoms: Neglected Targets, Ecological Interactions, Future Perspectives**

Edited and reviewed by: Li Chen, Institute of Zoology (CAS), China

> \*Correspondence: Maria Vittoria Modica mariavittoria.modica@szn.it

### Specialty section:

This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 17 January 2020 Accepted: 02 March 2020 Published: 24 March 2020

### Citation:

Modica MV, Sunagar K, Holford M and Dutertre S (2020) Editorial: Diversity and Evolution of Animal Venoms: Neglected Targets, Ecological Interactions, Future Perspectives. Front. Ecol. Evol. 8:65. doi: 10.3389/fevo.2020.00065 Many animals rely on the production of complex venoms for predation and defense. Their venom arsenal is often a mix of bioactive compounds, produced in specialized tissues and delivered through a broad range of fascinating anatomical structures. Venoms have evolved through millions of years of natural selection, in intricate co-evolutionary "arms races," where the venomous animal plays the prey, the predator, or often both. As a result, venoms are extremely potent, being usually effective at a very low concentration via highly specific interactions with key physiological targets. Many venom toxins target the neuromuscular system, while others possess extremely potent anticoagulant, anesthetic, and hypotensive activities, making them important biotechnological tools. This Research Topic will discuss the diversity and evolution of venom with a focus on expanding beyond model systems to examine neglected taxa, and characterizing ecological interactions in venomous organisms.

The biotechnological relevance of venom compounds has been relatively well exploited in snakes, from which captopril <sup>R</sup> , one of the first FDA approved drugs derived from animal venoms, was obtained in the 1970s. Captopril is a potent inhibitor of the angiotensin-converting enzyme (ACE) used to treat hypertension and heart failure. However, as reviewed in this Research Topic, potential exploitation of snake venom as a bioresource goes well beyond biomedicine, including applications in diagnostics and in cosmetics (Ferraz et al.).

In addition to having an indisputable potential in drug discovery research, venomous taxa are also highly suitable for investigating ecological interactions between organisms, along with their evolutionary implications. Venom composition is highly variable, both among and within species, as it is shaped by various biotic and abiotic factors, including prey and predator pressures, environmental conditions, and phylogenetic histories. Despite the technological advancements in the last two decades, especially in the areas of high-throughput sequencing technologies, mass spectrometry and genomic manipulations, the precise role of the aforementioned factors in shaping the animal venom arsenal is yet to be fully elucidated.

In this Research Topic, the importance of broadening investigation to neglected taxa and concepts has been emphasized by Jackson et al. who point out that overlooked lineages can be of great interest for toxinologists, especially in the context of venom ecology and evolution. Their review highlights that, among snakes, only the species that are capable of inflicting clinically severe envenomations in humans have been the focus of research, while most "non-front fanged" species have been largely neglected despite their high phylogenetic and ecological diversity (Jackson et al.).

The link between trophic diversification and venom evolution makes the investigation of neglected lineages with specific preys particularly promising in terms of novel activities, as here reviewed by Modahl and Mackessy. Recent studies have not only allowed the characterization of novel protein families in rear-fanged snakes (e.g., veficolins, matrix metalloproteinases, and acid lipases), but have also led to the identification of new functions in well-characterized toxin families, including threefingers toxins (Modahl and Mackessy).

Similar to snakes, despite a long history of venom studies in spiders, a large number of taxa remain overlooked. In this Research Topic, Zobel-Thropp et al. describe, for the first time, the venom composition of the Pholcid spiders, popularly known as daddy long-legs, a diverse and ancient lineage of spiders that are generalist foragers of arthropods, including other spider species. Their work reveals the complexity of Physocyclus mexicanus venom, suggesting a primary role in its activity for the neprylisin family, and highlighting its potent toxicity in arthropods with negligible effects on humans (Zobel-Thropp et al.).

Among cnidarians, corals represent yet another group of venomous organisms that are worthy of further investigation. However, to date, coral venom characterization has been hampered by the inherent technical difficulties. As pointed out in this Research Topic, even the best-characterized coral toxins, the SCRiPS, have not been subjected to rigorous biochemical and functional testing, despite exhibiting potent neurotoxic and antimicrobial activities (Schmidt et al.).

Beside the focus on untapped taxa, several contributions to this Research Topic tackle the origin of venom complexity at different levels. At the intraspecific scale, population proteomics combined with multivariate statistical analyses allowed the detection of signatures of natural selection in two species of parasitoid wasps, and the identification of populationspecific proteins, some of which are responsible for virulence (Mathé-Hubert et al.). This work reveals the eco-evolutionary feedback between the organisms and the ecosystems, with important implications for biological control.

In scorpions, long-scale evolutionary adaptations to different environments have been demonstrated to affect both venom composition and stinger morphology. Here, Evans et al. highlight the role of individual-level adaptations in the optimization of the balance cost-benefit of venom use. These adaptations include both behavioral plasticity in response to predator/prey identity, and modification of venom composition according to different predatory pressures (Evans et al.).

The molecular mechanisms underlying venom complexity are still not fully understood, but a dominant role was attributed to gene duplication events. To comprehend the origin and evolution of toxin genes and morphological adaptation, a comparative genomic approach may be crucial. Comparative genomics revealed the importance of mechanisms that do not fit the classical model of gene function evolution, such as the cis-regulated changes in the venom gland expression in the parasitoid wasp, Nasonia vitripennis, and the horizontal transfer of toxin genes from bacteria to the cnidarian Nematostella vectensis (Drukewitz and von Reumont).

Similarly, by the integration of transcriptomic, proteomic and genomic data of the common house spider, Parasteatoda tepidariorum (Haney et al.), shed light on the role of alternative splicing in generating venom complexity. This process may be influenced by environmental cues, representing a possible mechanistic link between environmental pressures and adaptive changes in venom composition. Thus, in absence of preceding gene duplication events, alternative mechanisms may operate to modify venom composition (Haney et al.; Drukewitz and von Reumont).

Altogether, this Research Topic covers many of the leadingedge trends in venom research. These trends show great promise for advancing our knowledge about the enormous biodiversity on the planet, while identifying new toxins with diverse biotechnological potential.

# AUTHOR CONTRIBUTIONS

MM wrote the original draft. SD, KS, and MH reviewed and edited the manuscript. All the authors read and approved the final manuscript.

# ACKNOWLEDGMENTS

MM acknowledges funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 748902. MH acknowledges funding from Hunter College Presidential Faculty Scholar Fellowship.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Modica, Sunagar, Holford and Dutertre. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Alternative Transcription at Venom Genes and Its Role as a Complementary Mechanism for the Generation of Venom Complexity in the Common House Spider

Robert A. Haney <sup>1</sup> \*, Taylor Matte<sup>2</sup> , FitzAnthony S. Forsyth<sup>1</sup> and Jessica E. Garb<sup>1</sup>

*<sup>1</sup> Department of Biological Sciences, University of Massachusetts Lowell, Lowell, MA, United States, <sup>2</sup> Center for Regenerative Medicine, Boston University, Medical, Boston, MA, United States*

### Edited by:

*Sebastien Dutertre, Centre National de la Recherche Scientifique (CNRS), France*

### Reviewed by:

*Jianqiang Wu, Kunming Institute of Botany (CAS), China Marjorie A. Lienard, Broad Institute, United States*

> \*Correspondence: *Robert A. Haney robert.a.haney@gmail.com*

### Specialty section:

*This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution*

Received: *04 November 2018* Accepted: *06 March 2019* Published: *24 April 2019*

### Citation:

*Haney RA, Matte T, Forsyth FS and Garb JE (2019) Alternative Transcription at Venom Genes and Its Role as a Complementary Mechanism for the Generation of Venom Complexity in the Common House Spider. Front. Ecol. Evol. 7:85. doi: 10.3389/fevo.2019.00085* The complex composition of venom, a proteinaceous secretion used by diverse animal groups for predation or defense, is typically viewed as being driven by gene duplication in conjunction with positive selection, leading to large families of diversified toxins with selective venom gland expression. Yet, the production of alternative transcripts at venom genes is often overlooked as another potentially important process that could contribute proteins to venom, and requires comprehensive datasets integrating genome and transcriptome sequences together with proteomic characterization of venom to be fully documented. In the common house spider, *Parasteatoda tepidariorum*, we used RNA sequencing of four tissue types in conjunction with the sequenced genome to provide a comprehensive transcriptome annotation. We also used mass spectrometry to identify a minimum of 99 distinct proteins in *P. tepidariorum* venom, including at least 33 latrotoxins, pore-forming neurotoxins shared with the confamilial black widow. We found that venom proteins are much more likely to come from multiple transcript genes, whose transcripts produced distinct protein sequences. The presence of multiple distinct proteins in venom from transcripts at individual genes was confirmed for eight loci by mass spectrometry, and is possible at 21 others. Alternative transcripts from the same gene, whether encoding or not encoding a protein found in venom, showed a range of expression patterns, but were not necessarily restricted to the venom gland. However, approximately half of venom protein encoding transcripts were found among the 1,318 transcripts with strongly venom gland biased expression. Our findings revealed an important role for alternative transcription in generating venom protein complexity and expanded the traditional model of venom evolution.

Keywords: venom, transcriptome, toxins, Parasteatoda, alternative transcript

# INTRODUCTION

Venoms exhibit a high degree of biochemical complexity, with the venom of an individual organism being composed of up to hundreds of distinct proteins or peptides (Escoubas et al., 2006; King and Hardy, 2013). Such complexity may enable the rapid immobilization of a phylogenetically divergent array of prey and simultaneous disabling of many cellular functions (Olivera, 1997; Ushkaryov et al., 2004; Casewell et al., 2014). This complexity also makes venoms important resources for therapeutic drug discovery (Lewis and Garcia, 2003; Escoubas and King, 2009; Netirojjanakul and Miranda, 2017), targeted insecticide development (Schwartz et al., 2012; Windley et al., 2012; Smith et al., 2013), and neurophysiological research (Ushkaryov et al., 1992, 2004; Ashton, 2001; Südhof, 2001; Deak et al., 2009; Silva et al., 2009).

The typical explanation for the generation of the diverse venom protein complement proposes a dominant role for gene duplication. After an initial gene duplication event, a shift to venom gland limited expression of the duplicate gene occurs. This is followed by further gene duplication events that generate diversity through the production of venom gland restricted toxin families with functionally distinct paralogs molded by the action of positive selection (Kordis and Gubensek, 2000; Fry et al., 2009; Wong and Belov, 2012; Schwager et al., 2017). Yet, other mechanisms such as the production of alternative transcripts expressed in the venom gland and producing venom toxins via alternative transcriptional start or polyadenylation sites, or through alternative splicing, could also generate protein diversity (Nilsen and Graveley, 2010; Pal et al., 2011; de Klerk and 't Hoen, 2015). Alternative transcription could thus act to enhance venom diversity, yet has been reported in only a small number of individual cases. For example, alternative splicing has been suggested to account for novel venom protein variants in snakes, lizards, and wasps (Siigur et al., 2001; Fry et al., 2010; Viala et al., 2015; Yan et al., 2017). However, the putative alternatively spliced variants in these studies were derived from de novo assemblies of short-read data or EST sequencing, and no genomic information was available to confirm whether these variants came from the same locus. Furthermore, while sequenced genomes, together with transcriptomic or proteomic datasets, are available for a limited set of venomous organisms (e.g., de Plater et al., 1995; Torres et al., 2000; Warren et al., 2008; Whittington et al., 2008, 2010; Kita et al., 2009; Sanggaard et al., 2014), no comprehensive exploration of the role of alternative transcription at venom genes or in venom composition has been undertaken in genome sequenced species.

In addition, the presence of alternative transcripts at venom genes may require a modified understanding of how an evolutionary shift in expression of a toxin transcript to the venom gland might be accomplished. If a gene that is duplicated has preexisting multiple transcripts, potentially only one of several alternative transcripts might shift expression to the venom gland, or such a shift to venom gland restricted expression could occur with no duplication from a gene that has multiple transcripts. In humans and other organisms, the use of alternative start or polyadenylation sites, or alternative splicing, play an important role in generating tissue-specific transcripts from existing genes (Farajzadeh et al., 2013; Hestand et al., 2015; Sanfilippo et al., 2017; Reyes and Huber, 2018). This process could provide a mechanism for altering venom composition through the generation of novel toxins expressed in the venom gland that circumvents the need for preceding gene duplication events.

Spiders are now amongst the venomous species with sequenced genomes (Sanggaard et al., 2014; Babb et al., 2017; Schwager et al., 2017), including the common house spider Parasteatoda tepidariorum, an emerging model for the study of arthropod development and evolution (Hilbrant et al., 2012). Although P. tepidariorum is a member of the spider Family Theridiidae along with the notorious black widows of the genus Latrodectus, its bites are generally not harmful to humans (Isbister and Gray, 2003; Isbister and White, 2004). Thus, this species is of importance for the study of venom evolution, as a contrast to black widow venom with its extreme toxicity to vertebrates. Additionally, a recent investigation of an earlier version of the P. tepidariorum genome assembly generated by the i5k (5,000 arthropod genomes) consortium focused on the diversity and evolution of two gene families whose proteins are abundant in black widow venom: latrotoxins and latrodectins (Gendreau et al., 2017), including a preliminary analysis of expression in a single venom gland RNA-Seq library. However, a full evaluation of the contribution of alternative transcriptional events at venom genes to venom diversity requires high-quality genomes together with replicated transcriptomes, as well as data on the protein composition of venom.

Here we integrated newly obtained data from multi-tissue RNA-Seq and tandem mass spectrometry (MS/MS) with an improved assembly of the recently sequenced P. tepidariorum genome (Schwager et al., 2017) to reconstruct transcripts at genomic loci, and characterize the complexity of this species venom by identifying the protein products of transcripts with MS/MS data from the secreted venom itself. In order to investigate the importance of alternative transcription in generating venom diversity, we tested whether alternative transcripts are characteristic of genes encoding venom proteins, and also whether alternative transcripts from the same locus may generate multiple venom toxins with distinct protein sequences or architectures. We also explored the expression patterns across tissues of alternative transcripts found at venom genes. In particular, we tested whether the expression of venom protein encoding transcripts is restricted to, or highest in, the venom gland, as predicted in a traditional model of venom evolution (Fry et al., 2009), and as expected given that their products are presumed to be secreted toxins. We also tested whether venom genes have all transcripts, or only a subset, that are venom gland restricted, or whether some transcripts at venom genes are primarily expressed outside of the venom gland, suggesting that they may have alternative functions.

# MATERIALS AND METHODS

# Laboratory Protocols for RNA-Sequencing

Adult female P. tepidariorum were maintained in the laboratory on a diet of crickets, and were fed 3 days prior to being placed under 5 min of CO<sup>2</sup> anesthesia for extraction of venom glands, silk glands, and ovaries and isolation of the cephalothorax after venom gland removal. Extraction of RNA was performed with a modified Trizol protocol, followed by column purification using an RNeasy kit. As it was necessary to pool venom glands from 11 to 12 individuals to obtain sufficient material for RNA extraction, and to average out inter-individual variability, we also combined silk glands, ovaries and cephalothoraxes from 3 individuals in each replicate. Two replicates per tissue type, for a total of eight biologically replicated RNA-Seq libraries were generated, with each replicate from a different pool of individuals.

# Transcript Reconstruction and Expression Analysis

Quality control and library preparation for Illumina sequencing were performed at the Deep Sequencing Core Laboratory of the University of Massachusetts Medical Center. All cDNA libraries were sequenced using a 100 bp paired-end protocol on an Illumina HiSeq 4000. Low-quality sequence was trimmed from reads at a Phred score threshold of 20 and Illumina adapters were removed with TrimGalore 0.3.7 (https://github. com/FelixKrueger/TrimGalore) incorporating FastQC 0.11.3 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to verify read quality: median quality scores were at least 30 across all bases in all reads. Trimmed reads were aligned to the P. tepidariorum genome assembly (NCBI Accession: AOMJ00000000.2; Schwager et al., 2017) using Hisat2 2.0.5 (Kim et al., 2015) with settings for downstream transcriptome assembly, a maximum sequence mismatch penalty of 4 and a minimum mismatch penalty of 1. To generate a more comprehensive reference annotation incorporating expression information from isolated tissue libraries, alignments were performed for each of the eight libraries from this study, two libraries (venom gland and silk gland) from a previous study (Gendreau et al., 2017) and a single library from ovary tissue produced by the i5k project (i5KConsortium, 2013). Stringtie 1.2.4 (Pertea et al., 2015) was then used to assemble transcripts using an existing annotation for the Schwager et al. assembly as a reference in each run. Individual assemblies from each library were merged with the reference genome annotation for P. tepidariorum to produce a final annotation, which was used for estimation of transcript abundance. All transcripts in the final merged annotation were identified by comparison to the nr database using the blastx command in Diamond 0.8.23 (Buchfink et al., 2014) using the best local alignment at an e-value cutoff of 1e-5. The final merged annotation was compared to the reference annotation to identify novel genes expressed from previously intergenic regions (class code "u") using Cuffcompare in the Cufflinks suite (Roberts et al., 2011).

Analysis of differential expression was performed using the R package EBSeq (Leng et al., 2013). Read counts for use in EBSeq were generated using the prepDE script provided with Stringtie and five counts were added to each value to prevent underflow errors during calculations in R when counts were zero in both replicates for a tissue. We performed pairwise comparisons between venom gland and the three other tissues and identified transcripts with expression significantly upregulated in the venom gland at a false discovery rate of 5% in each comparison. The intersection of the three sets constituted the final set of venom gland upregulated transcripts. To confirm novel splice junctions and to calculate junction support as reads mapped to junctions present in the merged final annotation, we used STAR 2.5.0a (Dobin et al., 2013) using default parameters.

# Protein Identification and Sequence Analysis

Milked venom was obtained from SpiderPharm (Yarnell, AZ) where it was produced by pooling venom extracted by electrostimulation from ∼300 anesthetized adult female P. tepidariorum, in order to generate sufficient material and to capture the broadest sample of venom components. Lyophilized venom was separated on polyacrylamide gels and divided into 8 fractions for subsequent analysis, including a small peptide fraction. Each fraction was trypsin digested, and separated on a Thermo/Proxeon nano-HPLC apparatus. Eluted proteins were subjected to mass spectrometry using a Linear Trap Quadropole (LTQ) Velos Orbitrap tandem mass spectrometer at the University of Arizona Cancer Center Proteomics Shared Resource. Three technical replicates of this analysis were performed. A database for searching with mass spectra was produced from predicted proteins from all transcripts in the final merged annotation of the P. tepidariorum genome assembly. While polymorphism is present in the RNA-Seq reads, database transcripts were each represented by a single consensus sequence derived from the genome assembly that summarizes mappings from reads across all libraries. Proteins were predicted first by translating all open reading frames >90 nucleotides for each transcript and then using an in-house Perl script to choose the longest protein in the frame of the best blastx hit, or the longest protein in the absence of a BLAST hit. Sets of transcripts with 100% identical predicted proteins were identified using CD-hit (Li and Godzik, 2006) after the trimming of protein predictions to the first methionine residue. Protein domain predictions were performed with InterProScan (Quevillon et al., 2005). Predictions of protein toxicity were performed with Clantox (Naamati et al., 2009), and of inhibitory cystine knot structural folds with Knoter1d (Gracy et al., 2007).

Spectra were matched to proteins in the reference database using Sequest version 1.3.0.339 and the results were viewed in Scaffold 4.4.7. To consider a protein as present in the venom, peptide probabilities were set at 95% and protein probabilities at 99%, with a minimum of 2 peptide matches per protein required. The decoy false discovery rate for these settings was 0%. Proteins with shared peptides that could not be discriminated were placed into groups, while proteins that were not supported by independent evidence were deemed not present in venom. The identified protein list was purged of common contaminants (keratin, trypsin) and of a likely arthropod contaminant (hemocyanin).

# RESULTS

# RNA-Seq Based Transcript Prediction Helps Reveals Venom Composition

Novel genes and transcripts were predicted from the P. tepidariorum genome assembly using eight new RNA-Seq libraries, as well as three previously published libraries (Gendreau et al., 2017), and merged with an existing annotation for this genome (Schwager et al., 2017). This resulted in a final tally of 58158 genes and 90093 transcripts (**Supplementary File 1**), of which 23989 novel genes and 25214 novel transcripts were defined as expressed regions of the genome that did not overlap with any existing annotated gene. However, only 4,718 (19.7%) of these putative novel genes had a BLAST hit to the nr database at an e-value of 1e-5 or better, a far lower proportion than that for the total number of genes with a BLAST hit in the entire genome (73.6%), and the function of the remaining expressed regions remains unknown.

The SDS-PAGE gels of P. tepidariorium venom showed multiple protein bands, including high molecular weight bands in the size range of latrotoxins (110–130 kDa), as well as a number of smaller bands, with the smallest in the 10–15 kDa range (**Figure S1**, **Figure S2**). After discarding contaminants, we identified via tandem mass spectrometry a minimum of 99 proteins across all gel slices in the venom of the common house spider (**Table S1**). Fifty-three proteins were unique identifications: a single protein that could be discriminated from other proteins in the database by the peptides identified in the experiment. However, 46 identifications were of a group of proteins that were not distinguishable by the peptide data. Further, 26 of these 46 groups contained 2–5 proteins that were distinct in sequence when compared from the putative start codon, yielding a potential maximum of 139 distinct proteins in the venom. The remaining 20 groups contained only identical proteins, but these proteins had been included in the database as they were produced by distinct transcripts.

For enumeration of types of venom components, we provide a conservative minimum estimate assuming that only a single predicted protein from a group contributed an individual identification (99 total proteins), and a maximum estimate that includes all proteins from each group (139 total) with distinct sequence, but that could not be discriminated. Proteins were assigned to toxin or other categories of interest using the BLASTx hit with lowest e-value for the transcript from which they were predicted. Latrotoxins were the most numerous type of known toxin in P. tepidariorum venom, with 33– 38 different latrotoxin proteins present. However, latrodectins (small accessory proteins to latrotoxins), inhibitory cysteine knot toxins (ICKs) and cysteine-rich secretory proteins (CRISP) were not diverse in venom, with only 1–2 representatives each (**Table 1**; **Figure 1**; **Table S1**). There were 3–7 proteases, and 3– 7 proteins corresponding to other enzymes (lipases, amylases, and hyaluronidases) in the venom. We identified 6–9 proteins with homology to leucine-rich repeat proteins (**Table 1**; **Figure 1**; **Table S1**), which were also identified in confamilial black widow venom (Haney et al., 2014). There were 24–34 distinct proteins in venom with a best BLAST hit to uncharacterized proteins TABLE 1 | Numbers of proteins corresponding to different known toxin or other category of interest in venom, or from a transcript upregulated in the venom gland, or both, as determined by BLAST homology.


\**lipase, amylase, hyaluronidase. Numbers separated by dashes indicate minimum and maximum estimates of the number of proteins of a given type, given ambiguity in the MS identifications for similar proteins. To calculate the number that are both in venom and are VGTup, we used the maximum estimate of distinct proteins in venom. CRISP, cysteinerich secretory protein; ICK, inhibitory cysteine knot toxin; LRR, leucine-rich repeat protein; Novel Family, putative toxin family identified in Gendreau et al. (2017).*

(**Table 1**), including 8–9 proteins derived from genes in a novel family identified by Gendreau et al. (2017) with transcripts having very high expression in P. tepidariorum venom glands, and 2–3 proteins lacking a BLAST hit. Of these 26–37 distinct uncharacterized or unknown proteins, 14–17 were <200 amino acids in length, and rich in cysteines (6–14 Cys residues). However, zero were predicted to have an inhibitory cystine knot structural motif, although 1 was predicted as a probable toxin and 3 were labeled as possible toxins by the Clantox server (**Table S1**).

# Venom Genes Typically Have Multiple Transcripts and Encode Distinct Proteins

In sum, 86 genes encoded at least one protein identified in venom, 28 of which produced only a single transcript (of which 15 were latrotoxins), while 58 were multiple transcript genes. The proportion of multiple transcript genes (58/86: 67.4%) contributing to venom was significantly higher by chi-square goodness-of-fit test (χ <sup>2</sup> = 115.82, df = 1, p < 0.0001) than the proportion of genes across the genome that encoded multiple transcripts (11922/58158: 20.5%). Identified events producing alternative transcripts at venom multiple transcript genes included alternative 5′ splice sites (7, 6.6%), alternative 3′ splice sites (3, 2.8%), exon skipping (13, 12.3%), mutually exclusive exons (26, 24.5%), alternative first exons (27, 25.5%), and alternative last exons (30, 28.3%). However, no venom gene alternative transcripts were produced through intron retention. Accounting for the presence of grouped and indistinguishable proteins, including identical proteins produced by distinct transcripts, there were 169 transcripts from these genes encoding the maximum 139 distinct venom proteins. Most of these transcripts (141, 83.4%) were encoded by the 58 multiple transcript genes (**Table S2**), and a slight majority (72) of these 141 transcripts were novel to this study. Together with the 141 transcripts encoding a venom protein, the 58 multiple transcript venom genes had an additional 135 predicted transcripts, for a total of 276.

These 276 transcripts produced 210 different protein sequences, and individual genes encoded 1–22 distinct variants.

FIGURE 1 | Known or putative toxins and spreading factors found in secreted venom (red, maximum number including proteins indistinguishable by MS; orange, minimum number) or upregulated in the venom gland (blue). Overlap between presence in the venom and venom gland upregulation is shown in green. *CRISP*, cysteine-rich secretory protein; *Enz\_other*, hyaluronidase, amylase, lipase; *ICK*, inhibitory cysteine knot toxin; *LRR*, leucine-rich repeat protein; *protease*, serine and metallo-protease; *Novel*, putative toxin family identified in Gendreau et al. (2017).

Forty-four of the 58 multiple transcript genes had transcripts that encoded more than one distinct protein, accounting for 196 of the 210 (**Table S2**: column 2). The greatest number of distinct proteins encoded at a locus occurred at genes with BLAST homology to chitinases (MSTRG.35296, 27 transcripts, 22 distinct proteins), endothelin-converting enzymes (MSTRG.35640, 22 transcripts, 20 distinct proteins), and latrotoxins (MSTRG.45528, 23 transcripts, 17 distinct proteins).

Among the 44 multiple transcript genes producing multiple protein variants via alternative transcripts**,** peptide evidence from the venom MS experiment was sufficient to distinguish a single protein as present in venom for 15 (**Table S2**, column 3). For example, gene MSTRG.32727, an inhibitory cystine knot toxin, produced four transcripts, two of which were novel to this study and which both possessed a 5′ untranslated region (UTR) bearing a highly supported intron with canonical splice sites not present in the previous genome annotation (**Figure 2**; **Supplementary File 1**; **Table S3**). These four transcripts produced two divergent predicted proteins (each encoded by two distinct transcripts) that differed in predicted disulfide binding pattern, an important determinant of function in ICK toxins. For this gene, however, only one of these proteins was distinguished in venom (**Figure 2**) by unique peptide matches.

# Venom Genes Can Contribute More Than One Protein to Venom

In contrast, eight of the 29 other venom multiple transcript genes producing multiple proteins contributed at least two distinguishable proteins each to the venom (**Table 2**, **Table S2**: column 4) as assessed by the MS data, accounting for 23 total distinct venom proteins (16.5% of all distinct proteins). Each of the 8 genes were from regions of the genome with more than one paralogous gene in the existing annotation of the dovetail

FIGURE 2 | A venom protein-encoding multiple transcript gene produces two distinct proteins, but only one is identified in venom. Locus MSTRG.32727 has homology to inhibitory cystine knot toxins, and has four transcripts which produce two different predicted proteins. For each of the two distinct proteins, two different transcripts encode the same protein sequence, which vary by the presence or absence of an intron bearing 5' UTR. The proteins vary in length and are divergent in sequence, with identical residues in red. Each has five predicted disulfide bonds, but the predicted pattern of connectivity is different. Transcripts labeled "V" have protein products identified in venom by MS. A "\*" indicates that these proteins cannot be discriminated by the MS data. Numbers over introns indicate the number of spliced reads supporting novel junctions across all libraries. Also shown is venom gland expression in TPM rounded to the nearest whole number, and whether the transcript is upregulated in the venom gland (dark red text) relative to other tissues.

TABLE 2 | Basic information for eight loci that contribute more than one protein to venom as assessed by tandem mass spectrometry.


*Column 3 lists the total number of transcripts predicted at the locus, the number of distinct proteins predicted from these transcripts, and the total number identified in venom.*

genome assembly (Schwager et al., 2017). However, expression data in this study indicated more complex transcriptional patterns in these regions, with novel UTRs, transcriptional start sites (TSS), introns and exons defining new transcriptional units (**Supplementary File 1**; **Figure 3**, **Figure 4**, **Figures S3–S8**). For these eight genes, and for venom multiple transcript genes in general, novel transcripts that yielded a protein in venom generally had strong support, including numerous reads spliced across novel exon-exon junctions (**Table S3**).

For example, gene MSTRG.15150, a latrotoxin, had four transcripts in our updated annotation, two of which were present in the previous annotation, and which were annotated as separate genes lacking UTRs (**Figure 3**). The two novel transcripts from this study show novel TSS and UTRs with three previously undetected introns. These introns possessed canonical splice sites and were highly supported by spliced reads (**Figure 3**; **Table S3**). Each of the four transcripts produced a unique protein. One transcript novel to this study, MSTRG.15150.1, encompassed the entire region, and combined segments of two distinct coding regions to produce a novel protein sequence. The other, MSTRG.15150.2, produced a longer predicted coding sequence distinct from transcript aug3.g26325.t1, which it subsumed, via an upstream start codon. At least two, and possibly three, proteins predicted from this locus were identified in venom (**Figure 3**). The protein from transcript aug3.g26326.t1 was identified, and while the peptide data indicated that the protein produced by transcript MSTRG.15150.1 was not in the venom, the proteins produced by MSTRG.15150.2 and aug3.g26325.t1 could not be discriminated, and hence one or both may be present.

# Distinct Venom Proteins From Novel Transcripts Are Supported by MS Data

In two cases, transcripts novel to this study received support from the MS data. First, a leucine-rich repeat protein gene, MSTRG.21390, contributed at least two distinct proteins to venom. This locus had six transcripts (four novel to this study), which in total produced five distinct proteins (three novel to this study). Again, these novel transcripts included UTRs not present in the original annotation, as well as novel exons and introns with canonical splice sites, which were generally highly supported by spliced reads (**Figure 4**; **Table S3**; **Supplementary File 1**). Three of these novel exons added coding sequence at the 5′ end of the transcript (**Figure 4**). At least two, and possibly three of the proteins produced, with different arrangements of leucine-rich repeats, were found in venom (**Figure 4B**), including the novel protein predicted from transcript MSTRG.21390.3, which spanned genes from the original annotation, combining exons into a novel combination, and was uniquely identified by MS data. Novel transcript MSTRG.21390.1 produces a protein distinct from one encoded by two different transcripts (MSTRG.21390.5 and aug3.g6329.t1) due to 7 amino acids predicted from a novel upstream exon. However, the two proteins are otherwise identical and cannot be discriminated by peptides in the current MS data set, and hence one or both may be present in venom in addition to the protein from MSTRG.21390.3.

A second novel predicted protein sequence, from the chitinase gene MSTRG.35296, was uniquely identified in venom. This protein was one of two distinct proteins from this locus that were identified in venom, and was encoded by two novel transcripts (**Figure S3**) that spanned and shared exons with three genes in the original annotation (**Figure S3**; **Supplementary File 1**). Although this locus was transcriptionally complex, with 27 transcripts (**Figure S3**; **Supplementary File 1**) and 22 proteins, for the most part novel introns, which contributed to the generation of protein variation, were highly supported by spliced reads and possessed canonical splice sites (**Table S3**).

# Alternative Transcripts May Further Contribute to Venom Protein Diversity

An additional 21 (of 29) genes produced from 2 to 9 different proteins, of which a minimum of 1 and a maximum of 4 (**Table S2**, column 3) per gene were potentially present in venom. However, although the potential venom proteins were distinct in sequence, they were only matched by shared peptides. In sum, these 21 genes may contribute a minimum of 21 and a maximum of 63 proteins to the venom. Gene MSTRG.6569 illustrates this result, having four transcripts (two novel) that produce four distinct proteins that cannot be differentiated by peptides derived from the MS experiment (**Figure S9**), in addition to four other novel transcripts that do not produce a venom-identified protein. As in previous examples, novel transcripts were generally highly

FIGURE 3 | A venom protein-encoding multiple transcript gene produces two venom proteins. Locus MSTRG.15150 has homology to latrotoxins, and has four transcripts (two novel to this study), which produce four different predicted proteins, which vary in UTR length and exon-intron structure. Coding regions are shaded pink, and non-coding gray. The proteins produced vary slightly in length but also in primary sequence, as enumerated in the lower table on the right, in which numbers correspond to transcript codes in the upper table. Transcripts labeled "V" have protein products identified in venom by MS. A "\*" indicates that these proteins cannot be discriminated by the MS data. Numbers over introns indicate the number of spliced reads supporting novel junctions across all libraries. Also shown is venom gland expression in TPM rounded to the nearest whole number, and whether the transcript is upregulated in the venom gland (dark red text) relative to other tissues.

expression in TPM rounded to the nearest whole number, and whether the transcript is upregulated in the venom gland (dark red text) relative to other tissues. (B) Three different protein domain architectures were predicted from protein primary sequences at this locus, and the first two shown were identified in venom. The leucine-rich repeat locations and size are identical for MSTRG.21390.3 and MSTRG.21390.2, but LRR L-domain start and end positions differ by 1 basepair, so these sequences are considered to have the same architecture. (C) Sequence of the novel protein from transcript MSTRG.21390.3 indicating peptides mapped to the sequence from MS data (orange).

supported by overall read counts and by spliced reads across novel exon-exon junctions (**Table S3**). This phenomenon was also observed among the eight genes contributing more than one protein to venom. Five of these genes (MSTRG.25517, MSTRG.15150, MSTRG.21390, MSTRG.2390, MSTRG.32970) also produce additional distinct proteins that may be present in venom, but could not be differentiated by the MS data, as only shared peptides were matched. Thus, up to seven additional distinct venom proteins may be produced by these loci.

# Venom Gene Transcripts Show Variable Spatial Expression Patterns

Reads from eight sequenced libraries from venom glands, silk glands, ovaries, and cephalothorax (with venom glands removed) were mapped to the genome (**Table S4**), and used to identify 1,318 venom gland upregulated transcripts (VGTup), transcripts with a pattern of expression highly biased toward the venom gland (**Table S5**), which came from 1,095 genes. This is a substantially higher number than the 355 genes with a VGTup found in a previous study (Gendreau et al., 2017) using an earlier genome assembly, although 108 (30.4%) of the genes found to have a VGTup in the previous study had a VGTup in the current analysis, and 448 of the 1,095 were novel genes defined in this study as expressed regions of the new genome assembly. Overall, the 1,318 transcripts that constitute the venom gland upregulated set had an enhanced likelihood of producing a protein in the secreted venom [1.5% of transcripts in genome, 47.3% (80 of 169) of all transcripts with predicted proteins in venom]. Yet, more than half of all venom proteins were produced by transcripts that were not venom gland upregulated, and instead displayed a range of expression patterns across tissues, and surprisingly included transcripts with zero venom gland expression.

This pattern of expression was observed at both single and multiple transcript venom protein encoding genes. Of the 28 single transcript genes that encoded a protein identified in venom (**Table S6**), only 14 (50%) had transcripts that were significantly upregulated in venom gland. These 14 transcripts, including 4 latrotoxins (**Table S6**), were among the most highly expressed transcripts in venom gland, with 11 in the top 1% of venom gland expression rank, and all 14 in the top 5% (average TPM = 278.3), although several showed some level of expression in other tissues (**Table S6**). However, the remaining 14 transcripts, although also producing venom proteins, mostly exhibited low expression across tissues (**Table S6**), including the venom gland (average TPM = 0.6), although 11 were latrotoxins. Furthermore, six of these transcripts had no measurable expression in venom gland (TPM = 0).

At multiple transcript venom genes, less than half of all venom protein-encoding transcripts exhibited venom gland upregulation (66 of 141: 46.8%) and high abundance in the venom gland (**Table S6**; **Figure 5A**), with 56 transcripts in the top 1% of venom gland expression and 63 of 66 in the top 5% (average TPM = 4697.6), including 18 latrotoxins (**Table S6**). Yet, as with single transcript venom genes, 75 other transcripts from venom multiple transcript genes that encoded a venom protein were not venom gland upregulated, and were generally lowly expressed in all tissues, with median TPM < 1 (**Figure 5B**). Furthermore, 27 had zero expression in venom gland (**Table S6**), including eight latrotoxin transcripts. However, 29 (37.2%) of these 75 transcripts that do not exhibit venom gland upregulation produced a protein that belongs to a group of venom proteins indiscriminable by our MS data (**Table S1**, **Table S6**), in which at least one other transcript is venom gland upregulated. Hence, it is possible in these cases that the venom gland upregulated transcript actually produces the protein found in venom. Also, 12 of these 75 transcripts, while not significantly upregulated, do have their highest average expression in the venom gland (**Table S6**). Surprisingly, however, nine of these transcripts that were not grouped at protein level with a VGTup actually showed relatively substantial expression in all tissues, and 30 had higher average expression in at least one other tissue than in venom gland.

Multiple transcript genes that contribute a protein to venom also had from 0 to 23 transcripts whose encoded protein was excluded from being present in venom by the current MS data (**Table S2**), which in theory could be transcripts that have an alternative function to that of a toxin. Yet, the pattern of expression of these transcripts is similar to those that do produce a venom protein. Approximately half (68 of 135: 50.4%) of these additional transcripts at venom protein encoding loci are significantly upregulated and abundant in the venom gland (**Table S6**; **Figure 5C**), with 48 in the top 1% of venom gland expression and 65 in the top 5%. The 67 remaining transcripts at these genes, not significantly upregulated in venom gland, are generally lowly expressed (<1 TPM) on average in all tissues (**Table S6**; **Figure 5D**), and although 13 had highest average expression in the venom gland, 22 transcripts had 0 TPM in the venom gland. Five of these transcripts showed expression > 5 TPM in all tissues, and 41 had higher average expression in at least one other tissue than in venom gland. Three had zero or very low expression in the venom gland, and expression in at least one other tissue that was at least an order of magnitude greater.

# DISCUSSION

# Common House Spider Venom Is a Diverse Mixture

Animal venoms are complex, heterogeneous solutions, and often contain a diverse suite of molecules (Casewell et al., 2013). We found that common house spider venom contains a variety of proteins, a finding similar to that from other spiders and from more distantly related organisms whose venom composition has been assayed (Liao et al., 2007; Yuan et al., 2007; Duan et al., 2008, 2013; Tang et al., 2010; Palagi et al., 2013; Colinet et al., 2014; Undheim et al., 2014; Brinkman et al., 2015; Himaya et al., 2015; Borges et al., 2016). Proteins identified include homologs of ICKs, CRISPs, and enzymes, including proteases, lipases, chitinases, amylases and hyaluronidases, which may act as spreading factors or be modified as toxins, in addition to other components. However, as in the confamilial Latrodectus hesperus (Haney et al., 2014), latrotoxins, atypically large neurotoxins that act on the presynaptic membrane (Grishin, 1998), were the most diverse group found in P. tepidariorum venom. While previous functional characterization of four sequenced latrotoxin molecules (Rohou et al., 2007) indicated targeting to specific groups (mammals, crustaceans, insects), knowledge of functional variation among the expanding set of sequenced latrotoxins is lacking. Also lacking is an understanding of how these large

for transcripts encoding a venom identified protein while the bottom panels show the range of values for transcripts from the same set of loci whose encoded proteins could not be identified in venom. Panels (A,C) show values for transcripts that were significantly upregulated in venom gland relative to the other three tissues, while panels (B,D) show values for transcripts that were not significantly upregulated in venom gland. Black bars show median values, while the interquartile range is indicated by the boundaries of the colored boxes. Whiskers are 1.5x the interquartile range (IQR) and dots indicate outliers beyond 1.5 × the IQR.

neurotoxins functionally interact with smaller toxin molecules, including ICKs and CRISPs, present in theridiid venom.

# Alternative Transcripts and Venom Diversity

The presence of products from numerous toxin genes of the same type in P. tepidariorum venom is consistent with the standard notion of gene duplication as a predominant force in the diversification of venom toxins (Kordis and Gubensek, 2000; Fry et al., 2009; Wong and Belov, 2012; Gacesa et al., 2015). Yet, proteomic diversity can be generated by other mechanisms, and distinct toxins that might vary in mechanism or prey specificity could also be produced by an alternative transcript, arising from alternative start or polyadenylation sites, or by alternative splicing (Graveley, 2001; Maniatis and Tasic, 2002). Our results using an integrative venomics approach (Calvete, 2017) suggest that alternative transcripts may have a role in producing venom complexity, and that multiple mechanisms were involved, with the exception of intron retention. While previous studies have suggested that a few specific toxin variants may be derived from alternative transcripts (Siigur et al., 2001; Fry et al., 2010; Viala et al., 2015; Yan et al., 2017), alternative transcription may in fact be prevalent at genes producing venom proteins, as genes that encoded proteins found in venom were much more likely to produce multiple transcripts than the genomic background rate. These alternative transcripts generated numerous distinct predicted protein sequences, creating a diverse pool of protein variants at individual loci that produce venom proteins.

The involvement of alternative transcripts in the generation of venom protein diversity were most clearly demonstrated by eight genes defined in the new annotation developed for this study that had more than one protein unambiguously identified in venom by mass spectrometry, as these genes together accounted for 23 distinct venom proteins. However, in the pre-existing genome annotation, these regions contained multiple distinct paralogous genes (Schwager et al., 2017). Yet, data from this study indicated that these genes were bridged by highly supported novel transcripts that produced distinct proteins, and these transcripts had generally high levels of read support, including numerous reads spliced across novel exonexon junctions. Furthermore, support from independent venom proteomic data was found for two novel transcripts from two different genes (MSTRG.21390, a leucine-rich repeat protein, and MSTRG.35296, a chitinase) where the novel protein predicted from region-spanning transcripts, which combined exons from multiple previously annotated genes, was identified. Thus, while it seems likely that the original annotation did not capture the full complexity of the transcriptome, and that the deep sequencing of tissues in this study has revealed more complex multi-exon genes, additional confirmation from proteomic data that these predicted gene-spanning transcripts are not artifacts and thus yield venom proteins, is needed. In addition, in these genomic regions the processes of gene duplication and alternative transcription do not appear independent, and may interact to produce variation, as proteins of a similar length are produced from transcripts spanning one part, or all, of the region (**Figures 3**, **4**, **Figures S3–S8**), and more than one complete coding region appeared to be present. This phenomenon appears similar to that documented in human cells, where transcripts have been identified that contain segments from two adjacent genes (Akiva, 2005; Frenkel-Morgenstern et al., 2012), have been labeled "chimeric" RNAs. However, it has recently been argued that this transcriptional pattern is not unexpected, and that transcribed regions may simply be genes requiring re-annotation (He et al., 2018).

Furthermore, the contribution from protein sequence variants at venom multiple transcript genes may be more extensive than those identified from these eight genes. For most venom multiple transcript genes, a level of ambiguity remained as to which of several proteins occurred in the venom, as the only peptides identified as present in the venom by MS are shared among related proteins. Hence, while these genes may contribute only a single protein to venom, the possibility remains that some are actually contributing more than one distinct protein and adding to venom complexity, and in general the transcripts producing protein variants are highly supported by RNA-Seq. The MS data is restricted to a single experiment, and the augmentation of the available proteomic data might aid in resolving this ambiguity, and in confirming whether additional proteins from these genes, including those at low levels, are present in common house spider venom.

# Many Venom Gland Selective Transcripts Are Not in Venom

An expectation of the traditional model of venom evolution is that a switch to selective venom gland expression of a new toxin gene occurs subsequent to its origination via duplication from a non-venom "body" protein gene that fulfills some typical physiological role (Kordis and Gubensek, 2000; Fry et al., 2009; Hargreaves et al., 2014). We identified a complement of transcripts with expression that was significantly biased toward the venom gland (VGTup) as a proxy for this selective expression, as in other theridiid spider species (Haney et al., 2014, 2016). Yet, while this set of venom gland upregulated transcripts were more likely than expected to produce a protein found in venom, most did not, and many actually occur at genes that do not produce any venom protein. These upregulated transcripts could be required for specific cellular functions in the venom glands, and do not act as venom toxins. Alternatively, and particularly for transcripts that are also homologous to known spider toxins, such as latrotoxins and ICKs, this finding could represent posttranscriptional controls acting in venom gland tissue, which prevent the protein products of transcripts at these genes from appearing in venom. Post-transcriptional processes may also explain why distinct proteins from VGTup at other venom protein encoding loci were not found in venom, although protein products from other transcripts at the same locus were. Given their selective expression in the venom gland, this suggests that the presence or absence in venom of the products of these transcripts could be under control at the level of translation or secretion, and could provide a reservoir of additional toxin variants. Post-transcriptional processes have been put forth as an explanation for the discrepancy in transcript and protein abundance in snakes (Casewell et al., 2014), although this did not lead to the absence of the products of highly expressed venom gland transcripts in the venom, as in P. tepidariorum, suggesting the possibility of differences in levels of post-transcriptional control among phyla.

# Transcripts at Venom Genes Are Not Always Expressed Selectively in Venom Gland

In contrast with the selective expression in the venom gland as posited in the traditional model of venom evolution (Fry et al., 2009), many of the individual transcripts that produce venom proteins in this study have some level of expression in other tissues (silk, ovary, or cephalothorax). Given that protein products found in venom are likely to be toxins, the expression of their encoding transcripts in other tissues could indicate a defensive function. For example, toxins have been identified in confamilal Latrodectus tredecimguttatus eggs, where they may serve to protect from predation (Yan and Wang, 2015). Alternatively, it is conceivable that these transcripts do not yield a translated product in other tissues, an hypothesis that could be tested with proteomic data. The presence of alternative transcripts at venom genes also adds an additional layer of complexity to the pattern of expression expected from the traditional model, which involves selective expression in the venom gland at the level of the gene, which presumably encodes a single transcript. However, venom protein genes are biased toward multiple transcripts, and have transcripts that do not produce products in venom, or show venom gland restricted expression. In fact, a subset show low or no venom gland expression, but do show substantial levels of expression outside the venom gland, and three showed venom gland expression at or near zero, with much higher expression in other tissues. This pattern of variable expression could indicate that in certain cases transcription in the venom gland is achieved at the level of individual transcripts, whether by switching of expression to the venom gland during evolution for a single transcript at a multi-transcript gene, or by generation of alternative transcripts at a gene whose expression has shifted to the venom gland and reversion to expression in other tissues, with functions in other tissues mediated by alternative transcripts as opposed to by paralogs. This topic merits further study, particularly with regards to the gene regulatory mechanisms that might be involved, and how they may evolve, which requires multi-species data on alternative transcription.

However, interpretation of patterns of transcript expression is complicated by the potential influence of environmental conditions on gene expression, and venom gland expression of some transcripts could potentially be enhanced by certain environmental stimuli, including exposure to different prey communities (Daltry et al., 1996; Binford, 2001; Sanz et al., 2006; Gibbs et al., 2011), and hence the full complexity of a venom can only be exposed by assaying venom from populations across a range of environments. This is supported by data from cone snails, which indicates that venom composition can vary in response to environmental stimuli, namely the presence of predators or prey (Dutertre et al., 2014). Environmental variability could also help explain why some venom proteins, including latrotoxins, appeared to be encoded by transcripts with no venom gland expression, which we consider a prerequisite for presence in the venom. Although we controlled environmental conditions in the laboratory for spiders used in gene expression experiments in this study, it was not feasible to maintain spiders milked for venom under exactly the same conditions. The presence of proteins in venom in those spiders would likely indicate that the associated transcripts were expressed in the venom gland. The difference in venom gland expression between populations of spiders used for transcriptomics or proteomics could have an environmental, but also a genetic, origin. However,

# REFERENCES


data from experiments assessing variability of expression in spider venom glands under different environmental conditions is as yet lacking, as is data on genetic structure among populations of P. tepidariorum.

In summary, through the generation and use of genomic, transcriptomic and proteomic data we find that venom complexity may be achieved through both gene duplication and complex patterns of alternative transcription. Patterns of transcript expression do not comport with simple predictions from a traditional model of venom evolution, and may reflect the interplay of environmental and genetic mechanisms. Though this study focused on a single spider species, we expect these findings will be broadly applicable across venomous taxa in explaining how diverse venom complements are generated.

# AUTHOR CONTRIBUTIONS

RH and JG conceived and designed the study. RH and JG performed laboratory procedures. RH, FF, and TM performed the analysis. RH wrote the first draft of the manuscript. All authors contributed to manuscript revision, and read and approved the submitted version.

# ACKNOWLEDGMENTS

Evelyn Schwager for assistance in spider dissections, Chuck Kristensen for spider venom extraction, George Tsaparalis and Linda Breci for mass spectrometry analysis, and Jonathan Coddington and Nico Posnien for access to the Dovetail genome assembly and annotation. This work was supported by funding from the National Institutes of Health (GM097714-01, GM097714-02) to JG.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00085/full#supplementary-material


(Ornithoctonus hainana) venom based on transcriptomic, peptidomic, and genomic analyses. J. Proteome Res. 9, 2550–2564. doi: 10.1021/pr1000016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Haney, Matte, Forsyth and Garb. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Significance of Comparative Genomics in Modern Evolutionary Venomics

### Stephan Holger Drukewitz 1,2 \* and Björn Marcus von Reumont 2,3,4 \*

1 Institute for Biology, University of Leipzig, Leipzig, Germany, <sup>2</sup> Department of Bioresources, Fraunhofer Institute for Molecular Biology and Applied Ecology, Giessen, Germany, <sup>3</sup> Institute for Insect Biotechnology, Justus Liebig University, Gießen, Germany, <sup>4</sup> LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany

### Edited by:

Sebastien Dutertre, Centre National de la Recherche Scientifique (CNRS), France

### Reviewed by:

Juan J. Calvete, Spanish National Research Council (CSIC), Spain Greta J. Binford, Lewis & Clark College, United States

### \*Correspondence:

Stephan Holger Drukewitz stephan.drukewitz@uni-leipzig.de Björn Marcus von Reumont bjoern.von-reumont@ agrar.uni-giessen.de

### Specialty section:

This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 19 February 2019 Accepted: 24 April 2019 Published: 09 May 2019

### Citation:

Drukewitz SH and von Reumont BM (2019) The Significance of Comparative Genomics in Modern Evolutionary Venomics. Front. Ecol. Evol. 7:163. doi: 10.3389/fevo.2019.00163 Venoms evolved convergently in diverse animal lineages as key adaptations that increase the evolutionary fitness of species which are manifold employed for defense, predation, and competition. They constitute complex cocktails of various toxins that feature a broad range of bioactivities. The majority of described venom proteins belong to protein families that are known to comprise housekeeping genes or harbor protein-domains, which are present in genes with non-venom related functions. However, the evolutionary processes and mechanisms that foster the origin of these venom proteins and triggered their recruitment into the venom delivery system are still critically discussed. In most instances single or combined proteomic and transcriptomic approaches are applied to describe venom compositions and the biological context of venoms. For neglected species these studies represent crucial contributions to improve our understanding of venom diversity on a broader scale. Nonetheless, the inference of the evolutionary origin of putative toxins in these studies could be misleading without appropriate coverage of gene populations from different tissue samples (gene completeness) or complementary genome data. Providing a valid backbone to correctly map transcriptome and proteome data, whole genome sequences facilitate a clear distinction between variability of venom proteins or toxins due to posttranslational modifications, alternative splicing, and false-positive matches that stem from sequencing or read processing and assembly errors. High-quality whole genome sequence data of venomous species are still sparse and unevenly distributed within taxon lineages. However, to reveal the evolutionary pattern of putative toxins in venomous lineages and to identify ancestral variants of venom proteins, the appropriate sampling of genomes from venomous and non-venomous species is crucial. Nevertheless, larger comparative studies based on multiple whole genome data sets are still sparse to uncover processes of venom evolution. Here, we review the general potential of comparative genomics in venomics to unravel mechanisms and patterns of evolutionary origin of toxin genes. Finally, we discuss the benefit of whole genome data to improve transcriptomics and proteomics-only studies, in particular if datasets are applied to assess the evolutionary origin of venom proteins.

Keywords: venom evolution, gene duplication, single gene co-option, origin of toxins, orphan genes, whole genome sequences

# INTRODUCTION

Venomous species are extremely diverse and ubiquitously evolved in all known animal phyla, as for example in old lineages such as marine cnidarians, molluscs, or polychaetes, but also terrestrial groups like reptiles, all major arthropod clades and even mammals (Casewell et al., 2013; Dutertre et al., 2014; von Reumont et al., 2014a,b). It is estimated that around 200,000 animal species (Holford et al., 2018) use venom as the utmost important molecular trait that guarantees the fitness and survival of species being employed for defense, predation, and competition (Casewell et al., 2013; von Reumont et al., 2014a; Sunagar et al., 2016), see also **Figure 1**. In contrast to poisons, which are generally composed of less complex mixtures of toxic substances and used in a rather unspecific manner, venoms are constituted by complex toxin components such as peptides, proteins, and other smaller organic molecules (Fry et al., 2009). While a poison is mostly passively delivered for defense purposes, venoms are introduced into an organism by a specialized morphological structure, the venom apparatus or delivery system, which penetrates through the organism's body wall to deploy the venom (Fry et al., 2009; von Reumont et al., 2014c). In several cases species exhibit both traits and increase their evolutionary fitness by being venomous and poisonous at the same time. Some centipede species for example possess venomous claws to hunt prey (Undheim et al., 2015) but also feature sternal glands in their skin that secrete sticky cyanogenic liquids (Vujisic et al., 2013; ´ Zagrobelny et al., 2018). Obviously a long and slender bodied predator with frontal venomous claws benefits from a whole body based poison system to defend itself against other soil living predators, for example if attacked at the rear end by ants (Vujisic et al., 2013 ´ ).

From the incredibly diverse venom systems in the animal kingdom only a small fraction is studied in more detail, as well as the biology of many venomous species (Casewell et al., 2013). Venom research is traditionally focused on a few taxa like snakes, spiders, scorpions, and cone snails that occur in close vicinity to humans, and pose either a risk of envenomation or the chance to utilize venom components as cure. Snakes undoubtedly represent the best-studied venomous animal group, mainly because snakebites kill at least 100.000 people per year and represent a WHO listed high priority neglected disease (Arnold, 2016; Chippaux, 2017; Gutiérrez et al., 2017; Williams et al., 2019). The production of an effective antidote is largely dependent on the present knowledge of a species' venom cocktail and possible intra-specific variation in its toxin components (Chippaux et al., 1991; Gutiérrez et al., 2009). Another motivation behind many studies in venomics is to screen for and to harvest the potential of identified venom proteins (single toxins) for applied research, such as the development of highly specific agrochemicals like bio-insecticides or pharmaceutical applications and drug design (Windley et al., 2012; King and Hardy, 2013; Holford et al., 2018; Pennington et al., 2018; Senji Laxme et al., 2019).

As a consequence, biological and ecological constraints, like changes in biotic and abiotic factors that modulate intra-specific and ontogenetic venom composition, are more extensively studied only in a few populations of some snake species (Calvete et al., 2007; Núñez et al., 2009; Durban et al., 2013; Neale et al., 2017; Sanz et al., 2017; Borja et al., 2018; Zancolli et al., 2019). Furthermore, knowledge on gender specific venom variation exists mostly only for snakes and a few spiders (Binford, 2001; Menezes et al., 2005; Pimenta et al., 2007; Herzig and Hodgson, 2009), although the reasons for different toxin cocktails in males and females are still being debated (Binford et al., 2016).

Another reason that venom systems of most taxa remain unstudied is that only in the last years the modern, methodological toolbox has been established to easily assess the composition of venoms and the mode of their delivery of so far neglected, in many cases very small and more difficult to access species (Sunagar et al., 2016; von Reumont, 2018). The research area that addresses all aspects of venom related research is nowadays called venomics, a term that was originally coined in 2004 for proteomic-based analyses on snake venoms (Juárez et al., 2004; Bazaa et al., 2005). Today the term venomics reflects the combination of several new—omics technologies to conduct general research on venoms in an integrative approach, see also **Figure 2** (Calvete, 2017). Besides new approaches to identify more effectively secreted proteins via proteomics, new sequencing technologies provide in unprecedented details the framework to study evolution of venoms and toxin expression (Gopalakrishnakone and Calvete, 2016; Sunagar et al., 2016; von Reumont, 2018). Particularly for small, so far neglected venomous species, transcriptomics and proteo-transcriptomics appear as the foremost methods to assess possible venom compositions and contribute to extend our comprehension of venom diversity in general. Recent studies show however, that de novo transcriptome analyses need to be conducted carefully to avoid misinterpretation of the data, and a combination of both transcriptomics and proteomics is of the utmost importance (Holding et al., 2018; Smith and Undheim, 2018; von Reumont, 2018). Nevertheless, it should be kept in mind that most of these studies pursue the goal to describe the venom composition of previously neglected species. While such studies on general venom compositions of neglected (and often rather small) organisms have a different angle and might benefit from more or pooled specimens to cover individual venom variation, whole genome sequencing and analyses should optimally be conducted utilizing one individual or highly homozygous specimens. Studies, which apply comparative genomics to investigate origin and evolution of toxin genes, are currently underrepresented in venomics mostly because high quality genomes of venomous species are still sparse; see also **Figure 1** and **Supplementary Table 1**. In particular, whole genome data becomes a corner stone to address questions such as how venom proteins or toxins originate, which mechanisms drive the composition and adaptation of venoms but also to prevent shortcomings of de-novo transcriptome approaches.

Here, we focus on the genomics aspects of evolutionary venomics and the potential that whole genome sequences harbor to understand the processes that underlie the evolutionary origin of venom proteins and toxins. Please note, that if we use in subsequent sections the term genomic or genome data, we relate to whole genome sequence data. It is equally important to bear

(Continued)

FIGURE 1 | the context of venom evolution are shown in the yellow boxes. For the few venomous groups that are studied in more depth (underlined) gray boxed illustrate the overall number of available genomes according to the NCBI database (asterisk\*, see also Supplementary Table 1) and recent publications (Branstetter et al., 2018; Garb et al., 2018). Animal silhouettes were taken from PhyloPic (PhyloPic — Free Silhouette Images of Life Forms) or were generated on own photograph, the phylogeny is based on Casewell et al. (2013).

in mind that we will omit specific aspects of venomics that would benefit from high quality whole genome sequences, such as venom evolution in populations. The goal of this review is to cover the predominant trends and patterns and to give an overview of the current situation in whole genome-based venom research.

# GENOMICS

Initially, genomes were sequenced for a few model organisms such as C. elegans, D. melanogaster, or Mus musculus, that were grown and bred in the laboratory to study their biology, development and underlying genetic mechanisms (Dunn and Munro, 2016). Facilitated by the fast progress in sequencing technology, genomic analyses of non-model organisms nowadays improve research in several fields of biology like evolutionary biology, developmental biology, and functional biology. In particular for systematics and phylogenetics, comparative genomics is important to understand how genome changes occurred in different taxon lineages along the tree of life (Dunn and Munro, 2016). On the other hand, the framework of comparative genomics gives insights on how functional DNA elements and adaptive traits evolve, and contributes to identify the linkage between genotype and phenotype (Dunn and Munro, 2016; Yin et al., 2016).

Venoms and their toxin components as highly adaptive traits represent an ideal case study to comprehend how protein functions evolve. The widely accepted hypothesis to explain the shift from a non-venom related gene and its evolution to a toxin is in line with the classic model on the evolution of gene function. An ancestral gene undergoes a duplication event which is followed by the neo- or sub functionalization of one of the copies as a toxin (Hughes, 1994; Lynch, 2002; Nei et al., 2002). In the context of venom evolution, authors often address the new toxic function of a protein as "recruitment" into the venom gland (Fry et al., 2003, 2009; Casewell et al., 2013; Pineda et al., 2014; Undheim et al., 2014). The crux of testing this hypothesis is to identify the ancestral state of a protein that is later weaponized as a venom component. The origin of every toxin that contributes to a venom composition has to be evaluated independently and taxon-specific. For example, snake venom evolution could underlie different constraints and mechanisms than the evolution of venom in assassin bugs or spiders. Even within a taxon the evolutionary patterns could change: processes that cause differences in venom composition between populations are not necessarily identical with those that originally facilitated a venomous lifestyle. An adequate sampling of genomes within the clade of interest and complementary data of representatives from phylogenetically older lineages are crucial to determine ancestral states of genes to link geno- and phenotype. This applies to the comparison of a venomous to a non-venomous lineage, but also to studies that focus on venom diversity within a venomous lineage or species. Depending on the question or hypothesis that is addressed, it is necessary to include information about the geographic origin of the specimen that was used to sequence and assemble the reference genome (for example: population genomic venom studies).

The evolutionary history and the processes that facilitated the functional switch of a gene from a physiological to a toxic function are per se challenging to assess. It seems that functional and structural constraints on the secreted proteins limit the pool of protein families that contribute to the possible toxin arsenal in venoms (Fry et al., 2009). It is known that protein families such as phospholipase A2, peptidase S1, peptidase S10, several metalloproteinases, kunitz and hyaluronidase are venom components, which evolved convergently in phylogenetic distant lineages. Depending on the taxon of interest and the evolutionary history of gene families within this lineage, the ancestral state (including the number of gene variants) of the gene families before being recruited into venom can vary. Consequently, the number of ancestral candidate genes for toxin evolution can differ depending on the respective lineage. We briefly showcase this situation of (venom independent) gene evolution exemplarily for the evolution of hyaluronidase-like genes in placental mammals, a group that comprises also the venomous Eurasian water shrew. The different ancestral states of this protein family in each lineage reveal how an insufficient taxon sampling could obscure an analyses of the origin of toxic hyaluronidase in the Eurasian water shrew (Kowalski et al., 2017).

For placental mammals (Placentalia) seven hyaluronidase-like genes are known: HYAL1, HYAL2, HYAL3, HYAL4, HYAL5, SPAM/Ph-20, and HYALP1 (**Figure 3**) (Csoka et al., 2001; Hubbard et al., 2002; Kim et al., 2005). These genes are not equally distributed in the genomes of all Placentalia (**Figure 3**). In humans (Homo sapiens), chimps (Pan troglodytes), rats (Rattus norvegicus), and mice (Mus musculus) six hyaluronidase-like genes cluster as two tightly linked triplets on two different chromosomes (Csoka et al., 2001; Hubbard et al., 2002; Kim et al., 2005). Besides those shared genes, it is known that mice and rats possess an additional hyaluronidase variant (HYAL5), which is located on the same chromosome as the triplet HYALP1, HYAL4, and SPAM/PH-20. The HYAL5 gene is missing in the genome of primates and Laurasiatheria, but is shared between rats and mice, which led to the assumption that the duplication event of this gene took place in the last common ancestor of all rodents (at least mice and rat) (Hubbard et al., 2002; Esselstyn et al., 2017). The HYALP1 is present in the rodents and the primate lineages but is missing in the genomes of the Laurasiatheria representatives, see **Figure 3** (Esselstyn et al., 2017). In both chimp and humans the HYALP1 gene is present, but point mutations led to a frameshift and pseudogenization, while the ortholog gene codes for an active enzyme in rodents (Kim et al., 2005). Depending on the phylogenetic lineage, members of the Placentalia show five or seven functional hyaluronidase genes, while the human/chimp lineage exhibits six hyaluronidase genes. However, one variant was pseudogenized over time and is expressed but not translated (Csoka et al., 2001). Five hyaluronidase-like genes arranged in two distinct clusters is most likely the ancestral state of the hyaluronidase protein family in the group of placental mammals. This pattern is shared by the analyzed representatives of laurasiatherian and afrotherian mammals (The number of hyaluronidase genes of the African elephant is not shown in **Figure 3** but was verified on ENSEMBL). Diverging patterns in the primate and rodent lineages probably evolved after the last common ancestor of all placental mammals and are lineage specific. In order to address the evolution of venom proteins, the ancestral state(s) of the non-venomous gene variant(s) has to be known to prevent false interpretations.

To reveal the recruitment process of a toxic hyaluronidase variant, in our example in the venomous Eurasian water shrew (Neomys fodiens) (Kowalski et al., 2017), the ancestral state of the hyaluronidase protein family in the insect-eating animals, to which shrews belong, has to be known. The genomes of both the non-venomous European hedgehog (Erinaceus europaeus) and the non-venomous common shrew (Sorex araneus) feature the five hyaluronidase genes that are supposed to be ancestral in the placental mammals. Each of these five hyaluronidase variants are necessary to interpret the evolution of the hyaluronidase as a venom component in the Eurasian water shrew.

# IMPROVING GENE COMPLETENESS AND HOMOLOGY PREDICTION WITH WHOLE GENOME DATA

The advantage of comparative genomics is that ancestral states of venom proteins are unambiguous to address. Transcriptomeonly approaches represent in many cases insufficient samples of

gene sets and lineage specific taxon-representatives. The level of sequence identity between the venom component and the ancestral genes can help identify the last non-toxic homolog. Nevertheless, predicting the processes of venom evolution, for example if the functional switch is the result of a gene duplication, a single gene co-option or alternative splicing, is difficult without complete gene sets. In **Table 1** we show the hyaluronidase variants that are known in the common house mouse. The (hypothetical) origin of a venomous salivary hyaluronidase in the house mouse is only to determine if all seven variants, are sampled in complete gene sets. Using only single tissue transcriptomics those possible other variants, but also differences in expression levels, are missed, which might lead to false assumptions. To assess the deeper phylogeny and origin of a single venom protein in general, all representative gene sets of closer related species of the discussed taxon lineage need to be incorporated in the analyses in order to infer the ancestral situation in the last common ancestor (LCA) of the venomous lineage and the closest non-venomous lineage.

Transcriptome data is generally used to identify highly expressed genes in the venom apparatus in which the toxins are translated, in most cases combined with a proteomic analysis to verify the secretion of these proteins. Subsequently, the venom composition with the predominantly transcribed and secreted genes is estimated, and possible bioactivity either postulated or tested. However**,** the de novo assembly of transcriptomic data is a computational challenging task linked to the high variability of expressed transcripts in tissues, which resulted in the modification of genome assembly algorithms for the application on RNA-sequencing data (Grabherr et al., 2011; Xie et al., 2014; Bushmanova et al., 2018). Major drawbacks in the de novo assembly of transcriptomic data are caused by the uneven coverage of transcripts, the difficult distinction between sequencing errors and low expressed transcripts, the challenging identification of alternative splicing variants and, finally, an unreliable assembly of recently duplicated paralogous genes. Ambiguous situations are solved differently depending on the applied assembly software. Consequently, the number and length of finally assembled transcripts, and subsequently the number of identified toxins might vary. The evaluation and interpretation of such results without the whole genome sequence of the same or a close related species as a blueprint TABLE 1 | Overview of the expression pattern of the seven hyaluronidase genes found in the genome of Mus musculus.


First row shows the gene names, second row the tissues with the highest expression level. All data from the Mouse ENCODE consortium (Yue et al., 2014).

is a challenging task (O'Neil and Emrich, 2013; Bushmanova et al., 2016). The completeness and the duplication level of single copy orthologs expected in a whole body transcriptome (Simão et al., 2015) can be a valuable reference point to compare different assemblies and to choose the "best" one or to create a hybrid assembly from different assembly programs. However, venom producing organs are specialized in the secretion of toxins, consequently the number of expressed housekeeping genes is expected to be reduced. Using a metric, which scores the quality of the assembly by the presence and duplication level of housekeeping genes, like the approach used in BUSCO, might result in error-prone implications regarding the completeness of assembled toxin transcripts (Holding et al., 2018). Nevertheless, at the same time there is a lack of alternative metrics to evaluate de novo transcriptome assembly. One general focus of current research in evolutionary biology is to improve the precision of complex homology prediction by harnessing whole genome data (Li et al., 2003; Jothi et al., 2006; Altenhoff et al., 2014; Emms and Kelly, 2015, 2018; Kriventseva et al., 2015; Linard et al., 2015; Mesquita et al., 2015; Sonnhammer and Östlund, 2015; Petersen et al., 2017). For instance, information about gene arrangements and position within the genome (synteny) can be additionally utilized to further refine homolog assignment if whole genome sequences are available (Lechner et al., 2014).

The previously described situation of hyaluronidase genes in placental mammals illustrates how synteny information can be incorporated in the process of homology assignment. Despite the broader phylogenetic distance and the resulting divergence in sequence similarity, the arrangement of the hyaluronidase genes in the genome of placental mammals allows a distinct ortholog prediction.

The evolutionary origin of snake venom proteins illustrates how more whole genome data could impact the research on venom evolution. Two studies independently revealed the expression of homolog venom protein genes in salivary glands and in several, distinct body tissues in venomous and non-venomous snakes and non-venomous lizard species (Hargreaves et al., 2014; Reyes-Velasco et al., 2015). Both studies describe an ancestral expression pattern and hypothesize about the origin of venom proteins, but the lack of highquality whole genome data for the majority of the analyzed species impeded precise conclusion about the loss, duplication, or changed expression patterns of specific gene variants. Available genome data and knowledge of lineage specific gene variants (comparable to the provided example of hyaluronidase in placental mammals) would provide the base for a clear inference of such mechanisms and to untangle these complex situations.

# WHOLE GENOME STUDIES IN VENOMICS

The majority of recent high-quality genome sequencing projects selected taxa driven by economical interests or human impact (Apis mellifera, Aedes aegypti), research on socialecological questions (higher, social insects such as ants), and partly based on their phylogenetic key position to enlighten animal evolution (e.g., Nematostella vectensis, Ornithorhynchus anatinus) (Weinstock et al., 2006; Nene et al., 2007; Putnam et al., 2007; Warren et al., 2008; Suen et al., 2011; Wurm et al., 2011). However, whole genome sequences of venomous species from several lineages are still rare, see also **Supplementary Table 1**, despite the constant decrease of sequencing costs and the improvement in new long read sequencing techniques like PacBio or Oxford Nanopore. Particularly the genome assembly is still challenging, and becomes more and more time but also hardware consuming, despite improvements in this field (Richards, 2018). Nevertheless, whole genome data is available for a few venomous species and was used to address venom evolution. The starlet sea anemone (Nematostella vectensis) resembles one of very few "venomous model organisms." This cnidarian species is easy to rear, has a relatively short generation period, offers transgenic tools and employs a venom from specialized cells to prey on other small invertebrates (Hand and Uhlinger, 2006). The initial motivation to sequence the whole genome was the phylogenetic position of Cnidaria as sistergroup to bilaterian animals (Putnam et al., 2007; King and Rokas, 2017) and implications for eumetazoan evolution when comparing genomic organization, gene repertoire and development. This genomic backbone fueled a first whole genome sequence-based analysis on the evolution of a neurotoxin (Nv1) (Moran et al., 2008), and later the analysis of ontogenetic toxin evolution in the complex life cycle of Nematostella (Columbus-Shenkar et al., 2018). Ancestral toxin-genes in this species were probably already present in the last common ancestor of stony corals and sea anemones (500 mya) (Columbus-Shenkar et al., 2018), but the deep evolutionary splits and poor taxon sampling prevent more precise statements about ortholog and paralog relationships of different gene variants. Consequently, processes that lead to the evolution of the toxin function are not known at the moment.

The genome of the platypus (Ornithorhynchus anatinus) (Warren et al., 2008) was originally utilized to understand the phylogenetic position of monotremes and early mammalian evolution (Petersen et al., 2017). Nonetheless, a comparative genomic analyses based on the homology assignment present in the ENSEMBL database v61 (Hubbard et al., 2002; Wong et al., 2012), addressed the evolution of proteins in the venom glands, which are used by male platypus for intra-specific competition and defense. The influence of gene duplication to recruit toxin genes was analyzed by filtering genes of the platypus for monotreme lineage specific gene duplication events. The matching sequences were then compared to known toxin domains and expression in the venom gland and revealed that only 15% (16 out of 107) of putative toxins arose through gene duplication. It is finally concluded that for the venom composition in platypus, gene duplication plays a minor role; the authors hypothesize instead that alternative splicing (see **Figure 4**) is the major driver (Wong et al., 2012).

In contrast, snake venom evolution reflects a different pattern, which is dominated by gene duplication followed by neofunctionalization (see **Figure 4**). The king cobra (Ophiophagus hannah) is a flagship species, representing an iconic snake that draws equally attention from scientists and the public. Its whole genome sequence was published by a consortium together in parallel with the associated whole genome sequence of the burmese python (Python bivittatus) (Castoe et al., 2013; Vonk et al., 2013). The genomes of both species were compared to each other, compared to the genome sequence of the anole lizard (Anolis carolensis), and compared to ortholog and paralog genes from different vertebrate outgroups in order to assess the evolution of key features like reduced limb development, changes in organ size after feeding, or the use of venom. Patterns of gene duplication coupled with positive selection were revealed as underlying processes in the neofunctionalization of venom proteins in the king cobra (Vonk et al., 2013). Another mechanism that might shape snake venom composition is the loss of genes. This process is illustrated in a recent study on the evolution of PLA<sup>2</sup> toxins from rattlesnakes, applying an exome capture approach based on genome data for the diamondback rattlesnake (Crotalus scutulatus). The last common ancestor of rattlesnakes featured neurotoxicity based on PLA<sup>2</sup> toxin variants that originated by duplication. During the evolutionary process some rattlesnake lost several neurotic variants, accompanied by a change in their venom phenotype (Casewell, 2016; Dowell et al., 2016). The authors suspect transposable elements as the source of this process (Dowell et al., 2016). It will be interesting to test the genomic mechanisms of this loss of genes but also the recruitment of lineage specific genes in more details. Especially, more comparative whole genome data of other snake groups are demanded to comprehensively address lineage specific toxin evolution as well as ancestral gene clusters. This goal now moves closer as we currently experience a steep increase of genome sequencing projects, especially regarding snakes that are in the public and scientific focus since decades. In 2018 and 2019, 10 new genome projects for snakes have been published (of currently 19 species in total), see also **Supplementary Table 1**. Two of those datasets were recently used in venomics studies of the five-pacer viper (Deinagkistrodon acutus) (Yin et al., 2016) and the habu (Protobothrops flavovoridis) (Shibata et al., 2018). A minor focus in terms of venom evolution was the analysis of the five-pacer viper genome, where the authors raised the point that younger lineage specific venom genes (unique for the venom elapid or viper lineage) are often expressed in the liver tissue of the other species. This would suggest an origin in metabolic proteins for some toxins and that snakes of the elapid and the viper lineages recruited new venom proteins independently in a similar way (Yin et al., 2016).

# THE ROLE OF COMPARATIVE GENOMICS TO ENLIGHTEN TOXIN GENE ORIGIN AND VENOM EVOLUTION

The evolutionary patterns and processes that shape venoms are only to elucidate if comparable genome datasets are used that consider the phylogenetic distance of taxa. The datasets also need to resemble a sufficient sampling of species (including an ancestral sistergroup) for these taxa to reveal the origin of the investigated toxin. For venomous species this is a challenging task, as elaborated before, since only few lineages are represented by sufficient genome data sets (**Figure 1**), which means finally that in most cases the genomes need to be generated from scratch.

However, some hymenopterans are well-studied on the genomic level, and this is in particular the case for the parasitoid wasp Nasonia vitripennis. Its genome is assembled and annotated on chromosome level, which represents the highest possible quality (Werren et al., 2010). Nasonia and close-related parasitoid wasp species are of key interest to understand parasitoid biology. These wasps paralyze a host with injected venom that alters its immune system to ensure that the offspring develops without being attacked, while the host is kept alive. It is known that the venom changes also the metabolism and gene expression, which is a key feature desired for applied pharmaceutical research (Martinson et al., 2016). To understand which processes shape this obviously targeted venom, a comparative genomics study was conducted analyzing four closely related parasitoid wasps of the group Pteromalidae (including Trichomalopsis s., Urolepis r., Nasonia v. Nasonia g.). These species showed a rather young maximum divergence time of 4.9 Mya years but displayed patterns of specialization on different hosts. It was revealed that, depending on the host species, different genes are expressed and identified in the proteome of the venom glands in different wasp species. Most of those gene switches are a result of cis-regulated changes in the venom gland expression, which do not fit the classical model of gene function evolution. For the analyzed lineage of parasitoid wasps the venom genes underly a rapid turnover and the recruitment of single copy genes as co-option in the venom gland is the dominant process (Martinson et al., 2017). This pattern was identified via the denser taxon sampling and genome data within the (small) clade of interest and would have been missed if more distant related species had been used as a comparison.

Interestingly, Nasonia represents in addition one of the few venomous species for which the mechanism of horizontal gene transfer (HGT) has been more robustly described (Martinson et al., 2016), see also **Figure 4**. HGT, synonymously also referred to as lateral gene transfer, reflects the non-genealogical mechanism of gene exchange between different species from separated lineages in contrast to sexual reproduction in which genes are inherited within a (vertical) lineage (Keeling and Palmer, 2008; Boto et al., 2014). HGT is one supposed mechanism

gene transfer.

of toxin evolution. However, while HGT is rather common between microbial organisms (for example from bacteria to bacteria), these events are considered to occur less often in lineages from the animal kingdom and concrete examples are rare (Keeling and Palmer, 2008; Dunning Hotopp, 2011; Martinson et al., 2016). Nevertheless, reports for HGT from bacteria to animals are strikingly rising and it appears to be more common for groups such as nematodes and arthropods (especially insects), which are more associated with bacterial endosymbionts or phytophagous (Dunning Hotopp, 2011; Boto et al., 2014; Gerth and Bleidorn, 2016). In Nasonia, a Gh19 chitinase HGT that derives from unicellular microsporidia, and happened likely also in other parasitoid wasps within the larger group of Chalcidoidea, is described. This gene, which occurs in plants, bacteria and microsporidia for defense or nutrient acquisition, has not been identified in other animals—except from a second HGT event into mosquitos (Martinson et al., 2016). RNAi knockdown experiments for GH19 chitinase show that it induces fly hosts to upregulate genes that are involved in immune responses against fungi.

Based on its high quality genome, Nematostella represents, as previously discussed, an exceptional taxon to understand in detail the processes of venom evolution and the origin of toxin genes (Columbus-Shenkar et al., 2018). Interestingly, HGT is one of the described mechanisms. A member of the pore-forming toxins (PFTs) of Nematostella featuring an aerolysin domain has obviously been transferred horizontally from the pathogenic bacterium Aeromonas hydorphyla to Nematostella (Moran et al., 2012), and it was shown by knockdown experiments that these genes are functional in the genome. HGT events were described for other venomous species as well, for example latrotoxin genes from spiders (Gendreau et al., 2017). However, our goal here is not to cover HGT as possible mechanism in full depth. It needs to be considered though that in most reports of possible HGT hard experimental evidence, such as RNAi experiments, which illustrate that genes are functionally incorporated into the genome, is missing. Several presumed cases of HGT from bacteria to animals are recently critically disputed, it appears that some studies falsely concluded HGT based on insufficient analyses and possibly contaminations (Martin, 2017; Salzberg, 2017; Leger et al., 2018). A prominent example from arthropods is now coined "tardigate" and refers to the work that presented a tardigrade genome featuring large fractions of bacterial DNA obtained via HGT. However, it turned out later that the claimed unusual high percentage of HGT was induced by inadequate analyses and contamination (Arakawa, 2016; Bemm et al., 2016; Luo et al., 2017).

Comparative analyses of increased numbers of whole genome sequences identified a mechanism of gene origin that is referred to as de novo gene evolution. De novo evolved genes or orphan genes are species or lineage specific, and it was revealed that, in a broad range of phylogenetic lineages, up to one third of genes present in a genome represent orphan genes (Tautz and Domazet-Lošo, 2011). Per definition orphan genes do not feature detectable homologs in closely related species and alternative scenarios that differ from the classical model of gene evolution by duplication are required (Ohno, 2006; Tautz and Domazet-Lošo, 2011). Based on Drosophila melanogaster genome data, it was shown that around 12% of the novel genes originated from non-coding DNA rather than from gene duplication or retroposition (Li et al., 2008), and further evidence supports that these genes quickly evolve to become an essential part of the genome (Chen et al., 2010). The evolution of functional genes from non-coding DNA is also known for Saccharomyces cerevisiae (Cai et al., 2008) and for Mus musculus (Heinen et al., 2009). Expression analyses in the genus Mus supported a rapid turnover of genome transcription and that over evolutionary time every part of the genome is transcribed at some point (Neme and Tautz, 2016). Due to the missing evolutionary pressure on the non-coding regions of a genome, these regions can accumulate mutation in a more or less unconstrained way. Despite the still enigmatic origin of new genes from long non-coding RNAs, there is evidence that ORF's from a suitable length can arise and are translated (Ruiz-Orera et al., 2014, 2018). The translation of the protein finally provides the starting point for selection to eliminate the new protein if it is deleterious, or to fix it in the genome when it is advantageous, see also **Figure 4**. Currently, inevitable high quality data and taxonomically broader samples of whole genome sequences for venomous species are missing to study this phenomenon. However, this scenario of de novo or orphan gene origin demands further attention in the context of toxin origin. Preliminary data of predatory robber flies (Asilidae) hints to the possibility that this mechanism might play a role in venom evolution for this dipteran group.

# PERSPECTIVE

Whole genome data became increasingly important in a variety of research fields, such as evo-devo, social-ecology, phylogenetics, and finally more applied areas, sometimes referred to as translational genomics. Many techniques and approaches are utilized to understand gene evolution under these multiple perspectives. However, comparative genomics reflects still a rather new toolbox in venomics.

We discuss here results from the few studies that already use whole genome data to infer venom evolution, including its potential to improve current caveats of de novo transcriptome based approaches such as assembly artifacts and incorrect ortholog prediction. We further outline the mostly untapped potential of comparative genomics to comprehend processes of toxin evolution in the broader context of gene origin and evolution. Genome backbones are crucial to address questions such as where and how toxin genes evolved within taxon lineages. Particularly important is in this context of gene completeness that is provided by whole genomes (in combination with proteomic and transcriptomics data). Equally fundamental is a sufficient, broad taxon sampling with representative genomes to identify the most ancestral variants of analyzed toxins for the discussed species group.

Presently, genome consortia sequence genome data from organisms of several animal groups, for example vertebrates (G10K), marine invertebrates (GIGA), ants (GAGA), arthropods (i5K), fungi and plants (10KP), producing big data output (Koepfli et al., 2015; Pennisi, 2017; Voolstra et al., 2017; Lewin et al., 2018). The future perspective is a global inventory and preservation of the currently declining biodiversity and its genetic information (Pennisi, 2017; Lewin et al., 2018). Genomes from venomous species represent for example one target as bioressource for possible therapeutics and bioinsecticides (Holford et al., 2018; Senji Laxme et al., 2019). Besides of these rather translational and applied aspects, combined efforts to generate more genomes of broader sampled venomous lineages would provide better datasets to model more detailed venom systems as a major evolutionary key innovation in the animal kingdom. Comparative genomics could significantly contribute to address in depth mechanisms of toxin gene evolution, environmental or prey specific adaptations, gender specific differences or population variation in a variety of animal lineages, and finally, the molecular base of morphological adaptations in the venom apparatus.

# DATA AVAILABILITY

No datasets were generated in this study.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## FUNDING

BMvR is funded by grants of the German Science Foundation (Asilid and hymenopteran venom evolution DFG RE3454/4-1 and RE3454/6-1) and by the LOEWE Center for Translational Biodiversity Genomics (Hessen State Ministry of Higher Education, Research and the Arts). This work was conducted within the new Animal Venomics working group at the Fraunhofer Institute for Molecular Biology and Applied Ecology (IME) in Giessen. SD is currently funded by

# REFERENCES


a scholarship (Doktorandenförderplatz) from the University of Leipzig and the Fraunhofer IME.

## ACKNOWLEDGMENTS

We thank Andreas Vilicinskas for founding the working group Animal Venomics at the Fraunhofer Institute for Molecular Biology and Applied Ecology, Branch Bioresources. Martin Schlegel has to be thanked for his perpetual support of SD and BMvR at the University of Leipzig. Especially Frank Förster, Andre Billion, and Robin Tobias Jauss gave valuable feedback and reflected different perspectives on bioinformatics and comparative genomics. We are also grateful to Alessandra Dupont for valuable feedback on the manuscript and a final editing. SD and BMvR acknowledge support from the DFG and University of Leipzig within the program of Open Access Publishing.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00163/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor declared a past co-authorship with one of the authors, BMvR.

Copyright © 2019 Drukewitz and von Reumont. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Variation in the Venom of Parasitic Wasps, Drift, or Selection? Insights From a Multivariate QST Analysis

Hugo Mathé-Hubert <sup>1</sup> \* † , Laurent Kremmer <sup>1</sup> , Dominique Colinet <sup>1</sup> , Jean-Luc Gatti <sup>1</sup> , Joan Van Baaren<sup>2</sup> , Émilie Delava<sup>3</sup> and Marylène Poirié<sup>1</sup>

### Edited by:

Kartik Sunagar, Indian Institute of Science (IISc), India

### Reviewed by:

Mrinalini, National University of Singapore, Singapore Yehu Moran, Hebrew University of Jerusalem, Israel Darin Rokyta, State College of Florida, United States Micaiah Ward, Florida State University, United States, in collaboration with reviewer DR

\*Correspondence:

Hugo Mathé-Hubert hugomh@gmx.fr

### †Present Address:

Hugo Mathé-Hubert, LIEC UMR, Université de Lorraine, CNRS, Metz, France

In Memoriam: This paper is dedicated to the memory of Prof. Roland Allemand

### Specialty section:

This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 01 March 2019 Accepted: 23 April 2019 Published: 10 May 2019

### Citation:

Mathé-Hubert H, Kremmer L, Colinet D, Gatti J-L, Van Baaren J, Delava É and Poirié M (2019) Variation in the Venom of Parasitic Wasps, Drift, or Selection? Insights From a Multivariate QST Analysis. Front. Ecol. Evol. 7:156. doi: 10.3389/fevo.2019.00156 <sup>1</sup> Université Côte d'Azur, INRA, CNRS, ISA, Sophia Antipolis, France, <sup>2</sup> Université de Rennes 1, UMR-CNRS ECOBIO 6553, Campus de Beaulieu, Rennes, France, <sup>3</sup> Université Lyon 1, Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Villeurbanne, France

Differentiation of traits among populations can evolve by drift when gene flow is low relative to drift or selection when there are different local optima in each population (heterogeneous selection), whereas homogeneous selection tends to prevent evolution of such a differentiation. Analyses of geographical variations in venom composition have been done in several taxa such as wasps, spiders, scorpions, cone snails and snakes, but surprisingly never in parasitoid wasps, although their venom should constrain their ability to succeed on locally available hosts. Such a study is now facilitated by the development of an accurate method (quantitative digital analysis) that allows analyzing the quantitative variation of large sets of proteins from several individuals. This method was used here to analyse the venom-based differentiation of four samples of Leptopilina boulardi and five samples of L. heterotoma from populations along a 300 km long south-north gradient in the Rhône-Saône valley (South-East of France). A major result is that the composition of the venom allows to differentiate the populations studied even when separated by few kilometers. We further analyzed these differentiations on the populations (reared under similar conditions to exclude environmental variance) with a QST analysis which compared the variance of a quantitative trait (Q) among the subpopulations (S) to the total variance (T). We also used random forest clustering analyses to detect the venom components the most likely to be adapted locally. The signature of the natural selection was strong for L. heterotoma and L. boulardi. For the latter, the comparison with the differentiation observed at some neutral markers revealed that differentiation was partly due to some local adaptation. The combination of methods used here appears to be a powerful framework for population proteomics and for the study of eco-evolutionary feedbacks between proteomic level and population and ecosystem levels. This is of interest not only for studying field evolution at an intermediate level between the genome and phenotypes, or for understanding the role of evolution in chemical ecology, but also for more applied issues in biological control.

Keywords: adaptive divergence, local adaptation, multivariate QST, antagonistic coevolution, individual 1D SDS-PAGE, population proteomics, Leptopilina wasps, Drosophila

# INTRODUCTION

Local adaptation, the fact that populations are more adapted to their own local environment than to another environment, has long been the searched signature of an ongoing natural selection in the field (Futuyma and Moreno, 1988; Kawecki and Ebert, 2004; Poisot et al., 2011). It is considered to be a major cause of the emergence and maintenance of polymorphism (Ravigné et al., 2009). Traits involved in the antagonistic interactions can also be subject to such local adaptations, but with the particularity that the interacting species each represent the environment of the other, which can produce local adaptations induced by coevolution. However, the "geographic mosaic theory of coevolution" predicts that the intensity of such coevolutionary dynamics can vary in space according to the specific composition of local communities (Thompson, 1994, 1997; Gomulkiewicz et al., 2000), with, for example, the presence of some alternative prey or host species.

Local adaptation induced by coevolution is particularly expected between species whose interactions involve resistancevirulence traits, as is the case for host-parasite or host-parasitoid interactions. Parasitoids are insects which develop at the expense of the host, usually leading to its death. For such interactions, Gandon (2002) has shown that the localized coevolution is more likely when the fitness cost to the host and the specificity of the parasite are high. For most parasitoids species, the fitness costs to the host are maximal since most hosts are killed before producing any offspring, and parasitoids virulence has been shown to be specific (Kraaijeveld et al., 1998; Dupas et al., 2009; Rouchet and Vorburger, 2012, 2014). In addition, local adaptation is more likely to occur when organisms are able to choose their environment (Ravigné et al., 2009). Parasitoids may be able to choose host genotypes that maximize their fitness. Indeed, most species of parasitoids were shown to be able to evaluate several traits of a host encountered, such as size, age, parasitism status, or resistance capacity, thanks to numerous types of chemosensory sensilla on their antennae or ovipositor, before deciding to lay an egg or not (van Baaren and Boivin, 1998; Dubuffet et al., 2006; van Baaren et al., 2007). The ability to select its host combined with the specificity of virulence may explain the tendency of similar parasitoid genotypes to parasitize similar host genotypes under certain environmental conditions (Lavandero and Tylianakis, 2013).

If the specificity of virulence and local adaptation exist, they may be supported by the specificity and local adaptation of virulence factors. However, geographic variation of parasitoid virulence factors has only been little studied to date although most adaptive evolutionary constraints are determined at this level of organization (Kawecki and Ebert, 2004; Biron et al., 2006; Wagner, 2012). In many parasitoids, the virulence factors are mainly contained in the venom injected into the host at the time of oviposition, which results in the inhibition of the immune system of the host as well as the regulation of its physiology, or behavior, thus optimizing the development of the parasitoid offspring (Poirié et al., 2009, 2014; Mrinalini et al., 2014; Moreau and Asgari, 2015; Martinson and Werren, 2018; Walker et al., 2018).

Although data for parasitoids are still scarce, geographic variations in venom composition have been investigated in most venomous taxa, e.g., wasps (Perez-Riverol et al., 2017), spiders (Binford, 2001), scorpions (Abdel-Rahman et al., 2009; Rodríguez-ravelo et al., 2013), cone snails (Remigio and Duda, 2008; Abdel-Rahman et al., 2011), and snakes (Wuster et al., 1992; Francischetti et al., 2000; Alape-Giron et al., 2008; Calvete et al., 2011; Holding et al., 2016; Hofmann et al., 2018).

To determine whether a geographical variation is the consequence of local adaptation or drift, tests can be done experimentally or using statistical inferences from field observations. For example, Holding et al. (2016) experimentally evidenced some local adaptations of the venom of rattlesnakes by showing that venom was most effective on squirrels sampled at an altitude similar to that of the rattlesnake from which it was extracted. In addition to these reciprocal-transplant-like experiments, local adaptation can be demonstrated using QST approaches that compare the variance of a quantitative trait (Q) among subpopulations (S) to the total variance (T). They generally help to disentangle the local adaptation from the drift by comparing the differentiation for the traits of interest to that for neutral markers (FST). One of these approaches, the multivariate QST analysis developed by Martin et al. (2008) additionally checks whether traits variances among populations are proportional to their variances within populations, which is expected if the traits are neutral.

In this paper, we have used this multivariate QST approach to assess the effect of drift and selection on the venom of different populations of two parasitoid species of Drosophila, Leptopilina boulardi, and L. heterotoma, along a south-north gradient in which the relative importance of the two species varies. In France, these parasitoids of Drosophila have been the subject of an in-depth follow-up for about 30 years along the Rhône-Saône valley. Drosophila melanogaster and D. simulans, the main Drosophila species in this area, are parasitized by three larval parasitoid species, Asobara tabida, L. boulardi and L. heterotoma. Leptopilina boulardi is generally considered as a specialist of these two host species, while L. heterotoma has been found on eight other Drosophila species as well as flies from the genus Chymomyza and Scaptomyza (Allemand et al., 1999; Fleury et al., 2004; Fleury et al., 2009).

The gradient of temperature along the Rhône valley prevents the establishment of L. boulardi, a Mediterranean species, in its northern part. In the South, L. boulardi outcompetes the other parasitoid species (Allemand et al., 1999; Fleury et al., 2004, 2009). The community is therefore dominated by D. simulans and L. boulardi south of the valley, and D. melanogaster and L. heterotoma in the north (Fleury et al., 2004). However, changes have occurred over the past 30 years in the community with L. boulardi moving northward rapidly (90 km per decade), probably in relation to the increase in temperature associated with global warming (Delava et al., 2014). The presence of this new competitor affects the life history traits of A. tabida (Vayssade et al., 2012; van Baaren et al., 2016; Moiroux et al., 2018) but not L. heterotoma, possibly due to a wider overlap of its ecological niche with that of A. tabida (Vayssade et al., 2012).

Leptopilina parasitoids have been widely studied for various life history traits [fecundity, egg size, egg load at emergence, lifetime, ovigeny index), physiology (lipid content, metabolic rate), mobility (e.g., locomotor activity)] and features associated with parasitism success or competitiveness (Fleury et al., 1995; Vayssade et al., 2012; Vuarin et al., 2012), notably in the Rhône valley. For example, L. heterotoma shows a strong intraspecific genetic variability in the allocation of resources to different life history traits (Vuarin et al., 2012). Genetic differences in locomotor activity were also observed, with parasitoids of the southern populations being more active at the beginning and the end of the photophase (lower temperature), and those of northern populations during the afternoon (higher temperature; Fleury et al., 1995).

The existence of a spatial structuring of populations, a prerequisite for local adaptation, was demonstrated in Iranian populations of L. boulardi using neutral markers (Seyahooei et al., 2011). Moiroux et al. (2013) also showed local adaptation in L. boulardi, with significant differentiation between orchards and forest habitats, possibly related to a different distribution of resources (e.g., host distribution) in these environments. A similar pattern may explain the characteristics of a Congolese line of L. boulardi that is less effective on D. melanogaster than Mediterranean populations but can develop successfully on the tropical species D. yakuba. A genetic analysis using this line and a line from a Mediterranean population suggested that these differences involve two non-linked loci, each necessary for parasitic success on one of the two hosts (Dupas and Carton, 1999). Interestingly, the Congolese line reveals resistance in D. melanogaster and D. yakuba and has highlighted two major resistance genes, one in each species (Hita et al., 1999, 2006). For L. heterotoma, a comparison of the Seattle (USA) and Igé (France) populations revealed a local adaptation to the host D. subobscura, the parasitoid fecundity being higher when parasitizing the sympatric host (Gibert et al., 2010). Overall, the explanations for populations differentiation mainly involved local adaptation, either to host distribution (Moiroux et al., 2013) or to the presence of competitors (Vayssade et al., 2012).

Surprisingly, although frequent local adaptation of traits involved in virulence is expected, the existence of a population structure based on such traits has rarely been tested. For example, Dupas et al. (2003) conducted such analysis worldwide on populations of L. boulardi, showing that the host resistance rate to the Congolese line was broadly similar regardless of their geographic origin, while parasitoids from tropical Africa were less virulent on D. melanogaster compared to those from other areas.

In Leptopilina parasitoids, virulence depends on the venom injected with the egg at oviposition (Poirié et al., 2009, 2014; Moreau and Asgari, 2015; Walker et al., 2018). Although never conclusively demonstrated, there is ample congruent evidence to suggest that differences in venom composition of parasitoids may explain their difference in host range (e.g., Lee et al., 2009). Parasitoid venoms are complex fluids containing many proteins that have been characterized in several species, including L. boulardi, and L. heterotoma (Colinet et al., 2013a; Goecks et al., 2013). Colinet et al. (2013a) investigated the interspecific and intraspecific variation of venom composition by comparing two L. boulardi lines having different virulence properties and a strain of L. heterotoma. L. boulardi and L. heterotoma had no major venom proteins in common while L. boulardi lines shared only about 50% of their most abundant proteins. Finally, interindividual variations of venom composition were also recently evidenced in strains and populations of species of the Leptopilina and Psyttalia genera (Colinet et al., 2013a,b). This variation in the venom composition may also be at the origin of the rapid evolution of parasitoid virulence observed under experimental evolution (Dion et al., 2011; Rouchet and Vorburger, 2014). Indeed, differential selection of the expression levels of genes encoding venom proteins was observed during the evolution of parasitoid virulence in this last experimental evolution (Dennis et al., 2017). Nevertheless, neutral molecular variation is common, and some of the variations in the venom composition are therefore most likely neutral and do not affect virulence. This raises the question of whether parasitoid populations could be differentiated according to the composition of the venom and which of these differences would be caused by drift or local adaptation.

Here, we have investigated the population structure of L. boulardi and L. heterotoma in the Rhône-Saône valley according to the individual composition of the venom. For L. boulardi, we have studied four populations, two in the south of the Rhône-Saône valley that this species invaded more than 40 years ago (Delava et al., 2014) and two at the northern end of this valley (close to Lyon) that was invaded by this species between 1993 and 2003 (Delava et al., 2014). For L. heterotoma, five populations were studied, one in the south, outside the Rhône-Saône valley (200 km east) where L. boulardi predominates, two in the middle of the Rhône-Saône valley where it appeared between 1993 and 2003, and two in the Saône valley, north of Lyon, where its presence was detected between 2003 and 2011 (Delava et al., 2014).The analysis of the venom composition of individual wasps was performed using a recently developed method based on 1D electrophoresis and further digital and statistical analysis of the intensity of the protein bands (Mathé-Hubert et al., 2015).

We report for the first time a structuration of parasitoid populations based on venom composition, a trait often directly related to the outcome of host-parasitoid interactions. This differentiation among populations in venom composition is partly induced by drift but may also be due to some heterogeneity in the selection pressures experienced by different populations. We used the multivariate QST approach developed by Martin et al. (2008) to detect whether there is selection for the same optimal relative amounts of venom components in all populations (homogeneous selection) or selection for different local optima (heterogeneous selection, or local adaptation), possibly in relation with some variation in the host physiology or resistance. For L. boulardi, data showed that part of the venom-based population structure is induced by a certain heterogeneity in selection within different populations, i.e., some local adaptation. For L. heterotoma, we could only perform part of the analysis, which evidenced a high evolutionary Mathé-Hubert et al. Parasitic Wasp Venom Locally Adapts

potential of the venom composition. These results are compared with those previously obtained for various parasitoid traits in the Rhône-Saône valley and are discussed in the context of coevolutionary theories.

# MATERIALS AND METHODS

# Sampling and Analyzed Individuals

The four L. boulardi populations were sampled in orchards in September 2010. Sampling locations were Ste Foy-Lès-Lyon and S<sup>t</sup> Laurent d'Agny for northern populations (25 and 29 females sampled, respectively) and Éyguières and Avignon for southern populations (20 and 11 females sampled, respectively). Southern and northern L. boulardi populations are distant from 200 km (**Figure 1** and **Table S1**). Each field-collected individual was isolated and allowed to parasitize larvae of a D. melanogaster strain sampled in Ste Foy-Lès-Lyon. The offspring of these individuals were then stored separately at −80◦C. The venom of one daughter was analyzed per field-sampled female.

The five populations of L. heterotoma were sampled in orchards in 2008 (n = 15–20) and then maintained on D. melanogaster (Ste Foy-Lès-Lyon) until the beginning of September 2013, when individuals were stored at −80◦C. As discussed later, we cannot exclude that some of the evolution we detected occurred in the laboratory between 2008 and 2013. For this reason, we will refer to these populations as "laboratory-maintained populations." The five laboratorymaintained populations originated from field samples from Uchizy and Montbellet for northern populations, Sonnay and Épinouze for middle populations, and Vence as a southern population but located outside the Rhône-Saône valley. Northern and middle populations are distant from 130 km whereas Vence is located 350 km away from northern populations (**Figure 1**). We analyzed 20 individuals for each laboratory maintained population.

# Venom Characterization: Sample Preparation and Venom Profile Digital Analysis

The venom apparatus were dissected individually, and venom extracted from the venom reservoir treated as described in Mathé-Hubert et al. (2015), and loaded onto 1D SDS-PAGE gels (Any-kD Mini-PROTEAN <sup>R</sup> TGXTM, Bio-Rad) for L. heterotoma (eight gels) and L. boulardi (seven gels). For both species, we ensured that all gels contained individuals from all studied populations. Briefly, gels were silver stained after migration (Morrissey, 1981) and photographed several times during the protein revelation step (digital camera EOS-5D-MkII, Canon, Japan). Taking several pictures during protein revelation allowed to account for most of the heterogeneity in protein quantities by ensuring that, for all lanes, at least one picture was available on which all shared bands were revealed without being too saturated (see examples of pictures in **Figure 2**). The high-resolution pictures (5,626 × 3,745 pixels; 16 bits; TIFF file) were then analyzed semi-automatically with a quantitative digital analysis based on lanes transformation

between 2003 and 2011.

into intensity profiles by Phoretix 1D (TotalLab, UK), which were then analyzed by R functions. These functions use the 2nd derivative of these profiles to semi-automatically identify bands that are common to a large number of individuals (hereafter "reference bands"). The intensity of these reference bands is then quantified in each individual lane and normalized to account for the remaining heterogeneity in protein amounts (see details in Mathé-Hubert et al., 2015). The analysis resulted in the choice of 29 and 32 "reference bands" of identified molecular weight for L. heterotoma and L. boulardi, respectively. Then, to remove the remaining variation in the total lane intensity and to measure the intensity of these reference bands in each lane, we used the following combination of parameters for L. heterotoma (Lh) and L. boulardi (Lb), [(i) "height" for Lh and "volume" for Lb (maximal intensity between borders of reference bands or total intensity between borders, respectively), (ii) background removed in Phoretix-1D with a "rolling ball" of 10,000 pixels of radius for both species, (iii) cyclicloess for Lh and quantiles normalization for Lb]. These two combinations of parameters were selected by comparing the signal to noise ratio and the multicollinearity of the dataset produced by each combination of parameters (details in Mathé-Hubert et al., 2015). These normalized intensities of the reference bands are the variables that describe the venom composition.

Lastly, the intensities of some reference bands can be highly correlated, either because they partly overlap (see **Figure 2**) or because of linkage disequilibrium (for more details, see Mathé-Hubert et al., 2015). Such a linkage disequilibrium between genes

FIGURE 2 | Example of SDS-PAGE gels for venom analysis. Example of SDS-PAGE gels used to characterize the variation in the venom composition of L. boulardi (A) and L. heterotoma (B). Each lane contains the venom of a single wasp whose population of origin is indicated at the top of the lane (Éy, Éyguières; Av, Avignon; SFL, Ste Foy-Lès-Lyon; SLA, S<sup>t</sup> Laurent d'Agny; Mo, Montbellet; Ve, Vence; So, Sonnay; Ép, Épinouze; Uch, Uchizy). The reference bands that significantly discriminate between populations after a Bonferroni correction are indicated.

coding for a higher or a lower expression of different venom proteins may indeed induce some correlation between the bands containing the two proteins. Since the linkage disequilibrium between traits should be low for the multivariate QST analysis (Martin et al., 2008), we merged reference bands that were too highly correlated. These set of correlated reference bands were detected using a UPGMA dendrogram in which the distance between pairs of reference bands was measured as 1 − |r| where r is the Pearson correlation coefficient (**Figure S2**). Then, we merged the bands whose correlation was higher than 0.6 in a composite reference band estimated as the mean of the merged bands. For L. boulardi, we built four composite reference bands from reference bands (16, 17), (19, 20), (22, 28), and (30, 31, 32) (**Figure S2A**). For L. heterotoma, we built six composite reference bands from reference bands (1–3), (6, 7), (8, 9), (15, 21), (19, 20), and (28, 29) (**Figure S2B**).

# Statistical Analyses of the Population Structuration

We performed all statistical analyses in R 3.4.4. The workflow of these analyses is described in **Figure S1**.

### Power of the Venom to Discriminate Populations

For each pair of populations, we used a Random Forest (RF) clustering algorithm to predict the population of origin of individuals based on the composition of their venom (the centered and scaled intensities of the reference bands). We conducted the RFs using the package party to account for the multicollinearity and high number of predictors compared to the low number of individuals per population (Hothorn et al., 2006; Strobl et al., 2008). For each RF, we have estimated the discriminating power of the venom composition as the proportion of accurately classified individuals. The significance of this discriminatory power was estimated using 10,000 permutations. To avoid overfitting, this was done using predictions based on the out-of-the-bag data. For all RFs fitted to L. boulardi, we used the weight argument to ensure that all populations have the same weight albeit the headcounts were not balanced.

### Effect of Selection on the Variation in Venom Composition Within and Among Populations

We used a QST analysis to disentangle between drift, heterogeneous selection (local adaptation) and homogeneous selection (**Figures S1A–C**). The QST compares the variance of a quantitative trait (Q) among the subpopulations (S) to the total variance (T).

As we had no information on the genetic relatedness between the analyzed individuals, this QST analysis was not performed on the additive genetic variance (strictsense QST) but on the variation between individuals raised in a common environment i.e., the total genetic variance (broad-sense QST). This broad-sense QST may be biased downward which is further described in the discussion. The multivariate QST analysis developed by Martin et al. (2008) relies on two complementary tests, the first being more powerful when different traits (here venom components) are subjected to different types of selection regime (i.e., homogeneous, heterogeneous, or neutral), the second when they all undergo the same selection pressures (either homogeneous or heterogeneous). For more details see **Figure S1C**. The R code used to perform these tests was developed by Martin et al. (2008). To ensure its long-lasting accessibility, it is also included in the **Supplementary Material S1**.

## **Test 1.H0: Covariances among (D) and within (G) populations are proportional (D** =ρ×**G)**

For both species, we first summarized the total variation in the venom composition using a principal components analysis (PCA; Dray and Dufour, 2007) in order to visualize how this variation is partitioned within and among populations. Then, we plotted the first two principal components (PC) of the PCA using the R function s.class (package ade4) that groups individuals of the same population. This allows to determine whether, as expected under neutrality, the dimension with the greatest variation within each population is also the one with the greatest variation between populations. To improve this visual comparison of the variation within vs. among populations, we projected on these plots the first PC of a PCA performed on the variation within populations and the first PC of a PCA performed on the variation among populations. These two additional PCA were performed using the functions wca and bca (package ade4), respectively. If the variation among populations is only induced by drift, then these axes of variation within and among populations should be mostly collinear. We measured this collinearity with the Pearson correlation coefficient. A function performing this analysis was added to the **Supplementary Material S1** (see also **Figure S1A**).

This expectation of proportionality between the variations within and among populations is the null hypothesis of the first test of the multivariate QST analysis developed by Martin et al. (2008). Indeed, under the null hypothesis that the components of venom are neutral, all the variations observed among populations appeared because the isolated populations drifted independently of each other. In each population, for each trait (here venom components), the amount of drift is proportional to the amount of variance within the population. This gives the statistical expectation that the average matrix of covariances within populations (G) is proportional to the matrix of covariances among populations (D): D = ρ × G, where the two matrices D and G are estimated by a MANOVA and the coefficient ρ of proportionality is estimated by a function developed by Martin et al. (2008). As the sample size per population is small, the pvalues were computed using the correction based on the Bartlett adjustment of likelihood ratios (Martin et al., 2008). In this analysis, the confidence interval of the proportionality coefficient ρ is estimated by maximum-likelihood (for more details, see Martin et al., 2008). This confidence interval is used for the second test.

### **Test 2.H<sup>0</sup> The proportionality coefficient** ρ = **2FST 1**−**FST**

The second test relies on the comparison between the average differentiation observed for the traits of interest (measured by the proportionality coefficient ρ) and the differentiation observed for some neutral markers (measured by the FST). Information on such neutral differentiation was only available for three of the four populations of L. boulardi studied (Avignon, Ste Foy-Lès-Lyon, and S<sup>t</sup> Laurent d'Agny) and it was not available for L. heterotoma. This test was therefore performed only on these three L. boulardi populations. Under the null hypothesis that, on average, the traits used to compute ρ are neutral, then ρ = 2FST 1−FST . This is typically tested by checking for an overlap between the confidence intervals of ρ and that of <sup>2</sup>FST 1−FST (Martin et al., 2008). The only information we had on neutral differentiation was a point estimate of the L. boulardi FST (= 0.084). However, we did not need the confidence interval of the FST because it turned out that the point estimate of <sup>2</sup>FST 1−FST itself fall within the confidence interval of ρ. This estimation of the FST is based on the RAD sequencing of L. boulardi individuals from the Rhône–Saône valley which led to the genotyping of about 16 individuals per population (the three studied populations) on 484 loci (Delava, in preparation).

## Identification of the Venom Components Possibly Under Heterogeneous Selection

The identification of venom components possibly under heterogeneous selection was done in two steps. First, we detected the reference bands most likely to be the ones under heterogeneous selection, and then we identified the components of the venom they contain.

For the first step, we used a clustering RF algorithm to predict, for both L. heterotoma and L. boulardi, the population of origin of the individuals based on the composition of their venom (the centered and scaled intensities of the reference bands). We conducted the RFs and computed the conditional importance of each explanatory variable (i.e., reference bands) using the package party (Hothorn et al., 2006; Strobl et al., 2008). We then tested the significance of each explanatory variable by comparing its conditional importance, its estimated null distribution using 10,000 permutations (Hapfelmeier and Ulm, 2013). The p-values resulting from this analysis were then Bonferroni corrected and we selected for further investigation the reference bands that were still significant.

We tentatively identified the proteins present in these reference bands by matching these bands with the bands on the 1D electrophoresis gels used for L. boulardi and L. heterotoma venom proteomics (Colinet et al., 2013a). Since one band can contain different proteins and a given protein can be found in different bands, the number of peptides matches from the mass spectrometry performed by Colinet et al. (2013a) was used to classify the proteins in the bands as abundant or not using a threshold of 10 peptides matches in the proteomic analysis. In these bands, low abundance proteins are unlikely to drive the observed population structure.

# RESULTS

# Power of the Venom to Discriminate Between the Populations

In the first step of the analysis, we checked whether the composition of the venom differed among populations, and to what extent these differences were strong enough to predict the population of origin of the wasps based on their venom composition. To the best of our knowledge, this question had never been investigated in a parasitoid wasp. Our field sampling design was particularly well-suited for such analysis since the distance between populations varied wildly (from 14 to 230 km for L. boulardi and from 3.3 to 350 km for L. heterotoma).

The RFs on each L. boulardi population pair indicated that, on average, venom accurately classified 74–89% of individuals (**Table 1**). These classifications were significant for three combinations: two of the four South-North comparisons (Avignon vs. Ste Foy-Lès-Lyon and Éyguières vs. S<sup>t</sup> Laurent d'Agny) and one of the two comparisons of populations of the same area (Ste Foy-Lès-Lyon vs. S<sup>t</sup> Laurent d'Agny, two populations only 14 km apart). Furthermore, the classification was only a little more accurate when comparing distant

### TABLE 1 | Venom-based discrimination between L. boulardi populations.


For each pair of populations, the proportion of individuals accurately classified by the RF is given with the permutation-based p-value of the null hypothesis (in brackets). Sample sizes are indicated close to the location name.



For each pair of populations, the proportion of individuals accurately classified by the RF is given with the permutation-based p-value of the null hypothesis (in brackets). Sample sizes are indicated close to the location name.

populations than when comparing nearby populations (average accuracies: 81 and 74.1% respectively; **Table 1**). The RFs performed on each pair of L. heterotoma laboratory-maintained populations allowed to classify 85 to 97% of individuals by their venom (**Table 2**). These classifications were significant in all comparisons. The highest accuracy was achieved when comparing Vence to another population (average accuracies: 95.6%; **Table 2**), but, excluding Vence, there was almost no difference between comparisons involving distant populations or nearby (average accuracies: 90.6 and 87.5% respectively; **Table 2**).

# Effect of Selection on the Variation in Venom Composition Within and Among Populations

To detect whether selection affected variation in venom composition among and within populations, we used multivariate QST analysis, which performed two tests. The first investigated whether all traits were under the same selection regime (homogeneous, heterogeneous or neutral) by checking whether the (co)variances among populations were proportional to the (co)variances within populations. If such proportionality exists, then the dimensions that have the most variance among and within populations should be collinear. We provided a visual assessment of this part of the analysis based on PCAs. The second test investigated whether, on average, traits were under homogeneous or heterogeneous selection by testing whether the coefficient of proportionality (ρ) estimated in the first test was different from <sup>2</sup>FST 1− FST .

For L. boulardi, 34.3% of the total variation in venom composition was summarized by the first two axes of the PCA. The axis summarizing the variation among populations discriminated mainly the populations of the North (Ste Foy-Lès-Lyon and S<sup>t</sup> Laurent d'Agny) of those of the South (Avignon and Éyguières), although it also partly discriminated between these two southern populations. This axis was mainly collinear with the first axis of the global PCA. On the opposite, the axis summarizing the variation within populations was mainly collinear with the second axis of the global PCA (**Figure 3A**). This reveals that instead of the expected collinearity under neutrality, the two axes were mainly orthogonal (Pearson correlation of 0.09), thus showing that the reference bands did not all undergo the same type of selection (more details on **Figure S1C**).

A similar result was obtained for L. heterotoma. The first two axes of the PCA explained 39.9% of the total variation in venom composition, and the within and among populations axes were also predominantly orthogonal, with a Pearson correlation of 0.16. On the among populations axis, the population of Vence was completely separated from other populations and that of Montbellet was partially separated (**Figure 3B**).

In agreement with this visual analysis of the venom-based structure of populations, the first test of the QST analysis yielded a highly significant result (for both L. boulardi and L. heterotoma p < 0.001; H0: D = ρ × G where G and D are the within and among populations covariance matrices, respectively and ρ is the proportionality coefficient; see section Materials and Methods). This analysis estimated the proportionality coefficient and its 95% confidence interval (L. boulardi: ρ = 0.243; ρǫ [0.176; 0.395], which corresponds to a QST of 0.108 [0.081; 0.165]; L. heterotoma: ρ = 0.738; ρǫ [0.567; 1.06], which corresponds to a QST of 0.270 [0.221; 0.346]). The coefficient of proportionality of L. heterotoma was three times higher than that obtained for L. boulardi and the confidence intervals did not overlap, meaning that this difference was significant. For L. boulardi we also compared the estimated coefficient of proportionality (ρ) with the FST to perform the 2nd test H<sup>0</sup> : ρ = 2FST <sup>1</sup>−FST which allowed to check whether there was some homogeneous or heterogeneous selection (more details on **Figure S1C**). This test was not significant since the estimate of FST = 0.084 gave an estimate of <sup>2</sup>FST 1−FST = 0.183, which falls within the confidence interval of ρ.

# Identification of the Venom Components Possibly Under Heterogeneous Selection

Reasoning that the locally adapted reference bands are the ones that discriminate the most between populations, we used RF to predict the populations of origin of individuals according to the intensity of reference bands, and we selected for further investigation those which remained significant after a Bonferroni correction. For L. boulardi, this led to identification of nine references bands (**Figure 4**) which, in agreement with the PCA (**Figures 2**, **3A**), mainly discriminated between populations of the north and populations of the south, except for the B12

FIGURE 3 | PCAs on the venom composition. The variation in the venom composition of L. boulardi (A) and L. heterotoma (B) is summarized using a PCA. Each dot corresponds to a single wasp that is positioned according to its venom composition. Then, the individuals of the same population are grouped and the ellipse summarizes their dispersion by indicating the standard deviation within the population. The percentage of variance explained is given at the end of each axis. The solid blue line and the dotted green line are the projections of the axes maximizing the within and the among populations variance, respectively. Under neutrality, they are expected to be collinear.

bands which mainly discriminated Ste Foy-Lès-Lyon from other populations. Also, as observed on the PCA, Avignon was more strongly discriminated than Éyguières (bands B26 and B29), although this tended to be the opposite for bands B2 and B6.

For L. heterotoma, this allowed identification of 13 references bands that, in agreement with the PCA (**Figure 3B**), distinguished mainly between Vence and other laboratorymaintained populations (B4, B6\_7, B8\_9, B16, B23, and B28\_29). However, there were some other differences (**Figure 5**). Montbellet was separated from other laboratory-maintained populations for bands B5 and B10. The band B12 tended to isolate Épinouze, and the bands B8\_9, B14, and B19\_20, Uchizy. The band B11 varied along a south-north gradient, with the lowest intensities at Vence (South), then regrouping Sonnay and Épinouze (middle of the sampled area) and finally Uchizy and Montbellet (North). In contrast, the bands B15\_21 isolated Montbellet (North) and Uchizy (middle) from the other three laboratory-maintained populations. Most of these populations had different average intensity for band B16.

For both species, a putative function could be assigned to only a few of these reference bands by comparing the 1D protein electrophoreses of this study to that used by Colinet et al.

with an underscore in the ID. Bonferroni corrected p-values are shown next to the ID of the reference bands. Reference bands are sorted from low to high molecular weight. These bands can be observed in Figure 2B.

(2013a) for the venom proteomics (**Tables 3**, **4**). For L. boulardi, the band B2 contained two proteins without any predicted function, the band B24 contained a serine protease inhibitor of the serpin superfamily (LbSPNm) and bands B27 and B29 mainly contained RhoGAP domain-containing proteins (LbGAP1, 2, 3, and 5). For L. heterotoma, bands B4 and B20 contained proteins of no predicted function. Bands B16 and B23 contained RhoGAP domain-containing proteins (LhGAP3 and LhGAP2, respectively), an additional protein in the band B23 being a serine protease inhibitor of the serpin superfamily (LhSPN). Bands B28 and B29 contained an aspartylglucosaminidase (AGA) and a protein of unknown function.

# DISCUSSION

The venom is a crucial component of the success of most endoparasitoids and its composition varies among individuals (Colinet et al., 2013b). However, although population biology


TABLE 3 | Correspondence between L. boulardi bands that discriminate between populations and their putative protein content.

Band contents were obtained from the 1D-SDS-PAGE proteomic study by Colinet et al. (2013a). Only proteins for which at least 10 peptide matches in Mascot searches against unisequences identified by transcriptomics of the venom apparatus were considered as abundant and used. The number of proteins in the band, their predicted function and the number of peptides matches is provided. NA, data not available.

studies of the geographical variation in venom composition was conducted on several venomous taxa (Wuster et al., 1992; Francischetti et al., 2000; Binford, 2001; Alape-Giron et al., 2008; Remigio and Duda, 2008; Abdel-Rahman et al., 2009, 2011; Calvete et al., 2011; Rodríguez-ravelo et al., 2013; Holding et al., 2016; Perez-Riverol et al., 2017; Hofmann et al., 2018), it was not the case for parasitoids. This is surprising since, in parasitoid wasps, venom should constrain the capacity to succeed on locally available hosts. Performing such a study is now facilitated by the development of a rapid and accurate method to analyse the quantitative variation of large sets of proteins from several individual samples (Mathé-Hubert et al., 2015). This method was used here to analyse the venom-based differentiation of four L. boulardi and five L. heterotoma samples from populations along a 300 km-long south—north gradient in the Rhône-Saône valley (South-East of France).

A major result of these analyses was that the venom composition, characterized by the relative proportions of the different constitutive proteins, allowed a significant discrimination of populations, even if only a few km apart. Such a differentiation among populations can evolve by drift when gene flow is low relative to drift or by selection when there are different local optima in each population (heterogeneous selection), whereas homogeneous selection tends to prevent evolution of differentiation among populations.

We further analyzed these differentiations with a QST analysis that partitions the variance of a quantitative trait (Q) among subpopulations (S) relative to the total population (T), using the multivariate QST analysis developed by Martin et al. (2008). In the strict sense, QST analyses should be based on additive genetic variance, but in practice, they also use QST in the broad-sense that applies to the total genetic variance (additive + epistasis + dominance; Bouétard et al., 2014; Porth et al., 2015), as well as the PST that considers the phenotypic (P) variance, thus also including the non-genetic environmental variance (Brommer, 2011).

In this study, we could exclude the existence of environmental variance since individuals were raised in a common environment, meaning that the phenotypic variance is equal to the total genetic variance. We therefore used broad-sense QST analyses which assume the lack of non-additive genetic variances, an approximation that can bias the QST estimates. This bias is generally considered low because, in most cases, QST is


TABLE 4 | Correspondence between L. heterotoma bands that discriminate between the laboratory maintained populations and their putative protein content.

Band contents were obtained from the 1D-SDS-PAGE proteomic study by Colinet et al. (2013a). Only proteins for which at least 10 peptide matches in Mascot searches against unisequences identified by transcriptomics of the venom apparatus were considered as abundant and used. The number of proteins in the band, their predicted function and the number of peptides matches is provided. NA, data not available.

computed for multigenic traits (Leinonen et al., 2013). However, the amount of a given venom component may depend on a low number of loci (although see Albert et al., 2014), so this bias could be more substantial. It generally induces an underestimation of the QST, both when it is due to neglected epistasis variance (Whitlock, 1999) and when due to neglected dominance variance (Goudet and Büchi, 2006; Leinonen et al., 2013) although this has been more controversial (Goudet and Martin, 2007). For a more detailed discussion on this bias, see Leinonen et al. (2013). Downwards biased QST estimates are too conservative for the detection of heterogeneous selection and too sensitive when detecting homogeneous selection, which is reassuring since, in such an analysis, heterogeneous selections are generally more interesting than homogeneous selections.

# Variation of the Venom Composition Among Populations: Drift or/and Selection?

### Leptopilina boulardi

The venom-based differentiation between distant populations (South vs. North) was only slightly higher than that between nearby populations. This is quite surprising since some populations were very close to each other. For example, venom significantly discriminated Ste Foy-Lès-Lyon and S<sup>t</sup> Laurent d'Agny, two northern populations only 14 km apart. This differentiation can originate from drift, heterogeneous selection, or a combination of both.

A strong drift could come from the northward extension of distribution of L. boulardi. Indeed, its repartition area has progressed 170 km northwards from 1993 to 2011 (Delava et al., 2014). The wavefront of expanding populations is known to drift significantly because the individuals that drive this expansion are sampled from previously sampled individuals (Travis et al., 2007; Hallatschek and Nelson, 2010). When the wavefront is expanding fast enough, the drift is expected to create a gradient of differentiation from the source to the wavefront (e.g., Moreau et al., 2011). However, only a slight North-South differentiation was observed for L. boulardi, and of the same order as the differentiation between nearby populations. Another explanation would be a metapopulation dynamic with frequent extinctions and recolonizations, due to a high migration rate. Such metapopulation dynamic is expected to create differentiation, even between close populations, since colonization events are associated with founder effects as already demonstrated (Weisser, 2000; Rauch and Weisser, 2007); Nyabuga et al., 2011.

Alternatively, the similar magnitude of differentiation between distant populations (North vs. South) and nearby populations could also be explained if: (i) the effect of drift on differentiations is low, either due to a low drift or a high migration rate and (ii) the heterogeneous selection is strong enough to counter the effect of drift and migration, thus creating some local adaptation. In keeping with this, the QST analysis detected some selection. Indeed, there was a highly significant lack of proportionality between variations within and among populations. Such a proportionality is expected in the absence of natural selection since if the variation among populations comes only from the drift occurring separately within each, the magnitude of drift should be proportional to the magnitude of the variation within populations. This lack of proportionality thus reveals the variation in the selection pressures experienced by the different venom components.

Such variation among traits (here venom components) can have different origins. While the observed variation is most likely neutral for some traits, the non-neutral variation of others may fall anywhere in a continuum described by the following three categories. Firstly, most selection pressures could be homogeneous among populations, which would keep all populations around the same optimal values and give a proportionality coefficient (ρ) lower than <sup>2</sup>FST 1−FST . Secondly, most selection pressures could be heterogeneous among populations, which would push them toward different optimal values ρ > 2FST <sup>1</sup>−FST . Thirdly, there could be a combination of both homogeneous and heterogeneous selection pressures, with no detectable differences between the levels of the two types of selection regimes ρ ≈ 2FST <sup>1</sup>−FST . This is the situation identified for the venom of L. boulardi. While the presence of homogeneous selection was highly expected for a complex trait related to fitness such as parasitoid wasp venom, that of heterogeneous selection was not, and its level was therefore unknown. The absence of a significant difference between ρ and <sup>2</sup>FST 1−FST suggests that both are of comparable strength.

### Leptopilina heterotoma

A much higher differentiation of venom was observed for L. heterotoma compared to L. boulardi. This high structuration among populations could have resulted from a low gene flow among populations. However, previous studies in the South-East of France evidenced a low structuration based on neutral markers, indicating thus a consistent gene flow (overall FST = 0.036; Ris, 2003). This leaves two non-mutually exclusive hypotheses for explaining the strength of this venom-based differentiation. Differentiation might be adaptive, resulting from some heterogeneity among populations in the optimal venom composition. Alternatively, all or part of the differentiation may have evolved by drift during the 5 years of laboratory rearing of the populations between field sampling and the venom analysis (see for example Simoes et al., 2010).

As with L. boulardi, there was a very significant lack of proportionality between variations within and among populations that could have been generated by heterogeneous or homogeneous selection or a combination of both. The second test would have allowed discriminating between these hypotheses but was not doable for L. heterotoma. An additional level of uncertainty comes from where the detected selection may have occurred. Indeed, the conditions of rearing in laboratory could have imposed certain modifications in the optimal composition of the venom to which the five laboratorymaintained populations could have adapted by homogeneous selection (van Lenteren, 2003).

In summary, for L. heterotoma, strong differentiation and lack of proportionality may reflect two non-mutually exclusive scenarios. Heterogeneous selection in the field alone could explain both. In addition, the combination of homogeneous laboratory and/or field selection and laboratory drift may explain the lack of proportionality and strong differentiation, respectively. Importantly, whatever the reasons for the observed differentiation and non-proportionality, results highlight the high amount of genetic variation in the venom composition of L. heterotoma.

## Local Adaptation of the Venom Composition?

For L. heterotoma, we could not test for local adaptation of the venom composition. However, several ecological features may have created it. Firstly, both spatial and seasonal variations in the relative abundances of Drosophila species have been described in the Rhône valley (Allemand et al., 1999; Ris, 2003). L. heterotoma is a generalist species whose venom may therefore contain either "generalist" virulence factors or combination of factors, each of which being effective for a subset of the host range. In the latter case, the local availability of the hosts could strongly influence the local composition of the venom. This variation in host availability is also likely affected by the competition with L. boulardi, both a specialist and a powerful competitor, which was absent from northern populations in 2003 and present in 2011. Its arrival may have induced changes in the exploitation pattern of Drosophila species by L. heterotoma, resulting in different selection pressures on venom components.

Interestingly, a strong latitudinal genetic differentiation of L. heterotoma was also previously reported, based on fitness traits (Fleury et al., 1995, 2004). However, this could be explained either by differences in abiotic conditions (e.g., temperature) or by a variation in the presence of L. boulardi as a competitor, or both. Another piece of information suggesting some heterogeneous selection could be the putatively adaptive North-South genetic differentiation for the time of preferred locomotor activity, with an ongoing shift between Lyon and Antibes that has been highlighted by Fleury et al. (1995). Indeed, Vence is not far from Antibes, and the middle populations of the Rhône Valley are not far from Lyon. Such a temperature-related difference in the period of activity could have other consequences on both the host and parasitoid physiology, which may affect the optimal venom composition.

For L. boulardi, the QST analysis detected some local adaptation of the composition of the venom, which could be induced by certain biotic or abiotic variations. Local adaptation could indeed be an optimization of something other than virulence if the expression of venom proteins is associated with other physiological traits. For example, if a venom protein is encoded by a gene (i) that is also expressed elsewhere than in the venom gland and (ii) that shows variation in cis-regulation of expression, then a local adaptation could be detected on the venom composition even though the selection takes place due to the protein effect in tissues other than the venom gland. However, such indirect selection on the composition of the venom should be rather rare because the changes it induces are likely not neutral if these venom proteins are costly to produce (Morgenstern and King, 2013).

Alternatively, this might reflect some variability in host availability. Indeed, in the studied area, L. boulardi parasitizes both D. melanogaster, predominant in the North, and D. simulans, predominant in the South (Fleury et al., 2004), and the optimal venom composition differ for success on both species (Cavigliasso et al., in preparation). The North could therefore be a hotspot of co-evolution with D. melanogaster and a coldspot with D. simulans, and vice versa for the South. However, this North-South variation in host availability cannot explain why venom composition differs between close populations. This variation between neighboring populations is the main surprise of this study and deserves to be analyzed with larger samples. As mentioned in the extinction/recolonization scenario, this differentiation could come from drift. If alternatively, it is a local adaptation of virulence, it may be an adaptation to an intra-specific variation in D. melanogaster, the main host of L. boulardi in this region (Fleury et al., 2004).

Such a "very local" adaptation may seem implausible given the low FST between these populations. However, local adaptation induced by coevolution differs from classical local adaptation. For example, since parasitoids and hosts interact antagonistically the local adaptation of one species in a given population is also the local maladaptation of the other species. The interactions between local adaptation and coevolution were studied by Gandon (2002) and Gandon and Nuismer (2009). They reveal that when genetic drift is substantial, higher migration rates actually promote local adaptation, as migrants increase the evolutionary potential (genetic diversity) of the population which is needed to overcome the effect of drift and win the arms race (Gandon, 2002). Also, although drift generally decreases local adaptation, it may sometimes have the opposite effect in the case of local adaptation of traits involved in an antagonistic co-evolutionary dynamic (Gandon and Nuismer, 2009). Indeed, drift occurring independently in each population of two species of a co-evolving "couple" may induce a certain spatial heterogeneity to which both species can adapt locally, which reinforces the spatial heterogeneity initiated by drift. Such simultaneous substantial amounts of drift within each population combined with migration among populations are precisely the predicted situation under the extinction/recolonization scenario we previously considered. Such scenario is not unlikely if we consider (i) the short generation time of these species, (ii) their high fertility, (iii) their ability to fly and disperse over long distances with the wind, (iv) the known seasonal variations in populations size, and finally, (v) the possibility of local extinctions or large fluctuations in the sizes of populations of hosts and parasitoids driven by the parasitic interaction, which would create a strong drift. These conditions are generally likely to create a metapopulation dynamic (Fronhofer et al., 2012).

If we combine this view of host-parasitoid metapopulation dynamic with the notion of informed dispersal (Clobert et al., 2009), then local adaptation between close populations for a trait as important as virulence may be expectable. As parasitoids can accurately estimate the suitability of a host before oviposition (van Baaren and Boivin, 1998; Dubuffet et al., 2006; van Baaren et al., 2007), their decision to settle or continue their migration, when emigrating, may depend on the prevalence of hosts that fit their needs. Such non-random dispersal strongly increases the possibility of local adaptation (Ravigné et al., 2009).

# Identifying the Venom Components Likely Under Heterogeneous Selection

To identify the traits under heterogeneous selection, we used the RF clustering algorithm to detect reference bands that discriminate significantly among populations. Indeed, heterogeneously selected venom components provide the greatest discriminating power among populations (high variance among populations relative to the variance within populations). We identified venom components (reference bands) that significantly discriminate between populations. However, even reference bands for which variation is neutral are expected to sometimes discriminate populations. We thus applied a Bonferroni correction to keep only the bands with the highest discriminating power. This selected 9 and 13 reference bands for L. boulardi and L. heterotoma, respectively. These two sets of reference bands should not be interpreted in the same way since we did not formally identify heterogeneous selection for L. heterotoma, which we did for L. boulardi.

The content of these reference bands was tentatively identified by manually matching them with the 1D electrophoresis gels analyzed by proteomics by Colinet et al. (2013a). This proteomics study characterized the venom content of L. boulardi and L. heterotoma lines, which are probably not representative of the whole diversity observed in the field. Nevertheless, there should be an overall correspondence between the content of the reference bands analyzed in this study and that of the bands analyzed by Colinet et al. (2013a). This identified 16 proteins as potentially under heterogeneous selection, six of which have no identified putative function. This highlights that our prior-less method can be used to discover novel active proteins. Indeed, venoms have strong and diverse biological activities but they contain a great diversity of proteins that might not all be worth studying. Our approach allows selecting without a priori proteins likely to contribute to parasitism success on the host. The other identified proteins are members of the RhoGAP and serpin families found in reference bands of both Leptopilina species, as well as an aspartylglucosaminidase (AGA) found in two L. heterotoma reference bands. Among the proteins identified in the discriminating reference bands are LbGAP, a L. boulardi RhoGAP seemingly required for parasitism success on a resistant D. melanogaster strain (Labrosse et al., 2005b; Colinet et al., 2010), and LbSPNm, a serpin of L. boulardi closely related to a member of this family involved in suppressing the Drosophila immune defense (Colinet et al., 2009). This suggests that at least some of the discriminating reference bands contain proteins that may be involved in virulence. The observed variation in their amount in the venom is thus unlikely to be neutral, especially since the production of many venom proteins is probably energetically expensive (Morgenstern and King, 2013).

# CONCLUSION

In this study, we illustrate how the method developed by (Mathé-Hubert et al., 2015) can be used for the venomics of populations or more generally for population proteomics. It shows for the first time that parasitoid wasp venom can be used to discriminate populations, including geographically very close ones. This is accurately demonstrated for L. boulardi whereas, differentiation among populations of L. heterotoma might be partly due to drift in the laboratory.

We also illustrated the use of the multivariate QST analysis developed by Martin et al. (2008) on the data generated by the method of (Mathé-Hubert et al., 2015) to detect the signature of natural selection. This signature is strong for both L. heterotoma and L. boulardi. For the latter, the comparison with the differentiation observed for neutral markers revealed that differentiation was partly due to some local adaptation.

Using proteomic data, we were able to tentatively identify the proteins contained in the bands that discriminate populations and therefore whose quantity might have evolved under heterogeneous selection. This identified several proteins known or suspected to be involved in virulence, but also proteins that have no identified putative function and would therefore deserve further investigation.

The combination of these two methods appears to be a powerful framework for population proteomics (Biron et al., 2006) and for the study of eco-evolutionary feedbacks between the proteomic level and the population and ecosystem levels (Diz et al., 2012). This is of interest not only to study evolution in the field at an intermediate level between the genome and phenotypes, or to understand the role of evolution in chemical ecology, but also for more applied issues. For example, biological control agents, sampled in the field in their area of origin, are often reared in the laboratory on a host species other than the target one (van Lenteren, 2003). The framework presented here could be used to monitor their potential unwanted adaptation to the laboratory host.

# DATA AVAILABILITY

The data generated from the gel pictures analysis are available in the **Supplementary Material S2**. The raw gel pictures will be made available by the authors, upon request, without undue reservation.

# REFERENCES


# AUTHOR CONTRIBUTIONS

DC, HM-H, J-LG, and MP conceived and designed research. JV was responsible for the sampling and rearing or the L. heterotoma populations. LK performed the analysis of the venom protein composition and data acquisition. HM-H designed and performed the statistical analyses. ÉD did the estimate of the FST. HM-H, MP, DC, J-LG, ÉD, and JV performed the writing and editing of the manuscript. All authors read and approved the final manuscript.

# FUNDING

This work received support from the French National Research Agency (CLIMEVOL project, ANR-08- BLAN-0231) and the Department of Plant Health and Environment from the French National Institute for Agricultural Research (INRA). It was performed in the context of the Investments for the Future LABEX SIGNALIFE: program reference ANR-11-LABX-0028. HM-H was funded by the Provence Alpes Côte d'Azur (PACA) region and the Department of Plant Health and Environment from the French National Institute for Agricultural Research (INRA).

# ACKNOWLEDGMENTS

We are very grateful to Patricia Gibert and Stéphanie Llopis for maintaining and providing the L. boulardi and L. heterotoma populations and for producing and sending us the individuals analyzed in this study. We also thank C. Rebuf for technical assistance.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00156/full#supplementary-material

Table S1 | Sampling locations.

Figure S1 | Workflow of the statistical analysis.

Figure S2 | Analysis of the correlation between the reference bands for L. boulardi and L. heterotoma (panel A and B respectively).

Supplementary Material S1 | R functions allowing to reproduce the Qst analysis (tests and graphics).

Supplementary Material S2 | Data produced in this work.


(Hymenoptera: Figitidae) which parasitize Drosophila in the Rhone valley (S-E France). Ann. Soc. Entomol. Fr. 35, 97–103.


virulence? Physiol. Entomol. 31, 170–177. doi: 10.1111/j.1365-3032.2006. 00505.x


venom proteomics of four Sidewinder Rattlesnake (Crotalus cerastes) lineages reveal little differential expression despite individual variation. Sci. Rep. 8, 1–15. doi: 10.1038/s41598-018-33943-5


Mackauer (Hymenoptera: Braconidae) attacking a specialist aphid on tansy. Biol. J. Linn. Soc. 102, 737–749. doi: 10.1111/j.1095-8312.2011.01620.x


two populations of a Drosophila parasitoid. Biol. J. Linn. Soc. 117, 231–240. doi: 10.1111/bij.12644


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Mathé-Hubert, Kremmer, Colinet, Gatti, Van Baaren, Delava and Poirié. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Venom Costs and Optimization in Scorpions

### Edward R. J. Evans <sup>1</sup> \*, Tobin D. Northfield2,3, Norelle L. Daly <sup>1</sup> and David T. Wilson<sup>1</sup> \*

*<sup>1</sup> Centre for Molecular Therapeutics, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD, Australia, <sup>2</sup> Department of Entomology, Tree Fruit Research and Extension Center, Washington State University, Wenatchee, WA, United States, <sup>3</sup> Centre for Tropical Environmental and Sustainability Science, College of Science and Engineering, James Cook University, Cairns, QLD, Australia*

Scorpions use venoms as weapons to improve prey capture and predator defense, and these benefits must be balanced against costs associated with its use. Venom costs involve direct energetic costs associated with the production and storage of toxins, and indirect fitness costs arising from reduced venom availability. In order to reduce these costs, scorpions optimize their venom use via evolutionary responses, phenotypic plasticity, and behavioral mechanisms. Over long timescales, evolutionary adaptation to environments with different selection pressures appears to have contributed to interspecific variation in venom composition and stinger morphology. Furthermore, plastic responses may allow scorpions to modify and optimize their venom composition as pressures change. Optimal venom use can vary when facing each prey item and potential predator encountered, and therefore scorpions display a range of behaviors to optimize their venom use to the particular situation. These behaviors include varying sting rates, employing dry stings, and further altering the volume and composition of venom injected. Whilst these cost-reducing mechanisms are recognized in scorpions, relatively little is understood about the factors that influence them. Here, we review evidence of the costs associated with venom use in scorpions and discuss the mechanisms that have evolved to minimize them.

Keywords: scorpion, venom, optimization, evolution, plasticity, behavior

# INTRODUCTION

Venomous organisms inject chemical cocktails into their predators and prey in order to disrupt normal biological functioning in their target (Fry et al., 2009; Casewell et al., 2013). These chemical weapons are often rich in proteins, peptides, and small molecules (Inceoglu et al., 2003; Escoubas et al., 2008; Calvete et al., 2009; Fry et al., 2009; Villar-Briones and Aird, 2018). Whilst venom provides survival benefits by aiding in prey capture and predator defense, the benefits come with costs. These costs are two-fold, involving direct energetic costs associated with production and storage of toxins (McCue, 2006; Nisani et al., 2007, 2012), and further indirect costs associated with a reduced capacity to capture prey or defend when supplies are depleted. Whilst these costs have different types of impacts on venomous animals, the methods to reduce these costs can overlap. It has been proposed that due to costs associated with venom use, organisms will meter/optimize the volume of venom they inject in order to use their venom as economically as possible in different situations (Hayes et al., 2002; Wigger et al., 2002; Hayes, 2008; Morgenstern and King, 2013). Research into optimal venom use has primarily focussed on snakes (Hayes, 1995, 2008; Hayes et al., 2002; Young et al., 2002), spiders (Wigger et al., 2002; Wullschleger and Nentwig, 2002; Hostettler and Nentwig, 2006; Nelsen et al., 2014; Cooper et al., 2015), and scorpions (Edmunds and Sibly, 2010; Nisani and Hayes, 2011; Lira et al., 2017), with the latter being the focus of this review.

### Edited by:

*Maria Vittoria Modica, Stazione Zoologica Anton Dohrn, Italy*

### Reviewed by:

*Timothy Jackson, The University of Melbourne, Australia Cassandra Marie Modahl, National University of Singapore, Singapore*

### \*Correspondence:

*Edward R. J. Evans edwardrobertjonathan.evans@ my.jcu.edu.au David T. Wilson david.wilson4@jcu.edu.au*

### Specialty section:

*This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *27 March 2019* Accepted: *13 May 2019* Published: *06 June 2019*

### Citation:

*Evans ERJ, Northfield TD, Daly NL and Wilson DT (2019) Venom Costs and Optimization in Scorpions. Front. Ecol. Evol. 7:196. doi: 10.3389/fevo.2019.00196*

Scorpions utilize venom to both capture prey and defend against predators, which can include organisms with very different physiologies and susceptibility to venom components (Gangur et al., 2017; van der Meijden et al., 2017). Over 400 million years of evolution (Dunlop and Selden, 2009) have led scorpions to develop a wide range of mechanisms that help minimize the costs of venom use. Successful prey capture and predator defense will ultimately affect a scorpion's evolutionary fitness, and therefore selection on venom composition and concentration is generally influenced by both prey and/or predators (Tian et al., 2008; Weinberger et al., 2009; Gangur et al., 2018). As selection pressures vary between environments, so will optimal investment in venom (Gangur et al., 2018). Scorpions adapted to different ecological niches often show large differences in venom composition (de la Vega et al., 2010) and stinger morphology (van der Meijden et al., 2013; van der Meijden and Kleinteich, 2017), and this likely reflects responses to different selection pressures.

# THE COSTS OF VENOM USE IN SCORPIONS

# Direct

Research on the energetic demands of venom use have focussed on costs of production, rather than maintenance, as it is difficult to measure the energy used in maintaining and storing toxins experimentally. Nisani et al. (2007) showed that after depleting the venom glands of scorpions (Parabuthus transvaalicus), metabolic rates increased by 39% during the first 3 days of regeneration. A later study (Nisani et al., 2012) found milked P. transvaalicus had on average a 21% higher metabolic rate than un-milked scorpions during the first 8 days of regeneration, but in this second study the rate did not rise as high during the first 3 days (Nisani et al., 2007). Metabolic rates fluctuated throughout the experiment, and the authors suggested this likely reflected the asynchronous regeneration of toxins (Nisani et al., 2012). Whilst differences were observed between the studies, both identify a large increase in metabolic rate above baseline levels, indicating that in scorpions venom production is an important energetic expense (Nisani et al., 2007, 2012). These studies also likely underestimate total energetic costs, as venom regeneration can take longer than 8 days to complete (Carcamo-Noriega et al., 2019). Furthermore, scorpion venom varies between species in complexity, toxins utilized, and volume stored and injected (de la Vega et al., 2010; Sunagar et al., 2013; van der Meijden et al., 2015); all of which alter energy requirements. Energetic costs are also dependent on the metabolic rate, which is likely to vary between species adapted to different ecological niches. Furthermore, as scorpions are ectotherms their metabolic rate will vary with environmental conditions, as Nime et al. (2013) showed scorpion activity is positively correlated with temperature.

# Indirect

Indirect costs are associated with the ecological limitations arising from depleted venom supplies, such as increased predation risk or reduced ability to capture prey. Scorpions can store a limited volume of venom (van der Meijden et al., 2015), and regeneration of toxins can take at least 2 weeks to be complete (Carcamo-Noriega et al., 2019), reducing the ability to use venom for prey capture and predator defense until venom supplies are restored. Ecological costs of venom depletion cannot easily be quantified, as they will vary widely between species and environments with fluctuating selection pressures. Nonetheless, evaluation of behavioral changes that arise when venom stores are depleted may serve as evidence of ecological costs. Such a behavioral response has yet to be reported in scorpions, but some spiders with depleted venom supplies will adapt their hunting behavior to target easily caught prey, as has been found in the wandering spider Cupiennius salei (Wullschleger and Nentwig, 2002). Scorpions and spiders often share many of their natural predators and prey items, therefore the ecological costs of venom depletion in both organisms may bear some similarity. Comparable behaviors in scorpions might include a switch to smaller prey that can be successfully captured with only the pedipalps while venom is replenishing.

# THE EVOLUTION OF OPTIMAL VENOM USE

Compared with other venomous taxa, scorpions are unusual in that they possess two main weapons: their stinging apparatus and their pedipalps. Species vary in their relative investment in these two weapons depending on their ecological niche, leading to the great morphological diversity in scorpion stinging apparatus and pedipalps seen between species (**Figure 1**) (van der Meijden et al., 2013). Burrowing species, such as members of the family Scorpionidae, are often sit-and-wait predators (Hadley and Williams, 1968; Bub and Bowerman, 1979; Shachak and Brand, 1983; Shivashankar, 1994), and possess large pedipalps that can be used to dig, grab passing prey, and block predators from entering their burrow (van der Meijden et al., 2010). Large pedipalps are often accompanied by a small stinging apparatus. Small tail size may be due to a trade-off in investment between pedipalps and stinging apparatus (van der Meijden et al., 2013), or a reduced tail may simply improve mobility in the confines of a burrow. Vagrant scorpion species, such as many members of the family Buthidae, have no permanent residence and forage more actively (Hadley and Williams, 1968). This group of scorpions generally rely more heavily on their sting to capture prey and defend themselves, and have evolved powerful and highly mobile tails (Warburg, 1998; Coelho et al., 2017). Many of these scorpions have developed potent venom, and the family Buthidae contains all species with medically significant stings (Santos et al., 2016). Buthids often possess small pedipalps, suggesting an evolutionary trade-off between pedipalps and stinging apparatus may be present (van der Meijden et al., 2013), and/or small pedipalps may improve mobility and energetic efficiency while actively foraging. The evolution of potent venom in buthids may further have reduced the advantages that large pedipalps provide, making them energetically unfavorable.

# BEHAVIORAL MECHANISMS TO OPTIMIZE VENOM USE

Optimal use of venom will vary between each interaction with prey and predators, influenced by the size and identity of the prey/predator. This has led scorpions to evolve a range of behavioral mechanisms allowing them to optimize their venom use when facing specific prey (Edmunds and Sibly, 2010) and under particular levels of threat (Nisani and Hayes, 2011).

# The "Decision" to Sting

To reduce the costs associated with unnecessary venom use, scorpions adapt their hunting strategies to the particular prey items targeted (Simone et al., 2018), and are less likely to use venom when capturing small or easily subdued prey (Rein, 1993; Edmunds and Sibly, 2010). Spiders appear to target the injection of their venom toward the thorax or head of prey items to maximize venom efficiency (Wigger et al., 2002; Carlson et al., 2014). However, to our knowledge it is not known if scorpions seek to apply stings to an optimal location, or if an optimal sting location even exists, as one study on Bothriurus bonariensis found that sting location did not affect the time taken to subdue prey (Simone et al., 2018).

Both the size and activity of prey items can influence a scorpion's choice to sting, as Parabuthus liosoma, Parabuthus pallidus and Hadrurus spadix sting larger and more active prey items more frequently (Rein, 1993; Edmunds and Sibly, 2010). Rein (1993) observed that the scorpions did not use their sting immediately when encountering prey, but would rather grab with their pedipalps and apply stings if the prey continued to struggle, presumably to minimize venom use whilst ensuring predation success. Ontogenetic changes in stinging behavior can also be used to optimize venom use. Older Paruroctonus boreus and Pandinus imperator use their larger pedipalps to overpower prey and sting less often, avoiding using venom (Cushing and Matherne, 1980; Casper, 1985).

In addition to trade-offs between the use of venom and pedipalps for prey capture, there may be trade-offs in venom use and mobility when avoiding predation, as faster scorpions appear less likely to sting predators (Carlson et al., 2014; Miller et al., 2016). For example, female Centuroides vittatus scorpions that are heavier and less mobile, are more likely to sting a potential predator than males, which are more likely to sprint to safety (Carlson et al., 2014; Miller et al., 2016). Furthermore, within each sex, sprint speed decreases and sting rate increases with mass, indicating that higher rates of aggression are associated with reduced mobility (Carlson et al., 2014; Miller et al., 2016). Through fleeing, the males and smaller scorpions are not only able to avoid being eaten, but they also save their venom supply for future encounters and do not need to expend energy regenerating toxins, thus reducing ecological and energetic venom costs.

Unlike other scorpions, seven species of Parabuthus spray venom defensively, which may cause irritation to the sensitive tissues, such as eyes, of predators (Newlands, 1974; Nisani and Hayes, 2015). P. transvaalicus defensively spray when presented with both air-flow and touch stimuli simultaneously, suggesting the behavior may be optimized toward high-threat scenarios where defensive tactics must be implemented before a predator gets close enough to sting (Nisani and Hayes, 2015). These scorpions are often attacked by predators such as grasshopper mice, which disarm scorpions by biting off their tails, and therefore place their face in close proximity to the telson (Nisani and Hayes, 2015). Compared with injection, sprayed venom has a higher risk of missing its target, likely increasing the costs of venom necessary to deter predators. However, these costs are likely offset by the advantage of deterring predators while the predator is still at a distance.

# "Dry" Stings

Scorpions can still avoid venom use when stinging in defensive encounters by employing "dry" stings, where no venom is injected. Scorpions readily utilize dry stings in defensive situations (Nisani and Hayes, 2011; Lira et al., 2017; Rasko et al., 2018). The factors that determine whether venom is injected in a sting, however, are currently unclear. In P. transvaalicus, dry stinging behavior is correlated with threat level, with the scorpions employing dry stings more frequently when the threat level is low (Nisani and Hayes, 2011). Additionally, this study suggests P. transvaalicus, when induced to sting multiple times in succession, use dry stings more frequently early in the stinging sequence and are more likely to inject venom as the threat persists (Nisani and Hayes, 2011). The combination of these results provides evidence that scorpions use dry stings in low threat situations to optimize their venom use. In contrast to the findings by Nisani and Hayes (2011) studies on dry stinging behavior in other scorpion species have produced differing results. Lira et al. (2017) presented Tityus stigmurus with the same "low-threat" and "high-threat" stimuli described by Nisani and Hayes (2011), and found no correlation between dry sting rate and threat level. Furthermore, it has been shown that repeated simulated attacks against Hadrurus arizonensis lead to an increase in dry sting rate, despite venom remaining in the gland (Rasko et al., 2018). This latter result seems counter-intuitive, as scorpions might be expected to increase their defensive investment as a threat persists. Whilst the studies investigating the factors influencing dry stinging behavior in scorpions are limited, the evidence supports the idea that at least some scorpion species utilize dry stings as a means to optimize their venom in defensive contexts (Nisani and Hayes, 2011), while others may not (Lira et al., 2017; Rasko et al., 2018). Further research should aim to identify whether interspecific differences are truly occurring, or if methodological differences between the studies are responsible for the observed differences. Furthermore, it is not currently known if scorpions utilize dry stings as a tactic to save venom when capturing prey, as spiders use dry bites (Malli et al., 1998; Wigger et al., 2002). Spiders, however, have the ability to masticate their prey with their fangs, but the dry sting of a scorpion provides comparatively little aid in the incapacitation of prey. It is therefore unlikely that scorpions use dry stings to save venom when targeting prey.

# Volume Injected

While scorpion stinging behavior involves a dichotomy between dry stings vs. stings with venom injected, scorpions also have the ability to vary the volume of venom they inject, both within each sting and through the application of multiple stings (Nisani and Hayes, 2011; van der Meijden et al., 2015). P. transvaalicus injected twice as much venom per single sting in highthreat situations compared with low-threat situations, indicating scorpions may use this tactic to vary their defensive investment in response to perceived threat level. The defensive sprays of P. transvaalicus display variable duration and flow rate suggesting the volume expelled could be controlled by contraction of the venom gland, but it is not currently known if the volume sprayed is influenced by threat level (Nisani and Hayes, 2015).

The volume injected in single stings may be limited by morphological constraints, or the time that the aculeus is pierced into the target (van der Meijden et al., 2015). When scorpions are faced with repeated attacks from predators, they will continue to defensively sting as the attacks continue. Experiments into the defensive investment of scorpions in response to predation threat suggest that scorpions will repeatedly sting predators as the threat persists (Lira et al., 2017), but it is unclear whether the investment per attack increases or decreases with sting number (Nisani and Hayes, 2011; Rasko et al., 2018). Targeting prey, scorpions often hold on with their pedipalps and judiciously apply stings as the prey continues to struggle (Casper, 1985; Rein, 1993). The rate of stings increases with both prey size and activity (Edmunds and Sibly, 2010), suggesting that scorpions are being frugal with their venom, and only apply extra stings as necessary.

# Composition Injected

In addition to reducing venom costs by metering the volume of venom they inject, scorpions are able to alter the composition of venom injected into their target and avoid unnecessarily injecting costly venom components. Scorpion venom is heterogeneous and changes in composition as it is expelled from the aculeus

(**Figure 2**). As the venom is secreted from the aculeus tip, the initial expulsion is a clear liquid, followed by an opalescent liquid, and finally turns milky colored and viscous (Yahel-Niv and Zlotkin, 1979). These different secretions also vary in toxicity (Yahel-Niv and Zlotkin, 1979). Inceoglu et al. (2003) found that the initial clear secretion in P. transvaalicus constituted around 5% of the total venom volume within the gland, and they termed this "prevenom". Prevenom and main milky venom have distinct compositions and modes of action in both invertebrate and vertebrate targets (Inceoglu et al., 2003). The different mode of actions of prevenom and main venom may act together to induce greater toxicity, in a similar way to the toxin cabals employed by cone snails which target different pathways simultaneously (Olivera et al., 1999; Inceoglu et al., 2003). P. transvaalicus prevenom was found to contain six-times less peptide and protein concentration compared to the main venom, but a 16-fold higher potassium (K+) salt concentration (Inceoglu et al., 2003). The authors suggested this extremely high K<sup>+</sup> concentration causes large and rapid depolarization of nerves in the target, causing quick paralysis in insects and pain in vertebrates (Inceoglu et al., 2003). Prevenom not only contains a much lower peptide/protein concentration, but a comparatively simplistic composition (Inceoglu et al., 2003). It is therefore expected that prevenom is metabolically cheaper to produce than the main venom, as K<sup>+</sup> salt likely requires less energy to be replenished than peptides requiring production and folding (Inceoglu et al., 2003). The relative costs of prevenom vs. main venom have not been calculated experimentally, but in P. transvaalicus prevenom components appear to regenerate quickly and at little metabolic cost compared with other toxins (Nisani et al., 2012). Prevenom may therefore have evolved as a mechanism to avoid injecting larger volumes of peptide rich mixtures, thereby minimizing both metabolic and ecological

costs of depletion, although this connection is difficult to test experimentally (Nisani et al., 2012).

By using prevenom first, scorpions can save their main peptide-rich venom for high-threat situations or when initially low-threat situations escalate. This hypothesis is supported by evidence that the composition of venom (prevenom vs. main venom) scorpions inject is context dependent. In lowthreat situations, scorpions are able to avoid injecting their metabolically "expensive" mixtures of toxins, by injecting only prevenom (Nisani and Hayes, 2011). P. transvaalicus inject their main venom more frequently in high-threat situations, and in later stings when induced to sting repeatedly at both low and high threat levels (Nisani and Hayes, 2011). Furthermore, in lowthreat situations, T. stigmurus injected prevenom in all trials, but when faced with the high-threat treatments most of the scorpions injected their main milky venom secretion (Lira et al., 2017). The use of prevenom in low threat situations not only minimizes metabolic costs but also reduces ecological costs, as prevenom appears to regenerate faster than the main venom components (Nisani et al., 2012).

# ADAPTIVE PLASTICITY

Recent evidence suggests scorpions can modify their venom composition in response to predator exposure (Gangur et al., 2017). Repeated periodical encounters with a surrogate vertebrate predator (a taxidermied mouse) over a 6 week period led Hormurus waigiensis to appear to produce a higher relative abundance of some vertebrate specific toxins used in defensive situations, and a lower relative abundance of certain toxins specific to their invertebrate prey (Gangur et al., 2017). This study provided the first evidence for adaptive plasticity in venom compositions, and suggested it has evolved as a mechanism to allow for the optimization of venom use (Gangur et al., 2017). Modification of venom composition in response to environmental pressures could allow scorpions to further optimize venom use in different environments. In environments with few predators, scorpions may not require large quantities of defensive toxins, but as predator abundance increases so does the need to defend themselves. Therefore, ability to plastically change venom composition can allow scorpions to prioritize their resources and minimize the costs of venom use. It is currently unclear what environmental cues (e.g., olfactory) led to the plastic response observed by Gangur et al. (2017). Furthermore, it is unclear if the response was targeted specifically at the presence of the mouse, or if it was a uniform response to increased predation pressure.

Unlike the response from simulated top-down predation pressure, Gangur et al. (2017) did not identify changes in venom composition in response to a scavenging vs. predacious diet, where venom is not required for prey capture. This may be due to the unpredictable nature of scavenging, and the potential need to kill prey in the future, regardless of current carrion availability. Alternatively, venom may need to be maintained for its defensive function. In contrast, it may also be that experimental conditions do not represent the bottom-up pressures experienced in the wild, as crickets may be more easily subdued than natural prey items. H. waigiensis is a burrowing species that has evolved large pedipalps and a small stinging apparatus, and further studies should evaluate whether more active species that rely more heavily on their sting to capture prey respond differently to a changing diet.

# CONCLUSIONS AND FUTURE DIRECTIONS

Scorpions experience direct costs associated with the production and storage of toxins, and indirect costs associated with impaired ecological function when their venom is depleted. Optimal venom use minimizes these costs, maximizing the survival benefit venom provides. On the broadest scales, optimal venom investment has contributed to the divergence of stinger morphology and venom compositions between species adapted to different environments (Tian et al., 2008; Sunagar et al., 2013; van der Meijden et al., 2013). Optimal venom use can be influenced by factors such as prey/predator identity, and scorpions therefore utilize a suite of behavioral tactics to minimize waste. These include varying sting frequency, employing dry stings, and further controlling the volume and composition of venom injected (Nisani and Hayes, 2011). Scorpions may also plastically adapt their venom composition (Gangur et al., 2017), allowing them to optimize venom use as selection pressures change. Whilst the presence of these mechanisms and behaviors are well-documented, the factors influencing them are poorly understood. Current knowledge of venom optimization has generally relied upon correlative research, where the selective forces driving the correlations are inferred, rather than directly measured. There is evidence of venom costs, benefits for prey capture and predator defense, and behavioral and trait phenotypes that appear to reduce these costs and maximize benefits. However, there is little direct evidence tying changes in phenotypes to changes in costs or benefits to describe a mechanistic link. Controlled selection experiments or phylogenetic studies that consider species interactions can help describe links between selection and evolutionary response in arms races (Pimentel, 1968; Kursar et al., 2009; Toju et al., 2011; Betts et al., 2018), and may help better describe how observed venom optimization mechanisms have evolved. Future work is needed to investigate whether observed changes are due to adaptive responses or physiological limitations, the extent that these mechanisms are influenced by the environment, and how widespread they are across different scorpion species.

# AUTHOR CONTRIBUTIONS

The manuscript was written by EE and edited by TN, DW, and ND.

# FUNDING

The Northcote Trust supports EE with a Northcote Graduate Scholarship.

### Evans et al. Venom Optimization in Scorpions

# REFERENCES


M. Hoggren, M. E. Douglas, H. W. Greene (Eagle Mountain: Eagle Mountain Publishing), 207–233.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Evans, Northfield, Daly and Wilson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multifunctional Toxins in Snake Venoms and Therapeutic Implications: From Pain to Hemorrhage and Necrosis

Camila R. Ferraz 1,2, Arif Arrahman<sup>3</sup> , Chunfang Xie<sup>3</sup> , Nicholas R. Casewell <sup>4</sup> , Richard J. Lewis <sup>1</sup> , Jeroen Kool <sup>3</sup> and Fernanda C. Cardoso<sup>1</sup> \*

<sup>1</sup> Centre for Pain Research, Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia, <sup>2</sup> Departamento de Ciências Patológicas, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, Brazil, <sup>3</sup> Division of BioAnalytical Chemistry, Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam, Netherlands, <sup>4</sup> Centre for Snakebite Research & Interventions, Liverpool School of Tropical Medicine, Liverpool, United Kingdom

### Edited by:

Kartik Sunagar, Indian Institute of Science (IISc), India

### Reviewed by:

Kempaiah Kemparaju, University of Mysore, India Stephen P. Mackessy, University of Northern Colorado, United States

> \*Correspondence: Fernanda C. Cardoso f.caldascardoso@uq.edu.au

### Specialty section:

This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 31 March 2019 Accepted: 24 May 2019 Published: 19 June 2019

### Citation:

Ferraz CR, Arrahman A, Xie C, Casewell NR, Lewis RJ, Kool J and Cardoso FC (2019) Multifunctional Toxins in Snake Venoms and Therapeutic Implications: From Pain to Hemorrhage and Necrosis. Front. Ecol. Evol. 7:218. doi: 10.3389/fevo.2019.00218 Animal venoms have evolved over millions of years for prey capture and defense from predators and rivals. Snake venoms, in particular, have evolved a wide diversity of peptides and proteins that induce harmful inflammatory and neurotoxic effects including severe pain and paralysis, hemotoxic effects, such as hemorrhage and coagulopathy, and cytotoxic/myotoxic effects, such as inflammation and necrosis. If untreated, many envenomings result in death or severe morbidity in humans and, despite advances in management, snakebite remains a major public health problem, particularly in developing countries. Consequently, the World Health Organization recently recognized snakebite as a neglected tropical disease that affects ∼2.7 million p.a. The major protein classes found in snake venoms are phospholipases, metalloproteases, serine proteases, and three-finger peptides. The mechanisms of action and pharmacological properties of many snake venom toxins have been elucidated, revealing a complex multifunctional cocktail that can act synergistically to rapidly immobilize prey and deter predators. However, despite these advances many snake toxins remain to be structurally and pharmacologically characterized. In this review, the multifunctional features of the peptides and proteins found in snake venoms, as well as their evolutionary histories, are discussed with the view to identifying novel modes of action and improving snakebite treatments.

Keywords: snake venoms, multifunctional toxins, pathological mechanisms, evolution, snakebite treatment, pain, hemotoxicity, myotoxicity

# INTRODUCTION

The composition and evolutionary histories of animal venoms have fascinated the scientific community for centuries. Venoms have evolved over millions of years to facilitate prey capture and/or defense from predators and rivals. Snake venoms, in particular, likely originated in the Cenozoic Era (Fry, 2005; Fry et al., 2006), and they are amongst the most well-characterized of animal venoms, comprising a complex mixture of toxic, and pharmacologically-active proteins and peptides (Casewell et al., 2013; Chan et al., 2016). Tragically, snake envenomation is a significant health and economic burden worldwide. It is estimated 1.8–2.7 million snakebites and 81,410– 137,880 deaths occur annually worldwide (Kasturiratne et al., 2008; Gutierrez et al., 2017), a problem which is mostly associated with agricultural work, especially in South and Southeast Asia, sub-Saharan Africa, and Central and South America (Kasturiratne et al., 2008; Harrison et al., 2019). In 2017, the World Health Organization (WHO) finally recognized the snakebite as a priority neglected tropical disease in which morbidity and mortality affects mostly individuals under 30 years old, who are often the most economically productive members of a community (WHO, 2018).

Snake venoms have a distinct complexity when compared to venoms from other animals such as spiders, scorpions, and cone snails (Zelanis and Tashima, 2014). In these animal venoms, the pharmacological effects are primarily caused by disulfide bridged peptides, whilst snake venoms consist of a more diverse array of larger proteins and peptides which results in a wider variety of pharmacological and toxicological effects (Zhang, 2015). These venoms comprise 50–200 components distributed in dominant and secondary families which can be presented in multiple proteins and peptides isoforms (Vonk et al., 2011; Slagboom et al., 2017; Tasoulis and Isbister, 2017). The dominant families are secreted phospholipases A<sup>2</sup> (PLA2s), snake venom metalloproteinases (SVMP), snake venom serine proteases (SVSP), and three-finger peptides (3FTX), while the secondary families comprise cysteine-rich secretory proteins, Lamino acid oxidases, kunitz peptides, C-type lectins, disintegrins, and natriuretic peptides (Slagboom et al., 2017; Tasoulis and Isbister, 2017; Munawar et al., 2018). Interestingly, snake venom composition varies interspecifically (Fry et al., 2008; Tasoulis and Isbister, 2017), as well as intraspecifically, with many factors influencing this diversity including age (Dias et al., 2013), gender (Menezes et al., 2006; Zelanis et al., 2016), location (Durban et al., 2011; Goncalves-Machado et al., 2016), diet (Barlow et al., 2009), and season (Gubensek et al., 1974). This variability phenomenon underpins toxin diversity and multifunctionality, and is of great importance to be considered in antivenom production and envenomation treatment (Gutierrez et al., 2017).

The pharmacological effects of snake venoms are classified into three main types, hemotoxic, neurotoxic, and cytotoxic (WHO, 2010). The major toxins involved in these effects are the PLA2s, SVMPs, SVSPs, and 3FTXs, that alone or in combination, are responsible for the multiple pharmacological effects occurring in snakebite victims. For example, some PLA2s and 3FTX are able to act on pre- or post-synaptic junctions as antagonist of ion channels and nicotinic or muscarinic receptors to induce severe neurotoxicity such as paralysis and respiratory failure (Fry et al., 2003; Lynch, 2007; Casewell et al., 2013; Harris and Scott-Davey, 2013; Tsetlin, 2015). In addition, other PLA2s and 3FTXs, along with SVMPs, cause local tissue damage resulting in swelling, blistering, bruising, and necrosis, and systemic effects such as hypovolemic shock (Gutierrez and Rucavado, 2000; Gutierrez et al., 2005; Harris and Scott-Davey, 2013; Rivel et al., 2016). Furthermore, SVSPs and SVMPs induce hemostatic and cardiovascular effects as coagulopathy, hypotension and hemorrhage (Slagboom et al., 2017). Interestingly, some PLA2s, SVSPs, and SVMPs are also capable of triggering severe pain by modulating pain pathways through activation of ion channels, such as transient receptor potential vanilloid type 1 (TRPV1) and acid-sensing ion channel (ASIC) (Bohlen et al., 2011; Zhang et al., 2017); and/or by pain sensitization through inflammatory mediators (Zychar et al., 2010; Menaldo et al., 2013; Ferraz et al., 2015; Mamede et al., 2016; Zambelli et al., 2017b). The inflammation induced by the elapid and viper venoms is widely reported to produce pain or hyperalgesia in human and in experimental models (Hifumi et al., 2015; Bucaretchi et al., 2016; Kleggetveit et al., 2016; Mamede et al., 2016). Unfortunately, these are not completely reversed by antivenom and anti-inflammatory therapies (Picolo et al., 2002; Ferraz et al., 2015; Hifumi et al., 2015).

The toxicological effects induced by snakebite are currently treated with intravenous administration of antivenom in combination to analgesics, fluid therapy, hemodialysis and/or antibiotics (Gutierrez et al., 2017). Although sufficient in most cases, snakebite treatments have been challenged by the continuous high numbers of clinical illness and mortality associated with snakebites worldwide (WHO, 2018). Furthermore, chronic morbidity following snakebites have been underestimated, with many victims reporting chronic symptoms in the bitten region, including complex regional pain syndrome (CRPA) (Seo et al., 2014; Kleggetveit et al., 2016) and musculoskeletal disabilities (Jayawardana et al., 2016). Available snakebite treatments face challenges associated with limited para-specificity, poor antibody specificity, high incidences of adverse reactions, low availability and poor affordability to those who need them, along with poor efficacy against local tissue effects (Williams et al., 2011; Gutierrez et al., 2017; Ainsworth et al., 2018; Harrison et al., 2019). Therefore, current research efforts are directed to the development of more effective snakebite therapies able to generically fully inhibit the major toxic components of snake venoms in order to better overcome severe acute and chronic effects caused by snakebite.

In light of the public health importance and the complexity of snake venoms, in this review we highlight the multifunctionality, structure-activity relationships and evolution of proteins and peptides in snake venoms. We aim to provide a better understanding of their action mechanisms and effects, and to bring attention to their undetermined targets and a host of potential novel therapeutic targets that might have implications for improving the treatments of snakebites.

# PHOSPHOLIPASES (PLA2s)

Phospholipases A<sup>2</sup> play an important role in the neurotoxic and myotoxic effects of snakebites (Harris and Scott-Davey, 2013). These proteins have molecular masses of 13–15 kDa and are classified into groups I and II, which are found as major components in the venoms of Elapidae and Viperidae, respectively (Six and Dennis, 2000; Harris and Scott-Davey, 2013). In addition, a third group of PLA2s, termed IIE, have been predominately recovered from the venoms of nonfront fanged snakes, although their importance in the venom arsenal remains unclear (Fry et al., 2012; Perry et al., 2018). Studies reconstructing the evolutionary history of this multilocus gene family have demonstrated that each of these PLA2s types (I, II, and IIE) have been independently recruited into snake venom systems (Fry et al., 2012; Junqueira-De-Azevedo et al., 2015), suggesting they have evolved their toxic properties by convergent evolution (Lomonte and Rangel, 2012). Although PLA2s from vipers and elapids share similar enzymatic properties, both types have undergone extensive gene duplication over evolutionary time, seemingly facilitating the evolution of new toxic functions, and resulting in different patterns of residue conservation (Lynch, 2007; Vonk et al., 2013; Dowell et al., 2016; **Figure 1A**). In addition, the venoms of viper snakes contain isoforms of group II PLA2s that are catalyticallyactive (e.g., Asp49) and catalytically-less active (e.g., Lys49) (Lomonte and Rangel, 2012).

A number of PLA2s exert strong myotoxic effects which often lead to severe necrosis (Harris and Maltin, 1982; Gutierrez and Ownby, 2003), and many of these toxins also promote inflammation, including edema formation, cytokine production and leukocyte recruitment, pain by inducing thermal allodynia and mechanical hyperalgesia, paralysis through block of neuromuscular transmission and intensify hemorrhage by inhibiting coagulation (**Table 1**) (Camara et al., 2003; Chacur et al., 2003; Camargo et al., 2008; Teixeira et al., 2011; Lomonte and Rangel, 2012; Harris and Scott-Davey, 2013; Casais-E-Silva et al., 2016; Costa et al., 2017; Zambelli et al., 2017b; Zhang et al., 2017). Neurotoxic effects caused by these toxins, as well as some of their proinflammatory effects, occurs via the modulation of pre-synaptic terminals as well as sensory nerveendings (Camara et al., 2003; Harris and Scott-Davey, 2013; Sribar et al., 2014; Zhang et al., 2017). The PLA2s pre-synaptic effects are characteristic of β-neurotoxins and target the motor nerve terminals at the neuromuscular junction (Sribar et al., 2014; Gutierrez et al., 2017). The mechanisms in which certain PLA2s exert their pre-synaptic effects are not fully understood, and the primary targets remain unidentified, although the PLA2 β-bungarotoxin is known to bind to K<sup>+</sup> channels at the pre-synaptic terminals via an accessory kunitz subunit (Benishin, 1990; Sribar et al., 2014; **Figure 1B**). Overall, these presynaptic effects induce robust exocytosis of the neurotransmitters vesicles reserves which consequently lead to the depletion of neurotransmitter release in the neuromuscular junction to promote muscle paralysis (Harris et al., 2000; Sribar et al., 2014; Gutierrez et al., 2017).

The inflammation induced by PLA2s has non-neurogenic and neurogenic (substance-P dependent) components (Camara et al., 2003; Camargo et al., 2008; Costa et al., 2017; Zhang et al., 2017). The non-neurogenic component is mostly caused by the hydrolysis of membrane lipids that generate potent pro-inflammatory lipid mediators (Costa et al., 2017). Additional non-neurogenic and neurogenic inflammations induced by PLA2s use more complex mechanisms still not fully understood. For example, leukocyte recruitment (De Castro et al., 2000), mastocytes degranulation (Menaldo et al., 2017), and macrophage activation (Triggiani et al., 2000; Granata et al., 2006; Giannotti et al., 2013) were demonstrated to occur independently of the generic PLA2s lipid hydrolysis catalytic activity. Furthermore, substance-P mediated neurogenic inflammation has been described to be induced by PLA2s from Crotalus durissus cascavella (Camara et al., 2003) and from Naja mossambica (Camargo et al., 2008). Interestingly, the C-terminal of Myotoxin-II (a Lys49-PLA2) isolated from Bothrops asper was able to activate macrophages, showing this region maybe be crucial for the observed enzymatic-independent inflammation (Giannotti et al., 2013) (**Figure 1A**).

The pain induced by PLA2s is driven by inflammatory processes and sensory neuronal activation. Bradykinin is an important mediator of the inflammatory pain induced by PLA2s (Moreira et al., 2014; Urs et al., 2014; Mamede et al., 2016; Zambelli et al., 2017b). It induces mechanical hyperalgesia dependent on the production of TNF-α, IL-1β, and prostaglandins (Cunha et al., 1992). This suggests that PLA2s contribute to an increase in arachidonic acid release from cell membranes and its availability to be processed by cyclooxygenase resulting in prostaglandin production (Verri et al., 2006). Corroborating this hypothesis, studies performed in rodents have demonstrated that PLA2s isolated from different snake venoms induced hyperalgesia mediated by biogenic amines, cytokines, prostaglandins, sympathomimetic amines, ATP, K<sup>+</sup> release, purinergic receptor activation, and glial cell activation (Nunez et al., 2001; Chacur et al., 2003, 2004; Zhang et al., 2017). Interestingly, the presence of a strong catalytic activity in the PLA2s is not essential for its nociceptive activity as observed by the nociceptive effects of PLA2s "Lys49" variants (Lomonte et al., 1994; Rong et al., 2016; Zhang et al., 2017). Direct activation of sensory neurons was demonstrated by MitTx from Micrurus tener tener, a heteromeric complex between a PLA2 and a kunitz peptide (Bohlen et al., 2011), and by the Lys49 PLA2 BomoTx from the Brazilian viper Bothrops moojeni (Zhang et al., 2017). MitTx activates somatosensory neurons and was found to be a potent and selective agonist of ASIC channels (**Figure 1C**). This agonistic effect induces robust pain behavior in mice via activation of ASIC1 channels on capsaicin-sensitive nerve fibers (Bohlen et al., 2011). BomoTx also activated a cohort of sensory neurons to induce ATP release followed by activation of purinergic receptors (Zhang et al., 2017). Unfortunately, the primary target of this neuronal activation is still unknown. This same toxin induced non-neurogenic inflammatory pain, thermal hyperalgesia dependent on TRPV1 channels-expressing nerve fibers, and mechanical allodynia dependent on P2X2/P2X3 expressing fibers (Zhang et al., 2017).

The multifunctionality of PLA2s is evidenced by their myotoxic, neurotoxic and enzymatic functions, as well as by their inflammatory properties. There is evidence that separate domains and regions of the PLA2s structure participate in these various activities (**Figures 1A,B**). For example, for the Lys49-PLA2 from Bothrops asper and Agkistrodon piscivorus piscivorus, the Cterminal region of these toxins (residues 115–129) were identified as the active sites responsible for their myotoxic effects (Lomonte et al., 1994; Nunez et al., 2001) (**Figure 1A**). Interestingly, the same C-terminal region in BpirPLA2-I isolated from Bothrops

pirajai had anticoagulant activity through inhibition of platelet aggregation (Teixeira et al., 2011). Crotoxin B, an Asp49-PLA2, and a major component of the venom of Crotalus durissus terrificus, has toxic active sites fully independent of its enzymatic activity (Soares et al., 2001; **Figure 1D**), while Lemnitoxin, a basic class I PLA2 isolated from Micrurus lemniscatus, has strong myotoxic and proinflammatory effects but no neurotoxic activity (Casais-E-Silva et al., 2016). A detailed mutational study using the PLA2 OS2 from the Australian Taipan snake Oxyuranus scutellatus scutellatus revealed that a 500-fold loss in enzymatic activity had only a minor effect on its neurotoxicity (Rouault et al., 2006). Furthermore, the enzymatic activity of OS2 was dependent of the N- and C-terminal regions, and the Nterminal region had a major role in the central nervous system

and in the nociceptive and/or endematogenic properties in BThrTx1 (Zambelli et al., 2017a). The central histidine in the catalytic site of Crotoxin B is highlighted in red.

neurotoxicity. An alanine scan of the Lys49-PLA2 from Bothrops jararacussu (BThTx-I) demonstrated distinct regions involved in the hyperalgesia and edema (Zambelli et al., 2017a). In this study, the mutant Arg118Ala lost both nociceptive and edematogenic properties, Lys115Ala and Lys116Ala lost the nociceptive effects without interfering with the edema formation and Lys122Ala lost the nociceptive properties and had weak inflammatory effects (**Figure 1E**). Similarly, an independent study showed the Lys122Ala substitution led to reduced membrane damaging and myotoxic activities (Ward et al., 2002). This C-terminal region is characterized by the presence of basic and hydrophobic residues which have been strongly associated with the ability of PLA2s to interact and penetrate the lipid bilayer (Delatorre et al., 2011; Gutierrez and Lomonte, 2013).

The variability of PLA<sup>2</sup> isoforms observed in the venom of different snakes may reflect a variety of factors, including the evolutionary history, phylogeography, diet, and/or environmental conditions relating to a species or populations within a species (Zancolli et al., 2019). Many snake venom toxins are known to be encoded by multi-locus gene families (Casewell et al., 2013), and the resulting genes within those families have been found clustered together in arrays on microchromosomes, likely as the result of tandem gene duplication events (Vonk et al., 2013). The process of gene duplication and loss underpins the evolution of many snake venom toxin families, including the PLA2s (Lynch, 2007; Vonk et al., 2013; Casewell et al., 2014; Dowell et al., 2016), with duplications likely initially stimulating a gene dosing effect while also freeing duplicates from evolutionary constraints, and thus enabling a scenario that may facilitate protein sub- and/or neo-functionalization (Lynch and Conery, 2000). Indeed, studies have demonstrated that extremely divergent venom phenotypes (e.g., neurotoxic vs. haemorrhagic) observed within populations of the same snake species, or between closely related species, are at least partially the result of major genomic differences in PLA<sup>2</sup> toxin loci, with variation at different gene complexes resulting in markedly different haplotypes (Dowell et al., 2018; Zancolli et al., 2019). It remains unclear as to the specific processes that underpin such diversity, although natural selection driven by environmental factors and hybridization events have both been proposed (Dowell et al., 2018).

# SNAKE VENOM METALLOPROTEINASES (SVMPs)

Snake Venom MetalloProteinases (SVMPs) are zinc-dependent proteinases ranging from 20 to 110 kDa in size and are categorized into P-I, P-II, and P-III classes according to their structural domains (Hite et al., 1994; Jia et al., 1996; Fox and Serrano, 2005) (**Figure 2**). These toxins are major components of viper venoms and play a key role in the toxicity of these snake venoms (**Table 1**; Tasoulis and Isbister, 2017). Venom SVMPs have evolved from ADAM (a disintegrin and metalloproteinase) proteins, specifically ADAM28 (Casewell, 2012), with the P-III being the most basal structural variant consisting of metalloproteinase, disintegrin-like, and cysteine-rich domains (Moura-Da-Silva et al., 1996; Casewell et al., 2011; **Figure 2B**). Subsequently, P-II SVMPs diverged from P-IIIs and consist of a metalloproteinase and disintegrin domain, with the latter typically detected in venom as a proteolytically processed product (Fox and Serrano, 2008; Casewell et al., 2011). The final class, P-I SVMPs which consist only of the metalloproteinase domain, appeared to have evolved on multiple independent occasions in specific lineages as a result of loss of the P-II disintegrinencoding domain (Casewell et al., 2011). Throughout this diverse evolutionary history, SVMPs show evidence of extensive gene duplication events, coupled with bursts of accelerated molecular evolution (Casewell et al., 2011; Vonk et al., 2013). However, while all three classes of SVMPs have been described from viper venoms, only P-III SVMPs have been detected in elapid and "colubrid" venoms (Casewell et al., 2015). While these P-III SVMPs are typically relatively lowly abundant venom components in elapid snakes (e.g., <10% of venom toxins), they can be major components in "colubrids" (Mackessy and Saviola, 2016; Pla et al., 2017; Tasoulis and Isbister, 2017; Modahl et al., 2018a). These abundance differences likely underpin the distinct pathologies observed following envenomings by snakes found in these families. SVMPs contribute extensively to the hemorrhagic and coagulopathic venom activities following bites by viperid snakes, and the diversity of SVMPs isoforms often present in their venom likely facilitate synergistic effects, such as simultaneous action on multiple steps of the blood clotting cascade (Kini and Koh, 2016; Slagboom et al., 2017). Certain "colubrid" snakes, whether medically important or not, also show evidence of having multiple SVMP toxins in their venom, even if they do not show the sub-class diversity observed in the vipers (Mackessy and Saviola, 2016; Pla et al., 2017; Modahl et al., 2018a; Perry et al., 2018). However, it is relatively uncommon for elapid snakebites to cause systemic hemotoxicity (Slagboom et al., 2017) and this is likely a consequence of those venoms exhibiting little diversity or abundance of SVMPs, and instead usually being dominated by neurotoxic toxin families such as the 3FTXs and PLA2s (Tasoulis and Isbister, 2017).

As mentioned above, the SVMPs are known for their hemorrhagic activity as well as for their ability to influence multiple steps of the blood clotting cascade, resulting in a lethal combination of systemic hemorrhage and incoagulable blood in prey and/or victims (Markland and Swenson, 2013). Research has revealed that the effects of SVMP-induced hemorrhage relies on a mechanism that occurs in two steps (Gutierrez et al., 2005; Escalante et al., 2011). First, SVMPs cleave the basement membrane and adhesion proteins of endothelial cells-matrix complex to weaken the capillary vessels. During the second stage, the endothelial cells detach from the basement membrane and become extremely thin, resulting in disruption of the capillary walls and effusion of blood from the fragile capillary walls. In addition to the proteinase activity, SVMPs impact on homeostasis by altering coagulation, which contributes to their toxic hemorrhagic effects (Markland, 1998; Takeda et al., 2012; Slagboom et al., 2017). This occurs through modulation of factors such as fibrinogenase and fibrolase that mediate the coagulation cascade, depletion of pro-coagulation

Ferraz et al.


Their functional properties associated to pain, inflammation, hemorrhage, necrosis and paralysis are described.

the disintegrin-like domain is highlighted in green and the cysteine-rich domain is highlighted in blue. (B) Cartoon representation of the three-dimensional structure of the class P-III metalloproteinase VAPB2 from Crotalus atrox (PDB 2DW0). The metalloproteinase domain is colored in orange, the disintegrin-like domain (D-like) is colored in green and the cysteine-rich domain (Cys-rich) is colored in blue. The disulphide bridges are colored in yellow.

factors through consumption processes (e.g., Factor X, prothrombin and fibrinogen), platelet aggregation inhibition and inflammatory activities (Kamiguti, 2005; Kini, 2005; Takeda et al., 2012; Kini and Koh, 2016; Slagboom et al., 2017; Ainsworth et al., 2018).

Some SVMPs also induce inflammation, including edema, and pain by triggering hyperalgesia (Dale et al., 2004; da Silva et al., 2012; Bernardes et al., 2015). The inflammation induced by jararhagin, a class P-IIIb metalloproteinase with potent hemorrhagic and dermonecrotic activity isolated from Bothrops jararaca, produced TNF-α and IL-1β in vivo (Laing et al., 2003; Clissa et al., 2006), while BJ-PI2, isolated from the same snake species, lacked hemorrhagic and necrotic activities but still induced vascular permeability and inflammatory cell migration (da Silva et al., 2012). Curiously, the edema formation induced by jararhagin was independent of pro-inflammatory mediators such as TNF, IL-1β, and IL-6 (Laing et al., 2003). Neurogenic inflammation was also implicated in the local hemorrhage induced by Bothrops jararaca which was shown to be dependent on serotonin and other neuronal factors (Goncalves and Mariano, 2000). The mechanisms on how neurogenic inflammation is triggered by the snake venom components and how it participates in the hemorrhagic process are still not understood. Pain induced by SVMPs is characterized by hyperalgesia and inflammatory pain, which is dependent on the production of cytokines, nitric oxide, prostaglandins, histamine, leukotrienes, and migration of leukocytes, mast cell degranulation and NFkB activation (Fernandes et al., 2007; Bernardes et al., 2015; De Toni et al., 2015; Ferraz et al., 2015). However, the mechanisms underlying SVMP-induced pain are still poorly understood, with neurogenic inflammation and neuronal excitatory properties still underexplored. The research reported to date indicates that SVMPs induce the production of inflammatory mediators and activate cytokine/chemokine cascades and the release of prostaglandins and sympathetic amines to engender nociceptor sensitization (Verri et al., 2006; De Toni et al., 2015; Ferraz et al., 2015).

The multifunctional properties of SMVPs are also welldescribed. Class P-III SVMPs tend to display stronger hemorrhagic activity compared to P-I and P-II SVMPs, possibly due to the disintegrin-like and cysteine-rich domains enabling binding to relevant targets in the extracellular matrix of capillary vessels. The functions of these domains have been investigated in inflammation, revealing that these domains are sufficient to induce pro-inflammatory responses through production of TNF-α, IL-1β and IL-6, and leukocyte migration, in which mechanisms and primary targets are still unknown (Clissa et al., 2006; Ferreira et al., 2018). These observations suggest that these domains are involved in the inflammatory hyperalgesia induced by SVMPs. Furthermore, the pronounced hemorrhagic and necrotic activities are strongly dependent on biological effects driven by the disintegrin-like and cysteine-rich domains, as observed for BJ-PI2 (da Silva et al., 2012). The hemorrhagic activity of Bothrops jararaca venom was also shown dependent on neurogenic inflammation (Goncalves and Mariano, 2000). These findings implicate the disintegrin-like and/or cysteine-rich domains as key player(s) in these neurogenic mechanisms, and possibly in non-inflammatory pain, from which the neuronal targets are still to be identified.

# SNAKE VENOM SERINE PROTEINASES (SVSPs)

Snake Venom Serine Proteinases (SVSPs) belong to the S1 family of serine proteinases and display molecular masses ranging from 26 to 67 kDa and two distinct structural domains (**Figure 3**). These venom toxins have evolved from kallikreinlike serine proteases and, following their recruitment for use in the venom gland, have undergone gene duplication events giving rise to multiple isoforms (Fry et al., 2008; Vaiyapuri et al., 2012). SVSPs catalyze the cleavage of polypeptide chains on the C-terminal side of positively charged or hydrophobic amino acid residues (Page and Di Cera, 2008; Serrano, 2013). Similar to SVMPs, SVSPs have been described in the venom of a wide variety of snake families, although they are typically only abundant in viper venoms, and much less common in the venoms of elapid and "colubrid" snakes (Tasoulis and Isbister, 2017; Modahl et al., 2018a). Whilst the SVMPs are well-known for their ability to rupture capillary vessels, SVSPs execute their primary toxicity by altering the hemostatic system of their victims, and by inducing edema and hyperalgesia through mechanisms still poorly understood (**Table 1**). Hemotoxic effects caused by SVSPs include perturbations of blood coagulation (pro-coagulant or anti-coagulant), fibrinolysis, platelet aggregation and blood pressure, with potential deadly consequences for snakebite victims (Murakami and Arni, 2005; Kang et al., 2011; Serrano, 2013; Slagboom et al., 2017).

Pro-coagulant SVSPs have been described to activate multiple coagulation factors, including prothrombin and factors V, VII, and X (Kini, 2005; Serrano, 2013). For example, the activation of prothrombin produces thrombin which in turn produces fibrin polymers that are cross-linked. Thrombin also activates aggregation of platelets which, together with the formation of fibrin clots, results in coagulation (Murakami and Arni, 2005). In addition, platelet-aggregating SVSPs will activate the platelet-receptors to promote binding to fibrinogen and clot formation (Yip et al., 2005). These procoagulant and plateletaggregating activities will lead to the rapid consumption of key factors in the coagulation cascade and clot formation. On the other hand, anticoagulant SVSPs effects involve the activation of Protein C, which subsequently inactivates the coagulant factors Va and VIIIa (Kini, 2006). Furthermore, fibrinolytic SVSPs play an important role in the elimination of blood clots by acting as thrombin-like enzymes or plasminogen activators, which eliminates the fibrin in the clots and contributes significantly to the establishment of the coagulopathy (Kang et al., 2011; Serrano, 2013). Through the activation/depletion and inactivation of these coagulation factors, the clotting of blood is prevented, leading to uncoagulable blood, and external and internal bleeding.

Little is known about inflammatory responses and hyperalgesia induced by SVSPs. Studies suggest SVMPs and PLA2s have a pivotal role in the inflammatory responses and pain induced by snake venoms, while SVSPs have an important role in inflammation and a minor role in pain (Zychar et al., 2010; Menaldo et al., 2013). SVSPs in the venoms of Bothrops jararaca and Bothrops pirajai induce inflammation through edema formation, leucocyte migration (mainly neutrophils) and mild mechanical hyperalgesia, however, the mediators involved in these effects are still unknown (Zychar et al., 2010; Menaldo et al., 2013).

# THREE-FINGER TOXINS (3FTXs)

Three-fingers toxins (3FTXs) are non-enzymatic neurotoxins ranging from 58 to 81 residues that contain a three-finger fold structure stabilized by disulfide bridges (Osipov and Utki, 2015; Kessler et al., 2017; **Figure 4A**). They are present mostly in the venoms of elapid and colubrid snakes, and exert their neurotoxic effects by binding postsynaptically at the neuromuscular junctions to induce flaccid paralysis in snakebite victims (Barber et al., 2013). Three-finger toxins differ in length, with short-chain 3FTXs including α-neurotoxins, β-cardiotoxins, cytotoxins, fasciculins and mambalgins, which comprise of 57– 62 residues and 4 disulfide bridges, and long-chain 3FTXs which include α-neurotoxins and γ-neurotoxins, hannalgesin and κneurotoxins, and comprise 66–74 residues and five disulfide bridges. Furthermore, they can exist as monomers and as covalent or non-covalent homo or heterodimers. The diversity of 3FTX isoforms described above are a direct result of a diverse evolutionary history, whereby ancestral 3FTXs have diversified by frequent gene duplication and accelerated rates of molecular evolution. These processes, which are broadly similar to those underpinning the evolution of the other toxin families described above, are particularly associated with the evolution of a high-pressure hollow-fanged venom delivery system observed in elapid snakes (Sunagar et al., 2013). For example, gene duplication events have resulted in the expansion of 3FTX loci from one in non-venomous snakes like pythons, to 19 in the elapid Ophiophagus hannah (king cobra) (Vonk et al., 2013), and selection appears to have acted extensively on surface exposed amino acid residues in these resulting paralogous elapid 3FTX genes (Sunagar et al., 2013). The consequences of this evolutionary history are the differential production of numerous 3FTX isoforms that often exhibit considerable structural differences and distinct biological functions (**Figures 4B–E**). Although many elapid snakes exhibit broad diversity of these functionally varied toxins in their venom (e.g., multiple short- and long-chain 3FTX isoforms), it remains unclear why particular functional variants are enriched in the venoms of certain elapid lineages, such as the cytotoxin-rich venoms of cobras (genus Naja) or the neurotoxin-rich venoms of mambas (genus Dendroaspis) (Tan et al., 2015; Lauridsen et al., 2017; Ainsworth et al., 2018). However, evidence of taxon-specific 3FTXs in the venoms of certain "colubrid" snakes (Pawlak et al., 2006; Mackessy and Saviola, 2016; Modahl et al., 2018a), coupled with pseudogenization of 3FTXs in species that no longer rely on their venom (Li et al., 2005), suggests that

cytotoxin 1 from Naja atra (Uniprot P60304), and calliotoxin from Calliophis bivirgatus (Uniprot P0DL82). (B–E) Cartoon representation of the three-dimensional structure of the 3FTXs short-chain mambalgin-1 from Dendroaspis polylepis polylepis (PDB 5DU1) (B), long-chain α-bungarotoxin from Bungarus multicinctus (PDB 1ABT) (C), non-covalent homodimer α-cobratoxin from Naja kaouthia (PDB 4AEA) (D), and covalent heterodimer irditoxin from Boiga irregularis (PDB 2H7Z) (E). (F) Cartoon representation of the three-dimensional structure of Fascilulin-2 bound to the AChE (PDB 4BDT). The residues Arg24, Lys25, Pro31 and Leu35 which form hydrogen bonds with AChE are shown in orange. (G–J) Cartoon representation of the three-dimensional structure of the muscarinotoxin 1 (MT1, PDB 4DO8) (G) and muscarinotoxin 7 (MT7, PDB 2VLW) (H), and respective analogs displaying the modified loop 1 (PDB 3FEV) (I) and loop 3 (PDB 3NEQ) (J) in light orange color. (K) Neurotoxin II from N. oxiana (PDB 1NOR). The residues Ser29, His31, Gly33 and Thr34 which form hydrogen bonds with the α-subunit of nAChR are shown in orange, and the residue Arg32 forming ionic interactions with the α-subunit of nAChR is shown in yellow. (L) Neurotoxin b (NTb) from O. Hannah (PDB 1TXA). The residues Lys24, Trp26, and Asp28 that form hydrogen bonds with the α-subunit nAChR are shown in orange.

selection for prey capture may be at least partially responsible for influencing differential 3FTX representation.

Despite the shared three-finger fold, the 3FTXs have diverse targets and biological activities. For example, αneurotoxins inhibit muscle acetylcholine receptors (nAChR) (Changeux, 1990), κ-neurotoxins inhibits neuronal AChR (Grant and Chiappinelli, 1985), muscarinic toxins inhibit muscarinic receptors (Marquer et al., 2011), fasciculins inhibit acetylcholinesterase (AChE) (Marchot et al., 1998), calciseptine modulates L-type calcium channels (De Weille et al., 1991; Garcia et al., 2001), cardiotoxins interact non-specifically with phospholipids (Konshina et al., 2017), or induce insulin secretion (Nguyen et al., 2012), mambin interacts with platelet receptors (McDowell et al., 1992), exactin inhibits Factor X (Girish and Kini, 2016), β-cardiotoxins inhibit β-adrenoreceptors (Rajagopalan et al., 2007), MTα inhibits α-adrenoreceptors (Koivula et al., 2010), mambalgins inhibit ASIC channels (Diochot et al., 2012), Tx7335 that activates potassium channels (Rivera-Torres et al., 2016) and calliotoxin activates voltagegated sodium channels (NaV) (Yang et al., 2016). Their toxic biological effects include flaccid or spastic paralysis due to the inhibition of AChE and ACh receptors (Grant and Chiappinelli, 1985; Changeux, 1990; Marchot et al., 1998; Marquer et al., 2011), and activation of NaV1.4 (Yang et al., 2016) and L-type calcium channels (Garcia et al., 2001) in the periphery, necrosis through the action of cardiotoxins (cytotoxins) (Konshina et al., 2017), alteration of the cardiac rate through modulation of αand β-adrenoreceptors (Rajagopalan et al., 2007; Koivula et al., 2010), and altered homeostasis through inhibition of platelet aggregation (McDowell et al., 1992) and Factor X (Girish and Kini, 2016). In addition to their multitude of bio-activities, 3FTXs can remarkably display toxicities that target distinct classes of organisms as demonstrated in non-front fanged snake venoms that produce 3FTX isoforms which are non-toxic to mice but highly toxic to lizards, and vice-versa (Modahl et al., 2018b).

Some 3FTXs are able to induce analgesia through inhibition of ASIC channels (Salinas et al., 2014), while no 3FTXs are known to be involved in inflammation and hyperalgesia as commonly reported for other snake toxins. Furthermore, 3FTXs are relatively small compared to the other snake toxins discussed herein, and do not exhibit multiple domains to produce their multiple toxic functions. Nevertheless, the number of receptors, ion channels, and enzymes targeted by snake 3FTXs highlights the unique capacity of this fold to modulate diverse biological functions and the arsenal of toxic effects that are induced by 3FTXs. The unique multifunctionality of the 3FTX scafold occurs because of their resistance to degradation and tolerance to mutations and large deletions (Kini and Doley, 2010). Therefore, the structure-activity relationship of the 3FTXs is complex and yet to be fully understood. Their functional sites are located on various segments of the molecule surface. Conserved regions determine structural integrity and correct folding of 3FTXs to form the three loops, including eight conserved cysteine residues found in the core region. Aromatic residues (Tyr25 or Phe27) are conserved in most 3FTXs and essential to their folding. Another conserved features are the antiparallel β-sheet structure and charged amino acid residues also essential to stabilize the native conformation of the protein by forming hydrogen and ionic interactions, respectively (Torres et al., 2001). Additional disulfide bonds can be observed either in the loop I or loop II which can potentially change the activity of the 3FTX in some cases.

Specific amino acid residues in critical segments of the 3FTXs have been identified to be important for binding to their targets. For example, the interactions between fasciculin and AChE enzyme has been studied. The first loop or finger of fasciculin reaches down the outer surface of the enzyme, while the second loop inserts into the active site and exhibit hydrogen bonds and hydrophobic interaction (Harel et al., 1995; **Figure 4F**). Several basic residues in fasciculin make key contacts with AChE. From docking studies, hydrogen bonds, and hydrophobic interactions where shown to establish receptor-toxin assembly. Six amino acid residues (Lys25, Arg24, Asn47, Pro31, Leu35 and Ala12) of fasciculin interact with the AChE residues by forming hydrogen bonds at its active site. Hydrophobic interactions are also observed between eight amino acid residues (Lys32, Cys59, Val34, Leu48, Ser26, Gly36, Thr15, Asn20) from fasciculin and the enzyme active site (Waqar and Batool, 2015). These interactions involve charged residues but lacks intermolecular salt linkages.

Muscarinic toxins from mamba venoms, such as MT1 and MT7 (**Figures 4G,H**), act as highly potent and selective antagonists of M1 receptor subtype through allosteric interactions with the M1 receptor. Fruchart-Gaillard et al. (2012) synthesized seven chimeric 3FTXs based on MT1 and MT7 proteins that have remarkable affinity for α1A-adrenoceptor receptor subtypes but low affinity for M1 receptor. In this study, substitution within loop 1 and loop 3 weaken the toxin interactions with the M1 receptor, resulting in a 2-fold decrease in affinity (**Figures 4I,J**). Furthermore, modifications in loop 2 of the MT1 and MT7 significantly reduce the affinity for the M1 receptor. Interestingly, a significant increase in affinity was achieved on the α1A-adrenoceptor by combined modifications in loops 1 and 3, where loop 1 forms a critical interaction with the receptor (Fruchart-Gaillard et al., 2012). Another muscarinic toxin named MTβ was designed based on ρ-Da1a protein from Black Mamba which is known to have affinity for the muscarinic receptor. The Ser/Ile38 and Ala/Val43 substitutions in ρ-Da1a could be responsible for the increased affinity of MTβ for α1Badrenoreceptor and α1D-adrenoreceptor (Blanchet et al., 2013). These two residues were not located at the tip of the toxin loop, however, they played a critical role in the interactions with their molecular targets (Bourne et al., 2005). These mutations seem to induce a significant change in the structure of the toxin which is mostly due to additional hydrophobic interactions between Ile38 and the aliphatic side chains of the toxin that may induce a slight movement of the side chain surrounding Ser/Ile38.

Neurotoxin II (NTII) from Naja oxiana is a potent blocker of nAChR. NTII is a short chain α-neurotoxin which consists of 61 amino acid residues and four disulfide bridges (**Figure 4K**). A computational model for examining the interactions of NTII with the Torpedo californica nAChR has been studied (Mordvintsev et al., 2005). The model showed that the binding of the short α-neurotoxin occurs by rearranging the aromatic residues in the binding pocket. The insertion of the loop II into the binding pocket of a nAChR induces the neurotoxin activity and significantly determines the toxin-receptor interactions, while loop I and III contact the receptor residues by their tips only and determine the immunogenicity of the short neurotoxins. In the model, the Arg32 NTII residue forms an ionic pair with Trp149 from the nAChR and is observed as the strongest interaction. Hydrogen bond interactions such as Asp30 from loop II with Tyr198 of the nAChR and Lys46 from loop III with Thr191 of the nAChR complement the ionic interaction between NTII and its target receptor. Aside from hydrogen bonds, van-der-Waals interactions were also observed at the fingertip of loop II amino acid residues (Lys25, Trp27, Trp28, Ser29, His31, Gly33, Thr34 and Arg38) (Mordvintsev et al., 2005).

The structure of neurotoxin b (NTb), a long neurotoxin from Ophiophagus hannah, has been elucidated (Peng et al., 1997; **Figure 4L**). Conserved residues in loop II also play an important role in the toxicity of the long neurotoxins by making ionic interactions between toxin and receptor. Positively charged residues Trp27, Lys24 and Asp28 are highly conserved residues in the long neurotoxins. Furthermore, a modification of the Trp27 in the long neurotoxin analog of NTb from king cobra venom led to a significant loss in neurotoxicity. The additional disulphide bridge in loop II of long neurotoxins does not affect the toxin activity. Nevertheless, cleavage of the additional disulphide bridge in loop II can disrupt the positively charged cluster at the tip of loop II. Changes in loop II conformation will affect the binding of the long neurotoxin to the target receptor resulting the loss of neurotoxicity (Peng et al., 1997).

Long and short neurotoxins show sequence homology and similar structure. Previous studies show that many residues located at the tip of loop II are conserved in both short and long neurotoxins. It is consisting of the long central β-sheets forming three loops and globular core. From the studies of α-bungarotoxin and α-cobratoxin, the least conserved regions of the long neurotoxin are the C-terminal and the first loop (Walkinshaw et al., 1980; Juan et al., 1999; Dutertre et al., 2017). However, significant differences between long-chain neurotoxin and short chain neurotoxin are indicated by the immunological reactivity. Many of the residues involved in the antibody-long neurotoxins binding are located in loop II, loop III, and in the Cterminal, while in short neurotoxins the antibody's epitope makes interactions with the loop I and loop II (Engmark et al., 2016).

# THERAPEUTIC IMPLICATIONS

# Treating Snake Envenomation

Animal-derived antivenoms are considered the only specific therapy available for treating snakebite envenoming (Maduwage and Isbister, 2014; Slagboom et al., 2017; Ainsworth et al., 2018). These consist of polyclonal immunoglobulins, such as intact IgGs or F(ab')2, or Fab fragments (Ouyang et al., 1990; Maduwage and Isbister, 2014; Roncolato et al., 2015), derived from hyperimmune animal serum/plasma (typically horse or sheep). When used rapidly and appropriately, they are capable of neutralizing life-threatening systemic envenoming, for example pathologies such as venom-induced coagulopathy, hemorrhage, neurotoxic effects, and/or hypotensive shock (Warrell, 1992; Calvete et al., 2009; Maduwage and Isbister, 2014; Slagboom et al., 2017; Ainsworth et al., 2018).

Antivenoms can be classified as monovalent or polyvalent depending on the immunogen used during production. Monovalent antivenoms are produced by immunizing animals with venom from a single snake species, whereas polyvalent antivenoms contain antibodies produced from a cocktail of venoms of several medically relevant snakes from a particular geographical region. Polyvalent antivenoms are therefore designed to address the limited paraspecific cross-reactivity of monovalent antivenoms by stimulating the production of antibodies against diverse venom toxins found in different snake species, and to avoid issues relating to the wrong antivenom being given due to a lack of existing snakebite diagnostic tools (O'leary and Isbister, 2009; Abubakar et al., 2010). However, polyvalent therapies come with disadvantages—larger therapeutic dose are required to effect cure, potentially resulting in an increased risk of adverse reactions, and in turn increasing cost to impoverished snakebite victims (Hoogenboom, 2005; O'leary and Isbister, 2009; Deshpande et al., 2013; Roncolato et al., 2015).

Variation in venom constituents therefore causes a great challenge for the development of broadly effective snakebite therapeutics. The diversity of toxins found in the venom of any one species represents considerable complexity, which is further enhanced when trying to neutralize the venom of multiple species, particularly given variations in the immunogenicity of the multi-functional toxins described in this review. Antivenom efficacy is therefore, typically limited to those species whose venoms were used as immunogens and, in a number of cases, closely-related snake species that share sufficient toxin overlap for the generated antibodies to recognize and neutralize the key toxic components (Casewell et al., 2010; Segura et al., 2010; Williams et al., 2011; Ainsworth et al., 2018).

Because variation in venom composition is ubiquitous at every level of snake taxonomy (e.g., interspecifically, intraspecifically, and even ontogenically Chippaux et al., 1991; Gibbs et al., 2013; Casewell et al., 2014; Durban et al., 2017; Ainsworth et al., 2018), gaining an understanding of venom composition in medically important snake species is valuable, and can inform predictions of the likely paraspecific neutralizing capability of an antivenom, and therefore the geographical applicability of a particular therapeutic. Consequently, venom toxicity/pathology analyses in combination with venom proteomics, antivenomics, and/or immunological analyses have been integrated to investigate the paraspecificity of antivenoms (Calvete et al., 2009; Madrigal et al., 2012; Pla et al., 2013; Tan et al., 2015). Such studies have revealed surprising cross-reactivity of antivenoms against distinct, non-targeted, snake species, such as: (i) the potential utility of Asian antivenoms developed against terrestrial elapid snakes at neutralizing the venom toxicity of potent sea snake venoms (Tan et al., 2015), (ii) the seeming utility of African polyvalent antivenom at neutralizing the venom of a genus of elapid snakes not including in the immunizing mixture (Whiteley et al., 2019), and (iii) the potential for saw-scaled viper antivenom to be used as an alternative treatment for bites by the boomslang (Dispholidus typus) in regions where the appropriate speciesspecific antivenom is unavailable or unaffordable (Ainsworth et al., 2018). The later of these studies demonstrated crossneutralization between distinct snake lineages (e.g., viper and colubrid), and this surprising finding was ultimately informed by venom comparisons indicating that these snakes had converged upon similar toxin compositions. Thus, detailed knowledge of venom composition can greatly inform studies assessing the geographical utility of antivenoms.

The application of "venomic" (e.g., transcriptomic, proteomic) approaches to characterize venom composition have revealed that many abundant venom proteins belong to only a few major families of toxins, which despite their diversity, often share antigenic determinants (Calvete et al., 2007, 2009; Gutiérrez et al., 2009). Such studies have stimulated much research into the development of novel therapeutic approaches to tackle snakebite. These include the use of monoclonal antibody technologies to target key pathogenic toxins found in certain snake species (Laustsen et al., 2018; Silva et al., 2018), pathology rather than geographically-focused approaches to neutralizing venom toxins (Ainsworth et al., 2018), and the use of small molecule inhibitors designed to generically target specific toxin classes (Arias et al., 2017; Bryan-Quiros et al., 2019). The priority targets for these "next generation" snakebite therapeutics are the multifunctional toxins described herein (PLA2s, SVMP, SVSP, 3FTX) because of the major pathologies that they cause in snakebite victims. It is anticipated that in the future these new therapeutics may offer superior specificities, neutralizing capabilities, affordability and safety over conventional antivenoms. However, the translation of their early research promise into the mainstay of future snakebite treatments will ultimately rely on further research on the toxins that they are designed to neutralize. Specifically, the selection, testing and optimization of new tools to combat snake envenoming is reliant upon the characterization of key pathogenic, and often multifunctional, toxins found in the venom of a diverse array of medically important snake species.

# Snake Toxins as Therapeutics, Cosmetics and in Diagnostics

The first drug derived from animal venoms approved by the FDA is captopril, a potent inhibitor of the angiotensin converting enzyme (sACE) used to treat hypertension and congestive heart failure (Cushman et al., 1977; Cushman and Ondetti, 1999). Captopril was derived from proline-rich oligopeptides from the venom of the Brazilian snake Bothrops jararaca (Ferreira et al., 1970; Gavras et al., 1974). This milestone in translational science in the late 70's revealed the exceptional potential of snake venoms, and possibly other animal venoms such as from spider and cone snails, as an exquisite source of bioactive molecules with applications in drug development. More recently, an antiplatelet drug derived from the venom of the southeastern pygmy rattlesnake Sistrurus miliarius barbouri was commercialized as Integrillin by Millenium Pharmaceuticals, and is used to prevent acute cardiac ischemia (Lauer et al., 2001). Furthermore, a group of snake α-neurotoxins named waglerins from the viper Tropidolaemus wagleri (Schmidt and Weinstein, 1995; Debono et al., 2017) was used in the development of anti-wrinkle cosmetics by Pentapharm Ltd., a Swiss-based chemical company. The resulting product is now commercialized as Syn-AKE. The same company commercialized Defibrase <sup>R</sup> , a SVSP purified from Bothrops moojeni, for use in acute cerebral infarction, angina pectoris and sudden deafness, and Haemocoagulase, purified from Bothrops atrox, for the treatment of hemorrhages of various origins.

Snake toxins have been applied with great success in diagnostics. For example, the Textarin: Ecarin test is commonly used to detect Lupus Anticoagulant (LA) (Triplett et al., 1993), and is composed by a SVSP from the venom of the Australian Eastern brown snake Pseudonaja textilis (Textarin) (Stocker et al., 1994), and a SVMP from the venom of the saw-scaled viper Echis carinatus (Ecarin) (Stocker et al., 1986). Snake toxins also have the potential to become novel painkillers. The toxin crotalphine, from the venom of Crotalus durissus, is a 14 residues peptide able to induce analgesia through modulation of κ-opioids receptors and TRPV1 channels (Gutierrez et al., 2008; Konno et al., 2008; Bressan et al., 2016), while mambalgin, a 3FTX from Dendroaspis polylepis, induces analgesia by inhibiting ASIC channels (Diochot et al., 2012). Other 3FTXs have been applied in studies of novel treatments for blood pressure disorders (MTα), blood coagulation disorders (KT-6.9), diabetes type-2 (cardiotoxin 1) and infertility (actiflagelin) (Utkin, 2019). These findings, alongside current research into venom toxins, suggest an exciting future for the use of snake venoms in the field of drug discovery.

# CONCLUSIONS

Snake venoms are amongst the most fascinating animal venoms regarding their complexity, evolution, and therapeutic applicability. They also offer one of the most challenging drugs targets due to the variable toxin compositions injected following snakebite. The multifunctional approach adopted by the major components of their venoms, by using multidomain proteins and peptides with promiscuous folds (e.g., three-finger fold), as well as their diversity of toxic effects, are unique and yet to be identified in other animal venoms at such level of complexity. Gaining a better understanding of the evolution, structureactivity relationships and pathological mechanisms of these toxins is essential to develop better snakebite therapies and novel drugs.

Recent developments in genomics, proteomics and bioactivity assays, as well as in the understanding of human physiology in health and disease, are enhancing the quality and speed of research into snake venoms. We hope to improve the therapies used to neutralize the toxic effects of PLA2s, SVMPs, SVSPs and 3FTXs, and to develop drugs as new antidotes for a broad-spectrum of snake venoms that could also be effective in preventing the described inflammatory reactions and pain induced by snakebite. Finally, a diversity of biological functions in snake venoms is yet to be explored, including their inflammatory properties and their intriguing interactions with sensory neurons and other compartments of the nervous system, which will certainly lead to the elucidation of new biological functions and the development of useful research tools, diagnostics and therapeutics.

# AUTHOR CONTRIBUTIONS

FC provided theme, scope, and guidance. FC, CF, AA, CX, NC, RL, and JK wrote the manuscript. FC, NC, RL, and JK critically reviewed the manuscript.

# REFERENCES


# FUNDING

Australian National Health and Medical Research Council (APP1119056) provided a Fellowship to RL; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) provided an international scholarship to CF; Sir Henry Dale Fellowship to NC (200517/Z/16/Z) funded by the Wellcome Trust and the Royal Society.


eptifibatide in patients with acute coronary syndromes: observations from the platelet IIb/IIIa in unstable angina: receptor suppression using integrilin therapy (PURSUIT) trial. Circulation 104, 2772–2777. doi: 10.1161/hc4801. 100358


the king cobra (Ophiophagus hannah). J. Biol. Chem. 272, 7817–7823. doi: 10.1074/jbc.272.12.7817


of venom, bring new hope for drug discovery. Bioessays 33, 269–279. doi: 10.1002/bies.201000117


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ferraz, Arrahman, Xie, Casewell, Lewis, Kool and Cardoso. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Not so Dangerous After All? Venom Composition and Potency of the Pholcid (Daddy Long-Leg) Spider *Physocyclus mexicanus*

Pamela A. Zobel-Thropp<sup>1</sup> , Jennifer Mullins <sup>1</sup> , Charles Kristensen<sup>2</sup> , Brent A. Kronmiller <sup>3</sup> , Cynthia L. David<sup>4</sup> , Linda A. Breci <sup>4</sup> and Greta J. Binford<sup>1</sup> \*

*<sup>1</sup> Department of Biology, Lewis & Clark College, Portland, OR, United States, <sup>2</sup> Spider Pharm, Yarnell, AZ, United States, <sup>3</sup> Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR, United States, <sup>4</sup> Arizona Proteomics Consortium, University of Arizona, Tucson, AZ, United States*

### *Edited by:*

*Kartik Sunagar, Indian Institute of Science (IISc), India*

### *Reviewed by:*

*Volker Herzig, The University of Queensland, Australia Dilza Trevisan Silva, Instituto Butantan, Brazil*

> *\*Correspondence: Greta J. Binford binford@lclark.edu*

### *Specialty section:*

*This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution*

> *Received: 28 March 2019 Accepted: 18 June 2019 Published: 12 July 2019*

### *Citation:*

*Zobel-Thropp PA, Mullins J, Kristensen C, Kronmiller BA, David CL, Breci LA and Binford GJ (2019) Not so Dangerous After All? Venom Composition and Potency of the Pholcid (Daddy Long-Leg) Spider Physocyclus mexicanus. Front. Ecol. Evol. 7:256. doi: 10.3389/fevo.2019.00256* Pholcid spiders (Araneae: Pholcidae), officially "cellar spiders" but popularly known as "daddy long-legs," are renown for the potential of deadly toxic venom, even though venom composition and potency has never formally been studied. Here we detail the venom composition of male *Physocyclus mexicanus* using proteomic analyses and venom-gland transcriptomes ("venomics"). We also analyze the venom's potency on insects, and assemble available evidence regarding mammalian toxicity. The majority of the venom (51% of tryptic polypeptides and 62% of unique tryptic peptides) consists of proteins homologous to known venom toxins including enzymes (astacin metalloproteases, serine proteases and metalloendopeptidases, particularly neprilysins) and venom peptide neurotoxins. We identify 17 new groups of peptides (U1−17-PHTX) most of which are homologs of known venom peptides and are predicted to have an inhibitor cysteine knot fold; of these, 13 are confirmed in the proteome. Neprilysins (M13 peptidases), and astacins (M12 peptidases) are the most abundant venom proteins, respectively representing 15 and 11% of the individual proteins and 32 and 20% of the tryptic peptides detected in crude venom. Comparative evidence suggests that the neprilysin gene family is expressed in venoms across a range of spider taxa, but has undergone an expansion in the venoms of pholcids and may play a central functional role in these spiders. Bioassays of crude venoms on crickets resulted in an effective paralytic dose of 3.9µg/g, which is comparable to that of crude venoms of *Plectreurys tristis* and other Synspermiata taxa. However, crickets exhibit flaccid paralysis and regions of darkening that are not observed after *P. tristis* envenomation. Documented bites on humans make clear that while these spiders can bite, the typical result is a mild sting with no long-lasting effects. Together, the evidence we present indicates pholcid venoms are a source of interesting new peptides and proteins, and effects of bites on humans and other mammals are inconsequential.

Keywords: neprilysin, astacin metalloproteinase, insecticidal neurotoxin, proteome, PD50

# INTRODUCTION

Pholcid spiders (Araneae: Pholcidae) are an example of disconnect between public perception, and scientific understanding. Commonly called "cellar spiders" some pholcids are also referred to as "daddy long-legs" and have been maligned in popular culture as having venoms with the highest mammalian potency of all spiders, that would be lethal if they only had fangs that could penetrate human skin. Pholcids, in reality, are a family of spiders that has been diversifying for over 200 million years (Dimitrov et al., 2012; Eberle et al., 2018) and are among the top six percent of the most diverse spider families (1,714 species in 94 genera, World Spider Catalog, February 2019). Reports of high mammalian potency emerge from rumor and non-scientific anecdote. We are not aware of any scientific work to date that has attempted to characterize venom composition of any pholcid species, and given their evolutionary depth and diversity, they provide remarkable opportunity to discover new toxins and lend insight into the broader patterns of venom diversity within spider venoms, while directly investigating a widespread urban myth.

Spiders of the family Pholcidae are interesting animals in their own right. Found on all continents except Antarctica, they range in size from minute to ∼1 cm in adult body length. They tend to have long, thin (hair-like) legs that are readily dropped defensively and twitch to deter predators after being removed from the body (e.g., Johnson and Jakob, 1999). As generalist foragers of arthropods, they build tangly, non-adhesive webs in protected places, and diversification within the family has involved extensive diversification of web microhabitats (Eberle et al., 2018). In prey capture, they either wait for prey to encounter their web, or some species leave their web and enter webs of other spiders and capture them and/or consume their egg sacs or prey (Jackson and Brassington, 1987). When pholcids encounter prey, they immobilize them by first wrapping with silk, and then biting to envenomate and eventually feed (Jackson and Brassington, 1987). Many common pholcids readily feed on other spiders, including black widows (Latrodectus) (pers. obs.) that have highly potent venoms on both arthropods and vertebrates (Wang et al., 2007). Given that spiders are dangerous prey, this observation inspires the question of how potent pholcid venoms are on arthropods. While Kirchner and Opderbeck (1990) provide data suggesting venoms of Pholcus phalangiodes only weakly affect insect prey that are ultimately killed by digestive enzymes, we have much to learn about the details of pholcid venom composition and how the venoms affect prey. Although small, their fangs can certainly penetrate the cuticle of many insects.

Beyond insects, the fangs of adult male Physocyclus mexicanus are able to pierce uncalloused human skin (Kristensen, pers obs). In fact, the only recorded human bites we are aware of that are publically available are from Mythbusters (Episode 13, premiered 15 Feb 2004), in which male P. mexicanus bit from aggression resulting in a brief mild sting (https://www. discovery.com/tv-shows/mythbusters/videos/daddy-longlegsminimyth). The same episode also reports the only direct tests of mammalian toxicity of daddy-long leg venom with injections of male P. mexicanus venom into mice. In comparative injections of the same volume (one µl, ∼80 µg, Kristensen pers obs), venom from Western black widows Latrodectus hesperus, were more potent than P. mexicanus. Despite the fact that obtaining one µl required venom extraction from 50–100 adult males (Kristensen pers obs), the Mythbuster assays suggest that much larger amounts of venom would be required to obtain an LD<sup>50</sup> for P. mexicanus venom on mice. Together, these first reports suggest bites from this species are not harmful to mammals. We expand on these observations in our discussion.

Shifting focus from potential risks to humans, venoms of pholcids and all spiders are exciting phenotypes because their rich chemical diversity contains a natural storehouse within which we can discover new molecular activities, and study patterns of diversity and mechanisms of evolution that have generated these phenotypes (e.g., Escoubas, 2006; King and Hardy, 2013). Venoms contain a complex mixture of bioactive components—proteins, peptides, and small molecules—that vary widely across the over 48,000 described species of spiders (Pineda et al., 2018; World Spider Catalog., 2019). A growing body of research characterizing venoms from diverse representatives of spiders is allowing inference of the phylogenetic distribution of venom components. Pholcids are in a higher lineage of spiders called Synspermiata (Michalik and Ramirez, 2014), and more specifically in the "lost trachea clade" (Wheeler et al., 2016). Within the last few years, select venoms in this clade have been characterized allowing inference of the presence and distribution of gene families that are expressed in venom. Venomic sampling to date includes venoms of Plectreurys tristis (Zobel-Thropp et al., 2014a), Scytodes thoracica (Zobel-Thropp et al., 2014b), and venoms within the family Sicariidae that includes brown recluse spiders (review in Chaves-Moreira et al., 2017). Adding a representative pholcid venom to this comparative set allows inference about the phylogenetic distribution and relative abundance of widespread gene families that contribute to spider venoms.

The goal of this study was to use transcriptomic and proteomic methods to characterize venom proteins of the North American pholcid, Physocyclus mexicanus. Our sampling includes venom from adult males, and transcriptomes with mRNAs from adult males and females sampled from the same population used in the Mythbusters episode. To our knowledge, this investigation is the first proteomic characterization of pholcid venom. We consider venom composition of this species in the context of what is known about venoms of other Synspermiata spiders, and discuss evidence for abundant and diverse astacins, comparable to their near relative Plectreurys tristis, and an expanded role of neprilysin homologs and diverse peptides with and without homology to known toxins. To better understand venom effects on natural prey, we use bioassays to evaluate potency on crickets. We also summarize information available on the propensity of these animals to bite humans and the effects of venom on mammals. Together, these results provide an important contribution to comparative venom analyses, and more formally address the urban myth of daddy long-leg spiders having exceptional venom potency on humans.

# MATERIALS AND METHODS

# Spider and Venom Collection

All spiders used in this work were adult Physocyclus mexicanus Banks from a local population in a single building at Spider Pharm (Yarnell, AZ 34◦ 13′ 12.34" N; 112◦ 44′ 56.43" W; 4,796 ft elev). After spiders were unfed for at least 1 week, venom was extracted using electrostimulation as in Binford and Wells (2003). This was done under two circumstances. The first at Lewis & Clark College (L&C) was intended to induce upregulation of expression to optimize capture of venom mRNAs. In parallel, at Spider Pharm, spiders were milked for proteomics analyses and insect bioassays, taking care to avoid contamination with regurgitate. Electrostimulation did not yield sufficient, pure samples of venom from females in either lab. Rather, at L&C of 12 adult females, six yielded tiny drops of venom, three of these were contaminated by vomit. However between 500 and 600 adult males were milked at Spider Pharm to obtain sufficient uncontaminated venom for proteomics and bioassays. Therefore, all methods requiring crude venom were done with venom from males only collected at Spider Pharm, but venom-gland transcriptomes were obtained from a combination of glands from males and females as described below. Voucher specimens are retained in the L&C Natural History collections, and deposited in The National Museum of Natural History.

# Proteomic Analyses

### LC-MS/MS Detection of Tryptic Peptides From Crude Venom

Crude venom extracted at SpiderPharm (∼80 µg from ∼100 male spiders) was sent to the Arizona Proteomics Consortium (University of Arizona) for proteomic analysis. Proteins were analyzed from an in-gel tryptic digest (**Figure 1A**) with LC-MS/MS using an LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific) as in Zobel-Thropp et al. (2014a,b). The sample contained a mixture of two digests using trypsin (Princeton Separations) in the presence of ProteaseMAX (Promega, manufacturer's protocol). One digest was from the excision of an entire lane containing 10 µg of crude venom (12% Tris-glycine SDS-PAGE) and the other was an excised region of <10 kDa only from a separate gel (25 µg crude venom, 16.5% Tris-tricine SDS), thereby attempting to enrich the sample with small venom polypeptides and peptides which are prone to diffusing out of the protein gel. Samples were desalted using OMIX C18 tips (Agilent Technologies), injected onto a Proxeon Easy nano-HPLC (Proxeon, Odense) precolumn (100µm, 2 cm, Fisher Scientific) and separated on a C18 column (75µm, 10 cm, Fisher Scientific) at a flow rate of 300 nl/min over a 145 min gradient of 5–40% solvent B (A = H2O), B = acetonitrile, both with 0.1% (v/v) formic acid). Eluate was delivered by a nanospray source (Nanomate, Advion) with a voltage of 1.85 kV to the Orbitrap Velos. Survey scans were acquired at 60,000 resolution and the top 14m/z values were selected for CID (width = 2 amu) at 35% relative energy in the Velos linear trap and then placed on an exclusion list for 45s.

Resulting MS/MS data was searched using SEQUEST on Discoverer (v. 1.3.0.339, Thermo Fisher Scientific) against masses of theoretical fragments from two databases. The first included a combined set of sequences generated by the Binford lab including Illumina HiSeq NGS data from 12 Synspermiata species (described below), and Sanger cDNA library sequences from 10 species as in Zobel-Thropp et al. (2014b) (described below). The second included all chelicerate sequences in NCBI (downloaded 2/17/2016), totaling 964,541 sequences. Matches required <10 ppm precursor error. The reversed database was searched to provide quality scores for FDR filtering. The SEQUEST output was organized in Scaffold (version Scaffold\_4.8.6, Proteome Software Inc., Portland, OR; Searle, 2010). Peptides were identified with 95% minimum threshold and 0% false discovery rate (FDR) and proteins were identified with 95% minimum threshold, two peptides minimum per protein, and 0% FDR. Details are in **Table S2**. While our goal for this analysis is thorough representation of peptides present in male crude venom, there are a number of standard reasons peptides or proteins may be missed. These include (but are not limited to) sample degradation or low concentration for mass spectrometry requirements, incomplete gel extraction, poor or no tryptic cleavage, lack of peptide binding to the C18 column, poor ionization or fragmentation, or posttranslational modification that not were not included in the data search. Proteome profiling data will be available through the PRIDE repository, and amino acid and cDNA sequences for all proteins detected in this study are deposited in GenBank (NGS accession numbers MN004877- MN004966) and included in **Table S1**.

# Transcriptome Database RNA Isolation

A single pool of RNA isolated from 12 adult P. mexicanus was used for generation of Sanger cDNA libraries, and Illumina 3000 transcriptomes. To up-regulate venom transcripts, venom was extracted from 12 spiders across two consecutive days. Six were milked on 1 day (three females and three males) and six spiders were milked the following day (three females and three males). Then, on the same day—three and 4 days after extraction—all 12 spiders were anesthetized with CO<sup>2</sup> and venom glands were extracted and flash-frozen immediately in liquid nitrogen. Given differential success in obtaining venom, only glands from spiders from which venom was extruded were used for mRNA isolation. We pooled glands between sexes to optimize total mRNA, and generate a thorough within-species transcriptome with the goal of annotating the proteome. Transcriptional timing is not known for pholcids, however this window of time has been successful for isolating venom gland mRNAs from other Synspermiata spiders, and combining three and four-day transcripts likely increased the breadth of transcripts captured (Binford et al., 2005; Zobel-Thropp et al., 2014a,b). Total RNA was isolated using the ChargeSwitch Total RNA Cell kit (Invitrogen) following the manufacturer's protocol.

## Sanger cDNA Library Construction and Screening

We constructed the venom gland cDNA library using the SMARTTM cDNA library construction kit (Clontech). Details of all procedures including library construction, cDNA packaging, screening and sequence analysis are the same as in

Zobel-Thropp et al. (2014b). For this library, we used 94 ng of total RNA for first strand synthesis of cDNA and followed the manufacturer's protocol for library construction. The titer for this library was 1.4 × 10<sup>6</sup> pfu/ml. We screened 1,685 clones using PCR and sequenced 311 cDNAs ≥500 bp on an Applied Biosystems 3730 Analyzer at the Genomic Analysis and Technology Core (University of Arizona) using TriplEx2 vectorspecified primers. We trimmed vector sequence and assembled sequences using Sequencher 5.1 (Gene Codes Corp.). Due to chromatogram ambiguity, we discarded 32 sequences, resulting in 279 high quality sequences for transcriptomic analysis.

detected by Orbitrap. Asterisks (\*) indicate hypothetical proteins with unknown function.

### Illumina 3000 Transcriptomes

From the same P. mexicanus RNA sample used for the Sanger cDNA library, we sequenced a transcriptome using Illumina 3000 at the Center for Genome Research and Biocomputing (CGRB) at Oregon State University. This was done as part of a larger project that included RNA samples from 23 transcriptomes from 11 species, All sequences resulting from these taxa were combined into a single dataset used to annotate the P. mexicanus proteome. The taxon representation in the dataset includes venom-gland and whole body transcriptomes from Plectreurys tristis and Scytodes thoracica (venom gland RNA is from the same samples used in Zobel-Thropp et al., 2014a,b, respectively), Austrarchaea mainae, Periegops suterii, Hexopthalma dolichocephalus, Drymusa serrana, Loxosceles spinulosa, L. rufescens, L. arizonica, and L. reclusa. For each of these samples, total RNA pools (ChargeSwitch, Invitrogen) were quantified and analyzed for quality using a Nano chip on the Agilent 2100 bioanalyzer at the CGRB. For NGS transcriptome construction, each total RNA pool was enriched for mRNA and prepared for multiplexed sequencing with PrepX reagents for RNA-Seq library construction (Wafergen). The final libraries were checked for quality and size distribution followed by qPCR analysis for quality control. Transcriptomes were multiplexed into two Illumina HiSeq 3000 150 bp paired end lanes.

### Transcriptome Assembly

Each transcriptome sample was individually de novo assembled using Trinity (Grabherr et al., 2011). Coding regions were identified from the de-novo transcriptome assemblies with TransDecoder (Haas et al., 2013). The three-step TransDecoder process was used: first, TransDecoder.LongOrfs was used to identify the long open reading frames; second, BLASTP (Altschul et al., 1990) was used against a set of known proteins [spider proteins from GenBank (Benson et al., 2005), ArachnoServer (Pineda et al., 2018), previously isolated Sanger cDNAs from the Binford lab, Latrodectus hesperus (https://www.hgsc.bcm.edu/arthropods/western-black-widowspider-genome-project), Loxosceles reclusa (https://www.hgsc.

bcm.edu/arthropods/brown-recluse-spider-genome-project),

Centruroides exilicauda (https://www.hgsc.bcm.edu/arthropods/ bark-scorpion-genome-project)]; third, TransDecoder.Predict was used in conjunction with the TransDecoder.LongOrfs output and the protein output to predict the likely coding regions. Output contigs from TransDecoder were combined with the known proteins described above and the final two steps of TransDecoder were run a second time.

To make sure the TransDecoder steps did not remove any essential contigs a second comparison was done against the original Trinity assemblies. Contigs in the Trinity assembly not identified from the TransDecoder process were aligned to previously isolated Sanger venom-gland cDNAs (Binford lab, unpublished) to identify important but missing genes. Sequences identified here that had hits in the above alignment were then aligned back to the TransDecoder peptides, if no hit was found they were added to the final TransDecoder set.

# Proteome Annotation

Our analysis pipeline for sequence annotation is illustrated in **Figure 1**. The final set of proteins was annotated into functional categories based on results from combined searches in the Conserved Domain Database (CDD v3.16, https://www.ncbi. nlm.nih.gov/Structure/cdd/wrpsb.cgi), BLAST searches in NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and ArachnoServer (Pineda et al., 2018), http://www.arachnoserver.org) public databases, and our database containing sequences from several Synspermiata venom gland cDNA libraries constructed in the Binford lab (unpublished). Hits were recognized as significant if e-values were ≤10−<sup>5</sup> . All of the sequences that were detected by peptide spectra in mass spec analyses (116 unique polypeptides) were annotated manually using the following steps that largely follow those in Zobel-Thropp et al. (2014a).

We identified 38 unique polypeptides with close homology to proteins known to be involved in conserved metabolic, housekeeping, or maintenance functions. Hits had strong evalues (e10−<sup>20</sup> to 0.0), and all matches agreed between hits in CDD, InterProScan, and NCBI (**Table S1**). Polypeptides that hit domains previously described as venom proteins were categorized as "proteins with venom function." Sequences that had no hits in CDD, InterProScan, or NCBI (60) included peptides with characteristics consistent with venom peptides (secreted with characteristic disulfide patterns) and were categorized as "venom peptides." Finally, 18 polypeptides were categorized as "proteins of unknown function," because they either had strong matches to proteins that were not considered housekeeping/metabolic proteins (14, e10−<sup>29</sup> to 0.0) or they did not have any hits in any databases (4), so potential function in venom could not be determined.

With the exception of sequences that hit housekeeping/metabolic proteins in our initial homology searches, we identified homologous sequences within venom proteomes across our Synspermiata dataset, and aligned those with homologs in NCBI using MAFFT followed with manual editing in Mesquite [Version 3.51 (build 898), http:// mesquiteproject.org]. We identified conserved characteristics consistent with known venom toxins (e.g., cysteine-rich), including signal peptide cleavage sites with SpiderP (http://www. arachnoserver.org/spiderP.html) and knottin or ICK predicted folding patterns using Knotter1d (http://www.dsimb.inserm. fr/KNOTTIN/knoter1d.php) tools. All full-length sequences were analyzed for signal peptide cleavage sites using SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/). Protein coding sequences were considered full-length if they contained a methionine and a stop codon.

### Sanger cDNA Library Annotation

The cDNA library consists of 251 sequences that were aligned, analyzed, and separated into 24 clusters and 60 singletons (individual sequences) (**Figure S1**, **Table S3**). Of those in the original dataset (279), sequences from two clusters and 11 singletons were deemed "no open reading frame," and 28 sequences were omitted from the dataset because they hit a sequence in NCBI, but when translated they were repetitive and did not contain an ORF. Our annotation of the Sanger cDNA sequences followed the same steps as those detailed above.

Given evidence of high levels of index swapping in our Illumina sequences, we only use our Illumina sequences as a database source for mass spec analyses. We limit our analysis of the transcriptome to sequences isolated from our Sanger cDNA library of P. mexicanus for which we are confident in the taxonomic and tissue source. The cDNA library includes combined transcripts of both male and female spiders and highly expressed homologs that we did not detect in the male only proteome may provide insight into female venom composition.

All non-redundant sequences >200 bp were submitted to GenBank and granted accession numbers (KP138796- KP138905).

### Phylogenetic Analyses

We estimated phylogenetic relationships of the most abundant toxins detected in our proteomes to help interpret diversity within the group, and to infer how these toxins are related to homologs in NCBI and proteomes of other Synspermiata. To do this we identified sets of related sequences using BLAST with a threshold of e < 10−<sup>5</sup> , aligned them initially with MAFFT online server (http://MAFFT.cbrc.jp/alignment/server/) by manual alignment in Mesquite v3.2. Bayesian phylogenies were inferred using MrBayes XSEDE through the CIPRES portal. Analyses were run for 50 million generations, or until the average standard deviation of split frequencies decreased below 0.01.

# Venom Potency Analysis

## Insect Bioassays

To assess the potency of pholcid venom, we quantified the dose at which 50% of crickets (Acheta domesticus) injected were paralyzed after 60 min (PD50). We diluted venom in physiologically buffered lepidopteran saline solution (5 mM KH2PO4/100 mM KCl/4 mM NaCl/15 mM MgCl2/2 mM CaCl2, pH 6.5) and established a dosage range distributed around the PD<sup>50</sup> by performing preliminary injections. Experimental doses were 0.25, 0.50, 1.0, 1.5, and 2.0 µg of crude venom protein with two replicate assays each on 15 crickets per dose by injection into the dorsal mesothorax using a PB-600 dispenser with a gastight syringe (Hamilton Co.). With each replicate, we injected an additional 15 control crickets with 0.4 µl of saline, a volume equal to the largest dose volume. Paralysis was scored every 10 min for 1 h based on the ability of a cricket to right itself. PD<sup>50</sup> values were generated for both absolute amounts of venom injected (µg) and for weight standardized dose (µg/g cricket) using the EPA Probit Analysis Program for calculation of LC/EC values (version 1.5 SAS).

# RESULTS AND DISCUSSION

We detail below the chemical composition of P. mexicanus venoms with evidence that they include an expanded presence of neprilysin and astacin-like metalloproteases, and a diverse set of peptide toxins. The venom causes rapid, flaccid paralysis on crickets with potency that is within the norm of those estimated for other Synspermiata spiders. A collection of observations of bites on humans and subsequent effects indicate that P. mexicanus can bite through human skin, however the effect is minimal.

# Venom Composition

A total of 2,021 tryptic peptides were detected via Orbitrap MS of our crude venom sample. These match 116 unique polypeptide sequences from our combined databases with >95% identity (**Figure 1A**). Of these, 33% have close homology with proteins that are likely involved in basic cellular function (**Table S1**), however these are represented by only 20% of the unique peptide spectra (**Figure 1B**). Within the remaining 78 polypeptides, the majority are homologs of known venom proteins (39) or venom peptides (21), and the remaining 18 are homologs of proteins of unknown function (**Table S1**; **Figure 1B**).

## Toxic (and Potentially Toxic) Venom Proteins

Metallopeptidases are the most abundant components we detected in the venom, as quantified either by number of spectra, or unique polypeptides. These are primarily neprilysins (M13 peptidases) (17 polypeptides, 53% of the unique spectra) and astacin metalloproteases (M12 peptidases) (13 polypeptides, 18% of unique spectra) (**Figure 1B**). Other peptidases were present in lower abundance (1-2 polypeptides each) including neuroendocrine convertase (S8 peptidase), and chymotrypsinlike (S1 peptidase). Beyond peptidases, we also detected in low abundance (1-3 polypeptides each) chitinase, two cysteine rich secretory proteins (CRISP), and phospholipase A2.

# **Neprilysins (M13 peptidases)**

The apparent dominance of neprilysins in P. mexicanus venom is unprecedented in spider venom proteomes, and likely indicates an important functional role. Neprilysins are thermolysin-like zinc metalloendopeptidases (Turner et al., 2001). They are typically membrane-bound proteins that can turn off peptide signaling events at the cell surface, and metabolize regulatory peptides (Turner et al., 2001). As venom components, they are predicted to degrade extracellular matrices around synapses, and possibly facilitate toxin access to targets (Undheim et al., 2013). Of the 17 unique sequences we detected in the venom proteome, only one was full length (MN004884) and was specifically hit uniquely by 110 spectra. This transcript has a signal peptide and codes for a mature protein of 697 residues with an estimated molecular weight of 79.7 kDa (**Table S1**). MN004884 aligns well with homologs from a broad set of taxonomic sources (**Figure 2**) and retains the conserved zinc binding motif HExxH. . . ExxA/GD (Turner et al., 2001) present in Gluzincins (Sterchi et al., 2008). The other sequence fragments range in length from 90 - 504 amino acids, and were all detected by anywhere from 4–24 unique spectra (**Table S1**).

Within spiders, neprilysins have been found in venoms of the barychelid mygalomorph Tritamme loki (Undheim et al., 2013), however in lower abundance than we see in P. mexicanus. We have detected single proteins in each of Loxosceles arizonica and L. reclusa venom proteomes, but in no other Synspermiata species. Interestingly, even with updated annotations (unpublished) based on our expanded NGS database, we have not detected neprilysin homologs in P. tristis venoms (Zobel-Thropp et al., 2014a), the closest relatives of P. mexicanus with characterized venoms.

Outside of spiders, homologs of neprilysins have been detected in snake venoms including timber rattlesnakes (annotated as SVMPs) Crotalus horridus (Rokyta et al., 2013, 2015), and cobras Naja kaouthia (Tan et al., 2017), however, they do not dominate any of the other venoms. Phylogenetic analysis with taxon inclusion of all venom expressed spider neprilysins, representative arthropod homologs detected in NCBI, the Crotalus venom toxin and a human homolog, indicate that the diverse set of homologs detected in P. mexicanus venom are monophyletic and sister to neprilysins in Loxosceles venoms (**Figure 2A**). The T. loki sequences are highly divergent from the Synspermiata clade. The latter shares a closer relationship with diverse chelicerate proteins detected in genomes than they do with the venom-expressed T. loki, making convergent recruitment into venoms from a diverse neprilysin lineage a reasonable explanation of this pattern. The detection of single proteins in Loxosceles suggests that neprilysins may be phylogenetically widespread in venoms at low levels with

expansion of the diversity and perhaps functional role within a lineage including P. mexicanus.

### **Astacin metalloproteinases (M12 peptidases)**

Homologs of astacin metalloproteinases (M12 peptidases) are the second most abundant venom proteins detected in P. mexicanus proteomes. Among the 13 distinct polypeptides, seven are full length and all have signal peptides (**Table S1**). The diversity of homologs includes seven polypeptide subgroups with <95% aa identity and separate into four distinct clades (**Figure 3**). Similar to neprilysins, M12 peptidases represent a gene family that is widespread in spider venoms, and homologs have been well described as contributors to venom function in Loxosceles (annotated as Loxosceles astacin-like proteases – LALPs; da Silveira et al., 2007; Trevisan-Silva et al., 2013), with up to three distinct proteins detected in proteomes of South American species (reviewed in Chaves-Moreira et al., 2017). The diversity of astacin-like metalloproteinases in the P. mexicanus venom proteome is comparable to that detected in P. tristis (Zobel-Thropp et al., 2014a). There are scattered M12 peptidase homologs in other proteomes of Synspermata (**Figure 3**), however, only P. tristis and P. mexicanus have more than three distinct homologs. Phylogenetic analyses including all venom proteome-confirmed M12 peptidase homologs across Synspermiata resolve into a monophyletic P. mexicana group that is derived from a paraphyletic P. tristis set of proteins (**Figure 3**). This pattern is consistent with an evolutionary expansion of the role of astacins in venom before the common ancestor of these two species, with further diversification in the pholcid lineage after the divergence from the shared ancestor with Plectreurys.

All of the sequences we detect have the conserved catalytic domain HEXXHXXGXXHEXXRXDR- and a MXY region that is involved with a sequence turn and Zn-dependent activity (da Silveira et al., 2007). LALPs have proteolytic effects on gelatin, fibronectin, fibrinogen and extracellular matrices, and thus, as venom toxins, they are hypothesized to be spreading factors that act in synergy with other toxins (Trevisan-Silva et al., 2013).

The hydrolytic effects of LALPs are suspected to contribute to hemostatic disturbance during mammalian envenomation, thereby influencing the dermonecrotic or systemic effects of Loxosceles bites (Trevisan-Silva et al., 2013).

Astacin-like metalloproteinases have also been detected as major components in digestive fluids secreted from the mouths of the mygalomorph Acanthoscurria geniculata (Walter et al., 2017) and araneomorphs Stegodyphus mimosarum (Walter et al., 2017), Nephilengys cruentata (Fuzita et al., 2016), and Argiope aurantia (Foradori et al., 2006) indicating a phylogenetically widespread presence in spider digestate. We are confident that the homologs we detect in venom proteomes represent presence in venom rather than contamination by regurgitate, because our venom collection techniques carefully exclude digestate. Expanded phylogenetic analyses including a broad taxonomic set of proteins obtained in oral digestate and venom

FIGURE 4 | Peptides identified in the venom proteome of *Physocyclus mexicanus* summarized by homologous groups (families). The pie chart on the left indicates the proportions of non-redundant polypeptide sequences detected in the proteome for each group, and represents the sequence diversity within and among the peptide families. The pie chart on the right indicates the proportion of peptide families as measured by abundance of unique tryptic peptides in the proteome. The table details the number of non-redundant sequences (#nr seqs), the number of unique tryptic peptides, the number of cysteines in the mature peptide, whether or not the peptides are predicted to have an ICK fold, and the estimated molecular weight of the mature peptide.

resolve a monophyletic clade of venom-expressed homologs in P. mexicanus that is sister to all but one divergent venomexpressed homolog in P. tristis (data not shown), reflecting potential diversification in this gene family in the context of venom expression.

## **Other venom proteins**

The remaining two peptidases in P. mexicanus venom are serine proteases and homologs of known venom toxins. Chymotrypsinlike homologs (S1 peptidases) were present with low diversity and abundance. The two homologs are fragments that each hit proteins with CUB domains. The other is a homolog of neuroendocrine convertase 1 (S8 peptidase) which are furinlike proteases, homologs of which cleave N- and C-termini in processing mature latrotoxins (Ushkaryov et al., 2004) and may be common in venoms across multiple lineages.

We detect three other non-peptidase toxic proteins that are homologs of well-established and widespread venom toxins. These include two CRiSP homologs, one that is full length (425 aa, 48.8 kDa); three chitinase homologs, one full length and corresponds to a 493 aa mature protein (est 54.7 kDa), and a single fragment of a bee venom-like phospholipase A2 (PLA2) with weak support (only 4 tryptic peptides). Of these, the chymotrypsins, chitinase, and PLA2 have not been detected in P. tristis, perhaps because of their low abundance, or, they may have a dynamic presence in venom extracts. However, the presence of CRiSP homologs adds to evidence that these are predictable widespread venom components in spider venoms (Fry et al., 2009; Undheim et al., 2013).

## Proteins of Unknown Function

Beyond proteins with homology to established venom components, there are proteins in the proteome, some wellrepresented with respect to peptide spectra, that are homologs of uncharacterized and unnamed genes. Given high rates of discovery of new venom components, we carefully assessed these as potential candidates for venom activity. An intriguing protein is abundant (combined 239 peptide spectra, 62 of which are unique peptides) but not diverse; it is represented by two identical polypeptides, one full length (599 aa), with an estimated size 67.7 kDa. This protein has clear homology to a domain of unknown function (DUF885), however Fold Function and Alignment Searches (FFAS, http://ffas.godziklab.org/ffas-cgi/ cgi/ffas.pl) detect sequence elements of lipoproteins, and folds similar to M32 carboxypeptidases, which are metalloproteases. High sequence similarity across arachnid genomes, suggests they perform a conserved, but undescribed function, which could be involved in venom polypeptide processing.

Of the remaining polypeptides with limited homology to known proteins, four are full length. One is a homolog of a protein we have previously detected in Scytodes venom (RP23 like) that is annotated as "mite allergen." This basic protein (est pI 9.97) has a signal peptide, is estimated 20.7 kDa (188 aa) and has homologs across arachnid genomes. Curiously, proteins referred to as allergens and "mite allergens" in Loxosceles venoms are not homologs of this toxin (MN004921). The two remaining proteins are both secreted and do not have any cysteines. The first (MN004924) is acidic (pI 4.14) estimated to be 22.5 kDa, and is leucine and glutamic acid rich. The other two (MN004934 and MN004937) are from a group of five listed to have leucine-rich repeat domains, based on results from the CDD and InterProScan searches. Of the five, three are homologs (MN004924 is estimated to be 34.9 kDa with a pI of 4.95), and the other two are not homologous. One is estimated to be 34.4 kDa (pI 9.05). Four distinct polypeptides had no hits in any databases (KP138800, KP138796, MN004925, and MN004926); they are distinct from each other, and we have named them "hypothetical proteins." None of these are found in other venom proteomes within Synspermiata, so we conclude they are not abundant or widespread contributors to venoms in this higher lineage.

## Venom Peptide Toxins

While we limited our discussion of venom proteins to those confirmed in the proteome, we discuss the venom peptides in the proteome in the context of homologous sequences in our transcriptomes. We do this because patterns emerge in comparisons between these data sets that potentially reflect sexual dimorphism given that our proteome is male specific and our transcriptome combines male and female gland extracts.

## **Overview**

In our combined transcriptome we detected a total of 103 sequences with homology to venom peptides, or cysteine motifs that are characteristic of venom peptides (91 in our Sanger cDNA library (taxon source confident, **Table S3**) and 12 NGS sequences detected as matches to our proteome). Among the 103 sequences, 49 are non-redundant and full length. These consist of 17 distinct groups, 13 of which are predicted knottins, and the other four with characteristics of spider venom peptides as described below. The peptides range in cysteine number from 7 to 10 and in predicted size from 3.6 kDa-11.9 kDa (mature, processed sequences) (**Table S1**). These sequences constitute over one-third (36%) of the venom gland transcriptome (**Figure S1**) and 18% of the proteome that is annotated as proteins with venom function (**Figure 1B**). We assigned the 17 groups of "pholcitoxins" with "unknown" functional notations (U1-U17) and a "PHTX-Pmx" name according to established nomenclatural convention (King et al., 2008). Suffixes of ascending letters distinguish distinct sequences within a group (e.g. 1a, 1b, 1c, etc.) and those with an underscore followed by a number differ only in the signal sequence region (accession numbers KP138863- KP1388905 and MN004909-MN004920).

### **Peptides detected in proteome**

Of the 17 peptide groups identified in the transcriptome, 13 (with 36 distinct sequences) were detected in the venom proteome. These vary in abundance (as quantified by numbers of unique tryptic peptides) and diversity (as quantified by the numbers of non-redundant sequences) (**Figure 4**). Over 25% (61/220) of the tryptic peptides that hit peptide toxin matches were to U<sup>17</sup> -PHTX-Pmx. The three 7.8 kDa peptides in this group are homologs of U1-HXTX-Iw1 that is abundant in the Australian hexathelid funnel-web spider Illawarra wisharti. While homologs appear to be widespread in spider venoms, the target of U1- HXTX-Iw1 remains unknown despite much experimental effort (reviewed in Wilson, 2015).

The most diverse peptide group in P. mexicanus venom is the U5-PHTX set of 13 non-redundant sequences that correspond to ∼5 kDa peptides and are homologs of ω-PLTX-Pt1a (**Figure 4**; **Table S1**). ω-PLTX-Pt1a blocks presynaptic voltage-gated Ca2<sup>+</sup> channels in invertebrates [(King, 2007), PLTX-II in Branton et al. (1987)], specifically Dmca1A channels (Kuromi et al., 2004) and are the most abundant peptide toxins in venoms of P. tristis (Zobel-Thropp et al., 2014a). While ω-PLTX-Pt1a is the top hit in BLAST searches of U5-PHTX, ω-PLTX-Pt1a is homologous to two other well-characterized toxins from P. tristis, U1-PLTX-Pt1c and δ/ω-PLTX-Pt1a. U1-PLTX-Pt1c (Plt-XI) is a highly potent insecticidal toxin, however, the molecular mechanism of activity remains unknown (Quistad and Skinner, 1994). The activity of δ/ω-PLTX-Pt1a is better understood and is an excitatory toxin with activity on both Ca2<sup>+</sup> and Na<sup>+</sup> channels (Zhou et al., 2013). All of these toxins share a cysteine pattern of -C6C-CC-C2C1C-C1C-C- and some require O-palmitoylation for activity.

The P. mexicanus venom proteome is well represented in diverse homologs with the same cysteine pattern as ω-PLTX-Pt1a. These include U3-, U4-, U5-, U6-, U8-, U9- and U11-PHTX. While the top BLAST hits for these toxin families varied, they are alignable and share the same motif, and thus represent a superfamily (**Figure 5**). Collectively, the motif spacing is summarized with (1-4)C6C(4-10)CC(1-2)C2C1C(4-9)C1C(3-8)C(0-5). All of the sequences in this superfamily have residues that may be palmitoylized (threonine or serine, in addition to cysteine) near the C-terminus. Together, members of this superfamily make up a large proportion of the venom peptides in P. mexicanus venom (∼45% of tryptic peptides, ∼70% of the unique polypeptides). Moreover, given the similarities and homology with toxins of known activity, members of this superfamily in P. mexicanus are strong candidates for contributing to toxicity of these venoms and may include novel target specificities.

Of the remaining four peptide families in the proteome, U14-PHTX is the most abundant and diverse with 28 tryptic peptides and four unique polypeptide sequences. These peptides have 8 cysteines with the pattern (2-4)C(6-7)C(5-7)CC(7-8)CRC(8-9)CHC(8-14). While they have no matches in databases, the cysteine pattern is similar

correlates with posterior probability support for that branch. All polypeptides included in this phylogeny have been confirmed to be present in the venom proteomes. Branches colored red are all from *P. tristis,* and all other colored branches represent groups of peptides identified in this work from *P. mexicanus.* Annotations of activity reflect the discussion in the text. The alignment includes a subset of the sequences included in this phylogeny, reduced to fit by eliminating highly similar sequences. Predicted disulfide bonding pattern is based on Quistad and Skinner (1994).

to, but larger than, µ-agatoxins and they may share a similar fold with cysteine bonding of I-IV, II-V, III-VIII, VI-VII. The predicted molecular weight is ∼6.65 kDa. U10, U15, and U<sup>16</sup> are each represented by single polypeptides and U<sup>10</sup> is well represented in the proteome with 17 tryptic peptide hits. Like U14-PHTX, U<sup>10</sup> is unique to pholcids with no hits in databases, however it has seven cysteines and codes for a peptide with estimated molecular mass 5.7 kDa. U15- PHTX is a clear homolog of U3- PLTX-Pt1a, originally described as Plt-X (Quistad and Skinner, 1994) in P. tristis venom, which


FIGURE 6 | *Physocyclus mexicanus* bioassays on crickets (*Acheta domestica*) result in rapid, flaccid paralysis along with ventral darkened areas. (A) Table presenting the estimated dose at which 50% of crickets are paralyzed at 1 h post injection (PD50) for *P. mexicanus* and other Synspermiata spiders. (B) A ventral view of representative crickets at 1 h post injection by venom from *P. mexicanus* and *P. tristis* for comparison. Notable differences are the relaxed legs and darkened regions on the cricket injected by *Physocyclus* in contrast to the folded legs and lack of darkened regions on *Plectreurys*.

is insecticidal, however they are not instantly paralytic but lead to slow lethality in some lepidopteran larvae. U16- PHTX has significant homology with U1- PLTX, however, it has an extended N terminus that is lacking a cysteine that is present in the rest of the superfamily.

## **Not detected in proteome**

Four peptide families in the transcriptome with characteristics of venom toxins were not detected in the proteome. One group U2-PHTX is among the most abundant and diverse peptides in the transcriptome (second only to U5-PHTX) with 16 total sequences, 7 non-redundant. These are homologs of a component in Lycosa singoriensis with unknown molecular target and function (U18-LCTX-Ls1a, Zhang et al., 2010). Nine transcripts (three non-redundant) in the U7-PHTX family are members of the superfamily described above, which are diverse and abundant in the male proteome (**Figure 5**). U12- and U13- PHTX are related to one another, are cysteine rich, and homologous to peptides isolated from a range of spider venom transcriptomes. They are distinctively large peptides (11.5–12 kDa), are not predicted knottins, and are remote homologs of Dickkopf-related proteins that are involved in development in a wide range of non-arthropod metazoans. Beyond standard analytical reasons why a protein may not be detected in a proteome (see Methods), lack of detection of these peptide families in the proteome could reflect them serving a nonvenomous function in the cell, or potentially expression in female but not male P. mexicanus venoms.

# Insecticidal Potency

The potency of male P. mexicanus venom, measured by PD<sup>50</sup> on crickets, is comparable to that of venom from female P. tristis and other Synspermiata taxa (**Figure 6**, Zobel-Thropp et al., 2014a). Interestingly crude venom from all Synspermiata taxa tested to date on Acheta domestica are more potent than comparable estimates of effective doses on multiple taxa using a different cricket model, Grillus assimilis (**Figure 6**, Manzoli-Palma et al., 2003). However, there are distinct physiological symptoms of envenomation whereby crickets injected by P. mexicanus venom undergo rapid, irreversible flaccid paralysis in which the legs lay loosely extended and do not twitch. In contrast, crickets injected with venom of P. tristis (Zobel-Thropp et al., 2014a), and sicariid venoms (Zobel-Thropp et al., 2010, 2012) undergo rapid and irreversible excitatory paralysis with legs flexed and twitching. P. mexicanus envenomated crickets also develop darkened areas around the mouth and sternum, and on the ventral abdomen (**Figure 6**), symptoms we have not observed after injection of other Synspermiata venoms (Zobel-Thropp et al., 2010, 2012, 2014a).

The components responsible for flaccid paralysis and areas of discoloration on crickets following injection with P. mexicanus venom are unknown. Neprilysins are candidates for contributing to these effects given their abundance in the venoms and uniqueness within Synspermiata. While melanism in crickets has been described as a symptom of intoxication by insecticides (Fisher and Brady, 1980) to our knowledge the mechanisms underlying this are unknown. Flaccid paralysis in insects can be induced by some polyamine toxins (Quistad et al., 1991), by Na<sup>v</sup> channel toxins (Bende et al., 2013, 2015) and a variety of peptide toxins that affect Ca2<sup>+</sup> channels (e.g., Knaus et al., 1987; Troncone et al., 1995; Lipkin et al., 2002; King and Hardy, 2013). Characterization of the nonpolypeptide and peptide components in P. mexicanus may identify homologous or convergent factors that are causing these symptoms.

# Notes on *P. mexicanus* Behavior and Effects of Bites on Humans

While we have not performed controlled behavioral experiments a dense natural, local colony of P. mexicanus at Spider Pharm in Yarnell, Arizona, has allowed opportunity for anecdotal observations of their behaviors including their propensity to bite humans. We include an overview of observations, emphasizing those that inform the role of venom and may inspire future experimental work.

## Propensity to Bite and Effects on Humans

Contrary to popular opinion about pholcid spiders, P. mexicanus readily bite humans when they are disturbed by moving or cleaning activities, but bites are rarely confirmed by direct observation (Kristensen, pers obs.). While working in the lab with an open colony of these animals, members of Spider Pharm staff are frequently bitten when their habitats are disturbed. Most bites are brief nips as the spiders are moving rapidly over exposed skin, possibly with a slight pinch or tug resulting in a mild and brief sting. All confirmed bites by this species have been by adult or subadult males.

In one case, an adult male P. mexicanus bite on the back of the neck produced a mild sting that became a small welt which blackened and eventually opened into a 1 mm diameter wound. Images of the wound were diagnosed by a dermatologist with expertise in spider envenomation (Van Stoeker, pers comm) as a typical reaction to foreign protein rather than necrosis resembling a triggered immune response comparable to loxoscelism. In a second case, three subadult males that were disturbed by moving trays of larvae walked down the fingers of the left hand and simultaneously bit the knuckle at the base of the little finger. This resulted in an immediate mild, temporary sting. The knuckle was sore and slightly inflamed for several days. Interestingly, the case of human bites recorded on Mythbusters (Episode 13, premiered 15 Feb 2004) involved adult males that were aggregated and engaged antagonistically with one another. In that case, when human arms were extended deeply into the enclosure spiders did not initially bite, though several spiders pressed their chelicerae against the skin without biting. A bite occurred on the back of the hand after the hand was pulled back closer to males fighting near the entrance (Kristensen pers obs).

## Potential Venom Dimorphism, Male Aggression, and Feeding Biology

The difference between sexes in our ability to obtain venom by electrostimulation inspires consideration of potential sexual dimorphism in venom composition and/or functional role [repeated independently in the Binford Lab and the Kristensen Lab (Spider Pharm)]. The easier to obtain, higher yields of males could reflect larger volumes of venom in males, or different morphological/physiological circumstances of extrusion. There could also be seasonal differences in venom volume. Adult males and females are similar in size, but males have more ornate and armored or reinforced chelicerae. Cheliceral armor is a known dimorphic trait in pholcids that is a proposed adaptation for mating (Huber, 1999) but may also be an adaptation for fighting and defense. Males are aggressive to one another when at high densities. For example, after placement in enclosures for the Mythbuster staged encounters, males were concentrated on the cloth and interacted aggressively including biting that resulted in fatalities (Kristensen pers obs). There appears to be less aggression in natural stable colonies, and interactions are typically settled quickly and non-lethally. While most of the aggressive interactions observed were between males, females also exhibit aggression, especially when guarding clusters of hatchlings.

P. mexicanus readily capture and eat a diversity of arthropod prey, including spiders. The role of venom is not obvious and prey that are apparently envenomated often appear to be little affected. Like other pholcids, most prey are slowed or restrained by silk first and bitten several times during wrapping, frequently with no obvious immediate effect. In other instances, there is rapid paralysis or death after single bites. More rarely, spiders directly attack small, weak prey with an initial bite (frontal attack) without the use of silk. Single, short bites made during frontal attacks may result in paralysis or death. Given evidence that venom has a potent effect on some prey, the spiders may meter the amounts of venom injected. The ability to regulate the quantity of venom injected into prey as a function of size, activity levels, and differences in sensitivity to toxins, has been experimentally determined in Cupiennius salei (Wigger et al., 2002; Wullschleger and Nentwig, 2002). The observed differential effects of pholcid bites are consistent with a similar ability to regulate venom injection, and may indicate it is a general attribute of spider envenomation.

In many spider species males reduce or stop feeding when they reach adulthood. This is not the case for P. mexicanus (pers obs; Wilson et al., 2016). Not only do males hunt and capture prey but we have observed them presenting captured prey to females, with and without attempts to mate. We have also observed males with egg sacs in their chelicerae, which could reflect adoption of the egg sacs or infanticide by males. Together, these observations suggest there are interesting sexual dynamics in this species that could influence some degree of sexual dimorphism in venoms.

# CONCLUSIONS

We conclude from the data and observations presented in this work that venoms of male P. mexicanus consist of a rich set of components, predominantly metalloproteinases and peptides. While activity of none of these components has been directly characterized, they include homologs of proteins and peptides that are demonstrated contributors to toxicity. Our evidence of potent toxicity in arthropods, in combination with homology to known toxins suggests that detailed activity characterizations have strong potential for discovering new or refined activities. The unprecedented extensive presence of neprilysins is intriguing with respect to contributions to immobilization or consumption of prey. While the information available on the effect of P. mexicanus venom on mammals is limited by lack of careful experimentation, we consider the documented accounts evidence that mouthparts of P. mexicanus are capable of piercing human skin, however envenomation results in a mild sting, and venom from a single P. mexicanus is not lethal to humans or other mammals. Given the tremendous diversity of pholcid spiders, we do not assume that venom composition in males of this species are representative of the entire family, and broader sampling within this group may uncover exciting differences that correlate with known ecological diversification within the group. Moreover, proteomic characterization of female venoms of P. mexicanus are needed to infer the full set of proteins identified in venom gland cDNA libraries that are present in adult members of this species, given the likelihood of at least some degree of sexual dimorphism. With expansive diversity to explore in pholcid venoms, we present these results as reason to be excited about potential discovery of new chemical activities and, by default, to change the public narrative and assume pholcid spiders are not harmful to humans.

# DATA AVAILABILITY

The datasets generated for this study can be found in GenBank, UniProt Knowledgbase, KP138796-KP138905, and MN004877- MN004966.

# AUTHOR CONTRIBUTIONS

GB and PZ-T conceived of the project, wrote the manuscript, and oversaw all aspects of collecting and analyzing transcriptomic, proteomic, and insect bioassay data. JM generated and analyzed Sanger transcriptomes and conducted insect bioassays. BK generated the Illumina 3000 transcriptomes. CD and LB conducted proteomics analyses. CK provided spiders, collected venom, and documented behavior of the animals including human envenomation.

# ACKNOWLEDGMENTS

This work was supported by funding from National Institute of Health R15-GM-097696-01 to GB, and summer research support from Lewis & Clark College. Lewis & Clark students Sophia Horigan, Sasha Bishop, Katherine Delgado, Demi Glidden, and Kendra Autumn contributed to considerations of annotating the proteome. Matt Briggs and Owen Hart helped with the PD<sup>50</sup> assays. We thank Matthew Cordes for help thinking about function of proteins with poor homology, Leslie Boyer for discussions about the manuscript, and Anita Kristensen for help with venom collection and observations of the natural colony of P. mexicanus at Spider Pharm. Careful and thoughtful comments from reviewers also improved the manuscript.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2019.00256/full#supplementary-material

Figure S1 | Overview of cDNA library Sanger sequences from a combination of male and female *P. mexicanus* venom glands. (A) Proportionate representation of overall transcript abundance into five general categories based on database hits. (B) Table: the 99 and 85% identity threshold assemblies, respectively identify redundant and homologous sequences in the library. Flowchart: clusters and

# REFERENCES


singleton sequences from the 85% threshold analysis were initially divided based on whether or not there were significant tBLASTx hits (e ≤10−<sup>5</sup> ) in ArachnoServer or NCBI nt/nr databases. General function prediction, and analyses of sequences that hit nothing in databases was done as described in the text.

Table S1 | Details of each of the unique polypeptides detected in the proteome of *Physocyclus mexicanus.* Sequences are categorized and color-coded as proteins with venom function (prey capture/feeding/defense)(blue), proteins of unknown function (orange), and proteins with housekeeping/metabolism function (gray). Polypeptide sequences detected in the proteome are in the far right column.

Table S2 | Details of the proteomics analyses conducted at the University of Arizona Proteomics Consortium.

Table S3 | Details of cDNAs identified in the cDNA library transcriptome of *Physocyclus mexicanus*. Sequences are categorized based on predicted proteins as in Table S1.

in spider digestive fluid. Comp. Biochem. Physiol. B Biochem. Mol. Biol. 143, 257–268. doi: 10.1016/j.cbpb.2005.08.012


venom collected by electrical stimulation. J. Physiol. Biochem. 63:221–230. doi: 10.1007/BF03165785


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zobel-Thropp, Mullins, Kristensen, Kronmiller, David, Breci and Binford. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Venoms of Rear-Fanged Snakes: New Proteins and Novel Activities

Cassandra M. Modahl <sup>1</sup> and Stephen P. Mackessy <sup>2</sup> \*

*<sup>1</sup> Department of Biological Sciences, National University of Singapore, Singapore, Singapore, <sup>2</sup> School of Biological Sciences, University of Northern Colorado, Greeley, CO, United States*

Snake venom research has focused on front-fanged venomous snakes because of the high incidence of human morbidity and mortality from envenomations and larger venom yields of these species, while venoms from rear-fanged snakes have been largely neglected. Rear-fanged snakes (RFS) are a phylogenetically diverse collection of species that feed on a variety of prey and show varying prey capture strategies, from constriction to envenomation. In general, RFS venoms share many toxin families with front-fanged snakes, and venoms generally are either a neurotoxic three-finger toxin (3FTx)-dominated venom or an enzymatic metalloproteinase-dominated venom. These venoms have also been discovered to contain several unique venom protein families. New venom protein superfamilies in RFS venoms include matrix metalloproteinases, distinct from but closely related to snake venom metalloproteinases, veficolins, and acid lipases. Specialized three-finger toxins that target select prey taxa have evolved in some RFS venoms, and this prey capture strategy has appeared in multiple RFS species, from Old World *Boiga* to New World *Spilotes* and *Oxybelis.* Though this same protein superfamily is commonly found in the venoms of elapid (front-fanged) snakes, no elapid 3FTxs appear to show prey-specific toxicity (with the exception of perhaps *Micrurus*). Neofunctionalization of *Spilotes sulphureus* 3FTx genes has even resulted in the evolution within a single venom of 3FTxs selectively neurotoxic to different prey taxa (mammals or lizards), allowing this non-constricting RFS to take larger mammalian prey. The large number of 3FTx protein sequences available, together with a growing database of RFS venom 3FTxs, make possible predictions concerning structure-function relationships among these toxins and the basis of selective toxicity of specific RFS venom 3FTxs. Rear-fanged snake venoms are therefore of considerable research interest due to the evolutionary novelties they contain, providing insights into the evolution of snake venom proteins and potential predator-prey coevolution in a broader phylogenetic context. Because of the limited complexity of these venoms, they represent a more tractable source to inform about the biological roles of specific venom proteins that are found in the venoms of this rich diversity of snakes.

Keywords: evolution, metalloproteinase, neofunctionalization, proteomics, three-finger toxin, toxin, transcriptomics

### Edited by:

*Kartik Sunagar, Indian Institute of Science (IISc), India*

### Reviewed by:

*Wolfgang Wuster, Bangor University, United Kingdom Bryan Fry, University of Queensland, Australia*

> \*Correspondence: *Stephen P. Mackessy stephen.mackessy@unco.edu*

### Specialty section:

*This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *29 March 2019* Accepted: *09 July 2019* Published: *23 July 2019*

### Citation:

*Modahl CM and Mackessy SP (2019) Venoms of Rear-Fanged Snakes: New Proteins and Novel Activities. Front. Ecol. Evol. 7:279. doi: 10.3389/fevo.2019.00279*

# INTRODUCTION

Venomous snakes and their venoms have instilled both fear and fascination in humans, and they have especially inspired the interest of scientists over the years as unparalleled examples of trophic adaptation. The evolution of venom, venom glands, and specialized maxillary teeth greatly contributed to the radiation of colubroid snakes during the Cenozoic era, generating the diversity of snakes present today (Kardong, 1979; Savitsky, 1980; Vidal, 2002; Jackson, 2003; Vonk et al., 2008; Fry et al., 2012a). There are over 3,700 recognized extant snake species (Uetz et al., 2018); however, the minority of these, <20%, are known to have venoms that result in medically significant bites to humans (Uetz et al., 2018; www.toxinology.com<sup>1</sup> ). Snakebitesfrom venomous snakes that are of medical concern are predominately front-fanged snakes from the families Viperidae and Elapidae, but a large number of snakes from various other families, previously classified as the single family Colubridae, have been discovered to be rear-fanged and also venomproducing (reviewed in Mackessy, 2002; Saviola et al., 2014; Junqueira-de-Azevedo et al., 2016).

Front-fanged snakes (FFS) occur in three families, but rear-fanged snake (RFS) species are phylogenetically diverse, presently assigned to at least three (and sometimes five) distinct families: Colubridae, Homalopsidae, and Lamprophiidae (Vonk et al., 2008; Pyron et al., 2013). Front-fanged snakes have tubular fangs positioned anteriorly in the upper jaw and a venom apparatus that includes an encapsulated reservoir with compressor glandulae (Viperidae) or adductor externus superficialis (Elapidae) muscles inserted directly onto the venom gland capsule (Kochva, 1962). Rear-fanged snakes exhibit a venom apparatus that is morphological variable (Taub, 1967), but typically it is without a storage reservoir or attaching muscles (**Figure 1A**; see also Kardong, 2002). As we have noted previously (Saviola et al., 2014), the venom-producing gland of RFS is referred to as the Duvernoy's venom gland, a name that recognizes the distinctive nature of this venom gland (secretory epithelial structure, mode of secretion/storage, mechanics of venom delivery, etc.) while acknowledging the clear embryonic, evolutionary and biochemical homologies with FFS venom glands. Snakes generally have numerous oral glands (Kardong, 2002), but the Duvernoy's venom gland, which is a serous secretory gland, is histologically distinct from the closely adjacent supralabial gland, which is mucosecretory (**Figure 1B**). A medial venom gland duct conducts secreted venom to one-several, often enlarged, posterior maxillary teeth, and these may be bladed, grooved and/or ungrooved (**Figures 1C–G**; see also Young and Kardong, 1996; Jackson, 2003). Because of the lack of storage reservoir and musculature, venom yields from RFS are considerably less in comparison to FFS, and it is difficult to obtain large quantities of venom for protein characterization. However, venom yields have been improved with the use of ketamine and pilocarpine to sedate snakes and induce gland secretions (Rosenberg, 1992; Hill and Mackessy, 1997).

For some RFS, what appears superficially to be a less well-developed venom system is still capable of producing lethal envenomations in humans. The previously underestimated venom of the RFS Dispholidus typus resulted in the death of eminent herpetologist Karl P. Schmidt in the late 1950s (Pope, 1958; Pla et al., 2017b). Although the large majority of RFS are unable to deliver lethal toxin quantities or even venom yields great enough to result in systemic envenomations in humans (Weinstein et al., 2013), at least three species (D. typus, Thelotornis capensis, and Rhabdophis tigrinus) have caused human fatalities, and bites by two additional species (Philodryas olfersii and Tachymenis peruviana) have resulted in serious human envenomations (Kuch and Mebs, 2002; Mackessy, 2002; Prado-Franceschi and Hyslop, 2002; Weinstein et al., 2011). For two of these genera, Dispholidus and Rhabdophis, antivenom is currently manufactured and is the recommended management of snakebites from these species (Weinstein et al., 2013). Increasing awareness of severe, at times fatal, envenomations from RFS has led to a slowly growing interest in their venoms. Additionally, advances in the sensitivity of research technologies has resulted in the ability to profile venoms with relatively little starting material.

Current –omic technologies have made venom characterization attainable for organisms with small amounts of venom and venom gland tissue. As a result of these technologies, databases are rapidly growing from transcriptomic and proteomic studies. Genomic and transcriptomic fields have become more inclusive with the affordability of next-generation sequencing (NGS), making these technologies available for non-model organisms. There are now a number of snake genome references (Castoe et al., 2013; Vonk et al., 2013; Yin et al., 2016; Perry et al., 2018), and the number of NGS generated venom gland transcriptomes from snakes is also on the rise (Durban et al., 2011; Rokyta et al., 2011, 2015; Aird et al., 2013; Margres et al., 2013), including for RFS species (McGivern et al., 2014; Zhang et al., 2015; Campos et al., 2016; Pla et al., 2017a,b; Modahl et al., 2018a,b). The sequencing depth of NGS allows for lowly expressed transcripts in snake venom glands that were previously difficult to obtain with expressed sequence tags (ESTs) to now be observed (Campos et al., 2016). In addition to NGS advancements, the increasing sensitivity of mass spectrometry instruments has made it possible to characterize venom proteomes with small amounts of venom (∼100–200 ng). The integration of these –omic technologies together has led to comprehensive venom profiles of RFS venoms. These profiles are now readily generated, are affordable, and demonstrate more accurate evolutionary overviews of venom compositional diversification.

These more complete venom gland transcriptomes and venom proteomes have revealed common patterns of toxin expression and secretion for RFS (Junqueira-de-Azevedo et al., 2016), as well as identified new venom proteins that had previously not been recognized as venom components in FFS species (OmPraba et al., 2010; Ching et al., 2012; Fry et al., 2012b; Campos et al., 2016). Further, these venoms have been shown to possess toxins with unique activities, such as prey-specific toxicity (Mackessy et al., 2006; Pawlak et al., 2006, 2009; Heyborne and Mackessy, 2013; Modahl et al., 2018b). Venom proteins from RFS show distinct gene organization and evolutionary trajectories (Pawlak and Kini, 2008; Dashevsky et al., 2018), making these generally neglected venoms ideal models to study venom as trophic adaptations. By

<sup>1</sup>Toxinology Department, Women's and Children's Hospital, Adelaide, Australia.

examining a large and divergent clade of snakes, we can begin to address and answer questions as to how venom proteins acquire their functionalities and what biological roles they provide.

# REAR-FANGED SNAKE VENOMS: KNOWN PROTEINS

Even as phylogenetically diverse as venomous snakes are, there are some venom proteins commonly observed in all snake venoms. These include three-finger toxins (3FTxs), snake venom metalloproteinases (SVMPs), C-type lectins (CTLs), and cysteine-rich secretory proteins (CRiSPs) (Mackessy, 2010b; Junqueira-de-Azevedo et al., 2016). Phospholipase A2s (PLA2) are ubiquitous in FFS venoms (Mackessy, 2010b), but have been found to be abundant in only a few RFS species (Hill and Mackessy, 2000; Huang and Mackessy, 2004). Overall venom composition observed for snakes is one of two types, either a venom is dominated by smaller, usually neurotoxic 3FTxs, such as those in elapid (cobras, mambas, kraits, etc.) venoms, or a venom that consists primarily of larger enzymatic proteins, such as SVMPs, as in the case of viper and pit viper venoms (Mackessy, 2010b). Interestingly, RFS venoms can have either an elapidlike neurotoxic or viper-like enzymatic composition (McGivern et al., 2014). A summary of the presence and absence of venom proteins in RFS venoms can be found in a colubrid –omics review (Junqueira-de-Azevedo et al., 2016).

Another commonality that is observed for all venomous snakes is the large number of proteoforms present in each venom. This large number of isoforms is a result of multiple gene duplication events and single nucleotide polymorphisms (SNPs), which through neofunctionalization has generated toxins with different activities (Casewell et al., 2013; Modahl et al., 2018b). This has allowed for RFS, which tend to have less complex venom (Peichoto et al., 2012), to expand toxin functionalities.

# Three-Finger Toxins

One venom protein superfamily that is prevalent in snake venoms, with an incredibly vast range of activities, are the three-finger toxins (3FTxs). Three-finger toxins can be neurotoxic, acting as antagonists of nicotinic acetylcholine receptors (nAChRs) (Chang and Lee, 1963; Nirthanan and Gwee, 2004; Bourne et al., 2005), muscarinic acetylcholine receptors (mAChRs) (Karlsson et al., 2000; Chung et al., 2002), adrenergic receptors(Rajagopalan et al., 2007), GABA receptors (Rosso et al., 2015), or even bind to and alter the activation of ion channels (de Weille et al., 1991; Rivera-Torres et al., 2016; Yang et al., 2016). These toxins can also be anticoagulants, inhibiting platelet aggregation (Kini et al., 1988; McDowell et al., 1992). Regardless of activity, all 3FTxs maintain a conserved structural scaffold of three β-stranded loops, crosslinked by four disulfide bridges. This forms the "three-finger" arrangement, appearing like three fingers of a hand. Three-finger toxins are non-enzymatic, small proteins consisting of 60–85 amino acid residues, and occur either as monomers, which is most common, or as dimers (Kini and Doley, 2010).

Elapid venoms are the most well-known sources of 3FTxs (Fry et al., 2003b), but 3FTxs have been documented in the venoms of many RFS species (Fry et al., 2003a, 2008; Pawlak et al., 2006, 2009; Heyborne and Mackessy, 2013; Junqueira-de-Azevedo et al., 2016) and can make up large portions of these venoms, as much as 84–92% of the total venom composition (Pla et al., 2017a; Modahl et al., 2018b). Rear-fanged snake genera that have abundant proteins in the molecular mass range of 3FTxs include Boiga, Spilotes, Trimorphodon, Oxybelis, Leioheterodon, Psammophis, Rhamphiophis, and Thelotornis (**Figure 2A**). Of these snakes mentioned, 3FTxs have been sequenced and functionally characterized from the venoms of Boiga dendrophila (Lumsden et al., 2005; Pawlak et al., 2006), Boiga irregularis (Pawlak et al., 2009), Oxybelis fulgidus (Heyborne and Mackessy, 2013), and Spilotes sulphureus (Modahl et al., 2018b). Transcripts for 3FTx have been found in the venom gland, or in the venom, of Boiga cynodon, Boiga nigriceps, Trimorphodon biscutatus, Ahaetulla prasina, Leioheterodon madagascariensis, Thrasops jacksonii, and Psammophis mossambicus (Fry et al., 2008; Modahl and Mackessy, 2016; Dashevsky et al., 2018; Modahl et al., 2018a), among others.

Three-finger toxins were one of the first venom proteins characterized pharmacologically in RFS venoms and were reported to exhibit postsynaptic neurotoxicity, similar to elapid α-neurotoxins (Broaders et al., 1999; Fry et al., 2003a; Lumsden et al., 2005). Several 3FTxs from RFS venoms have been the only well-characterized venom proteins with taxon-specific toxicities, showing preferential binding to lizard or bird nAChRs, in direct relation to snake diet (Pawlak et al., 2006, 2009). Several elapid venoms have shown differential toxicity toward bird or rodent neuromuscular preparations (e.g., Hart et al., 2012, 2013), but these studies were on crude venoms and did not include toxicity assays on native prey species. Sequences and structures of 3FTxs isolated from RFS have an elongated N-terminal segment, and tend to be larger in size than observed in FFS venoms (Lumsden et al., 2007; Pawlak et al., 2009; Heyborne and Mackessy, 2013; Modahl et al., 2018b). It is currently unknown how this longer N-terminal region is involved in receptor binding.

Several of these 3FTxs are also N-terminally blocked by a pyroglutamic acid residue (Broaders et al., 1999; Fry et al., 2003a; Pawlak et al., 2006; Heyborne and Mackessy, 2013). This pyroglutamic acid blockage especially makes these proteins hard to sequence by Edman degradation, and obtaining venom protein transcript sequences has been the most successful approach to study 3FTx amino acid residue diversity in RFS (Fry et al., 2008; McGivern et al., 2014; Modahl and Mackessy, 2016; Pla et al., 2017a; Dashevsky et al., 2018; Modahl et al., 2018b). Rear-fanged snake 3FTxs demonstrate low amino acid sequence identity with 3FTxs from elapids (usually <50%), and all RFS 3FTxs are members of the "non-conventional" toxin classification, characterized by an additional fifth disulfide bond in the first loop (Nirthanan et al., 2003). This type of 3FTx also occurs in elapid venoms, where it was first characterized in cobra (Naja) venoms (Carlsson, 1975; Utkin et al., 2001), but they are the only 3FTx type present in RFS venoms. This suggests that the non-conventional cysteine pattern may be the more basal arrangement for 3FTxs, and provides insight into the evolution of 3FTxs (cf. Fry et al., 2003b). The incredible diversity of RFS 3FTxs with low identity to elapid 3FTxs, combined with the binding selectivity exhibited by some of these toxins, provides a database of proteins to explore how venom proteins target prey and adapt toxicity, and will be discussed more in another section below.

# Metalloproteinases

Snake venom metalloproteinases occur primarily in viperid venoms (Mackessy, 2010b), and similar to 3FTxs, are also a large multigene family, exhibiting a diversity of activities. These enzymes have effects including hemorrhage, coagulopathy, fibrinolysis, apoptosis, and the activation of factor X and prothrombin (Takeya et al., 1993). Many SVMPs function by degrading endothelial cell membrane components or target proteins involved in coagulation, such as fibrinogen or platelet receptors (Takeda et al., 2012). As part of the metzincin superfamily of proteinases, they are characterized by the presence of the Zn2+-binding motif HEXXHXXGXXH at the catalytic site. Snake venom metalloproteinases are closely related to mammalian ADAM (a disintegrin and metalloproteinase) and ADAMTS (ADAM with thrombospondin type-1 motif), but differ in domain organization (Fox and Serrano, 2005; Takeda et al., 2012). The SVMP P-I class has only a catalytic metalloproteinase domain present, P-IIs contain a metalloproteinase domain followed by a disintegrin domain, and P-IIIs have a metalloproteinase, disintegrin-like, and cysteinerich domain (Hite et al., 1994; Fox and Serrano, 2005).

In some RFS venoms, SVMPs are the most abundant toxins, making up 62–70% of the total venom composition (Modahl et al., 2018a). SVMP-dominated venoms are found in RFS species such as A. prasina (Modahl et al., 2018a), B. portoricensis (Weldon and Mackessy, 2012; Modahl et al., 2018a), D. typus (Kamiguti et al., 2000; Pla et al., 2017b), Hydrodynastes gigas (Hill and Mackessy, 2000), Hypsiglena torquata (McGivern et al., 2014), Phalotris mertensi (Campos et al., 2016), Pseudoboa neuwiedii (Torres-Bonilla et al., 2018), Thamnodynastes strigatus (Ching et al., 2012), Thamnophis sirtalis (Perry et al., 2018), and several Philodryas species (Ching et al., 2006; Peichoto et al., 2012; Urra et al., 2015) (**Figure 2B**). Potent SVMPs are observed in the venoms of snakes from the genus Philodryas (Assakura et al., 1994; Rocha et al., 2006; Sánchez et al., 2014), and venoms from these species in particular have been commonly reported to induced hemorrhage, myonecrosis and edema (Peichoto et al., 2005; Nery et al., 2014; Sánchez et al., 2014; Oliveira et al., 2017). The proteolytic activity of Philodryas venoms is inhibited by metal chelators (Assakura et al., 1992; Acosta et al., 2003; Peichoto et al., 2005, 2012; Rocha and Furtado, 2007), suggesting that SVMPs are resulting in these clinical snakebite symptomologies; in some species, serine proteinases may also be involved (Assakura et al., 1994; Peichoto et al., 2005; Ching et al., 2006). In comparison to venom from the pit viper species

Bothrops jararaca, proteinase activity was 25 times greater for P. baroni venom (Sánchez et al., 2014), and also greater for P. olfersii and P. patagoniensis venoms (Carreiro da Costa et al., 2008). This demonstrates that the SVMP enzymatic activity from RFS venoms can be even more impressive than some FFS species. Philodryas sp. have been noted to share similar SVMP epitopes with snakes of the genus Bothrops (Assakura et al., 1992; Tancioni et al., 2004; Carreiro da Costa et al., 2008) and antivenom produced from Bothrops venoms have been observed to neutralize the systemic envenomation effects of Philodryas (Rocha et al., 2006).

Several SVMPs from RFS venoms have been purified and characterized, including patagonfibrase from P. patagoniensis (Peichoto et al., 2007), alsophinase from Borikenophis (previously the genus Alsophis) portoricensis (Weldon and Mackessy, 2012), and several from P. olfersii (Assakura et al., 1994). These characterized SVMPs are most noted for fibrin(ogen)olytic activity, preferentially degrading the α-chain of fibrinogen over the β-chain, however selectivity was variable, but none degraded the γ-chain (Assakura et al., 1994; Peichoto et al., 2007; Weldon and Mackessy, 2012). Patagonfibrase also impaired platelet aggregation induced by collagen and ADP (Peichoto et al., 2007). Both patagonfibrase and alsophinase were shown to cause hemorrhage and edema in mice, once again suggestive of the significant role SVMPs play in producing the envenomation symptomologies from these snakes (Peichoto et al., 2007, 2011; Weldon and Mackessy, 2012).

Only SVMPs of the P-III class has been reported in RFS (Saviola et al., 2014; Junqueira-de-Azevedo et al., 2016), with the exception of one truncated SVMP of a new class (Campos et al., 2016). This new type of SVMP was identified in the RFS P. mertensi, in which a P-III SVMP was discovered to be truncated in the middle of the disintegrin-like domain by a nonsense mutation in the gene; the resulting truncated SVMP was also observed in the venom (Campos et al., 2016). Rear-fanged SVMPs appear to have such nucleotide mutations frequently; within the venom gland transcriptome of B. portoricensis, a nucleotide substitution resulted in the elimination of the conserved stop codon and produced SVMP transcripts with a 9 residue extension at the C-terminus (Modahl et al., 2018a). P-IIIs were the first class of SVMPs recruited as a venom toxin (Casewell et al., 2011), and these RFS toxin genes and transcripts provide insight into the evolution of SVMPs and where alterations in ancestral gene sequences occurred, allowing the domains of this superfamily of proteins to diversify.

# Cysteine-Rich Secretory Proteins

Cysteine-rich secretory proteins are common in many reptile venoms (Mackessy, 2002; Sunagar et al., 2012), but very little is currently known about their exact biological targets as venom components. These non-enzymatic proteins all share a conserved 16 cysteine residue pattern, forming eight disulfide bonds (Mackessy and Heyborne, 2010). CRiSPs lack proteolytic, hemorrhage and coagulant activity (Lodovicho et al., 2017). The few snake venom CRiSPs that have been characterized have been found to inhibit various ion channels (Nobile et al., 1996; Brown et al., 1999; Yamazaki et al., 2002; Wang et al., 2006) or induce inflammation, activating the complement system (Lodovicho et al., 2017).

CRiSPs occur in a wide range of RFS venoms (Hill and Mackessy, 2000; Peichoto et al., 2012), and are likely at least present at the transcript level for all species (Junqueira-de-Azevedo et al., 2016). Few CRiSPs have been purified and characterized from RFS venoms, but the CRiSP patagonin was purified from the venom of Philodryas patagoniensis and demonstrated necrotic activity toward murine gastrocnemius muscle when injected intramuscularly at doses of 43 µg and greater (Peichoto et al., 2009). It was suggested that patagonin was potentially binding to ion channels. Patagonin had no effect on the aggregation of human platelets and no proteolytic activity toward azocasein or fibrinogen (Peichoto et al., 2009). Helicopsin, a CRiSP from the RFS Helicops angulatus, was found have robust neurotoxic activity, causing respiratory paralysis in mice (Estrella et al., 2011). However, tigrin, a CRiSP isolated from the venom of RFS R. tigrinus tigrinus, was found to have no effect on high potassium or caffeine-induced contraction of helical strips of endothelium-free rat-tail arterial smooth muscle (Yamazaki et al., 2002), suggesting a lack of any neurotoxicity. The biological roles of CRiSPs in venoms still remains a bit unclear, but given the wide occurrence of these proteins in venoms, their conserved sequence and cysteine scaffold, and moderate-high levels of abundance in RFS venoms (often one of the most abundant venom components; Pla et al., 2017a; Modahl et al., 2018a), they likely serve important biological roles.

# Non-ubiquitous and/or Minor Venom Components

Other proteins that have been found in RFS venom proteomes include serine proteinases, phospholipase A2s (type IA), acetylcholinesterases, and C-type lectins. Serine proteinases dominate many viper venoms (Mackessy, 2010b) and are responsible for both promoting and inhibiting blood coagulation, via activation of coagulation factors to induction of platelet aggregation or direct action on fibrinogen (Braud et al., 2000). In RFS venoms, serine proteinases have been identified in the venoms of P. olfersii (Assakura et al., 1994; Ching et al., 2006), P. patagoniensis (Peichoto et al., 2005), and P. mertensi (Campos et al., 2016), but overall these enzymes appear to be uncommon in this group of snakes. Phospholipase A2s (type IA) are also uncommon in RFS venoms, but for some species, such as T. biscutatus lambda, PLA<sup>2</sup> enzymatic activity is detectable from crude venom and a PLA<sup>2</sup> has even been purified from this venom (Hill and Mackessy, 2000; Huang and Mackessy, 2004). Front-fanged snake venoms usually contain many PLA<sup>2</sup> isoforms and this large venom protein superfamily has a wide range of pharmacological effects, including neurotoxicity, myotoxic, cardiotoxic, anticoagulant, hemolytic, and hypotensive activities (Kini, 2003). PLA<sup>2</sup> enzymes from RFS appear to be more similar to PLA<sup>2</sup> sequences from elapid venoms (Huang and Mackessy, 2004; Fry et al., 2008), and perhaps serve a predigestive or prey capture role in RFS venoms, but they are not an abundant venom component. Acetylcholinesterase is another enzyme that has been detected at low levels in some RFS venoms, such as Boiga species (Broaders and Ryan, 1997; Hill and Mackessy, 2000) and Leptophis ahaetulla marginatus (Sánchez et al., 2018), but it is also not broadly distributed in RFS venoms (Mackessy, 2002; Junqueira-de-Azevedo et al., 2016).

C-type lectins (CTL) do appear to be a ubiquitous component of RFS venoms, and they are consistently present as transcripts in venom glands of RFS species (Junqueira-de-Azevedo et al., 2016). CTLs are non-enzymatic proteins capable of binding reversibly and non-covalently to carbohydrates, inducing hemagglutination by binding surface glycoconjugates on erythrocytes (Lu et al., 2005; Morita, 2005). Lectins have been isolated from many natural sources, and are more common components of viper than elapid venoms (Mackessy, 2010b). For FFS species, CTL abundance appears to be variable in snake venoms (Durban et al., 2011) and has only been detected at a transcript level in the venom gland ofsome species (e.g., Vonk et al., 2013). A CTL from the RFS Cerberusrynchopsshared 79% sequence identity to a CTL from the elapid Bungarus multicinctus (OmPraba et al., 2010), and CTL transcripts from the venom gland of P. olfersii were also found to be more similar to those of elapids than viperids (Ching et al., 2006). All lectins have a carbohydrate recognition domain responsible for the glycan interaction activity, and the EPD triplet and conserved calcium-binding region indicated mannose specificity for the CTL discovered in venom from the RFS C. rynchops. Interestingly, CTLs in RFS have various glycan binding motifs, including EPD, QPD, EPN, RPS, QVE, and EPK (Fry et al., 2008; Junqueira-de-Azevedo et al., 2016). The variability in binding motifs in various FFS species suggests that these genes have greatly diversified in colubroids and might provide different functionalities in RFS venoms.

# REAR-FANGED SNAKE VENOMS: NEW PROTEIN FAMILIES

New venom protein families have been identified from RFS venoms. These previously unknown venom proteins can make up 26% of the total venom gland transcripts in these species (Cerberus rhynchops: OmPraba et al., 2010) or almost 50% of the expressed toxins in the venom gland (T. strigatus: Ching et al., 2012), demonstrating that these newly recognized venom proteins are not just minor venom components and likely are biologically relevant as venom toxins. However, there have been transcripts for many toxins identified in the venom gland of RFS species that have been found to be absent from the venom (Fry et al., 2008; Junqueira-de-Azevedo et al., 2016; Modahl et al., 2018a). Additionally, toxin homologs are expressed at low-moderate levels in other tissues and are not restricted to the venom gland (Hargreaves et al., 2014; Junqueira-de-Azevedo et al., 2015; Reyes-Velasco et al., 2015), and mistaking these non-toxin genes as venom proteins can lead to confounding evolutionary analyses. Expression of genes belonging to venom toxins is higher in venom gland tissues in comparison to other organ tissues, and examining these expression profiles has been a successful approach to identifying true venom proteins and what could potentially be new toxins (Hargreaves et al., 2014; Campos et al., 2016; Perry et al., 2018). It is still critical that when characterizing new toxins, any transcriptome study is accompanied by a venom proteome to confirm the presence of the toxin in the venom. Ribonuclease, lipocalin, phospholipase A<sup>2</sup> (type IIE) and vitelline membrane outer layer proteins have been suggested to be new toxins from RFS species (Fry et al., 2012b), but have yet to be confirmed and functionally characterized in these venoms. Therefore, only new venom protein families with detected presence in RFS venoms will be discussed below. These new toxins include veficolins, matrix metalloproteinases and acid lipases.

# Veficolins

A new venom protein family, named veficolins (venom ficolins), was discovered in the venom of C. rynchops (OmPraba et al., 2010), one of only two RFS species of the family Homalopsidae that has been studied. OmPraba et al. used a combined transcriptomic and proteomic approach to characterize C. rynchops venom, confirming both veficolin gene expression in the venom gland and its presence in the venom. Transcripts for veficolins have been identified in the venom glands of other RFS species (Fry et al., 2012b; Junqueira-de-Azevedo et al., 2016; Pla et al., 2017a; Modahl et al., 2018b), but C. rynchops venom has been the only species reported with these toxins in the venom proteome (OmPraba et al., 2010). Veficolins share amino acid sequence similarities to mammalian ficolins with collagen-like and fibrinogen-like domains. It is possible that these venom proteins can induce platelet aggregation and/or initiate complement activation when delivered into prey (OmPraba et al., 2010), aiding in prey capture. However, the two veficolins from C. rynchops venom have yet to be experimentally characterized.

# Matrix Metalloproteinases

Snake venom matrix metalloproteinases (svMMPs) were first identified as a new venom protein present in the venom of RFS R. tigrinus tigrinus (Komori et al., 2006). Transcripts for svMMPs appear to be abundant in the venom glands of Dipsadinae RFS, including T. strigatus (Ching et al., 2012) and Erythrolamprus miliaris (Junqueira-de-Azevedo et al., 2016), and they are closely related to but functionally distinct from SVMPs. The expression of svMMPs in the venom gland of T. strigatus was considerably higher than that of SVMPs: 46.2– 8.2%, respectively (Ching et al., 2012). Snake venom matrix metalloproteinases were also detected in the 2-D electrophoresis venom profile of P. mertensi (Campos et al., 2016). RFS svMMP genes cluster with a single MMP-9 ancestral gene, regardless of the presence or absence of ancillary domains (Junqueirade-Azevedo et al., 2016), suggestive of a separate recruitment event relative to SVMPs. Matrix metalloproteinases are also zincdependent enzymes that degrade extracellular matrix proteins, such as collagens, elastin, proteoglycans, and laminins (Ra and Parks, 2007). These enzymes have been well-recognized for their roles in many physiological and pathological processes, including organ growth, wound healing, bone remodeling, immunity modulation, tumor invasion, and metastasis (Vu and Werb, 2000; Parks et al., 2004; Page-McCaw et al., 2007). They likely serve tissue-degrading roles in RFS venoms.

# Acid Lipases

Recently, another new venom protein, snake venom acid lipase (svLIPA) was identified in the venom of the RFS P. mertensi (Campos et al., 2016). The venom gland transcriptome of this species revealed a highly expressed transcript coding for a lysosomal acid lipase protein; P. mertensi venom also exhibited lipase activity (Campos et al., 2016), and the recombinantly expressed svLIPA transcript produced a functional protein. Campos et al. supported their identification of svLIPA as a new venom protein based on its presence in P. mertensi venom, presence in the venom of the FFS Micrurus corallinus (Correa-Netto et al., 2011), high transcript abundance in the venom gland of P. mertensi in comparison to other body tissues, and previous reports of its transcription in the oral glands of other snakes (Hargreaves et al., 2014). Acid lipases hydrolyze cholesteryl esters and triglycerides to free cholesterol and fatty acids, aiding in cell metabolism and immunity (Gomaraschi et al., 2019). As with svMMPs, lipases could serve predigestive roles in RFS venoms.

# REAR-FANGED SNAKE VENOMS: NEW ACTIVITIES

One of the greatest hurdles to overcome when functionally characterizing a venom protein is acquiring enough material to perform assays. Experimental characterization of minor venom components or toxins from venomous animals with low venom yields is usually done by first producing enough of the required protein from a recombinant expression system, such as Escherichia coli, Pichia pastoris, or mammalian cells (Gomes et al., 2016). However, toxins have proven difficult to express in these systems due to being rich in disulfide bonds, leading to protein aggregates and misfolding, and they can be toxic to expressing cells as well (Saez et al., 2014). It is also possible to chemically synthesize toxins, but this is only successful for smaller toxins. With FFS species, venom yields are much larger, so it is possible to avoid these challenges by purifying venom proteins directly from the venom. This has led to a characterization bias of only venom proteins that are abundant and/or that originate from snake species with large venom yields. Because of this, only a few proteins have been characterized from RFS venoms. For the toxins that have been explored, interesting new activities and targets have emerged, especially in relation to snake diet, prey capture strategies and the biological roles RFS venoms provide. Venoms from these species are an additional and highly productive resource for examining the functional diversity of prevalent protein superfamilies.

# Three-Finger Toxins

Crude venoms from several RFS species show drastic toxicity differences depending on the model organism used for lethal dose

### TABLE 1 | Toxicity of venoms and purified toxins toward lizards and mice.


*i.p., intraperitoneal; i.v., intravenous.*

*Lethal dose (LD*50*) values for B. irregularis venom are from Mackessy et al. (2006), S. sulphureus venom are from Modahl et al. (2018b), and N. kaouthia venom are from Modahl et al. (2016). Purified* α*-cobratoxin values are from Modahl et al. (2016) (lizard) and Karlsson (1973) (mice), irditoxin values are from Pawlak et al. (2009), and purified toxins values from S. sulphureus are from Modahl et al. (2018b).*

(LD50) experiments (Mackessy, 2002). Philodryas patagoniensis venom was tested on pigeons (Columba livia domestica), guinea pigs (Cavia porcellus), rabbits (Oryctolagus cuniculus), and frogs (Leptodactylus sp.), and pigeons were the most sensitive to this venom (Martins, 1907). Boiga irregularis venom LD<sup>50</sup> values were determined for domestic chickens (Gallus domesticus), geckos (Hemidactylus sp.), skinks (Carlia sp.), and mice (Mus musculus), and it was found that crude venom was much more toxic to birds and lizards than to mammals (**Table 1**) (Mackessy et al., 2006). Venom from S. sulphureus was also more toxic toward geckos (Hemidactylus sp.) than mice (M. musculus), even at 22-fold higher mass-adjusted doses in mice (**Table 1**) (Modahl et al., 2018b). This impressive toxicity in geckos of S. sulphureus venom is equivalent to that observed for venom from the highly toxic FFS Naja kaouthia (**Table 1**). The enhanced toxicity of crude venom toward different taxa has led to explorations into these venoms to discover components responsible. In most cases, 3FTxs have been identified to be the taxon-specific targeting toxins in these venoms, first characterized in the venoms of RFS species in the genus Boiga, commonly referred to as cat snakes.

From B. dendrophila venom, two 3FTxs have been characterized with postsynaptic neurotoxicity, boigatoxin-A (Lumsden et al., 2005) and denmotoxin, B. dendrophila monomeric toxin (Pawlak et al., 2006). Purified denmotoxin was found to bind to postsynaptic nAChRs in chick muscle preparations 100-fold more readily in comparison to mice nAChRs. This is also consistent with the lack of B. dendrophila crude venom toxic effects after injections into mice of doses up to 20µg/g (Pawlak et al., 2006). Boigatoxin-A shows 78% identity to denmotoxin, but was only tested on chick neuromuscular junctions, so it is unknown if this toxin would display taxa targeting specificity. For the venom of B. irregularis, taxonspecific toxicity was found to be a result of the 3FTx complex irditoxin, B. irregularis dimeric toxin (Pawlak et al., 2009). Irditoxin inhibited postsynaptic nAChRs in chick biventer cervicis muscle preparations, but was three orders of magnitude less effective at the mammalian neuromuscular junction. This corresponds with the in vivo toxicity of irditoxin, which was nontoxic in mammals at doses up to 25µg/g, but has an LD<sup>50</sup> value of 0.55µg/g in Hemidactylus geckos and 0.22µg/g in chickens (G. domesticus) (**Table 1**; Pawlak et al., 2009). Irditoxin was the first identified covalently linked 3FTx heterodimeric complex; previous 3FTx complexes were observed to be noncovalent homodimers, examples including κ-bungarotoxins (Dewan et al., 1994). Both of these rear-fanged cat snake species, B. dendrophila and B. irregularis, are arboreal snakes with diets primarily of birds and lizards, especially as juveniles (Greene, 1989). Therefore, the 3FTx taxon-specific toxicity appears to be correlated to prey items these snakes commonly eat.

Taxon-specific 3FTxs have also been isolated in two New World species, O. fulgidus (Heyborne and Mackessy, 2013) and S. sulphureus (Modahl et al., 2018b). Fulgimotoxin, O. fulgidus monomeric toxin, has an LD<sup>50</sup> of 0.28µg/g in Anolis lizards, but was found to be nontoxic to mice at mass-adjusted doses more than 15 times the observed lizard LD<sup>50</sup> (Heyborne and Mackessy, 2013). Oxybelis fulgidus is an arboreal snake that has a diet of birds and lizards (Robert, 1982), supporting the link between taxon-specific 3FTxs and snake diet. Sulditoxin, S. sulphureus dimeric toxin, was identified as the lizard-specific 3FTx complex in S. sulphureus venom, and has a LD<sup>50</sup> value of 0.22µg/g in Hemidactylus geckos, but is non-toxic to mammals up to doses of 5µg/g (**Table 1**) (Modahl et al., 2018b). Sulditoxin was the second heterodimeric 3FTx complex discovered in RFS venoms, and interestingly shares the same cysteine residue pattern as irditoxin, likely using the same additional cysteines in the first and second loops (residues 17 and 42, respectively) to form an intermolecular disulfide linkage of the two subunits. A second taxon-specific toxin exists in S. sulphureus venom, but this toxin demonstrated specific toxicity toward mammals instead of lizards. This mammal-specific toxin was found to be a monomer and was named sulmotoxin 1, S. sulphureus monomeric toxin (**Table 1**; Modahl et al., 2018b). The third most abundant 3FTx in S. sulphureus venom, sulmotoxin 2, did not show taxon specificity. Spilotes sulphureus are arboreal snakes that are generalist predators, feeding on birds, lizards, amphibians, and small mammals (Andrade et al., 2017). The venom of S. sulphureus has two taxa specific 3FTxs present, and this correlates with its diet, which includes a diversity of prey.

Gene trees of RFS 3FTx sequences suggest that taxon-specific targeting convergently evolved in these separate snake species, at least for the dimeric complexes (irditoxin and sulditoxin), because even though these sequences share the same cysteine pattern, they cluster separately (Modahl et al., 2018b); and the two irditoxin subunit sequences are not common to all species within the Boiga genus (Dashevsky et al., 2018). It is also possible that these genes could have been lost in some Boiga species, which has been suggested for adaptive PLA2s in the rattlesnake clade (Dowell et al., 2016). Until genome sequences are obtained, the evolutionary histories of these genes are difficult to assess, especially given the challenges with assembling de novo venom gland transcriptomes. Currently, most toxin sequences are obtained from venom gland NGS, and assemblers that are commonly used struggle to assemble toxin genes because of the multitude of similar isoforms and high expression levels of these toxins (Macrander et al., 2015; Holding et al., 2018; Modahl et al., 2019).

Selection analyses have revealed conflicting trends in RSF 3FTx evolution, where positive selection was reported previously (Sunagar et al., 2013), and it has since been observed that selection is variable for different 3FTx clades (Dashevsky et al., 2018). Some RFS 3FTxs are evolving under negative selection, especially sequences of subunits that form the dimeric complexes, where residues that are responsible for subunit interactions must be maintained (Dashevsky et al., 2018). With the current shortage of 3FTx sequences in databases from RFS species and absence of genomes, it is difficult to fully recognize the evolutionary complexities for this group toxins. However, by continuing to study these sequences from phylogenetically diverse species that are converging on the same trophic adaptation, it is possible to begin to explore how a venom protein with a conserved structure, like 3FTxs, can evolve specific activities and alter its mechanism of action. These toxins are therefore ideal to examine protein structure-function relationships.

When lizard-specific 3FTx sequences from RFS species were aligned, conserved motifs were observed in the central loop, where residues critical to nAChR receptor binding are found for FFS 3FTxs erabutoxin-a (Pillet et al., 1993) and αcobratoxin (Antil et al., 1999). These conserved sequences were identified as CYTLY and WAVK for residues 34–38 and 46– 49, respectively (Heyborne and Mackessy, 2013). These two motifs were hypothesized to be involved in toxin specificity to lizard nAChRs. Sulmotoxin 1, which is mammal specific, exhibits CYNLY, a threonine to asparagine substitution, and WTVK, an alanine to threonine substitution, motifs in this region (Modahl et al., 2018b), supporting this hypothesis, because sulmotoxin 1 lacks lizard toxicity. The size of the central loops between sulmotoxin, denmotoxin, and fulgimotoxin is also variable (**Figure 3A**), with longer central loops observed in lizard-specific 3FTxs, in addition to a proline in this region, followed by a negatively charged amino acid (either an aspartic acid or glutamic acid, or both; **Figure 3B**). Site-directed mutagenesis studies have yet to be performed for these taxa-specific 3FTxs and would help to elucidate the residues responsible for taxon-specific receptor binding. Key residues involved in receptor interactions could also be different than what is known for elapid 3FTxs, and identification of novel receptor-ligand interactions could also provide insight into how covalent 3FTx complexes (irditoxin or sulditoxin) interact with nAChRs.

# Metalloproteinases

All metalloproteinases that have been characterized from RFS venoms are of the P-III class and commonly exhibit fibrinogenolytic activity, with rapid degradation of the α-chain of fibrinogen and lower activity toward β- and γ-chains (Assakura et al., 1992; Peichoto et al., 2007; Weldon and Mackessy, 2012). Several SVMPs purified from the venom of P. olfersii degraded the α-chain of fibrinogen, with some also hydrolyzing the βchain, but with differences in selectivity and rate (Assakura et al., 1994). In the venom of P. patagoniensis, the isolated SVMP named patagonfibrase was found to be the venom component responsible for the hydrolysis of the α-chain of fibrinogen, but not the β-chain (Peichoto et al., 2007), and for B. portoricensis, the SVMP alsophinase was found to hydrolyzed the α-subunit of fibrinogen almost instantly, and slightly degraded the βchain after a 60 min incubation (Weldon and Mackessy, 2012). Although no individual SVMPs have been purified from the venom of A. prasina, there are SVMPs in this venom that act at a more rapid rate than those from the venom of B. portoricensis, hydrolyzing both α- and β-chains of fibrinogen (**Figure 4A**) even faster than what is observed with crude Prairie Rattlesnake (Crotalus viridis viridis) venom (**Figure 4B**; Modahl et al., 2018a). SVMPs of the P-III class are also found in venoms of elapid snakes but do not typically degrade the β-chain of fibrinogen and only target the α-chain (Evans, 1981; Guo et al., 2007; Sun and Bao, 2010). These studies demonstrate the underappreciated potency and functional diversity of SVMPs in RFS venoms.

Isoforms of SVMP P-IIIs have been observed to have distinct coagulation mechanisms in humans, birds, and small rodents, with some isoforms more specialized for rats or chickens (Bernardoni et al., 2014). There have been several studies that suggest SVMPs from RFS venoms might also have taxon selectivity. Higher molecular mass proteins isolated from the venom of P. olfersii were found to produce an irreversible blockade in chick biventer cervicis preparations, but had no effect on mouse phrenic nerve-diaphragm preparations (Prado-Franceschi et al., 1996). A separate study of P. olfersii venom revealed that all large molecular mass protein spots on a 2-D gel were SVMPs (Ching et al., 2006). Venom from P. patagoniensis, induced a time-dependent neuromuscular blockade of chick biventer cervicis preparations, but had no effect of mouse phrenic nerve-diaphragm preparations (Carreiro da Costa et al., 2008). Although not currently tested, SVMPs in this venom might also be responsible for the taxon-specific neuromuscular effects, given that this is a closely related species and that the venom lacks 3FTxs (Harvey and Mackessy, unpubl. obs.). An SVMPdominated pit viper venom (from Bothrops insularis) was shown to have differential effects on mouse phrenic nerve diaphragm and chick biventer cervicis muscle preparations (Cogo et al., 1993). In FFS venoms, variations in SVMP P-III isoform expression have been observed in rattlesnake populations that are locally adapted to prey (Margres et al., 2017), and a similar relationship between SVMP variants and prey targeting has been hypothesized for Bothrops neuwiedi venom SVMPs (Bernardoni et al., 2014). More work in this area is needed to determine if SVMPs from RFS venoms are indeed taxon-specific venom components and linked to snake diet. Dietary studies show that for Philodryas species, amphibians and lizards are the primary prey taken (Marques and Hartmann, 2005; López and Giraudo, 2008) and isolated SVMPs have yet to be evaluated on these taxa. In addition, it would be important to determine what molecular interactions mediate SVMP taxon-specific toxicity.

SVMPs from RFS could therefore provide interesting models for future work focused on structure-function relationships. How do some SVMPs degrade both α- and β-chains while others do not? How can SVMPs be selective to prey type? Venoms from RFS species are ideal for addressing these questions because these

venoms are overall less complex. For instance, vipers have several different classes of SVMPs, as well as serine proteinases that can also degrade fibrinogen, making it difficult to study specific toxin activity without multiple purification steps. Because of the abundance and extensive occurrence of P-III SVMPs in RFS venoms, these proteins can provide insight into the evolution of this toxin family and the biological roles that SVMPs provide in venoms.

# REAR-FANGED SNAKE VENOMS: BIOLOGICAL ROLES

Recognized biological roles of venom proteins include: (1) a trophic role—use in prey capture by affecting prey locomotor ability by immobilization and preventing escape, by inducing quiescence in prey, and/or by causing rapid death; (2) participating in digestion, such as by accelerating digestion of bulky prey items; (3) helping to lubricate prey during ingestion; and (4) functioning as a form of defense (Kardong, 2002). It is likely that prey capture is the primary role, as there have been multiple publications demonstrating a strong link between venom and snake diet (Mackessy, 1985, 1988; da Silva and Aird, 2001; Mackessy et al., 2003; Li et al., 2005a; Pawlak et al., 2006, 2009; Barlow et al., 2009; Modahl et al., 2018b). Venom has been observed to be variable in composition (Chippaux et al., 1991), and this variation has been associated with dietary differences from geographic locality (Daltry et al., 1996; Creer et al., 2003; Smiley-Walters et al., 2017; Sousa et al., 2017) and ontogenetic

dietary shifts (Mackessy, 1985, 1988; Andrade and Abe, 1999; Mackessy et al., 2006; Cipriani et al., 2017).

# Prey Capture

Venoms from both FFS and RFS species have been found to show differential toxicity toward prey items (da Silva and Aird, 2001; Mackessy et al., 2006; Gibbs and Mackessy, 2009), and population-level correlations between venomous snakes' venoms and their sympatric prey have been proposed (Margres et al., 2017; Smiley-Walters et al., 2017). The toxicity of venoms from four species of the FFS genus Echis to a natural scorpion prey was found to be associated with the degree of utilization of arthropods in the diet for each Echis species (Barlow et al., 2009), and a similar trend was observed for Sistrurus rattlesnakes that vary in the extent of amphibian, lizard and mammal prey taken for each species (Gibbs and Mackessy, 2009). Further, snakes with a greater dietary breadth appear to have greater toxin diversity within their venoms (Pahari et al., 2007; Calvete et al., 2012; but see Zancolli et al., 2019). The opposite also appears to be true if there is a reduction in dietary diversity or when venom no longer plays a pivotal role in prey capture. The sea snake Aipysurus eydouxii that feeds exclusively on fish eggs has atrophied venom glands, a loss of venom neurotoxicity and a reduction in the diversity of toxins in the venom of this species (Li et al., 2005b); a similar trend is seen in the terrestrial elapid Brachyurophis (Goodyear and Pianka, 2008). One of the advantages of exploring RFS venoms is that various species feed on a variety of specific prey types and show varying prey capture strategies, from constriction to envenomation, making these venoms ideal for predator-prey evolutionary studies.

Predatory strategy, venom vs. constriction, can drive venom evolution and snake behavior, favoring the evolution of venom that allows for a greater diversity and larger prey to be taken with minimum prey retaliation, and transitioning away from reliance on constriction (Savitsky, 1980). Boiga irregularis has the lizard-specific 3FTx complex irditoxin (**Table 1**) in abundance within its venom, and lizard prey is bitten and held by these snakes (Mackessy et al., 2006). However, mammalian prey is constricted by B. irregularis, and this is likely because this venom lacks mammal-specific toxins, as the crude venom murine LD<sup>50</sup> value is >30µg/g (**Table 1**) (Mackessy et al., 2006). Spilotes sulphureus venom has two taxon-specific 3FTxs, one that is lizard-specific and one that is mammal-specific (**Table 1**), and both are abundant in the venom; constriction is not observed for S. sulphureus (Boos, 2001; Modahl et al., 2018b). Larger mammalian prey are metabolically more favorable to consume than smaller lizard prey, and sulmotoxin, the mammal-specific 3FTx, appears to be a more recent adaptation facilitating mammalian prey capture. In the case of the FFS snake Naja kaouthia, a species that also does not use constriction, this highly toxic snake has one main 3FTx, α-cobratoxin, that is lethal toward both mammal and lizard prey (**Table 1**; Modahl et al., 2016). These examples demonstrate two different venom evolutionary strategies to target a greater diversity of prey, one venom with a single 3FTx that has evolved lethal toxicity to many prey species, and a second venom in which toxin gene duplication and neofunctionalization has resulted in separate 3FTxs that are selectively toxic toward different prey types. Again, toxin evolution in rear-fanged snake venoms can provide key insights into venom evolution and predatory strategies, as these snakes vary greatly in predatory behavior and types of prey consumed.

When atypical prey items are taken, venom proteins with unique activities may be discovered. This has been the case for several FFS species, such as snakes that consume other snakes. The King Cobra, Ophiophagus hannah, is a species that consumes other snakes and was found to have a toxin from a new venom protein superfamily, ohanin, in its venom (Pung et al., 2005). The Long-glanded Coral Snake, Calliophis bivirgatus, also a snakeeating snake, was found to have a 3FTx with unusual activity toward sodium ion channels (Yang et al., 2016). Snakes of the genus Bungarus, which also commonly feed on other snakes, exhibit unique 3FTx homodimer complexes (κ-bungarotoxins; Dewan et al., 1994). Rear-fang snakes have even more dietary specialists, including species that feed on scorpions, spiders or centipedes, and so it is expected that novel toxins remain to be discovered in venoms of this diverse group of colubroid snakes.

Therefore, it is likely that venoms from RFS species potentially have many new and currently undiscovered potent toxins. Experimental studies have found that only about half of the total venom expended by the RFS B. irregularis is delivered into the viscera of prey, and the other half remains embedded in the skin (Hayes et al., 1993). But even with these lower yields delivered into prey tissue, effects are still observed as prey that is removed without being consumed by the snake may become sluggish and eventually die after many minutes or several hours (Hayes et al., 1993). In combination with a less rapid venom delivery system, these snakes also generally have less complex venom (Mackessy, 2010b; Peichoto et al., 2012), resulting in venom toxins that are optimized for specific types of prey. Venoms from most species of RFS are uncharacterized, and because of historic biases, most LD<sup>50</sup> work utilizes domestic murine models due to availability of mice, genetic uniformity and presumed closer applicability to humans. However, by excluding other models, such as lizards, other vertebrates and invertebrates, one can overlook the biological potency of specialized venom proteins that may have been selected for toxic effects on non-mammalian prey (cf. Richards et al., 2012; Smiley-Walters et al., 2018).

# Digestion

Metalloproteinases in rattlesnake venoms have been suggested to facilitate efficient digestion of prey at suboptimal temperatures or when large prey are consumed (Mackessy, 1988, 2010a); however, several studies have indicated that envenomation does not increase digestive efficacy (McCue, 2007; Chu et al., 2009). In the case of many rattlesnake species, the extent of SVMP activity in a venom is negatively correlated with overall venom toxicity (Mackessy, 2010a). This has led to rattlesnake venoms being characterized as type I (SVMPdominated venoms that are less toxic) or type II (venom that is more toxic but with low metalloproteinase activity; Mackessy, 2010a). A similar dichotomy is found in many RFS venoms, where a venom is either dominated by toxic 3FTxs or enzymatic SVMPs (McGivern et al., 2014). For the RFS venoms that are heavily dominated by SVMPs, these venom components likely also aid in prey predigestion, as similar fibrinogenolytic activity is observed for SVMPs from both rattlesnakes and RFS species (**Figure 4**; Modahl et al., 2018a). Studying RFS venoms can reveal parallels to trends seen in FFS venoms, emphasizing the important of specific venom proteins in facilitating prey handling in diverse species of venomous snakes, regardless of the venom delivery system.

# FUTURE RESEARCH

Advancements and integrations of research technologies now allow much more detailed approaches to characterize unknown venoms and individual toxins. Transcriptomes assembled from venom glands provide custom databases to be paired with proteomics and make it possible to identify proteins in a venom even when they are currently missing from public databases (Modahl et al., 2019). For the recently characterized sulditoxin, without the species-specific NGS transcriptome, interpretation of the MS/MS spectra of the trypsin digested toxin by searching against public domain databases did not result in identification of this isolated toxin. By including the NGS venom gland transcriptome, the identification of the exact transcript and thus the full amino acid sequence was readily achieved (Modahl et al., 2018b). Previously, a similar situation produced negative results; from the analysis of RFS Rhamphiophis oxyrhynchus venom, a neurotoxin was isolated and partially characterized, but it showed no similarities to any toxins in public databases at that time, and the exact venom protein classification could not be determined (Lumsden et al., 2007). Unique or unusual toxin sequences still may not be present in public databases, and hence speciesspecific venom gland transcriptome databases are critical for identification (Campos et al., 2016; Modahl et al., 2018b, 2019).

Newer technologies have allowed for modifications to various characterization approaches. For venom gland transcriptome assemblies, venom protein identities are usually based on known key word searches for toxins. More recently, several machine learning programs have been developed to identify unknown toxins from large transcriptomic datasets (Gacesa et al., 2016; Macrander et al., 2018). Additionally, sequencing other snake tissues besides venom glands has provided insight into how gene expression can help in the identification of true toxins, what mechanisms result in toxin high expression in the venom gland in comparison to other tissues, and what gene homologs are present in other tissues (Hargreaves et al., 2014; Junqueira-de-Azevedo et al., 2015; Reyes-Velasco et al., 2015). Genomes of RFS are also useful to identify toxin gene duplication events (Perry et al., 2018). They help to provide support to approaches to identify where toxin genes originate and selection pressures they experience.

Because rear-fanged venomous snakes encompass such a large diversity of colubroid snakes, venom evolution can be studied on a broader evolutionary scale, addressing such questions as the effect of phylogeny on venom evolution or dietary specialization. There is an extreme range of toxicity of RFS venoms to humans, with some species being life-threating and others being harmless. This gradient of toxicity can be used for explorations into venom in terms of the biological roles of individual venom proteins, such as how toxicity and prey specificity can develop.

Venoms from rear-fanged snakes show many parallels to those of FFS, but one area that has yet to be explored is the level of venom variation within a single RFS species. The venom of B. irregularis has been found to exhibit ontogenetic variation related to diet, and venoms from different populations (Indonesia vs. Guam) show demonstrably different toxin compositions (Mackessy et al., 2006; Pla et al., 2017a). Conversely, individuals from the same populations of B. portoricensis and A. prasina showed very little venom variation (Modahl et al., 2018a), but population-level variation in venom composition is currently unknown for RFS. Variation in rear-fanged snake venoms deserves more attention, as it can help to uncover the mechanisms behind commonly observed venom variation, which has been an area of controversy. Studies are also still lacking at the level of posttranslational modifications of venom proteins, for most venomous snakes, and how this contributes to overall venom diversity.

Another neglected area of research is the interactive or synergistic potential of toxins. Because studying purified toxins

# REFERENCES


usually requires a reductionist approach, few studies have attempted to evaluate interactions between toxins. Dimeric toxins, such as sulditoxin and irditoxin, consist of two dissimilar 3FTxs, but the importance of dimeric associations to specific pharmacological activity is unknown. Interactive complexes are likely between venom toxins, including RFS toxins, and the generally lower complexity of these venoms should make such investigations more tractable. Venoms from RFS therefore have the potential to contribute importantly to our understanding of many phenomena still outstanding in toxinology, and there are multitude opportunities for investigations of these venoms, from basic compositional analyses to detailed structure-function studies to NGS-based investigations of regulation and evolution of venom expression systems.

# AUTHOR CONTRIBUTIONS

CM wrote the original draft. SM edited, contributed to, and modified the manuscript. CM and SM created the figures, formulated the original concepts, and edited the final versions of the manuscript.

snake alpha-neurotoxins and nicotinic receptors. EMBO J. 24, 1512–1522. doi: 10.1038/sj.emboj.7600620


cysteine-rich secretory protein, helicopsin. Arch. Toxicol. 85, 305–313. doi: 10.1007/s00204-010-0597-6


the first cysteine-rich secretory protein (CRISP) isolated from Bothrops jararaca snake venom. Toxicol. Lett. 265, 156–169. doi: 10.1016/j.toxlet.2016.12.003


tissues suggests a new model for the evolution of snake venom. Mol. Biol. Evol. 32, 173–183. doi: 10.1093/molbev/msu294


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Modahl and Mackessy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Coral Venom Toxins

### Casey A. Schmidt, Norelle L. Daly\* and David T. Wilson\*

*Centre for Molecular Therapeutics, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD, Australia*

The phylum Cnidaria contains a wide variety of unique organisms that possess interesting adaptations evolved over many years to help them survive in a competitive environment. One of these adaptations is the presence of venom, which has been of particular interest for studies aimed at identifying novel drug leads and for understanding the mechanisms involved in envenomation. The potency of the venom varies significantly amongst cnidarians, and although corals are often overshadowed by the jellyfish and sea anemone toxins, they also possess a range of interesting bioactive compounds. In this mini-review, we provide an overview of the toxins present in corals, highlighting the diverse structures and bioactivities.

Keywords: coral, sea anemone, toxin, nematocyst, venom

# INTRODUCTION

### Edited by:

*Maria Vittoria Modica, Stazione Zoologica Anton Dohrn, Italy*

### Reviewed by:

*Adam Michael Reitzel, University of North Carolina at Charlotte, United States Yehu Moran, Hebrew University of Jerusalem, Israel*

> \*Correspondence: *Norelle L. Daly norelle.daly@jcu.edu.au David T. Wilson david.wilson4@jcu.edu.au*

### Specialty section:

*This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution*

> Received: *31 March 2019* Accepted: *08 August 2019* Published: *27 August 2019*

### Citation:

*Schmidt CA, Daly NL and Wilson DT (2019) Coral Venom Toxins. Front. Ecol. Evol. 7:320. doi: 10.3389/fevo.2019.00320* The organisms in the phylum Cnidaria represent some of the oldest living venomous creatures on the planet, including jellyfish, hydroids, sea anemones, and corals (Rachamim et al., 2015). The phylum is primarily defined by the presence of nematocytes, or stinging cells, in the tissue of these organisms (Ozbek, 2011). Nematocyte cells contain a tubule with a capsule of venom that, when stimulated, is everted, striking the prey/predator, penetrating the outer membrane and delivering the venom (Ozbek, 2011). These stinging cells have a variety of purposes but are mainly used for prey capture and defense (Greenwood, 2009).

### Venoms often contain a complex mixture of compounds, including small molecules, peptides and proteins. These compounds can be highly potent and specific for biological targets, and the peptides in venoms are generally stable because of the presence of disulfide bonds making them of interest in drug design (Vetter et al., 2011; Utkin, 2015). Venomous creatures, such as spiders, scorpions, and cone snails have been well-studied and several databases have been established to collate the data on the toxins present (Kaas et al., 2008, 2012; Kuzmenkov et al., 2016; Pineda et al., 2018). Although cnidarians are generally less well-studied (Macrander et al., 2018), information about the toxins present is currently expanding.

Perhaps the most well-known of the cnidarian organisms are jellyfish. Their venom can be extremely potent and act, not only on small marine prey organisms, but can also have severe physiological effects on humans (Tibballs, 2006; Tibballs et al., 2011; Remigante et al., 2018). Although not as harmful as some jellyfish, other cnidarians, such as select sea anemones can elicit a stinging sensation in humans when the nematocytes in the tentacles are stimulated (Lubbock et al., 1981; Garcia-Arredondo et al., 2016). Several sea anemone toxins have been well-characterized, including an analog of a ShK toxin from Stichodactyla helianthus, which has entered Phase 2 trials for autoimmune diseases (Pennington et al., 2009; Chi et al., 2012; Prentis et al., 2018). There have been several recent reviews of sea anemone toxins regarding their bioactivity as well as their potential uses in the field of pharmaceutical development (Prentis et al., 2018; Thangaraj et al., 2018; Madio et al., 2019; Utkin et al., 2019). Corals are often overshadowed by the highly potent and potentially life-threatening toxins from jellyfish and clinically relevant sea anemones, but they too possess toxins of interest.

Much of the research on corals has focused on climate change impacts and secondary metabolites. In the past year alone there have been more than 3,750 journal articles published on "corals and climate change" (GoogleScholar) while there are around 388 articles published on "corals and toxins" (2 August 2019). The majority of research on corals and their secondary metabolites is not necessarily venom related. For example, metabolites from the gorgonian coral Erythropodium caribaeorum have been shown to act as a deterrent against reef fishes predating on the coral (Fenical and Pawlik, 1991). However, more recent studies have characterized venom derived coral toxins, with potential in the field of drug development (Radwan et al., 2002; Frazao et al., 2012; Rodriguez et al., 2012; Garcia-Arredondo et al., 2016). There is significant scope for future studies aimed at characterizing coral toxins, and in this mini review we highlight some of the structural and functional diversity that has already been uncovered.

# CORAL TOXINS

Corals, primarily grouped into stony corals and soft corals, are members of the Anthozoa class of the phylum Cnidaria as shown in the phylogenetic tree in **Figure 1**. Toxins have been characterized from four of the Anthozoa order and examples of these toxins are given in **Table 1** to highlight the structural diversity, range of bioactivities and potential applications. The majority of research into toxins from organisms in the Anthozoa class has focused on sea anemones, because of the exciting therapeutic potential of some of the toxins, and several recent reviews have been published in this area (Prentis et al., 2018; Liao et al., 2019; Madio et al., 2019). Although there are distinct differences between sea anemones and corals, with stony corals having calcium carbonate skeletons in contrast to sea anemones (Shick, 1991; Stanley, 2003), the two organisms are in the same class and might have similar compounds to each other. We are only just beginning to appreciate the diversity of compounds present in the nematocysts of corals and potentially toxins present elsewhere in the tissue of these organisms. There is evidence that toxins can be delivered from anatomical structures other than nematocytes in sea anemones, with differences in localization between species (Moran et al., 2012; Bastos et al., 2016). It is possible a similar phenomenon also occurs in corals. While sea anemone toxins, such as the ShK toxins have not been found in corals to date, there are examples of other toxins and toxin families as outlined below for stony corals and soft corals and highlights the importance of characterizing coral toxins.

# Scleractinia (Stony Coral)

Stony corals (Scleractinia) are reef building corals that absorb calcium carbonate from the water to form a hard skeleton, and occur in colonial or solitary aggregates (Stanley, 2003). The most well-characterized toxins from stony corals are a family of peptides termed small cysteine-rich peptides (SCRiPs) found in the ectoderm of a common stony coral, Acropora millepora (Sunagawa et al., 2009). SCRiPs contain a conserved eight cysteine framework, similar to the rattlesnake myotoxin domain in crotamine toxins (Jouiaei et al., 2015a) but there is limited sequence conservation beyond the cysteine residues. SCRiPs were originally thought to be found strictly in Scleractinia corals but recent studies have shown that homologs of SCRiPs are found in the sea anemones Anemonia viridis and Metridium senile (Jouiaei et al., 2015a). SCRiPS were also originally thought to be involved with biomineralization in reef building corals by playing a role in calcification of the skeleton of the coral but there is now evidence that they are a family of toxins rather than calcifying proteins (Jouiaei et al., 2015a). Two SCRiPs originally found in the ectoderm of the coral A. millepora were recombinantly expressed and incubated with zebrafish larvae. The larvae became paralyzed and insensitive to touch, consistent with neurotoxic action (Jouiaei et al., 2015a). The presence of SCRiPS in the ectoderm of A. millepora is also consistent with a role in prey envenomation as the ectoderm is lined by nematocytes (Grasso et al., 2011; Jouiaei et al., 2015a). Overall, SCRiPS remain of interest for further research because of their interesting framework and their presence in a variety of Anthozoa organisms.

It appears likely that SCRiPs are not the only toxins present in stony corals. Analysis of extracts from 11 different Scleractinian coral families collected from Heron Island on the Great Barrier Reef showed variable levels of toxicity in several assays including mice toxicity, haemolytic activity, and antimicrobial activity (Gunthorpe and Cameron, 1990). While the toxicity was variable, the majority of the species tested displayed some level of toxicity. Although the compounds responsible for the bioactivity were not characterized in this study, a recent study on the analysis of extracts from the nematocysts of three stony corals (Scleractinia corals), Pseudodiploria strigosa, Porites astreoides, and Siderastrea sidereal indicated the presence of a range of toxins and provided insight into the chemical composition. Extracts from all three corals were lethal to crickets, had haemolytic and nociceptive activity to varying extents, and exhibited PLA<sup>2</sup> and serine protease activities (Garcia-Arredondo et al., 2016). Interestingly, although these corals are not considered harmful to humans, the activity of these extracts is consistent with the physiological effects caused in humans by some hydroids, such as Millepora alcicornis and Millepora complanate, where the toxins work as lysins on erythrocytes (Garcia-Arredondo et al., 2016). Analysis of the extracts with SDS-PAGE indicated the presence of a broad range of proteins that differ under reducing conditions, and mass spectrometry analysis of a low molecular weight fraction indicated the presence of peptides with molecular weights in the range 3,000–6,000 Da. These peptide fractions were subsequently shown to be lethal to crickets and cause vasoconstriction. Further study is required to characterize these toxins, but it could be possible that some will have similarity to the SCRiPs family or the proteins with protease activity might have similarity to proteases present in other venoms, such as snake venom.

Toxins have also been predicted from the proteomics analysis of proteins discharged from nematocysts the stony coral Acropora digitifera, and the genome of this organism (Gacesa et al., 2015). A total of 55 potential toxins were predicted based on the genome but only 12

TABLE 1 | Selected toxins present in the class Anthozoa, phylum Cnidaria.


*<sup>a</sup>Monastyrnaya et al. (2016).*

*<sup>b</sup>Diochot et al. (2004), Moreels et al. (2017).*

*<sup>c</sup>Sunagawa et al. (2009).*

*<sup>d</sup>Castaneda et al. (1995), Prentis et al. (2018).*

*<sup>e</sup>Liao et al. (2018).*

*<sup>f</sup> Lazcano-Perez et al. (2016, 2018).*

were found based on the proteomic analysis of the nematocyst extracts (Gacesa et al., 2015). These toxins are suggested to be phospholipases and toxic peptidases based on their similarity to other known toxins found on Tox-Prot (Gacesa et al., 2015). Furthermore, an haemolytic toxin from the actinoporin family has recently been characterized in Stylophora pistilata and was suggested to be a non-nematocyst protein (Ben-Ari et al., 2018).

# Alcyonacea (Soft Coral)

Soft corals (Alcyonacea), formerly known as gorgonian corals, contrast stony corals in that they do no create a calcium carbonate skeleton (Alarif et al., 2019). In a similar study to the Scleractinia coral extract analyses, Radwan et. al demonstrated the effects of venom from three different soft corals on mice. The corals, Nephthea sp., Dendronephthya sp., and Heteroxenia fuscescens, were collected from the Red Sea and are known to cause a stinging effect in humans (Radwan et al., 2002). The data showed that extracts from nematocysts of all three corals resulted in fractions with bioactive effects including lethality to mice, haemolysis, vasopermeability, or dermonecrosis, with toxins from H. fuscescens the most lethal to mice (Radwan et al., 2002). Similar to the studies on stony corals, SDS-PAGE analysis of the venom extracts indicates the presence of a wide range of proteins ranging from ∼200 kDa to <6,000 Da. The bioactivity was not restricted to one class of protein, with two fractions from the H. fuscescens extract showing potent haemolytic activity with one fraction containing a protein of 116 kDa and the other containing a peptide of <6 kDa. The addition of a variety of lipid membrane components to the venom followed by addition of this mixture to human red blood cells instigated a protective response of the cells against the crude venom toxins (Radwan et al., 2002). The inhibition of a physiological response in the cells suggests that the binding site is occupied, with the most effective inhibition occurring by the addition of dihydrocholesterol (Radwan et al., 2002). Occupation of the binding site prevents the toxin binding and therefore eliciting a physiological response on the cells. Furthermore, this research also tested the mice for antibody production. Mice injected with a dose of venom, and provided with boosters throughout the study,

produced an immune response in 15 days with high levels of antibodies present in the blood (Radwan et al., 2002). Further studies are required to fully characterize the bioactive peptides and proteins present in the extracts.

Non-proteinaceous toxins are also present in soft corals. For example, the small molecule toxin sarcophine was isolated from the soft coral Sarcophyton glaucum and is toxic to fish as well as mice, rats and guinea pigs (Ne'eman et al., 1974). Ingestion by the animal led to a decrease in cardiac and pulmonary function as well as motor function and body temperature of the animals (Ne'eman et al., 1974). Using guinea pig ileum, it was shown that sarcophine acts as a competitive inhibitor of cholinesterase (Ne'eman et al., 1974). This coral was originally studied for its ecological characteristics when Ne'eman et al. observed that the fish in the area were not predating on this specific coral (Ne'eman et al., 1974) and subsequent studies lead to the characterization of sarcophine.

There have also been large scale studies on toxins of soft corals found on the Great Barrier Reef in which 136 different specimens from 15 different genera were analyzed (Coll et al., 1982). In this study two genera were found to be the most toxic and lethal to the fish species tested: Lemnalia and Sarcophyton (Coll et al., 1982). The different genera of coral tested exhibited a large range of affects, from no noticeable effect on the fish to causing death. This study and the studies on sarcophine involved extraction of coral tissue rather than nematocysts extracts, which makes it difficult to identify where the compound comes from in the coral, but these studies demonstrate the diversity of the compounds from corals and the potent activity they can possess.

# EVOLUTION OF CORAL TOXINS

Insight into the evolution of coral toxins is primarily based on the SCRiP family of peptides, as these are the most wellcharacterized to date. As mentioned above, in contrast to the original suggestion, SCRiPS are not only found in corals but have been found in the sea anemones Anemonia viridis and Metridium senile (Jouiaei et al., 2015b). The toxin τ -AnmTx Ueq 12-1 isolated from the sea anemone Urticina eques also shows similarity to SCRiPs, in particular one cDNA matched the SCRiP Anthopleura elegantissima comp63456\_c0\_seq1 (Logashina et al., 2017). The presence of SCRiPs in corals and sea anemones suggests that these proteins evolved more than 500 million years ago, the estimated time when coral diverged from sea anemones (Shinzato et al., 2011). Furthermore, molecular evolutionary assessments indicate that coral SCRiPs have evolved under negative selection as no sites were found that were positively selected based on the Bayes Empirical Bayes approach (Jouiaei et al., 2015a). The role of negative selection in coral toxin evolution is supported by studies on the toxins of Acropora digitifera and Stylophora pistilata (Gacesa et al., 2015; Ben-Ari et al., 2018). Interestingly, this appears to be a general phenomenon for venoms of ancient lineages, whereas toxins from lineages that have evolved more recently appear to evolve under positive selection (Lynch, 2007; Casewell et al., 2011, 2012; Sunagar et al., 2012, 2013, 2014; Brust et al., 2013; Dutertre et al., 2014; Jouiaei et al., 2015a; Sunagar and Moran, 2015).

Given the large number of toxins that appear to be present in coral venom, further molecular characterization is likely to provide further insight into the evolution of this ancient lineage. In particular, characterization of some of the larger toxins might provide insight into origins of a range of toxins, as analysis of the venom from the sea anemone Stichodactyla haddoni showed that some venom peptides have similar sequences to housekeeping proteins involved in regulatory biological functions (Madio et al., 2017). This is a common trend across many venomous taxa because it is suggested that the main ways that toxins are recruited is via gene modification of regulatory proteins, such as sequence duplications (Fry et al., 2009). It has been shown that cnidarian organisms rely on similar structural frameworks of their toxins and then modify these toxins for activity on specific targets (Honma and Shiomi, 2006; Prentis et al., 2018). Because of this evolution from non-toxin related proteins, we see variability in the size of peptides found in nematocyte venom as the evolution of each peptide differs greatly (**Table 1**).

Despite the similarities between cnidarian organisms, such as coral and sea anemones, it is also likely that significant differences in the evolution of toxins will be found based on analysis of proteins found in the nematocysts of organisms from three different classes of Cnidaria, namely Anthozoa, Scyphozoa, and Hydrozoa (Rachamim et al., 2015). The organisms analyzed from these three classes were the sea anemone Anemonia viridis (Anthozoa), the jellyfish Aurelia aurita (Scyphozoa) and the hydrozoan Hydra magnipapillata (Hydrozoa). Although this analysis led to the identification of hundreds of proteins, only six proteins were common in all three species (Rachamim et al., 2015). Of these, most were structural proteins and only one of the six proteins common to all three species, the dickkopf protein, is predicted to function as a toxin (Rachamim et al., 2015). The A. aurita and H. magnipapillata venom showed the most similarities, mainly composed of cytotoxins and enzymes, while the A. viridis venom proteome composition was predominantly related to peptide neurotoxins (Rachamim et al., 2015). The general lack of conservation across these cnidarians might point to significant evolutionary differences and further cnidarian toxins promises to provide interesting insights into toxin evolution in general.

# CHALLENGES IN CNIDARIAN TOXIN ANALYSES

In the study of venomous creatures, such as spiders and cone snails it can often be quite straightforward to isolate the venom with limited contamination from the environment or other tissues. Indeed, Australian funnel-web spiders (Atracidae) can release microlitres of venom onto their fangs that can be "easily" recovered (Wilson and Alewood, 2004). However, for corals and cnidarian organisms in general this is not always the case, and can complicate the toxin extraction process (Garcia-Arredondo et al., 2016). A range of extraction methods have been used in the analysis of corals and sea anemones, including extraction of the toxins from the nematocyte in the tissue of the coral (Garcia-Arredondo et al., 2016) and homogenization of the whole tentacle of the sea anemone (Prentis et al., 2018). Homogenization of the whole tentacle will clearly yield more than just venom toxins, but even extraction of the toxins from the nematocyst can be complicated, given the small size of the nematocysts. Furthermore, it can be difficult to separate out the calcareous skeleton of the coral from the tissue itself (Garcia-Arredondo et al., 2016). The cellular contents can have implications for interpretation of the results because some components, such as minicollagens share characteristics similar to the cysteine-rich toxins of interest (Madio et al., 2017).

The difficulties in defining the origin of compounds, either from the nematocyst or other tissue, has significant implications for elucidating evolutionary relationships for these toxins. In particular, it is difficult to determine a common ancestor (Kayal et al., 2018). Genome and transcriptomic analyses are likely to provide further insight into the evolution of cnidarian toxins. Using an integrative approach of both genomic and transcriptomic analyses allows for a better understanding of the active toxins found in organisms compared to the potential toxins

## REFERENCES


seen in the genome and will also allow comparison with other toxins from a range of venomous creatures.

# CONCLUSION

Coral venoms are a source of interesting novel bioactive molecules with significant scope for further characterization of novel toxins. Further application of analysis technologies (e.g., genomics, transcriptomics, and proteomics) is likely to significantly enhance the knowledge in this field and identify novel classes of peptides/proteins. The well-characterized coral toxins, SCRiPs, have now been identified as likely neurotoxins, but given the highly microbial environment in which corals exist, further analysis of the nematocyst components of corals is likely to provide a new and unique source of antimicrobial molecules. Studies on coral extracts have indicated that such compounds exist. In addition, determining the composition of peptides that make up the venom from corals may provide insight into the overlap and differences between cnidarian groups.

# AUTHOR CONTRIBUTIONS

CS, ND, and DW wrote the manuscript.


caribaeorum through transcriptome sequencing. J. Proteome Res. 17, 891–902. doi: 10.1021/acs.jproteome.7b00686


family identified by a 454 pyrosequencing approach. Peptides 34, 26–38. doi: 10.1016/j.peptides.2011.10.011


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Schmidt, Daly and Wilson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Snake Venom in Context: Neglected Clades and Concepts

### Timothy N. W. Jackson<sup>1</sup> \* † , Hadrien Jouanne<sup>2</sup> and Nicolas Vidal <sup>2</sup> \* †

<sup>1</sup> Australian Venom Research Unit, Department of Pharmacology and Therapeutics, University of Melbourne, Melbourne, VIC, Australia, <sup>2</sup> Muséum National d'Histoire Naturelle, UMR 7205, MNHN, CNRS, Sorbonne Université, EPHE, Université des Antilles, Institut de Systématique, Evolution et Biodiversité, Paris, France

Despite the fact that venom is an intrinsically ecological trait, the ecological perspective has been widely neglected in toxinological research. This neglect has hindered our understanding of the evolution of venom by causing us to ignore the interactions which shape this evolution, interactions that take place between venomous snakes and their prey and predators, as well as among conspecific venomous snakes within populations. In this opinion piece, we introduce and briefly discuss several ecologically oriented concepts that may be of interest to toxinologists, before reviewing a range of non-front-fanged snake taxa that have been neglected toxinologically, but which represent the majority of extant ecological diversity amongst snakes. We conclude by noting that the ecological perspective even has something to offer to clinical toxinology, in the wake of the World Health Organization reinstating snakebite envenoming to its list of Neglected Tropical Diseases.

Keywords: reptile, snake, evolution, ecology, toxin

# INTRODUCTION

Venom is a functional trait, used by one organism to subjugate or deter another (Jackson and Fry, 2016). Without this relationship between the venomous and the envenomed, it makes no sense to speak of "venom" – an organism may produce a plethora of potentially toxic compounds, but if they have not been selected for functional deployment in the subjugation or deterrence of other organisms, they are not venom. Thus, venom is an intrinsically relational trait. Ecology is the study of relationships amongst organisms and with their environment, therefore venom is "ecological" by definition. Despite this simple truth, the study of venom ecology—the usage and evolution of venom in context—has been a relatively neglected area in snake venom research and in toxinology more broadly. In the following opinion piece, we briefly discuss several ecologically-oriented concepts that have been neglected in venom research, but which we feel are worth considering within the field. Following this, we review some of the extant diversity of non-front-fanged venomous snake species which have received little to no attention in toxinological circles.

# NEGLECTED CONCEPTS

## Context in Molecular Evolution

The importance of context for the evolution of venom begins at the molecular level. Proteins and the genes that encode them exist in their own interdependent ecological webs. Genes are located within structured genomes and their location within these networks has a profound impact on their evolution. For example, the arrangement

### Edited by:

Maria Vittoria Modica, Stazione Zoologica Anton Dohrn, Italy

# Reviewed by:

Juan J. Calvete, Spanish National Research Council (CSIC), Spain Kevin Arbuckle, Swansea University, United Kingdom

### \*Correspondence:

Timothy N. W. Jackson timothy.jackson@unimelb.edu.au Nicolas Vidal nicolas.vidal@mnhn.fr

†These authors have contributed equally to this work

### Specialty section:

This article was submitted to Chemical Ecology, a section of the journal Frontiers in Ecology and Evolution

Received: 30 April 2019 Accepted: 21 August 2019 Published: 06 September 2019

### Citation:

Jackson TNW, Jouanne H and Vidal N (2019) Snake Venom in Context: Neglected Clades and Concepts. Front. Ecol. Evol. 7:332. doi: 10.3389/fevo.2019.00332 of a genomic neighborhood affects the propensity that a given segment (containing one or more "genes") within that region has for duplicating (Reams and Roth, 2015). Not only is this broadly relevant to the origin of novel functions via a number of pathways in which duplication plays a role, it is specifically relevant to the evolution of venom, as many important toxins are members of gene families which have expanded through duplication (Fry et al., 2009). Proteins themselves have been described as "fundamentally relational entities" (Guttinger, 2018), since they must interact with other molecules in order to effect their own functional roles—a protein in isolation is impotent. Exophysiological proteins, such as venom toxins, which interact with targets in the bodies of prey or predators (i.e., secondary organisms) vividly illustrate this point. However, it is no less the case for endophysiological molecules functioning within the ecosystem of other molecules that comprises the physiology of the primary organism. Furthermore, the ecology of molecules consists of more than molecular interactions and extends to the "environmental" features, such as pH and temperature, which make these interactions possible and influence their rate, potency etc.

Once the role of context in influencing the function of a protein is acknowledged, the importance of changes in context for the evolution of novel functions becomes clear. Again, the evolution of venom toxins, in particular their "recruitment" into venom arsenals, is instructive. In a recent study, Koludarov et al. (2019) reconstructed the history of the phospholipase A2g2 gene family, from which all viperid snake phospholipase A2 toxins originate. In tracking all duplication events that have occurred within the family since the most recent common ancestor of amniotes, the authors noted that duplication seemed to follow the origin of novel functions, rather than predate it as in Ohno's (1970) influential "neofunctionalization" model. In consideration of the recruitment of this family as a toxin in toxicoferan reptiles, it was inferred that this novel function arose when a shift in tissue-specific patterns of gene expression facilitated an association between this secretory protein and the dental glands of an early toxicoferan or ancestral snake. The association of this enzyme with an incipient delivery system was the first step toward exposing it to a radically novel context. Whilst the gene family had been ancestrally involved in innate immunity, fighting pathogens in an endophysiological context, its secretion in dental glands made possible its exophysiological deployment against other multicellular organisms. However, it was not until a delivery system capable of inoculating this potential toxin deeply into a prey organism's tissue evolved that the genuine functional novelty of a role within venom emerged in this gene family (Koludarov et al., 2019). Thus, it is only in viperid snakes that these proteins became a significant part of venom, a novel function precipitated by the exposure of the molecule to a novel context. Selection for the toxin function in turn enabled viper-specific expansions of the gene family, with associated neofunctionalizations.

The aforementioned role of secretory context in the origins of novel protein functions is highly analogous to the concepts of ecological opportunity and ecological release. These concepts have been central to discussions of adaptive radiation for at least 70 years (e.g., Simpson, 1949). According to such models, adaptive radiations occur following colonization of new environments, the origin of a novel functional trait/innovation, or escape from antagonistic/competitive species (Gavrilets and Vose, 2005; Yoder et al., 2010). Consideration of the origins of novel functions, however, reveals their similarity to the colonization of new environments—a change of context at either the protein or the organismal level results in opportunity for diversification. Furthermore, "escape from adaptive conflict" is a term used to describe a subfunctionalization process in which gene duplication facilitates the distribution of the multiple functions possessed by a "parent" gene amongst its duplicate "offspring" (Innan and Kondrashov, 2010). This segregation allows each gene to specialize for a particular function, a process often accompanied by segregation of the expression patterns of each gene product. Highlighting the close analogy between gene and organismal evolution, "escape and radiate coevolution" describes a process in which an organism evolves a novel trait which allows it to (literally) escape the adaptive constraints imposed upon it by an antagonistic species, opening a pathway to further diversification (see below for further discussion of coevolution).

# Context and Evolvability

"Evolvability" is a term that has been used to refer to a variety of phenomena influencing the propensity of a lineage to diversify and evolve novel functions. This has been described as the "evolutionary potential" of the lineage (Sterelny, 2007) and as "the ability of a biological system to produce phenotypic variation that is both heritable and adaptive" (Payne and Wagner, 2019). More specifically, evolvability is the result of unselected properties (e.g., mutation rate) a lineage possesses that constrain or facilitate that lineage's access to areas of "phenotype space" (i.e., the range of possible future phenotypes available to that lineage— Brown, 2014). This definition remains problematic, however, because the degree to which such properties remain unselected is controversial. Regardless of this point, venom genes, which appear to evolve rapidly (Chang and Duda, 2012; Casewell et al., 2013; Sunagar et al., 2013) are excellent candidates for research aimed at understanding the factors influencing evolvability.

Although a full treatment of the evolvability of venom is beyond the scope of this article, a few points are worth mentioning. Evolvability is increased in lineages that are "robust" enough to acquire variation neutrally (Wagner, 2012; Jackson et al., 2016). At the molecular level, this neutral variation is the substrate from which novelty emerges—unselected changes may result in the capricious discovery of functional novelty, upon which selection subsequently acts. This process highlights the close relation between "exaptation" (Gould and Vrba, 1982) and evolvability—the former describes an unselected property becoming a positively selected function (Jackson and Fry, 2016), whereas the latter describes the general propensity for this to occur within a lineage. The more neutral variation an evolving lineage can accumulate, the higher the chance of such a fortuitous occurrence and hence the more "evolvable" that lineage becomes.

Toxin genes have been independently recruited in diverse venomous taxa from a number of gene families that possess properties that facilitate the accumulation of neutral variation (Fry et al., 2009; Casewell et al., 2013). One such property is a high apparent rate of duplication—venom genes often originate from within multi-copy gene families and once a locus is recruited to a venom system further locus-specific expansion frequently takes place (Fry et al., 2009; Koludarov et al., 2019). The propensity for a locus to duplicate is believed to be an unselected consequence of its genomic context—the arrangement of genes surrounding it (Reams and Roth, 2015). Thus, it is not only the context in which a gene product is deployed that influences its functional evolution, but also the intragenomic context in which it is located. As with other factors influencing evolvability, it is controversial whether or not duplication propensity is always unselected, or rather might be positively selected in certain contexts. Evidence of increased duplication rates within sub-families of genes that have been recruited as venom toxins suggests that in some cases the origins of a novel function may lead to selection for increased duplication, or at least influence the preservation of duplicate genes (see Koludarov et al., 2019 for detailed discussion).

# Interactions Between Venomous Snakes and Their Prey

Direct study of the ecological relationships between venomous snakes and their prey, predators, and conspecifics (which may also be predators or prey!) presents many challenges. It is far easier to simply collect venom, analyse it in a laboratory setting, and then correlate the resultant data with previously accumulated knowledge of diet and behavior, than it is to perform integrated "eco-toxinological" studies. Correlating previously collected dietary data with new data on venom composition and activity is illuminating and may result in the generation of hypotheses that guide future research in venom evolution (e.g., Jackson et al., 2016; Zancolli et al., 2019). Adequately testing these hypotheses, however, often requires a more direct approach (e.g., prey-handling experiments; taxon-specific toxicity testing, etc.), which may be difficult to implement (although see Barlow et al., 2009 for a nice example of such work). The simple issue of practicality goes a long way toward explaining the neglect of ecology within toxinology. However, pragmatism should not prevent venom researchers from formulating research questions from an ecologically informed perspective. Given the intrinsic interdependence of venomous organisms and their prey, as well as the interdependence of both with the broader environment, ecology cannot fail to have a profound influence on the evolution of venom.

One area of research that is considerably limited by the challenges of conducting genuinely "ecological" venom research concerns cost-benefit analyses of venom production and deployment. Although an interesting literature exists on this subject (see Morgenstern and King, 2013; Evans et al., 2019), it typically concentrates on the cost of venom, whilst neglecting the benefit it confers. Venomous organisms may utilize various methods (e.g., venom metering; secretions with reduced protein content, etc.) to minimize the cost of venom, but effectively investigating whether or not venom is a "costly" trait, relative to other defensive and offensive strategies, requires an integrated approach. The metabolic cost of venom must be compared to that of other strategies, e.g., constriction in snakes, and the costs of each strategy offset against the energetic gains made accessible by their deployment. There are no free lunches in nature and it may be that venom is a particularly cost-effective strategy for acquiring larger amounts of energy in a single meal (i.e., subduing a larger prey item) than would otherwise be possible. Unfortunately, the kind of experiments required to comprehensively investigate these questions are complicated and costly (and may not provide the requisite career gains to offset these costs!).

The primary function of venom for venomous snakes is prey subjugation (Jackson and Fry, 2016). Although many studies have attempted to correlate venom composition with prey type, the nature of the interaction between predator and prey is influenced by far more than the species (or order) that each belongs to. Additional considerations may include the physiological states of both organisms at the time of the encounter, as well as their relative body sizes. Brief examples will serve to illustrate each of these points.

The physiological state of homeothermic endotherms (e.g., typical small mammals) may be relatively constant, whereas that of poikilothermic ectotherms is widely varying. Snakes that feed predominantly upon mammals, therefore, may have options unavailable to those that specialize in feeding upon reptiles. These options may include a high percentage of enzymatic toxins, which are rate-limited by temperature, in their venom (Jackson et al., 2016). Such toxins may be additionally favored for deployment against mammals because of the rapidity of their action against prey which are potentially dangerous and possess a capacity for sustained exertion which far exceeds that of a snake. The situation is considerably more complex for snakes feeding upon poikilothermic reptiles whose physiological state fluctuates widely according to time of day, level of activity, and seasonally (Pough et al., 2016). For a snake that hunts lizards during the day, when the lizards are themselves active and may have been basking or otherwise maintaining an elevated body temperature and metabolic rate, enzymatic toxins may still be an effective strategy. On the other hand, for a snake that feeds on the same species of lizard at night, when it is sheltering and its metabolic rate is depressed, another strategy, perhaps non-enzymatic neurotoxins coupled with constriction, may be preferable.

One of the obvious advantages of venom as a predatory strategy is the ability it confers to subdue larger prey than would be possible with purely physical means. For snakes, there may be trade-offs among prey size (snake), mobility, and subjugation strategy (e.g., venom vs. constriction). Regardless, even for a venomous snake, the relative size of a prey animal may exert considerable influence on the outcome of a predation attempt. Prior conceptions of venom as (definitionally) causing "rapid prey death" have long obscured the details of the "battle" that may take place when a venomous snake encounters a potential prey item. The view has often seemed to be that the primary challenge for the snake is to get close enough to strike successfully, after which the venom takes care of the rest. For a generation of researchers which honed its intuitions by studying rattlesnakes, this is entirely reasonable. As evolutionary

toxinologists have diversified their research interests, however, the fact that rattlesnakes (along with mambas, taipans, and other venomous snakes employing a bite-and-release strategy) are exceptional cases, and "rapid prey death" (which is conceptually distinct from rapid incapacitation, but may be equivalent to it at the level of selection—Jackson and Fry, 2016) may be an unusual rather than typical consequence of envenoming, has become clear (Fry et al., 2012). In contrast, struggles between venomous snakes and some prey animals may take many minutes or even hours to resolve and there is no guarantee of success for the snake even after delivering its venom.

Observations of snakes feeding upon frogs, for example, vary from occasions in which the amphibians are consumed within seconds, apparently whilst still alive, to drawn out battles. The key variable here does not seem to be the potency of the snake's venom, or the sophistication of its venom delivery system, but rather the size of the frog relative to the size of the snake's head. One of the authors (TNWJ) has observed rear-fanged tree snakes from the genera Boiga and Dendrelaphis swallowing small frogs rapidly (almost too quickly to take photographs) and conversely observed front-fanged elapid snakes struggling to subdue larger frogs over periods exceeding 1 h. In a situation in which predator and prey are almost evenly matched physically, the "chemical edge" provided by venom may be a decisive factor. Indeed, this edge (though initially slight) is likely what first allowed the venomous function to be positively selected in ancestral snakes or even ancestral toxicoferan lizards (Fry et al., 2013; Jackson et al., 2017). Long before venom could rapidly subdue prey, it may have only slightly weakened it. This much is intuitively obvious, what may be less so is the fact that a snake possessing an "advanced high-pressure delivery system" and venom that could subdue a mouse in seconds may have to battle a frog for an hour and still not be guaranteed a meal. The interaction of prey type, size, and physiological state must therefore all be considered if we are to bring a genuinely ecological perspective to toxinological research.

# Antagonistic Coevolution

As mentioned above, one of the factors traditionally associated with adaptive radiation is "escape" from antagonistic taxa. This form of adaptive radiation is often referred to as "escape and radiate coevolution" (Hui et al., 2015), and the evolution of chemical weaponry, both offensive and defensive, represents a prime example of this dynamic. Coevolutionary arms races are often invoked to account for diversity in venom composition and activity, as well as the dynamic molecular evolution of toxin genes (e.g., Casewell et al., 2013). On the other hand, interest in the evolution of resistance to venom in both the prey and predators of venomous organisms has also increased recently (after a long history of piecemeal research), highlighted by the publication of two excellent reviews (Holding et al., 2016 and Arbuckle et al., 2017) detailing the multiple molecular innovations that confer this resistance. The evolution of chemical weaponry and the reciprocal evolution of resistance may be one of the most important drivers of functional diversity at the molecular level. Thus, the fledgling field of snake venom eco-toxinology has the luxury of being able to draw upon insights from investigations of ecological phenomena as seemingly disparate as plant-herbivore coevolution, responsible for a huge variety of plant secondary metabolites (Fraenkel, 1959) and antimicrobial drug-resistance, which represents "an increasingly serious threat to global public health" (WHO, 2018).

# Competing Strategies

Consideration of the interactions between venomous snakes and their prey/predators is instructive, but it should not obscure the fact that one of the primary drivers of evolution by natural selection is competition amongst varying phenotypes within populations—a snake's conspecifics are at least as influential on the evolution of its venom as its prey are. A recent study (Zancolli et al., 2019) of Mojave rattlesnakes (Crotalus scutulatus), a species that exhibits two distinct venom phenotypes, serves to illustrate this point, as well as the influence of other environmental factors. The study found no discernible difference in diet between Type A (neurotoxic) and Type B (haemotoxic) snakes. However, a strong association was found between venom type and climate, in which the neurotoxic type was found in regions with cooler winters and higher rainfall. Several possible explanations for this pattern are discussed in the paper, including climatic effects on prey availability, which may influence snake foraging strategies, in turn influencing exposure of the snakes to their own predators (a nice illustration of the interdependence of ecological factors). Insightfully, the authors point out that within widely distributed species, which occupy a diversity of environments across that distribution, there may be selection for locally optimal strategies leading to intraspecific diversity. Continuing this line of reasoning, we conjecture that these local optima could also result from the effect of climate on the taxon-specific effectiveness of specific venom compositions, and that the proximal driver of selection may be competition between the two strategies (as opposed to local prey availability, which would be a more distal pressure). Thus, whilst both strategies might be broadly successful, each possesses a slight competitive edge over the other in its "preferred" climate. Type B snakes rely on enzymatic metalloprotease (SVMP) toxins, which are likely rate-limited by temperature. The pre-synaptic neurotoxins favored in the Type A phenotype, on the other hand, are phospholipases which act non-enzymatically and are thus less affected by temperature. C. scutulatus are generalist predators and their diets include up to 30% reptiles alongside the more commonly consumed mammalian prey (Zancolli et al., 2019). As previously discussed, mammals, with their elevated metabolic rates, are often rapidly subdued by enzymatic toxins, but the effectiveness of these toxins against reptilian prey may vary according to the physiological state of the latter. In areas with milder winters, SVMPs may be a highly effective strategy for the rattlesnakes, but where temperature is more variable (a factor also influenced by precipitation rates) snakes with a rate-limited toxin arsenal maybe out competed by those with a slower-acting, but temperature invariant, neurotoxic strategy.

# Intra-Populational Variation

Heritable phenotypic variation within populations is a precondition of evolution by natural selection (Godfrey-Smith, 2007). Venom is a highly evolvable trait (see above), and it is thus unsurprising that we should see intraspecific variation in venom phenotypes. This variation is rarely likely to be as neat as that documented for C. scutulatus, however. Another recent study (Smiley-Walters et al., 2019) has demonstrated considerable intra-populational variation in the toxicity of venoms of individual pygmy rattlesnakes (Sistrurus miliarius) to lizards. This work is in its early stages, but it is already significant in adding another wrinkle to our investigations of the evolution and ecology of snake venoms. Variation was found to be considerably higher within the population than between populations and each of these variants may be considered an incipient "strategy," though the majority of them will likely be transient. The authors described the variation as "functional," however, in evolutionary terms it might well be epiphenomenal (unselected/neutral—see Jackson and Fry, 2016). Again, a large amount of unselected phenotypic variation is exactly what we should expect to see in a highly evolvable trait such neutral variation is the substrate from which functional variation grows as effective strategies are hit upon by chance and give their bearers a slight competitive edge over other members of the population. As the authors comment, follow-up studies using additional prey species may show whether there are trade-offs associated with variation in toxicity toward lizards (Smiley-Walters et al., 2019) and thus whether the variation is truly functional.

# NEGLECTED CLADES

The general neglect of an ecological perspective in toxinology has resulted in the neglect of the vast majority of ecological (and phenotypic) variants amongst extant snakes. As before, there are pragmatic reasons for this—venom research has naturally focused on the most common and most dangerous species of snake, and many venom researchers come from either clinical or biochemical backgrounds and may thus be unfamiliar with the cornucopia of research riches represented by snake diversity.

Snakes are the most species-rich squamate clade, numbering more than 3,700 extant species (Uetz et al., 2019). Our understanding of their higher-level relationships has recently progressed thanks to several molecular studies (reviewed in Miralles et al., 2018; Streicher and Ruane, 2018). They include ∼500 species of paraphyletic and fossorial "Scolecophidia" ("worm snakes"), and ∼3,200 species of Alethinophidia ("typical snakes"), almost 3,000 of them being Caenophidia ("advanced snakes"). Caenophidia includes three species of aglyphous file snakes (Acrochordidae), ∼740 front-fanged venomous snakes including Elapidae, Viperidae, and Atractaspidinae (genera Atractaspis and Homoroselaps only), and several lineages formerly known as "colubrids" i.e., Caenophidia devoid of a front-fanged venom system. Although authors continue to debate the subfamilial/familial rank of some of these lineages, they mostly agree on their content, and we will here recognize the following eight families: Xenodermidae, Pareidae, Homalopsidae, Lamprophiidae, Pseudoxenodontidae, Dipsadidae, Natricidae, and Colubridae (Uetz et al., 2019).

Snakes are all carnivorous. The spectrum of their diets is huge, ranging from the eggs of social insects to large mammals. Immobilization is achieved by simply holding the prey firmly within the mouth, encircling the prey in body coils and applying pressure to cause asphyxiation or cardiovascular dysfunction (constriction), or injecting the prey with toxic venom (Lillywhite, 2014). Snake venoms and their associated delivery apparatuses are integrated systems with established functions of subduing prey, and deterring predators (Jackson et al., 2016). They are much more widespread among snakes than previously thought (Vidal, 2002). Moreover, snakes display an exceptional diversity of oral glands (including venom glands), the functions of which remain unknown in many cases (Jackson et al., 2017).

Venoms and toxins of front-fanged dangerous snakes have been the subject of the vast majority of toxinological studies, but snakes devoid of a front-fanged venom system are much more diverse ecologically and phylogenetically and may therefore harbor an untapped potential of new toxins or toxin families. As several studies have found a relationship between diet and venom composition (da Silva and Aird, 2001; Barlow et al., 2009; Jackson et al., 2016; Modahl et al., 2018; Healy et al., 2019), we will here focus on snake lineages with particular diets and/or lineages that have been particularly neglected from a toxinological point of view.

# "Scolecophidia"

Although this assemblage is paraphyletic, all "scolecophidian" snakes feed primarily on insects, most notably social insects and their eggs (with a few exceptions such as the genus Acutotyphlops that feeds on earthworms). Phisalix (1922) first described temporomandibular glands that seem to be composed of serous secretory cells and include a duct. Two recent studies have brought surprising results. Using magnetic resonance imaging, Jackson et al. (2017) identified very large supralabial glands in a typhlopid while Martins et al. (2018) found that the largest glands in leptotyphlopids were infralabial ones. This is significant as leptotyphlopids ingest prey using a specialized mechanism named mandibular raking while typhlopids use an equally specialized maxillary mechanism (Kley and Brainerd, 1999; Kley, 2001). Further investigations of the function and composition of these well-developed serous glands are much needed.

# Homalopsidae

Also called "mud snakes," this family includes ∼55 species, all Asian, most of which are found in estuarine, marine, or freshwater environments. Apart from nine species (genera Brachyorrhos, Calamophis, Karnsophis) that feed primarily on earthworms, they are rear-fanged snakes considered to be mildly venomous. These rear-fanged species feed mostly on aquatic vertebrates (fish and amphibians), but two of them (genera Gerarda, Fordonia) are specialized on crustaceans (crabs) that they grapple with, envenom and dismember before eating (Murphy and Voris, 2014).

The gland transcriptome and venom proteome of one species (Cerberus rhynchops) has been analyzed, resulting in the discovery of a new snake venom protein family named veficolins (OmPraba et al., 2010). Homalopsidae is therefore a promising group in terms of bioprospecting.

# Lamprophiidae: Psammophiinae and Pseudoxyrhophiinae

The subfamily Psammophiinae includes seven genera and 53 species that are all rear-fanged. Most species are active diurnal snakes and opportunistic predators feeding on lizards, frogs and mammals (Pough et al., 2016). With a few exceptions (Psammophis mossambicus, Fry et al., 2008; Brust et al., 2013), psammophiine venoms haven't been studied, although a new neurotoxin (rufoxin) was isolated from Ramphiophis oxyrhynchus by Lumsden et al. (2007).

The subfamily Pseudoxyrhophiinae is the main snake radiation in Madagascar and nearby Comoro Islands with 19 endemic genera and 84 species. The genera Alluaudina, Compsophis, Ithycyphus, Langaha, Leioheterodon, Madagascarophis, Micropisthodon, Parastenophis, Phisalixella are rear fanged and display of large variety of phenotypes and ecology (Ruane et al., 2015). With the exception of Leioheterodon madagascariensis (Fry et al., 2008), the composition of their venoms is unknown.

# Non-front-fanged Species Specialized on Venomous Arthropods

In addition to the well-studied case of some saw-scaled vipers (genus Echis) feeding on scorpions (Barlow et al., 2009), several non-front fanged caenophidian species are specialized for feeding upon potentially dangerous arthropod prey such as scorpions, centipedes and spiders. Moreover, they belong to different families: Lamprophiidae, Colubridae and Dipsadidae. To our knowledge, the venoms from these arthropod-eating species have not been studied. Among Lamprophiidae, the subfamily Atractaspidinae includes the rear-fanged genus Aparallactus (11 species) that feeds on centipedes (with the exception of Aparallactus modestus, a fangless species that feeds on earthworms; Portillo et al., 2018, 2019).

Among Colubridae, at least 13 genera including 95 species (Chionactis, Chilomeniscus, Conopsis, Ficimia, Geagras, Gyalopion, Pseudoficimia, Scolecophis, Sonora, Stenorrhina, Sympholis, Tantilla, Tantillita) feed on scorpions, centipedes and spiders. It is worth noting that almost all if not all of these genera belong to one tribe named Sonorini. Among Dipsadidae, at least one species, Philodryas agassizii, is known to feed upon spiders, which is first neutralizes with venom (Marques et al., 2006). In addition to their venoms, the possible mechanisms of immunity of these snakes to the bites of their dangerous prey remain to be investigated.

# Non-front-fanged Species Specialized on Gastropod Molluscs

Snakes belonging to the Neotropical tribe Dipsadini (family Dipsadidae) mostly feed on gastropod molluscs (hence the common name of "snail-eating" snakes) and possess a hypertrophied protein-secreting infralabial gland (de Oliveira et al., 2008). The secretions of this gland are toxic to the snake's molluscan prey (Laporta-Ferreira and Salomaõ, 1991; Salomaõ and Laporta-Ferreira, 1994), although a recent study emphasizes a role in mucus control and transport (Zaher et al., 2014; see Jackson et al., 2017 for further discussion). The tribe includes ∼80 species belonging to four genera (Dipsas, Plesiodipsas, Sibon, Tropidodipsas) but none of their venom systems have been investigated proteomically or transcriptomically. Of additional interest is their rear-fanged sister-group (genus Ninia, 11 species) that feeds on the same prey but without the specialized mandibular protein-secreting system.

At least 3 other dipsadid genera feed on molluscs: Contia (2 North American carphophiine species) has long needle-like teeth on its mandibles, probably an adaptation to gripping and eating slugs (Greene, 1997), while Calamodontophis and Tomodon (5 South American xenodontine species) possess specialized long needle-like teeth on their maxillaries (Bizerra et al., 2005).

Convergently, the Asian family Pareidae (20 species belonging to the genera Aplopeltura, Asthenodipsas, Pareas) has adopted a similar diet and associated mandibular specialization as the Dipsadini, but have been even less studied than their Neotropical counterparts.

Among the mostly African/Malagasy family Lamprophiidae, the genus Duberria (four species) feeds on molluscs only and the monotypic genus Micropisthodon is suspected to do the same due to its morphology, which resembles that displayed by Pareidae and Dipsadini (O'Shea, 2018). Finally the genus Storeria (five species, Natricidae) also includes mollusks in its diet (Rossman and Myer, 1971).

# Non-front-fanged Species Feeding on Squamates

# (Snakes/Amphisbaenians/Skinks)

Given their potential danger or particular morphology, scalation and strength, this kind of prey requires subjugation by constriction and/or envenomation.

Among snakes considered to be "basal," two unrelated lineages deserve particular interest: Aniliidae (Anilius scytale) and Cylindrophiidae (Cylindrophis, 14 species, and possibly Anomochilus, 3 species; Gower et al., 2005). Aniliidae feeds mostly on amphisbaenians and snakes while Cylindrophis commonly includes snakes in its diet in addition to "invertebrates" and other elongate vertebrates such as eels (Greene, 1997). Neither of these lineages is an effective constrictor and both of them possess large serous-secreting rictal glands, which may secrete 3-finger toxins (Fry et al., 2013). This system may therefore play a functional role in prey subjugation (Jackson et al., 2017).

Several non-front-fanged caenophidian species predominantly feed upon snakes/amphisbaenians. Among Dipsadidae, this includes the following rear-fanged genera: Boiruna, Clelia, Mussurana, Paraphimophis, and Pseudoboa (18 species belonging to the tribe Pseudoboini); Apostolepis, Elapomorphus and Phalotris (53 species, tribe Elapomorphini; Lema et al., 1983; Alencar et al., 2013; Gaiarsa et al., 2013), as well as Erythrolamprus (six species, tribe Xenodontini;Wallach et al., 2014; Marques et al., 2016; Sánchez et al., 2019). The venom composition of Phalotris mertensi has been studied appears to be distinct from that of other non-front-fanged species as it includes a unique snake venom acid lipase (svLIPA) (Campos et al., 2016; Junqueira-de-Azevedo et al., 2016).

Among Lamprophiidae, the following rear-fanged genera (belonging to the tribe Atractaspidinae) feed mostly on snakes and fossorial squamates: Amblyodipsas, Brachyophis, Chilorhinophis, Hypoptophis, Macrelaps, Polemon, and Xenocalamus (32 species; Portillo et al., 2018, 2019). Given its peculiar dentition (enlarged front and back maxillary teeth), use of venom and inclusion of skinks in its diet, the "mock viper" (genus Psammodynastes, 2 species, family Lamprophiidae) also deserves special interest (Jackson and Fritts, 1996). Another poorly studied genus is Micrelaps (four species, family Lamprophiidae), which mostly feed on snakes and has enlarged venom glands (Greene, 1997).

### Ecology and Clinical Toxinology

Situating venom in its ecological, and therefore evolutionary, context may have considerable impact for toxinological research. This extends far beyond the few concepts and taxa we have discussed and reviewed here and has implications even for research in clinical toxinology. Snakebite envenoming was recently reinstated to the World Health Organization's list of Neglected Tropical Diseases (WHO, 2017), in recognition of the fact that it results in well over 100,000 deaths, and 400,000 permanent disablements worldwide annually (Gutiérrez et al., 2017). Whilst there is much important research to be done in that space concerning the improvement of antivenom products and their distribution to those most in need, the ecological perspective should not be neglected. Snakebite envenoming is one of the most impactful forms of human-animal conflict in the modern world and such conflict occurs as the result of an interaction between two organisms in an environment—in this case between a human and a snake.

As with antimicrobial resistance, the fact that humans are involved should not prevent us from considering these interactions "ecological." Each organism employs a range of strategies in meeting the challenges presented by its environment. For humans, these strategies include technological responses such as the use of antibiotics and antivenom. These strategies are different by degree, but not in kind, from strategies employed by other organisms, which may either produce their own antimicrobial or anti-toxin molecules or sequester them from other species (much as we "sequestered" penicillin and continue to sequester horse antibodies). These philosophical

# REFERENCES


points aside, rigorous investigation of the ecology of snakebite the circumstances in which bites from venomous snakes to humans occur—is perhaps the most neglected of all the areas of research we have discussed in this paper. It is also potentially the most impactful. As snake lovers will attest, the majority of snake bites to humans are defensive, but not all are. Regardless, all are "ecological," and thus gaining a deeper understanding of the nature of the interactions which lead to these occasionally catastrophic incidents may help us decrease their number. For snakebite envenoming, as in so many other cases, prevention is far better than cure, and prevention requires an understanding of the ecology of both venomous snakes and the humans with whom they share their environment.

# CONCLUSION

In this brief article we have highlighted a number of ways in which context, and therefore ecology, is relevant to the study of snake venom and its evolution. We have traversed a considerable amount of conceptual distance, from molecular evolution to clinical toxinology. For this reason, our treatment of each area has been unfortunately abbreviated—each of these topics either has been, or should be, discussed elsewhere in more detail. Nonetheless, we feel that attempting to unite these diverse fields and phenomena within an ecological perspective is a worthwhile exercise. Though this perspective has been somewhat neglected, the ecology of venom is a research agenda which continues to gather steam (see e.g., Diz and Calvete, 2016; papers in this issue; and a forthcoming special edition of the journal Toxins). Another exciting development in ecological research is the fledgling field of "ecological genomics" (Shafer et al., 2016) – the introduction of these methods into venom research may contribute to rapidly advancing the agenda we have advocated in this piece. Our modest hope for this article is that it piques the interest of our fellow toxinologists and illustrates some of the future research directions that derive from viewing the venomous world through an ecological lens.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This work was supported by MNHN.


Ohno, S. (1970). Evolution by Gene Duplication. New York, NY: Springer Verlag


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Jackson, Jouanne and Vidal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.