## THE EVOLUTION OF NEUROPEPTIDES – A STROLL THROUGH THE ANIMAL KINGDOM: UPDATES FROM THE OTTAWA 2019 ICCPB SYMPOSIUM AND BEYOND

EDITED BY : Klaus H. Hoffmann and Elizabeth Amy Williams PUBLISHED IN : Frontiers in Endocrinology and Frontiers in Neuroscience

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-017-9 DOI 10.3389/978-2-88966-017-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## THE EVOLUTION OF NEUROPEPTIDES – A STROLL THROUGH THE ANIMAL KINGDOM: UPDATES FROM THE OTTAWA 2019 ICCPB SYMPOSIUM AND BEYOND

Topic Editors: Klaus H. Hoffmann, University of Bayreuth, Germany Elizabeth Amy Williams, University of Exeter, United Kingdom

Citation: Hoffmann, K. H., Williams, E. A., eds. (2020). The Evolution of Neuropeptides – a Stroll Through the Animal Kingdom: Updates from the Ottawa 2019 ICCPB Symposium and Beyond. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-017-9

# Table of Contents


Dick R. Nässel, Meet Zandawala, Tsuyoshi Kawada and Honoo Satake

*32 Global Neuropeptide Annotations From the Genomes and Transcriptomes of Cubozoa, Scyphozoa, Staurozoa (Cnidaria: Medusozoa), and Octocorallia (Cnidaria: Anthozoa)*

Thomas L. Koch and Cornelis J. P. Grimmelikhuijzen

*46 Evolution and Comparative Physiology of Luqin-Type Neuropeptide Signaling*

Luis Alfonso Yañez-Guerra and Maurice R. Elphick

*61 Expression Analysis of Cnidarian-Specific Neuropeptides in a Sea Anemone Unveils an Apical-Organ-Associated Nerve Net That Disintegrates at Metamorphosis*

Hannah Zang and Nagayasu Nakanishi


Gerd Gäde, Petr Šimek and Heather G. Marco


Christine Martin, Lars Hering, Niklas Metzendorf, Sarah Hormann, Sonja Kasten, Sonja Fuhrmann, Achim Werckenthin, Friedrich W. Herberg, Monika Stengl and Georg Mayer

*142 Comparative Aspects of Structure and Function of Cnidarian Neuropeptides*

Toshio Takahashi

*153 Function and Distribution of the Wamide Neuropeptide Superfamily in Metazoans*

Elizabeth A. Williams

# Editorial: The Evolution of Neuropeptides - a Stroll Through the Animal Kingdom: Updates From the Ottawa 2019 ICCPB Symposium and Beyond

#### Klaus H. Hoffmann<sup>1</sup> \* † and Elizabeth A. Williams <sup>2</sup> \* †

<sup>1</sup> Faculty for Biology, Chemistry, and Earth Sciences, University of Bayreuth, Bayreuth, Germany, <sup>2</sup> College of Life and Environmental Sciences, University of Exeter, Exeter, United Kingdom

Keywords: neuropeptides, neuropeptide evolution, neuropeptide signaling, neuropeptide function, cnidarians, arthropods, echinoderms, metazoan

**Editorial on the Research Topic**

#### **The Evolution of Neuropeptides - a Stroll Through the Animal Kingdom: Updates From the Ottawa 2019 ICCPB Symposium and Beyond**

#### Approved by:

Hubert Vaudry, Université de Rouen, France

#### \*Correspondence:

Klaus H. Hoffmann Klaus.hoffmann@uni-bayreuth.de Elizabeth A. Williams e.williams2@exeter.ac.uk

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 09 June 2020 Accepted: 24 June 2020 Published: 28 July 2020

#### Citation:

Hoffmann KH and Williams EA (2020) Editorial: The Evolution of Neuropeptides - a Stroll Through the Animal Kingdom: Updates From the Ottawa 2019 ICCPB Symposium and Beyond. Front. Endocrinol. 11:508. doi: 10.3389/fendo.2020.00508 Signaling in the nervous and endocrine systems via neuropeptides is an ancient mechanism found in almost all animals. Active neuropeptides, generated by enzymatic cleavage of larger precursor peptides, play a role as neurotransmitters, neuromodulators, hormones, or growth factors and are involved in the regulation of a huge variety of biological systems. Through binding to their specific G protein-coupled receptors, neuropeptides regulate processes including animal development and growth, feeding, metabolism and digestion, behavior, diuresis and homeostasis, and ecdysis and metamorphosis. The chemical structure of neuropeptides is highly diverse and every peptide may be pleiotropic in function, making the study of neuropeptide signaling a complex yet fascinating subject.

Recent advances in genome/transcriptome sequencing, in concert with mass spectrometry and computational prediction and processing, have enabled the identification of active peptides, neuropeptide precursor proteins, and neuropeptide receptors in species from a growing variety of animal taxa. Also recently, techniques such as RNA interference, morpholino knockdown, and the CRISPR-Cas system for genome editing have been used for specific knockdown of neuropeptide precursor genes and neuropeptide receptors, to evaluate the function of neuropeptide signaling. These technological advancements have provided novel insights into neuropeptide function and evolution.

In August 2019, scientists from around the world gathered in Ottawa (Canada) to discuss all aspects of neuropeptide evolution, from structural diversity to functional condition and potential applications, as part of the 10th International Congress of Comparative Physiology and Biochemistry: mechanisms and evolutionary processes. This Research Topic collects the findings presented at the neuropeptide symposium plus recent associated research, including six original research papers and four review articles. The papers cover a wide range of metazoans, from cnidarians to echinoderms.

The origin of neuropeptide signaling in metazoans is currently a major focus in neuropeptide biology. Three papers in this collection extend our knowledge of neuropeptide diversity in the early branching cnidarians. Analysis of largescale transcriptome resources from several different classes of cnidaria by Koch and Grimmelikhuijzen identify two ancient cnidarian neuropeptide classes, GRFamides and RPRSamides. This study also uncovers the distribution of different neuropeptide classes across largely understudied cnidarians including cubozoans, scyphozoans, staurozoans, and octocorallia. Complementing this study is a review article by Takahashi summarizing the structure and function of Cnidarian neuropeptides, with focus on hydrozoans and hexacorallians. Cnidarian neuropeptides function in a wide range of processes throughout the life cycle, including neuron differentiation, larval locomotion, metamorphosis, muscle contraction, feeding, sensory activity, and reproduction. Like other animal phyla, the cnidarian neuropeptide repertoire includes both novel and conserved neuropeptides. Zang and Nakanishi initiate investigations into cnidarian-specific neuropeptides by mapping expression of RPamide during development in the starlet sea anemone, Nematostella vectensis. This study reveals new features of the sea anemone nervous system, including the sensory larval apical organ, and provides insight into how the nervous system is reshaped during metamorphosis from larval to adult form.

Classic studies of neuropeptide structure and function in arthropods such as fruit fly, cockroach, cricket and crab were imperative in establishing the neuropeptide biology field. Here, four original research papers expand our knowledge of panarthropod neuropeptide signaling in diverse directions. Martin et al. characterize the expression of two pigment dispersing hormones (PDFs) and a PDF receptor in a velvet worm (Phylum Onychophora), enlightening the organization and function of PDF signaling, largely known for involvement in the circadian system, in a panarthropod ancestor. The authors also employ receptor deorphanization to uncover the difference in binding specificity of the two PDFs to this G proteincoupled receptor. Bläser and Predel carry out bioinformatic analysis of an impressive 200 insect species from the group Polyneoptera—cockroaches, termites, locusts, and stick insects to study the evolution of single-copy neuropeptide precursors. These neuropeptides, including ASTC, CCAP, CCHamide, corazonin, elevenin, NPF, proctolin, SIFamide, and sNPF, show a relatively high degree of sequence conservation, although the extent of conservation differs between neuropeptide families. This provides clues regarding the conservation or diversification of function for polyneopteran neuropeptides. Gäde et al. report a mass spectrometry analysis of the presence and structure of adipokinetic hormones (AKHs) in Diptera (flies, mosquitoes) and Mecoptera (scorpion flies). The authors map how changes in single amino acid-increments from an ancestral peptide could have occurred during dipteran evolution to generate existing AKH diversity, providing a finescale look at neuropeptide

### REFERENCES

1. Sachkova MY, Landau M, Surm JM, Macrander J, Singer S, Reitzel AM, et al. Toxin-like neuropeptides in the sea anemone Nematostella unravel recruitment from the nervous system to venom. BioRxiv [Preprint]. (2020). doi: 10.1101/2020.05.28.121442

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

evolution. Rocco and Paluzzi investigate the expression of glycoprotein hormone subunits GPA2 and GPB5 and their G protein-coupled receptor LGR1 in the mosquito Aedes aegypti. The authors find that GPA2/GPB5 heterodimers activate the LGR1 receptor similar to glycoprotein receptor interactions in human and Drosophila. Importantly, the authors noted that each subunit may also interact with other unknown proteins to activate different receptors or signaling pathways for different functions. This study contributes to cross-phylum understanding of neuropeptide signaling as glycoprotein hormone is an ancient neurohormone found across metazoa.

Three review articles contribute to bridging our understanding of neuropeptide signaling across different metazoan phyla.Williams compares and contrasts Wamide neuropeptide family signaling across cnidarians and protostomes. Yañez-Guerra and Elphick survey the luqin family, a paralog of tachykinins, in metazoans. Luqins function in regulating feeding, diuresis, egg-laying, locomotion and lifespan. The identification of luqins in a starfish and hemichordate extends this family into deuterostomes. Nässel et al. review the organization and evolution of multi-functional tachykinin precursor peptides across protostomes and deuterostomes. The authors note that a subset of protostome tachykinins have been recruited for use in venom or salivary glands to affect prey. This interesting evolutionary path of neuropeptide recruitment for novel toxins is also recently revealed in Nematostella vectensis through discovery of an ShK-like venom component, which causes muscle contractions in Nematostella but is toxic to fish larvae (1).

Sincere thanks to the authors, reviewers, and editors for their valuable contributions to this Research Topic, and to the ICCBP Symposium presenters and participants for their enthusiastic participation. We look forward to moving toward practical applications utilizing knowledge of neuropeptide signaling in the development of pest and parasite control agents and novel drugs, the improvement of invertebrate culture for food, and in bio conservation, both on the land and in the sea.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

EW was supported by a BBSRC David Phillips Fellowship (BB/T00990X/1).

Copyright © 2020 Hoffmann and Williams. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Tachykinins: Neuropeptides That Are Ancient, Diverse, Widespread and Functionally Pleiotropic

Dick R. Nässel<sup>1</sup> \* † , Meet Zandawala<sup>2</sup>† , Tsuyoshi Kawada<sup>3</sup> and Honoo Satake<sup>3</sup>†

<sup>1</sup> Department of Zoology, Stockholm University, Stockholm, Sweden, <sup>2</sup> Department of Neuroscience, Brown University, Providence, RI, United States, <sup>3</sup> Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan

Edited by: Klaus H. Hoffmann, University of Bayreuth, Germany

#### Reviewed by:

Liliane Schoofs, KU Leuven, Belgium Geoffrey Coast, Birkbeck, University of London, United Kingdom Gáspár Jékely, University of Exeter, United Kingdom

#### \*Correspondence:

Dick R. Nässel dnassel@zoologi.su.se

#### †ORCID:

Dick R. Nässel orcid.org/0000-0002-1147-7766 Meet Zandawala orcid.org/0000-0001-6498-2208 Honoo Satake orcid.org/0000-0003-1165-3624

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 27 September 2019 Accepted: 06 November 2019 Published: 20 November 2019

#### Citation:

Nässel DR, Zandawala M, Kawada T and Satake H (2019) Tachykinins: Neuropeptides That Are Ancient, Diverse, Widespread and Functionally Pleiotropic. Front. Neurosci. 13:1262. doi: 10.3389/fnins.2019.01262 Tachykinins (TKs) are ancient neuropeptides present throughout the bilaterians and are, with some exceptions, characterized by a conserved FX1GX2Ramide carboxy terminus among protostomes and FXGLMamide in deuterostomes. The best-known TK is the vertebrate substance P, which in mammals, together with other TKs, has been implicated in health and disease with important roles in pain, inflammation, cancer, depressive disorder, immune system, gut function, hematopoiesis, sensory processing, and hormone regulation. The invertebrate TKs are also known to have multiple functions in the central nervous system and intestine and these have been investigated in more detail in the fly Drosophila and some other arthropods. Here, we review the protostome and deuterostome organization and evolution of TK precursors, peptides and their receptors, as well as their functions, which appear to be partly conserved across Bilateria. We also outline the distribution of TKs in the brains of representative organisms. In Drosophila, recent studies have revealed roles of TKs in early olfactory processing, neuromodulation in circuits controlling locomotion and food search, nociception, aggression, metabolic stress, and hormone release. TK signaling also regulates lipid metabolism in the Drosophila intestine. In crustaceans, TK is an important neuromodulator in rhythm-generating motor circuits in the stomatogastric nervous system and a presynaptic modulator of photoreceptor cells. Several additional functional roles of invertebrate TKs can be inferred from their distribution in various brain circuits. In addition, there are a few interesting cases where invertebrate TKs are injected into prey animals as vasodilators from salivary glands or paralyzing agents from venom glands. In these cases, the peptides are produced in the glands of the predator with sequences mimicking the prey TKs. Lastly, the TK-signaling system appears to have duplicated in Panarthropoda (comprising arthropods, onychophores, and tardigrades) to give rise to a novel type of peptides, natalisins, with a distinct receptor. The distribution and functions of natalisins are distinct from the TKs. In general, it appears that TKs are widely distributed and act in circuits at short range as neuromodulators or cotransmitters.

Keywords: substance P, neurokinin, neurokinin receptor, natalisin, G protein-coupled receptor, co-transmission, neuropeptide evolution, tachykinin-related peptide

### INTRODUCTION

fnins-13-01262 November 18, 2019 Time: 13:40 # 2

Substance P, the prototypic tachykinin (TK), was the first neuropeptide ever to be isolated from brain tissue already in 1931 (Von Euler and Gaddum, 1931). For a long time it was the sole brain neuropeptide known, and was joined only in the 1950s by the pituitary peptides, oxytocin and vasopressin (Turner et al., 1951; Du Vigneaud et al., 1953; Du Vigneaud, 1955). Today, the number of neuropeptides identified in the animal kingdom is huge and hard to overview [see (Jekely, 2013; Mirabeau and Joly, 2013; Nässel and Zandawala, 2019)]. Also, the number of TK family members has grown immensely over the years, and we know now that there often are several structural and functional representatives in each species.

Already from the outset it was recognized that substance P is produced both in the brain and the intestine (Von Euler and Gaddum, 1931; Hökfelt et al., 2001). Today, it is clear that TKs are utilized by neurons in the CNS, by neurons and enteroendocrine cells associated with the intestine (Otsuka and Yoshioka, 1993; Nässel, 1999; Hökfelt et al., 2001; Satake et al., 2013; Steinhoff et al., 2014), as well as by other cells in mammals such as hematopoietic cells (Zhang et al., 2000; Morteau et al., 2001), endothelial cells, Leydig cells and immune cells [see (Almeida et al., 2004)]. Thus, they are widespread and pleiotropic, and are not only neuropeptides, but also produced by other cell types.

Although the first TK was identified in 1931, it was not until 1971 that substance P was purified and sequenced from 20 kg bovine hypothalamus (Chang et al., 1971), and subsequently synthesized (Tregear et al., 1971). This enabled production of antisera and their application in radioimmunoassay (Powell et al., 1973) and immunocytochemistry (Hökfelt et al., 1975) to localize substance P and demonstrate its release. Thereafter, important experimental work ensued, including development of TK agonists and antagonists [see (Otsuka and Yoshioka, 1993; Hökfelt et al., 2001)], identification of TK receptors [see (Masu et al., 1987; Nakanishi, 1991)], developing genetic approaches and discovering important roles in health and disease [see (Otsuka and Yoshioka, 1993; Hökfelt et al., 2001; Onaga, 2014; Steinhoff et al., 2014)]. The discovery of the roles of substance P and other TKs in pain, inflammation, cancer, depressive disorder, immune function, gut function, hematopoiesis, sensory processing and hormone regulation [see (Hökfelt et al., 2001; Onaga, 2014; Steinhoff et al., 2014; Zieglgänsberger, 2019)] has lead to extensive research into the pharmacology and molecular biology of this signaling system as a therapeutic target [see (Steinhoff et al., 2014)], resulting in a huge number of publications annually. However, it is hard to find recent comprehensive reviews on TKs that cover distribution and functions.

Tachykinins have also been explored outside mammals and other vertebrates. The first TK to be identified in an invertebrate was eledoisin, isolated from salivary glands of the cephalopod Eledone moschata (Erspamer and Anastasi, 1962). Eledoisin was actually the first TK to be sequenced, but since the sequence of substance P was not yet known, the structural relationship was realized only later. The authors, however, recognized that the action of eledoisin on mammalian smooth muscle is similar to that of substance P (Erspamer and Anastasi, 1962; Erspamer and Erspamer, 1962). Many years later, four TKs were isolated from the brain and retrocerebral glands of the locust Locusta migratoria (Schoofs et al., 1990a,b). Today, multiple TKs (more than 350 sequences) have been identified from over 50 insect species [see the DINeR database<sup>1</sup> (Yeoh et al., 2017)], and numerous ones from other invertebrates and protochordates [see e.g., Kawada et al. (2010), Veenstra (2010, 2011, 2016), Conzelmann et al. (2013), Palamiuc et al. (2017), Zandawala et al. (2017), Dubos et al. (2018), Koziol (2018) and **Figure 1** and **Supplementary Table S1**]. Also in invertebrates, the common TKs are produced by neurons of the CNS and by endocrine cells of the intestine, but the presence of invertebrate TKs in other cell types has not been reported thus far. Functional analysis has revealed that invertebrate TKs are also pleiotropic. Moreover, recent genetic work in Drosophila, suggests that many TK functions are conserved over evolution.

In this review, we first comment on TK terminology in invertebrates since at present it may seem somewhat complex and confusing. Furthermore, we discuss the evolution of genes encoding TK precursors and receptors as well as outline TK signaling systems in various phyla across the animal kingdom. Next, we discuss the distribution of TKs and functions of TK signaling systems; here, we are more comprehensive in dealing with invertebrates since vertebrate TK literature is very extensive. Furthermore, we highlight the functions of TK-signaling that are conserved across different animal phyla. Of note, TKs generally appear to signal over a relatively short range within defined neuronal circuits as neuromodulators or cotransmitters. Only a few examples of intestinal TKs acting as local circulating hormones are available. We also discuss a sister group of the TKs, the natalisins, that seems to have arisen by a gene duplication in the Panarthropoda (comprising arthropods, onychophores, and tardigrades) lineage and appears restricted to this group. The natalisins and their receptors constitute a distinct signaling system that has not been investigated in detail thus far.

### STRUCTURE OF TACHYKININ PEPTIDES AND ORGANIZATION OF GENES ENCODING THEIR PRECURSORS

We start this section with a commentary on TK terminology and continue with describing TK precursors, peptides and receptors in mammals where knowledge is the largest and then move on with other vertebrates and last invertebrates.

### A Note on Major Types of Tachykinins and Terminology

Substance P (RPKPQQFFGLMamide), and other mammalian TKs, are characterized by an FXGLMamide carboxy terminus, and these peptides act on either of three TK receptors (GPCRs; NK1R – NK3R) (Otsuka and Yoshioka, 1993; Onaga, 2014; Steinhoff et al., 2014). The first invertebrate neuropeptides referred to as TKs (locustatachykinin I-IV; LomTK I-IV),

<sup>1</sup>http://www.neurostresspep.eu/diner/insectneuropeptides



FIGURE 1 | Sequence alignments of (A) tachykinin and (B) natalisin peptides from select species. Note that C-terminal amidation is not shown; it is represented by the amidation signal G. Conserved residues are highlighted in black (identical) or gray (similar). Species belonging to the same phyla have been highlighted with the same color. Species names are as follows: Homsa (Homo sapiens), Danre (Danio rerio), Cioin (Ciona intestinalis), Astru (Asterias rubens), Capte (Capitella teleta), Ureun (Urechis unicinctus), Cragi (Crassostrea gigas), Octvu (Octopus vulgaris), Caeel (Caenorhabditis elegans), Hypdu (Hypsibius dujardini), Tetur (Tetranychus urticae), Dappu (Daphnia pulex), Trica (Tribolium castaneum), Bommo (Bombyx mori), Anoga (Anopheles gambiae), Varde (Varroa destructor), Bacdo (Bactrocera dorsalis), Drome (Drosophila melanogaster) and Nemve (Nematostella vectensis). Note that the A. rubens, C. elegans and N. vectensis peptides are unlikely to be TKs as they deviate substantially from the canonical TK sequences (see also text).

were isolated from the brain and retrocerebral complex of locusts, and have a different carboxy terminus, FX1GX2Ramide (Schoofs et al., 1990a,b). This sequence is shared by many other invertebrate TKs and only one type of insect TK receptor is known so far (Nässel, 1999; Van Loy et al., 2010; Satake et al., 2013). The structural difference in the active core of the two groups of TK peptides renders the FX1GX2Ramides inactive on the vertebrate-type TK receptors and conversely, vertebrate TKs do not activate invertebrate receptors (Satake et al., 2013). Thus, these authors suggested that the FX1GX2Ramides should be designated tachykinin-related peptides (TKRPs) to distinguish them from vertebrate TKs with FXGLMamide. In the literature, the individual TKs and TKRPs have been given many different names. In invertebrates, these commonly include a prefix indicating the species of origin (e.g., LomTK in Locusta migratoria) and then numbers if multiple peptide paracopies (isoforms) exist on the same precursor (LomTK-I, LomTK-II etc).

To further complicate the terminology of TKs, there are peptides with an FXGLMamide carboxy terminus produced by salivary glands of mosquitos, sialokinins (Champagne and Ribeiro, 1994) and in cephalopods, eledoisin and octopustachykinin (Erspamer and Anastasi, 1962; Kanda et al., 2003; Satake et al., 2003) (**Figure 2** and **Supplementary Table S2**). These TKs are delivered to prey and meant to act on exogenous receptors, not within the "sender animal" (predator). The peptides of this kind were referred to as invertebrate TKs (Inv-TKs) (Satake et al., 2003), to distinguish them from TKRPs. Similarly, exocrine glands in amphibian skin produce FXGLMamide-type TKs that have been given different exotic names [see (Lazarus and Attila, 1993)]. A recent finding adds to the TK complexity; in the parasitoid Jewel wasp (Nasonia vitripennis), the toxin glands produce a precursor encoding multiple FQGMRamide containing peptides (Arvidson et al., 2019). The wasp injects the toxin that contains FQGMRamide peptide and other components into the cockroach brain to paralyze the host by acting on the cockroach TK receptor in circuits of the central complex. In summary, TKs for exogenous use are produced to act on receptors of target animals and the salivary gland ones deviate structurally from native TKs. We will discuss these in more detail later. In the present review, we use the names originally given to the different TKs when relevant (see **Supplementary Tables 1**, **2**), but for the sake of simplicity we will henceforth use the term TKs for all FXGLMamide and FX1GX2Ramides when we discuss the peptides in general.

A related, but distinct, invertebrate peptide signaling system is constituted by the natalisins (NTL) and their receptors (Jiang et al., 2013). These will be discussed separately. Also, note that especially in early papers (but also some more recent ones) a family of neuropeptides designated leucokinins (LKs) has been considered related to TKs [see e.g., Holman et al. (1986), Nässel and Lundquist (1991), Al-Anzi et al. (2010)]. The LKs have an FXSWGamide carboxy terminus, and analysis based on precursor structure (and receptors) show that they are not homologous to TKs (Jekely, 2013; Mirabeau and Joly, 2013).

### TKs and Their Receptors in Mammals

In mammals, including humans, there are three genes encoding precursors of tachykinins: preprotachykinin A (PPTA), preprotachykinin B (PPTB) and preprotachykinin C (PPTC), also known as Tac1, Tac3, and Tac4, respectively [see (Onaga, 2014; Steinhoff et al., 2014)]. These genes arose through two rounds of genome duplications in the vertebrate lineage followed by subsequent gene losses and diversification (Elphick et al., 2018). The Tac1 precursor gives rise to Substance

P (SP), Neurokinin A (NKA), Neuropeptide K (NPK), and neuropeptide γ (NPγ), Tac3 to Neurokinin B (NKB), and Tac4 to Hemokinin (HK), Endokinin-A (EKA) and Endokinin-B (EKB). Thus, there are nine different TKs in mice, rats and humans. The sequences of the Tac1 and Tac3 derived TKs are conserved in humans, mouse and rat, whereas the ones encoded on Tac4 differ between species (Steinhoff et al., 2014). The Tac4 encodes another two endokinins (EKC and EKD) that are not TKs (Steinhoff et al., 2014). The organization of mammalian TK precursors is shown in **Figure 3** and their sequences in **Table 1**. For comparison, TK precursors in other representative animals are shown in **Figures 4**, **5**, and a cladogram with TK signaling components found in **Figure 6**. It is also worth noting that the Tac1 and Tac4 genes each give rise to 4 splice variants, α, β, γ, and δ (Onaga, 2014;

Steinhoff et al., 2014). The TKs have differential affinity for three different TK receptors, NK1R - NK3R (or TAC1R - TAC3R) (Nakanishi, 1991; Onaga, 2014; Steinhoff et al., 2014) as shown in **Table 1**; the ligand selectivity is as follows: SP > NKA > NKB for NK1R, NKA > NKB > SP for NK2R, and NKB > NKA > SP for NK3R (Satake et al., 2013; Steinhoff et al., 2014). HK and EKs exhibit the highest affinity to NK1R (Satake et al., 2013; Steinhoff et al., 2014). These are G-protein-coupled receptors (GPCRs) of the rhodopsin family (also known as family A GPCRs). The NK2R (neuropeptide K receptor) is of historical interest since it was the first neuropeptide receptor to be cloned (Masu et al., 1987). Signaling through the NK receptors is diverse and complex. For example, the ligand-activated NK1R initiates G-protein mediated signaling that can lead to (1) activation of phospholipase C (PLC), which results in formation

of inositol trisphosphate (IP3) and diacylglycerol (DAG), mobilization of intracellular stores of Ca2+, and activation of PKC; (2) activation of adenylyl cyclase (AC), resulting in formation of cAMP, and activation of PKA; or (3) activation of phospholipase A2 and production of arachidonic acid (Steinhoff et al., 2014).

In mammals, TKs play roles as neuromodulators/ cotransmitters in central brain circuits, as well as in pain, stress, anxiety, depressive disorder, aggression, memory formation, inflammation, cancer, immune function, gut function, hematopoiesis, sensory processing, reproduction and cytokine and hormone regulation [see (Otsuka and Yoshioka, 1993; Felipe et al., 1998; Hökfelt et al., 2001; Holsboer, 2009; Onaga, 2014; Steinhoff et al., 2014; Lénárd et al., 2018; Zieglgänsberger, 2019)].

Substance P was mapped to neurons in the rat nervous system early on (Hökfelt et al., 1975, 1977; Ljungdahl et al., 1978). Now, we know that the distribution of SP and other TKs, as well as their receptors, is widespread and plastic. Receptor expression is regulated by various transcription factors under different physiological states. For instance, the receptors can be upregulated during inflammation via the transcription factor NF-κB (Onaga, 2014; Steinhoff et al., 2014). SP and NKA and their receptors are not only widely distributed throughout the central and peripheral nervous system, but also in many other tissues including dermal tissue, gastrointestinal tract, as well as the respiratory, urogenital and immune systems (Hökfelt et al., 2001; Onaga, 2014; Steinhoff et al., 2014). Whereas SP and NKA are expressed throughout the brain in mammals, NKB is found mainly in the hypothalamus and spinal cord. Furthermore, SP is present in brain circuits that are involved in the processing of anxiety, such as the amygdala, septum, mid-brain, periaqueductal gray, hippocampus, and hypothalamus (Holsboer, 2009).

#### TABLE 1 | Human tachykinins.<sup>1</sup>

fnins-13-01262 November 18, 2019 Time: 13:40 # 6


<sup>1</sup>Sequence and receptor data from Onaga (2014); Steinhoff et al. (2014). The sequences of the Tac1 and Tac3 derived peptides are conserved in humans, mouse and rat. <sup>2</sup>These are also known as Ppt-a, Ppt-b and Ppt-c. <sup>3</sup>Tac1 gene gives rise to four splice forms: a, b, g, and d, see Figure 3. <sup>4</sup>Tac4 gene also gives rise to four splice forms, a, b, g, and d (see Figure 3), and two additional peptides (Endokinin-C and D) that are not TKs.

As another example, in zebrafish, Tac1 transcript has been mapped to neurons in the olfactory bulb, telencephalon, preoptic region, hypothalamus, mesencephalon, and rhombencephalon, whereas Tac3a was observed in the preoptic region, habenula and hypothalamus and Tac3b predominantly expressed in the dorsal mesencephalon (Ogawa et al., 2012). Additional details of SP and NKA distribution are beyond the scope of this review.

A few further examples of TK signaling are given here that are of interest for the discussion of TK functions in invertebrates in later sections. Certain taste cells express NK1R, and SP appears to regulate responses not only to toxins, but also to tastants: spicy foods stimulate SP release to enhance umami taste reception (Onaga, 2014). In the dorsal horn of the spinal cord, SP modulates nociceptive signals relayed to the brain, but also in pain-processing areas of the brain cortex (Felipe et al., 1998; Zieglgänsberger, 2019). NKB regulates hormone (e.g., gonadotropin-releasing hormone, GnRH) release in the hypothalamus (Steinhoff et al., 2014). NKRs are densely distributed in the rat olfactory bulb and it was shown that SP acts to depress neuronal activity in glomerular neurons, by triggering release of GABA (Olpe et al., 1987), similar to TKs and GABA in the antennal lobe of Drosophila (Ignell et al., 2009; Ko et al., 2015). The intestine is supplied by processes from TK-expressing neurons in dorsal root ganglia, or from local neurons (Hökfelt et al., 2001). NKRs are widely expressed in the intestine (in a cell-specific manner) by enteric neurons, intestinal muscle, epithelium, vasculature as well as immune system. TK signaling in the gut thus influences motility, electrolyte and fluid secretion, as well as vascular and immune functions (Hökfelt et al., 2001).

Recently, all NKRs were also shown to be expressed in genital organs and cells including the testis, sperm, ovary, granulosa cells, cumulus cells, and the uterus, and shown to be involved in sperm motility and reproduction (Pinto et al., 2010, 2015; García-Ortega et al., 2014; Candenas et al., 2018; Blasco et al., 2019).

Substance P is also known to activate three Mas-related GPCRs (Mrgprs), a promiscuous group of receptors underlying

Nässel et al. Tachykinins: Ancient Multifunctional Neuropeptides

itch: human MRGPRX2, mouse MrgprA1, and mouse MrgprB2 (Bader et al., 2014). It is clear that the Mrgpr group is entirely separated from the TKR group (Bader et al., 2014). Interaction of SP with the Mrgprs induces elevation of intercellular Ca2<sup>+</sup> (Azimi et al., 2016). The EC<sup>50</sup> value of SP for human MRGPRX2 is approximately 150 nM, while EC<sup>50</sup> values of SP for mouse MrgprA1 and MrgprB2 are about 5 µM and 50 µM, respectively (Azimi et al., 2016). Interestingly, analyses using NK1R knockout mice suggested that SP induces itch via Mrgprs rather than the NK1R (Azimi et al., 2017). Mrgprs are specific to mammals, suggesting that new SP-recognizing receptors arose during mammalian evolution.

### TKs and Receptors in Protochordates and Non-mammalian Vertebrates

Genomes of non-mammalian vertebrates possess receptors that are homologous to NK1R – NK3R (Biran et al., 2012; Satake et al., 2013). Likewise, several genes encoding TK homologs are present in non-mammalian vertebrate species. Tac1 and Tac3 have also been identified from zebrafish, and the Tac3 prototype gene appears to have duplicated to give rise to Tac3a and Tac3b (Biran et al., 2012) as shown in **Figure 4**. In addition, two Tac3 have also been characterized from an eel, Anguilla anguilla, that is a basal vertebrate (Campo et al., 2018). Due to teleost-specific whole genome duplication multiple Tac3 genes were generated in teleost fish (Moriyama and Koshiba-Takeuchi, 2018). Tac3a of zebrafish encodes not only an NKB (NKBa), but also an NKF that is a piscine-specific TK (Biran et al., 2012). Tac3b of zebrafish encodes both an NKB (NKBb) and an NKF, although NKBb contains an FVGLLamide sequence at the C-terminus that differs from TK consensus sequence FXGLMamide (Biran et al., 2012). Gene duplication of zebrafish TK receptor genes has also occurred, resulting in two NK1R genes (Tacr1a, Tacr1b) and three NK3R genes (Tacr3a, Tacr3b, and Tacr3c) (Biran et al., 2012). Both TACR3a and TACR3b are efficiently activated by zebrafish NKBa and NKF, and their interaction induces both elevation of intercellular Ca2<sup>+</sup> and production of cAMP (Biran et al., 2012). EC<sup>50</sup> values of NKBb for TACR3a and TACR3b are 50–100 fold higher than that of NKBa for TACR3a and TACR3b (Biran et al., 2012). It is not clear whether homologs of HK and EKs are present in non-mammalian vertebrates, although it was proposed that Tac4 is present in fish, including zebrafish, Danio rerio (Biran et al., 2012).

More than 20 TKs have been identified from skin secretion of frogs including, Odorrana grahami, Rana chensinensis, Theloderma kwangsiensis, Kassina senegalensis, and Physalaemus fuscumaculatus (**Supplementary Table S3**). These skin TKs possess the characteristic TK consensus sequence FXGLMamide (Bertaccini et al., 1965; Anastasi et al., 1977; Li et al., 2006; Wu et al., 2013; Zhang et al., 2013). The frog skin TKs are likely to act as exogenous factors, for instance as antimicrobial substances, rather than endogenous neuropeptides or hormones. An endogenous tachykinin has also been isolated from the brain of the frog, Rana ridibunda with a sequence homologous to that of NKB (O'Harte et al., 1991). Moreover, Tac1 and Tac3 of the frog, Xenopus tropicalis, have been predicted (registered

in NCBI databases). The X. tropicalis Tac1 encodes an SPlike and an NKA-like peptide, while Tac3 gene encodes an NKB-like and an NKF-like peptide. Interestingly, the sequence of the T. kwangsiensis skin TK is identical to that of the X. tropicalis SP-like peptide, except for two amino acid residues. The sequence of the K. senegalensis skin TK is also similar to that of SP (**Supplementary Table S3**). The skin TKs are amphibianspecific, suggesting that TKs acquired new functions in the amphibian lineage.

As shown in **Figure 4** and **Supplementary Table S1**, two TKs (CiTK-I, CiTK-II) were identified from the ascidian (protochordate), Ciona intestinalis (Satake et al., 2004). Like vertebrate TKs, CiTKs contain a C-terminal FXGLMamide (Satake et al., 2004), suggesting that the FXGLMamide sequence of TKs is highly conserved in Olfactores (vertebrates and ascidians). Since the ascidians are among the basal chordates, the CiTk gene is likely to correspond to a prototype of vertebrate TK genes. This single CiTk gene in Ciona encodes CiTK-I and CiTK-II (Satake et al., 2004; **Figure 3B**), suggesting that gene duplications of a prototype TK gene have occurred in the vertebrate lineage, resulting in Tac1, Tac3, Tac4, and amphibian skin TK genes. Furthermore, CiTK-I and -II are located in the same exon of the CiTk gene (Satake et al., 2004), indicating that splice variants of the CiTk gene are absent (**Figure 4**). Therefore, alternative splicing of the TK gene also emerged during vertebrate evolution.

Homology search for mammalian NK1R - NK3R sequences using a C. intestinalis database<sup>2</sup> , showed that only one homologous receptor is present in the ascidian (Satake et al., 2004). This receptor, CiTKR, can be activated by CiTKs (Satake et al., 2004), indicating that it is an authentic TK receptor. Phylogenetic analysis reveals that the CiTKR is sister

<sup>2</sup>http://ghost.zool.kyoto-u.ac.jp/SearchGenomekh.html

to the vertebrate TK receptor clade, which comprises NK1R - NK3R (**Figure 7**). These results suggest that NK1R – NK3R arose via duplication and diversification in the vertebrate lineage (**Figure 7**).

### TKs and Their Receptors in Insects and Other Protostome Invertebrates

Tachykinins have been identified in a wide range of invertebrates, including annelids, mollusks, arthropods, tardigrades, echinoderms and tunicates, and tentatively in nematodes (see **Supplementary Table S1**). Several of these were isolated biochemically, others cloned, but most were identified by bioinformatics and subsequently confirmed by mass spectrometry. However, many TKs listed in the **Supplementary Table S1** have only been predicted from sequences identified from genomes and transcriptomes, based on similarity searches and await confirmation by mass spectrometry. There are some groups of invertebrates where TKs have not been identified or where sequences are only remotely similar to TKs. For instance in flatworms (Platyhelminthes) and cubomedusae (Cnidaria) no TKs have yet been discovered (McVeigh et al., 2009; Nielsen et al., 2019), and in sea anemones (Cnidaria) and nematodes (e. g. C. elegans) the "TK sequences" are not clearly TK-related (Ohno et al., 2017; Palamiuc et al., 2017; Hayakawa et al., 2019), see **Supplementary Table S1**. In fact, the proposed C. elegans TK-like receptor (NPR-22) is more closely related to RYamide/Luqin receptors than to TK-like receptors, and luqin-like peptides (from the LURY-1 precursor) are found in the worm and were shown to activate the receptor (Ohno et al., 2017; Yañez-Guerra et al., 2018). The previously proposed ligand (FMRFamidelike peptide 7, FLP-7) activates NPR-22 only at micromolar concentrations in a heterologous assay (Mertens et al., 2006; Palamiuc et al., 2017), suggesting that it is not a ligand [see (Ohno et al., 2017)]. However, it remains to be tested whether FLP-7 peptide is an NPR-22 ligand in vivo. Also, the C. elegans TK-like peptide derived from NPL-8 (SFDRMGGTEFGLM), does not activate the NPR-22 receptor (Mertens et al., 2006). Thus, the presence of a TK signaling system in C. elegans is still unresolved. However, TK-like receptors are found in Cnidaria (Anctil, 2009; Krishnan and Schiöth, 2015) (bioinformatics only), so the origin of TK signaling could possibly be traced to the common ancestor of Bilateria and Cnidaria. The presence of TK receptors has been demonstrated in the two major clades of Bilateria, the Nephrozoa (protostomes and deuterostomes) and its sister group the Xenacoelomorpha, that include Xenoturbella, Nemertodermatida, and Acoela (Thiel et al., 2018). In the Xenacoelomorpha, the presence is based on bioinformatics only.

With a few exceptions, each species has one gene encoding a TK precursor with multiple copies of TK peptides. As exceptions, two precursor genes were found in e.g., the limpet Lottia gigantea (Veenstra, 2010), the polychaete worm Platynereis dumerilii (Conzelmann et al., 2013), the crab Carcinus maenas (Christie, 2016), the tardigrade Hypsibius dujardini (Koziol, 2018) and the spider mite Tetranychus urticae (Veenstra et al., 2012; **Supplementary Table S1**). The organization of Drosophila and spider mite TK precursors is shown in **Figure 4** and that of the mosquito Aedes aegypti in **Figure 2B**. The number of peptides that can be cleaved from invertebrate TK precursors range from 1 in tardigrades (Koziol, 2018; **Figure 5**) to 15 in the cockroaches Leucophaea maderae and Periplaneta americana (Predel et al., 2005). Commonly, these peptides all have different, but related sequences (designated paracopies). In a few cases, the precursor only has several identical TKs, like in the crayfish Procambarus clarkii, which has seven CabTRP1 (Yasuda-Kamatani and Yasuda, 2004). The TKs are generally between 9 and 11 amino acids long, but a few have only 6, and others up to 18 as in cockroaches (Muren and Nässel, 1996; Predel et al., 2005), or even 37 residues as predicted in some scorpions and spiders (Veenstra, 2016). Interestingly, the N-terminally extended TKs of cockroaches have internal dibasic cleavage sites and it appears as if in the brain these are more likely to be processed and, thus, generate the shortened TKs, whereas the extended TKs are normally found in the midgut (Muren and Nässel, 1996, 1997; Winther et al., 1999; Predel et al., 2005).

All TKs are amidated, but only very few have been detected that may be N-terminally blocked by pyroglutamate (pQ), for instance in hemipteran bugs and the bivalve mollusk Anodonta cygnea (**Supplementary Table S1**). In insects and many other arthropods, it is common to find TKs with an N-terminal proline (P) in the second position (e.g., Drosophila DTK-2, APLAFVGLRa). This is likely to render the peptides sensitive to proline-specific dipeptidyl peptidase (DPP-IV) cleavage and inactivation (Nässel et al., 2000; Isaac et al., 2009). Thus, TKs can be specifically inactivated by DPP-IV selectively located in regions of the CNS or in the periphery (Nässel et al., 2000). Other peptidases that have been shown to inactivate TKs are nephrilysins, angiotensin converting enzymes and deaminases [see (Isaac et al., 2002, 2009)].

Two putative TK receptors were cloned from Drosophila before the endogenous ligands were known (Li et al., 1991; Monnier et al., 1992). Both receptors displayed significant similarities to mammalian TK receptors. One of these, designated

FIGURE 6 | A cladogram showing the occurrence of tachykinin and natalisin signaling systems in Bilateria and Cnidaria. Sequence logos of the peptides have also been provided. These were constructed using the final 10 amino acids at the C-terminus. Note that C-terminal amidation is not shown; it is represented by the amidation signal G. Species in which the receptors have been functionally characterized are indicated by asterisk. Note that the natalisin signaling system appears to have arisen in the lineage leading up to tardigrades and arthropods. The tachykinin-like peptide sequences in Asterias rubens, Caenorhabditis elegans, and Nematostella vectensis diverge from the canonical TK sequences in other phyla and should probably not be classified as TKs (see text). In C. elegans a TK receptor has not been functionally characterized, although an FLP-7/NPR-22 signaling system has been proposed (Palamiuc et al., 2017), but shown to represent a luqin/RYamide signaling system (Ohno et al., 2017). The peptide encoded by the TK2 precursor in Hypsibius dujardini looks more similar to arthropod natalisins. Precursors encoding TK-like peptides have not yet been identified in Branchiostoma floridae and Saccoglossus kowalevskii. However, TK-like receptors are present in all the animals presented here.

DTKR (CG7887) was confirmed as a receptor for endogenous Drosophila TKs (DTK-1-5) (Birse et al., 2006), the other NKD (CG6515) was first shown to respond to DTK-6, but not the other Drosophila TKs (Poels et al., 2009). DTK-6 has an FVAVRamide C-terminus instead of the common FX1GX2Ramide. Surprisingly, it turned out several years later

that NKD is a receptor for a novel family of neuropeptides called natalisins (NTL) that in Drosophila have a consensus sequence of FX1X2X3Ramide (Jiang et al., 2013). One of these peptides, NTL4, has an FFATRamide, remotely similar to DTK-6, and indeed at high concentrations NTL4 activated the TK receptor DTKR (Jiang et al., 2013). The same authors also

full-length receptors were used for the analysis. Sequences were aligned using the MAFFT (E-INS-i algorithm and BLOSUM30 scoring matrix) and the phylogenetic tree constructed using the FastTree plugin in Geneious Prime (2019). Receptors that have been functionally characterized are indicated by a symbol before the species name. Note that in another annelid, Urechis unicinctus (among Lophotrochozoans), the TK receptor has been functionally characterized (not shown here). The figure was constructed in MEGAX. Sequences used to generate the phylogeny are provided in Supplementary Material Text File S1.

showed that DTK-6 at high concentrations activates the NTL receptor NTLR. In many insects, receptors of both DTKRand NTLR-type have been identified (Jiang et al., 2013). In other invertebrates, such as for instance annelids (Urechis) and mollusks (Octopus) only receptors of DTKR type are known (Kawada et al., 2002; Kanda et al., 2007), suggesting that the NTL signaling system arose in the arthropod lineage (Jiang et al., 2013). However, TK-like precursors with NTL-like peptides are found in the spider mite (chelicerate) as well as tardigrades; the latter suggesting that the NTL signaling might also be present outside arthropods. Natalisin signaling will be discussed in a separate section.

Some invertebrate TKs act on exogenous TK receptors in prey animals. TKs with FQGMRa C-termini (**Figure 2** and

OSN is activated and as a consequence the DM5 PN relays strong aversive signals and food search is reduced. (D) In the hungry fly the DILP level is low, DTKR expression is high and therefore TK signaling activates DTKR and the OSN releases less ACh resulting in suppressed activation of the aversive DM5 PN and therefore increase food search. (E) A scheme showing the combined signaling from the DM1 and DM2 signaling pathways that increase food search in hungry flies with low circulating insulin. In hungry flies the aversive DM5 odor pathway is inactivated resulting in increased food search (as detailed in Figure 7C). In the DM1 pathway (food odor attraction) signaling with short neuropeptide F (sNPF) and its receptor sNPFR is increased in hungry flies due to low insulin and increased expression of the sNPFR. This leads to presynaptic potentiation of the ACh signaling and increased activation of DM1 PNs, resulting in increased food search. The panels (A,C–E) were redrawn from figures in Ko et al. (2015) and Jékely et al. (2018).

**Supplementary Table S2**) are produced by venom glands of the Jewel wasp Ampulex compressa and injected into the cockroach brain where action on the TK receptors leads to paralysis (Arvidson et al., 2019). Interestingly, these wasp venom TKs are injected as a precursor protein in the low pH venom, and not as cleaved peptides; only in the cockroach brain with neutral pH they will be slowly liberated to act on TK receptors (Arvidson et al., 2019). Other TKs with C-terminal FXGLMamide are produced in salivary glands of the mosquito Aedes aegypti (sialokinins I and II), and the cephalopods Eledone moschata (eledoisin) and Octopus vulgaris (OctTK-I and II) (Erspamer and Anastasi, 1962; Champagne and Ribeiro, 1994;

Kanda et al., 2003). Presumably these TKs cause vasodilation in vertebrate prey animals [see (Erspamer and Erspamer, 1962; Beerntsen et al., 1999)]. The sequences of gland TKs are shown in **Figure 2** and **Supplementary Table S2**.

### A Case of Novel Ligands for an Insect TK Receptor

Although most peptides and their cognate receptors co-evolve, there are a few interesting cases where receptors have adopted novel structurally unrelated ligands in addition to their original ligands. A well-known example is the Drosophila sex peptide, produced in male accessory glands and transferred to the female during copulation (Wolfner, 2002; Kubli, 2003). Sex peptide was adopted as an additional ligand for the myoinhibitory peptide (MIP) receptor (Kim et al., 2010). The Drosophila MIP receptor is the only known receptor for sex peptide. Another example, again in Drosophila, is the pigment-dispersing factor (PDF) receptor that has adopted DH31 as an additional ligand (Shafer et al., 2008; Goda et al., 2016). However, unlike sex peptide, DH31 also exerts its effects by binding to its own specific DH31 receptor (Johnson et al., 2005), suggesting that the PDF receptor is promiscuous. A similar phenomenon has been discovered for the silkmoth Bombyx mori TK receptor (BNGR-A24), which seems to have adopted ion-transport peptide (extended ITPL isoform in particular) as a novel additional ligand (Nagai-Okatani et al., 2016). Bombyx ITPL is a large protein (79 amino acids), comprising 6 cysteine residues, which form three disulfide bridges, and is thus structurally very dissimilar to tachykinins (Roller et al., 2008; Nagai-Okatani et al., 2016). Nonetheless, Bombyx ITPL and TKs appear to be orthosteric ligands of the Bombyx TK receptor (BNGR-A24) based on heterologous and homologous cell culture experiments (He et al., 2014; Nagai-Okatani et al., 2016). Moreover, activation of BNGR-A24 by ITPL is coupled to the cGMP pathway, whereas BNGR-A24 activation by TKs can activate different second messenger pathways in a cell-type specific manner. More specifically, TKmediated activation of BNGR-A24 in BmN cells has no effect on cAMP and cGMP levels, but if the receptor is expressed in HEK293 and Sf21 cells causes an increase in cAMP and Ca2<sup>+</sup> levels (He et al., 2014; Nagai-Okatani et al., 2016). Thus, the two ligands of the Bombyx TK receptor may activate distinct second messenger pathways, at least in vitro. Since ITP or ITPL receptors have not been identified in any other species besides Bombyx, it remains to be determined if this phenomenon is widespread amongst insects, or whether it is just restricted to Bombyx.

### TKs and Their Receptors in Deuterostome Invertebrates

TK-like receptors have also been mined from the genomes and transcriptomes of invertebrate deuterostome phyla such as Cephalochordata (e.g., Branchiostoma floridae), Hemichordata (e.g., Saccoglossus kowalevskii) and Echinodermata (e.g., Asterias rubens) (Jekely, 2013; Mirabeau and Joly, 2013; Yañez-Guerra et al., 2018). Phylogenetic analysis suggests that TK-like receptors and luqin/RYamide-type receptors arose by gene duplication in a common ancestor of the Bilateria (Yañez-Guerra et al., 2018). Precursors encoding TK-like peptides have also been predicted in the starfish, Asterias rubens, and brittle stars (**Figure 1** and **Supplementary Table S1**) (Semmens et al., 2016; Zandawala et al., 2017). However, the predicted TK-like peptides from A. rubens with an XXGL/IFamide C-terminus diverge substantially from FXGXRamide peptides of invertebrates, as well as from TKs of the protochordate Ciona intestinalis and vertebrates (FXGLMamide) (Semmens et al., 2016). Interestingly, the C-terminus of A. rubens (XXGL/IFamide) is somewhat similar to the proposed C. elegans tachykinin-like peptides (XMVRFamide) (Palamiuc et al., 2017), with peptides of both species having an Famide C-terminus and lacking the conserved phenylalanine residue in 5th position from the C-terminus. However, as mentioned above, the C. elegans TK-like receptor is more similar to RYamide/Luqin-like receptors, so it remains to be determined whether the predicted TK-like peptides in echinoderms are bona fide endogenous ligands for the echinoderm TK-like receptors. Moreover, sequences encoding TK-like receptors have been identified in the genomes and/or transcriptomes of hemichordates and cephalochordates (Jekely, 2013; Mirabeau and Joly, 2013; **Figure 6**). However, TK-like peptides have not yet been identified in these taxa. Perhaps the difficulty in discovering these peptides can be attributed to substantial diversification in the canonical sequence, which would render the homology-based search protocols ineffective.

### EVOLUTION OF TACHYKININ SIGNALING COMPONENTS

We show a cladogram of TKs (**Figure 6**) and phylogenetic analysis their receptors (**Figure 7**) in the animal kingdom and the occurrence of natalisin signaling in some groups. TK signaling is evolutionary ancient (**Figure 6**) and is one of several peptide families that emerged before the split of deuterostomes and protostomes (Jekely, 2013; Mirabeau and Joly, 2013). A few recent studies suggest that it might be more ancient than previously thought. TK-like receptors were recently found in genomes of Xenacoelomorpha, which is a sister group of Nephrozoa (comprising deuterostomes and protostomes) (Thiel et al., 2018). However, no TK-like ligands were identified in these genomes. A TK-like GPCR has also been predicted in the genome of the sea anemone Nematostella vectensis, but this receptor is more closely related to other Nematostella neuropeptide GPCRs than it is to bilaterian TK receptors (Krishnan and Schiöth, 2015; Thiel et al., 2018). Most protostomes have a single TK receptor but protostomian TK precursors encode multiple TK peptides (**Figure 7**). Thus, it appears that protostomian TKRs can all be activated by the different TKs, as shown already in Drosophila and Bombyx (Birse et al., 2006; Jiang et al., 2013; Nagai-Okatani et al., 2016). Our phylogenetic analysis shows that a single TK-like receptor is also found in tardigrades (**Figure 7**; Koziol, 2018). Interestingly, tardigrades have 2 TKlike precursors, one of which encodes a peptide with sequence similarity to TKs and another one, which encodes a NTLlike peptide (**Figure 5**; Koziol, 2018). This suggests that NTL signaling may have arisen by the duplication of TK gene first

and a subsequent duplication and diversification of its receptor. Interestingly, the spider mite genome encodes multiple NTL-like receptors and a single TK receptor. In addition, it possesses two TK-like precursors, one of which contains TK-like and NTLlike peptides and another precursor with only TK-like peptides (**Figure 5**). This perhaps indicates a more advanced point in the diversification of the NTL-signaling, as the receptors now seem to have duplicated. Additional genomes of basal arthropods, onychophores and tardigrades need to be examined to determine the nature of TK-like and NTL-like signaling present in these animals before we can establish the precise lineage in which the NTL-signaling arose.

In deuterostomes, at least ancient Olfactores (vertebrates and ascidians) acquired a TK receptor that recognized a TK harboring the C-terminal FXGLMamide motif. The three subtypes of TK receptors, namely NK1R, NK2R, and NK3R, appear to have arisen following the whole genome duplications in the vertebrate lineage. Furthermore, these subfamilies might have acquired ligand selectivity during their diversification along with the generation of TK subtypes. The current missing pieces are echinoderm, acorn worm, and amphioxus counterparts, because canonical TKs have not yet been identified in genomes and/or transcriptomes of these deuterostome invertebrates. However, multiple TKRs are present in echinoderms and hemichordates suggesting additional independent gene duplication events within these lineages. In **Supplementary Figure S1** we show multiple sequence alignments of select TK and natalisin receptors.

### TACHYKININS IN INVERTEBRATES, DISTRIBUTION AND FUNCTIONS

### Overview of Functional Diversity From Early Studies

The first TKs isolated from insects were purified with the aid of a hindgut contraction assay that had also been utilized for first discovery of numerous other insect neuropeptides (Holman et al., 1990, 1991; Schoofs et al., 1990b). TKs from the annelid worm Urechis unicinctus were also purified with the aid of a muscle contraction assay (Ikeda et al., 1993). Thus, it was shown early that TKs are myostimulatory on a variety of muscles in the body wall, oviduct, foregut, hindgut, as well as heart [see (Ikeda et al., 1993; Schoofs et al., 1993; Nässel, 1999; Sliwowska et al., 2001)]. Examples of other functions established before employment of genetic tools are modulation of network activity in the stomatogastric ganglion of crustaceans [(Blitz et al., 1995), reviewed in Nusbaum and Blitz (2012), Nusbaum et al. (2017)], activation of dorsal unpaired median neurons in locust (Lundquist and Nässel, 1997), stimulation of release of adipokinetic hormone from locust corpora cardiaca (Nässel et al., 1995b), diuretic action on Malpighian tubules of locust L. migratoria and moth Manduca sexta (Skaer et al., 2002; Johard et al., 2003) and presynaptic inhibition of crayfish photoreceptors, likely as a co-transmitter of GABA (Glantz et al., 2000). From numerous in vitro studies, there is no evidence that the different paracopies of TKs in a species have any major differential activities or functions, except possibly DTK-6 in Drosophila, but it should be noted that the presence of this mature peptide has not been verified by mass spectrometry. Additional functions of TKs discovered using various approaches are discussed separately in the context of different species in sections "Distribution and Function of TKs in Invertebrates" and "Functional Roles of TKs in Drosophila, Genetic Advances".

### Distribution and Function of TKs in Invertebrates

Early work used antisera to substance P and other vertebrate tachykinins to localize presumptive TK neurons in the CNS of several invertebrates summarized in Nässel (1999). It should be noted that the earliest of these studies were performed before neuronal/intestinal TKs had been isolated and sequenced in invertebrates. In retrospect, it appears that of the TK antibodies used during this era, one monoclonal antibody (Cuello et al., 1979) actually recognizes the invertebrate TKs [see (Blitz et al., 1995; Johansson et al., 1999; Nässel, 1999)], whereas the polyclonal ones, except anti-Kassinin (Lundquist et al., 1994), seem to label other epitopes, at least in insects. Thus, TK distribution in several crustaceans (Sandeman et al., 1990; Schmidt and Ache, 1994; Blitz et al., 1995; Johansson et al., 1999), and the horse shoe crab Limulus polyphemus (Chamberlain and Engbretson, 1982; Mancillas and Selverston, 1985) is likely to be correctly described in these earlier studies. The first antisera to invertebrate TKs were raised against locust LomTK-I (Nässel, 1993) and LomTK-II (Vitzthum and Homberg, 1998), blowfly CavTKII (Nässel et al., 1995a) and cockroach LemTRP1 (Winther and Nässel, 2001) and these were subsequently used in a large number of invertebrate species, some of which are outlined below.

### Insects

The neuronal distribution of TK immunoreactivity in the CNS is in general fairly well conserved between insects studied, whereas in other arthropods only some features seem to be shared with insects. Characteristic of TK distribution in insects is presence in neuronal processes in antennal lobes, central complex, pars intercerebralis, dorsolateral protocerebrum, optic lobes and subesophageal zone. First, we will outline the neuronal localization of TK in Drosophila and a few other insects, then move on to crustaceans, and snails. For these organisms, except Drosophila, we also briefly describe TK functions. Functional roles of TK signaling in Drosophila are described in a separate section.

In the adult Drosophila brain, in situ hybridization and immunolabeling revealed that there are more than 160 TKexpressing neurons that can be divided up into 11 bilateral groups and one unique pair (**Figure 8A**; Winther et al., 2003). Ten large lateral neurosecretory cells (ITPn) express TK, as well as two other peptides (short neuropeptide F, sNPF, and ion transport peptide, ITP) (Kahsai et al., 2010a). The other TK neurons are interneurons of different kinds innervating the fanshaped body of the central complex, the antennal lobes, the optic lobes, pars intercerebralis, dorsal lateral protocerebrum and the

subesophageal zone (Winther et al., 2003). Some of these TK neuron clusters have been functionally investigated by genetic manipulations (colored cells in **Figure 8A**), whereas the functions of other clusters (black cells in **Figure 8A**) remain obscure. Functional aspects will be discussed later in a separate section. Details of some of the TK neurons are shown in **Figure 8B**. In the third instar larva there are only about 44 neurons in the entire CNS that are consistently labeled by TK antisera; 32 of these are in the brain and SEZ (**Figure 8C**) (Siviter et al., 2000; Winther et al., 2003). In both larvae and adults, enteroendocrine cells of the midgut and anterior hindgut express TK (Siviter et al., 2000; Veenstra et al., 2008; Veenstra, 2009).

In some cells in Drosophila, TK is colocalized with other neuropeptides or GABA (**Supplementary Table S4**): in neurosecretory cells (ITPn) with sNPF and ITP, in local neurons of the antennal lobe with either GABA, allatostatin-A or myoinhibitory peptide (MIP; also known as allatostatin-B) and in midgut enteroendocrine cells with either neuropeptide F or diuretic hormone 31 (Veenstra et al., 2008; Ignell et al., 2009; Carlsson et al., 2010; Kahsai et al., 2010a). In addition, single cell transcriptome sequencing of brain neurons shows that TK is coexpressed with glycoprotein hormone beta 5 (GPB5) (Davie et al., 2018). A more systematic screen of colocalized substances in the insect CNS would probably make this list longer.

The blowfly Calliphora vomitoria displays a neuronal distribution of TK very similar to that in Drosophila (Lundquist et al., 1994). Studies of TK distribution in other insects reveal many similarities, except that the numbers of neurons in the different clusters vary between species, as shown next.

In the brain of the honeybee Apis mellifera TK distribution was mapped by in situ hybridization (Takeuchi et al., 2004). Neuronal cell bodies were revealed in association with the central complex, antennal lobes and optic lobes, as in other insects, but also associated with the mushroom body calyces. The intrinsic mushroom body neurons were identified as the small-type Kenyon cells (class I and II) (Takeuchi et al., 2004). Later, immunolabeling also confirmed presence of TK in Kenyon cells including their axons in the lobes (Heuer et al., 2012). The distribution of TK transcript is spatially similar irrespectively of sex, cast, or division of labor of workers: however, quantitatively transcript levels are higher in queens and foragers than in nurse and drone bees (Takeuchi et al., 2004). In bees, the TK in mushroom bodies may be involved in regulation of foraging and social behaviors (Takeuchi et al., 2004; Brockmann et al., 2009; Boerjan et al., 2010). Also in other hymenopterans (Oya et al., 2017) and in the beetle Tribolium castaneum (Binzer et al., 2014), TK was identified in major subpopulations of Kenyon cells, but in other studied insects there are so far no reports of such neurons producing TKs. In the honeybee, quantification of neuropeptides by mass spectrometry was performed after foraging nectar or pollen (Brockmann et al., 2009). TK was among the three peptides whose levels were most affected in association with foraging for nectar or pollen.

In the moth Spodoptera litura, at least 80 TK neurons were detected in the adult brain, and the innervation of the central complex, the antennal lobes, pars intercerebralis, dorsal lateral protocerebrum and the subesophageal zone is similar to that in Drosophila (Kim et al., 1998; see **Supplementary Figure S2**). Also a pair of large descending neurons was identified. A special feature of the moth is the presence of TK expression in 8 median neurosecretory cells with axon terminations in the retrocerebral complex and anterior aorta (Kim et al., 1998). Similar median neurosecretory cells (MNCs) were also seen in the moth Heliothis virescens (Zhao et al., 2017), and the beetles Tenebrio molitor and Zophobas atratus (Sliwowska et al., 2001). In another moth Manduca sexta, the TK distribution in the brain (except the MNCs) and intestine was found similar to S. litura and it was shown that TK stimulates secretion in the Malpighian tubules in vitro (Skaer et al., 2002).

The brains of the cockroach Leucophaea maderae and the locust Locusta migratoria contain far larger numbers of TK neurons, but the innervation pattern of brain regions is similar to Drosophila and moth (Nässel, 1993; Muren et al., 1995; Vitzthum and Homberg, 1998). In the L. maderae brain (without optic lobes), about 360 TK neurons were found (Muren et al., 1995; **Supplementary Figure S3**), and about 800 in the entire brain of L. migratoria (Nässel, 1993). In contrast to Drosophila, there are efferent TK neurons in the cockroach abdominal ganglia that innervate the hindgut and TK neurons in the stomatogastric ganglia that supply extensive axon terminations over the foregut and midgut (Muren et al., 1995; Nässel et al., 1998). In locusts, TK neurons in the lateral neurosecretory cell group send axons to the corpora cardiaca where they contact cells producing adipokinetic hormone (AKH) and it was shown that TKs induce AKH release in vitro (Nässel et al., 1995b). These neurosecretory cells may be analogous to the ITPn neurons in Drosophila (Kahsai et al., 2010a), although a role in hormone release was not yet analyzed in the fly. In both locust and cockroach midgut, endocrine cells express TK, and in the locust there are also TK producing endocrine cells in the six midgut ampullae at the base of the Malpighian tubules (Muren et al., 1995; Winther and Nässel, 2001). TKs stimulate secretion in locust Malpighian tubules (Johard et al., 2003). Calcium-dependent release of TK from the cockroach and locust intestine could be induced by potassium application, and TK was demonstrated in hemolymph, suggesting that hormonal release of intestinal TK regulates tubules secretion (Winther and Nässel, 2001). In locusts, several cases of colocalization of TK and other peptides have been demonstrated. The endocrines of the ampullae (but not in the rest of the midgut) coexpress TK, diuretic hormone 44 (DH44) and FMRFamidelike peptide, and TK was shown to stimulate secretion in locust Malpighian tubules together with DH44 (Johard et al., 2003). In certain central complex neurons there is co-expression of TK and leucokinin and in others TK and octopamine or GABA (Vitzthum and Homberg, 1998). Finally, sensory neurons of the metathoracic legs co-express TK, allatotropin, FMRFamide-like peptide and probably acetylcholine (Persson and Nässel, 1999; **Supplementary Table S4**).

Another insect studied in some detail is the hemipteran bloodsucking bug Rhodnius prolixus where a total of about 250 TK immunoreactive neurons were found in the brain (Kwok et al., 2005). These are distributed in the optic lobes and in several other clusters in the midbrain. Interestingly, no TK containing enteroendocrine cells were detected in this species, in contrast

to many other studied insects, but the hindgut is innervated by TK axons (Kwok et al., 2005). These TK axons also express leucokinin, another myostimulatory peptide (Haddad et al., 2018). Rhodnius TKs were shown to increase the basal tonus of the hindgut, but also to increase the frequency and amplitude of peristaltic contractions of the salivary gland, a tissue that displays high levels of TK transcript, but no immunoreactive TK (Haddad et al., 2018).

Some further functions of TKs in insects other than Drosophila have been explored that might shed light on TK signaling in general. In female burying beetles, Nicrophorus vespilloides, neuropeptides were quantified in solitary virgins, individuals actively parenting or post-parenting solitary adults to identify neuropeptides associated with parenting (Cunningham et al., 2017). TK was found as one of the few peptides associated with active parenting. In several insects, including Drosophila, oriental fruitfly, cockroaches and moths, TK seems to play a role in modulation of olfactory sensory processing (Ignell et al., 2009; Jung et al., 2013; Fusca et al., 2015; Ko et al., 2015; Gui et al., 2017b; Lizbinski and Dacks, 2017). Other functions have not been investigated in multiple species; however, immunocytochemistry suggests some conservation of the distribution of TK in neurons of specific brain centers, and intestine in insects and crustaceans that might also reflect functional conservation.

### Crustaceans

The TK distribution in the brain of a few crayfish, lobster and crab species has been studied, mostly with a monoclonal antibody to substance P (Goldberg et al., 1988; Sandeman et al., 1990; Schmidt and Ache, 1994; Schmidt, 1997a,b), but one study also employed antiserum to a cockroach TK that is nearly identical to crab TK (Johansson et al., 1999). These studies have mostly focused on TK neurons in the olfactory centers of the brain, but also the stomatogastric nervous system.

As seen in **Supplementary Figure S4**, brain of crayfishes possesses a pair of TK interneurons with large cell bodies and extensive processes in the anterior deutocerebrum and varicose branches among the cell bodies of a group of olfactory interneurons in the lateral deutocerebrum (Sandeman et al., 1990; Johansson et al., 1999). There is another pair of TK neurons with deutocerebral cell bodies and processes in the neuropil of the olfactory lobe, as well as larger numbers of small TK neurons with processes in the olfactory and accessory lobes (Sandeman et al., 1990; Johansson et al., 1999). TK neurons were shown in all the neuropils of the optic lobes of the crayfish and specifically a set of TK and GABA expressing amacrine cells were identified in the lamina ganglionaris (Glantz et al., 2000). This study shows that application of GABA and TK to photoreceptor terminals in the lamina induces a short-latency, dose-dependent hyperpolarization with a decay time of a few seconds. TK also acts over several minutes to reduce the photoreceptor potential to potentiate the action of GABA (Glantz et al., 2000). In the American lobster, the distribution of TK processes is in general similar to that seen in insects with TK immunolabeling in the protocerebral bridge, central body, olfactory (antennal) lobes, and anterior median protocerebral neuropil (Langworthy et al., 1997). Like in insects, midgut enteroendocrine cells in crabs also express TK (Christie et al., 2007).

In decapod crustaceans the stomatogastric nervous system (STN) consists of 25–30 neurons (depending on species) and controls handling of ingested food. Most of these neurons contribute to the activity in one or both of the neural networks in the STN, which regulate (1) gastric mill (chewing) or (2) pyloric circuit (pumping and filtering of food that has been chewed) [see (Nusbaum et al., 2017)]. A pair of neurons (MCN1) that innervate the stomatogastric ganglion produces the TK CabTRP-Ia (Blitz et al., 1995; Christie et al., 1997). The MCN1s also produce GABA and the peptide proctolin (Nusbaum et al., 2017). It was shown that CabTRP-Ia and GABA released from MCN1 are critical for activation of the gastric mill rhythm, whereas MCN1 release of CabTRP-Ia and proctolin predominantly excites the pyloric rhythm (Nusbaum et al., 2017).

### Mollusks

A few mollusks have been investigated with respect to TK distribution (using antiserum to locust TK). These include the pond snail Lymnaea stagnalis, the pulmonate terrestrial snail Helix pomatia and the freshwater bivalve, Anodonta cygnea (Elekes and Nässel, 1994; Elekes et al., 1995). In L. stagnalis, about 180 TK neurons were found, distributed in cerebral and pedal ganglia and TK axons were detected in the intestine (Elekes et al., 1995). About 900 TK neurons were seen in H. pomatia with about 80% of these in cerebral ganglia, whereas in A. cygnea only a smaller number of TK neurons was detected in cerebral, pedal and visceral ganglia (Elekes and Nässel, 1994; Elekes et al., 1995). In the snails, a large number of TK neurons are located in procerebrum of the cerebral ganglia (**Supplementary Figure S5**). The procerebrum is an association center for olfactory information similar to the mushroom bodies of insects and thus TKs seem to be involved in olfactory processing also in mollusks (Elekes and Nässel, 1994). Recently, a TK receptor related to the Drosophila DTKR and responding to endogenous TKs was identified in the bivalve mollusk Crassostrea gigas (Dubos et al., 2018). In the snail Helix, the neuronal membrane effects of locust LomTK-I, and anodontatachykinin, were either depolarizing or hyperpolarizing depending on neuron-type, and voltage-clamp experiments revealed a role of Ca- or K-currents in these peptide effects (Elekes et al., 1995).

### Nematode Worms

In Caenorhabditis elegans, the gene FLP-7 was considered to encode a TK precursor ortholog (Palamiuc et al., 2017). However, as mentioned in section "TKs and Their Receptors in Insects and Other Protostome Invertebrates," this gene encodes FMRFamidelike peptides (Mertens et al., 2006) and the proposed receptor gene (NPR-22) is only remotely related to TK receptors, and more closely related to the RYamide/Luqin receptor (Ohno et al., 2017; Yañez-Guerra et al., 2018; see **Figure 6**). Nevertheless, since the signaling system was referred to as a TK system (Palamiuc et al., 2017) we summarize the findings here. It was shown that FLP-7 is expressed in several tissues, including the head, the nervous system, and the sensillum (wormbase.org). At the cellular level, a fluorescent transgenic reporter line revealed that FLP-7 is

expressed in the ALA motor- and the AVG interneurons and in the ASI sensory neurons, and that the reporter is secreted into the "circulation" (Palamiuc et al., 2017). The ALA motor neuron has been shown to regulate locomotion, the AVG neuron influences ventral cord development, and the ASI sensory neuron pair regulates whole body physiology during development, controls lifespan via neurohormones, and regulates 5-HT-induced fat loss (Palamiuc et al., 2017). FLP-7 was shown to act in the intestine to induce lipase activity and fat loss (Palamiuc et al., 2017). In another study, the ligands of NPR-22 were found to be the luqin-like peptides LURY-1 and 2 (AVLPRYa and PALLSRYa) encoded on the gene Y75B8A.11 (Ohno et al., 2017). The LURY peptides are secreted from pharyngeal neurons and regulate feeding, lifespan, egg-laying and locomotor activity (Ohno et al., 2017). In summary, no clear-cut TK signaling system has been discovered in C. elegans so far.

#### Ambulacraria

Tachykinin expression has not been mapped yet in echinoderms or hemichordates.

### Functional Roles of TKs in Drosophila, Genetic Advances

With the introduction of the binary Gal4-UAS system (Brand and Perrimon, 1993) it became possible to genetically target components of the TK signaling system spatially and temporarily, and thus knock down or increase activity in a neuronspecific fashion. **Table 2** summarizes the known functions of TKs in Drosophila. A first study, utilized Tk-RNAi to broadly knock down TK production in neurons by means of ubiquitously expressed drivers (Elav- and tubulin-Gal4) and monitor effects on olfaction and locomotion (Winther et al., 2006). The flies with globally reduced TK signaling displayed decreased responses to certain odors and were hyperactive in locomotor assays. Subsequent studies describe more targeted manipulations where TK functions in smaller populations of neurons could be revealed.

It was found that the TK receptor DTKR is expressed by olfactory sensory neurons (OSNs) of the Drosophila antennae and TK in subpopulations of the local neurons (LNs) of the antennal lobe (Ignell et al., 2009). TK signaling from LNs to OSNs provides presynaptic inhibitory feedback by suppressing calcium and synaptic activity (Ignell et al., 2009). An ensuing study revealed further details on the role of TK signaling in olfaction and food search (Ko et al., 2015). It was shown that in hungry flies where circulating levels of insulin-like peptide (ILP) are low there is an upregulation of the DTKR in OSNs carrying specific odorant receptors (Or42b and Or85a) (**Figure 9**). In the antennal glomerulus DM5, which conveys food odor aversion (negative valence), upregulation of the inhibitory DTKR in a hungry fly leads to increased TK signaling and thus suppressed depolarization and as a consequence decreased synaptic activation of antennal lobe projection neurons (PNs) leading to increased food attraction (Ko et al., 2015). When the fly has fed, and circulating insulin is high, the DTKR expression decreases due to activation of the insulin receptor in OSNs, synaptic signaling increase and food aversion is augmented TABLE 2 | Functions of TKs in Drosophila.


1 In brackets: genetic interference with peptide (Tk) or receptor (Dtkr). <sup>2</sup>Ablation of TK-expressing ICN neurons, activation or inactivation of neurons. <sup>3</sup> Interference with signaling in TK expressing endocrines. OSN, olfactory sensory neurons; AL, antennal lobe; SEZ, subesophageal zone; IPC, insulin-producing cells; ITPn, ITP neurons; NSC, neurosecretory cell; ICN, IPC contacting neurons; EGF, epidermal growth factor; IMD, immune-deficiency pathway; DILP, Drosophila insulin-like peptide.

(**Figure 9**). In glomerulus DM1 (positive valence; wired for food odor attraction) innervated by Or42b expressing OSNs, enhanced signaling with sNPF increases food attraction in hungry flies with low circulating insulin (Ko et al., 2015). This enhanced signaling is caused by up-regulation of sNPF receptor expression on OSNs and strengthened synaptic activation of PNs (**Figure 9**). Together, peptidergic neuromodulation of the two odor channels (DM1 and DM5) ensures that hungry flies increase food search. Whereas it has been shown that sNPF facilitates cholinergic transmission in OSNs to PNs (Ko et al., 2015), it is not clear whether TK acts to modulate inhibitory GABA transmission in LNs (Ignell et al., 2009). Also in the cockroach Periplaneta americana (Jung et al., 2013) and the oriental fruitfly Bactrocera dorsalis (Gui et al., 2017b), TK signaling modulates olfactory sensitivity, and the presence of TK in antennal lobe neurons in all studied insects may suggest a conserved role in olfaction.

In the central complex of Drosophila, TK is found in a few sets of neurons (light red neurons in **Figure 8A**) and in assays of explorative walking TK knockdown in some of these neurons resulted in flies with increased center zone avoidance, whereas knockdown in other neurons resulted in flies with increased activity-rest bouts (Kahsai et al., 2010b). Thus, TK in

the central complex seems to be important for modulation of spatial orientation, activity levels, and temporal organization of spontaneous walking.

A small set of protocerebral TK neurons (light blue in **Figure 8A**) have been shown to regulate levels of aggression in male Drosophila (Asahina et al., 2014). These TK neurons are a small subpopulation of the numerous neurons that express the male splice form of fruitless (FruM+), a transcription factor that specifies male-specific behavior, including male aggression. Thus, a set of 4 pairs of neurons in the brain designated Tk-GAL4FruM neurons control the level of male-male aggression, but have no influence on male-female courtship behavior (Asahina et al., 2014). The same authors found that the Tk-GAL4FruM neurons also may be cholinergic (express marker for acetylcholine signaling) and that this neurotransmitter may thus play an additional role in the circuit.

Another male-specific TK circuit in Drosophila is involved in gustatory detection of an anti-aphrodisiac pheromone (CH503). Gustatory cells (Gr68a) in the forelegs respond to this pheromone and mediate signals to central brain circuits via 8 to 10 TK neurons located in the subesophageal zone and thereby suppress courtship (Shankar et al., 2015). It is not clear from this study to which specific neurons TK in **Figure 8** they correspond.

The insulin-producing cells (IPCs) of the Drosophila brain are modulated by several factors, including TK (Birse et al., 2011; Nässel and Vanden Broeck, 2016). The IPCs produce four insulinlike peptides (DILP1, 2, 3, and 5) and are known to regulate many aspects of development and adult physiology, such as growth, metabolism, stress responses, reproduction and lifespan reviewed in Owusu-Ansah and Perrimon (2014), Nässel and Vanden Broeck (2016). Knockdown of the receptor DTKR in IPCs affected levels of dilp2 and dilp3 transcripts in these cells, increased the fly lifespan and diminished carbohydrate levels during starvation (Birse et al., 2011). Knockdown of the natalisin receptor (NTLR; CG6115; earlier known as NKD) had no effect on IPC activity and the TK cells acting on the IPCs were not identified (Birse et al., 2011). In a more recent paper, a pair of TK neurons was demonstrated in the Drosophila larva, which connect functionally to the IPCs (Meschi et al., 2019). These TK neurons (ICNs) are inhibitory on IPCs. Under protein-rich diet conditions the ICNs respond to growth-blocking peptides secreted from the larval fat body and this alleviates the inhibitory action on IPCs, and DILPs can be released to stimulate growth (Meschi et al., 2019). It is not completely clear from the images of this paper, but it appears as if the ICNs are the same as the descending neurons shown in **Figure 8C** (blue arrows), which also exist in the adults (DN in **Figure 8A**).

In larvae, a nociceptive pathway mediating thermal tissue damage signals was identified and shown to include TK and the receptor DTKR (Im et al., 2015). The DTKR receptor is expressed in the nociceptive sensory neurons and required for mediation of thermal hypersensitivity after tissue damage. A set of TK expressing interneurons in the ventral nerve cord mediates this presynaptic modulation of nociceptive sensory neurons (Im et al., 2015). Substance P is known for its role in modulation of nociceptive sensory signals in the dorsal horn of the spinal cord [see (Hökfelt et al., 2001; Steinhoff et al., 2014)], suggesting a conserved role of tachykinin signaling, although the pathway and mechanisms differ.

A set of five pairs of large neurosecretory cells (ITPn in **Figure 8A**) produces TK, as well as ITP and sNPF. Targeted knockdown of TK (or sNPF) in these cells result in flies that display decreased survival time when exposed to desiccation or starvation, and also suffer increased water loss at desiccation (Kahsai et al., 2010a). ITP is acting as an antidiuretic hormone (Galikova et al., 2018), but it is not likely that TK or sNPF are released as circulating hormones from the ITPn cells (Kahsai et al., 2010a). Instead, these peptides might act locally, either presynaptically on ITPn axon terminations, or on other brain neurons/neurosecretory cells to modulate antidiuretic signals or metabolic stress responses. A similar local action of sNPF released from lateral neurosecretory cells in the brain has been demonstrated; it was found that the IPCs in the brain and the AKH-producing cells in the CC were directly regulated by locally released sNPF (Kapan et al., 2012; Oh et al., 2019).

In gut endocrine cells (EECs) of Drosophila, TK was shown to influence lipid homeostasis by controlling lipid production in enterocytes of the midgut (Song et al., 2014). These TK- (and DH31-) producing EECs are nutrient-sensing and can be activated by the presence of circulating dietary proteins and amino acids (Park et al., 2016). The EECs have also been shown to play a role in the innate immune system and development of Drosophila (Kamareddine et al., 2018). Activating the immune deficiency (IMD) pathway in EECs triggers TK signaling leading to DILP3 upregulation in the gut and mobilization of lipids increased insulin signaling and effects on organismal development and growth. Thus, the gut microbiota can influence growth via the immune system and TK and insulin signaling (Kamareddine et al., 2018). TK was also shown to activate peristalsis in the midgut (Siviter et al., 2000), maybe by local paracrine signaling.

### Functional Roles of TKs in Protochordates and Non-mammalian Vertebrates

Also in several non-insect invertebrates and non-mammalian vertebrates TKs were found to exhibit contractile activity on muscles in the digestive tract (Satake and Kawada, 2006; Satake et al., 2013; Steinhoff et al., 2014). In the following we will discuss additional roles of TKs in sea squirts (Ascidians) and fish, exemplified by Zebrafish.

### Ascidians

Aoyama et al. (2012) demonstrated that CiTK induces growth of follicles in Ciona during late stage-II (vitellogenic stage) to stage-III (post-vitellogenic stage) via up-regulation of gene expression and the enzymatic activities of follicle-processing proteases: cathepsin D, chymotrypsin, and carboxypeptidase B1. This is consistent with the finding that CiTKR is expressed exclusively in test cells (functional counterparts of vertebrate granulosa cells) residing in late stage-II follicles (Aoyama et al., 2012). Moreover, Ci Cathepsin D, co-localized with CiTKR in test cells, is initially activated, and Ci Carboxypeptidase B1 and

Ci Chymotrypsin, localized in follicular cells, are activated 1 h later (Aoyama et al., 2012). These findings provide evidence for a novel tachykininergic follicle growth pathway. In addition, the CiTK-induced follicle growth is suppressed by a Cionaspecific neuropeptide, CiNTLP6, via downregulation of the three aforementioned proteases (Kawada et al., 2011). It would be interesting to reveal whether the tachykininergic regulation of follicle growth is conserved, at least in part, in vertebrates or other invertebrates.

#### Teleost Fish

The roles of NKB in reproductive functions are of interest in both teleosts and mammals. NKB is expressed in the hypothalamus of mammals (Satake and Kawada, 2006; Steinhoff et al., 2014). Moreover, NKB is colocalized with kisspeptin and dynorphin A in KNDy neurons in the arcuate nuclei. KNDy neurons and NKB are responsible for the generation of GnRH pulsatility in the hypothalamus, which plays a central role in reproductive functions via induction of secretion of gonadotropins (LH and FSH) from the pituitary to the gonads (Lehman et al., 2010; Navarro, 2012). Interestingly, some mutations were detected in the genomic sequences of human Tac3 and Tacr3 in a portion of patients with hypogonadotrophic hypogonadism (Topaloglu et al., 2009; Chen et al., 2018). NKB is likely to downregulate production of LH and FSH in zebrafish, tilapia and goldfish (Biran et al., 2012; Qi et al., 2015; Hu et al., 2017; Chen et al., 2018; Liu et al., 2019). Several of these studies also indicated that NKB and/or NKF (identical to NKB-related peptide) downregulate the expression of kiss2 that is a homolog of mammalian kisspeptin. The role of mammalian kisspeptin in induction of GnRH synthesis and release may suggest that kisspeptin 2 also upregulates GnRH synthesis and release in teleost. However, conservation of such kisspeptin 2-directed GnRH regulation in teleost is not likely, since kisspeptin 2 seems not to be involved in reproduction in teleosts (Nakajo et al., 2017). Collectively, these findings suggest that biological roles of NKB and NKF in reproduction in teleost are distinct from those in mammals. In addition, SP and NKA were found to upregulate gene expression and release of LH, prolactin, and somatolactin α in carp pituitary cells (Hu et al., 2017). Interestingly, short-term SP treatment (3 h) induces LH release, but long-term SP treatment attenuated gene LH expression (Hu et al., 2017). Thus, in teleosts SP and NKA are important in reproductive functions.

### CONSERVED ROLES OF TACHYKININ SIGNALING IN THE ANIMAL KINGDOM

Some of the functional roles of TKs that have been described in some detail in earlier sections are evolutionarily conserved, at least in general terms. By general terms we mean that for instance a role in nociception has been found for TKs both in Drosophila (Im et al., 2015) and in mammals [see (Onaga, 2014; Steinhoff et al., 2014; Zieglgänsberger, 2019)], but the neuronal pathways and mechanisms are quite different. In a similar fashion, TKs are acting as cotransmitters in many neuronal circuits and thus play roles in for instance: modulation of olfactory sensory signaling together with GABA in Drosophila (Ignell et al., 2009; Ko et al., 2015) and mammals (Olpe et al., 1987), modulation of rhythm generating motor networks in crustaceans (Nusbaum et al., 2017) and lampreys (Parker and Grillner, 1998), aggression in Drosophila (Asahina et al., 2014) and mammals (Felipe et al., 1998; Katsouni et al., 2009), as well as roles in learning and memory circuits in honey bees (Takeuchi et al., 2004; Brockmann et al., 2009; Boerjan et al., 2010) and mammals (Lénárd et al., 2018). Furthermore, TKs are involved in regulation of several aspects of intestinal function, including electrolyte and fluid secretion in insects (Johard et al., 2003; Veenstra et al., 2008; Lemaitre and Miguel-Aliaga, 2013; Song et al., 2014) and mammals (Hökfelt et al., 2001; Steinhoff et al., 2014), in regulation of gustatory receptors in Drosophila (Shankar et al., 2015) and mammals (Onaga, 2014) and in control of hormone release in insects (Nässel et al., 1995b; Birse et al., 2011; Meschi et al., 2019) and vertebrates (Hu et al., 2014; Steinhoff et al., 2014; Zhang et al., 2019).

Roles of TKs in reproductive functions have been demonstrated in vertebrates (Satake et al., 2013; Steinhoff et al., 2014), but not yet in insects or other invertebrates. However, in insects and other arthropods natalisins seem to be important in reproductive behavior, as outlined in the next section (Jiang et al., 2013; Gui et al., 2018).

### NATALISINS, A SISTER GROUP OF TACHYKININS IN ARTHROPODS

A novel peptide precursor gene that encodes multiple copies of peptides that were designated natalisins (NTLs) was discovered in Drosophila, Tribolium castaneum, and Bombyx mori; these have a consensus sequence FXXXRamide (Jiang et al., 2013). The name NTL is derived from the functional role of the peptide in reproduction (Latin word natalis for birth) (Jiang et al., 2013). The NTLs have so far only been identified in arthropods and tardigrades, and the peptides display minor similarities to TKs. However, the NTL receptor (NTLR) was previously identified as a TK receptor (CG6115; TakR86C; NKD) (Monnier et al., 1992; Poels et al., 2009), suggesting that NTLs are ancestrally related to arthropod TKs (Jiang et al., 2013). In fact, phylogenetic analysis suggests that NTL signaling arose through duplication of the TK signaling early in the arthropod lineage (Jiang et al., 2013; see **Figure 6**). However, TK-like precursors with NTL-like peptides are found in the spider mite (chelicerate) as well as tardigrades as shown in **Figures 4**, **5**, suggesting that the NTL signaling might also be present outside arthropods (Veenstra et al., 2012; Koziol, 2018). It was noted that in the centipede Strigamia maritima (Myriapoda) there is no NTL gene (Veenstra, 2016) and in the spider mite Tetranychus urticae, there is no separate NTL gene in the genome (Veenstra et al., 2012). However, two TK genes were annotated in T. urticae and on these precursors two of the three putative mature peptides are similar to NTL, and one is a TK; the second precursor encodes two TKs and an unrelated peptide, but no NTL (**Figure 4**). Thus, with this mix of TKs and NTLs on the spider mite genes, it was suggested that TK and NTL divergence started by internal events on duplicated TK

genes and resulted in the evolution of a separate NTL signaling system (Jiang et al., 2013). Additional genomes/transcriptomes of basal arthropods need to be examined to substantiate this claim. As seen in **Supplementary Table S5**, there are five paracopies of natalisins in Drosophila, 7 in Anopheles aegypti, 11 in Bombyx mori and 15 in Manduca sexta, 2 in Tibolium castaneum and only one each in the tardigrades Hypsibius dujardini and Ramazzottius varieornatus (Jiang et al., 2013; Koziol, 2018). The Drosophila peptides (DmNTL1-5) range from 15 to 24 residues and have a consensus C-terminus FXPXRamide (except DmNTL4).

In the brains of Drosophila, B. mori, T. castaneum and Varroa destructor, there are two pairs of identifiable NTL neurons with very similar locations and arborizations (designated ADLI and ICLI in each species) (Jiang et al., 2013, 2016). The Drosophila neurons are shown in **Supplementary Figure S6**. In the B. mori brain, there are two additional pairs of neurons in the subesophageal zone. Another study shows that in the oriental fruit fly Bactrocera dorsalis there are three pairs of NTL neurons (Gui et al., 2017a). Thus, in insects studied so far, the NTL system appears relatively simple and the neuronal branches do not seem to innervate any of the well-defined centers, such as antennal lobes, mushroom bodies, central complex or optic lobes (Jiang et al., 2013). There are a few additional segmental neurons in the ventral nerve cord. The brain ICLI neurons coexpress NTL, Ast-A and MIP (Diesner et al., 2018).

In Drosophila, genetic experiments revealed that NTL and the four NTL neurons are important for male mating success (Jiang et al., 2013). NTL-RNAi in NTL-Gal4 neurons reduces male copulation success rate. The courtship behavior was only affected in the latency of courtship initiation in males. NTL-RNAi females also displayed reduced mating frequency, but did not actively reject males (Jiang et al., 2013). Silencing of the NTL-Gal4 neurons resulted in complete repression of mating in males, but had no effect in females. Manipulations of NTL neurons had no effect on egg laying, however, in T. castaneum systemic NTL-RNAi in either sex resulted in reduced egg numbers after mating (Jiang et al., 2013). Also in the oriental fruit fly Bactrocera dorsalis NTL and its receptor (NTLR) play important roles in mating (Gui et al., 2017a, 2018). In this species the NTL signaling is required for regulation of mating frequency in both males and females.

### CONCLUSION AND PERSPECTIVES

We have shown that TKs are neuropeptides that emerged early in bilaterian lineages, but it is not clear what their ancestral form is since cnidarians and other non-bilaterian do not possess typical TKs, although bioinformatics has indicated presence of TK receptors [see (Krishnan and Schiöth, 2015; Hayakawa et al., 2019)]. It is also puzzling that no typical TKs have been identified in echinoderms, acorn worms, or amphioxus. Possibly this is due to species-specific diversification of TK sequences. Thus, important questions regarding the evolution and diversification of this signaling system remain unanswered. Nevertheless, it is clear that TK signaling is widespread and diverse among bilaterians and contributes to many vital functions. Some of these functions appear conserved over evolution, at least in general terms. It is important to stress that TKs seem to have multiple distributed (localized) functions in different neuronal circuits and commonly act as co-transmitters, and thus TK signaling is not likely to orchestrate global functions. Elucidation of TK functions in neglected phyla (such as echinoderms, xenacoelomorphs and cnidarians) can also provide clues on whether the mode of action of TK as a co-transmitter is an ancient or a more derived trait. This is important since besides TK, there are only a few other neuropeptides in protostomes (at least in arthropods) that seem to function mainly as cotransmitters, such as sNPF and proctolin (Nässel, 2018; Nässel and Zandawala, 2019). Did TKs evolve as primary paracrine peptide signals in basal phyla with simple nervous systems, and then diversified functionally to also confer plasticity to more complex neural circuits by providing neuromodulatory actions as co-transmitters? In organisms without nervous systems, such a Trichoplax adherens, peptides seem to act as primary messengers that induce simple behaviors (Nikitin, 2015; Senatore et al., 2017; Varoqueaux et al., 2018) and even in more evolved organisms many neuropeptides/peptide hormones seem to relay single global orchestrating actions [see (Nässel and Zandawala, 2019; Nässel et al., 2019)].

In mammals, TK signaling has received extensive attention due to its clinical importance with roles for instance in pain, inflammation, cancer, depressive disorder and immune system. Thus, the literature list is huge: searching PubMed for the term "substance P" renders more than 24,000 hits. This means that our coverage of mammalian TKs in this review is very superficial and incomplete. On the other hand, a search for e.g., "tachykinin in insects" yields about 200 hits and therefore our discussion of invertebrate TKs is somewhat more detailed, but certainly still providing a sketchy picture of TK signaling since functional studies are not yet that numerous.

The development of powerful genetic tools, not only in Drosophila and C. elegans, but also other organisms has improved the possibilities to analyze neuropeptide signaling down to single identified neurons or sets of neurons. Furthermore, with optogenetics and other strategies for temporal control of manipulations and elegant techniques for imaging neuronal connections or activity, we already see an increase in studies of invertebrate neuropeptides. For TKs, it is of importance to note that in the CNS these peptides seem to operate as local neuromodulators and/or co-transmitters. Many (if not most) TK expressing neurons may additionally signal with small molecule transmitters and, therefore, manipulations of TK signaling only remove one layer of the signal transfer.

Whereas many neuropeptides also have functions as circulating hormones [see (Nässel and Zandawala, 2019)], it seems like TKs do not in most organisms studied. In studies of cockroach, locust and Drosophila it was proposed that TKs are released into the circulation from gut endocrine cells to stimulate secretion in nearby Malpighian tubules (Winther and Nässel, 2001; Johard et al., 2003; Söderberg et al., 2011). A recent Drosophila study showed that gut TKs act locally and do not affect behavior, indicating that there is no signaling to the brain via the circulation (Song et al., 2014). If bona fide hormonal roles of TKs can be excluded, we can focus on their local actions, but we still face some difficulties due to the diversity of TK expressing neuronal systems and the co-expression of small molecule transmitters. Hopefully this review will trigger interest in TK signaling in invertebrates in spite of these challenges. It is obvious from the literature that research on TK signaling in mammals is already very extensive, but certainly further basic research and clinical studies are urgently needed to unravel this important and interesting signaling system.

### AUTHOR CONTRIBUTIONS

fnins-13-01262 November 18, 2019 Time: 13:40 # 21

DN contributed to the conceptualization, prepared the first draft of the manuscript, wrote parts of the manuscript, prepared figures and tables, and coordinated the assembly of the manuscript. MZ contributed to the conceptualization, wrote parts of the manuscript, and prepared figures and tables. TK and HS wrote

### REFERENCES


parts of the manuscript, and prepared figures and tables. All authors edited and finally approved the manuscript.

### FUNDING

This work was funded by the Swedish Research Council (Vetenskapsrådet), grant number 2015-04626 (DN) and the Japan Society for the Promotion of Science, grant number JP19K06752 (HS).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins. 2019.01262/full#supplementary-material



distinct subtypes of olfactory local interneurons in the cockroach (Periplaneta americana). J. Comp. Neurol. 523, 1569–1586. doi: 10.1002/cne.23757



and memory consolidation. Neurosci. Biobehav. Rev. 85, 1–20. doi: 10.1016/j. neubiorev.2017.09.003


the silkworm Bombyx mori. PLoS One 11:e0156501. doi: 10.1371/journal.pone. 0156501


with kisspeptins in the brains of zebrafish. J. Comp. Neurol. 520, 2991–3012. doi: 10.1002/cne.23103



postembryonic development of Drosophila. J. Comp. Neurol. 464, 180–196. doi: 10.1002/cne.10790


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Nässel, Zandawala, Kawada and Satake. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Global Neuropeptide Annotations From the Genomes and Transcriptomes of Cubozoa, Scyphozoa, Staurozoa (Cnidaria: Medusozoa), and Octocorallia (Cnidaria: Anthozoa)

### Thomas L. Koch and Cornelis J. P. Grimmelikhuijzen\*

Section for Cell and Neurobiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark

#### Edited by:

Elizabeth Amy Williams, University of Exeter, United Kingdom

#### Reviewed by:

David Plachetzki, University of New Hampshire, United States Meet Zandawala, Brown University, United States

\*Correspondence: Cornelis J. P. Grimmelikhuijzen cgrimmelikhuijzen@bio.ku.dk

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 15 July 2019 Accepted: 13 November 2019 Published: 06 December 2019

#### Citation:

Koch TL and Grimmelikhuijzen CJP (2019) Global Neuropeptide Annotations From the Genomes and Transcriptomes of Cubozoa, Scyphozoa, Staurozoa (Cnidaria: Medusozoa), and Octocorallia (Cnidaria: Anthozoa). Front. Endocrinol. 10:831. doi: 10.3389/fendo.2019.00831 During animal evolution, ancestral Cnidaria and Bilateria diverged more than 600 million years ago. The nervous systems of extant cnidarians are strongly peptidergic. Neuropeptides have been isolated and sequenced from a few model cnidarians, but a global investigation of the presence of neuropeptides in all cnidarian classes has been lacking. Here, we have used a recently developed software program to annotate neuropeptides in the publicly available genomes and transcriptomes from members of the classes Cubozoa, Scyphozoa, and Staurozoa (which all belong to the subphylum Medusozoa) and contrasted these results with neuropeptides present in the subclass Octocorallia (belonging to the class Anthozoa). We found three to six neuropeptide preprohormone genes in members of the above-mentioned cnidarian classes or subclasses, each coding for several (up to thirty-two) similar or identical neuropeptide copies. Two of these neuropeptide preprohormone genes are present in all cnidarian classes/subclasses investigated, so they are good candidates for being among the first neuropeptide genes evolved in cnidarians. One of these primordial neuropeptide genes codes for neuropeptides having the C-terminal sequence GRFamide (pQGRFamide in Octocorallia; pQWLRGRFamide in Cubozoa and Scyphozoa; pQFLRGRFamide in Staurozoa). The other primordial neuropeptide gene codes for peptides having RPRSamide or closely resembling amino acid sequences. In addition to these two primordial neuropeptide sequences, cnidarians have their own class- or subclass-specific neuropeptides, which probably evolved to serve class/subclass-specific needs. When we carried out phylogenetic tree analyses of the GRFamide or RPRSamide preprohormones from cubozoans, scyphozoans, staurozoans, and octocorallia, we found that their phylogenetic relationships perfectly agreed with current models of the phylogeny of the studied cnidarian classes and subclasses. These results support the early origins of the GRFamide and RPRSamide preprohormone genes.

Keywords: neuropeptide, evolution, nervous system, Cnidaria, phylogeny

### INTRODUCTION

During animal evolution, ancestral Cnidaria, Placozoa, Ctenophora, and Porifera diverged from the Bilateria more than 600 million years ago (1). Neuropeptides have, so far, only been isolated and sequenced from cnidarians (2–4), although peptide-containing endocrine cells can also be found in placozoans and ctenophores (5, 6) and peptides have been annotated in placozoan genomes (7, 8). For understanding the origins and evolution of neuropeptides, therefore, it is important to study the four above-mentioned animal phyla with a focus, perhaps, on cnidarians, because they have well-developed peptidergic nervous systems (2–4).

The phylum Cnidaria consists of six classes: Hydrozoa (Hydra and colonial polyps, such as Clytia), Scyphozoa (true jellyfishes), Cubozoa (box jellyfishes), Staurozoa (stalked jellyfishes), Anthozoa (sea anemones and corals), and Myxozoa (a group of small ectoparasites). Most Hydrozoa, Scyphozoa, Cubozoa, and Staurozoa have a life-cycle that includes a polyp and a medusa stage and these classes are, therefore, often collected into a subphylum named Medusozoa [see **Figure 1**, which is based on (9–11)]. The class Anthozoa is subdivided into two subclasses, Hexacorallia and Octocorallia, which have different rotational (radial) body symmetries, being a 6-fold rotational symmetry in Hexacorallia and an 8-fold rotational symmetry in Octocorallia.

Cnidarians have net-like nervous systems that sometimes are fused to form giant fibers or nerve rings in the bell margins of medusa, or are condensed in the head regions of polyps. These anatomical structures can be easily visualized in whole-mounts with the help of neuropeptide antibodies, because cnidarians are normally transparent (2, 12–14).

Cnidarian neuropeptides have only been isolated and sequenced from a few species such as the sea anemones Anthopleura elegantissima, and Calliactis parasitica (Hexacorallia), Renilla Koellikeri (Octocorallia), Hydra magnipapillata (Hydrozoa), and Cyanea lamarckii (Scyphozoa) (2–4, 15, 16). It was, therefore, unclear whether peptides occur ubiquitously in cnidarians and what the structures of these neuropeptides are.

In the last few years, several cnidarian genomes have been published (17–23) together with a large number of cnidarian transcriptomes (24–33). These important advancements in cnidarian biology open the possibility of tracking the evolution of the cnidarian neuropeptides and eventually determine the primordial neuropeptide(s) that evolved together with

Octocorallia. This figure is based on data from Technau and Steele (9), Zapata et al. (10), and Kayal et al. (11).

TABLE 1 | Accession numbers for the different databases used.


the early cnidarian nervous systems. In a recent paper we have developed a bioinformatics tool to predict neuropeptide preprohormone genes from several cubozoan transcriptomes (31). In our current paper we have applied this script to predict neuropeptide preprohormone genes in cnidarian species with publicly accessible genomes or transcriptomes that belong to three classes (Scyphozoa, Staurozoa, Cubozoa), all belonging to the Medusozoa. We have compared these data from Medusozoa with a prediction of neuropeptide genes present in Octocorallia, where this subclass was used as a kind of outgroup (see also **Figure 1**) to generate more contrasts in our results.

The aim of the current paper was to determine whether these cnidarian classes produce the same types of neuropeptides, or whether there exist class-specific neuropeptides. When common neuropeptides would be present in these classes, these peptides would be good candidates for being among the first neuropeptides that evolved during cnidarian evolution.

### MATERIALS AND METHODS

### Sequence Data

We investigated assembled genomes (WGSs) and transcriptomes (TSAs) from seven octocorallians (Renilla reniformis, Eleutherobia rubra, Xenia sp. KK-2018, Briareum asbestinum, Clavularia sp., Heliopora coerulera and Acanthogorgia aspera), three scyphozoans (Aurelia aurita, Rhopilema esculentum, and Nemopilema nomurai) and five staurozoans (Calvadosia cruxmelitensis, Haliclystus aricula, Haliclystus sanjuanensis, Craterolophus convolvulus, and Lucernaria quadricornis). The database accession numbers are shown in **Table 1**.

### Identification of Neuropeptide Preprohormones

We screened the translated genomes and transcriptomes for neuropeptide preprohormones using a script that is extensively explained in Nielsen et al. (31). This script is based on the presence of at least three similar peptide sequences followed by classical preprohormone processing sites ("GR" and "GK") in the proteins. This script has, of course, its limitations, because neuropeptide genes might be missed that code for two or one neuropeptide copies on their preprohormones. Proteins with at least three processing sites were manually curated and labeled as neuropeptide preprohormones based on the presence of a signal peptide, the presence of three or more potential neuropeptide sequences, and the overall structure of the protein. The C-terminal parts of the immature neuropeptide sequences are easy to identify, because they are often followed by GR, GRR, or GKR sequences, which are classical processing sites for prohormone convertases (R, RR, KR), while the G residues are a classical processing signal for C-terminal amidation (3, 4). The N-termini, however, are often more difficult to determine as, in Cnidaria, N-terminal processing occurs by an unknown unspecific aminopeptidase cleaving at multiple residues, but stopping at Q residues, which are converted into N-terminal pQ groups (3, 4). In addition, N-terminal processing stops at N-terminal P or X-P sequences, which are also resistant to Nterminal degradation. The residues that are preferred for Nterminal processing are E, D, S, T, N, G, A, L, V, Y, or F (3). These residues often form the spacings in between the immature neuropeptide sequences on cnidarian preprohormones.

The identified neuropeptide preprohormones were also used as queries in TBLASTN searches against the other data sets using standard settings.

The putative preprohormones were investigated for the presence of signal peptides using SignalP 5.0 (http://www.cbs.dtu. dk/services/SignalP/) (34).

### Phylogenetic Analysis

The preprohormones were aligned using ClustalW (35). For phylogenetic tree analysis the aligned protein sequences were loaded in PAUP∗<sup>1</sup> and the maximum parsimony tree was calculated using p-distance and visualized in figtree <sup>2</sup> .

### RESULTS

### Annotation of Neuropeptide Preprohormones in Scyphozoa

Using our script for the discovery of neuropeptide preprohormones in cnidarian genomes and transcriptomes (31), we could detect six neuropeptide preprohormone genes in four publicly accessible databases from three scyphozoans: Rhopilema esculentum (transcriptome) Nemopilema nomurai (transcriptome), and Aurelia aurita (genome and transcriptome) (**Table 1**). The script detects neuropeptide genes that code for preprohomones that have three or more neuropeptide

<sup>1</sup>https://paup.phylosolutions.com/ (accessed September 20, 2019).

<sup>2</sup>https://github.com/rambaut/figtree (accessed September 20, 2019).

#### TABLE 2 | An overview of scyphozoan neuropeptide families.


Only the major neuropeptides located on a preprohormone are listed here. The preprohormones are given in Supplementary Figures 1–6. Amino acid residues that are in common with the first-mentioned neuropeptide sequence from each neuropeptide family are highlighted in yellow.

sequences, thus neuropeptide genes might be missed that code for two or one neuropeptide copies. **Table 2** gives an overview of the neuropeptides contained in these six preprohormones. **Supplementary Figures 1**–**6** give all the preprohormone sequences identified with our script in the three scyphozoan species.



Only the major neuropeptides located on a preprohormone are listed here. The preprohormones are given in Supplementary Figures 7–9. Amino acid residues are highlighted as in Table 2.

The scyphozoan databases from N. nomurai, R. esculentum, and A. aurita all contain transcripts and genes coding for a preprohormone that produce multiple copies of either LPRSamide or closely related neuropeptide sequences (**Table 2**; neuropeptide family #1). These sequences are flanked by classical GKR or GRR processing sites at their C-termini, where cleavage occurs C-terminally of K or R, after which the C-terminal G residues are converted into a C-terminal amide group (3, 4). At the N-termini of the neuropeptide sequences are acidic (E or D), or S, T, G, N, A, M, L, or V residues, which are processing sites that are often used in cnidarians for N-terminal neuropeptide processing (3, 4) (**Supplementary Figure 1**). In the proposed mature neuropeptide sequences, the C-termini are protected by an amide bond (for example LPRSamide), while the N-termini are protected by a prolyl residue in the second position of the peptide (**Table 2**). Similar LPRSamide neuropeptide sequences can also be detected in databases from Staurozoa (**Table 3**), Cubozoa (**Table 4**), and Octocorallia (**Table 5**).

The Scyphozoan databases from N. nomurai, R. esculentum, and A. aurita also contain transcripts and genes that code for numerous copies of identical pQWLRGRFamide neuropeptides (**Table 2**; neuropeptide family #2). These GRFamide neuropeptides have classical C-terminal GR or GKR processing sites and, again acidic (D or E), K, N, G, or S N-terminal processing sites that are often used in cnidarian N-terminal preprohormone processing (3, 4) (**Supplementary Figure 2**). Similar GRFamide peptides occur in Staurozoa (**Table 3**), and identical GRFamide neuropeptides occur in Cubozoa (**Table 4**). In Octocorallia GRFamide neuropeptides exist that have the Cterminal GRFamide sequence in common with the scyphozoan pQWLRGRFamide peptides (**Table 5**).

All four databases show that scyphozoans also produce a neuropeptide preprohormone that code for several copies of pQPPGVWamide and pQPPGTWamide and their nonamidated variants pQPPGVW and pQPPGTW (**Table 2**; neuropeptide family #3). These preprohormones have classical GKR, RR, RRR, RKR, or RKK processing sites at the C-termini of the neuropeptide sequences and N-terminal K, S, or N residues that in cnidarians are known to be involved in Nterminal processing (3, 4) (**Supplementary Figure 3**). Similar pQPPGVWamide peptides also occur in Staurozoa (**Table 3**; peptide family #3) and Cubozoa (**Table 4**; peptide family #3), but they are absent in Octocorallia (**Table 5**).

The N. nomurai transcriptome database also codes for a preprohormone that contains one copy of a probable cyclic neuroepeptide CTSPMCWFRPamide and several other nearly identical peptides, where the two cysteine residues are likely forming a cystine bridge (**Table 2**; neuropeptide family #4; **Supplementary Figure 4**). Similar peptides can be found in the databases from R. esculentum, where a small number of amino acid residue exchanges occur without, however, changing the consensus sequence CXSPMCWFRXamide (**Table 2**; peptide

#### TABLE 4 | An overview of cubozoan neuropeptide families.


This table is a shortened version of Table 1 from Nielsen et al. (31). Only the major neuropeptides are shown here. Amino acid residues are highlighted as in Table 2.

family #4). In A. aurita, however, the C-termini are extended by one or two amino acid residues (**Table 2**; peptide family #4). The C-termini of all neuropeptide sequences in the preprohormones are followed by classical GR, GKR, GKKR, or GKRR processing sites and the neuropeptide sequences are preceded by G, N, D, E, or S sequences, which are known processing sites in cnidarian preprohormones (3, 4) (**Supplementary Figure 4**). Similar cyclic neuropeptides can also be found in the transcriptomes of three cubozoans, which all have the consensus sequence CXGQMCWFRamide (**Table 4**, **Figure 2**). Thus, compared to the scyphozoan neuropeptides, these cubozoan neuropeptides have the sequences CXXXMCWFRamide in common, including a common distance between the cysteine residues forming the presumed cystine bridge (**Figure 2**). These neuropeptides could not be found in Staurozoa (**Table 3**), or Octocorallia (**Table 5**).

Finally, Scyphozoans have two neuropeptide families that cannot be found in Staurozoa, Cubozoa, and Octocorallia and


Only the major neuropeptides located on a preprohormone are listed here. The preprohormones are given in Supplementary Figures 10–14. Amino acid residues are highlighted as in Table 2.

which, therefore, appear to be scyphozoan-specific (**Table 2**; peptide families #5 and #6).

In N. nomurai, the first neuropeptide family (neuropeptide family #5 of **Table 2**; **Supplementary Figure 5**) consists of members having the sequence pQHLRYamide or other very similar sequences. In R. esculentum, six copies of the sequence pQHVRYamide can be identified (**Table 2**). In A. aurita, a prolyl residue, which also protects the N-terminus of a neuropeptide, is replacing the pyroglutamyl residue: PHVRYamide and PHLRYamide (**Table 2**). In the preprohormones for these neuropeptides the neuropeptide sequences have classical GR or GK processing sites at their C-termini, while at their Ntermini they have R, D, T, and A residues, which in cnidarians are known to be involved in N-terminal processing (3, 4) (**Supplementary Figure 5**).

The second scyphozoan-specific neuropeptide family (neuropeptide family #6 of **Table 2**; **Supplementary Figure 6**) consists of pQPLWSARFamide or related sequences in N. nomurai (**Table 2**). For some peptides the N-terminal pyroglutamyl group is lacking, but those peptides still have two sequential N-terminal prolyl residues, which protect the Ntermini against enzymatic degradation (**Table 2**). R. esculentum

have peptides that are very similar to the ones occurring in N. nomurai with the exception of three copies of a short peptide pQLRPamide that do not occur in N. nomurai. All peptides from A. aurita lack N-terminal pyroglutamyl residues, but, again, are N-terminally protected by prolyl residues (**Table 2**).

### Annotation of Neuropeptide Preprohormones in Staurozoa

Using our script (31), we discovered three different neuropeptide preprohormone genes in staurozoans (**Table 3**). In Calvadosia cruxmelitensis we found a preprohormone that contained 11 copies of the neuropeptide sequence RPRSamide (**Table 3**, neuropeptide family #1; **Supplementary Figure 7**). These RPRSamide sequences have the classical C-terminal processing site GKR, while N-terminally of the RPRSamide sequences are D, V, I, F, and A residues, which in cnidarians are known preprohormone processing sites (3, 4). Similar preprohormones exist in the transcriptomes from Haliclystus auricula and Haliclystus sanjuanensis (**Supplementary Figure 7**), which contain 15, respectively, 16 copies of the RPRSamide sequence (**Table 3**). In the Craterolophus convolvulus transcriptome we could identify a preprohormone with 8 copies of the RPRSamide sequence. In the transcriptome from Lucernaria quadricornis we discovered a neuropeptide preprohormone with 3 copies of the RPRSamide and 6 copies of the KPRSamide sequence (**Table 3**; **Supplementary Figure 7**). As mentioned earlier, neuropeptide preprohormones having numerous copies of RPRSamide or similar peptides, also occur in Scyphozoa (**Table 2**), Cubozoa (**Table 4**), and Octocorallia (**Table 5**).

The transcriptomes from the four Staurozoa species also contain transcripts coding for GRFamide preprohormones (**Table 3**, neuropeptide family #2). For all species these preprohormones contain numerous copies of the neuropeptide pQFLRGRFamide (**Table 3**; **Supplementary Figure 8**). This neuropeptide is very similar to the one from scyphozoans (**Table 2**, neuropeptide family #2), but it contains an F residue at position 2 instead of a W residue in the scyphozoan peptide. The same is true for cubozoans (**Table 4**, neuropeptide family #2), where the GRFamide peptides also have a W residue at position 2. However, compared to Octocorallia (**Table 5**, neuropeptide family #2) the staurozoan GRFamide peptides are much longer, being N-terminally elongated by three amino acid residues.

The third neuropeptide preprohormone that we discovered in staurozoans contains numerous copies of a pQPPGAWamide neuropeoptide or closely related sequences (**Table 3**, neuropeptide family #3; **Supplementary Figure 9**). All peptides are protected by amino-terminal pQ, pQP, or pQPP sequences against unspecific aminoterminal enzymatic degradation. Identical or similar neuropeptides occur in scyphozoans (**Table 2**, neuropeptide family #3) or cubozoans (**Table 4**, neuropeptide family #3). The peptides, however, are absent in Octocorallia (**Table 5**).

### Annotation of Neuropeptide Preprohormones in Cubozoa

We have recently annotated neuropeptide preprohormones in five different cubozoan species (31). A short summary of these results is given in **Table 4**, while the amino acid sequences of the preprohormones are given in Nielsen et al. (31).

Cubozoans produce RPRAamide neuropeptides (**Table 4**, neuropeptide family #1) that are very similar to neuropeptides occurring in scyphozoans (**Table 1**, neuropeptide family #1), staurozoans (**Table 3**, neuropeptide family #1) or octocorallians (**Table 5**, neuropeptide family #1).

Cubozoans also have preprohormones that produce numerous copies of pQWLRGRFamide (**Table 4**, neuropeptide family #2). Identical or very similar neuropeptides can also be found in scyphozoans (**Table 2**, neuropeptide family #2), and staurozoans (**Table 3**, neuropeptide family #2). However, in octocorallians there is a shorter version of these peptides (**Table 5**, neuropeptide family #2).

Cubozoans produce the neuropeptide pQPPGVWamide (**Table 4**, neuropeptide family #3) that is identical or very similar to neuropeptides produced in scyphozoans (**Table 2**, neuropeptide family #3), and staurozoans (**Table 3**, neuropeptide family #3). These peptides, however, do not occur in octocorallians (**Table 5**).

Cubozoans have peptides with the sequence CXGQMCWFRamide, which are probably cyclic after the two cysteine residues have formed a cystine bridge (**Table 4**, neuropeptide family #4). Scyphozoans have similar peptides (**Table 2**, neuropeptide family #4). However, these peptides are lacking in Staurozoa (**Table 3**) and Octocorallia (**Table 5**).

Finally, cubozoans have two peptide families that do not occur in Scypho- and Staurozoa. These are neuropeptides with the C-terminal sequence GLWamide (**Table 4**, neuropeptide family #5), and neuropeptides with the C-terminal sequence GRYamide

FIGURE 3 | Phylogenic tree analysis of all the identified X1PRX2amide preprohormones from scyphozoans, staurozoans, cubozoans, and octocorallians. This phylogenetic tree (left panel) is in complete accordance with the phylogenetic tree given in Figure 1. The abscissa (upper line) gives the length of the preprohormone fragments (in amino acid residues). The middle panel gives schematic representations of the various preprohormones from each species. The right panel gives the amino acid sequences of the major X1PRX2amide peptides produced by these preprohormones. This panel also shows that X1PRX2amide (where X<sup>2</sup> is either S, A, or G) is the consensus sequence of all peptides.

(**Table 4**, neuropeptide family #6). The C-termini from the GLWamides are relatively well-conserved, but their N-termini are quite variable [see Table 1 from Nielsen et al. (31)]. The same is true for the GRYamide peptides (31).

### Annotation of Neuropeptide Preprohormones From Octocorallia

We investigated the transcriptomes from seven Octocorallia species for the presence of neuropeptide genes (**Tables 1**, **5**).

Octocorallia produce GPRGamide and closely related peptides (**Table 5**, neuropeptide family #1; **Supplementary Figure 10**) that resemble very much the LPRSamide and RPRAamide peptides from scyphozoans (**Table 2**, neuropeptide family #1), staurozoans (**Table 2**, neuropeptide family #1), and cubozoans (**Table 4**, neuropeptide family #1).

All octocorallians produce a preprohormone that carry numerous copies of the neuropeptide pQGRFamide (**Table 5**, neuropeptide family #2; **Supplementary Figure 11**). This peptide, dubbed Antho-RFamide, was the first cnidarian neuropeptide to be chemically isolated and sequenced from Anthozoa (36), including the octocoral Renilla (37). The presence of acidic residues, preceding the peptide sequence in the Renilla preprohormone (**Supplementary Figure 11**) illustrates, again, that processing must occur at E or D residues. Antho-RFamide does not occur in Scyphozoa, Staurozoa, and Cubozoa, but these medusazoans have N-terminally elongated forms that have the C-terminal GRFamide sequence in common with Antho-RFamide (neuropeptide families #2 from **Tables 2**–**4**). Thus, GRFamide neuropeptides are widespread in cnidarians.

In contrast to the two neuropeptide families discussed above that appear to be ubiquitous in cnidarians, octocorallians also produce Octocorallia-specific neuropeptides. The first group (**Table 5**, neuropeptide family #3) has the structure pQLRGamide or a very similar sequence. Their preprohormones are given in **Supplementary Figure 12**.

The second group of neuropeptides (**Table 5**, neuropeptide family #4) has the structure PPFHamide, or pQPFHamide. Both sequences are N-terminally protected by either the Nterminal PP or pQP sequences. Their preprohormones are given in **Supplementary Figure 13**. In addition, we discovered a preprohormone in H. coerulea that produces multiple copies of an RPFLamide sequence. Also these peptides are N-terminally protected by prolyl residues at position 2, but they are only 50% identical with the other peptides from this family (**Table 5**, peptide family #4; **Supplementary Figure 13**).

than to the staurozoans, which produce a slightly different peptide.

The third group of Octocorallia-specific neuropeptides (**Table 5**, neuropeptide family #5; **Supplementary Figure 14**) has the sequence GPRRamide, or a closely related sequence. Again, these neuropeptide sequences are protected against N-terminal enzymatic degradation by a prolyl residue at position 2 of the peptides. The R. reniformis preprohormone has at least two copies of GPRRamide, the E. rubra preprohormone produces eight copies of GPRRamide, the Xenia sp. preprohormone contains 15 GPRRamide copies, the B. asbestinum preprohormone 22 copies, the clavularia sp. preprohormone 5 copies, and the H. coerulea preprohormone 4 copies of GPRRamide (**Table 5**, neuropeptide family #5; **Supplementary Figure 14**).

### Phylogenetic Tree Analyses of the XPRXamide and GRFamide Preprohormones

We carried out a phylogenetic tree analysis of all the XPRSamide/XPRAamide/XPRGamide preprohormones investigated in this paper [neuropeptide families #1, from **Tables 2**–**5**; **Supplementary Figures 1**, **7**, **10**; (31)]. These studies (**Figure 3**) showed that the structural relationships between these preprohormones very precisely followed the established phylogenetic relationships of the classes and subclasses they belong to (**Figure 1**). These findings show that all XPRSamide/XPRAamide/XPRGamide preprohormones are derived from a common ancestor.

When we carried out the same analysis for the GRFamide preprohormones, we came to the same conclusion (**Figures 1**, **4**). Aligning the mature neuropeptide sequences themselves (right panel of **Figure 4**), showed that the octocorallian peptide pQGRFamide is farthest away from the other GRFamide peptides, while the pQWLRGRFamides are identical in both Cubo- and Scyphozoa and only slightly different from the pQFLRGRFamides that occur in Staurozoa. These findings are, again in complete agreement with the phylogenetic relationships between the classes and subclasses to which these peptides belong (**Figures 1**, **4**), showing that all GRFamide preprohormones are derived from a common ancestor.

### DISCUSSION

In our paper we have analyzed the neuropeptide preprohormones from three cnidarian classes (Scyphozoa, Cubozoa, Staurozoa) and one subclass (Octocorallia). We did not include the remaining classes (Hydrozoa and Myxozoa) and subclasses (Hexacorallia) in our study, because already the current study includes a large amount of data (**Supplementary Figures 1**–**14**) with altogether 66 preprohormones, each of which contains a varying number of different neuropeptides (**Tables 2**–**5**). These large numbers of preprohormones and neuropeptides are difficult to analyze and present in an understandable and concise way. Yet, although not all cnidarian classes have been included in our analyses, we can already now draw conclusions related to the major questions that we asked at the end of the Introduction: (i) Do all cnidarian classes produce the same types of neuropeptides; or (ii) are there class-specific neuropeptides?

We found that Scyphozoa, Cubozoa, Staurozoa, and Octocorallia all produced GRFamide peptides (**Figure 4**), which is an answer to the above-mentioned question (i). This finding also means that the common cnidarian ancestor (red filled circle in **Figure 5**) must have produced GRFamides, suggesting that GRFamides are ancient neuropeptides that probably evolved together with the first cnidarians. pQGRFamide (Antho-RFamide) is a well-established neuropeptide that has been isolated, sequenced and cloned from the sea anemones Anthopleura elegantissima and Calliactis parasitica (Hexacorallia) and from the sea pansy Renilla koellikeri (Octocorallia) (3, 4, 36–39). Using immunocytochemistry, dense nets of Antho-RFamide producing neurons have been found in C. parasitica and R. koellikeri (2, 40), showing that these peptides are genuine neuropeptides.

N-terminally elongated forms of Antho-RFamide have been isolated and sequenced from Scyphozoa, such as pQWLRGRFamide from Cyanea larmarckii (compare right panel of **Figure 4**; and **Table 2**, peptide family #2) and the use of antibodies showed its presence in nerve nets of C. lamarckii (41, 42). Thus, both Antho-RFamide and its N-terminally elongated forms (right panel of **Figure 4**) are well-established neuropeptides in Cnidaria.

There is another peptide-family that occurs in all Octocorallia, Cubozoa, Scyphozoa, and Staurozoa species that we have investigated. These peptides have the general structure X1PRX2amide, where X<sup>1</sup> is quite variable, while X<sup>2</sup> is confined to S, A, or G (neuropeptide family #1 from **Tables 2**–**5**; right panel of **Figure 3**). Because these peptides occur in both Octocorallia and the three classes that belong to the Medusozoa (**Figure 5**), it is likely that they also were present in the common cnidarian ancestor (red filled circle in **Figure 5**).

After submission of our manuscript and two reviewing rounds, a paper was published on neuropeptides present in the hexacoral N. vectensis (43). This paper confirms the presence of GRFamide and X1PRX<sup>2</sup> peptides in Hexacorallia, which supports our conclusion that these two neuropeptide families originated in the common cnidarian ancestor (**Figure 5**). Furthermore,

this paper confirms the presence of class-specific neuropeptides (see below).

In contrast to the GRFamides, not much is known about the XPRXamides. However, two peptides related to this peptide family (WPRPamide and RPRPamide) have recently been identified in the hydrozoan jellyfishes Clytia hemispherica and Cladonema pacificum as the endogenous neuropeptides inducing oocyte maturation, and oocyte and sperm release (44). The peptides have also been localized in neurons and, therefore, are genuine neuropeptides (44).

If one looks at the structures of the various X1PRX2amide peptides (right panel of **Figure 3**), one can see several interesting features. First, all peptides have a prolyl residue at position 2. This residue might help protecting the neuropeptide against non-specific enzymatic N-terminal degradation, because the X-P bond is not an amide, but an imide bond, which likely gives resistance against enzymatic hydrolysis. Prolyl residues have also been found at the N-termini of many other peptides described in this paper (for example **Table 2**, peptide family #6; **Table 5**, peptide families #4 and #5). Second, none of the X1PRX2amide peptides are protected by an N-terminal pQ group, while most of the other neuropeptide families (**Tables 2**–**5**) have such protecting groups, suggesting that the X1PRX2amide peptides need an N-terminal positive charge (by the protonation of the N-terminal primary amine group) for binding to their G-proteincoupled receptor (GPCRs). Third, the R residue at position 3 of the peptide is conserved in all peptides, creating, again, a positive charge in the middle of the peptide and making the overall charge of all peptides in the family quite positive, especially when the residues in position 1 are R or K (**Figure 3**, right panel).

In Octocorallia there is, in addition to the X1PRX2amide peptides (where X<sup>2</sup> is S, A, or G), another peptide family, of which most members have the GPRRamide sequence (**Table 5**, neuropeptide family #5). However, we do not think that these peptides belong to the same family as the X1PRX2amide family, because their preprohormones have a different organization. While in many X1PRX2amide preprohormones, most neuropeptide sequences are followed by a dibasic (KR) processing site [**Supplementary Figures 1**, **7**, **10**; (31)], there are exclusively single basic (R) processing sites at these positions in the GPRRamide precursor (**Supplementary Figure 14**). Furthermore, in all GPRRamide preprohormones, the neuropeptide sequence is frequently preceded by the sequence DEIT (**Supplementary Figure 14**), whereas this sequence is absent in all X1PRX<sup>2</sup> preprohormones [**Supplementary Figures 1**, **7**, **12**; (31)]. We assume, therefore, that the GPRRamide preprohormones might not be evolutionarily closely related to the X1PRX2amide preprohormones.

Besides the GRFamides and X1PRX2amides all other peptide families described in this paper are novel, with exception of the peptides given in **Table 4**, which have been published earlier (31), and we do not really know whether they are localized in neurons and, thus, are genuine neuropeptides. These peptides are not ubiquitous in cnidarians, but often occur in more than one class. For example, the presumed cyclic peptides from scyphozoans (**Table 2**, neuropeptide family #4) do also occur in cubozoans (**Table 4**, neuropeptide family #4). These results establish the close relationships between scyphozoans and cubozoans, which again is in accordance with current models of the phylogeny of cnidarian classes (**Figure 1**).

Also the pQPPGVWamide peptide family (**Table 2**, neuropeptide family #3; **Table 3**, neuropeptide family #3; **Table 4**, neuropeptide family #3) occurs in scypho-, cubo-, and staurozoans. These results confirm the close phylogenetic relationships between these classes, which is in full agreement with the current models for cnidarian phylogeny (**Figure 1**).

In addition to these peptide families that occur in more than one classes, there are neuropeptides that are confined to a single class (**Table 2**, neuropeptide families #5 and #6; **Table 4**, neuropeptide families #5 and #6; **Table 5**, neuropeptide families #3, #4, #5). These neuropeptides may serve class-specific physiological processes.

None of the above-mentioned peptides identified in the four cnidarian classes/subclasses have significant structural similarities with any of the known bilaterian neuropeptides.

### DATA AVAILABILITY STATEMENT

The datasets for T. cystophora can be found in the GenBank GGWE01000000.

### AUTHOR'S NOTE

This paper is part of the article collection "The Evolution of Neuropeptides: A Stroll through the Animal Kingdom: Updates from the Ottawa 2019 ICCPB Symposium and Beyond" hosted by Dr. Klaus H. Hoffmann and Dr. Elisabeth Amy Williams.

### AUTHOR CONTRIBUTIONS

TK and CG conceived and designed the project, and analyzed the data. TK carried out the experiments. CG wrote the paper with inputs from TK. All authors approved the final manuscript.

### FUNDING

We thank the Danish Council for Independent Research (grant number 7014-00088 to CG) for financial support. This funding body played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2019.00831/full#supplementary-material

### REFERENCES


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Koch and Grimmelikhuijzen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evolution and Comparative Physiology of Luqin-Type Neuropeptide Signaling

Luis Alfonso Yañez-Guerra\* † and Maurice R. Elphick\*

School of Biological and Chemical Sciences, Faculty of Science and Engineering, Queen Mary University of London, London, United Kingdom

#### Edited by:

Klaus H. Hoffmann, University of Bayreuth, Germany

#### Reviewed by:

Lindy Holden-Dye, University of Southampton, United Kingdom Liliane Schoofs, KU Leuven, Belgium

#### \*Correspondence:

Luis Alfonso Yañez-Guerra L.Yanez-Guerra@exeter.ac.uk Maurice R. Elphick m.r.elphick@qmul.ac.uk

#### †Present address:

Luis Alfonso Yañez-Guerra, Living Systems Institute, University of Exeter, Exeter, United Kingdom

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Neuroscience

Received: 22 October 2019 Accepted: 31 January 2020 Published: 18 February 2020

#### Citation:

Yañez-Guerra LA and Elphick MR (2020) Evolution and Comparative Physiology of Luqin-Type Neuropeptide Signaling. Front. Neurosci. 14:130. doi: 10.3389/fnins.2020.00130 Luqin is a neuropeptide that was discovered and named on account of its expression in left upper quadrant cells of the abdominal ganglion in the mollusc Aplysia californica. Subsequently, luqin-type peptides were identified as cardio-excitatory neuropeptides in other molluscs and a cognate receptor was discovered in the pond snail Lymnaea stagnalis. Phylogenetic analyses have revealed that orthologs of molluscan luqin-type neuropeptides occur in other phyla; these include neuropeptides in ecdysozoans (arthropods, nematodes) that have a C-terminal RYamide motif (RYamides) and neuropeptides in ambulacrarians (echinoderms, hemichordates) that have a C-terminal RWamide motif (RWamides). Furthermore, precursors of luqin-type neuropeptides typically have a conserved C-terminal motif containing two cysteine residues, although the functional significance of this is unknown. Consistent with the orthology of the neuropeptides and their precursors, phylogenetic and pharmacological studies have revealed that orthologous G-protein coupled receptors (GPCRs) mediate effects of luqin-type neuropeptides in spiralians, ecdysozoans, and ambulacrarians. Luqin-type signaling originated in a common ancestor of the Bilateria as a paralog of tachykinin-type signaling but, unlike tachykinin-type signaling, luqin-type signaling was lost in chordates. This may largely explain why luqin-type signaling has received less attention than many other neuropeptide signaling systems. However, insights into the physiological actions of luqin-type neuropeptides (RYamides) in ecdysozoans have been reported recently, with roles in regulation of feeding and diuresis revealed in insects and roles in regulation of feeding, egg laying, locomotion, and lifespan revealed in the nematode Caenorhabditis elegans. Furthermore, characterization of a luqin-type neuropeptide in the starfish Asterias rubens (phylum Echinodermata) has provided the first insights into the physiological roles of luqin-type signaling in a deuterostome. In conclusion, although luqin was discovered in Aplysia over 30 years ago, there is still much to be learnt about luqin-type neuropeptide signaling. This will be facilitated in the post-genomic era by the emerging opportunities for experimental studies on a variety of invertebrate taxa.

Keywords: luqin, cardio-excitatory peptide, RYamides, RWamides, neuropeptide evolution, G-protein coupled receptors

### INTRODUCTION

fnins-14-00130 February 15, 2020 Time: 17:8 # 2

Neuropeptides are evolutionarily ancient neuronal signaling molecules that typically exert their effects on target cells by binding to cognate G-protein coupled receptors (GPCRs) (Elphick et al., 2018; Jékely et al., 2018). Phylogenetic studies have revealed that the evolutionary origin of at least 30 neuropeptide signaling systems can be traced back to the bilaterian common ancestor of deuterostomes and protostomes. However, some neuropeptide signaling systems have been lost in one or more bilaterian phyla/sub-phyla (Jékely, 2013; Mirabeau and Joly, 2013; Elphick et al., 2018). Luqin-type neuropeptide signaling, which is the focus of this review, is one of the bilaterian neuropeptide signaling systems that have been lost in chordates/vertebrates. This may in part explain why less is known about luqintype neuropeptide signaling than other bilaterian neuropeptide systems that have been retained in vertebrates. Nevertheless, several advances in our knowledge of the evolution and comparative physiology of luqin-type signaling in invertebrates have been made recently and therefore writing of this the first review article on luqin-type neuropeptide signaling is timely.

The neuropeptide luqin and its cognate GPCR were first discovered in molluscs (Shyamala et al., 1986; Fujimoto et al., 1990; Tensen et al., 1998). Subsequently, luqin-like neuropeptides known as RYamides, on account of a C-terminal Arg-Tyr-NH<sup>2</sup> motif, were discovered in the arthropod Cancer borealis (Li et al., 2003). Furthermore, receptors for RYamides were identified in the fruitfly Drosophila melanogaster and in the red flour bettle Tribolium castaneum (Collin et al., 2011; Ida et al., 2011). In 2013, evidence that molluscan luqin-type signaling and arthropod RYamide-type signaling are orthologous was reported. Thus, use of pairwise-based clustering methods revealed that luqin precursors and RYamide precursors form part of the same protein cluster (Jékely, 2013). Furthermore, phylogenetic analysis of G-protein coupled neuropeptide receptors revealed that molluscan luqin receptors are orthologs of arthropod RYamide receptors (Mirabeau and Joly, 2013). In addition, these two studies also reported for the first time the discovery of a luqin-type signaling system in the nematode Caenorhabditis elegans, which has subsequently been functionally characterized experimentally (Jékely, 2013; Mirabeau and Joly, 2013; Ohno et al., 2017).

Importantly, phylogenetic analysis of transcriptome/genome sequence data has revealed that while luqin-type signaling has been lost in vertebrates and other chordates (urochordates, cephalochordates), genes encoding luqin-type precursors and receptors are present in ambulacrarian deuterostomes (echinoderms and hemichordates) (Jékely, 2013; Mirabeau and Joly, 2013). Thus, it was established for the first time that the evolutionary origin of luqin-type neuropeptide signaling predates the divergence of protostomes and deuterostomes, but with differential loss in the deuterostome branch of the Bilateria. Luqin-type neuropeptides in ambulacrarians are characterized by a predicted C-terminal RWamide motif (Elphick and Mirabeau, 2014; Rowe et al., 2014; Semmens et al., 2016). Furthermore, biochemical characterization of luqin-type neuropeptide signaling in the starfish Asterias rubens (phylum Echinodermata) demonstrated that a neuropeptide with the confirmed structure EKGRFPKFMRW-NH<sup>2</sup> acts as a ligand for two luqin-type receptors in this species (Yañez-Guerra et al., 2018). Thus, the luqin-type neuropeptides in spiralian protostomes, the luqintype RYamides in ecdysozoan protostomes, and the luqin-type RWamides in ambulacrarian deuterostomes have been unified as members of a bilaterian family of neuropeptides (Jékely, 2013; Mirabeau and Joly, 2013; Yañez-Guerra et al., 2018). It is against the backdrop of these important recent findings that we review here our knowledge of the evolution and comparative physiology of luqin-type neuropeptide signaling.

### THE DISCOVERY AND FUNCTIONAL CHARACTERIZATION OF LUQIN-TYPE NEUROPEPTIDE SIGNALING IN MOLLUSCS

The discovery of the neuropeptide luqin was enabled by identification of transcripts expressed in the L5 neuron of the mollusc Aplysia californica. This was decades before the development of contemporary single cell transcriptomic methodologies and was facilitated by the large size of the cell body of the L5 neuron (Shyamala et al., 1986). Analysis of the expression of the cardio-excitatory neuropeptide FMRFamide in the central nervous system of A. californica using antibodies to FMRFamide revealed immunostaining in many neurons, including the L5 neuron that is located in the upper quadrant of the left abdominal ganglion (Brown et al., 1985; Schaefer et al., 1985). However, it was found that the FMRFamide gene is not expressed in the L5 neuron. Therefore, to determine the identity of the neuropeptide(s) responsible for FMRFamide-like immunoreactivity in L5, transcripts expressed in this neuron were sequenced. Sequencing of a transcript named L5-67 revealed that it encodes a 112 amino acid residue protein comprising an N-terminal signal peptide followed by a neuropeptide with a predicted C-terminal RFamide motif, which therefore could be cross-reactive with FMRFamide antibodies (Shyamala et al., 1986). Subsequent biochemical analysis revealed that processing of this precursor gives rise to the amidated decapeptide APSWRPQGRF-NH2, which was named luqin because it is expressed in the left upper quadrant cells of the abdominal ganglion (Aloyz and DesGroseillers, 1995). Furthermore, a 76 amino acid peptide corresponding to the region of the precursor protein C-terminal to the luqin neuropeptide was also detected and named proline-rich mature peptide (PRMP) (Aloyz and DesGroseillers, 1995). PRMP contains two cysteines separated by a 10 amino acid-residue sequence and subsequent studies have revealed that this is an evolutionarily conserved feature of luqintype precursor proteins (**Figure 1**) (Jékely, 2013; Mirabeau and Joly, 2013; Yañez-Guerra et al., 2018).

In addition to luqin and PRMP, a mass spectrometric survey of the left upper quadrant neurons revealed that two other peptides are derived from the A. californica luqin precursor, which were named luqin-B and luqin-C (Li et al., 1998). The luqin-B fragment contains part of the mature luqin neuropeptide


FIGURE 1 | Alignment of the N-terminal neuropeptide-containing and C-terminal regions of luqin-type precursor proteins in bilaterians. Conserved residues are highlighted in black or gray. The C-terminal residues of the luqin-type neuropeptides and species names are highlighted in phylum-specific colors: red (Mollusca), pink (Annelida), orange (Platyhelminthes, Brachiopods, and Nemerteans), green (Arthropoda), purple (Nematoda), yellow (Priapulida and Tardigrada), light blue (Echinodermata), and dark blue (Hemichordata). Species names are as follows: Acal (Aplysia californica), Aful (Achatina fulica), Cgig (Crassostrea gigas), Iobs (Ilyanasa obsoleta), Bgla (Biomphalaria glabrata), Ctel (Capitella teleta), Smed (Schmidtea mediterranea), Lana (Lingula anatina), Llon (Lineus longissimus), Dmel (Drosophila melanogaster), Aaeg (Aedes aegypti), Tcas (Tribolium castaneum), Cqua (Cherax quadricarinatus) Tsui (Trichuris suis), Cele (Caenorhabditis elegans), Pcau (Priapulus caudatus), Hduj (Hypsibius dujardini), Rvar (Ramazzottius varieornatus), Arub (Asterias rubens), Ovic (Ophionotus victoriae), Ajap (Apostichopus japonicus), Spur (Strongylocentrotus purpuratus), Skow (Saccoglossus kowalevskii). The sequences used for this alignment were reported in Koziol et al. (2016); Koziol (2018), Yañez-Guerra et al. (2018), and De Oliveira et al. (2019).

and the luqin-C fragment contains a shorter version of PRMP (Li et al., 1998), but it is not known if PRMP, luqin-B, or luqin-C are biologically active molecules. However, it has been shown that alternative splicing by exon skipping of one of the exons of the gene encoding the luqin precursor results in a frame shift and production of a precursor protein comprising the complete mature amidated decapeptide luqin and a short C-terminal region that does not contain the two cysteines characteristic of PRMP. This suggests that PRMP may not be essential for biosynthesis of the mature luqin neuropeptide. Still, from a functional perspective, it is noteworthy that while the full-length transcript is widely expressed in A. californica, the alternative transcript is specifically expressed in the kidney (Angers and DesGroseillers, 1998).

Sequencing of transcripts encoding luqin-type precursors and mass spectrometric identification of neuropeptides derived from them has revealed a high level of sequence conservation of luqin-type neuropeptides in molluscs. For example, in the giant snail Achatina fullica, a luqin-type peptide was identified as an amidated undecapeptide SGQSWRPQGRF-NH2, the C-terminal region of which (underlined) is identical to Aplysia luqin (Fujimoto et al., 1990), and in the pond snail Lymnaea stagnalis the luqin-type peptide TPHWRPQGRF-NH<sup>2</sup> was identified. It is noteworthy that the luqin precursors in A. californica, L. stagnalis, and A. fullica comprise a single luqin-type neuropeptide, whereas in some other gastropod molluscs, such as the eastern mudsnail Ilyanassa obsoleta and the freshwater snail Biomphalaria glabrata, the precursor comprises two luqin-type neuropeptides (**Figure 1**). The first insight into the characteristics of luqin receptors was made with the discovery that the luqin-type peptide TPHWRPQGRF-NH<sup>2</sup> is a ligand for an orphan G protein-coupled receptor, GRL106, in the pond snail L. stagnalis. It was also reported that this receptor is closely related to vertebrate tachykinin receptors and the Drosophila neuropeptide-Y-type receptor (Tensen et al., 1998).

Insights into the physiological roles of luqin-type neuropeptides were facilitated by the identification of a luqintype neuropeptide in the snail A. fullica—a pulmonate gastropod. The peptide was discovered on account of its bioactivity as a cardioactive peptide that causes an increase in the frequency of beating when applied to auricle preparations from A. fullica. Hence, this peptide was named Achatina cardioexcitatory peptide

(ACEP) (Fujimoto et al., 1990). However, ACEP also affects other muscle systems in A. fullica, increasing the amplitude of tetanic contraction of the penis retractor muscle and buccal muscle in response to electrical stimulation. Furthermore, ACEP was found to induce depolarization and rhythmic firing of a motor neuron (B4) that innervates buccal muscle (Fujimoto et al., 1990), an effect indicative of a physiological role in regulation of feeding behavior. Consistent with findings from A. fullica, the luqin-type peptide identified as a ligand for the Lymnaea receptor GRL106 was also found to be cardioexcitatory in this pulmonate gastropod species. Hence, the peptide was named Lymnaea cardioexcitatory peptide (LyCEP). Furthermore, the sequence similarity that LyCEP and ACEP share with Aplysia luqin was noted (Tensen et al., 1998). In accordance with the cardioexcitatory actions of LyCEP, immunohistochemical analysis revealed that LyCEP-immunoreactivity is present in nerve fibers ending in the pericardial cavity of the heart, indicating that the peptide is released into the pericardial cavity as a neurohormone (Tensen et al., 1998). However, LyCEP is not, as its name implies, specifically a cardioactive peptide because it is also expressed in nerve fibers associated with inhibition of the egg-laying hormone-producing caudodorsal cells in L. stagnalis. Accordingly, transcripts encoding the LyCEP receptor are present in the caudodorsal cells and LyCEP causes hyperpolarization of these cells in vitro (Tensen et al., 1998).

Analysis of the expression of the luqin precursor in the ophistobranch gastropod A. californica revealed transcripts in approximately 100 neurons of the central nervous system, but predominantly in neurons that innervate the circulatory and reproductive systems. In peripheral tissues, transcripts were detected in the intestine and in the kidneys (Giardino et al., 1996). Whole mount immunolabeling experiments with an antibody directed against the luqin precursor revealed immunoreactive fibers in different regions of the circulatory system of Aplysia, including the auricle, the ventricle, and the aorta. In the reproductive system, immunoreactive fibers were detected in the small and large hermaphroditic ducts and in the ovotestis (Giardino et al., 1996). The kidney also displayed immunoreactivity, located on the inner surface of the kidney wall. Strong immunoreactivity was also seen in neurites located in a large nerve associated with muscles of the renal pore, a sphincter that controls urine efflux (Angers et al., 2000). Altogether, these findings suggest roles of luqin-type peptides in regulation of the reproductive and circulatory systems, in fluid mobilization, and water homeostasis in gastropod molluscs.

While what is known about luqin expression and function in molluscs is largely based on experimental studies on selected gastropod species, as detailed above, genes/transcripts encoding luqin-type precursors have been identified in other gastropod species and in species belonging to other molluscan classes, including Bivalvia, Scaphopoda, Cephalopoda, Monoplacophora, Polyplacophora, Chaetodermomorpha, and Neomeniomorpha (De Oliveira et al., 2019). Thus, the occurrence of luqin-type neuropeptide precursors throughout the phylum Mollusca has been established, providing a basis for extending analysis of luqin-type neuropeptide function beyond gastropods to other classes.

### LUQIN-TYPE SIGNALING IN ANNELIDS

Luqin-type precursors with a similar organization to those in molluscs have been identified in annelids. Analysis of genomic sequence data from Capitella teleta revealed the occurrence of a luqin-type precursor comprising the predicted mature peptide QFAWRPQGRF-NH<sup>2</sup> (Veenstra, 2011) (**Figure 1**). Later, a partial precursor transcript was identified in the annelid Platynereis dumerilii and the structure of the luqin-type peptide derived from this precursor was confirmed as WRPQGRF-NH<sup>2</sup> using mass spectrometry (Conzelmann et al., 2013). Furthermore, a transcript encoding a luqin-type receptor was identified in P. dumerilii and pharmacological characterization of this receptor demonstrated that the luqin-type peptide WRPQGRF-NH<sup>2</sup> acts as a ligand for this receptor (Bauknecht and Jékely, 2015). Currently, there are no data available that provide insights into the physiological roles of luqin-type neuropeptides in annelids.

### LUQIN-TYPE SIGNALING IN OTHER SPIRALIANS

Analysis of genome/transcriptome sequence data has revealed the occurrence of genes/transcripts encoding luqin-type precursors in a variety of species belonging to the phylum Platyhelminthes. For example, in the planarian Schmidtea mediterranea, an expanded family of genes encoding four luqin-type precursors has been identified (Koziol et al., 2016). The predicted neuropeptides derived from these precursors share sequence similarities with luqin-type neuropeptides from molluscs and annelids, including a C-terminal RFamide motif and the N-terminal motif WRPQ, which is conserved with only conservative substitutions (**Figure 1**). Interestingly, only two of the four precursors have a C-terminal pair of cysteines separated by 10 amino acids, which is typically a conserved feature of luqin-type precursors in other phyla (Koziol et al., 2016). The functional significance of the loss of this feature in two precursors is unknown, but it may be reflective of a loss of selection pressure after gene duplications gave rise to the four precursor genes.

Genes encoding luqin-type precursors have also been identified in parasitic platyhelminths, including the fox tapeworm Echinococcus multilocularis, the salmon fluke Gyrodactlyus salaris, the rodent tapeworm Hymenolepis microstoma, the broad fish tapeworm Diphyllobothrium latum, and the cestode Mesocestoides corti (Koziol et al., 2016). In all these species, a single luqin-type precursor comprising a single predicted neuropeptide was identified, with the peptides having the N-terminal motif WRPH that is similar to the WRPQ motif that is a feature of luqin-type neuropeptides in molluscs and annelids. Interestingly, however, the luqin-type peptides in these species are positioned in the C-terminal region of the precursor protein. This contrasts with precursors in other taxa, where luqin-type peptides are located N-terminally and proximal to the signal peptide. Furthermore, none of the luqin-type precursors in the parasitic platyhelminth species listed above have a C-terminal pair of cysteines separated by 10 amino

acids. Thus, the luqin-type precursors identified in parasitic platyhelminthes are highly divergent by comparison with other luqin-type precursors. This makes sequence alignment difficult and for this reason luqin-type precursors from parasitic platyhelminths are not included in **Figure 1** but instead they are shown in **Supplementary Figure 1B**. Nevertheless, evidence that luqin-type neuropeptides in parasitic platyhelminths are functional can be found in the identification of genes encoding candidate luqin-type receptors; for example, in E. multilocularis (Koziol et al., 2016). Furthermore, a comprehensive analysis of GPCRs encoded in the genome of the parasitic helminth Fasciola hepatica has revealed the presence of a receptor that is clearly an ortholog of the luqin-type receptor from the annelid P. dumerilii and the RYamide receptor from D. melanogaster (McVeigh et al., 2018). Further studies are now needed to confirm the predicted luqin-type ligand-receptor partners in platyhelminths and to investigate the expression and pharmacological actions of luqin-type neuropeptides in platyhelminths.

Luqin-type precursors have also been identified in the brachiopod Lingula anatina and in the nemertean Lineus longissimus (bootlace worm) (De Oliveira et al., 2019). In both species, a single luqin-type precursor was identified that contains a predicted mature neuropeptide with the C-terminal sequence WRPQGRF-NH2, which is the same motif identified in molluscan and annelid luqin-type peptides. The precursors identified in these species also have the typical C-terminal region containing the two cysteines separated by 10 amino acid residues (**Figure 1**). Furthermore, two proteins in L. anatina have been annotated as luqin-type receptors (XP\_013402794.1, XP\_013402807.1).

### RYamides: LUQIN-TYPE NEUROPEPTIDES IN ARTHROPODS

Arthropodan neuropeptides with a C-terminal RYamide motif (RYamides) were first identified in the decapod C. borealis by de novo post source decay sequencing of peptides in extracts of the pericardial organs of this species. Five different peptides were identified, all of them sharing the conserved C-terminal motif FXXXRY-NH2, where X is variable (Li et al., 2003). RYamides sharing the same C-terminal motif were subsequently identified by analysis of tissue extracts from other decapods, including Cancer productus (Fu et al., 2005), Pugettia producta (Stemmler et al., 2007), Carcinus maenas (Ma et al., 2009), and in the Pacific white shrimp Litopenaeus vannamei (Ma et al., 2010).

Genes/transcripts encoding precursors of RYamides have been identified in several crustacean species, including the water flea Daphnia pulex (Dircksen et al., 2011), the isopod Proasellus cavaticus (Christie, 2017), the red swamp crayfish Procambarus clarkii (Veenstra, 2015), the Australian crayfish Cherax quadricarinatus (Nguyen et al., 2016), and the freshwater amphipod Hyalella azteca (Christie et al., 2018). Interestingly, while most of these precursors comprise two RYamides, the D. pulex and P. clarkii precursors comprise three predicted RYamides (**Supplementary Figure 1A**). In the case of the D. pulex RYamide precursor, mass spectrometric analysis of brain tissue enabled identification of two of the three RYamides predicted to be derived from this precursor. The first peptide, located immediately after the signal peptide, was identified in its post-translationally modified form as pQTFFTNGRY-NH2, with both C-terminal amidation and N-terminal conversion of a glutamine residue to pyroglutamate. The second RYamide predicted to be derived from this precursor was not detected by mass spectrometry. Two different forms of the third RYamide peptide were detected—the 27 residue peptide SGNGGIVLGNSELDARNPERFFIGSRY-NH<sup>2</sup> and a C-terminal fragment of this peptide (NPERFFIGSRY-NH2) generated by cleavage at the underlined arginine residue in the longer peptide (Dircksen et al., 2011).

Genes/transcripts encoding precursors of RYamides have also been identified in a variety of insects, including six Drosophila species, the red flour beetle T. castaneum, the silkworm Bombyx mori, the honey bee Apis mellifera, the pea aphid Acyrthosiphon pisum, the yellow fever mosquitoes Aedes aegypti and Culex pipiens, and the stick insect Carausius morosus (Hauser et al., 2010; Liessem et al., 2018), and these typically comprise two predicted RYamides. Interestingly, in the RYamide precursor of the parasitic wasp Nasonia vitripennis paracopy expansion has occurred to give rise to a precursor comprising seven predicted RYamide-type neuropeptides (**Supplementary Figure 1A**). However, only one of these peptides has thus far been characterized biochemically by mass spectrometry (Hauser et al., 2010).

The first arthropod RYamide receptors to be characterized were from T. castaneum and D. melanogaster. In the case of the red flour beetle T. castaneum, a receptor was identified (GenBank Accession Number: HQ709383) and shown to be activated in a dose-dependent manner by the two RYamide peptides derived from the T. castaneum RYamide precursor (Collin et al., 2011). In D. melanogaster, the RYamide receptor was characterized independently by two laboratories, revealing that the two RYamides derived from the D. melanogaster RYamide precursor activate, in a dose-dependent manner, a receptor encoded by the gene CG5811 (Collin et al., 2011; Ida et al., 2011). Ida et al. (2011) also reported that injection of RYamide-1 suppresses the proboscis extension reflex (PER) in the blowfly Phormia regina, indicating a physiological role in regulation of feeding behavior (Ida et al., 2011). Subsequent experimental studies on this species revealed the presence of 26 RYamide-immunoreactive neurons in the brain and showed that injection of RYamide-1 or -2 has no effect on the volume of sucrose solution intake when feeding occurs but causes a reduction in the percentage of flies exhibiting the PER. Furthermore, injection of RYamide-1 was found to cause a significant decrease in the responsiveness to sucrose solution of sugar receptor neurons located on the labellum of the proboscis. Thus, it was concluded that RYamides suppress feeding motivation and sucrose responsiveness in the blow fly P. regina (Maeda et al., 2015). Analysis of the expression of the RYamide precursor in the silkworm B. mori using mRNA in situ hybridization revealed expression in the brain, terminal abdominal ganglion, and midgut. In the larval and adult brain, four to seven pairs of RYamide precursor-expressing neurons were identified in the protocerebrum and tritocerebrum. Expression was also revealed in a pair of posterior dorsomedial neurons in the terminal abdominal ganglion of adults and

larvae. Lastly, RYamide precursor expression was revealed in enteroendocrine cells of the anterior midgut of larvae, pupae, and adult specimens of B. mori. This pattern of expression indicates that RYamides are involved in regulation of feeding and digestion in B. mori (Roller et al., 2016).

Consistent with the hypothesis that RYamides may also regulate feeding behavior in other arthropods, use of quantitative real-time PCR revealed that expression of the RYamide precursor gene is significantly downregulated in the brain after starvation in the decapod Marsupenaeus japonicus (kuruma shrimp). Furthermore, injection of RYamides into the muscle of juvenile M. japonicus caused suppression of food intake in some experiments, but this was not consistently reproducible (Mekata et al., 2017). Foregut activity in decapods is controlled by motor neurons in the stomatogastric ganglion (STG), the activity of which is regulated by a variety of different neuropeptide types (Marder and Bucher, 2007). To gain further insights into the complexity of neuropeptide signaling in the STG, transcriptomic analysis of the STG in the crab C. borealis revealed the presence of 46 transcripts encoding receptors for 27 different neuropeptide types, but interestingly RYamide receptor transcripts were not detected. Consistent with this finding, in vitro application of RYamides to C. borealis STG preparations had no consistent modulatory effect on the motor outputs of the ganglion (Dickinson et al., 2019). Thus, further studies are now required to investigate the mechanisms by which RYamides may affect feeding in decapod crustaceans.

RYamides are not only involved in regulation of feeding behavior in arthropods. Analysis of RYamide expression in several Drosophila species revealed that the RYamide precursor gene is expressed in two abdominal neurons of the adult central nervous system that project to the rectal papillae, organs that mediate water re-absorption in flies (Veenstra and Khammassi, 2017). Consistent with this expression pattern, injection of female mosquitoes with RYamides delays postprandial diuresis (Veenstra and Khammassi, 2017). Thus, it appears that RYamides may act as regulators of urine production in insects. Interestingly, this is consistent with immunohistochemical evidence of a similar role in the mollusc Aplysia, where luqin-immunoreactivity has been localized in a nerve associated with muscles of the renal pore, a sphincter that controls urine efflux (Angers et al., 2000) (see above).

### DETAILED FUNCTIONAL CHARACTERIZATION OF LUQIN-TYPE SIGNALING IN THE NEMATODE Caenorhabditis elegans

In 2003, Keating et al. reported an extensive functional analysis of GPCRs in the nematode C. elegans, employing use of RNA interference (RNAi) methods to knockdown GPCR gene expression. Included in this study was the gene Y59H11AL.1 (also now known as npr-22) and a phylogenetic analysis revealed that the protein encoded by Y59H11AL.1 is most closely related to the L. stagnalis luqin receptor GRL106. However, both the receptor encoded by Y59H11AL.1 and GRL106 were classified by the authors under the general descriptor of "tachykinin-like receptors" (Keating et al., 2003). Subsequently, efforts were made to identify the neuropeptide(s) that act as a ligand for the Y59H11AL.1-encoded GPCR (NPR-22) by screening a library of synthetic neuropeptides predicted from analysis of the sequences of C. elegans neuropeptide precursors (Mertens et al., 2006). It was discovered that FMRFamide related peptides derived from the FLP-7 precursor can activate NPR-22, with the peptide FLP-7.3 (SPMERSAMVRF-NH2) being the most potent, albeit with an EC<sup>50</sup> of ∼1 µM (Mertens et al., 2006). The authors also noted that NPR-22 (Y59H11AL.1) is closely related to the D. melanogaster receptor CG5811. However, at the time the peptides that act as ligands for CG5811 were unknown and it was not until 5 years later that it was discovered that CG5811 is a RYamide receptor (Collin et al., 2011; Ida et al., 2011).

In 2013, an extensive phylogenetic analysis of GPCRs and neuropeptide precursors identified a luqin-type precursor in C. elegans (Mirabeau and Joly, 2013). Furthermore, alignment of the C. elegans precursor with luqin-type precursors from other taxa revealed many conserved features. Thus, the precursor comprises two putative luqin-type peptides that have an RYamide motif and the C-terminal region of the precursor contains two cysteines, which are separated by eight amino acid residues (Mirabeau and Joly, 2013). The spacing of the two cysteines residues in the C. elegans luqin-type precursor is atypical of luqin-type precursors where, as highlighted above and shown in **Figure 1**, the two cysteine residues are usually separated by ten residues. However, this feature of the C. elegans luqintype precursor is probably a derived characteristic because in another nematode species, the parasite Trichuris suis, the luqintype precursor has two C-terminal cysteines separated by 10 residues (**Figure 1**) (Yañez-Guerra et al., 2018).

Recently, a detailed analysis of the phylogenetic relationships of luqin-type receptors revealed that the C. elegans receptor NPR-22 (Y59H11AL.1) is an ortholog of the L. stagnalis luqin receptor (GRL106) and the D. melanogaster receptor CG5811 and belongs to a clade of luqin-type receptors that is quite distinct from the closely related tachykinin-type receptors and neuropeptide-Y-type receptors (Yañez-Guerra et al., 2018) (**Figure 2**). Furthermore, the neuropeptides LURY-1.1 (PALLSRY-NH2) and LURY-1.2 (AVLPRY-NH2) derived from the C. elegans luqin/RYamide precursor (LURY-1) were shown to act as ligands for NPR-22 at nanomolar concentrations (Ohno et al., 2017). Additionally, the authors showed that, as reported previously by Mertens et al. (2006), the FLP-7.3 peptide (SPMERSAMVRF-NH2) derived from the FLP-7 precursor also acts as a ligand for NPR-22, but only at micromolar concentrations (Ohno et al., 2017). Therefore, it is likely that LURY-1.1 and LURY-1.2 are the natural ligands for NPR-22 and the ability of FLP-7 derivedpeptides to activate NPR-22 at micromolar concentrations may be an in vitro pharmacological and non-physiological phenomenon due to the presence of a C-terminal RYamide-like RFamide motif in these peptides. However, evidence that NPR-22 mediates effects of FLP-7 derived peptides in vivo has been reported in an investigation of neuroendocrine mechanisms of serotonin-induced fat loss in C. elegans (Palamiuc et al., 2017).

Having identified the molecular components of a luqin/RYamide-type neuropeptide signaling system in

FIGURE 2 | Phylogenetic tree showing the occurrence and relationships of luqin/RYamide-type receptors and tachykinin-type receptors in bilaterians. The tree comprises two distinct receptor clades—luqin/RYamide-type receptors and the paralogous tachykinin-type receptors, with thyrotropin-releasing hormone (TRH)-type receptors included as an outgroup. Taxa are color-coded and SH-aLRT support (Guindon et al., 2010) [1000 replicates] for clades is represented with colored stars, as explained in the key. Species in which the peptide ligands that activate luqin/RYamide-type receptors or tachykinin-type receptors have been identified experimentally are shown with blue lettering. Species names are as follows: Aaeg (Aedes aegypti), Acal (Aplysia californica), Apis (Acyrthosiphon pisum), Arub (Asterias rubens), Cele (Caenorhabditis elegans), Cint (Ciona intestinalis), Ctel (Capitella teleta), Dmel (Drosophila melanogaster), Hduj (Hypsibius dujardini), Hsap (Homo sapiens), Lana (Lingula anatina), Lsta (Lymnaea stagnalis), Obim (Octopus bimaculoides), Ovul (Octopus vulgaris), Pcau (Priapulus caudatus), Pdum (Platynereis dumerilii), Rvar (Ramazzottius varieornatus), Skow (Saccoglossus kowalevskii), Spur (Strongylocentrotus purpuratus), Tcas (Tribolium castaneum), Tsui (Trichuris suis), Tpse (Trichinella pseudospiralis), Uuni (Urechis unicinctus). This figure is a modified version of a tree reported previously in Yañez-Guerra et al. (2018), with the addition of published luqin-type receptor sequences from the tardigrades H. dujardini and R. varieornatus (Koziol, 2018) and the brachiopod L. anatina (XP\_013402794.1, XP\_013402807.1) and tachykinin-type receptor sequences from the nematodes C. elegans (tk-1; C38C10.1) (Mirabeau and Joly, 2013) and T. spiralis (KRY8989.1). The alignment was performed using MUSCLE (Edgar, 2004) [16 iterations] and the trimming was made using BMGE (Criscuolo and Gribaldo, 2010) [standard automatic trimming]. The tree was generated in W-IQ-tree (Trifinopoulos et al., 2016) using the maximum likelihood method with automatic selection of the substitution model. The branch support analysis used was SH-aLRT (Guindon et al., 2010) with 1000 iterations.

C. elegans, Ohno et al. (2017) performed a detailed functional characterization of this signaling system. Analysis of the expression of the LURY-1 precursor and NPR-22 in C. elegans revealed that LURY-1 is expressed by the pharyngeal neurons M1 and M2, which regulate feeding in C. elegans. Thus, the M1 neuron stimulates spitting behavior whereas the M2 neuron stimulates pharyngeal pumping (Bhatla et al., 2015). With a much wider pattern of expression, NPR-22 is expressed in head muscles, I2 pharyngeal neurons, feeding pacemaker MC neurons, the RIH neuron, the interneurons AIA and AUA, the

ASK neurons, the ASI neurons, a few B-type motoneurons in the posterior ventral nerve cord, pharyngeal muscles, body wall muscles, the intestine, and a few unidentified cells anterior to the nerve ring (Ohno et al., 2017; Palamiuc et al., 2017).

To gain insights into the physiological roles of LURY-1 derived peptides, the lury-1 gene was overexpressed in C. elegans, with several phenotypes being observed. First, there was an increase in the number of unlaid eggs in the uterus but the rate of egg-laying was not affected, indicating that the rate of ovulation is normal but egg-laying is facilitated and embryos are laid prematurely. Accordingly, microinjection of synthetic LURY-1 peptides (10 µM) caused the same phenotype. Second, pharyngeal pumping, which is required for food intake, was reduced and microinjection of synthetic LURY-1 peptides (10 µM) caused the same phenotype. Third, adult lifespan was extended by as much as 21–50%. Finally, a reduction in locomotor activity was observed (Ohno et al., 2017). Importantly, all of these phenotypes were largely suppressed by the deletion of npr-22, indicating that LURY-1 peptides act upstream of NPR-22. Furthermore, these findings were consistent with a previous analysis of npr-22 knockdown, which revealed a phenotype in which animals have a reduced body size and brood size (Ceron et al., 2007). More specifically, cell-specific rescue experiments indicated that NPR-22 acts in the feeding pacemaker MC neurons to control feeding and lifespan and NPR-22 acts upstream of the serotonin-uptaking RIH neuron to control egg-laying (Ohno et al., 2017). The authors conclude that food-evoked activation of the pharynx triggers MC neurons to release of LURY-1 peptides, which then act as hormones via NPR-22-dependent mechanisms to control feeding, egg-laying, and roaming behavior (Ohno et al., 2017). Thus, use of C. elegans as a model experimental system has provided the first whole-animal perspective on the physiological/behavioral roles of luqin-type neuropeptide signaling.

### RYamide-TYPE NEUROPEPTIDE PRECURSORS IN OTHER ECDYSOZOANS

Analysis of transcriptome sequence data from the penis worm Priapulus caudatus (phylum Priapulida) revealed the existence of a luqin-type neuropeptide precursor in this species (Yañez-Guerra et al., 2018). This precursor comprises two putative luqintype neuropeptides with a C-terminal RYamide motif and has the conserved C-terminal region with two cysteines separated by 10 amino acid residues (**Figure 1**). Interestingly, the P. caudatus neuropeptides share similarities with both arthropod/nematode RYamides and spiralian luqins. Thus, while the C-terminal RYamide motif is a conserved feature of this neuropeptide family in ecdysozoans, the N-terminal region of the P. caudatus luqin-type neuropeptides shares more sequence similarity with mollusc/annelid luqins than with arthropod/nematode RYamides. For example, the N-terminal sequence QWRP in one of the P. caudatus RYamides is also a feature of several molluscan luqin-type peptides (**Figure 1**). These "intermediate" characteristics of the P. caudatus RYamides are also reflected in a phylogenetic analysis of the relationships of luqin/RYamidetype precursors, where the P. caudatus RYamide precursor is not positioned in a clade comprising arthropod/nematode precursors, as would be expected based on animal phylogenetic relationships, but instead it is positioned at the base of a clade comprising mollusc/annelid precursors. This suggests that the P. caudatus RYamide precursor may have retained many of the ancestral characteristics of protostome luqin-type precursors, whereas the arthropod/nematode luqin-type precursors appear to be more divergent (Yañez-Guerra et al., 2018). In P. caudatus, two candidate receptors for luqin-type neuropeptides have been identified based on their phylogenetic relationship with luqintype receptors that have been characterized in other protostomes (**Figure 2**) (Yañez-Guerra et al., 2018). Experimental studies are now needed to determine if the two P. caudatus luqin-type neuropeptides are effective as ligands for both receptors or if the receptors exhibit preferential ligand binding. Furthermore, experimental studies are needed to investigate the physiological roles of luqin-type signaling in priapulids.

Analysis of the genome sequences of the tardigrades Hypsibius dujardini and Ramazzottius varieornatus (phylum Tardigrada) has revealed the occurrence of luqin-type precursors and receptors in both of these species (Koziol, 2018). One luqintype precursor was identified in H. dujardini and two luqin-type precursors were identified in R. varieornatus (Koziol, 2018). The predicted neuropeptides derived from these precursors have a C-terminal RYamide motif, consistent with other members of this neuropeptide family in ecdysozoans. However, atypical of ecdysozoan luqin-type precursors, the tardigrade precursors comprise only one predicted neuropeptide (**Figure 1**). The C-terminal region of the precursor contains two cysteines separated by 10 amino acid residues in H. dujardini and in one of the two precursors from R. varieornatus, while the second precursor in R. varieornatus lacks the second cysteine (**Figure 1**). Genes encoding a single luqin-type receptor have been identified in both H. dujardini and R. varieornatus (Koziol, 2018) but the ligand-binding properties of these receptors remain to be investigated experimentally. Furthermore, nothing is known about the physiological roles of luqin-type neuropeptides in tardigrades and so this will be an interesting area for investigation in the future, particularly in the context of their remarkable capacity to withstand extreme environmental conditions, including radiation tolerance, desiccation, and both high and low temperature and pressure (Jönsson, 2019; Jönsson et al., 2019; Kamilari et al., 2019).

### DISCOVERY OF LUQIN-TYPE SIGNALING IN AMBULACRARIAN DEUTEROSTOMES REVEALS THE URBILATERIAN ORIGIN OF THIS NEUROPEPTIDE SIGNALING SYSTEM

The discovery of precursors of luqin-type peptides in deuterostomian invertebrates was first reported in 2013. They were identified in the hemichordate Saccoglossus kowalevskii

and in the echinoderm Strongylocentrotus purpuratus, which was facilitated by the presence of the aforementioned conserved C-terminal region containing two cysteines separated by 10 amino acid residues (Jékely, 2013). The S. purpuratus and S. kowalevskii precursor proteins comprise the putative neuropeptides EIRSPGGKPHKFMRW-NH<sup>2</sup> and EGSNTFLRW-NH2, respectively. Thus, the presence of a C-terminal FXRW-NH<sup>2</sup> motif (where X is L or M) was identified as a characteristic feature of luqin-type peptides in the ambulacrarian clade of the deuterostomes (Elphick and Mirabeau, 2014).

Through analysis of transcriptome/genome sequence data, luqin-type neuropeptide precursors have subsequently been identified in other echinoderm classes, including Holothuroidea (sea cucumbers), Asteroidea (starfish or sea stars), and Ophiuroidea (brittle stars). In the sea cucumbers Apostichopus japonicus, Holothuria glaberrima, Holothuria scabra, and Holothuria leucospilota, the precursor comprises a single neuropeptide with the same predicted structure in all four species—KPYKFMRW-NH<sup>2</sup> (Rowe et al., 2014; Suwansa-Ard et al., 2018; Chen et al., 2019; Chieu et al., 2019). Luqin-type precursors identified in the starfish species A. rubens and Acanthaster planci comprise a single putative neuropeptide with the amino acid sequence EKGRFPKFMRW-NH<sup>2</sup> and EEKTRFPKFMRW-NH2, respectively (Semmens et al., 2016; Smith et al., 2017). Ophiuroid luqin-type precursors also comprise a single putative neuropeptide, which has the predicted sequence QGFNRDGPAKFMRW-NH<sup>2</sup> in Ophionotus victoriae, QGFNRGEGPAKFMRW-NH<sup>2</sup> in Ophiopsila aranea, and QGFSRDGPAKFMRW-NH<sup>2</sup> in Amphiura filiformis (Zandawala et al., 2017). Thus, the C-terminal motif KFMRW-NH<sup>2</sup> appears to be a conserved feature of luqin-type neuropeptides in echinoderms.

A large-scale analysis of the phylogenetic distribution of G-protein coupled neuropeptide receptors in bilaterian animals revealed the presence of luqin-type receptors in ambulacrarians (hemichordates and echinoderms) (Mirabeau and Joly, 2013), consistent with the identification of luqin-type neuropeptide precursors in these taxa. However, luqin-type receptors were not identified in vertebrates or other chordates (urochordates and cephalochordates) and accordingly luqin-type neuropeptide precursors have not been identified in these taxa. Thus, it was concluded that the evolutionary origin of luqin-type receptors can be traced to the common ancestor of protostomes and deuterostomes, but with subsequent loss in the chordate lineage (Mirabeau and Joly, 2013). Furthermore, the phylogenetic analysis of neuropeptide receptor relationships reported by Mirabeau and Joly (2013) revealed that luqin-type receptors are paralogs of tachykinin-type receptors and this finding was confirmed recently by a phylogenetic analysis specifically focused on luqin-type receptors and closely related neuropeptide receptors (Yañez-Guerra et al., 2018) (**Figure 2**). Thus, it can be inferred that gene duplication in a common ancestor of the Bilateria gave rise to the paralogous luqin-type and tachykinintype signaling systems, but with subsequent loss of luqin-type signaling in chordates (**Figure 3**).

A detailed analysis of luqin-type receptors in ambulacrarians revealed the presence of four genes/transcripts in the

determined experimentally. Note the loss of the luqin-type signaling system in the chordate lineage, which is signified by the red cross and the white-filled boxes. Note that Xenacoelomorpha are not included in this diagram because of the controversy regarding the phylogenetic position of this phylum. However, as discussed in this review, luqin-type receptors have been identified in xenacoelomorphs but the precursors of peptides that act as ligands for these receptors have yet to be identified. The cladogram depicting bilaterian relationships is based on a recent phylogenetic study reported by Laumer et al. (2019).

hemichordate S. kowalevskii and two genes/transcripts in the echinoderms S. purpuratus (sea urchin) and A. rubens (starfish) that encode members of this family of neuropeptide receptors (Yañez-Guerra et al., 2018). Furthermore, the starfish A. rubens was selected a model experimental system in which to functionally characterize luqin-type neuropeptide signaling for the first time in a deuterostome. A cDNA encoding the A. rubens luqin-type precursor ArLQP was cloned and the structure of the mature peptide derived from this precursor was determined using mass spectrometry as a 12 amino acid residue peptide that is C-terminally amidated—EEKTRFPKFMRW-NH<sup>2</sup> (ArLQ). Cloning, sequencing, and heterologous expression of cDNAs encoding two A. rubens luqin-type receptors (ArLQR1 and ArLQR2) facilitated testing of synthetic ArLQ as a candidate ligand for these receptors. This revealed that ArLQ is a potent ligand for both ArLQR1 and ArLQR2, with

EC<sup>50</sup> values of 2.4 × 10−<sup>8</sup> and 7.8 × 10−<sup>10</sup> M, respectively (Yañez-Guerra et al., 2018).

To gain insights into the physiological roles of luqin-type neuropeptide signaling in A. rubens, mRNA in situ hybridization methods were employed to investigate the expression pattern of ArLQP in adult starfish. ArLQP-expressing cells were revealed in the central nervous system, including the circumoral nerve ring and the radial nerve cords. However, expression was limited to the ectoneural region, which contains sensory and interneurons, with no expression detected in motoneuron cell bodies located in the hyponeural region. ArLQP-expressing cells were also revealed in starfish locomotor organs—tube feet—specifically located in close proximity to the basal nerve ring in the disk region. Lastly, in the digestive system, ArLQP-expressing cells were revealed in the cardiac stomach and pyloric stomach (Yañez-Guerra et al., 2018). Efforts to generate antibodies to ArLQ were also made to enable immunohistochemical analysis of ArLQ expression in A. rubens, but these were unsuccessful. Nevertheless, informed by the pattern of ArLQP expression revealed by use of mRNA in situ hybridization, synthetic ArLQ was tested as a potential myoactive peptide on in vitro preparations of tube feet and cardiac stomach. No effects on cardiac stomach preparations were observed but, interestingly, ArLQ was found to cause dose-dependent relaxation of tube foot preparations (Yañez-Guerra et al., 2018). Furthermore, the relaxing effect of ArLQ was similar in potency and magnitude to SALMFamide-2 (S2), a neuropeptide that has been identified and functionally characterized previously as a myorelaxant in A. rubens (Elphick et al., 1991; Melarange and Elphick, 2003). Clearly, further studies are now needed to gain broader insights into the physiological roles of luqin-type neuropeptides in starfish and other echinoderms. Further investigation of the physiological roles of luqin-type neuropeptide signaling in echinoderms could also be extended beyond adult animals to the free-swimming larval stage of these animals. Detailed anatomical analyses of neuropeptide precursor gene expression in larvae of A. rubens and S. purpuratus have been reported recently (Mayorova et al., 2016; Wood et al., 2018) but these studies did not incorporate analysis of the expression of luqin-type precursors. Therefore, this is also an important objective for future work on luqin-type neuropeptide signaling in echinoderms.

### LUQIN-TYPE NEUROPEPTIDE SIGNALING IN XENACOELOMORPHS: RECEPTORS WITH MISSING LIGANDS

The phylum Xenacoelomorpha comprises an assemblage of marine worms that have a simple body plan without a throughgut (Hejnol and Pang, 2016; Telford and Copley, 2016; Gavilán et al., 2019). They are of particular interest for evolutionary studies because of controversy regarding their phylogenetic position in the animal kingdom. On the one hand, they have been placed as a sister group to all other bilaterian animals [Nephrozoa hypothesis] (Cannon et al., 2016; Rouse et al., 2016). Alternatively, they are considered to be closely related to the Ambulacraria, forming a clade known as the Xenambulacraria [Xenambulacraria hypothesis] (Bourlat et al., 2006; Philippe et al., 2011). The most recent analysis of the phylogenetic position of xenacoelomorphs, including a strategy devoted to mitigate the effects of systematic errors, has supported the Xenambulacraria hypothesis (Philippe et al., 2019).

Analysis of transcriptome sequence data from 13 xenacoelomorph species revealed the occurrence of luqintype receptors in this phylum (Thiel et al., 2018). Transcripts encoding luqin-type receptors were identified in species belonging to each of the three xenacoelomorph sub-phyla; Xenoturbellida (Xenoturbella bocki), Nemertodermatida (Nemertoderma westbladi), and Acoela (Hofstenia miamia). Furthermore, phylogenetic analysis revealed that these receptors form part of a clade of luqin-type receptors that include spiralian luqin receptors and ecdysozoan RYamide receptors. Interestingly, the xenacoelomorph luqin-type receptors are positioned within a branch that also includes ambulacrarian luqin-type receptors (Thiel et al., 2018). Therefore, this may be additional evidence in support of the Xenambulacraria hypothesis. Thus far, precursors of luqin-type neuropeptides have yet to be identified in xenacoelomorphs and so this represents an important objective for future work. In particular, it would be interesting to determine the C-terminal motif of luqin-type neuropeptides in xenacoelomorphs. If the peptides have a C-terminal RWamide motif, which is a characteristic of ambulacrarian luqin-type neuropeptides, then this would be further evidence of a close relationship between xenacoelomorphs and ambulacrarians. Furthermore, discovery of luqin-type neuropeptides in xenacoelomorphs would provide a basis for functional investigation of physiological roles of these neuropeptides in this phylum.

### GENERAL CONCLUSIONS AND SPECULATIONS

The discovery and naming of luqin as a neuropeptide that is expressed in left upper quadrant cells of the abdominal ganglion of the mollusc Aplysia was perhaps a rather esoteric beginning to a new field of neuropeptide research. However, gradually over a period of more than three decades, it has become apparent that in fact this finding has broad relevance to neuropeptide signaling in bilaterian animals, with the notable exception of chordates. Although the name luqin was highly specific in its derivation, it nevertheless provides a useful generic name for the neuropeptide family. Thus, it is preferable to RFamides, RYamides, or RWamides because these C-terminal motifs are not unique to or even generally applicable to the neuropeptide family as whole. Therefore, our recommendation is that all members of this neuropeptide family are referred to as "luqins," recognizing of course that the derivation of the name is meaningless beyond Aplysia.

Comparison of the sequences of luqin-type precursors has revealed variability in the number of luqin-type neuropeptides derived from these proteins. All the luqin-type precursors identified thus far in ambulacrarians comprise a single luqin-tye

neuropeptide and this is also a feature of some luqin-type precursors in spiralians and ecdysozaons (**Figure 1**). Therefore, it could be inferred that this may reflect the characteristics of the luqin-type precursor in the common ancestor of the Bilateria. However, the existence of precursors that comprise two luqin-type neuropeptides is a feature of many ecdysozoans and some spiralian species, indicating perhaps that this characteristic has evolved independently in both lineages (**Figure 1**). Further expansions in the number of luqin-type neuropeptides derived from precursor proteins are seen in some crustacean species; for example, in D. pulex and P. cavaticus, where there are three predicted mature peptides (Dircksen et al., 2011; Christie, 2017) (**Supplementary Figure 1A**). However, the most extreme example is seen in the wasp N. vitiprenis, where the precursor comprises seven predicted luqin-type neuropeptides (Hauser et al., 2010) (**Supplementary Figure 1A**). The functional significance of this expansion in some arthropods is as yet unknown. Furthermore, the existence of multiple genes encoding luqin-type neuropeptides is a feature of some platyhelminth species (Koziol et al., 2016).

Thus far, genes/transcripts encoding luqin-type precursors and/or luqin-type receptors have been discovered in all of the non-chordate bilaterian phyla that have been investigated, but not in chordates (**Figure 3**). The loss of LQ-type signaling in chordates is of interest from a functional perspective, as discussed below. However, loss of LQ-type signaling in chordates has also influenced efforts to classify neuropeptide receptors. Hence, the original classification of the C. elegans and L. stagnalis LQtype receptors as "tachykinin-like receptors" (Keating et al., 2003). More recently, in a phylogenetic analysis of gonadotropininhibitory hormone (GnIH)-type signaling, it was concluded that GnIH-type receptors have a "strong evolutionary relationship" with the C. elegans receptor NPR-22 (Ubuka and Tsutsui, 2018). This conclusion was a consequence of a phylogenetic analysis that was restricted to comparison of human GnIH-type receptors with a variety of C. elegans neuropeptide receptors, while more comprehensive phylogenetic analyses clearly show that the C. elegans receptor NPR-22 is an LQ-type receptor (Ohno et al., 2017; Yañez-Guerra et al., 2018).

While genes/transcripts encoding luqin-type precursors and/or luqin-type receptors have been discovered in all of the non-chordate bilaterian phyla that have been investigated, there are still many phyla that remain to be examined and so we cannot rule out the possibility that luqin-type neuropeptide signaling has been lost in some non-chordate phyla. Nevertheless, based on what it currently known, the loss of luqin signaling in chordates appears to be singular and therefore notable. Why then was luqin signaling lost in the chordates? To address this question, we need to consider, first, what is known about the physiological roles of luqin-type neuropeptide signaling and, second, the paralogous relationship of luqin-type signaling with tachykinin-type signaling.

Although luqin-type signaling was first discovered and then functionally characterized in molluscs, it was the recent detailed analysis of this signaling system in the nematode C. elegans, employing use of reverse genetic techniques, that has provided the most comprehensive insights into the physiological roles of luqin-type neuropeptide signaling. Furthermore, the findings from C. elegans provide a broad functional context for comparison with experimental findings from other less intensely studied taxa (**Figure 4**).

A key finding from C. elegans was that luqin-type neuropeptides are secreted by a pair of pharyngeal neurons (M1 and M2) and act as hormones to suppress feeding (Ohno et al., 2017). It is noteworthy, therefore, that luqin-type RYamides suppress feeding motivation and sucrose responsiveness in an insect (Maeda et al., 2015) and brain expression of the luqintype RYamide precursor gene is significantly downregulated after starvation in a crustacean (Mekata et al., 2017). Thus, in ecdysozoans, there is evidence of an evolutionarily conserved role of luqin-type neuropeptide signaling as an inhibitory regulator of feeding in association with changes in food availability. Further studies are now required to investigate if luqin-type neuropeptides also regulate feeding activity in spiralians and ambulacrarians. Nevertheless, it is noteworthy that in the mollusc A. fullica luqin has excitatory effects on buccal neurons and muscles (Fujimoto et al., 1990) and in the starfish A. rubens a luqin-type neuropeptide precursor is expressed in the cardiac stomach, a region of the digestive system that is everted when starfish feed (Yañez-Guerra et al., 2018).

Luqin-type neuropeptides also regulate egg laying in C. elegans, acting via the luqin-type receptor NPR-22 upstream of the serotonin-uptaking RIH neuron. Thus, by comparison with wild-type animals, mutant worms lacking expression of the luqin-type LURY-1 precursor or NPR-22 exhibit reduced egg-laying during a period of refeeding after starvation (Ohno et al., 2017). Accordingly, detection of luqin-immunoreactivity in nerve fibers associated with the hermaphroditic ducts and the ovotestis and in nerve fibers associated with inhibition of the egg-laying hormone-producing caudodorsal cells in L. stagnalis, which express luqin-type receptor transcripts (Tensen et al., 1998), suggest that luqin-type signaling may likewise regulate egg laying in molluscs.

Overexpression of luqin-type neuropeptides in C. elegans produced a phenotype where worms exhibited reduced roaming activity and, importantly, this was not observed in animals lacking the luqin-type receptor NPR-22. It was concluded that this action may reflect a physiological role of luqin signaling in attenuating locomotory activity when food is abundant (Ohno et al., 2017). It is noteworthy, therefore, that the relaxing effect of the luqin-type neuropeptide ArLQ on starfish tube feet may likewise be consistent with a physiological role in inhibitory regulation of locomotor activity in an echinoderm (Yañez-Guerra et al., 2018). Investigation of the effects of ArLQ on locomotor activity will therefore be an interesting objective for future studies, employing use of methods that have been reported recently to examine the effects of other neuropeptides on starfish locomotor activity (Tinoco et al., 2018).

The discovery that luqin-type signaling is paralogous to tachykinin-type signaling was based on phylogenetic analysis of the relationships of G-protein coupled neuropeptide receptors in the Bilateria (Mirabeau and Joly, 2013; Thiel et al., 2018; Yañez-Guerra et al., 2018) (**Figure 3**). Therefore, it can be inferred that duplication of genes encoding a common ancestral neuropeptide

precursor and receptor occurred in a common ancestor of the Bilateria to give rise to the paralogous luqin-type and tachykinintype signaling systems. Analysis of the phylogenetic distribution of tachykinin-type signaling indicates that it has been retained in all bilaterian phyla that have been analyzed (Mirabeau and Joly, 2013; Elphick et al., 2018).

A possible explanation for the loss of a neuropeptide signaling system in a taxon could be functional redundancy with respect to a paralogous signaling system. However, given (i) the length of time elapsed since the gene duplications that gave rise to the paralogous luqin-type and tachykinin-type signaling systems is likely to be in excess of 650 million years, based on the estimated time of divergence of protostomes and deuterostomes (Erwin et al., 2011) and (ii) the preservation of both signaling systems in most bilaterian phyla that have been analyzed, it seems unlikely that this explains the loss of luqin signaling in chordates. Further insights may be obtained as part of a broader investigation of the functional significance of the loss of bilaterian neuropeptide signaling systems. Examples of other neuropeptide types that have been lost in chordates include the leucokinintype and pigment dispersing factor (PDF)-type signaling systems (Mirabeau and Joly, 2013) while more specifically corazonin-type signaling has been lost in vertebrates and urochordates but not in cephalochordates (Tian et al., 2016; Zandawala et al., 2018). Comparisons with these neuropeptide systems may therefore be informative in future efforts to draw general conclusions on the evolutionary and functional significance of neuropeptide loss in chordates.

### AUTHOR CONTRIBUTIONS

Both authors contributed equally to the planning and writing of the review article. LY-G produced the figures.

### FUNDING

LY-G was supported by a Leverhulme Trust grant (RPG-2016- 353) and by a Ph.D. studentship awarded by the Mexican Council of Science and Technology (CONACyT studentship No. 418612) and Queen Mary University of London. ME was supported by grants from the Biotechnology and Biological Sciences Research Council (BB/M001644/1) and the Leverhulme Trust (RPG-2016-353).

### ACKNOWLEDGMENTS

fnins-14-00130 February 15, 2020 Time: 17:8 # 13

We thank Michael Crossley (University of Sussex, United Kingdom), Marycruz Flores-Flores (Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Mexico), Marina Ezcurra (University of Kent,

### REFERENCES


United Kingdom), and Ray Crundwell (Queen Mary University of London, United Kingdom) for providing copyright free photographs of Lymnaea stagnalis, Drosophila melanogaster, Caenorhabditis elegans, and Asterias rubens, respectively.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins. 2020.00130/full#supplementary-material


and protein hormones present in the parasitic wasp Nasonia vitripennis. J. Proteome Res. 9, 5296–5310. doi: 10.1021/pr100570j


identifies orphan rhodopsins unique to phylum Platyhelminthes. Int. J. Parasitol. Drugs Drug Resist. 8, 87–103. doi: 10.1016/j.ijpddr.2018.01.001


G-protein-coupled receptor for a novel member of the RFamide neuropeptide family. J. Neurosci. 18, 9812–9821. doi: 10.1523/jneurosci.18-23-09812. 1998


functional decline of this neuropeptide gene. Insect Biochem. Mol. Biol. 83, 68–79. doi: 10.1016/j.ibmb.2017.03.001


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Yañez-Guerra and Elphick. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Expression Analysis of Cnidarian-Specific Neuropeptides in a Sea Anemone Unveils an Apical-Organ-Associated Nerve Net That Disintegrates at Metamorphosis

#### Hannah Zang1,2 and Nagayasu Nakanishi <sup>2</sup> \*

<sup>1</sup> Lyon College, Batesville, AR, United States, <sup>2</sup> Department of Biological Sciences, University of Arkansas, Fayetteville, AR, United States

#### Edited by:

Elizabeth Amy Williams, University of Exeter, United Kingdom

#### Reviewed by:

Chiara Sinigaglia, UMR5242 Institut de Génomique Fonctionnelle de Lyon (IGFL), France Meet Zandawala, Brown University, United States

> \*Correspondence: Nagayasu Nakanishi nnakanis@uark.edu

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 12 November 2019 Accepted: 31 January 2020 Published: 19 February 2020

#### Citation:

Zang H and Nakanishi N (2020) Expression Analysis of Cnidarian-Specific Neuropeptides in a Sea Anemone Unveils an Apical-Organ-Associated Nerve Net That Disintegrates at Metamorphosis. Front. Endocrinol. 11:63. doi: 10.3389/fendo.2020.00063 Neuropeptides are ancient neuronal signaling molecules that have diversified across Cnidaria (e.g., jellyfish, corals, and sea anemones) and its sister group Bilateria (e.g., vertebrates, insects, and worms). Over the course of neuropeptide evolution emerged lineage-specific neuropeptides, but their roles in the evolution and diversification of nervous system function remain enigmatic. As a step toward filling in this knowledge gap, we investigated the expression pattern of a cnidarian-specific neuropeptide—RPamide during the development of the starlet sea anemone Nematostella vectensis, using in situ hybridization and immunohistochemistry. We show that RPamide precursor transcripts first occur during gastrulation in scattered epithelial cells of the aboral ectoderm. These RPamide-positive epithelial cells exhibit a spindle-shaped, sensory-cell-like morphology, and extend basal neuronal processes that form a nerve net in the aboral ectoderm of the free-swimming planula larva. At the aboral end, RPamide-positive sensory cells become integrated into the developing apical organ that forms a bundle of long cilia referred to as the apical tuft. Later during planula development, RPamide expression becomes evident in sensory cells in the oral ectoderm of the body column and pharynx, and in the developing endodermal nervous system. At metamorphosis into a polyp, the RPamide-positive sensory nerve net in the aboral ectoderm degenerates by apoptosis, and RPamide expression begins in ectodermal sensory cells of growing oral tentacles. In addition, we find that the expression pattern of RPamide in planulae differs from that of conserved neuropeptides that are shared across Cnidaria and Bilateria, indicative of distinct functions. Our results not only provide the anatomical framework necessary to analyze the function of the cnidarian-specific neuropeptides in future studies, but also reveal previously unrecognized features of the sea anemone nervous system—the apical organ neurons of the planula larva, and metamorphosis-associated reorganization of the ectodermal nervous system.

Keywords: neuropeptide, evolution, cnidaria, sea anemone, neural development, metamorphosis

## INTRODUCTION

Neuropeptides are short polypeptide hormones that are generated from a larger precursor protein via proteolytic cleavage in neurons and neuroendocrine cells [reviewed in (1)]. The cleaved peptides, in some cases, undergo peptide α-amidation involving post-translational conversion of a Cterminal glycine into an amide group. Processed neuropeptides are stored in membrane-bound secretory vesicles that are distributed throughout the cell. Upon receiving stimuli, the secretory vesicles fuse with the cell membrane, releasing neuropeptides into the extracellular space and/or circulation. The neuropeptides then diffuse and bind to receptors—typically G-protein-coupled receptors—in the cell membrane of target cells, triggering intracellular signaling cascades. Neuropeptides, but not small molecule neurotransmitters such as glutamate, GABA and acetylcholine, are commonly found in the nervous systems of cnidarians (e.g., jellyfishes and sea anemones) and bilaterians (e.g., vertebrates and insects), and thus are thought to be the first neurotransmitter molecules in nervous system evolution (2, 3). In an attempt to better understand how the primordial nervous system may have functioned, efforts have been directed toward resolving deeply conserved functions of neuropeptides that are shared across Cnidaria and Bilateria. These studies suggest a likely ancestral role of deeply conserved Wamide neuropeptides in modulating life cycle transition (4–6). Comparably little is known, however, about how lineage-specific neuropeptides contribute to the evolution and diversification of nervous system function, particularly in Cnidaria. As a step toward understanding the role of lineage-specific neuropeptides in the evolution of neural function in Cnidaria, we investigated the expression pattern of a cnidarian-specific neuropeptide— RPamide—during development of the starlet sea anemone Nematostella vectensis.

Cnidaria is an early-evolving animal group that diverged from its sister group Bilateria over 600 million years ago in the Precambrian (7–11). Cnidaria represents a diverse clade in which Medusozoa forms a sister group to Anthozoa (12, 13); Medusozoa consists of Hydrozoa (e.g., Portuguese man o' war), Staurozoa (stalked jellyfish), Scyphozoa (e.g., moonjelly), and Cubozoa (e.g., sea wasp), and Anthozoa comprises Octocorallia (e.g., sea fans) and Hexacorallia (e.g., hard corals and sea anemones). Cnidarians are diploblastic, being composed of ectoderm and endoderm separated by an extracellular matrix called the mesoglea, and are characterized by having a phylumdefining stinging cell type, the cnidocyte. Cnidarian development typically entails multiple life cycle stages, whereby gastrulation generates a two-layered, free-swimming planula larva that metamorphoses into a sessile polyp with oral tentacles. The polyp sexually matures in Anthozoa, Staurozoa and some hydrozoan cnidarians (e.g., Hydra); in most medusozoans, the polyp undergoes another round of metamorphosis via transverse fission/strobilation (in Scyphozoa and Cubozoa) or lateral budding (in Hydrozoa) to generate free-swimming medusae that grow and reach sexual maturity. The nervous system develops in the ectoderm [and, in some cases, in the endoderm; e. g., the sea anemone N. vectensis (14, 15), the hydrozoan Hydra littoralis (16), and the scyphozoan Cyanea capillata (17)] at each of the life cycle stages [but see (18) for electron microscopic evidence for the lack of the nervous system in cubozoan planulae]. It is composed of epithelial sensory cells and basally localized ganglion cells, which extend basal neurites that form a basiepithelial network (19). This basic organization is modified to generate neural structures of varying morphological complexities, from the oral nerve ring of Hydra (20) to the lensed eyes of cubozoan medusae (21). Neuropeptides, but not small molecule neurotransmitters, are ubiquitously expressed in cnidarian nervous systems, and thus are thought to be the primary neurotransmitters and neurohormones of Cnidaria [reviewed in (2, 3)].

Among cnidarians, the starlet sea anemone N. vectensis is a useful model to examine neuropeptide function during development, because genomic and transcriptomic resources (8, 22, 23), molecular genetic tools [e.g., in situ hybridization and CRISPR-mediated mutagenesis (24, 25)], and data on neural anatomy and development (14, 15) are available. In N. vectensis, gastrulation occurs by invagination (26, 27), and the site of gastrulation—the blastopore—becomes the mouth of the animal (28). Both sensory cells and ganglion cells begin to develop in the outer ectoderm during gastrulation (14, 29). The gastrula embryo then develops into a free-swimming planula larva that forms a bundle of long cilia at the aboral pole, which is referred to as the apical tuft (14, 30). The ectodermal structure that houses the apical tuft is termed the apical organ, and is often assumed to be a sensory structure that is used to perceive sensory signals for settlement and metamorphosis [e.g., (30)]. Early-born neurons in the ectoderm send basal neurites toward the base of the apical organ, forming a basiepithelial nerve net (14). During planula development, sensory cells develop in the pharyngeal ectoderm, as well as in the endoderm (14). At metamorphosis, the planula larva transforms into a sessile polyp by developing circumoral tentacles with mechanosensory hair cells and losing the apical tuft (14). We note that N. vectensis neurons have been reported to show diverse transcriptome profiles (31), and thus transcriptionally distinct neural cell types may exist among morphologically similar neurons.

Isolated originally from the green aggregating anemone Anthopleura elegantissima (32, 33) and recently from the starlet sea anemone N. vectensis (34), RPamide represents one of the cnidarian-specific peptides. RPamide precursor-encoding genes have been recovered from the genomes of anthozoans [A. elegantissima (35) and N. vectensis (36); but absent in Octocorallia (37)] and medusozoans [hydrozoans Clytia hemisphaerica and Cladonema pacificum (38, 39); scyphozoans Nemopilema nomurai and Rhopilema esculentum (37); but absent in staurozoans and cubozoans (37)], but not from those of other metazoan groups. Immunohistochemical evidence suggests that RPamide is a neuropeptide; immunostaining with an antiserum against RPamide has shown that RPamide is expressed in ectodermal sensory cells of the body wall in the sea anemone Calliactis parasitica (32) and in ectodermal photosensory cells in gonads of the hydrozoan C. hemisphaerica (38). Treatment of isolated sea anemone tentacles with synthetic RPamide changed the rates of spontaneous contractions, consistent with a role in neurotransmission (32, 33). In addition, recent experimental evidence suggests that RPamide regulates oocyte maturation and spawning in hydrozoans (39). However, whether RPamide has any role during cnidarian development is not known.

In this paper, we have investigated the developmental expression pattern of RPamide in N. vectensis in order to build a neuroanatomical framework for understanding RPamide function during development. By using in situ hybridization and immunohistochemistry, we show that RPamide is dynamically expressed during N. vectensis development. RPamide expression begins at gastrulation in scattered epithelial cells in the aboral ectoderm. These RPamide-positive epithelial cells show a spindle-shaped, sensory cell-like morphology and extend basal neurites that form an aboral nerve net of the planula larva. A subset of the RPamide-positive sensory cells located at the aboral end become integrated into the developing apical organ. Later during planula development, RPamide-positive sensory cells become evident in oral ectoderm of the body column and the pharynx, as well as in the endoderm. At metamorphosis, RPamide-positive aboral sensory nerve net disintegrates by apoptosis, and RPamide becomes expressed in ectodermal sensory cells of growing oral tentacles. In addition, we find that expression of RPamide and that of the conserved neuropeptide GLWamide occurs in distinct subsets of neurons in planulae, suggesting that RPamide and GLWamide may perform nonoverlapping functions in N. vectensis planulae.

### MATERIALS AND METHODS

### Animal Culture

N. vectensis were cultured as previously described (40, 41).

### RNA Extraction, cDNA Synthesis, and Gene Cloning

Total RNA was extracted from a mixture of planulae and primary polyps using TRIzol (Thermo Fisher Scientific). cDNAs were synthesized using the BD SMARTer RACE cDNA Amplification Kit (Cat. No. 634858; BD Biosciences, San Jose, CA, USA). RPamide gene sequences [Nv37852 and Nv244953; (36)] were retrieved from the Joint Genome Institute genome database (N. vectensis v1.0; http://genome.jgi-psf.org/Nemve1/ Nemve1.home.html). 5′ and 3′ RACE were conducted, following manufacturer's recommendations (BD SMARTer RACE cDNA Amplification Kit, BD Biosciences, San Jose, CA, USA), in order to confirm in silico predicted gene sequences. Gene specific primer sequences used for RACE PCR are: 3′ RACE Forward 5′ -GCTCGGTACAGAGCCGAAACCTGAGACAC-3 ′ ; 5′ RACE primary PCR Forward 5′ - CATGGGCAACGGTCAGCGGCAGATCGATG-3′ . RACE PCR fragments were cloned into the pGEM-T plasmid vector using the pGEM-T Vector Systems (Cat. No. A3600; Promega), and were sequenced at Macrogen Corp., Maryland. To generate templates for RNA probes for in situ hybridization experiments, RTPCR was performed to amplify a 406 bp NvRPa (Nv244953) gene fragment. Gene specific primer sequences used for RTPCR are: Forward 5′ - CGAAGGACCTTGAAAGTGGACTGTTCTCGG-3′ ; Reverse 5 ′ -CATGGGCAACGGTCAGCGGCAGATCGATG-3′ . The PCR product was cloned into a pCR4-TOPO TA vector using the TOPO TA Cloning kit (Cat. No. K457501; ThermoFisher Scientific), and sequenced at Genewiz, New Jersey.

### Generation of an Antibody Against RPamide

An antibody against a synthetic peptide CEDSSNYEFPPGFHRPamide corresponding in amino acid sequence to Nv-RPamide IV [sensu (36); **Figure 1**] was generated in rabbit (YenZym Antibodies, LLC); a recent mass spectrometry study has confirmed that the C-terminal FPPGFHRPamide is secreted by adult N. vectensis (34). The antibody was generated against the predicted Nv-RPamide IV sequence because mass spectrometry data were not available when the antibody was produced. Following immunization, the resultant antiserum was affinity purified with the CEDSSNYEFPPGFHRPamide peptide. The affinity purified antibody was then affinity-absorbed with KWSCSLRPamide and KWSCCLRPamide, which correspond to predicted RPamide peptides encoded by Nv37852—Nv-RPamide I and Nv-RPamide II, respectively [sensu (36)]—in order to generate a Nv-RPamide IV-specific antibody. Immunostaining with the antibody preadsorbed with the synthetic Nv-RPamide IV (CEDSSNYEFPPGFHRPamide) for 2 h at 37◦C showed little immunoreactivity at the mid planula and primary polyp stages (**Supplementary Figure 4**), suggesting that the antibody reacts with Nv-RPamide IV. However, this antibody likely crossreacts with other RPamide-like peptides, as we have observed a population of neurons that are immunoreactive with the antibody but do not express NvRPa transcripts in the tentacular ectoderm at the tentacle-bud and primary polyp stages (e.g., **Figure 4C**).

### CRISPR-Cas9-Mediated Mutagenesis

20 nt-long sgRNA target sites were manually identified in the genomic locus for NvRPa. To minimize off-target effects, target sites that had 17 bp-or-higher sequence identity elsewhere in the genome (N. vectensis v1.0; http://genome.jgi.doe.gov/Nemve1/ Nemve1.home.html) were excluded. Selected target sequences were as follows.


The sgRNA species were synthesized in vitro (Synthego), and mixed at equal concentrations. The sgRNA mix and Cas9 endonuclease (PNA Bio, PC15111, Thousand Oaks, CA, USA) were co-injected into fertilized eggs at concentrations of 500 ng/µl and 1,000 ng/µl, respectively.

### Genotyping of Embryos

Genomic DNA from single embryos was extracted by using a published protocol (24), and the targeted genomic locus was amplified by nested PCR. Primer sequences used for nested genomic PCR are: "1" Forward 5′ -CGAAGGACCTTGAAAGT GGACTGTTCTCGG-3′ , "1" Reverse 5′ -TGTCTGGGACTAGT TTACCTACAGCG-3′ , "2" Forward 5′ -GTGGTATGAGGCACA AACGTAGATGG-3′ , "2" Reverse 5′ -CATGGGCAACGGTCAG CGGCAGATCGATG-3′ .

### Immunofluorescence, TUNEL and in situ Hybridization

Animal fixation, immunohistochemistry, and in situ hybridization were performed as previously described (14, 25). For immunohistochemistry, we used primary antibodies against RPamide (rabbit, 1:200), GLWamide [rabbit, 1:200; (5)], acetylated ∂-tubulin (mouse, 1:500, Sigma T6793) and tyrosinated ∂-tubulin (mouse, 1:500, Sigma T9028), and secondary antibodies conjugated to AlexaFluor 568 [1:200, Molecular Probes A-11031 (anti-mouse) or A-11036 (antirabbit)] or AlexaFluor 647 [1:200, Molecular Probes A-21236 (anti-mouse) or A-21245 (anti-rabbit)]. Nuclei were labeled using fluorescent dyes DAPI (1:1,000, Molecular Probes D1306), and filamentous actin was labeled using AlexaFluor 488-conjugated phalloidin (1:25, Molecular Probes A12379). TUNEL assay was carried out after immunostaining, by using in situ Cell Death Detection Kit (TMR red cat no. 1684795, Roche, Indianapolis, IN, USA) or Click-iT Plus TUNEL Assay for in situ Apoptosis Detection (Alexa Fluor 488, cat no. C10617, Molecular Probes) according to manufacturer's recommendation; both TUNEL assay kits showed similar results, and thus only data generated by using the Roche kit are reported. For in situ hybridization, antisense and sense digoxigenin-labeled riboprobes were synthesized by using the MEGAscript transcription kits according to manufacturer's recommendation (Ambion; T7, AM1333; T3, AM1338), and were used at the final concentration of 1 ng/µl. Fluorescent images were recorded using a Leica SP5 Confocal Microscope or a Zeiss LSM900. Images were viewed using ImageJ.

### RESULTS

It has been previously reported that the genome of the starlet sea anemone N. vectensis contains two RPamideprecursor genes [Nv37852 and Nv244953; (36)]. However, Reverse Transcriptase PCR by using planula and polyp cDNAs showed detectable expression for Nv244953, but not for Nv37852 (**Supplementary Figure 1**). Thus, we have focused our present analyses on Nv244953, which we will herein refer to as NvRPa (Nematostella vectensis RPamide). Consistent with in silico gene prediction (https://mycocosm.jgi.doe.gov/cgi-bin/ dispGeneModel?db\$=\$Nemve1&id\$=\$244953), comparison of the N. vectensis genome and NvRPa cDNA sequence indicates that NvRPa is a 7-exon gene. We note that the cDNA sequence has a shorter exon 1 and a longer exon 7 relative to the in silico predicted gene model (**Figures 1A,B**); however, the shorter exon 1 could have resulted from failure to synthesize full-length cDNA.

Based on the location of diagnostic endoproteolytic cleavage sites (i.e., acidic residues, aspartic/glutamic acid; basic residues, lysine/argenine) and C-terminal amidation sites (i.e., glycine residue) [reviewed in (2)], NvRPa is predicted to encode single copies of two RPamide peptides that differ in N-terminal sequence [**Figure 1C**; Nv-RPamide III and IV, sensu (36)].

### RPamide Is Expressed in the Aboral Sensory Nerve Net of the Planula Larva

Next we examined the spatiotemporal expression patterns of NvRPa during N. vectensis development. We used in situ hybridization and immunostaining to localize NvRPa transcripts and RPamide peptides, respectively. In situ hybridization using a sense probe did not result in cell-type-specific staining (**Supplementary Figure 2**). An antibody against RPamide peptides was made against Nv-RPamide IV predicted to be generated by NvRPa (see section Materials and Methods), and was validated by CRISPR-Cas9-mediated mutagenesis; a significantly reduced average number of anti-RPamideimmunoreactive neurons was observed in NvRPa F0 mosaic mutants (2; n = 5) relative to the wildtype control (20; n = 5) (two tailed t-test, p < 0.0001; **Supplementary Figure 3**). In addition, immunostaining with the antibody that was preadsorbed with Nv-RPamide IV peptides did not result in cell-type-specific staining (**Supplementary Figure 4**), indicating that the antibody indeed reacts with Nv-RPamide IV. In situ

Zang and Nakanishi Apical-Organ Neurons of Sea Anemone

FIGURE 2 | riboprobe against NvRPa transcript ("rp") and antibodies against RPamide ("RPa") and acetylated ∂-tubulin ("acTub"). Nuclei are labeled with DAPI. (A–F) are side views of animals with the blastopore/oral opening facing up. (A,C,E) show superficial planes of section at the level of the ectoderm, and (B,D,F) show longitudinal sections through the center of the animal. (G,H) depict an apical organ (ao) and aboral-lateral ectoderm, respectively; apical surface is to the top. During gastrulation, scattered epithelial cells in the aboral ectoderm begin to express NvRPa transcripts (arrowheads in A,B), which show spindle-shaped morphologies characteristic of cnidarian sensory cells (19). At the early planula stage, NvRPa-expressing sensory cells extend basal RPamide-positive neurites forming an aboral basiepithelial nerve net (arrowheads in C,D), and a subset of NvRPa-expressing sensory cells become integrated into the developing apical organ (ao; arrowhead in F). Note that the cell bodies of NvRPa-expressing sensory cells are located in the superficial stratum of the apical organ (arrowheads in G,H) and of the lateral ectoderm (arrowheads in I–L). ph, pharynx; ec, ectoderm; en, endoderm; at, apical tuft; ne, neurite. Scale bar: 50µm (A–F); 10µm (G–L).

hybridization shows that NvRPa transcripts first occur at gastrulation in scattered epithelial cells of the aboral ectoderm, which exhibit a spindle-shaped morphology characteristic of cnidarian sensory cells (arrowheads in **Figures 2A,B**). During early planula development, these NvRPa-expressing epithelial cells extend RPamide-positive basal neuronal processes that form an interconnected basiepithelial network in the aboral ectoderm (**Figures 2C,D**). A subset of NvRPa-expressing sensory cells become integrated into the apical organ developing at the aboral end (**Figures 2E–H**). The cell bodies of RPamide-positive neurons are located in the superficial stratum of the ectoderm (arrowheads in **Figures 2G–L**). Consequently, within the apical organ, RPamide-positive sensory cells can be distinguished from apical-tuft cells based on the position of nuclei; the former have superficial nuclei, while the latter have basal nuclei [**Figures 2G,H**; (14)]. Thus, RPamide appears active in the aborally concentrated sensory nerve net of the planula larva.

At later stages of planula development, NvRPa becomes expressed in sensory cells of the oral body column ectoderm, pharyngeal ectoderm, as well as in the developing endodermal nervous system (**Figures 3A,B**). Some NvRPa-expressing sensory cells of the oral ectoderm extend basal neurites in an aboral direction (ne<sup>l</sup> in **Figure 3C**). while others extend in a transverse orientation (ne<sup>t</sup> in **Figure 3C**). NvRPa-expressing neurons in the pharyngeal ectoderm extend neurites preferentially in an aboral direction (**Figures 3D,E**). In the endoderm, NvRPa expression primarily occurs in neurons with longitudinally oriented neurites (**Figures 3D,F**), which become part of longitudinal neuronal bundles of the polyp endodermal nervous system (see below).

### RPamide-Positive Ectodermal Nervous System Is Reorganized at Metamorphosis, Involving Removal of the Aboral Sensory Nerve Net by Apoptosis, and de novo Development of RPamide-Positive Tentacular Sensory Cells

At metamorphosis, the planula larva transforms into a polyp by developing tentacles around the mouth. During this process, the NvRPa-expressing aboral sensory nerve net disappears (**Figures 4A,B**) along with the apical organ (14). Concomitantly, a subset of sensory cells in the tentacular ectoderm begin to express NvRPa (SC<sup>t</sup> in **Figure 4C**). In addition, NvRPa expression is detected in some of the endodermal neurons with transverse neurites that connect with longitudinal neuronal bundles (blue arrowhead in **Figure 4D**). We note that at this developmental stage (i.e., tentacle bud and primary polyp stages) we observed a population of neurons that are immunoreactive with the antibody but do not express NvRPa transcripts in the tentacular ectoderm (**Figure 4C**); over 92% of RPaimmunoreactive neurons in oral tentacles (n = 77) did not show detectable levels of NvRPa expression. This suggests that RPamide-like peptides that are not generated by NvRPa are expressed in some of the tentacular ectodermal neurons. Alternatively, NvRPa could be expressed in these neurons at levels not detectable by in situ hybridization. The lack of colocalization of NvRPa transcripts and RPa-immunoreactivity is rarely found prior to metamorphosis.

We then examined the mechanism by which the NvRPa-expressing aboral sensory nerve net was removed at metamorphosis in N. vectensis. Specifically, we have considered the possibility that NvRPa-expressing aboral sensory cells undergoes programmed cell death at metamorphosis. To test this hypothesis, we labeled apoptotic DNA fragmentation at metamorphosis by using the terminal deoxynucleotidyl transferase dUTP nick end-labeling (TUNEL) assay. We have found that, indeed, NvRPa-expressing aboral sensory cells contain TUNEL-positive nuclei and/or nuclear fragments at the tentacle-bud stage (69.6%, n = 46; **Figure 5**). This finding supports the hypothesis that the NvRPa-expressing sensory nerve net in the aboral ectoderm of the planula larva disintegrates by apoptosis during metamorphosis in N. vectensis. We note that apoptosis in the aboral ectoderm at metamorphosis is not restricted to RPamide-positive neurons, but include other peptidergic neurons such as GLWamide- and RFamide-immunoreactive neurons (**Supplementary Figure 5**).

### The Expression Pattern of RPamide Is Distinct From That of GLWamide

Next we asked whether the expression pattern of NvRPa was distinct from that of GLWamide—a neuropeptide thought to be conserved across Cnidaria and Bilateria, which regulates the timing of life cycle transition in N. vectensis (5). To address this, we combined in situ hybridization and immunostaining to examine the spatial localization of NvRPa transcripts and GLWamide peptides during planula development in N. vectensis. We found no evidence of colocalization of NvRPa and GLWamide expression in ectodermal and endodermal nervous systems in N. vectensis planulae (**Figure 6**). Thus, RPamide and GLWamide are expressed in non-overlapping subsets of neurons in the planula nervous systems in N. vectensis,suggesting that RPamide and GLWamide may perform non-overlapping neural functions in planulae. These data are also consistent with multiple peptidergic neuron types having complementary roles in specific biological processes such as metamorphosis.

ectoderm, and endoderm. Z-projections of confocal sections of Nematostella vectensis at mid-planula stages, labeled with an antisense riboprobe against NvRPa transcript ("rp") and antibodies against RPamide ("RPa") and acetylated ∂-tubulin ("acTub"). Nuclei are labeled with DAPI. (A–D) are side views of animals with the oral opening facing up. (A,C) show superficial planes of section at the level of the ectoderm; (B,D) show longitudinal sections through the center. (E,F) are magnified views of boxed regions in (D) showing NvRPa-expressing cells with their apical side facing up. During planula development, NvRPa becomes expressed in epithelial cells with the sensory-cell-like morphology and RPa-positive basal neurites. These NvRPa-expressing sensory cells reside in the body column oral ectoderm (sc<sup>o</sup> in A–C), pharyngeal ectoderm (sc<sup>p</sup> in B; arrowheads in E) and endoderm (scen in B; an arrowhead in F). Note in C that NvRPa-expressing sensory cells in the body column oral ectoderm can have aborally oriented longitudinal neurites (ne<sup>l</sup> ), or transverse neurites (ne<sup>t</sup> ). NvRPa-expressing sensory cells in the pharynx have aborally oriented longitudinal neurites (ne in E), and those in the endoderm send longitudinal neurites in both oral and aboral directions (ne in F). ph, pharynx; ec, ectoderm; en, endoderm. Scale bar: 50µm (A–D); 10µm (E,F).

FIGURE 4 | During metamorphosis of a planula larva into a sea anemone polyp, the RPamide-positive sensory nerve net of the aboral ectoderm disappears, and RPamide becomes expressed in ectodermal sensory cells of developing oral tentacles. Z-projections of confocal sections of Nematostella vectensis at the tentacle-bud/primary polyp stage, labeled with an antisense riboprobe against NvRPa transcript ("rp") and antibodies against RPamide ("RPa") and acetylated ∂-tubulin ("acTub"). Nuclei are labeled with DAPI. All panels show side views of the animal with the oral opening facing up. (A,C) show superficial planes of section at the level of surface ectoderm (ec); (B,D) show longitudinal sections through the center. (B,D) depict an aboral half of the animal, while (C) shows an oral half of the animal. The inset in (C) is a magnified view of the boxed region showing a NvRPa-expressing cell with their apical side facing up; note that NvRPa transcripts primarily localize to the cytoplasm so that the nucleus (nu) appears NvRPa-negative. At metamorphosis, RPamide expression becomes undetectable in the aboral ectoderm at the transcript and peptide levels (compare A,B with Figures 2A–D). Concomitantly, NvRPa transcripts become expressed in ectodermal sensory cells of the developing oral tentacles that send RPamide-positive longitudinal neurites in an aboral direction (sc<sup>t</sup> in C); note in the inset in (C) the sensory cell-like spindle-shaped morphology of an NvRPa-expressing tentaclular cell. White arrowheads in (C) show anti-RPamide-immunoreactive sensory cells in the tentacular ectoderm that do not express detectable levels of NvRPa transcripts, which suggests that the immunoreactive materials in these cells are not produced by NvRPa. Note in (D) that RPamide expression is evident in endodermal neurons that are integrated within longitudinal neuronal bundles (nb) (orange arrowhead), as well as in those that reside between neighboring neuronal bundles (blue arrowhead); neurites from the latter neurons connect with those from neurons in the neuronal bundles. te, tentacle; sco, sensory cell in the oral body column ectoderm; ph, pharynx; en, endoderm; ap, aboral pole. Scale bar: 50µm (A–D); 10µm (inset in C).

## DISCUSSION

In this paper, we reported the expression pattern of a cnidarianspecific neuropeptide, RPamide, during development of the starlet sea anemone N. vectensis. We have found that early RPamide-positive neurons form a sensory nerve net in the aboral ectoderm of the planula larva, and that a subset of RPamidepositive sensory cells located at the aboral end become part of the apical organ. During planula development, RPamide becomes expressed in sensory cells of oral ectoderm in the body column and pharynx, as well as in endodermal neurons. At metamorphosis, the RPamide-positive sensory nerve net in aboral ectoderm disappears by apoptosis, and RPamide becomes expressed in a subset of ectodermal sensory cells in growing oral tentacles. RPamide expression and GLWamide expression do not co-occur in the same neurons in planulae, indicative of functional segregation between the two neuropeptides.

### The Sea Anemone Apical Organ Consists of Multiple Cell Types, Including Apical Tuft Cells and Peptidergic Sensory Cells

Our study sheds light on the diversity of cell types that constitute the apical organ in a sea anemone planula. Chia and Koss (30) used electron microscopy to describe the structure of an apical organ in the sea anemone A. elegantissima. They reported that

the apical organ consists of columnar epithelial cells, each of which has a long cilium and a collar of microvilli on the apical cell surface and a basally localized nucleus. Long cilia from the columnar epithelial cells form a ciliary bundle known as the apical tuft, and their ciliary rootlet characteristically reach the basal processes of the cells. In addition, Nakanishi et al. (14) identified two types of gland cells in the apical organ of N. vectensis: one with electron-lucent granules and the other with electron-dense granules. In this study, we have discovered that a subset of early-born RPamide-positive sensory cells in the aboral ectoderm become part of the apical organ in N. vectensis. RPamide-positive sensory cells of the apical organ are morphologically distinct from other cell types that are housed in this structure. Specifically, in contrast to the apical-tuft epithelial cells that are column-shaped and have basal nuclei, these RPamide-positive sensory cells are spindle-shaped and contain nuclei located close to the epithelial surface. To our knowledge, this is the first evidence that the sea anemone apical organ contains a peptidergic sensory cell type. Thus, the apical organ is composed of at least three cell types: apical-tuft cells, gland cells, and peptidergic sensory cells.

Motivated by the question of whether the aboral domain of cnidarians is homologous to the anterior domain of bilaterians, several studies investigated the molecular mechanism of apical organ development in N. vectensis [e.g., (42–44)]. It has been proposed that apical organ development in N. vectensis occurs in two phases, involving a set of regulatory factors—Six3, FoxQ2 and FGFs—whose bilaterian homologs play a conserved role in controlling anterior development (43). First, during gastrulation, six3/6 specifies the identity of the broad aboral domain in the ectoderm by positively regulating the expression of fgfa1 and foxQ2a. Subsequently, during planula development, FGFa1 signaling drives the differentiation of apical organ cells by downregulating six3/6 and foxQ2a expression at the aboral pole.

Consistent with this model, experimental perturbation of six3/6 and FGFa1 expression results in planula larvae without apical tufts (42, 43). However, it has not been clear whether all apical organ cell types—besides the apical-tuft cells—develop in this way. In the present study, we have found that development of RPamide-positive sensory cells at the aboral end occurs during gastrulation, and precedes the mid-planula stage at which downregulation of six3/6 is reported to occur (43). This suggests that neurogenesis that generates apical organ neurons takes place prior to differentiation of apical tuft cells, and therefore does not depend on downregulation of six3/6 at the aboral pole. Our data therefore imply that neurogenesis and ciliogenesis in apical organ development are mechanistically decoupled. Consistent with this hypothesis, positive regulators of neurogenesis, such as soxB(2), ath, and ashA, are expressed in the aboral domain during gastrulation, prior to apical tuft formation (29, 45–47). How the mechanisms of neurogenesis and ciliogenesis in the context of sea anemone apical organ development differ from each other, and how they relate to anterior development of bilaterians, are to be addressed in future studies.

### Reorganization of the Ectodermal Nervous System at Metamorphosis Represents an Ancestral Condition in Cnidaria

Nervous system development in the sea anemone N. vectensis has been viewed as a gradual process that entails progressive addition of new features; neurons begin development in ectoderm at gastrulation, then in endoderm during planula development, and finally in the polyp tentacular ectoderm at metamorphosis (14, 15). Only apical tuft cells have been reported to disappear at metamorphosis in N. vectensis (14). In contrast, reorganization of peptidergic ectodermal nervous systems at the planula-polyp transition is commonly observed in non-sea anemone cnidarians such as the coral Acropora (48), moonjelly Aurelia (49), and hydrozoans [Pennaria disticha (formerly Pennaria tiarella) (50); Hydractinia echinate (51, 52); Clava multicornis (53)]. This process typically involves disappearance of the planula peptidergic nervous system housed exclusively in the ectoderm in these cnidarians—likely via apoptosis (49, 53–56), followed by the development of the orally concentrated polyp nervous system. Therefore, reorganization of the ectodermal nervous system at metamorphosis appears to be an ancestral condition in Cnidaria. In this paper, we have discovered that the early-born RPamide-positive sensory nerve net of the aboral ectoderm is removed at metamorphosis by programmed cell death. We also note that early-born GLWamide- and RFamide-positive sensory cells in the aboral ectoderm similarly disintegrate by undergoing DNA fragmentation at metamorphosis (**Figure 5**). Hence, we suggest that sea anemones have in fact retained the ancestral pattern of cnidarian neural development to undergo reorganization of the peptidergic ectodermal nervous system at the planulapolyp transition.

## DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

## AUTHOR CONTRIBUTIONS

NN conceived and designed the study. HZ and NN carried out the experiments and drafted the manuscript.

### FUNDING

This work was supported by funds from Arkansas IDeA Network of Biomedical Research Excellence (INBRE) (to HZ and NN), Arkansas Bioscience Institute (to NN), and the University of Arkansas (to NN).

### ACKNOWLEDGMENTS

We thank Ethan Ozment for animal husbandry, and Arianna Tamvacakis for comments on an earlier version of the manuscript. We are also grateful to Mark Q. Martindale for his encouragement to explore every neuropeptide encoded by the Nematostella genome, which was the impetus for this study. We would also like to thank reviewers for comments on the earlier version of the manuscript, which greatly improved the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00063/full#supplementary-material

Supplemental Figure 1 | Reverse Transcriptase (RT) PCR of RPamide-encoding genes. cDNA PCR results showing the presence of Nv244953 transcripts (expected size: 291 bp) and the absence of Nv37852 transcripts (expected size: 207 bp). "–" denotes a negative control without cDNA. 100 bp ladders were used. cDNA was generated from RNA extracted from planulae and primary polyps. Primer sequences used for PCR are: "244953" Forward 5 ′ -GCTCGGTACAGAGCCGAAACCTGAGACAC-3′ , Reverse 5 ′ -CATGGGCAACGGTCAGCGGCAGATCGATG-3′ , "37852" Forward 5 ′ -GCCTTGGTCGTGTTGCCTTCGCCCTGGTC-3′ , "2" Reverse

```
5
′
                                      .
```
Supplemental Figure 2 | In situ hybridization using a sense riboprobe. Z-projections of confocal sections of Nematostella vectensis at mid-planula (A,B) and tentacle-bud/primary polyp (C,D) stages, labeled with a sense NvRPa riboprobe ("sense probe") and an antibody against acetylated ∂-tubulin ("acTub"). All panels are side views of animals with the oral opening facing up. (A,C) show superficial planes of section at the level of the ectoderm; (B,D) show longitudinal sections through the center. No cell-type-specific staining is observable. Scale bar: 50 µm.

Supplemental Figure 3 | CRISPR mutagenesis and validation of an anti-RPamide antibody. (A) Schematic view of the NvRPa locus. (B) Genomic DNA nested PCR results of Cas9-injected wildtype control embryos ("WT") and F0 embryos injected with NvRPa locus-specific sgRNAs together with Cas9 ("RPa").

### REFERENCES


Results of secondary PCR are shown. (C,D) Medial-level confocal sections of Cas9-injected wildtype control (C) and NvRPa F0 mutant (D) mid-planulae at 4 day-post-fertilization (dpf), labeled with an anti-RPamide ("RPa"). Nuclei are labeled with DAPI. Oral side is up. (E) A histogram showing the number of anti-RPamide-positive neurons in Cas9-injected wildtype control animals ("WT"; n = 5) and NvRPa F0 mutants ("CR-RPa"; n = 5) at the 4 dpf mid-planula stage. In (A), a blue bar shows a predicted translation start site; a red bar shows a predicted translation termination site; yellow bars show predicted RPamide-encoding regions as in Figure 1. Purple arrowheads show sgRNA target sites. Blue arrows mark regions targeted in the PCR analysis shown in (B). Note in B that genomic PCR of the WT embryo shows the expected size of PCR fragments (1690 bp for secondary nested PCR), while F0 mutant embryos show additional bands of smaller sizes (arrowhead), indicating that targeted deletions of different sizes have occurred mosaically in each embryo. Over 88% of PCR genotyped embryos showed clear mutant bands (i.e., PCR fragments that are smaller than expected from a wildtype allele; n = 34). DNA sequencing of mutant bands (e.g., arrowhead) has confirmed that excision of RPamide-encoding regions can be induced by this CRISPR-mediated mutagenesis approach. In (C,D), arrowheads show anti-RPa-positive neurons. Note in (D) that an anti-RPa-positive cell is observed in the endoderm (en) but not in the ectoderm (ec). In (E), <sup>∗</sup> denotes a statistically significant difference (α = 0.05). Scale bar: 50 µm.

Supplemental Figure 4 | Immunostaining with a preadsorbed anti-RPamide antibody. Z-projections of confocal sections of Nematostella vectensis at mid-planula (A,C) and primary polyp (B,D) stages, labeled with an anti-RPamide antibody ("anti-RPa"; A,B) and an anti-RPamide antibody preadsorbed with Nv-RPamide IV (CEDSSNYEFPPGFHRPamide) ("Preadsorbed anti-RPa"; C,D). All panels are side views of animals with the oral opening facing up. (A,C) show longitudinal sections through the center; (B,D) show sections through the entire animals. No cell-type-specific staining is observable with the preadsorbed antibody, indicating that the anti-RPamide indeed reacts with Nv-RPamide IV. Scale bar: 50 µm.

Supplemental Figure 5 | GLWamide- and RFamide-immunoreactive sensory cells of the aboral ectoderm undergo apoptosis at metamorphosis. Z-projections of confocal sections of Nematostella vectensis at the tentacle-bud stage, labeled with antibodies against acetylated ∂-tubulin ("acTub") and GLWamide ["GLWa"; (5)] or RFamide ("RFa"; unpublished). DNA fragmentation is detected by TUNEL assay ("TUNEL"). All panels show side views of the animal with the oral opening facing up; (A,C) represent superficial planes of section at the level of surface ectoderm, while (B,D) depict longitudinal sections near the center. Boxed areas are magnified in insets; note that in insets in (B,D) the apical surface of the ectoderm faces up. Arrowheads point to TUNEL-positive DNA fragmentation, evidencing programmed cell death. Scale bar: 50 µm.

repertoire and genomic organization. Science. (2007) 317:86–94. doi: 10.1126/science.1139158


endoderm and shaped by distinct mechanisms. Development. (2012) 139:347– 57. doi: 10.1242/dev.071902


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Zang and Nakanishi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Expression Profiling, Downstream Signaling, and Inter-subunit Interactions of GPA2/GPB5 in the Adult Mosquito Aedes aegypti

David A. Rocco\* † and Jean-Paul V. Paluzzi\* †

Laboratory of Integrative Vector Neuroendocrinology, Department of Biology, York University, Toronto, ON, Canada

#### Edited by:

Elizabeth Amy Williams, University of Exeter, United Kingdom

Reviewed by:

Frank Hauser, University of Copenhagen, Denmark Mark R. Brown, University of Georgia, United States

#### \*Correspondence:

David A. Rocco davrocco@yorku.ca Jean-Paul V. Paluzzi paluzzi@yorku.ca

#### †ORCID:

David A. Rocco orcid.org/0000-0001-9545-7713 Jean-Paul V. Paluzzi orcid.org/0000-0002-7761-0590

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 18 December 2019 Accepted: 06 March 2020 Published: 31 March 2020

#### Citation:

Rocco DA and Paluzzi J-PV (2020) Expression Profiling, Downstream Signaling, and Inter-subunit Interactions of GPA2/GPB5 in the Adult Mosquito Aedes aegypti. Front. Endocrinol. 11:158. doi: 10.3389/fendo.2020.00158 GPA2/GPB5 and its receptor constitute a glycoprotein hormone-signaling system native to the genomes of most vertebrate and invertebrate organisms. Unlike the well-studied gonadotropins and thyrotropin, the exact function of GPA2/GPB5 remains elusive, and whether it elicits its functions as heterodimers, homodimers or as independent monomers remains unclear. Here, the glycoprotein hormone signaling system was investigated in adult mosquitoes, where GPA2 and GPB5 subunit expression was mapped and modes of its signaling were characterized. In adult Aedes aegypti mosquitoes, GPA2 and GPB5 transcripts co-localized to bilateral pairs of neuroendocrine cells, positioned within the first five abdominal ganglia of the central nervous system. Unlike GPA2/GPB5 homologs in human and fly, GPA2/GPB5 subunits in A. aegypti lacked evidence of heterodimerization. Rather, cross-linking analysis to determine subunit interactions revealed A. aegypti GPA2 and GPB5 subunits may form homodimers, although treatments with independent subunits did not demonstrate receptor activity. Since mosquito GPA2/GPB5 heterodimers were not evident by heterologous expression, a tethered fusion construct was generated for expression of the subunits as a single polypeptide chain to mimic heterodimer formation. Our findings revealed A. aegypti LGR1 elicited constitutive activity with elevated levels of cAMP. However, upon treatment with recombinant tethered GPA2/GPB5, an inhibitory G protein (Gi/o) signaling cascade is initiated and forskolin-induced cAMP production is inhibited. These results further support the notion that heterodimerization is a requirement for glycoprotein hormone receptor activation and provide novel insight to how signaling is achieved for GPA2/GPB5, an evolutionary ancient neurohormone.

Keywords: GPA2/GPB5, mosquito, glycoprotein hormone, leucine-rich repeat-containing G protein coupledreceptor 1, homodimer, heterodimer, thyrostimulin

### INTRODUCTION

Members of the cystine knot growth factor (CKGF) superfamily, which are characterized with a CKGF domain as their primary structural feature, include (i) the glycoprotein hormones, (ii) the invertebrate bursicon hormone, (iii) the transforming growth factor beta (TGFβ) family, (iv) the bone morphogenetic protein (BMP) antagonist family, (v) the platelet-derived growth factor (PDGF) family and (vi) the nerve growth factor (NGF) family. Of the members of the CKGF superfamily, the glycoprotein hormones are of fundamental importance in the regulation of both vertebrate and invertebrate physiology.

In vertebrates, members of the glycoprotein hormone family include follicle-stimulating hormone (FSH), luteinizing hormone (LH), thyroid-stimulating hormone (TSH) as well as chorionic gonadotropin (CG), which are implicated in governing several aspects of physiology including reproduction, energy metabolism along with growth and development. Structurally, these hormones are formed by the heterodimerization of two cystine-knot glycoprotein subunits, an α subunit that is structurally identical for each hormone (GPA1), and a hormonespecific β subunit (GPB1-4) (1, 2).

Two additional glycoprotein hormone subunits were more recently identified in the human genome, glycoprotein α2 (GPA2) and glycoprotein β5 (GPB5), and were found to heterodimerize (GPA2/GPB5) and act on the same receptor as TSH. As a result, GPA2/GPB5 was coined the name thyrostimulin to differentiate it from TSH in vertebrates (3, 4). Unlike other glycoprotein hormones which are restricted to the vertebrate lineage, homologous genes encoding GPA2/GPB5 subunits exist in all bilaterian organisms, where its function appears to be pleiotropic (3–5). In vertebrates, the function of GPA2/GPB5 has been implicated, or at least suggested, to be involved in reproduction (6), thyroxine production (4, 7), skeletal development (8), immunoregulation (9, 10) and the proliferation of ovarian cancer cell lines (11). For invertebrate species, GPA2/GPB5 has been implicated or demonstrated to function in development (12–15), hydromineral balance (13–18) as well as in reproduction (15, 17, 19).

Heterodimerization by non-covalent interactions between the subunits forming FSH, LH, TSH, and CG is required for their respective biological functions (2, 20). However, whether heterodimerization is required for GPA2/GPB5 to activate its receptor and exert its physiological role in vertebrates and invertebrates is debated. For FSH, LH, TSH, and CG, the subunits are co-expressed in the same cells and each hormone is released into circulation as heterodimers (21, 22). On the other hand, GPA2 and GPB5 subunit expression profiles, both for vertebrate and invertebrate organisms, do not always occur in the same cells, and GPA2 is often expressed more widely and abundantly than GPB5 in some tissues (7, 12, 23–25). Additionally, unlike the beta subunits of the classic glycoprotein hormones, the structure of the GPB5 subunit lacks an extra pair of cysteine residues that form an additional disulfide linkage, referred to as the "seatbelt," which strengthens and stabilizes its heterodimeric association with GPA2 (26).

Relative to other G protein-coupled receptors (GPCRs), members of the leucine-rich repeat-containing G proteincoupled receptor (LGR) family are often characterized with a large extracellular amino terminal domain responsible for the selective binding of their large hormone ligands (27). Following its initial genomic and molecular characterization (28), an invertebrate receptor for GPA2/GPB5, called LGR1, was functionally deorphanized in the fruit fly Drosophila melanogaster, where GPA2/GPB5 heterodimers were found to activate LGR1 increasing cyclic AMP (cAMP) levels (29). Interestingly, stimulatory G protein (Gs) coupling and signaling to elevate cAMP was also shown with GPA2/GPB5-TSH receptor activation in humans (4, 6).

In the mosquito Aedes aegypti, genes encoding GPA2, GPB5 and LGR1 were identified and shown to be expressed in all developmental life stages, with expression enriched in adults compared to juvenile stages (13). In adults, LGR1 was found localized to epithelia throughout the gut where GPA2/GPB5 could regulate feeding-related processes and hydromineral balance (17). Notably, LGR1 transcript expression was also observed in the reproductive organs of males and females (19). In adult male mosquitoes, knockdown of LGR1 expression led to abnormal spermatogenesis with spermatozoa displaying malformations such as shortened flagella and consequently, LGR1-knockdown males had 60% less spermatozoa as well as significantly reduced fecundity relative to control mosquitoes (19).

With an interest in better understanding GPA2/GPB5 signaling in A. aegypti mosquitoes, our work herein set out to characterize the tissue-specific and cellular distribution expression profile of GPA2/GPB5 in mosquitoes. As well, we sought to determine if GPA2/GPB5 could heterodimerize, whether heterodimers were required to activate LGR1, and determine downstream signaling events upon receptor activation using a heterologous system. Combining various molecular techniques, we demonstrate GPA2/GPB5 cellular co-expression in the central nervous system of adult mosquitoes, and that both subunits are indeed required to activate LGR1, which exhibits ligand-dependent G protein-coupling activity. Moreover, our results also provided evidence for GPA2 and GPB5 homodimers, with individually-expressed subunits being incapable of activating LGR1. Overall, these findings appreciably advance our understanding of GPA2/GPB5 signaling in mosquitoes and provide novel directions to uncover the functions of homologous systems in other organisms.

### MATERIALS AND METHODS

### Animals

Adult A. aegypti (Liverpool) were derived from an established laboratory-reared colony raised under conditions described previously (17).

### GPA2/GPB5 Transcript Analysis by RT-qPCR

Total RNA was isolated and purified from select A. aegypti tissues and organs, reverse transcribed into cDNA. GPA2 and GPB5 transcript abundance was quantified using a StepOnePlusTM Real-Time PCR System (Applied Biosystems) (see **SI Materials and Methods** for details) and transcript abundance was normalized to transcript levels of two reference genes; ribosomal protein 49 (GenBank accession: AY539746) and 60S ribosomal protein S18 (GenBank accession: XM\_001660270), following the 11Ct method as previously described (13). Experiments were repeated using a total of three technical replicates per sample and three biological replicates for each tissue/organ.

### Fluorescence in situ Hybridization

Using gene-specific primers (**Table S1**), A. aegypti GPA2 and GPB5 sequences were amplified from previously prepared constructs (13) that contained the GPA2 and GPB5 complete open reading frames (ORFs). Sense and antisense probes were then generated following a similar protocol as recently reported (19). Briefly, cDNA fragments were ligated to pGEM T Easy vector (Promega, Madison, WI, USA) and used to transform NEB 5-α competent Escherichia coli cells (New England Biolabs, Whitby, ON, Canada). After screening plasmid constructs for directionality using T7 promoter oligonucleotide and genespecific primers (**Table S1**), template sense or anti-sense cDNA strands for GPA2 and GPB5 probe synthesis were created by PCR amplification and verified by Sanger sequencing for base accuracy (The Centre for Applied Genomics, Sick Kids Hospital, Toronto, ON, Canada). Digoxigenin (DIG)-labeled anti-sense and sense RNA probes corresponding to GPA2 and GPB5 subunits were synthesized using the HiScribe T7 High Yield RNA Synthesis kit (New England Biolabs, Whitby, ON, Canada). Fluorescence in situ hybridization (FISH) was then used to detect GPA2 and/or GPB5 transcript in the mosquito central nervous system using 4 ng µl −1 (GPB5) and/or 6 ng µl −1 (GPA2) RNA sense/antisense probes, following a previously established protocol (30). Preparations were analyzed with a Lumen Dynamics X-CiteTM 120Q Nikon fluorescence microscope (Nikon, Mississauga, ON, Canada), or a Yokogowa CSU-XI Zeiss Cell Observer Spinning Disk confocal microscope, and images were processed using Zeiss Zen and ImageJ software. All microscope settings were kept identical when acquiring images of control and experimental preparations.

### Wholemount Immunohistochemistry

GPB5 immunoreactivity in the abdominal ganglia of adult mosquitoes was examined in newly emerged and four-day old A. aegypti that were lightly CO<sup>2</sup> anesthetized, and dissected in PBS at RT. Tissues were fixed, permeabilized and incubated in a custom affinity-purified rabbit polyclonal GPB5 antibody (1 µg ml−<sup>1</sup> ; Genscript, Piscataway, NJ) designed against an antigen sequence (CDSNEISDWRFP) positioned at residues 69–80 of the deduced A. aegypti GPB5 protein sequence (13) for 48 h at 4◦C rocking. Control treatments involved preincubated A. aegypti GPB5 primary antibody solution containing 100:1 peptide antigen:antibody (mol:mol). After several washes, tissues were incubated overnight at 4◦C in Alexa Fluor 488 conjugated goat anti-rabbit Ab (1:200) secondary antibody (Life Technologies, Carlsbad, CA) in PBS containing 10% normal sheep serum. The next day, samples were washed and then mounted onto coverslips using mounting media containing 5 µg µl <sup>−</sup><sup>1</sup> Diamidino-2-phenylindole dihydrochloride (DAPI), and analyzed using a Lumen Dynamics X-CiteTM 120Q Nikon fluorescence microscope (Nikon, Mississauga, ON, Canada), or optically sectioned using a Yokogowa CSU-XI Zeiss Cell Observer Spinning Disk confocal microscope. All images were processed using Zeiss Zen and ImageJ software. Further details concerning the wholemount immunohistochemical protocol were reported in an earlier study (17).

### Plasmid Expression Constructs

Plasmid expression constructs were designed to study A. aegypti and H. sapiens GPA2/GPB5 subunit dimerization patterns and receptor signaling. Using previously available hexa-histidine-tagged A. aegypti GPA2-His (13), as well as FLAG (DYKDDDDK)-tagged (-FLAG) H. sapiens GPB5-FLAG (Genscript, Clone OHu31847D) plasmid vectors as template, the full ORF of each (A. aegypti and H. sapiens) GPA2 and GPB5 subunit coding sequence, including a consensus Kozak translation initiation sequence, was amplified and a hexa-histidine or FLAG tag sequence was incorporated on the carboxyl-terminus of subunits to produce the following fusion proteins; A. aegypti GPA2-FLAG, H. sapiens GPB5- His (**Table S2**). A pcDNA3.1<sup>+</sup> mammalian expression construct containing mCherry, which was a gift from Scott Gradia (Addgene plasmid # 30125), was utilized to verify cell transfection efficiency. Experiments also utilized previously prepared pcDNA3.1<sup>+</sup> constructs with A. aegypti GPB5-His and A. aegypti LGR1 coding sequences and dual promoter vector pBudCE4.1 containing both A. aegypti GPA2-His and GPB5-His (13). Additionally, pcDNA 3.1<sup>+</sup> mammalian expression vector construct containing FLAG tagged H. sapiens thyrotropin receptor (TSHR-FLAG) (Genscript USA Inc., Clone OHu18318D), H. sapiens GPA2-FLAG (Genscript USA Inc., Clone OHu31847D), H. sapiens GPB5-FLAG (Genscript USA Inc., Clone OHu55827D) and pGlosensorTM−22F cyclic adenosine monophosphate (cAMP) biosensor plasmid (Promega Corp., Madison, WI), which were used for receptor activation and intracellular signaling assays.

### Generation of Tethered A. aegypti GPA2/GPB5 Fusion Construct

The ORFs of A. aegypti GPA2 and GPB5 sequences were tethered together in order to promote heterodimer interactions for testing in receptor activity assays with mammalian cell lines. A hexahistidine tagged artificial linker sequence involving three glycineserine repeats was used to fuse the amino-terminus of A. aegypti GPA2 propeptide sequence to the carboxyl-terminus of A. aegypti GPB5 prepropeptide sequence, using multiple PCR amplifications with several primer sets (**Table S2**) as performed previously using lamprey and Amphioxus GPA2 and GPB5 sequences (31, 32) (see **SI Materials and Methods** for details).

### Transient Transfection of HEK 293T Cells

Human embryonic kidney (HEK 293T) cells were grown in complete growth media (Dulbecco's modified eagles medium: nutrient F12 (DMEM) media, 10% heat inactivated fetal bovine serum (Wisent, St. Bruno, QC) and 1X antimycotic-antibiotic) and maintained in a water-jacketed incubator at 37◦C, 5% CO2. When cells reached ∼80–90% confluency, they were transfected with mammalian expression plasmid constructs in 6-well tissue culture plates (Thermo Fisher Scientific, Burlington, ON) using Lipofectamine 3000 transfection reagent (Life Technologies, Carlsbad, CA) with 3:1 (µL:µg) transfection reagent to DNA ratio. Before transfection, culture media was replaced with either serum-free medium (DMEM and 1X antimycotic-antibiotic) for experiments that collected secreted proteins, or fresh complete growth medium for experiments that dually-transfected cells with pGlosensorTM−22F cAMP biosensor plasmid and either H. sapiens TSHR, A. aegypti LGR1, or mCherry.

### Preparation of Protein Samples

At 48 h post-transfection, serum-free culture media containing secreted proteins were collected and concentrated in 0.5 mL 3 kDa molecular weight cut-off centrifugal filters (VWR North America). In some experiments, cells were dislodged with PBS containing 5 mM ethylenediaminetetraacetic acid (EDTA; Life Technologies) (PBS-EDTA) pH 8.0, pelleted at 400 × g for 5 min, resuspended in PBS and transferred to 1.5 mL centrifuge tubes for a subsequent centrifugation. Cell lysates were then prepared by resuspending and sonicating cells for 5 s in cell lysis buffer containing 37.5 mM Tris, pH = 7.5, 1.5 mM EDTA, pH 8.0, 3% sodium dodecyl sulfate, 1.5% protease inhibitor cocktail (v/v), and 1.5 mM dithiothreitol (DTT). For receptor activity assays, in order to prevent carry-over of lysis buffer, cell lysates were concentrated in 3-kDa molecular weight cut-off centrifugal filters and re-constituted back to initial volumes with serum-free media for a total of three repetitions. Proteins were then used for crosslinking analysis, deglycosylation, and immunoblotting or used as ligands for functional receptor activation using the cAMP signaling biosensor assays.

Dissuccinimidyl suberate (DSS), a chemical cross-linker that is primarily reactive toward amino groups providing stabilization of weak or transient protein intermolecular interactions, was employed to study GPA2 and GPB5 protein-protein interactions. A. aegypti and H. sapiens GPA2-FLAG and GPB5-His proteins were kept separate or combined together (GPA2-FLAG/GPB5- His) and then treated with DSS (Sigma Aldrich, Oakville, ON). In some experiments, A. aegypti GPA2-His/GPB5-His were co-expressed using a dual promoter plasmid and as such, media containing both His-tagged subunits were directly treated with DSS. To treat secreted concentrates of culture media containing H. sapiens GPA2 and GPB5 proteins, 0.68 mM DSS was used whereas both 0.68 mM (data not shown) and 2.04 mM DSS was used to test the dimerization characteristics of A. aegypti GPA2-His/GPB5-FLAG proteins. Cross-linking was performed for 30 min at RT and reactions were quenched with 50 mM Tris, pH 7.4 for 10 min under constant mixing. To remove N-linked sugars, protein samples were treated with peptide-N-Glycosidase F (PNGase) (New England Biolabs, Whitby, ON) following manufacturers guidelines, with the only modification being that protein samples were not heated to 100◦C before enzymatic deglycosylation. Experiments aimed to determine the effects of protein glycosylation on cross-linking ability first treated A. aegypti GPA2-His/GPB5-His subunits with PNGase and subsequently cross-linked samples with DSS after deglycosylation.

### Western Blot Analysis

Samples were prepared in 2x Laemmli buffer (Sigma Aldrich, Oakville, ON) containing 4% SDS, 20% glycerol, 10% 2 mercaptoethanol, 0.004% bromophenol blue and 0.125 M Tris HCl, pH ∼6.8, and resolved on 10 or 15% SDS-polyacrylamide gels under reducing conditions at 120 V for 90–110 min. Using a wet transfer system, proteins were then transferred to polyvinylidene difluoride (PVDF) membranes at 100 V for 75 min. For GPA2-FLAG/GPB5-His heterodimerization experiments, samples were run in duplicate on the same gel and following transfer, membranes were cut in half for separate primary antibody incubations. Membrane blots containing protein samples were blocked for 1 h in PBS containing 0.1% Tween-20 (Bioshop, Burlington, ON, Canada) and 5% skim milk powder (PBSTB) rocking at RT. After blocking, membranes were incubated overnight at 4◦C on a rocking platform in PBSTB containing mouse monoclonal anti-His (1:500 dilution) or mouse monoclonal anti-FLAG (1:500) primary antibody solutions. The next day, membranes were washed three times with PBS containing 0.1% Tween-20 (PBST) for 15 min each wash and then were incubated in PBSTB containing anti-mouse HRP conjugated secondary antibody (1:2000–1:3000 dilution) for 1 h rocking at RT before washing again with PBST (3 × 15 min washes). Finally, blots were incubated with Clarity Western ECL substrate and images were developed using a ChemiDoc MP Imaging System (Bio-Rad Laboratories, Mississauga, ON) and molecular weight measurements and analysis were performed using Image Lab 5.0 software (Bio-Rad Laboratories, Mississauga, ON). Western blots used to study recombinant GPA2 and GPB5 dimerization were repeated at least three times.

### Receptor Functional Activation Bioluminescence Assays

HEK 293T cells were co-transfected to express (i) H. sapiens TSHR, A. aegypti LGR1, or mCherry along with (ii) pGlosensorTM−22F cAMP biosensor plasmid (Promega Corp., Madison, WI), which encodes a modified form of firefly luciferase with a fused cAMP binding moiety providing a biosensor for the direct detection of cAMP signaling in live cells. At 48 h post transfection, recombinant cells were dislodged with PBS-EDTA, pelleted at 400 × g for 5 min, resuspended in assay media [DMEM:F12 media with 10% fetal bovine serum (v/v)] containing cAMP GloSensor reagent (2% v/v), and incubated for 3 h rocking at RT shielded from light. White 96-well luminescence plates (Greiner Bio-One, Germany) were loaded under low light with previously prepared secreted or cell lysate protein concentrates (described above), forskolin or assay media alone, and incubated at 37◦C for 30 min prior to performing receptor activity assays. For stimulatory G-protein (Gs) pathway detection, recombinant cells expressing either H. sapiens TSHR, A. aegypti LGR1, or mCherry along with the cAMP biosensor were pre-treated with 0.25 mM 3-isobutyl-1-methylxanthine (IBMX) for 30 min at RT with rocking and shielded from light. Using an automatic injector unit (BioTek Instruments Inc., Winooski VT), co-transfected HEK 293T cells were then loaded into wells containing various treatments, including 250 nM forskolin as a positive control or concentrates of protein fractions from mCherry-transfected cells as a negative control. For inhibitory G-protein (Gi/o) pathway testing, A. aegypti GPA2 and/or GPB5 was tested for the ability to inhibit a forskolininduced cAMP-mediated bioluminescent response. Using an automatic injector unit (BioTek Instruments Inc., Winooski

VT), cells expressing LGR1 or mCherry (LGR1 activation and signaling negative control) were loaded into wells that contained various ligand treatments. Subsequently, 250 nM or 1µM forskolin was added to wells immediately after the addition of cells, using a second automatic injector unit. For bioluminescent assays with tethered GPA2/GPB5 fusion proteins, recombinant cells expressing or not expressing LGR1 were equally divided to test Gs and Gi signaling pathways simultaneously using the same batch of cells per biological replicate. In all assays, luminescence was measured every 2 min for 20 min at 37◦C using a Synergy 2 Multimode Microplate Reader (BioTek, Winooski, VT) and averaged over 4–8 technical replicates for each treatment. To calculate the relative luminescent response, data were normalized to luminescent values recorded from treatments with 250 nM or 1µM forskolin alone. Assays were performed repeatedly and involved 3–6 independent biological replicates. Relative luminescent values were compiled on Excel and transferred into GraphPad Prism 8.0 for figure preparation and statistical analysis. Data were analyzed using one-way or two-way ANOVA with Tukey's multiple comparison post-test (p < 0.05).

### RESULTS

### A. aegypti GPA2 and GPB5 Subunit Expression Localization

To determine the distribution of GPA2 and GPB5 subunit expression in the central nervous system and peripheral tissues, adult mosquito organs were analyzed using RT-qPCR. GPA2 and GPB5 subunit transcript was detected in the central nervous system of adult male and female mosquitoes, with significantly enriched expression in the abdominal ganglia relative to other regions of the nervous system and peripheral tissues (**Figures 1A,B**). Fluorescence in situ hybridization was employed to localize cell-specific expression of the GPA2 and GPB5 transcripts in the abdominal ganglia. GPA2 and GPB5 anti-sense RNA probes identified two bilateral pairs of neuroendocrine cells (**Figures 1C,D**) within each of the first five abdominal ganglia in male and female mosquitoes whereas the control sense probes did not detect cells in the nervous system (**Figures 1E,F**). Within each of these first five abdominal ganglia, GPA2 transcript localized to similar laterally-positioned cells as the GPB5 transcript (**Figures 1G,H**). To determine if cells expressing GPA2 transcript were the same cells expressing GPB5 transcript, abdominal ganglia were simultaneously treated with both GPA2 and GPB5 anti-sense RNA probes. Using this dual probe approach, results confirmed the detection of only two bilateral pairs of cells (**Figures 1I,J**) that displayed a greater staining intensity compared to the intensity of cells detected when using either the GPA2 or GPB5 anti-sense probe alone (**Figures 1C,D**).

Using a custom antibody targeting A. aegypti GPB5, we next sought to immunolocalize GPB5 protein in the abdominal ganglia. GPB5 immunoreactivity localized to two bilateral pairs of cells (**Figure 2A**) within the first five ganglia, which were in similar positions to cells expressing GPA2 and GPB5 transcript (**Figures 1C,D,G–J**). Regardless of sex, in 30% of

FIGURE 1 | GPA2/GPB5 subunit transcript expression and localization in adult A. aegypti. (A,B) RT-qPCR examining GPA2 (A) and GPB5 (B) transcript expression in the central nervous system of adult mosquitoes, with significant enrichment in the abdominal ganglia (AG). Subunit transcript abundance is shown relative to their expression in the thoracic ganglia (TG). Mean ± SEM of three biological replicates. Columns denoted with different letters are significantly different from one another. Multiple comparisons two-way ANOVA test with Tukey's multiple comparisons (P < 0.05) to determine sex- and tissue-specific differences. Reproductive tissues (RT), alimentary canal (Gut), carcass (Carc), brain (B). Fluorescence in situ hybridization anti-sense (C,D,G–J) and sense (E,F) probes to determine GPA2 and GPB5 transcript localization (GPA2 and/or GPB5 transcript, red; nuclei, blue) in the abdominal ganglia of adult mosquitoes. Unlike sense probe controls (E,F), two bilateral pairs of cells (arrowheads) were detected with GPA2 (C,G) and GPB5 (D,H) anti-sense probes in the first five abdominal ganglia of adult male and female A. aegypti. Co-localization of the GPA2 (G) and GPB5 (H) transcript was (Continued) FIGURE 1 | verified by treating abdominal ganglia dually with a combination of GPA2 and GPB5 anti-sense probes (I,J) that revealed two, intensely-stained bilateral pairs of cells. In (C–F,I,J), microscope settings were kept identical when acquiring images of control and experimental ganglia. The second abdominal ganglion is depicted in each image as a representative, given that no differences in the number nor staining intensity of cells were observed between ganglia of a given mosquito. Scale bars are 50µm in (C–F,I,J) and 40µm in (G,H). Experimental procedures were repeated three-four times with five mosquitoes (of each sex) per trial.

experimental ganglia. The second abdominal ganglion is depicted in (A,B,D) as a representative, given that no differences in the number nor staining intensity of cells were observed between ganglia of a given mosquito. Scale bars are 25µm in (A–D) and 20µm in (B,C). Experimental procedures were repeated three-four times with five mosquitoes (of each sex) per trial.

mosquitoes examined, three instead of two bilateral pairs of cells immunolocalized to each of the first five abdominal ganglia of the ventral nerve cord (**Figure 2B**), whereas no cells were ever detected in the sixth terminal ganglion (**Figure 2C**). For a given mosquito, there were no differences in the number of GPB5 immunoreactive cells detected between different ganglia. Along the lateral sides of each of the first five abdominal ganglia, GPB5 immunoreactive processes were observed to closely associate into a tract of axons that emanated through the lateral nerve (**Figure 2B**). Control treatments with GPB5 antibody preabsorbed with the GPB5 immunogenic antigen did not detect any cells in ganglia (**Figure 2D**).

### Cross-Linking Analyses to Determine A. aegypti GPA2 and GPB5 Subunit Interactions

A. aegypti GPA2 and GPB5 subunit protein interactions were studied using western blot analysis of recombinant proteins from HEK 293T cells expressing each subunit independently or co-expressing both subunits in the same cells using a dual promoter plasmid. Under control conditions, GPA2 protein is represented as two bands at 16 and 13 kDa, which correspond to the glycosylated and non-glycosylated forms of A. aegypti GPA2, respectively (**Figure 3A**). Following deglycosylation with PNGase, the higher molecular weight band of GPA2 is eliminated, and the non-glycosylated lower molecular weight band intensifies (**Figure 3A**). Interestingly, when the GPA2 subunit was tested to examine potential homodimerization, an additional strong band at ∼32 kDa was detected, which migrates to ∼30 kDa when cross-linked protein samples were deglycosylated using PNGase (**Figure 3A**). Under control conditions, GPB5 protein is represented as a band size at 24 kDa and the migration pattern is not affected by PNGase treatment (**Figure 3B**). Following treatment with DSS, a faint second band appears at 48 kDa, which does not change in molecular weight after treatment with PNGase (**Figure 3B**). Three independent band sizes at 24 kDa (GPB5), 16 kDa (glycosylated GPA2) and 13 kDa (non-glycosylated GPA2) were detected in lanes loaded with protein isolated from HEK 293T cells co-expressing GPA2 and GPB5 subunits (**Figure 3C**). After removal of N-linked sugars, the 24 kDa band is not affected but the 16 kDa band disappears and 13 kDa band intensifies (**Figure 3C**), as observed when assessing the GPA2 subunit independently (**Figure 3A**). Crosslinked samples show the addition of two higher molecular weight bands at ∼48 and ∼32 kDa, the latter of which migrates lower to ∼30 kDa when subjected to PNGase treatment (**Figure 3C**). Given the resolution and intensity of the detected bands, the molecular weight band at ∼45 kDa could indicate GPA2/ GPB5 heterodimeric interactions (13 or 16 kDa + 24 kDa = 37–40 kDa). Alternatively, these bands could reflect GPB5 (24 + 24kDa = 48 kDa) homodimers. As a result of this uncertainty, additional experiments were performed to clarify whether A. aegypti GPA2 and GPB5 subunits are heterodimeric partners.

### Heterodimerization of Mosquito and Human GPA2/GPB5

Using yeast two-hybrid analyses and cross-linking experiments, it was previously shown that human GPA2 (hGPA2) and GPB5 (hGPB5) subunits are capable of heterodimerization (4). As a result, to verify whether A. aegypti GPA2 and GPB5 subunit proteins are heterodimeric candidates, experiments were performed alongside hGPA2/hGPB5 subunit proteins, using the latter as a positive experimental control for heterodimer detection.

Initially, single-promoter expression constructs were designed to incorporate a FLAG-tag and His-tag on the C-terminus of (human and mosquito) GPA2 and GPB5 subunits, respectively (i.e., GPA2-FLAG and GPB5-His). When probed with an anti-His antibody, no bands were detected in lanes containing only GPA2-FLAG (human

FIGURE 3 | Western blot analyses to determine the effects of glycosylation on homo- and heterodimer formation on the glycoprotein hormone (GPA2/GPB5) subunits in the mosquito, A. aegypti. (A) In untreated conditions, western blot analysis of GPA2 subunit alone reveals two bands at 16 and 13 kDa. Whereas, following treatment with PNGase, the higher molecular weight band disappears and the 13 kDa band is intensified. A thick, additional band at ∼32 kDa appears when GPA2 protein is cross-linked with DSS, and this band migrates slightly lower to ∼30 kDa when GPA2 protein is treated with both DSS and PNGase. (B) A 24 kDa band is observed in lanes loaded with untreated GPB5 subunit alone. Upon PNGase treatment, the 24 kDa band is not affected; however, upon treatment with DSS, a second faint band appears at 48 kDa that is not affected by deglycosylation. (C) Western blot analyses of co-expressed GPA2 and GPB5 subunits shows three distinct band sizes at 24, 16, and 13 kDa, corresponding to the GPB5 subunit and two forms of GPA2 subunit protein. Similar to (A), after treatment with PNGase, the higher molecular weight form of GPA2 is eliminated and the 13 kDa band intensifies. When GPA2/GPB5 protein is cross-linked, two additional bands are detected at ∼48 and ∼32 kDa; however, following cross-linking and PNGase treatment, the ∼32 kDa band is eliminated leaving only the 30 kDa band along with the unaffected ∼48 kDa band. GPA2 (A) and GPA2/GPB5 (C) subunit protein was resolved on either a 10% (top) or 15% (bottom) polyacrylamide gel under denaturing conditions.

and mosquito) protein (**Figures 4A,B**). Similarly, when an anti-FLAG antibody, no bands were detected in lanes containing only GPB5-His protein (human and mosquito) (**Figures 4A,B**).

Lanes loaded with cross-linked hGPB5-His protein revealed two bands at 18 kDa (monomer) and 36 kDa (homodimer) (**Figure 4A**). Similarly, in lanes loaded with cross-linked hGPA2-FLAG protein, two bands at 20 kDa (monomer) and 40 kDa (homodimer) were detected (**Figure 4A**). To determine subunit heterodimerization, hGPA2-FLAG protein was combined with hGPB5-His protein (i.e., produced separately in different cell batches) and subsequently crosslinked using DSS. Results showed an intense 38 kDa band size that correlated to the molecular weight of the hGPA2-FLAG/hGPB5-His heterodimers; detected with both anti-His and anti-FLAG primary antibody solutions (**Figure 4A**).

Identical experiments were conducted using A. aegypti GPA2-FLAG/GPB5-His protein but using three-fold higher concentration of DSS cross-linker to help improve detection of inter-subunit interactions. Results demonstrated that lanes loaded with DSS-treated GPB5-His protein resulted in a 24 kDa monomer band (**Figure 4B**), whereas lanes containing cross-linked GPA2-FLAG detected a 16 kDa (glycosylated monomer), 30 and 32 kDa (homodimers) band size (**Figure 4B**). As a result, unlike immunoblots containing hGPA2-FLAG and hGPB5-His protein (**Figure 4A**), lanes loaded with cross-linked A. aegypti GPA2-FLAG and GPB5- His failed to provide evidence of bands correlating to the predicted molecular weight of an A. aegypti GPA2/GPB5 heterodimer, which would be expected at 37–40 kDa (**Figure 4B**).

### A. aegypti GPA2/GPB5 Unable to Activate LGR1-Mediated Gs and Gi/o Signaling Pathways

Bioluminescent assays were employed to confirm LGR1 interaction with mosquito GPA2 and/or GPB5 subunits and elucidate downstream signaling pathways upon receptor activation. Since human GPA2/GPB5 (hGPA2/hGPB5) has previously been shown to bind and activate human thyrotropin receptor (hTSHR) mediating a stimulatory G protein (Gs) signaling pathway (4), we first validated our experimental design using hGPA2/hGPB5 and hTSHR as a positive control. Recombinant hGPA2 and hGPB5 subunit proteins were produced in HEK 293T cells with single promoter expression constructs containing the hGPA2 or hGPB5 sequences. Conditioned culture media containing secreted proteins were subsequently concentrated, and crude extracts containing hGPA2, hGPB5 or a combination of hGPA2 and hGPB5 subunits were tested as ligands on HEK 293T cells expressing the hTSHR and a cAMP-sensitive firefly luciferase biosensor, which produces bioluminescence upon interacting with cAMP. Control treatments involved the incubation of hTSHR/ luciferase-expressing cells with concentrated media collected from mCherry-transfected cells (negative control), and data was normalized to treatments with 250 nM forskolin (positive control) (**Figures 5A,B**). As expected, the results indicate that incubation with extracts containing a combination of both

FIGURE 4 | Elucidating heterodimerization of H. sapiens (human) (A) and A. aegypti (mosquito) (B) GPA2 and GPB5 subunits. Single promoter expression constructs for human and mosquito GPA2-FLAG and GPB5-His were used for transient expression in HEK 293T cells. Protein was harvested and subsequently concentrated, treated with DSS cross-linker, and probed with an anti-His or an anti-FLAG antibody after SDS-PAGE. (A,B) No bands were detected in lanes loaded with cross-linked GPA2-FLAG protein (Lane 1) or cross-linked GPB5-His protein (Lane 6), probed with an anti-His antibody or anti-FLAG antibody, respectively. (A) Bands corresponding to the monomeric form (18 kDa) and homodimer (36 kDa) of cross-linked human GPB5-His (Lane 3) and to the monomeric form (20 kDa) and homodimer (40 kDa) of cross-linked human GPA2-FLAG protein (Lane 4). A combination of the subunits with subsequent cross-linking of separately-produced human GPA2-FLAG and GPB5-His protein (Lane 2, 5) revealed a band size correlating to the human GPA2/GPB5 heterodimer (38 kDa), detected using an anti-His (Lane 2) or anti-FLAG (Lane 5) antibody. (B) Bands corresponding to the mosquito GPB5 monomer (24 kDa) (Lane 3), mosquito GPA2 glycosylated monomer (16 kDa) and homodimer pairs (30 and 32 kDa) (Lane 4). No detection of bands correlating to mosquito GPA2/GPB5 heterodimer (37–40 kDa) were observed, when probed with either anti-His (Lane 2) or anti-FLAG (Lane 5) antibodies.

hGPA2 and hGPB5 were required to stimulate a cAMP-mediated luminescent response from hTSHR-expressing cells, but not when incubated with extracts containing individual subunits (**Figure 5A**). We next performed identical experiments using A. aegypti GPA2/GPB5 and LGR1. Given the availability of a dual promoter plasmid (pBudCE4.1), an additional treatment was performed whereby GPA2/GPB5 subunits were co-expressed within the same cells and conditioned media was concentrated as described above. Unlike the results using hGPA2/hGPB5 subunit homologs on HEK 293T cells expressing the hTSHR (**Figure 5A**), no combination of mosquito GPA2 and/or GPB5 subunit proteins led to an increase in the cAMP-based luminescent response in HEK 293T cells expressing mosquito LGR1 (**Figure 5B**).

Using an in silico analysis to predict coupling specificity of A. aegypti LGR1 and human TSHR to different families of Gproteins (33), we determined that A. aegypti LGR1 is strongly predicted to couple to inhibitory (Gi/o) G proteins **Table S3**). As a result, to determine whether A. aegypti GPA2 and/or GPB5 activate a Gi/o signaling pathway, various combinations of GPA2 and GPB5 were tested for their ability to inhibit a 250 nM forskolin-induced rise in cAMP measured by changes in bioluminescence (**Figures 5C,D**). Results revealed that sole treatments of GPB5 proteins alone significantly inhibited a forskolin-induced luminescent response, relative to control treatments with mCherry proteins, when incubated with cells expressing LGR1 (**Figure 5C**). However, similar inhibitory effects of GPA2 and GPB5 proteins were also observed with cells lacking LGR1 expression (**Figure 5D**).

### Characterization of Tethered A. aegypti GPA2/GPB5

Activity on the hTSHR was only observed when both hGPA2 and hGPB5 subunits were coapplied for receptor activation (**Figure 5A**) and, unlike the heterodimerization of hGPA2/hGPB5 observed in our experiments, mosquito GPA2/GPB5 lacked evidence of heterodimerization (**Figure 4**). In light of these observations, we hypothesized that the activation of A. aegypti LGR1 also required subunit heterodimerization. To mimic GPA2/GPB5 heterodimers using the heterologous expression system, both GPA2 and GPB5 mosquito subunits were expressed as a tethered, single-chain polypeptide by fusing the C-terminus of the GPB5 prepropeptide sequence with the N-terminus of the GPA2 propeptide sequence, using a tagged linker sequence composed of twelve amino acids, involving three glycine-serine repeats and six histidine residues.

HEK 293T cells were transfected to transiently express a single promoter plasmid construct containing the tethered GPA2/GPB5 sequence, or the red fluorescent protein (mCherry) as a negative control. To verify expression of the construct, at 48 h posttransfection, cell lysates along with the conditioned culture media, the latter of which contains secreted proteins, were collected for immunoblot analysis. No bands were detected in lanes containing cell lysate or secreted protein fractions of mCherry transfected cells (**Figure 6A**). However, in the lysates of cells transfected to express tethered GPA2/GPB5, an intense band at 32 kDa, and less intense band at 37 kDa were detected; the latter of which corresponds to the predicted molecular weight of non-glycosylated GPA2 (13 kDa) plus GPB5 (24 kDa) (**Figure 6A**). In lanes containing secreted fractions of tethered GPA2/GPB5- transfected cells, a band at 37 kDa was again detected, as well as a stronger 40 kDa band size that correlates to the predicted molecular weight of glycosylated GPA2 (16 kDa) plus GPB5 (24 kDa) (**Figures 6A,B**). After secreted protein extracts containing tethered GPA2/GPB5 proteins were treated with PNGase, the higher 40 kDa molecular

weight band disappears and the 37 kDa band size intensifies, which confirms the observed molecular weight shift results from removal of N-linked oligosaccharides (**Figure 6B**). Thus, the tethered GPA2/GPB5 undergoes similar post-translational processing as observed for the subunits expressed individually.

### Tethered A. aegypti GPA2/GPB5 Activates LGR1

The activity of tethered GPA2/GPB5 proteins on LGR1 activation was examined. Cell lysate or secreted protein fractions collected from mCherry- or tethered GPA2/GPB5-transfected cells were incubated with HEK 293T cells co-expressing the cAMP luciferase biosensor and either A. aegypti LGR1 or mCherry (i.e., not expressing LGR1). Whether tethered GPA2/GPB5 proteins (cell lysate or secreted fractions) could elevate cAMP or inhibit a forskolin-induced rise in cAMP was assessed and compared to negative control treatments with proteins harvested from mCherry-transfected cells.

Without the addition of tethered GPA2/GPB5 proteins, the basal luminescent levels were higher in LGR1-transfected cells compared to mCherry-transfected cells (**Figure 7**), suggesting constitutive activity of A. aegypti LGR1 elevating cAMP levels. However, the application of neither secreted protein fractions nor cell lysates of tethered GPA2/GPB5 transfected cells elicited an increase in the cAMP-mediated luminescent response relative to control treatments with mCherry proteins (**Figures 7A,B**). Secreted protein fractions containing tethered GPA2/GPB5 protein had no effect on the forskolin-induced cAMP-dependent luminescence, compared to control treatments with mCherry secreted proteins in LGR1-expressing cells (**Figure 7C**). Notably, however, treatments of LGR1-expressing cells with cell lysates containing tethered GPA2/GPB5 significantly inhibited the

luminescent response owing to the forskolin-induced rise in cAMP relative to treatments with mCherry-transfected cell lysates, which was not observed in cells lacking LGR1 expression (**Figure 7D**).

### DISCUSSION

### GPA2/GPB5: A Neuropeptide Produced in the Abdominal Ganglia of Mosquitoes

The central nervous system (CNS) of adult mosquitoes is comprised of a brain and a ventral nerve cord, consisting of the suboesophageal ganglion, three fused thoracic ganglia and six abdominal ganglia. GPA2 and GPB5 transcripts are significantly enriched in the abdominal ganglia of adult mosquitoes relative to peripheral tissues and other regions of the CNS. Although low levels of GPA2 and GPB5 transcripts were detected in the thoracic ganglia and brain using RT-qPCR, fluorescence in situ hybridization (FISH) techniques used to localize GPA2 and GPB5 transcripts, along with immunohistochemical detection of GPB5, did not identify specific cells in these regions of the nervous system. Instead, GPA2 and GPB5 transcripts, as well as GPB5 immunoreactivity was identified in 2–3 laterally-localized bilateral pairs of neuroendocrine cells within the first five abdominal ganglia. These findings are consistent with previous findings in the fruit fly D. melanogaster, where GPA2 and GPB5 subunit transcripts were localized to four bilateral pairs of neuroendocrine cells within the abdominal neuromeres in the fused ventral nerve cord, that were distinct from cells expressing other neuropeptides including leucokinin, bursicon, crustacean cardioactive peptide or calcitonin-like diuretic hormone (18).

Both FISH and immunohistochemical techniques revealed GPA2 and GPB5 expression within two bilateral pairs of neuroendocrine cells, that were positioned slightly posterior to where the lateral nerves emanate from these ganglia. It is possible that GPB5 immunoreactive axons follow similar projection patterns as ovary ecdysteroidogenic hormone (OEH I), a gonadotropin produced in lateral neurosecretory cells in the abdominal ganglia of mosquitoes (34). OEH I positive cells emanate through the lateral nerves, similar to our findings with GPB5 immunoreactive cells, and terminate on perivisceral organs that function as neurohaemal release sites (34).

In some abdominal ganglia preparations, a third bilateral pair of cells immunoreactive for GPB5 protein was detected; however these additional cells were not detected using FISH, which suggests GPB5 transcript may be differentially regulated between different bilateral pairs of cells. Given that the same number of cells were observed to express GPA2 transcript, and these cells localized to similar positions as GPB5 expressing cells, the glycoprotein hormone subunits are likely co-expressed in the same cells. To investigate cellular co-expression of GPA2/GPB5, abdominal ganglia were simultaneously treated with both GPA2- and GPB5-targeted anti-sense RNA probes. From this analysis, again only two bilateral pairs of cells were detected and these were more intensly stained compared to preparations treated with either probe alone, which confirms that GPA2 and GPB5 are indeed co-expressed within the same neurosecretory cells of the first five abdominal ganglia in adult mosquitoes. The cellular co-expression of GPA2 and GPB5 proteins implies that, upon a given stimulus, both subunits are regulated in a similar manner and are likely simultaneously released following the appropriate stimulus. Importantly, since co-expression and heterodimerization of the classic vertebrate glycoprotein hormone subunits takes place within the same cells (21, 22), these findings indicate that the mosquito GPA2/GPB5 subunits may be produced and released as heterodimers in vivo.

### Heterodimerization and Homodimerization of GPA2/GPB5

To study the interactions of A. aegypti GPA2 and GPB5 subunits in vitro, hexa-histidine tagged proteins secreted into the culture media of transfected HEK 293T cells were collected and analyzed under denaturing conditions after cross-linking treatments, which had been utilized previously to show GPA2/GPB5 heterodimerization in other organisms (4, 6, 29). Crosslinked protein samples were then deglycosylated to identify whether the removal of N-linked sugars affected dimerization. Treatment of mosquito GPA2 and GPB5 subunits individually with cross-linker resulted in the detection of bands with sizes corresponding to homodimers of GPA2 (∼32 kDa) and GPB5 (48 kDa). GPA2 homodimer bands migrated lower to ∼30 kDa following deglycosylation treatment with PNGase. However, experiments performed with cross-linked GPA2/GPB5 protein were not able to confirm heterodimerization since the detected band sizes could also reflect GPB5 homodimeric interactions. Similarly, previous studies that demonstrated GPA2/GPB5 heterodimerization in human (4, 6) and fruit fly (29) did not provide evidence on the interactions of each subunit alone to determine if homodimerization is possible. In these earlier studies, molecular weight band sizes that were identified as heterodimers could have been the result of homodimeric

interactions (4, 6, 29). As a result, the findings herein with mosquito GPA2/GPB5 indicate additional experiments are required to confirm GPA2/GPB5 heterodimerization in these organisms.

To clarify whether A. aegypti (mosquito) GPA2/GPB5 heterodimerize, each subunit was differentially tagged (GPA2- FLAG and GPB5-His), and immunoblots containing various combinations of cross-linked subunits were probed with either anti-FLAG or anti-His antibody. As a positive control, experiments were first performed using H. sapiens (human) hGPA2/hGPB5 subunit proteins. Similar to mosquito GPA2 and GPB5 subunits, results showed hGPA2 and hGPB5 subunits are capable of homodimerization. To study heterodimeric interactions, GPA2 and GPB5 subunit proteins were expressed separately in HEK 293T cells. Upon combining and treating protein samples with DSS, a molecular weight band size at 38 kDa, corresponding to the molecular weight of hGPA2/hGPB5 heterodimers (hGPA2-FLAG and hGPB5-His tagged), was detected and migrated differently than bands corresponding to hGPA2 (40 kDa) and hGPB5 (36 kDa) homodimers. Taken together, our results confirm hGPA2 and hGPB5 subunits are indeed capable of heterodimerization in vitro. Further, an induction of cAMP was observed when both subunits were present for TSHR functional activation, whereas treatments with individual subunits failed to significantly increase cAMPmediated luminescence. As a result, hGPA2/hGPB5 is capable of heterodimerization, and since a combination of both subunits were required to signal a TSHR-mediated elevation in cAMP, these heterodimers are required to functionally activate its cognate glycoprotein hormone receptor.

Analogous experiments using the same concentration of DSS cross-linker (data not shown) as well as three-fold higher concentrations were performed with mosquito GPA2- FLAG and GPB5-His subunit proteins. Irrespective of the DSS concentration used, no bands corresponding to the expected molecular weights of mosquito GPA2/GPB5 heterodimers (37– 40 kDa) were observed, but rather, only GPA2 and GPB5 homodimers were detected. Moreover, mosquito GPA2/GPB5, either each subunit alone, mixed from different cell batches or co-expressed in the same cells using a dual promoter vector, was unable to stimulate a cAMP-mediated luminescent response in HEK 293T cells expressing LGR1. Since we identified A. aegypti LGR1 is predicted to couple a Gi/o signaling pathway (**Table S3**), we examined if mosquito GPA2/GPB5 could inhibit a forskolin-induced cAMP response. Sole treatments of GPA2 and GPB5 subunit proteins inhibited a forskolin-induced rise in cAMP, however these inhibitory actions were not owed to G protein signaling events related to A. aegypti LGR1, since inhibition was observed in control cell lines in the absence of LGR1. These results suggest mosquito GPA2 and GPB5 subunit proteins may non-specifically interact with other endogenously expressed proteins, like the orphan receptors LGR4 and LGR5 that are highly expressed in HEK 293T cells (35).

### A. aegypti GPA2/GPB5 Heterodimers Activate LGR1 and Initiate a Switch From Gs to Gi Coupling

Activation of the human thyrotropin receptor was only observed when both hGPA2 and hGPB5 subunits were present which, unlike data obtained involving mosquito subunits, demonstrated hGPA2/hGPB5 subunit heterodimerization in vitro using the mammalian heterologous system. The inability of mosquito GPA2 and GPB5 subunits to successfully heterodimerize could result from improper protein folding of insect-derived secretory proteins in HEK 293T cells used for heterologous expression. As a result, future experiments should test the expression and heterodimerization of A. aegypti GPA2/GPB5 in other heterologous systems such as insect cell lines, which could provide a more appropriate environment for tertiary and quaternary protein structure formation. Nonetheless, given the observed co-localization of the subunits within bilateral pairs of cells in the first five abdominal ganglia in A. aegypti, which is comparable to cellular colocalization shown earlier in D. melanogaster (18), we proposed that A. aegypti GPA2/GPB5 heterodimers would be required to functionally activate LGR1 in vitro. To confirm this possibility, a tethered construct was designed linking the C-terminus of GPB5 to the N-terminus of GPA2 using a histidine tagged glycine/serine-rich linker sequence. Natural and synthetic linkers function as spacers that connect multidomain proteins, and are commonly used to study unstable or weak protein-protein interactions (36). The incorporation of a linker sequence between glycoprotein hormone subunits has been performed previously, and does not affect the assembly, secretion or bioactivity of human FSH (37), TSH (38), and CG (39). The conversion of two independent glycoprotein hormone subunits into a single polypeptide chain using a glycine-serine repeat linker sequence has also been performed recently with lamprey GPA2/GPB5 (31), which was shown to induce a cAMP response. Interestingly, similar proteins involving TSH alpha fused to TSH beta with carboxyl-terminal peptide (CTP) as a linker promoted a three-fold higher induction of cAMP compared to wild-type TSH, likely because the addition of a CTP linker increases protein stability and flexibility (40).

Thus, tethered A. aegypti GPA2/GPB5 was expressed in HEK 293T cells and secreted protein fractions along with cell lystaes were collected for expression studies. In cell lysates, immunoblot studies revealed the detection of two bands at 32 and 37 kDa which correlate to immature (incompletely processed) forms of tethered GPA2/GPB5 proteins and nonglycosylated GPA2 (13 kDa) plus GPB5 (24 kDa), respectively. In secreted fractions, a faint band corresponding to nonglycosylated GPA2/GPB5 as well as an intense band at 40 kDa, glycosylated GPA2 (16 kDa) plus GPB5 (24 kDa), was detected. Moreover, treatments with PNGase verified the higher molecular weight band size in secreted fractions at 40 kDa is glycosylated, as observed for GPA2 expressed independantly in earlier experiments herein and in previous studies (13). Bands in cell lysates indicate the retention of immature and nonglycosylated GPA2/GPB5 heterodimers. For this reason, cell lysates and secreted protein fractions of tethered GPA2/GPB5 expressing cells were separately tested for their ability to activate A. aegypti LGR1. Unlike secreted fractions that were more rich in glycosylated GPA2/GPB5 proteins, cell lysates that lacked glycosylated GPA2/GPB5 were able to activate LGR1, leading to a reduction in cAMP-mediated luminescence. Thus, our results suggest that post-translational modifications like glycosylation may impact ligand-receptor activity. Alternatively, it is possible that a lack of inhibitory activity of secreted fractions could be owed to smaller concentrations of tethered GPA2/GPB5 compared to lysate fractions, given that the amounts of secreted and cell lysate fractions were not equalized. Future experiments should aim to create novel constructs that improve expression and secretion of tethered GPA2/GPB5.

### Evolutionary Conservation and Divergence of GPA2/GPB5 Receptor Signaling

In humans, GPA2/GPB5-TSHR signaling stimulates adenylyl cyclase activity to increase intracellular cAMP via interaction with a Gs protein (4, 6, 11), and these results were confirmed in our studies. Comparatively, GPA2/GPB5 signaling was also shown to increase levels of cAMP upon binding LGR1 in D. melanogaster (29). Interestingly, our experiments demonstrate low level constitutive activity of adenylyl cyclase in LGR1 expressing cells since cAMP luminescent response was moderately greater in LGR1-transfected cells compared to cells not expressing LGR1. High basal constitutive activity of LGR1 has been observed previously in D. melanogaster (29). Moreover, constitutive activity of glycoprotein hormone receptors is wellknown in vertebrates and has been demonstrated to be stronger for the thyrotropin receptor than for the LH/CG receptor (41). Given that GPA2/GPB5 receptor homologs in vertebrate and invertebrate model organisms have been shown previsously, and herein, to promote constitutive synthesis of cAMP, it may be suggestive that a common downstream effector and/or function exists for LGR1 and homologs in different organisms.

Surprisingly, however, our experiments indicate that incubations of LGR1-expressing cells with mosquito GPA2/GPB5 tethered protein triggers a switch from low level constitutive Gs coupling to Gi/o coupling for A. aegypti LGR1, given that tethered GPA2/GPB5 signficantly inhibited the forskolininduced increase in cAMP in LGR1-expressing HEK 293T cells but not in control cells lacking LGR1 expression. This finding, while highly interesting, is not entirely unusual since promiscuous G protein coupling has been reported for glycoprotein hormone receptors like the TSH receptor (Gs and Gq) and LH/CG (Gi and Gs) (42–44).

### Regulation by GPA2/GPB5 Heterodimers

To help stabilize heterodimerization, the beta subunit sequences of the classic glycoprotein hormones (FSH, LH, TSH, and CG) contain two additional cysteine residues that form an additional disulfide bridge which wraps around and "buckles" the alpha subunit (26). Though heterodimerization can occur with mutated forms of this "seatbelt" structure, there is a dramatic decrease in heterodimer stability (21, 22). GPB5 in vertebrates and invertebrates lack the seatbelt structure required to stabilize heterodimerization (26). Thus, the hypothesis that GPA2/GPB5 functions as a heterodimer in a physiological situation (i.e., without chemical cross-linking) is challenged. The dissociation constant (Kd) associated with heterodimerization of the classic glycoproteins hormone subunits and GPA2/GPB5 is 10−<sup>7</sup> to 10−<sup>6</sup> M, which indicates heterodimeric interactions are favored at these concentrations (45, 46). However, since the classic beta subunits contain an additional disulfide bridge that strengthens its association with the common alpha subunit, heterodimeric interactions are stabilized in circulation at physiological concentrations as low as 10−<sup>11</sup> to 10−<sup>9</sup> M (21). Without this seatbelt structure, GPA2/GPB5 heterodimeric interactions are posssible only at micromolar concentrations, which are not typically observed in circulation (21, 26). Together, the limited evidence so far using heterolougous expression challenges the possibility of endocrine regulation by GPA2/GPB5 heterodimers. Nonetheless, it was argued in D. melanogaster that the large neurosecretory cells co-expressing the glycoprotein hormone subunits, along with their corresponding axonal projections that localized distinctly from organs that express the GPA2/GPB5 receptor (LGR1), did support that this system is indeed endocrine in nature (18). Alternatively, the subunits could function independently or regulate physiology as a heterodimer in a paracrine/autocrine fashion. In rats, GPA2/GPB5 is expressed in oocytes and may act as a paracrine regulator of TSHR-expressed granulosa cells in the ovary to regulate reproductive processes (6). Another possibility to consider is that additional endogenous co-factors may be involved, but remain unidentified, which help to strengthen interaction between the GPA2 and GPB5 subunits, since the tethered mosquito GPA2/GPB5 was indeed capable of activating LGR1 in vitro inducing a Gi/o signaling cascade.

### GPA2 and GPB5 Homodimerization

Our results establish that human and mosquito GPA2 and GPB5 subunits can weakly and strongly, respectively, homodimerize. However, whether these homodimers have a physiological function in vivo is unknown. Treatments of either mosquito or human GPA2 and GPB5 subunits alone did not stimulate specific downstream signaling in LGR1- or TSHR-expressing cells, upholding that only GPA2/GPB5 heterodimers can activate their cognate glycoprotein hormone receptors. However, it is possible that GPA2 and GPB5 homodimers may target other unidentified receptors. In insects, the molting hormone bursicon is a heterodimer of two subunits called burs and pburs. Burs/pburs heterodimers act via a glycoprotein hormone receptor (i.e., LGR2) to regulate processes such as tanning and sclerotization of the insect cuticle as well as wing inflation after adult emergence (47). Recently, it was demonstrated that bursicon subunits can homodimerize (i.e., burs/burs and pburs/pburs) and these homodimers mediate actions independently of LGR2 to regulate immune responses in A. aegypti and D. melanogaster (48, 49).

In addition to the human and mosquito GPA2/GPB5 homodimers observed in our studies, human GPA2 was also shown to interact with the beta subunits of CG and FSH (4). Lastly, the expression patterns of GPA2 and GPB5 in a number of organisms do not always strictly co-localize, since GPA2 expression exhibits a much wider distribution and is expressed more abundantly than GPB5 in a number of vertebrate and invertebrate organisms (7, 12, 23, 24, 50, 51). Taken together, this raises the possibility that GPA2 and GPB5 subunits may interact with other unknown proteins that could activate different receptors or signaling pathways and elicit distinct functions.

### CONCLUDING REMARKS

Although much is known about the classic vertebrate glycoprotein hormones including LH, FSH, TSH, and CG along with their associated receptors, little progress has been made thus far toward better understanding the function of GPA2/GPB5 signaling and subunit interactions, particularly for the invertebrate organisms. To our knowledge, this is the first study to demonstrate A. aegypti and H. sapiens GPA2 and GPB5 subunit homodimerization in vitro. Our results also confirm that heterodimerization of A. aegypti and H. sapiens GPA2/GPB5 are required for the activation of their cognate receptors LGR1 and TSHR, respectively. In constrast to previous reports showing GPA2/GPB5-induced LGR1 activation elevates intracellular cAMP by coupling a Gs pathway, the current findings provide novel information supporting that A. aegypti LGR1 couples to a Gi/o protein to inhibit cAMP levels following application of the GPA2/GPB5 fusion polypeptide. Further, our results revealed that mosquito LGR1 is constitutively active when overexpressed in the absence of its ligand, GPA2/GPB5, inducing a Gs signaling pathway that raises levels of cAMP levels, which is consistent with previous observations with overexpression of fruit fly LGR1 (29, 52) as well as mammals including dog and human TSH receptor (53, 54).

In the mosquito nervous system, our results confirm GPA2 and GPB5 subunits are co-expressed within the same neurosecretory cells of the first five abdominal ganlgia where their coordinated release and regulation are likely. As a result, whether GPA2/GPB5 are secreted as heterodimers, like the classic glycoprotein hormones, and/or as homodimers remains to be determined in vivo. While homodimers were inactive in the heterologous assay used herein, whether these homodimers are functional in vivo and what physiological role they play (if any) is a research direction that should be addressed in future studies. All in all, this investigation has provided novel information for an invertebrate GPA2/GPB5 and LGR1 signaling system and contributes toward advancing our understanding and the functional elucidation of this ancient glycoprotein hormone signaling system common to nearly all bilaterian organisms.

### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/**Supplementary Material**.

### AUTHOR CONTRIBUTIONS

DR and J-PP contributed to experimental design, wrote the paper, and aided with data analysis. DR performed all the experiments.

### REFERENCES


### FUNDING

This study was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant and an Ontario Ministry of Research and Innovation Early Researcher Award to J-PP. This manuscript has been released as a Pre-Print on BioRxiv available at https://www.biorxiv.org/content/ 10.1101/694653v2.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00158/full#supplementary-material


glycoprotein hormones (GPH) alpha and beta subunits and GPH-related A2 (GPA2) and B5 (GPB5) molecules. Reprod Biol Endocrinol. (2009) 7:90. doi: 10.1186/1477-7827-7-90


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Rocco and Paluzzi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Adipokinetic Peptides in Diptera: Structure, Function, and Evolutionary Trends

#### Gerd Gäde<sup>1</sup> \*, Petr Šimek <sup>2</sup> \* and Heather G. Marco<sup>1</sup>

<sup>1</sup> Department of Biological Sciences, University of Cape Town, Cape Town, South Africa, <sup>2</sup> Biology Centre, Czech Academy of Sciences, Ceské Budejovice, Czechia

Nineteen species of various families of the order Diptera and one species from the order Mecoptera are investigated with mass spectrometry for the presence and primary structure of putative adipokinetic hormones (AKHs). Additionally, the peptide structure of putative AKHs in other Diptera are deduced from data mining of publicly available genomic or transcriptomic data. The study aims to demonstrate the structural biodiversity of AKHs in this insect order and also possible evolutionary trends. Sequence analysis of AKHs is achieved by liquid chromatography coupled to mass spectrometry. The corpora cardiaca of almost all dipteran species contain AKH octapeptides, a decapeptide is an exception found only in one species. In general, the dipteran AKHs are order-specific- they are not found in any other insect order with two exceptions only. Four novel AKHs are revealed by mass spectrometry: two in the basal infraorder of Tipulomorpha and two in the brachyceran family Syrphidae. Data mining revealed another four novel AKHs: one in various species of the infraorder Culicumorpha, one in the brachyceran superfamily Asiloidea, one in the family Diopsidae and in a Drosophilidae species, and the last of the novel AKHs is found in yet another Drosophila. In general, there is quite a biodiversity in the lower Diptera, whereas the majority of the cyclorraphan Brachycera produce the octapeptide Phote-HrTH. A hypothetical molecular peptide evolution of dipteran AKHs is suggested to start with an ancestral AKH, such as Glomo-AKH, from which all other AKHs in Diptera to date can evolve via point mutation of one of the base triplets, with one exception.

Keywords: diptera, adipokinetic peptides, mass spectrometry, adipokinetic and hypertrahalosemic biological assays, fly phylogeny

### INTRODUCTION

There are surely a few facts about the importance of Diptera for ecology, medicine and agriculture that are not so well-known to the general public and non-dipteran specialists alike. Most associations with "flies" in general are about the role of mosquitoes as vectors of terribly infectious diseases, such as malaria or yellow fever, or about the use of the vinegar fly Drosophila melanogaster as model organism for genetic experiments into development, aging, behavior, metabolism and diseases. Mosquitoes (Family: Culicidae) are widely known as "detrimental" dipteran species, for they are vectors for Dengue-, West Nile- Zika- and yellow fever, as well as for malaria and encephalitis (https://www.who.int/neglected\_diseases/vector\_ecology/mosquitoborne-diseases/en/). Certain house flies (Family: Muscidae) and blow flies (Family: Calliphoridae)

#### Edited by:

Klaus H. Hoffmann, University of Bayreuth, Germany

#### Reviewed by:

Christian Wegener, Julius Maximilian University of Würzburg, Germany Mark R. Brown, University of Georgia, United States

\*Correspondence:

Gerd Gäde gerd.gade@uct.ac.za Petr Šimek simek@bclab.eu

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 03 January 2020 Accepted: 04 March 2020 Published: 31 March 2020

#### Citation:

Gäde G, Šimek P and Marco HG (2020) The Adipokinetic Peptides in Diptera: Structure, Function, and Evolutionary Trends. Front. Endocrinol. 11:153. doi: 10.3389/fendo.2020.00153

**89**

transmit disease microorganisms that lead to, e.g., cholera, dysentery and gastroenteritis in humans (1), and from horse flies (Family: Tabanidae), diseases may be effected in other animals, such as equine infectious anemia, and anthrax in cattle and sheep (2). Next to such detrimental effectors, it is perhaps understandable that many people overlook the fact that there are also beneficial fly species, e.g., hover flies (Family: Syrphidae); these are the second most important pollinators after the Hymenoptera (3). Additionally, there are specialist pollinators known from other dipteran families, such as the long-tongued horse flies (Family: Tabanidae) or the cocoa-tree pollinator midges (Family: Ceratopogonidae) (3). Other benign functions are fulfilled by certain fly larvae of the family Calliphoridae, used for medicinal purposes to clean wounds by consuming necrotic tissues, and as producers of pharmacologically active substances such as antimicrobial, antiviral, and antitumor compounds (4). Also, fly larvae of the black soldier fly Hermetia illucens (Family: Stratiomyidae), feed voraciously on household and agricultural waste products. These insect decomposers are now reared at industrial scale, and their pupae used for the economic production of oils and dried protein feed for chicken and fish (5). Finally, in forensic entomology the succession order of larvae of various flies on/in the decomposing human corpse is indicative of the time of death (6) and provides yet another example of the positive usefulness of Diptera.

From the above remarks it should be clear that "the flies" are quite a diverse order with respect to being classified as "pest" or "beneficial." This is not surprising when considering the vast number of fly species: the order Diptera comprises about 152 000 described species (7). Until recently the major divide has been (a) the Nematocera [mostly crane flies (Family: Tipulidae) and mosquitoes, characterized by long antennae] and (b) Brachycera which harbor inter alia the above-mentioned families—Tabanidae, Drosophilidae and Muscidae—which are all characterized by short antennae. The newest phylogenomic research takes 149 out of 157 families of Diptera into account and interprets molecular data from 14 nuclear loci and from many complete mitochondrial genomes, as well as 371 morphological characters (8). The comprehensive study presented a phylogenetic tree that shows (i) the monophyly of Diptera with Mecoptera (scorpion flies) and Siphonaptera (fleas) as closest relatives; (ii) paraphyly in the former Nematocera which are now rather called lower Diptera; the rare and species-poor families Deuterophlebiidae (mountain midges) and Nymphomyiidae are at the base of the fly tree, followed by the lower dipteran infraorders Tipulomorpha and Culicomorpha; (iii) the clade Neodiptera comprising of the infraorder Bibionomorpha (marsh flies and gall midges) is the sister group of Brachycera; (iv) the major clades of Brachycera are the Orthorrapha [Tabanomorpha, Stratiomorpha (such as soldier flies) and Asiloidea (such as robber flies)], the Schizophora in the Cyclorrapha [Tetritoidea (fruit flies) and Ephydroidea (relatives of Drosophila)] and the Calyptrate inside the Cyclorrapha (such as tsetse flies, house flies and blow flies)] (8).

In the current study, we determine the primary structure of an important metabolic hormone from the so-called adipokinetic hormone (AKH) family in major families of lower Diptera and Brachycera with the aim to establish how the biodiversity and distribution of various AKH sequences follows the afore-mentioned phylogeny. Previously, we have successfully implemented the use of the primary structure of AKHs in various insect orders to verify certain phylogenetic trends and ancestral relationships [e.g., (9–13)]. The AKH is one of many biologically active neuropeptides in insects. Its major function is regulation of metabolism and it can be compared in a functional respect with the vertebrate hormone, glucagon. Structurally, however, the AKH peptide and its G-protein coupled receptor is related to the vertebrate gonadotropin releasing hormone (GnRH) system, and it is suggested to form a large peptide superfamily together with two other insect neuropeptide systems: corazonin, and adipokinetic hormone/corazonin-related peptide (14–16). AKHs are synthesized and released from the corpus cardiacum (CC), a neurohemal organ. The peptides are characterized by a chain length of 8 to 10 amino acids and posttranslationally modified N- and C-termini (pyroglutamate and carboxyamidation, respectively). At position two from the amino terminal, AKHs have either an aliphatic amino acid (leucine, isoleucine, or valine) or an aromatic amino acid (phenylalanine); position three is either a threonine or asparagine residue; at position four one finds either the aromatic phenylalanine or tyrosine; position five has either threonine or serine; position eight is always the aromatic tryptophan and position nine comprises of a simple glycine, whereas at position six, seven, and ten there is a large variability of amino acids possible (17, 18).

The first two AKH peptides of Diptera fully known by primary structure were isolated from the CC of the horse fly Tabanus atratus (Family: Tabanidae) and had the sequence of an octapeptide (pELTFTPGW amide) and a decapeptide [pELTFTPGWGY amide; thus, the octapeptide extended by two amino acids, (19)]. The code-names for these peptides were derived from their biological activity ascertained in heterologous biological assays: Tabat-AKH for the octapeptide and Tabat-HoTH for the decapeptide because these peptides, respectively, increased the lipids in the hemolymph (adipokinetic effect) or decreased the concentration of trehalose (hypotrehalosemic effect) in the hemolymph of the face fly Musca autumnalis (dipteran Family: Muscidae) (19). Homologous bioassays in the horse fly Tabanus lineolata set the record straight and came up with some strikingly different results: the octapeptide, which was called an "adipokinetic" peptide, had no hyperlipemic effect and caused only a slight increase of carbohydrates in the hemolymph of T. lineolata, whereas the decapeptide, called a "hypotrehalosemic peptide," was a very effective adipokinetic and hyperglycemic hormone (20). Regardless of the specific action of these peptides in different insect species, they are generically known as members of the adipokinetic hormone family, in short AKHs.

The next dipteran AKH was elucidated a year after the tabanid peptides, isolated and sequenced from CC of the blowfly Phormia (=Protophormia) terraenovae (Family: Calliphoridae) as an octapeptide [pELTFSPDW amide; (21)] that regulates carbohydrate metabolism and is released during induced stress (22) and was, hence, code-named Phote-HrTH. This was the first member of the AKH family that had a charged amino acid; all other AKHs up to then were neutral molecules. A peptide with the same sequence was found at the same time in the vinegar fly D. melanogaster [Family: Drosophilidae; (23)], where biological assays established its activity as cardioacceleratory in prepupae of D. melanogaster (24). Phote-HrTH has subsequently been identified in a number of Dipteran species (including in larval specimens of Drosophila melanogaster, see **Table 1**) and is also known for its modulatory action on specific crop muscles of Phormia regina (44) that enables delivery of ingested food from the crop to the midgut for digestion. In the post-genomic era, D. melanogaster has become the established model insect for research on energy metabolism at the molecular level, with engineered AKH mutants, transgenic insect lines with specific modified cellular components (e.g., transcription factors, ATPsensitive K<sup>+</sup> channels, cGMP-dependent protein kinase), ablated endocrine cells, nutrient stress and other parameters, providing information to elucidate the pathways involved in glucose and lipid homeostasis, as well as disease states such as obesity, diabetes and aging [see recent reviews by Gáliková and Klepsatel (45) and Marco and Gäde (18)].

In the malaria mosquito, Anopheles gambiae (Family: Culicidae), genomic information, gene cloning and physiological experimentation led to the identification of a hypertrehalosemic hormone as an octapeptide with the sequence pELFTPAW amide [= Anoga-HrTH, (27, 46, 47)]. Similar genomic and immunocytochemical methods identified an AKH with the sequence pELTFTPSW amide (= Aedae-AKH) in the CC of the yellow fever mosquito, Aedes aegypti (Family: Culicidae), confirmed by mass spectrometric measurements (25, 26, 48). An AKH gene cloned from the West Nile virus vector, Culex pipiens = C. quinquefasciatus yielded an identical AKH, i.e., Aedae-AKH (25). Interestingly, Aedae-AKH as mature peptide was detected and sequenced by mass spectrometry in the alder fly Sialis lutaria which belongs not to Diptera but to the order Megaloptera (49). Kaufmann et al. (25) identified two AKH genes when mining publicly accessible genetic data from the tsetse fly Glossina morsitans (Diptera, Family: Glossinidae), the deduced mature peptides is the well-known Phote-HrTH and a novel octapeptide with the sequence pELTFSPGW amide (= Glomo-AKH); both of these genes were later cloned (36), and annotated during whole genome sequencing (50) of G. morsitans, while the mature peptide sequences were verified by mass spectrometry (37). Functionally, Phote-HrTH increased the amount of lipids released from fat body tissue of adult female G. morsitans in vitro (36); the function of Glomo-AKH was not physiologically investigated.

From the above, it is evident that there is some sort of group-specificity in the AKH family peptide sequences of the Diptera investigated to date, but there is also variation in the number of sequences per species and even in the length of the peptide sequence. Further elaboration of AKH sequences may also be useful in ascertaining the overlap of sequence identity in beneficial and pest dipteran species for the potential of finding a suitable lead for developing peptide mimetics that would target specifically pest dipterans. An additional aim of the current study therefore is to provide a comprehensive list of putative AKH sequences from Diptera by mining publicly-accessible data bases.

### MATERIALS AND METHODS

### Insects

Adult specimens of unknown age and both sexes were used in this study. Fly species were either caught in the field by netting, were purchased from breeders or were received as a gift from a research institution. In total 19 species of Diptera were studied. Details of the fly species and the taxonomic affiliations are given below. For the latter, the phylogenetic outline given by Pape et al. (51) and Wiegmann et al. (8) were followed (see also section Introduction).

### Suborder Lower Diptera (Formerly: Nematocera), Infraorder Tipulomorpha

Two species of crane fly (Family: Tipulidae) were investigated. Specimens of Tipula paludosa were caught on a meadow in Groß Raden in Mecklenburg-Western Pomerania (Mecklenburg-Vorpommern), Germany, by netting. Specimens of the other species (very likely Tipula oleracea based on morphological characters that we could ascertain) were caught on the wall of a hotel complex in northern Crete close to Heraklion.

### Suborder Brachycera, Infraorder Stratiomyomorpha

One species of soldier fly (Family: Stratiomyidae) was investigated. Specimens of Hermetia illucens were purchased from Illucens GmbH, Ahaus, Germany.

### Suborder Brachycera, Superfamily Syrphoidea

Six species of hover fly (Family: Syrphidae) were investigated. Specimens of Volucella pellucens, Volucella zonaria, Chrysotoxum cautum, and Eristalis ssp. were collected by netting from flowers of wild oregano (Origanum vulgare) in July and August 2017 and 2019 around Bad Iburg and Hagen, Germany. Specimens of Eristalis tenax and Allograpta fuscotibialis were netted on flowers in a private garden in Cape Town, South Africa.

### Suborder Brachycera, Superfamily Tephritoidea

One species of fruit fly (Family: Tephtritidae) was investigated. Specimens of the Mediterranean fruit fly ("medfly") Ceratitis capitata were a gift from the Fruitfly Africa Medfly rearing facility in Stellenbosch, South Africa.

### Suborder Brachycera, Superfamily Hippoboscoidea

Three species of tsetse fly (Family: Glossinidae) were investigated. Specimens of Glossina fuscipes and G. austeni came from the Veterinary Institute in Onderstepoort, South Africa, whereas G. morsitans morsitans were a gift from the International Atomic Energy Agency in Vienna, Austria.

### Suborder Brachycera, Superfamily Muscoidea

Two species of the house fly (Family: Muscidae) and one species of kelp fly (Family: Anthomyiidae) were investigated. TABLE 1 | The distribution of AKH peptides in the Diptera, to date: primary sequence and calculated protonated mass.


(Continued)

#### TABLE 1 | Continued


(Continued)

#### TABLE 1 | Continued


\*Mature peptide sequence: in the case of sequences derived from nucleotide databases, the post-translational modifications (i.e., blocked termini) are deduced from the characteristic features of previously-sequenced AKHs. Note that the N-terminal pyroglutamic acid (pE) can arise from the cyclization of a glutamine (Q) or, more rarely, from a glutamate (E) residue. In the case of prepro-AKH sequences translated from mRNA, all have a glutamine amino acid residue at position 1 of the uncleaved AKH sequence. Novel peptide sequences deduced from data mining are consecutively numbered (1–4)—note that we have not confirmed the data experimentally. \*\*Published work or database accession number.

<sup>a</sup>Other Anopheles species (accession no.): A. coluzzii (A0A182LR92); A. stephensi (A0A182Y983); A. albimanus (A0A182FZY6); A. arabiensis (A0A182IIN3); A. atroparvus (A0A182J042); A. christyi (A0A182K374); A. culicifacies (A0A182MX76); A. darlingi (A0A158N7M0-1); A. epiroticus (A0A182PMF71); A. funestus (A0A182S506); A. melas (A0A182TPX4); A. merus (A0A182VPI3); A. minumus (A0A182WR25); A. quadriannulatus (A0A182XUQ9); A. sinensis (A0A084VYF4-1).

<sup>b</sup>Other Drosophila species (accession no.): D elegans (XP\_017127434.1); D. virilis (XP\_002046309.1); D. novamexicana (XP\_030561075.1); D. serrata (Sequence ID: XP\_020807712.1); D. pseudoobscura pseudoobscura (XP\_001353031.1); D. guanche (SPP77431.1); D. obscura (XP\_022220454.1); D. kikkawai (XP\_017019402.1); D. willistoni (XP\_002068218.1); D. ananassae (XP\_001957722.1); D. arizonae (XP\_017860946.1); D. busckii (XP\_017842867.1); D. mojavensis (XP\_002012181.1); D. hydei (XP\_023165811.1); D. bipectinata (XP\_017101344.1); D. takahashii (XP\_017009635.1); D. rhopaloa (XP\_016978805.1); D. suzukii (XP\_016943637.1); D. yakuba (XP\_002093648.1); D. eugracilis (XP\_017071536.1); D. biarmipes (XP\_016966107.1); D. sechellia (XP\_002035290.1); D. erecta (XP\_001971844.1); D. simulans (XP\_002083583.1).

<sup>c</sup>Other Glossina species (gene identification number): G. pallidipes (GPAI036121; GPAI049064); G. austeni (GAUT013261; GAUT013267); G. palpalis (GPPI030617; GPPI030614); G. fuscipes (GFUI054167; GFUI054166); G. brevipalpis (GBRI027509; GBRI045557).

\*\*\*A tipulid species that is not T. paludosa, and most likely T. oleracea (although not confirmed by a dedicated taxonomist) was collected in Crete, and the AKH complement was elucidated via MS.

The phylogenetic outline is as per Wiegmann et al. (8).

Specimens of Musca domestica were a gift from Clintech (Pty), Ltd., Bryanston, South Africa; Musca autumnalis was caught in the wild in August 2019 around Bad Iburg, Germany, and the kelp fly Fucellia capensis was caught by netting on rotting kelp on the beach of Muizenberg, South Africa, in August 2019.

### Suborder Brachycera, Superfamily Oestroidea

Two species of blowfly (Family: Calliphoridae) and one of flesh fly (Family: Sarcophagidae) were investigated. Specimens of Calliphora vicina and Sarcophaga ssp. (very likely carnaria) were collected during July and August 2017 around Osnabrück, Germany, and Lucilia cuprina was a gift of Clintech (Pty), Ltd., Bryanston, South Africa.

### Species of the Order Mecoptera

Adults of both sexes of the common scorpion fly, Panorpa communis, were collected by netting during July 2017 around Bad Iburg, Germany.

### Biological Assays

Conspecific metabolic bioassays were performed with adult specimens of both sexes of the soldier fly Hermetia illucens, of which a larger number of individuals was available. Flies were acclimatized for about 1 h before experimentation at ambient temperature (22 ± 1 ◦C), each in a 15 ml plastic container with water-soaked cotton wool pieces. After this resting period, 0.5 µl of hemolymph was sampled dorsally from the thorax with a disposable glass microcapillary (Hirschmann Laborgeräte, Eberstadt, Germany), and the hemolymph was blown into concentrated sulfuric acid to measure vanillinpositive material (= total lipids) or anthrone-positive material (= total carbohydrates) according to the phosphovanillin method (52) and anthrone method (53), respectively, as modified by Holwerda et al. (54). Subsequently, the flies were injected ventrolaterally into the abdomen with 3 µl of either water (a control for assessing stress effects of handling), a crude corpora cardiaca extract, or a synthetic peptide using a Hamilton fine-bore 10 µl syringe. A second hemolymph sample was taken 90 min post-injection from the same individuals under resting conditions.

### Dissection of Corpora Cardiaca, Peptide Extraction, Mass Spectrometry, and Sequence Analysis

Corpora cardiaca (CCs) were microdissected from the head/neck region of adult flies from each species under investigation using a stereomicroscope at about 20- to 30-fold magnification; CCs were placed into 80% v/v methanol. Peptide material were extracted


TABLE 2 | Biological activity of a crude methanolic extract of corpora cardiaca from the black soldier fly (Hermetia illucens), and the synthetic peptide Tabat-AKH in homologous bioassays.

<sup>a</sup>Data are presented as Mean ± SD.

<sup>b</sup>Paired t-test was used to calculate the significance between pre- and post-injection. NS, not significant.

from the dissected CCs as outlined previously (55). The vacuumcentrifuged dried extracts were dissolved in 50 µl of aqueous formic acid for liquid chromatography tandem positive ion electrospray mass spectrometry (LC-MS<sup>n</sup> ) on an LTQ XL linear ion trap instrument (Thermo Fisher Scientific, San Jose, CA), as described in detail previously (56).

For conspecific biological assays with the black soldier fly (see section Biological Assays), the dried CCs extracted from H. illucens were reconstituted in distilled water for injection into the flies.

### Synthetic Peptides

The following peptides were previously synthesized by Peninsula Laboratories (Belmont, CA, USA), Pepmic Co. Ltd. (Suzhou, China), or custom-synthesized by Dr. Kevin Clark (University of Georgia, Athens, USA): Phote-HrTH (= Drome-HrTH; pELTFSPDW amide), Glomo-AKH (pELTFSPGW amide), Tabat-AKH (pELTFTPGW amide), Tabat-HoTH (pELTFTPGWGY amide), Anoga-HrTH (pELTFTPAW amide), and Aedae-AKH (pELTFTPSW amide). The novel peptides of this study, Eriss-CC (pELTFSAGW amide), Volpe-CC (pELTFSPYW amide), Tippa-CC-I (pELTYSPSW amide) and Tippa-CC-II (pELTFSPSW amide) were all custom-synthesized by Pepmic Co. Ltd.

### Mining of AKH Sequences From Publicly Available Data Bases

The primary sequence of AKH family peptides in dipteran species were investigated by mass spectrometry (see Section Dissection of Corpora Cardiaca, Peptide Extraction, Mass Spectrometry, and Sequence Analysis). To compare and analyse Dipteran AKHs in a phylogenetic manner, we identified further AKH sequences from other dipteran species via literature searches (i.e., from previously published texts), as well as via bioinformatics. Such, in silico searches of Dipteran protein, genomic and/or EST databases were conducted to identify translated amino acid sequences and transcripts encoding putative AKH peptide precursors.

Some AKH sequences were retrieved from VectorBase (https://www.vectorbase.org/), which is a Bioinformatics Resource for invertebrate vectors of human pathogens. The searches for AKH genes were conducted using the "search term" function, or via the BLAST (Basic Local Alignment Search Tool) search function whereby the nucleotide sequence of the Glossina morsitans AKH was inserted. From the search results, the AKH peptide sequence contained within the deduced prepro-hormones were predicted from homology to known insect AKH isoforms. The associated UniProt accession numbers (https://www.uniprot.org/uniprot) were recorded. The UniProt Knowledgebase (UniProtKB) is the publicly available central hub for the collection of functional information on proteins.

The remainder of putative AKH sequences were obtained via homology searches using BLAST from the National Center for Biotechnology Information site (https://blast.ncbi.nlm.nih.gov/); AKH peptide precursors from G. morsitans were used as BLAST query. For all searches resulting in sequence identifications, the BLAST score and BLAST-generated E-value for significant alignment were considered.

### RESULTS AND DISCUSSION

### Function of Dipteran AKH: The Black Soldier Fly as a Test Case

We conducted a conspecific biological assay with only one dipteran species to show proof of principle that the CC extracts of dipteran species do have hypertrehalosemic biological activity. Unambiguous results reveal that conspecific CC extracts of the black soldier fly H. illucens injected into resting adults had a small but statistically significant hypertrehalosemic effect (**Table 2**). Such a hypertrehalosemic effect was also shown when the endogenous AKH octapeptide of the black soldier fly, Tabat-AKH, was injected, whereas the lipid concentration was not affected (**Table 2**). These results confirm previous data on the action of AKH peptides in a blowfly (22) and the malaria mosquito (47), viz. the mobilization of carbohydrates. When working with a crude CC extract, one could argue that the extract contains, of course, all other putative neuropeptide hormones. Although we are not aware of any neuropeptide that is in higher concentration in the CC and has a clear effect of increasing carbohydrates or lipids except the AKHs, we can only assume that the measured effect was induced by an AKH.

However, it is clear from studies on other Diptera that these peptides can also mobilize lipids in certain fly species (20, 36). Thus, as reported for other insect orders, it is the metabolic machinery of the species under investigation and the metabolic needs of an insect that

determines activation of a lipase and/or a phosphorylase by AKHs (18, 57).

### Mass Spectral Analyses of AKHs Reveal Octapeptides in Major Clades of Diptera

Methanolic CC extracts from all 19 fly species were analyzed by mass spectrometry. Here we will show a few exemplary cases; we chose species from the infraorders Tipulomorpha and Stratiomyomorpha, as well as from the superfamilies Syrphoidea and Tephritoidea.

### Tipulomorpha Tipula paludosa

The CC extract of the crane fly T. paludosa was fractionated (separated) by reversed-phase liquid chromatography (LC) and the peptides detected by positive ion electrospray mass spectrometry (MS). **Figure 1A** shows the base peak chromatogram; **Figures 1B–D** depict extracted mass peaks of AKHs at 6.46, 8.44, and 8.55 min, respectively, with the corresponding [M + H]<sup>+</sup> mass ions at m/z 963.5, 947.4, and 917.4, respectively. The primary structures of these peak materials were deduced from the tandem MS<sup>2</sup> spectra obtained by collision-induced dissociation (CID) of the respective m/z ions. The spectrum of m/z 963.4 (**Figure 2**) with clearly defined b, y, b-H2O, y-NH3, and other product ions allowed an almost complete assignment of a typical octapeptide member of the AKH family under the assumption that the peptide has a characteristic pyroglutamate residue at the N-terminus (see schematic inset in **Figure 2**). All other amino acids are assigned except at position two where the remaining mass of 113 can be accredited to one of the isomers leucine or isoleucine. Such a peptide with the sequence pGlu-Ile/Leu-Thr-Tyr-Ser-Pro-Ser-Trp amide had never been found in any insect, to date. It was hence code-named Tippa-CC-I (Tipula paludosa CC peptide I) and awaited clarification of the amino acid residue in position 2 through co-elution experiments with a synthetically made Leu<sup>2</sup> analog of the peptide (see below). The spectrum of m/z 947.4 (**Figure 3**) led to the interpretation of another octapeptide with the sequence pGlu-Ile/Leu-Thr-Phe-Ser-Pro-Ser-Trp amide which is also novel and, thus, called Tippa-CC-II with the same ambiguity about position 2 in the primary sequence. The spectrum of the third peptide at m/z 917.4 (**Figure 4**) resulted in the sequence interpretation pGlu-Ile/Leu-Thr-Phe-Ser-Pro-Gly-Trp amide which, with Leu<sup>2</sup> , is well-known under the name Glomo-AKH as one of the two peptides found in the tsetse fly (see section Introduction; **Table 1**). All three peptides were synthesized as Leu<sup>2</sup> isomer and co-elution experiments were performed; previously we had established that the isobaric Leu/Ile peptides have different LC retention times (58, 59). As depicted in **Figures S1A–I** all three synthetic peptides had identical retention times to the natural peptides in the CC extract and have, therefore, the correctly assigned primary peptide sequence, confirming that leucine is the second amino acid residue also of the novel Tippa-CC-I and II. The same three masses were also identified from the tipulid species caught on Crete and support the finding of these 3 octapeptides in this infraorder.

When we look at the primary structure of the three AKHs of T. paludosa and the other tipula species, it is obvious that they are closely related to each other and to the AKH known as Phote-HrTH from Phormia and Drosophila (**Table 1**).

### Stratiomomorpha Hermetia illucens

An extract from the CC of the black soldier fly shows a peak with a retention time of 8.88 min (**Figure S2A**) which corresponds in MS analysis to an [M + H]<sup>+</sup> ion of m/z 931.4 (**Figure S2B**). The CID spectrum gave clear product ions and led to the interpretation of an AKH with the primary structure pGlu-Ile/Leu-Thr-Phe-Thr-Pro-Gly-Trp-amide (**Figure S2C**) of which the Leu<sup>2</sup> form was established by a co-elution experiment (**Figures S2D–F**). This peptide is known as Tabat-AKH and found in Tabanus atratus (19). Surprisingly, genetic work has proposed another peptide for this species, i.e., pGlu-Leu-Thr-Phe-Thr-Gly-Gln-Trp-amide, which differs at position 6 (Gly instead of Pro) and 7 (Gln instead of Gly), and in its calculated protonated mass of 962.4730 [see **Table 1**; (28)]. Although there were many other prominent peak material eluting from the CC extract in the current study (**Figure S2A**), these peaks did not correspond with the calculated mass of 962.47, nor did they relate to an AKH peptide sequence. Since there has never been a glycine residue found at position 6 in any of the 90 AKHs known so far (G. Gäde personal communication), we suggest that there may be a misread in the genetic code and, hence, do not currently accept this sequence as a novel one until it has been confirmed (or refuted) by mass spectrometric methods.

### Syrphoidea: Eristalis ssp

An extract from the CC of a number of Eristalis specimens, possibly a mixture of a few species that are difficult to distinguish

taxonomically, gave information of four clearly identified AKHs. The extracted chromatograms reveal three AKHs with nearidentical retention times but with different [M + H]<sup>+</sup> ions of m/z 891.4, 917.4, and 975.5, the fourth AKH with [M + H]<sup>+</sup> ion of m/z 1023.5 had a longer retention time (**Figures S3A–E**). The CID spectra gave clear evidence for the sequence of the respective AKHs. Here we show only the CIDs for [M + H]<sup>+</sup> 891.4 and 1023.5 because they represent novel AKHs. The peptide with [M + H]<sup>+</sup> 917.4 (= Glomo-AKH) has been dealt with in the earlier example (see Tipulomorpha: Tipula paludosa) while [M + H]<sup>+</sup> 975.5 (= Phote-HrTH) will be discussed later (see Tephritoidea: Ceratitis capitata). The primary structure of an AKH representing [M + H]<sup>+</sup> 891.4 was deduced as pGlu-Leu/Ile-Thr-Phe-Ser-Ala-Gly-Trp amide (**Figure 5A**) and the one with [M + H]<sup>+</sup> 1023.5 was assigned the sequence pGlu-Leu/Ile-The-Phe-Ser-Pro-Tyr-Trp amide (**Figure 5B**). The ambiguity at position two was solved

to the presence of Leu in both cases by co-elution experiments with the synthetic peptide (**Figures S3F–Q**). Both peptides are novel, thus have not been found in any insect before and we assign them the code-name Eriss-CC for the peptide with MH<sup>+</sup> 891.4 because it occurs in an Eristalis subspecies (see **Table 1**: Eristalis subspecies 2), and Volpe-CC for the peptide with MH<sup>+</sup> 1023.5 as this peptide was discovered in the unambiguously defined species Volucella pellucens and all other syrphid species in the current study (**Table 1**).

### Tephritoidea Ceratitis capitata

An extract from the CC of the fruit fly shows only one peak with a retention time of 8.81 min (**Figure S4A**) which corresponds in MS analysis to an [M + H]<sup>+</sup> ion of m/z 975.4 (**Figure S4B**); the CID (**Figure S4C**) determines a sequence of pGlu-Leu/Ile-Thr-Phe-Ser-Pro-Asp-Trp amide. Co-elution with synthetic peptide (**Figures S4D–F**) unequivocally determines Leu at position 2 and, hence, characterizes this peptide as Phote-HrTH found previously in certain flies [(21, 23); see **Table 1**].

### AKHs of Diptera Are Order-Specific

The CC extracts of all the other Diptera investigated in the current study were analyzed in the same way as in the above examples. The results are given in **Table 1**. In this table we present the taxonomic units of Diptera in the phylogenetic relationship as outlined by Wiegmann et al. (8).

Additionally, previously published data on AKH sequences are incorporated in this table as are sequences that were "mined" from genomic or transcriptomic data sets as outlined in Materials and Methods.

It is apparent from the combined data sets in **Table 1** that there are a few main points to be made about the AKHs in Diptera which will be discussed in more detail hereafter: (i) Dipteran AKHs are octapeptides. There is only one decapeptide exception. (ii) Current mass spectral investigations find 4 novel octapeptides from fly corpora cardiaca; an additional 4 novel octapeptides are found through data mining. (iii) Dipteran AKHs are specific for the order. There are only two exceptions.

To date, we know roughly 90 sequences of AKHs from insects. There is a clear bias toward the production of octapeptides as only one third of the known AKHs are decapeptides and only three are nonapeptides. Most of the decapeptide AKHs are found in the orders Hymenoptera, Hemiptera and Caelifera, while the nonapeptides are in Lepidoptera and Hemiptera. The

order under scrutiny here, the Diptera, contain up to now only a single AKH decapeptide, viz. Tabat-HoTH in the horse flies. The presence of Tabat-HoTH can be well-explained by gene duplication since this decapeptide has the same amino acid sequence as the accompanying octapeptide in the horse fly but is elongated by two amino acids at the C- terminus. It remains to be seen whether another genus of the tabanids also has such a complement of AKHs. Unfortunately, our efforts with an extract prepared from long-frozen CCs of the common horse fly, Haematopota fluvialis, did not yield any mass spectral data for inclusion in the current work. Although gene duplication has taken place in other dipteran species, such as in crane flies and hover flies as shown in the current work, this has not led to decapeptides but mutations to other octapeptides.

Before we started this study, 6 AKHs were known from the order Diptera (see history in section **Introduction**). The current study adds 8 further sequences: 4 novel AKH peptides (2 in the tipulids and 2 in the syrphids) found by unequivocal mass spectrometric studies, and 4 novel sequences (distributed in Culicomorpha, Stratiomyomorpha, Asiloidea, Diopsidae, and Drosophilidae) from data mining of public data bases—although it should be cautioned that the "mined" sequences have not been substantiated by mass spectrometry or peptide chemistry. Three of the four characterized novel AKHs and 1 of the mined AKHs share the same N-terminal pentapeptide pELTFS, while the fourth novel AKH share pELTYS with one of the mined AKHs (**Table 1**). The remaining 2 of the 4 novel peptides found by data mining have the pentapeptide sequence pELTFT (**Table 1**). A scan of the dipteran AKH sequences in **Table 1** shows that the most prevalent amino acid residue in position 6 is proline, and there is a clear distribution pattern of AKHs with pELTFSP in lower diptera Tipulomorpha, while pELTFTP is present from the lower diptera Psychodomorpha until its last appearance in orthorraphan Brachycera Asiloidea, after which AKHs with pELTFSP reappear in Brachycera Syrphoidea onwards).

In contrast to almost all other insect orders it appears that AKHs in Diptera are relatively order-specific. Of the 14 AKHs now known from Diptera, only two are found outside of this order, viz. Aedae-AKH which is found in the only Megaloptera species investigated to date (49), and a new discovery from the current work: Glomo-AKH is found in a species from Mecoptera (see Putative Molecular Evolution of Dipteran AKHs). This is quite remarkable when considering the diversity and speciesrichness of Diptera. Such order-specificity is at the moment also known from Lepidoptera (60) but not from Coleoptera (59, 61) and Orthoptera (13, 62). We cannot, at this point, exclude the possibility that future investigations may reveal AKH structures that are common to Diptera and various other orders.

### Putative Molecular Evolution of Dipteran AKHs

If we want to speculate about the molecular evolution of dipteran AKHs we first have to find out what the putative ancestor AKH may be. For this it would be advantageous to know the AKHs of the closest relatives to the Diptera. According to Wiegmann et al. (63) the orders Mecoptera (scorpion flies) and Siphonaptera (fleas) are evolutionary closest and not the parasitic Strepsiptera as once postulated. Data mining revealed genomic information

FIGURE 6 | Hypothetical molecular evolution of adipokinetic peptides in Diptera. Glomo-AKH is assumed as ancestral peptide for this order. The amino acid substitution in each peptide is indicated in a larger font than in the peptide from which it is hypothetically derived. All substitutions are point mutations except the change from Tabat-AKH to the unconfirmed novel peptide 2 found by data mining. \*The switch from Gly<sup>7</sup> to Val<sup>7</sup> requires two base changes.

for a member of the Siphonaptera proposing a mature AKH of the cat flea, Ctenocephalides felis, with the sequence pELTFTPVW amide (XP\_026477356.1), which is also a predicted AKH for the robber fly Dasypogon diadema (QYTT01077274.1; see **Table 1**). We did not find any data base information on Mecoptera and thus decided to include the common scorpion fly, Panorpa communis, into our study. The mass spectrometric data from the CC of this insect are clear: the AKH has the ionized mass (MH+) of 917.5 and the sequence of pELTFSPGW amide, thus Glomo-AKH (results not shown). Thus, both related orders have AKHs with sequences that also occur in Diptera, viz. Glomo-AKH in members of the Tipulidae, Syrphidae, and Glossinidae and the novel peptide (novel 2) in a member of the Asilidae (see **Table 1**). This may point to a common ancestor of the three orders with the sequence pELTFXPXW amide. Unfortunately, we do not have data on AKHs from the most basal fly families, that are species-poor and not very accessible, i.e., the Deuterophlebiidae and Nymphomyiidae (8). Currently, the AKHs known from Tipulidae are the ones from the most basal dipteran family. Since Glomo-AKH does also occur in the close relative order Mecoptera and in a number of dipteran families, we currently assume this molecule as parent to speculate and sketch a molecular evolution of the AKHs occurring in the various families of Diptera, as inferred by single point mutations. **Figure 6** clearly shows that this hypothetical peptide evolution could have occurred without introducing any new sequences to the ones listed in **Table 1**. Additionally, only in one case indicated in the figure would two nucleotides of the triplet have to mutate. Although speculation, this scheme of putative evolution has all the known AKHs of the lower Diptera (Tipulomorpha and Culicomorpha) on one side (left) of the flow chart giving it some credibility.

As previously shown, e.g., for Odonata, sequences of mature AKHs do not allow an insight into deeper phylogenetic relationships but give some indications of relatedness (10, 64). For example, the three peptides of the Tipulidae are very similar, as are the four of the Syrphidae and also the three in Drosophilidae (see **Table 1** and **Figure 6**). It is also clear that the vast majority of the cyclorraphan flies, which are monophyletic (8), only produce Phote-HrTH. It appears that quite a radiation of AKHs occurred early in the evolution of the Diptera, thus in the Tipulomorpha and Culicomorpha which are thought to have radiated in the Triassic (8). The next wave of radiation according to Wiegmann et al. (8) occurred in the early Jurrasic by the lower Brachycera including the clades Tabanomorpha, Stratiomyomorpha and Asiloidea all of which

### REFERENCES


have their AKHs clustered together at the right side of **Figure 6**. The latest radiation involves the clade Schizophora between earliest Paleocene to Tertiary and includes the species in the Tephritoidea, Ephydroidea and Oestroidea which all produce Phote-HrTH and its closest relatives (see middle of **Figure 6**).

### DATA AVAILABILITY STATEMENT

All datasets for this study are included in the article/**Supplementary Material**. Any other requests can be directed to Gerd Gäde, gerd.gade@uct.ac.za for metabolic data and Petr Šimek, simek@bclab.eu for mass spectrometric data.

### AUTHOR CONTRIBUTIONS

GG: concept and design of the study, acquisition of insect species and synthetic peptides, interpretation of the data and writing the draft manuscript. HM: co-designed the study, data acquisition (biological assays, dissection of insect corpora cardiaca and preparation of extracts, mining data bases for AKH sequences), interpretation and analyses of data, writing and refining the draft manuscript. PS: mass spectrometric analyses, data interpretation, and drafting of MS Figures for the manuscript.

### FUNDING

This work is based on the research supported in part by the National Research Foundation of South Africa: grant numbers 85768 [IFR13020116790] to GG and 109204 [IFR170221223270] to HM, and the University of Cape Town (Block Grants to GG and HM). PS was supported by the Czech Science Foundation No. 17-22276S.

### ACKNOWLEDGMENTS

The authors thank Ms. Pavla Kružberská for excellent technical support with MS measurements, all suppliers of insects mentioned in the main text, and colleagues for identification of certain species (Wolfgang Rutkies and Frank Wolf).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00153/full#supplementary-material


Insect Biochem Mol Biol. (2009) 39:770–81. doi: 10.1016/j.ibmb.2009. 09.002


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Gäde, Šimek and Marco. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evolution of Neuropeptide Precursors in Polyneoptera (Insecta)

#### Marcel Bläser and Reinhard Predel\*

Department of Biology, Institute for Zoology, University of Cologne, Cologne, Germany

Neuropeptides are among the structurally most diverse signaling molecules and participate in intercellular information transfer from neurotransmission to intrinsic or extrinsic neuromodulation. Many of the peptidergic systems have a very ancient origin that can be traced back to the early evolution of the Metazoa. In recent years, new insights into the evolution of these peptidergic systems resulted from the increasing availability of genome and transcriptome data which facilitated the investigation of the complete neuropeptide precursor sequences. Here we used a comprehensive transcriptome dataset of about 200 species from the 1KITE initiative to study the evolution of single-copy neuropeptide precursors in Polyneoptera. This group comprises well-known orders such as cockroaches, termites, locusts, and stick insects. Due to their phylogenetic position within the insects and the large number of old lineages, these insects are ideal candidates for studying the evolution of insect neuropeptides and their precursors. Our analyses include the orthologs of 21 single-copy neuropeptide precursors, namely ACP, allatotropin, AST-CC, AST-CCC, CCAP, CCHamide-1 and 2, CNMamide, corazonin, CRF-DH, CT-DH, elevenin, HanSolin, NPF-1 and 2, MS, proctolin, RFLamide, SIFamide, sNPF, and trissin. Based on the sequences obtained, the degree of sequence conservation between and within the different polyneopteran lineages is discussed. Furthermore, the data are used to postulate the individual neuropeptide sequences that were present at the time of the insect emergence more than 400 million years ago. The data confirm that the extent of sequence conservation across Polyneoptera is remarkably different between the different neuropeptides. Furthermore, the average evolutionary distance for the single-copy neuropeptides differs significantly between the polyneopteran orders. Nonetheless, the single-copy neuropeptide precursors of the Polyneoptera show a relatively high degree of sequence conservation. Basic features of these precursors in this very heterogeneous insect group are explained here in detail for the first time.

Keywords: neuropeptides, transcriptome, Polyneoptera, insect evolution, Blattodea, Orthoptera, Phasmatodea, Dermaptera

## INTRODUCTION

Neuropeptides are among the structurally most diverse signaling molecules in multi-cellular animal organisms. As such, they participate in intercellular information transfer from neurotransmission to intrinsic or extrinsic neuromodulation and regulate physiological processes including growth, reproduction, development, and behavior. For a single insect species, up to 50

#### Edited by:

Elizabeth Amy Williams, University of Exeter, United Kingdom

#### Reviewed by:

Sheila Ons, National University of La Plata, Argentina Sven Bradler, University of Göttingen, Germany

> \*Correspondence: Reinhard Predel rpredel@uni-koeln.de

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 30 January 2020 Accepted: 19 March 2020 Published: 15 April 2020

#### Citation:

Bläser M and Predel R (2020) Evolution of Neuropeptide Precursors in Polyneoptera (Insecta). Front. Endocrinol. 11:197. doi: 10.3389/fendo.2020.00197 neuropeptide genes can be expected coding for single or multiple copies of neuropeptides (1, 2). The sequences of single-copy neuropeptides, which are the focus of our study, are on average better conserved than those of multiple-copy peptides because amino acid (AA) substitutions in the single ligand of a particular neuropeptide receptor are potentially more likely to lead to a general loss of function than substitutions involving only one of several related neuropeptides produced from the same precursor. Thus, mutations that alter the sequences of singlecopy neuropeptides must either be accompanied by parallel mutations in receptor genes that maintain the binding properties of the respective receptors or should not alter the steric properties of the peptides to maintain functionality (3). Most neuropeptides activate peptide-specific G-protein coupled receptors and many of these peptidergic systems have a very ancient origin that can be traced back to the early evolution of the Metazoa [e.g., (4)]. In fact, in many cases orthologies between neuropeptide and/or corresponding receptor genes of distantly related lineages can be identified (5–7). The identification of cockroach sulfakinins (8) with an already then suspected relationship to the cholecystokinins of vertebrates was a first strong indication of the conservation of peptidergic systems across protostomes and deuterostomes, these taxa diverged more than 700 million years ago (9, 10). Mirabeau and Joly (6) later described eight conserved peptidergic systems (vasopressin, neuropeptide Y/F, tachykinin, gonadotropin-releasing hormone/adipokinetic hormone, cholecystokinin/sulfakinin, neuromedin U/pyrokinin, corticotropin-releasing factor, calcitonin) which probably already occurred in the last common ancestor of Bilateria. More recently, Elphick et al. (4) have described as much as 30 neuropeptide signaling systems with orthologs in protostomes and deuterostomes.

Insects have always been in the focus of neuropeptide research (11). Today, the fruitfly Drosophila melanogaster is the model organism also for the study of neuropeptide functions (12). However, due to the large number of harmful insects that have an impact on human health or reduce yields in agriculture and forestry, other groups of insects, in particular true bugs, beetles, lepidopterans, and various groups of flies have also been intensively examined [e.g., (13–18)]. Beneficial insects such as honey bees and predators or parasites of pest insects [e.g., (19–21)] were also investigated in detail. Nevertheless, many neuropeptides in insects were first described from Polyneoptera such as locusts, cockroaches and stick insects; including proctolin, the first ever identified insect neuropeptide (22). In recent years, most of the insights into the evolution of peptidergic systems then resulted from the increasing availability of genome and transcriptome data which facilitated the investigation of the complete neuropeptide precursor sequences. Such data have been used in several comprehensive studies, mainly to compile neuropeptide precursor sequences within higher taxa (23–25). In a recent study a large dataset of neuropeptide precursors from Blattodea was used to demonstrate the considerable phylogenetic information contained in these sequences (3). A particular focus on the general evolution of neuropeptide precursors has been placed in a study on the precursors of 12 Drosophila species (26). In this study the authors also discussed the potentially different evolution of single-copy and multiple-copy precursors, with mutations in the neuropeptide sequences of single-copy precursors being exposed to stronger stabilizing selection.

In our study, we used a comprehensive transcriptome dataset of about 200 species from the 1KITE initiative (http://www. 1kite.org/) to study the evolution of single-copy neuropeptide precursors in Polyneoptera. This group comprises well-known orders such as cockroaches and termites (Blattodea), locusts (Orthoptera) and stick insects (Phasmatodea), but also rather unknown orders with few species such as ice crawlers (Grylloblattodea), heel walkers (Mantophasmatodea), and angel insects (Zoraptera). The internal relationships of Polyneoptera were recently resolved as part of the 1KITE project (27). Due to their phylogenetic position within the Insecta (**Figure 1**) and the large number of old lineages, these insects are ideal candidates for studying the evolution of insect neuropeptides and their precursors. Furthermore, the taxon sampling of the 1KITE initiative includes multiple species from all higher lineages of Polyneoptera (with the exception of Zoraptera with only one species). This enabled us to discuss changes in single-copy neuropeptide precursor sequences with respect to insect evolution.

Our analyses include the orthologs of 21 single-copy neuropeptide precursors. The neuropeptides of several of these precursors were first described from Polyneoptera (cockroaches: proctolin, corazonin, myosuppressin; stick insects: HanSolin, RFLamide), but were later also found in other insects. Based on the sequences obtained, the degree of sequence conservation between and within the different polyneopteran lineages is discussed. Furthermore, the data are used to postulate the individual neuropeptide sequences that were present at the time of the insect emergence more than 400 million years ago, as well as to detect taxon-specific losses of peptidergic systems. The dataset from the 1KITE initiative has become "historic" and contains a number of transcriptomes with partly insufficient coverage of neuropeptide precursors. The few new transcriptomes we have prepared specifically for this study (Mantophasmatodea) show much better data coverage. Nevertheless, the comprehensive taxon sampling of the 1KITE initiative allows a sufficient validation of almost all statements we provide about the evolution of neuropeptide precursors in Polyneoptera.

### MATERIALS AND METHODS

### Orthology Assessment and Alignment of Neuropeptide Precursor Sequences

Orthology assessment and alignments were performed as described in Bläser et al. (3). Briefly, we mined transcriptome sequences, provided by the 1KITE initiative (GenBank Umbrella BioProject ID PRJNA183205), for single-copy neuropeptide precursors in the datasets from each order of Polyneoptera; starting with neuropeptide precursor sequences of Carausius morosus (2), Locusta migratoria (1) and Blattodea (3). Once a full set of single-copy neuropeptide precursors was obtained, we used this information to search for precursors in the remaining

species of this order or species from related orders. Assembled transcripts were analyzed with the tblastn algorithms provided by NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Identified candidate nucleotide precursor gene sequences were translated into AA sequences using the ExPASy Translate tool (29) with the standard genetic code. Orthologous neuropeptide precursors were aligned using the MAFFT-L-INS-i algorithm (30) (dvtditr (aa) Version 7.299b alg=A, model=BLOSUM62, 1.53, −0.00, −0.00, noshift, amax=0.0). Alignments generated with the MAFFT-L-INS-i algorithm were then manually checked for misaligned sequences using N-termini of signal peptides and conserved AA residues (cleavage signals, Cys as target for disulphide bridges) as anchor points. Incompletely translated transcripts of neuropeptide precursors and transcripts of questionable quality were either combined to generate complete precursors when possible, or labeled with question marks at the respective AA positions.

different lineages are added at the nodes (28). Ma, million years ago.

### Assessment of Precursor Characteristics

Individual AA alignments of each group of orthologous neuropeptide precursors from each order were merged in BioEdit 7.2.5 (31). The coverage of single-copy neuropeptide precursors in our dataset (**Additional File 1**), minimal and maximal length of precursors as well as number of identified transcripts and the position of the conserved neuropeptide sequences in the precursor were manually determined for each neuropeptide precursor in each order of Polyneoptera, respectively. Additionally, the predicted neuropeptide sequences in the individual AA alignments were determined and further analyzed using BioEdit 7.2.5. The lengths of these sequences as well as N-terminal and C-terminal cleavage sites were manually determined.

Alignments of single-copy neuropeptide precursors of the individual polyneopteran orders as well as combined alignments of all polyneopteran lineages were used to estimate the average evolutionary divergence (AED) over all sequence pairs in Mega X (32). Standard error estimates were obtained by implementing 500 bootstrap replicates using the Poisson correction model (33). We used the pairwise deletion option to remove all ambiguous sites. The results of these analyses are shown in **Additional File 3**. The median AED of each single-copy neuropeptide precursor was calculated with Microsoft Excel and compared to the overall AED value of the respective singlecopy neuropeptide precursor. To calculate the overall AED value, complete sequences of each single-copy neuropeptide from each polyneopteran order (see **Additional File 1**) were merged into a single file and aligned again. Furthermore, the alignments of the predicted neuropeptide sequences were used to calculate the overall AED value for the conserved neuropeptide sequences for each neuropeptide. Finally the overall median AED of all single-copy neuropeptides for each order was calculated using Microsoft Excel.

These analyses enabled us to compare internal sequence variation for each neuropeptide in all polyneopteran orders (AED for each order), as well as between orders (overall AED). The median AED also allowed an assessment of relative levels of sequence conservation. The overall AED of the predicted neuropeptide sequences enables a comparison of sequence conservation between neuropeptide sequences and the complete precursor sequence.

### Sequence Logo Generation and Topology Mapping

Sequence logos of the aligned neuropeptide precursor orthologs were generated using the tool WebLogo version 2.8.2 (34). Each stack represents one position in the multiple sequence alignment. The overall height of a stack indicates the sequence conservation at this AA position; the height of letters within

the stack indicates the relative frequency of each AA at that position. For the color scheme of AA residues, the default settings were selected. Resulting sequence logos were manually mapped (Adobe Illustrator CS6; version 16.0.0.) on a simplified tree showing the phylogeny of Polyneoptera (27).

## RESULTS AND DISCUSSION

The BLAST searches in the Polyneoptera transcriptome assemblies of the 1KITE initiative were performed with singlecopy neuropeptide precursor sequences of C. morosus (2), L. migratoria (1), and Blattodea (3). Due to the varying quality of the transcriptome data and the generally low quantity of several neuropeptide-coding RNA sequences in whole body transcriptomes (see **Additional File 1**), the number of identified precursor sequences is significantly lower than the number of species analyzed. In addition, the yield of neuropeptide precursors is different for the different precursors; for example, much fewer CCHamide-1 precursors could be identified across the different lineages than precursors for other neuropeptides such as proctolin and NPF-1 (see **Additional File 1**). Nevertheless, the extensive material of the 1KITE initiative guaranteed sufficient information for almost all orders of Polyneoptera. The only exception was Zoraptera, where only a single transcriptome is available. In total, we have included 21 different single-copy precursors in our analysis. The precursors for adipokinetic hormones (AKHs) were not included in our study because the number of AKH genes varies considerably between and within the polyneopteran orders and the orthologies could not be resolved. All neuropeptide precursor sequences identified in this study are listed in **Additional File 2**, sorted by the different polyneopteran lineages. The phylogenetic relationships in Polyneoptera are illustrated in **Figure 2** which also shows the estimated divergence times of the different orders. The estimated divergence times between orders vary between 300 and 150 Ma, which corresponds to the time scale of the parallel (independent) evolution of the respective precursors. Due to the long separate history of the two orthopteran lineages Caelifera and Ensifera (**Figure 2**), we have treated them separately in our analyses. The information about the neuropeptide sequences for each lineage is used in the following to determine the respective ancestral neuropeptide sequences for the Polyneoptera. A comparison with orthocopies of Zygentoma and Remipedia (24) allows in many cases also a statement about the possible ancestral neuropeptide sequences of the insects or even of the hexapods.

## ACP (Additional File 3A)

A single adipokinetic hormone/ corazonin-like peptide (ACP) precursor with a length of 85–109 AA is present in almost all polyneopteran orders. The only exception was found in Dermaptera, where the ACP precursor is absent. It is noteworthy that of the 37 phasmatodean species analyzed, an ACP precursor was found only in 5 species of Oriophasmata (35). However, the transcriptomes of Phasmatodea in general show a rather high percentage of missing data (see **Additional File 1**) and therefore the low number of ACP precursors in Phasmatodea might also

FIGURE 3 | Sequence logo representation showing the degree of amino acid sequence conservation of the ACP neuropeptides for each order; mapped on a phylogenetic tree of Polyneoptera [modified from Wipfler et al. (27)]. Only the completely obtained precursor sequences were considered, the respective number is given in parentheses for each taxon. An "X" in the sequence represents a gap. The hypothetical ancestral state of the ACP sequence in Polyneoptera is listed at the top.

be a result of this incomplete dataset. Each precursor contains a usually very well conserved ACP motif with an amidation site. The ACP sequence immediately follows the signal peptide and terminates upstream of a dibasic KR cleavage site.

The ACP sequences are mostly decapeptides; only in one species of Embioptera (Rhagadochir virgo) ACP is a duodecapeptide. The sequence (p)QVTFSRDWNA-NH2, which also occurs in the remipedian X. tulumensis, was found in various orders of Polyneoptera (**Figure 3**). This sequence is likely ancestral for all Hexapoda. Amino acid substitutions in ACPs of Polyneoptera are largely limited to substitutions from Val<sup>2</sup> to Ile<sup>2</sup> (few Ensifera and Embioptera), Arg<sup>6</sup> to Lys<sup>6</sup> (Zoraptera, all Plecoptera), and several substitutions of the two C-terminal AA (**Figure 3**). The median average evolutionary divergence (AED) for the ACP precursor is 0.40 (**Figure 4**). Grylloblattodea show the lowest ACP precursor variation, while Plecoptera possess the most variable ACP precursor sequences. The overall AED for the ACP precursors is 0.79, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.14; **Figure 4**).

### AST-CC (Additional File 3B)

A single allatostatin-CC (AST-CC) precursor with a length of 108-172 AA is present in all polyneopteran orders. The precursors contain a well conserved AST-CC motif without a C-terminal amidation site. Only in Apteroperla tikumana (Plecoptera) a second partial precursor with an alternative AST-CC motif was found. The AST-CC sequence is always located C-terminal in the precursor, N-terminally flanked by a dibasic RR cleavage site [monobasic Arg in Tryonicus parvus (Blattodea)] and terminates upstream of variable C-terminal cleavage sites or the AST-CC precursor sequence ends directly with the AST-CC sequence.

Most AST-CC sequences of Polyneoptera are nonadecapeptides, only some derived AST-CC sequences of Orthoptera have 18 (few Caelifera) or 20 AA [few Ensifera and T. parvus (Blattodea)]. The sequence GQQKGRVYWRCYFNAVTCF-OH which also occurs in the silverfish T. domestica was found in most orders of Polyneoptera (not in Dermaptera, Caelifera, Embioptera). This sequence might therefore be regarded as ancestral for all Pterygota. While the C-terminal motif YWRCYFNAVTCF-OH is highly conserved in all Polyneoptera, the N-terminus shows a number of lineagespecific AA substitutions, particularly at position 2 and 3. The median AED for the AST-CC precursor is 0.24. Grylloblattodea and Mantophasmatodea show the lowest AST-CC precursor variation, while Embioptera possess the most variable AST-CC precursors. The overall AED for the AST-CC precursor is 0.49, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.08).

## AST-CCC (Additional File 3C)

A single allatostatin-CCC (AST-CCC) precursor with a length of 93-121 AA is present in all polyneopteran orders. The precursors contain a highly conserved AST-CCC motif with an amidation site. The AST-CCC sequence is always located C-terminal in the precursor, N-terminally flanked by a dibasic KR cleavage site and terminates upstream of a monobasic K cleavage site (KK in few species of Caelifera).

All AST-CCC sequences of Polyneoptera are tetradecapeptides. The sequence SYWKQCAFNAVSCF-NH<sup>2</sup> which also occurs in the remipedian X. tulumensis was found in all orders of Polyneoptera. This sequence might therefore be regarded as ancestral for all Hexapoda. Amino acid substitutions in AST-CCC are limited to a substitution from Lys<sup>4</sup> to Arg<sup>4</sup> in few species of Dermaptera and Embioptera and all species of Mantodea. The median AED for the AST-CCC precursor is 0.23. Mantophasmatodea show the lowest AST-CCC precursor variation, while Ensifera possess the most variable AST-CCC precursors. The overall AED for the AST-CCC precursor is 0.36, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.03).

In our study we could not find an AST-C precursor in any transcriptome. The presence of an AST-C precursor was previously suggested for L. migratoria (36, 37), but the corresponding AST-C neuropeptide has not been confirmed biochemically (e.g., fragment analysis).

### AT (Additional File 3D)

A single allatotropin (AT) precursor with a length of 102–142 AA is present in all polyneopteran orders. The precursor contains a usually well-conserved AT motif with a C-terminal amidation site. The AT sequence follows a precursor peptide (17-21 AA) inserted between the signal peptide and the AT sequence. Only in the Dermaptera the AT sequence is directly C-terminal of the signal peptide. In all other taxa, the AT sequence is N-terminally flanked by a monobasic Arg cleavage site, while all sequences terminate upstream of a dibasic KR cleavage site. Specific features of AT precursors were found in Embioptera, where the species Haploembia palaui has a second AT precursor and another species (Ptilocerembia catherinae) has a second, longer AT motif immediately C-terminal of the first AT sequence.

With a single exception (Ensifera: Comicus calcaris; 12 AA), the AT sequences (AT-1 of P. catherinae) are tridecapeptides. The sequence GFKNVALSTARGF-NH<sup>2</sup> which also occurs in the silverfish T. domestica was found in many orders of Polyneoptera (not in Zoraptera, Dermaptera, Mantophasmatodea, Grylloblattodea, Embioptera). This sequence might therefore be regarded as ancestral for all Pterygota. Common AA substitutions of AT sequences affect the positions 5 and/or 6 from the N-terminus, resulting in lineage-specific AA at these positions. Highly derived sequences of AT are typical of all Dermaptera and most Embioptera; in Dermaptera these substitutions even affect the N- and C-terminal AA, which are conserved in all other polyneopteran orders. The median AED for the ACP precursor is 0.3. Grylloblattodea and Mantophasmatodea show the lowest ACP precursor variation in Polyneoptera, while Embioptera possess the most variable ACP precursors. The overall AED for the ACP precursors is 0.48, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.10).

## CCAP (Additional File 3E)

A single crustacean cardioactive peptide (CCAP) precursor with a length of 143-174 AA is present in all polyneopteran orders. The precursor contains a fully conserved CCAP motif with a Cterminal amidation site. The CCAP sequence follows a precursor peptide (24–27 AA) inserted between the signal peptide and the CCAP sequence. The CCAP sequences are N-terminally flanked by dibasic KR cleavage sites and terminate upstream of KKR or RKR (few Orthoptera and Embioptera) cleavage sites. The sequence PFCNAFTGC-NH<sup>2</sup> which also occurs in the remipedian X. tulumensis was found in all orders of Polyneoptera. This sequence might therefore be regarded as ancestral for all Hexapoda. A single AA substitution from Phe<sup>6</sup> to Leu<sup>6</sup> was found in Creoxylus spinosus (Phasmatodea). The median AED for the CCAP precursor is 0.22. Grylloblattodea and Mantophasmatodea show the lowest CCAP precursor variation in Polyneoptera, while Plecoptera possess the most variable CCAP precursors. The overall AED for the ACP precursors is 0.45, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.00).

### CCHamide-1 (Additional File 3F)

A single CCHamide-1 precursor with a length of 115-241 AA is present in almost all polyneopteran orders. The only exception was found in Ensifera, where the CCHamide-1 precursor is absent. The precursor contains a usually wellconserved CCHamide-1 motif with a C-terminal amidation site. The CCHamide-1 sequence in the precursor immediately follows the signal peptide and terminates upstream of a dibasic KR cleavage site.

With very few exceptions (N-terminally extended in Embioptera and possibly also in Caelifera), the predicted CCHamide-1 sequences are always tetradecapeptides. The sequence GSCLSYGHSCWGAH-NH<sup>2</sup> which also occurs in the silverfish T. domestica was found in most orders of Polyneoptera (not in Zoraptera). This sequence might therefore be regarded as ancestral for all Pterygota. Significant intraordinal variation is only present in Dermaptera, Plecoptera, Blattodea, and Mantodea. The most common AA substitution across different orders was that of Ala<sup>13</sup> to Gly<sup>13</sup> (Zoraptera and few Mantophasmatodea, Mantodea and Blattodea). The median AED for the CCHamide-1 precursor is 0.52. Mantophasmatodea show the lowest CCHamide-1 precursor variation, while Plecoptera have the most variable CCHamide-1 precursors. The overall AED for the CCHamide-1 precursors is 0.81, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.07).

### CCHamide-2 (Additional File 3G)

A single CCHamide-2 precursor with a length of 97-174 AA is present in all polyneopteran orders. The precursors contain a well-conserved CCHamide-2 motif with a C-terminal amidation site. The CCHamide-2 sequence in the precursor follows immediately the signal peptide and terminates upstream of a dibasic KR cleavage site. An analysis of the C. morosus peptidome (2) has shown that the N-terminal KR of CCHamide-2 is not recognized as a cleavage site and we hypothetically assume that the same N-terminus occurs in all polyneopteran CCHamide-2. Two precursors with different CCHamide-2 sequences were found in the species H. palaui (Embioptera). Additionally, in several species of Caelifera a second transcript of the CCHamide-2 precursor was found. In these species, 16 AA (P/SYGVRR/TPGD/AIQI/TRRAG) are inserted following the N-terminal KR in the respective CCHamide-2 sequences.

With few exceptions in Blattodea (Nocticola sp.: 16 AA; Catara rugosicollis, Coptotermes sp.: 14 AA), the predicted CCHamide-2 sequences are always pentadecapeptides. The sequence KRGCSAFGHSCFGGH-NH<sup>2</sup> which also occurs in several silverfish species, but not T. domestica, was found in most orders of Polyneoptera (not in Plecoptera, Dermaptera). This sequence might therefore be regarded as ancestral for all Pterygota. Common AA substitutions in the CCHamide-2 sequences are Ala<sup>4</sup> to Ser<sup>4</sup> (Phasmatodea, several Blattodea, few Plecoptera) and Phe<sup>10</sup> to Tyr<sup>10</sup> (few Embioptera, Plecoptera, and Dermaptera). The AA at position 3 of the N-terminus show lineage-specific AA substitutions in the majority of Dermaptera (Ser to Gln), Plecoptera (Ser to Thr) and Caelifera (Ser to Met). The median AED for the CCHamide-2 precursor is 0.37. Mantophasmatodea show the lowest CCHamide-2 precursor variation in Polyneoptera, while Dermaptera possess the most variable CCHamide-2 precursors. The overall AED for the CCHamide-2 precursors is 0.72, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.10).

## CNMamide (Additional File 3H)

CNMamide precursors with a length of 127–179 AA are present in all polyneopteran orders. Two transcripts with different CNMamide sequences were found in several Blattodea. The less commonly found longer transcripts (7 species of Blattodea) are orthologs of the single CNMamide precursor of Mantodea (sister group of Blattodea). In Caelifera, we identified two precursors in the species Haplotropis brunneriana and Pielomastax soochowensis. One of these precursors is very similar to the single CNMamide precursors of other Polyneoptera, while the second precursor, found in the majority of caeliferan species (27), has a rather variable and N-terminally extended sequence. In Ensifera, we have also identified two very different precursor sequences, but these sequences were always found in different species. It therefore remains unclear whether these sequences represent different transcripts or result from the rapid sequence diversification of CNMamide precursors. All precursors contain the CNMamide motif with a C-terminal amidation site. The CNMamide sequence in the precursor is located C-terminal in the precursor, N-terminally flanked by a dibasic KR cleavage site (KK in the ensiferan Acheta domesticus and Phaeophila crisbredoides), and terminates upstream of variable C-terminal cleavage sites (mostly RKR).

Most CNMamides are tetradecapeptides, but the full variation is from 13 to 18 AA. The sequence GSYMSLCHFKICNM-NH<sup>2</sup> which also occurs in the silverfish T. domestica was found in many orders of Polyneoptera (not in Dermaptera, Caelifera, Mantophasmatodea, Embioptera, Mantodea). This sequence might therefore be regarded as ancestral for all Pterygota. Common AA substitutions in the CNMamide sequences are Gly<sup>1</sup> to Asn<sup>1</sup> (all Mantophasmatodea), Gly<sup>1</sup> to Thr<sup>1</sup> (all Embioptera), and Ser<sup>5</sup> -Leu<sup>6</sup> to Thr<sup>5</sup> -Met<sup>6</sup> (all Mantodea). Particularly the CNMamide sequences of Caelifera, Blattodea, and Ensifera in which two different precursors are present are quite variable at the N-terminus. A similar sequence variation was also found in Dermaptera and Plecoptera. The median AED for the CCHamide-2 precursor is 0.44. Mantophasmatodea show the lowest CCHamide-2 precursor variation in Polyneoptera, while Plecoptera possess the most variable CCHamide-2 precursors. The overall AED for the CCHamide-2 precursors is 0.95, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.26).

## Corazonin (Additional File 3I)

A corazonin precursor with a length of 85-140 AA was found in almost all polyneopteran orders. The only exception was obtained in Zoraptera, where the corazonin precursor is absent. For two species, Nippancistroger testaceus (Ensifera) and Medauroidea extradentata (Phasmatodea), we identified a second precursor with different corazonin sequences. Otherwise, the corazonin precursors contain highly conserved corazonin motifs with C-terminal amidation sites. The corazonin sequence in the precursor follows immediately the signal peptide and terminates upstream of a RKR cleavage site.

Corazonin sequences are almost exclusively undecapeptides. Only in one species of Dermaptera (Nesogaster amoenus), corazonin has 9 AA and the C-terminal dipeptide is missing. The sequence (p)QTFQYSRGWTN-NH2, which also occurs in Malacostraca and even Myriapoda, was found in most orders of Polyneoptera (not in Phasmatodea and Dermaptera). This sequence might therefore be regarded as ancestral for all Hexapoda. Peptidomics confirmed for different polyneopteran taxa that the N-terminal Gln of corazonin is almost completely converted to pGlu (38). Amino acid substitutions in corazonin of Polyneoptera are largely limited to substitutions from Arg<sup>7</sup> to His<sup>7</sup> (many Phasmatodea, Caelifera and Dermaptera). Considering the phylogenetic position of the respective insect taxa (**Figure 1**), the His<sup>7</sup> -corazonins probably evolved several times independently of each other. Significant intraordinal variation is only present in Dermaptera. In Mantophasmatodea, all species of a single lineage (Austrophasmatidae) have a unique corazonin sequence with two AA substitutions (Gln<sup>4</sup> to His<sup>4</sup> ; Arg<sup>7</sup> to Gln<sup>7</sup> ), while in all other Mantophasmatodea the original sequence is retained (39). The median AED for the corazonin precursor is 0.37. Mantophasmatodea show the lowest corazonin precursor variation, while Dermaptera possess the most variable corazonin precursor sequences. The overall AED for the corazonin precursors is 0.82, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.07).

### CRF-DH (Additional File 3J)

A precursor for the corticotropin-releasing factor-like diuretic hormone (CRF-DH) was found with a length of 138-284 AA in almost all polyneopteran orders. A partial sequence (N-terminal) of a possible CRF-DH was identified for the single species of Zoraptera. This sequence is not further considered here. The CRF-DH precursors contain a mostly well-conserved CRF-DH motif with C-terminal amidation site. The CRF-DH sequence is located in the middle of the precursor and is flanked by dibasic KR cleavage sites in most polyneopteran taxa. Notable exceptions are the precursors of Dermaptera which have a Cterminal RKR cleavage site (not in Parapsalis infernalis) and lack the N-terminal KR cleavage motif. Therefore, the sequences of the mature CRF-DHs of Dermaptera cannot be predicted with certainty and require biochemical confirmation first.

Most CRF-DHs of Polyneoptera consist of 46 AA, shorter sequences are indicated for several Plecoptera (42-46 AA), Ensifera (45-46 AA), Caelifera (44-46 AA), Mantophasmatodea (44 AA), and a single species of Blattodea (45 AA in Nocticola). Dermaptera have N-terminal extended CRF-DHs (see above). Due to considerable sequence variations, particularly in Dermaptera, an ancestral sequence of CRF-DH for Polyneoptera cannot be determined with certainty. All species contain a consensus sequence of PSLSIVNxxDVLRQRxxLExxRxRMR within the CRF-DH. The variable AA (x) within this sequence decrease significantly if the CFR-DHs of Dermaptera are not considered (PSLSIVNxxDVLRQRLLLExARRRMR). The median AED for the CFR-DH precursor is 0.30. Mantophasmatodea show the lowest CFR-DH precursor variation, while Ensifera possess the most variable CFR-DH precursor sequences. The overall AED for the CFR-DH precursors is 0.62, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.31).

### CT-DH (Additional File 3K)

A precursor for the calcitonin-like diuretic hormone (CT-DH) was found with a length of 107-178 AA in all polyneopteran orders. The CT-DH precursors contain a highly conserved CT-DH motif with C-terminal amidation site. The CT-DH sequence is located in the middle of the precursor, N-terminally flanked by a dibasic KR cleavage site, and terminates upstream of an RRRR cleavage site (RKRR in Plecoptera).

All CT-DHs of Polyneoptera consist of 31 AA. The sequence GLDLGLSRGFSGSQAAKHLMGLAAANYAGGP-NH<sup>2</sup> which also occurs in the silverfish T. domestica was found in most orders of Polyneoptera (not in Zoraptera, Dermaptera, Caelifera). This sequence might therefore be regarded as ancestral for all Pterygota. The remarkable sequence conservation of CT-DH is unique for such a long neuropeptide. The few AA substitutions in CT-DHs of Polyneoptera are often lineage-specific and cover all species within the corresponding insect orders: Phe<sup>10</sup> to Tyr<sup>10</sup> and Tyr<sup>27</sup> to Phe<sup>27</sup> (Zoraptera), Leu<sup>6</sup> to Met<sup>6</sup> (Dermaptera), Ser<sup>7</sup> to Asn<sup>7</sup> , and Ser<sup>13</sup> to Ala/Thr<sup>13</sup> (Caelifera). Significant intraordinal variation is only present in Caelifera. The median AED for the CFR-DH precursor is as low as 0.15. Mantophasmatodea and Grylloblattodea show the lowest CT-DH precursor variation, while Dermaptera possess the most variable CT-DH precursor sequences. The overall AED for the CT-DH precursors is 0.44, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.05).

## Elevenin (Additional File 3L)

An elevenin precursor with a length of 99-170 AA was found in all polyneopteran orders. For a single species, Paratemnopteryx couloniana (Blattodea), we have identified a second and sequence-related precursor. The elevenin precursors contain quite variable elevenin motifs without C-terminal amidation site. The elevenin sequence in the precursor follows immediately the signal peptide and terminates upstream of a dibasic KR cleavage site (RKR or KKR in some Caelifera). In Caelifera, the predicted elevenin sequence contains a Lys2Arg<sup>3</sup> motif that might be used as cleavage site. However, as noted above for CCHamide-2, KR motifs that immediately follow the signal peptide sequence do not necessarily function as cleavage signals for prohormone convertases.

The elevenins of Polyneoptera are variable in length and consist of 17 (Embioptera) up to 22 AA (Dermaptera, multiple species of Blattodea). Due to considerable sequence variations (**Figure 5**), an ancestral sequence of elevenin for Polyneoptera cannot be determined. Most polyneopteran taxa have species with a consensus C-terminus of CRGVAA-OH (CRGASA-OH in Mantophasmatodea) and a conserved position of the two Cys residues; specific features also found in T. domestica. Significant intraordinal variation is present in Dermaptera, Caelifera, Embioptera, Phasmatodea, and Blattodea. The median AED for the elevenin precursor is 0.42. Grylloblattodea and Mantophasmatodea show the lowest elevenin precursor variation, while Caelifera possess the most variable elevenin precursor sequences. The overall AED for the elevenin precursors is 0.87, the actual neuropeptide sequence is better conserved (overall AED: 0.45).

### HanSolin (Additional File 3M)

HanSolin was recently described from C. morosus (2). A subsequent search for HanSolin in Coleoptera (25) also revealed orthologous precursors in these holometabolous insects. Here we found a single HanSolin precursor with a length of 88-139 AA in all polyneopteran orders. The HanSolin precursors contain quite variable HanSolin motifs with a conserved C-terminus; including a C-terminal amidation site. HanSolin is always located C-terminal in the precursor, N-terminally mostly flanked by a monobasic Arg cleavage site (RR in Mantophasmatodea, multiple Caelifera and Ensifera), and terminates upstream of a dibasic Cterminal RR cleavage site (occasionally KR or monobasic R for different orders).

The HanSolins of Polyneoptera seem to be very variable in length and consist of 8 (Caelifera) up to 16 AA (Ensifera). In many cases, the N-terminal sequences of the predicted mature peptides require biochemical confirmation. Due to considerable sequence variations, an ancestral sequence of HanSolin for Polyneoptera cannot be determined. Most polyneopteran taxa have species with a consensus C-terminal hexapepide of GQPLRW-NH<sup>2</sup> (GMPLRF-NH<sup>2</sup> in Zoraptera, GLPLRW-NH<sup>2</sup> in Mantophasmatodea); this C-terminus is also found in T. domestica (Bläser and Predel, unpublished). Significant intraordinal variation is present in Dermaptera, Plecoptera, Caelifera, Ensifera, Embioptera, and Phasmatodea. The median AED for the HanSolin precursor is 0.39 (**Figure 6**). Mantophasmatodea show the lowest HanSolin precursor variation, while Ensifera and Plecoptera possess the most variable HanSolin precursor sequences. The overall AED for the HanSolin precursors is 0.83, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.33).

### MS (Additional File 3N)

A myosuppressin (MS) precursor with a length of 143-174 AA is present in all polyneopteran orders. All of these precursors contain a highly conserved MS motif with a C-terminal amidation site. For three species, L. migratoria, Prosarthria teretrirostris (both Caelifera) and Hemimerus sp. (Dermaptera), we identified a second precursor with related MS sequences. We also found a second transcript with identical MS sequences in five species of Mantodea, but with an insertion of 39 AA in the middle part of the precursor. The MS sequence is always located C-terminal in the precursor, N-terminally flanked by a dibasic KR cleavage site and terminates upstream of an RRR cleavage motif.

MS sequences are almost exclusively decapeptides. Only in one species of Blattodea (Reticulitermes santonensis) MS

FIGURE 5 | Sequence logo representation showing the degree of amino acid sequence conservation of the HanSolin neuropeptides for each order; mapped on a phylogenetic tree of Polyneoptera [modified from Wipfler et al. (27)]. Only the completely obtained precursor sequences were considered, the respective number is given in parentheses for each taxon. An "X" in the sequence represents a gap. The hypothetical ancestral C-terminus of the HanSolin sequence in Polyneoptera is listed at the top.

is a highly derived undecapeptide (KEDSQHMFLRF-NH2). The sequence (p)QDVDHVFLRF-NH2, which also occurs in Remipedia, was found in most orders of Polyneoptera (not in Caelifera and Dermaptera). This sequence might therefore be regarded as ancestral for all Hexapoda. Peptidomics confirmed for different polyneopteran taxa that the N-terminal Gln of MS is only partially converted to pGlu [e.g., (2, 40)]. With the exception of the MS of R. santonensis (see above) AA substitutions are restricted to the N-terminal AA (P/T in Caelifera, H in several Ensifera) and the position 6 (Val<sup>6</sup> to Ile<sup>6</sup> in Dermaptera). The second precursor of few species contains additional AA substitutions (EDVGHVFLRF-NH<sup>2</sup> in L. migratoria; KDIEHVFLRF-NH<sup>2</sup> in P. teretrirostris, QDVHHNFLRF-NH<sup>2</sup> in Hemimerus sp.). The median AED for the MS precursor is 0.26. Grylloblattodea and Mantophasmatodea show the lowest MS precursor variation in Polyneoptera, while Ensifera possess the most variable MS precursors. The overall AED for the MS precursors is 0.57, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.05).

### NPF-1 (Additional File 3O)

The insect neuropeptide F-1 (NPF-1) precursor with a length of 81–97 AA was found in all polyneopteran orders. In most taxa we have also identified a second and longer transcript showing an insertion of about 40 AA in the middle of the NPF-1 neuropeptide (= NPF-1b). Such transcripts are also known from T. domestica and are therefore a basic feature of Pterygota. The remaining sequences of NPF-1 are identical in both transcripts. The NPF-1 precursors contain a well-conserved NPF-1 motif with C-terminal amidation site. To a slightly lesser extent, this also applies to the insertion in the long transcript. The NPF-1 sequence in the precursor follows immediately the signal peptide and terminates upstream of a dibasic KR cleavage site.

Most NPF-1<sup>a</sup> neuropeptides of Polyneoptera consist of 33 AA, but the full range is 30–36 AA. The C-terminus LQELDRYYSQVARPRF-NH<sup>2</sup> is fully conserved in the majority of Polyneoptera (E to M/R in Embioptera and Mantophasmatodea; V to N/K in Plecoptera). Particularly significant intraordinal variation of the N-terminus is present in Dermaptera, Plecoptera, Caelifera, Grylloblattodea, and Embioptera. The median AED for the CFR-DH precursor is 0.27. Mantophasmatodea and Mantodea show the lowest CT-DH precursor variation, while Ensifera and Plecoptera possess the most variable CT-DH precursor sequences. The overall AED for the CT-DH precursors is 0.45, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.19).

### NPF-2 (Additional File 3P)

An insect neuropeptide F-2 (NPF-2) precursor with a length of 85–134 AA was found in all polyneopteran orders. The NPF-2 precursors contain a quite variable NPF-2 motif with a wellconserved C-terminus and C-terminal amidation site. The NPF-2 sequence in the precursor follows immediately the signal peptide and terminates upstream of a dibasic KR cleavage site.

The NPF-2 neuropeptides of Polyneoptera are variable in length and consist of 43–47 AA. Only the C-terminus PRF-NH<sup>2</sup> is fully conserved in all analyzed species while a C-terminal RPRF-NH<sup>2</sup> was at least found in members of all order of Polyneoptera. Thus, the information about the C-terminal AA is not sufficient to distinguish between the neuropeptides NPF-1 and NPF-2. Particularly significant intraordinal variation of the N-terminus is present in Dermaptera, Plecoptera, Ensifera, and Embioptera. The median AED for the NPF-2 precursor is 0.29. Mantophasmatodea and Mantodea show the lowest NPF-2 precursor variation, while Plecoptera possess the most variable NPF-2 precursor sequences. The overall AED for the NPF-2 precursors is 0.65, the actual neuropeptide sequence is slightly better conserved (overall AED: 0.41).

### Proctolin (Additional File 3Q)

A proctolin precursor with a length of 74–104 AA is present in almost all polyneopteran orders. The only exception was found in Dermaptera, where the proctolin precursor is absent. For the single species of Zoraptera (Zorotypus caudelli) we identified a second precursor. All these precursors contain a highly conserved proctolin motif without a C-terminal amidation site. The proctolin sequence immediately follows the signal peptide and terminates upstream of a monobasic Arg cleavage site.

Proctolin sequences are exclusively pentapeptides. The sequence RYLPT-OH, which also occurs in Remipedia and even Myriapoda, was found in all orders of Polyneoptera. This sequence might therefore be regarded as ancestral for Hexapoda. With the exception of the proctolin of Systella rafflesii (Pro<sup>4</sup> to His<sup>4</sup> ) and Zubovskia sp. (Thr<sup>5</sup> to Val<sup>5</sup> ; both Caelifera), all species possess the original sequence RYPLT-OH. The median AED for the proctolin precursor is 0.31. Grylloblattodea, Mantophasmatodea, and Mantodea show the lowest proctolin precursor variation in Polyneoptera, while Ensifera and Caelifera possess the most variable proctolin precursors. The overall AED for the ACP precursors is 0.52, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.00).

### RFLamide (Additional File 3R)

RFLamides were only recently described from C. morosus (2). An RFLamide precursor with a length of 122–211 AA is present in all polyneopteran orders. Most precursors contain a well-conserved RFLamide with a C-terminal amidation site. The RFLamide sequence is always located C-terminal in the precursor, Nterminally flanked by an RKR or RRR cleavage site (dibasic RR in Protonemura ausonia, Plecoptera) and terminates upstream of quite variable cleavage motifs (monobasic Arg up to 5 basic AA which terminate the precursor sequence).

RFLamides are mostly duodecapeptides. Only in Mantophasmatodea and Grylloblattodea the RFLamides are 14 mers with an extended C-terminus. In these taxa the first Arg of the original cleavage motif is replaced by a Met, resulting in RFLamides with two additional AA without a C-terminal amidation site. The unique C-terminus of these insects is a remarkable synapomorphy of Mantophasmatodea and Grylloblattodea. The sequence PASAIFTNIRFL-NH<sup>2</sup> was found in most orders of Polyneoptera (not in Zoraptera, Mantophasmatodea, Grylloblattodea). This sequence might therefore be regarded as ancestral for all Polyneoptera (**Figure 7**). Amino acid substitutions in RFLamides of Polyneoptera are largely limited to substitutions of Ser<sup>3</sup> , Ala<sup>4</sup> , and Ile<sup>5</sup> . Mantophasmatodea and Grylloblattodea have the most derived sequences, each with several lineage-specific features; in addition to the distinct C-terminus, which is identical in both taxa. Significant intraordinal variation is present in Plecoptera, Ensifera, and Embioptera. The median AED for the RFLamide precursor is 0.28 (**Figure 8**). Grylloblattodea (only 2 species) show the lowest RFLamide precursor variation in Polyneoptera, while Plecoptera possess the most variable RFLamide precursors. The overall AED for the RFLamide precursors is 0.62, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.14).

### SIFamide (Additional File 3S)

A SIFamide precursor with a length of 71–103 AA is present in all polyneopteran orders. In two blattodean species (Prorhinotermes and Schultesia) we found a second SIFamide precursor with identical SIFamide sequences; probably slightly different alleles. Most precursors contain a very well-conserved SIFamide with a C-terminal amidation site. An exception was found in R. virgo (Embioptera), which has three consecutive copies of SIFamide in the precursor and therefore cannot not be treated as a singlecopy peptide. The SIFamides (actually SIYamides) of this species do indeed have more derived sequences and are not considered in our analyses. Whether the transition to multiple copies is a specific feature of Embioptera cannot yet be determined, since we could not find SIFamide precursors in the other Embioptera species. The SIFamide sequence in the precursors of all other species follow immediately the signal peptide and terminates upstream of a dibasic KR cleavage site.

The SIFamides are mostly duodecapeptides or longer. Only in the SIFamides of Dermaptera the N-terminal AA is missing (= undecapeptides). Since the length of the signal peptide of SIFamide precursors cannot always be predicted with certainty (41), the N-terminus of SIFamides should be confirmed by peptidomics in taxa not yet examined. The sequence TYRKPPFNGSIF-NH<sup>2</sup> was found in most orders of Polyneoptera (not in Dermaptera). This sequence might therefore be regarded as ancestral for all Polyneoptera. In the SIFamide of T. domestica (Zygentoma) only the N-terminal AA is different (Thr<sup>1</sup> -Gly<sup>1</sup> ). Amino acid substitutions in SIFamides of Polyneoptera are largely limited to substitutions of Thr<sup>1</sup> and Tyr<sup>2</sup> . The median AED for the SIFamide precursor is 0.22. Grylloblattodea show the lowest SIFamide precursor variation in Polyneoptera, while Caelifera possess the most variable SIFamide precursors. The overall AED for the SIFamide precursors is 0.5, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.09).

A second gene coding for a related neuropeptide, SMYamide, was described for L. migratoria (Caelifera) and Zootermopsis nevadensis [Blattodea; (1)]. We found SMYamide precursors in Caelifera, Ensifera, Embioptera, Phasmatodea, Mantodea, and Blattodea. The phylogenetic position of these taxa suggests that the SMYamide gene has evolved within the Polyneoptera. This is corroborated by the fact that so far no orthologs of SMY genes have been reported from any other insect.

### sNPF (Additional File 3T)

A short neuropeptide F (sNPF) precursor with a length of 86– 134 AA is present in all polyneopteran orders. All of these precursors contain a highly conserved sNPF motif with a Cterminal amidation site. A specific feature of most Caelifera is the presence of a second and longer sNPF neuropeptide immediately after the first sNPF sequence in the precursor. While the two species of Xya (Caelifera) still show the original pattern with a single sNPF sequence, P. teretrirostris (Caelifera) even has three

consecutive sNPFs in the precursor. The sequence of the Nterminal sNPF of the Caelifera with multiple sNPFs is highly conserved and closely resembles the orthologous sNPFs of the other polyneopteran orders. Therefore, it was included in our analyses. The sNPF sequence is always located in the middle of the precursor, N-terminally flanked by a dibasic RK cleavage site, and terminates upstream of a dibasic RR cleavage site.

Short NPF sequences (in Caelifera only the N-terminal sNPF sequence) are exclusively undecapeptides with a potential secondary cleavage site (Arg<sup>3</sup> ). The sequence SNRSPSLRLRF-NH<sup>2</sup> which also occurs in T. domestica, was found in several orders of Polyneoptera (Zoraptera, Dermaptera, Plecoptera, Caelifera and Ensifera). This sequence might therefore be regarded as ancestral for all Pterygota. Apparently the sister group of Zoraptera + Dermaptera (i.e., the remaining polyneopteran orders) originally had two alleles coding for Ser or Ala as the N-terminal AA. Several orders of this group (Plecoptera, Caelifera, Ensifera, Grylloblattodea, Phasmatodea) still have species either with Ser<sup>1</sup> or Ala<sup>1</sup> , while in Mantophasmatodea, Embioptera, Mantodea, and Blattodea the sNPF with Ala<sup>1</sup> has completely replaced the original Ser at this position. Other AA substitutions are restricted to the second AA (Gln<sup>2</sup> to Ser<sup>2</sup> ) and have been detected in a few Grylloblattodea and Caelifera and in all Embioptera. The median AED for the sNPF precursor is 0.20. Grylloblattodea and Mantophasmatodea show the lowest sNPF precursor variation in Polyneoptera, while Embioptera and Plecoptera possess the most variable sNPF precursors. The overall AED for the sNPF precursors is 0.44, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.05).

### Trissin (Additional File 3U)

A trissin precursor with a length of 88–117 AA is present in almost all polyneopteran orders. The only exceptions were found in Dermaptera and Zoraptera, where trissin precursors are absent. For two species of the genus Xya (Caelifera), we have identified two trissin precursors with moderately (trissin 1) or strongly modified N-termini (trissin 2). Otherwise, the trissin precursors contain usually a well-conserved trissin motif without C-terminal amidation site. The trissin sequence in the precursor follows immediately after the signal peptide and terminates upstream of a tribasic RKR cleavage site (KKR in 2 of 26 Mantodea species and dibasic KR in 2 of 6 Ensifera species).

Most trissins of Polyneoptera consist of 27 AA. For most Caelifera a truncated sequence of trissin with a single AA (instead of two) preceding the N-terminal Cys is predicted by SignalP-5.0; trissin 1 of Xya probably starts directly with the N-terminal Cys. Trissins 1 of Xya variegata and X. japonica additionally show an insertion of Ser downstream of the N-terminal Cys. In Ensifera, the N-terminal of trissin is not always clearly predicted; probably it starts directly with the N-terminal Cys. Generally, the Nterminal cleavage of trissins should be confirmed by peptidomics. However, trissin has not been detected biochemically from any polyneopteran species so far. In two species of Ensifera (Ceuthophilus sp. and Diestrammena asynamora) the first Arg of the C-terminal cleavage motif is replaced by Ser, which probably leads to an extended C-terminus (NYLS-OH instead of NYL-OH). All species of the ensiferan infraorder Gryllidea whose trissin sequence has been identified (Ceutophilus sp., Gryllotalpa sp., Neonetus sp.) show an insertion of Asp in the middle of the sequence, indicating a synapomorphy. The sequence LSCDSCGRECXXXCGTRNFRTCCFNYL-OH (XXX: no ancestral AA assigned) was found in T. domestica and most orders of Polyneoptera. This sequence might therefore be regarded as ancestral for all Pterygota. Amino acid substitutions in trissins of Polyneoptera are mainly limited to substitutions of AA at positions 11–13 and 18. Significant intraordinal variation is present in Plecoptera, Caelifera, Ensifera, and Blattodea. Distinct lineage-specific features are substitutions of Ser<sup>4</sup> to Phe<sup>4</sup> /Val<sup>4</sup> (Caelifera) or Ile<sup>4</sup> (Ensifera), Phe<sup>24</sup> to Leu24/Tyr<sup>24</sup> (Caelifera/Ensifera), and Arg<sup>21</sup> to Val21/His<sup>21</sup> (Ensifera). The median AED for the trissin precursor is 0.23. Mantophasmatodea show the lowest trissin precursor variation, while Ensifera possess the most variable trissin precursor sequences. The overall AED for the trissin precursors is 0.59, the actual neuropeptide sequence is significantly better conserved (overall AED: 0.20).

### CONCLUSIONS

In our analysis we examined the single-copy precursor sequences of 21 neuropeptide genes of Polyneoptera. The neuropeptides of 17 of these precursors are C-terminally amidated (not AST-CC, elevenin, proctolin, trissin), which prevents rapid degradation by exopeptidases and thus supports their functions as hormones. Only very few neuropeptide genes coding for single-copy neuropeptides are completely missing in a given polyneopteran order. Dermaptera have no ACP, proctolin and trissin, Ensifera do not have CCHamide-1, and in most Embioptera we could not detect any SIFamide precursor (with the exception of a multiplecopy SIFa precursor in R. virgo, see above). Furthermore, we did not find precursors for corazonin, CRF-DH and trissin in Zoraptera, but only one species of this order could be analyzed. Therefore, the absence of the respective neuropeptide genes has yet to be confirmed for Zoraptera. For most orders and also for the individual species within these orders, we have found all single-copy precursors, a feature already documented for the "basal" hexapods, which represent the sister group of the Pterygota [winged insects; (24)]. In contrast, peptide gene losses are more frequent in the much more species-rich and ecologically significant Holometabola. The fruit fly Drosophila melanogaster, which is used as a model organism in molecular biology, neurobiology and also physiology, is a good example in this context as it lacks not less than 6 of the 21 peptidergic systems analyzed here (ACP, AT, Elevenin, HanSolin, NPF-2, RFLamide) (12).

The sequence conservation of the precursor sequences, including the signal peptides, varies for the different neuropeptide genes. Low overall AED values (AST-C: 0.36; sNPF, CT-DH: 0.44; CCAP, NPF-1: 0.45; see **Additional File 3**) contrast with high AED values (CNMamide: 0.95; Elevenin: 0.87; Hansolin: 0.83; see **Additional File 3**), which are significantly above the average value of 0.63 calculated for all neuropeptide precursors (**Figure 9**). As expected, the sequences of singlecopy neuropeptides within the precursors are much better conserved (overall AED 0.16; **Figure 9**). However, the extent of sequence conservation across Polyneoptera is remarkably different between the different neuropeptides. Neuropeptides such as proctolin, CCAP, AST-C, sNPF, MS, and CT-DH (overall AED ≤ 0.05) are almost identical in all taxa and the most common sequence always represents the predicted ancestral sequence of Pterygota (sNPF, CT-DH) or even the ancestral sequence of Hexapoda (proctolin, CCAP, AST-C, MS). For all neuropeptides with very high AED values (Elevenin: 0.45; NPF-2: 0.41; Hansolin: 0.33), the sequence ancestral to Polyneoptera could not be determined. Many of the neuropeptides with high AED values have long sequences, but this does not necessarily

lead to high AED values, as is shown for example with CT-DH (31 AA; AED: 0.05).

The overall median AEDs for the single-copy neuropeptides and precursors differ significantly between the polyneopteran orders. This was to be expected, since the different lineages evolved independently of each other over different periods of time (see **Figure 2**). In addition, several polyneopteran orders (e.g., Grylloblattodea and Mantophasmatodea) represent relict groups with only a few extant and rather closely related taxa. Thus, these orders show particularly low AEDs, while Dermaptera, Plecoptera and Orthoptera (Ensifera + Caelifera) have much higher intra-ordinal sequence diversity (**Figure 9**). The relatively high AEDs for Embioptera were somewhat unexpected in this context. Although the AEDs for the various neuropeptide precursors of the different Polyneoptera are mostly in the range of the median AED for all neuropeptide precursors, there are striking exceptions. This is especially true for Mantodea (significantly lower AEDs for NPF-1 and−2) and Dermaptera (significantly lower AED for AST-CC). A comparison of AEDs in the orthopteran sister groups Ensifera and Caelifera also shows very different AEDs for the different neuropeptide precursors, either in favor of Ensifera or Caelifera (**Additional File 3**). This means that in the evolution of the sequences of neuropeptide precursors there have been some striking increases or decreases in the AA substitution rate, which cannot be directly related to a uniform development of the peptidergic system of a given taxon or to a specific neuropeptide gene.

A number of derived neuropeptide sequences were found, showing sequence motifs (= synapomorphies) typical only for representatives of a specific polyneopteran lineage. This has to be separated from intra-ordinal variation. Within the respective lineages, the derived sequences are often well-conserved (**Additional File 3**). However, surprisingly few examples of derived sequences have been found that are typical of two or more polyneopteran orders. One clear example is the substitution within the C-terminal cleavage motif of RLFamides, which probably occurred in the last common ancestor of Mantophasmatodea and Grylloblattodea. This substitution prevents the C-terminal amidation and is typical of all Mantophasmatodea and Grylloblattodea. Furthermore, the absence of trissin in both Dermaptera and Zoraptera (here, however, only a single transcriptome was available) indicates that the loss of this neuropeptide already occurred in the last common ancestor of these two lineages. Typical for most Dictyoptera (Mantodea + Blattodea) is Gln<sup>11</sup> of trissin, which is only found in this taxon.

Overall, the single-copy neuropeptide precursors of the Polyneoptera show a relatively high degree of sequence conservation. Basic features of these precursors in this very heterogeneous insect group are explained here in detail for the first time. Further insights into the evolution of neuropeptides can be expected from future analyzes of the much more variable multiple-copy neuropeptides.

### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/**Supplementary Material**.

### AUTHOR CONTRIBUTIONS

MB and RP contributed to the conception and design of the study. MB mined the transcriptomes for neuropeptide precursors and wrote the first draft of the manuscript. All authors contributed to final version of the manuscript and approved the submitted manuscript.

### FUNDING

This study was funded by the Deutsche Forschungsgemeinschaft (RP 766/11-1).

### ACKNOWLEDGMENTS

We are grateful to the entire 1KITE consortium, especially to Bernhard Misof, Alexander Donath, Benjamin Wipfler (ZFMK Bonn, Germany) and Karen Meusemann (University of Freiburg,

### REFERENCES


Germany) for providing early access to the 1KITE transcriptome assemblies and raw data. Furthermore, we thank Sander Liessem and Lapo Ragionieri (University of Cologne, Germany) for supporting the transcriptome analysis.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00197/full#supplementary-material


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Bläser and Predel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Analysis of Pigment-Dispersing Factor Neuropeptides and Their Receptor in a Velvet Worm

Christine Martin1†, Lars Hering1†, Niklas Metzendorf <sup>1</sup> , Sarah Hormann<sup>1</sup> , Sonja Kasten<sup>1</sup> , Sonja Fuhrmann<sup>1</sup> , Achim Werckenthin<sup>2</sup> , Friedrich W. Herberg<sup>3</sup> , Monika Stengl <sup>2</sup> and Georg Mayer <sup>1</sup> \*

<sup>1</sup> Department of Zoology, Institute of Biology, University of Kassel, Kassel, Germany, <sup>2</sup> Department of Animal Physiology, Institute of Biology, University of Kassel, Kassel, Germany, <sup>3</sup> Department of Biochemistry, Institute of Biology, University of Kassel, Kassel, Germany

#### Edited by:

Elizabeth Amy Williams, University of Exeter, United Kingdom

#### Reviewed by:

Bo Joakim Eriksson, University of Vienna, Austria Dusan Zitnan, Slovak Academy of Sciences, Slovakia

> \*Correspondence: Georg Mayer georg.mayer@uni-kassel.de

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 17 February 2020 Accepted: 14 April 2020 Published: 12 May 2020

#### Citation:

Martin C, Hering L, Metzendorf N, Hormann S, Kasten S, Fuhrmann S, Werckenthin A, Herberg FW, Stengl M and Mayer G (2020) Analysis of Pigment-Dispersing Factor Neuropeptides and Their Receptor in a Velvet Worm. Front. Endocrinol. 11:273. doi: 10.3389/fendo.2020.00273 Pigment-dispersing factor neuropeptides (PDFs) occur in a wide range of protostomes including ecdysozoans (= molting animals) and lophotrochozoans (mollusks, annelids, flatworms, and allies). Studies in insects revealed that PDFs play a role as coupling factors of circadian pacemaker cells, thereby controlling rest-activity rhythms. While the last common ancestor of protostomes most likely possessed only one pdf gene, two pdf homologs, pdf-I and pdf-II, might have been present in the last common ancestors of Ecdysozoa and Panarthropoda (Onychophora + Tardigrada + Arthropoda). One of these homologs, however, was subsequently lost in the tardigrade and arthropod lineages followed by independent duplications of pdf-I in tardigrades and decapod crustaceans. Due to the ancestral set of two pdf genes, the study of PDFs and their receptor (PDFR) in Onychophora might reveal the ancient organization and function of the PDF/PDFR system in panarthropods. Therefore, we deorphanized the PDF receptor and generated specific antibodies to localize the two PDF peptides and their receptor in the onychophoran Euperipatoides rowelli. We further conducted bioluminescence resonance energy transfer (BRET) experiments on cultured human cells (HEK293T) using an Epac-based sensor (Epac-L) to examine cAMP responses in transfected cells and to reveal potential differences in the interaction of PDF-I and PDF-II with PDFR from E. rowelli. These data show that PDF-II has a tenfold higher potency than PDF-I as an activating ligand. Double immunolabeling revealed that both peptides are co-expressed in E. rowelli but their respective levels of expression differ between specific cells: some neurons express the same amount of both peptides, while others exhibit higher levels of either PDF-I or PDF-II. The detection of the onychophoran PDF receptor in cells that additionally express the two PDF peptides suggests autoreception, whereas spatial separation of PDFR- and PDF-expressing cells supports hormonal release of PDF into the hemolymph. This suggests a dual role of PDF peptides—as hormones and as neurotransmitters/neuromodulators—in Onychophora.

Keywords: PDF, PDFR, BRET, Epac, E. rowelli, Onychophora, Panarthropoda, Ecdysozoa

### INTRODUCTION

Pigment-dispersing factors (PDFs) and pigment-dispersing hormones (PDHs) are neuropeptides that occur in various protostomes and have been mainly studied in nematodes, crustaceans and insects [reviewed in (1, 2)]. In decapod crustaceans, PDH controls a circadian relocation of the retinal pigment to shield the photoreceptor cells from light and also orchestrates circadian migrations of pigment within the integumental chromatophores (3–6). The crustacean PDHs, however, seem to not only play a hormonal role in mediating pigment dispersion but might additionally act as neurotransmitters or neuromodulators in the nervous system (7–12). The same holds true for the insect PDFs—homologs of the crustacean PDHs (4, 13). PDFs are synthesized by the circadian clock neurons in various insect species and are involved in different aspects of circadian timing (1, 14–34).

The PDFs of the fruit fly Drosophila melanogaster and the cockroach Rhyparobia maderae are analogs of the vasoactive intestinal peptide (VIP) of mammals (35). It must be noted, however, that despite their similar function in the circadian pacemaker systems, the insect PDFs (and crustacean PDHs) do not share a common ancestry with VIP. The ancestral pdf gene rather evolved in the protostome lineage, as it occurs in both spiralians and ecdysozoans but not in deuterostomes (36– 38). The report of potential PDF precursors in echinoderms and an enteropneust (39) should be taken with caution due to the low sequence similarity with the PDFs/PDHs of insects and crustaceans [cf. Figure 7 in (39)] and the lack of a phylogenetic analysis. Interestingly, while the last common ancestor of protostomes possessed only one pdf gene, which has been retained at least in mollusks and annelids [(36, 37); cf. supporting information Supplementary Figures 1 and 2 in (38)], a duplication of this gene might have occurred in the ecdysozoan lineage (**Figure 1**). Two homologs, pdf-I and pdf-II, have been identified thus far in priapulids, nematodes and onychophorans (38, 41, 42), whereas pdf-II seems to have been subsequently lost in tardigrades and arthropods. Even more intriguing is the independent duplication of the retained pdf-I gene in tardigrades, which show three in-paralogs, and decapod crustaceans, which express two to three PDH isoforms but seem to possess only two pdh genes (11, 38, 43–48).

Since two PDF peptides were most likely encoded in the genome of the last common ancestor of Ecdysozoa (**Figure 1**), the question arises of whether their ancestral role was in lightdependent pigment dispersion, in coupling of circadian clocks or in other processes. Alternatively, these peptides might have played divergent or multiple roles. Research on nematodes revealed that their two pdf genes express three PDFs that most likely also have an impact on circadian timing (49). There, the same peptides serve several functions including the control of locomotion, mate searching, mechano- and chemosensation as well as the sensation of oxygen (41, 50). Accordingly, in the nematode nervous system PDFs occur in nearly all sensory neurons that also control mate searching and locomotor activity patterns (42). Insects instead possess fewer PDF-expressing cells organized in 2–4 clusters associated with each compound eye and few additional somata showing a species-specific distribution in the protocerebrum [e.g., (14–16, 22, 23, 25, 29, 30, 51–56)]. The situation in crustaceans seems to be more complex, but they also possess PDH-immunoreactive somata in the median protocerebrum, in addition to those associated with the eye stalks (7, 8, 11).

Similar to crustaceans, Mayer et al. (38) detected numerous immunoreactive somata in the median protocerebrum of six distantly related onychophoran species using a broadly reactive antiserum raised against the synthetic β-PDH peptide of the crustacean Uca pugilator (7). These and other results indicate a broad and elaborate distribution of PDFs in the peripheral and central nervous system of onychophorans. However, since the applied antiserum recognized both onychophoran peptides, Ony-PDF-I and Ony-PDF-II, despite its higher affinity to Ony-PDF-I [83.3% sequence similarity as opposed by 55.6– 61.1% to Ony-PDF-II; cf. Figure 1 in (38)], it remains unclear whether these two peptides are co-expressed or show distinct distributions in various tissues and cells. To clarify this issue, we obtained specific antibodies against each PDF peptide of the onychophoran Euperipatoides rowelli (Peripatopsidae) and performed double immunolabeling. The gathered data provide insights into the ancestral distribution and potential roles of the two PDF peptides in the last common ancestors of Panarthropoda and Ecdysozoa.

Another important goal of our study was to deorphanize and immunolocalize the PDF receptor (PDFR) of onychophorans. Investigations of PDFRs are generally scarce. These membrane proteins belong to the ancient family of G protein-coupled receptors (GPCRs) and have a similar function to the VPAC<sup>2</sup> receptors in the circadian system of mammals [reviewed in (35)]. However, while PDFRs are activated by PDF/PDH peptides, the VPAC<sup>2</sup> receptors rather bind the aforementioned vasoactive intestinal peptide (VIP). Despite their important functions, only little is known about the origin and evolutionary history of PDFRs and VPAC<sup>2</sup> receptors.

The PDFR was initially deorphanized in the fruit fly D. melanogaster by three simultaneous studies: one of them examined a range of GPCRs for their sensitivity to PDF (57) and the other two reported that mutants lacking the groomof-PDF (gop) or the han gene (referred to as pdfr thereafter) imitate phenotypical behavior of pdf null mutants (58, 59). Subsequently, PDFR was shown to be expressed in 60% of pacemaker neurons (60), a few additional, clock-unrelated neurons as well as within the visual system of the fruit fly (61). Like D. melanogaster, the nematode Caenorhabditis elegans possesses one pdfr gene but three PDFR splice variants (41), all of which show a dose-dependent sensitivity to the three nematode PDFs (PDF-1a, PDF-1b, and PDF-2). The nematode PDFRs have been localized in the motor neurons of the nervous system as well as in nearly all muscle cells (41). Apart from D. melanogaster and C. elegans, to our knowledge there are no immunohistochemical data on the local distribution of PDFRs in other animals.

To improve our understanding of the PDF/PDFR system in onychophorans, we screened the transcriptome of E. rowelli (62) for GPCR genes and identified a single pdfr candidate.

We further performed functional analyses using synthetic PDF-I and PDF-II peptides (based on sequences from E. rowelli) in conjunction with bioluminescence resonance energy transfer (BRET) to confirm the identity of the putative receptor. Finally, we generated a specific antibody against the newly detected PDFR of E. rowelli and localized this receptor protein in the animal. The obtained data provide insights into the onychophoran PDF/PDFR network and contribute to a better understanding of its evolution in panarthropods and ecdysozoans.

### MATERIALS AND METHODS

### Collection and Maintenance of Specimens

Specimens of Euperipatoides rowelli Reid, 1996 (63) were obtained from decaying logs and leaf litter in the Tallaganda State Forest (New South Wales, Australia; 35◦ 30′ 31′′S, 149◦ 36′ 14.3′′E, 934 m) in October 2016. They were collected under the permit number SPPR0008 issued by the Forestry Commission of New South Wales and exported under the permit number WT2012-8163 provided by the Department of Sustainability, Environment, Water, Population and Communities. The collected specimens were maintained in the laboratory as described previously (64).

### Transcriptome de novo Assembly and Identification of Pigment-Dispersing Factor Receptor Genes

Illumina short reads from two specimens (male and female) of E. rowelli, sequenced as part of the i5K initiative (65, 66), were obtained from the short read archive (SRA) of GenBank (ER9 female: accession number SRR1946792; ER10 male: accession number SRR1946791). Prior to the assembly step, sequencing adapters were trimmed using cutadapt v1.8.1 (67) (parameters: –max-n = 0 -u 10 -u-5 -U 10 -U-5) and short reads with more than five bases below a Phred quality score of 15 were removed using ConDeTri v2.2 (68) (-hq = 14 -lq = 10 -sc = 33 -frac = 0.95 -minlen = 85). Both data sets were then assembled de novo using the software IDBA-Tran v1.1.1 (69) (–mink 20 –maxk 85 –step 5 –max\_isoforms 1 –min\_contig 100) yielding 92,426 unique contigs for ER9 (mean contig length = 717, N50 = 851) and 59,131 for ER10 (mean contig length = 707, N50 = 817), respectively. Both assemblies are available upon request.

To obtain the sequences of putative pdfr genes from the transcriptome of E. rowelli, two bait sequences of known pigment-dispersing factor receptors from the kuruma shrimp Marsupenaeus japonicas (accession number BAH85843.1) and the fruit fly D. melanogaster (NP\_570007.2) were used in tBLASTn v2.4.0+ searches (70) against three assemblies in total [ER9 and ER10 from this study; Filter15 assembly from Hering et al. (62)] yielding 27 candidate sequences with an E value less than 1e−<sup>5</sup> as a threshold.

### Cluster Analysis and Phylogenetic Reconstruction

After screening for putative pigment-dispersing factor receptor genes in E. rowelli transcriptomes, a cluster analysis of the 27 candidate sequences together with 18,337 bilaterian G proteincoupled receptors (GPCRs) obtained from the GPCR database (71) was performed using CLANS (72). In short, all sequences were clustered based on pairwise sequence similarities obtained from all-vs.-all BLAST searches. Because the best BLAST hit does not necessarily imply the closest relationship (73), a phylogenetic tree was reconstructed afterwards from all formerly obtained class B GPCRs, to which the PDFRs most likely belong (57, 59).

For phylogenetic analyses, the transmembrane domains of all class B GPCRs obtained from the cluster analysis [993 in total, mainly hormone receptors, including PDF receptors; reviewed in (74, 75)] were predicted using Pfam-A v29 (76, 77) and aligned using the MAFFT online version v7.452 (78) with the most accurate option L-INS-i and default parameters. To remove homoplastic and random-like positions, the alignment was masked with the software Noisy rel. 1.15.12 (79) (-seqtype = P -shuffles = 10,000) prior to the analyses. Two Maximum likelihood analyses were conducted with the Pthreads version of RAxML v8.2.10 (80). For each run, the best tree was obtained from a combined analysis (-f a option) of 10 independent inferences and GAMMA correction of the final tree under either the empirical JTT substitution model or a dataset-specific GTR substitution matrix, including calculation of bootstrap support values from 100 pseudoreplicates using the rapid bootstrapping algorithm implemented in RAxML. The JTT model was automatically selected by RAxML (PROTGAMMAAUTO option) as bestfitting substitution model. Notably, the analysis using the datasetspecific GTR+G model yielded a better log likelihood score (−87,233.49) for the best tree than using the best obtained empirical model JTT+G (−88,041.12). The phylogenetic tree was visualized with iTol v2 (81) and edited with Adobe Illustrator CS 5.1 (Adobe Systems Incorporated, San Jose, CA, USA).

### Amplification of Gene Fragments and Directional Cloning

The whole coding sequence (CDS) of the identified Er-pdfr gene was amplified from previously synthesized cDNA of E. rowelli [see (82)] using gene-specific primers (Er\_pdfr\_HindIII\_F: 5 ′ -tcgaagcttgccaccATGTGGTATTTAATTTATTGTATTTTA TCCTTC-3′ ; Er\_pdfr\_XhoI\_R: 5′ -tgcgctcgagCTAAATAC ACAATTCCTCCAACAC-3′ ), which contained restriction sites (HindIII, XhoI) required for subsequent directional cloning into the mammalian expression vector pcDNA3.1(+) (Invitrogen, Carlsbad, CA, USA) to generate the plasmid pcDNA3.1-ErPDFR-3K10. The plasmid DNA was purified from 100 mL of transformed E. coli bacterial cell culture using the PureYieldTM Plasmid Midiprep System (Promega GmbH, Walldorf, Germany) including an endotoxin removal step. The sequence and orientation of the cloned insert was verified by Sanger sequencing (Eurofins Genomics GmbH, Ebersberg, Germany) and deposited in GenBank under the accession number Er-pdfr MT080366.

### Antibody Generation

Polyclonal antibodies against Er-PDF-I and Er-PDF-II were newly generated by using HPLC-purified synthetic Er-PDF-I (CNAELINSLLGLPKMMNDA-NH2) and Er-PDF-II peptides (CNAELINSLLNLPQKLQEA-NH2; peptides&elephants GmbH (Hennigsdorf, Germany). Prior to immunization in rabbits, the peptides were coupled via an additional cysteine at the N-terminus to Sulfo-SMCC-activated bovine thyroglobulin as carrier. The anti-Er-PDF antibodies (anti-Er-PDF-I: IG-P1035, LOT# 2078B, 14µg/mL; anti-Er-PDF-II: IG-P1036, LOT# 2079B2, 21µg/mL) were purified from sera of immunized rabbits by two-fold depletion of one antibody against the antigen of the other and subsequent affinity purification on its own antigen to minimize cross-reactivity (ImmunoGlobe GmbH, Himmelstadt, Germany).

A HPLC-purified synthetic peptide from the C-terminal region of Er-PDFR (Ac-LRDHQGGQLSEDRRDC-NH2, Schafer-N ApS, Copenhagen, Denmark) was coupled to keyhole limpet hemocyanin (KLH) and used to newly generate a polyclonal antibody (anti-Er-PDFR; IG-P1044, LOT# 2130-5.6, 25µg/mL), which was affinity purified from blood serum of an immunized goat (ImmunoGlobe GmbH).

### BRET Assay

HEK293T cells were seeded on 96-well Nunc plates (Thermo Fisher Scientific, Waltham, MA, USA) at a density of 2 × 10<sup>4</sup> cells/well and cultured in DMEM high glucose (Sigma-Aldrich Chemie GmbH, Munich, Germany) supplemented with 10% fetal bovine serum and 5% CO<sup>2</sup> at 37◦C. After 24 h, cells were transfected using polyethyleneimine (25 kDa PEI, linear; Polysciences Europe GmbH, Hirschberg an der Bergstrasse, Germany) with GFP2-hEpac1E157-P881-Rluc8 [Epac-L; modified from (83–85)] and pcDNA3.1-ErPDFR-3K10 in a 1:1 ratio (100 ng total DNA/well). Cells were washed with HBSS (Thermo Fisher Scientific) 2 days after transfection and stimulated with different concentrations of synthetic PDF peptides (10−<sup>11</sup> to 10−<sup>5</sup> M final concentration) together with the substrate coelenterazine 400A ("DeepBlueC"; BIOTREND Chemikalien GmbH, Cologne, Germany) at a final concentration of 5µM in a total volume of 50 µL HBSS. Control experiments were conducted with cells expressing Epac-L alone, stimulation without PDF peptides, stimulation using an unrelated peptide (myoinhibitory peptide 7 from the cockroach Rhyparobia maderae), and stimulation with forskolin (50µM final concentration; Sigma-Aldrich Chemie GmbH) + IBMX (100µM final concentration; Sigma-Aldrich Chemie GmbH), respectively. The raw luminescence was measured immediately after stimulation at wavelengths of 410 ± 80 nm for the donor and 515 ± 30 nm for the acceptor in eight biological replicates for each tested PDF peptide concentration (n = 8) using a POLARstar Omega microplate reader (BMG Labtech, Cary, NC,

USA). Their ratios [emission (410 nm)/emission (515 nm)] were box plotted as relative luminescence against the respective PDF concentration in dose response curves.

### Western Blots and Specificity Tests

Western blots for goat anti-Er-PDF-I and goat anti-Er-PDF-II were performed using synthetic Er-PDF-I and Er-PDF-II peptides. Western blots for goat anti-Er-PDFR were conducted using synthetic peptide coupled to bovine serum albumin (BSA) as well as tissue lysate from dissected heads. For lysate preparation, five specimens of E. rowelli were anesthetized in chloroform vapor and dissected in cold phosphate-buffered saline (PBS; 0.1 M, pH 7.4). Tissues of onychophoran heads were homogenized on ice in a solution containing 500 µL NP-40 lysis buffer (pH 8.0) and the protease inhibitor cOmpleteTM Mini (Roche Diagnostics GmbH, Mannheim, Germany). Tissue was incubated in lysis buffer for 2 h at 4◦C. Subsequently, the lysed tissue was centrifuged and the supernatant stored at −80◦C.

Synthetic Er-PDF-I and Er-PDF-II peptides were diluted in 4x Laemmli sample buffer (pH 6.8), heated for 5 min at 95◦C and applied to an acrylamide Tris-tricine gel (10%) for 30 min at 30 V and 120 min at 50 V and then transferred to a Roti <sup>R</sup> pvdf 0.2 membrane (Carl Roth GmbH & Co. KG, Karlsruhe, Germany) via semi-dry blot for 120 min at 84 mA. The synthetic Er-PDFR peptides were diluted in 2x Laemmli sample buffer, heated for 5 min at 95◦C) and, together with the proteins of the lysate (lysed 1:1 in 2x Laemmli sample buffer), separated using SDS-PAGE on a 10% gel for 15 min at 30 V and 40 min at 200 V and then transferred to a Porablot NCP nitrocellulose membrane (Macherey-Nagel GmbH & Co. KG, Düren, Germany) via semidry blot for 60 min at 150 mA. Membranes were blocked for 30 min using 4% powdered milk in PBS at room temperature and then incubated with either rabbit anti-Er-PDF-I and rabbit anti-Er-PDF-II (1µg/mL in PBS each) or goat anti-Er-PDFR (1µg/mL in PBS) overnight at 4◦C. Following several washing steps with PBS, membranes were incubated with either goat anti-rabbit or donkey anti-goat antibodies, respectively, conjugated with alkaline phosphatase (1:500; dianova GmbH, Hamburg, Germany) and washed again in PBS. The signal was developed using a solution containing 175µg/mL BCIP (5 bromo-4-chloro-3-indolyl phosphate; Thermo Fisher Scientific) in dimethylformamide and the reaction was stopped with PBS.

In order to further test the specificity of the generated antibodies, additional western blots were conducted as described above with the following changes: Er-PDF-I antibody was tested against the Er-PDF-II peptide and, vice versa, the Er-PDF-II antibody was tested against the Er-PDF-I peptide.

### Whole-Mount Preparation, Vibratome Sectioning, and Immunohistochemistry

For immunohistochemistry, specimens were anesthetized in chloroform vapor and fixed according to Stefanini et al. (86) in 4% paraformaldehyde (PFA) containing 7.5% picric acid in phosphate buffered saline (PBS) overnight. Specimens were washed in several rinses of PBS, cut in halves and embedded in 31.25% albumin from chicken egg (Sigma-Aldrich Chemie GmbH) and 4.17% gelatin from porcine skin (Type A; Sigma-Aldrich Chemie GmbH) in distilled water. Albumin-gelatin blocks were hardened for 4 h at 4◦C and then fixed in 10% PFA overnight. After several rinses in PBS, the blocks were cut in 80–100µm thick sections using a vibratome (MICROM 650 V; Microm International GmbH, part of Thermo Fisher Scientific, Walldorf, Germany). The sections used for anti PDFRlabeling were treated with a mixture of collagenase/dispase (Roche Diagnostics GmbH; 1 mg/mL each) and hyaluronidase (Sigma-Aldrich Chemie GmbH; 1 mg/mL) in PBS for 40 min at 37◦C and subsequently washed in PBS. Sections were blocked either in 10% normal goat serum (NGS; Sigma-Aldrich Chemie GmbH) for anti-Er-PDF-I and anti-Er-PDF-II labeling or in 10% bovine albumin serum (BSA; Carl Roth GmbH & Co. KG) for anti-Er-PDFR labeling in PBS containing 1% Triton X-100 (PBS-Tx, Sigma-Aldrich Co) for 1.5 h, followed by an incubation with the primary antibodies, either (i) rabbit anti-Er-PDF-I, (ii) rabbit anti-Er-PDF-II or (iii) goat anti-Er-PDFR (0.2µg/mL in PBS-Tx, 1% NGS) for 2 days (rabbit anti-Er-PDF-I and rabbit anti-Er-PDF-II) or 5 days (goat anti-Er-PDFR) at 4◦C. After several changes of PBS for 8–24 h, sections were incubated with either goat anti-rabbit Alexa Fluor <sup>R</sup> 488 (1:500; Thermo Fisher Scientific) or donkey anti-goat Alexa Fluor <sup>R</sup> 488 or Alexa Fluor <sup>R</sup> 680 (1:500; Thermo Fisher Scientific) for 2 days at 4◦C. After washing steps in PBS-Tx, sections were further washed using PBS alone and then incubated in a solution containing 4′ ,6-diamidino-2-phenylindole (DAPI; 1 ng/mL; Carl Roth GmbH & Co. KG) and phalloidin-rhodamine (25µg/mL in PBS; Thermo Fisher Scientific) for 2 h at room temperature. After rinsing the sections several times in PBS they were mounted between two coverslips in ProLongTM Gold Antifade Reagent (Invitrogen).

For double labeling of both PDF peptides, the procedure was the same as for single labeling with the following changes: the sections were incubated in 10% BSA in PBS-Tx for 1.5 h. Thereafter, the sections were first incubated with the rabbit anti-Er-PDF-II (0.2µg/mL in PBS-Tx, 1% BSA) at 4◦C for 2 days. After several changes of PBS-Tx for 12 h, the sections were then incubated in a solution containing the fab fragments goat antirabbit (Jackson Immuno Research Europe Ltd., Cambridgeshire, UK; 40µg/mL in PBS-Tx, 1% BSA) for 2 days at 4◦C. After changes of PBS-Tx for 8–24 h, the sections were incubated in donkey anti-goat Alexa Fluor <sup>R</sup> 594 (1:500, Thermo Fischer Scientific). Thereafter, the sections were incubated with rabbit anti-Er-PDF-I (0.2µg/mL in PBS-Tx, 1% BSA) at 4◦C for 2 days, followed by an incubation in goat anti-rabbit Alexa Fluor <sup>R</sup> 488 (1:500; Thermo Fischer Scientific) and counterstained with DAPI (Carl Roth GmbH & Co. KG).

For double labeling with anti-Er-PDFR and either rabbit anti-Er-PDF-I or rabbit anti-Er-PDF-II, the procedure was as described for single labeling with the following changes: the incubations with the primary antibodies anti-Er-PDFR with either rabbit anti-Er-PDF-I or rabbit anti-Er-PDF-II (0.2µg/mL in PBS-Tx) as well as the secondary antibodies donkey antigoat Alexa Fluor <sup>R</sup> 488 and donkey anti-rabbit <sup>R</sup> 568 (1:500; Thermo Fisher Scientific) were performed simultaneously for 5 days at 4◦C.

Additional double labeling of both PDF peptides was performed on whole mount-preparations of the brain. Specimens of E. rowelli were first anesthetized in chloroform vapor and their brains were dissected in physiological saline (87). Following the fixation after Stefanini et al. (86) for 3 h at room temperature, the brains were washed in PBS, dehydrated through an ascending ethanol series (70, 90, 95, 100, and 100%, 10 min each), cleared in xylene for 10 min twice, rehydrated through a descending ethanol series (100, 100, 95, 90, 70 and 50%, 5 min each) and washed several times in PBS. The samples were then treated with a mixture of collagenase/dispase (Roche Diagnostics GmbH; 1 mg/mL each) and hyaluronidase (Sigma-Aldrich Chemie GmbH; 1 mg/mL) in PBS for 40 min at 37◦C and washed in several rinses of PBS. The staining procedure was as described for sections with the following modifications: the incubation times of the primary antibodies, the fab fragments and the secondary antibodies were increased to 4 days. After the incubation with the secondary antibodies, the samples were washed in PBS-Tx, dehydrated through an ascending ethanol series (70, 90, 95, 100, and 100%, 10 min each), cleared and mounted between two coverslips in methyl salicylate using custom-made carbon fiber-reinforced polymer microscope slides as spacers.

### Confocal Laser Scanning Microscopy and Image Processing

Whole mount preparations and vibratome sections were analyzed using a confocal laser scanning microscope (Zeiss LSM 880; Carl Zeiss Microscopy GmbH, Jena, Germany) equipped with an Airyscan module. Images were acquired and raw Airyscan datasets processed using the ZEN 2 (Black edition) imaging software (Carl Zeiss Microscopy GmbH). Selected substacks were created and adjusted to optimal brightness and contrast using Fiji v.1.52 (88, 89). Panels and diagrams were designed using Illustrator CS5.1 and the video was processed with Adobe Premiere Pro CS5.1 (Adobe Systems Incorporated).

### RESULTS

### Identification of the Onychophoran PDF Receptor

The initial BLAST search for PDF receptor homologs in three transcriptomes of E. rowelli yielded 27 candidate sequences. We included these in subsequent phylogenetic analyses of known bilaterian GPCRs to obtain the onychophoran PDFR ortholog. After prescreening the most likely candidate sequences in an all-vs.-all BLAST cluster analysis (**Figure 2A**), a maximum likelihood tree was reconstructed using the formerly obtained class B GPCR members, to which PDF receptors most likely belong (57, 59). Using this approach, we identified one PDF receptor gene (Er-pdfr) in each analyzed transcriptome of E. rowelli; only in one of the transcriptomes (ER10) the respective transcript was fractured in two contigs (transcripts 10,426 and 39,667; see **Supplementary Figures 1, 2**). The detected Er-pdfr sequence clearly falls into the monophyletic group of protostome PDFRs (**Figure 2B**; **Supplementary Figures 1, 2**).

### In vitro Analysis of the Functionality of the Identified PDF Receptor

To test the functionality of the recognized Er-PDF receptor, we further performed bioluminescence resonance energy transfer (BRET) experiments on cultured human cells (HEK293T) using an Epac-based cAMP-sensor (Epac-L). The cAMP response data revealed that the transfected Er-PDFR is activated by both PDF peptides of E. rowelli, Er-PDF-I and Er-PDF-II, in a dose-dependent manner (**Figures 3A,B**). In contrast, the myoinhibitory peptide Rm-MIP7, which is co-localized with PDF in circadian pacemaker cells of the cockroach Rhyparobia maderae [see (91)] and which we used as a control, did not stimulate the receptor of E. rowelli. The minimum concentrations required for PDFR stimulation differs between the two PDFs: while Er-PDF-I showed an activation threshold at 10−<sup>6</sup> M, the receptor sensitivity to Er-PDF-II was lower by one order of magnitude at 10−<sup>7</sup> M (**Figures 3A,B**).

### Specificity Tests of Er-PDF-I, Er-PDF-II, and Er-PDFR Antibodies

To test the specificity of the generated Er-PDF-I and Er-PDF-II antibodies, we performed western blot analyses using different concentrations of synthetic peptides (**Figures 4A–D**). The western blots show a distinct band below 4.6 kDa for the Er-PDF-I antibody (tested on 500 ng of synthetic Er-PDF-I peptide) and the Er-PDF-II antibody (using 50 ng and 500 ng of synthetic Er-PDF-II peptide), respectively (**Figures 4A,D**). The control experiments reciprocally using the non-corresponding peptides revealed no signal (**Figures 4B,C**).

The specificity of the Er-PDFR antibody was tested using both the synthetic peptide as well as lysates of entire heads of E. rowelli at different dilutions (**Figure 5**). The western blot employing the synthetic PDFR peptide coupled to BSA exhibits a distinct band at ∼70 kDa and an additional band above 100 kDa. The western blots using different dilutions of lysates show distinct bands at ∼70 kDa and a considerably weaker band at ∼120 kDa in all three lanes (**Figure 5**). The lane carrying 15 µl of lysate exhibits an additional weak band at ∼250 kDa.

### Immunolocalization of PDF-I and PDF-II Peptides and Their Receptor

Immunolabeling against the Er-PDF-I and Er-PDF-II peptides revealed high numbers of somata and fibers in the brain and the ventral nerve cords of E. rowelli (**Figures 6A–D**, **7A–C,A**′**–C**′ ). In each brain hemisphere, Er-PDF-I and Er-PDF-II immunoreactive (ir) somata occur mainly in one large ventromedian (**Figures 7A**′**–C**′ ) and two dorsal groups, the latter comprising a smaller anterior and a larger median group (**Figures 7A–C**). While the somata of the anterior group innervate the anterior neuropil, the larger median group is associated with the central body and the central neuropil and is located closer to the dorsal surface of the brain (**Figures 6A,B**; **Supplementary Video 1**). The somata of the ventromedian group occupy mainly the median portion of the ventral perikaryal layer, in which they are situated at different

FIGURE 2 | Phylogenetic analyses to deorphanize the PDF receptors of E. rowelli. (A) Cluster analysis of ∼18,300 bilaterian G protein-coupled receptor genes (GPCRs) including 27 PDF receptor candidate sequences from E. rowelli. The position of the PDF receptor family is highlighted in red. (B) Maximum likelihood tree under a dataset-specific GTR+G model of amino acid sequence evolution of ∼1,000 class B GPCRs (including PDF receptors) obtained from the former cluster analysis. Note the occurrence of a putative PDF receptor from E. rowelli in a clade of protostome PDF receptors (highlighted in red). See Supplementary Figures 1, 2 for the full trees.

cAMP-sensor after exposure to different concentrations of the synthetic PDF peptides Er-PDF-I (A) and Er-PDF-II (B). Experiments without transfecting Er-PDFR (–ErPDFr), without Er-PDF stimulus (–PDF) and stimulating with an unrelated peptide (myoinhibitory peptide; +RmMIP7) were conducted as negative controls. Forskolin/IBMX (+Forsk./IBMX), stimulating adenylyl cyclase and inhibiting phosphodiesterases, respectively, was induced as a positive control in order to generate a massive increase of cAMP within the cell [reviewed in Prinz et al. (90)]. Box plots of eight biological replicates each (n = 8). Interquartile range (IQR) is delimited by the 25th (lower border) and 75th percentile (upper border); whiskers depict the 1.5-fold IQR. The median (50th percentile) is indicated by a horizontal line within the box, while squares (✷) demarcate the mean, and crosses (×) the minimum or maximum values, respectively.

depths (**Figures 6A,B**, **7A**′ **,B**′ ). Notably, no Er-PDF-I-ir or Er-PDF-II-ir somata are found in the so-called hypocerebral organs– vesicle-like structures associated with the ventral cortex of each brain hemisphere (**Supplementary Figures 3A,B**). Additional

groups of Er-PDF-I-ir and Er-PDF-II-ir somata occur in the ventral part of the deutocerebrum as well as in the connecting cords, which link the brain with the ventral nerve cords (**Figure 6C**). Within the ventral nerve cords,

anti-Er-PDF-II (C).

the somata are mainly located in the ventral and median perikaryal layers (**Figure 6D**). While most somata with a strong Er-PDF-II signal appear ventrally, those with a strong Er-PDF-I signal show a wide mediolateral distribution in each nerve cord.

In addition to somata, our immunolabeling revealed numerous Er-PDF-I-ir and Er-PDF-II-ir fibers, which are characterized by typical varicosities and occur in all major neuropils of the E. rowelli brain including the central neuropil, the central body, the mushroom bodies, the olfactory lobes, and the optic tracts and neuropils (**Figures 6A–C**). Despite the nearly ubiquitous distribution of Er-PDF-I-ir/Er-PDF-II-ir fibers in all brain neuropils, their density differs among specific regions. While the central neuropil shows a dense mesh of internal Er-PDF-I-ir and Er-PDF-II-ir fibers, the olfactory glomeruli are rather surrounded by them (**Figures 6A,B**). The lowest density of Er-PDF-I-ir and Er-PDF-II-ir fibers is found in the antennal tracts and the mushroom bodies, both of which exhibit fibers mainly in the periphery, except for the median lobe of the mushroom body, which additionally shows strong internal Er-PDF-I immunoreactivity (**Figure 6B**). In contrast, the hypocerebral organs do not contain any Er-PDF-I-ir or Er-PDF-II-ir fibers (**Supplementary Figures 3A,B**). Like in the brain, numerous Er-PDF-I-ir and Er-PDF-II-ir fibers are evident in the neuropils of each nerve cord (**Figure 6D**). Most fibers with a prominent Er-PDF-II immunoreactivity are confined to the ventral portion of the ventral cord neuropil, whereas those exhibiting strong Er-PDF-I signal are more widely distributed, thus reflecting the wide distribution of the corresponding somata in the perikaryal layer of the nerve cord.

To determine whether Er-PDF-I and Er-PDF-II are coexpressed at least in some cells and regions of the nervous system, we performed double labeling. Our data indeed show that both peptides are found in the same sets of neuronal somata and fibers within the brain and the ventral nerve cords (**Figures 7C**, **8A–H**). However, the signal intensity differs markedly between

FIGURE 6 | Immunolocalization of Er-PDF-I and Er-PDF-II in E. rowelli. Confocal laser scanning micrographs of vibratome sections. Dorsal is up in all images. Hatched line indicates median regions. Er-PDF-I-ir (magenta), Er-PDF-II-ir (green) and DNA-labeling (gray) from anterior to posterior through head (A–C) and trunk (D). Note similar distribution of either peptide in brain and ventral nerve cords. (A) Arrows indicate dorsal groups of somata in protocerebrum. Insets show large varicosities in PDF-immunoreactive fibers. (B) Arrowheads point to large ventral groups of somata in protocerebrum. Asterisks indicate four lobes of mushroom bodies. (C) Filled arrowheads point to somata in deutocerebrum. Empty arrowheads demarcate somata in connecting cords. (D) Cross sections of ventral nerve cords. at, antennal tract; cb, central body; cc, connecting cord; cn, central neuropil; dc, deutocerebrum; dl, dorsal perikaryal layer; ol, olfactory lobe; on, optic neuropil; ot, optic tract; vl, ventral perikaryal layer; vn, neuropil of ventral nerve cord. Scale bars: 50µm (A–D) and 500 nm (insets).

micrographs. Dotted lines indicate outline of brain. Anterior is up in all images. Er-PDF-I-ir (magenta), Er-PDF-II-ir (green), and co-localization (white). (A,B) Dorsal Er-PDF-I-ir /Er-PDF-II-ir cell group is subdivided in anterior (filled arrowheads) and median cell groups (open arrowheads). Ventrally located somata with strong Er-PDF-I immunoreactivity occur anteriorly and laterally (A′ ,C′ ), whereas those with strong Er-PDF-II immunoreactivity are located further posteriorly (B′ ,C′ ). (C) Note that all Er-PDF-I-ir and Er-PDF-II-ir somata of dorsal group appear in white, indicating co-localization of both peptides at similar levels. Scale bar: 50µm.

tissues and cells. For example, while Er-PDF-I and Er-PDF-II are both expressed at a high level in the dorsal groups of somata within the brain (**Figures 7C**, **8A–C**), the ventromedian groups show a more sophisticated pattern, with Er-PDF-I exhibiting a stronger signal in the anterolateral and Er-PDF-II in the median regions (**Figures 7C**′ , **8A,B,D**). A differentiated signal is evident even within individual cells. While some somata show a completely overlapping signal, others seem to express either Er-PDF-I or Er-PDF-II (**Figures 7C,C**′ , **8A–D**). However, high resolution imaging confirmed the presence of both peptides in all labeled somata, although they seem to be expressed at different levels (**Supplementary Figures 4, 5**).

somata, albeit at different intensities. (C) Somata (arrowheads) and processes (arrows) of dorsal cell group exhibit equal intensity levels of Er-PDF-I-ir and Er-PDF-II-ir. (D) In contrast, somata and processes of ventral cell group occur in three variants: (i) Er-PDF-I-ir and Er-PDF-II-ir at equal levels (open arrowheads), (ii) Er-PDF-I-ir at higher level (arrows), and (iii) Er-PDF-II-ir at higher level (filled arrowheads). (E–H) Differences in expression levels of Er-PDF-I-ir and Er-PDF-I-ir are also seen in optic neuropil (E), inner lobe (black asterisk) as opposed to remaining lobes of the mushroom bodies (white asterisks) (F), somata of connecting cords (G), and somata and neuropil of nerve cords (H). an, anterior neuropil; ad, anterior division of central body; at, antennal tract; cb, central body; cc, connecting cord; cn, central neuropil; dl, dorsal perikaryal layer; ey, eye; mb, mushroom body; ol, olfactory lobe; on, optic neuropil; ot, optic tract; pd, posterior division of the central body; vl, ventral perikaryal layer; vn, neuropil of ventral nerve cord; vp, perikaryal layer of ventral nerve cord. Scale bars: 50µm (A,B) and 20µm (C–H).

Different signal intensities are also evident among the fibers. Like the somata, the fibers show three different expression patterns with either both peptides expressed similarly or one or the other occurring at a higher level (**Figures 8A–H**, **9A**). Differences are evident, for example, in the central body, which shows a stronger Er-PDF-II signal (**Figure 8B**), whereas the optic neuropil exhibits a higher expression of Er-PDF-I (**Figure 8E**). Double labeling of the mushroom body confirms the predominant occurrence of Er-PDF-I inside the median lobe (**Figure 8F**), which was detected by separately labeling the individual peptides (cf. **Figure 6B**). Likewise, the double labeling confirms the differences in the distribution of Er-PDF-I and Er-PDF-II in the neuropil of the ventral nerve cords, which shows a condensation of fibers with a strong Er-PDF-II signal in its ventral portion but a more ubiquitous distribution of fibers with a strong Er-PDF-I signal (**Figure 8H**).

Although the somata of the Er-PDF-I-ir and Er-PDF-II-ir neurons are exclusively found in the central nervous system of E. rowelli, it is worth mentioning that at least some of their fibers extend into the peripheral nervous system including various peripheral nerves (such as leg nerves, slime papilla nerves, tongue nerve, oral and pharyngeal nerves), the ring commissures, and the heart nerve (**Figure 9A**). In the heart wall, the fibers show the same differential expression pattern of the two peptides as in the remaining nervous system. While the fibers of the longitudinal, dorsomedian heart nerve express both peptides at a similar level, the associated transverse fibers exhibit a higher intensity of Er-PDF-II signal than Er-PDF-I. Remarkably, the high resolution confocal microscopy used for imaging reveals diverging patterns of Er-PDF-I and Er-PDF-II immunoreactivity even within the individual fibers in our specimens (**Figures 8C,D**, **9A**).

In addition to the Er-PDF-I and Er-PDF-II peptides, we immunolocalized the newly identified and deorphanized pigment-dispersing factor receptor (PDFR) of E. rowelli. As expected from its nature as a membrane receptor protein, the Er-PDFR immunoreactivity is generally found in the periphery of each cell (e.g., **Figures 9B,C**). Like the two analyzed PDF peptides, Er-PDFR is localized in a dorsal and a ventral group of somata within the brain (**Figures 10A–C**, **11A–C**). However, double labeling with either Er-PDF-I or Er-PDF-II revealed no co-expression of Er-PDFR in the dorsal groups of neuronal somata, although at least some PDFR-ir cells are either directly adjacent to or located in close proximity to the Er-PDF-I-ir/Er-PDF-II-ir somata (**Figures 10B**, **11B**). In contrast to the dorsal group, the Er-PDFR-ir somata of the ventral group show three different conditions: they are either (i) not closely associated with the Er-PDF-I-ir/Er-PDF-II-ir somata, (ii) directly adjacent to them, or (iii) co-express the two peptides and the receptor (**Figures 10C**, **11C**). The hypocerebral organs exhibit no PDFR-ir somata (**Supplementary Figure 3C**).

Besides the dorsal and ventral perikaryal layers of the brain, additional Er-PDFR-ir somata occur, for example, next to the olfactory glomeruli (**Supplementary Figure 6A**) and the median lobe of the mushroom body (**Supplementary Figure 6B**). Moreover, Er-PDFR is expressed in a few dorsolateral somata of the brain near the eye as well as in some somata of the optic ganglion itself (**Figures 10A, 11A**, **12A–C**). Er-PDFR-expressing somata are absent in the ventral nerve cords, but are abundant in the connecting cords (**Supplementary Figures 6C,D**). Here, double labeling solely revealed somata with either Er-PDFR or Er-PDF-I/Er-PDF-II signal, whereas co-expression of both the peptide and the receptor is not evident.

In addition to the described Er-PDFR immunoreactivity in the neuronal somata, the receptor is identifiable in the neuropils of the brain, the eyes and the ventral nerve cords (**Figures 10A**, **11A**, **Supplementary Figures 6A–**C). However, the signal in the neuronal processes is generally weaker and less defined than in the somata. Single fibers are thus difficult to follow and appear as numerous dots in each neuropil region (**Supplementary Figures 6A–C**).

Notably, in contrast to the PDF signal, Er-PDFR immunoreactivity is not restricted to the nervous system of E. rowelli but is additionally found in other cells and tissues (**Figures 9B,C**, **10A**, **11A**, **12A–D**). The most prominent signal outside the nervous system occurs in the visual system including the microvilli of the photoreceptor processes (**Figures 12A,B**) and the pigment granules [cf. 86)] of the supportive cells (**Figures 12C,D**). Er-PDFR immunoreactivity is further associated with membranes of certain blood cells (= hemocytes), which appear blue under the bright-field microscope (**Figures 9B,C**, **Supplementary Figures 7A,B**). These blueish Er-PDFR-ir hemocytes do not only occur in the heart lumen but also in other regions of the circulatory system of E. rowelli (**Figure 9C**).

### DISCUSSION

### Deorphanization of the Onychophoran PDF Receptor

Using a combination of transcriptomic and phylogenetic analyses, we were able to identify a putative PDF receptor candidate in the onychophoran E. rowelli. PDFRs are generally involved in cAMP-mediated signaling pathways and—in response to PDF—increase intracellular cAMP levels in vivo (92) and also in vitro, for example, when expressed in human embryonic kidney cells [HEK293; (57)]. We used the latter approach for clarifying whether or not the identified receptor is functional and whether it responds to both peptides by measuring its activity in the presence of either Er-PDF-I or Er-PDF-II in vitro. This was done by detecting the cAMPinduced activation of an Epac-based sensor ["Exchange protein activated by cAMP"; (93)] using bioluminescence resonance energy transfer [BRET; (84, 90, 94)]. Our data revealed that Er-PDFR is activated by both peptides, although Er-PDF-II seems to be one order of magnitude more potent than Er-PDF-I. A similar result was obtained from nematodes (41), but the reason for the observed differences in stimulation of the PDF receptor by different ligands remains unknown. Taken together, the results of our phylogenetic analyses and BRET experiments suggest that the identified receptor is indeed functional and most likely represents the sole PDF receptor of the onychophoran E. rowelli. Our BRET data further suggest that

Er-PDFR is activated by Er-PDF-I and Er-PDF-II in a distinct and dose-dependent manner.

### Specificity of the Generated Antibodies

The western blots conducted for testing the affinity of the newly generated antibodies against Er-PDF-I and Er-PDF-II revealed specific bands below 4.6 kDa, corresponding to the expected molecular weight of both peptides of 1.9–2.0 kDa each. Specificity tests using the two antibodies against each non-corresponding synthetic peptide (anti Er-PDF-I and anti Er-PDF-II) were negative, suggesting that the two antibodies are specific and do not cross-react.

Additional western blots for testing the affinity of the antibody generated against the PDF receptor of E. rowelli yielded more intricate but consistent results. For these tests, we used a synthetic PDFR peptide coupled to BSA together with a lysate of dissected heads at different dilutions. All of these tests revealed a distinct band at ∼70 kDa corresponding well to the expected molecular weight of the synthetic PDFR peptide-BSA conjugate (which is ∼6 kDa: 1.9 kDa fractionized Er-PDFR plus 66 kDa BSA) and the entire receptor protein (61.06 kDa) in the head lysate. The appearance of additional bands at ∼120 kDa and ∼250 kDa may have been induced by the formation of dimers and multimers by the receptor protein itself/the BSA coupled to the synthetic peptide. We believe that this result does not compromise the specificity of the Er-PDFR antibody generated for this study, as it might be due to a technical artifact.

### Differential Expression of PDF-I and PDF-II Peptides in the Onychophoran Nervous System

Our immunolabeling using two specific antibodies against the PDF-I and PDF-II peptides of E. rowelli largely confirms the

broad distribution of PDFs in the onychophoran nervous system demonstrated previously (38). In accord with these previous findings, we encountered PDF immunoreactivity in groups of neuronal somata in the ventral and dorsal protocerebrum, the deutocerebrum, the ventral nerve cords, as well as an elaborate fiber network within the brain, the ventral nerve cords and the peripheral nervous system including the heart nerve. Interestingly, we found that both peptides are co-expressed in all immunoreactive structures of E. rowelli, although their respective levels of expression differ at least in some cells and tissues. We were able to recognize three major patterns: (i) equal staining intensities of both peptides, (ii) higher Er-PDF-I signal, and (iii) higher Er-PDF-II signal. These different patterns were particularly evident in specific groups of somata within the brain (**Figure 13**).

Currently, we can only speculate about the functional significance of these observations. One possible reason for the differential expression of Er-PDF-I and Er-PDF-II might be their different functionality, as evidenced by our BRET assays with the deorphanized PDF receptor. Interestingly, the nematode C. elegans also possesses two pdf genes (albeit three PDF isoforms), which are co-expressed in certain cells including interneurons, and chemosensory and motor neurons (42). The PDF peptides in these cells seem to have opposing effects on locomotion, as PDF-1 deletion mutants and animals with overexpressed PDF-2 show the same atypical locomotory behavior (41, 49). Irrespective of whether or not similar opposing effects of the two peptides also exist in onychophorans, their coexpression in all immunoreactive cells of E. rowelli and at least in some cells of C. elegans suggests that this feature might have been present in the last common ancestor of Ecdysozoa.

The co-localization of the two PDF peptides in onychophorans and nematodes contrasts with what is known from arthropods that exhibit more than one PDF/PDH isoform

such as decapod crustaceans. These animals express two to three PDH peptides (isoforms of either α-PDH or β-PDH), which are assumed to fulfill different functions based on their distinct localization (11, 43, 44, 46–48, 95–100). For example, while β-PDH II is expressed in the sinus gland of Cancer productus, and therefore most likely acts as a neurohormone, β-PDH is localized in the eyestalk, thus serving as a neuromodulator or neurotransmitter (11). It must be noted, however, that all crustacean PDH isoforms are derivatives and in-paralogs of pdf-I (**Figure 1**), whereas onychophorans and nematodes most likely inherited two pdf genes, pdf-I and pdf-II (pdf-1 and pdf-2 sensu 41), from the last common ancestor of Ecdysozoa (38). From the evolutionary point of view, the multiple isoforms of decapod crustaceans are a derived feature of this group.

### Immunolocalization of the Onychophoran PDF Receptor Indicates Different Signal Transduction Mechanisms and Supports Hormonal Control

Localization of the deorphanized PDF receptor in E. rowelli using a specific Er-PDFR antibody revealed prominent

immunoreactivity associated with membranes of specific cells in different organ systems. In contrast to the two PDF peptides, PDFR is not expressed in the ventral nerve cords but only in the brain and the connecting cords (101) within the central nervous system. There are three types of PDFR-expressing cells in the brain including those (i) co-expressing PDFR and PDF-I/PDF-II, (ii) not co-expressing but directly adjacent to PDF-I/PDF-II-expressing cells, and (iii) not closely associated with PDF-I/PDF-II immunoreactive cells. This indicates three different mechanisms of signal transduction:

(i) Co-localization of PDFR and the two PDF peptides suggests autoreception, i.e., feedback on its own cell, which has also been reported from fruit flies and Madeira cockroaches (34, 102). In insects, however, the majority of PDFexpressing cells are autoreceptive and they are typically outnumbered by PDFR-expressing cells (60), whereas PDFimmunoreactive cells are more numerous than PDFRexpressing cells in the onychophoran brain.


the dorsal cell group (diagram on the left) express both peptides at equal levels, whereas those of the ventral cell group show a more elaborate pattern of expression (diagram on the right). While the anterior somata exhibiting higher levels of PDF-I and the posterior ones of PDF-II, the median somata express both peptides at similar levels.

indeed characterized by numerous varicosities and axonal terminals, which are reminiscent of typical release sites (38).

The so-called hypocerebral organs–enigmatic, vesicle-like structures associated with the onychophoran brain–have been repeatedly proposed to play a neurosecretory role (104, 105). Although we cannot rule out this potential function, the lack of PDF-I, PDF-II, and PDFR immunoreactivity in their tissues suggests that the hypocerebral organs do not produce or secrete PDF peptides; neither do they seem to be involved in the PDF/PDFR system.

Apart from the central nervous system, prominent PDFR signal is associated with the visual system of E. rowelli. In particular, membranes surrounding the microvilli of the photoreceptor cell processes (rhabdoms) and the pigment granules of the supportive cells [cf. (106)] are highly immunoreactive. PDH has been described to induce retinal pigment dispersion in the eyes of crustaceans under bright light conditions (1, 3, 5, 6). Whether or not such pigment dispersion occurs in the onychophoran eye is unclear, but it seems unlikely due to the generally low resolution of vision in these animals (107) and the position of pigment granules at the base of the rhabdomeric layer (106, 108, 109). Regardless of in which direction the pigment granules would move within the supportive cells, their basal position prevents shading/protection of photoreceptor processes from incident light. Additional PDFR immunoreactivity within the eye occurs in a few somata of the optic ganglion [cf. (109)]. Whether this signal is associated with glial cells, as reported from D. melanogaster (61), remains to be clarified.

Most intriguingly, besides the nervous and visual systems of E. rowelli, we detected PDFR in membranes of specific hemocytes. These hemocytes appeared blueish under the brightfield microscope and were localized in different parts of the circulatory system including the heart. Up to five major types of hemocytes have been described from onychophorans based on their ultrastructure (110, 111), but beyond this only little is known about their possible functions. At this point, we can only speculate that the blue-pigmented granules might be due to storage of hemocyanin (112) or, alternatively, might be associated with a role of these cells in immune defense, as hemocytes have been proposed to absorb dead ectodermal cells (113), which in E. rowelli contain blue pigment granules. Irrespective of whether these hemocytes are involved in the potential storage of hemocyanin or innate immune response, each of these roles might be controlled by the PDF/PDFR system depending on the amount of PDF peptides released into the hemolyph. This potentially novel function of PDF peptides associated with hemocytes requires further investigation. Nonetheless, the demonstrated occurrence of PDF receptor in cells of the visual and circulatory systems of E. rowelli clearly supports the suggested (38) hormonal role of PDF peptides in onychophorans.

### CONCLUSIONS

In this study, we deorphanized and immunolocalized the onychophoran PDF receptor and performed double labeling using specific antibodies against the two onychophoran PDF peptides. We further explored potential differences in the stimulation of PDFR by each peptide, revealing that PDF-II has a tenfold higher potency than PDF-I as an activating ligand. Double immunolabeling demonstrates that both onychophoran PDF peptides are co-expressed but their respective levels of expression show cell-specific variation. For example, some neurons express the same amount of both peptides, while others exhibit higher levels of either PDF-I or PDF-II. Whether this variation is due to cyclic changes, as for example in insects (33, 114, 115), remains to be clarified.

The detection of the onychophoran PDF receptor in cells that additionally express the two PDF peptides suggests autoreception, whereas spatial separation of PDFR- and PDFexpressing cells confirms hormonal release into the hemolymph (38). Hence, the PDF peptides of onychophorans might play a dual role—as hormones and as neurotransmitters or neuromodulators—similar to the PDH peptides of decapod crustaceans (7–12). Whether the PDF-releasing cells of onychophorans are light-responsive ultradian and circadian oscillators, as for example in the Madeira cockroach (116–119), is unknown. Future studies should therefore focus on clarifying whether there are any cycling patterns in the expression of the two peptides. Establishment of cell cultures (e.g., for in vivo calcium imaging) and corresponding behavioral assays would contribute to a better understanding of the PDF/PDFR system in Onychophora and the last common ancestors of Panarthropoda and Ecdysozoa.

### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

### AUTHOR CONTRIBUTIONS

LH, CM, AW, FH, MS, and GM designed the research. LH conducted the BRET experiments and the phylogenetic

### REFERENCES


analyses. CM, NM, and SH performed the immunolabeling experiments. NM, SH, SK, and SF carried out the specificity tests. CM, LH, NM, SH, and GM analyzed the data and wrote the first draft. All authors have read and approved the final manuscript.

### FUNDING

This work was supported by German Research Foundation (DFG: MA 4147/10-1) to GM. University of Kassel, Programmlinie Aufbau Graduiertenprogramme (Biological Clocks: 1.10.01.EWN/E1) to MS, FH, and GM.

### ACKNOWLEDGMENTS

The authors are thankful to Noel N. Tait and Dave M. Rowell for help with the permits. Dave M. Rowell, Ivo de Sena Oliveira, Isabell Schumann, and Alexander Baer are acknowledged for their support with collecting the specimens. We gratefully acknowledge Alexander Meier for providing the Epac-L cAMPsensor and all members of the Department of Zoology, University of Kassel, for their assistance with animal husbandry. We are thankful to Thordis Arnold, Erik Machal, and Irmtraud Hammerl-Witzel for assistance in laboratory work. Vladimir Gross kindly proofread the manuscript. We thank the staff of the Department of Sustainability, Environment, Water, Population and Communities of the Australian Government for providing collecting and export permits.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00273/full#supplementary-material


authentic β-PDH as a local neurotransmitter and β-PDH II as a humoral factor. J Comp Neurol. (2008) 508:197–211. doi: 10.1002/cne.21659


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Martin, Hering, Metzendorf, Hormann, Kasten, Fuhrmann, Werckenthin, Herberg, Stengl and Mayer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Comparative Aspects of Structure and Function of Cnidarian Neuropeptides

Toshio Takahashi\*

Suntory Foundation for Life Sciences, Bioorganic Research Institute, Kyoto, Japan

Cnidarians are early-branching animals in the eukaryotic tree of life. The phylum Cnidaria are divided into five classes: Scyphozoa (true jellyfish), Cubozoa (box jellyfish), Hydrozoa (species, Hydra and Hydractinia), Anthozoa (sea anemone, corals, and sea pen), and Staurozoa (stalked jellyfish). Peptides play important roles as signaling molecules in development and differentiation in cnidaria. For example, cnidaria use peptides for cell-to cell communication. Recent discoveries show that Hydra neuropeptides control several biological processes including muscle contraction, neuron differentiation, and metamorphosis. Here, I describe the structure and functions of neuropeptides in Hydra and other cnidarian species. I also discuss that so-called primitive nervous system of Hydra is in more complex than generally believed. I also discuss how cnidaria use peptides for communication among cells rather than in higher animals.

#### Edited by:

James A. Carr, Texas Tech University, United States

#### Reviewed by:

Hervé Tostivint, Muséum National d'Histoire Naturelle, France Joao Carlos dos Reis Cardoso, University of Algarve, Portugal

> \*Correspondence: Toshio Takahashi takahashi@sunbor.or.jp

#### Specialty section:

This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology

Received: 03 March 2020 Accepted: 30 April 2020 Published: 27 May 2020

#### Citation:

Takahashi T (2020) Comparative Aspects of Structure and Function of Cnidarian Neuropeptides. Front. Endocrinol. 11:339. doi: 10.3389/fendo.2020.00339 Keywords: Hydra, Cnidaria, neuropeptide, metamorphosis, myoactivity, interstitial stem cell, neuron differentiation

### INTRODUCTION

Molecular phylogenetic studies show that Cnidaria are the sister group of Bilateria. Ancestral Cnidarians diverged over 500 million years ago in animal evolution. Despite the long course of evolution, the nervous systems of cnidarians are differentiated (1). Cnidarian species are also mainly classified into two groups according to the unique life cycle, the anthozoans and medusozoans (1). Anthozoa lives exclusively as polyps. Among medusozoans, Cubozoa and Scyphozoa predominantly live as medusae. On the other hand, Hydrozoa usually follows a life cycle where the species alternate between these two forms except for Hydra and Hydractinia. Staurozoa lives exclusively as polyps.

Cnidaria such as Hydra are composed of multiple cell types that represent the fundamental architecture of multicellular organisms. Hydra exhibits a simple body plan with a head and tentacles on one end and a foot on the opposite end of a hypostome. The gastric region is located between the head and foot. The body is composed of two layers, ectoderm and endoderm, which are separated by an extracellular matrix, the mesoglea. The cells of both epithelial layers also function as muscle cells. Hydra also have multipotent interstitial stem cells, which differentiate into nerve cells (2), nematocytes (2), gland cells (3), and germ cells (4). Hydra as a member of cnidaria represents an attractive model to understand axial pattern formation into head- and foot-specific tissues.

The nervous system of Hydra is simple and is composed of a nerve net that extends throughout the animal. The cnidarian nervous system is mainly peptidergic (5). Classical molecules such as acetylcholine also contribute to the Hydra nervous system (6).

Peptides play important roles as hormones and neurotransmitters and they are involved in the maintenance of a variety of developmental stages. However, little is known about whether they are involved in differentiation and development. In Hydra, theoretical model suggests that small molecules such as peptides are transported to establish morphogenetic gradients that regulate patterning processes. To systematically identify and characterize peptide signaling molecules, we started the Hydra Peptide Project (7). By using the strategy illustrated in **Figure 1**, many peptides were extracted and purified with successive steps of high performance liquid chromatography (HPLC). Signaling peptides were identified by their effect on the gene expression profile of Hydra by using differential display (DD)-PCR. Positive peptides were chemically synthesized, the synthetic peptides were used for biological assays including behavioral (muscle contraction), neuron differentiation, and others. Furthermore, introduction of the Hydra Expressed Sequence Tag (EST) Project has enabled us to identify transcripts for novel peptides even more efficiently (**Figure 1**) (8).

The primary aim of the present review is to describe the structures and functions of peptide signaling molecules such as neuropeptides in cnidarians, especially in Hydra.

### CNIDARIAN NEUROPEPTIDES

### FMRFamide-Like Peptides (FLPs)

The peptide FMRFamide was originally purified from the cerebral ganglion of the clam Macrocallista nimbosa (9, 10). Other mollusks and members of most other phyla express peptides with a similar sequence. FMRFamides are categorized into two groups depending on the structural similarity with FMRFamide. The first category consists of FMRFamide-related peptides (FaRPs), which include encode for multiple peptides with the C-terminal FMRFamide or FLRFamide (11). The second category of FMRFamides includes FLPs, which are peptides that have only the RFamide sequence at C-termini (12). Therefore, FaRPs and all other RFamide peptides are considered FLPs. Krajniak (13) excellently reviewed FaRPs in invertebrates. This overview primarily focuses on cnidarian FLPs.

A variety of FLPs are expressed in the evolutionarily ancient nervous system of cnidarians (**Table 1**). Peptides with GRFamide at the C-terminus have been found in a scyphozoan (the jellyfish Cyanea lamarckii) (15), three hydrozoans (Hydra magnipapillata, the hydromedusa Polyorchis penicillatus, and Hydractinia echinata) (16–21), and an anthozoan (the sea anemone Anthopleura elegantissima) (14), whereas peptides with TRFamide and/or RRFamide at the C-terminus have been described in another anthozoan (the sea anemone Nematostella vectensis) (22). All mature neuropeptides are controlled by highly regulated secretion pathways. Usually, a precursor of a neuropeptide is incorporated as a preprohormone in the endoplasmic reticulum, where it is converted into a prohormone. Next, prohormones move to the Golgi apparatus for endoproteolysis and/or amidation at the C-terminus, which results in the final active peptide. FLPs have been identified in numerous cnidarians. A Calliactis parasitica cDNA includes 19 copies of Antho-RFamide (**Table 1**), two copies of FQGRFamide, and one copy of YVPGRYamide (24). Two cDNAs have been isolated from Anthopleura elegantissima; one cDNA includes 13 copies of Antho-RFamide (**Table 1**) and nine other FLPs; the second cDNA includes 14 copies of Antho-RFamide and eight other FLPs (25). Renilla koellikeri has 36 copies of Antho-RFamide (26). A Polyorchis penicillatus cDNA includes one copy of Pol-RFamide I (**Table 1**) and 11 copies of Pol-RFamide II (**Table 1**), in addition to another predicted FLP (27). In Hydra, RFamides are spliced from three different preprohormones called A, B, and C (**Figure 2A**). Preprohormone-A includes six Hydra-RFamides (Hydra-RFamide I-VI) (**Figure 2A**) (**Table 1**) (19). Preprohormone-B has one copy of Hydra-RFamide I and

#### TABLE 1 | FLPs in cnidarians.


pQ, pyroglutamate.

Hydra-RFamide II and three probable Hydra-RFamides (Hydra-RFamide V, VII, and VIII) (**Figure 2A**) (19). Preprohormone-C has one copy of Hydra-RFamide I and seven copies of additional neuropeptide sequences (one copy of pQWFSGRFamide and six copies of pQWLSGRFamide) (**Figure 2A**) (19). In Hydractinia echinata, one copy of He-RFamide is present (**Table 1**) (21). In Nematostella vectensis, three FLPs [Nv-RFamide I and II and RFamide (ID:17)] are present (**Table 1**) (22, 23). Collectively, precursor-encoding cnidarian FLP cDNAs yield many neuropeptides with great structural diversity, indicating that they have great functional diversity as well.

Cnidarian FLPs control several functions, such as muscle contraction, feeding, sensation, reproduction, metamorphosis, and movement of larvae. Treatment of the sea anemone Calliactis parasitica with Antho-RFamide increases muscle tone, contraction amplitude, and contraction of slow muscles (28). In individual autozooid polyps of Renilla koellikeri, Antho-RFamide also leads to tonic contractions in the rachis and peduncle (29). In Hydra, Hydra-RFamide III mediates pumping of the peduncle in a dose-dependent manner (30).

FMRFamide activates a Na<sup>+</sup> channel identified in snails (31, 32). Three cation channel subunits of the degenerin (DEG)/epithelial Na<sup>+</sup> channel (ENaC) gene family were cloned from the freshwater polyp Hydra magnipapillata and designated Hydra Na<sup>+</sup> channel (HyNaC)2–4 (33). Subsequently, a novel subunit, designated HyNaC5, was cloned, and expression of the gene was shown to be co-localized with HyNaC2 and HyNaC3 at the base of the tentacles (34). Co-injection of HyNaC5 with HyNaC2 and HyNaC3 genes in Xenopus oocytes strongly enhances the current amplitude after peptide application and increases the affinity of the channel for Hydra-RFamide I and II (34). HyNaC2/3/5 is assembled into a functional heterotrimeric channel that is activated by Hydra-RFamide I with high affinity. The experimental data of HyNaCs suggested that secretion of Hydra-RFamide I and/or II induces tentacle contraction, perhaps during feeding (33, 34). Seven additional HyNaC subunits, HyNaC6-HyNaC12, were cloned, and all belong to the DEG/ENaC gene family (35). These subunits and the four originally identified subunits self-assemble in Xenopus oocytes to create 13 different ion channels that show high-affinity binding of Hydra-RFamide I and II. The HyNaC inhibitor, diminazene, slows tentacle movement in Hydra. Because Hydra express multiple peptide-gated ion channels with a restricted number of FLPs as ligands (35), FLPs may be important for fast transmission at neuro-muscular junction in cnidarians. The function of Hydra-RFamide IV in Hydra is unknown.

Highly specialized mechanoreceptor cells, called stinging cells or nematocytes, that are important for capturing prey and defense are present in cnidarians (36). Two- and threecell synaptic pathways, including synapses between nematocytes and nearby nerve cells, are present in the epidermis of the sea anemone tentacles (37, 38). Cnidarian sensory function is probably mediated by FLPs, as evidenced by anti-FMRFamide and anti-RFamide antibody staining in the tentacles of four classes of cnidaria. Thus, FLPs probably mediate chemosensory regulation of cnidocyte discharge (39). The epidermal sensory cells of the spot ocellus in Aurelia are also positive for FMRFamide (40), which may inhibit spontaneous firing of nematocytes.

FLPs also play a key role in cnidarian reproduction, larval movement, and metamorphosis. Reproduction of colonial octocorals such as Renilla koellikeri occurs via spawning and exfoliation. Intact gamete follicles are released into the water during spawning. These follicles rupture during exfoliation, releasing the gametes. Antho-RFamide is present in ciliated neurons in the epithelium of follicles of Renilla koellikeri and induces exfoliation of the epithelium and subsequent release of the gametes into water (41). Light enhances the potency of Antho-RFamide (41).

The colony-forming marine hydroid, Hydractinia echinata, is closely related to freshwater Hydra. Fertilized eggs of this species undergo rapid cleavage divisions for about 1 day and develop into spindle-shaped planula larvae in about 3 days (42). Planula larvae are capable of migrating toward light (43), and they metamorphose into adult polyps when they receive appropriate environmental stimuli (44, 45). Hydra-RFamide I inhibits the migration of planula larvae, thus modulating phototaxis by



TABLE 2 | GLWamide family peptides in cnidarians.

pQ, pyroglutamate; <sup>h</sup>P, hydroxyproline.

inhibiting myomodulation (43). Metamorphosis is also inhibited by this peptide, leading to the suggestion that the function of endogenous FLPs is to stabilize the larval stage (46). Thus, FLPs may play a role in regulating the movement of planula larvae prior to metamorphosis, possibly linking movement to chemotactic or phototactic processes (47). Sensory neurons that express FLPs are present in planula larvae, suggesting that migration and metamorphosis of these animals may be mediated by secretion of endogenous neuropeptides in response to environmental stimuli.

### GLWamides

GLWamides are characterized by certain features at their Nand C-termini. Most GLWamides have a GLWamide motif at the C-terminus (**Table 2**). Seven GLWamide peptides are found in Hydra, and they include X-Pro or X-Pro-Pro at their N-termini (**Table 2**) (7, 49). In the anthozoan Anthopleura elegantissima, Metamorphosin A (MMA) that is a member of the GLWamide family has an N-terminal pyroglutamine (**Table 2**) (48). Both N-terminal modifications produce resistance to aminopeptidase (52).

GLWamide cDNAs are found in other cnidarians as well. A cDNA encoding a preprohormone with 11 immature peptide sequences, nine of which are unique, was cloned from Hydra magnipapillata (**Figure 2B**) (50). The corresponding gene includes one copy of Hym-53 (NPYPGLWamide), Hym-54 (GPMTGLWamide), Hym-249 (KPIPGLWamide), and Hym-370 (KPNAYKGKLPIGLWamide); two copies of Hym-248 (EPLPIGLWamide); and t copies of Hym-331 (GPPPGLWamide), as well as two additional putative GLWamides (Hydra-LWamide VI and VIII) (**Table 2**). Hydra-LWamide VIII is predicted from this cDNA and probably includes GMWamide at the C-terminus (50). A cDNA encoding GLWamides has been cloned from Hydractinia echinata (51) and includes one copy of He-LWamide I and 17 copies of He-LWamide II (**Table 2**). Two unique cDNAs have been cloned from the anthozoans Actinia equine and Anemonia sulcata (51). The Actinia gene includes one copy of MMA, Ae-LWamide IV, Ae-LWamide V, Ae-LWamide VI, and Ae-MWamide; two copies of Ae-LWamide I and Ae-LWamide III; and four copies of Ae-LWamide II (**Table 2**). In contrast, the Anemonia gene has one copy of MMA, Ae-LWamide II, and As-IWamide; two copies of As-LWamide II; and four copies of As-LWamide I (**Table 2**) (51). The preprohormones of anthozoans but not hydrozoans include MMA. The peptide is probably a prototype of the family (53). Two other peptides that are possibly generated from the preprohormones of Actinia and Anemonia are likely processed into -GMWamide (Ae-MWamide) and -GIWamide (As-IWamide) at their C-terminus (**Table 2**). Whether these two peptides and Hydra-LWamide VIII belong to the GLWamide family is uncertain, as substitution of the Leu residue in GLWamide with Met or Ile results in deactivation of contractile activity in the retractor muscle of the anthozoan Anthopleura fuscoviridis (54).

The various species of Hydractinia generally live on hermit crab shells. The Hydractinia life cycle includes a planula larval stage but no medusa stage. After attaching to snail shells, planula larvae undergo MMA-induced metamorphosis and become polyps after about 1 week (48, 55). MMA thus works as a neurohormone to mediate development in addition to its roles as a neurotransmitter and neuromodulator. In Hydractinia serrata, Hydra GLWamides also cause polyp development from planula larvae (7, 49). A common GLWamide sequence is required to induce metamorphosis in Hydractinia, and the GLWamide terminus and amidation are essential and specific for inducing metamorphosis (56). Substitution of Gly in GLWamide with another common amino acid (except Cys) decreases or completely inhibits potency of the peptide, and substitution of Leu or Trp in GLWamide with another common amino acid (except Cys) partially or completely blocks its potency for muscle contraction in Anthopleura fuscoviridis (54). The precise mechanism of how these peptides induce metamorphosis remains to be determined. Bacteria in the environment produce a chemical that can induce larvae to undergo metamorphosis (48). This chemical signal probably affects sensory neurons in the planula larvae that secrete endogenous GLWamides to induce a phenotypic change in the surrounding epithelial cells. Hydra lack a larval stage and develop directly into adults from embryos, and thus, how GLWamide peptides function during early development in Hydra is unclear.

Motile planula larvae play a role in sexual reproduction in reef-building corals. These larvae undergo complex metamorphosis after adhering to a substrate, and a juvenile coral colony results. In Acropora, Hym-248 induces dosedependent metamorphosis of nearly 100% of planula larvae into polyps (57). However, the effect of Hym-248 on metamorphosis is species-specific (57, 58). A Hym-248-specific receptor appears to exist in Acropora. The receptor may serve as a barrier to ensure specification in corals. In Hydractinia, the peptide for their receptors is loose. The possible receptors may share certain common sequences and binding sites. Hym-248-related peptide(s) are expected to be identified in Acropora.

In Hydra, all GLWamide peptides serve as myoactive peptides to activate sphincter muscle contraction and bud detachment (7). The sphincter muscle is involved in bud detachment. To test myoactivity in Hydra, nerve-free tissue of epithelial hydra is typically used (59, 60). When normal Hydra that contains nerve cells is treated with the peptides, they exhibit the same effect as epithelial Hydra. GLWamides are synthesized and expressed in nerve cells (49) and thus function as neurotransmitters or neuromodulators at the neuromuscular junction. Hym-248, which is a Hydra GLWamide, induces both bud detachment and body elongation (49). Muscle tissue in Hydra runs perpendicular to the ectodermal and endodermal epithelial cells. Hym-248 may bind to two different types of receptors, one that binds all types of GLWamides and one that specifically binds to Hym-248. Substance P (SP) is a highly conserved member of the tachykinin peptide family that is widely expressed throughout the animal kingdom (61). It binds to tachykinin receptors [neurokinin-1, 2, and 3 receptor (NK1R, NK2R, and NK3R)] that belong to Gprotein-coupled receptors (GPCRs). SP preferentially activates NK1R. This difference of specificity against other tachykinin peptides can be accounted for the conformational flexibility of the short and linear peptides and ligand binding affinity for the receptors (62). Probably, the features of both receptors for Hym-248 may depend on the ligand structure and binding affinity for receptors.

All GLWamide family peptides enhance retractor muscle contraction of Anthopleura (49). Nerve cells in the sea anemone retractor muscle stain strongly with a GLWamide motif-specific antibody, similar to the nervous system of Hydra (49).

In Hydractinia echinata, GLWamide and RFamide neuropeptides modulate planula larva migration. He-LWamide II, which is a GLWamide, induces migration by extending the active period (43). GLWamides and FLPs antagonize one another to modulate migration of Hydractinia echinata planula larvae.

In hydrozoan jellyfish, maturation of oocytes and spawning are initiated by light-dark cycles in natural conditions within 1 second (63). Exposure to Hym-53 for < 2 min is sufficient for oocyte maturation and spawning (64). Thus, neuropeptides function as hormones that modulate the first step that determines whether oocytes undergo irreversible meiosis after light exposure. TABLE 3 | Hym-176, Hym-357, and their related peptides in Hydra.


### Hym-176 (APFIFPGPKVamide)

Hym-176 was a newly identified as a neuropeptide (**Table 3**) (7, 65). The gene that encodes Hym-176 is strongly expressed in the neurons of the lower peduncle and weakly expressed in the gastric region (67). This peptide induces contraction of the ectodermal muscle in Hydra (65). This region-specific neuron subset correlates with the myoactivity of the peptide. Hym-176 has no effects on muscle contraction in Anhtopleura, metamorphosis in Hydractinia, and oocyte maturation and spawning in Cytaeis. And also, the gene encoding the peptide (Hym-176A) is just isolated from Hydra (**Figure 2C**) (66, 67). Thus, the peptide is species-specific.

The gene that encodes Hym-176 also encodes a second peptide, Hym-357 (KPAFLFKGYKPamide) (**Figure 2C**) (**Table 3**). This neuropeptide was identified in a screen for myoactive peptides (20). Detailed observations suggest that Hym-357 neurons activate other neurons to release neurotransmitters for induction of muscle contraction.

To identify the homologous gene that encodes Hym-176, Noro and coworkers found four candidate genes in the freshwater polyp Hydra magnipapillata (66). No authentic Hym-176 is present in the four paralogues (**Figure 2C**) (66). The cDNAs, Hym-176C and Hym-176D, encode one copy of a Hym-176 homologous peptide (**Figure 2C**) (**Table 3**). Hym-357 is encoded in both the gene that encodes Hym-176 and the gene that encodes Hym-176B (**Figure 2C**) (66). Hym-176C encodes Hym-690 (KPLYLFKGYKPamide), which is closely related to Hym-357 (**Figure 2C**) (**Table 3**) (20). Hym-176E appears not to have Hym-176- and Hym-357-related peptides (**Figure 2C**). The function of Hym-176C and D and Hym-690 has not yet been characterized in Hydra.

## Hym-355 (FPQSFLPRGamide)

Hym-355 is a member of the PRXamide family of peptides that have PRXamide at their C-terminal region (**Figure 2D**) (**Table 4**) (68) and are subdivided into three groups in invertebrates: (a) neuropeptides that induce pheromone biosynthesis (70) and similar molecules, (b) small cardioactive peptides (71– 73), and (c) antho-RPamide (52) and similar molecules. Antho-RPamide (LPPGPLPRPamide) is located in neurons of sea anemones and induces tentacle contraction. Thus, the peptide is involved in neurotransmission. PRXamide peptides have been identified in many invertebrates. Hym-355 is homologous to members of sub-group (c), including LPPGPLPRPamide (Anthopleura elegantissima) (**Table 4**), AAPLPRLamide (Urechis unicinctus) (74), QPPLPRYamide (Helix pomatia), and pQPPLPRYamide (Helix pomatia) (75). GPRGGRATEFGPRGamide and GPRGGREVNLEGPRGamide both have PRGamide at their C-termini and are expressed in the sea anemone Nematostella vectensis (**Table 4**) (23). The gene encoding the PRGamides is expressed in neurons (23), indicating that the PRGamides are neuropeptides.

Oxytocin-vasopressin superfamily peptides are neuropeptides synthesized in the hypothalamus and secreted from the posterior pituitary gland in mammals. Whether cnidarians express oxytocin/vasopressin superfamily peptides remains an open question in the field of comparative physiology


MIHs, maturation-inducing hormones.

of nervous systems. Immunohistochemical staining suggests that oxytocin/vasopressin superfamily peptides exist in the Hydra nervous system (76, 77). Morishita and coworkers (78) purified two peptides, Hym-355 and SFLPRGamide, from Hydra magnipapillata using HPLC fractionation and immunologic assays. They demonstrated that the antigen for vasopressin-like immunoreactivity is Hym-355 in the Hydra nervous system. The C-terminal region of Hym-355 (PRGamide) is identical to that of vasopressin. Neither antibody against the two peptides discriminates one peptide from the other. Thus, Koizumi et al. (79) performed immunohistochemistry with an anti–Hym-355 antibody and demonstrated immunoreactivity in the nerve rings of Cladonema radiatum and Turritopsis nutricula. However, whether Hym-355 functions as a neurohypophysial hormone is not well-understood.

The tissue of Hydra undergoes continuous renewal **(Figure 3A**). The number of neurons remains constant. Two groups of peptides, Hym-355 and PW family peptides, regulate this state (7, 68, 80). PW family peptides share the same sequence of Pro-Trp and are identified as epitheliopeptides (81).

Hym-355 increases early neuron differentiation, and Hym-33H (AALPW) blocks neuron differentiation (68, 80). Simultaneous treatment with Hym-355 and Hym-33H results in a normal level of neuron differentiation. Taken together, the observations are consistent with a feedback model that modulates the homeostasis of neuronal differentiation in Hydra (**Figure 3B**) (68). This model suggests that Hym-355, which is synthesized by neurons, enhances early neuronal differentiation. To balance differentiation, epithelial cells produce PW peptides. A third factor termed as X in **Figure 3B** may control synthesis and secretion of PW family peptides. Hym-355, PW peptides, and the putative third factor may work together to maintain a constant neuronal density in Hydra. Hym-355 induces interstitial

stem cells to undergo neuron differentiation and also induces retractor muscle contraction in the sea anemone Anthopleura fuscoviridis (68).

A member of the GLWamide family, Hym-53 (NPYPGLWamide) (**Table 2**), and Hym-355 induce oocyte maturation and spawning, but the effect of Hym-53 is stronger than that of Hym-355. Hym-355-like immunoreactivity is observed in neurons in Cytaeis (63). Possibly, neurons expressing Hym-53- and Hym-355-like peptides contribute downstream of light receptors in oocyte maturation and spawning in Cytaeis. Takeda and coworkers demonstrated that endogenous peptides including W/RPRPamide peptides are involved in oocyte maturation (**Table 4**) (69). RPRYamide, RPRGamide, WPRAamide, and RPRAamide may act as maturation-inducing hormones (MIHs) (**Table 4**) (69). Takeda et al. (69) also demonstrated that MIH peptides are synthesized by neurons in the gonad, and probably act on the oocyte surface. They propose that hydrozoan MIHs and neuropeptides are evolutionally


linked to regulate reproduction upstream of MIHs in bilaterian species (69).

### FRamide Family

During research aimed at systematic identification of peptide signaling molecules in Hydra (7), two novel neuropeptides, FRamide-1 (IPTGTLIFRamide) and FRamide-2 (APGSLLFRamide), were identified (**Table 5**) (82). Among Hydra EST and genome databases (8), we can rapidly identify peptide transcripts and their genes. The two peptides and the single gene encoding both peptides were identified using this exact approach (**Figure 2E**).

FRamide-1 (IPTGTLIFRamide) and FRamide-2 (APGSLLFRamide) exhibit opposing effects even though they are encoded by the same gene. The former peptide evokes body column elongation due to endodermal muscle contraction, whereas the latter peptide evokes body column contraction due to ectodermal muscle contraction (82). Two explanations for these seemingly contradictory observations are possible. One possibility is that the release of each peptide is differentially regulated (83, 84), and the other possibility is that each peptide is processed in a different type of neuron (85). Additionally, the opposing effects of FRamide family peptides may be ligand binding affinity for one receptor (62). In higher animals, most neuropeptides bind to GPCRs that are localized at the

molecules function together and/or separately to maintain the organism's lifestyle in response to stress stimulation, light reception, mechanical stimulation, and chemical stimulation.

target cell. To understand the opposite effects, identification of FRamide-specific receptors on the target cells is important.

### CONCLUSION

Neuropeptides released from nerve cells in response to a variety of stimuli are mandatory for fine-tuned regulation of behavior, reproduction, metamorphosis, and tissue maintenance (**Figure 4**). Here, I described 57 types of neuropeptides so far identified in cnidarians. However, the study of neuropeptides is still in its infancy. Additional novel peptides will likely be found (86), including neuropeptides, thus enabling elucidation of the mechanisms that regulate the physiology and development of cnidarians and increasing our understanding of peptide function in other species.

It is important to elucidate functional interaction between neuropeptides and receptors for the verification of their biological roles and evolutionary processes. However, no receptors for the neuropeptides remain to be identified in Hydra and cnidarians. Recently, Shiraishi and coworkers developed the machine-learning-assisted strategy for the identification of novel peptide–receptor pairs (87). As they indicate the multiplicity of use of the strategy, it is worth to use the strategy for increasing the receptor (especially GPCR) repertoire as many as possible on Hydra and cnidarians. When neuropeptide-GPCR pairs are efficiently and systematically elucidated in a

### REFERENCES


phylogenetically critical Hydrozoa Hydra magnipapillata, Hydra provides cnidarian perspectives into evolution of GPCRs.

The cells of Hydra are well-characterized and belong to the epithelial cell lineage and the interstitial stem cell lineage (**Figure 3A**). However, knowledge of the molecules and biochemical mechanisms of the cells remains limited. The singlecell RNA sequencing technique sheds light on the complete molecular diversity of the cells in Hydra. Siebert and coworkers (88) applied this approach to the homeostatic adult Hydra. They drew a molecular map of the Hydra nervous system and unlocked the door toward understanding the molecular basis of morphogenesis and regeneration in Hydra.

### AUTHOR CONTRIBUTIONS

TT wrote the original review manuscript draft.

### FUNDING

This work was supported by a Grant-in-Aid for Scientific Research (C) to TT (Grant Number 17K07495).

### ACKNOWLEDGMENTS

The author acknowledges a grant from the JSPS KAKENHI to TT (Grant Number 17K07495).


maturation in the hydrozoan Cytaeis uchidae. Dev Biol. (2006) 298:248– 58. doi: 10.1016/j.ydbio.2006.06.034


immunoreactivity depends on axial location. J Neurosci. (1991) 11:2011–20. doi: 10.1523/JNEUROSCI.11-07-02011.1991


**Conflict of Interest:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Takahashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Function and Distribution of the Wamide Neuropeptide Superfamily in Metazoans

#### Elizabeth A. Williams\*

*Living Systems Institute, University of Exeter, Exeter, United Kingdom*

The Wamide neuropeptide superfamily is of interest due to its distinctive functions in regulating life cycle transitions, metamorphic hormone signaling, and several aspects of digestive system function, from gut muscle contraction to satiety and fat storage. Due to variation among researchers in naming conventions, a global view of Wamide signaling in animals in terms of conservation or diversification of function is currently lacking. Here, I summarize the phylogenetic distribution of Wamide neuropeptides based on current data and describe recent findings in the areas of Wamide receptors and biological functions. Common trends that emerge across Cnidarians and protostomes are the presence of multiple Wamide receptors within a single organism, and the fact that Wamide signaling likely functions across an extensive variety of biological systems, including visual, circadian, and reproductive systems. Important areas of focus for future research are the further identification of Wamide-receptor pairs, confirmation of the phylogenetic distribution of Wamides through largescale sequencing and mass spectrometry, and assignment of different functions to specific subsets of Wamide-expressing neurons. More extensive study of Wamide signaling throughout larval development in a greater number of phyla is also important in order to understand the role of Wamides in hormonal regulation. Defining the evolution and function of neuropeptide signaling in animal nervous systems will benefit from an increased understanding of Wamide function and signaling mechanisms in a wider variety of organisms, beyond the traditional model systems.

#### Keywords: neuropeptide, wamide, myoinhibitory peptide, GLWamide, allatostatin B

### INTRODUCTION

Neuropeptides are short peptidergic molecules released by animal neurons that act as modulators or hormones to regulate biological processes. These signaling molecules are notable for being present in the nervous system of early metazoans, and for their important functions in regulating animal behavior and physiology. Historically, neuropeptides have been named according to their function, or where function is unknown, according to repetitive conserved sequence motifs found in the precursor peptide. These naming strategies have the unfortunate consequence of often obscuring neuropeptide relationships across species or phyla. The Wamide neuropeptide superfamily is a striking example of this. Wamides are repetitive proneuropeptides that contain multiple cleavage sites flanking short, amidated active peptides with a conserved C-terminal tryptophan (W). This neuropeptide superfamily is of ancient origin, already present in the last common ancestor of cnidarians and protostomes (1). Depending on the species in which they were studied, Wamides have been referred to as myoinhibitory peptide (MIP), allatostatin B

#### Edited by:

*Lee E. Eiden, National Institutes of Health (NIH), United States*

#### Reviewed by:

*Meet Zandawala, Brown University, United States Taka-aki Koshimizu, Jichi Medical University, Japan*

> \*Correspondence: *Elizabeth A. Williams e.williams2@exeter.ac.uk*

### Specialty section:

*This article was submitted to Neuroendocrine Science, a section of the journal Frontiers in Endocrinology*

Received: *16 February 2020* Accepted: *01 May 2020* Published: *28 May 2020*

#### Citation:

*Williams EA (2020) Function and Distribution of the Wamide Neuropeptide Superfamily in Metazoans. Front. Endocrinol. 11:344. doi: 10.3389/fendo.2020.00344* (ASTB), prothoracicostatic peptide (PTSP), WWamide, GLWamide, or metamorphosin A (MMA). Even the name "Wamide" is not ideal for defining this neuropeptide family, since neuropeptides from other families may also contain a C-terminal amidated tryptophan residue. For example, adipokinetic hormone (AKH) in some insect species, molluscan APGWamides and echinoderm luqins, and some insect short neuropeptide F's (sNPF) also have C-terminal Wamide motifs, but phylogenetic analyses indicate that these neuropeptides belong to the AKH/corazonin/ACP/GnRH superfamily, the RYa/luqin, and the sNPF/prolactin superfamilies, respectively (2–5), therefore they are not considered here. In this mini review, I aim to unite recent knowledge of Wamide expression and function in diverse phyla to improve understanding of their evolutionary history and identify remaining knowledge gaps where future research could enlighten the evolution of Wamide function and mechanism of action.

### A BRIEF HISTORY OF WAMIDE DISCOVERY

The first known Wamide discovered was locust myoinhibitory peptide (LOM-MIP). LOM-MIP was identified as a suppressor of visceral muscle contraction in hindgut and oviduct in 1991 (6). Subsequently, WWamide neuropeptide was identified in the mollusc Achatina fulica (7) and the first cnidarian GLWamide, known as metamorphosin A (MMA), was identified in Anthopleura elegantissima (8). Initially, both WWamide and MMA were considered novel peptides unrelated to insect MIPs. The term allatostatin B was used to describe MIPs which inhibited juvenile hormone III synthesis in the cricket Gryllus bimaculatus (9). In the silkworm Bombyx mori, MIPs were anointed "prothoracicostatic hormone" (PTSP), due to their inhibition of ecdysone synthesis in this species (10), although cricket ASTBs were also found to inhibit ovarian ecdysteroid synthesis prior to this (11). The first crustacean Wamide was identified in crab in 2005 and was noted to be related to insect ASTBs (12).

Fifteen years after the discovery of LOM-MIP, the term "Wamides" was first used to describe this neuropeptide superfamily in the construction of a metazoan neuropeptide database by Liu et al. (13), who grouped Arthropod MIP/ASTB/PTSP, cnidarian GLWamides, molluscan WWamides and nematode MIPs within the Wamide family. Annelid Wamides were identified in 2011 by Veenstra (14), who noted their link to insect and mollusc Wamides. Genome analysis of the gastropod Lottia gigantea reinforced the link between molluscan WWamides and insect ASTB (15). Largescale similarity-based clustering of metazoan neuropeptides revealed that Wamides form part of the ancient central cluster of repetitive proneuropeptides that give rise to short, amidated peptides, with representative sequences from annelids, molluscs, platyhelminths, nematodes, arthropods, and cnidarians (1).

## PHYLOGENETIC DISTRIBUTION OF WAMIDES

Following initial discoveries of Wamides through peptide purification and sequencing, recent largescale genomic and transcriptomic analyses have enabled a more complete view of Wamide distribution throughout the animal kingdom (**Figure 1**). Wamide neuropeptides also occur in brachiopods (18), tardigrades and priapulids (19–21). Transcriptome analysis of xenacoelomorph neuropeptides did not find myoinhibitory peptide orthologs, although a putative MIP receptor was found in an acoel (22). A transcript fragment of a MIP can be found however in a Xenoturbella bockii transcriptome dataset used to identify homeobox genes (21), thus the presence of MIP signaling in xenacoelomorphs remains uncertain for now. Extensive transcriptome analyses recently confirmed the presence of Wamides in all molluscan groups, but also supported the absence of Wamides (ASTB) in phoronids, platyhelminths and rotifers, as well as ectoprocts and entoprocts (23).

Despite extensive sequencing data, Wamides have not been identified in deuterostomes, although phylogenetic analysis of GPCRs identified orphan human GPCR139 and GPCR142 as potential MIP receptor orthologs (24). Ligands for these receptors have not been confirmed in vivo but are suggested to be small peptides or amino acids, such as L-Trp (W) and L-Phe (F) (25, 26). Therefore, while a Wamide signaling system arose early in metazoan evolution, in a cnidarian/bilaterian ancestor, it appears to have been lost multiple times during evolution, including in ambulacrarians and tunicates. This trend also occurs within phyla, for example, although widespread among insects, Wamides have not been found in the honey bee, wasp, or leaf-cutting ants (27, 28). A precise picture of Wamide gain and loss throughout metazoan evolution requires further sequence data, particularly from lesser-studied spiralian and ecdysozoan phyla.

Wamide distribution among early branching metazoan phyla is of ongoing interest. Recent bioinformatic analyses indicate that among cnidarians, GLWamides are absent in class Scyphozoa, Staurozoa, and Octocorallia (29). The same analysis uncovers new G{A/V/T}Wamides in Cubozoa, Scyphozoa, and Staurozoa. How these new cnidarian Wamides relate to GLWamides and the rest of the Wamide neuropeptide superfamily awaits further investigation, however it is interesting to note that an {A/V/T}Wamide C-terminal motif also occurs in arthropod, mollusc and annelid Wamides (**Figure 2**). In the placozoan Trichoplax adhaerens, an RWamide precursor was found through bioinformatic prediction (30). Placozoan RWamide has strong resemblance to cnidarian GLWamides (**Figure 2**), however since some luqin/RYamide and sNPF/prolactin family neuropeptides also show a conserved RWamide C-terminal motif (4, 5), it is currently unclear to which superfamily the placozoan RWamide belongs. Similarly, although putative neuropeptide precursors were found in ctenophores these did not have homology to other metazoan neuropeptides. However, since the cross-phylum conservation of neuropeptides can be limited to just a few residues (1, 31), the existence of Ctenophore Wamides remains a

conserved structure of Wamides within each phyla. In "GPCR" column, black boxes indicate presence of MIP/sex peptide receptor GPCR ortholog confirmed by receptor deorphanization assay, white boxes indicate currently no orthologous biochemically confirmed GPCR. Phylogeny and presence of genome based on Bezares et al. (16), Figure 6, with authors' permission. Tree structure based on a phylogenomic study with Bayesian inference under the CAT + GTR + γ 4 model to suppress long-branch attraction artifacts (17).

possibility. No neuropeptide-like precursors have been identified in poriferans to date, despite neuropeptide-processing enzymes being present in the Amphimedon queenslandica genome (32). Further peptide and receptor characterization through mass spectrometry, largescale receptor deorphanization and functional studies are needed to clarify the occurrence and evolution of Wamide signaling in poriferans, ctenophores, placozoans, and cnidarians.

### WAMIDE RECEPTORS

The most well-described Wamide receptor is a rhodopsin family G-protein-coupled receptor (GPCR). This was initially called the sex peptide receptor (SPR) after its first characterized ligand, the Drosophila sex peptide (33). Whereas sex peptides only occur in the family Drosophilidae and specifically activate the Drosophila receptor, Drosophila MIPs can activate orthologous receptors


from other insect groups and from Aplysia californica, a mollusc (34). Following this discovery, it was deduced that the sex peptide receptor's ancestral ligands were in fact MIPs (34, 35), causing a reassignment of name to myoinhibitory peptide receptor. Sex peptides differ significantly from MIPs in their primary amino acid sequence; they are larger (36aa cf. 9-12aa), lack C-terminal amidation, and contain a disulfide bridge. Like MIPs, however, sex peptides contain a pair of tryptophan (W) residues which are predicted to stabilize a beta-turn secondary structure adopted by both peptides (34). The conserved tryptophans are required for receptor binding in both sex peptide and MIPs, indicating that they may interact with a common binding site on the receptor (34).

Orthologs of the MIP receptor have been biochemically characterized in insects [fruitfly, mosquito (34), tick (36), kissing bug (37), silkworm (35)], a mollusc (34), nematode (38), and annelids (39). Receptor deorphanization assays with modified peptides show that the two conserved tryptophan residues are important for receptor activation. The C-terminal tryptophan residue is especially critical for receptor activation; when this residue is replaced with an alanine, all receptor activity is lost in both annelids and insects (34, 37, 39, 40).

Unlike protostomes, a cnidarian Wamide receptor has not yet been identified. Receptor identification based on phylogenetic analyses alone may prove difficult, as most cnidarian GPCRs are more closely related to each other than to specific nephrozoan receptors (22). A largescale combinatorial receptor de-orphanization approach, such as that used to identify several novel peptide-receptor pairs in Platynereis dumerilii (41), would be useful in this endeavor. Without a known receptor, cnidarian Wamides can not be definitively confirmed as orthologs of protostome Wamides. However, sequence similarity clustering (1), the conservation of the C-terminal GLWamide motif in nematode and some insect MIPs (**Figure 2** and **Supplementary Data**), the greater importance of the Cterminal tryptophan for receptor binding (see above), and the occasional occurrence of an N-terminal tryptophan in cnidarian GLWamides (e.g., Hydra vulgaris, **Supplementary Data**) are in support of orthology.

The presence of multiple Wamide receptors in the same organism is emerging as a common theme. Caenorhabditis elegans has three receptors related to arthropod MIP receptors (38). In Drosophila melanogaster, loss of the MIP/SPR GPCR does not affect the function of MIP in appetite control or female mating behavior, indicating that MIP may act through one or more additional receptors (42, 43). In the hydrozoan Hydra magnipapillata, the GLWamide Hym-248 activates both endodermal muscle contraction and sphincter muscle contraction, whereas other GLWamides activate only sphincter muscle contraction (44). This additional function of Hym-248 suggests that a receptor specific only to this peptide is expressed in endodermal muscle, while a more generalist GLWamide receptor is expressed in sphincter muscle (45). These findings indicate that a common evolutionary strategy for the diversification of Wamide peptide function is through the addition of receptors with varying binding specificities.

A new family of Wamide receptors has recently been identified in the marine worm Platynereis dumerilii (46). These receptors are peptide-gated ion channels from the degenerin/epithelial sodium channel family. The MIP-gated ion channel (MGIC) receptor is paralagous to the FMRFamide-gated sodium channels found in snails and cnidarians (47, 48). The mechanism of MIPs binding to the MGIC receptor is similar to that of binding to the MIP GPCR in that the conserved tryptophans are essential for receptor activation. As with the MIP GPCR, replacement of the N-terminal tryptophan with an alanine reduced MGIC receptor activity, while replacement of the C-terminal tryptophan abolished MGIC receptor activity (46). Although all eleven mature MIP peptides from the same precursor activate both the MIP GPCR and the MGIC receptor, each mature peptide preferentially activates one of the receptor types, suggesting a mechanism for diversification of peptide function. Comparison of in vivo concentrations of the different mature MIPs to receptor activation concentrations determined in vitro would indicate whether different mature MIPs really activate both MGIC and GPCR receptor types in vivo. Phylogenetic analysis of peptide-gated ion channels identified nematode, amphioxus, and cnidarian sequences related to MGIC/FaNaC peptide-gated ion channels (46), however these putative peptide-gated ion channels are yet to be assigned a peptidergic ligand through physiological assays.

### KNOWN FUNCTIONS OF WAMIDES

Throughout metazoans, Wamides have a wide variety of functions and often carry out multiple functions within the same organism. One known function of Wamides shared between cnidarians and protostomes is the regulation of life cycle transitions. Cnidarian GLWamides induce larval settlement and metamorphosis in the hydroid Hydractinia echinata, as well as the larvae of several coral species, and the hydrozoan Clytia hemisphaerica (8, 49–52). Knockdown of the GLWamide precursor gene in Nematostella vectensis showed that GLWamide is not necessary for metamorphosis, at least in this species, but plays a modulatory role in determining metamorphic timing (53). Similar to cnidarians, treatment of larvae of Platynereis dumerilii with synthetic MIP peptide induces settlement (39). Both cnidarian and P. dumerilii Wamide-expressing cells are sensory-neurosecretory, suggesting that this neuropeptide plays a role in activating a settlement and metamorphosis program in response to specific environmental cues, however the nature of the environmental cues that trigger Wamide peptide release, and the downstream signaling pathways activated or repressed by Wamides require further characterization. Microarray and RNA-Seq studies of GLWamide-treated branching coral Acropora millepora identified significant changes in transcription after peptide exposure (54, 55). These studies may be useful for identifying candidate genes and pathways for functional investigations of Wamide signaling networks acting in coral metamorphosis, now that tools for genome editing are available for coral (56). Curiously, the Hydra GLWamide, Hym-248, also promotes settlement and metamorphosis in two species of sponge (57). This suggests that although Wamides haven't been identified in sponges, there may be some overlap between the signal transduction pathways involved in sponge and cnidarian metamorphosis, particularly at the level of neuropeptide receptors.

Similar to regulating marine invertebrate larval settlement and metamorphosis, in some insect species, Wamides regulate levels of metamorphic hormones. In the silkworm, PTSPs suppress ecdysteroidogenesis in the prothoracic glands (10, 35, 58), while in crickets, allatostatin B inhibits juvenile hormone production (59, 60). MIPs also regulate aspects of the behavioral sequence underlying ecdysis in D. melanogaster and Manduca sexta, as part of a peptidergic signaling cascade initiated by ecdysis triggering hormone (61–64). However, functions of Wamides during insect larval development and how Wamides regulate hormone production or release still require further investigation. Do Wamides also regulate hormones similar to juvenile hormone or ecdysone in the induction of marine invertebrate metamorphosis? Juvenile hormone and its precursor, methyl farnesoate, can regulate larval metamorphosis in polychaetes and barnacles (65, 66) and ecdysone signaling may play a role in crustacean and mollusc metamorphosis (67), however Wamide function in mollusc and crustacean larval stages has not yet been reported, nor has the effect of Wamides on specific marine invertebrate hormones.

Perhaps the most widely conserved function of Wamides is the regulation of muscle contraction. In arthropods, MIPs inhibit muscle contraction in the hindgut, oviduct and heart/pyloric system (6, 68–72). Contrary to their name, MIPs promote muscle contractions in annelid gut muscles (73). Cnidarian GLWamides also promote muscle contractions in the longitudinal (ectodermal) muscles and circumferential (endoderm) muscles (44). A specialization of the muscle contraction function of some Hydra GLWamides is in inducing the contraction of sphincter muscles, which causes detachment of buds from a parental polyp, thereby linking Wamides to the regulation of asexual reproduction in Hydra (74). Mollusc WWamides inhibit the phasic contractions of the anterior byssus retractor muscle in mussel, but potentiate contractions of the penis and radula retractor muscles in land snail, as well as inducing contractions of the radula protractor muscle in a gastropod (7). Studies of the effects of Wamides on muscle contraction have revealed that Wamide effects are dependent on the current physiological state of the system (71), the concentration of peptide applied (7, 75), the type of receptor activated (45), and whether the peptide acts pre- or postsynaptically (7).

Additional to the regulation of muscle contraction in the digestive system, Wamides have been implicated in a variety of feeding-related functions, primarily in insects. In the kissing bug and ticks, MIP expression in the salivary glands suggests a role in salivation (36, 37, 76, 77). Activating MIP neurons in D. melanogaster adults decreased food intake and body weight and reduced the sensitivity of starved flies toward food (42). Also in D. melanogaster, MIPs modulate attraction to polyamine food odors in mated females (78). This sex-specific modulation is through an autocrine signaling mechanism, with MIP and the MIP/sex peptide receptor both expressed in taste and olfactory neurons. Recent findings from C. elegans show a role for MIP in aversive gustatory short-term learning, and longterm learning of salt avoidance behaviors (38). MIP's function in sensing specific food cues in adult fly and nematode shows interesting parallels with the role of MIP in regulating marine invertebrate metamorphosis. In both cases, MIP is expressed in neurons with chemosensory morphology and detection of specific cues and release of MIPs activates a switch in states. Cues for marine invertebrate metamorphosis are also often associated with preferred food sources of their future adult stages (79).

Beyond the regulation of life cycle transitions, muscle contraction and feeding/digestive, MIPs may also play a role in diverse biological systems, including reproductive, visual, and circadian systems. In female D. melanogaster, MIP-expressing abdominal interneurons enhance mating receptivity in mated females (43). Additionally, MIP-expressing interneurons in the central brain form part of a mechanosensory circuit that informs female mating decisions (80). Also in D. melanogaster, some optic lobe neurons express MIP, although it is not yet clear if these regulate signaling in the visual or circadian system, or both (81). MIP expression in D. melanogaster cycles with circadian rhythm, in line with the role of MIP in maintaining a sleep-like state in adults (82). Spatial expression of MIP in cockroach brains also suggests a role in the circadian system (83, 84). MIP expression in insects has some similarities to GLWamide expression in jellyfish. For example, GLWamide expression is seen in the gonadal ectoderm cells of Clytia hemisphaerica and Cytaeis uchidae (85, 86), as well as the photoreceptive organs of the jellyfish Cladonema radiatum, Aurelia aurita, and Tripedalia cystophora (87). In both insects and cnidarians, further functional studies are needed to reveal the function of Wamides in reproduction, vision or circadian rhythm. These previous studies of spatial expression highlight the usefulness of detailed expression atlases that encompass different life cycle stages, sexes and wholebody analyses for additional species and phyla, to provide initial indications of Wamide function in distinct biological systems.

### CONCLUSIONS AND FUTURE DIRECTIONS

Wamides are clearly neuropeptides of significant importance to nervous system signaling, with a role in diverse biological systems throughout an organism's life cycle. The fact that Wamide signaling is lost in some species or phyla indicates that Wamides function within more complex networks of neuropeptide signaling, sometimes playing a modulatory but non-essential role, and their function may be taken on or replaced by other neuropeptides. Several similarities are seen in both the function and spatial expression of Wamides in cnidarians and protostomes, supporting the definition of this neuropeptide superfamily, however the identification of cnidarian Wamide receptors will further enlighten the evolution of Wamide signaling in metazoans.

The most detailed studies of Wamide signaling to date have been carried out in model organisms D. melanogaster and C. elegans and these studies show that subsets of MIP-expressing neurons are likely different cell types (e.g., sensory neurons vs. interneurons) responsible for different aspects of Wamide function. It is therefore important for future studies aiming to uncover mechanisms of Wamide signaling to develop methods for manipulating specific subsets of Wamide-expressing neurons. One approach to this is through the development of libraries of reporter constructs driven by different promoters with a range of cell-specificities. Further understanding of Wamide signaling can also be achieved by analysis of genes co-expressed in Wamideand Wamide receptor-expressing cells and more widespread analyses of Wamide receptor expression. These analyses can

### REFERENCES


indicate mechanisms of signaling, such as autoregulation in cells expressing both Wamide and receptor, or association with specific neurotransmitters, such as GABA or acetylcholine. They can also be used to generate maps of potential signaling cascades, as in insect ecdysis (61), through the comparison of expression of other neuropeptides and receptors. Receptor-ligand expression mapping based on single cell transcriptome data, as in (88), which should then be functionally tested, will facilitate these analyses.

Another aspect of Wamide signaling of importance for future studies is the identification of signaling cascades activated following Wamide release and receptor binding, in terms of which class of G protein is recruited, if indeed the Wamide signal is GPCR-mediated, and which downstream signaling pathways are activated or repressed. Again, single cell RNA-Seq analyses or precise spatial expression mapping can be used to identify which elements of the signaling pathway are present in the target cells of Wamide signaling. With both distinctive and conserved functions, the Wamide superfamily is an excellent model for studying the evolution of neuropeptide signaling and patterns of peptide-receptor coevolution in animal nervous systems.

### AUTHOR CONTRIBUTIONS

EW conceived the study and wrote the paper.

### FUNDING

EW is supported by a BBSRC David Phillips Fellowship (BB/T00990X/1).

### ACKNOWLEDGMENTS

The author is grateful to members of the Jékely lab for informative discussions and valuable advice, and the two referees whose constructive comments improved the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo. 2020.00344/full#supplementary-material

a deuterostome. Sci Rep. (2018) 8:7220. doi: 10.1038/s41598-018-2 5606-2


in the hydrozoan jellyfish Cytaeis uchidae. Mol Reprod Dev. (2013) 80:223– 32. doi: 10.1002/mrd.22154


**Conflict of Interest:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Williams. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.