Losing Genes: The Evolutionary Remodeling of Cetacea Skin

The skin is a multi-layered organ, often displaying associated structures, that establishes a protective interface between the organism and the surrounding environment. In mammals, the skin provides a physical and immune barrier, while contributing to thermoregulation and water balance. Within cetaceans, the archetypal mammalian skin was drastically reshaped and remodeled, emerging as a striking feature of their successful adaptation to a fully aquatic lifestyle. In fact, cetacean skin is extremely thick, displays a high cellular turnover rate, and lacks typical mammalian pelages, as well as sebaceous glands, resulting in a smooth and drag-reducing skin. Curiously, at the genome level, the majority of cetacean skin-related innovations resulted from episodes of gene loss: spanning diverse processes such as skin keratinization and cornification, immunity and inflammation or lubrication. Here, we review how the cetacean skin was shaped by such an evolutionary mechanism, by describing the full set of genes with inactivating mutations in the various functional compartments of the skin.


Innovations in the Cetacea Skin
The (re-)colonization of marine environments by mammals, such as Cetacea, more than 50 million years ago resulted in the appearance of an extensive array of evolutionary novelties (McGowen et al., 2014). These were critical for the successful adaptation of Cetacea ancestors to a novel environment. One of the most noticeable innovations encompasses the secondary modification of cetacean skin in terms of morphology, physiology and overall anatomy (Mouton and Both, 2012). For example, the epidermis is 50× thicker in cetaceans compared with their terrestrial counterparts, allowing higher resistance to outside aggressions and to better maintain homeostasis in water (Eckhart et al., 2019); the upper layers of the epidermis are not fully cornified (parakeratosis) (Morales-Guerrero et al., 2017); the skin is relatively smooth and sloughing is rapid (Hicks et al., 1985), while there is no scab formation (Mouton and Both, 2012). Additionally, the hypodermis is comparatively thick and fatty (Mouton and Both, 2012) and sebaceous and sweat glands, as well as pelage, are absent (Giacometti, 1967). The lipid skin compartment is also considerably different, as Cetacea possess a phospholipid-rich cornified layer allowing for waterproofing (Spearman, 1972) and lipokeratinocytes act as a physical barrier within a hypertonic environment giving animals buoyancy and streamlining, insulation and caloric characteristics to the skin (Menon et al., 1986). Importantly, Cetacea skin lacks the stratum granulosum and stratum lucidum, melanin is spread across all layers, and melanophage clusters are common in most species (Morales-Guerrero et al., 2017). Finally, keratohyalin granules are absent in some cetacean species epidermis (Pfeiffer and Menon, 2002;Strasser et al., 2015) but see Reeb et al. (2007).

Gene Loss as a Source of Innovation
The availability of high-quality genomes from several eukaryotic species during the last two decades, covering many evolutionary lineages, has allowed comparative studies to fully understand the power of gene loss in specific lineages and which specific molecular pathways have been affected. Gene loss as a major evolutionary force in eukaryotic lineages was first proposed by Olson (1999) as a complement to other evolutionary processes such as mutation (amino acid alterations), duplication and translocation. More recently, various studies have provided solid genomic foundation to the power and prevalence of gene loss events throughout Metazoa (Albalat and Cañestro, 2016;Fernández and Gabaldón, 2020;Guijarro-Clarke et al., 2020). Overall, a clear trend emerged suggesting that the mechanism is much more pervasive, and not only of parasitic and symbiotic species as classically thought (Albalat and Cañestro, 2016). Gene loss is not necessarily the complete removal of a gene from a genome, but any change to it that leads to loss-of-function, inactive RNA or protein products. Furthermore, this evolutionary process may occur when the fitness cost of maintaining superfluous genes outweighs the cost of inactivating them. There are two main mechanisms by which a gene can be lost: a sudden event such as unequal crossing over leading to the physical removal of a gene from a chromosome; or an incremental process such as pseudogenization, where sequence errors accumulate after initial loss-of-function mutations (Albalat and Cañestro, 2016). Evolution by gene loss has been extensively described in bacteria, paralleling horizontal gene transfer in frequency (e.g., Lawrence and Roth, 1999), as the small size of their genomes allowed comparing their composition much earlier than eukaryotic ones. Yet, gene loss has now been described in many types of organisms, including protists, fungi, plants and animals. In Cetacea, the emergence of genome sequences in various species allowed in the past decade the identification of ancestral events of gene inactivation underscoring various lineage-specific phenotypic adaptations (e.g., McGowen et al., 2014McGowen et al., , 2020Sharma et al., 2018). Here, we review the contribution of gene loss in the molecular make-up of the unique skin phenotype of this iconic group of marine mammals -the skin lossOme.

Epidermal barrier Keratins
The keratin cytoskeleton has been entirely remodeled in cetaceans. Cytoskeleton keratins are absent in marine mammals, but stress-inducible epithelial keratins are conserved in cetaceans and expressed in the skin (Eckhart et al., 2019). K2 and K9 are lost in Cetacea, manatee and some terrestrial mammals; Krt23 is absent in most cetaceans (Ehrlich et al., 2019a). While K1/K10 are lost in aquatic mammals, K6/K17 are conserved and expressed in the skin (Eckhart et al., 2019). K6/K16 or K17 are usually expressed in terrestrial mammals during stress response in order to repair damaged skin, but in marine mammals it is continually expressed, having become a normal constituent of the skin. Together with the loss of DSG4 (see below) the hyperproliferation of keratinocytes resulting from the continual activation of K16/K17 may be responsible for the thickening of cetacean skin and rapid wound healing and lack of scarring observed in cetaceans (e.g., Krützen et al., 2002). Sirenians have also lost the K1/K10 program, seemingly independently. Hair keratins were repurposed to form baleen or are expressed in the epithelium of the tongue (Ehrlich et al., 2019a). KRT24 is expressed in several epithelia including the cornea and the oesophagus (Ehrlich et al., 2019b). It originated in a common ancestor of amniotes and was lost in three mammalian clades: camels, a subclade of pinnipeds (eared seals and the walrus) and cetaceans (Figure 1; Ehrlich et al., 2019b). Krt24 remnants were found in several cetaceans, such as the bottlenose dolphin (Tursiops truncatus). Whales are also missing K3, as are rodents (Nery et al., 2014; Figure 1). Overall, the keratin, Type I and Type II, gene complement is severely impaired in Cetacea (Nery et al., 2014).

Differentiation, proliferation and adhesion
The cornified lipid envelope is a layer of long chain ceramides and fatty acids surrounding corneocytes. The epidermis-type lipoxygenase 3 (AloxE3) and the arachidonate 15-lipoxygenase (Alox15) are epidermal lipoxygenases, acting mostly in the skin (Figure 1). Functionally, they are critically involved in forming the cornified lipid envelope (Krieg et al., 2013). AloxE3 is an atypical lipoxygenase, important for skin barrier function (Krieg et al., 2013). The gene was lost most likely in two independent events in toothed and baleen whales (Sharma et al., 2018). Alox15 regulates inflammation and immunity, and has been associated with hair loss and skin integrity in mice (Kim et al., 2018). Coding disabling mutations were noted in killer whales (probably in all Cetacea), with a convergent event of inactivation in manatees (Figure 1; Huelsmann et al., 2019). Transglutaminase 5 (TGM5) cross-links structural corneocyte proteins, and is associated with peeling skin syndrome (Sharma et al., 2018). Previous studies have determined the loss of this gene before the split of toothed and baleen whale lineages (Figure 1; Sharma et al., 2018).
Filaggrin, a S100-fused type protein associated with keratins, and participating in the establishment of the skin barrier, has been lost in whales, but not dolphins. All other S100-fused type proteins have been lost in cetaceans (Strasser et al., 2015).
Desmocollin-1 (Dcs1) and its specific binding partner desmoglein-4 (Dsg4) encode components of desmosomes that mediate cell adhesion in the upper epidermis. Dsg4 has been associated with specific disease phenotypes such as inherited hypotrichosis (Kljuic et al., 2003). Importantly, mutations in this gene in mice disrupt desmosomal adhesion and alter keratinocyte behavior (Kljuic et al., 2003). They have both been lost at the base of the split of cetaceans into toothed and baleen whales (Figure 1; Sharma et al., 2018). Serpin B7 (SerpinB7) is a member of a protease inhibitor gene family, and it has been associated with plantar skin thickening (Nagashima-type palmoplantar keratoderma) in humans. Previous studies strongly suggest that it has been lost in the Cetacea stem lineage (Figure 1; Huelsmann et al., 2019).
The gene psoriasis susceptibility 1 candidate 2 (Psors1C2) is within the psoriasis susceptibility locus 1, and is exclusively present in mammals. It was found to be inactivated in dolphins and whales by a frameshift mutation (Huelsmann et al., 2019).
We additionally investigated the coding condition of two novel genes: secreted Ly-6/uPAR-related protein 1 (Slurp1) and serpin family A member 12 (SerpinA12) (see Supplementary Material for details). Mutations in Slurp1 cause a rare palmoplantar keratoderma, Mal de Meleda (Fischer et al., 2001). Knockout mice lacking exon 2 of Slurp1 develop severe palmoplantar keratoderma characterized by increased keratinocyte proliferation, lipid droplets in the stratum corneum and water barrier defect (Adeyo et al., 2014). In contrast, Vaspin, or SerpinA12, is an anti-inflammatory adipokine, and is mostly expressed in keratinocytes in skin. In psoriatic skin lesions expression is reduced compared with uninvolved skin (Saalbach et al., 2012). A detailed analysis unveiled a diverse set of ORF-inactivating mutations, including other gene erosion signs across all cetacean species SerpinA12 and Slurp1 genes (Figure 2). The mutational landscape of SerpinA12 includes, among others: (i) the loss of exons 1 and 2 of the gene for the entire set of species belonging to the Delphinidae, Phocoenidae, Monodontidae, Pontoporiidae, Iniidae, and Lipotidae families; (ii) a conserved in-frame premature stop codon in the exon 1 of Eubalaena japonica and Eubalaena glacialis, as well as a single-nucleotide deletion in the same exon for the Balaenopteridae and Eschrichtiidae species, with the exception of Balaenoptera musculus, Balaenoptera bonaerensis, and Balaenoptera acutorostrata scammoni, not consistent with the species tree topology (Figure 2); (iii) a conserved in-frame premature stop codon in the exon 3 of Monodon monoceros and Delphinapterus leucas, accompanied by a single-nucleotide deletion in the exon 4 of the same species; (iv) or, finally, a conserved in-frame premature stop codon in the exon 4 of all Balaenopteridae and Eschrichtiidae species, except for E. glacialis that displayed the loss of the same exon. With respect to Slurp1, severe gene erosion was observed in Ziphius cavirostris and Mesoplodon bidens (Ziphiidae family) where no sign of the gene was found; a conserved single-nucleotide deletion was detected in the exon 3 of the gene in all remaining species, except for Kogia breviceps; an in-frame premature stop codon was found in the same exon of B. acutorostrata scammoni, and an additional one was observed in the same exon of all the Balaenopteridae and Eschrichtiidae screened species (Figure 2). Hippopotamus amphibius, in contrast, presented an intact version of both SerpinA12 and Slurp1 genes.

The Pilo-Sebaceous Gland Sebum production
Cetaceans and some other mammal lineages do not have sebaceous glands. Several genes are known to be central for the unique lipid production in sebaceous glands, such as Mogat3, Dgat2l6, Awat1, Awat2, Elovl3, and Fabp9 (Lopes-Marques et al., 2019b). All of these genes present inactivating mutations in Cetaceans, with intermediate profiles in other lineages lacking sebaceous glands such as hippopotamuses, pig, manatees and elephants (Figure 1; Lopes-Marques et al., 2019a).

Hair follicle
The hairless gene (Hr) is a putative zinc finger transcription factor, with its protein product functioning as a transcriptional corepressor that interacts with nuclear receptors. Hr regulates basic hair follicle functions, regulating specific genes involved in hair morphogenesis and hair follicle cycling. In cetaceans, a series of apparent deletions and amino acid changes in important functional domains of Hr is present in Odontocetes (Figure 1), but Mysticetes have an intact open reading frame (Chen et al., 2013).
Fibroblast growth factors (FGF) are a family of highly conserved growth factors with 22 members in mammals. They have a variety of functions including hair growth in adults, and the evolution of these genes may be associated with aquatic adaptations in cetaceans (Imamura, 2014). Several FGFs such as Fgf5, Fgf7, Fgf10, Fgf18, and Fgf22 are expressed in hair follicles, and regulate hair growth. In Fgf22 nonsense and frameshift mutations have been detected, and the lack of heparin-binding sites, receptor interaction sites and localization signals has been noted, with other evidence corroborating the pseudogenization of the gene after the divergence of toothed and baleen whales (Figure 1; Nam et al., 2017).

Melanocortin receptor 5
Sebaceous glands have been lost or are poorly developed in Cetacea and other mammals, associated with low density of hairs. It has been hypothesized that these features are adaptive specializations to aquatic environments, at least within Cetartiodactyla where the only species with these features are cetaceans and their most closely related extant organisms, the hippopotamuses (see references in Springer and Gatesy, 2018). The melanocortin receptor 5 (MC5R) is expressed not only in sebaceous glands in association with hair follicles but also in other exocrine glands (e.g., Thiboutot et al., 2000;Springer and Gatesy, 2018). A detailed analysis of MC5R occurrence in mammalian genomes was unable to detect an ORF in the genomes of seven cetaceans (with some evidence of it being deleted in the Orca genome) and two pholidota (Figure 1; Springer and Gatesy, 2018). In the manatee and aardvark, there is an incomplete sequence of MC5R, and inactivating mutations in elephants, rhinoceros, colugo, gibbon, orangutan and other mammals which have either evidence for lack of sebaceous glands or have specialized skin (Springer and Gatesy, 2018).

Skin Immune System Inflammatory response
Cetaceans do not develop epidermal scabs when injured. The rapid renewal of skin layers minimizes the need for elaborate inflammatory mechanisms. In accordance, the skin-specific C-C motif chemokine ligand 27 (Ccl27) was found to be inactivated in Cetacea, as well as in Pholidota, Sirenia, Chiroptera, and Rodentia, by a number of insertions, deletions, and start and stop codon mutations (Figure 1; Lopes-Marques et al., 2019a). Ccl27 is produced by keratinocytes, being more specifically produced during inflammation, and mediates adhesion and homing of skin-infiltrating T cells. Serpina12, mentioned above, may play a role in inflammation, being involved in a regulatory chain including KLK7 and interleukin-1β (Heiker, 2014).

Interleukins
The skin is often the first barrier against infection, and so it is not a surprise that the immune system plays an integral role in mammalian skin. Immunity genes are often subject to diversifying selection as this increased repertoire to deal with a variety of pathogens. In Cetacea, several interleukin genes show signs of erosion (Figure 1) (e.g., Lopes-Marques et al., 2018). For example, interleukin 1F, comprising 10 interleukin genes shows significant signs of inactivation, which may be due to the epidermal cornification program loss (Lachner et al., 2017); Il20 shows ORF disrupting mutations in whales and dolphins, as well as in pig, mole rats and manatees (Lopes-Marques et al., 2018). Cetaceans do not have functional orthologs of Il36A, Il36B, Il37, and Il38 (Lachner et al., 2017). Other IL1F genes are conserved (Lachner et al., 2017).

Pyroptosis-related proteins
Gasdermins are a family of cell death inducers (Ramos-Junior and Morandini, 2017). Gasdermin D has pore-forming activity and has been implicated in pyroptotic death of macrophages, while Gasdermin A is expressed in epidermis, hair follicles and epidermal differentiation before cornification of keratinocytes (Lachner et al., 2017). Comparative genomics showed conservation of GSDMD and inactivation of the GSDMA, GSDMB, GSDMC, and GSDME genes in cetaceans (Figure 1; Lachner et al., 2017).
Other pyroptosis-related proteins include caspase-14 (Casp14), Nlrp10 and Pydc1 (Lachner et al., 2017). Casp14 is lost in all cetaceans (Figure 1; Lachner et al., 2017). Nlrp10, Pydc1 and Casp14 are all transcriptionally upregulated during keratinocyte differentiation in humans, yet they lack functional orthologs in cetaceans. NLRP10 and PYDC1 are not conserved in several lineages, and have probably been lost in a common ancestor of cattle and cetaceans (Lachner et al., 2017).

Other
Cetaceans have lost KLK8, a protease with distinct roles in the skin and brain (Yoshida, 2010). In the skin, KLK8 takes part in keratinocyte proliferation and desquamation, and activates antimicrobial peptides in sweat (Hecker et al., 2017). It is lost in dolphins and some whales and in manatee (Figure 1; Huelsmann et al., 2019). As both cetaceans and manatees do not have sweat glands, the epidermal function of KLK8 most likely became obsolete as a consequence of the adaptation to the aquatic environment.

FUTURE PERSPECTIVES: CETACEA GENOMES AND THE ROLE OF GENE LOSS
The breadth and quality of genomic data available from multiple metazoan lineages has allowed the comparative genomics field to operate in a much wider scale overcoming some of the limitations of earlier studies, where only a couple of genomes were compared and the order of events (e.g., whether a gene was lost or gained) was in many occasions complex to ascertain (e.g., Guijarro-Clarke et al., 2020). Since the description of first set of cetacean genomes (Foote et al., 2015), many have followed, totalizing at the moment 31 species with available resources in GenBank 1 (e.g., Fan et al., 2019;Morin et al., 2020). Together with published genomes from related species, such as Artiodactyla, our capacity to understand the evolution of complex phenotypes has greatly improved. This is particularly relevant in the context of gene loss and the unique phenotype adaptations of Cetacea. Recently, a surge of studies has specifically demonstrated the relevance of disabling mutations in the coding region of genes as a source of adaptive success in land to water habitat transitions (e.g., Meyer et al., 2018;Huelsmann et al., 2019;Lopes-Marques et al., 2019b). This evolutionary trend of gene reduction has been particularly prevalent in the gene families involved in skin physiology (Figure 1), with a minimum of 36 described events of gene inactivation, and strongly correlate with specific Cetacea phenotypic traits (e.g., keratins and absence of pelage). Other significant aspects which deserve consideration for future research themes include: (i) whether lost genes in Cetacea species 1 https://www.ncbi.nlm.nih.gov/datasets/genomes/?txid=9721 are evolutionary "novelties" in the mammal lineage or correspond to "older" gene families; (ii) to understand whether less pathwayentrenched genes are more easily lost; and (iii) to incorporate comparative analysis at the transcriptome level between Cetacea and other mammalian lineages as RNA-Seq resources become available. In conclusion, the land-to-water habitat transition experienced marine mammals such as Cetacea displays a clear genome signature in extant species, namely those involving a mutational path of gene inactivation.

AUTHOR CONTRIBUTIONS
GE and LA: manuscript draft, investigation, and formal analysis. MF, AM, ML-M, and RF: investigation. RR: conceptualization, manuscript draft, investigation, and methodology. LC: conceptualization, funding acquisition, manuscript draft, investigation, and methodology. All authors contributed to the article and approved the submitted version.

FUNDING
This research was developed under Project No. 32030, cofinanced by COMPETE 2020, Portugal 2020 and the EU through the ERDF, and by FCT (PTDC/BIA-EVL/32030/2017) through national funds. It was also supported by the strategic funding UIDB/04423/2020 and UIDP/04423/2020 through national funds provided by FCT.