Adaptation-Driven Evolution of Sirtuin 1 (SIRT1), a Key Regulator of Metabolism and Aging, in Marmot Species

The sirtuin protein family plays a role in the lifespan of various species and is involved in numerous key metabolic processes. To understand the evolutionary role of sirtuins in marmots, a long-living rodent species group with remarkable metabolic shutdown during hibernation, we conducted a phylogeny-based substitution rate analysis of coding genes based on genetic information of seven marmot species. We show that sirtuin 1 (SIRT1) has evolved under positive selection in the marmot lineage. We pinpoint three amino acid changes in four different marmot species that underlie the signal of positive selection and that may favor increased longevity in marmots. Based on a computational structural analysis we can show that all three substitutions affect the secondary structure of the same region in human SIRT1. We propose that the identified region is close to the catalytic domain and that the potential structural changes may impact the catalytic activity of the enzyme and therefore might be playing a functional role in marmot's extended lifespan and metabolic shutdown.


INTRODUCTION
Maximum lifespans of mammals differ greatly between species (Kowalczyk et al., 2020). Although the rodent species evolved a tenfold variability in lifespan (Gorbunova et al., 2014), the underlying molecular and evolutionary mechanisms are still little understood. Studies on ageing processes have identified the highly conserved sirtuin family (Zhao et al., 2020), whose effect on lifespan was first demonstrated in yeasts and is suspected to have a similar influence in mammals (Haigis and Sinclair, 2010).
The mammalian sirtuins have seven members (SIRT1 to SIRT7) (Frye, 2000) and are a protein family of nicotinamide adenine dinucleotide (NAD+)-dependent deacylases (Haigis et al., 2006). While they have a conserved catalytic core domain in common (Frye, 2000), they differ by cellular localization, functions, and substrates (Maissan et al., 2021). The best studied protein of this family is SIRT1 which is predominately a nuclear protein and acts as a transcription regulator and deacetylates certain histones such as H3, H4, and H1 (Toiber et al., 2011), but also modifies more than 50 non-histone proteins (Nakagawa and Guarente, 2014), such as p53 (Luo et al., 2001). Generally, it influences energy metabolism, cell survival, DNA repair, and tissue regeneration as well as numerous other important cellular processes (Bonkowski and Sinclair, 2016). Like SIRT1, SIRT6, and SIRT7 are located in the nucleus and involved in histone modification. While SIRT6 controls the expression of multiple glycolytic genes and is a regulator of glucose homeostasis (Zhong et al., 2010), SIRT7 is a positive regulator of RNA polymerase transcription and is linked to cell proliferation (Ford et al., 2006). In contrast, SIRT2 is primarily located in the cytoplasm where it regulates various metabolic enzymes. However, under stress and during mitosis, it also migrates into the cell nucleus where it regulates the cell cycle and cell differentiation (Vaquero et al., 2006). The other mammalian sirtuins SIRT3, SIRT4, and SIRT5 are mitochondrially localized. There, SIRT3 and SIRT5 influence a variety of metabolic enzymes (e.g., in fatty acid oxidation) (Verdin et al., 2010), while SIRT4 is active in amino acid metabolism by inhibiting glutamate dehydrogenase (Haigis et al., 2006).
In addition to the functions already mentioned, sirtuins, especially SIRT1, have been investigated for their role in caloric restriction (Finkel et al., 2009;Derous et al., 2021). It has long been known that calorie restriction significantly delays the aging process and extends life span in a variety of animal species, including rodents, and has been shown to reduce the incidence of age-related disorders (e.g., diabetes, cancer, and cardiovascular disease) in mammals (Bordone and Guarente, 2005). In humans, the association between SIRT1 expression and caloric restriction (Civitarese et al., 2007) as well as an age-related increase in SIRT1 expression intensity (Kilic et al., 2015) has been demonstrated. Furthermore show transgenic mice with brain-specific SIRT1 overexpression a significant delay of the aging process and an extension of life span (Satoh et al., 2013).
Since sirtuins play an important role in aging processes it has been postulated that the evolution of sirtuins may be associated with the heterogeneity of maximum lifespan between species (Tian et al., 2017). Despite the numerous analyses of sirtuin functions, the insights of evolutionary processes of this important protein family is fairly limited. Currently existing studies mainly focus on model organisms, such as mice (Vanhooren and Libert, 2013). Exceptionally, a recent study pointed out amino acid substitutions in beaver SIRT6 relative to mouse SIRT6 leading to lifespan extension (Tian et al., 2019). Based on these promising results we were intrigued whether positive selection might have been played a role in the evolution of sirtuin protein members in other rodent species. A sequence-based analysis of positive selection might reveal particular amino acid with potential impact on sirtuin catalytic activity and may explain an association with longevity. As the sirtuins are associated with dietary restricted longevity here we focused on a species group that experiences long periods of dietary restriction in form of deep hibernation and conducted a phylogeny-based substitution rate analysis based on genetic information of seven marmot species (Cardini and O'Higgins, 2004). We find that sirtuin 1 (SIRT1) has evolved under positive selection in the marmot lineage.

Data Retrieval
For the analysis of the seven mammalian sirtuin protein family members SIRT1 to SIRT7 we obtained coding DNA sequences, including several isoforms or different prediction entries from NCBI Genbank (Clark et al., 2015). The sequences were retrieved for human (Homo sapiens), mouse (Mus musculus), rat (Rattus norvegicus), American beaver (Castor canadensis), and guinea pig (Cavia porcellus). The coding DNA sequence for SIRT5 for rats could not be retrieved. Additionally, we obtained some sirtuin coding DNA sequences for marmots from Genbank, including sequences of SIRT1, SIRT3, SIRT5, and SIRT6 for yellow-bellied marmot (Marmota flaviventris) and the sequence of SIRT 5 for Alpine marmot (Marmota marmota marmota, Gossmann et al., 2019). A complete list of retrieved sirtuin sequences along with their respective identifiers can be seen in Supplementary Table 1 (see Supplementary Material). To conduct an analysis tailored to marmots, additional publically available coding sequences for SIRT1 to SIRT7 homologs were predicted. For this, coding sequences were predicted from genomic data based on sequence homology to the available Refseq sirtuin marmot sequences using Exonerate v2.2.0 (Slater and Birney, 2005). The genomes of the woodchuck (Marmota monax), the Himalayan marmot (Marmota himalayana) and the Vancouver Island marmot (Marmota vancouverensis) were retrieved from Genbank. Furthermore, coding sequences for the Mongolian marmot (Marmota mongolia), long-tailed marmot (Marmota caudata) and Altai marmot (Marmota baibacina) could be obtained by using re-sequencing information (Bai et al., 2019) (http://www. marmotdb.org/). Specifically, coding information were obtained by using the vcf 's information obtained from mapping short read data onto the Himalayan reference genome by Bai et al. (2019). Each species for which sequence data was used was included in the phylogenetic tree that was provided obtained from Timetree (Kumar et al., 2017).

Sequence Preparation and Selection Analysis
To prepare the obtained sequences a customized python pipeline was designed based on available scripts from Afanasyeva et al. (2018). In brief, it contained general automated preparation steps, e.g., renaming data files, adjusting identifier information and storing the coding sequences for each sirtuin SIRT1 to SIRT7 from each species. Note, that this pipeline could be easily extended to handle large scale data, such as entire proteomes or across large trees (Afanasyeva et al., 2018;Yusuf et al., 2020). Then, each coding DNA sequence was translated into the corresponding protein sequence using the standard genetic code. All sequences containing a stop codon within the sequence were disregarded for further analysis. The translated sequences were then aligned using MUSCLE v3.8 using default parameter settings (Edgar, 2004). These alignments were used for a manual inspection and limited to the selection of one sequence per species and protein. This was necessary, because alignment artifacts may artificially create a signature of positive selection (Afanasyeva et al., 2018). Criterions for the exclusions of sequences included sequences with missing fragments in comparison to the human canonical sequence as well as sequences with potentially erroneous intron exon boundaries. The identifier of the selected sequences which were retrieved from Genbank are listed in Supplementary Table 2, while the selected sequences of the predicted sirtuin sequences for SIRT1 are shown in Supplementary Table 3

(Supplementary Material).
Alignments of the selected and translated sequences where further processed using PRANK v.170427 using default parameter settings (Loytynoja, 2013) and critical residues were masked with ZORRO (Wu et al., 2012) when the confidence score was smaller than nine. Combined with the previous coding DNA sequences the results were used as input for PAL2NAL v14 (Suyama et al., 2006) to create coding DNA sequence alignments. Thereby, output files with renamed identifier were created as input for PAML (Yang, 2007). Prepared codeml files and phylogenetic tree files pruned down to the number of available proteins were then used as input for codeml of the PAML package.
To conduct a selection analysis based on substitution rates, codeml from the PAML package version 4.9j was used. Two model types were considered. First, a branch model comparison was executed to determine heterogeneity in dN/dS between the marmot clade and the rest of the tree (two-ratio model) relative to a model where a single dN/dS ratio (one-ratio model) was assumed. Secondly, branch-site comparisons were used to test for the evidence of positive selection within the marmot clade. We compared the branch-site model A to a model where a potentially positively selected site category was restricted to drift (e.g., dN/dS = 1). Furthermore, the site models model 1 (M1a, NearlyNeutral, 2 categories) (Wong et al., 2004;Yang et al., 2005) and model 22 (M2a_rel, PositiveSelection, 3 categories) (Weadick and Chang, 2012), that allow the ω ratio to vary among sites (among codons or amino acids in the protein), were compared to a Clade model C (Bielawski and Yang, 2004).

Selection Model Comparison
The codeml-generated output files were further processed using a customized python script. For the branch test comparison dN/dS values were extracted from the one ratio and two ratio models and twice the log likelihood difference was assumed to be χ 2 distributed with one degree of freedom. Same accounts for the three different branch-site model comparisons. Here, the dN/dS values were extracted for model A, model A fixed, model=1 (M1a) and model=22 (M2a_rel), as well as for Clade model C. Again, twice the log likelihood difference of model A against model A fixed (with ω = 1) was assumed to be χ 2 distributed with one degree of freedom. For model M1a against Clade model C d.f.=3 was used while one degree of freedom was used for model=22 (M2a_rel) against Clade model C. Furthermore, the NEB (Naïve Empirical Bayes) and BEB (Bayes Empirical Bayes) values were extracted from rst-files from the codeml output for data plotting.

Structural Prediction
A 3D model was calculated using the SWISS-MODEL (Guex et al., 2009) (with access on SWISS-MODEL Interactive Workspace, n.d.) a template for human SIRT1 protein structure from Protein Data Bank (4I5I) (Zhao et al., 2013) and the coding sequence of human SIRT1. With FirstGlance (FirstGlance in Jmol, n.d.) the model was visually processed. The influence of the substitution of selected amino acids regarding the secondary structure of the protein was analyzed using Jpred4 (Drozdetskiy et al., 2015) as well as Porter 5.0 (Torrisi et al., 2019) submitting the protein sequence of human SIRT1 in comparison to the same sequence with single amino acid substitution [with substitution of lysin (K) to threonine (T) in amino acid sites 499, serine (S) to leucin (L) in amino acid sites 508 and substitution of arginine (R) to isoleucine (I) in amino acid sites 613] deducted from the marmot sequences of M. sibirica, M. himalayana and M. caudata, and M. flaviventris, respectively.

Branch Test: Differences in Evolutionary Rates in Marmot Clade
Differences in the evolutionary rates, recovered from betweenspecies comparisons, may reveal molecular footprints of selective pressures. Here we investigated heterogeneity in the rate of molecular evolution of sirtuins between the marmots and other mammalian species (Figure 1).
The ratio of non-synonymous to synonymous substitutions ω = dN/dS was used as a measure for the rate of molecular evolution at the protein level. The dN/dS ratio indicates the driving selective forces in the evolutionary process, as elevated dN/dS ratios above one suggest the influence of positive selection (ω > 1) while a low ratio (ω < 1) reveals a process driven mainly by negative selection (Gossmann and Schmid, 2011). The comparison of the estimated evolutionary rate between the one ratio (single dN/dS ratio for all considered species) and two ratio model (different dN/dS ratio for marmot clade) revealed higher dN/dS ratios in the marmot clade relative to the compared species for six of the seven sirtuin proteins (SIRT1, SIRT3, SIRT4, SIRT5, SIRT6, and SIRT7) ( Table 1).
Indeed, we find a significant difference of the d N /d S ratios between the marmot clade and the remaining tree for SIRT1. This shows that the marmot clade has a different evolutionary rate regarding SIRT1 than the remaining branches of the phylogenetic tree. The explanation for the higher dN/dS ratios could be relaxed purifying selection, but it could also suggest a possible influence of positive selection specific to marmots.

Branch-Site Test: SIRT1 Repeatedly Evolved Under Positive Selection
Since the branch test revealed an increased rate of molecular evolution in the marmot clade relative to other mammalian species, a possible role of positive selection events was further analyzed. To investigate positive selection in the evolution of SIRT1 in the marmot clade three branch-site model comparisons were executed. The branch-site model A was compared to the model A fixed where a potentially positively selected site category was restricted to drift (with dN/dS = 1). Furthermore, Clade C was compared to model 1 (M1a, NearlyNeutral) and also to model 22 (M2a_rel, PositiveSelection). All three comparisons revealed a significant difference (Figure 2), indicating the occurrence of positive selection events within the marmot clade.
A significant difference for the three different likelihood comparisons is assumed if twice the log likelihood difference  For the comparison of the estimated evolutionary rate between the one ratio (single dN/dS ratio for all considered species) and two ratio model (different dN/dS ratio for A: marmot clade and B: other mammalian species) a significant difference is assumed if twice the log likelihood difference (2 l = 2 · (l1 -l2) with l1 and l2 as the log of the maximum likelihood (ML) estimated of the two models) is χ 2 distributed with d.f. = 1.
(2 l = 2 · (l1 -l2) with l1 and l2 as the log of the maximum likelihood (ML) estimated of two models) is χ 2 distributed. A likelihood comparisons between the branch-site model A and model A fixed, where a potentially positively selected site category was restricted to drift (e.g., dN/dS = 1) was performed to test for the evidence of positive selection within the marmot clade (χ 2 distribution with d.f. = 1). Furthermore, results of a likelihood comparisons between Clade model C (Bielawski and Yang, 2004) and site models model 1 (M1a, NearlyNeutral, 2 categories) (Wong et al., 2004;Yang et al., 2005) (χ 2 distribution with d.f. = 3) and a comparison between Clade model C and model 22 (M2a_rel, PositiveSelection (3 categories)) (Weadick and Chang, 2012) (χ 2 distribution with d.f. = 1) differed significantly. To identify individual amino acid substitutions that were likely promoted by positive selection, we determined every amino acid's probability to belong to a class with either low (dN/dS < 1), neutral (dN/dS = 1) or a high dN/dS ratio (dN/dS > 1). As the different models revealed similar results, we focused on model A results. The gained NEB (Naive Emperical Bayes) probabilities show three amino acid sites with high posterior probabilities (> 75%) for site class 2 with ω = dN/dS = 1.68 (Figure 3), indicating that the footprints of positive selection can be pinpointed to particular amino-acid replacements. Similar results have been obtained for posterior probabilities based on BEB (Bayes empirical Bayes, Figure 3) probabilities. This approach accounts for sampling error (Yang et al., 2005) illustrating that the amino acids sites underlying the signal of positive selection can be robustly identified.
The three outlined amino acid sites presumably have undergone positive selection. The alignment of the analyzed sequences for SIRT1 shows the amino acid sites and the corresponding substitutions within the marmot clade (Figure 4). In respective to coding sequence of human SIRT1 the amino acid sites are 499 with substitution of lysin (K) to threonine (T) in M. sibirica, 508 with substitution of serine (S) to leucin (L) in M. himalayana and M. caudata and 613 with substitution of arginine (R) to isoleucine (I) in M. flaviventris.

Secondary Structure Prediction
Since we identified substitutions that presumably evolved under positive selection, we were interested whether we could identify a possible impact of these residues on the structural level. To investigate this effect further, a secondary structure prediction of human SIRT1 protein was performed along with in-silico mutated variants of the human protein, substituting the three residues we identified from the marmot species. Intriguingly, we find that a differently predicted substructure within the amino acid sites 518 to 523 of the human SIRT1 sequence is observed for each of the 3 residues (Figure 5). Interestingly the secondary structure of this 6 amino acid long sequence stretch can be predicted only with low confidence, but it is embedded in a coiled coiled region that can be confidently predicted. The observed effect on the secondary structure is also remarkable, because one of the mutations is 270 bp (90 aa) away from the affected amino acid stretch. While the human sequence is predicted with a helix structure for the mentioned amino acid sites, the sequences with the single substitutions each are predicted with a predominantly coiled structure instead. We conducted control mutations where we randomly substituted amino acid residues nearby the identified positions and did not find such an effect on the structure prediction for serine and arginine (Supplementary Figures 3E-G). We find that changes of nearby lysin residues to the observed mutations had a similar impact that lead to a coiled-coiled prediction in the target region (Supplementary Figures 3H-J).
This structural difference may alter the enzyme function. As Zhao et al. (2013) reported the catalytic domain of SIRT1 reaches up to the residue 516. Their estimated structure (shown in Supplementary Figure 1) which ends with the amino acid E 512 suggests a very close position of the subsequence QKELAY with amino acid sites 518 to 523 to the substrate bounded in the catalytic domain. Assuming a coiled structure for the subsequence (as predicted in Figure 5) instead of the shown strand in Supplementary Figure 1 it is conceivable that the subsequence QKELAY might have an influence on the substrate affinity.

DISCUSSION
The association of sirtuins and dietary restricted longevity has been at the center of ageing research for more than a decade (Dali-Youcef et al., 2007;Guarente, 2007;Imai, 2007). Nevertheless, little is known about the role of molecular adaptations of sirtuins especially in non-model animals experiencing long periods of dietary restriction under natural conditions. These studies hardly deviate from the usual models, such as yeast, mouse and human, leading to a limited amount of insights (Vanhooren and Libert, 2013;Mitchell et al., 2015). Our analyses, as well as work of Tian et al. (2019) with highlighted results found in beaver SIRT6 protein, underline the importance of considering animals outside the common model organisms.
To understand the influence of sirtuins on dietary restricted longevity, it is important to compare the different species with their different phenotypic characteristics regarding dietary FIGURE 3 | Quantitative representation of the Naive Empirical Bayesian (NEB) and Bayes Emperical Bayesian (BEB) class probabilities derived from the branch-site model comparison as a test for positive selection. The amino acid sites respective to human SIRT1 protein sequence fit into the classes with certain dN/dS ratios with the given posterior probabilities. Posterior probabilities for site class 2 indicate the probability for positive selection and are shown in red. restriction and longevity. Learning about the correlation of the genomic differences and phenotypic characteristics helps discovering the genetic basis for a long life span. In order to gain such important insights, species with extreme characteristics in the respective area, such as in this case animals with extreme dietary restriction, should be especially considered.
Here, we report differences in sirtuin evolution in the marmot clade relative to other mammalian species. We find a significantly increased rate of molecular evolution for one of the sirtuin family members, SIRT1. Increased rates of d N /d S might stem from relaxed purifying selection, or reflect an overall lower effective population size in the marmot lineage (Gossmann et al., 2019). We tested a third hyporthesis that could drive this pattern: positive selection. Indeed, We identify three amino acid substitutions occurring in four specific marmot species but not in humans or the common model organism mouse. Interestingly, for two species, M. himalayana and M. caudata the same amino acid substitution may have appeared independently or may stem from common ancient genetic variation, illustrating the possibility of parallel evolution. The fact that significant differences were observed for three branchsite model comparisons underlines the robustness of the analysis and demonstrates the independence of the result from the model assumptions.
With the huge difference in dietary restriction between these species the results outline the importance of an even broader analysis. Our phylogeny-based substitution rate analysis presented here can easily be extended to sequences of many other species to learn more about sirtuins and their evolution. The investigation of other deep hibernators such as dormice, chipmunks, hamsters, fat-tailed lemurs, bats and hedgehogs (Geiser, 2011) could provide further findings. For this gaining further insights depend on the genome sequences of additional species. Large scale genome projects such as the Bat1K (Teeling et al., 2018) will allow for more phylogenetic analyses in the future. Furthermore, population genetic analyses can be performed to better understand the recent evolutionary mechanisms underlying lineage specific mutations.
FIGURE 5 | Relevant section of secondary structure prediction. Protein sequences for secondary structure prediction are (A) human SIRT1 protein sequence, and human SIRT1 protein sequences with substitution of (B) lysin (K) to threonine (T) in amino acid site 499, (C) serine (S) to leucin (L) in amino acid site 508, (D) arginine (R) to isoleucine (I) in amino acid site 613. Predictions were performed with Porter 5.0 (Torrisi et al., 2019). Secondary structures are divided into three classes: E, Strand; H, Helix; C, Coil; SS3, 3 class secondary structure; SS8, 8 class secondary structure, each followed by the line of confidence values for 3 and 8 class predictions, respectively (with 9, maximal confidence; 0, very little confidence).
As experimentally shown in the work of Tian et al. (2019), the discovered amino acid substitutions have an influence on lifespan, demonstrated by swapping the critical amino acids of mouse and beaver SIRT6. Similar experiments using mouse models and inserting the amino acid substitutions highlighted in this work could verify the effect on molecular mechanisms and life span. The protein changes observed are in regions of human Sirt1 involved in NAD binding and a potential phosphorylation site at residue 530 (Shan et al., 2017;Ling et al., 2018). It would be intriguing to investigate further whether this could indicate the regulation of the Sirt1 interaction networks (Elibol and Kilic, 2018) being under selection. Due to lack of quality sequencing data of the annotated marmot genomes that were used in this study, we could not analyse the region that is homologous to the N-terminus of human/mouse Sirt1. Since the SIRT1 N-terminal region that has been shown to interact with histone proteins (Vaquero et al., 2004) future investigation could shed further light on the functional implications of evolutionary transitions in SIRT1.
In conclusion, this work demonstrates the ability of phylogenetic studies to determine the signature of molecular adaptation, especially when looking at a range of species with specific life-history features, which is enabled by large quantities of genome information and the use of computational approaches. However we note, that studies on a comparable scale involving other species clades will be crucial to shed light on the evolutionary signatures of extreme life styles to deduct the mechanistic underpinnings of metabolic adaptation.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
NJ and TG contributed to conception and design of the study. NJ performed the analysis and wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.