Genetic Analysis of Tryptophan Metabolism Genes in Sporadic Amyotrophic Lateral Sclerosis

The essential amino acid tryptophan (TRP) is the initiating metabolite of the kynurenine pathway (KP), which can be upregulated by inflammatory conditions in cells. Neuroinflammation-triggered activation of the KP and excessive production of the KP metabolite quinolinic acid are common features of multiple neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). In addition to its role in the KP, genes involved in TRP metabolism, including its incorporation into proteins, and synthesis of the neurotransmitter serotonin, have also been genetically and functionally linked to these diseases. ALS is a late onset neurodegenerative disease that is classified as familial or sporadic, depending on the presence or absence of a family history of the disease. Heritability estimates support a genetic basis for all ALS, including the sporadic form of the disease. However, the genetic basis of sporadic ALS (SALS) is complex, with the presence of multiple gene variants acting to increase disease susceptibility and is further complicated by interaction with potential environmental factors. We aimed to determine the genetic contribution of 18 genes involved in TRP metabolism, including protein synthesis, serotonin synthesis and the KP, by interrogating whole-genome sequencing data from 614 Australian sporadic ALS cases. Five genes in the KP (AFMID, CCBL1, GOT2, KYNU, HAAO) were found to have either novel protein-altering variants, and/or a burden of rare protein-altering variants in SALS cases compared to controls. Four genes involved in TRP metabolism for protein synthesis (WARS) and serotonin synthesis (TPH1, TPH2, MAOA) were also found to carry novel variants and/or gene burden. These variants may represent ALS risk factors that act to alter the KP and lead to neuroinflammation. These findings provide further evidence for the role of TRP metabolism, the KP and neuroinflammation in ALS disease pathobiology.

The essential amino acid tryptophan (TRP) is the initiating metabolite of the kynurenine pathway (KP), which can be upregulated by inflammatory conditions in cells. Neuroinflammation-triggered activation of the KP and excessive production of the KP metabolite quinolinic acid are common features of multiple neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). In addition to its role in the KP, genes involved in TRP metabolism, including its incorporation into proteins, and synthesis of the neurotransmitter serotonin, have also been genetically and functionally linked to these diseases. ALS is a late onset neurodegenerative disease that is classified as familial or sporadic, depending on the presence or absence of a family history of the disease. Heritability estimates support a genetic basis for all ALS, including the sporadic form of the disease. However, the genetic basis of sporadic ALS (SALS) is complex, with the presence of multiple gene variants acting to increase disease susceptibility and is further complicated by interaction with potential environmental factors. We aimed to determine the genetic contribution of 18 genes involved in TRP metabolism, including protein synthesis, serotonin synthesis and the KP, by interrogating whole-genome sequencing data from 614 Australian sporadic ALS cases. Five genes in the KP (AFMID, CCBL1, GOT2, KYNU, HAAO) were found to have either novel protein-altering variants, and/or a burden of rare protein-altering variants in SALS cases compared to controls. Four genes involved in TRP metabolism for protein synthesis (WARS) and serotonin synthesis (TPH1, TPH2, MAOA) were also found to carry novel variants and/or gene burden. These variants may represent ALS risk factors that act to alter the KP and

INTRODUCTION
Amyotrophic lateral sclerosis (ALS) is a devastating neurodegenerative disease caused by the loss of upper and lower motor neurons resulting in progressive muscle weakness, wasting, spasticity and eventual paralysis (1). Disease generally occurs between 50 and 60 years of age, and death usually occurs within three to five years from symptom onset, though survival can vary greatly (2). Ten percent of ALS cases are classified as familial, where there is clear evidence of a family history of disease, while the remaining 90% are considered sporadic (SALS), seemingly occurring at random in the population (3).
The genetics of ALS is heterogenous, with over 40 genes and 850 variants now implicated as causal or associated with the disease (3,4). In European populations, approximately 60% of familial and 10% of SALS cases are attributed to a known causal mutation in these genes (4)(5)(6). Additionally, there is strong evidence of a complex genetic contribution to SALS. Studies on the heritability of the disease suggest that 40-60% of SALS risk may be attributed to genetic factors (7)(8)(9). A multi-step hypothesis has been described to explain the late onset and sporadic nature of ALS, whereby six 'steps' are required for disease onset to occur (10,11). These steps may include mutations, genetic risk factors, environmental exposures, or other unknown events. Recent genetic analysis identified genes with an increased load, or burden, of rare protein-altering variants in ALS cases. These included TBK1 and NEK1, as well as known ALS genes, SOD1, TARDBP and OPTN (12). Gene burden complements the multi-step hypothesis for the late onset of ALS, where the presence of genetic alterations may contribute to presentation of disease (10,11).
Tryptophan (TRP) is an essential amino acid that is either used for the synthesis of proteins, catabolised for the biosynthesis of serotonin and melatonin, or shuttled through the kynurenine pathway (KP) metabolites to produce nicotinamide adenine dinucleotide (NAD + ). A single enzyme, tryptophanyl-tRNA synthetase, encoded by WARS (cytoplasmic) and WARS2 (mitochondrial), acts in the aminoacylation of TRP to its tRNA for protein synthesis, four enzymes are involved in serotonin synthesis, and 13 enzymes are involved in the KP (Figure 1). The KP enzymes act to generate several bioactive intermediates including kynurenine (KYN), kynurenic acid (KYNA), picolinic acid (PIC), quinolinic acid (QUIN) as well as NAD + (13). In physiological conditions, QUIN is usually in low abundance and rapidly transaminated into nicotinic acid, and ultimately NAD + . Under neuroinflammatory conditions, QUIN is an excitotoxin that is excessively produced by activated microglia in the brain (14), while KYNA and PIC, produced by astrocytes and neurons respectively, partly prevent QUIN toxicity (14,15). Increased QUIN levels can amplify neuroinflammation by acting to stimulate neuronal release and inhibit astroglial uptake of glutamate leading to high extracellular glutamate and excitotoxicity, subsequent mitochondrial dysfunction, and activation of proteases (16).
Altered TRP levels and KP dysfunction have been linked to neurodegenerative diseases both genetically and functionally. Multiple mutations in WARS have been found to cause distal hereditary motor neuropathy, a form of motor neuron disease characterised by slowly progressive muscle weakness and atrophy (17,18). Protein-altering missense, nonsense and splicing variants present in KP genes have also been identified as associated with diseases such as multiple sclerosis, Parkinson's disease, schizophrenia, autism and others (19).
Neuroinflammation and the KP have been functionally implicated in neurodegenerative diseases including ALS (14), multiple sclerosis (20), Parkinson's (21), Alzheimer's (22), and Huntington's Diseases (16). Altered levels of KP metabolites present in cerebrospinal fluid (CSF), serum and spinal cord tissues of ALS patients have been significantly associated with disease. CSF and serum levels of TRP, KYN and QUIN were found to be significantly increased, and serum PIC levels were significantly decreased in ALS patients compared to controls (14). Similarly, KYNA levels in serum was found to be decreased in ALS patients with severe clinical status, as compared to controls. Conversely, in CSF, KYNA levels were lower in controls, indicating a difference in KYNA production between the CNS and blood, as well as the presence of immune activation (23). Additionally, increased levels of IDO1 (the first and rate-limiting enzyme of the KP) and QUIN were identified in the motor cortex and spinal cord of patients (14). KP metabolites (KPMs) also represent promising biomarkers for ALS progression [reviewed in (24)].
Although altered TRP metabolism, serotonin synthesis, the KP and neuroinflammation have all been functionally implicated in ALS, the contribution of variation in key genes from these pathways has not been reported. We aimed to determine the contribution of sequence variants in these genes to ALS through the identification of novel and rare protein-altering variants, and by preforming gene burden analysis in a large cohort of Australian sporadic ALS cases.

Subjects
Six-hundred and fourteen sporadic ALS cases were recruited through the Macquarie University Neurodegenerative Disease Biobank, Australian MND DNA bank (Royal Prince Alfred Hospital) and the Brain and Mind Centre (University of Sydney). All individuals provided informed consent for research participation as approved by the human research ethics committees of Macquarie University (5201600387), Sydney South West Area Health District and The University of Sydney. All sporadic ALS cases were of predominately European descent, and were diagnosed with probable or definite ALS according to El Escorial criteria (25). Demographic characteristics of the cohort, such as sex, age of onset, and mutation status were consistent with that of other European datasets, where a subset of patients carried mutations in known ALS genes including C9orf72, SOD1 and TARDBP or disease associated variation in other ALS genes, as previously reported in McCann et al. (4). Control genotype data was ascertained from the nonneurological subset of non-Finnish Europeans (nNFE, n=51,592) from the Genome Aggregation Database (gnomAD) (26). Population-specific Australian control genotype data were ascertained through the Diamantina control dataset (AOGC, n=967) and the Medical Genetics Reference Bank (MGRB, n= 1,144) (27). The AOGC dataset comprises of whole-exome sequencing data from neurologically healthy Australians of predominately Western European descent. The MGRB dataset comprises of PCR-amplified whole-genome sequencing data from healthy Australians of >70 years of age and no history of dementia.

Data Processing
All sporadic ALS samples underwent whole-genome sequencing (WGS, Illumina 150bp PCR-free library, X-Ten sequencer) at The Kinghorn Cancer Centre (Sydney, Australia), as detailed by McCann et al. (4). Data was annotated to hg19 using ANNOVAR and included in silico protein prediction tools from the database for non-synonymous SNP's functional predictions v4.1a (dbNSFP) (4,(28)(29)(30). Custom UNIX scripts were used to parse variant call format files for all variants in the target genes. RStudio v3.6.3 (31) was used for all subsequent analyses. Novel variants were considered accurate with base coverage equal or greater than 25X, reference/alternate read

Variant Filtering and Pathogenicity Scoring
Filtering criteria were applied to identify qualifying variants present in WGS data for burden analysis (both heterozygous and homozygous variants were included). Qualifying variants were defined as those which alter the protein sequence including missense, insertions or deletions, splicing and stop gain or loss variants, and were considered as rare in the population. Rare variants were defined as present at a minor allele frequency (MAF) equal to or less than 0.005, with the exception of the gnomAD nNFE controls, where a MAF equal to or less than 0.0001 was used due the large sample size. Novel genetic variants were defined as those present in SALS, and absent, or only present in a single individual, from all control datasets including the National Centre for Biotechnology Information (NCBI) dbSNP153 database (https://www.ncbi.nlm.nih.gov/snp/). The potential pathogenicity of novel gene variants was assessed using 12 functional prediction tools from dbNSFP, including SIFT, PolyPhen2-HDIV, PolyPhen2-HVAR, LRT, MutationTaster, MutationAssessor, FATHMM, PROVEAN, MetaSVM, MetaLR, M-CAP and CADD (29). The percentage of deleterious predictions was used to calculate a pathogenicity score, whereby a score of 1 indicates that 100% of tools predicted a deleterious effect. Meta-analysis prediction tools REVEL (nonsynonymous variants only) and BayesDel (nonsynonymous and splicing variants) were also noted from dbNSFP annotation, as these tools were recently found to outperform other in silico prediction tools (33)(34)(35). Pathogenic cut-off scores were 0.5 for REVEL and -0.057 for BayesDel. The splicing variants were analysed for functional affects using Human Splicing

Gene Burden
Burden analysis was performed on qualifying variants only, as defined above. For burden testing, the total number of qualifying variants per gene in sporadic ALS cases was compared to that of multiple control datasets separately. The Fisher's exact test (from the R package exact 2x2) was used for analysis. As 18 genes were analysed in this project, a Bonferroni correction of the p-value was applied (n=18, p=0.00278).

RESULTS
Eighteen genes involved in TRP metabolism and the KP ( Figure 1) were screened for genetic variants in whole-genome sequencing data from 614 Australian sporadic ALS patients. Three-hundred and eleven single nucleotide non-intergenic variants were identified including 50 synonymous, 76 nonsynonymous, one stop gain, one frameshift, four splicing, 128 3'UTR, and 51 5'UTR variants. Of these, 84 rare proteinaltering variants that qualified for burden analysis were identified, and all genes had a least one such variant. Five genes (AFMID, HAAO, KYAT1/CCBL1, TPH1 and WARS) showed a burden of qualifying variants in SALS cases compared to the gnomAD nNFE dataset, however, this was not replicated when compared to the Australian control cohorts ( Table 1). Nine novel variants in six genes were identified, each in a single individual ( Table 2). In silico assessment of novel missense variants indicated that three variants present in GOT2 (1), KYNU (1) and MAOA (1) were predicted to be pathogenic by more than 80% of the total protein prediction tools that provided prediction results ( Table 2). Meta-analysis prediction scores from REVEL and BayeDel also correlated with these predictions ( Table 2). The MAOA (X chromosome) variant was present in the heterozygous state in one female. None of these variants were present in additional ALS cohorts (MAOA data not present in Project MinE), nor were they previously implicated in other diseases (NCBI ClinVar database, https://www.ncbi.nlm.nih. gov/clinvar/). The novel HAAO intronic splicing variants were also predicted to affect splicing by altering intronic acceptor sites using Human Splicing Finder (36), and to be deleterious by MutationTaster, CADD and BayesDel.

DISCUSSION
We sought to determine the prevalence of novel genetic variants or burden of rare protein-altering variants in genes that play a key role in TRP metabolism or the KP in Australian sporadic ALS. Nine novel genetic variants (absent from public control databases, including population-specific controls) were identified in WARS (protein synthesis), TPH2 and MAOA (serotonin  Table 2). The genes WARS and TPH1, and KP genes AFMID, HAAO, and KYAT1/CCBL1 were shown to have a significant burden of qualifying rare protein-altering variants in sporadic ALS compared to the non-neuronal Non-Finnish European subset of the gnomAD dataset (Table 1), although this was not replicated when compared to Australian controls. This may be due to technical differences in data generation (whole-exome, PCR-amplified or PCR-free whole-genome sequencing), sample size or unidentified differences in population structure due to the highly multicultural and diverse Australian population. The increased burden of rare protein-altering variants, including the presence of novel variants, provides support for the role of TRP metabolism and the KP in ALS, and suggests these variants may act to increase risk of developing disease. Aminoacyl-tRNA synthetases (ARSs) such as WARS are responsible for the first step of translation and protein synthesis. Mutations in the tryptophan ARS gene, WARS, have been found to cause the neurodegenerative disease, distal hereditary motor neuropathy (17,18). WARS mutations were found to negatively affect protein synthesis and cell viability and cause neurite degeneration in neuronal cell lines and rat motor neurons (17,18). We identified three additional novel WARS variants in sporadic ALS cases. Two variants (c.G91A, p.A31T and c.T107C, p.I36T) were located in close proximity within the N-terminal helix-turnhelix (WHEP) domain, responsible for protein-protein interactions (17). Interestingly, deletion of the WHEP domain of a Caenorhabditis elegans glycyl-tRNA synthetase was found to affect protein structure and reduce enzyme function (38). However, these WARS WHEP domain variants were predicted to benign by protein prediction software tools, and therefore, further analysis is required to establish their potential pathogenicity.
The neurotransmitter serotonin acts as a critical mood regulator, with its depletion highly associated with depression. This depletion may be a result of decreased availability of TRP due to activation of IDO1 and the KP, which is associated with neuroinflammation and psychological or physiological (illness) stress (39,40). Four enzymes are involved in serotonin synthesis from TRP, with TPH1/TPH2 converting TRP to serotonin precursor 5-hydroxytrypophan (5-HTP), and MAOA converting 5-HTP to 5-hydroxyindoleacetic acid (5-HIAA, Figure 1). Serotonin depletion has also been associated with neurodegenerative diseases including Alzheimer's disease and frontotemporal dementia. Decreased levels of serotonin and 5-HIAA have also been found in the spinal cord of ALS patients (41,42), as well as in ALS patient platelets, with serotonin levels positively correlating with improved survival (41). Interestingly,  administration of 5-HTP in an ALS SOD1 mouse model significantly improved phenotype, which also corresponded with increased platelet serotonin levels in the animals (43). In an alternate ALS SOD1 mouse model, degeneration of serotonergic neurons in the brainstem was found to lead to spasticity, a common clinical feature of ALS. Expression of mutant SOD1 caused a loss of serotonergic neurons in the brainstem, a phenotype that was rescued with SOD1 deletion. This, in turn, abolished spasticity in the mouse (44). We found a burden of qualifying variants in TPH1, and novel variants in TPH2 and MAOA in sporadic ALS cases compared to controls. These genes encode tryptophan hydroxylases (TPHs) involved in 5-HTP synthesis and 5-HIAA synthesis respectively. Additionally, the MAOA variant, p.I138F was predicted to have a pathogenic effect by eight prediction tools ( Table 2).
In the central nervous system, neuroinflammatory conditions result in increased numbers of M1 neurotoxic microglia, which produce excessive levels of QUIN (45). QUIN acts to agonise the N-methyl-D-aspartate (NMDA) receptor, resulting in an excitotoxic cascade that ultimately results in neuronal death (45). Mechanisms of QUIN neurotoxicity include protein dysfunction, oxidative stress, glutamate excitotoxicity, mitochondrial dysfunction, neuroinflammation, autophagy and apoptosis (46,47). In ALS, several studies have found increased levels of QUIN in the CSF of patients as well as in spinal cord neuronal and microglial cells (46). Additionally, increased levels of QUIN by intracerebral injection into rat striatum resulted in increased astrocyte expression of the major ALS protein, SOD1. As a free superoxide radical scavenger, the increased SOD1 levels were thought to be a neuroprotective response to limit QUIN oxidative toxicity, a function that may be inhibited by ALScausing mutant SOD1 protein forms (46,48). QUIN excitotoxicity can partly be mediated by KYNA, which is produced by astrocytes (49). Interestingly, KYNA levels were also found to be higher in ALS patient CSF compared to controls, which may reflect an astroglial attempt to produce the neuroprotective metabolite (13). In serum, however, KYNA levels were conversely found to be significantly lower in ALS patients with severe clinical status compared to both patients with mild clinical status and controls (23). In a separate study, we have found similarly decreased levels of KYNA in the serum of patients with ALS as compared to controls (n= 238, p <0.001, Student's T-test; data not shown). Of the five KP genes found to carry novel variants and/or a significant burden of qualifying variants in this study, four were directly involved in KYNA (KYAT1/CCBL1 and GOT2) and QUIN (KYNU and HAAO) synthesis from 3-hydroxykynurenine ( Figure 1).
The role of TRP and the KP in neuroinflammation, and its link to several major neurodegenerative diseases including ALS has been widely studied. We have shown for the first time that genetic variation in these genes may be associated with sporadic ALS and may confer risk to developing disease, however replication in additional cohorts is required to confirm this relationship. The protein-altering variants in the genes involved in these pathways may trigger functional effects that influence disease risk and when combined with other pathogenic 'steps' may progressively lead to ALS onset. Further studies can now commence to determine the specific pathogenic role of the novel variants and genes that carry a burden of variants in sporadic ALS.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by human research ethics committees of Macquarie University (5201600387), Sydney South West Area Health District and The University of Sydney. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JF, IF and GG conceptualised and designed the studies and experiments. Experiments were performed by JF, SC, and EM. Data was curated by KW, NT, DB and EM. Data was analysed by JF, SC and VT. Resources were obtained by RP, MK and DR. JF wrote the manuscript. All authors contributed to the article and approved the submitted version. IB supervised the project and acquired funding.