- 1Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, Mexico
- 2Instituto de Neurología y Neurocirugía, Centro Médico Zambrano Hellion, TecSalud, San Pedro Garza Garcia, Mexico
- 3Department of Neurology, University of Wisconsin School of Medicine and Public Health, Madison, WI, United States
- 4Tecnologico de Monterrey, Institute for Obesity Research, Monterrey, Mexico
- 5Tecnologico de Monterrey, Proyecto oriGen, Monterrey, Mexico
Introduction: Amyotrophic lateral sclerosis (ALS) is a fatal progressive neurodegenerative disease characterized by the deterioration of upper and lower motor neurons. Affected patients experience progressive muscle weakness, including difficulty in swallowing and breathing; being respiratory failure the main cause of death. However, there is considerable phenotypic heterogeneity, and its diagnosis is based on clinical criteria. Moreover, most ALS cases remain unexplained, suggesting a complex genetic background.
Methods: To better understand the molecular mechanisms underlying ALS, we comprehensively analyzed, filtered and classified genes from 4,293 abstracts retrieved from PubMed, 7,343 variants from ClinVar, and 33 study accessions from GWAS catalog. To address the importance of ALS-associated genes and variants, we performed diverse bioinformatic analyses, including gene set enrichment, drug-gene interactions, and differential gene expression analysis using public databases.
Results: Our analysis yielded a catalog of 300 genes with 479 ALS-associated variants. Most of these genes and variants are found in coding regions and their proteins are allocated to the cytoplasm and the nucleus, underscoring the relevance of toxic protein aggregates. Moreover, protein-coding genes enriched ALS-specific pathways, for example spasticity, dysarthria and dyspnea. ALS-associated genes are targeted by commonly used drugs, including Riluzole and Edaravone, and by the recently approved antisense oligonucleotide therapy (Tofersen). Moreover, we observed transcriptional dysregulation of ALS-associated genes in peripheral blood mononuclear cell and postmortem cortex samples.
Conclusion: Overall, this ALS catalog can serve as a foundational tool for advancing early diagnosis, identifying biomarkers, and developing personalized therapeutic strategies.
1 Introduction
Amyotrophic Lateral Sclerosis (ALS) is a devastating, progressive neurodegenerative disorder whose hallmark is the degeneration of both upper and lower motor neurons in the cerebral cortex, brainstem, and spinal cord (Swinnen and Robberecht, 2014; Galvin et al., 2017; Hardiman et al., 2017). ALS is one of the most common adult motor neuron diseases, representing a significant socioeconomic burden (Logroscino et al., 2018). Although affected by ethnicity, the global incidence of ALS is approximately 2 cases per 100,000 person-years with a prevalence of 6-9 per 100,000 individuals (Martínez et al., 2011; Mehta et al., 2018; Longinetti and Fang, 2019; Brown et al., 2021; Wolfson et al., 2023). The number of ALS cases is rapidly increasing, mainly due to population aging. Furthermore, projections based on a meta-analysis of reported cases show that the number of ALS patients worldwide will likely increase to 375,000 in 2040, representing a 69% rise (Arthur et al., 2016). Moreover, the cumulative lifetime risk for developing ALS is estimated to be 1:350 in men and 1:400 in women (Ryan et al., 2019). The age of onset is approximately between 50 and 75 years, and even though the rate of disease progression is variable, most patients die of respiratory muscle failure within 2-3 years of symptom onset (Masrori and Van Damme, 2020).
There is considerable variation in the phenotypic expression of ALS including the onset site, age of onset, type and degree of motor neuron involvement, disease progression, symptom severity, and survival time (Tard et al., 2017; Feldman et al., 2022). Approximately 70% of ALS patients present a spinal onset characterized by muscle weakness of the limbs, 30% present with a bulbar onset distinguished by dysarthria, dysphagia, and dysphonia, and a minority (3-5%) have a respiratory or cognitive onset (Tard et al., 2017; Masrori and Van Damme, 2020; Feldman et al., 2022) The bulbar and respiratory onset are generally associated with a poor prognosis (Kiernan et al., 2011); the former being more common in women (Salameh et al., 2015). Despite being predominantly considered a motor disease, ALS patients also present cognitive and behavioral changes (Beeldman et al., 2020; Pender et al., 2020). Thus, there is a molecular overlap between ALS and Frontotemporal dementia (FTD) since approximately 15% of ALS patients meet FTD diagnostic criteria (Ringholz et al., 2005; Pender et al., 2020). Due to the heterogeneous presentation of ALS, diagnostic criteria are available, including the revised El Escorial criteria (Brooks et al., 2000), the Awaji Shima criteria (de Carvalho et al., 2008), and recently, the simplified Gold Coast criteria (Shefner et al., 2020). However, there is no definite test for ALS diagnosis. Instead, a clinical investigation, consisting of blood tests, imaging of the brain and spine, and neurophysiological evaluations, is performed to exclude mimic disorders (Turner and Talbot, 2013). Thus, there is an urgent need for accurate diagnostic criteria for ALS to reduce the diagnostic delay (∼1 year after disease onset), granting early treatment initiation and enabling an improved prognosis.
ALS is mainly considered a sporadic disease (sALS) because 80–90% of the cases depict no known genetic mutation. In contrast, 10% of ALS patients have a family history of disease with an autosomal dominant inheritance pattern (fALS). Intriguingly, mutations in merely more than 30 genes have been identified as causative or conferring an increased risk of the development of ALS, explaining 70% of fALS and only 15% of sALS (Renton et al., 2014; Chia et al., 2018; Mead et al., 2023). However, the heritability of ALS, both sporadic and familial, has been estimated to be approximately 50% (Ryan et al., 2019; Trabjerg et al., 2020). To explain the heritability of ALS, genome-wide association studies (GWAS) and next-generation sequencing technologies have led to the identification of several ALS risk loci (Van Rheenen et al., 2016; Nicolas et al., 2018; van Rheenen et al., 2021). Nevertheless, these changes explain less than 10% of ALS cases, suggesting that a large number of ALS risk genes are still unknown. Known risk genes converge on common biological pathways such as oxidative stress, mitochondrial function dysregulation, protein homeostasis, RNA processing, DNA damage, and excitotoxicity.
Interestingly, mutations in four genes account for 70% of fALS cases, namely, C9orf72, TARDBP, SOD1, and FUS (Chiò et al., 2014; Chia et al., 2018). However, the genetic architecture of ALS is complex because a minority of patients exhibit a monogenic inheritance, while the majority have an oligogenic pattern characterized by the inheritance of mutations in several genes. Furthermore, researchers have identified a shared polygenic risk of ALS with traits and conditions including smoking, physical activity, cognitive performance, and educational attainment (Bandres-Ciga et al., 2019). In addition, unprecedented recent research combining transcriptomic and epigenetic profiling of motor neurons, GWAS statistics, and machine learning methods identified 690 potential ALS-associated genes, representing a 5-fold increase in the heritability of the disease (Zhang et al., 2022).
Mounting evidence demonstrates the contribution of numerous genetic variants to the risk of ALS in different cohorts. However, an updated and comprehensive collection of ALS-associated genes and variants is needed. To this aim, we performed a systematic review analyzing 4,293 abstracts from PubMed, 7,343 variants from ClinVar, and 33 study accessions from GWAS catalog. Furthermore, we performed numerous functional analyses to verify that our list of genes was associated with ALS.
2 Materials and methods
2.1 Systematic literature search
A systematic search for relevant articles was performed in February 2023 using PubTerm (Garcia-Pelaez et al., 2019); a curation and annotation webtool. Our search strategy included two main terms and their synonyms: amyotrophic lateral sclerosis and variants. Articles were restricted to original research papers and to humans. Thus, the following query was used: (“Amyotrophic Lateral Sclerosis”[TIAB] OR “Lou Gehrig”*[TIAB] OR “ALS”[TIAB]) AND (mutation*[TIAB] OR polymorphism*[TIAB] OR variant*[TIAB] OR SNP*[TIAB]) AND English[Language] NOT review[Publication Type] NOT mouse[TIAB] NOT mice[TIAB] NOT animal*[TIAB] to retrieve records from PubMed database (Supplementary Table 1). The process for record selection followed the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) 2020 guidelines (Page et al., 2021), as depicted in Figure 1. Our systematic literature search was complemented with searches for ALS-associated variants performed in ClinVar and GWAS databases as will be described below.

Figure 1. Flowchart of literature selection. PRISMA flowchart summarizing the steps of the screening process with details regarding the number of publications retrieved by the initial query, the number of reports excluded at each screening step, and the number of studies selected for the review from PubMed, ClinVar, and GWAS.
2.2 Curation and gene categorization
Using the PubTerm annotation web tool, we organized retrieved articles by gene, and we defined criteria for filtering records (Figure 1). Records corresponding to non-human genes or to genes resulting from annotation or nomenclature errors were eliminated. The remaining records underwent a scrutinous revision through a manual curation of the title, abstract, and in some cases, the complete manuscript. At least two authors performed the manual curation. Genes in the following exclusion categories were filtered out: (1) unrelated, (2) not reported, (3) conflicting evidence, (4) negative evidence, (5) related disease, (6) other genetic alteration. Genes were “unrelated” when they were not associated with ALS or any other neurological disorder. Additionally, “not reported” genes were those that were related to ALS, but their variants were not described. Moreover, genes were annotated as “conflicting evidence” when articles disagreed on the association of the gene variant with ALS risk. Genes were noted as “negative evidence” when researchers reported no association between the gene variant and ALS. Furthermore, genes with variants found enriched in diseases related to ALS (neurodegenerative, neuromuscular, neurological) were labeled as “related disease.” Finally, genes in which the variant was associated to ALS but there was uncertainty in its locus, or its single-nucleotide polymorphism (SNP) was found in an intronic or intergenic region, were considered as “other genetic alterations.”
Abstracts organized per gene were carefully analyzed until enough evidence was obtained to classify the gene into a relevant inclusion category. We defined six categories of genes with variants associated with ALS: (1) gene variants with experimental evidence of association to ALS, (2) gene variants found in gene exons and with evidence in a GWAS, (3) gene variants described in a case report, (4) gene variants found to be related to ALS complications, (5) gene variants related to treatment response, and (6) genes (with or without variants) that were suggested as biomarkers of ALS. Genes that potentially fit more than one category were assigned to the class with the strongest evidence or the highest number of supporting abstracts. The resulting genes and their categories are included in Supplementary Table 2. For every gene, sentences from abstracts that were critical for assigning a gene’s category, along with the abstracts’ PubMed ID, were annotated in PubTerm’s notes. Annotations for each gene are available in Supplementary Table 3.
2.3 Collection of ALS-associated variants
To find additional genes whose variants were not explicitly described in PubMed records, we searched ClinVar, a public database of human variant-phenotype associations (Landrum et al., 2018). We downloaded ALS associated variants from ClinVar website1 (Accessed on June 15, 2023) using the search term “Amyotrophic Lateral Sclerosis.” Variants without a defined SNP identifier (dbSNP ID) were filtered out. Variants were assigned a severity score according to their clinical significance and these scores were used for filtering redundant variants (same SNP identifier), keeping unique variants of highest severity (lowest score). Furthermore, only variants with severity scores in the categories “pathogenic” or “likely pathogenic” were used for downstream analysis. A list of genes mapped to variants was compiled and filtered to exclude genes resulting from the PubTerm revision. The remaining genes were classified as having experimental evidence of variants. The list of variants and clinical significance are included in Supplementary Table 4.
To further collect ALS associated variants, we explored the GWAS Catalog (Sollis et al., 2023), a comprehensive database of human genotype-phenotype associations derived from curated GWAS. All association studies related to the “Amyotrophic Lateral Sclerosis” trait and corresponding to the MONDO_0004976 disease ontology were retrieved from the GWAS Catalog webpage2 (Accessed on June 15, 2023) and can be found in Supplementary Table 4. Only unique variants with a defined SNP identifier, and with a significance p-value less than 5x 10−8 were considered for downstream analysis. In the cases of redundant variants (same SNP identifier), the smallest p-value was considered. A list of genes spanning the remaining GWAS variants was obtained and filtered to exclude genes resulting from the PubTerm search.
A catalog of SNP identifiers was compiled from the PubMed, ClinVar, and GWAS systematic revisions. We extracted genomic information of each variant from either GWAS, ClinVar or through the rentrez R package (Winter, 2017). Variants were classified as intronic, intergenic or exonic according to their genomic location using Homer annotation tool (Heinz et al., 2010) and the hg38 annotation file version 21 downloaded from GENCODE3. The ALS variants catalog and relevant genomic information is included in Supplementary Table 5.
2.4 In-silico determination of gene functions
Genes resulting from our systematic review were classified according to the gene type (protein-coding, non-protein coding and others) specified in the hg38 GENCODE annotation file version 21 downloaded from GENCODE (see text footnote 3). The subcellular location of the molecules coded by each protein-coding gene was then retrieved from the UniProtKB/Swiss-Prot database (Bateman et al., 2023) accessed at https://www.uniprot.org/ on June 15, 2023. The cellular location of the first isoform of genes coding for multiple isoforms was used. Only records with a “reviewed” status and “Homo sapiens” category were retrieved. Additionally, all genes that encode known transcription factors (TF) were further identified using a list of human TFs obtained by combining the AnimalTFDB v4.0 (Shen et al., 2023) and the The Human Transcription Factors database (Lambert et al., 2018) downloaded from http://humantfs.ccbr.utoronto.ca/ on June 15, 2023 (gene type, subcellular location data, and TF classification are included in Supplementary Table 6).
2.5 Gene set enrichment analysis
We used ALS associated genes to estimate the enrichment of gene sets through a hypergeometric statistical test (phyper R function). We used the msigdbr R package to download the Molecular Signatures Database (MSigDB) (Subramanian et al., 2005; Liberzon et al., 2015) and obtained human specific gene sets (hallmark gene sets, Gene Ontologies, Biocarta, Reactome, and KEGG pathways). Gene sets with FDR < 0.05 and at least 5 overlapping genes were considered significantly enriched and are listed in Supplementary Table 7. All protein-coding genes were considered within the gene universe for the hypergeometric test due to their propensity of having variants. GraphPad Prism 9 tool was used to generate Figs depicting the enrichment of significant ALS related gene sets grouped by clinical phenotype and manifestations. Overlaps of genes in significant gene sets related to ALS, cognitive impairment, depression, dementia, and FTD were performed with the InteractiVenn online tool (Heberle et al., 2015).
2.6 Comparison with canonical and machine learning-based predicted ALS-associated genes
The compiled set of ALS-associated genes were compared to an independently curated list of ALS genes (n = 260) (Eitan et al., 2022) and to a machine learning-derived set (n = 690) (Zhang et al., 2022). Lists are included in Supplementary Table 8.
2.7 Analysis of drug-gene interactions
A drug-gene interaction analysis was performed using the drug-gene interaction repository (February 2022) from The Drug Gene Interaction Database (DGIdb) (Cannon et al., 2024). We compiled drug-gene interactions by using the list of ALS-associated genes as input with default parameters. We downloaded the list of interactions and filtered them for further analysis and visualization using Cytoscape V3.10 (Shannon et al., 2003). Genes depicting interactions with less than two drugs were filtered out. The full list of interactions obtained is provided in Supplementary Table 9.
2.8 Differential gene expression analysis
To further assess the relevance of our list of ALS-associated genes with variants, we performed a differential expression analysis using two recently available high throughput sequencing ALS datasets downloaded from NCBI’s Gene Expression Omnibus (Barrett et al., 2013). We downloaded the FPKM expression matrix of GSE183204 which includes transcriptomic data of peripheral blood mononuclear cells isolated from 18 ALS patients and 12 age and sex-matched healthy controls. In this research, authors stratified ALS patients according to levels of nuclear SOD1 as high and low (Garofalo et al., 2022). Given that high levels of nuclear SOD1 are hypothesized to have a protective mechanism in sALS patients (Garofalo et al., 2022), we compared healthy controls against sALS patients with low nuclear accumulations of SOD1. Similarly, we downloaded the collection GSE124439 composed of transcriptomic datasets obtained from 148 ALS postmortem cortex samples and 17 neurologically healthy controls (Tam et al., 2019). In this research, authors identified 3 distinct molecular ALS subtypes: retrotransposon activation, oxidative stress, and activated glia. Due to their lowest transcriptional heterogeneity, we selected samples classified as activated microglia (ALS-glia) to compare against neurologically healthy controls. Raw count matrixes were downloaded from GEO.
For the first data set (GSE183204) gene expression values were log-transformed and quantile normalized whereas the second data set (GSE124439) was processed using the variance stabilizing transformation (VST) included in DESeq2 (Love et al., 2014) and quantile normalization. For each data set we performed a principal component analysis (PCA) using 1,000 most variable genes according to the median absolute deviation. Outlier samples were identified by plotting the first two principal components and were eliminated from downstream analysis. Focusing only on the list of ALS-associated genes, we performed a differential expression analysis using Limma (Ritchie et al., 2015) and DESeq2 (Love et al., 2014) for the first and second data sets, respectively. Genes with a normalized expression fold-change > 1.1 and a false discovery rate (FDR) < 0.1 were considered differentially expressed. The Gencode v43 human gene annotation file was downloaded3 and used. All statistical analysis were performed in the R programming language. Heatmaps of normalized expression matrices depicting differentially expressed genes were constructed (Figure 6). The matrices of normalized counts and differential expression metrics of both public data sets are included in Supplementary Table 10.
3 Results
3.1 The systematic review identifies and classifies genes with ALS-associated variants
We performed a systematic review using PRISMA’s standards to comprehensively collect all genes with variants associated with ALS. A total of 4,293 abstracts (Supplementary Table 1) matching the PubMed query described were imported into the PubTerm web tool (Garcia-Pelaez et al., 2019). Through automatic annotation, abstracts yielded 2,420 genes that were subsequently filtered eliminating non-human (n = 978) and incorrectly annotated genes (n = 427). The remaining 1,015 genes were categorized for exclusion through a manual screening of the titles and abstracts of all records associated with each gene. If needed, the complete publications were reviewed for adequate gene categorization. Overall, we found 294 genes that were not related to ALS nor to any other neurological disorder and 288 genes described as associated with ALS but without a variant description. Furthermore, genes related to ALS with variants described were classified as “conflicting evidence” (n = 10) or “negative evidence” (n = 44) when studies disagreed on the association of a gene variant to ALS or when they reported no association, respectively. The remaining 379 genes were filtered if their variants were associated with ALS-related diseases, (neurodegenerative, neuromuscular, and neurological disorders) (n = 87) or if gene variants were found in intronic or intergenic regions (n = 43). Overall, our systematic review yielded 249 genes from PubMed with genomic variants related to ALS (Figure 1).
The resulting genes (n = 249) were further assigned to inclusion categories according to the type of evidence provided in their related articles. A total of 120 and 34 genes reported variants with experimental or GWAS evidence, respectively, whereas 4 and 6 genes depicted variants related to disease complications and treatment response. Moreover, 19 and 66 genes, were described as having variants found in case reports or suggested as potential biomarkers, respectively. The classification of genes in either exclusion or inclusion categories is depicted in Supplementary Table 2. The critical sentence considered as supporting evidence of classification into inclusion or exclusion categories is included in Supplementary Table 3.
To further collect ALS associated variants, we explored the ClinVar database (Landrum et al., 2018) and the GWAS Catalog (Sollis et al., 2023) using the “Amyotrophic Lateral Sclerosis” search query. We retrieved 7,343 variants from the ClinVar database and filtered 3,613 variants which lacked an SNP identifier yielding 3,730 variants. The remaining variants were assigned a severity score based on their clinical significance. A total of 107 variants were redundant and only those with highest severity (lowest score corresponding to pathogenic or likely pathogenic) were maintained (n = 3,623). Interestingly, almost 50% of ALS variants (1,725/3,623) had an uncertain significance and 13% (474/3,623) were classified as “Conflicting interpretations of pathogenicity”, suggesting that more studies associating those genes and their variants to ALS pathogenesis are needed.
We further filtered variants leaving only those with pathogenic and/or likely pathogenic scores, yielding 344 variants mapped to 48 genes (included in Supplementary Table 4). The SNP identifiers of these variants were aggregated to our list of ALS-related SNPs (Supplementary Table 5). We compared the 48 ClinVar genes to the list of ALS genes derived from PubTerm and we found that 35 genes overlapped and were distributed among the inclusion categories of “With experimental evidence of variants” (n = 32), “With GWAS evidence” (n = 1), and “Case report” (n = 2). After filtering these overlapping genes, we obtained 13 genes annotated exclusively in ClinVar (included in Supplementary Table 4) and we assigned them to the “experimental evidence” category.
Similarly, from the GWAS Catalog we retrieved 346 variants reported in 33 different study accessions (27 publications) and matching the defined query (Supplementary Table 4). We found 85 variants without an assigned SNP identifier, 40 redundant variants, and 173 non-significant (p-value > 5 ×10–8) variants and filtered them out. We found significant variants annotated as intergenic in GWAS Catalog (n = 10) that were mapped to more than one gene due to their proximity. In such cases, the same variant was assigned individually to all neighboring genes. Thus, we obtained 58 variants (48 unique ones) that mapped to 49 genes of which 11 were found in our PubTerm search and were either classified as having experimental (n = 6) or GWAS evidence (n = 5). Subsequently, 38 genes found exclusively in the GWAS Catalog were added to the category of genes with GWAS evidence of variants (Figure 1). Intriguingly, filtered variants were derived from studies based mainly on genome-wide/exome-wide genotyping arrays or targeted genotyping sequencing, potentially overlooking non-coding variants.
To evaluate the strength of ALS-associated genes we combined the genes found in all databases, yielding 300 ALS-associated genes (249 PubTerm, 13 ClinVar, 38 GWAS). From these, 71 genes were found to have at least 5 publications associated in the PubTerm/PubMed search (Supplementary Table 4). Interestingly, genes with the highest number of publications are commonly known ALS-associated genes: SOD1, TARDBP, C9orf72, and FUS (Table 1). For each gene, we added its inclusion category, and whether it was found in the Clinvar search after filtering redundant and unidentified variants. Moreover, for each gene found with variants in Clinvar, we added its clinical significance. If a gene had more than one variant, we selected the one with the highest severity. We also labeled genes with significant (p-value < 5 × 10–8) variants registered in GWAS Catalog. Surprisingly, only 6 genes (with at least 5 publications) were annotated in GWAS catalog as having ALS-associated variants, suggesting that more GWAS studies in diverse ethnicities are needed. We built a boxplot from the combined datasets to determine if there was a correlation between a gene’s clinical significance and the number of abstracts assessed in PubTerm. As shown in Figure 2A, genes with pathogenic clinically significant variants have a higher median number of abstracts/publications associated with ALS. Similarly, the top 20 genes with the highest number of abstracts are all found in Clinvar and have a pathogenic clinical significance (see Table 1).

Table 1. Top 20 genes with the higher number of abstracts associated with ALS with our search query using PubTerm.

Figure 2. Characterization of variants according to clinical significance, genomic location, and subcellular location of proteins. (A) Number of abstracts (log10-transformed scale) found per ALS-associated gene and the clinical significance category of their variants as classified by ClinVar. (B) Classification of variants associated with ALS-genes according to their genomic location. (C) Schematic representation of the subcellular location of proteins coded by ALS-associated genes.
To analyze the ALS-associated variants found we compiled a list of 479 SNP identifiers mapped to 146 unique genes. Most variants were downloaded from ClinVar (70%), PubMed (16%), and GWAS (11%), however, a minority were shared among PubMed/ClinVar (2%) and PubMed/GWAS (1%) as depicted in Supplementary Figure 1. Furthermore, we extracted genomic information of ALS-associated variants using rentrez R package (Winter, 2017) and Homer annotation tool (Heinz et al., 2010). As observed in Figure 2B, ALS-associated variants are located mainly in protein-coding regions including exons (31%), promoter-transcription start sites (TSS) (17%) and transcription termination sites (TTS) (23%). However, variants are also found in intronic (12%) and intergenic regions (6%) and the remaining (11%) in regulatory regions such as long intergenic RNAs (lincRNAs), antisense lncRNAs, and others (pseudogenes, and small nucleolar RNAs). A catalog of all compiled variants, RS identifiers, mapped genes, genomic location, source database, and region type is found in Supplementary Table 5.
3.2 ALS-associated genes code for proteins mainly found in the cytoplasm, cell membrane, and nucleus
To assess the functional relevance of our list of 300 ALS-associated genes, we searched for gene type information in the GENCODE annotation file and found that 275, 11, and 4 genes were classified as protein-coding, lncRNA, and miRNA, respectively. The remaining 10 genes belonged to snRNA or pseudogene categories. Furthermore, we analyzed the subcellular location of ALS-associated proteins using UniProtKB/Swiss Prot database and found that while the majority of proteins are found in the cytoplasm (n = 78), cell membrane (n = 62) or nucleus (n = 39), others are secreted (n = 32) or located in the endoplasmic reticulum (n = 14) or mitochondrion (n = 12), suggesting diverse dysregulated cellular pathways. For example, major ALS-related genes C9orf72, FUS, and TARDBP, are allocated to the nucleus, whereas SOD1, is found in the cytoplasm. Mutations in TARDBP, FUS, C9orf72, and SOD1 may result in toxic protein aggregates in neurons, leading to degeneration in ALS. Interestingly, we found that only 4% (12/300) of ALS-associated genes including CAMTA1, CEBPD, KCNIP3, LHX8, MEF2C, MTF1, RUNX2, SFPQ, SREBF1, TFAM, ZNF704 and ZNF746, are annotated as transcription factors. The subcellular location of ALS-associated proteins is depicted in Figure 2C. Supplementary Table 6 includes gene type, subcellular location, and the classification of transcription factors.
3.3 Functional gene set enrichment analysis identifies ALS-relevant gene sets
To summarize the biological significance of the ALS-associated genes found, we performed a gene set enrichment analysis. To investigate the most significant gene sets across a variety of domains such as diseases, bioprocesses and cellular functions, we used the GO, Hallmark Gene Sets, Reactome, KEGG pathways and the Biocarta databases. Gene sets were considered enriched with a False Discovery Rate (FDR) < 0.05 and at least 5 overlapping genes. The full list of enriched gene sets is included in Supplementary Table 7. Similar functional terms were manually grouped into panels for comprehensive interpretation and expository purposes as shown in Figure 3. Significant terms that were related to different components of ALS were grouped under spinal/muscular, bulbar, respiratory and upper motor neuron categories. Terms related to biochemical processes, cellular components and metabolism were grouped in a physiology category and common ALS phenotypes and symptoms were classified as “other.”

Figure 3. Highly enriched gene sets and ontologies classified. We identified (hypergeometric test with FDR < 0.05 and at least 5 common genes). Enriched gene sets were grouped by relevant physiological categories. Enrichment is indicated as -log10(FDR) in black bars and the number of common genes is represented with the yellow line. AB, Abnormal; UMN, Upper Motor Neuron; REG. Regulation; SYN, Synaptic.
As expected, Amyotrophic Lateral Sclerosis was identified as the most significantly enriched disease gene set (−log10 FDR = 66.7). Other significant disease gene sets found included depression, cognitive impairment, frontotemporal dementia and dementia, suggesting that ALS-associated genes are part of a shared genetic background including other neurodegenerative disorders. Figures 4A,B depict the genetic overlap between these disorders. Genes like C9orf72, FUS, VCP, TBK1, CCNF, HNRNPA1, and SQSTM1 among others are common to all diseases. The pairing of ALS-depression shared the highest number of genes, sharing 31 of the curated genes. This number is followed by ALS-cognitive impairment with 22, and finally ALS-dementia and ALS-frontotemporal dementia with 19 and 15 genes, respectively.

Figure 4. Comparison of ALS-associated genes with disorders depicting similar symptoms and curated/machine-learning ALS lists. Venn diagrams depicting the ALS-associated genes that are common and specific between different enriched human phenotype gene sets. (A) ALS, Cognitive Impairment, and Depression. (B) ASL, Dementia, and Frontotemporal Dementia (FTD). (C) Venn diagram depicting common genes between ALS-associated genes and an independently curated list of ALS genes proposed by Eitan et al. (2022) and a list of genes inferred by Zhang et al. (2022) through a machine learning approach using transcriptomic and epigenomic cell profiling. The list of genes proposed in this Venn diagram are referred to as A, while those proposed by Zhang et al. and Eitan et al. are labelled as B and C respectively. (D) List of the common genes found in the comparisons depicted in subsection (C).
Enriched gene ontologies included terms classified as biological processes, cellular components and molecular functions databases. Among biological processes, the most significant gene sets contained terms that closely relate to ALS pathophysiology, for example, neuron death, regulation of apoptosis, response to oxidative stress, post-transcriptional protein modification, microglial activation, axonal transport, mitochondrial organization, vesicle mediated transport, autophagy regulation and ion transport. Furthermore, the most significant cellular component terms were related to axons, neuron projections, dendrites and vacuoles. Likewise, significantly enriched molecular functions included receptor binding, hydrolase activity, and growth factor activity.
3.4 Publicly available ALS curated gene lists and machine learning predictions overlap with our list of ALS-associated genes
To further assess the validity of our systematic review, we compared the list of ALS-associated genes with other curated lists found in the literature. Zhang et al. (2022) identified 690 ALS risk genes through regional fine-mapping (RefMap), a new machine learning method that integrates epigenetic profiling with GWAS summary statistics. Overall, they applied RefMap to ALS GWAS data, transcriptomic and epigenetic profiling of iPSC-derived motor neurons (MNs). Their ALS GWAS data used for RefMap gene enrichment included an independently curated list of genes (Eitan et al., 2022), and genes from the ClinVar database. With RefMap, they were able to identify ALS active genomic regions, which were mostly non-coding. They also determined that ALS pathogenesis is initiated in the distal axon of affected MNs. Finally, they established KANK1 as novel ALS gene, which is found in human neurons and leads to TDP-43 mislocalization (Neumann et al., 2006; Zhang et al., 2022). After manually comparing our list with theirs, we observed an overlap of 16 genes, including KANK1.
Eitan et al. (2022) performed a region-based burden analysis of variants in untranslated regions, including microRNAs (miRNAs), of ALS whole-genomes and non-ALS controls. They used whole-genome sequencing data from Project MinE ALS and NYCG ALS to analyze regions of interest. After performing the region-based burden test, where they combined rare genetic variants with minor allele frequencies (MAF) ≤ 0.01 to weigh their contribution to ALS, they identified 260 candidate genes associated to sporadic ALS. Overall, the strongest association found was for the untranslated region of IL18RAP, which was considered as a protective non-coding allele that reduces the chance of developing ALS five-fold and delays the onset in people who develop the disease. After comparing the three lists and eliminating duplicates, we observed that 29% (86/300) of our ALS-associated gene list has been reported in two independently curated lists of ALS-associated genes (Eitan et al., 2022; Zhang et al., 2022) as depicted in Figure 4C. Intriguingly, only 6 genes are found in the overlap between the three data sets. The lists of genes are included in Supplementary Table 8.
3.5 Drug-gene interactions analysis reveals hub genes
The druggable genome consists of the group of genes that are known or predicted to interact with drugs in diverse conditions or disorders. To explore which of the ALS-associated genes found are part of the druggable genome, we used the DGIdb (Cannon et al., 2024) which contains over 10,000 genes and 20,000 drugs involved in nearly 70,000 drug-gene interactions. We identified a total of 2,836 drug-gene interactions involving 120 genes included in our systematic review. Among those, 357 interactions had a defined interaction type (inhibition, modulator, agonist, among others) and 2,479 had non-specific interactions. The full list of the retrieved interactions can be found in Supplementary Table 9. A drug-gene interaction network including only defined interactions and genes interacting with more than 2 drugs was created for visualization purposes (Figure 5). Even though non-specific, we added Edaravone and its interactor CYP1A2 into the network to overview both FDA-approved drugs for the treatment of ALS (Edaravone and Riluzole). Interestingly, the network analysis revealed that two genes that encode alpha subunit proteins of sodium channels, SCN4A and SCN7A, have the highest amount of drug interactions, and can also be inhibited by Riluzole.

Figure 5. Drug-gene interaction network. Drug-gene interaction network depicting drugs (Blue, Yellow) targeting relevant ALS-associated genes (green) found in the systematic review. Hub genes with the highest number of targeting drugs are shown. Arrow colors indicate the interaction category.
SCN4A encodes for a voltage-gated sodium channel and SCN7A for a type II sodium channel whose activation is proportional to the extracellular sodium concentrations, thus mediating sodium homeostasis. These channels are essential for proper neuronal and muscular membrane depolarization during an action potential. There are two proposed mechanisms through which mutations in SCN4A can lead to an ALS phenotype. The first is through excessive sodium permeability leading to hyperexcitability and excitotoxicity or via retrograde motor neuron toxicity caused by muscular hyperexcitability (Franklin et al., 2020). SCN7A loss of function has been proposed to disrupt extracellular sodium homeostasis and lead to neuron hyperexcitability (Franklin et al., 2020).
The main therapeutic action of Riluzole is through the down-regulation of glutamic acid neurotransmission leading to a diminished neuronal excitotoxicity. Part of this effect is due to the inhibition of glutamate release from presynaptic dendrites, which may be down-regulated by the drug’s role as a modulator of voltage-gated sodium channels (Doble, 1996; Mohammadi et al., 2002; Sever et al., 2022). The modulation on the propagation of action potentials indirectly diminishes glutamate exocytosis, hence, excitotoxicity. Riluzole has been described to act on a variety of sodium channels and their subunits (Song et al., 1997; Weiss et al., 2010), including SCN7A and SCN4A, which are also reported on The ChEMBL Bioactivity Database (Mendez et al., 2019). Whether the effect of Riluzole on mutant SCN7A and SCN4A is intact, has not yet been studied. There are two case reports of patients with SCN4A mutations who were treated with Riluzole and died less than 2 years after symptom onset (Franklin et al., 2020). It is possible that Riluzole’s effectivity may be partially compromised in patients that are known carriers of mutations in these genes.
Interestingly, three of the genes with the most common and penetrant ALS mutations known (TARDBP, FUS, and SOD1) depict drug targets, however, their interaction type is not categorized and thus they are not shown in the network (Figure 5). An outstanding example are the variants of SOD1 which are targeted by Tofersen, an antisense oligonucleotide (Miller et al., 2022) Tofersen, a treatment designed specifically for SOD1-associated ALS was approved in April 2023 by the FDA after clinical trials (Miller et al., 2020, 2022; Benatar et al., 2022). Tofersen, being an antisense oligonucleotide, works by binding the mutant SOD1 transcripts and reducing their synthesis (Weishaupt et al., 2024).
Among the genes with interactions classified as “non-specific,” CYP1A2 was found to be targeted by Edaravone, an FDA approved drug for ALS treatment. Edaravone’s mechanism of action is still unclear, however, it is described as a reactive oxygen species scavenger. It is thought to trap free radicals and increase the expression of nuclear factor-erythroid factor 2 related factor (NrF2) which activates antioxidant response genes, thus protecting cells from ferroptosis (Homma) (Johnson et al., 2022; Soares et al., 2023). Even though Edaravone has been broadly approved for ALS population, its main benefit is in patients with definite or probable ALS either with milder symptoms (scoring at least 2 points in each of the items in the ALSFRS-R) or a disease duration less than 2 years (Soares et al., 2023).
3.6 Transcriptional dysregulation is observed in ALS public databases
To evaluate the transcriptional dysregulation of ALS-associated genes, we downloaded two ALS and neurologically healthy controls datasets corresponding to peripheral blood mononuclear cells (GSE183204) and postmortem cortex samples (GSE124439). Raw count matrices were log-transformed and normalized, and outlier samples were identified using PCA plots depicting 1,000 most variable genes. In the first dataset (GSE183204), consisting of peripheral blood mononuclear cells of ALS patients with low levels of nuclear SOD1 and age and sex-matched healthy controls (Garofalo et al., 2022), we identified four outlier samples (3 controls and 1 ALS). The remaining 9 ALS samples and 9 controls were compared. We focused on the list of ALS-associated genes and found 52 differentially expressed genes using Limma (Ritchie et al., 2015) with fold-change > 1.1 and FDR < 0.1. The heatmap of Figure 6A shows the row z-scores of the 52 differentially expressed genes which clearly separate ALS cases from healthy controls. As observed, the majority of ALS-associated genes are downregulated (39 down vs. 13 up). The top 5 most downregulated genes include CXCR4, CYLD, HLA-DRA, MATR3, and C9orf72.

Figure 6. Heatmaps of differentially expressed ALS-associated genes in public ALS databases. The differential expression of ALS-associated genes was evaluated in (A) peripheral blood mononuclear cells (GSE183204) (Garofalo et al., 2022) and (B) postmortem cortex samples (GSE124439) (Tam et al., 2019). Raw count matrices were log-transformed and normalized. Genes were considered differentially expressed with fold-change > 1.1 and FDR < 0.1. Gene expression is depicted as row-z-scores.
Similarly, we identified and filtered three outlier samples in the postmortem cortex dataset (GSE124439) corresponding to two controls and one ALS sample. The remaining 27 ALS samples, classified as activated microglia by the authors (Tam et al., 2019), were compared against 15 neurologically healthy controls using DESeq2 (Love et al., 2014). We found 163 differentially expressed genes with a fold-change > 1.1 and FDR < 0.1 (Figure 6B). From these genes, 89 were downregulated and 74 upregulated in ALS samples, suggesting a more robust dysregulation in postmortem cortex tissue than in peripheral blood mononuclear cells. The top 5 differentially expressed genes are upregulated and they include PXK, MOBP, SH3TC2, MBP, and MOB3B. Interestingly, Myelin Basic Protein (MBP) and Myelin-Associated Oligodendrocyte Basic Protein (MOBP) are both involved in oligodendrocyte-driven myelination whereas SH3 domain and tetratricopeptide repeats-containing protein 2 (SH3TC2) is thought to be expressed in the Schwann cells that wrap the myelin sheath around nerves (Stendel et al., 2010; Barateiro et al., 2016). These results support the proposition that oligodendrocyte dysfunction and myelin damage contribute to neuronal death in ALS (Kang et al., 2013). The molecular role of PXK and MOB3B in the pathophysiology of ALS is not clear. However, they are involved in cellular processes related to synaptic transmission, neuroprotection, and maintenance of cellular integrity (Mao et al., 2005; Takeuchi et al., 2010; Elkholi et al., 2023). The matrices of normalized counts and differential expression metrics of the expression of ALS-associated genes in the peripheral blood mononuclear cell and postmortem cortex samples are included in Supplementary Table 10. As observed in Figure 7, 28 ALS-associated genes are differentially expressed in both databases, depicting 23 genes with a significant fold-change in the same direction. See list of common differentially expressed ALS-associated genes and their fold-changes in each database in Supplementary Table 10.

Figure 7. Comparison of differentially expressed ALS-associated genes. Comparison of log2-fold changes of differentially expressed ALS-associated genes found in two public databases (GSE183204) (Garofalo et al., 2022), (GSE124439) (Tam et al., 2019). Axes represent log2-fold changes. The regression line is depicted in red.
4 Discussion
In ALS, pathological processes arise mainly from toxic gain-of-function, loss-of-function or toxic aggregates of proteins derived from variations/mutations in genes. Until recently, the number of ALS genes was only 40 (Chia et al., 2018; Goutman et al., 2018) and isolated efforts have been performed to increase this number using multi-omics and machine-learning approaches (Eitan et al., 2022; Zhang et al., 2022). Hitherto, we performed a systematic review to identify genes with variants associated with ALS. We comprehensively analyzed, filtered, and classified genes from 4,293 abstracts retrieved from PubMed, 7,343 variants from ClinVar, and 33 study accessions from the GWAS catalog. Our analysis yielded 300 genes with 479 ALS-associated variants. These genes were classified according to their association with ALS and the highest number of them were assigned to the experimentally validated group. Furthermore, other genes with ALS variants were classified as having GWAS evidence or suggested as a biomarker. As expected, the genes with the highest number of related publications are those with the most common and penetrant ALS mutations known: C9orf72, TARDBP, FUS, and SOD1 (Goutman et al., 2018). Moreover, the top 20 genes with the highest number of publications depict mainly pathogenic variants. Interestingly, 71% of ALS-associated variants are found in protein-coding regions (exons, TSS, and TTS), as opposed to what has been reported describing that most disease-associated variants are in non-coding regions (Smigielski et al., 2000; Hindorff et al., 2009). Furthermore, our results show that ALS associated genes are largely protein-coding with enzymatic activity. Although our aim was to capture both coding and noncoding variants associated with ALS, our search strategy was optimized for sensitivity to gene-linked evidence, thereby prioritizing variants with established gene associations. As a result, non-coding variants located in intergenic or enhancer regions without annotated gene links were excluded. While this may have introduced a gene-centric bias, the lack of gene annotation in these regions meant their inclusion would not have impacted our downstream analyses. Still, these non-coding elements remain underexplored and may play a more significant role in ALS pathogenesis as has been suggested (Cooper-Knock et al., 2021; van Rheenen et al., 2021). Moreover, to fully address the importance on non-coding variants, more whole-genome sequencing GWAS studies are needed.
We also investigated the molecular function and subcellular location of the proteins coded by our list of ALS-associated genes and found that 26% and 21% were found in the cytoplasm and in the nucleus, respectively, confirming that pathophysiological mechanisms of ALS include disturbed RNA metabolism, impaired autophagy/proteostasis, impaired DNA repair, and cytoskeletal defects (Nguyen et al., 2018). Moreover, we found significantly enriched gene sets relevant to ALS pathophysiology using our list of ALS-associated genes and we observed genetic overlap between other neurodegenerative disorders; for example, depression, cognitive impairment, frontotemporal dementia and dementia. This genetic overlap underlies the shared disease symptoms and supports the difficulty in their diagnosis.
We found that 29% of our list of ALS-associated genes has been reported in a manually curated and a machine-learning predicted list. This results likely reflect distinct biological dimensions of ALS, with literature-supported annotation versus data-driven functional predictions. The minimal convergence observed across gene sets suggests that machine learning and transcriptomic methodologies, while powerful, may not yet capture the full spectrum of ALS-relevant genes recognized in the literature, or are designed to selectively target rare genetic regions.
To analyze potential drugs for repurposing, we explored the drug-gene interactions using our list of ALS-associated genes. We retrieved a list of potential drugs targeting ALS-associated genes, including commonly used drugs (Riluzole and Edaravone) and recently FDA-approved drugs (Tofersen). Tofersen is a novel category of drugs consisting of antisense oligonucleotides that bind transcripts originating from genes with specific variants therefore reducing the synthesis of the mutant protein and thus decreasing its toxic effects. It is important to mention that these drugs only provide benefit in individuals carrying the specific gene variants, highlighting the importance of genetic screening and GWAS studies. Moreover, an increasing number of antisense oligonucleotide genetic drugs are being developed (Crooke et al., 2021), thus it is important to compile comprehensive catalogs of variants and genes associated with ALS.
We hypothesized that ALS-associated genes would depict transcriptional dysregulation and to verify it, we downloaded two public ALS data sets corresponding to samples from peripheral blood mononuclear cells and postmortem cortex (Tam et al., 2019; Garofalo et al., 2022). We performed a differential expression analysis and found that 17% and 54% of our list of 300 ALS-associated genes were significantly dysregulated in peripheral blood mononuclear cell and postmortem cortex samples, respectively. Interestingly, even though the fold-changes were considerably higher in the postmortem cortex samples, 28 genes were differentially expressed in both data sets, with 23 of them depicting dysregulation in the same direction. These results suggest that transcriptional dysregulation is not only observed in neurons, but also in mononuclear cells, thus, these transcripts are potential biomarkers of ALS. Among the common ALS-associated dysregulated genes, we found PXK, MOB3B, and CXCR4. Intriguingly, while the role of PXK and MOB3B in ALS remains elusive, researchers have demonstrated that they are involved in synaptic signaling and axonal survival and maintenance (Mao et al., 2005; Elkholi et al., 2023). Similarly, increasing evidence is implicating the CXC chemokines/cognate receptors signaling axes in the pathophysiology of ALS, suggesting that monitoring CXC-ligands (e.g. CXCR4) in ALS is important for tracking disease progression (La Cognata et al., 2024).
Several limitations in ALS genetic research should be highlighted. One significant challenge is the discrepancy between the number of ALS-related genes identified in PubMed compared to ClinVar or GWAS databases. This suggests a need for more comprehensive genetic studies and better registration practices, particularly in ClinVar, to ensure clinical significance and facilitate the application of findings. Furthermore, the heterogeneity in study design and data analysis complicates the integration of findings across studies, underscoring the importance of standardized methodologies and reporting practices. Another critical limitation is the lack of reproducibility across independent cohorts, often influenced by differences in sample size, geographic regions, and genetic backgrounds. Addressing these issues is vital for advancing ALS genetic research. Finally, there remains a significant gap in studying underrepresented ethnic populations. Most genetic research in ALS has focused on populations of European descent, leaving many other ethnic groups largely unexplored. Expanding genetic studies to include diverse populations is critical to gaining a comprehensive understanding of ALS pathogenesis and addressing disparities in disease outcomes.
Overall, this systematic review consolidates data from multiple databases to create a comprehensive catalog of genes and variants associated with ALS, presenting promising candidates for further validation studies. Our findings emphasize the need to transition from genetic associations to larger, more diverse case-control and cohort studies to deepen our understanding of ALS pathogenesis. Additionally, registering these variants in databases like ClinVar can enhance their utility in clinical and research contexts.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.
Author contributions
CA-A: Conceptualization, Data curation, Investigation, Supervision, Writing – original draft, Writing – review and editing. RC-DD: Conceptualization, Formal Analysis, Methodology, Funding acquisition, Supervision, Writing – original draft, Writing – review and editing. HRM: Conceptualization, Funding acquisition, Supervision, Writing – review and editing. RO-L: Conceptualization, Writing – review and editing. JAF-S: Conceptualization, Formal Analysis, Methodology, Writing – review and editing. JAM-G: Conceptualization, Formal Analysis, Methodology, Funding acquisition, Writing – original draft, Writing – review and editing. GKP-M: Conceptualization, Formal Analysis, Methodology, Funding acquisition, Writing – original draft, Writing – review and editing. REF-S: Conceptualization, Formal Analysis, Methodology, Funding acquisition, Writing – original draft, Writing – review and editing. EM-L: Formal Analysis, Methodology, Writing – review and editing. LM-RM: Formal Analysis, Investigation, Writing – original draft. KRR-A: Formal Analysis, Investigation, Writing – original draft. DM-C: Formal Analysis, Investigation, Writing – original draft. PJA-M: Formal Analysis, Investigation, Writing – original draft.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This project was supported by the Challenge-Based Research Funding Program 2023 (IJUT070-23DG93002). LM-RM received a graduate fellowship from CONAHCyT (1078461).
Acknowledgments
We would like to thank Deborah Garza-Hernandez for her advice with the use of PubTerm web tool and Línea de Servicios de Neurociencias de TecSalud.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no Generative AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2025.1598336/full#supplementary-material
Supplementary Figure 1 | Schematic diagram depicting the retrieval and selection of ALS-variants associated to genes. Variants were compiled from ClinVar, PubMed, and GWAS Catalog databases while performing the systematic review. This figure is related to Figure 1.
Supplementary Table 1 | List of publications retrieved from PubMed using the defined search query.
Supplementary Table 2 | Genes retrieved through the systematic revision including their inclusion and exclusion categories.
Supplementary Table 3 | Sentences from abstracts critical for assigning a gene’s category as annotated in PubTerm’s notes.
Supplementary Table 4 | Collection of variants found in ALS-associated genes in GWAS and ClinVar databases.
Supplementary Table 5 | Catalog of SNP identifiers compiled from PubMed, ClinVar, and GWAS databases through a systematic revision.
Supplementary Table 6 | Classification of ALS-associated genes corresponding to gene type, subcellular location data, and TF function.
Supplementary Table 7 | Gene set enrichment analysis using ALS-associated genes.
Supplementary Table 8 | Comparison of ALS-associated genes to a manually curated list of ALS genes and a machine learning driven approach.
Supplementary Table 9 | Table of drug-gene interactions retrieved from DGIdb (Cannon et al., 2024).
Supplementary Table 10 | Differential gene expression analysis metrics of ALS-associated genes in public databases of peripheral blood mononuclear cells and postmortem cortex samples.
Footnotes
References
Arthur, K. C., Calvo, A., Price, T. R., Geiger, J. T., Chiò, A., and Traynor, B. J. (2016). Projected increase in amyotrophic lateral sclerosis from 2015 to 2040. Nat. Commun. 7:12408. doi: 10.1038/NCOMMS12408
Bandres-Ciga, S., Noyce, A. J., Hemani, G., Nicolas, A., Calvo, A., Mora, G., et al. (2019). Shared polygenic risk and causal inferences in amyotrophic lateral sclerosis. Ann. Neurol. 85, 470–481. doi: 10.1002/ANA.25431
Barateiro, A., Brites, D., and Fernandes, A. (2016). Oligodendrocyte development and myelination in neurodevelopment: Molecular mechanisms in health and disease. Curr. Pharm. Des. 22, 656–679. doi: 10.2174/1381612822666151204000636
Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., et al. (2013). NCBI GEO: Archive for functional genomics data sets–update. Nucleic Acids Res. 41:93. doi: 10.1093/NAR/GKS1193
Bateman, A., Martin, M. J., Orchard, S., Magrane, M., Ahmad, S., Alpi, E., et al. (2023). UniProt: The universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531. doi: 10.1093/NAR/GKAC1052
Beeldman, E., Govaarts, R., De Visser, M., Klein Twennaar, M., Van Der Kooi, A. J., Van Den Berg, L. H., et al. (2020). Progression of cognitive and behavioural impairment in early amyotrophic lateral sclerosis. J. Neurol. Neurosurg. Psychiatry 91, 779–780. doi: 10.1136/JNNP-2020-322992
Benatar, M., Wuu, J., Andersen, P. M., Bucelli, R. C., Andrews, J. A., Otto, M., et al. (2022). Design of a randomized, placebo-controlled, phase 3 trial of tofersen initiated in clinically presymptomatic SOD1 variant carriers: The ATLAS study. Neurotherapeutics 19, 1248–1258. doi: 10.1007/S13311-022-01237-4
Brooks, B. R., Miller, R. G., Swash, M., and Munsat, T. L. (2000). El Escorial revisited: Revised criteria for the diagnosis of amyotrophic lateral sclerosis. Amyotroph. Lateral Scler. Other Motor Neuron Disord. 1, 293–299. doi: 10.1080/146608200300079536
Brown, C. A., Lally, C., Kupelian, V., and Dana Flanders, W. (2021). Estimated prevalence and incidence of amyotrophic lateral sclerosis and SOD1 and C9orf72 genetic variants. Neuroepidemiology 55, 342–353. doi: 10.1159/000516752
Cannon, M., Stevenson, J., Stahl, K., Basu, R., Coffman, A., Kiwala, S., et al. (2024). DGIdb 5.0: Rebuilding the drug-gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res. 52, D1227–D1235. doi: 10.1093/NAR/GKAD1040
Chia, R., Chiò, A., and Traynor, B. J. (2018). Novel genes associated with amyotrophic lateral sclerosis: Diagnostic and clinical implications. Lancet Neurol. 17, 94–102. doi: 10.1016/S1474-4422(17)30401-5
Chiò, A., Battistini, S., Calvo, A., Caponnetto, C., Conforti, F. L., Corbo, M., et al. (2014). Genetic counselling in ALS: Facts, uncertainties and clinical suggestions. J. Neurol. Neurosurg. Psychiatry 85, 478–485. doi: 10.1136/JNNP-2013-305546
Cooper-Knock, J., Zhang, S., Kenna, K. P., Moll, T., Franklin, J. P., Allen, S., et al. (2021). Rare variant burden analysis within enhancers identifies CAV1 as an ALS risk gene. Cell Rep. 34:108730. doi: 10.1016/j.celrep.2021.108730
Crooke, S. T., Baker, B. F., Crooke, R. M., and Liang, X. (2021). Antisense technology: An overview and prospectus. Nat. Rev. Drug Discov. 20, 427–453. doi: 10.1038/S41573-021-00162-Z
de Carvalho, M., Dengler, R., Eisen, A., England, J. D., Kaji, R., Kimura, J., et al. (2008). Electrodiagnostic criteria for diagnosis of ALS. Clin. Neurophysiol. 119, 497–503. doi: 10.1016/J.CLINPH.2007.09.143
Doble, A. (1996). The pharmacology and mechanism of action of riluzole. Neurology 47, S233–S241. doi: 10.1212/WNL.47.6_SUPPL_4.233S
Eitan, C., Siany, A., Barkan, E., Olender, T., van Eijk, K. R., Moisse, M., et al. (2022). Whole-genome sequencing reveals that variants in the Interleukin 18 receptor accessory protein 3’UTR protect against ALS. Nat. Neurosci. 25, 433–445. doi: 10.1038/S41593-022-01040-6
Elkholi, I. E., Boulais, J., Thibault, M. P., Phan, H. D., Robert, A., Lai, L. B., et al. (2023). Mapping the MOB proteins’ proximity network reveals a unique interaction between human MOB3C and the RNase P complex. J. Biol. Chem. 299:5123. doi: 10.1016/J.JBC.2023.105123
Feldman, E. L., Goutman, S. A., Petri, S., Mazzini, L., Savelieff, M. G., Shaw, P. J., et al. (2022). Amyotrophic lateral sclerosis. Lancet 400, 1363–1380. doi: 10.1016/S0140-6736(22)01272-7
Franklin, J. P., Cooper-Knock, J., Baheerathan, A., Moll, T., Männikkö, R., Heverin, M., et al. (2020). Concurrent sodium channelopathies and amyotrophic lateral sclerosis supports shared pathogenesis. Amyotroph. Lateral Scler. Frontotemporal Degener. 21, 627–630. doi: 10.1080/21678421.2020.1786128
Galvin, M., Gaffney, R., Corr, B., Mays, I., and Hardiman, O. (2017). From first symptoms to diagnosis of amyotrophic lateral sclerosis: Perspectives of an Irish informal caregiver cohort-a thematic analysis. BMJ Open 7:e014985. doi: 10.1136/BMJOPEN-2016-014985
Garcia-Pelaez, J., Rodriguez, D., Medina-Molina, R., Garcia-Rivas, G., Jerjes-Sánchez, C., and Trevino, V. (2019). PubTerm: A web tool for organizing, annotating and curating genes, diseases, molecules and other concepts from PubMed records. Database (Oxford) 2019:bay137. doi: 10.1093/DATABASE/BAY137
Garofalo, M., Pandini, C., Bordoni, M., Jacchetti, E., Diamanti, L., Carelli, S., et al. (2022). RNA molecular signature profiling in PBMCs of sporadic ALS patients: HSP70 overexpression is associated with nuclear SOD1. Cells 11:293. doi: 10.3390/CELLS11020293
Goutman, S. A., Chen, K. S., Paez-Colasante, X., and Feldman, E. L. (2018). Emerging understanding of the genotype-phenotype relationship in amyotrophic lateral sclerosis. Handb. Clin. Neurol. 148, 603–623. doi: 10.1016/B978-0-444-64076-5.00039-9
Hardiman, O., Al-Chalabi, A., Chio, A., Corr, E. M., Logroscino, G., Robberecht, W., et al. (2017). Amyotrophic lateral sclerosis. Nat. Rev. Dis. Prim. 3:17071. doi: 10.1038/NRDP.2017.71
Heberle, H., Meirelles, V. G., da Silva, F. R., Telles, G. P., and Minghim, R. (2015). InteractiVenn: A web-based tool for the analysis of sets through Venn diagrams. BMC Bioinform. 16:169. doi: 10.1186/S12859-015-0611-3
Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y. C., Laslo, P., et al. (2010). Simple combinations of lineage-determining transcription factors prime CIS-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589. doi: 10.1016/J.MOLCEL.2010.05.004
Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S., et al. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. U.S.A. 106, 9362–9367. doi: 10.1073/PNAS.0903103106
Johnson, S. A., Fang, T., De Marchi, F., Neel, D., Van Weehaeghe, D., Berry, J. D., et al. (2022). Pharmacotherapy for amyotrophic lateral sclerosis: A review of approved and upcoming agents. Drugs 82, 1367–1388. doi: 10.1007/S40265-022-01769-1
Kang, S. H., Li, Y., Fukaya, M., Lorenzini, I., Cleveland, D. W., Ostrow, L. W., et al. (2013). Degeneration and impaired regeneration of gray matter oligodendrocytes in amyotrophic lateral sclerosis. Nat. Neurosci. 16, 571–579. doi: 10.1038/NN.3357
Kiernan, M. C., Vucic, S., Cheah, B. C., Turner, M. R., Eisen, A., Hardiman, O., et al. (2011). Amyotrophic lateral sclerosis. Lancet 377, 942–955. doi: 10.1016/S0140-6736(10)61156-7
La Cognata, V., Morello, G., Guarnaccia, M., and Cavallaro, S. (2024). The multifaceted role of the CXC chemokines and receptors signaling axes in ALS pathophysiology. Prog. Neurobiol. 235:87. doi: 10.1016/J.PNEUROBIO.2024.102587
Lambert, S. A., Jolma, A., Campitelli, L. F., Das, P. K., Yin, Y., Albu, M., et al. (2018). The human transcription factors. Cell 172, 650–665. doi: 10.1016/J.CELL.2018.01.029
Landrum, M. J., Lee, J. M., Benson, M., Brown, G. R., Chao, C., Chitipiralla, S., et al. (2018). ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067. doi: 10.1093/NAR/GKX1153
Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J. P., and Tamayo, P. (2015). The molecular signatures database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425. doi: 10.1016/J.CELS.2015.12.004
Logroscino, G., Piccininni, M., Marin, B., Nichols, E., Abd-Allah, F., Abdelalim, A., et al. (2018). Global, regional, and national burden of motor neuron diseases 1990-2016: A systematic analysis for the global burden of disease study 2016. Lancet Neurol. 17, 1083–1097. doi: 10.1016/S1474-4422(18)30404-6
Longinetti, E., and Fang, F. (2019). Epidemiology of amyotrophic lateral sclerosis: An update of recent literature. Curr. Opin. Neurol. 32, 771–776. doi: 10.1097/WCO.0000000000000730
Love, M. I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15:550. doi: 10.1186/S13059-014-0550-8
Mao, H., Ferguson, T. S., Cibulsky, S. M., Holmqvist, M., Ding, C., Fei, H., et al. (2005). MONaKA, a novel modulator of the plasma membrane Na,K-ATPase. J. Neurosci. 25, 7934–7943. doi: 10.1523/JNEUROSCI.0635-05.2005
Martínez, H. R., Molina-López, J. F., Cantú-Martínez, L., González-Garza, M. T., Moreno-Cuevas, J. E., Couret-Alcaraz, P., et al. (2011). Survival and clinical features in Hispanic amyotrophic lateral sclerosis patients. Amyotroph. Lateral Scler. 12, 199–205. doi: 10.3109/17482968.2010.550302
Masrori, P., and Van Damme, P. (2020). Amyotrophic lateral sclerosis: A clinical review. Eur. J. Neurol. 27, 1918–1929. doi: 10.1111/ENE.14393
Mead, R. J., Shan, N., Reiser, H. J., Marshall, F., and Shaw, P. J. (2023). Amyotrophic lateral sclerosis: A neurodegenerative disorder poised for successful therapeutic translation. Nat. Rev. Drug Discov. 22, 185–212. doi: 10.1038/S41573-022-00612-2
Mehta, P., Kaye, W., Raymond, J., Wu, R., Larson, T., Punjani, R., et al. (2018). Prevalence of amyotrophic lateral sclerosis - United States, 2014. MMWR Morb. Mortal. Wkly. Rep. 67, 216–218. doi: 10.15585/MMWR.MM6707A3
Mendez, D., Gaulton, A., Bento, A. P., Chambers, J., De Veij, M., Félix, E., et al. (2019). ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940. doi: 10.1093/NAR/GKY1075
Miller, T. M., Cudkowicz, M. E., Genge, A., Shaw, P. J., Sobue, G., Bucelli, R. C., et al. (2022). Trial of antisense oligonucleotide tofersen for SOD1 ALS. N. Engl. J. Med. 387, 1099–1110. doi: 10.1056/NEJMOA2204705
Miller, T., Cudkowicz, M., Shaw, P. J., Andersen, P. M., Atassi, N., Bucelli, R. C., et al. (2020). Phase 1-2 trial of antisense oligonucleotide tofersen for SOD1 ALS. N. Engl. J. Med. 383, 109–119. doi: 10.1056/NEJMOA2003715
Mohammadi, B., Lang, N., Dengler, R., and Bufler, J. (2002). Interaction of high concentrations of riluzole with recombinant skeletal muscle sodium channels and adult-type nicotinic receptor channels. Muscle Nerve 26, 539–545. doi: 10.1002/MUS.10230
Neumann, M., Sampathu, D. M., Kwong, L. K., Truax, A. C., Micsenyi, M. C., Chou, T. T., et al. (2006). Ubiquitinated TDP-43 in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Science 314, 130–133. doi: 10.1126/SCIENCE.1134108
Nguyen, H. P., Van Broeckhoven, C., and van der Zee, J. (2018). ALS genes in the genomic era and their implications for FTD. Trends Genet. 34, 404–423. doi: 10.1016/J.TIG.2018.03.001
Nicolas, A., Kenna, K., Renton, A. E., Ticozzi, N., Faghri, F., Chia, R., et al. (2018). Genome-wide analyses identify KIF5A as a novel ALS gene. Neuron 97:1268–1283.e6. doi: 10.1016/J.NEURON.2018.02.027
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., et al. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 372:n71. doi: 10.1136/BMJ.N71
Pender, N., Pinto-Grau, M., and Hardiman, O. (2020). Cognitive and behavioural impairment in amyotrophic lateral sclerosis. Curr. Opin. Neurol. 33, 649–654. doi: 10.1097/WCO.0000000000000862
Renton, A. E., Chiò, A., and Traynor, B. J. (2014). State of play in amyotrophic lateral sclerosis genetics. Nat. Neurosci. 17, 17–23. doi: 10.1038/NN.3584
Ringholz, G. M., Appel, S. H., Bradshaw, M., Cooke, N. A., Mosnik, D. M., and Schulz, P. E. (2005). Prevalence and patterns of cognitive impairment in sporadic ALS. Neurology 65, 586–590. doi: 10.1212/01.WNL.0000172911.39167.B6
Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., et al. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47. doi: 10.1093/NAR/GKV007
Ryan, M., Heverin, M., McLaughlin, R. L., and Hardiman, O. (2019). Lifetime risk and heritability of amyotrophic lateral sclerosis. JAMA Neurol. 76, 1367–1374. doi: 10.1001/JAMANEUROL.2019.2044
Salameh, J. S., Brown, R. H., and Berry, J. D. (2015). Amyotrophic lateral sclerosis: Review. Semin. Neurol. 35, 469–476. doi: 10.1055/S-0035-1558984
Sever, B., Ciftci, H., Demirci, H., Sever, H., Ocak, F., Yulug, B., et al. (2022). Comprehensive research on past and future therapeutic strategies devoted to treatment of amyotrophic lateral sclerosis. Int. J. Mol. Sci. 23:2400. doi: 10.3390/IJMS23052400
Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., et al. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504. doi: 10.1101/GR.1239303
Shefner, J. M., Al-Chalabi, A., Baker, M. R., Cui, L. Y., de Carvalho, M., Eisen, A., et al. (2020). A proposal for new diagnostic criteria for ALS. Clin. Neurophysiol. 131, 1975–1978. doi: 10.1016/J.CLINPH.2020.04.005
Shen, W. K., Chen, S. Y., Gan, Z. Q., Zhang, Y. Z., Yue, T., Chen, M. M., et al. (2023). AnimalTFDB 4.0: A comprehensive animal transcription factor database updated with variation and expression annotations. Nucleic Acids Res. 51, D39–D45. doi: 10.1093/NAR/GKAC907
Smigielski, E. M., Sirotkin, K., Ward, M., and Sherry, S. T. (2000). dbSNP: A database of single nucleotide polymorphisms. Nucleic Acids Res. 28, 352–355. doi: 10.1093/NAR/28.1.352
Soares, P., Silva, C., Chavarria, D., Silva, F. S. G., Oliveira, P. J., and Borges, F. (2023). Drug discovery and amyotrophic lateral sclerosis: Emerging challenges and therapeutic opportunities. Ageing Res. Rev. 83:1790. doi: 10.1016/J.ARR.2022.101790
Sollis, E., Mosaku, A., Abid, A., Buniello, A., Cerezo, M., Gil, L., et al. (2023). The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985. doi: 10.1093/NAR/GKAC1010
Song, J.-H., Huang, C.-S., Nagata, K., Yeh, J. Z., and Narahashi, T. (1997). Differential action of riluzole on tetrodotoxin-sensitive and tetrodotoxin-resistant sodium channels1. J. Pharmacol. Exp. Ther. 282, 707–714. doi: 10.1016/S0022-3565(24)36856-9
Stendel, C., Roos, A., Kleine, H., Arnaud, E., Özçelik, M., Sidiropoulos, P. N. M., et al. (2010). SH3TC2, a protein mutant in Charcot-Marie-Tooth neuropathy, links peripheral nerve myelination to endosomal recycling. Brain 133, 2462–2474. doi: 10.1093/BRAIN/AWQ168
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U.S.A. 102, 15545–15550. doi: 10.1073/PNAS.0506580102
Swinnen, B., and Robberecht, W. (2014). The phenotypic variability of amyotrophic lateral sclerosis. Nat. Rev. Neurol. 10, 661–670. doi: 10.1038/NRNEUROL.2014.184
Takeuchi, H., Takeuchi, T., Gao, J., Cantley, L. C., and Hirata, M. (2010). Characterization of PXK as a protein involved in epidermal growth factor receptor trafficking. Mol. Cell Biol. 30, 1689–1702. doi: 10.1128/MCB.01105-09
Tam, O. H., Rozhkov, N. V., Shaw, R., Kim, D., Hubbard, I., Fennessey, S., et al. (2019). Postmortem cortex samples identify distinct molecular subtypes of ALS: Retrotransposon activation, oxidative stress, and activated glia. Cell Rep. 29:1164–1177.e5. doi: 10.1016/J.CELREP.2019.09.066
Tard, C., Defebvre, L., Moreau, C., Devos, D., and Danel-Brunaud, V. (2017). Clinical features of amyotrophic lateral sclerosis and their prognostic value. Rev. Neurol. (Paris) 173, 263–272. doi: 10.1016/J.NEUROL.2017.03.029
Trabjerg, B. B., Garton, F. C., Van Rheenen, W., Fang, F., Henderson, R. D., Mortensen, P. B., et al. (2020). ALS in Danish Registries: Heritability and links to psychiatric and cardiovascular disorders. Neurol. Genet. 6:398. doi: 10.1212/NXG.0000000000000398
Turner, M. R., and Talbot, K. (2013). Mimics and chameleons in motor neurone disease. Pract. Neurol. 13, 153–164. doi: 10.1136/PRACTNEUROL-2013-000557
Van Rheenen, W., Shatunov, A., Dekker, A. M., McLaughlin, R. L., Diekstra, F. P., Pulit, S. L., et al. (2016). Genome-wide association analyses identify new risk variants and the genetic architecture of amyotrophic lateral sclerosis. Nat. Genet. 48, 1043–1048. doi: 10.1038/NG.3622
van Rheenen, W., van der Spek, R. A. A., Bakker, M. K., van Vugt, J. J. F. A., Hop, P. J., Zwamborn, R. A. J., et al. (2021). Common and rare variant association analyses in amyotrophic lateral sclerosis identify 15 risk loci with distinct genetic architectures and neuron-specific biology. Nat. Genet. 53, 1636–1648. doi: 10.1038/S41588-021-00973-1
Weishaupt, J. H., Körtvélyessy, P., Schumann, P., Valkadinov, I., Weyen, U., Hesebeck-Brinckmann, J., et al. (2024). Tofersen decreases neurofilament levels supporting the pathogenesis of the SOD1 p.D91A variant in amyotrophic lateral sclerosis patients. Commun. Med. 4:150. doi: 10.1038/S43856-024-00573-0
Weiss, S., Benoist, D., White, E., Teng, W., and Saint, D. A. (2010). Riluzole protects against cardiac ischaemia and reperfusion damage via block of the persistent sodium current. Br. J. Pharmacol. 160, 1072–1082. doi: 10.1111/J.1476-5381.2010.00766.X
Winter, D. J. (2017). rentrez: An R package for the NCBI eUtils API. R. J. 9, 520–526. doi: 10.32614/RJ-2017-058
Wolfson, C., Gauvin, D. E., Ishola, F., and Oskoui, M. (2023). Global prevalence and incidence of amyotrophic lateral sclerosis: A systematic review. Neurology 101, E613–E623. doi: 10.1212/WNL.0000000000207474
Keywords: Amyotrophic Lateral Sclerosis, systematic review, genes, variants, mutations
Citation: Arreola-Aldape CA, Moran-Guerrero JA, Pons-Monnier GK, Flores-Salcido RE, Martinez-Ledesma E, Ruiz-Manriquez LM, Razo-Alvarez KR, Mares-Custodio D, Avalos-Montes PJ, Figueroa-Sanchez JA, Ortiz-Lopez R, Martínez HR and Cuevas-Diaz Duran R (2025) A systematic review and functional in-silico analysis of genes and variants associated with amyotrophic lateral sclerosis. Front. Neurosci. 19:1598336. doi: 10.3389/fnins.2025.1598336
Received: 23 March 2025; Accepted: 13 May 2025;
Published: 16 June 2025.
Edited by:
Francesca Luisa Conforti, University of Calabria, ItalyReviewed by:
Silvia Corrochano, Health Research Institute of the Hospital Clínico San Carlos (IdISSC), SpainZhangyu Zou, Fujian Medical University Union Hospital, China
Copyright © 2025 Arreola-Aldape, Moran-Guerrero, Pons-Monnier, Flores-Salcido, Martinez-Ledesma, Ruiz-Manriquez, Razo-Alvarez, Mares-Custodio, Avalos-Montes, Figueroa-Sanchez, Ortiz-Lopez, Martínez and Cuevas-Diaz Duran. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Hector R. Martínez, aGVjdG9yLnJhbW9uQHRlYy5teA==; Raquel Cuevas-Diaz Duran, cmFxdWVsLmN1ZXZhcy5kZEB0ZWMubXg=
†These authors have contributed equally to this work and share first authorship