Large-scale analyses of CAV1 and CAV2 suggest their expression is higher in post-mortem ALS brain tissue and affects survival

Introduction: Caveolin-1 and Caveolin-2 (CAV1 and CAV2) are proteins associated with intercellular neurotrophic signalling. There is converging evidence that CAV1 and CAV2 (CAV1/2) genes have a role in amyotrophic lateral sclerosis (ALS). Disease-associated variants have been identified within CAV1/2 enhancers, which reduce gene expression and lead to disruption of membrane lipid rafts. Methods: Using large ALS whole-genome sequencing and post-mortem RNA sequencing datasets (5,987 and 365 tissue samples, respectively), and iPSC-derived motor neurons from 55 individuals, we investigated the role of CAV1/2 expression and enhancer variants in the ALS phenotype. Results: We report a differential expression analysis between ALS cases and controls for CAV1 and CAV2 genes across various post-mortem brain tissues and three independent datasets. CAV1 and CAV2 expression was consistently higher in ALS patients compared to controls, with significant results across the primary motor cortex, lateral motor cortex, and cerebellum. We also identify increased survival among carriers of CAV1/2 enhancer mutations compared to non-carriers within Project MinE and slower progression as measured by the ALSFRS. Carriers showed a median increase in survival of 345 days. Discussion: These results add to an increasing body of evidence linking CAV1 and CAV2 genes to ALS. We propose that carriers of CAV1/2 enhancer mutations may be conceptualised as an ALS subtype who present a less severe ALS phenotype with a longer survival duration and slower progression. Upregulation of CAV1/2 genes in ALS cases may indicate a causal pathway or a compensatory mechanism. Given prior research supporting the beneficial role of CAV1/2 expression in ALS patients, we consider a compensatory mechanism to better fit the available evidence, although further investigation into the biological pathways associated with CAV1/2 is needed to support this conclusion.


Introduction
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease affecting upper and lower motor neurons. It is characterised by the progressive loss of motor function, leading to muscle weakness, difficulty breathing and swallowing, and paralysis. There is currently no treatment, with a mean life expectancy of 3 years (Al-Chalabi and Hardiman, 2013). ALS is comorbid with fronto-temporal dementia (FTD), with an estimated 50% of ALS patients experiencing impaired executive function (Lomen-Hoerth et al., 2003;Strong et al., 2017). These diseases are often conceptualised as two ends of a disease spectrum with a shared pathogenesis and clinical overlap (Phukan et al., 2007;Conlon et al., 2018).
Individuals who have a first-degree relative with ALS are twice as likely than average to develop ALS (Al-Chalabi et al., 2010), and patients with a family history (familial ALS) make up approximately 5%-10% of cases (Zou et al., 2017). A pathogenic variant for familial patients can be identified in over 50% of cases (Turner et al., 2017). However, most cases have no family history (sporadic ALS), and the majority have no identified genetic aetiology. A recent genome-wide association study (GWAS) estimates the narrow-sense heritability of ALS due to SNPs at 8.5% (Van Rheenen et al., 2016). This represents a minimum heritability value based upon the variation of SNPs included in sequencing arrays. Broad sense heritability estimations for ALS vary between 43% and 53% (Ryan et al., 2019;Trabjerg et al., 2020). Of all currently known pathogenic variants, the most common is a hexanucleotide repeat expansion within the C9orf72 gene, which accounts for 30%-40% of familial cases and 5%-10% of sporadic cases (Brown and Al-Chalabi, 2015;Braems et al., 2020). Individuals with this mutation display an earlier age of onset and faster disease progression (Iacoangeli et al., 2019). A further 5% of sporadic cases are attributable to mutations in SOD1, FUS, and TARDBP genes (Jones et al., 2021).
Despite these known genetic variants, a large proportion of ALS heritability remains unaccounted for. Most ALS genetic studies focus on the study of rare single nucleotide variants (SNVs) and small insertions and deletions (indels) in the coding regions of the genome, or on common single nucleotide polymorphisms (SNPs). As a consequence, structural and rare variants in non-coding regions of the genome are largely under-investigated and could represent a potential source of the missing heritability (Young, 2019;Cooper-Knock et al., 2020;Theunissen et al., 2020).
Caveolin-1 and Caveolin-2 (CAV1 and CAV2, or CAV1/2) genes code for proteins that are associated with the function of membrane lipid rafts. These are regions of low fluidity within the cellular membrane, which act as anchoring points for intercellular signalling (Igarashi et al., 2020). Converging evidence links CAV1 and CAV2 genes to ALS pathology; CAV1 is associated with neuronal survival and is upregulated during induced ischemia in mice, aiding the uptake of extracellular vesicles and reducing apoptosis (Yue et al., 2019). CAV1 may also play a role in the cognitive decline associated with ALS/FTD (Tang et al., 2021), with overexpression increasing neuroplasticity, pro-growth signalling, learning, and memory in mice (Head et al., 2011;Mandyam et al., 2017). Additional evidence using male SOD1 mice showed that the promotion of neuron-specific CAV1 expression increases body weight and improves longevity and motor function (Sawada et al., 2019). In a subsequent mouse study, subpial administration of synapsin-promoted CAV1 also increased survival, although saw no changes to body weight or motor function (Ichinomiya et al., 2021). Conversely, increased neurodegeneration and synaptic reduction were observed in CAV1 knock-out mice (Head et al., 2010).
In humans, CAV1 coding regions are enriched for ALS-associated variants and CAV1 and CAV2 enhancer mutations are significantly associated with an increased risk of ALS (Cooper-Knock et al., 2020). An expression analysis revealed that two mutations within CAV1 and CAV2 enhancer regions reduced CAV1/2 expression in patient-derived non-neuronal cells, which was supported by CRISPR-Cas9 editing in neuronal cells (Cooper-Knock et al., 2020). Together, evidence from human and mouse studies indicate that CAV1/2 is neuroprotective, and CAV1/2 mutations are a risk factor for ALS pathology, likely as a consequence of reduced gene expression.
In this study, we aim to investigate whether these mutations lead to differences in disease-related phenotypes, as well as changes in ALS risk, and explore whether CAV1/2 expression plays a role in the disease beyond enhancer mutations. In the first set of analyses, we used an RNA-sequencing pipeline to perform expression analysis of the CAV1 and CAV2 genes. The results supported our hypothesis that CAV1 and CAV2 genes would be differentially expressed between ALS cases and controls, with patients showing increased expression. In the second set of analyses, we investigated differences in survival duration and age of onset between ALS patients with and without CAV1/2 enhancer mutations. Considering the evidence that CAV1/2 enhancer mutations reduce CAV1/2 expression and that CAV1/2 expression is beneficial to ALS phenotypes, we hypothesised a reduced survival duration and earlier age of onset in ALS patients who have CAV1 or CAV2 enhancer mutations. Results were opposite to our expectation, showing increased survival duration among carriers of CAV1/2 enhancer mutations. No difference in age of onset was observed between groups.
To confirm whether differential expression of CAV1/2 occurred in neurons specifically, we ran an RNA-seq expression analysis in iPSC-derived motor neurons (MNs) from ALS patients and neurologically normal controls. Additionally, we examined the presence of a correlation between the expression of CAV1/2 in the iPSC-derived MNs and survival, age of onset, and disease progression as measured by the ALSFRS.

Sequencing and clinical data
Datasets for RNA-seq differential expression RNA-seq datasets for the differential expression analyses were obtained from TargetALS at the New York Genome Centre (NYGC; NCBI GEO ID: GSE116622 and GSE124439), the Florida Mayo Clinic (NCBI GEO ID: GSE67196), and the King's College London and MRC London Neurodegenerative Diseases Brain Bank (Smith et al., 2015;Iacoangeli et al., 2021;Jones et al., 2021).
Sample collection and data generation were previously described (Jones et al., 2021). Briefly, frozen human post-mortem samples were used in all cases, and tissue was taken across multiple brain areas. The KCL MRC Brain Bank samples were taken from the primary motor cortex. The Mayo Clinic samples were obtained from the lateral hemisphere of the cerebellum, Brodmann areas 9 and 44 (prefrontal cortex) and Brodmann area 4 (primary motor cortex). The Target alS (NYGC) samples were obtained from the cerebellum, the lateral and medial motor cortex, and various locations within the frontal cortex.

Project MinE
Whole genome sequencing and clinical data of ALS cases from Project MinE (data freeze 2) were used for the survival and age of onset analyses (Zhang et al., 2022). Samples were filtered to remove common variants (MAF > 0.01) in the enhancer regions of CAV1 and CAV2 genes, which are defined in Cooper-Knock et al. (2020). Individuals with missing data for sex, survival, and age of onset for the corresponding analysis, or those that failed quality controls (Project MinE ALS Sequencing Consortium, 2018) were removed. This retained 5,987 cases for analysis, including 44 individuals with at least one CAV1 or CAV2 enhancer mutation (individual variants and their frequencies can be found in Supplementary Material). Data generation and whole-genome sequencing quality controls, including principal component analysis, were previously described (Project MinE ALS Sequencing Consortium, 2018;Van Rheenen et al., 2021;Zhang et al., 2022).

Answer ALS
Total RNA-seq gene expression profiling of iPSC-derived MNs and phenotype data were obtained for 55 ALS patients and 15 controls from AnswerALS (Baxi et al., 2022). Gene expression was normalized for gene length and then sequencing depth to produce transcripts per kilobase million (TPM). Age of onset and disease status were available for all individuals and these parameters were used to check for the correlation between the expression of top-ranked RefMap ALS genes and age at disease onset.

RNA-seq differential expression analysis
An RNA-seq based differential expression analysis was performed for CAV1 and CAV2 genes on samples across three Diagrammatic representation of RNA-seq differential expression pipeline. Each RNA-seq step is shown in the blue circles, with the tool used at each step given beside each red circle. datasets. A detailed protocol of library preparation is described by Tam et al. (2019) for TargetALS samples, Prudencio et al. (2015) for Mayo Clinic samples, Prudencio et al. (2015) and Jones et al. (2021) for the KCL MRC Brain Bank samples. Figure 1 illustrates the stages performed in the RNA-seq analysis.
Multi-Q23 was used for all datasets to assess read quality preand post-alignment. The removal of ribosomal RNA transcripts was achieved by filtering with SortMeRNA, using rRNA databases. BBDuk was used to filter adapters and low-quality reads. RNA-seq reads were aligned with STAR v2.7 using the GRCh37.89 reference genome.
Read counts were imported into R using Tximport and DESeq2. Only transcripts with at least 10 reads were retained for analysis. Available data for disease status, gender, quintiles of age, quintiles of PMI, RIN, and flow-cell were imported into R. SVA and SVAseq were used to generate surrogate variables for each sample, which estimate expression heterogeneity. These were included as covariates in subsequent analyses to control for unaccounted confounding factors such as cell heterogeneity and extraneous variation.
Raw read counts were supplied to DESeq2, which was used to perform a differential expression analysis across ALS cases and controls. Differential expression was estimated using log 2 foldchange, a wald test, and FDR p-value correction. Analyses were run using covariates of age, gender, post-mortem delay, RIN, and surrogate variables, where data was available.
The final differential expression results were meta-analysed for each brain tissue type using the Stouffer method (Stouffer et al., 1949). This uses the p-value, sample size, and log 2 fold-change from each dataset to produce meta-analysed test statistics, and considers the direction of effect.

Project MinE survival and age of onset analyses
Multiple cox proportional hazard survival analyses were run and visualised in R using the survival and survminer packages. These analyses were to assess whether the presence of CAV1/2 enhancer mutations impacts patient survival. Analyses were run with sex at birth and age of onset as covariates, using individuals with no CAV1/2 mutations together with: CAV1 mutations only, CAV2 mutations only, and individuals with mutations in either gene.
C9-related ALS is characterised by different clinical presentations (Al-Chalabi et al., 2016, earlier age of onset, and faster disease progression compared to non-C9 ALS, suggesting a separate disease mechanism (Iacoangeli et al., 2019). Analyses were therefore run with and without individuals carrying a pathogenic repeat expansion of the C9orf72 gene (Iacoangeli et al., 2019) to assess whether increasing sample homogeneity would reveal a stronger effect of CAV1/2 mutations on survival. Analyses were additionally run excluding samples from patients with other well-known ALS mutations (SOD1, FUS, TARDBP), and matching samples based on nationality. Finally, survival analyses were run when stratifying samples by type of CAV enhancer mutation (CAV1 or CAV2).
A second set of analyses were run to determine whether CAV1/2 status affected age of onset, using sex at birth as a covariate. These were linear regression and cox proportional hazard models, run in R using the survival package. Analyses were run with and without carriers of a pathogenic C9orf72 repeat expansion. They compared samples with no CAV1/2 mutation to: (1) samples with CAV1 enhancer mutations; (2) samples with CAV2 enhancer mutations; and (3) samples with either mutation.

Samples and datasets Differential expression analysis datasets
Samples were matched across disease status by age and sex within each dataset, where data was permitted. Cases were comprised of samples from sporadic and familial ALS patients, including C9orf72and SOD1-associated ALS. Control samples were obtained from individuals with non-neurological or non-ALS disease. An outline of each dataset is provided in Figure 2.
Project MinE dataset for CAV1/2 enhancer mutation analyses CAV1/2 enhancer variants of MAF > 0.01 in gnomAD were removed prior to analysis. 5,987 samples passed the quality controls and were used for analysis. Of these, 356 were carriers Sample overview across the RNAseq datasets used in the differential expression analyses. Datasets were obtained from the KCL Brain Bank (green), Mayo Clinic (orange), and TargetALS (NYGC; blue).
of the C9orf72 repeat expansion. In total, 44 patients had at least one CAV1/2 enhancer mutation, of which, 34 were carriers of CAV1 mutations, and 10 were carriers of CAV2 mutations. Figure 3 shows sample sizes for the four primary Project MinE survival analyses.
Bulk RNAseq reveals higher expression of CAV1 and CAV2 in ALS patient tissue compared to controls Considering converging evidence that CAV1/2 genes are neuroprotective and the previous association between ALS disease Sample sizes for each Project MinE survival analysis. Samples are divided by those with CAV1/2 enhancer mutations (orange) and without (blue). status and CAV1/2 enhancer regions, we hypothesised that CAV1 and CAV2 genes would be differentially expressed between ALS patients and controls within brain tissue. Results from the differential expression analysis for CAV1 and CAV2 are outlined in Table 1 and shown in violin plots in Figure 4. CAV1 showed statistically significant differential gene expression within the KCL primary motor cortex (Log2FC = 0.396, p = 0.04) and the NYGC cerebellum (Log2FC = 0.751, p = 0.02). CAV2 was differentially expressed in the primary motor cortex within the KCL BrainBank sample (Log2FC = 0.183, p = 0.01), in addition to the cerebellum (Log2FC = 0.669, p = 0.004) and lateral motor cortex (Log2FC = 0.691, p = 0.029) within Target alS (NYGC) samples. Dataset-tissues almost universally showed a positive log 2 fold-change (with the exception of the NYGC frontal cortex), suggesting that CAV1/2 is consistently upregulated among ALS cases. This direction of effect is contrary to previous evidence if we conclude that a higher expression level in cases corresponds to gene expression increasing ALS risk. However, this aligns with a compensatory model, in which expression of CAV1/2 genes is increased to mitigate ALS-related pathology.
Log 2 fold-change for CAV1 and CAV2 were in a consistent direction across all datasets and tissues except for the CAV2 NYGC frontal cortex. For this reason, a Stouffer meta-analysis was run for the motor cortex, frontal cortex, and cerebellum, the results of which are shown in Table 2. Two TargetALS NYGC tissue regions were available within the motor cortex, the lateral and medial motor cortex, of which only the lateral motor cortex reached statistical significance (Log2FC = 0.691, p = 0.029). These datasets were separately meta-analysed with the KCL Brainbank dataset. These analyses were statistically significant for both CAV1 and CAV2 genes, and all showed a large log2 fold-change over 2.

CAV1/2 expression is higher in iPSC-derived motor neurons from ALS patients
Bulk RNA-seq in post-mortem brain tissue has shown that expression of both CAV1 and CAV2 genes is higher in ALS patients compared to controls. Enhanced CAV1 expression has previously been associated with neuroprotection (Sawada et al., 2019) and reduced CAV1 expression has been associated with risk for ALS (Cooper-Knock et al., 2020). Therefore, the observed higher This figure shows violin plots of significant gene expression for CAV1 and CAV2 between cases and controls. The X-axis indicates tissue/dataset combination and case/control status. The Y-axis is normalised gene expression. Coloured dots inside violin plots are jittered gene expressions for each sample. Boxplots inside each violin plot show gene expression for each category. Violin plot colour: Condition (case: red; control: blue). Note that CAV1 differential expression in the lateral motor cortex is significant only to p < 0.1. Violin plots for all analyses are available in Supplementary  Figure 1. expression of CAV1 and CAV2 might represent a compensatory reaction to neurotoxicity. However, the bulk RNA-seq analysis does not allow us to determine which cell types are responsible for observed changes in CAV1/2 expression. To address this, we analysed gene expression in iPSC-derived MNs from ALS patients (n = 55 1 ) and neurologically normal controls (n = 15). Mean expression of both genes was higher in ALS patients compared to controls although this difference was not statistically significant (CAV1: mean ALS = 1.46 TPM, mean control = 1.3 TPM, t = 0.48, Log2FC = 0.1575, p = 0.31. CAV2: mean ALS = 1.67 TPM, mean control = 1.39 TPM, t = 1.43, Log2FC = 0.2647, p = 0.08).

Correlation analyses between CAV1/2 expression and phenotypic measures in answer ALS
Using RNAseq from iPSC-derived MN, we examined the association between CAV1/2 expression and phenotypic measures. An outline of these results is shown in Table 3. Age of onset was quantified in days; there was no significant correlation between CAV1/2 expression and age of onset (Pearson correlation p > 0.05). Survival was measured in days from the date of onset to death and censored samples were not included because of the lack of longitudinal data; the date of death was available for 27 ALS patients. Cox proportional hazards model was used to determine whether survival was significantly correlated with CAV1/2 expression. The first 10 principal components were used as covariates to control for population structure. Neither CAV1 (p = 0.96) nor CAV2 (p = 0.70) were significantly associated with survival in this cohort. Next, we tested whether CAV1/2 expression was correlated with the rate of change in ALSFRS, which is a measure of the rate of disease progression. The ALSFRS was measured longitudinally between 2 and 10 times (with a median of four measurements). The delta-ALSFRS was calculated using linear regression based upon patient visit time and was available for 43 ALS patients. CAV2 expression but not CAV1 expression was negatively correlated with the rate of change of ALSFRS score ( Figure 5); iPSC-derived MN with higher CAV2 expression were derived from patients with a faster rate of decline in the ALSFRS (Pearson correlation p = 0.04, t = −1.78, r = −0.27). In view of our previous data, this could suggest that a compensatory increase in CAV2 expression is highest in patients with more rapid disease progression. It is interesting that CAV1 has been previously associated with neuroprotection but was not significant in this test which may indicate opposing forces of compensatory upregulation with more aggressive disease and a therapeutic effect slowing disease progression. Table 4 outlines the results from four of these survival analyses.

Survival analyses in project MinE
In the first set of analyses (1-2), we tested the difference in survival of the patients carrying a mutation in the enhancer of either gene (CAV1/2) against non-carriers (Figure 6). The decision was made to combine CAV1 and CAV2 enhancer mutations due to their related biological function, co-expression, overlapping enhancers, and to maximise the statistical power. CAV1/2 mutations were significantly associated with longer survival (HR = 0.694, p = 0.043; HR = 0.674, p = 0.034). This was the case irrespective of whether C9orf72 samples were included or removed.
The following analyses were then stratified by the presence of CAV1 or CAV2 enhancer mutations. These analyses excluded C9orf72 samples. Although not significant, the effects on survival of The left side of the table displays the inclusion criteria of each analysis, and the right side displays the results. The first two columns specify whether samples with a C9orf72 mutation have been included (green tick) or excluded (red cross). CAV+ denotes the number of samples with CAV1/2 enhancer mutations, and CAV-indicates the sample size of those without CAV1/2 enhancer mutations. *p < 0.05; **p < 0.01; * * * p < 0.001.

FIGURE 6
Survival Curves comparing survival of patients with vs without any CAV1/2 mutation. The left graph is based upon data from Analysis 1, inclusive of patients with C9orf72 repeat expansion. The right graph is from Analysis 2, with C9orf72 samples removed. Patients with CAV1/2 mutations have a longer survival time (C9orf72-inclusive analysis: median survival difference of 345 days. See Table 5 for a full descriptive summary). Y-axis is the fraction of surviving sample. X-axis is time in days. Dashed lines indicate 95% confidence intervals. Orthogonal lines indicate death or censoring event. Graphs exclude 22 samples from patients surviving over 10,000 days to improve scaling. Complete graphs are available in Supplementary Figure 2. CAV1 and CAV2 enhancer mutations were similar and consistent with the analyses 1-2. This supports our initial choice to aggregate them to increase statistical power based on the hypothesis that mutations in the enhancers of both genes have a similar role in ALS. Descriptive statistics for these analyses are available in Table 5.

Age of onset in project MinE
Similarly to the survival analyses, each age of onset analysis was performed using differing inclusion criteria.  Analyses correspond to the rows in Table 4. Analyses 1 and 2 are including and excluding patients with the C9orf72 repeat expansion, respectively. Analysis 3 and 4 are stratified by CAV1 and CAV2 enhancer mutation, and do not include C9orf72 mutations. the event status indicator to 1 (the event has occurred) for each sample. In parallel, a linear regression was performed using the same inclusion criteria as analysis 1. No analysis found any effect of CAV1/2 mutation on the age of onset.

Discussion
We report increased expression of CAV1 and CAV2 in ALS cases when compared to controls using bulk RNA sequencing from post-mortem brain tissue samples. Statistically significant differential expression was found in the KCL Brainbank and Target alS (NYGC) samples, but not in Mayo Clinic samples, although the direction of effect was consistent. Non-significant results may be due to a lack of power, as the sample size was substantially smaller in the Mayo Clinic samples than the other datasets. Additionally, meta-analyses revealed significant differences within the cerebellum and motor cortex for both CAV1 and CAV2 expression, but not the frontal cortex. One possible interpretation is that overexpression of CAV1/2 genes increases ALS risk. However, this is inconsistent with evidence that CAV1/2 expression is protective in ALS (Head et al., 2011;Cooper-Knock et al., 2020) and more generally promotes neuronal growth and improves motor function (Egawa et al., , 2018). An alternative interpretation consistent with previous literature is that the gene upregulation is indicative of a compensatory mechanism; CAV1/2 expression is increased as a response to ALS pathology, which affords greater protection.
Survival analyses showed that among ALS patients, carriers of CAV1/2 enhancer mutations had longer survival compared to non-carriers, with a median survival difference of 345 days in the Project MinE dataset. No correlation was demonstrated between gene expression and survival in the AnswerALS iPSC-derived MNs, although this analysis was limited by the small sample size. We observed a negative correlation between CAV2 expression and the rate of change in the ALSFRS in the iPSC-derived MNs. Given the seemingly protective role of CAV1/2, it was expected that mutations in CAV1/2 enhancers, which purportedly decrease CAV1/2 expression, would in turn reduce survival. We consider two possible explanations for observing the opposite outcome. CAV1/2 enhancer mutations exist in non-coding regions and have an unknown impact on gene expression. Cooper-Knock and colleagues (Cooper-Knock et al., 2020) ran an expression analysis using a single CAV1/2 enhancer mutation (chr7:116222625:T > C), finding an association with reduced CAV1/2 expression in patient-derived neuronal cells. However, this is not sufficient evidence to conclude the global effect of CAV1/2 mutations on expression, as enhancer mutations may also increase gene expression (Corradin and Scacheri, 2014;Sur and Taipale, 2016). The effects of other variants on gene expression may account for the increased survival duration that we observed. Further investigation into the of CAV1/2 enhancer mutations on gene expression would be beneficial to build evidence for or against this interpretation.
An alternative hypothesis is that patients with CAV1/2 mutations represent a subset of ALS patients with a less aggressive phenotype. In this framework, CAV1/2 enhancer mutations reduce CAV1/2 expression, leading to dysfunctional neuronal signalling and accelerated neurodegeneration. However, the dysfunction associated with CAV1/2 is on average less severe than non-CAV-related ALS phenotypes, leading to the longer survival time found in our analyses. It is more likely that rare variants occurring within enhancer regions are deleterious, leading to reduced function of the enhancer and therefore reduced expression than to improve function and increase CAV1/2 expression. This prior expectation makes this interpretation more biologically plausible.
Whether or not CAV1/2 enhancer mutations increase or decrease CAV1/2 gene expression, both align with the "compensatory model" of CAV1/2 overexpression in ALS patients. If CAV1/2 are neuroprotective and are upregulated to compensate for ALS pathology, CAV1/2 enhancer mutations which increase expression simply boost this effect, leading to increased survival. If these mutations decrease expression and subsequently increase neurodegeneration, the "increased survival" we observe among patients with CAV1/2 enhancer mutations may be explained by CAV-mediated ALS being on average less severe than non-CAV ALS.
Individuals with CAV1/2 mutations represent a small but relevant proportion of ALS patients (0.7%). Our results add to an increasing body of evidence linking CAV1 and CAV2 genes to ALS, help to elucidate the role of their enhancer mutations and gene expression in ALS, and support the positioning of CAV1/2 genes as potential targets for the development of treatment. However, further research into the functional effect of CAV1/2 mutations is needed to clarify their role in the pathogenesis of ALS.

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here:

Author contributions
BA, JC-K, AJ, and AI contributed to concept, design of the study, running the analyses, and drafted the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of interest
JV reports to have sponsored research agreements with Biogen and Astra Zeneca.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's note All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

SUPPLEMENTARY FIGURE 1
Violin plots of gene expression for CAV1 and CAV2 between cases and controls. The X-axis indicates tissue/dataset combination and case/control status. The Y-axis is normalised gene expression. Coloured dots inside violin plots are jittered gene expressions for each sample. Boxplots inside each violin plot show gene expression for each category. Violin plot colour: Condition (case: red; control: blue).

SUPPLEMENTARY FIGURE 2
Uncropped survival curves comparing survival of patients with vs without any CAV1/2 mutation. The top graph is based upon data from Analysis 1, inclusive of patients with C9orf72 repeat expansion. The bottom graph is from Analysis 2, with C9orf72 samples removed. Patients with CAV1/2 mutations have a longer survival time (C9orf72-inclusive analysis: median survival difference of 345 days. See Table 5 for a full descriptive summary). Y-axis is the fraction of surviving sample. X-axis is time in days. Dashed lines indicate 95% confidence intervals. Orthogonal lines indicate death or censoring event.

SUPPLEMENTARY TABLE 1
List of CAV1/2 enhancer variants and their frequency within the Project MinE dataset.