Cathepsins and Parkinson’s disease: insights from Mendelian randomization analyses

Background Parkinson’s disease (PD), the second most prevalent neurodegenerative condition, has a multifaceted etiology. Cathepsin-cysteine proteases situated within lysosomes participate in a range of physiological and pathological processes, including the degradation of harmful proteins. Prior research has pointed towards a potential link between cathepsins and PD; however, the precise causal relationship between the cathepsin family and PD remains unclear. Methods This study employed univariate and multivariate Mendelian randomization (MR) analyses to explore the causal relationship between the nine cathepsins and Parkinson’s disease (PD) risk. For the primary analysis, genome-wide association study (GWAS) summary statistics for the plasma levels of the nine cathepsins and PD was obtained from the INTERVAL study and the International Parkinson’s Disease Genomics Consortium. GWAS for PD replication analysis were obtained from the FinnGen consortium, and a meta-analysis was performed for the primary and replication analyses to evaluate the association between genetically predicted cathepsin plasma levels and PD risk. After identifying significant MR estimates, genetic co-localization analyses were conducted to determine whether shared or distinct causal variants influenced both cathepsins and PD. Results Elevated cathepsin B levels were associated with a decreased risk of PD in univariate MR analysis (odds ratio [OR] = 0.890, 95% confidence interval [CI]: 0.831–0.954, pFDR = 0.009). However, there was no indication that PD affected cathepsin B levels (OR = 0.965, 95% CI: 0.858–1.087, p = 0.852). In addition, after adjusting for the remaining cathepsins, cathepsin B levels independently and significantly contributed to the reduced risk of PD in multivariate MR analysis (OR = 0.887, 95% CI: 0.823–0.957, p = 0.002). The results of the replication MR analysis with the FinnGen GWAS for PD (OR = 0.921, 95% CI: 0.860–0.987, p = 0.020) and meta-analysis (OR = 0.905, 95% CI: 0.862–0.951, p < 0.001) were consistent with those of the primary analysis. Colocalization analysis did not provide any evidence of a shared causal variant between cathepsins and PD (PP.H4.abf = 0.005). Conclusion This genetic investigation supports the hypothesis that cathepsin B exerts a protective effect against PD. The quantification of cathepsin B levels could potentially serve as a predictive biomarker for susceptibility to PD, providing new insights into the pathomechanisms of the disease and possible interventions.


Introduction
Parkinson' s disease (PD) is a neurodegenerative condition characterized by degeneration of dopaminergic neurons in the substantia nigra pars compacta.Disease progression is closely linked to the accumulation of alpha-synuclein and abnormal protein degradation, and proteases, particularly cathepsins, play a key role in the attenuation of pathological protein aggregates (Reiser et al., 2010;Rai et al., 2022;Stoka et al., 2023).Several studies have established an association between PD and cathepsin activity, indicating their possible role in the etiology of the disease (Yelamanchili et al., 2011;Pišlar et al., 2018;McGlinchey et al., 2019;Pal et al., 2019;Yuan et al., 2021;Milanowski et al., 2022;Stoka et al., 2023).
Proteases, including those of the cathepsin family, are lysosomal enzymes that are essential for maintaining cellular homeostasis.Cathepsins are cysteine proteases that belong to the papain superfamily.They are involved in various cellular processes such as autophagy, cell signaling, and protein and lipid turnover (Fonović et al., 2014).Owing to their diverse functions, they contribute to many diseases, including neurological conditions such as Parkinson's disease (Stoka et al., 2023).
Experimental studies have consistently identified cathepsins as important contributors to PD pathogenesis.Although the study by Mantle et al. (1995) did not identify significant differences in cathepsin activity between PD patients and controls, other studies have identified an increase in the expression of cathepsin B, D, and X in animal models of PD (Pišlar et al., 2018;Gan et al., 2019).This finding suggests a possible association between disease initiation and development.Furthermore, degradation of the alpha-synuclein C-terminal, which is caused by cathepsin activity, has been observed in Lewy bodies.This is believed to be related to the formation of amyloid plaques and development of Parkinson's disease (McGlinchey et al., 2019).Furthermore, interactions between cathepsins and other biomarkers, as well as genetic variabilities such as apolipoprotein E and Brain-Derived Neurotrophic Factor (BDNF), may contribute to the risk of Parkinson's disease.This highlights the genetic complexity of the disease development (Schulte et al., 2003;Pal et al., 2019;Milanowski et al., 2022).
Advances in genomic science have strengthened our understanding of the role of heredity in disease development.Genetic variants from genome-wide association studies (GWAS) can be used as instrumental variables in Mendelian randomization (MR) studies to establish causal relationships between exposure and outcome.We conducted MR analyses in the context of PD in order to determine the causal effect of various cathepsins on the risk of developing PD (Emdin et al., 2017).In this study, univariate and multivariate MR techniques were employed to identify genetic level associations, and colocalization analyses were performed to examine shared genetic loci.

Data sources
This study used publicly accessible datasets.GWAS summary statistics for cathepsin levels were obtained from the INTERVAL study, which included 3,301 European participants (Sun et al., 2018).This study was approved by The National Research Ethics Service, and informed consent was obtained from all participants.PD GWAS data were obtained from the International Parkinson's Disease Genomics Consortium, which consists of 33,674 PD cases and 449,056 controls (Nalls et al., 2019).To ensure the stability of the significant results, we extracted GWAS data on PD (4,681 cases and 407,500 controls) from the FinnGen Consortium Freeze 10 database for replication analysis (Kurki et al., 2023).The data sources and study flowchart are presented in Table 1 and Figure 1.

Instrument selection
The selection of Cathepsin-related instrumental variables (IVs) for this study was carried out meticulously, adhering to stringent criteria.These criteria included ensuring that the IVs exhibited low linkage disequilibrium (LD) with an r 2 value below 0.001 within a 10,000 kb window and had p-values below 5 × 10 −6 .Similarly, for the reverse Mendelian randomization analysis related to PD, the same criteria were applied, with the p-value threshold set at 5 × 10 −8 .Single Nucleotide Polymorphisms (SNPs) in the exposure data can be found in Supplementary Tables S1, S2.The rigorous selection process involved identifying SNPs with genome-wide significance (p < 5 × 10 −6 ) as potential instrumental variables, excluding SNPs associated with the outcome (p < 0.05), considering linkage disequilibrium through a clumping procedure, assessing and correcting for pleiotropy using the MR-PRESSO test, verifying instrument strength with the F-statistic, and filtering IVs based on exposure-outcome associations.These steps ensure the robustness and validity of the instrumental variables used in this study for accurate causal inference in Mendelian randomization analysis.
Considering that PD is susceptible to lifestyle, smoking, alcohol consumption, use of psychotropic drugs and Type 2 diabetes, we queried the SNPs of the above positive results using NHGRI-EBI Catalog database 1 with therdhold of p = 5 × 10 −5 and 2 SNPs (rs1260326 and rs34593439) in IVs of cathepsins associated with the above confounding factors (detailed in Supplementary Table S6).

MR analysis
The inverse-variance weighted (IVW) method has been predominantly utilized in MR investigations to estimate effect size (Emdin et al., 2017).The Wald ratio in IVW was used to weigh the effect of each variant on exposure in relation to the risk of disease.A random-effects inverse variance meta-analysis was employed to merge the individual MR estimates.MR findings were validated using the  (Verbanck et al., 2018).
To further evaluate the independent effects of cathepsins, additional multivariate Mendelian randomization was used to investigate whether the impact of each individual cathepsin was dependent on other cathepsins.This study employed multivariable MR to assess the direct causal impact of several cathepsins on the risk of PD in a single analysis using the Mendelian randomization package (Yavorska et al., 2017).

Replication analysis meta analysis
To validate the robustness of the results, the FinnGen GWAS database was used as a second independent consortium for data on Parkinson's (Kurki et al., 2023).We conducted a replicated MR analysis for significant results, and a meta-analysis to explore the combined effects.

Colocalization analysis
To identify whether cathepsins genetically linked with PD share a causal variant, we conducted colocalization analysis.Bayesian testing was used to conduct colocalization analysis, utilizing the minor allele frequency (MAF) for approximations (Giambartolomei et al., 2014).We used the coloc.abffunction to examine the genetic regions around the Cathepsin B gene, specifically focusing on a 50 kb window centered on the gene's location on chromosome 8.For each pair of traits, we examined five hypotheses: H0 (no SNP causing the traits), H1 (associated with trait 1), H2 (association with trait 2), H3 (two separate SNPs causing the traits independently), and H4 (one SNP causing both traits).Colocalization was considered to have occurred when the posterior probability (SNP.PP.H4) was greater than 0.8.The R package Coloc was used in this study.

Forward univariable MR analysis
In the forward univariate MR analysis, we investigated the impact of nine cathepsins (B, E, F, G, H, L2, O, S, and Z) on the risk of PD.The analysis employed multiple MR methods, including Inverse Variance Weighted (IVW), MR Egger, Weighted Median, and Weighted Mode, using 9-22 single nucleotide polymorphisms as instrumental variables.
The results showed that Cathepsin B exposure was associated with a decreased risk of PD across all the MR methods.Specifically, the IVW method showed an odds ratio (OR) of 0.890 (95% CI: 0.831-0.954),and the result was statistically significant after multiple testing corrections (p_FDR = 0.009).The MR Egger, Weighted Median, and Weighted Mode methods also supported this finding, with consistent directions of effects and significance levels.For cathepsin E, F, G, H, O, S, Z, and, and L2, none of the MR methods indicated a significant relationship with PD risk, with p-values exceeding the conventional threshold of 0.05, and odds ratios close to null.The heterogeneity tests (Q_pval) were mostly non-significant, suggesting that the effect estimates were consistent across the different genetic instruments.The MR-Egger intercept and PRESSO did not indicate the presence of directional pleiotropy or outliers, confirming the robustness of our findings (Figure 2; Supplementary Table S3).
To address potential pleiotropic bias arising from trans-pQTLs, we conducted a univariate Mendelian Randomization (MR) analysis using exclusively the cis-pQTLs for each Cathepsin protein.Specifically, we included rs1692819 for Cathepsin B, rs1791679 for Cathepsin F, rs62013235 for Cathepsin H, and rs41271951 for Cathepsin S as the sole cis-pQTLs in our MR analysis, employing the Wald Ratio method.The results revealed that utilizing rs1692819 as the cis-pQTL for Cathepsin B showed a significant association with an odds ratio (OR) of 0.829 (95% CI: 0.752-0.915,p_FDR < 0.001).However, no significant associations were found for Cathepsin F (OR = 0.897, 95% CI:  stability and reliability of our results, corroborating the initial findings of our study (Supplementary Figure S2).

Reverse univariable MR analysis
We conducted a reverse MR analysis to explore the potential causal effect of PD on the expression levels of various cathepsins.Multiple MR methods were employed, including Inverse Variance Weighted (IVW), MR Egger, Weighted Median, and Weighted Mode, using 5-11 single nucleotide polymorphisms as instrumental variables.
Regarding PD and Cathepsin B levels, none of the MR methods showed a significant effect on PD expression.The IVW method yielded a beta coefficient (b) of −0.035 (standard error [SE] = 0.060, p = 0.560), indicating a non-significant effect of PD on Cathepsin B levels.The MR Egger, Weighted Median, and Weighted Mode methods all supported these findings, with p-values exceeding the threshold for statistical significance.Furthermore, for PD on cathepsin E, F, G, H, O, S, L2, and Z levels, non-significant results suggest that within the power of our analysis, PD does not have a detectable causal effect on cathepsin expression levels (Figure 3; Supplementary Table S4).

Multivariable MR analysis
In our multivariate Mendelian Randomization (MR) analysis evaluating the influence of different cathepsins as exposures on PD, following the screening of 9 cathepsins using a rigorous threshold of p = 5 × 10 −6 , r 2 = 0.001, and kb = 10,000, subsequent refinement procedures encompassing deduplication, clumping, and harmonization culminated in the identification of 10 SNPs as IVs for the MVMR analysis.The results show that only cathepsin B showed a significant negative association with PD risk (OR = 0.887, 95% CI = 0.823-0.957,p = 0.002), indicating a potential protective effect against PD.None of the other cathepsins (E, F, G, H, O, S, L2, Z) was significantly associated with PD, with p-values exceeding the threshold for significance (Figure 4).The lack of significant associations for these cathepsins suggests that they might not be causally related to PD, at least within the scope of this analysis.

Replication and meta-analysis
To verify the stability of the results, another independent FinnGen database was used for repeated MR analysis, and a further metaanalysis was performed.Replicated MR analysis between Cathepsin B and PD showed a similar effect in the FinnGen consortium (OR = 0.921, 95% CI = 0.860-0.987,p = 0.020 for the IVW method) (Table 2) and remained significant in the combined meta-analysis (OR = 0.905, 95% CI = 0.862-0.951,p < 0.0001) (Figure 5).

Colocalization analysis
Colocalization analysis was used to detect genetic variants shared between cathepsin B and PD.Therefore, we did not find any  Forest plots of univariable Mendelian randomization analysis of the relationship between Parkinson's disease and various cathepsins.

Discussion
This study aimed to investigate the complex association between Cathepsins and PD using Mendelian randomization and colocalization approaches.Our findings are the first attempt in the field of PD pathogenesis, shedding light on the potential protective role of cathepsin B, an enzyme integral to the degradation of pathological proteins such as alpha-synuclein.
Using univariate and multivariate MR analyses, our study provides compelling evidence that cathepsin B levels are negatively correlated with susceptibility to PD.A variety of sensitivity and replication analyses have provided further support for this association, thereby enhancing the consistency and dependability of our findings.Notably, our data show that cathepsin B and PD do not share any genetic variations, indicating a complex interaction that requires further research.
At the molecular level, the protective mechanisms of cathepsin B against PD may involve its crucial function in the degradation of alpha-synuclein through autophagy, a critical process that prevents the harmful accumulation of this protein.The results of Jones-Tabah et al. (2023) and McGlinchey et al. (2019) corresponded with our findings, highlighting the crucial role of cathepsin B in maintaining lysosomal function and preventing the development of neurotoxic aggregates.
To expand our understanding of the possible consequences of cathepsin B activity, Bai et al. (2018) and Nakanishi (2020) investigated the effect of this enzyme on oxidative stress and neuroinflammation, which are known to contribute to PD and other neurodegenerative disorders.Moreover, research conducted by Kim et al. (2022) and Almeida et al. (2020) indicated that focusing on the lysosomal pathway, specifically cathepsin B, could potentially become a highly effective therapeutic approach not only for PD but also for various other neurodegenerative disorders.
By contrast, Tsujimura et al. (2015) offer an alternative viewpoint by suggesting that cathepsin B could potentially facilitate the development of intracellular alpha-synuclein aggregates, which are characteristic features of PD pathology.The apparent contradiction in cathepsin B function underscores the enzyme's intricate and situationally dependent roles in cellular processes, which may differ during the distinct phases of the disease.
However, our study had some limitations.Firstly, the homogeneity of our sample population, which primarily consisted of individuals of European ancestry, may have limited the generalizability of our results.To enhance the generalizability of our findings and firmly establish cathepsin B as a feasible biomarker for PD, future research should incorporate a more heterogeneous sample.Secondly, Our study not observing colocalization between Cathepsin B and PD, indicating a potential confounding effect of linkage disequilibrium between the cathepsin B pQTL and PD riskassociated variants.This observation aligns with findings from Zheng et al. (2020), underscoring the likelihood that the significant associations identified through MR analysis may be influenced by LD rather than reflecting a direct causal relationship.The absence of colocalization emphasizes the complexity inherent in interpreting MR results in the context of potential LD confounders, shedding light on the nuanced interplay between genetic factors and disease susceptibility in PD. lastly, The consideration of tissue specificity in protein biomarker analysis emerges as a pertinent limitation in our study, with plasma serving as the primary source of protein measurements despite the relevance of brain tissue for PD research.This discrepancy underscores a key aspect highlighted in the work by Yang et al. (2021), wherein distinct tissue-specific protein quantitative trait loci (pQTLs) profiles are reported.The observation  that different tissues unique pQTL effects emphasizes the importance of utilizing brain-related tissues for PD studies to capture more relevant and context-specific insights.Our reliance on plasma-derived data, while informative, introduces a limitation in the interpretation of our findings, as the tissue-specificity of pQTLs may not be fully captured in this context.In summary, our MR analysis provides significant data supporting the neuroprotective effect of cathepsin B and its potential as a therapeutic target in PD.The findings of this study strongly encourage additional investigations into the biological roles of cathepsin B and its possible use as an early biomarker for diagnosing PD.Subsequent investigations should focus on overcoming the recognized constraints and deepening our understanding to effectively utilize the therapeutic potential of cathepsin B in combating PD and other neurodegenerative disorders.

FIGURE 1
FIGURE 1Study design for Mendelian randomization and colocalization analyses between cathepsins and Parkinson's disease.The study approach contained two main phases: in the first phase, two-sample Mendelian randomization (on the left) and multivariable Mendelian randomization (on the right).The procedures encompass primary analysis techniques, such as inverse variance weighted, as well as secondary analysis techniques, such as MR-Egger and weighted medians.The sensitivity analyses included Cochran's Q test, the MR-Egger intercept, the MR-PRESSO global test, and leave-one-out analysis.During the second phase, we conducted colocalization analyses to determine whether there was a shared genetic variant between positive cathepsin from the first phase and Parkinson's disease.

FIGURE 2
FIGURE 2Forest plots of univariable Mendelian randomization analysis of the relationship between various Cathepsins on Parkinson's disease.

FIGURE 5 A
FIGURE 5A meta-analysis of the causal association of cathepsin B and Parkinson's Disease.IPDGC, International Parkinson's Disease Genomics Consortium; FinnGen, the FinnGen consortium; OR, odds ratio; CI, confidence interval.

TABLE 1
Data sources for cathepsins and Parkinson's disease.

TABLE 2
Replication MR analysis cathepsin B on Parkinson's disease in FinnGen consortium.