Expression of Human Endogenous Retrovirus-K in Spinal and Bulbar Muscular Atrophy

Background: Spinal and Bulbar Muscular Atrophy (SBMA) is caused by the extension of the polyglutamine tract within the androgen receptor (AR) gene, and results in a multisystem presentation, including the degeneration of lower motor neurons. The androgen receptor (AR) is known to modulate the expression of endogenous retrovirus-K (ERVK), a pathogenic viral genomic symbiont. Since ERVK is associated with motor neuron disease, such as Amyotrophic Lateral Sclerosis (ALS), we sought to determine if patients with SBMA exhibit evidence of ERVK reactivation. Results: Data from a pilot study demonstrate that peripheral blood mononuclear cell (PBMC) samples from controls and patients with SBMA were examined ex vivo for the expression of ERVK viral transcripts and proteins. No differences in ERVK RNA expression was observed between the clinical groups. In contrast, enhancement of processed ERVK Gag and integrase proteins were observed in SBMA-derived PBMC as compared to healthy control specimens. Increased ERVK protein maturation co-occurred with elevation in the expression of the pro-inflammatory transcription factor IRF1 in SBMA. Conclusions: Our findings indicate that ERVK viral protein maturation in SBMA is an unrecognized biomarker and facet of the disease. We discuss how our current understanding of ERVK-driven pathology may tie into key aspects of multi-system dysfunction in SBMA, with a focus on inflammation, proteinopathy, as well as DNA damage and repair.

and its pre-mutation expansion with Amyotrophic lateral sclerosis (ALS) (5,6), Atrophin-1 in Dentatorubral pallidoluysian atrophy (DRPLA) and several genes implicated in distinct types of spinocerebellar ataxia (4). Generational trinucleotide repeat expansions in risk genes can lead to offspring with earlier disease onset and more severe clinical symptoms in SBMA (2,7). PolyQ expansions in excess of 37 amino acids long are considered pathogenic (8).
Trinucleotide repeat expansion disorders (TREDs) such as SBMA have been associated with both loss-of-function and gain-of-function effects (9). It remains uncertain how the CAG repeat expansion induces neurodegeneration in SBMA. However, similar to all other polyglutamine diseases, an accumulation of intranuclear inclusions with misfolded polyglutamine-expanded proteins are found in certain neuronal populations. In SBMA cases, these deposits can occur in the anterior horn motor neurons of the spinal cord (10). The pathogenic mechanism is not well-understood, but increasing evidence suggests that toxicity of the mutant AR protein is due primarily to androgen-dependent impairment of its receptor, as well as having impacts in terms of proteinopathy. When androgen binds mutant AR, this leads to AR aggregation, nuclear inclusions in certain tissues, and altered function as a transcriptional regulator (11)(12)(13).
Mutant polyQ proteins can accumulate into proteinaceous deposits in neural cells, as well as peripheral cell types, such as muscle (14) and immune cells (15). Our work has previously demonstrated that aggregate-prone, mutated TDP-43 protein can facilitate the accumulation of endogenous viral proteins within cells (16). Amazingly, over 8% of human DNA is of retroviral origin-scattered inside our genome are thousands of retrovirus-like sequences called endogenous retroviruses (ERVs) (17,18). When activated by signals such as inflammation, select pre-existing viruses in our DNA can produce viral proteins within human cells (19). The cellular consequences related to the expression of these viral proteins is largely unknown. However, accumulating evidence points toward endogenous retrovirus-K (ERVK) driving neurodegeneration in ALS (19)(20)(21)(22). Therefore, due to overlapping cellular mechanisms and aspects of clinical presentation, we postulated that ERVK expression may be enhanced in SBMA, as seen in ALS. were anonymized by clinician Dr. Kerri Schellenberg prior to processing by the Douville lab.

Diagnosis and Demographics of Participant Samples
Clinical examination and genetic screening for CAG repeats in AR was used to confirm the clinical diagnosis of SBMA. Genetic testing was not performed on control specimens. Table 1 indicates the individual patient diagnosis and number of CAG repeats in AR for the samples used in this study.

PBMC Isolation
Whole blood samples were diluted in saline solution and processed on a ficoll gradient (GE Healthcare 17-5442-02) as previously described (23). The time from collection to processing for all patient samples was <20 h. Extracted ex vivo PBMC were counted and aliquoted into 5 x 10 6 cells per dry pellet and frozen until subsequent batched analysis.

Western Blots
PBMC were lysed on ice with 50 µl of in-house lysis buffer (0.05M Tris (pH 7.4), 0.15M NaCl, 0.002M EDTA, 10% glycerol and 1% NP-40 in ultra-pure water) to extract proteins. The lysis buffer was supplemented with 1x HALT protease and phosphatase inhibitor cocktail (Thermo Scientific #78442). BCA assay (Thermo Scientific #PI23227) was used to determine the protein content of each sample as per manufacturer's instructions. Cell lysates were prepared for SDS-PAGE and heated at 95 • C for 10 min. Proteins (15 µg per lane) were separated by SDS-PAGE using a 10% BioRad Quick Cast gel (#161-0173) and transferred onto a PVDF membrane (BioRad #162-0260). The membrane was blocked in 5% skim milk solution for 30 min and probed with the desired primary antibody overnight at 4 • C, followed by incubation at room temperature for 2 h. Primary antibodies used were: mouse anti-ERVK Gag (LifeSpan Biosciences # LS-C65287), rabbit anti-ERVK integrase (Pierce, custom antibody), rabbit anti-ERVK SU (Pierce, custom antibody), rabbit anti-human IRF1 (Santa Cruz #SC497), chicken anti-human β-actin (Abcam # ab13822; loading control). The membrane was probed with fluorophoreconjugated anti-mouse, chicken or anti-rabbit IgG secondary antibodies (1:1000 dilution; Molecular Probes #21449, A11072, A21246) for 2 h at room temperature. The membrane was imaged using a Protein Simple FluorChem M chemiluminescent imager, and multiplexed with readouts on the same blot in separate fluorescent channels. Image Lab software was used to determine the molecular weight and relative density (normalized to βactin) of each band. The identity of each band was based on Gag-Pro-Pol processing, as previously described (24).

Statistical Analyses
GraphPad Prism version 8.1.2 was used to carry out statistical analyses including column statistics and unpaired t-test used to assess clinical group differences for western blot and Q-PCR quantifications.

Nomenclature
As with human genes, ERVK viruses (or HERV-K) are assigned names by the Human Gene Nomenclature Committee as recommended by Mayer et al. (25). Gene names in the text are italicized, whereas protein names are not.

A Pilot Cohort of Patients With SBMA
Kennedy's disease is very rare and likely underestimated and underdiagnosed, with prevalence estimates of 1-2 per 100,000 individuals (26,27). Founder effects are associated with regional increases in SBMA prevalence (28,29).
With the goal of generating preliminary data for future studies, we recruited four patients with SBMA and four control participants to donate blood samples. The number of CAG repeats in the AR gene of patients with SBMA in this study is listed in Table 1. All SBMA cases had polyQ tracts extending beyond 38 CAG repeats, which is representative of pathological AR disruption (8).

Control of ERVK Expression
The ERVK provirus is comprised of the typical retrovirus genes gag, pro, pol, and env. Multiple ERVK RNA transcripts are produced from a provirus. An essential transcript encoding both structural (Gag) and enzymatic (Pol) proteins is translated to produce the Gag-Pro-Pol polyprotein, which is cleaved by the mature protease enzyme to produce individual viral proteins (Figure 1). Protease cleavage sites in the ERVK Gag polyprotein are known, allowing the identification of mature viral proteins (30). Many inflammatory diseases are associated with elevated ERVK expression; although putative pathological contributions remain contentious (31,32). Pro-inflammatory cytokine signaling has repeatedly been shown to contribute toward reactivation of endogenous retroviruses.

IRF1 Expression Is Elevated in PBMC From Patients With SBMA
TNFα signaling can enhance IRF1 activity. The action of this inflammatory transcription factor has previously been implicated in the reactivation of ERVK (19). In PBMC, TNFα treatment dose-dependently increases both IRF1 and ERVK reverse transcriptase (RT) protein expression ( Figure 1B). One notable observation in PBMC from patients with SBMA is a 2.4-fold enhanced expression of IRF1 as compared with controls (p < 0.05, Figure 1D), despite no evidence of differences in the IRF1 transcript between clinical groups ( Figure 1C).

ERVK Expression in PBMC From Patients With SBMA and Controls
We assessed the protein expression of the Gag-Pro-Pol polyprotein (218 kDa), and its protease-derived cleavage products, by targeting either an N-terminal Gag epitope (Figure 2) or C-terminal integrase epitope (Figure 3) through a multiplex western blot analysis of ex vivo PBMC.
Basal expression of ERVK is expected in many human tissue types (33). Indeed, ERVK gag transcripts were readily measured in both controls and patients with SBMA, with no significant differences between clinical groups ( Figure 2B). Figure 2A depicts that ERVK Gag polyproteins are evident in both controls and patients with SBMA. However, samples from patients with SBMA display more viral polyprotein processing leading to formation of mature viral proteins than their control counterparts. This is evident when examining the formation of mature structural Gag proteins, capsid (CA−28 kDa), matrix (MA−15 kDa) and nucleocapsid (NC−15 kDa). The sum of intensity quantification of MA/NC bands indicates that there is a trend toward more Gag polyprotein processing occurring in PBMC from patients with SBMA as compared with controls ( Figure 2C, p = 0.06). Figure 3 shows that there are similar levels of ERVK pol transcript ( Figure 3B) and ERVK Gag-Pro-Pol polyprotein (218 kDa, Figure 3A) in PBMC from controls and SBMA. Similar to what was observed with the expression of ERVK Gag, the processing of viral proteins containing an integrase epitope was greater in SBMA samples than those of controls. Dimeric integrase is required for enzymatic activity (34); we observed monomeric (32 kDa) and an abundance of dimeric (52 kDa)

FIGURE 1 | Genomic insertions of ERVK with intact open reading frames can produce a variety of mature viral proteins and are induced by pro-inflammatory signals. (A)
The ERVK provirus is comprised of the typical retrovirus genes gag, pro, pol, and env. Multiple ERVK RNA transcripts are produced from a provirus. This diagram depicts the gag-pro-pol transcript. This transcript is translated to produce the Gag-Pro-Pol polyprotein, which is cleaved by the mature protease enzyme to produce individual viral proteins. Gag structural proteins (capsid, matrix and nucleocapsid) and pol-derived reverse transcriptase (RT/RT-RH heterodimer) integrase (IN) proteins were examined in this study. The cellular role of most ERVK proteins remains unknown; however, the ERVK Env protein is known to be neurotoxic (20).

(B) ERVK reverse transcriptase (RT) levels dose-dependently increase with pro-inflammatory stimulus in immune cells. Peripheral blood mononuclear cells (PBMC)
were treated with increasing doses of pro-inflammatory cytokine TNFα and evaluated for ERVK RT and IRF1 expression. As described elsewhere (19), the pro-inflammatory transcription factor IRF1 participates in driving increased levels of ERVK transcription and RT protein expression. (C,D) Similar IRF1 transcript expression in control and SBMA cases (C), despite evidence of elevated IRF1 protein expression in SBMA as compared with controls (D, *p < 0.05). integrase bands. Band intensity quantification of ERVK integrase protein indicates that there is significantly more Pol polyprotein processing occurring in PBMC from patients with SBMA as compared with controls ( Figure 3C, p < 0.01).
Together these data show that ERVK viral protein maturation is enhanced in SBMA PBMC, as compared with control cells. The implications of ERVK viral protein activity as it may relate to SBMA will be discussed below.

DISCUSSION
PolyQ diseases highlight the complexity of translating the human genome into a given cellular state. The role of endogenous retroviruses further complicates matters. However, it is of critical importance to consider both cellular and viral contributors to disease processes. Here, for the first time we show evidence of ERVK viral protein maturation in SBMA. Albeit a preliminary study, our work points to additional avenues of investigation into the role of ERVK in this motor neuron disease.

ERVK LTR and the AR Paradox
ERVK promoters contain binding sequences for AR. We have shown that 5 ′ and 3 ′ long terminal repeats (LTRs-viral promoters) which control ERVK expression contain conserved androgen response elements (35). Experimental evidence also supports a role for AR in enhancing ERVK expression (36). Therefore, we hypothesized that disrupted AR activity in SBMA would result in decreased ERVK protein levels; however, this was not the case. Our results consistently show that there is enhanced ERVK polyprotein processing and formation of mature (and potentially pathogenic) viral proteins in SBMA PBMC. This could be considered a novel gain-of-function in SBMA. Further validation using tissue samples from individuals with SBMA is warranted to confirm or refute a pathological impact from mature viral proteins, as ERVK expression in PBMC often co-occurs with evidence of viral proteins in other tissue types (37,38).

Inflammation as a Driver of ERVK Expression in SBMA
Inflammation and chronic immune stimulation are becoming recognized as a distinct feature of trinucleotide repeat expansion disorders (15). Peripheral immune activation is observed before clinical onset of Huntington's disease and in several murine models of polyQ disease (15,39). Indeed, we observed enhanced expression of IRF1 in ex vivo (non-cultured/stimulated) PBMC from patients with SBMA, indicative of an ongoing inflammatory response. IRF1-dependent enhancement of inflammatory signaling is a potential underlying mechanism for ERVK expression in SBMA tissues, based on experimental cell culture models and observations in autopsied brain tissue specimens from patients with ALS (19).

Failure to Degrade ERVK Proteins in SBMA?
Ubiquitination and digestion of unwanted cellular proteinsincluding protein aggregates and viral proteins-is crucial to maintain cellular homeostasis and is particularly important for neuronal health (40). Several studies indicate that similar to ALS, SBMA is also characterized by a failure in protein clearance mechanisms such as lysosomal degradation and autophagy (41,42). Given that we observed similar levels of ERVKderived transcripts between controls and patients with SBMA, the elevated ERVK polyprotein processing and mature ERVK proteins levels observed in SBMA may be related to a failure to degrade these viral proteins. Cell culture models show that inhibition of the proteasome can lead to an accumulation of ERVK proteins (16). ALS-associated mutations in TDP-43 are aggregate-prone and can facilitate the accumulation of ERVK proteins within cells (16). PolyQ expanded AR also forms proteinaceous toxic deposits in cells, which is associated with a loss-of-function as a transcriptional regulator (9,43,44). It remains unclear whether AR proteinopathy impacts the accumulation of ERVK proteins in SBMA.

Potential Effects of ERVK on DNA Damage in SBMA
Several polyQ diseases exhibit evidence of heightened DNA damage and genomic instability (45,46). In a murine model of SBMA, the extended polyQ tract in AR100 mice is associated with enhanced expression of DNA damage marker γH2AX in motor neurons, in conjunction with decreased expression of genes involved in DNA repair, such as p53, Sesn1, ATR, Gadd45, Xrcc5, and Tp63 (42). Mutant AR protein can also act as a sink for DNA repair protein PTIP by sequestering it away from sites of DNA damage (47). Excessive ubiquitination of polyQ proteins may further compromise nuclear DNA repair processes through depletion of nuclear ubiquitin and histone de-ubiquitination (48). Enzymatic activity of retroviral integrase proteins can lead to significant DNA damage accumulation over time (49). Given a loss of DNA repair function in SBMA, this may render cells more vulnerable to DNA damaging insults like ERVK integrase activity (21) (unpublished data). Therefore, our observation of mature ERVK integrase protein expression in PBMC from patients with AR polyQ repeats is potentially pathologically relevant to the underlying multi-system disease processes that occur in SBMA.

CONCLUSION
A novel gain-of-function effect in SBMA appears to be the enhancement of ERVK polyprotein processing into mature viral proteins in immune cells, despite an overall similar abundance of viral polyprotein in controls and patients with SBMA. While other ERVK-associated disease states exhibit increased levels of ERVK transcripts and viral protein (32), only viral protein processing seems to be altered in SBMA immune cells. As PBMC models do not necessarily reflect disease-relevant tissues, additional investigation into the role of endogenous retroviruses in SBMA is warranted. Should ERVK be further shown to contribute to the pathogenesis of SBMA, an antiviral therapeutic opportunity could be identified for this disease.

DATA AVAILABILITY
All datasets [generated/analyzed] for this study are included in the manuscript and the supplementary files.

ETHICS STATEMENT
Ethics approval was granted, and consent obtained from all participants in this study.