Evidence for a Pan-Neurodegenerative Disease Response in Huntington's and Parkinson's Disease Expression Profiles

Huntington's and Parkinson's Diseases (HD and PD) are neurodegenerative disorders that share some pathological features but are disparate in others. For example, while both diseases are marked by aberrant protein aggregation in the brain, the specific proteins that aggregate and types of neurons affected differ. A better understanding of the molecular similarities and differences between these two diseases may lead to a more complete mechanistic picture of both the individual diseases and the neurodegenerative process in general. We sought to characterize the common transcriptional signature of HD and PD as well as genes uniquely implicated in each of these diseases using mRNA-Seq data from post mortem human brains in comparison to neuropathologically normal controls. The enriched biological pathways implicated by HD differentially expressed genes show remarkable consistency with those for PD differentially expressed genes and implicate the common biological processes of neuroinflammation, apoptosis, transcriptional dysregulation, and neuron-associated functions. Comparison of the differentially expressed (DE) genes highlights a set of consistently altered genes that span both diseases. In particular, processes involving nuclear factor kappa-light-chain-enhancer of activated B cells (NFkB) and transcription factor cAMP response element-binding protein (CREB) are the most prominent among the genes common to HD and PD. When the combined HD and PD data are compared to controls, relatively few additional biological processes emerge as significantly enriched, suggesting that most pathways are independently seen within each disorder. Despite showing comparable numbers of DE genes, DE genes unique to HD are enriched in far more coherent biological processes than the DE genes unique to PD, suggesting that PD may represent a more heterogeneous disorder. The complexity of the biological processes implicated by this analysis provides impetus for the development of better experimental models to validate the results.


INTRODUCTION
Transcriptional dysregulation has been observed in both Huntington's disease (HD) and Parkinson's disease (PD) (Cha, 2007;Elstner et al., 2011). Transcription, neuroinflammation, and developmental processes have been shown to be dysregulated in the brains of HD individuals , while inflammation and mitochondrial dysfunction were observed to be altered in the brains of PD individuals (Dumitriu et al., 2012). However, a systematic comparison of the transcriptional signatures of HD and PD has not been performed to date, and those genes and biological processes common to both diseases, if any, remain to be determined. To address this question, we sought to identify genes that are consistently differentially expressed (DE) in the post-mortem brains of HD and PD human subjects compared to neuropathologically normal control brains using mRNA-Seq. We hypothesize that common altered genes and pathways in HD and PD will elucidate the mechanistic underpinnings of the neurodegenerative process.
This study presents the results of a comparison of DE genes for each of HD and PD vs. controls (C) analyzed separately. In addition, in order to identify consistent effects with lower effect size across diseases, an analysis was performed where the HD and PD datasets are concatenated as a single category, neurodegenerative disease (ND), and compared with C. DE genes are determined using a tailored form of logistic regression as previously described (Choi et al., 2017), which improves control of type I errors when compared to negative binomial based DE detection techniques.

Human Subjects
This study has been designated exempt (Protocol # H-28974) by the Boston University School of Medicine Institutional Review Board, as no human subjects were studied and all data are derived from post-mortem human brain specimens.

Sample Collection, Processing, and Sequencing
The HD, PD, and C samples used in this study are those previously described in our past work (Dumitriu et al., 2012;Labadorf et al., 2015). Nine additional HD brain samples were included in this study beyond those in , including two HD gene positive asymptomatic individuals, obtained from the Harvard Brain Tissue Resource Center. All samples underwent the same tissue dissection and RNA extraction sample preparation protocol performed by the same individuals. Briefly, RNA was extracted from the prefrontal cortex of post-mortem brains of HD and PD subjects, as well as neuropathologically normal controls. RNA was poly-A selected and subjected to mRNA sequencing on the Illumina HiSeq 2000 platform. Sample statistics are contained in Table 1. See Labadorf et al. (2015) and Dumitriu et al. (2012) for more detailed information about sample preparation.

mRNA-Seq Data Processing
Each FASTQ file containing mRNA sequences was first trimmed for sequence quality using the sickle software package (Joshi and Fass, 2011) with default arguments. The resulting short reads were aligned against the hg38 build of the human reference genome using the STAR aligner v2.4.0h1 (Dobin et al., 2013) with permissive multimapping parameters (200 maximum alignments-outFilterMultimapNmax 200) and otherwise parameter values suggested in the STAR manual. Multimapped reads were assigned unique alignment locations using the ORMAN software package (Dao et al., 2014). Aligned reads were counted against GENCODE v21 annotation (Harrow et al., 2012) using the HTSeq package v0.6.1p1 . Read counts for all samples were normalized using the method described in the DESeq2 package v1.10.1 (Love et al., 2014) and outlier counts were trimmed using the strategy as previously described . Since the original mRNAs were poly-A selected, only genes with biotypes known to be polyadenylated (i.e., protein_coding, lincRNA, processed_transcript, sense_intronic, sense_overlaping, IG_V_gene, IG_D_gene, IG_J_gene, IG_C_gene, TR_V_gene, TR_D_gene, TR_J_gene, and TR_C_gene) as annotated by Ensembl BioMart (Kinsella et al., 2011) downloaded on May 27th, 2015. To avoid spurious results due to low abundance, genes were further filtered if more than half of the ND or C samples had zero counts.

Differential Expression and Assessment of Batch Effects
DE genes were identified using Firth's logistic (FL) regression (Firth, 1993;Heinze and Schemper, 2002) applied to mRNA-Seq data as previously described (Choi et al., 2017). Logistic regression is a well-established statistical model that can identify consistent differences in one or more variables between two groups of samples. In contrast to negative binomial regression models like edgeR (Robinson et al., 2010) and DESeq2 (Love et al., 2014), this method models a binomial status variable (e.g., case vs. control) as a function of gene counts and any other potentially confounding variables (RIN value, PMI, etc.). Classical logistic regression has historically not been used to determine DE genes because of the so-called "complete separation" problem, where model parameter estimation fails when there is perfect or nearly perfect separation of a predictor with respect to a binomial variable (e.g., one condition has extremely high read counts and the other has very low read counts). FL regression addresses this issue by using a modified likelihood function to enable reliable parameter estimation, and has other statistical advantages with respect to type I error rates and power. Note the DE statistic from FL regression is log odds ratio (LOR) of case vs. controls, that is, positive LOR indicates greater mRNA abundance in case and negative LOR indicates greater abundance in control. All reported p-values are Benjamini-Hochberg (BH) adjusted (Benjamini and Hochberg, 1995) unless noted otherwise. See Choi et al. (2017) for further information on this method applied to mRNA-Seq data.
The data sets in this study were sequenced in five separate batches. To evaluate whether there was evidence of systemic batch effects confounding the identification of DE genes, we ran separate Firth DE models with and without a categorical variable representing batches and compared the beta estimates using Spearman correlation. Beta estimate ranks were highly correlated for HD vs. C (ρ = 0.84, p ≪ 0.001), PD vs. C (ρ = 0.99, p ≪ 0.001), and ND vs. C (ρ = 0.97, p ≪ 0.001). We therefore did not include a batch variable in the DE models.

Identification of ND DE Genes and Enriched Genesets
DE genes were identified as those with BH adjusted p-values < 0.01 from the Firth's logistic regression models of HD vs. C, PD vs. C, and ND vs. C models, yielding three independent DE gene lists. Read counts for each gene were scaled to have a mean of zero and standard deviation of one to obtain standardized regression coefficients, which makes coefficients comparable across genes. All controls were used in each model. Gene set enrichment analysis was performed on each gene list ranked by read counts beta coefficient using the GSEA (Subramanian et al., 2005) implementation in the DOSE software package (Yu et al., 2015) against the MsigDB Canonical Pathway (C2) geneset database (Subramanian et al., 2005). GSEA enrichment was computed using the complete list of genes irrespective of significance ranked by standardized beta coefficient of the count variable. The robust rank aggregation (RRA) algorithm (Kolde et al., 2012) was used to identify individual genes that were consistently altered across these gene lists. Briefly, RRA is a probabilistic, non-parametric, rank-based method for detecting genes ranked consistently better than expected under the null hypothesis of uncorrelated inputs in the presence of noise and outliers. The genes identified as most significant by RRA are the most likely to be implicated in the general ND phenotype.
In addition to producing independent HD and PD DE gene lists, we sought to functionally characterize the genes that are uniquely significant to each disease as well as those in common. To accomplish this, the DE genes from HD and PD were intersected, partitioning the genes into HD-specific, PD-specific, and DE genes common to the two gene lists. Each of these partitioned gene lists were then subjected to gene set enrichment on the MsigDB Canonical Pathway (C2), Transcription Factor Targets (C3), and Gene Ontology (C5) gene set databases (Subramanian et al., 2005) using a hypergeometric test.

RESULTS
Firth's logistic (FL) regression identified 2,427, 1,949, and 4,843 significantly DE genes for HD, PD, and ND, respectively, at qvalue < 0.01. Gene set enrichment analysis of MsigDB C2 gene sets identified 226, 199, and 250 gene sets significantly enriched at q-value < 0.05 for HD, PD, and ND, respectively. Due to the large number of DE genes in each dataset, we focus exclusively on the GSEA enrichment results here. Complete DE gene list and gene set enrichment statistics for HD, PD, and ND are included in Supplemental File 1.
There was a high degree of overlap between the significantly enriched gene sets of HD and PD. 145 gene sets were significantly enriched in both DE gene lists, while 81 and 54 gene sets were uniquely significant in HD and PD, respectively. When a pathway was enriched in more than one list, the pathway was always, without exception, enriched in the same direction, either positively (genes are more abundant in disease) or negatively (genes are less abundant in disease). There were 24 gene sets uniquely significant in ND. Figure 1 depicts the enriched gene sets grouped by high-level biological category for HD, PD, and ND.
We make several observations of Figure 1A. First, the plurality of enriched gene sets across all three data sets are related to immune processes (IM) and are with few exceptions positively enriched. Pathways related to neuronal processes (NE) are largely negatively enriched and there is a subset of these gene sets that are exclusively enriched in HD. With the exception of DNA damage, all remaining biological categories are represented for both HD and PD. DNA damage related pathways (DN) are unique to the PD dataset and are negatively enriched. Multiple apoptosis (AP), developmental (DE), transcription/translation (TR), and transcription factor target (TF) gene sets are also enriched in all three gene lists.
There were 83 gene sets that did not fit cleanly into a single category (OT), which notably include pathways related to endocytosis, signaling, cellular adhesion and extracellular matrix, glycans, and metabolism. A tabular form of the data underlying Figure 1A is in Supplemental File 1. RRA identified 1,353 genes with a score <0.01. The top ten genes identified by RRA as most highly ranked across all three gene lists are reported in Table 2. The rank of each gene in the individual gene lists are also reported in the table, showing that most genes are relatively highly ranked across all three studies as expected. The list of all significant genes identified by RRA analysis is included in Supplemental File 1.
The most consistently ranked gene is RP1-93H18.7 (ENSG00000272403.1), a lncRNA, which was removed from Ensembl starting at version GRCh38.p2, but shows consistent transcription in these data. The annotation of this gene was likely removed because it was found to be the 3 ′ UTR end of one transcript (ENST00000452085.7 as of human genome release GRCh38.p10) of the gene dermatan sulfate epimerase (DSE). DSE is also DE in both HD and PD, is involved in embryonic development (Habicher et al., 2015;Stachtea et al., 2015), and has been related to the immune response in cancer patients (Mizukoshi et  al., 2012). The significance of ENSG00000272403 is likely attributable to the differential expression of DSE itself. See Supplemental File 2 for a more thorough investigation of this gene.
Deficiencies in the second ranked gene, SPR (sepiapterin reductase), have been linked to DOPA-responsive dystonia (Wijemanne and Jankovic, 2015) and previously implicated in PD (Tobin et al., 2007). The third gene, DDIT4 (DNA-Damage-Inducible Transcript 4), is a multi-functional gene which, via its inhibition of the mammalian target of rapamycin complex 1 (mTORC1), regulates in cell growth, proliferation, and survival (Dennis et al., 2013), controls p53/TP53-mediated apoptosis in response to DNA damage (Vadysirisack et al., 2011;Cam et al., 2014), and plays a role in neurodegeneration, neuronal death and differentiation, and neuron migration during embryonic brain development (Malagelada et al., 2011;Canal et al., 2014;Ota et al., 2014;Romaní-Aumedes et al., 2014). TRIP10 (thyroid hormone receptor interactor 10), another multi-functional gene, is involved in insulin signaling (Chang et al., 2013), endocytosis (Feng et al., 2010), and structures specific to monocyte-derived cells (Linder et al., 2000). TNFRSF10D (Tumor Necrosis Factor Receptor Superfamily, Member 10d, Decoy With Truncated Death Domain) inhibits certain types of apoptosis and may play a role in NfkB pathway (Degli-Esposti et al., 1997). Figure 2 illustrates the differences in normalized counts for the top genes identified by RRA. With the exception of (12) PITX1, which is driven entirely by HD, all top genes demonstrate substantial differences between both disease conditions and control.
Finally, we examined the significant DE genes from HD and PD for intersection. Figure 3 illustrates the overlap of DE genes between diseases and describes gene set enrichment results for the intersection. Figure 4 contains the enrichment results for the HD unique genes.
HD specific enrichments are shown in Figure 4, where a broad spectrum of biological processes are implicated in the HD-unique DE genes. The most striking enriched gene set is KEGG_RIBOSOME, with many other related gene sets involved in translation and molecular metabolism similarly enriched. Multiple gene sets that share >20% of their DE genes are associated with Jun Proto-Oncogene (AP1), BTB And CNC Homology 1, Basic Leucine Zipper Transcription Factor 1 and 2 (BACH1, BACH2), and NRF2/TCF11 are also implicated. Other strongly implicated biological processes are ion channel activity, plasma membrane and signaling, apoptosis, immune system and inflammatory processes, developmental genes, neuron-related signaling pathways, many transcription factors, and two families of miRNAs.

DISCUSSION
RNA-sequencing was performed in 29 HD, 29 PD, and 49 control prefrontal cortex (BA9) samples to assess the common and unique transcriptional profiles of these two protein aggregation related neurodegenerative diseases. Firth's logistic (FL) regression identified 2,427, 1,949, and 4,843 significantly DE genes for HD, PD, and ND, respectively, at q-value < 0.01. Gene set enrichment analysis of MsigDB C2 gene sets identified 226, 199, and 250 gene sets significantly enriched at q-value < 0.05 for HD, PD, and ND, respectively. Gene sets related to immune processes and inflammatory pathways were highly enriched for both diseases. Notably, Figure 1 shows that the overwhelming majority of enriched biological pathways are common to both diseases and that they are invariably perturbed in the same direction in both diseases. To the authors knowledge, this study presents the first comprehensive comparative analysis of DE gene expression from HD, PD, and ND in post-mortem human brains assessed with mRNA-Seq. In one previous study, Capurro et al analyzed HD and PD microarray data using a cell-type deconvolution method to identify cell-type specific differences in gene expression between cases and controls for both diseases (Capurro et al., 2014). Only one gene from the study, doublecortin-like kinase 1 (DCLK1), was found to be differentially expressed in both HD and PD cortex, and this gene also appears as DE in common between diseases in the analysis presented here.
The comparison of HD and PD in particular is motivated by the fact that these diseases can be viewed as mirror-images of each other. GABAergic medium spiny interneurons, which compose most of the neurons in the striatum and selectively die in HD but are spared in PD, project directly into the substantia nigra and coordinate motor activity throughout the brain via dopamine-induced signaling (Yager et al., 2015). Dopaminergic neurons in the substantia nigra, on the other hand, which also are important in coordinating motor activity as well as arousal, reinforcement, and reward (Schultz, 2007), selectively degenerate in PD but are spared in HD. It was observed in a study of 523 HD subjects that the incidence of PD in this cohort was lower than that of the general population, though both HD and PD individuals develop Alzheimer's disease at the same rate , suggesting the selective death of medium spiny neurons might be neuroprotective of dopaminergic neuron death. Given the intimate neurological link between the affected neurons in HD and PD, and the mutual exclusivity of their degeneration, this comparison poses a very interesting contrast FIGURE 2 | Box plots of normalized counts for top RRA genes split by condition, RRA rank is in parenthesis. Whiskers extend to 25th and 75th percentile counts, white bars are median counts. With the exception of (12) PITX1, which is driven entirely by HD, all top genes demonstrate substantial differences between both disease conditions and control.
to identify common responses to neurodegeneration that are not confounded by neuron type. Unfortunately, a direct comparison of neurons in these regions of post-mortem human brains is not possible, precisely due to this mirror-image pathology. The choice of the BA9 brain region is motivated by the fact that, due to degeneration, the affected neurons are largely missing from the striatum and substantia nigra in HD and PD, respectively, whereas BA9 is generally free of involvement in both diseases (Braak et al., 2003;Halliday et al., 2006;Hadzi et al., 2012).
Because the primarily affected neurons in HD and PD do not exist in BA9, the biological processes implicated by this analysis may likely represent the response to, rather than direct cause of, the respective diseases. Nonetheless, the remarkable consistency between HD and PD observed in this analysis points to important mechanisms that further our understanding of neurodegenerative disease as a general process.
The biological processes implicated by DE gene lists identified from each condition separately are compellingly similar. From   Figure 3A and reported similarly as in Figure 3C. Note 17 redundant or uninformative GO gene sets and 7 TF gene sets for motifs with unknown transcription factors were omitted from the figure results but are included in Supplemental File 1. Figure 1, we see that the majority of enriched biological pathways are common and that they are invariably perturbed in the same direction in both diseases. Furthermore, combining HD and PD data into an ND condition does not yield significantly more novel biological insights. This remarkable consistency between the pathway enrichment results suggest that the underlying molecular responses to neurodegeneration in HD and PD may be more similar than they are different, despite their different pathological underpinnings. Of particular significance is the strong positive enrichment of immune and inflammatory pathways, which have been convincingly implicated in both diseases separately (Dobbs et al., 1999;Jenner, 2003;Kwan et al., 2012;Dexter and Jenner, 2013;Ellrichmann et al., 2013;Crotti et al., 2014;Allen Reish and Standaert, 2015;Labadorf et al., 2015), but the compelling similarity of these signatures between HD and PD revealed by this analysis has not been illustrated to date.
The negative regulation of neuron-related pathways is also noteworthy, since the BA9 brain region, from which these samples are derived, is not known to be heavily involved in either of these diseases. Despite the lack of clear and consistent neurodegeneration in this brain region, the widespread biological pathways shown to be affected in this analysis strongly suggest neurons in BA9 do indeed experience a common set of effects in the neuropathology for HD and PD.
Many of the individual genes identified by RRA as most consistently different in HD, PD, and ND have previously been the focus of studies in neurodegeneration. The second highest ranked gene SPR has been the focus of significant study in PD and is related to the PARK3 gene locus (Tobin et al., 2007;Sharma et al., 2011), but has not been previously implicated in HD. Inhibition of DNA-damage inducible transcript 4 (DDIT4/RTP801/REDD1) has been associated with neuroprotection in PD models and patients (Malagelada et al., 2011) and is involved with mutant Huntingtininduced cell death (Martín-Flores et al., 2015). Thyroid hormone receptor interactor 10 (TRIP10) has been shown to interact directly with mutant huntingtin (Holbert et al., 2003), and while it is not known to play a role in PD pathology, its elevated mRNA abundance in these PD samples suggest it may indeed be implicated. Other top genes have also been implicated in neurodegeneration: tumor necrosis factor receptor superfamily 10D (TNFRSF10D) (López-Gómez et al., 2011;Frenkel, 2015), protein arginine methyltransferase 6 (PRMT6) (Scaramuzzino et al., 2015), and toll-like receptor 5 (TLR5) (Arroyo et al., 2011). Further investigation of this list of genes is likely to yield novel insights into the mechanisms of neurodegeneration.
The intersection of DE genes between HD and PD also affords important insight into genes relevant to fundamental neurodegenerative processes. Most notably, pathways related to NFkB and transcriptional targets of CREB are prominent in the enrichment results. The NFkB pathway is prominent in both HD (Marcora and Kennedy, 2010;Träger et al., 2014) and PD pathology (Ghosh et al., 2007;Flood et al., 2011) through its central role in inflammatory signaling. CREB is directly targeted by brain derived-neurotrophic factor (BDNF) (Pizzorusso et al., 2000), an important trophic factor in the brain. Both BDNF (Zuccato et al., 2008), and CREB (Obrietan and Hoyt, 2004;Choi et al., 2017) have been directly implicated in HD pathology, while CREB is also believed to play a critical role in dopamine receptor-mediated nuclear signaling (Andersson et al., 2001), and disruption of the protein's function leads to neurodegeneration (Mantamadiotis et al., 2002;Devi and Ohno, 2014). The specific inflammation-related gene sets (HSF1 transcription factor targets, CXCR4, IL12) suggests there is some specificity in the aspects of the pan-neurodegenerative neuroimmune response. Recent studies in both HD and PD have focused on the role of insulin sensitivity and metabolism in patients (Block et al., 2010;Aviles-Olmos et al., 2013;Russo et al., 2013), supporting the role of insulin synthesis as an enriched biological pathway in the common gene list. While the enrichment of apoptosis-related pathways was not surprising, pathways related to angiopoietin, ephrin, and axon guidance suggest that biological processes related to neuronal plasticity are active in both of these diseases and may even indicate that neuroprotective or neuroregenerative processes are a component of the neurodegenerative response.
These data also point to compelling differences between HD and PD. Interestingly, two groups of genes, DNA damage and repair and tRNA related processes, seem to be uniquely perturbed and negatively enriched in PD. The DNA damage and repair gene set enrichment may be a reflection of mitochondrial DNA damage. In PD, dopaminergic neurons of the substantia nigra (though not cortical neurons) were found to be particularly vulnerable to mitochondrial DNA damage (Sanders et al., 2014), and Lewy body pathology, the histological hallmark of PD, is associated with mitochondrial DNA damage (Müller et al., 2013). More generally, mitochondrial DNA damage and oxidative stress are associated with several neurodegenerative diseases including PD, Alzheimer's disease (Moreira et al., 2010), and ALS (Coppedè, 2011). There is evidence supporting the involvement of aminoacyl tRNA synthetases in neurological disease, including ALS, leukoencephalopathy, and PD (Park et al., 2008).
In HD, a number of uniquely perturbed gene sets related to glycan biosynthesis and metabolism are negatively regulated, and these pathways have not been previously implicated in HD. The 1687 HD-unique DE genes are enriched for many gene sets across a broad spectrum of biological processes, including mRNA and protein metabolism, ion channel activity, signaling and kinase activity, apoptosis, immune response, and development. Other, more specific gene sets related to a large number of transcription factors further support the observation of transcriptional dysregulation in HD (Cha, 2000;Labadorf et al., 2015). The specificity of these enriched TF gene sets is quite striking, as the targeted DE genes appear to be largely disjoint between them, suggesting potential, specific causes of the dysregulated transcriptional effects in HD. The enrichment of two miRNA families are also particularly relevant in light of recent reports of miRNA dysregulation in HD (Hoss et al., 2014. It is interesting to note the disparity in enrichment between the HD and PD unique DE genes. Though the numbers of unique DE genes are comparable, the large number of enriched gene sets in HD stands in sharp contrast to the almost total absence of enrichment in PD. This result implies that the DE genes in HD are more consistently related to one another than in PD. One possible, and potentially important explanation for this is that HD is a much more homogeneous disease than PD. It is well established that PD has a significant sporadic component (Lesage and Brice, 2012), caused by a combination of genetic and environmental factors. The relative heterogeneity of PD may make finding consistently effective treatments difficult, and the absence of biological enrichment in specific pathways, other than those common to both diseases, from this analysis may be a reflection of an underlying molecular basis for this effect. It may be that, given sufficient sample size, coherent subgroups of patients may be identified by examining patterns in their gene expression using datasets such as those analyzed here. On the other hand, despite extensive molecular characterization of HD, effective, widely available therapies for HD have remained elusive despite the relative homogeneity of the disease process among HD patients.
Despite the strong literature support of the genes and pathways implicated by this analysis, the findings are purely computational and, therefore, remain hypothetical. Experimentally validating these results poses significant challenges due to the complexity of the pathways implicated in whole tissue from human brain in many respects. First, a fundamental finding of this study is that thousands of genes are implicated between the two diseases, and no one individual gene is immediately apparent as a driver justifying specific follow-up study. Even if an expanded set of genes were chosen while still making wet lab interrogation tractable, the fundamental questions raised by the study would still not be meaningfully addressed. Second, the most compelling result-that these diseases share nearly identical inflammatory signatures-was discovered on the pathway level, not the gene level, suggesting that any functional study would have to modulate entire pathways to make a comparison meaningful. Third, the model systems currently available for HD and PD are very poor with respect to the human disease presentation, in particular to the inflammatory pathways implicated here. For instance, HD mouse models require substantially higher CAG expansions than are seen in humans to show any symptoms, and no mouse studies to date have implicated the strong inflammatory response observed in human tissues by this and other studies. Fourth, when using cell culture systems, since the immune response involves interaction between different cell types, monoculture in vitro systems cannot recapitulate this phenotype. While co-culture of neurons and glia is possible, these systems are not advanced to the point where it is possible to invoke anything like the complex response implicated by the human data. And last, related to this last point, the contribution of age to neuroinflammation is not known, and since these individuals are mostly 50 years old and greater, we would have to find an appropriate model to assess age vs. acute inflammatory vs. chronic inflammatory effects to properly interpret a meaningful experiment. Thus, the available tools as of this study are generally inadequate to meaningfully interrogate these findings in a wet lab setting without substantial improvement and significant effort. The results here are thus proposed as important and potentially fruitful avenues of future study, both in the development of model systems to aid in understanding the mechanism of the neuroinflammatory process in ND and in better understanding the neurodegenerated human brain itself.
The data quality of these datasets has been validated in their original studies. A set of genes across the dynamic range of the sequencing results in 20 of the 29 HD datasets when compared to all 49 C were successfully validated by qPCR . In the PD study, an earlier microarray dataset using many of the same samples was conducted to assess gene expression differences between 19 PD and 24 C and was directly compared to the sequencing data. There was a high level of agreement between the DE genes found in both analyses, which are orthogonal platforms and were generated using different RNA extractions of the same brains (Dumitriu et al., 2012). These previously published results show the validity of the data, which is further enhanced by the agreement between these previous studies and this one, despite using a different analytical methodology.
Nonetheless, these findings have important implications on our understanding of the neurodegenerative disease process. The significant involvement of the inflammatory pathways in both diseases in an area not thought to be directly involved in disease pathogenesis suggests the response to neurodegeneration is widespread throughout the brain. NFkB in particular appears to be a major player, which is well supported in the HD and PD literature. It is unclear whether the neuroinflammatory response is protective, deleterious, or both from these data, but investigation into the role these processes play, and the potential therapeutic value of modulating them, should be made a high priority.

ADDITIONAL INFORMATION
Raw FASTQ reads for these samples are available under GEO accession numbers GSE64810 and GSE68719. The code used to perform the statistical analysis and generate figures is available at https://bitbucket.org/adamlabadorf/hdvpd_mrnaseq.

AUTHOR CONTRIBUTIONS
AL: designed the implementation of the experiments, wrote all analysis code, and prepared the manuscript; SC: provided guidance and critical feedback on the use of Firth logistic regression; RM: conceived of the overall study, made available and processed the samples, and provided critical study and manuscript feedback.
Arizona, the Harvard Brain Tissue Resource Center McLean Hospital, Belmont, Massachusetts, and the Human Brain and Spinal Fluid Resource Center VA, West Los Angeles Healthcare Center, California for providing the brain samples used in these studies.