Pathways to decoding the clinical potential of stress response FOXO-interaction networks for Huntington's disease: of gene prioritization and context dependence

The FOXO family of transcription factors is central to the regulation of organismal longevity and cellular survival. Several studies have indicated that FOXO factors lie at the center of a complex network of upstream pathways, cofactors and downstream targets (FOXO-interaction networks), which may have developmental and post-developmental roles in the regulation of chronic-stress response in normal and diseased cells. Noticeably, FOXO factors are important for the regulation of proteotoxicity and neuron survival in several models of neurodegenerative disease, suggesting that FOXO-interaction networks may have therapeutic potential. However, the status of FOXO-interaction networks in neurodegenerative disease remains largely unknown. Systems modeling is anticipated to provide a comprehensive assessment of this question. In particular, interrogating the context-dependent variability of FOXO-interaction networks could predict the clinical potential of cellular-stress response genes and aging regulators for tackling brain and peripheral pathology in neurodegenerative disease. Using published transcriptomic data obtained from murine models of Huntington's disease (HD) and post-mortem brains, blood samples and induced-pluripotent-stem cells from HD carriers as a case example, this review briefly highlights how the biological status and clinical potential of FOXO-interaction networks for HD may be decoded by developing network and entropy based feature selection across heterogeneous datasets.


INTRODUCTION
The FOXO family of transcription factors is well-known for its effect in regulating longevity as initially uncovered in the nematode C. elegans (Kenyon et al., 1993). This activity of FOXO proteins may hold true in humans since allelic variation in FOXO3A was associated with the ability to be long-lived in several populations of centenarians (Willcox et al., 2008;Anselmi et al., 2009;Flachsbart et al., 2009;Li et al., 2009;Soerensen et al., 2010;Kleindorp et al., 2011). Besides their role as longevity-promoting factors, FOXO proteins are known to regulate a variety of biological processes which are important for development, metabolism and tumor suppression (Calnan and Brunet, 2008). Multiple studies have indicated that FOXO factors may lie at the center of a complex network of upstream pathways such as the PI3K-AKT (insulin/IGF-1 signaling cascade), MST-1, JNK, SIR-2, and AMPK pathways, cofactors such as 14-3-3 proteins and ß-catenin and a fairly large number of either established or putative transcriptional targets. This notion has been extensively reviewed in several articles to which to refer for more details (Greer and Brunet, 2005;Calnan and Brunet, 2008;Landis and Murphy, 2010;Yen et al., 2011;Neri, 2012;Eijkelenboom and Burgering, 2013). These studies have emphasized a model in which, through a series of context-dependent post-translational modifications and nucleo-cytoplasmic interactions, FOXO proteins are signal integrators that may be repressed by insulin/IGF-1 signaling and that may function developmentally or post-developmentally to modulate cell cycle arrest, apoptosis, autophagy, angiogenesis, differentiation, stress resistance, stem cell maintenance, glucogenesis, and food intake.
Consistent with their role in protecting from chronic-stress in a variety of cellular contexts, FOXO factors and interactors such as sir-2.1/SIRT1 and ß-catenin also regulate cellular proteotoxicity and neuron survival in models for neurodegenerative diseases such as Huntington's disease (HD) (Morley et al., 2002;Parker et al., 2005Parker et al., , 2012Burnett et al., 2011) and Alzheimer's disease (AD) (Cohen et al., 2006;Kim et al., 2007). The same notion has been exemplified in more generic models of neurodegeneration (Calixto et al., 2012) and models for muscle cell dysfunction in oculo-pharyngeal muscular dystrophy (Catoire et al., 2008;Pasco et al., 2010). Interestingly, the protective effect of reducing the insulin/IGF-1 signaling cascade-which activates FOXO, is conserved from C. elegans to mammals (Cohen et al., 2009;Freude et al., 2009;Killick et al., 2009). FOXO proteins are not the sole proteins that may regulate cellular proteotoxicity by acting downstream to the insulin/IGF-1 signaling cascade as other transcription factors such as the heat shock factor HSF-1 may also be involved (Cohen et al., 2010;Teixeira-Castro et al., 2011;Zhang et al., 2011;Chiang et al., 2012). Collectively, these observations suggest that the activity of stress-response networks such as FOXO-interaction networks could modify the speed at which the pathogenic process develops in neurodegenerative disease. The FOXO-interaction networks contain several genes that are both potential drug targets (Russ and Lampel, 2005) and evolutionary-conserved, which provides a positive framework for investigating whether these networks might contain targets and markers of interest to tackle neurodegenerative diseases such as HD and AD.

PATHWAYS TO DECODING THE CLINICAL VALUE OF FOXO-INTERACTION NETWORKS
Given the importance of FOXO-interaction networks for the regulation of cellular homeostasis, understanding the cellular, mechanistic and time requirements for these networks to regulate diseased neuron resistance and, possibly, modify the onset and progression of neurodegenerative disease has great therapeutic implications. Insight into this question may be provided by the unbiased analysis of stress-response network activity. One challenge is to identify the FOXO targets that may be involved in neuronal resistance in specific neurodegenerative disease conditions. Another challenge is to define the clinical potential of this information, which may be achieved by means of candidate gene prioritization. The knowledge required to prioritize genes in stress-response networks can be found in molecular profile datasets such as transcriptomic data. As molecular profile datasets are becoming increasingly available for studying neurodegenerative disease, a timely question is whether there is some supporting evidence for specific genes in FOXO-interaction networks to regulate the pathogenic process in neurodegenerative disease. A related question is whether some of these genes could be viewed as "privileged disease targets" whereas other genes might be viewed as "privileged predictors" of disease onset and progression. Here, the analysis of HD datasets may provide some answers.

HUNTINGTON'S DISEASE DATASETS
HD is a dominantly-inherited disease with CAG expansion in the huntingtin (htt) gene and expanded polyglutamine (polyQ) tracts in the htt protein causing striatal and cortical degeneration (Walker, 2007). While HD is inherited, this disease shows a great deal of phenotypic variability and has become a subject of intense research to understand neurodegenerative disease biology due to genetic tractability, large number of models across species, shared disease mechanisms between HD and other neurodegenerative diseases (Zuccato et al., 2010) and availability of well-characterized cohorts of HD subjects (Orth et al., 2011). Among the many genes that could be targeted in HD (Zuccato et al., 2010), those genes which belong to stress response networks are of high interest as they regulate survival mechanisms that may be central to diseased-neuron resistance. Stress response networks encompass a large number of pathways that have been interrogated in genome-wide studies. Most particularly, transcriptomic data have been generated from (1) the striatum of several murine models of HD such as N-terminal htt transgenic mice R6/2 (at 6 and 12 weeks) and D9-N171-98Q (a.k.a. DE5; at 14 months) (Kuhn et al., 2007;Thomas et al., 2011), full length htt transgenic mice YAC128 (at 12 and 24 months) and knock-in mice CHL2 (at 22 months) and HdH(Q92/Q92) (at 18 months) and (2) caudate nucleus and BA4/BA9 cortex from post-mortem HD brains (Hodges et al., 2006), blood samples from pre-symptomatic and symptomatic HD carriers (Borovecki et al., 2005) and HD induced pluripotent stem (iPS) cells that were differentiated into neural stem cell (NSC) lines and that expressed 60 or 180 CAG repeats (2012). This represents a total of 14 HD contexts (seven murine and seven human contexts), all of them assessed on Affymetrix platforms. It is important to note that these 14 HD-related studies are heterogeneous in terms of htt gene species, genetic background, cellular/tissular context and pathological stage (Table S1). Additionally, there might be some level of cross-studies variability. Nonetheless, these data provide a case example to illustrate how the status and properties of FOXOinteraction networks might be explored in HD in the context of heterogeneous datasets.

FOXO-INTERACTION NETWORKS
Putative targets of mammalian FOXO, namely FOXO3, have been identified in mouse NSCs (Paik et al., 2009;Renault et al., 2009;Ro et al., 2012), which represents a total of 374 genes emphasized by either study. As inferred from the probabilistic functional network STRING (Franceschini et al., 2013), 350 out of 374 mouse FOXO3 targets have a total of 5859 high-confidence (STRING score > 0.4) and first-degree interactors. This analysis results in a FOXO3-interaction network that contains 6209 genes (Figures 1, S1; Table S2) and that shows a large proportion of evolutionaryconserved genes. This network has small-world network characteristics (network in which most genes are not neighbors of one another, but most genes can be reached from every other by a small number of interactions), which may reflect the existence of "functional units" in this network. Several FOXO3 target groups are represented in this network, comprising, for example, genes involved in mTor signaling, p53 signaling, metabolic pathways, glycolysis, regulation of actin skeleton, cancer pathways, focal adhesion and cell cycle ( Figure S1).
Intersecting this information with the HD microarray data abovementioned and selecting the genes which are instructed by at least 6 contexts within species and 11 contexts across species retains 4436 genes from murine datasets, 4881 genes from human datasets-here the best reciprocal hits as indicated by Ensembl (http://www.ensembl.org/), and 4634 genes from both murine and human datasets. What may be the behavior of these genes across HD contexts? This question can be addressed by using entropy based feature selection.

ENTROPY BASED FEATURE SELECTION
Entropy is a mathematically defined quantity that helps to account for the flow of information through a biologically regulated process, and this quantity can be defined for individual genes as inferred from the change in status or level of activity across a number of experimental contexts. High gene-entropy values indicate context-dependent activity, and, conversely, low gene-entropy values indicate context-independent activity. In other words, entropy-as per mRNA-level standard, is a measure of how dependent is the variation of gene expression level studies. The confidence score was set at 0.7 to ensure clarity of the graph. Biological content is illustrated in Figure S1.
on experimental context. In so far, entropy allows detecting the genes that may be particularly important for adaptability and homeostasis across cell types and across time. Regarding HD datasets (Table S1), entropy allows detecting the genes that may be particularly important for adaptability and response to mutant htt expression in specific cell types and species, and as pathology develops. A simple entropy analysis (Fuhrman et al., 2000) of HD datasets in which the signal is the change of gene status among three possibilities (up-regulated, down-regulated, no effect) allows genes with low-to-high entropy values to be identified in FOXO-interaction networks (Figure 2; Tables S3-5), which is also true when considering gene subsets such as FOXO3 targets (Paik et al., 2009;Renault et al., 2009;Ro et al., 2012) and potential drug targets (Russ and Lampel, 2005). Permutation analysis and statistical comparisons of gene entropy distributions before and after permutations (Mielke and Berry, 2007) suggests that these observations did not occur by chance ( Figure S2). In these networks, FOXO1 and FOXO3 showed moderate to high entropy values across datasets (except, however, for FOXO3 across human datasets). The sirtuin SIRT1 showed high entropy values and sirtuin SIRT3 showed moderate entropy values across the murine and murine/human datasets (Tables S3-5). These observations are consistent with the notion that SIRTs such as SIRT1 and FOXOs such as FOXO3 may be highly sensitive to the context (cell type, species) in which they operate and they support the previously-emphasized importance of these genes in regulating mutant htt cytotoxicity (Parker et al., 2005(Parker et al., , 2012Jeong et al., 2011;Jiang et al., 2011;Fu et al., 2012).
Overall, it appears that the numbers of very high entropy (>0.4) genes are much smaller compared to that of low-tomoderate entropy genes, a trend notably illustrated by geneentropy distributions across the 14 mouse and human contexts (Figure 2). This suggests that a rather limited proportion of genes in FOXO3-interaction networks might be highly sensitive to change in cellular context. While this might be unexpected considering that FOXO pathways are believed to be highly contextdependent, the genes that have middle to high entropy values (>0.2) are in larger numbers and may also be involved in context dependency. Additionally, the signal analyzed herein is based on the change of gene status, and not the amplitude of this change, which may limit the sensitivity of the analysis.
How about the biological content of gene-entropy categories? Enrichment analysis using KEGG annotations (Kanehisa et al., 2002) suggests that low-to-high entropy gene categories significantly differ in biological content, a phenomenon that is true for FOXO3 targets as well as larger gene sets and that is observed within and across species (Figures 3, 4). More precisely, geneentropy categories may have specific KEGG-pathway profiles, and a given gene-entropy category may change biological profile across models of HD, providing a comprehensive view on how cells and tissues might respond to mutant htt expression within and across species. For example, regarding FOXO3 targets (Figure 3; Tables S6-8), the pathway "regulation of actin cytoskeleton" is specific to low-entropy genes in murine striatum datasets whereas it is specific to high-entropy genes in human datasets. This suggests that whereas this pathway may poorly respond to variable contexts such as htt gene species in HD mice, it may greatly respond to variable contexts such as cellular context (e.g., caudate nucleus, cortex, blood, and iPS cells) as contributed by the human datasets. Similar clear-cut evidences are provided by the KEGG-pathway profiles when considering FOXO3 targets and their high confidence first-degree neighbors (Figures 4, S3-5; Tables S9-11). For example, the annotation of gene-entropy categories corresponding to all datasets (14 conditions) highlights pathways that are more specifically linked to low, middle or high entropy (Figures 4, S6), providing a global signature for the biological significance of gene entropy categories. Interestingly, this signature dramatically changes when considering either murine or human datasets only (Figures S4,  S5; Tables S10, S11), illustrating how FOXO3-interaction networks may be sensitive to the HD context(s) in which they operate.
How is this translated at the gene level? If considering a rather homogeneous group of conditions, here murine striatum datasets, it appears that the corresponding FOXO3-interaction network mostly contain low entropy genes, suggesting that few genes only (medium to high entropy genes) such as for example FOXO1 (Table S6) may respond to change of context (e.g., htt gene species, genetic background, pathological stage) across murine models (Figure 5A). In contrast, many of the previously low-entropy genes such as for example the PTEN and DUSP6 phosphatases (Tables S6, S8) become middle-to-high entropy genes when adding the context variability associated to the human datasets ( Figure 5B). This illustrates how the response of FOXO3interaction networks to a change of context across HD-related conditions may be precisely mapped at the gene level (see also Figure S6). In summary, the examples provided herein illustrate how entropy based feature selection may allow stress response genes to be finely prioritized in HD. Of note, these examples are biased toward a FOXO3-interaction network that was selected "by hand" for the purpose of illustration. As such, they do not constitute final results nor they support final conclusions.

GENE PRIORITIZATION USING ENTROPY BASED FEATURE SELECTION
Many pathways may be closely related to the pathogenesis of neurodegenerative diseases such as HD. While hypothesis-driven approaches may allow the pathological or protective role of these pathways to be predicted, it remains unclear how these pathways may behave across multiple experimental models, whether they may be prominently associated to specific contexts and how this may impact on gene prioritization. The approach illustrated herein provides a glimpse of this problem. In particular, this approach roughly highlights two categories of genes including genes with a rather limited (low entropy genes or "stable genes") or significant (moderate to high entropy genes or "unstable genes") change of status across HD conditions. What is the significance of gene-entropy categories for prioritizing candidate disease targets? Whereas one may propose to employ stable genes as "privileged candidate targets" since the way to manipulate their activity in order to protect from the disease would make little doubt, other may consider that stable genes are targets of poor interest because they are unlikely to reflect the changes that may occur in specific disease contexts (e.g., cellular contexts, time requirements). These two possibilities are two perspectives on a single property, namely, the ability to change status across contexts. The negative perspective highlights insufficient selectivity in responding to a particular context, which might become practically useless in discovery efforts aiming at targeting specific aspects of the disease process. However, low entropy across heterogeneous datasets might constitute a preferred criterion for prioritizing candidate biomarkers. The more positive perspective may be based on the consideration that focusing on stable genes may allow a quick identification of "targets" with sufficient information whose therapeutic value can be established in subsequent target validation steps. In the context of the currently available HD datasets, all of them greatly differing at several levels such as htt gene type, genetic background, cellular context and disease stage, our bias is that genes having moderate to high entropy values may represent candidate targets of higher interest because manipulating their activity could more significantly impact on the homeostasis and survival of specific cell types at a given phase (e.g., early vs. late stage) and site (e.g., brain/neuronal vs. peripheral pathology) of the pathogenic process in HD. Considering additional parameters in the target gene prioritization model such as local network entropy, proximity to htt, brain gene expression and time requirement will refine the view on this question. An important aspect of entropy based systems modeling is the context of the experiments used for gene prioritization. As illustrated herein, the nature and diversity of HD-related conditions may impact on the gene content and biological profiles of signal entropy categories. Specific experimental models may be responsible for most of the entropy observed (herein, the models based on human samples and cells), especially if the number of models is rather limited, which gains to be accounted. More largely, using comparable rather than divergent models of disease may change the interpretation of low-to-high entropy values in term of clinical potential and the final selection of genes as either promising targets or markers, or both. In theory, if analyzing multiple studies all examining for the same context (e.g., same cellular context at the same stage of pathology vs. control), then high entropy could indicate that a gene is unreliable, and middle-tolow entropy might be viewed as a robust criterion for prioritizing candidate targets. In practice, this type of situation is unlikely to be frequently encountered. A major trend for understanding context dependency in HD is to examine for the effects of multiple variables such as expanded-polyQ length, cellular context and pathological stage in individual species and models, then to select variables of interest and compare for their effects on gene behavior across species and models. In so far, context heterogeneity is a constitutive element of the framework for modeling HD datasets. High entropy is then anticipated to be relevant to several situations in which the comparison of the molecular and biological features that may be recapitulated by individual experimental models (model sensitivity analysis) and the subsequent selection of targets of interest relative to multiple contexts and variables (target prioritization) are two complementary aspects of systems modeling for HD.

PERSPECTIVES
System-level approaches are key to target and marker discovery because these approaches are able to summarize and prioritize the biological information buried in large and complex datasets. Applying system-level approaches to the study of neurodegenerative diseases such as HD is an emerging approach that is anticipated to become widely used in the field as more diverse genome-wide datasets. Although developing system-level approaches requires specific skills in software computing and mathematics, databases and knowledge discovery platforms may become available and facilitate the access to these approaches. There is an increasingly large repertoire of methods for the interrogation of complex datasets and development of systems modeling in disease research, ranging from bionetwork mapping (Rapaport et al., 2007;Lejeune et al., 2012) to reverse engineering of gene networks (Lefebvre et al., 2012) and analysis of network rewiring response and differential entropy properties (Bandyopadhyay et al., 2010;Califano, 2011;Shou et al., 2011;West et al., 2012). Regardless of how disease-associated networks are generated, understanding how these networks may differ and how they may reconfigure activity as a function of the context in which they operate is essential to gene prioritization. In this respect, entropy-based approaches may capture gene essentiality changes across multiple conditions that encompass several species and models. This is particularly attractive for studying the dynamics of chronic-stress response networks that, similarly to FOXO-interaction networks, may be strongly dependent on the context in which they operate. Entropy based feature selection may indeed help understanding how chronic-stress response networks allow specific cell types to maintain function and resist degeneration, and how cellular resistance may develop over time. This knowledge may in turn help unraveling the clinical potential of these networks for neurodegenerative diseases such as HD, which might foster the identification of successful disease-modifying strategies.
It is remarkable that longevity-promoting factors such as FOXO proteins may have a role in development and phenotypic plasticity (De La Torre-Ubieta et al., 2010;Christensen et al., 2011;Tang et al., 2011;Mei et al., 2012;Salih et al., 2012). FOXO proteins may thus be important throughout the entire lifetime of an individual. This raises the possibility that FOXO proteins regulate developmental, post-developmental and late determinants of the pathogenic process in HD. Today, data are relatively scarce to study context dependency in neurodegenerative disease. However, molecular profile data allowing context dependency to be examined in a deeper manner are becoming increasingly available, and their analysis using systems modeling is expected to tell us more about the clinical potential of stress response genes in HD and, perhaps, other neurodegenerative diseases.

ACKNOWLEDGMENTS
Frédéric Parmentier is supported by the Agence Nationale de la Recherche et de la Technologie (ANRT) and Glaxo-Smith-Klyne, France. François-Xavier Lejeune is supported by the CHDI Foundation (USA). This work was supported by Inserm and the Agence Nationale de la Recherche (ANR 08-MNPS-024-01), Paris, France, and by the European Huntington Disease Network (EHDN, Germany) and the Hereditary Disease Foundation (USA).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Aging_Neuroscience/10.3389/ fnagi.2013.00022/abstract         and their first neighbors were randomly replaced by genes that belong to the STRING network and that are instructed by at least 6 HD-related conditions in either the mouse or human datasets, and this was performed 1000's of times. The Kolmogorov-Smirnov test was then used to compare the distributions of entropy values.
Differences between distributions before and after n permutations (Y axis) are shown for P < 0.05 (red curves) and P < 0.01 (blue curves). One hundred percentage of the shuffled differences are statistically significant for replacement of less than 80-100 (mouse datasets) to 60-70 (human datasets) genes, suggesting that FOXO-interaction networks have specific entropy features. The number of permuted genes required to significantly alter the initial distributions of entropy values was greater for the mouse datasets compared to the human datasets, which reflects stronger heterogeneity in human datasets. Figure S3 | Heat-map representing the biological content for low to high entropy genes in the FOXO3-interaction network and across seven mouse models (striatum) of HD. Biological content is inferred from enrichment in KEGG pathways (P < 0.001).

Figure S4 | Heat-map representing the biological content for low to high entropy genes in the FOXO3-interaction network and across seven human
HD (post-mortem caudate nucleus and cortex, blood samples from pre-and post-symptomatic HD subjects, iPS cells) datasets. Biological content is inferred from enrichment in KEGG pathways (P < 0.001).