# SYSTEMS BIOLOGY AND THE CHALLENGE OF DECIPHERING THE METABOLIC MECHANISMS UNDERLYING CANCER

EDITED BY: Osbaldo Resendis-Antonio and Christian Diener PUBLISHED IN: Frontiers in Physiology and Frontiers in Cell and Developmental Biology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

> *The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-333-7 DOI 10.3389/978-2-88945-333-7

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **SYSTEMS BIOLOGY AND THE CHALLENGE OF DECIPHERING THE METABOLIC MECHANISMS UNDERLYING CANCER**

Topic Editors:

**Osbaldo Resendis-Antonio,** National Autonomous University of Mexico (UNAM), Instituto Nacional de Medicina Genomica (INMEGEN), Mexico **Christian Diener,** Instituto Nacional de Medicina Genomica (INMEGEN), Mexico

Since the discovery of the Warburg effect in the 1920s cancer has been tightly associated with the genetic and metabolic state of the cell. One of the hallmarks of cancer is the alteration of the cellular metabolism in order to promote proliferation and undermine cellular defense mechanisms such as apoptosis or detection by the immune system. However, the strategies by which this is achieved in different cancers and sometimes even in different patients of the same cancer is very heterogeneous, which hinders the design of general treatment options.

Recently, there has been an ongoing effort to study this phenomenon on a genomic scale in order to understand the causality underlying the disease. Hence, current "omics" technologies have contributed to identify and monitor different biological pieces at different biological levels, such as genes, proteins or metabolites. These technological capacities have provided us with vast amounts of clinical data where a single patient may often give rise to various tissue samples, each of them being characterized in detail by genomescale data on the sequence, expression, proteome and metabolome level. Data with such detail poses the imminent problem of extracting meaningful interpretations and translating them into specific treatment options. To this purpose, Systems Biology provides a set of promising computational tools in order to decipher the mechanisms driving a healthy cell's metabolism into a cancerous one. However, this enterprise requires bridging the gap between large data resources, mathematical analysis and modeling specifically designed to work with the available data. This is by no means trivial and requires high levels of communication and adaptation between the experimental and theoretical side of research.

**Citation:** Resendis-Antonio, O., Diener, C., eds. (2017). Systems Biology and the Challenge of Deciphering the Metabolic Mechanisms Underlying Cancer. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-333-7

# Table of Contents

### **1. Editorial**

*05 Editorial: Systems Biology and the Challenge of Deciphering the Metabolic Mechanisms Underlying Cancer*

Osbaldo Resendis-Antonio and Christian Diener

#### **2. Systems Biology and the Metabolic Pathways Involved in Cancer**

*07 High Concentrations of H2 O2 Make Aerobic Glycolysis Energetically More Favorable for Cellular Respiration*

Hamid R. Molavian, Mohammad Kohandel and Sivabal Sivaloganathan


Mahua Roy and Stacey D. Finley

*64 Modeling the Pro-inflammatory Tumor Microenvironment in Acute Lymphoblastic Leukemia Predicts a Breakdown of Hematopoietic-Mesenchymal Communication Networks*

Jennifer Enciso, Hector Mayani, Luis Mendoza and Rosana Pelayo

### **3. Integrating Diverse Data to Understand the Cancer Phenotype**

### *79 Metabolomics of Head and Neck Cancer: A Mini-Review*

Jae M. Shin, Pachiyappan Kamarajan, J. Christopher Fenno, Alexander H. Rickard and Yvonne L. Kapila

*87 Host-Microbiome Interaction and Cancer: Potential Application in Precision Medicine*

Alejandra V. Contreras, Benjamin Cocom-Chan, Georgina Hernandez-Montes, Tobias Portillo-Bobadilla and Osbaldo Resendis-Antonio

*110 Personalized Prediction of Proliferation Rates and Metabolic Liabilities in Cancer Biopsies*

Christian Diener and Osbaldo Resendis-Antonio

### **4. The Interplay Between Cancer and Cellular Physiology**

*121 Cancer Clocks Out for Lunch: Disruption of Circadian Rhythm and Metabolic Oscillation in Cancer*

Brian J. Altman

*130 "Gestaltomics": Systems Biology Schemes for the Study of Neuropsychiatric Diseases*

Nora A. Gutierrez Najera, Osbaldo Resendis-Antonio and Humberto Nicolini

# Editorial: Systems Biology and the Challenge of Deciphering the Metabolic Mechanisms Underlying Cancer

#### Osbaldo Resendis-Antonio1, 2 \* and Christian Diener <sup>2</sup>

<sup>1</sup> Coordinación de la Investigación Científica, Red de Apoyo a la Investigación, UNAM, Mexico City, Mexico, <sup>2</sup> Human Systems Biology Laboratory, INMEGEN, Mexico City, Mexico

Keywords: systems biology, cancer, modeling and simulation, metabolic modeling, dynamical systems

#### **Editorial on the Research Topic**

#### **Systems Biology and the challenge of deciphering the metabolic mechanisms underlying cancer**

Cancer is one of the major causes of mortality worldwide. One of the particular challenges in battling the disease is that cancers manifest in many different forms each with their own specific genotype and phenotype. Specifically, there is a large variety of genetic and metabolic strategies that cancers employ in order ensure proliferation, metastasis and escape from the immune system of the host. Understanding this mixture of common and specific alterations pose a particular challenge for Science as there are many scales on which to study the disease, ranging from metabolic mechanisms common to all cancers to patient-specific alteration affecting treatment. The intrinsic complexity is astonishing and in order to defeat the disease, we still have a long road to travel. With this purpose in mind, it is crucial to propose new quantitative schemes to gain a better understanding of the mechanisms that underlie modern cancer treatments. Here, Systems Biology approaches have the potential to characterize the metabolic and regulatory mechanisms that support the cancer phenotype and may provide new hypotheses that can cut down the malignant phenotype in clinical treatments. In this context, the works presented in this Research Topic and EBook paint a representative picture of the research landscape and address cancer on various levels of details and mechanisms.

One of the most studied alterations in cancer is the Warburg effect, a switch toward aerobic glycolysis, marked by lactate secretion and a decreased entry of glycolytic intermediates into the citric acid cycle even with an excess of oxygen. Using kinetic models of glycolysis Molavian et al. show how large amounts of oxidative stress byproducts make aerobic glycolysis favorable and Marín-Hernández et al. identify efficient knockout strategies for the increased aerobic cancer glycolysis. Addressing the question which regulatory events may induce the Warburg effect Beltran-Anaya et al. review the impact on non-coding RNAs on glycolysis and related pathways. Those studies are complemented by a set of cancer type-specific works where the inclusion of specific metabolic pathways and transcriptional regulation gives additional perspectives. Roy and Finley use a detailed kinetic model of glycolysis, glutaminolysis, tricarboxylic acid cycle and the pentose phosphate pathway in KRAS-mediated pancreatic cancer in order to reproduce known large-scale knockdown experiments and suggest novel targets to combat pancreatic cancer and Enciso et al. employ a Boolean model to describe a loss of intercellular communication in the molecular regulatory network involved in the development of Acute Lymphoblastic Leukemia.

## Edited and reviewed by:

Raina Robeva, Sweet Briar College, United States

> \*Correspondence: Osbaldo Resendis-Antonio oresendis@inmegen.gob.mx

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 06 June 2017 Accepted: 11 July 2017 Published: 28 July 2017

#### Citation:

Resendis-Antonio O and Diener C (2017) Editorial: Systems Biology and the Challenge of Deciphering the Metabolic Mechanisms Underlying Cancer. Front. Physiol. 8:537. doi: 10.3389/fphys.2017.00537 One of the challenges in studying a disease as diverse as cancer is the integration of novel large-scale data in order to increase our understanding of the etiology and progression of the disease, identify proper biomarkers to diagnosis, and promote computational models with higher capacities to predict clinical outcomes in personalized medicine. In this context,Shin et al. review how metabolomics data have helped to characterize particular metabolic mechanisms in the development of head and neck cancers whereas Contreras et al. review the interplay between the microbiota profile and the development of several subtypes of cancer. In our own study (Diener and Resendis-Antonio) we show how large-scale genomic data sets from cell lines and cancer biopsies can be combined in order to improve knowledge about the phenotype of individual biopsies and how this strategy can unravel individual metabolic alterations in a personalized manner.

Additionally, it is important to note that many local alterations in cancer cells take place in tight interplay with the microenvironment and many complex regulatory programs in the surrounding tissues. Thus, one has to be aware that human diseases are not isolated units and one disease can promote or coexist with other diseases in the organism. Altman reviews the interplay between cancer and the circadian cycle in affected cells and its importance in studying metabolic alterations in cancer, whereas Gutierrez Najera et al. suggest a standardization of the phenotypes in neuropsychiatric diseases and their interplay with other diseases such as cancer.

In total, the presented works span a wide variety of approaches to study the metabolic alterations in cancer, showing how methods from Systems Biology can be used in order to formulate more stringent hypotheses about the alterations causing cancer and their potential remedies. Furthermore, there is a clear agreement that any metabolic modeling approach has to be combined with experimental data obtained from various sources. This imposes large possibilities but also challenges for the coming years. As more and larger data sets, sometimes spanning hundreds of thousands of samples, are available this creates a large demand for strategies that can create knowledge and optimized treatment suggestions. Systems Biology will play a large role in addressing those challenges and will have to find novel approaches in order to extend its applicability from general to specific models that may address metabolic alterations in a patient- or sample-specific manner. Finally, we perceive that the near future may bring a breakthrough to the healthcare sectors by combining new high-throughput technologies, Bioinformatics and Systems Biology to implement Leroy Hood's and Stephen H. Friend's proposal for a precision medicine capable of being predictive, personalized, preventive and participatory.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### ACKNOWLEDGMENTS

The authors would like to acknowledge the financial support coming from an internal grant of the INMEGEN, Mexico. In addition, we appreciate the help of the Frontiers in Physiology editing staff during the preparation of the Research Topic.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Resendis-Antonio and Diener. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# High Concentrations of H2O<sup>2</sup> Make Aerobic Glycolysis Energetically More Favorable for Cellular Respiration

Hamid R. Molavian, Mohammad Kohandel and Sivabal Sivaloganathan\*

Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada

Since the original observation of the Warburg Effect in cancer cells, over 8 decades ago, the major question of why aerobic glycolysis is favored over oxidative phosphorylation has remained unresolved. An understanding of this phenomenon may well be the key to the development of more effective cancer therapies. In this paper, we use a semi-empirical method to throw light on this puzzle. We show that aerobic glycolysis is in fact energetically more favorable than oxidative phosphorylation for concentrations of peroxide (H2O2) above some critical threshold value. The fundamental reason for this is the activation and high engagement of the pentose phosphate pathway (PPP) in response to the production of reactive oxygen species (ROS) H2O<sup>2</sup> by mitochondria and the high concentration of H2O<sup>2</sup> (produced by mitochondria and other sources). This makes oxidative phosphorylation an inefficient source of energy since it leads (despite high levels of ATP production) to a concomitant high energy consumption in order to respond to the hazardous waste products resulting from cellular processes associated with this metabolic pathway. We also demonstrate that the high concentration of H2O<sup>2</sup> results in an increased glucose consumption, and also increases the lactate production in the case of glycolysis.

Keywords: cancer cell metabolism, warburg effect, glycolysis, oxidative phosphorylation, pentose phosphate pathway, reactive oxygen species

### INTRODUCTION

Increased aerobic glycolysis (the Warburg Effect) in proliferating cancer cells has been a perplexing puzzle that has remained unresolved for more than 80 years (Warburg, 1930, 1956; Gatenby and Gillies, 2004; Vander Heiden et al., 2009; Cairns et al., 2011; Schulze and Harris, 2012). The observation that cancerous cells are dominated by aerobic glycolysis is confounded by the fact that this metabolism produces far less energy compared to oxidative phosphorylation—generating 2 ATP from one molecule of glucose when compared to oxidative phosphorylation which generates 36 ATP (in the ideal case; Warburg, 1930, 1956). Initially, it was conjectured that defects in mitochondria might be the main reason for the increased aerobic glycolysis (Gatenby and Gillies, 2004; Vander Heiden et al., 2009; Cairns et al., 2011; Schulze and Harris, 2012), but successive experimental investigations have failed to confirm this scenario (Warburg, 1930, 1956).

#### Edited by:

Osbaldo Resendis-Antonio, Instituto Nacional de Medicina Genómica, Mexico

#### Reviewed by:

Malkhey Verma, University of Manchester, UK Logan Ganzen, Purdue University, USA

> \*Correspondence: Sivabal Sivaloganathan ssivalog@uwaterloo.ca

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 25 March 2016 Accepted: 08 August 2016 Published: 23 August 2016

#### Citation:

Molavian HR, Kohandel M and Sivaloganathan S (2016) High Concentrations of H2O2 Make Aerobic Glycolysis Energetically More Favorable for Cellular Respiration. Front. Physiol. 7:362. doi: 10.3389/fphys.2016.00362

Mitochondria produce reactive oxygen species (ROS) H2O<sup>2</sup> in non-cancerous and cancerous cells during oxidative phosphorylation (Turrens, 2003). Moreover, in cancerous cells the concentration of H2O<sup>2</sup> is also enhanced by the production of H2O<sup>2</sup> through tumor suppressor and oncogenic agents (Szatrowski and Nathan, 1991; Vafa et al., 2002; Turrens, 2003; Sablina et al., 2005; Nogueira et al., 2008; Bensaad et al., 2009). The accumulation of H2O<sup>2</sup> results in a toxic environment for cell compartments; moreover, mitochondria, as a source of H2O2, are much more vulnerable to H2O<sup>2</sup> and as a result the development of conducible intracellular conditions can trigger tumor necrosis factors (TNF; Comporti, 1987; Schulze-Osthoff et al., 1993). As a defense mechanism, mitochondria import reduced glutathione (GSH), which is produced mainly through the activation of the pentose phosphate pathway (PPP) in the cytoplasm to detoxify the H2O<sup>2</sup> (Deneke and Fanburg, 1989; Fernandez-Checa et al., 1997; Anastasiou et al., 2011). The removal of ROS is critical for cell survival since under high concentrations of H2O2, cell metabolism pathways are shut down in order to drive the flow of glucose to the PPP and thus produce enough GSH to detoxify the H2O<sup>2</sup> (Anastasiou et al., 2011). However, the activation and maintenance of the PPP requires ATP hence an active, ramped up, productive cell metabolism is needed in order to produce more ATP when PPP is highly activated.

To understand the mechanism behind aerobic glycolysis and the role of ROS, we consider the following major metabolisms and detoxification pathway: oxidative phosphorylation, glycolysis, and the PPP. We assume that ATP, GSH, and H2O<sup>2</sup> are the major players in the cell metabolism dynamics and the three chemical reactions involved are therefore (Supplementary Information),

$$\begin{aligned} \text{Glucose } &+ & 6\text{O}\_2 &\rightarrow & 6\text{CO}\_2\\ &+ & 6H\_2\text{O (energy } &= & 36 \text{ ATP) Resipation} \end{aligned} \tag{1}$$

Glucose → 2Lactate<sup>−</sup> + 2H <sup>+</sup> (energy = 2 ATP) Glycolysis(2)

$$\begin{aligned} \text{Glucose} &+ \text{ATP} + \text{H}\_2\text{O} + 2\text{GSH} \rightarrow \begin{aligned} \text{R5P} &+ 4\text{GSH} + \text{CO}\_2 \\ &+ \text{ADP} \text{Detox} \end{aligned} \end{aligned} \tag{3}$$

In reaction (3), R5P may be used for synthesis of nucleotides and nucleic acids, which are necessary for cell proliferation, and GSH is used in the following equation to detoxify H2O<sup>2</sup>

$$\text{H}\_2\text{O}\_2 + \text{ 2GSH} \stackrel{GP\text{x}}{\longrightarrow} \text{GSSG} + \text{ 2H}\_2\text{O} \tag{4}$$

This reaction involves intermediate steps in which GPxr interacts directly with H2O<sup>2</sup> and GSH is a co-factor which produces GPxr (Supplementary Information). The concentration of GSH and GPxr in cells are respectively about 0.1–7 mM (Deneke and Fanburg, 1989; Li et al., 2000; Ng et al., 2007; Anastasiou et al., 2011) and 10 nM–5µM (Antunes and Cadenas, 2001; Stone, 2004), hence it seems that there is always enough GSH to detoxify H2O2. However, at high concentration levels of H2O2, H2O<sup>2</sup> primarily modulates the concentration levels of GSH. Cells respond to these elevated levels by producing more GSH to detoxify the accumulated H2O<sup>2</sup> (Bellomo et al., 1992). Therefore, at high concentrations of H2O2, the production rate of GSH depends in turn on the concentration level of H2O<sup>2</sup> (Li et al., 2000; Ng et al., 2007). A good indication of this behavior is the full diversion of glucose flux into the PPP when cells are contaminated with high concentrations of H2O<sup>2</sup> (Fernandez-Checa et al., 1997). Increasing the concentration of H2O<sup>2</sup> further, eventually results in cell death. Thus, we can define an upper limit for the concentration of H2O<sup>2</sup> above which cells go through apoptosis and we call this the cell sensitivity concentration (CSC) level. For concentrations of H2O<sup>2</sup> much lower than CSC the production rate of GSH is very low. As the concentration of H2O<sup>2</sup> becomes comparable to CSC, cells start to activate the PPP to produce more GSH. The produced H2O<sup>2</sup> by cell mitochondria is a major player in this response, since they are at the center of H2O<sup>2</sup> production and they, if functional, can activate TNF. At high concentrations of H2O2, the H2O<sup>2</sup> produced by mitochondria does not diffuse into the cell cytoplasm and accumulates instead around the mitochondria which, as a result, activate TNF.

We include these observations in the form PGSH = βP mt ROS + γ P ext ROS where PGSH, P mt ROS, and P ext ROS are respectively the production rates of GSH, H2O<sup>2</sup> by mitochondria and H2O<sup>2</sup> by external sources, and β and γ are functions of the concentration of GSH and, the difference between CSC (C0) and the concentration of H2O<sup>2</sup> (CROS). We choose β and γ to be different since, in general, the response of mitochondria to the accumulation of H2O<sup>2</sup> differs from that of the rest of the cell. β and γ are very small for low concentrations of H2O<sup>2</sup> and start to increase as the concentration of H2O<sup>2</sup> increases. Close to C<sup>0</sup> they become very large in order to drive most of the consumed glucose to the PPP pathway. Since mitochondria can activate TNF in their oxidative phosphorylation state (Schulze-Osthoff et al., 1993), there is a stronger response at high concentrations of H2O2, hence β should be significantly larger than γ .

We assume that both oxidative phosphorylation and glycolysis are activated to produce energy for cell needs and for the PPP and investigate the production rate of ATP in the presence of H2O2. The net production of ATP is the sum of the production of ATP by oxidative phosphorylation and glycolysis minus the consumption of ATP by the PPP (which primarily detoxifies the generated H2O<sup>2</sup> by mitochondria and other sources). We obtain the production and consumption as functions of the oxygen and glucose consumption and GSH production (PGSH = βP mt ROS + γ P ext ROS). Using P mt ROS <sup>=</sup> <sup>α</sup>qO, where <sup>α</sup> is the fraction of oxygen consumption (qO) converted to H2O<sup>2</sup> by mitochondria [about 1/100−2/100 Turrens, 2003], thus we have (Supplementary Information):

$$P\_{ATP} = \frac{\left(\frac{17}{3} - \frac{3}{4}\alpha\beta\right)\left(q\_G - \frac{\nu}{4}P\_{ROS}^{ext}\right)}{1 + \frac{r\alpha\beta}{4}}r + 2\,q\_G - \frac{3\nu P\_{ROS}^{ext}}{4} \\ \text{(5)}$$

where q<sup>G</sup> is the total consumption of glucose and r is the ratio between oxygen and metabolic glucose consumptions. We now consider the case when the net production rate of ATP exactly balances the energy requirements of the cell and derive the following result for the consumption of glucose in terms of the concentration and production rate of H2O<sup>2</sup> (Supplementary Information):

$$q\_G^{\rm SS} = \frac{(9+17r)\chi}{24+68r-3r\alpha\beta}P\_{\rm ROS}^{\rm ext} + \frac{12+3r\alpha\beta}{24+68r-3r\alpha\beta}q\_{\rm ATP}^{\rm cell} \quad \text{(6)}$$

where q cell ATP and q SS G are respectively the ATP and glucose consumption by the cell in the equilibrium state. Also, the total amount of lactate production for the case of pure glycolysis (r = 0) reads:

$$P\_{Lact}^{r=0} = q\_{ATP}^{cell} + \frac{\mathcal{V}}{4} P\_{ROS}^{ext}$$

We first consider the special case of zero external production and low concentrations of H2O<sup>2</sup> (αβ is very small). In this case, and for a purely glycolytic metabolism (r = 0), PATP = 2q<sup>G</sup> which implies that 2ATP are produced for each molecule of glucose—this is the well-known case of pure glycolysis. In the case of dominant respiration (r = 6) PATP = 36qG, which is again the net energy production rate for a cell under oxidative phosphorylation. A simple comparison of 2–36 ATP production leads to the superficial, albeit prominent conclusion that, for the case of low H2O<sup>2</sup> concentrations, oxidative phosphorylation is the most efficient metabolism.

We now search for the most efficient metabolism in the presence of H2O<sup>2</sup> by finding the maximum production rate of ATP (PATP) as a function of r. For αβ < αT, where α<sup>T</sup> = 68/9 the maximum production of ATP arises at r = 6. This is in agreement with the well-known fact that oxidative phosphorylation is the most efficient way for ATP production. However, rather interestingly, for αβ > α<sup>T</sup> the maximum production rate of ATP occurs at r = 0 and the transition point is independent of P ext ROS. This implies that for some concentrations of H2O2, glycolysis is in fact energetically more efficient than oxidative phosphorylation, which directly contradicts current prevailing explanations. To illustrate this metabolic transition in terms of the concentration of H2O<sup>2</sup> we choose two functions β = 100 1−CROS/C0 and γ = 10 1−CROS/C0 and substitute these into Equation (5). These functions mimic the real behavior of the system in which the production rate of GSH increases as the concentration of H2O<sup>2</sup> increases. The coefficient β is chosen to be larger because (a) TNF is activated by mitochondria and (b) at high concentrations of H2O2, the produced H2O<sup>2</sup> cannot diffuse into the cell and accumulates in the vicinity of the mitochondria which occupy a much smaller space within the cell. In **Figure 1** we plot the normalized PATP (PATP is normalized to be one for any given concentration of ROS) as a function of CROS and r. For CROS < 0.87C0, which corresponds to β < αT, the maximum production rate arises at r = 6 and for CROS > 0.87C0, which corresponds to αβ > αT, it transitions to pure glycolysis (r = 0).

As a first step to better understand the nature of the transition from oxidative phosphorylation to glycolysis, we first take note that the inefficiency observed in oxidative phosphorylation is fundamentally due to the fact that for some concentrations of H2O2, a cell must expend most of its produced ATP in detoxifying its self-generated H2O<sup>2</sup> (by mitochondria).

This is supported by the fact that for α = 0 (i.e., no production of H2O<sup>2</sup> by mitochondria), it is respiration that is the more efficient metabolism for any concentration of H2O2. In contrast, for α 6= 0, no matter how small α is, the shift in metabolism (from respiration to glycolysis) occurs when concentrations of H2O<sup>2</sup> exceed some critical threshold value. At these concentration levels of H2O2, the net produced energy by oxidative phosphorylation for one molecule of glucose is less than that produced through glycolysis. To gain a more quantitative understanding of this phenomenon, we first observe that α is a property of mitochondrial efficiency. It can safely be assumed that this remains constant independent of H2O2. Meanwhile β changes through either an increase in the concentration levels of H2O<sup>2</sup> or a decrease in the concentration levels of GSH and thus results in a crossing of the transition point to glycolysis. In non-cancerous cells and under normal conditions, H2O<sup>2</sup> is mainly produced by mitochondria and diffuses through the cell. In this case, the low production of GSH and other antioxidants are sufficient to detoxify the H2O2, hence normal cells function in the αβ < α<sup>T</sup> regime. In this case, oxidative phosphorylation is the most efficient mechanism for ATP production. However, in proliferating cancer cells, H2O<sup>2</sup> is produced by growth factors and mitochondria that work at higher rates to compensate for the increased energy needs of proliferating cells. Hence, the concentrations of H2O<sup>2</sup> are much higher and this can push the cell into the regime αβ > α<sup>T</sup> in which glycolysis is the more efficient metabolism.

In **Figure 2** we plot the glucose consumption as functions of CROS for the two cases r = 0 and r = 6 under constant cell needs. These plots show the significant increase in glucose consumption as functions of the concentration of H2O2. Therefore, the generated H2O<sup>2</sup> by growth factors and other sources is one of the major reasons for the increase in glucose consumption. Also, the two plots intersect for CROS = 0.87C0, which means that for a given cell glucose requirement, the energy expended for the consumption of glucose through oxidative phosphorylation exceeds the corresponding expenditure when the cell has shifted

to a glycolysis metabolism and this is due to the production of H2O<sup>2</sup> by mitochondria.

In **Figure 3** we plot the lactate production for r = 0 as a function of CROS assuming that q cell ATP remains constant. An increased concentration of H2O<sup>2</sup> leads to enhanced lactate production. This suggests that the observed high lactate production in cancer cells does not occur solely because of the cell needs, but may also be related to the increase in the concentration levels of H2O2. We plot in the inset of **Figure 3** the ratio between lactate production and glucose consumption against ROS concentration levels. This figure demonstrates that the larger portion of glucose is consumed by the cell metabolism, however, the corresponding share consumed by the PPP increases with elevations in the ROS concentration levels.

Notice that the presented results are qualitatively independent of the form of β and γ and we can derive the obtained results for any β and γ as long as they increase with increasing H2O2. These functions could be measured in vitro by putting different cell lines in a steady state flow of H2O<sup>2</sup> and measuring the production rate of GSH for different concentrations of GSH. We also note that, at the same concentration of H2O2, glycolytic cells produce less GSH than cells that use oxidative phosphorylation.

In Shi et al. (2009) showed that by enhancing H2O<sup>2</sup> in hepatoma cells, glycolysis activity increases, and by reducing H2O<sup>2</sup> levels, this activity decreases. These observations are consistent with our prediction of a concomitant increase in glycolysis activity with increase in H2O<sup>2</sup> and vice versa. We also note that Brand and Hermfisse (1997) observe that for proliferating rat thymocytes, cells switch to glycolysis to protect themselves against H2O2.

In **Figure 4** we illustrate how the different dominant mechanisms in cell metabolism and detoxification evolve as the concentrations of oxygen and H2O<sup>2</sup> vary. For concentrations of oxygen less than the hypoxic concentration (CH) and CROS less than the critical value for transition from oxidative phosphorylation to glycolysis (COG), the metabolism is anaerobic

FIGURE 3 | Production of lactate as a function of the concentration of H2O2 . The inset shows the ratio between lactate production and glucose consumption. PLact is in the dimension of q cell ATP, and P ext ROS = qG <sup>50</sup> . In the inset, the ratio between lactate production to glucose consumption is plotted.

glycolysis. When the concentration of oxygen passes the hypoxic concentration levels, cells transit to oxidative phosphorylation or aerobic glycolysis depending on whether CROS is less or greater than COG. As the concentration of H2O<sup>2</sup> increases and exceeds CGP, cells close all their metabolic pathways to drive the whole consumption of glucose toward PPP in order to reduce cell damage by H2O2. However, this process cannot continue indefinitely because changing glucose to G6P is ATPdependent. Thus, cells need to keep their glycolytic metabolism active in order to continue the process of generating GSH and for detoxification of H2O2. When the concentrations of H2O<sup>2</sup> exceed the critical concentration C<sup>0</sup> tumor cells undergo apoptosis. Notice that this diagram is based on the most efficient mechanism of producing ATP and the availability of oxygen. It is possible that mutated cells activate less dominant metabolic pathways in any of these regions.

When some cells adopt the glycolytic phenotype, they have chosen the most efficient way of generating energy under H2O<sup>2</sup> stress and are more resistant to ROS. Hence, at high concentrations of H2O2, they have a higher survival advantage as compared to cells that rely on respiration. As a result, the glycolytic phenotypic population becomes the dominant population under ROS stress. Switching to the glycolytic phenotype may be realized through overexpression of glycolytic agents. In fact, experimental results report that PKM2 is activated in cancer cells which serves to shift the metabolism from oxidative phosphorylation to aerobic glycolysis (Christofk et al., 2008; Hitosugi et al., 2009). Interestingly, as the concentration of H2O<sup>2</sup> gets close to the CSC (at which cell damage may occur), PKM2 is inhibited to drive the whole glucose flux to the PPP pathway and thus minimize the adverse effects of ROS (Anastasiou et al., 2011). Hence, PKM2 maybe be one of the primary candidates driving the described transition from oxidative phosphorylation to glycolysis.

We anticipate that there are other physiological and pathological situations in which our results might be pertinent and might help to explain certain biological behaviors. Two such examples are the observation of the glycolysis metabolism in embryos (Kondoh et al., 2007) and skeletal muscle (Richardson et al., 1998) which could well be described and understood based on our proposed model. Another example is the observation of the transition between glycolysis and oxidative phosphorylation in yeast (Chen et al., 2007). However, further investigations and more detailed discussion of these systems is beyond the scope of the current manuscript.

In conclusion, we have demonstrated that aerobic glycolysis is energetically more favorable than oxidative phosphorylation when the concentration levels of H2O<sup>2</sup> exceed a certain critical value. This is because the energy generated by mitochondria is consumed by PPP to respond to the production and high concentrations of H2O2, generated by mitochondria. This makes

#### REFERENCES


oxidative phosphorylation an inefficient source of energy since it results in high energy consumption in order to respond to the production of H2O<sup>2</sup> by mitochondria under high concentrations of H2O2. We have also shown that by increasing H2O<sup>2</sup> levels, cells need to increase their glucose consumption via the glycolysis metabolism and PPP in order to satisfy their nutritional needs and for the purposes of removing H2O2. Thus, we propose that H2O<sup>2</sup> is the major player behind the observed shift toward aerobic glycolysis in proliferating cancer cells.

### AUTHOR CONTRIBUTIONS

HM, MK and SS conceived and designed the research project. HM carried out the calculations and simulations. HM, MK and SS analyzed the results and wrote the manuscript.

#### FUNDING

SS and MK acknowledge NSERC funding through individual Discovery Grants.

#### ACKNOWLEDGMENTS

We thank M. Milosevic, M. Gingras, M. Khorami, C. Phipps, and K. Kaveh for discussion and critical reviews. This work was financially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC, discovery grant to SS) as well as NSERC/CIHR Collaborative Health Research grant (to MK and SS).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2016.00362


of murine embryonic stem cells. Antioxid. Redox Signal. 9, 293–299. doi: 10.1089/ars.2006.1467


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Molavian, Kohandel and Sivaloganathan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Inhibition of Non-flux-Controlling Enzymes Deters Cancer Glycolysis by Accumulation of Regulatory Metabolites of Controlling Steps

Álvaro Marín-Hernández \*, José S. Rodríguez-Zavala, Isis Del Mazo-Monsalvo, Sara Rodríguez-Enríquez, Rafael Moreno-Sánchez and Emma Saavedra \*

Departamento de Bioquímica, Instituto Nacional de Cardiología, Mexico City, Mexico

#### Edited by:

Osbaldo Resendis-Antonio, Instituto Nacional de Medicina Genomica, Mexico

#### Reviewed by:

Julio Vera González, University Hospital Erlangen, Germany Anshu Bhardwaj, Institute of Microbial Technology (CSIR), India

#### \*Correspondence:

Álvaro Marín-Hernández marinhernndez@yahoo.com.mx; alvaro.marin@cardiologia.org.mx Emma Saavedra emma\_saavedra2002@yahoo.com

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 12 May 2016 Accepted: 02 September 2016 Published: 23 September 2016

#### Citation:

Marín-Hernández Á, Rodríguez-Zavala JS, Del Mazo-Monsalvo I, Rodríguez-Enríquez S, Moreno-Sánchez R and Saavedra E (2016) Inhibition of Non-flux-Controlling Enzymes Deters Cancer Glycolysis by Accumulation of Regulatory Metabolites of Controlling Steps. Front. Physiol. 7:412. doi: 10.3389/fphys.2016.00412 Glycolysis provides precursors for the synthesis of macromolecules and may contribute to the ATP supply required for the constant and accelerated cellular duplication in cancer cells. In consequence, inhibition of glycolysis has been reiteratively considered as an anti-cancer therapeutic option. In previous studies, kinetic modeling of glycolysis in cancer cells allowed the identification of the main steps that control the glycolytic flux: glucose transporter, hexokinase (HK), hexose phosphate isomerase (HPI), and glycogen degradation in human cervix HeLa cancer cells and rat AS-30D ascites hepatocarcinoma. It was also previously experimentally determined that simultaneous inhibition of the non-controlling enzymes lactate dehydrogenase (LDH), pyruvate kinase (PYK), and enolase (ENO) brings about significant decrease in the glycolytic flux of cancer cells and accumulation of intermediate metabolites, mainly fructose-1,6-bisphosphate (Fru1,6BP), and dihydroxyacetone phosphate (DHAP), which are inhibitors of HK and HPI, respectively. Here it was found by kinetic modeling that inhibition of cancer glycolysis can be attained by blocking downstream non flux-controlling steps as long as Fru1,6BP and DHAP, regulatory metabolites of flux-controlling enzymes, are accumulated. Furthermore, experimental results and further modeling showed that oxamate and iodoacetate inhibitions of PYK, ENO, and glyceraldehyde3-phosphate dehydrogenase (GAPDH), but not of LDH and phosphoglycerate kinase, induced accumulation of Fru1,6BP and DHAP in AS-30D hepatoma cells. Indeed, PYK, ENO, and GAPDH exerted the highest control on the Fru1,6BP and DHAP concentrations. The high levels of these metabolites inhibited HK and HPI and led to glycolytic flux inhibition, ATP diminution, and accumulation of toxic methylglyoxal. Hence, the anticancer effects of downstream glycolytic inhibitors are very likely mediated by this mechanism. In parallel, it was also found that uncompetitive inhibition of the flux-controlling steps is a more potent mechanism than competitive and mixed-type inhibition to efficiently perturb cancer glycolysis.

Keywords: cancer glycolysis, metabolic regulation, uncompetitive inhibition, feed-back inhibition, enolase inhibition, pyruvate kinase inhibition, oxamate

## INTRODUCTION

In recent years it has been extensively documented that oxidative phosphorylation (OxPhos) is predominant for supplying ATP in cancer cells under aerobic conditions (Zu and Guppy, 2004; Moreno-Sánchez et al., 2007; Ralph et al., 2010). However, cancer glycolysis becomes prevalent when OxPhos is downregulated by long-term hypoxia or high incidence of mutations in mitochondrial DNA (Carew and Huang, 2002; Gatenby and Gillies, 2004; Rodríguez-Enríquez et al., 2010; Hernández-Reséndiz et al., 2015). Glycolysis also provides precursors for the synthesis of the macromolecules required for the constant and accelerated cellular duplication of cancer cells (Bauer et al., 2005). In addition, the enhanced lactic acid (a glycolytic end-product) production and secretion by cancer cells has been proposed to promote evasion of the immune system and induction of angiogenesis and metastasis (Lardner, 2001; Fischer et al., 2007). In consequence, glycolysis inhibition has re-emerged as an alternative therapeutic option for cancer (Warmoes and Locasale, 2014). In addition, cancer cells may induce oxidative stress on neighboring stromal fibroblasts triggering mitophagy and hence re-directing their energy metabolism toward glycolysis. In return, the lactate produced and expelled by fibroblasts, as well as ketone bodies, are now taken up and actively oxidized by cancer cells to drive OxPhos, which presumably favors tumor growth. This cell-cell interplay has been called reverse Warburg effect (Pavlides et al., 2009; Martinez-Outschoorn et al., 2011).

By applying the fundamentals of metabolic control analysis (Fell, 1997; Moreno-Sánchez et al., 2008, 2010), the enzymes and transporters that control the glycolytic flux of cancer cells have been identified. These are indeed the targets with the highest therapeutic potential because their inhibition will have greater negative effects on tumor glycolysis than inhibition of low- or negligible flux-controlling steps. It was determined by both, elasticity analysis and kinetic modeling (experimental strategies of metabolic control analysis and bottom-up Systems Biology, respectively), that the main controlling steps of cancer glycolysis are the glucose transporter (GLUT), hexokinase (HK), hexose phosphate isomerase (HPI), and glycogen degradation, regardless the environmental conditions to which the cells were exposed (normoxia/normoglycemia, hypoxia/hyperglycemia, and normoxia/hypoglycemia; Marín-Hernández et al., 2006, 2011, 2014). Although the degree of flux control exerted by these controlling steps slightly changes among the different conditions, the main controlling steps remain the same, which emphasizes the fact that cancer glycolysis is also tightly regulated despite its flux enhancement.

However, non-flux-controlling glycolytic steps such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH), pyruvate kinase (PYK), and lactate dehydrogenase (LDH) have been also proposed as suitable targets for inhibition of cancer glycolysis (Ganapathy-Kanniappan et al., 2012; Ganapathy-Kanniappan and Geschwind, 2013; Daniele et al., 2015). Inhibition of any of these three non-controlling enzymes induces a moderate decrease in the growth of cancer cells (Tang et al., 2012; Daniele et al., 2015). However, this anticancer effect could be rather linked to inhibition of the "moonlighthing" or accessory functions of glycolytic enzymes which include roles in cancer development and promotion and cell cycle progression (Ganapathy-Kanniappan and Geschwind, 2013; Hu et al., 2014).

On the other hand, the glycolytic metabolites glucose-6 phosphate (Glc6P) and fructose-1,6-bisphosphate (Fru1,6BP) and the pentose phosphate pathway metabolites erythrose-4 phosphate (Ery4P) and 6-phosphogluconate (6PG) can modulate the activities of the controlling enzymes HK and HPI through competitive and mixed-type inhibitions. Furthermore, some metabolites that at low, physiological concentrations are innocuous, at high concentrations may become inhibitors of the controlling steps HK (Fru1,6BP) and HPI (dihydroxyacetonephosphate; DHAP), inducing significant inhibition of the glycolytic flux in cancer cells (Moreno-Sánchez et al., 2016). Elevated levels of Fru1,6BP and DHAP in cancer cells can be achieved by inhibiting, simultaneously, down-stream enzymes with negligible flux-control such as enolase (ENO), PYK, and LDH. Therefore, inhibitors of these latter enzymes may also function as anti-glycolytic drugs because they may indirectly induce inhibition of the high flux-controlling HK and HPI. To understand the mechanistic basis of why inhibition of down-stream non-controlling glycolytic enzymes may affect the pathway flux, it appears necessary to determine which are the down-stream steps with high control on the concentrations of the regulatory metabolites Fru1,6BP and DHAP (i.e., metabolite concentration control coefficients).

Such a goal was pursued and resolved in the present paper by using our published AS-30D and HeLa cells glycolysis kinetic models (Marín-Hernández et al., 2011, 2014; Moreno-Sánchez et al., 2016). Previous theoretical studies have suggested that uncompetitive inhibition induces more severe toxic effects on a metabolic pathway than competitive inhibition (Cornish-Bowden, 1986; Eisenthal and Cornish-Bowden, 1998). Therefore, in silico simulations of how different mechanisms of inhibition (competitive, mixed-type, uncompetitive) on controlling enzymes impact the pathway systemic properties (fluxes and metabolite concentrations) were also carried out using the kinetic glycolysis models.

It was concluded that (i) inhibition of GAPDH with iodoacetate, or PYK/ENO with oxamate but not LDH, PGK, or PGAM, induces Fru1,6BP and DHAP accumulation and methylglyoxal production, leading to significant suppression of glycolysis; and (ii) uncompetitive inhibition of the most controlling pathway steps is the most direct and potent mechanism to efficiently perturb cancer glycolysis.

#### MATERIALS AND METHODS

#### Chemicals

HK, Glc6PDH, HPI, aldolase, α-glycerophosphate dehydrogenase, triosephosphate isomerase (TPI), LDH, and

**Abbreviations:** DHAP, dihydroxyacetone-phosphate; ENO, enolase; Ery4P, erythrose-4-phosphate; Fru1,6BP, fructose-1,6-biphosphate; GAPDH, glyceraldehyde 3-phosphate dehydrogenase; Glc6P, glucose 6-phosphate; GLUT, glucose transporter; HK, hexokinase; HPI, hexosephosphate isomerase; LDH, lactate dehydrogenase; 6PG, 6-phosphogluconate; PYK, pyruvate kinase; TPI, triosephosphate isomerase.

Fru6P were purchased from Roche (Mannheim, Germany). Glucose, iodoacetate, methylglyoxal, NADH, NAD+, NADP+, and oxamate were from Sigma Chemical (St Louis, MO, USA).

#### Isolation of Tumor Cells

AS-30D hepatocarcinoma cells were propagated in 200–250 g weight female Wistar rats by intraperitoneal inoculation of 3 mL of ascitic liquid containing ∼4–6 × 10<sup>8</sup> cells/mL. After 5–6 days, the intraperitoneal cavity liquid was extracted and tumor cells were isolated by centrifugation as previously described (López-Gómez et al., 1993). Animal manipulation was carried out in accordance with the recommendations of Mexican Official Standard NOM-062-ZOO-1999. This study did not require approval by the Ethics Committee of the Instituto Nacional de Cardiología de México.

### Glycolytic Fluxes and Metabolite Concentrations

Hepatocarcinoma AS-30D cells (15 mg cell protein/mL) were incubated in saline Krebs-Ringer medium supplied with oxamate (10 or 20 mM) or iodoacetate (2 or 4 mM) for 60 min under orbital shaking at 150 rpm and 37◦C; under such conditions cell viability was always higher than 90%. Thereafter, a cell sample was withdrawn (time 0) and 5 mM glucose was added; after further 10 min of incubation another cell sample (time 10) was withdrawn. The cell samples were immediately mixed with icecold perchloric acid (final concentration of 3% v/v), vortexed and centrifuged at 1800 × g for 1 min at 4◦C. The supernatants were neutralized with 3 M KOH/0.1 M Tris, further incubated in ice for at least 30 min and then centrifuged. The supernatants were stored at −72◦C until use for determination of Glc6P, Fru6P, Fru1,6BP, G3P, DHAP, ATP, ADP, and L-lactate contents as described by Bergmeyer (1974). The rate of the glycolytic flux was estimated from the difference in L-lactate contents from the t = 0 and t = 10 min samples. As glycogen degradation and glutaminolysis are negligible in AS-30D cells (Marín-Hernández et al., 2006), total L-lactate production did not require correction provided by 2-DOG inhibition.

Methylglyoxal was determined by gas chromatography in a Shimadzu GC2010 apparatus (Shimadzu; Kyoto, Japan) equipped with a capillary column HP-PLOT/U of 30 m length, 0.32 mm I.D. and 10µm film (Agilent, USA), and flame ionization detector. A methylglyoxal standard curve was carried out in the range of 0.3–30 nmoles, and the time of retention was 4.7 min. The equipment conditions were FID temperature 200◦C, column temperature 180◦C, oven temperature 180◦C, and linear velocity 26.4 cm/s. He (10 ml/min) and H<sup>2</sup> (40 ml/min) mix was used as carrier gas. The cell sample (15 mg/ml) was withdrawn after time 10 and centrifuged at 1800 × g for 2 min. 0.5 mL of the supernatant was removed and the cell pellet was resuspended in the remaining supernatant. The suspension was sonicated with a Branson sonicator three times for 15 s at 60% of maximal output with 1 min rest, in an ice bath. The sonicate was centrifuged at 20 800 × g for 5 min. The supernantant was filtered and 1–2µl were injected in the gas chromatograph. The limit of detection of methylglyoxal was lower than 0.3 nmoles. The concentration in the stock solution of methlyglyoxal was enzymatically calibrated by using human ALDH2 and saturating NAD+.

#### Kinetic Modeling

The previous kinetic models of glycolysis built for HeLa and AS-30D cells (Marín-Hernández et al., 2014; Moreno-Sánchez et al., 2016) using the metabolic simulator GEPASI version 3.3 (Mendes, 1993) were modified for the HK, HPI, TPI, and GAPDH rate equations as described below. The other rate equations remained unaltered, however, they are here fully described (Supplementary Table 1) because model updates previously developed are scattered in several papers (Marín-Hernández et al., 2011, 2014; Moreno-Sánchez et al., 2016). The models and simulations were also run in COPASI software (Hoops et al., 2006) with no significant differences to those of GEPASI (SBML files are included in Supplementary Presentation 1). The great majority of the kinetic parameter values used in the models were determined under the same experimental conditions (K+-based medium at pH 7.0 and 37◦C; Marín-Hernández et al., 2006, 2011, 2014; Rodríguez-Enríquez et al., 2009; Moreno-Sánchez et al., 2012, 2016).

In the AS-30D model, kinetics of GLUT was defined as a monosubstrate reversible Michaelis-Menten equation [Haldane equation (Equation 1)] as it was previously determined (Rodríguez-Enríquez et al., 2009; Marín-Hernández et al., 2011):

$$\nu = \frac{Vmf\left(\left[Glc\_{out}\right] - \frac{\left[Glc\_{in}\right]}{Keq}\right)}{K\_{Glcout}\left(1 + \frac{\left[Glc\_{in}\right]}{K\_{Glcin}}\right) + Glc\_{out}}\tag{1}$$

in which Glcout and Glcin and KGlcout and KGlcin are the extraand intra-cellular glucose concentrations and the Km values, respectively; Keq is the equilibrium constant; and Vmf is the maximal velocity in the forward reaction.

In the HeLa model, the rate-equation for GLUT was changed to a double mono-substrate reversible Michaelis-Menten equation (Equation 2), representing the co-existence of two isoforms (Marín-Hernández et al., 2014),

$$\begin{split} \nu &= Vmf \left( \left[ \frac{f1\left( \left[ Glc\_{out} \right] - \frac{\left[ Glc\_{in} \right]}{Keq} \right)}{K\_{Glcount1} \left( 1 + \frac{\left[ Glc\_{in} \right]}{K\_{Glic1}} \right) + Glc\_{out}} \right] \\ &+ \left[ \frac{f2\left( \left[ Glc\_{out} \right] - \frac{\left[ Glc\_{in} \right]}{Keq} \right)}{K\_{Glcount2} \left( 1 + \frac{\left[ Glc\_{in} \right]}{K\_{Glic2}} \right) + Glc\_{out}} \right] \right) \end{split} \tag{2}$$

in which KGlcout<sup>1</sup> and KGlcout<sup>2</sup> are the Km values for extracellular glucose of each GLUT isoform; and KGlcin<sup>1</sup> and KGlcin<sup>2</sup> are the Km values for intracellular glucose of each GLUT isoform. f 1 and f 2 are the fractional isoform contents determined by Western blot analysis and enzyme kinetics (Marín-Hernández et al., 2014). This two-components equation was proposed in the previous study because cells grown in low glucose express significant contents of both isoforms, GLUT1 and GLUT3 (Marín-Hernández et al., 2014).

The HK rate equation (Equation 3) used for the present updated AS-30D model was a random Bi-Bi system (Segel, 1975) with mixed type inhibition by Fru1,6BP based on previous experimental kinetic analysis (Moreno-Sánchez et al., 2016):

$$\nu = \frac{\frac{\text{Vmf}}{\text{a1} \text{Kab}} \left( \left[ \text{A} \right] \left[ \text{B} \right] - \frac{\left[ \text{P} \right] \left[ \text{Q} \right]}{\text{K} \text{q}} \right)}{1 + \frac{\left[ \text{A} \right]}{\text{A} + \frac{\left[ \text{A} \right]}{\text{A} \text{B}} + \frac{\left[ \text{A} \right] \left[ \text{B} \right]}{\text{A} \text{K} \text{k} \text{B}} + \frac{\left[ \text{P} \right]}{\text{K} \text{p}} + \frac{\left[ \text{Q} \right]}{\text{K} \text{p}} + \frac{\left[ \text{P} \right] \left[ \text{Q} \right]}{\text{K} \text{q} \text{K} \text{q}} + \frac{\left[ \text{A} \right] \left[ \text{Q} \right]}{\text{K} \text{a} \text{K} \text{p}} + \frac{\left[ \text{P} \right] \left[ \text{B} \right]}{\text{K} \text{p} \text{K} \text{k}} + \frac{\left[ \text{A} \right] \left[ \text{B} \right]}{\text{K} \text{p} \text{K} \text{k}} + \frac{\left[ \text{A} \right] \left[ \text{B} \right]}{\text{K} \text{p} \text{K} \text{k}} + \frac{\left[ \text{A} \right] \left[ \text{B} \right]}{\text{K} \text{p} \text{K} \text{k}} \text{-K} \text{p} \text{-K}$$

where A = [Glcin], B = [ATP], P = [Glc6P] and Q = [ADP]. Ka, Kb, Kp, and Kq are the Km values for the corresponding substrates and products. α<sup>1</sup> and α<sup>2</sup> values are the factors by which Ka (Kmglc) changes when B (ATP) and I (Fru1,6BP) are bound to the enzyme, respectively. Ki is the inhibition constant for Fru1,6BP (KiFru1,6BP).

In the HeLa model, the HK rate equation (Equation 4) was a double random-bisubstrate Michaelis-Menten to also represent the coexistence of two enzyme isoforms as previously reported (Marín-Hernández et al., 2014),

The rate equation for PFK-I (Equation 7) in all kinetic models was the concerted transition model of Monod, Wyman and Changeux for exclusive ligand binding (Fru6P, activators, and inhibitors) together with mixed-type activation and simple Michaelis–Menten terms for ATP and reverse reaction (Marín-Hernández et al., 2011, 2014) as established by experimental kinetic analysis (Moreno-Sánchez et al., 2012). ATP and citrate are the allosteric inhibitors. L is the allosteric transition constant; KaFru26BP is the activation constant for Fru26BP; KiCIT and KiATP are the inhibition constants for citrate and

$$\begin{aligned} \nu = Vmf \left( \left[ \frac{\frac{f1}{Ka1Kb} \left( [A] \left[ B \right] - \frac{[P][Q]}{Kq} \right)}{1 + \frac{[A]}{Ka1} + \frac{[B]}{Kb} + \frac{[A][B]}{Ka1Kb} + \frac{[P]}{Kq} + \frac{[Q]}{Kq} + \frac{[P][Q]}{KpKq} + \frac{[A][Q]}{Ka1Kq} + \frac{[P][B]}{KpKb}} \right] \\\\ \quad + \left[ \frac{\frac{f2}{Ka2Kb} \left( [A] \left[ B \right] - \frac{[P][Q]}{Kq} \right)}{1 + \frac{[A]}{Ka2} + \frac{[B]}{Kb} + \frac{[A][B]}{Ka2Kb} + \frac{[P]}{Kp} + \frac{[Q]}{Kq} + \frac{[A][Q]}{Ka2Kq} + \frac{[P][B]}{KpKb}} \right] \end{aligned} \tag{4}$$

in which Ka1 and Ka2 represent the Km values for Glcin of each isoform; f 1 and f 2 are the fractional isoform contents experimentally determined from the activities of HKI and HKII in HeLa cellular extracts (Marín-Hernández et al., 2014).

The HPI rate equation in the HeLa model was considered as a monoreactant reversible Michaelis-Menten equation (Equation 5) with (a) competitive (Marín-Hernández et al., 2011, 2014), (b) uncompetitive (Segel, 1975), and (c) mixed type inhibition (Segel, 1975) by Ery4P, 6PG, and Fru1,6BP.

ATP; α and β are the factors by which KFru6P and Vmax change when a mixed-type activator is bound to the active enzyme.

Probably derived from the high complexity of the PFK1 rate equation, the algorithms used by COPASI to generate the ordinary differential equations to calculate the variation in the metabolite concentrations assign the role of the first substrate (or product) to that defined in the reaction specification window. If the order of substrates and products in the reaction does not

(a) v = Vmf [Glc6P] Kglc6<sup>p</sup> − Vmr [Fru6P] Kfru6<sup>p</sup> 1 + [Glc6P] Kglc6<sup>p</sup> + [Fru6P] Kfru6<sup>p</sup> + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] K6pg (b) v = Vmf Kglc6<sup>p</sup> - Glc6P − Vmr Kfru6<sup>p</sup> [Fru6P] 1 + [Glc6P] Kglc6<sup>p</sup> ∗ 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] <sup>K</sup>6pg + [Fru6P] Kfru6<sup>p</sup> ∗ 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] <sup>K</sup>6pg (c) v = Vmf [Glc6P] Kglc6<sup>p</sup> 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] <sup>K</sup>6pg <sup>−</sup> Vmr [Fru6P] Kfru6<sup>p</sup> 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] <sup>K</sup>6pg 1 + [Glc6P] Kglc6<sup>p</sup> ∗ 1 + [ERY4P] ∝Kery4p + [Fru1,6BP] ∝Kfru1,6bp + [6PG] ∝K6pg 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] K6pg ! + [Fru6P] Kfru6<sup>p</sup> ∗ 1 + [ERY4P] ∝Kery4p + [Fru1,6BP] ∝Kfru1,6bp + [6PG] ∝K6pg 1 + [ERY4P] Kery4p + [Fru1,6BP] Kfru1,6bp + [6PG] K6pg ! (5)

The HPI rate equation in the AS-30D model was a monoreactant reversible equation (Equation 6) with competitive inhibition by four modulators: Ery4P, 6PG, Fru1,6BP and DHAP as experimentally determined (Moreno-Sánchez et al., 2016):

$$\nu = \frac{Vmf\frac{[Glc6P]}{K\_{g\&\text{ç}}} - Vmr\frac{[Fm6P]}{K\_{f\&\text{ç}}}}{1 + \frac{[Glc6P]}{K\_{g\&\text{ç}}} + \frac{[Fm6P]}{K\_{f\text{nd}}p} + \frac{[ERY4P]}{K\_{crp4p}} + \frac{[Fm1, 6RP]}{K\_{f\text{ru1}, \text{6bp}}} + \frac{[DRAP]}{K\_{\text{dgq}}}} \text{ (6)}$$

match with that stated in the rate equation, then the computer program mix-up the identity of the ligands in the ordinary differential equations. Therefore, to correct for this type of errors both the syntax reaction and the rate equation have to be displayed in the same order of substrates and products; then one should be aware that the reaction syntax does not necessarily reflect the order of binding in the enzyme, which defines the type of reaction mechanism.

Reaction: ATP + Fru6P = ADP + Fru1,6BP

$$V = Vm\left(\frac{\frac{[ATP]}{K\_{ATP}}}{1 + \frac{[ATP]}{K\_{ATP}}}\right)\left(\frac{1 + \frac{[Pmath \text{FuelS}}{6K\_{P\text{mS}}\text{MF}}}{1 + \frac{[Pmath \text{FuelS}}{K\_{P\text{mS}}\text{MF}}}\right)\left(\frac{\frac{Fm\text{d}\text{Fod}\left(1 + \frac{Fm\text{d}\text{H}\text{H}}{6K\_{P\text{mS}}\text{MF}}\right)}{K\_{P\text{mP}}\left(1 + \frac{[P\_{\text{mS}}\text{H}]}{K\_{P\text{mS}}\text{H}}\right)}\right)^{2}\right)$$

$$= \left(\frac{\frac{[ADP][Final\text{R}\text{P}]}{K\_{ADP}\text{F}\_{\text{R}\text{M}}\text{R}\text{P}\text{H}}}{\left(1 + \frac{[P\_{\text{mS}}\text{H}]}{K\_{P\text{mS}}\text{R}\text{P}\text{H}}\right)^{4}} + \left[1 + \frac{\frac{[P\_{\text{mS}}\text{H}]}{K\_{P\text{mS}}\text{H}}}{K\_{P\text{mS}}\text{P}\text{H}}\right]^{4}\right)^{4}\tag{7}$$

$$= \left(\frac{\frac{[ADP][Full\text{R}\text{P}]}{K\_{ADP}\text{F}\_{\text{R}\text{M}\text{S}}\text{R}\text{P}\text{H}}}{\frac{[ADP]}{K\_{P\text{mS}}\text{P}} + \frac{[ADP][Final\text{R}\text{P}]}{K\_{P\text{mS}}\text{R}\text{P}\text{H}\text{H}}}{K\_{P\text{mS}}\text{R}\text{P}\text{H}}\right)^{4}}\tag{7}$$

The ALDO rate equation was the reversible Uni-Bi random Michaelis-Menten equation (Equation 8) in all kinetic models as was reported in previous kinetic models (Marín-Hernández et al., 2011, 2014).

$$\nu = \frac{Vmf \frac{[Ftu1, 6BP]}{K\_{ftu1, 6bp}} - Vmr \frac{[DHAP][G3P]}{K\_{DHAP}K\_{G3P}}}{1 + \frac{[Ftu1, 6BP]}{K\_{ftu1, 6bp}} + \frac{[DHAP]}{K\_{DHAP}} + \frac{[G3P]}{K\_{G3P}} + \frac{[DHAP][G3P]}{K\_{DHAP}K\_{G3P}}} \tag{8}$$

Kinetics for TPI in the AS-30D model was here depicted by a mono-substrate simple reversible Michaelis-Menten equation (Equation 9) with mixed type inhibition by Fru1,6BP as it was previously determined (Moreno-Sánchez et al., 2016):

$$\nu = \frac{Vmf}{1 + \frac{[S]}{K\_s\left(1 + \frac{[I]}{K\_t}\right)} - Vmr\frac{[P]}{K\rho\left(1 + \frac{[I]}{Kt}\right)}}{1 + \frac{[S]}{K\_s\left(\frac{1 + \frac{[I]}{Kt}}{1 + \frac{[I]}{KtK}}\right)} + \frac{[P]}{K\_p\left(\frac{1 + \frac{[I]}{Kt}}{1 + \frac{[I]}{KtK}}\right)}}\tag{9}$$

α value is the factor by which Ks and Kp change when Fru1,6BP is bound to the enzyme; Ki is the inhibition constant for Fru1,6BP (KiFru1,6BP).

In the HeLa model, the TPI rate equation (Equation 10) was a mono-substrate simple reversible Michaelis-Menten equation as it was previously determined (Marín-Hernández et al., 2011, 2014) with no inhibitors:

$$\nu = \frac{Vmf\frac{[S]}{Ks} - Vmr\frac{[P]}{Kp}}{1 + \frac{[S]}{Ks} + \frac{[P]}{Kp}} \tag{10}$$

GAPDH kinetics in the AS-30D model was here described by a simplified ordered Ter-Bi Michaelis-Menten equation (Equation 11) with mixed type inhibition by Fru1,6BP as previously determined (Moreno-Sánchez et al., 2016):

$$\nu = \frac{Vm\frac{[A][B][C]}{KaKbKc} - Vm\frac{[P][Q]}{KpKq}}{1 + \frac{[A]}{Ka} + \frac{[A][B]}{KaKb} + \frac{[A][B][C]}{KaKbKc} + \frac{[P][Q]}{KpKq} + \frac{[Q]}{Kq}} \tag{12}$$

In all models the rate equations for PGAM and ENO were depicted by mono-substrate simple reversible Michaelis-Menten equation (Equation 13):

$$\nu = \frac{Vm f \frac{[S]}{Ks} - Vmr \frac{[P]}{Kp}}{1 + \frac{[S]}{Ks} + \frac{[P]}{Kp}} \tag{13}$$

in which [S] and [P] are the respective concentrations of substrates and products and their respective Km values (Ks and Kp) which were experimentally determined in previous works (Marín-Hernández et al., 2011, 2014).

The PYK rate equation (Equation 14) in all models was defined as simple random-bisubstrate Michaelis-Menten that represents the kinetics of the prevalent PYK isoform in cancer cells with no cooperativity or allosteric modulation by typical metabolites, as experimentally determined for AS-30D cells (Marín-Hernández et al., 2014; Moreno-Sánchez et al., 2016). A, B, P, and Q are PEP, ADP, Pyr, and ATP, respectively; Ka, Kb, Kp, and Kq are the Km values for the corresponding substrates and products.

$$\nu = \frac{\frac{Vmf}{KaKb} \left( [A] \left[ B \right] - \frac{[P][Q]}{Keq} \right)}{1 + \frac{[A]}{Ka} + \frac{[B]}{Kb} + \frac{[A][B]}{KaKb} + \frac{[P]}{Kp} + \frac{[Q]}{Kq} + \frac{[P][Q]}{KpKq} + \frac{[A][Q]}{KaKq} + \frac{[P][B]}{KpKb}} \tag{14}$$

In all models, the rate equations (Equation 15) for PGK and LDH were defined by random Bi-Bi reversible Michaelis-Menten for non-interacting substrates (α and β = 1) according to the reported literature (Marín-Hernández et al., 2011, 2014; Moreno-Sánchez et al., 2016); Ka, Kb, Kp, and Kq are the Km values for the

$$\nu = \frac{Vmf \frac{[A][B][C]}{Kak\text{K}k\text{K}\varepsilon} - Vmr\frac{[P][Q]}{Kp\text{K}q}}{1 + \frac{[A]}{Ka} + \frac{[A][B]}{Ka\text{K}b} + \frac{[A][B][C]}{Ka\text{K}k\text{K}\varepsilon} + \frac{[P][Q]}{Kp\text{K}q} + \frac{[Q]}{Kq} + \frac{[A]}{Kt} + \frac{[A][B][I]}{Ka\text{K}k\text{-K}i} + \frac{[P][Q][I]}{Ka\text{K}k\text{-K}i} + \frac{[P][Q][I]}{kp\text{K}q\text{-K}i}}} \tag{11}$$

where A = [NAD+], B = [G3P], C = [Pi], P = [BPG], Q = [NADH] with their respective affinity constants (Ka, Kb, Kc, Kp, and Kq). Ki is the inhibition constant for Fru1,6BP (KiFru1,6BP). α value is the factor by which Kb (KmG3P) changes when Fru1,6BP is bound to the enzyme.

In the HeLa model, the GAPDH rate equation (Equation 12) was a simplified ordered Ter-Bi Michaelis-Menten equation as was previously determined and used in previous models (Marín-Hernández et al., 2011, 2014).

corresponding substrates and products previously determined (Marín-Hernández et al., 2011, 2014; Moreno-Sánchez et al., 2016). In the case of PGK, A, B, P, and Q are 1,3BPG, ADP, 3PG, and ATP, respectively whereas for LDH they are NADH, Pyr, Lactate and NAD+, respectively.

$$\nu = \frac{Vm\frac{[A][B]}{aKaKb} - Vmr\frac{[P][Q]}{bKpKq}}{1 + \frac{[A]}{Ka} + \frac{[B]}{Kb} + \frac{[A][B]}{aKaKb} + \frac{[P][Q]}{bKpKq} + \frac{[P]}{Kp} + \frac{[Q]}{Kq}}} \tag{15}$$

In the HeLa model the rate-equation (Equation 16) for the monocarboxylate transporter (MCT) activity was incorporated (Marín-Hernández et al., 2014) since it catalyzes the expulsion of lactate and H+. With the inclusion of this reaction, it can be visualized why HeLa cell cultures become rapidly acidic. This was a mono-substrate reversible Michaelis-Menten equation in which Lacin and Lacout are the intra- and extra-cellular lactate concentrations; KLacin and KLacout are the Km values for the intra and extracellular lactate; Keq is the equilibrium constant of the reaction. The equation only included lactate as a ligand because kinetic parameters for the proton are not available.

$$\nu = \frac{Vmf\left([Lac\_{in}] - \frac{[Lac\_{out}]}{Kcq}\right)}{K\_{Lación}\left(1 + \frac{[Lac\_{out}]}{K\_{Lación}}\right) + Lac\_{in}}\tag{16}$$

#### Molecular Docking Analysis

Crystal structures of human HK type I, TPI and GAPDH, and mouse HPI, were obtained from the protein data bank (accession numbers 4FOI, 1HTI and 3PFW, and 1U0F, respectively). The three dimensional models of the ligands used in the study were obtained from different crystal structures found in the protein data bank or downloaded from PubChem (https://pubchem.ncbi. nlm.nih.gov/). The models were optimized using the ArgusLab 4.0.1 (Planaria Software LLC, Seattle, WA) available at: http:// www.arguslab.com and Maestro, version 9.1 (Schrodinger, LLC, New York, NY, 2010) softwares. For docking analysis, the ligands from the above protein crystal structures were removed using the software UCSF Chimera package 1.6 (Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, CA; supported by NIH P41 RR001081; Pettersen et al., 2004). The protein structure and ligand models were prepared for docking using the software ADT 1.5.2 (Sanner, 1999; Morris et al., 2009). Docking analysis of the enzyme structures and ligands were carried out using the software Autodock 4.2.5.1 (Huey et al., 2007) available at http:// autodock.scripps.edu/. After docking, 100 conformations for each ligand were obtained, and then clustered for analysis using ADT 1.5.2 software. The conformations selected corresponded to the lowest values of binding energy and Ki. Analysis of the resulting structures and generation of the figures were achieved with PyMOL (The PyMOL Molecular Graphics System, Version 1.2.r1, Schrodinger, LLC).

### RESULTS

### Effect of Metabolic Inhibition of Low Flux–Controlling Enzymes in the Pathway Flux

Inhibition of a low- or non-flux controlling step might be regarded as a mishap or misleading option to decrease the pathway flux because it requires almost complete inhibition (a "pharmacological knock out" or >80% inhibition) of the activity to decrease the pathway flux to the levels reached by inhibiting a flux-controlling step. However, there are experimental evidences indicating that oxamate and iodoacetate, presumed specific inhibitors of LDH and GAPDH, i.e., two non-controlling glycolytic enzymes, do affect the glycolytic flux of cancer, and non-cancer cells (Goldberg et al., 1965; Elwood, 1968; McKee et al., 1968; Coe and Strunk, 1970; Chatham et al., 1988; Moreno-Sánchez et al., 2016). Remarkably, this inhibition induces accumulation of Fru1,6BP and DHAP, being the former an inhibitor of HK (controlling enzyme), TPI, and GAPDH, whereas the latter is an inhibitor of HPI (another controlling enzyme; published data summarized in Supplementary Table 2).

The effect of oxamate on the activities of LDH, GAPDH, ENO, and PYK, which in turn affected the Fru1,6BP and DHAP concentrations (Moreno-Sánchez et al., 2016), was here in silico examined focusing on the concentration control coefficients for Fru1,6BP and DHAP by using the updated and modified kinetic models of AS-30D and HeLa glycolysis. These models were further refined (described in Section Kinetic Modeling) with respect to the models previously published (Marín-Hernández et al., 2014; Moreno-Sánchez et al., 2016). Enzymes that produce a metabolite have positive concentration control coefficients whereas those that consume it have negative ones (Fell, 1997). For glycolysis of AS-30D hepatocarcinoma cells, the model simulations indicated that GLUT, HK, and HPI have high positive concentration control coefficients values (from 1 to 2.3), whereas ENO (−0.57 and −0.99), PYK (−0.4 and −0.7), and GAPDH (−0.16 and −0.27) have high negative concentration control coefficients values on Fru1,6BP and DHAP, respectively (**Table 1**). For HeLa cells cultured for 24 h under different glucose concentrations (normo-, hypo-, or hyper-glucemic conditons) kinetic model simulations showed that GLUT, HK, and HPI also exert control on the synthesis of the same two metabolites, whereas GAPDH (−0.7 and −1.3) and PYK (−1.3 and −3) control their consumption (**Table 1**). Hence, in both types of cancer cells, the enzymes with high positive concentration control coefficients on Fru1,6BP and DHAP were those that also exerted the main control on the glycolytic flux (GLUT, HK, and HPI) and hence, their inhibition should decrease Fru1,6BP and DHAP concentrations. In contrast inhibition of GAPDH, ENO, and PYK, which have high negative concentration control coefficients should increase the levels of Fru1,6BP and DHAP. It is worth noting that without kinetic modeling it would not have been possible to unveiled the important role of GAPDH, ENO, and PYK on the indirect modulation of the flux-controlling enzymes by Fru1,6BP and DHAP. By the present in silico analysis, these other metabolic regulatory mechanisms of cancer glycolysis became apparent and were further analyzed.

In the AS-30D kinetic model it was evaluated the effect of LDH, ENO or PYK inhibition on the levels of Fru1,6BP and DHAP. LDH showed low control on their concentrations (−0.4 and −0.02; **Table 1**) since an 80% decrease in its activity only induced a marginal increase in their concentrations (**Figure 1**); identical results were attained with PGK and PGAM (data not shown). In contrast, a similar inhibition of ENO and PYK activities led to marked accumulation of Fru1,6BP and DHAP (**Figures 1A,C**). Similar results were obtained with the HeLa model under hypoglycemia when modulation of the GAPDH, PYK, PGK, PGAM, and LDH activities were simulated (**Figures 1B,D**). These analyses strongly suggested that the


TABLE 1 | Concentration control coefficients of glycolytic steps on Fru1,6BP, and DHAP obtained in silico using the updated kinetic models of glycolysis in AS-30D and HeLa cells (see Section Materials and Methods) under normoxia (20% O2 ).

reference 100% enzyme activity values were those corresponding to the respective Vmax values for the forward reaction whereas Fru1,6BP and DHAP concentrations were those predicted by each model (A,C for AS-30D cells; B,D for hypoglycemic HeLa cells). When two enzymes were simultaneously titrated, identical variation in the activities was applied. In the case of LDH, ENO, and GAPDH, a decrease of the Vmaxf value was accompanied by a proportional decrease in the Vmaxr value.

increase in Fru1,6BP and DHAP levels reported in AS-30D, Ehrlich ascites, sarcoma 37 ascites and HeLa cells treated with oxamate (Goldberg et al., 1965; Elwood, 1968; Coe and Strunk, 1970; Moreno-Sánchez et al., 2016), was a consequence of ENO and PYK inhibition rather than of LDH inhibition, contrary to the most common interpretation.

To further establish the influence in the glycolytic flux and ATP concentration of the inhibitory effect of Fru1,6BP and DHAP on flux-controlling (HK and HPI) and not flux-controlling (TPI and GADPH) enzymes, several simulations were made with the AS-30D model (**Figure 2**). When inhibition of Fru1,6BP and DHAP was not included in the rate equations of HK, HPI, TPI and GAPDH, the decrease of ENO and PYK activities still increased the levels of Fru1,6BP and DHAP but the concentration of ATP and glycolytic flux were not modified (**Figure 2**). A similar effect was observed when inhibition of Fru1,6BP on the TPI and GAPDH rate equations was included (data not shown). These last observations indicated that inhibition of ENO and PYK activities per-se was not sufficient to decrease the glycolytic flux. Only when the Fru1,6BP and DHAP inhibitions on the HPI and HK rate equations were included, the glycolytic flux and ATP concentration decreased (**Figure 2**).

Furthermore, the Fru1,6BP and DHAP levels indeed increased in cells treated with oxamate (reported by Moreno-Sánchez et al., 2016) or iodoacetate (**Table 2**). These experimentally determined metabolite concentrations were also closely simulated by kinetic modeling when decreasing by ∼75% the ENO+PYK activities, and including the Fru1,6BP and DHAP inhibition of HK, or HPI, or both HK+HPI, or HPI+HK+TPI+GAPDH (**Figure 2**). Hence, the interactions of these metabolites with HK and HPI are apparently also involved in the mechanisms of control of their own intracellular levels. Incubation with oxamate or iodoacetate promoted a severe decrease (3.5–4.6 times vs. control) in the intracellular ATP (**Table 2**). Although cell viability remains high (>90%), it may be possible that these inhibitors also affect the mitochondrial function thus perturbing the cell ATP levels (Martin-Requero et al., 1986; Cano-Ramírez et al., 2012). Since a significant HK fraction in cancer cells is bound to mitochondria, OxPhos also provides ATP for this glycolytic reaction. However, this interplay between glycolysis and OxPhos through the ATP/ADP ratio has not been included in the present kinetic models because the subcellular distribution of the HK isoforms

FIGURE 2 | Effect of ENO and PYK inhibition on Fru1,6BP, DHAP, and ATP concentrations and glycolytic flux. The kinetic model for AS-30D glycolysis was used. ENO plus PYK activities were modulated and the effects on Fru1,6BP (A), DHAP (B), glycolytic flux (C), and ATP (D) were determined. Several simulations were made with or without enzyme inhibition by Fru1,6BP and DHAP on no enzyme (a); on HPI (b); on HK (c); on both HPI and HK (d); and on HPI, HK, TPI, and GAPDH (e). The respective HPI, HK, TPI, and GAPDH rate equations with Fru1,6BP and DHAP inhibition are depicted in the Methods Section.

has not been determined under the different O<sup>2</sup> and glucose culture conditions.

### Accumulation of Toxic Metabolites Contributes to Decrease Cancer Glycolysis

Another possible consequence of DHAP accumulation is the production of methylglyoxal, which can also inhibit the glycolytic flux in cancer and normal cells (Leoncini et al., 1989; Halder et al., 1993; Biswas et al., 1997). In this regard, cells treated with oxamate showed increased methylglyoxal levels (**Table 2**). Similarly, significant increases in Fru1,6BP, DHAP and methylglyoxal were observed in cells treated with iodoacetate (**Table 2**). This last inhibitor primarily affects GAPDH (Sabri and Ochs, 1971), which also exerts control on the concentrations of Fru1,6BP and DHAP (**Table 1**). In addition, in the iodoacetatetreated cells, significant decreases in the ATP concentrations and glycolytic flux were observed with respect to control cells, whereas the Glc6P and Fru6P levels did not change (**Table 2**). All these changes in metabolites and glycolytic flux were similar to those previously observed in cells treated with oxamate (**Table 2**; and Moreno-Sánchez et al., 2016). Therefore, it is suggested that iodoacetate induces glycolysis inhibition mainly through accumulation of Fru1,6BP and DHAP.

Methylglyoxal affects PYK and GAPDH activities (Leoncini et al., 1989; Halder et al., 1993; Biswas et al., 1997). However, in cells treated with iodoacetate (2 mM) or oxamate (20 mM) no changes were attained in these enzyme activities (data not shown). This discrepancy may be attributed to the high concentrations of methylglyoxal (2.5 mM) used in previous papers (Leoncini et al., 1989; Halder et al., 1993). Alternatively, the glyoxalase system in AS-30D cells might be highly efficient

TABLE 2 | Fru1,6BP and DHAP concentrations in AS-30D cells treated with oxamate or iodoacetate.


Cells were incubated in Krebs-Ringer medium with no glucose added at 37◦C for 1 h with the indicated inhibitor concentrations. Then, 5 mM glucose was added and the metabolite concentrations and glycolytic flux were determined after 10 min. Metabolites in mM. Fluxes in nmol/min\*mg protein.

<sup>a</sup>For comparison, these values were taken from Moreno-Sánchez et al. (2016).

<sup>b</sup>Methylglyoxal in nmol/mg protein.

<sup>c</sup>The independent experiment values showed a 15% difference between them. <sup>d</sup>The limit of methylglyoxal detection was ∼0.3 nmoles.

Asterisks indicate statistically significant differences compared with control (\*P ≤ 0.05, \*\*P ≤ 0.0005) using Student's t-test for non-paired samples. The data shown represent the mean ± standard deviation with the number of independent preparations assayed between parentheses. N.M. not measured.

for methylglyoxal detoxification, a hypothesis that remains to be experimentally determined.

### Effect of the Inhibition Mechanism of HPI on Pathway Properties

In the previous sections it was shown that it is indeed possible to significantly inhibit glycolysis by affecting non-flux controlling enzymes. Traditionally, competitive inhibitors have been studied or designed for drug therapy. However, such type of inhibitors induces substrate accumulation which in turn eventually displaces the inhibitor from the enzyme binding site, attenuating its inhibitory impact. Therefore, it was interesting to test with the present kinetic model the effect of different types of inhibition mechanisms on the pathway flux. This is relevant because the experimental results above showed accumulation of metabolites that affect the activities of the main controlling enzymes. Therefore, the kinetics of a flux-controlling step such as HPI was analyzed. This enzyme catalyzes a monosubstrate reaction and is strongly regulated by Fru1,6BP, Ery4P, and 6PG (Supplementary Table 2) which are competitive inhibitors (Marín-Hernández et al., 2011). In these simulations, DHAP was not included as HPI inhibitor because it only affects at high concentrations (KiDHAP = 9.4 mM and in HeLa cells physiological concentrations of DHAP are 0.5–0.8 mM; Marín-Hernández et al., 2014).

The first versions of the kinetic model of cancer glycolysis previously published predicted low levels of Glc6P and high glycolytic flux which were in disagreement with the experimental values. The model refinement process indicated that HPI activity should be inhibited to properly simulate the experimental values (Marín-Hernández et al., 2011). Thus, it was experimentally determined that physiological levels of Ery4P, 6PG, and Fru1,6BP competitively inhibited HPI activity vs. Fru6P or Glc6P (Marín-Hernández et al., 2011; Moreno-Sánchez et al., 2016; published data summarize in Supplementary Table 2). Then, multiple competitive-type inhibition was incorporated in the HPI rate-equation to accurately simulate the experimental Glc6P concentrations and glycolytic flux (Marín-Hernández et al., 2011). Here it was now explored the effect that different types of HPI inhibition have on pathway metabolite concentrations and flux to establish which kind of inhibitor is more efficient in blocking controlling steps and glycolytic flux.

For competitive inhibition, the effect of changing the affinities (1/Ki) of HPI inhibitors was modeled. A decrease of 90% in the Ki value (i.e., the affinity values for inhibitors were increased by 10 fold) increased the HPI flux-control coefficient to a value of 0.27 (**Figure 3A**). The reason for this behavior was that accumulation of Glc6P (**Figure 3C**) attenuated the binding of the physiological inhibitors to HPI and also exerted strong inhibition on HK. Furthermore, the HPI flux control remained unchanged (0.25) when the Ki value was decreased 100-fold (Ki = 0.01; **Figure 3A**), but the glycolytic flux drastically decreased as a consequence of HK inhibition by accumulated Glc6P.

When HPI inhibitors were all considered as uncompetitive or mixed type inhibitors in the kinetic model of hypoglycemic HeLa cells, it was necessary to decrease their affinities by 6– 10 times (i.e., their Ki values were increased 6–10) to keep unaltered the HPI flux-control coefficient, pathway flux and Glc6P concentration (**Figures 3A–C**, respectively). However, it was noted that with uncompetitive inhibition, an increase in the Ki values by only three-fold yielded a high flux control coefficient of 0.65 with concomitant remarkable suppression of pathway flux and accumulation of Glc6P. This enhanced accumulation of substrates of the inhibited enzyme by uncompetitive inhibitors (vs. competitive inhibitors) was envisioned three decades ago (Cornish-Bowden, 1986), but perhaps because examples of uncompetitive inhibition have not been profusely found, studies on this issue have not been developed. With mixed-type inhibition, the three-fold increase in Ki values brought about milder effects on HPI flux control, pathway flux and Glc6P concentration.

These in silico simulations suggested that both uncompetitive and mixed-type inhibition can perturb the pathway flux, at a significantly greater extent than competitive inhibition, because these types of inhibition affect Vmax (which is not altered by competitive inhibitors) and catalytic efficiency (Vmax/Km). It is recall that the Vmax value is directly linked to the content of active enzyme (Vmax = kcat × [enzyme]total) and hence to transcriptional/translational regulation and protein degradation. The design of new inhibitors should consider the uncompetitive and mixed-type inhibition mechanisms to generate potent drugs against cancer glycolysis.

#### Docking Analysis Predicts Potency of Regulatory Metabolites of HPI

To explore why regulatory metabolites may have different potencies, a docking analysis of metabolic inhibitors was

performed on HPI. This analysis showed that the HPI competitive inhibitors can indeed adequately bind to the substrate binding site and be stabilized by the same amino acids in the active site (**Figure 4**). The binding energies were −5.62 (Ery4P), −4.57 (6PG), −3.63 (Fru1,6BP), and −2.64 (DHAP) Kcal/mol. The estimated Ki values (in mM) were 0.076 (Ery4P), 0.45 (6PG), 2.2 (Fru1,6BP), and 11.6 (DHAP) which indicated that Ery4P and 6PG bind more tightly to the enzyme active site compared with the other two metabolites. These results correlated with previous data indicating that the most potent HPI inhibitors are Ery4P (Ki = 0.8–2.5µM) and 6PG (Ki = 6.8–18µM; Marín-Hernández et al., 2011). The discrepancy between the theoretical and experimental Ki values may be due to limitations in the docking procedure since for the analysis, the enzyme structures were considered rigid whereas only the ligands were flexible (for the limitation in the number of rotating bonds that can be assigned). However, enzyme structures are flexible and upon substrate or modifier binding, the active sites

have in general conformational changes that in most cases favor tighter ligand coupling. However, despite these limitations, the docking data analysis predicted the order of binding efficiency and potency of the HPI inhibitors. Also, docking analysis showed that Fru1,6BP can bind to the active sites of HK, TPI, and GAPDH (Supplementary Figure 1).

### DISCUSSION

### Feedback Inhibition by Glycolytic Intermediaries on Flux Controlling Enzymes

A recent study by our group (Moreno-Sánchez et al., 2016) showed that oxamate inhibition of cancer glycolysis was mediated by the direct moderate inhibition of several pathway sites such as LDH, PYK, and ENO. The simultaneous oxamate inhibition of these non-controlling enzymes induced the accumulation of Fru1,6BP and DHAP, which behaved as inhibitors of HK, HPI, TPI, and GAPDH. In the present study, it was shown that inhibition of down-stream noncontrolling enzymes may affect pathway flux only if Fru1,6BP and DHAP are accumulated. It was also shown that the mechanistic basis of this glycolysis suppression was to specifically block the steps with predominant control on the Fru1,6BP and DHAP concentrations, which were PYK, ENO, and GAPDH, but not LDH or PGK. Perturbation of other pathways by inhibiting non-controlling steps may occur as long as the ensuing accumulation of metabolites affects the activities of the main controlling steps. For instance, inhibition of malate dehydrogenase, a Krebs cycle non-controlling step, brings about accumulation of NAD+, malate, fumarate, and succinate. And high levels of these metabolites may alter the activities of isocitrate and 2 oxoglutarate dehydrogenases, the main controlling steps of Krebs cycle.

Although iodoacetate is an unspecific drug that may covalently alkylates thiol groups at the active sites of many enzymes and hence may show toxicity, treatment of Ehrlich ascites carcinoma-bearing mice with iodoacetate significantly increases the median cumulative survival time and percentage of survivors, as well as decreases the tumor size (Fahim et al., 2003). Similarly, oxamate is able to inhibit the chondrosarcoma and nasopharyngeal carcinoma growth in nude mice (Li et al., 2013; Hua et al., 2014). It should be noted that the observed improvement in tumor-bearing animals treated with these compounds is the result of several combined processes including abolishment of tumor glycolysis and activation of several rescue pathways such as immune system (Rheins et al., 1975) and antioxidant defense (Fahim et al., 2003).

Although the drugs tested (oxamate and iodoacetate) might have similar effects on non-cancer cells, both inhibitors are welltolerated in animals and human non-cancer cell lines, suggesting that normal cells are less sensitive to glycolysis inhibition, likely due to a lower dependence on glycolysis for their proliferation with respect to tumor cells. In addition, glycolysis inhibition of tumor associated fibroblasts (reverse Warburg effect) may be also beneficial to deter tumor growth (Pavlides et al., 2009; Martinez-Outschoorn et al., 2011).

As suggested by the data of the present study, the anticancer effect observed of these unspecific drugs (Fahim et al., 2003; Li et al., 2013; Hua et al., 2014) may be associated with the inhibition of glycolysis mediated by (i) the accumulation of Fru1,6BP and DHAP which in turn inhibit the main flux-controlling

enzymes HK and HPI (**Figure 5**); and (ii) the accumulation of methyglyoxal. For cancer treatment, this mechanism may help in the design of new strategies to inhibit essential metabolic pathways such as OxPhos, the antioxidant system and anabolic routes.

Based on the in silico kinetic modeling analysis indicating that both uncompetitive and mixed-type inhibitors can perturb at a significantly greater extent the pathway flux and metabolite concentrations than competitive inhibitors, it is concluded that elevation of the Fru1,6BP levels will have a more severe depressing effect on the glycolytic flux of cancer cells than that of DHAP levels because the former behaves as a mixed-type inhibitor of HK whereas the latter competitively inhibits HPI (Moreno-Sánchez et al., 2016).

### Synthesis of Toxic Metabolites for Cancer Glycolysis

Fru1,6BP is a product of the PFK-1 reaction and a weak inhibitor of HK, HPI, GAPDH, and TPI (Marín-Hernández et al., 2011; Moreno-Sánchez et al., 2016). In addition of being an activator of PYKM2, Fru1,6BP may indirectly inhibit oxidative phosphorylation (Mazurek et al., 2002; Díaz-Ruiz et al., 2008). On the other hand, DHAP is one of the two products of Fru1,6BP breakdown, it is a weak HPI inhibitor (Moreno-Sánchez et al., 2016) and together with glyceraldehyde-3- phosphate represent the most important endogenous source of methylglyoxal (Allaman et al., 2015). The latter compound is one of the most potent glycating agents naturally produced within cells; it reacts with proteins, lipids and nucleic acids to form advanced glycation end products (AGEs; Allaman et al., 2015). High levels of this metabolite can be reached when the concentrations of their precursors are elevated, such as in impaired glucose utilization and TPI deficiency (Ahmed et al., 2003). The glyoxylase system is the main ubiquitous pathway for methylglyoxal detoxification (**Figure 5**) and is involved in tumor development, growth, migration, apoptotic evasion, and multidrug resistance. Increased levels and activities of glyoxylases 1 and 2 in diverse types of cancer (bladder, breast, colon, lung, and prostate) have been observed (Thornalley and Rabbani, 2011; Geng et al., 2014). Therefore, they have been considered as malignancy biomarker and potential anti-cancer target.

One attractive novel approach for targeting cancer cells, derived from the present study, which deserves further experimental assessment, is the use of inhibitors of GAPDH, ENO, and PYK together with glyoxylase inhibitors, which at relatively low doses do not perturb host homeostasis. This particular multi-drug treatment would induce DHAP accumulation which in turn would lead to enhanced levels of methylglyoxal, severely compromising cancer cell growth and viability. Although, methylglyoxal has several toxic effects, it has shown anticancer activity in tumor-bearing mice and slight sideeffects (Ghosh et al., 2006). Furthermore, high concentrations of methylglyoxal (2–7.5 mM) strongly inhibits OxPhos and glycolysis, drastically decreasing the ATP level in cancer cells and apparently showing no effect on normal cells and tissues (Ray et al., 1991; Biswas et al., 1997). In tumor-bearing animals similar high levels of methylglyoxal concentrations in blood (13–19 mM) had no apparent toxic effect on vital organs (liver, kidney) but increased their life span by inhibiting tumor growth (Ghosh et al., 2006). This therapeutically exciting difference has been attributed to alterations in complex I and GAPDH of tumor cells that increase sensitive to methylglyoxal with respect to nontumor enzymes (Biswas et al., 1997; Ray et al., 1997). Other reports indicate that methylglyoxal (30µM) inhibits complex III and ATP synthesis in vascular smooth A-10 cells (Wang et al., 2009).

### Uncompetitive Inhibition Is the Most Potent Mechanism to Block Glycolytic Flux

There are three mechanisms by which a reversible inhibitor may interact with an enzyme: competitive, uncompetitive, and mixed type inhibition (Segel, 1975); the non-competitive inhibition should be considered as a special, non-common case of mixedtype inhibition. A molecule that is structurally similar to the natural substrate may reversible bind to the enzyme active site and act as a competitive inhibitor. In this regard, docking simulations were performed to support this assumption for HPI since the regulatory metabolites Fru1,6BP, Ery4P, 6PG, and DHAP readily bind to the substrate binding site and are stabilized by the same amino acids involved in the Glc6P and Fru6P binding (**Figure 4**). Thus, competitive inhibitors are common in metabolic pathways because the products of each reaction and several other pathway intermediaries have structural similarity with the substrate. As a consequence, Fru1,6BP, Ery4P, and 6PG behave as competitive inhibitors of HPI vs. the substrate Glc6P and product Fru6P, regulating the supply of Glc6P for pentose phosphate and glycogen synthesis pathways.

Nevertheless, competitive inhibitors can be also readily displaced from the active site by high substrate concentrations, thereby restoring enzyme activity. Thus, the physiological effect of competitive inhibitors is to provide an immediate response of the targeted enzyme/transporter which will be attenuated in the medium term. Then, although competitive inhibitors are easier to find in nature or be designed and manufactured, they are not pharmacologically efficient drugs.

In contrast, the effects of the uncompetitive and mixed-type inhibitors cannot be overcome by increasing the substrate concentration; in fact, for uncompetitive inhibition it becomes more significant at increasing substrate concentrations (Cornish-Bowden, 1986). Using a kinetic model of the parasite Trypanosoma brucei glycolysis it was concluded that inhibition of the pyruvate transport would be more effective for perturbing the pathway with an uncompetitive inhibitor (followed by mixed-type) than with a competitive one (Eisenthal and Cornish-Bowden, 1998). Although, uncompetitive inhibitors are not common, there are recent reports about the identification of uncompetitive inhibitors of human γ-glutamyl transpeptidase and Pglycoprotein, proteins that can play an important role in drug-resistance in cancer (Wickham et al., 2013; Teng et al., 2015).

Uncompetitive and mixed-type inhibitions modify the Vmax value, which is a kinetic parameter that has a strong influence on the degree of control that each pathway enzyme exerts (Marín-Hernández et al., 2014). GLUT was the main controlling step of glycolysis in HeLa hypoglycemic cells and T. brucei because its activity (i.e., Vmax) was the lowest (Bakker et al., 1999; Marín-Hernández et al., 2014). In contrast, PFK-I has low activity in tumor cells but it has no control on the pathway flux because Fru2,6BP activation increases several-fold its activity (Moreno-Sánchez et al., 2012).

### CONCLUSION

Kinetic modeling studies have shown that only the simultaneous inhibition of several flux-controlling steps will have significant impact on glycolytic flux and ATP concentration in cancer cells. This can be accomplished by direct inhibition using, preferentially, uncompetitive specific drugs or indirectly through the accumulation of regulatory metabolites of the flux-controlling steps by inhibiting enzymes that exert low flux-control.

### AUTHOR CONTRIBUTIONS

AM-H, IDM-M, and JSR-Z performed the in vitro and in silico experiments and docking analysis, AM-H, SR-E, RM-S, and ES planned experiments, analyzed data, contributed reagents, and wrote the paper.

#### ACKNOWLEDGMENTS

This research was partially supported by grants Nos. 180322 (AM-H), 178638 (ES), and 239930 (RM-S) from CONACyT-Mexico. Authors thank Dr. Ricardo Jasso Chávez (Instituto

#### REFERENCES


Fell, D. (1997). Understanding the Control of Metabolism. London: Portland Press.

Fischer, K., Hoffmann, P., Voelkl, S., Meidenbauer, N., Ammer, J., Edinger, M., et al. (2007). Inhibitory effect of tumor cell-derived lactic acid on human T cells. Blood 109, 3812–3819. doi: 10.1182/blood-2006-07-035972

Nacional de Cardiología, SSA) for gas chromatography technical assistance.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2016.00412


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Marín-Hernández, Rodríguez-Zavala, Del Mazo-Monsalvo, Rodríguez-Enríquez, Moreno-Sánchez and Saavedra. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Insights into the Regulatory Role of Non-coding RNAs in Cancer Metabolism

Fredy O. Beltrán-Anaya, Alberto Cedro-Tanda, Alfredo Hidalgo-Miranda\* and Sandra L. Romero-Cordoba\*

Cancer Genomics Laboratory, National Institute of Genomic Medicine, Mexico City, Mexico

Cancer represents a complex disease originated from alterations in several genes leading to disturbances in important signaling pathways in tumor biology, favoring heterogeneity that promotes adaptability and pharmacological resistance of tumor cells. Metabolic reprogramming has emerged as an important hallmark of cancer characterized by the presence of aerobic glycolysis, increased glutaminolysis and fatty acid biosynthesis, as well as an altered mitochondrial energy production. The metabolic switches that support energetic requirements of cancer cells are closely related to either activation of oncogenes or down-modulation of tumor-suppressor genes, finally leading to dysregulation of cell proliferation, metastasis and drug resistance signals. Non-coding RNAs (ncRNAs) have emerged as one important kind of molecules that can regulate altered genes contributing, to the establishment of metabolic reprogramming. Moreover, diverse metabolic signals can regulate ncRNA expression and activity at genetic, transcriptional, or epigenetic levels. The regulatory landscape of ncRNAs may provide a new approach for understanding and treatment of different types of malignancies. In this review we discuss the regulatory role exerted by ncRNAs on metabolic enzymes and pathways involved in glucose, lipid, and amino acid metabolism. We also review how metabolic stress conditions and tumoral microenvironment influence ncRNA expression and activity. Furthermore, we comment on the therapeutic potential of metabolism-related ncRNAs in cancer.

#### Edited by:

Firas H. Kobeissy, University of Florida, USA

#### Reviewed by:

Guanglong Jiang, Indiana University School of Medicine, USA Noriko Hiroi, Keio University, Japan

#### \*Correspondence:

Alfredo Hidalgo-Miranda ahidalgo@inmegen.gob.mx Sandra L. Romero-Cordoba sromero\_cordoba@hotmail.com

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 25 May 2016 Accepted: 25 July 2016 Published: 08 August 2016

#### Citation:

Beltrán-Anaya FO, Cedro-Tanda A, Hidalgo-Miranda A and Romero-Cordoba SL (2016) Insights into the Regulatory Role of Non-coding RNAs in Cancer Metabolism. Front. Physiol. 7:342. doi: 10.3389/fphys.2016.00342 Keywords: metabolic reprogramming, cancer metabolism, ncRNA regulation, miRNAs

### METABOLIC REPROGRAMMING: CANCER METABOLISM CHANGING THE ENERGETIC STATE TO FULFILL CELLULAR REQUIREMENTS

Deregulation of cellular energetics has been pointed out as one of the emerging hallmarks of cancer, both during early cellular transformation and as a driving phenotype of several tumorigenic programs (Kroemer and Pouyssegur, 2008; Munoz-Pinedo et al., 2012). Under physiological conditions, cells maintain regulated and complex metabolic homeostasis by diverse signaling pathways that function as energetic sensors. Metabolic sensors act under a network of cooperative signaling cascades, not only to fulfill the energetic requirements of the cells, but also to influence cellular pathways like cell growth, proliferation, and death (Dumortier et al., 2013). In contrast, cancer cells loose this regulated homeostasis in several ways, including alterations in intrinsic and extrinsic molecular mechanisms that govern cellular metabolism, in order to provide the basic metabolic requirements of tumoral cells, such as quick biosynthesis of ATP, accelerated biosynthesis of macromolecules, and maintenance of optimal redox status (Cairns et al., 2011). To satisfy their metabolic needs, cancer cells also present changes in energetic pathways such as elevated glucose uptake, aerobic glycolysis and altered lipid and fattyacid metabolism (Newsholme et al., 1985; Vander Heiden et al., 2009). This advantageous bioenergetic state is not only related to the metabolic requirements imposed by the higher biological activity of the tumoral cells, it can also promote a proliferative phenotype and facilitate cell survival and movement under adverse conditions like hypoxia or glucose and nutrient deprivation, becoming a major player in the development and evolution of a tumor (Jones and Thompson, 2009).

This metabolic shift, known as metabolic reprogramming, has been correlated with the activity of oncogenes and loss of tumor suppressor molecules (Esquela-Kerscher and Slack, 2006). Furthermore, once a tumor has developed and reached a certain volume, it becomes difficult to maintain optimal oxygen levels in its cells, creating a hypoxic environment. This also promotes a metabolic reprograming which includes an elevated glycolytic rate, preferentially through oxidative phosphorylation and suppression of gluconeogenesis, creating complex glucoselactate fluxes, as well as a pro-tumorigenic environment (Reyes et al., 2014).

Non-coding RNAs (ncRNAs), mainly, microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), have been defined as important regulators of several metabolic pathways. miRNAs are small ncRNAs (between 19 and 22 nt), with an important role as post-transcriptional regulators (Bartel, 2009). LncRNAs are transcripts from 200 nt to 100 kilobases (kb) lacking an open reading frame and without evident protein-coding function (Rinn and Chang, 2012). Both of them participate in many physiological processes through the modulation of gene expression at the epigenetic, transcriptional, and posttranscriptional levels (**Figure 1**).

ncRNAs can actively regulate energetic signaling by targeting key metabolic transporters and enzymes, or by directly or indirectly controlling the expression of tumor suppressors or oncogenes (Iorio and Croce, 2012). Analysis of the correlation between oncogenic programs, metabolic reprograming and aberrant ncRNA expression has highlighted the crucial role of these metabolic aspects in initiation, promotion, and progression of cancer (Arora et al., 2015).

Several lines of evidence suggest that ncRNA plays an important role in the establishment of metabolic reprogramming in cancer cells, as well as the feedback regulation between alterations in energetic signaling and ncRNA expression or activity. In this review, we will discuss the evidence that describes the roles of ncRNAs as modulators of cancer metabolism and as molecules which contribute to the establishment of a diversity of mechanisms that govern the heterogeneity and plasticity of the energetic metabolism of cancer cells.

### ncRNAs REGULATE GLYCOLYTIC FLUXES: A SWEET STORY

One of the most significant changes induced by cancer metabolic reprogramming involves the bypass of oxidative phosphorylation (Tricarboxylic Acid cycle) to a non-oxidative pathway lead by aerobic glycolysis and lactate production, in order to satisfy the energetic demands of the tumor cells (Vander Heiden et al., 2009). One of the better characterized metabolic phenotypes present in tumor cells is the Warburg effect, which gives preference to ATP generation through glycolysis, even under normal oxygen concentrations, over ATP synthesis through the electron transport chain in the mitochondria (Warburg, 1956; Gatenby and Gillies, 2004; Kim and Dang, 2006). Consequently, most of the glucose in the cell is converted to lactate, rather than being metabolized through the Krebs cycle (Warburg, 1956; Semenza et al., 2001; Gatenby and Gillies, 2004). Although the energetic balance established by glycolysis is less efficient (lower quantity of ATP generated per unit of glucose) than oxidative phosphorylation, it is quicker. However, oxidative phosphorylation is not completely abolished and still functions at a low level (**Figure 2A**). Therefore, this abnormal and accelerated metabolism meets the energetic needs for cellular functions and construction of biological blocks (fatty acids, lipids, nucleotides, and proteins) for cancer cells (Zheng, 2012).

The first step in energy metabolism is the entry of glucose into the cells through glucose transporters (GLUTs). Until now, 14 isoforms of GLUTs have been identified, of which GLUT1, 2, 3, and 4 are well-characterized and expressed in different tissues, some of them in a specific manner (Thorens and Mueckler, 2010). ncRNAs actively regulate the intracellular glucose levels by modulating gene expression of glucose transporters. For instance, GLUT1 is targeted by miR-340, which is up-regulated in oral squamous cell carcinoma (Xu et al., 2016). In renal cell tumors, miR-199a, miR-138, miR-150, and miR-532-5p down-regulate GLUT1 expression, whereas miR-130b, miR-19a/b, and miR-301a increase GLUT-1 (Chow et al., 2010). Additionally, loss of miR-1291 enhances the development of renal tumors through targeting GLUT1 (Yamasaki et al., 2013). In prostate tumors, the PCGEM1 lncRNA promotes the expression of GLUT1. Similarly, lncRNA-p21 expression is related to HIF-1α and its responsive genes, such as GLUT1, promoting its expression in diverse cancer cell lines (Yang et al., 2014). In bladder cancer, downmodulation of miR-195-5p allows the expression of GLUT3 (Fei et al., 2012; Peschiaroli et al., 2013). Additionally, reduced levels of miR-150 negatively regulate GLUT4 expression in pancreatic cancer cells (Srivastava et al., 2011). Such alterations in the expression of ncRNAs and their effect over GLUT expression, represent possible mechanisms through which tumors may bypass regulatory energetic checkpoints by promoting glycolysis, as well as other oncogenic pathways like proliferation, migration, and invasion (**Figure 2B**).

ncRNAs can also affect the patterns and mechanisms of glucose uptake and glucose/lactate fluxes in cancer cells, promoting aggressive behavior through the establishment of a glycolytic phenotype. The CRNDE (Colorectal Neoplasia

FIGURE 1 | Biological and mechanical overview of non-coding RNAs. (1, 2) Biogenesis of microRNAs and their main mechanisms of action. The pri-miRNA is transcribed by pol II polymerase and digested by the RNase DROSHA originating a pre-miRNA (70 nt), which is exported to the cytoplasm by exportin 5. Then, another RNase, Dicer, digests the pre-miRNA to generate a mature duplex miRNA (∼22 nt). One strand of this duplex is then incorporated in the miRISC complex (Ago2-microRNA) to target mRNA by perfect complementarity producing transcript degradation (1) or an imperfect one promoting translation repression (2). Conversely, (right side), general functions of lncRNAs are described. (3) Recruitment of transcription factors to promote transcription of target genes or (4) recruitment of chromatin modifiers and thus (6) promoting remodeling of the chromatin architecture. Other functions of lncRNAs are (5) control of alternative splicing of mRNA, and (7) control of translation rates favoring or inhibiting polysome loading to mRNAs, (8) acting as a decoy to preclude access of regulatory proteins to DNA. (9) The interaction between microRNAs and endogenous competent RNAs (ceRNAs) is a redundant system to regulate mRNA expression by lncRNAs-microRNAs; this mechanism is known as sponge function by lncRNAs. Thus, microRNA sequestration by lncRNA prevents microRNA functions on its target.

Differentially Expressed) lncRNA responds to insulin-like growth factors (IGF) promoting glucose uptake in colorectal cancer (Ellis et al., 2014). Furthermore, the over-expression of the ceruloplasmin lncRNA (NRCP) in ovarian and breast cancer cells, along with the LINK-A lncRNA in triple negative breast cancer, promotes glucose uptake, favoring lactate production and consequently, enhancing tumor progression (Rupaimoole et al., 2015; Lin et al., 2016). In breast tumors, ncRNAs can also function as modifiers of the tumor microenvironment. Under metastatic conditions, tumor cells secret vesicles that carry high levels of miR-122 to non-tumor cells, repressing glucose uptake in the normal cells and facilitating metastasis by increasing nutrient availability for the cancer cells (Fong et al., 2015; **Figure 2B**).

After glucose uptake, numerous enzymes take part in the catabolism of trioses, pyruvate, and finally lactate. Regulation of glycolytic enzymes by ncRNAs further increases this biological complexity. Hexokinases (HK) catalyze ATP-dependent phosphorylation of glucose to glucose-6-phosphate (Robey and Hay, 2006). Interestingly, HK2 is overexpressed in various tumors and contributes to the establishment of aerobic glycolysis (Mathupala et al., 2009; Vander Heiden et al., 2011). In lung, colon, prostate and head, and neck squamous cell cancers, loss of miR-143 allows HK2 expression (Fang et al., 2012;

#### FIGURE 2 | Continued

internalized into the cell through the membrane transporters GLUT1, 2, 3, and 4. Through a system of coupled enzymatic reactions, D-glucose is converted into pyruvate, which enters into the TCA cycle, and OXPHO. When the amount of oxygen is limited, pyruvate is converted into lactate. Conversely, in the mitochondria, the TCA cycle is coupled to OXPHO which represents the largest source of metabolic energy. Pyruvate is oxidized and converted into acetyl coenzyme A, which enters the TCA cycle that generates reducing molecules (NADH and FADH2) to produce ATP by oxidative phosphorylation. Finally, fatty acids can be converted into acetyl coenzyme A by ß-oxidation to then generate energy through the TCA cycle and OXPHO (B). Glycolysis regulation by miRNAs and lncRNAs under oncogenic conditions. Expression of the GLUT transporter family is regulated by ncRNAs, thus altering the internalization rate of glucose. Other molecules are under ncRNAs regulation pathways, such as hexokinase-2 enzyme, which mediates the transformation of glucose to glucose 6-phosphate, PKM2 enzyme involved in pyruvate synthesis, LDHB and LDHA enzymes that convert pyruvate into lactate, and PDHK, responsible for the synthesis of Acetyl coenzyme A from pyruvate.

Peschiaroli et al., 2013). Similarly, miR-143 locus is deleted in other malignancies (Volinia et al., 2010), and has also been found down-regulated in cervical tumors (Michael et al., 2003; Lui et al., 2007). In bladder cancer cells, miR-155 repress miR-143, allowing up-regulation of HK2 (Jiang et al., 2012). Moreover, the up-regulation of hipoxia factors suppresses the expression of miR-199a-5p and promotes glycolysis in liver cancer, since the miRNA normally interferes with the expression of HK2 (Guo et al., 2015). The Urothelial Cancer-Associated 1 lncRNA (UCA1) modulates HK2 by activation of STAT3 through the repression of miR-143 (Li Z. et al., 2014). Another member of the hexokinases, HK1 is also regulated by miR-138 (Peschiaroli et al., 2013). Additionally, in colorectal cancer rosmarinic acid suppress miR-155 repressing the Warburg effect through the mechanism of inactivating the IL-6/STAT3 pathway (Xu et al., 2015).

Another important intermediate step in glycolysis is the conversion of fructose-1,6-bisphosphate to glyceraldehyde 3 phosphate by the aldose enzyme, which is a direct target of miR-122 in liver cells (Fabani and Gait, 2008).

Pyruvate kinase (PKM) regulates the final rate-limiting step of glycolysis, which catalyzes the generation of two molecules of pyruvate and two molecules of adenosine triphosphate (ATP; Mazurek, 2011). MiR-124, miR-137, and miR-340 regulate alternative splicing of the PKM gene in colorectal cancer. The switch from isoform PKM2 to PKM1 inhibits the glycolysis rate and promotes oxidative phosphorylation (Sun et al., 2012). PKM2 is also regulated by miR-326 which is down-modulated in glioblastoma cells (Kefas et al., 2010). Furthermore, pyruvate dehydrogenase kinase (PDHX) catalyzes the conversion of pyruvate to acetyl coenzyme A and is down-modulated by miR-26a in colorectal cancer, thus impairing mitochondrial metabolism (Chen et al., 2014). Let-7 is a microRNA that is commonly down-regulated in several cancer types. Since PDK1 is a physiological target of let-7, its down-regulation in tumors facilitates aerobic glycolysis. Furthermore, PDK1 is critical for Lin28A/B-mediated cancer proliferation, establishing a precise mechanism by which Lin28/let-7 facilitates the Warburg effect to promote cancer progression (Ma et al., 2014; **Figure 2B**).

Under aerobic glycolysis conditions, oncogenic lesions convert pyruvate to lactate through lactate dehydrogenase (LDH) to fulfill their energetic needs (Hatziapostolou et al., 2013). LDHB is a target of miR-375, which is down-regulated in esophageal squamous cell and maxillary sinus squamous cell carcinomas (Isozaki et al., 2012; Kinoshita et al., 2012). Another important enzyme is the LDHA, frequently overexpressed in tumor cells, and targeted by miR-34a, miR-34c, miR-369- 3p, miR-374a, and miR-4524a/b, that are down-modulated in colorectal cancer tissues (Wang J. et al., 2015). Moreover, lncRNA-p21 positively modulates LDHA, Enolase 1, PDHX, Isozyme 4 (PDK4), Phosphoglycerate mutase (PGAM2), Glucose-6-Phosphate Isomerase (GPI), and Pyruvate Kinase (PKM2) in diverse tumors (Hung et al., 2014). The ability of cells to maintain optimal lactate fluxes depends on monocarboxylate transporters (MCTs). Specifically, MCT1 is targeted by miR-29a, miR-29b, and miR-124 in pancreatic cancer (Pullen et al., 2011). Additionally, let-7b, usually inhibited in tumors, has been shown to target basigin (BSG) which interacts with MCT1 (Fu et al., 2011; **Figure 2B**).

Cancer cells reprogram their metabolism, based on complex regulatory networks involving diverse oncogenic and tumor suppressor genes, including PI3K/Akt, Myc, hypoxia inducible factor (HIF), Ras, Src, p53, and PTEN that promote an increase glucose uptake and glycolysis (Dang et al., 2009; Luo and Semenza, 2011). Those genes are targets of ncRNAs regulation networks in cancer (**Table 1**).

Not only can the human genome-encoded ncRNAs modulate glucose metabolism in cancer cells. Kaposi's sarcoma-associated herpesvirus (KSHV), the etiological agent of Kaposi's sarcoma, has been shown to express microRNAs in its genome that collaborate to induce aerobic glycolysis in infected cells, mainly through the down-regulation of EGLN2 and HSPA9, which cooperate to form the glycolytic phenotype (Yogev et al., 2014).

### LIPID METABOLISM: A FAT STORY

Lipids constitute a mayor building block for organelles and cells to maintain cellular function and structure provide energy and orchestrate different cellular pathways. As part of lipid metabolism (anabolism and catabolism) a variety of biological intermediators are generated as second messengers (Huang and Freter, 2015), which manage multiple signaling pathways like cell growth, proliferation, differentiation, survival, apoptosis, inflammation, motility, and membrane homeostasis (Mattes, 2005; Krycer et al., 2010; Zechner et al., 2012). Alterations in lipid metabolism can affect cell function, promoting the establishment, and development of cancer (Santos and Schulze, 2012). In fact, lipid biosynthesis is limited to a subgroup of tissues and organs, including adipose, liver, and breast, but its reactivation or reprogramming is commonly observed in tumor cells (Menendez and Lupu, 2007; Abramson, 2011; Beloribi-Djefaflia et al., 2016). The activation or inhibition of


LC, lung cancer; HCC, hepatocellular carcinoma; BlaCa, Bladder cancer; CC, colon cancer; BRCA, breast cancer; PCA, prostate cancer; GC, glioblastoma cancer.

lipid signaling pathways is aimed at fulfilling the cell energy requirements and responds to environmental changes. There are numerous enzymes regulating lipid metabolism in the cells and recently, diverse data show that expression of many of these enzymes are regulated by ncRNAs (Huang and Freter, 2015; **Figure 3**).

In prostate cancer cells, miR-185 and miR-342 control lipogenesis and cholesterol synthesis by down-modulating the expression of sterol regulatory element binding protein 1 and 2 (SREBP-1, SREBP-2), repressing their responsive genes, including fatty acid synthase (FASN) and 3-hydroxy-3-methylglutaryl CoA reductase (HMGCR; Li X. et al., 2013). In lymphocytic leukemias,

metabolic enzymes related with lipid biosynthesis, such as lipase A (LIPA) and pyruvate dehydrogenase lipoamide kinase isozyme 1 (PDK1), are targets of miR-125b (Tili et al., 2012). Recently, miR-205 has been associated with lipid metabolism de-regulation in hepatocellular carcinoma, acting on acyl-CoA synthetase long-chain family member 1 (ACSL1), a lipid metabolism enzyme in liver (Liu et al., 2012; Cui M. et al., 2014). Additionally, the loss of miR-122, an abundant liverspecific miRNA, alters fat and cholesterol metabolism through modulation of genes involved in lipid synthesis, including Agpat1, Mogat1, Agpat3, Agpat9, Ppap2a, Ppap2c (Hsu et al., 2012; Tsai et al., 2012).

Over-expression of miR-27a in hepatitis C virus-infected liver cells vs. hepatitis B virus-infected cells has been recently described. MiR-27a targets the lipid synthetic transcription factor RXR and the lipid transporter ATP-binding cassette subfamily A member 1 in hepatocarcinoma. Moreover, miR-27a downmodulates the expression of several lipid metabolism-related genes (FASN, SREBP1, SREBP2, PPAR, PPAR, ApoA1, ApoB100, and ApoE3), some of which also participate in the production of infectious viral particles (Shirasaki et al., 2013).

The over-expression of HULC contributes to the malignant development of hepatocellular carcinoma by supporting abnormal lipid metabolism via activation of the acyl-CoA synthetase subunit ACSL1. This results in promotion of lipogenesis and the accumulation of intracellular triglycerides and cholesterol in experimental models. HULC induces methylation of the miR-9 promoter, regulating its expression and favoring alterations in lipid metabolism (Cui et al., 2015). LncRNA SPRY4-IT1 was first identified in adipose tissue (Ota et al., 2004) and was recently found up-regulated in melanoma (Khaitan et al., 2011). Expression of this lncRNA shows a strong correlation with lipid metabolism alterations, including the increase of acyl carnitine, fatty acyl chains, and triacylglycerol, as well as the down-modulation of phosphatidic acid, phosphatidylcholine, phosphatidylinositol, and phosphatidylserine. It is probable that the significant changes in lipid profiles are correlated with the oncogenic modulation of SPRY4-IT1 over the lipid phosphatase lipin 2, an enzyme that converts phosphatidate to diacylglycerol (Mazar et al., 2014).

The oncogene ANRIL is up-regulated in gastric, lung, hepatocellular, cervical, melanoma, ovarian, bladder cancer, among other tumors (Li Z. et al., 2016). Interestingly, ANRIL regulates genes involved in glucose and fatty acid metabolism (Bochenek et al., 2013), such as ADIPOR1. Furthermore, ANRIL can epigenetically regulate the expression of miRNAs in gastric cancer cells, particularly miR-99a/miR-449a, which target CDK6/E2F1 and mTOR pathways (Zhang et al., 2014), that regulate lipid metabolism and adipose tissue function (Cai et al., 2015).

The steroid receptor RNA activator gene is an unusual gene that expresses two different transcripts by alternative splicing of the first intron: (1) the lncRNA SRA and (2) the SRAP protein gene (Hube et al., 2006). SRA co-activates PPAR-gamma, inducing adipogenesis (Xu et al., 2010; Liu et al., 2014) and it may also regulate lipid metabolism (Marion-Letellier et al., 2015). Interestingly, the over-expression of SRA has been associated with poor prognosis in endometrial cancer (Smolle et al., 2015). The lncRNA-DYNLRB2-2 responds to oxidized-LDL to promote ABCA1-mediated cholesterol efflux (Hu et al., 2014). In prostate cancer, the ox-LDL/lncRNA-DYNLRB2-2 circuit might be involved in the promotion of proliferation, migration and invasion rates (Wan et al., 2015). Experiments in animal models showed that the lncLSTR (lncRNA-liver-specific triglyceride regulator), a liver-enriched lncRNA, physiologically contributes to triglyceride metabolism by enhancing Cyp8b1 expression, a molecule involved in bile acid synthesis. Furthermore, Cyp8b1 is down-modulated in primary hepatocytes in which lncLSTR is depleted, suggesting a regulatory activity over Cy8b1 as one of its downstream responsive genes (Li et al., 2015a).

#### AMINO ACID METABOLISM

Apart from other energetic sources, amino acids are important substrates that sustain mitochondrial metabolism and support the biosynthesis of proteins, lipids, and other macromolecules. Alterations in amino acid metabolism are also common in cancer cells (**Figure 3**).

Glutamine metabolism seems to have a critical role in cancer programs, and has been implicated in tumor formation and metastasis (DeBerardinis and Cheng, 2010), as well as being an important source of tumor energy (Li and Zhang, 2016). miRNAs have also been reported to regulate amino acid catabolism, for example, in kidney cancer miR-23b<sup>∗</sup> regulates proline oxidase, which is the first enzyme involved in the conversion of proline to glutamic acid (Liu et al., 2012). Interestingly, the lncRNA CCAT2 participates in the alternative splicing of glutaminase (GLS), an enzyme that catalyzes the hydrolysis of glutamine to glutamate (Redis et al., 2016), where glutamate can be further deaminated to a-Ketoglutarate by glutamate dehydrogenase (GDH) and incorporated into the tricarboxylic acid cycle (Li and Zhang, 2016). Another lncRNA involved in glutamine metabolism is PCGEM1 an androgen-induced prostate specific lncRNA, which regulates expression of enzymes such as GLS, Glutathione Reductase (GSR), and type I gamma-glutamyltransferase (GGT1) in prostate tumors (Hung et al., 2014).

Redundant regulation by ncRNAs reveals that metabolic pathways in cancer are finely regulated, acting at different cellular levels. Consequently, understanding these processes will enable future development of anti-metabolite therapies to target specific energetic signals altered in oncogenic lesions.

#### MITOCHONDRIAL METABOLISM IN CANCER AND ITS RELATION WITH ncRNA

The partial maintenance of mitochondrial function in glycolytic cells appears essential for cancer cell development. Thus, the tumor must balance the bioenergetic requirements to grow, proliferate, and survive within the energetic restrictions and metabolic pathways. Mitochondria are in fact, the main intracellular producers of reactive oxygen species (ROS) as part of adenosine triphosphate (ATP) production through oxidative phosphorylation (OXPHOS). This organelle is responsible for converting available nutrients into the fundamental blocks required for cell maintenance (Ahn and Metallo, 2015), such as fatty acids, cholesterol and proteins (Kamphorst et al., 2013). Therefore, mitochondrial alterations have been implicated in the etiology of many diseases including cancer. The metabolic reprogramming of the mitochondrial network in tumoral programs is achieved through several mechanisms, including ncRNAs transcribed both in the nuclear and in the mitochondrial genome (mtDNA). ncRNAs can actively regulate mitochondrial metabolism by controlling structural and functional mechanisms that respond to changes in energy requirements or environmental conditions (Benard et al., 2010; **Figure 4**).

For example, miR-210 is up-regulated by hypoxia (Dang and Myers, 2015), and can block mitochondrial respiration through down-modulation of the electron transport chain (ETC) complexes (Huang and Zuo, 2014). Particularly, miR-210 targets ISCU1 and ISCU2, suppressing mitochondrial function and disrupting iron homeostasis in colon, breast, and esophageal cancer (Chen et al., 2010; Favaro et al., 2010). In breast cancer cells, miR-378<sup>∗</sup> promotes a metabolic shift by inhibiting the expression of important regulators of energy metabolism such as estrogen-related receptor-γ and GA-binding protein transcription factor. This reduces the tricarboxylic acid cycle (TCA) rates, decreasing the dependency on OXPHOS, and increasing lactate production (Eichner et al., 2010). Similarly, in hepatocellular carcinomas miR-23a modulates a metabolic switch from OXPHOS to anaerobic glycolysis by targeting the glucose-6-phosphatase catalytic subunit (G6PC), which plays an important role in mitochondrial respiration (Wu et al., 1999; Wang et al., 2012). Likewise, overexpression of miR-125b in lymphocytic

leukemia models represses many transcripts implicated in energetic and lipid metabolism including phosphatidylcholine transfer protein, lipase A, lysosomal acid, cholesterol esterase, glutathione synthetase, acyl-CoA synthetase short-chain family member 1, HK2, stearoyl-CoA desaturase 1, AKT2, and pyruvate dehydrogenase kinase 1 (PDK1; Tili et al., 2012).

Some of the most important by-products of the electron transport chain in the mitochondria are reactive oxygen species (mROS). Increased production of ROS can lead to activation of tumorigenic signaling and metabolic reprogramming. This tumorigenic signaling includes mechanisms to prevent imbalances in the production of mROS maintaining redox homeostasis (Sullivan and Chandel, 2014). Emerging evidence shows that control of ROS levels is mediated in part by ncRNAs. One of the first evidence was the cluster miR-17–92, overexpressed in small-cell lung cancer, which reduce DNA damage to a tolerable level and consequently lead to the accumulation of genetic instability (Ebi et al., 2009). miR-141 and miR-200a, contribute to ovarian tumorigenesis by targeting p38α and modulating oxidative stress response in mouse models (Mateescu et al., 2011). In addition, miR-21 and miR-34a promote tumor malignancy by the formation of ROS through the mediation of SOD3 and TNFα expression in cancer cells (Zhang et al., 2012). In medulloblastoma, miR-128a regulates ROS by specific inhibition of the Bmi-1 oncogene, which participates in maintaining mitochondrial function and redox homeostasis (Venkataraman et al., 2010). Let-7a promotes OXPHOS in breast cancer (Serguienko et al., 2015) and hepatocellular carcinoma by directly modulating PDK1, which as mentioned previously, is a negative regulator of OXPHOS activity (Ma et al., 2014). In bladder cancer, the lncRNA UCA1 participates in ROS formation and promotes mitochondrial glutaminolysis by its sponge effect on miR-16 (Li H. J. et al., 2015). SOD2, which has response elements for NF-κB, wipes out the superoxide anion radicals generated by OXPHOS and coverts them into hydrogen peroxide in cancer cells. Although it is know that the lncRNA Lethe prevents binding of NFκB to NFκB response elements resulting in the suppression of SOD2 (Rapicavoli et al., 2013), the impact of Lethe on energetic metabolism of cancer cells is poorly understood.

Apart from glucose, cancer cells exhibit increased glutamine intake and glutamine metabolism (glutaminolysis). The accelerated glutamine metabolism provides enough substrate to increase lipogenesis and nucleic acid biosynthesis, necessary for the proliferative phenotype of the cancer cells (Gao et al., 2009). Of particular importance, mitochondrial enzymes participate in the metabolism of glutamine and other metabolites (glutamate, proline, aspartate, and alanine) as part of the tumor programs (Dang, 2010). One of the major regulators of glutaminolysis is MYC. Along the same line the suppression of miR-23A/B by MYC enhances mitochondrial glutaminase expression and glutamine metabolism in prostate cancer (Gao et al., 2009). Additionally, the deregulation of the HOTTIP lncRNA by miR-192 and miR-204 can produce an abnormal glutaminolysis through positive regulation of GLS1 in hepatocellular tumors (Ge et al., 2015). Furthermore, the CCAT2 lncRNA modulates GSL alternative splicing through an allele-specific regulatory mechanism (Redis et al., 2016). Moreover, in bladder cancer cells the UCA1 lncRNA promotes glutamine metabolism through its sponge function over miR-16, allowing the expression of GLS2, enzyme that participates in the hydrolysis of glutamine to glutamate (Li H. J. et al., 2015).

The involvement of mitochondrial miRNAs (mitomiRs) and mitochondrial lncRNAs in regulating the OXPHOS system is of particular interest. These regulatory molecules have either a prooxidant or antioxidant effect (Bai et al., 2011; Aschrafi et al., 2012; Li P. et al., 2012). Therefore, mitochondrial ncRNAs may participate in the fine-tuning of the mitochondrial energy supply. A recent study identified 13 miRNAs significantly enriched in mitochondria of HeLa cells, which actively participate in cell cycle and apoptosis through regulation of mitochondrial activity (Bandiera et al., 2011; Demongeot et al., 2013). The lncRNAs encoded by mtDNA, ASncmtRNA-1/2, are down-regulated in cancer cells and take part in the mitochondrial reprograming of oncogenic pathways (Burzio et al., 2009). Biological activity of ASncmtRNAs results in survivin inhibition at the RNA level, probably mediated by microRNAs (Vidaurre et al., 2014). Survivin enhances the stability of oxidative phosphorylation complex II, which promotes cellular respiration (Rivadeneira et al., 2015).

Another type of non-coding RNA, the Plement-induced wimpy testis (PIWI)-interacting RNAs (piRNAs), have been recently recognized to be relevant in cancer metabolic reprogramming. piRNAs are small non-coding RNAs (26–31 nt) that form the piRNA-induced silencing complex (piRISC). The main function of piRNAs is to silence transposable elements (TEs) in the germ line, but also in cancer cells (Siomi et al., 2011), mainly through epigenetic regulation, genome re-arrangement, and stem cell self-renewal (Ross et al., 2014). piRNA expression has been detected in mitochondrial RNAs of HeLa cells, and are possibly implicated in diverse functions related to energetic homeostasis, bioenergetics and cell growth (Kwon et al., 2014).

### INTERPLAY BETWEEN ncRNAs, TUMOR MICROENVIRONMENT, AND METABOLIC CONDITIONS

Novel data suggest that the regulatory role of ncRNAs during carcinogenesis is not limited to cancer cells they are also implicated in the activation of the tumor stroma and in its transition into a cancer-associated microenvironment. In fact, tumor development involves a fine interplay between malignant and stromal cells. Secreted ncRNAs can serve as regulatory signals promoting cancer cell proliferation, migration, communication, and stromal modification, thereby enhancing an optimal microenvironment for oncogenesis (Soon and Kiaris, 2013). The tumor microenvironment presents a complex architecture, comprising fibroblasts, vascular endothelial cells, immune cells, adipocytes, and extracellular matrix, conforming the stromal tissue that surrounds and interacts tumor cells (Hanahan and Weinberg, 2011).

Importantly, cancer-associated fibroblasts (CAFs) can modify the metabolism of the adjacent cancer cells, as a consequence, its activity can promote tumor growth, invasion and angiogenesis (Franco et al., 2010). CAFs are originated from normal fibroblasts (NFs) that are in contact with tumor cells, receiving and sending signals to co-evolve with the tumor cells and support their biological requirements. Communication pathways between CAFs and neoplastic cells include ncRNA mediated signaling (**Table 2**; Erez et al., 2010).

Additionally, the metabolic status in cancer lesions is also balanced by different micro-environmental components. For instance, surrounding immune cells present active alternative pathways to overcome tumor energetic limitations. In particular, the metabolic switch in tumor cells promotes the presence of tumor-infiltrating lymphocytes (T cells) which is a crucial tumoral adaptation to dampen antitumor immunity (Molon et al., 2016; Zhao et al., 2016). Maintaining tumor metabolic homeostasis requires a balanced immune response, which is achieved by extracellular signals that can be induced or repressed by ncRNA activity (**Table 2**; Dumortier et al., 2013).

Another important component of the tumor microenvironment are the adipocytes, that are considered as an energy storage depot, as well as endocrine cells that produce hormones, growth factors, cytokines, and adipokines (Rajala and Scherer, 2003). Therefore, mature adipocytes influence tumor behavior through heterotypic signaling processes, providing fatty acids for rapid tumor growth, and also promoting homing, migration, and invasion of tumor cells (Nieman et al., 2011). ncRNAs can actively participate as important modulators of the lipid metabolism in tumors where adipocytes represent the major component of the tumoral microenvironment (**Table 2**).

Emerging data suggest a fine regulatory loop between the HIF system, microenvironment and tumor cells, governed by diverse regulatory molecules like ncRNAs. Given the fact that HIF target genes include many metabolism-induced genes, such as ncRNAS (Semenza, 2010; Masson and Ratcliffe, 2014), and both tumor and stromal hypoxia, along with deregulated metabolism, characterize aggressive cancer phenotypes, it is tempting to

#### TABLE 2 | ncRNAs and tumor microenvironment.


conclude that activation and regulation of HIF pathways by complex signaling processes is one of the most important causes for deregulated tumor metabolism (Höckel and Vaupel, 2001; **Table 3**). A more detailed overview about hypoxia and lncRNA is discussed in Chang et al. (2016).

Endogenous and exogenous hormone-signaling pathways serve as metabolic regulatory networks that control fuel and energy metabolism on both tumor and stromal cells, and connects nutrient availability with cell growth and proliferation. Currently, ncRNA modulation by hormones can reenforce hormone-signaling activity. For example, insulin, a major hormone in the homeostasis of energy and metabolism, has been implicated in the regulation of miRNA expression (Granjon et al., 2009). Additionally, the estrogen receptor activates autophagic fluxes as a response to metabolic damage, in part by regulating ncRNA expression (Bernales et al., 2007; **Table 3**).

### nC-RNAs MEDIATING METABOLIC STRESS RESPONSES: AUTOPHAGY, EMT, ANGIOGENESIS, AND INFLAMMATION

When metabolic stress triggers energetic and nutritional changes in tumor cells, the metabolic stress responses collaborate to maintain homeostasis. Metabolic changes take place in reaction to stress in the tumor and stromal cells through the activation of several mechanisms, including autophagy, epithelial-mesenchymal transition (EMT), angiogenesis, and inflammation.

Autophagy is a catabolic process indispensable for the maintenance of cellular homeostasis. Alterations of autophagy are described in cancer and are due to alterations in the expression of various genes that promote or suppress it (Lozy and Karantza, 2012). Autophagic programs consist of the degradation of cellular organelles, cytoplasmic proteins and lipids, allowing recycling of the resulting catabolites for biosynthesis and energy metabolism, in order to satisfy nutrient, energy and hormonal demands of the tumor cells (Jing et al., 2015). The metabolic requirements of cancer cells are maintained, in part, by autophagy pathways present in tumor or stroma cells (Martinez-Outschoorn et al., 2011; Mathew and White, 2011). Considering the vast implications if ncRNAs in stress responses, their activity might contribute to the dynamics of autophagy during cancer progression (Leung and Sharp, 2010; **Table 4**).

Metabolism, mainly hypoxic conditions, can drive EMT through NF-κB, PI3K/Akt/mTOR, Notch, Wnt/β-catenin, and Hedgehog signaling pathways (Fan et al., 2013). EMT refers to a complex molecular and cellular program by which epithelial cells lose their epithelial attributes such as cell–cell adhesion, planar-basal polarity, and limited motility, but acquire mesenchymal features, including increased motility, invasiveness and development of escape routes for apoptosis (Polyak and Weinberg, 2009). Modulation of EMT pathways by ncRNAs has been described in several tumors (**Table 4**). Another important feature that characterizes the most advanced and aggressive

#### TABLE 3 | ncRNA regulation by hypoxia and hormone environment.


tumors is angiogenesis; meaning the development of tumor neovasculature. This mechanism is crucial to satisfy nutrient and oxygen demands, as well as to provide routes for metabolic waste excretion (Carmeliet and Jain, 2000). To achieve this oncogenic hallmark, tumor cells induce pro-angiogenic factors or block anti-angiogenic signals, in part by modulating ncRNA expression profiles (**Table 4**). For a more detailed overview about ncRNAs implicated in EMT and angiogenesis refer to Wang W. et al. (2015) and Choudhry et al. (2016).

Finally, inflammation is considered as an oncogenic feature that allows the acquisition of carcinogenic capacities by the provision of biomolecules to the tumor and to the cells of the microenvironment, such as transcription factors which can enhance proliferative signaling, proangiogenic factors, invasion and metastasis (Hanahan and Weinberg, 2011; **Table 4**). A more detailed overview of ncRNAs implicated in inflammation is discussed in (O'connell et al., 2012).

#### TABLE 4 | ncRNAs and their contribution to events in the metabolic changes in cancer.


CCC, Colorectal cancer; GBM, glioblastoma; HNSCC, Head and neck squamous cell carcinoma; PCA, Prostate Cancer; CML, Chronic myeloid leukemia; OvCa, Ovary cancer; BRCA, Breast cancer; GC, Gastric Cancer; ↑, up-expression; ↓, down-expression.

tamoxifen-induced autophagy and enhance the sensitivity of breast cancer cells to tamoxifen treatment (Frankel et al., 2011). (8) Recombinant lentivirus administration of miR-30a (inhibitor of autophagy by down-modulating BECN1), can enhance sensitivity to imatinib cytotoxicity in chronic myeloid leukemia, increasing tumor cell apoptosis (Yu et al., 2012). In vitro (cell line models). (9) Up-regulation of miR-125a in cervical cancer (CC) models sensitized to paclitaxel by down-regulating STAT3 (Fan et al., 2016). (10) Re-expression of miR-30a can sensitize tumor cells to cisplatin via mediating autophagy in HeLa, MCF-7 and HepG2 (Zou et al., 2012). (11) Over-expression of miR-101 sensitized human lung carcinoma cells to radiation treatment (Yan et al., 2010).

### NOVEL INSIGHTS: nCRNAs AS THERAPEUTIC TOOLS IN CANCER METABOLISM

The advent of novel knowledge and high throughput technologies, such as RNA-seq, Chip-seq, and metabolomic analysis, has allowed us to gain insight into the versatility of the mechanism that regulate metabolism and how the disturbance of specific factors, in particular ncRNAs, might impact the altered phenotypes of cancer cells. During the past years, we have gained important understanding about the biological activity of ncRNAs, although more research is needed to better understand the complex mechanisms that orchestrate tumor metabolism. Furthermore, pharmacological intervention of cell metabolism is emerging as a potential therapeutic strategy in some cancers (Ahn and Metallo, 2015) giving us the opportunity to explore new sources for biomarker discovery and development of new targeted drugs. The crucial role of ncRNAs in metabolism and associated mechanisms raises the possibility of developing ncRNA-targeted therapies. miRNA and lncRNAs mimics or inhibitors can be used to elevate or block the activity of metabolic-related genes to drive cancer initiation and/or progression programs. **Figure 5** summarizes some of the actual and future therapeutic applications of metabolism-related ncRNAS in cancer treatment.

## AUTHOR CONTRIBUTIONS

SR and AH coordinated the review process, SR, AH, FB, and AC performed the literature review, organized the information and wrote the paper. All authors read and approved the last version of the manuscript.

### ACKNOWLEDGMENTS

We thank Dr. Elizabeth Langley McCarron for her comments and critical review of the manuscript. FB and AC received a Ph.D.,

#### REFERENCES


scholarship from the Mexican National Council of Science and Technology. This paper is part of the Ph.D., thesis project of FB and AC, students of the Biomedical Sciences Ph.D., Program of the National Autonomous University of Mexico.


T24 cells by regulating GLUT3 expression. FEBS Lett. 586, 392–397. doi: 10.1016/j.febslet.2012.01.006


fibroblasts mediates prostaglandin E2/interleukin-6 signaling in the tumor microenvironment. Cell Res. 25, 588–603. doi: 10.1038/cr.2015.51


and hepatocarcinogenesis. J. Clin. Invest. 122, 2884–2897. doi: 10.1172/JCI 63455


NF-kappaB to IL-6 signaling axis and STAT3-driven cancer phenotypes. Sci. Signal. 7, ra11. doi: 10.1126/scisignal.2004497


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Beltrán-Anaya, Cedro-Tanda, Hidalgo-Miranda and Romero-Cordoba. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Computational Model Predicts the Effects of Targeting Cellular Metabolism in Pancreatic Cancer

Mahua Roy <sup>1</sup> and Stacey D. Finley 1, 2 \*

<sup>1</sup> Biomedical Engineering, University of Southern California, Los Angeles, CA, USA, <sup>2</sup> Chemical Engineering, University of Southern California, Los Angeles, CA, USA

Reprogramming of energy metabolism is a hallmark of cancer that enables the cancer cells to meet the increased energetic requirements due to uncontrolled proliferation. One prominent example is pancreatic ductal adenocarcinoma, an aggressive form of cancer with an overall 5-year survival rate of 5%. The reprogramming mechanism in pancreatic cancer involves deregulated uptake of glucose and glutamine and other opportunistic modes of satisfying energetic demands in a hypoxic and nutrient-poor environment. In the current study, we apply systems biology approaches to enable a better understanding of the dynamics of the distinct metabolic alterations in KRAS-mediated pancreatic cancer, with the goal of impeding early cell proliferation by identifying the optimal metabolic enzymes to target. We have constructed a kinetic model of metabolism represented as a set of ordinary differential equations that describe time evolution of the metabolite concentrations in glycolysis, glutaminolysis, tricarboxylic acid cycle and the pentose phosphate pathway. The model is comprised of 46 metabolites and 53 reactions. The mathematical model is fit to published enzyme knockdown experimental data. We then applied the model to perform in silico enzyme modulations and evaluate the effects on cell proliferation. Our work identifies potential combinations of enzyme knockdown, metabolite inhibition, and extracellular conditions that impede cell proliferation. Excitingly, the model predicts novel targets that can be tested experimentally. Therefore, the model is a tool to predict the effects of inhibiting specific metabolic reactions within pancreatic cancer cells, which is difficult to measure experimentally, as well as test further hypotheses toward targeted therapies.

#### Edited by:

Osbaldo Resendis-Antonio, National Autonomous University of Mexico, Mexico

#### Reviewed by:

Maria Suarez Diez, Helmholtz Zentrum für Infektionsforschung, Germany Monika Heiner, Brandenburg University of Technology, Germany

> \*Correspondence: Stacey D. Finley sfinley@usc.edu

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 18 November 2016 Accepted: 27 March 2017 Published: 12 April 2017

#### Citation:

Roy M and Finley SD (2017) Computational Model Predicts the Effects of Targeting Cellular Metabolism in Pancreatic Cancer. Front. Physiol. 8:217. doi: 10.3389/fphys.2017.00217 Keywords: metabolic modeling, systems biology, kinetic model, sensitivity analysis, parameter optimization

### 1. INTRODUCTION

Pancreatic ductal adenocarcinoma (PDAC) is a particularly aggressive and challenging form of cancer (Hidalgo, 2010; Oberstein and Olive, 2013; Siegel et al., 2013; Blum and Kloog, 2014) that is highly resistant to conventional chemotherapy. Mutations mediated by the KRAS or MYC oncogenes, found in 95% of cases of PDAC (Almoguera et al., 1988; Uemura et al., 2004; Löhr et al., 2005; Hezel et al., 2006; Kimmelman, 2015), promote reprogramming of the cellular metabolism, enabling the cancer cells to optimally use available resources (Ying et al., 2012). Specifically, KRAS promotes glucose uptake (Donahue and Dawson, 2016) and rewiring of glucose and glutamine metabolism (Kerr et al., 2016) to satisfy the excess demand for nutrients and cellular resources needed to sustain proliferation. The cells use glycolysis (glucose metabolism) to generate cellular resources needed to produce more cells. Similarly, increased glutamine consumption enables the cells to meet the larger demand for nitrogen needed to generate building blocks such as amino acids and lipids (Eagle, 1955; Vasseur et al., 2010; Pavlova and Thompson, 2016). The cells exhibit high survival and minimal death, even when the primary nutrients and energetic resources are scarce, suggesting that the cells adapt to the challenging conditions by altering their metabolism (Yoshida, 2015). This reprogramming of metabolic pathways is considered to be an emerging hallmark of most cancers (Hanahan and Weinberg, 2011) and is a driver of malignant growth. Moreover, the metabolic stress that occurs as a result of KRAS-mediated metabolic alterations can lead to further mutations and continued cell proliferation and tumor progression (Cairns et al., 2011; Misale et al., 2012). For these reasons, the dysregulated metabolic pathways can be used to identify biomarkers to support cancer diagnosis (Chung et al., 2003; Serkova and Boros, 2005; Pelicano et al., 2006). The altered metabolism also represents potential therapeutic targets (Macheda et al., 2005).

Pancreatic cancer cells are particularly reliant on glutamine to sustain proliferation and promote cell survival. Glutamine is a conditionally essential amino acid that fuels the tricarboxylic acid (TCA) cycle. Upon being taken up by the cell, glutamine is converted to glutamate by the glutaminase (GLS) enzyme, and then enters the TCA cycle as α-ketoglutarate (Wise and Thompson, 2010). Interestingly, PDAC has been characterized by non-canonical metabolism of glutamine, whereby the enzyme glutamic-oxaloacetic transaminase (GOT1) catalyzes the conversion of cytosolic aspartate to oxaloacetate. This enzyme is used in pancreatic cancer, instead of the glutamate dehydrogenase enzyme (GLUD1) used by normal cells to convert glutamate derived from glutamine to α-ketoglutarate in the mitochondria (McGivan and Chappell, 1975; Newsholme et al., 2003).

Glutamate, α-ketoglutarate, and aspartate are all important glutamine metabolism intermediates needed for cell proliferation. Glutamate-pyruvate transaminase (GPT), also known as alanine amino-transferase, transfers nitrogen from glutamate to pyruvate to make alanine and α-ketoglutarate. This nitrogen supports amino acid synthesis needed to produce cellular building blocks (i.e., lipids and nucleic acids). The αketoglutarate obtained by the conversion of glutamate promotes citrate production and lipid biosynthesis (Wise et al., 2011; Metallo et al., 2012). Aspartate is converted to oxaloacetate (Cohen et al., 2015), which is further converted to malate and then to pyruvate through the action of malic enzyme (ME1). The action of ME1 increases the NADPH/NADP ratio to maintain the redox balance and to replenish the glutathione (GSH) pool to quench the reactive oxygen species (ROS) (Gaglio et al., 2011). Given the importance of glutamine in pancreatic cancer, the enzymes that catalyze its metabolism, including GLS, GOT1, and ME1, are potential targets for impeding cell growth (Weinberg et al., 2010; Gross et al., 2014). For example, knocking down GOT1 activity alters the cells reductive capacity and is shown to inhibit cell proliferation in vitro and tumor growth in vivo (Son et al., 2013).

Pancreatic cancer cells also utilize the glycolytic pathway to metabolize glucose. Glycolysis converts glucose to pyruvate, most of which is used to form lactate, producing some ATP, rather than incorporated into the TCA cycle for ATP production. The increased reliance on glycolysis, despite the fact that oxidative phosphorylation is more efficient in generating ATP is termed the "Warburg effect" (Warburg, 1956) and has been widely studied (Vander Heiden et al., 2009). However, glycolysis enables the cells to meet their demand for precursors needed for biomass synthesis, which outweighs their energetic demands for ATP or NADH from the TCA cycle. The demand for the generation of amino acids, lipids, and nucleic acids is further satisfied by branching pathways that exploit the elevated glucose uptake, including the pentose phosphate pathway (PPP) (DeBerardinis et al., 2008; Weinberg et al., 2010; Patra and Hay, 2014). The PPP provides NADPH for macromolecule biosynthesis and quenching of reactive oxygen species (ROS), termed reductive biosynthesis. It also generates ribose-5-phosphate (R5P) required as a precursor for DNA and RNA biosynthesis (Recktenwald et al., 2008; DeNicola et al., 2011). Glucose metabolism has been targeted in attempts to inhibit cancer cell proliferation (El Mjiyad et al., 2011), and it remains a target in pancreatic cancer (Vander Heiden, 2011).

Mathematical modeling is necessary to understand metabolic reprogramming in cancer cells. Predictive mathematical models can incorporate the many metabolites, enzymes, and regulatory mechanisms that characterize cellular metabolism to enable a better understanding of the metabolic pathways (Vazquez et al., 2010; Alberghina et al., 2012; Cazzaniga et al., 2014; Le Novère, 2015). Many published metabolic modeling techniques have focused on constraint-based approaches in which certain physical, chemical, or biological constraints are applied to predict the metabolic phenotypes (Resendis-Antonio et al., 2010; Bordbar et al., 2014). These are steady state stoichiometric models that can predict the flux distributions, but fail to capture the kinetic aspects (time course of metabolite concentrations) in the system or time-varying heterogeneities that arise due to environmental fluctuations. When considering processes that are inherently transient, such as the effects of reprogramming of cancer metabolism, kinetic modeling is required to understand the dynamic relationship between metabolic fluxes and metabolite concentrations (Markert and Vazquez, 2015). Therefore, models that represent the metabolic pathways using a system of nonlinear ordinary differential equations (ODEs) are of particular importance. These kinetic models provide a mechanistic description of the transient dynamics of the system (Machado et al., 2012; Cazzaniga et al., 2014), as well as provide steady state measurements. When constructed and validated using experimental measurements, kinetic models can be used to perform in silico experiments to predict the dynamic effects of perturbing the metabolic network. In this way, the models are a valuable alternative to wet experiments that can be expensive and time-consuming.

In this study, we construct such a kinetic model of pancreatic cancer cell metabolism. Given the importance of glutamine and glucose metabolism in promoting pancreatic cancer cell proliferation, we apply the model to identify effective metabolic targets for impeding proliferation. The model is used to simulate the effects of altering specific metabolic enzymes and is a valuable tool to quantitatively understand the dynamics of cancer cell metabolism.

### 2. MATERIALS AND METHODS

### 2.1. Model Structure and Numerical Implementation

We constructed a kinetic model of pancreatic cancer cell metabolism using previously published models of metabolism from various cell types (Mulukutla et al., 2010; Marín-Hernández et al., 2011; Mulukutla et al., 2012; Marín-Hernández et al., 2014; Shestov et al., 2014; Mulukutla et al., 2015). Our model is comprised of a total of 46 metabolites and 53 enzymatic reactions including glycolysis, glutaminolysis, the TCA cycle, the PPP, and malate-aspartate-ketoglutarateglutamate shuttles between the cytosolic and mitochondrial compartments (**Figure 1**). Each step in the metabolic pathway is modeled according to known enzymatic reactions, which include reaction mechanisms ranging from simple Michaelis-Menten to complicated random bi-bi kinetics, expressed as different mathematical formulations. Rate laws for each reaction mechanism are incorporated into a system of 46 nonlinear ordinary differential equations (ODEs) that describe how the metabolite concentrations evolve over time. There is a single ODE for each metabolite, representing the rate of change of the species concentration, which depends on the rates at which the species is produced and consumed in the reaction network. We used an implicit differential equation solver in MATLAB (Guide, 1998) to numerically integrate the equations and solve for the metabolite concentrations. This is a deterministic model, which simulates the concentrations in a homogeneous ensemble of cells that experience, on average, similar intraand extra-cellular environmental conditions. By integrating the ODEs, we calculate the average dynamics of the cell population.

We briefly summarize the model equations below, and the full set of ODEs is provided in the Supplementary Material (model files: "S1.m" and "S2.xml"). Abbreviations for the metabolites and reaction names are given in Supplementary File S3 and the values of the fixed parameters are listed in Supplementary File S4. The detailed rate equations for glycolysis and their corresponding kinetic constants are primarily based on the glycolysis model for HeLa cells (Marín-Hernández et al., 2011, 2014). This glycolysis reaction network was extended to include reactions from the TCA cycle and PPP using kinetic rate laws and parameters from Mulukutla and coworkers (Wu et al., 2007; Mulukutla et al., 2010, 2012, 2015). Reactions that involve ATP consumption and production in the cytoplasm were defined as in the model of Shestov et al. (2014), and the ATP and ADP concentrations in mitochondrial compartment were kept constant as in Mulukutla et al. In addition, glutamine transport parameters were obtained from Pingitore et al. (2013).

AKT is a strong promoter of KRAS-mediated pancreatic cancer tumorigenicity (Asano et al., 2004) due to its influence on the rates of metabolic reactions in glycolysis. It is known that PDAC cells have increased glucose uptake (Ying et al., 2012), which is mediated by upregulation of specific glycolytic enzymes, including the glucose transporter-1 (GLUT1), hexokinase (HK), and lactate dehydrogenase A (LDHA). Additionally, AKT promotes increased glucose uptake by activating GLUT1, HK, and the phosphofructokinase (PFK) enzyme (Rathmell et al., 2003; Elstrom et al., 2004). We have incorporated the effect of AKT in our metabolic model, simulating AKT-mediated enhanced glycolytic activity. Specifically, the activities of the GLUT1, HK, and PFK enzymes (represented by their individual Vmax values) are modeled to have 20% basal activity, while 80% of their activity is due to activation by AKT (Mosca et al., 2012; Mulukutla et al., 2012).

In order to predict how the intracellular metabolic pathways influence cell growth, we incorporate cell number with the enzyme-catalyzed reactions. Specifically, the model is augmented to include a 47th ODE that describes the time evolution of the number of cancer cells, CN. Cell growth is implemented as a logistic equation (Enderling and Chaplain, 2014) that accounts for the maximal carrying capacity of the tumor, KCC (Equation 1).

$$\frac{d(C\_N)}{dt} = [\lambda (1 - \frac{C\_N}{K\_{CC}})C\_N] - \alpha\_d C\_N \tag{1}$$

The number of cancer cells is directly linked to the metabolism, where the growth rate depends on the intracellular concentrations of three primary metabolites known to influence cell proliferation: glucose, glutamine and ATP (Venkatasubramanian et al., 2008; Zhu et al., 2012). The dependence on these three metabolites is modeled assuming Monod-type functions (Higuera et al., 2009) (Equation 2).

$$\lambda = \alpha\_{atp}(\frac{ATP}{k\_{ap} + ATP}) + \alpha\_{glc}(\frac{Glc\_{in}}{k\_{gc} + Glc\_{in}}) + \alpha\_{gln}(\frac{Gln\_{in}}{k\_{gn} + Gln\_{in}}) \tag{2}$$

The growth and death parameters αatp, αglc, αgln, and α<sup>d</sup> are in the units of min−<sup>1</sup> . The concentration parameters kgc, kgn, and kap have units of mM.

#### 2.2. Initial Conditions

We simulated the model with multiple sets of starting metabolite concentrations to identify the appropriate range of initial conditions. There is limited information regarding the initial intracellular metabolite concentrations in pancreatic cancer cells. Therefore, we allowed the initial concentration for each metabolite to vary within a specified range. We specified the concentration ranges based on published measurements obtained from various cell lines, including human cervical cancer (Marín-Hernández et al., 2011, 2014), diseased and surrounding normal tissue samples from stomach and colon cancer patients (Hirayama et al., 2009), breast cancer cell extracts (Le Guennec et al., 2012), PDAC cancer patient samples (Fontana et al., 2016) and mouse myeloma and CHO cell lines (Mulukutla et al., 2012, 2015). Additional uncertainty for pancreatic cancer cells was

considered by increasing and decreasing the upper and lower bounds, respectively, by 20%. Due to the lack of measurements that distinguish the metabolite levels in different cellular compartments, the initial concentrations of metabolites that were present in both mitochondrial and cytosolic compartments were assumed to be the same. The ranges of metabolite concentrations given in **Table 1** account for variability in literature measurements as well an additional uncertainty for unknown intracellular concentration of pancreatic cancer cell lines in particular.

Latin Hypercube Sampling (McKay et al., 2000; Oguz et al., 2013) was applied to sample within the ranges selected for each metabolite. LHS separates the range of concentrations for the metabolites into multiple intervals and samples from each interval exactly once, thereby efficiently exploring the entire possible range of initial conditions for each metabolite. We selected to obtain 100 sets of initial conditions for each metabolite for parameter identifiability analysis (Section 3.1.1), and then randomly selected 50 of those sets to be used in parameter estimation (Section 3.1.3). This procedure adequately explores the possible ranges of initial conditions while balancing the computational resources required for global parameter optimization.

#### 2.3. Parameter Estimation

The baseline model, adapted from literature, has a total of 372 parameters, which includes 71 reaction velocities (the forward and reverse rates, V<sup>f</sup> and V<sup>r</sup> , respectively). The reaction velocities



reflect the amount of enzyme present and the corresponding enzyme activity. Conventionally, the reaction velocities are thought to distinguish the metabolism across different cell types. Therefore, of the many kinetic parameters included in the reaction rate equations, only the reaction velocities were fit to the training data, and the other rate constants were held at their literature values. We also fit the cell growth parameters shown in Equations (1) and (2). Below, we describe the experimental data used to train the model and the method used to perform the parameter estimation.

The model is fit to experimental measurements from Son et al. (2013), who measured the concentrations of 14 intracellular metabolites using targeted metabolomic analysis. Son and coworkers sought to understand the non-canonical glutamine metabolism in pancreatic cancer cells following the knockdown of GOT1, a major enzyme in glutamine metabolism. The metabolite concentrations were measured when the GOT1 enzyme was knocked down, relative to the no knockdown condition. Thus, they quantified the fold-change in the metabolite concentrations.

The experimental protocol used by Son and coworkers is illustrated in Figure S1. We simulated the same sequence of steps to predict the fold-change in the concentrations of the 14 metabolites. Since the relative enzyme expression level can be correlated with the regulation of enzyme activity levels, we simulate enzyme knockdown by multiplying the V<sup>f</sup> by the factor (1 - α) (Nolan and Lee, 2012). We take the value of α to be 0.85, based on the average GOT1 expression level from two knockdown experiments reported in Son et al. (2013). The model is simulated for GOT1 knockdown to predict the fold-change in the concentration of the 14 metabolites relative to the no knockdown case. We sought to minimize the weighted sum of the squared error (WSSR) between the experimental data and the model predictions.

Additionally, Son and colleagues use in vitro cell culture to investigate how intracellular metabolism influences cell proliferation. They measure the number of cells with and without GOT1 knockdown and in the presence of varying extracellular nutrient concentrations. We also simulate their experimental protocols and compare the model predictions to their experimental measurements.

Particle swarm optimization (PSO) was used to identify the parameter values needed to enable the model predictions to best fit the data and minimize the WSSR. PSO (Iadevaia et al., 2010; Kennedy, 2010; Tashkova et al., 2011) is a biologicallyinspired stochastic global optimization technique developed by Kennedy and Eberhart (1995). It is based on the concept of the social behavior observed in nature. In PSO, many particles, sets of parameters, are constantly updated from their random starting values to identify the parameter values that best fit the experimental data. Each particle has a position, representing the location in the multi-dimensional parameter space, and a velocity with which it moves toward a local minimum in the WSSR. The particles communicate with one another to update their position and velocity, ultimately moving toward the global minimum in the WSSR. We used PSO to estimate the reaction velocities for the baseline model. Each particle represents a vector of all reaction velocities to be optimized where the initial parameter values are taken from a well sampled space with the given bounds. To specify the bounds, the reaction velocities were allowed to vary 100-fold up and down from their starting values (taken from the literature, see Materials and Methods). Each run of the PSO algorithm executes 2, 500 iterations, a user-defined value to balance the computational expense of the parameter search. We performed the parameter estimation twice for each set of initial conditions (i.e., a total of 5, 000 iterations per initial condition) and, for each case, selected the set of parameters that generated the lowest error. This gave a total of 50 best-fit parameter sets, one set for each initial condition.

Estimating the reaction velocities for each initial condition was the first step of model fitting. In the second step of model fitting, we sought to estimate the growth parameters by minimizing the WSSR. Since there fewer parameters to fit compared to the first fitting step, we used nonlinear least squares optimization. We performed the fitting 100 times for each initial condition to approach the global minimum in the model error. Given limited prior knowledge of the range of base values for growth parameters (Higuera et al., 2009), we searched over a parameter space spanning seven orders of magnitude for each parameter. The model simulations to optimize for cell growth were conducted such that the same set of seven growth parameters could fit the experimental growth curve for both no knockdown and GOT1 knockdown conditions.

#### 2.4. Data Extraction

Experimental data for model training and validation was extracted from Son et al. (2013) using the MATLAB GRABIT program (Guide, 1998). Training data includes the fold-change in metabolite concentrations and cell number under GOT1 knockdown. Validation data includes the cell number under nutrient deprivation.

### 2.5. Parameter Identifiability Analysis

We used structural parameter identifiability analysis (Maly and Petzold, 1996; Ascher and Petzold, 1998; Shampine et al., 1999; Finley et al., 2011; Berthoumieux et al., 2013) to reduce the number of model parameters being fit to the training data. Parameter identifiability determines implicit dependencies among parameters. If two parameters are found to be correlated, we can specify a mathematical relationship between the parameters and only fit one in the parameter estimation procedure. Here, we only specify the relationship between correlated forward and reverse reaction velocities, where the reverse reaction velocity, V<sup>r</sup> , is expressed as a function of the forward reaction velocity, V<sup>f</sup> , with the equilibrium constant, Veq: V<sup>r</sup> = V<sup>f</sup> /Veq. In these cases, only the forward reaction velocity is fit to the experimental data, thereby reducing the number of fitted parameters. The Veq is calculated using the published works from which our model is derived (Wu et al., 2007; Mulukutla et al., 2010, 2012, 2015; Marín-Hernández et al., 2011, 2014).

#### 2.6. Sensitivity Analysis

We applied global sensitivity analysis (Saltelli et al., 2008) to determine which of the model parameters most significantly influence the predicted metabolite concentrations. Specifically, we used the extended Fourier Amplitude Sensitivity Test (eFAST) method (Marino et al., 2008), a variance-based approach, to understand the robustness of the model outputs (metabolite concentrations) given variance in the model inputs (the reaction velocities) (Zi, 2011). We allowed the model inputs to vary two orders of magnitude up and down from their literature values. The eFAST method calculates two indices that provide an estimate of the sensitivity of the model outputs with respect to the model parameters. The first order index, S<sup>i</sup> , quantifies the variance of the model output with respect to the variances of each individual input, and the total FAST index, Sti, quantifies the variance of the model output with respect to the variances of each input and covariances between combinations of inputs. The Si , then, is a measurement of local sensitivity of the model output to each individual input, whereas Sti is a measure of the global sensitivity, accounting for the interactions or correlations among multiple inputs.

### 3. RESULTS

We have constructed a kinetic model that predicts the dynamics of cellular metabolism in pancreatic cancer cells. The model is based on a priori knowledge of the molecular species involved and the reactions and interactions between the species. The complete model describing the metabolic network dynamics incorporates enzymatic reactions involved in glycolysis, glutaminolysis, the TCA cycle, and the PPP (**Figure 1**). We represent the cell using a cytoplasmic compartment and the mitochondria. Through glycolysis, glucose is metabolized to pyruvate, which enters the tricarboxylic acid cycle (in the mitochondria), or pyruvate can form lactate (in the cytoplasm), which is excreted from the cell. Glycolysis and pentose phosphate pathway take place in the cytoplasm and are linked through three metabolites: G6P, F6P and G3P. The TCA cycle in the mitochondrial compartment takes the influx of cytoplasmic pyruvate from glycolysis. Additionally, the following metabolites are exchanged between the cytoplasm and the mitochondria: malate, aspartate, citrate, glutamate and alpha-ketoglutarate. In total, the model includes 46 metabolites interacting through 53 enzymatic reactions where the evolution of the metabolites' concentrations are calculated by solving a set of nonlinear ODEs. The complete set of model reactions and the baseline parameter values from literature are included in the Supplementary Material.

### 3.1. Training of the Complete Kinetic Model

We performed parameter estimation to fit the model to quantitative experimental data and estimate the reaction velocities (V<sup>f</sup> and Vr) that allow the model predictions to best match the available experimental data. As described in the Methods, the complete model is constructed using equations from multiple sources, each of which contains parameters that characterize the rates of the metabolic reactions. Therefore, we fit the model to data specific to pancreatic cancer in order to obtain a validated model that can be used to predict the dynamics of metabolism in pancreatic cancer cells.

#### 3.1.1. Parameter Identifiability Analysis

We first performed parameter identifiability (PI) to determine the pairs of correlated parameters. Specifically, we aimed to identify which of the total 71 forward and reverse reaction velocities are mathematically correlated. Completing this analysis allowed us to fit the forward rate, and calculate the reverse rate using the equilibrium constant. Initially 100 sets of initial conditions are chosen from Latin Hypercube Sampling. We sum the calculated correlation coefficients for each of the 100 initial conditions and subsequently normalized the estimated correlation coefficients. When the forward and reverse reaction velocities (V<sup>f</sup> and V<sup>r</sup> , respectively) for a particular reaction are shown to be highly correlated for multiple sets of initial conditions, we fit the V<sup>f</sup> and calculate V<sup>r</sup> using the equilibrium constant, Veq. We performed the PI analysis once using the baseline model and all 71 reaction velocities, identifying 10 correlated pairs ("round 1"). We then performed the analysis again, after specifying the V<sup>r</sup> values found to be correlated in round 1, which identified another two correlated pairs ("round 2"). Through this analysis, we reduced the number of reaction velocities to be fit from 71 to 59. The results of the parameter identifiability are shown in Figures S2–S4.

#### 3.1.2. Global Sensitivity Analysis

Next, we performed global sensitivity analysis to determine which of the reaction velocities most significantly influence the model outputs. Ideally, estimating the sensitivity of the predicted concentrations of the 14 metabolites to variance in the reaction velocities reduces the number of fitted parameters, where only the values of the most influential parameters are estimated. Therefore, we applied the eFAST method (see Section 2) to calculate the sensitivity of the fold-change in the metabolite concentrations given variance in the 59 reaction velocities included in the model, for each set of initial conditions. The cumulative result of the sensitivity analysis is shown in Figure S5, where we sum the sensitivity coefficients for the 50 sets of initial conditions. However, the results show that each of the parameters influence at least one of the predicted fold-changes for each set of initial conditions. Therefore, we moved forward with fitting all 59 parameters, so as not to omit any parameter that affects the predicted fold-changes.

#### 3.1.3. Parameter Estimation

Finally, we used particle swarm optimization (PSO) to find the optimal values for each reaction velocity that allow the foldchanges in the metabolite concentrations predicted by the model to accurately match the fold-changes measured experimentally. By performing the model training, the predicted fold-changes match very closely to the experimental data, as shown in **Figure 2**. As a result, we estimated the values of the reaction velocities for each set of initial conditions. The estimated parameter values are given in the Supplementary Material ("S5.xlsx").

We incorporated growth kinetics with the trained metabolic model to predict the number of cells over time. The cell growth is simulated in the presence of complete media (35 mM of glucose and 6 mM of glutamine) for a total time period of 5 days. The model is able to match the training data for the growth curves measured by Son et al. (2013) (**Figure 3A**). By training the model, we estimated the cell death rate and growth parameters that characterize how the concentrations of glucose, glutamine, and ATP contribute to the rate of cell proliferation (Equations 1 and 2). As a result, four initial conditions out of the total 50 starting initial conditions obtained from LHS were able to fit the data equally well (**Figure 3A**).

### 3.2. Model Validation

We validated the model with available experimental measurements for cell proliferation under conditions of nutrient deprivation. The validation step confirms that the model is able to predict data not used in the model training. Two initial conditions with their corresponding fitted parameters (reaction velocities and growth parameters) could successfully validate the experimental growth curves measured under minimal glucose and glutamine concentrations (**Figure 3B**). These validated sets of initial conditions (**Table 2**) represent physiologically possible intracellular levels of metabolites present in pancreatic cancer cells. We therefore used only these sets of initial conditions and their corresponding fitted parameters in simulating various conditions and generating predictions that provide novel insight into pancreatic cancer metabolism.

The best-fit parameter sets estimated using these two initial conditions are remarkably consistent. A total of 69 and 71% of the reaction velocities and growth parameters, respectively, are within 100-fold of one another, as highlighted in Supplementary File S3. This consistency confirms the robustness of the identified parameter values and their physiological possibility within the intracellular environment of a pancreatic cancer cell, which is difficult to determine experimentally. However, given the large number of parameters that needed to be optimized, along with their interdependence due to upstream and downstream metabolite concentrations, some parameters showed high variability, as is common in systems biology models. Specifically, two growth parameters, αglc and kgc, vary as widely as seven orders of magnitude between the two sets of bestfit parameter values estimated using the two validated initial conditions. These parameters characterize the contribution of glucose to the overall rate of cell proliferation. However, the ratio of α to k for glucose is very similar across the two sets of initial conditions, again pointing to the robustness of the estimated parameter values. The occurrence of high variability in the best-fit parameters is to be expected in highly nonlinear and complex kinetic models (Bellu et al., 2007). However, the strength of the model optimization lies in the fact that despite high variability in certain parameters, the model validation for both initial conditions is highly comparable, as evident from Figure S6.

#### 3.3. Model Robustness

To test the robustness of the model predictions, we predicted how the number of cancer cells increase for varying metabolite initial conditions. We performed a Monte Carlo analysis, running the model with 1,000 different values of initial conditions

total cellular metabolite pool, which was measured in the experiments.

randomly selected from a Gaussian distribution. The baseline initial condition for each metabolite is allowed to vary 50% up and down. Here, the mean is the baseline value for the initial condition, and the standard deviation is 1/6 of the mean. This ensures that all of the values selected from the Gaussian distribution are within three standard deviations of the mean. The predicted results for one of the validated sets of initial conditions are shown in **Figure 4**. The simulations indicate that cell proliferation is fairly sensitive to the initial metabolite concentrations. Therefore, our careful procedure of identifying an appropriate set of initial conditions is important in generating valid model predictions.

### 3.4. Predicted Effects of Nutrient Availability

We applied the model to investigate the effects of the availability of glucose and glutamine in the extracellular environment. The cell proliferation rate is explicitly dependent on the concentrations of glucose and glutamine (Ramanathan et al., 2005; Yun et al., 2009), as well as the ability to convert the nutrient sources into ATP. Therefore, we explored how the cell count varied given changes in the extracellular levels of glucose and glutamine. We simulated the model under varying conditions of both glucose and glutamine (**Figure 5**). The model predicts that nutrient availability influences cell proliferation in a nonlinear manner. Additionally, the number of pancreatic cells is predicted to be more dependent on glutamine availability, as compared to glucose, particularly given longer times for cell growth. This result, which holds true for both validated sets of initial conditions, is consistent with experimental observations (Gaglio et al., 2011).

#### 3.5. Predicted Effects Metabolic Fluxes

The model predicts the dynamic reaction fluxes under varying conditions, providing insight into the metabolic phenotype of the pancreatic cancer cells. The flux through the enzyme-catalyzed reactions indicates the functional impact of each connection in the metabolic network (Sauer, 2006). Therefore, we applied the model to predict the dynamic reaction fluxes through


TABLE 2 | Final sets of initial conditions that fit the training data and match the validation data well.

results for 1,000 sets of initial conditions, where the shading shows the standard deviation.

the metabolic reactions both in the baseline model with no GOT1 knockdown (**Figure 6A**) and under GOT1 knockdown (**Figure 6B**). The differences in the reaction fluxes between these two conditions provide mechanistic insight into how altering a single enzyme-catalyzed reaction has a systemic effect on the metabolic network. The model predicts that GOT1 knockdown influences the magnitude and direction of the adenylate kinase (AK) reaction. The AK enzyme catalyzes the production of ADP from ATP and AMP, and in the baseline model, this reaction mostly proceeds in the reverse direction (i.e., there is a net production of ATP). With GOT1 knockdown, the flux through the AK reaction switches after 24 h of cell growth. In this case, less ATP is available to be consumed for proliferation, hence lower cell growth is observed. Additionally, GOT1 knockdown causes the glutamate-pyruvate transaminase (GPT) reaction to proceed in the opposite direction, as compared to the no knockdown case. This means that with GOT1 knockdown, the GPT reaction works to produce glutamate rather than consume it, compensating for the lower glutamate production that occurs when the GOT1 enzyme is not fully active.

### 3.6. Predicted Response to Metabolic Perturbations

The model predicts the systems-level response to various metabolic perturbations. With the ability to predict the number of pancreatic cancer cells over time and the dynamic reaction fluxes, the model can help identify the enzyme-catalyzed reactions that are effective therapeutic targets to inhibit tumor metabolism and impede cell growth. Therefore, we applied the model to predict the effects of inhibiting various enzymes in the metabolic network. We implemented enzyme knockdowns by decreasing the forward reaction velocity (V<sup>f</sup> ) by 85%, either alone or in combination with GOT1 knockdown. We first

targeted enzymes that directly influence the three metabolites involved in the cell proliferation rate (glucose, glutamine, and ATP). These enzymes include GLUT1, which catalyzes glucose uptake by the cell, GLS, the enzyme that converts glutamine to glutamate, and OXPHOS, the reaction simulating oxidative phosphorylation. The model predicts that inhibiting these enzymes influences cell growth to varying degrees. GLUT1 knockdown alone is not as effective in reducing cell growth as GOT1 knockdown (**Figure 7A**). Moreover, knockdown of both GLUT1 and GOT1 is as effective in reducing cell growth as GOT1 knockdown alone. Thus, the model indicates that GLUT1 is not an optimal target, as compared to GOT1. In comparison, OXPHOS knockdown leads to lower cell proliferation compared to GOT1 knockdown (**Figure 7B**). Also, under GLS knockdown, cell growth is significantly reduced (**Figure 7C**), alone or in combination with GOT1 knockdown.

The model predicts novel strategies to reduce pancreatic cancer cell metabolism that lead to reduced cell proliferation. After targeting enzymes that directly influence the metabolites whose concentrations influence the cell proliferation rate, we examined the effects of altering other enzymes in the metabolic network, individually and in combination. We conducted a local sensitivity analysis by varying the reaction velocities and predicting the effects on the relative cell number. We systematically reduced each of the 59 fitted reaction velocities in the trained model from 5% knockdown up to complete knockout (Burgard et al., 2003; Meister et al., 2013). In this way, the model is used to specifically pinpoint which enzyme-catalyzed reactions contribute most to cell growth inhibition. Reducing

the reaction velocity in the GOT1 reaction showed an expected direct correlation of decrease in cell growth with increasing effect of knockdown (Figure S7). However, it is more interesting to apply the model to identify combination therapies, i.e., systematic combinations of knockdown of essential enzymatic reactions. Therefore, we identified how knockdown (reducing the reaction velocity by 85%) for a target enzyme influences the predicted cell growth, alone and in combination with GOT1 knockdown. The model predicts three relevant classes of behaviors that lead to a reduction in cell proliferation, as described below. We show the relative cell count for a representative example from each case in **Figure 7D**, MALPi; **Figure 7E**, GAPDH; and **Figure 7F**, GOT2.


reduction of NAD to NADH. Interestingly, over-expression of GAPDH has been observed in many types of cancers (Norris et al., 2008; Ganapathy-Kanniappan et al., 2012; Krasnov et al., 2013). Inhibiting GAPDH would decrease the production of downstream metabolites, hence reducing the formation of lipids and amino acids, which are required for cell proliferation (Pereira et al., 2009). As expected, the model predicts reduced cell proliferation upon inhibiting the GAPDH enzyme (**Figure 7E**).

3. Knockdown of the target enzyme alone is very effective in reducing cell proliferation, and combining it with GOT1 knockdown does not have any additional effect. A representative example of this behavior is shown by targeting glutamate oxaloacetate transaminase 2 (GOT2). This enzyme promotes synthesis of OAA by AKG via glutamate. The expression level and activity of the GOT2 enzyme has been found to be highly elevated in pancreatic and breast cancer cells (Chakrabarti et al., 2015; Korangath et al., 2015; Yang et al., 2016). The model predicts that targeting GOT2 activity is a potential lethal approach to target glutamine metabolism to inhibit tumor growth (**Figure 7F**).

### 4. DISCUSSION

### 4.1. Robust and Predictive Computational Model

We present a predictive model that enables quantification of the kinetics of the intracellular metabolism of pancreatic cancer cells. The model provides an understanding of how the cells depend on the extracellular conditions (Vander Heiden et al., 2009) and the resulting dynamic reaction fluxes. The ultimate goal is to use the model to tackle this aggressive disease by identifying novel strategies to alter the reprogrammed metabolism within cancer cells (Hanahan and Weinberg, 2011).

The model is predictive of pancreatic cancer cell metabolism in particular, as we carefully calibrated the model to pancreatic cancer-specific data from the 8988T cell line. The calibrated model predicts the metabolite concentrations, reaction fluxes, and number of pancreatic cells over time. As a result of model calibration and validation to data not used in training, we identify feasible sets of initial conditions and kinetic parameters that together provide a model that is specific to pancreatic cancer. We apply the validated model to predict the effects of perturbing specific metabolic reactions, alone and in combination. Interestingly, the model simulations show that targeting the PPP, TCA cycle, or mitochondrial-cytoplasmic shuttle reactions presents an equally important and synergistic role with targets to regulate tumor metabolism.

Computational modeling offers a powerful tool to incorporate the complexity and robustness of the interconnected metabolic pathways and predict how individual and subsets of metabolic reactions give rise to the systemic behavior of the cells. Through parameter identification, sensitivity analyses, and parameter estimation, we obtained a predictive computational model that matches experimental data and can be used to predict metabolic phenotypes of pancreatic cancer. We utilized a quantitative approach to predict how altering nutrient availability and enzyme activity inhibits cancer cell metabolism, and ultimately, cancer cell proliferation. In this way, the model is a valuable framework that generates hypotheses regarding novel therapeutic strategies. The model provides quantitative insight into how the dynamics of metabolism are affected by strategic knockdown of enzyme activity. The strategies that we implemented computationally can be tested experimentally using shRNA to selectively reduce the activity of the targeted enzyme(s). Thus, when combined with experimental studies, the model can prove useful in designing and understanding pre-clinical trials.

Our approach of fitting the model with different sets of initial conditions to generate multiple parameter sets is akin to ensemble modeling for metabolic systems (Tran et al., 2008; Srinivasan et al., 2015; Saa and Nielsen, 2016). The ensemble modeling approach, which has been applied to build dynamic genome-scale models, generates multiple parameter sets (an ensemble of models) that produce the same steady state conditions. Given additional data, such as the distributions of the reaction fluxes under certain perturbations, the number of feasible models can be reduced. The ensemble of models is produced by sampling the parameter space for the kinetic rates, given certain constraints (i.e., thermodynamics or growth requirements). Analogously, we have sampled the space of possible initial metabolite concentrations and trained the model for each set of initial conditions to generate a set of possible kinetic parameters. We then use the cell proliferation data to further identify the sets of appropriate parameters and initial metabolite concentrations. This procedure resulted in two possible models, which are then evaluated to determine their robustness, and finally applied to generate novel predictions.

### 4.2. Comparison to Other Studies

The metabolic model constructed in this work is a significant expansion beyond existing kinetic models of cancer metabolism. Previously published kinetic models in the context of cancer have mostly focused on the glycolytic pathway. Such models have successfully identified enzymes that are associated with tumor growth and malignancy and are important targets in inhibiting metabolism, including GLUT, HK, PFK-1, and GAPDH (Marín-Hernández et al., 2011, 2014; Shestov et al., 2014). However, the enzymes involved in the TCA cycle and glutaminolysis also significantly contribute to cancer cell proliferation, particularly in case of pancreatic cancer. Our paper is the first to combine these pathways, along with cell growth, in a model for pancreatic cancer, thereby advancing the field of dynamic metabolic modeling of cancer. The impact of enzymes that catalyze glutaminolysis and TCA cycle reactions was proven experimentally by Son et al. (2013) and our simulations also confirm their importance.

We can compare the model predictions to experimental studies published in the literature. Over-expression of GLUT has been identified in almost all types of cancer and hence is a key signature of malignancy (Ganapathy-Kanniappan and Geschwind, 2013). Targeting GLUT has been shown to inhibit glucose transport and reduce cell growth(Liu et al., 2012; Granchi et al., 2014). However, due to the ubiquitous expression of GLUT in all cell types, blockage of GLUT remains a critical challenge. Using the model, we could successfully confirm the presence of alternative targets described in the literature, as well as identify novel targets. The model predicts the effects of targeting other pathways by which tumor cells metabolize nutrients and produce building blocks needed for cell proliferation. For example, the model predicts that targeting oxidative phosphorylation (via the OXPHOS enzyme) can significantly reduce cell growth, in combination with inhibition of the GOT1 enzyme. Indeed, the literature has shown that targeting this pathway by which the cell generates ATP in the mitochondria (Caro et al., 2012; Haq et al., 2013; Vazquez et al., 2013; Viale et al., 2014; Weinberg and Chandel, 2015), synergistically with optimal inhibition of glycolysis and glutaminolysis may increase effectiveness of cancer therapeutics (Lu et al., 2015; Yadav et al., 2015). Another example is inhibition of glutaminase (GLS), the enzyme responsible for converting glutamine to glutamate. The glutamate produced in this reaction subsequently enters in the TCA cycle to ultimately generate metabolites such as OAA, AKG, acetyl-CoA, and citrate for lipid production and nitrogen for DNA synthesis (Chen and Cui, 2015). The GLS enzyme is reported to have a positive correlation with cancerous tumor growth from normal cells due to enhanced glutaminolysis (Lora et al., 2004; Xiang et al., 2015), making it is a potential target for effective cancer therapeutic. The model predicts a synergistic effect when GLS is inhibited in combination with GOT1. Interestingly, inhibitors of GLS are being explored: BPTES (DeLaBarre et al., 2011; Hartwick and Curthoys, 2012) and CB839 (Gross et al., 2014) have been shown to induce apoptosis in cancer cells. These predicted effects of targeting OXPHOS and GLS, along with those described in Section 3.6 and illustrated in **Figure 7** demonstrate the utility of the model and confirm its validity. Excitingly, this comparison of the model results and known experimental studies lends great confidence to the model's predictions.

#### 4.3. Model limitations

Our model accurately reproduces, both quantitatively and qualitatively, experimental data used for training and validation. However, there are certain limitations that can be addressed as additional quantitative data become available for model fitting. Currently, the model only considers cancer cells; however, it is important to consider additional cell types within the tumor. We can extend the model to predict the effects of interactions between multiple cell types and to understand the dynamics of exchange of nutrients between the cells. Expanding the model in this way could enable a better understanding of the symbiosis between cells (Mendoza-Juez et al., 2012) and how the tumor microenvironment can alter the cells' metabolic dependencies and induce apoptosis (Phipps et al., 2015). Another limitation is that the model does not include intracellular recycling pathways or scavenging mechanisms such as autophagy (organelle degradation by autophagosomes) or macropinocytosis (engulfing the nutrients followed by lysosomal degradation). Additionally, the model assumes that the concentrations of glucose, glutamine, and ATP directly correlate to the cellular resources required for biomass production and cell proliferation. Therefore, we do not include the steps toward amino acid synthesis or nucleotide synthesis through the non-oxidative arm of the PPP or the hexosamine biosynthesis pathway. These are processes that enable cancer cells to promote biomass synthesis and could be added as future extensions to the existing model. Finally, given additional data, the model can be adapted to predict the metabolism in a range of cancer cell types beyond pancreatic cancer.

#### 5. CONCLUSION

The metabolic model presented here is a novel computational tool for investigating the metabolism of pancreatic cancer

#### REFERENCES


cells. The model includes enzyme-catalyzed reactions in central metabolic pathways and is trained and validated using quantitative experimental measurements, specific to pancreatic cancer lines. As a result, we have constructed the first kinetic model of pancreatic cancer metabolism. The model predicts the effects of both intracellular and extracellular perturbations, providing the metabolic fluxes and the number of cancer cells over time. With a successful identification of appropriate initial conditions and parameter values for pancreatic cancer, the model serves as a good starting point to predict the dynamic metabolism in other pancreatic cancer cell lines as well as a template for studying cell growth in other cell types. Additionally, using model simulations, we can design novel in silico combinatorial therapies toward impeding cancer cell proliferation. Thus, the model can be used to complement in vitro and in vivo pre-clinical studies.

### AUTHOR CONTRIBUTIONS

SF designed the research. MR constructed the model and performed the simulations and analyses. All authors contributed to writing the manuscript and approved of its final version.

### FUNDING

This work is supported by The Rose Hills Foundation and the USC Provost's Office (research grant to SF).

### ACKNOWLEDGMENTS

The authors thank members of the Finley research group for helpful discussions. Computation for the work described in this paper was supported by the University of Southern California's Center for High-Performance Computing (https://hpcc.usc.edu).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2017.00217/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Roy and Finley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modeling the Pro-inflammatory Tumor Microenvironment in Acute Lymphoblastic Leukemia Predicts a Breakdown of Hematopoietic-Mesenchymal Communication Networks

#### Jennifer Enciso1, 2, Hector Mayani <sup>1</sup> , Luis Mendoza<sup>3</sup> \* and Rosana Pelayo<sup>1</sup> \*

<sup>1</sup> Oncology Research Unit, Mexican Institute for Social Security, Mexico City, Mexico, <sup>2</sup> Biochemistry Sciences Program, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico, <sup>3</sup> Departamento de Biología Molecular y Biotecnología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico

#### Edited by:

Christian Diener, National Institute of Genomic Medicine, Mexico

#### Reviewed by:

Oksana Sorokina, University of Edinburgh, UK Marcio Luis Acencio, Norwegian University of Science and Technology, Norway

#### \*Correspondence:

Luis Mendoza lmendoza@biomedicas.unam.mx Rosana Pelayo rosanapelayo@gmail.com

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 22 March 2016 Accepted: 02 August 2016 Published: 19 August 2016

#### Citation:

Enciso J, Mayani H, Mendoza L and Pelayo R (2016) Modeling the Pro-inflammatory Tumor Microenvironment in Acute Lymphoblastic Leukemia Predicts a Breakdown of Hematopoietic-Mesenchymal Communication Networks. Front. Physiol. 7:349. doi: 10.3389/fphys.2016.00349 Lineage fate decisions of hematopoietic cells depend on intrinsic factors and extrinsic signals provided by the bone marrow microenvironment, where they reside. Abnormalities in composition and function of hematopoietic niches have been proposed as key contributors of acute lymphoblastic leukemia (ALL) progression. Our previous experimental findings strongly suggest that pro-inflammatory cues contribute to mesenchymal niche abnormalities that result in maintenance of ALL precursor cells at the expense of normal hematopoiesis. Here, we propose a molecular regulatory network interconnecting the major communication pathways between hematopoietic stem and progenitor cells (HSPCs) and mesenchymal stromal cells (MSCs) within the BM. Dynamical analysis of the network as a Boolean model reveals two stationary states that can be interpreted as the intercellular contact status. Furthermore, simulations describe the molecular patterns observed during experimental proliferation and activation. Importantly, our model predicts instability in the CXCR4/CXCL12 and VLA4/VCAM1 interactions following microenvironmental perturbation due by temporal signaling from Toll like receptors (TLRs) ligation. Therefore, aberrant expression of NF-κB induced by intrinsic or extrinsic factors may contribute to create a tumor microenvironment where a negative feedback loop inhibiting CXCR4/CXCL12 and VLA4/VCAM1 cellular communication axes allows for the maintenance of malignant cells.

Keywords: cancer systems biology, acute lymphoblastic leukemia, tumor microenvironment, CXCL12, pro-inflammatory bone marrow, early hematopoiesis, network modeling, dynamical systems

## INTRODUCTION

Cancer is currently considered as a global child health priority (Gupta et al., 2014). The application of effective treatments to decrease overall childhood cancer mortality requires a comprehensive understanding of its origins and pathobiology, along with accurate diagnosis and early identification of high-risk groups (reviewed in Vilchis-Ordoñez et al., 2016). Strikingly, the

clinical, molecular and biological heterogeneity of malignant diseases indicating an unsuspected multiclonal diversity has highlighted their complexity and the uncertainty of their cell population dynamics. Novel theoretical and experimental integrative strategies have changed our perspective of cancer, from a hierarchical, deterministic and unidirectional process to a multi-factorial network where genetics interacts with micro and macro environmental cues that contribute to the etiology and maintenance of tumor cells (Notta et al., 2011; Davila-Velderrain et al., 2015; Tomasetti and Vogelstein, 2015). Furthermore, stochastic effects associated with the number of stem cell divisions have been proposed as major contributors, often even more significant than hereditary or external factors (Tomasetti and Vogelstein, 2015).

B-cell acute lymphoblastic leukemia (B-ALL) is largely the result of a growing number of cooperating genetic and epigenetic aberrations that corrupt hematopoietic developmental pathways and ultimate lead to uncontrolled production of malignant B lymphoid precursor cells within the bone marrow (BM) (Pelayo et al., 2012; Purizaca et al., 2012). Leukemic cell infiltration and treatment failure worsen the outcome of the disease and remain the foremost cause of relapse. Recent advances suggest the ability of leukemia initiating cells to create abnormal BM microenvironments, promoting high proliferation and early differentiation arrest at the expense of normal cell fate decisions (Colmone et al., 2008; Raaijmakers, 2011; Vilchis-Ordoñez et al., 2015). Intrinsic damage and/or remodeling of cell compartments that shape the distinct BM niches may account to microenvironmental regulation of quiescence, proliferation, differentiation and blastic cell migration. Leukemic cells compete for niche resources with their normal hematopoietic counterparts (Wu et al., 2009), culminating in the displacement of the latter, as observed in xenotransplantation mice models (Colmone et al., 2008). Moreover, the marrow microenvironment provides leukemic precursors with dynamic interactions and regulatory signals that are essential for their maintenance, proliferation and survival. Although, the underlying molecular mechanisms are poorly defined, these niches protect tumor cells from chemotherapy-induced apoptosis, showing a new perspective on the evolution of chemoresistance (Ayala et al., 2009: Shain et al., 2015; Tabe and Konopleva, 2015), and emphasizing the need for new models that theoretically or experimentally replicate the interplay between tumor and stromal cells under normal and pathological settings.

As suggested by our previous findings, ALL lymphoid precursors have the ability of responding to pathogen- or damage- associated molecular patterns via Toll-like receptor signaling by secreting soluble factors and altering their differentiation potentials (Dorantes-Acosta et al., 2013). The resulting pro-inflammatory microenvironment may expose them to prolonged proliferation, contributing tumor maintenance in a self-sustaining way while prompting the NF-κB-associated proliferation of normal progenitor cells (Vilchis-Ordoñez et al., 2015, 2016). Some hematopoietic growth factors and pro-inflammatory cytokines, including granulocyte-colony stimulating factor (G-CSF), IFNα, IL-1α, IL-1β, IL-7, and TNFα were highly produced by ALL cells from a conspicuous group of patients co-expressing myeloid markers (Vilchis-Ordoñez et al., 2015). Of note, mesenchymal stromal cells (MSCs) from ALL BM have shown atypical production of pro-inflammatory factors whereas disruption of the major cell communication pathway is apparent by detriment of CXCL12 expression and biological function (Geay et al., 2005; Colmone et al., 2008; van den Berk et al., 2014).

Considering that the CXCL12/CXCR4 axis constitutes the most critical component of the perivascular and reticular BM niches supporting the hematopoietic stem and progenitor cells (HSPCs) differentiation and maintenance within the BM, as well as the early steps of B cell development (Ma et al., 1998; Tokoyoda et al., 2004; Sugiyama et al., 2006; Greenbaum et al., 2013), an obstruction of the HSPC-MSC interaction may have substantial implications in the overall stability of these processes. Whether the inflammation-derived signals provide a mechanism for leukemic cells to survive, to induce changes in lineage cell fate decisions, or to prompt niche remodeling in leukemia settings, are currently topical questions.

Mathematical model strategies have become powerful approaches to complex biological systems and may contribute to unravel the hematopoietic-microenvironment interplay that facilitates tumor cells prevalence (Altrock et al., 2015; Enciso et al., 2015). Through continuous dynamic modeling with differential equations we have learned seminal aspects of multi-compartment and multi-clonal behavior of leukemic cell populations (Stiehl and Marciniak-Czochra, 2012; Enciso et al., 2015), leading to novel proposals on disease development driven by unbalanced competition between normal and preleukemic cells (Swaminathan et al., 2015). Both stochastic and deterministic models have been useful to simulate cell fate decisions and predict clonal evolution (reviewed in Enciso et al., 2015). Certainly, incorporating tumor microenvironment in cancer modeling is expected to change our vision of biochemical interactions in niche remodeling-dependent hematopoietic growth, as recently demonstrated for myeloma disease (Coelho et al., 2016).

By developing and simulating a dynamic Boolean system, we now investigate the biological consequences of microenvironmental perturbation due by temporal TLR signaling on crucial communication networks between stem/progenitor cells (HSPCs) and MSCs in ALL. We propose that NF-κB dependent tumor-associated inflammation coparticipate in malignant progression concomitant to normal hematopoietic failure through disruption of CXCL12/CXCR4 and VLA4/VCAM-1 communication axes.

### MATERIALS AND METHODS

### Manual Curation Strategy

Based on the crucial and unique role of the CXCL12/CXCR4 axis in the regulation of maintenance, biological activity, and niche communication-derived cell fate decisions of seminal cells, including pluripotent embryonic stem cells and multipotent hematopoietic stem cells, construction and updating of molecular interactions of relevance involved careful manual curation of primary hematopoietic cell research. Moreover, of special interest was the attention to the hematopoietic malignancies, which in contrast to solid tumors, display a distinct CXCL12 mediated microenvironmental behavior. Thus, although the modeled signaling pathways could be considered generic to all tissues, the organ, stage of cell differentiation and surrounding microenvironment may influence the net result of interactions. Taking into account this considerations, most published work that has been used for the reconstruction of our proposed model, include data from molecular interactions in HSPCs. Some of the interactions have been reported in a number of different tissues and predicted to be conserved in the hematopoietic system. Finally, as there is not enough data to model hematopoietic-microenvironment restricted to Homo sapiens and some interactions might be crucial for the molecular connectivity of the model, we have used information from different species when needed. A detailed referencing of all reports used for the model reconstruction is provided as Supplemental Material (Tables S1, S2, and reference list).

### Molecular Basis for the Network Reconstruction

The connectivity among key molecules involved in the communication between HSPCs and MSCs within the BM was inferred through the curated experimental literature. Specifically, we were interested in recovering the network components, their interactions, and the nature of the interactions (activation/positive or inactivation/negative). The resulting general network incorporates transcriptional factors, kinases, membrane receptors, interleukines, integrins, growth factors, and chemokines from Homo sapiens and Mus musculus species. Importantly, to simplify the modeling process, some groups of molecules were considered as single functional modules, thus encompassing a series of sequential steps that lead to the activation or inactivation of a certain node (e.g., PI3K/Akt). The following paragraphs summarize the principle evidence used to reconstruct the HSPC-MSC network and infer the logical rules for computational simulation of the system as a discrete dynamical model. A detailed referencing is provided as Supplemental Material (Tables S1, S2, and reference list).

The CXCR4/CXCL12 chemokine pathway was considered as the central axis for the network construction considering its essential role in homeostasis maintenance (Sugiyama et al., 2006; Tzeng et al., 2011) and B lineage support (Ma et al., 1998; Tokoyoda et al., 2004). Furthermore, recent observations suggest that this axis is disrupted by up-stream molecular deregulations both in MSC and leukemic blasts harvested from ALL patients, affecting the maintenance of hematopoietic cells within their regulatory niches (Geay et al., 2005; Colmone et al., 2008; van den Berk et al., 2014). Besides the well-studied CXCR4/CXCL12 chemotactic interaction, CXCR4 activation increases the affinity between vascular cellular adhesion molecule-1 (VCAM-1) expressed on the surface of MSC and its receptor VLA-4 on HSPC. Both pathways, CXCR4/CXCL12 and VLA-4/VCAM-1, are known to play coordinately a central role in HSPC migration, engraftment and retention within the BM (Peled et al., 2000; Ramirez et al., 2009), converge in triggering the PI3K/Akt and ERK signals, and share common up-stream regulators involving molecular factors guiding inflammatory responses.

As mentioned in the Introduction, recent evidence indicates the secretion of high levels of pro-inflammatory cytokines by a conspicuous group of ALL patients (Vilchis-Ordoñez et al., 2015), thereby presumably contributing to remodeling of the normal hematopoietic microenvironment (Colmone et al., 2008). Of note, interleukin-1α (IL-1α) and IL-1β, which were substantially elevated, play an amplification role on inflammation increasing the expression of other cytokines, like G-CSF (Majumdar et al., 2000; Allakhverdi et al., 2013), and setting a positive feedback loop with the PI3K co-activation of NFκB (Reddy et al., 1997; Sizemore et al., 1999; Carrero et al., 2012; Bektas et al., 2014). IL-1 and G-CSF, inhibit directly and indirectly the CXCR4/CXCL12 axis. G-CSF negatively regulates CXCL12 transcription and increases the secretion of matrix metalloproteinase-9, showing the ability to degrade both CXCL12 (Lévesque et al., 2003; Semerad et al., 2005; Christopher et al., 2009; Day et al., 2015) and CXCR4 (Lévesque et al., 2003). Moreover, G-CSF promotes up-regulation of Gfi1 that at the time inhibits the transcription of CXCR4 (Zhuang et al., 2006; De La Luz Sierra et al., 2007; de la Luz Sierra et al., 2010). Thus, by considering this information from experimental data, we have included IL-1 and G-CSF as key elements of the BM microenvironment in the HSPC-MSC communication network.

In concordance, we incorporated as a "positive control condition" an input node representing the Toll-like receptor ligand (lTLR) lipopolysaccharide (LPS), that binds TLR4 and triggers the conventional and well-known NF-κBdependent pro-inflammatory response, promoting, among other transcriptional targets, the transcription of pro-IL-1β (Jones et al., 2001; Tak and Firestein, 2001; Wang et al., 2002; Khandanpour et al., 2010; Higashikuni et al., 2013).

Downstream NF-κB, the expression of CXCR7 has been shown to be upregulated (Tarnowski et al., 2010), which in turn, down-regulates CXCR4 by heterodimerization, promoting its internalization and further degradation. In parallel, activated CXCR7 presents a higher affinity for CXCL12 and β-arrestin, reducing CXCR4 signaling in CXCR7 and CXCR4 expressing cells (Uto-Konomi et al., 2013; Coggins et al., 2014). However, CXCR7 is unable to couple with G-protein, transducing through recruitment of β-arrestin and leading to MAP kinases Akt and ERK activation (Tarnowski et al., 2010; Uto-Konomi et al., 2013; Torossian et al., 2014). As with CXCR4, CXCR7, and VLA-4 activation in HSPC, PI3K/Akt pathway is activated on HSPC and MSC, via G-CSF receptor signaling (Liu et al., 2007; Vagima et al., 2009; Ponte et al., 2012; Furmento et al., 2014), and after LPS stimulation (Guha and Mackman, 2002; Wang et al., 2009; McGuire et al., 2013). Apparently, PI3K/Akt acts at overlapping levels on the modulation of inflammation. On the one hand, it increases the production of IL-1 antagonist molecules (Williams et al., 2004; Molnarfi et al., 2005; Li and Smith, 2014) and inhibits secretion of mature IL-1β (Tapia-Abellán et al., 2014). On the other hand, it promotes nuclear translocation of the transcriptional factor Foxo3a (Brunet et al., 1999; Miyamoto et al., 2008; Park et al., 2008), down-regulating indirectly the transcription of antioxidant enzymes and enabling reactive oxygen species (ROS) accumulation, which in turn promotes maturation of pro-IL-1β and IL-1β secretion (Hsu and Wen, 2002; Yang et al., 2007; Gabelloni et al., 2013).

At the mesenchymal counterpart, in addition to a number of molecules participating in the MSC-subsystem sensitivity to microenvironmental cues, we incorporated an input node representing Gap-junction conformed by connexin-43 (Cx43) that mediates direct intercellular communication between mesenchymal cells. Strikingly, its integral activity as calcium channel conductor has been shown to be a potent positive regulator of CXCL12 transcription and secretion (Schajnovitz et al., 2011). Furthermore, Cx43 expression appears to be critically disregulated in the BM stromal cells from acute leukemia patients, suggesting an important role in the hypothetic disregulation of the hematopoieticstromal intercellular communication (Liu et al., 2010; Zhang et al., 2012). The inclusion of GSK3β and β-catenin in both subsystems was relevant due to their roles as intermediates of signaling transduction and regulation of the main intracellular communication elements proposed for our network reconstruction. The model is available in XML format (GINML) on GINsim Model Repository page (http://ginsim.org/models\_ repository) (Chaouiya et al., 2012), under the title "HSPCs-MSCs. Communication pathways between Hematopoietic Stem Progenitor Cells (HSPCs) and MSCs."

### Dynamical Modeling of the HSPC-MSC Network

For the computational modeling of the HSPC-MSC complex system, we followed the standard steps to convert it into a discrete dynamical system, as described by Albert and Wang (2009) and Assman and Albert (2009). The Boolean approach is useful when quantitative and detailed kinetic information is lacking. In such a case, each node of the network is represented as a binary element, allowed only to have an "active" (ON) or "inactive" (OFF) state, numerically represented by 1 and 0, respectively. The activation state of each node is dependent on the activation state of its regulators, as described by Boolean functions, also called logical rules. The classical Boolean operators employed in Boolean functions are AND (&), OR (|) and NOT (!). The AND operator is used to represent the requirement of the conjunction of two or more nodes participating in the regulation of a certain node (e.g., VLA-4 = CXCR4 & VCAM-1 representing that VLA-4 optimal activation requires its ligand VCAM-1 and the signaling due to CXCR4 activation). When there is more than one node able to regulate another, but only one of them is sufficient to exert the effect, the OR operator is applied (e.g., PI3K/Akt = GCSF | ROS | TLR representing that the activation of the G-CSF receptor, the increase of intracelular ROS concentration or the binding of a TLR ligand may activate PI3K/Akt signaling). Finally, the NOT operator represents repression of a node over another (e.g., IL-1 = (NF-κB & ROS) & !PI3K/Akt meaning that IL-1 requires the transcriptional activation of pro-IL-1 promoted by NF-κB and the post-transcriptional maturation mediated by ROS, but its signaling is inhibited by the presence of PI3K/Akt). Detailed compiling of reviewed references for the network reconstruction and the development of the logical rules can be found in Tables S1, S2.

Given that each node in the network has an activation state, then the general state of a network at a given time t can be represented by a vector of n elements, where n is the number of nodes in the network. For example, the vector (00000010000000000100001000), represents a network state where only the 7th, 18th, and 23rd elements are active. In our model, this particular state represents the pattern of activation where only GSK3B\_H, GSK3B\_M, and VCAM1\_M are active. Now, since we are implementing a dynamical system, it is necessary to specify how the network may evolve from a time t to t+1.

There are two possible implementations to model the transition from one state of the network to another. On one side, the synchronous scheme update the activation state of all the nodes each time-step, assuming that all the biological processes involved in the model occur at similar time scales. And on the other side, asynchronous scheme update only one of the logical rules per time step, considering a more complex behavior of biological processes where molecular signaling is likely to change at different time points depending on the nature of the interaction (Albert and Wang, 2009). Either one or another update scheme, take an initial combination of the nodes (initial state) and update the logical rules successively through an established number of time steps or until an steady state or attractor is reached. Attractors may be of a single state (fixed point attractors) or a set of states (cyclic or complex attractors depending if they have one or more possible transition paths among their constituent states). The analysis of the nodes activation pattern in the attractors give the biological significance of the computational simulations of the models (Albert and Wang, 2009; Assman and Albert, 2009).

The dynamical behavior of the network was analyzed implementing the logical rules into BoolNet (R open-source package), and obtaining its attractors (stationary states) by applying asynchronous update strategies (Müssel et al., 2010). Under the asynchronous updating scheme, the simulation was performed using 50,000 random initial states, updating the network until either a fixed point attractor or a complex attractor was reached. Confidence of the model was tested through the simulation of all possible mutants (constitutive and null activation of every node) and the comparison of the resultant attractors with experimental reports about the biological effects in vivo or in vitro after the use of antagonists or the generation of knock-in and knock-out models.

### Dynamical Multicellular Approach

Assuming that every simulation beginning at a certain initial state of the network represents the dynamical profile of a single cell, Wu and collaborators proposed a "population-like" analysis for a discrete model (Wu et al., 2009). Similarly, we asynchronously ran the simulations of the network from 50,000 random initial states, and then updated for 2000 time-steps, followed by calculation of the average activation value from 50,000 simulations for each node in each time-step. Such data was

plotted as multi-cellular average activation graphs. Furthermore, we evaluated the effect of a short (1 time-step) and a sustained (699 time-steps) temporary induction of lTLR in time-step 700 and 1400, and analyzed the dynamical effects in the wild type network and in some relevant mutant networks.

### RESULTS

### Network Reconstruction

The inferred HSPC-MSC network (**Figure 1**) constitutes the first attempt to model relevant interaction axes between undifferentiated hematopoietic cells and the BM microenvironment, that may approach us to a deeper understanding of the numerous molecular signals influencing the hematopoietic system regulation during normal and malignant processes. Our current ALL network has 26 nodes and 80 interactions. Among them, twelve nodes correspond to molecules that are expressed in HSPC and involved in intracellular signaling (PI3K/Akt, Gfi1, NF-κB, GSK3β, FoxO3a, ERK, β-catenin, and ROS) or cell-membrane receptors for communication with the microenvironment (CXCR4, CXCR7, VLA-4, and TLR). Eleven nodes conform the MSC subsystem, integrated by intracellular signaling molecules (PI3K/Akt, NF-κB, GSK3β, FoxO3a, ERK, β-catenin, and ROS), a gapjunction protein regulating communication among MSC (Cx43), communication ligands with HSPC (VCAM-1 and CXCL12) and TLR. Common internal nodes in both HSPC and MSC systems are representative molecules from the most studied pathways influencing proliferation, migration, survival, and -some of them- differentiation. Finally, the microenvironmental compartment is represented by G-CSF secreted by myeloid and stromal cells (Majumdar et al., 2000; Allakhverdi et al., 2013; Tesio et al., 2013; Boettcher et al., 2014), its inductor IL-1 which is secreted by MSC and HSPC, and lTLR so as

FIGURE 1 | Regulatory HSPC-MSC network. The network is constituted by three compartments represented with different geometric shapes: HSPC, MSC, and microenvironmental soluble factors. HSPC and MSC have intracellular nodes regulating the response and expression of elements mediating the communication between them. CXCR4-CXCL12 and VLA-4/VCAM-1 axes are suggested to be the most crucial communicating elements. HSPC and MSC are both susceptible of TLR stimulation with lTLR input. HSPC, hematopoietic stem and progenitor cell; MSC, mesenchymal stromal cell.

#### TABLE 1 | Logical rules used for HSPC-MSC modeling as a Boolean system on BoolNet.


Nodes representing molecules in HSPC are denoted with "\_H" at the end of the node name, while nodes representing molecules in MSC are denoted with "\_M." Logical rules were constructed using the logical operators AND ( & ), OR ( | ) and NOT ( ! ). The corresponding common names and genes ID are found in Table S3.

to model a homeostasis disruption that is known to drive a pro-inflammatory signaling. Model inputs are Cx43 and lTLR, while the activation value of the other 24 nodes is dependent on the network topology and the initial state of the input nodes. All logical rules used for the computational simulation with BoolNet are shown in **Table 1**. Note that the logical rules for the input nodes include self-regulations, but these are for computational purposes to represent their sustained activation, rather than a biological reality.

### Attractors of the Wild-Type Network: Searching for the Relevance of TLR in the Biology of CXCL12

The asynchronous simulation of the Boolean model returned 4 attractors: 2 fixed points and 2 complex attractors (**Figure 2**). The first two attractors, fixed point attractor 1 and 2, were identified with the physiological detached and attached state of the HSPC with its MSC counterpart, respectively.

Both fixed point attractors will depend on the initial states of both, TLR and Cx43. Thus, in the absence of lTLR, the final fates will depend on the initial activation state of Cx43. However, once TLR is activated, final fates are not contributed anymore from the activation state of Cx43.

Loss of HSPC-MSC communication corresponding to a detachment state, is due to the absence of Cx43 and the consequent inactivation of CXCL12. In the activation pattern of this attractor, only VCAM-1 accompanied by GSK3β in both sub systems remained active (Tabe et al., 2007). On the contrary, when Cx43 is active (as in fixed point attractor 2), CXL12 is expressed by the MSC, which in turn positively regulates the CXCR4 receptor required for the activation of the VLA-4/VCAM-1 axis. The pattern in HSPC, correspond to ERK and PI3K/Akt activation, well-described elements downstream CXCR4 and VLA-4 (Tabe et al., 2007). β-catenin, a subject of debate about its function on stem cell maintenance, is turned on as a consequence of the GSK3β inhibition by PI3K/Akt (Dao et al., 2007).

Complex attractors 1 and 2 share the same activation values in all nodes, except for the initial state of Cx43 which is an input and therefore may be consistently either active or inactive through simulation. Importantly, these two attractors have the node for ITLR active, so that under induced pro-inflammatory conditions the resultant perturbation of CXCR4/CXCL12 and VLA-4/VCAM-1 is exclusively dependent on CXCL12 down regulation in MSC by NF-κB. The network attractors are concordant with experimental observations (Ueda et al., 2004; Wang et al., 2012; Yi et al., 2012) with the exception of IL1 and GCSF inactivation although lTLR-induced NF-κB signaling in hematopoietic and mesenchymal compartments. In order to explain this discrepancy we may remark that an attractor is a stable network state or set of states, reached after the network went through a sequence of transient states where, in most biological systems, there is cross-pathway communication for modulating cellular response (Williams et al., 2004; Tapia-Abellán et al., 2014), so IL1 and GCSF could be activated in some transient states but down-regulated by other pathways responding to lTLR activation. Due to the existence of regulatory circuits among pathways, in the presence of ITLR there is an oscillatory behavior of ERK and Gfi1. Therefore, we applied the dynamic multicellular approach described by Wu et al. (2009) in order to have a deeper understanding of the HSPC-MSC model upon perturbations. The average activation value of 50,000 simulations for all nodes within the HSPC-MSC network was plotted and presented in **Figure 3**. The plots represent a qualitative approach for the analysis of the cell population trend under specific conditions. Considering that the initial activation values are randomly chosen, with exception of lTLR, TLR\_M, and TLR\_H which activation value was set to 0, the average initial activation value for the rest of the nodes correspond to 0.5. From time-step 0 to time-step 499 correspond to the stabilization of the dynamics. Of note, the plateau obtained around timesteps 500-699 corresponds to the average of the two fixed point attractors.

### Analysis of Transitory States Applied to a Multicellular Approach: from Pro-inflammatory Signals to CXCL12 Downregulation

The short lTLR stimulation at time-step 700 and 1400 (**Figures 3A–C**) induces up-regulation of Gfi1 in HSPC (**Figure 3A**), and of NF-κB and PI3K/Akt in both HSPC and MSC compartments (**Figures 3A,B**). These nodes maintain a sustained activation as long as the lTLR is present (**Figures 3D–F**). In contrast, ERK, ROS and FoxO3a showed an increase but are regulated by other nodes, providing a feedback to basal values. Accompanying the cross-regulation of intracellular pathways, a decrease on CXCR4, CXCL12, VLA-4, and VCAM-1 activation is observed. As expected, there is positive signaling of the proinflammatory cytokines with a parallel co-increase of CXCR7, signals damped by PI3K/Akt and CXCL12 down-regulation, respectively.

### Model Validation by Mutant Analysis

Listed in **Table 2** are the observations from comparisons between the resultant attractors of simulations with null ("loss of function") and constitutive expression mutants ("gain of function"), against the wild-type model. We focused on the activation value changes in the two axes of interest – CXCR4/CXCL12 and VCAM-1/VLA-4. Even though the nodes included in the reconstruction of the present model are well-studied elements of cell fate related-pathways, there is a lack of experiments correlating their perturbation with microenvironment modifications that impact HSPC behavior (**Table 2**, Table S4). Due to this missing data, and in order to validate the model, we now used available information of general alterations in hematopoiesis in the presence of lTLR.

MSC ERK, FoxO3a, and PI3K/Akt nodes participating in CXCR4/CXCL12 and VCAM-1 VLA-4 axes regulation were not found in the revised literature. β-catenin in MSC has a role on osteoblastogenesis and its constitutive induced expression in osteoblasts in a mice model results in acute myeloid leukemia (AML) induction (Kode et al., 2014). The constitutive expression of β-catenin showed an outcome where, under non-induced inflammation, the CXCR4/CXCL12 axis is disrupted. This gives support to our hypothesis that CXCR4/CXCL12 is probably involved in the maintenance of leukemic cells. Furthermore, the dynamic multicellular approach in the gain of function of βcatenin in MSC, reproduced the recovery of VCAM-1 expression upon stimulation of lTLR as reported by Kincade in OP9 cells (Figure S1; Malhotra and Kincade, 2009).

GSK3β inhibition in MSC has been known to function in the regulation of osteoblast and adipocyte differentiation. Besides, experimental effect of a GSK3β-inhibitor on osteoblastogenesis has shown that the decrease of this kinase induces downregulation of CXCL12 expression (Satija et al., 2013). The model is consistent with the unsteadiness of CXCL12 activation in the simulation of the mutant (Figures S2A,B).

According to our hypothesis, a pro-inflammatory-induced CXCR4/CXCL12 disruption results in leukemic progression support. In the proposed model, overexpression of NF-κB disrupts the HSPC-MSC communication (Figure S2C). This is in agreement with the reported leukocytosis associated to upregulation of NF-κB within BM MSCs from a mice model of high-fat diet (Cortez et al., 2013). Finally, modeling of a gain of function mutation in ROS resulted in the blocking of CXCL12 activation (Figure S2D). This is also in accordance of the recent report of oxidative damage induced by iron in MSC, resulting in down-regulation of CXCL12 expression and reduction of their hematopoietic supporting function (Zhang et al., 2015). Moreover, the iron-induced hematopoietic alterations previously observed by other groups, are attenuated by the treatment with ROS inhibitors (Lu et al., 2013).

Nodes in HSPC which have been experimentally reported as dispensable for hematopoiesis, which did not show any alterations in the CXCR4/CXCL12 and VLA-4/VCAM-1 axes on the mutant simulations, are β-catenin (Figures S3A–D; Cobas et al., 2004; Jeannet et al., 2008) and CXCR7 (Figures S3E,F).

However, even though in vivo β-catenin null mutant HSPC does not lose long-term reconstitution capacity or multipotentiallity, its overexpression produces lose of stemness and differentiation blockage to erythroid and lymphoid lineages (Kirstetter et al., 2006; Scheller et al., 2006). Simulations of the gain of function of β-catenin resulted in the appearance of additional attractors where FoxO3a and GSK3β are increased (Figures S4A,B, S5B), suggesting a reduction in proliferation and/or apoptosis induction (Maurer et al., 2006; Yamazaki et al., 2006). In turn, the simulation of overexpression of FoxO3a showed a downregulation of ERK and PI3K (Figures S4C, S5C). Also reported as proliferative repressors in HSPC (Hock et al., 2004; Zeng et al., 2004; Holmes et al., 2008), Gfi1 and GSK3β overexpression mutants inhibited ERK activation, and additionally Gfi1 induce the downregulation of PI3K/Akt node, CXCR4/CXCL12 and VLA-4/VCAM-1 axes (Figures S4 and S5). Disagreeing with experimental data (Holmes et al., 2008), GSK3β null mutant outcome result in an additional attractor where PI3K/Akt and ERK are inactive, notwithstanding CXCR4 and VLA4 activation (Figure S6).

Of interest, NF-κB (**Figure 4**) and ROS (Figures S4F, S5F) constitutive expression in HSPC induce additional attractors with activation of IL-1 and G-CSF, and inhibition of axes regulating HSPC-MSC contact. A number of investigations on cancer cells report a correlation of NFκB increased levels and CXCR4 (Richmond, 2002; Ayala et al., 2009; Shin et al., 2014). Nonetheless, a recent study in human leukemic cell lines has shown that LPS treatment increases MMP-9 activity, a metalloproteinase known to efficiently degrade CXCR4 and CXCL12 (Hajighasemi and Gheini, 2015).

### NF-κB Gain of Function Mutant as ALL Simplified Model

How common alterations in ALL cells may induce BM microenvironment remodeling, regardless of the underlying

#### TABLE 2 | Results from the model outcome for single node mutations.


#### Gain of function


genetic aberration, was investigated by running a dynamic multicellular simulation using the mutant network for NFκB gain of function within the HSPC sub-system. The results shown in **Figure 4** confirm that NF-κB mutation in HSPC may perturb HSPC-MSC communication in parallel with the induction of other alterations previously reported in

ALL cells, such as the increase of Gfi1 expression (Purizaca et al., 2013) and a pro-inflammatory milieu (Vilchis-Ordoñez et al., 2015). IL1 and G-CSF activation by HSPC upregulate ERK, NF-κB and PI3K/Akt in MSC. As consequence of PI3K/Akt increase in MSC, β-catenin is up-regulated through the inhibition of GSK3β. Strikingly, the sustained activation of CXCR7 resulted as a consequence of NFκB constitutive expression in HSPC and CXCL12 residual expression from MSC. CXCR7/CXCL12 axis was recently reported to be increased in ALL cells and a possible participation in abnormal cell migration was suggested (Melo et al., 2014).

### DISCUSSION

According to the classical model of hematopoiesis, normal blood cells are replenished throughout life by stem and early progenitor populations undergoing stepwise differentiation processes in the context of intersinusoidal specialized niches (Purizaca et al., 2012; Vadillo et al., 2013). Cell cycle status, self-renewing capability and the central cell fate decisions depend, in great part, on the microanatomic organization and signals from the BM environment. Endosteal, perivascular and reticular niches provide support by cell-cell interactions and growth/differentiation factors that control the expression of lineage-specific transcription factors, among other elements. Within the reticular niche, mainly composed by CXCL12 abundant reticular cells (CARs), a special category of MSCs, the chemokine CXCL12 and its receptor CXCR4 play a pivotal role in the regulation of lymphopoiesis from the earliest stages of the pathway (Tokoyoda et al., 2004; Nagasawa, 2015). The transcription factor Foxc1 governs CXCL12 and stem cell factor expression, allowing the CAR niche formation for maintenance of HSC, common lymphoid progenitors, B cells, NK and plasmacytoid dendritic cells (Omatsu et al., 2014). The net balance of its disruption is instability of adaptive and innate immune cell production. Recent findings suggest that elevation of cytokines and growth factors, including G-CSF and TNFα, due to infectious stress, substantially reduce the expression of CXCL12, SCF and VCAM-1, further impairing primitive cell maintenance and prompting their proliferation and migration (Kobayashi et al., 2015, 2016).

Much remains to be unraveled about CXCL12-related mechanisms of intercommunication damage that may favor growth of cancer cells at the expense of healthy hematopoiesis during biological contingencies such as hematological malignancies and biological stress. Although, genetic heterogeneity may be co-responsible for differences in ALL overall survival, response to treatment, differentiation-stage arrest or even predisposition to metastasis, a common need might be the development of biological features that provide pre-malignant cells decisive advantage over normal cells to compete for the same ecological niche. Given the importance of CXCR4/CXCL12 axis for homeostatic hematopoiesis and of its presumptive disruption in ALL BM, we now propose a Boolean model reconstructed with some of the most studied elements upstream and downstream this key communication axis. Our model shows its capacity to simulate several phenotypes relevant to ALL. According to previous experimental research, the major assumption made from this model is that the integrity of CXCR4/CXCL12 signaling, promoting the required activation of the VLA-4/VCAM-1 integrins interaction, is absolutely necessary for HSPC retention in the mesenchymal niche and in consequence, indispensable for optimal hematopoiesis regulation (Lévesque et al., 2003; Lua et al., 2012; Greenbaum et al., 2013; Park et al., 2013). The HSPC-MSC model asynchronous simulation in the absence of lTLR returned two attractors corresponding to HSPC attachment and detachment to MSC. The 'attachment' status, represented by the induction of CXCR4/CXCL12 and/or VLA-4/VCAM-1 axes, also exhibited PI3K/Akt and β-catenin activation within the HSPC compartment. Although there is some controversy about the β-catenin role in HSC regulation (Kirstetter et al., 2006; Duinhouwer et al., 2015), the co-activation of PI3K/Akt and β-catenin is known to promote self-renewal and HSC expansion (Perry et al., 2011). Two core pathways downstream CXCR4/CXCL12 binding are PI3K/Akt and ERK, both promoters of cell survival and regulators of proliferation. Considering that the mesenchymal stromal niche has being identified as the interface between the quiescence promoting osteoblastic niche and the vascular niche regulating final lineage commitment and cell migration, the signals provided by mesenchymal cells should tightly regulate proliferation/expansion in order to further allow differentiation. According to this statement, the attractor representing the detached state conducts to pro-apoptosis signaling in the absence of aberrant expression of NF-κB, that relies on cytochrome C releaseassociated normal functions of GSK3β in HSPC (Maurer et al., 2006).

By using elegant mice disease models and controlled culture systems, a wealth body of studies has recently highlighted the coparticipation of inflammation and infectious stress in the HSPC exit from quiescence status, as well as in cancer etiology and progression (Baldridge et al., 2011; Vilchis-Ordoñez et al., 2015). Chronic inflammation and carcinogenesis have been closely connected via either a oncogenes-derived intrinsic pathway or through an extrinsic pathway from external factors that promote latent inflammatory responses involving signaling pathways such as MyD88, NF-κB, and STAT3 (Mantovani et al., 2008; Krawczyk et al., 2014).

Interestingly, pattern recognition receptors (PRRs), including Toll-like receptors (TLRs) are functionally expressed from the most primitive stages of hematopoiesis and contribute to emergent cell replenishment in response to life-threatening infections or disease-associated cell damage (Nagai et al., 2006; Welner et al., 2008; Dorantes-Acosta et al., 2013; Vadillo et al., 2014). This phenomenon is called emergency hematopoiesis and is regulated at the most primitive cell level (Kobayashi et al., 2015, 2016).

The potential relevance of this mechanism in leukemogenesis was the focus of this investigation, and our model allowed for the analysis of most behaviors observed under experimental settings. The discrete simulation of NF-κB constitutive expression mutant on HSPC, gave further support to our hypothesis on the perturbation of CXCR4/CXCL12 communication axis induced by pro-inflammatory microenvironment. The single mutation of NF-κB was sufficient to remodel the dynamical behavior of the three sub-systems represented, which was an unexpected behavior of the model. The dynamic analysis of the ALLlike network, also suggested the activation of an alternative communication pathway mediated by CXCR7 binding CXCL12. Inhibition of CXCL12 within the mesenchymal niche, may be fundamental for cell migration to adjacent BM structures unable to sustain proper differentiation or even to extramedullar tissues, accounting for a predictable role of this axis in metastasis.

### CONCLUDING REMARKS

The proposed HSPC-MSC model is the first systemic approximation to understand the intercommunication pathways underlying primitive cell retention/proliferation in the mesenchymal niche as a determinant factor for progression of hematological hyperproliferative diseases. We applied conventional discrete dynamical modeling and non-conventional population-like approaches as an average behavior of the network model. Future improvement of discrete dynamical modeling for ALL system will provide a powerful tool for investigation of unbalanced competitions between leukemic and normal hematopoietic cells within the BM. Overall, systems biology will advance our comprehensive view of the mechanisms involved in the pathogenesis of leukemic niches that may illuminate therapeutic strategies based on cell-to-cell crosstalk manipulation.

## AUTHOR CONTRIBUTIONS

JE designed the work; generated, analyzed and interpreted data; wrote the paper. HM interpreted data; revised the work for intellectual content; wrote the paper. LM designed the work; interpreted data; revised the work for intellectual content; wrote the paper. RP designed the work; interpreted data; revised the work for intellectual content; wrote the paper.

### ACKNOWLEDGMENTS

This work was supported by the National Council of Science and Technology (CONACyT) (Grant CB-2010-01-152695 to RP), by the Mexican Institute for Social Security (IMSS) (Grant FIS/IMSS/PROT/G14/1289 to RP) and by the "Red Temática de Células Troncales y Medicina Regenerativa" from CONACyT. LM acknowledges the sabbatical scholarships from PASPA-DAPA UNAM and CONACyT 251420. JE is scholarship holder from CONACyT and IMSS, and was awarded by the PRODESI IMSS Program.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2016.00349

## REFERENCES


expression in the bone marrow. Blood 106, 3020–3027. doi: 10.1182/blood-2004-01-0272


Toll-like receptor(TLR)-4 and PI3K/Akt. Cell Biol. Int. 33, 665–674. doi: 10.1016/j.cellbi.2009.03.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Enciso, Mayani, Mendoza and Pelayo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Metabolomics of Head and Neck Cancer: A Mini-Review

Jae M. Shin1, 2, Pachiyappan Kamarajan3, 4, J. Christopher Fenno<sup>1</sup> , Alexander H. Rickard<sup>2</sup> and Yvonne L. Kapila3, 4 \*

*<sup>1</sup> Department of Biologic and Materials Sciences, University of Michigan School of Dentistry, Ann Arbor, MI, USA, <sup>2</sup> Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI, USA, <sup>3</sup> Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, MI, USA, <sup>4</sup> Division of Periodontology, Department of Orofacial Sciences, University of California San Francisco, San Francisco, CA, USA*

Metabolomics is used in systems biology to enhance the understanding of complex disease processes, such as cancer. Head and neck cancer (HNC) is an epithelial malignancy that arises in the upper aerodigestive tract and affects more than half a million people worldwide each year. Recently, significant effort has focused on integrating multiple "omics" technologies for oncological research. In particular, research has been focused on identifying tumor-specific metabolite profiles using different sample types (biological fluids, cells and tissues) and a variety of metabolomic platforms and technologies. With our current understanding of molecular abnormalities of HNC, the addition of metabolomic studies will enhance our knowledge of the pathogenesis of this disease and potentially aid in the development of novel strategies to prevent and treat HNC. In this review, we summarize the proposed hypotheses and conclusions from publications that reported findings on the metabolomics of HNC. In addition, we address the potential influence of host-microbe metabolomics in cancer. From a systems biology perspective, the integrative use of genomics, transcriptomics and proteomics will be extremely important for future translational metabolomic-based research discoveries.

#### Edited by:

*Osbaldo Resendis-Antonio, National Autonomous University of Mexico, Mexico*

#### Reviewed by:

*Sudipto Saha, Bose Institute, India Nikolaos Psychogios, Massachusetts General Hospital, USA*

> \*Correspondence: *Yvonne L. Kapila yvonne.kapila@ucsf.edu*

#### Specialty section:

*This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology*

Received: *04 August 2016* Accepted: *24 October 2016* Published: *08 November 2016*

#### Citation:

*Shin JM, Kamarajan P, Fenno JC, Rickard AH and Kapila YL (2016) Metabolomics of Head and Neck Cancer: A Mini-Review. Front. Physiol. 7:526. doi: 10.3389/fphys.2016.00526* Keywords: head and neck cancer, oral cancer, squamous cell carcinoma, metabolomics, microbiome

## INTRODUCTION

The incidence of head and neck cancer (HNC) exceeds half a million cases annually worldwide and accounts for approximately 3% of adult malignancies (Johnson et al., 2011; National Cancer Institute, 2013). HNC is defined as epithelial malignancies that arise in the aerodigestive tract (paranasal sinuses, nasal and oral cavity, pharynx and larynx) and can metastasize to different locations (Rezende et al., 2010). About 75% of HNCs are oral cancers and 90% of oral cancers are diagnosed as oral squamous cell carcinomas (OSCC) (Rezende et al., 2010; National Cancer Institute, 2013). Despite therapeutic and technological advances, the prognosis for HNC has not improved in decades due to its malignant and recurrent properties (Forastiere et al., 2001; Mao et al., 2004). The most widely accepted risk factors for HNC include tobacco (smoked or chewed), alcohol use, and human papillomavirus (HPV) infection (Gillison, 2004; Schmidt et al., 2004). However, these risk factors alone cannot explain the observed incidence and pathogenesis of HNC, since some patients are not in these risk categories. Thus, it is likely that other unknown factors play important roles in tumorigenesis, tumor progression and metastasis of HNC.

There has been an increasing trend to incorporate "omics" technology, including metabolomics, into oncological research (Vucic et al., 2012; Cho, 2013; Armitage and Barbas, 2014; Yu and Snyder, 2016). Investigators have explored different technologies and analytical methods to better understand the metabolomic properties of cancers, including HNC (Bathen et al., 2010; Blekherman et al., 2011; Beger, 2013; Liesenfeld et al., 2013; Olivares et al., 2015). As more independent reports on metabolomics of HNC are being published, a comprehensive meta-analysis of these large "omics" data sets will be of potential value in the near future to enhance translational studies. Specifically, metabolomic studies can help to potentially identify clinically relevant biomarkers that may be useful in early detection of cancer, to enhance the accuracy of diagnosis and prognosis, and to aid in the development of new drug targets to help improve therapeutic outcomes (Olivares et al., 2015; Yu and Snyder, 2016).

The objective of this mini-review is to summarize and discuss the published studies on HNC metabolomics. We will discuss the different technological tools utilized in metabolomics, and focus on the findings from studies that used different types of patient samples (i.e., saliva, serum, blood, urine, tissues). In addition to the host-metabolomic profiles, we discuss the potential relationship and influence of the microbial metabolome in cancers. By coupling metabolomics data with other omics data, we can achieve a greater understanding of complex cancer processes and derive new information that may help to better target aggressive and malignant cancer types, such as HNC.

### Biological Samples Used for Head and Neck Cancer Metabolomics

A broad array of biological fluids, such as saliva, blood and urine have been used in metabolomic-based studies (Nagana Gowda et al., 2008; Psychogios et al., 2011; Bouatra et al., 2013; Dame et al., 2015). These biofluids contain hundreds to thousands of detectable metabolites that can be obtained non- or minimally invasively (Beger, 2013). In addition, cell and tissue extracts can be a source of samples for metabolomic-based studies (Beger, 2013). With current diagnostic procedures requiring a tissue biopsy, a portion of the tissue samples can be harvested for further metabolomic analyses. The following discussion will focus on the findings, postulated hypotheses, and conclusions from the published metabolomic studies that used different biofluids and cell/tissue extracts to study HNC metabolomics.

#### Saliva Metabolomics

Saliva is an important biological fluid required for multiple functions, including speech, taste, digestion of foods, antiviral and antibacterial protection, to maintain adequate oral health (Loo et al., 2010; Spielmann and Wong, 2011). Saliva is readily available, and the collection process is simple and non-invasive. Thus, saliva has been a popular medium for "omics" based research studies (Zhang et al., 2012; Cuevas-Córdoba and Santiago-García, 2014). Two types of saliva that can be used for metabolomics studies are stimulated and unstimulated whole saliva. These two saliva types vary in their chemical composition, so it is important to identify the specific type of saliva that was used for the study (Humphrey and Williamson, 2001; Carpenter, 2013; Cuevas-Córdoba and Santiago-García, 2014).

Amongst different HNC types, OSCC is associated with a high morbidity rate and a poor 5-year survival rate of less than 50% (Epstein et al., 2002; Mao et al., 2004). To improve the prognosis for HNC, investigators have proposed using saliva metabolites to differentiate between precancerous and malignant lesions. Using hierarchical principal component analysis (PCA) and discriminate analysis algorithms, Yan and colleagues were able to distinguish between OSCC and its precancerous lesions oral lichen planus (OLP) and oral leukoplakia (OLK) (Yan et al., 2008; **Table 1**). Although the OLP and OLK groups were not as well separated in the PCA plot, the OSCC group showed a clear separation from the healthy and precancerous groups (Yan et al., 2008). In addition, Wei and others used ultra-performance liquid chromatography coupled with quadrupole/time-of-flight spectrometry (UPLC-QTOFMS) analysis to identify a signature panel of salivary metabolites that could distinguish OSCC from healthy controls (Wei et al., 2011; **Table 1**). Wei selected a panel of five salivary metabolites, which included γ-aminobutyric acid, phenylalanine, valine, n-eicosanoic acid and lactic acid. This combination of metabolites accurately predicted and distinguished OSCC from the control samples, suggesting that metabolomic approaches could complement the clinical detection of OSCC for improved diagnosis and prognosis (Wei et al., 2011).

Work presented by Almadori and colleagues discovered that salivary glutathione (antioxidant), but not uric acid (antioxidant), was significantly increased in patients with oral and pharyngeal SCC compared to healthy controls (Almadori et al., 2007; **Table 1**). However, although there were significant alterations in the glutathione levels potentially due to metabolism of malignant cells, the concentrations were too inconsistent to suggest glutathione as a definitive SCC diagnostic marker (Almadori et al., 2007). Furthermore, Sugimoto and colleagues identified 28 metabolites that correctly differentiated oral cancers from control samples in their study (Sugimoto et al., 2010). Among these differentially expressed metabolites, salivary polyamine levels were markedly higher in oral cancer samples compared to other cancer samples (breast and pancreatic) and controls (Sugimoto et al., 2010). Polyamines are small molecules derived from amino acids that are essential for many biological functions (Dimery et al., 1987; Pegg, 2009). Increased polyamine levels have been associated with increased cell proliferation, decreased apoptosis and elevated expression of genes affecting tumor invasion and metastasis (Gerner and Meyskens, 2004). Thus, it is hypothesized that polyamine homeostasis is important for regulation of cancer related functions, such as cell proliferation and apoptosis.

Based on published studies that analyzed the salivary metabolome of HNC, there is a general consensus that unique

**Abbreviations:** Ala, (alanine); Asp, (aspartate); Bet, (betaine); Cit, (citrate); Cr, (creatinine); Cho, (choline); Glu, (glutamate); Gluc, (glucose); Gln, (glutamine); Glut, (glutathione); Gly, (glycine); GPC, (glycerophosphocholine); His, (histidine); Ile, (isoleucine); Lac, (lactate); Leu, (leucine); Lys, (lysine); PCho, (phosphocholine); Phe, (phenylalanine); Pro, (proline); Pyr, (pyruvate); Tau, (taurine); Thr, (threonine); Tyr, (tyrosine); Val, (valine).



*(Continued)*

#### TABLE 1 | Continued


*Cap IC-MS, Capillary anion exchange ion chromatography-mass spectrometry; CE-TOF/MS, Capillary electrophoresis-time-of-flight mass spectrometry; GC/MS, Gas chromatography/mass spectrometry; <sup>1</sup>H-NMR, Proton nuclear magnetic resonance; HR-MAS; High resolution magic angle spinning; <sup>1</sup>H-MRS, Proton magnetic resonance spectroscopy; HPLC, High performance liquid chromatography; LC/GC, Liquid chromatography/gas chromatography; NMR, Nuclear magnetic resonance; UPLC-QTOFMS, Ultra-performance liquid chromatography coupled with quadrupole/time-of-flight spectrometry; LN-Met, lymph node metastasis.*

metabolites specific to HNC exist. However, due to differences in detection and analytical methods, the current data still lacks coherency, and a common HNC metabolomic signature has yet to be identified.

#### Blood and Urine Metabolomics

In addition to saliva, blood and urine are commonly used for metabolomic-based studies (Psychogios et al., 2011; Bouatra et al., 2013). Blood is divided into plasma—a cellular portion containing red and white blood cells and platelets, and serum a non-cellular protein-rich liquid separately obtained following blood coagulation. Both plasma and serum contain a wide variety of metabolites, and current studies suggest that plasma and serum are similar in terms of metabolite content within the aqueous phase (Psychogios et al., 2011). Importantly, numerous studies have demonstrated that an altered chemical and protein metabolic composition can now be detected in blood samples obtained from subjects with pathology or diseases, such as cancer (Psychogios et al., 2011; DeBerardinis and Thompson, 2012). Tiziani and colleagues reported that OSCC patients exhibited abnormal metabolic activity in blood serum, wherein altered activity related to lipolysis, the TCA cycle and amino acid catabolism was detected (Tiziani et al., 2009; **Table 1**). For example, there was an increased level of ketone bodies present in OSCC samples, suggesting that increased lipolysis was a backup mechanism for energy production (Tiziani et al., 2009). Furthermore, a common signature for many cancers includes a high rate of glycolysis followed by lactic acid fermentation in the cytosol, rather than by a comparatively low rate of glycolysis followed by oxidation of pyruvate in the mitochondria, known as the "Warburg effect." Similarly in HNC, Tiziani demonstrated that OSCC tumors relied heavily on glycolysis as a main energy source (Warburg, 1956; Tiziani et al., 2009).

Yonezawa and others identified several metabolites that were altered in serum and tissue samples of HNSCC patients who experienced relapse (Yonezawa et al., 2013). The four metabolites that were significantly altered were glucose, methionine, ribulose, and ketoisoleucine (Yonezawa et al., 2013). Interestingly, when the authors compared the metabolomic profiles of the OSCC serum and tissue samples, an inverse relationship was observed in the differentially expressed metabolites (Yonezawa et al., 2013; **Table 1**). Metabolites associated with glycolytic pathways (i.e., glucose) were lower in the tissues, whereas amino acids (i.e., valine, tyrosine, serine, and methionine) were expressed in higher levels in the tissues than the serum (Yonezawa et al., 2013). In addition, the serum metabolomic profiles differed between patients with or without HNSCC relapse (Yonezawa et al., 2013). Several other studies further support that serum and plasma samples from HNC subjects possess distinct metabolomic profiles. For example, elevated levels of choline-containing compounds were detected in OSCC samples in numerous studies (Maheshwari et al., 2000; El-Sayed et al., 2002; Bezabeh et al., 2005; Tiziani et al., 2009; Zhou et al., 2009). Choline is an important constituent of phospholipid metabolism in cellular membranes and is considered a biomarker for cancer cell proliferation, survival and malignancy (Ackerstaff et al., 2003; Glunde et al., 2006, 2011). Through our comprehensive analysis, choline was identified as one of the metabolites that was consistently over expressed in HNC samples regardless of sample types (**Figure 1B**). Studies have suggested a link between cancer feedback cell signaling and choline metabolism (Aboagye and Bhujwalla, 1999; Ackerstaff et al., 2003; Janardhan et al., 2006; Glunde et al., 2011; Ridgway, 2013). Thus, an abnormal choline metabolism in cancer has gained much attention and is regarded as a metabolic hallmark for tumor development and progression (Glunde et al., 2011).

The use of urine samples in HNC metabolomic studies is not as common compared to the other types of biofluids mentioned above. However, urine is widely used by metabolomic researchers for other conditions or diseases due to its ease of collection and the wide coverage of metabolites that is possible with urine samples (Bouatra et al., 2013). Thus far, there has only been a single study reported on HNC metabolomics using urine. From patient urine samples, Xie and colleagues identified a panel

the altered cancer metabolism. (B,C) Venn diagrams showing, (B) Overlap of differentially expressed metabolites identified in HNC in saliva, blood and urine, and cells and tissues. (C) Overlap of differentially expressed metabolites in HNC identified by different detection methods such as HPLC/GC/MS, NMR/MAS, MRS and other. Metabolites were selected and compiled from studies in Table 1. Red, detected in increased levels; Blue, detected in decreased levels; Green, detected in increased and decreased levels.

of differentially expressed metabolites and demonstrated their utility by logistic regression (LR) modeling (Xie et al., 2012; **Table 1**). When two metabolites, valine and 6-hydroxynicotic acid, were inputted together in the LR prediction model,the authors were able to identify OSCC with a 98.9% accuracy, and a greater than 90% sensitivity, specificity and positive predictive value (Xie et al., 2012). However, similar to saliva and blood metabolomics, the use of urine samples for HNC metabolomics will require further validation through more independent studies.

### Cell and Tissue Metabolomics

The current gold standard for diagnosis of HNC is a scalpelobtained biopsy and subsequent histopathological interpretation. However, the current procedure is subjective and does not capture the full heterogeneic properties of neoplastic processes, as it is difficult to distinguish between precancerous from cancerous and malignant lesions (Rezende et al., 2010; Yu and Snyder, 2016). Early studies with magnetic resonance spectroscopy (MRS) using patient tissue samples demonstrated that a higher choline to creatine ratio was observed in HNC samples compared to healthy controls (Mukherji et al., 1997; El-Sayed et al., 2002; **Table 1**). In addition, Mukherji and colleagues reported that elevated levels of amino acids, such as alanine, glutathione, histidine, isoleucine, valine, lysine, and polyamines were more likely found in tumors compared to controls, and similar metabolites, such as glutathione and polyamines were also elevated in saliva associated with HNC (Mukherji et al., 1997; Almadori et al., 2007; Sugimoto et al., 2010). Srivastava and others used proton high-resolution magic angle spinning magnetic resonance (HR-MAS MR) spectroscopy to identify the metabolic perturbations of OSCC tumors compared to healthy controls. The data revealed higher levels of lactate, phosphocholine, choline and amino acids, and decreased levels of PUFA and creatine in OSCC samples compared to nonmalignant samples (Srivastava et al., 2011). As previously mentioned, higher levels of detected choline in HNC tissues may indicate increased cancer cell proliferation and membrane biosynthesis, as a result of reciprocal interactions between oncogenic signaling and choline metabolism (Glunde et al., 2011). The reduced level of creatine could also be an indication of increased energy metabolism in tumors (Mukherji et al., 1997; El-Sayed et al., 2002).

Somashekar and colleagues reported that tumorous tissues biopsied from different anatomical locations (tongue, lip, oral cavity, and larynx) displayed similar metabolomic profiles between one another, suggesting that HNSCC tissues share similar metabolic activity during malignant transformation (Somashekar et al., 2011; **Table 1**). Primary and metastatic HNSCC tissues both showed increased/altered levels of branched chain amino acids, lactate, alanine, glutamine, glutamate, glutathione, aspartate, creatine, taurine, phenylalanine, tyrosine and choline compounds, with decreased levels of triglycerides (Somashekar et al., 2011; **Table 1**). In addition, Tripathi and others demonstrated that the cell extracts of HNSCC displayed comparable metabolic phenotypes as observed in the HNSCC tissues (Tripathi et al., 2012; **Table 1**). Thus, based on published reports, the metabolites associated with malignant transformation of HNC are associated with multiple dysregulated metabolic pathways, including glycolysis, glutaminolysis, oxidative phosphorylation, energy metabolism, TCA cycle, osmo-regulatory and anti-oxidant mechanisms (**Figure 1**; Somashekar et al., 2011; Tripathi et al., 2012; Wang et al., 2014).

#### Influence of Microbial Metabolomics

The human body is a host to taxonomically diverse multispecies microbial communities. In particular, the oral cavity and the gut are home to hundreds of transient and resident microbial species (Eckburg et al., 2005; Dewhirst et al., 2010). Several publications suggest that the microbiota that colonize the human body (particularly the oral cavity and gut) contribute to the etiology of different types of cancers because of their ability to alter the community composition and induce inflammatory reactions, DNA damage and apoptosis, and an altered metabolism (Meurman, 2010; Chen et al., 2012; Farrell et al., 2012; Louis et al., 2014). Thus, when considering cancerassociated metabolomics, the influence of the microbiota and its repertoire of metabolites should also be considered, since the microbiota are profoundly abundant in the human body and cancerous tissues.

Colorectal cancer (CRC), like HNC, is associated with risk factors that include diet and lifestyle (Gingras and Béliveau, 2011). Specific bacterial genera, like Fusobacterium, are found in greater abundance in patients diagnosed with CRC, colorectal adenomas, pancreatic cancer and HNC (Castellarin et al., 2012; Farrell et al., 2012; Kostic et al., 2012; McCoy et al., 2013). Accumulated data suggest that diverse polymicrobial communities can produce a wide range of metabolites by metabolic fermentation (Tang, 2011). For instance, gut microorganisms can secrete a variety of metabolites that may play a role in the etiology and prevention of complex diseases (Heinken and Thiele, 2015). These microbial metabolites can directly regulate and modulate the host-tumor cell metabolism (**Figure 1A**); bacteria isolated from the gut can produce metabolites that are protective or detrimental to the host tissues and cells. For example, short-chain fatty acids (SCFAs) like butyrate, acetate, and propionate function in the suppression of inflammation and cancer, whereas other metabolites, such as polyamines, are toxic and cancer-promoting at high levels (Louis et al., 2014). Alterations in microbial diversity and function due to known risk factors for HNC (alcohol and tobacco use) and unknown factors could actively contribute to HNC tumorigenesis (Schwabe and Jobin, 2013; **Figure 1A**).

### CONCLUDING REMARKS

The complement of "omics" based approaches could significantly enhance our understanding of the complex processes of HNC tumorigenesis. Although, it is extremely complex, progress has been made in integrating two or more omics data sets to study cancer (Cho, 2013). For example, studies have examined the molecular differences between HPV+ and HPV− HNCs by comparing the differences in their genomic, transcriptomic, and proteomic profiles (Sepiashvili et al., 2015). Since Otto Warburg's first hypothesis of the altered metabolism of cancer cells, the field of cancer metabolomics has rapidly expanded and revealed intriguing new data regarding metabolic pathways associated with cancers (Warburg, 1956). With fast-moving advancements in technology and bioinformatics, the quality of data output and the ability to detect small molecular metabolites has significantly improved. Thus, investigators will likely soon be able to transition from untargeted global metabolomic approaches to more focused targeted and mechanistic-based metabolomic studies. In addition, with the availability of growing public databanks, investigators can now search for specific omics variations that characterize different types of cancers and phenotypes of a cancer (Cho, 2013).

From the clinical perspective, understanding the metabolic pathways associated with life threatening conditions, such as cancer, could be extremely valuable in decreasing the burden of disease. With saliva-based DNA screening tests already available for chair-side use in dentistry for HNC, we can envision a salivabased screening or diagnostic test that incorporates omics that replaces the surgical biopsy and provides a more individualized and robust patient health, disease, or risk profile. Here, we discussed the metabolomics of both the host (normal and cancerous conditions) and co-existing microbiota (**Figure 1A**). In addition, we organized the differentially expressed metabolites from previous publications by sample types (saliva, blood and urine, cells and tissues) and detection methods (**Figures 1B,C**). The full integration and routine inclusion of metabolomics in the clinic has yet to be implemented, however, continued research and translational efforts will reinforce the promise of this evolving technology and science. Studies to date have been conducted with relatively small patient sample sizes, with different sample types and detection methods. In the future, it will be critical to follow up with larger, more comprehensive population studies to confirm the validity of the current findings. In addition, sharing detailed sample collection and analytical methods between investigators will be essential to conduct sound HNC metabolomics research. From the systems biology perspective, the integration of other omics data with

#### REFERENCES


metabolomics data will be required for a greater understanding of cancer biology.

### AUTHOR CONTRIBUTIONS

JS wrote the manuscript, put the figure and table together, and edited the manuscript. PK, JF, and AR edited the manuscript, edited the figure and table, and edited the manuscript. JF, AR, and YK conceived the topic for the mini review, assisted with the manuscript writing, assisted with the figure and table and edited the manuscript.

#### ACKNOWLEDGMENTS

Special thanks to So Young Han for the graphic illustration of the tumor microenvironment depicted in this manuscript. This work was supported by an NIH grant (R56 DE023333; Biomarkers of Aggressive Oral Cancer; awarded to PK, YK).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Shin, Kamarajan, Fenno, Rickard and Kapila. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Host-Microbiome Interaction and Cancer: Potential Application in Precision Medicine

Alejandra V. Contreras 1 †, Benjamin Cocom-Chan1, 2 † , Georgina Hernandez-Montes <sup>3</sup> , Tobias Portillo-Bobadilla<sup>3</sup> and Osbaldo Resendis-Antonio1, 2, 3 \*

1 Instituto Nacional de Medicina Genómica, Mexico City, Mexico, <sup>2</sup> Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica, Mexico City, Mexico, <sup>3</sup> Coordinación de la Investigación Científica, Red de Apoyo a la Investigación-National Autonomous University of Mexico (UNAM), Mexico City, Mexico

It has been experimentally shown that host-microbial interaction plays a major role in shaping the wellness or disease of the human body. Microorganisms coexisting in human tissues provide a variety of benefits that contribute to proper functional activity in the host through the modulation of fundamental processes such as signal transduction, immunity and metabolism. The unbalance of this microbial profile, or dysbiosis, has been correlated with the genesis and evolution of complex diseases such as cancer. Although this latter disease has been thoroughly studied using different high-throughput (HT) technologies, its heterogeneous nature makes its understanding and proper treatment in patients a remaining challenge in clinical settings. Notably, given the outstanding role of host-microbiome interactions, the ecological interactions with microorganisms have become a new significant aspect in the systems that can contribute to the diagnosis and potential treatment of solid cancers. As a part of expanding precision medicine in the area of cancer research, efforts aimed at effective treatments for various kinds of cancer based on the knowledge of genetics, biology of the disease and host-microbiome interactions might improve the prediction of disease risk and implement potential microbiota-directed therapeutics. In this review, we present the state of the art of sequencing and metabolome technologies, computational methods and schemes in systems biology that have addressed recent breakthroughs of uncovering relationships or associations between microorganisms and cancer. Together, microbiome studies extend the horizon of new personalized treatments against cancer from the perspective of precision medicine through a synergistic strategy integrating clinical knowledge, HT data, bioinformatics, and systems biology.

Keywords: microbiome, cancer metabolism, systems integration, metabolome, next generation sequencing (NGS), precision medicine

### INTRODUCTION

Our body is integrated by a legion of microorganisms that coexist in all our tissues and, notably, with a symbiotic functional purpose. Furthermore, host-microbial interactions are beginning to be recognized for their outstanding influence on well-being or the emergence of diseases such as cancer. The advent of high-throughput (HT) technologies has allowed significant advancements in uncovering these correlations through the diversity and abundance of microorganisms in samples

#### Edited by:

Alessio Mengoni, University of Florence, Italy

#### Reviewed by:

Yang Dai, University of Illinois at Chicago, USA Amedeo Amedei, University of Florence, Italy

#### \*Correspondence:

Osbaldo Resendis-Antonio resendis@ccg.unam.mx; oresendis@inmegen.gob.mx

† These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 08 August 2016 Accepted: 21 November 2016 Published: 09 December 2016

#### Citation:

Contreras AV, Cocom-Chan B, Hernandez-Montes G, Portillo-Bobadilla T and Resendis-Antonio O (2016) Host-Microbiome Interaction and Cancer: Potential Application in Precision Medicine. Front. Physiol. 7:606. doi: 10.3389/fphys.2016.00606 of normal and dysfunctional cohorts of human tissues associated with complex diseases such as obesity, type 2 diabetes, and cancer. For instance, in 2015, Mitra and co-workers reported the characterization of the microbiota at different stages of development of cervical intraepithelial neoplasia, and they observed a strong association between the severity of the disease and the vaginal microbiota diversity (Mitra et al., 2015). Furthermore, the association of the microbiota and obesity has also been explored, with observations of changes in the balance and relative abundances of Bacteroidetes and Firmicutes (Ley et al., 2006). Overall, these and other studies provide a glimpse of the central role that the microbiome has in a variety of biological processes in the human body such as in the regulation of fat storage, lipogenesis, fatty acid oxidation and energy balance (Gérard, 2016).

These findings that associate microbiome and phenotype dysfunctional states have contributed to a change in paradigms regarding the relationship between human body and microorganisms, and suggest elucidating the rules by which this interaction can confer wellness or disease. To this end, some challenges must be overcome. For instance, the development of new computational paradigms that contribute to the coherent interpretation of heterogeneous HT technologies, such as Next Generation Sequencing (NGS) and Metabolomics, and the construction of quantitative schemes capable of influencing clinical decisions in precision medicine.

In this review, we present the forefront of HT technologies and conceptual schemes in bioinformatics and systems biology for surveying the host-microbiome association and cancer progression. We expect that our review will be used as a technical and conceptual guide in human microbiome studies, present and discuss the advances in the field, and establish an introspective analysis of the next steps for linking microbiome studies and precision cancer medicine.

### CHARACTERIZATION OF THE MICROBIOME USING HIGH-THROUGHPUT TECHNOLOGIES

The advent of HT technologies has positively impacted the elucidation of the metabolic and regulatory mechanisms by which hosts and microbes interact to determine a health or disease state in the host. In particular, NGS and techniques related to metabolome analysis such as mass spectrometry (MS) are valuable technologies for analyzing the microbiota composition and exploring the genetic, functional, and metabolic activity of the microbial community. Moreover, the use of these technologies enables us to explore the implications of the human microbiome to induce functional and dysfunctional states in a variety of human tissues. Here, we present the state of the art of these technologies and discuss some key findings to elucidate the relationship between the human microbiome and cancer.

#### Next Generation Sequencing

Sanger sequencing, the first-generation of DNA sequencing technology developed by Frederick Sanger based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase, established the methodological principles for DNA sequencing (Sanger et al., 1977). The Sanger sequencing technique constituted the main part of the Human Genome Project in 2001 and was the principle for the first automatic sequencing machine (AB370) produced by Applied Biosystems (Liu et al., 2012). However, limitations in throughput and the high cost of Sanger DNA sequencing reduced the potential of sequencing for other applications, such as for the characterization of personal genomes and cancer whole-genome sequencing. In fact, the cost of the Human Genome Project was estimated to be approximately 1–3 billion dollars over a 15-year period (International Human Genome Sequencing Consortium, 2004). After 2004, when the International Human Genome Sequencing Consortium published the completed sequencing process of the human genome, different HT sequencing technologies emerged, promoting decreasing costs and increasing potential applications for human health (Reuter et al., 2015).

Through automated DNA sequencing instruments that use an attractive interaction among chemistry, engineering, software and molecular biology, dramatic improvements in sequencing technology have allowed revolutionary advances in our understanding of health and disease (Mardis, 2011, 2013). The launch of the Genome Sequencer system by 454 Life Sciences in 2005 highlighted the use of second-generation sequencing techniques employing massively parallel analysis. The secondand third-generation sequencing platforms, collectively known as NGS, are characterized by high data throughput, which can be used for a diverse range of scientific applications by changing the sample type and the manner of its preparation.

Many commercial second-generation sequencing platforms are now available, which follow a similar protocol: library/template preparation, clonal amplification and massively parallel sequencing. In terms of throughput per run, read length and accuracy, each platform has different specific features that make them useful for particular applications. Moreover, the newly emerged third-generation sequencing techniques, such as PacBio (Brown et al., 2014) and MinION (Quick et al., 2014), are performed on a single-molecule basis with no necessary initial DNA amplification step. These newer technologies can produce much longer reads compared with the second-generation sequencing platforms and have the potential to be less costly and less time-consuming.

Several reviews have covered these major platforms in high detail (Metzker, 2010; Mardis, 2013; Reuter et al., 2015). Of particular interest for this review is the application of NGS as an important tool that can provide detailed information about the taxonomic composition and the functional capabilities of the human microbiome for modern biomedical research. Some platforms are not discussed in this review, including Roche-454's pyrophosphate Genome Sequencer and ABI's SOLiD; instead, we attend to the platforms most commonly used today as technological tools in microbiome analysis as well as recent development (**Table 1**).

The appropriate selection of one platform depends on the particular aim and design of the study. Illumina's technology has had tremendous advances in output and reduction in



costs over the last few years and, as a consequence, currently dominates the NGS market (Dohm et al., 2008; Reuter et al., 2015). Illumina's sequencing technology has been widely used in microbiome projects (Evans et al., 2014; Lambeth et al., 2015; Yasir et al., 2015), including the Human Microbiome Project (HMP Consortium, 2012a).

Although both the Illumina and Ion Torrent systems offer a number of advantages in terms of utility for generating usable sequences, its feature to obtain short read length makes them less suited for some particular scientific questions, including genome assembly, gene isoform detection, and methylation detection (Rothberg et al., 2011). Single-molecule real-time (SMRT) sequencing (third-generation sequencing platforms) offers an available approach to overcome these limitations. De novo genome assembly is one of the main applications of PacBio sequencing because long reads can provide large scaffolds (Travers et al., 2010; Carneiro et al., 2012; Rhoads and Au, 2015). In addition, using the direct sequencing protocol without library preparation offers the advantage of requiring a small quantity of DNA, just 1 ng for small genomes, over the other protocols that require 400–500 ng (Coupland et al., 2012). Moreover, SMRT sequencing methods can be used to study molecules other than DNA, for instance ribosomes (Uemura et al., 2010).

DNA sequencing using nanopore technology is another alternative method for producing long-read sequence data. The recent distribution of the MinION by Oxford Nanopore Technologies has made it possible to evaluate the utility of long-read sequencing using a device that resembles a USB memory stick (Ashton et al., 2015; Jain et al., 2015). Speed, single-base sensitivity and long read lengths make nanoporebased technology a promising method for HT sequencing. The MinION system has been used to sequence genomes of infectious agents, such as the influenza virus (Wang J. et al., 2015), to identify the position and structure of a bacterial antibiotic resistance island (Ashton et al., 2015), and as part of a genomic surveillance system of Ebola virus in which the sequencing process took as little as 16–60 min (Quick et al., 2016).

Rapid advances in sequencing technologies present widespread opportunities for microbiome studies using different platforms; however, the performance of the sequencing should be considered for the study design. Loman et al. reported that MiSeq had the highest throughput per run (1.6 Gb/run, 60 Mb/h) and the lowest error rates compared with 454 GS Junior or Ion Torrent PGM (Loman et al., 2012). In addition, Clooney et al. compared Illumina HiSeq, MiSeq and Ion PGM shotgun sequencing on six human stool samples, and found that optimal assembly values for the HiSeq were obtained for 10 million reads per sample, whereas the MiSeq and PGM sequencing depths were not sufficient to reach an optimal level of assembly (Clooney et al., 2016). Furthermore, MiSeq and PGM technologies provide a better functional categorization for predicting core genes from assembled contigs, possibly due to their longer read lengths (Clooney et al., 2016). Therefore, in some cases a combination of platforms could provide a more complete coverage of the studied genome.

The current sequencing assay protocols allow for two types of microbiome studies: (a) marker gene sequencing community identification, which surveys and counts microbes using amplicon sequencing of a single marker gene that is usually the 16S rRNA gene, and taxonomic assignment by bioinformatic methods; and (b) shotgun metagenomic sequencing, which surveys the entirety of all microbial DNA present in a sample using a collection of ad-hoc bioinformatic methods for gene and species identification purposes (Brown, 2015).

#### Amplicon Sequencing

Classic microbiology methods are limited to the study of microbes that grow under specific sets of culture conditions; however, most microbial species are difficult or impossible to culture in vitro. For that reason, their full genetic spectrum was unknown until the advent of HT sequencing technologies, expanding our knowledge of the microbial world. The similarities and distinctions among bacterial species have become complex (Konstantinidis et al., 2006), so that, instead of a "species," the term "operational taxonomic unit" (OTU) is used to characterize and infer the phylogenetic relationships between organisms grouped by sequence similarity (Blaxter et al., 2005; Koeppel and Wu, 2013; Schmidt et al., 2014). Usually, the 16S rRNA gene, which is a highly conserved gene in all prokaryotes, is amplified to analyze prokaryotic taxonomic composition in samples. However, this gene is approximately 1550 base pairs long making it difficult to sequence the whole gene through HT sequencing methods without an assembly step (Di Bella et al., 2013).

Instead of sequencing the entire 16S gene, one or more of its nine variable (V) regions are amplified using particular sets of primers. The choice of which variable region to use and amplify depends on factors related to the sample and experiment. For instance, evidence suggests that the V1–V3 region is better for taxonomical classification of species; however, some predictive studies show that the V3–V5 region results in a better classification of microbiota from disease vs. healthy specimens (Statnikov et al., 2013). Kim et al. analyzed different variable regions and recommended targeting of the V1–V3 and V4–V7 regions for the analysis of archaea and the V1–V3 and V1–V4 regions for the analysis of bacteria (Kim et al., 2011).

#### Shotgun Metagenomic Sequencing

Although the 16S rRNA is the most frequent gene used for studies of microbial community membership and structures, it has some limitations. The use of a particular set of primers for amplification of 16S and its PCR conditions can favor some taxa over others, creating bias in abundance counts (Statnikov et al., 2013). In addition, the 16S primers do not capture viruses and eukaryotes. Then, the shotgun metagenomic sequencing approach is commonly used to describe microbial communities without the biases inherent to PCR amplification of a single gene. In principle, shotgun sequencing provides robust estimates to identify the whole genomes present in a biological sample, including genome sequences of viruses and other functional DNA elements (Brown, 2015).

Metagenomic analysis is much more challenging than amplicon sequencing due to the consideration of whole genomes instead of a particular gene. Indeed, hundreds of millions of reads must be generated and analyzed for each sample, taking advantage of very deep sequencing on the Illumina HiSeq or similar instruments. In addition to shotgun metagenomic analysis, metatranscriptomic analysis using direct cDNA sequencing, which is known as RNA sequencing (RNAseq), allows for the analysis of all of the RNA of a sample to determine which genes are transcribed and for monitoring gene regulation over time, which is particularly interesting when studying changes in the microbiota in response to perturbations (Valles-Colomer et al., 2016).

Due to technical difficulties such as isolation of high quality RNA from biological samples or the presence of mRNA from the host, the application of RNA-seq to the study of the human microbiota in cancer is still limited. To date, a couple of interesting studies related to the metatranscriptome and the microbiome have been published. In 2014, Franzosa and coworkers reported the correlation between the metagenome and metatranscriptome of the healthy human gut microbiome. These findings showed that 41% of microbial transcripts are in concordance with their genomic abundances, while sporulation and some pathways of amino acid biosynthesis are underexpressed, and methanogenesis and ribosome biogenesis are up regulated. Interestingly the subjectspecific metatranscriptomic variation was more significant than the metagenomic variation (Franzosa et al., 2014). In 2015, Versluis and coworkers explored the gut metatranscriptomes for the expression of antibiotic resistance genes. Their results showed that resistance gene expression could be constitutive or could have different roles other than antibiotic resistance (Versluis et al., 2015).

After sequence data have been obtained, the next step in the NGS pipeline is the bioinformatics analysis of the reads, which include quality control, assembly and, finally, microbiome profiling (**Figure 1**). In each step of the bioinformatic pipeline, there are diverse computational methods that can be applied based on the organisms, the biological question being explored, and the technology applied to the samples. There are three initial steps in common when the 16S rRNA gene is used for prokaryotes, the nuclear ribosomal internal transcribed spacer region (ITS) for fungi or shotgun sequencing: (1) data acquisition or generation of FASTQ files (a common format for sharing sequencing read data); (2) quality control; and (3) assembly of the reads (**Figure 1**).

genome pipeline. In the functional assignment step, we gather a biological understanding for regulation and gene pathway reconstruction, obtaining finally the

#### Data Acquisition

microbiome profiling.

The NGS methodologies provide data files in different formats depending on the platform used. For instance, the Illumina platform generates <sup>∗</sup> .bcl binary files containing base call and quality for each tile in each cycle, while Oxford Nanopore Technologies provide the data in binary files in HDF5/FAST5 format, which contains a number of hierarchical groups, datasets and attributes (Watson et al., 2014). However, to proceed with the analysis, both data files need to be converted to FASTQ format. The FASTQ files have four lines per sequence: sequence identifier, raw sequence, quality score identifier and quality scores encoded in Phred format. Phred quality scores are a measure associated with the assurance of each nucleotide in the sequence.

#### Quality Control

Routinely, before starting a data analysis, a primary sequence analysis should be performed, where various data parameters are evaluated such as the quality scores of the sequences, global CG content, and the repeat abundance and the proportion of duplicated reads. The main tool to perform this is the FastQC (http://www.bioinformatics.babraham.ac.uk/ projects/fastqc/) or the FASTX-Toolkit, which is a collection of command line tools. Parameters for good quality data include a Phred quality score above 28, low percentage of duplicated sequences, no adapter content, and GC count per read close to the theoretical distribution. Another useful tool for quality assessment and processing of HT DNA sequence data is the Bioconductor's package ShortRead (Morgan et al., 2009).

#### Assembly

The assembly processing of contigs consists of searching for overlapping reads, alignment and merging sequences to reconstruct the entire original sequence. There are two main approaches for genome assembly: de novo and reference guide. In the novo assembly approach, there are currently two main methods: Overlap-Layout-Consensus (OLC) and De Bruijn Graph (BG). OLC methods are based on overlap graphs, and their process has three steps: (1) searching for overlapping reads comparing all-against-all, (2) construction and manipulation of an overlap graph leading to an approximate read layout, and (3) constructing the consensus sequence using multiple sequence alignments (Miller et al., 2010). The BG method involves the definition and alignment of K-mers, where the K parameter denotes the length in bases of these sequences; the overlap is between k-mers, not between reads.

In the context of obtaining the microbiome profile of a sample using the 16S rRNA gene, three phases can be distinguished: OTU clustering, OTU classification and diversity assessment. The metrics for microbiota description include species richness and phylogenetic diversity, distance matrices of samples, alfa and beta diversity, rank abundance distributions and statistical analysis of ordering and classification. OTU clustering is a key step for de novo OTU construction that has an important efect on the estimation of species abundance and diversity. There are some recent comparisons of several of these clustering methods (Chen et al., 2013; Kopylova et al., 2016). Alternatively, to the direct construction of OTU clusters, more recently, DADA2 addresses the sequencing errors and its correction to properly identify the sequence variants at the strain level (Callahan et al., 2016). Further taxonomic assignment to the sequence table can be accomplished via Greengenes (DeSantis et al., 2006), SILVA (Quast et al., 2013) or a dedicated human intestinal 16S database (Ritari et al., 2015). There are different software options to analyze this kind of data from end to end such as QIIME or Mothur (Schloss et al., 2009; Caporaso et al., 2010; Navas-Molina et al., 2013), MICCA (Albanese et al., 2015) or phyloseq developed in R language (Mcmurdie and Holmes, 2013; Heazlewood et al., 2015).

While amplification of the 16S rRNA gene is performed to determine the diversity of and quantify the abundance of bacteria, metagenomic shotgun sequencing aims to recover genomes (Smits et al., 2015), describe the genomic structure and survey the metabolic capabilities of the different microorganism in a community. The most common strategy to reconstruct genomes and recover global functional pathways from metagenomic data from reads involves: (1) gene prediction, (2) functional assignment, and (3) pathway reconstruction (Abubucker et al., 2012).

Accurate gene prediction is critical for functional assignment. With the intent of increasing the accuracy of prediction, some authors recommend using algorithms that take into account significant differences between coding and non-coding sequences to identify open reading frames, di-codons frequency, GC content of coding sequences, preference bias in codon usage and patterns in the use of start and stop codons (Escobar-Zepeda et al., 2015).

From a practical point of view, there are several packages and suites to perform metagenomic analysis taking into account a variety of statistical tools (Supplementary Table 1). For instance, MetaGeneMark uses direct polynomial and logistic approximations of oligonucleotide frequencies, and it evaluates the dependencies between the frequencies of oligonucleotides with different lengths and the GC% of a nucleotide sequence (Zhu et al., 2010); Glimmer-MG, which is based on Glimmer, uses the interpolated Markov models with variable-order for capturing sequence compositions of protein-coding genes (Kelley et al., 2012); FragGeneScan incorporates sequencing error models and codon usages in a hidden Markov model to predict ORFs in short reads (Rho et al., 2010); and Orphelia is a gene finder based on the machine learning approach (Hoff et al., 2008).

A common strategy in metagenomics pipeline is the partitioning or clustering of reads (for example, for the exclusion of rRNA, tRNA or other specific DNA) by alignment methods (Kopylova et al., 2012; Wood and Salzberg, 2014). This allows taxonomy assignment and classification of reads. Improvements in terms of speed and accuracy of these tasks have been reached by various methods implemented in Phymm and PhymmBL (Brady and Salzberg, 2009), LMAT (Ames et al., 2013), mOTUs (Sunagawa et al., 2013), and more recently Kraken (Wood and Salzberg, 2014), MetaPhlAn2 (Truong et al., 2015), and SMART (Lee et al., 2016). For a better estimation of gene abundances, methods that uses a machine learning approach, such as MUSiCC (Manor et al., 2015). All these methods rely on a reduced database search of single copy genes, wide coverage phylogenetic markers or hidden Markov models using training sets. Others use combined methods of genomic signatures, marker genes and optional contig coverages (Lin and Liao, 2016). Peabody and coworkers present a recent comprehensive evaluation of metagenomic classification methods (Peabody et al., 2015).

Functional assignment is performed on the predicted open reading frame or predicted proteins by sequence similarity search to well-cured databases, using tools such as BLAST (local alignments), FASTA (global alignment) or HMMER (hidden model Markov profiles) when sequence identity is low. These analyses can be performed using locally installed software; alternatively, for users with no bioinformatic training, there are different suites for analysis, such as MG-RAST (Wilke et al., 2016), IMG/M (Markowitz et al., 2012; Wilke et al., 2016), JCVI and Metagenomics Reports (METAREP) (Goll et al., 2010; Markowitz et al., 2012; Wilke et al., 2016) or MEGAN (Huson and Weber, 2013), MetAMOS (Treangen et al., 2013), MOCAT2 (Kultima et al., 2016), and MetaTrans (Martinez et al., 2016) which are software designed to simplify all metagenomics or metatranscriptomics pipeline; preprocessing, assembly, annotation and analysis.

Having obtained a high quality functional annotation, the process of metabolic pathway reconstruction is extremely useful to identify, at a systemic level, those pathways with a primary role in supporting the phenotype. For mapping each gene in a metabolic pathway and analyzing missing enzymes (due to an analogous enzyme that is performing the same function), two different databases can be used: KEGG (Ogata et al., 1999) and MetaCyc (Karp, 2002). For instance, KEGG has implemented GhostKOALA as a tool for metagenomic analysis, which is based on a non-redundant dataset of pangenome sequences (Kanehisa et al., 2016).

#### Metabolomics

Host-microbiome interactions encompass an exchange of metabolites and signaling molecules, some of them with an essential role to establish a proper functionality in the host and the microbial community. This crosstalk depends on a variety of factors such as the microbiome composition and external ambiances. Understanding the metabolic activity of these communities and how impacts the host has been the focus of many studies. Some of them associating metabolic biomarkers with the development of disease.

With the aim of disentangling this complex metabolic communication and surveying the metabolic pathways that actively participate in the community, metabolomics–embracing the massive quantitative measurement of intracellular or extracellular metabolites in biological samples such as human stool (Weir et al., 2013)–has been established as the more suitable HT technology to characterize the phenotype and dynamic response of living systems (Nicholson and Lindon, 2008; Marcobal et al., 2013; Diener et al., 2016).

Metabolomic studies can be performed by using three basic approaches: (1) fingerprinting or endo-metabolome, searching for metabolites within the organisms under study; (2) footprinting or exo-metabolome, analyzing metabolites from the environment around the organism under study; and (3) metabolome profiling, where the goal is to screen one or more specific compounds (Patel et al., 2015). A typical metabolic study has four basic steps: sample collection, data acquisition, bioinformatic analyses and biological interpretation (Briefly described in **Figure 2**).

Currently, two main technologies are used in metabolomics; MS and nuclear magnetic resonance spectroscopy (NMR). MS is a highly sensitive method for detection, quantification and structure elucidation of hundreds of metabolites. Given the wide spectrum of molecular weights of metabolites in samples, it is necessary to separate metabolites to improve the sensitivity and accuracy of detection. Thus, MS is often coupled with different separation techniques such as gas chromatography (GC-MS), liquid chromatography (LC-MS) and capillary electrophoresis (CE-MS) (Gowda and Djukovic, 2014). All of these techniques have been used for clinical studies, and each has advantages and limitations. For instance, GC-MS has high-resolution capability, but it requires volatile compounds or compounds made volatile by chemical derivatization. LC-MS is a very sensitive technique, and it has the advantage of not requiring chemical derivatization of compounds; however, it has poor resolution. Also, the high capacity of CE-MS to separate compounds allows its use as a

FIGURE 2 | Workflow for metabolomics analysis. Metabolomic studies involve four general steps: (1) sample collection method, which depends on the type of tissue and must consider the type of storage, preservation and preparation of each sample, (2) data acquisition, involves sample analysis and quality control, (3) analysis data, includes normalization and identification of metabolites using specialized software for statistical analysis, and (4) data interpretation, which must be integrated and modeled to raise new hypotheses.

platform for multiplexing samples (Johanningsmeier et al., 2014; Nagana Gowda and Raftery, 2015).

On the other hand, NMR spectroscopy is a technique with high reproducibility and is able to absolutely quantify metabolites using a single reference; because it is a non-destructive technique, the samples can be used for re-analyses using other methods (Nagana Gowda and Raftery, 2015). NMR spectroscopy has two variations: <sup>1</sup>H-NMR and high-resolution magic angle spinning NMR (HR-MAS-NMR).

After analyzing samples, it is necessary to interpret the data. Common analysis procedures involves data conversion, detecting signal peaks, alignment (i.e., comparison between different datasets to eliminate migration times shifts) (Katajamaa and Orešic, 2007 ˇ ), normalization and identification of metabolites. Processed data requires multivariate statistical analysis to find samples or variables accounting most of the variability between datasets and potential biological roles; therefore, methods such as partial least square discriminant analysis (PLS-DA), principal component analysis (PCA), hierarchical clustering analysis (HCA) and orthogonal partial least square discriminant analysis (OPLS-DA) are widely used. A number of free software packages and databases for metabolic analysis are available, and these are summarized in Supplementary Table 1. Visualizing tools can leverage the interpretation of results, both heatmaps and pathways are widely used to perform this task (Supplementary Table 1).

Finally, it is important to standardized data to share it in public databases, this could facilitate experimental replication between laboratories and maximize the value of metabolomic data (Fiehn et al., 2006). Additionally, the Human Metabolome Database (HMDB) is a metabolome project, analogous to the Human Genome Project, which aims to provide a comprehensive database of detected and biologically expected human metabolites. Currently, the HMDB has more than 40,000 metabolite entries (Wishart et al., 2012). The enrichment of these valuable tools can provide a better understanding of the characteristics of health and disease states when combined with other clinical and modeling approaches to fill the gap between the genotype and phenotype relationship (Diener et al., 2016).

To study the metabolic changes in health and disease, we can analyze the metabolites produced solely by the host, those produced or modified by the microbiome, or the metabolites jointly contributed from host-microbiome interactions (Guo et al., 2015). In cancer metabolomic research, there are different types of samples to study, including fluids such as urine, blood, saliva, breath condensate, cerebrospinal fluid, and pancreatic juices or tissue, and in each case require of particular method for storing and preparing the sample for processing (Spratlin et al., 2009). Additionally, metabolomics can help us to track those metabolites found in our environment that can influence the phenotype, such as diet, chemical exposure, xenobiotics, supplements or drugs (de Raad et al., 2016). Here, we briefly review some studies related to cancer metabolomics and hostmicrobiome co-metabolism.

Cancer cells have a specific metabolic demand to proliferate, increase their growth and sustain their malignant phenotype (Resendis-Antonio et al., 2015). Notably, this physiological state is represented by changes in the metabolic profile of human tissue. The identification of these metabolic alterations is a crucial point to define the phenotype, design new therapeutic targets and explore the evolution of the disease (Locasale et al., 2009; Yun et al., 2009; Ramirez et al., 2013).

Metabolomic studies have led us to search for new biomarkers in cancer, and these findings have had important implications for surveying the mechanisms of a variety of cancers such as bladder (Rodrigues et al., 2016), breast (Jobard et al., 2014), pancreatic (Di Gangi et al., 2015), gastroesophageal (Abbassi-Ghadi et al., 2013), gastric (Abbassi-Ghadi et al., 2013; Chan et al., 2016), and oral (Mikkonen et al., 2015) cancer. For instance, in the case of gastric cancer, three potential biomarkers, 2-hydroxyisobutyrate, 3-indoxylsulfate and alanine, were identified in urine samples using <sup>1</sup>H-NMR spectroscopy. Revealing that those patients have a particular metabolic profile (Chan et al., 2016).

Other more comprehensive approaches involve the study of microbiome metabolites and their interactions with the host, i.e., synthesis, absorption, and potential physiological effects on the host. There are several studies that have been able to discern the different metabolites in the human gut microbiome and their relationships with health and disease (Sharon et al., 2014). Additionally, there are in vivo studies observing the effects of the human gut microbiota on the metabolism of biofluids of humanized mice (Marcobal et al., 2013; Smirnov et al., 2016). By characterizing, discerning and associating metabolite levels with genetics and external factors such as diet and the microbiome, metabolomics can aid in diagnostics and expand the clinical scope toward the realization of precision medicine (Beebe and Kennedy, 2016).

For instance, Guo et al. analyzed the plasma metabolites from healthy volunteers, identifying 600 metabolites covering 72 biochemical pathways, ranging from biosynthesis, catabolism, gut microbiome activities, and xenobiotic metabolism. Also, the metabolome profiles were associated with wholeexome sequencing and clinical records to identify metabolic abnormalities associated with disease (Guo et al., 2015). This approach exemplifies how complementing genetic and metabolic analysis can help to improve diagnosis and medical interventions such as dietary changes, evaluate drug response and the discovery of biomarkers.

### ELUCIDATING THE HOST-MICROBIOME INTERACTIONS AND CANCER DEVELOPMENT

In the emergence of complex diseases such as cancer, the relationship between the environmental influence, the microbiome and cancer appearance can be very entangled. The body offers a suitable and nutrient-rich microenvironment to resident microbes, while the microbiome assists humans in metabolic or immune tasks. Additionally, the microbiota provides humans with non-nutrient essential factors, such as vitamins, and impedes pathogens from establishing (Zitvogel et al., 2015). Differences in microbial and possibly viral compositions between healthy subjects and those affected by Contreras et al. Microbiome-Cancer-Systems Biology

diseases have been identified (Blumberg and Powrie, 2012; Koeth et al., 2013; Bultman and Jobin, 2014; Clavel et al., 2014; Tilg and Moschen, 2014). Broadly defined, this imbalance, referred as dysbiosis, imply deviations in the composition of resident commensal communities from the ones found in healthy individuals (Petersen and Round, 2014).

Most of the current research exploring the effects of host-microbe interplay in cancer is focused on colorectal cancer (CRC). By using genomic approaches, some studies have compared the mucosal surface and the intestinal lumen microbiota between healthy patients and those with CRC (Chen et al., 2012; Kostic et al., 2012; Sanapareddy et al., 2012). Although there is no consensus between studies, some taxa are associated with a protective function (e.g., Roseburia) while others are associated with potentially detrimental effects (e.g., Fusobacterium, Klebsiella, and Escherichia/Shigella) (Jobin, 2013; Thomas and Jobin, 2015). This suggests a dysbiotic or differential community composition correlated with CRC development. However, among the open issues about hostmicrobiome interactions in disease, we ignore the role of the microbiome as a driver or consequence of cancer development (Tjalsma et al., 2012).

Altered cellular metabolism and inflammation are proposed host dependent hallmarks of cancer (Hanahan and Weinberg, 2011). Even when host-microbiome interactions might not be considered essential for cancer appearance, or its effects are indirect, some cancers, such as CRC, might have an important microbial component. In vitro studies have reported a signaling process between bacterial quorum-sensing peptides (QSPs) and cancer cells. Bacillus derived QSPs are synthesized when there are bacterial stressors and are able to induce tumor cell invasiveness in a process called epithelial-mesenchymal-like (EMT-like) process (involved in CRC metastasis) (Wynendaele et al., 2015). The QSPs contributed both to metastatic and angiogenesis behaviors under these settings (De Spiegeleer et al., 2015; Wynendaele et al., 2015). Furthermore, in other kinds of cancer, the result of microbial activities can reduce the effectiveness of chemotherapy (Wallace et al., 2010) or influence the development of tumors distant from the gut (Iida et al., 2013).

Genetic and environmental factors disrupting the healthy relationship between hosts and microbiomes can provoque dysbiosis and promote cancer development (**Figure 3**). Lifestyle, diet, and early exposure have been recognized as major players in determining the microbiome composition. Additionally, different metabolites produced by the intestinal microbiota are proposed to play both cancer-promoting and cancerprotecting roles; however, factors determining different outcomes are not completely understood (Bultman and Jobin, 2014). Characterizing bacterial OTUs consistently altered across studies, and attributing to them the presence of specific diseases can be difficult given the inter-individual variations (Zackular et al., 2013). This suggests the need to understand what are the possible roles of the microbiome in this process. In this regard, we will review three major factors that can promote microbial dysbiosis and cancer development: (1) infectious agents, (2) diet- and microbial-derived metabolites; and (3) inflammatory mediators.

development. Differences in microbial composition between healthy individuals and those affected by cancer have been identified. Genetic and environmental factors can disrupt the healthy condition of human microbiome and promote microbial dysbiosis. Infectious agents are one of the main contributors to dysbiosis and cancer development, in addition to diet, which has been recognized as one of the major players in determining microbiome composition. Moreover, microbes associated to cancer appear to activate pro-inflammatory pathways on host tissues.

### Infectious Agents in Cancer

Infectious agents are one of the main contributors to cancer development. The linkage of infection with some biological agents and carcinogenesis in humans started more than a century ago when Francis Peyton Rous began his famous cancer virus transmission experiments at the Rockefeller Institute, USA (Moore and Chang, 2010). Eleven biological agents have been identified as group 1 carcinogens by the International Agency for Research on Cancer (IARC) (Bouvard et al., 2009). These include Epstein-Barr virus (EBV), hepatitis B and C viruses (HBV and HCV, respectively) Kaposi sarcoma herpesvirus (KSHV, also known as human herpesvirus type 8, HHV-8), human immunodeficiency virus type 1 (HIV-1), human papillomavirus (HPV) type 16 (HPV-16), human T-cell lymphotropic virus type 1 (HTLV-1), Helicobacter pylori (H. pylori), Clonorchis sinensis (C. sinensis), Opisthorchis viverrini (O. viverrini), and Schistosoma haemotobium (S. haemotobium). Although HIV does not directly cause cancer, its infection strongly increases the incidence of many different human cancers. Among these cancers, those associated with the herpesviruses KSHV and EBV are the most strongly enhanced by immunosuppression (Bouvard et al., 2009).

Specific infections represent major cancer risk factors with an estimated 2.1 million (16.4%) of the 12.7 million new cases in 2008 attributable to infection. This fraction is substantially higher in less developed regions of the world (23.4% of all cancers) than in more developed regions (7.5%). The most important infectious agents are H. pylori, hepatitis B and C viruses and HPV, which together are responsible for 1.9 million cases of gastric, liver and cervix uteri cancers, respectively (de Martel et al., 2012). A better understanding of the role of infectious agents in the etiology of cancer is an essential element for precision medicine, because such cancers are theoretically preventable by proper vaccination or early treatment of infection (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2012).

The IARC estimates that one in five cancer cases worldwide are caused by infection, with most being caused by viruses (Bouvard et al., 2009). The first human tumor virus, Epstein-Barr virus, also known as human herpesvirus 4 (HHV-4), was described in 1964 in cell lines from African patients with Burkitt's lymphoma (Epstein et al., 1964). EBV is invariably associated with the non-keratinizing type of nasopharyngeal carcinoma (NPC), which represents 80% of NPC cases, and new evidence points to a role for EBV in 5–10% of gastric carcinomas. EBV infection is observed to occur mostly in the upper middle portions of the stomach rather than in the lower part of the stomach (Shah and Young, 2009).

Chronic infection with Hepatitis B virus (HBV) and hepatitis C virus (HCV) is known to cause hepatocellular carcinoma (Song et al., 2016). Several epidemiological studies suggest that HCV may be involved in the pathogenesis of several B-cell lymphoproliferative disorders. In particular, sufficient evidence is available to indicate that chronic infection with HCV can also cause non-Hodgkin lymphoma (Hermine et al., 2002). Evidence of HTLV-1 infection was initially found in at least 90% of adult T-cell leukemia and lymphoma (ATLL) cases; subsequently, HTLV-1 infection became part of the diagnostic criteria for ATLL (Oh and Weiderpass, 2014). KHSV is a causal factor for Kaposi sarcoma and, more recently, MCV, a novel member of the polyomavirus family, has been identified. There is some evidence that MCV has an important role in the development of Merkel cell carcinoma, a rare skin cancer arising in elderly and chronically immunosuppressed individuals (Shuda et al., 2008).

It is very well established that infection with specific types of HPV can cause cervical cancer. Global epidemiological studies identified HPV 16, 18 and a few others as major risk factors for cervical cancer (zur Hausen, 2009). In addition, there is strong epidemiological evidence for the involvement of HPV infection in the carcinomas of the cervix, penis, vulva, vagina, anus, upper aerodigestive tract, and head and neck. The majority of HPVrelated head and neck cancers are located in the oropharynx (Hettmann et al., 2015). Multiple meta-analyses support the discovery of a higher HPV detection rate in regions associated with high risk for esophageal squamous cell carcinoma (ESCC), compared to low-risk areas. Additionally, a potential role of HPV in the rise of esophageal adenocarcinoma (EAC) was proposed recently; however, future studies are required (Xu et al., 2015).

The prevalence of H. pylori infection varies widely by geographic area, age and socioeconomic status. In less developed regions, it may reach 80%, while, in more developed regions, the prevalence is 40% or less (Brown, 2000). H. pylori infection is limited to the distal part of the stomach, and chronic infection is associated with non-cardia gastric carcinoma. H. pylori yields various virulence factors that may dysregulate host intracellular signaling pathways, controlling the immune response associated with the induction of carcinogenesis. Of all virulence factors, cagA (cytotoxin-associated gene A), and its pathogenicity island (cag PAI), and vacA (vacuolating cytotoxin A) are the major pathogenic factors (Ahn and Lee, 2015). H. pylori can modulate the immune response through activating growth factors and cytokines (Amedei et al., 2009). For instance, the H. pylori secreted peptidyl prolyl cis, trans-isomerase, HP0175, is one of bacterial antigens recognized by sera of H. pylori infected patients, that is able to activate both epidermal growth factor receptor and NF-κB pathway, and drives gastric T helper 17 (TH17) responses in patients with distal gastric adenocarcinoma (Amedei et al., 2014).

Regarding helminth infections, chronic infections with the liver flukes C. sinensis and O. viverrini are associated with cholangiocarcinoma. Liver fluke antigens stimulate both inflammatory and hyperplastic changes in the infected bile ducts, which undergo severe pathological transformations. The relative risk for this adenocarcinoma is estimated to be 7.8 for individuals infected with O. viverrini and 7.7 for those infected with C. sinensis (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2012). Approximately 5–10% of cholangiocarcinoma is caused by chronic C. sinensis infection in endemic areas, which are located in China, Korea, Thailand, Laos, Vietnam, and Cambodia (Oh and Weiderpass, 2014). On the other hand, S. haematobium is a parasitic flatworm associated with bladder cancer that infects millions of people, mostly in the developing world. In in vitro models exposed to total antigens Botelho et. al. found increased cell proliferation, decreased apoptosis, up-regulation of the anti-apoptotic molecule Bcl-2, down-regulation of the tumor suppressor protein p27, and increased cell migration and invasion (Botelho et al., 2010).

Infectious agents can be direct carcinogens, such as the HTLV-1 and the KSHV, which express viral oncogenes that directly contribute to cancer cell transformation, or indirect carcinogens by causing chronic inflammation, which eventually leads to carcinogenic mutations in host cells, such as H. pylori, the major cause of gastric carcinogenesis. In addition, carcinogenesis would result from the interaction of multiple risk factors including those related to the infectious agent itself (virulence factors, variants, or subtypes), host-related factors (gene polymorphisms and immune system status) and environmental aspects (smoking, chemicals, ionizing radiation, immunosuppressive drugs, or another infection that may lead to reactivation of latent oncogenic viruses such as EBV or KSHV) (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2012). Further studies should be conducted to elucidate in detail the contribution of these additional factors to the development of cancers associated with infectious agents.

### Diet and Microbial-Derived Metabolites in Cancer

Microbiome-derived metabolites are gaining recognition for their potential participation in cancer development (Louis et al., 2014). Clearly, diet is a major source for the production of those metabolites and has to be taken into account along with microbiome composition and activities. For example, high fat and high protein consumption is characteristic of modern western diets (Hughes et al., 2000; Albenberg and Wu, 2014), and this particular dietary composition is currently recognized as a risk factor for cancer occurrence (Bouvard et al., 2015; Gallagher and LeRoith, 2015). In this section, we will present some example in vitro and in vivo studies of microbiomederived metabolites related to cancer development, and explore its possible application as biomarkers.

#### Secondary Bile Acids

In the liver, enzymatic oxidation of cholesterol generates bile acids (BA) that function as detergents that facilitates digestion and absorption of lipids; while also acting as signaling molecules related to metabolic homeostasis (de Aguiar Vallim et al., 2013). The presence BAs in the colon promotes its subsequent conversion to secondary bile acids (SBA) by means of bacterial enzymes. Species with 7-α-dehydroxylating enzymes, can convert the host's BA into SBA (Ou et al., 2013) and those can act as carcinogens (Bernstein et al., 2005).

In vitro studies have shown that 1-h exposure to SBAs Deoxycholic Acid (DCA) or Lithocholic Acid (LCA) causes extensive DNA damage at physiological concentrations in a dose-dependent manner (Booth et al., 1997). Moreover, those compounds induced the production of reactive oxygen species (ROS) by acting as detergents on membrane enzymes, such as phospholipase A 2, resulting in the formation of prostaglandins and leukotrienes (Bernstein et al., 2005).

Pro-cancerous activity derived from SBA has also been described in vivo. In a mouse model, treatment with a carcinogen at the neonatal stage and posterior feeding under a high fat-diet induced the appearance of hepatocellular carcinoma, showing a senescence-associated secretory phenotype (SASP) in hepatic stellate cells (Yoshimoto et al., 2013). The level of DCA produced by enteric bacteria was increased under these conditions, and OTU analysis revealed an increase in DCA-producing bacteria belonging to Firmicutes from Clostridium cluster XI (Yoshimoto et al., 2013).

Human studies indicate that African Americans have a higher incidence of and higher mortality from CRC than other ethnic population in the USA (O'Keefe et al., 2007). In a search for possible mechanisms, microbiome compositions between African Americans and native Africans were analyzed; the former group were enriched in Bacteroides spp., whereas the later was dominated by Prevotella spp. (Ou et al., 2013). This reflected the differences in bacterial enrichment between western and fiberrich diets. Additionally, genes coding for SBA and fecal SBA concentrations were higher in African Americans, whereas shortchain fatty acids were higher in native Africans (Ou et al., 2013). This scenario suggests that similar genetic backgrounds differ in phenotype and proclivity to develop a certain disease, and this difference is mainly driven by diet and different microbiome conformations.

#### Short Chain Fatty Acids

Consumption of dietary fiber stimulates saccharolytic fermentation by diverse gut microbes that produce shortchain fatty acids (SCFA), mainly acetate, propionate, and butyrate (Holmes et al., 2012). Bacteroidetes produce high levels of acetate and propionate, whereas Firmicutes bacteria produce high amounts of butyrate. Acetate and propionate are found in portal blood and are eventually metabolized by the liver or peripheral tissues (Honda and Littman, 2012). Butyrate is considered a pleiotropic metabolite, functioning as the primary energy source for colonocytes, reducing oxidative stress and inhibiting inflammation (Hamer et al., 2008).

Some anticancer activities have been attributed to butyrate. By functioning as an inhibitor of histone deacetylase (HDAC), butyrate induces hyperacetylation of core histone proteins (H3 and H4) when compared with other SCFA. Among its effects as an HDAC inhibitor, butyrate can induce in vitro S-phase arrest of colorectal adenocarcinoma cells and inhibit its growth by inducing apoptosis and the expression of the cell cycle regulators p21 and cyclin B1 (Hinnebusch et al., 2002).

Interestingly, those effects depend on cell status, i.e., normal vs. cancer. In the former, butyrate stimulates proliferation (functioning as an energy source); while in cancerous cells, butyrate inhibits proliferation and induce apoptosis (Comalada et al., 2006). Donohoe et al. analyzed these context-dependent effects from the perspective of the Warburg effect (Donohoe et al., 2012). Due to the Warburg effect, cancer cells primarily depend on aerobic glycolysis instead of oxidative metabolism for survival. In this context, butyrate is not used as an energy source and its accumulation is allowed inside the nuclei, inhibiting HDAC in cancer cells. Experimental inhibition of the Warburg effect in cancerous colonocytes induced cell proliferation, suggesting that the Warburg effect is necessary for observing the butyrate antiproliferative effect (Donohoe et al., 2012).

On the other hand, CRC-prone mice revealed a paradoxical effect of butyrate on colonic cancer cells. By using a mouse model with mutations in the adenomatous polyposis coli (APC) and DNA mismatch repair (MMR) genes (as commonly observed in humans), Belcheva et al. observed an anomalous proliferation of colonic epithelial cells and formation of polyps (Belcheva et al., 2014). Furthermore, using antibiotics or lowering carbohydrates in diet reduced the development of tumors. This indicates an involvement of microbial metabolism and diet in cancer development under this particular host's genetic background. The authors identified butyrate as a causative of disease onset, and the sole administration of butyrate was sufficient to increase polyp number and epithelial cell proliferation. Given the apparently paradoxical effects of butyrate on cancerous phenotypes, there is a potential therapeutic modification of bacterial activities with antibiotics and/or diet modifications for cancer patients in order to improve the outcome.

#### Proteins and Red Meat Diet-Associated Compounds

When carbohydrates get depleted from the proximal colon, protein fermentation can occur in the distal colon (Windey et al., 2012). This activity is mainly driven by colonic bacteria and results in the production of noxious metabolites such as ammonia, amines, phenols and sulfides. Western diets, provide metabolites like fats, heme and heterocyclic amines, and those are suggested to play a role in CRC development (Windey et al., 2012).

Amino acids fermented by colonic bacteria include lysine, arginine, glycine, and the branched chain amino acids (BCAA) leucine, valine, and isoleucine. This generates a diversity of end products including ammonia, SCFA, and branched-chain fatty acids (BCFA) valerate, isobutyrate, and isovalerate. Microbial metabolism of amino acids can also produce biogenic amines by decarboxylation of amino acids (Windey et al., 2012; Neis et al., 2015).

Bacterial metabolism of aromatic amino acids results in the production of phenolic and indolic compounds that are excreted as p-cresol. In vitro studies in epithelial colonic cells have shown detrimental effects and genomic DNA damage by ammonia, sulfides, p-cresol and phenolic compounds (Pedersen et al., 2002; Attene-Ramos et al., 2010; Windey et al., 2012). Hydrogen sulfide also inhibits cellular respiration, at least in part by acting as an inhibitor of cytochrome c oxidase, which participates in the final step to produce ATP. These noxious effects have been associated with Inflammatory bowel disease and cancer (Medani et al., 2011).

Additionally, epidemiological and experimental studies have shown that red meat induces more genetic damage than white meat (Toden et al., 2007). By studying the characteristic compound of red meat, heme molecules, Ijssennagger et al. reported that the colon microbiota facilitates, heme-induced epithelial injury and hyperproliferation as a result of the activity of hydrogen sulfide-producing and mucin-degrading bacteria. They observed that the microbiota facilitates heme-induced hyperproliferation by opening the mucus barrier. Bacterial hydrogen sulfide can reduce the S-S bonds in polymeric mucin, thereby increasing the mucus layer permeability for mucindegrading bacteria and cytotoxic micelles (Ijssennagger et al., 2015). Antibiotic treatment prevented the heme-induced cell damage and diminished the expression of cell cycle genes.

It has been shown that a small set of metabolites can modify host physiology; however, numerous metabolites in humans have not been investigated (da Silva et al., 2015). Therefore, further research to categorize new metabolites; transport mechanisms and characterize the biotransformation processes by the microbiome, is a top priority to identify biomarkers such as compounds, specific taxonomic components or metagenomic-enriched functions. Integrating these studies with epidemiological, clinical or nutritional data can provide clues for the search for these biomarkers.

### Microbiota-Mediated Inflammation in Cancer

The symbiotic nature of the intestinal host-microbial relationship poses health challenges. The immune system has developed adaptations to contain the microbiome while preserving the symbiotic relationship (Hooper et al., 2012). However, opportunistic invasion of host tissue by resident bacteria has serious health consequences including inflammation. Chronic inflammation and inflammatory factors, such as reactive oxygen and nitrogen species, cytokines, and chemokines, can contribute to tumor growth and spread (Garrett, 2015).

Increasing evidence indicates that colonizing microbes can drive cancer development and progression by direct or indirect effects on host tissues (Gagliani et al., 2014). Pattern recognition receptors (PRR) recognize specific conserved microbial patterns (bacterial cell walls, nucleic acids, motility apparatuses). The most studied PRR related to CRC belongs to the group of intracellular Nod-like receptors (NLR) and Toll-like receptors (TLR). Following microbial sensing, these PRR engage a complex set of signaling proteins that shape the host immune and inflammatory response (Jobin, 2013). Some NLR family members, such as NOD-2, NLRP3, NLRP6, and NLRP12 may play a role in mediating CRC (Garrett, 2015). Mice deficient in NOD-2 showed a proinflammatory microenvironment that enhanced epithelial dysplasia following chemically induced injury (Couturier-Maillard et al., 2013), and those deficient in NLRP6 showed enhanced inflammation-induced CRC formation (Hu et al., 2013).

Activation of TLR results in feed forward loops of activation of NF-κB. Microbes associated with cancer appear to activate NFκB signaling within the tumor microenvironment. NF-κB was more activated (increased nuclear translocation of the p65 NFκB subunit) in tumors with a high Fusobacterium nucleatum (F. nucleatum) abundance in human colorectal cancer (Kostic et al., 2013). NF-κB is a master regulator of the inflammatory response, and it acts in a cell type-specific manner, activating survival genes within cancer cells and inflammation-promoting genes in components of the tumor microenvironment. NF-κB activation is prevalent in carcinomas and is mainly driven by inflammatory cytokines within the tumor microenvironment (Didonato et al., 2012). The FadA adhesin of F. nucleatum has also been shown to bind to E-cadherin, activate β-catenin signaling and differentially regulate the inflammatory and oncogenic responses in the colon tissue from patients with adenomas and adenocarcinomas (Rubinstein et al., 2013). In vitro studies have also revealed that the Fap2 protein from F. nucleatum can help tumor cells evade the immune system by binding the inhibitory receptor TIGIT in natural killer cells and inhibiting their cytotoxic activities (Gur et al., 2015). These observations of tumor zones enriched in Fusobacterium indicate that the local microbiome conformation is not random and can play an important role in the procancerous phenotype.

The immune system within the tumor microenvironment is not restricted to the innate cells, which present infectious agents to cells of the adaptive immune system for responding selectively and specifically to them. Some adaptive immune responses can be protumorigenic; for instance, upon contact with specific bacteria, CD4+T cells can produce cytokines that promote tumor progression (Gagliani et al., 2014). IL-23, is a cytokine mainly produced by tumor-associated myeloid cells activated by microbial products such as flagellin, promotes tumor growth and progression and development of a tumoral IL-17 response (Grivennikov et al., 2012). Enterotoxigenic Bacteroides fragilis, which secretes B. fragilis toxin, causes inflammation in humans and triggers colitis and strongly induces colonic tumors in multiple intestinal neoplasia (Min) mice. The enterotoxigenic B. fragilis induces STAT3 signaling characterized by a selective TH17 response for colonic hyperplasia and tumor formation (Wu et al., 2009). TH17 cells produce other cytokines besides IL-17, such as IL-22, another cytokine linked to human colon cancer by activation of STAT3 (Jiang et al., 2013).

Notably, inflammation can be associated with other malignant phenotypes that can synergistically act as risk factors for cancer development. For instance, obesity can also generate overrepresentation of bacterial species that produce procarcinogenic metabolites, such as SBAs (Louis et al., 2014). Dysbiosis present in obese individuals alters the gut epithelial barrier, making it more permeable to microbial products that activate immune cells in the lamina propria; and reach the liver via the portal circulation, this contributes to the production of proinflammatory cytokines, such as TNF and IL-6 (Font-Burgada et al., 2016). Barrier deterioration was shown to be a major contributor to colorectal tumorigenesis by microbial products that trigger tumor-elicited inflammation (Grivennikov et al., 2012).

#### INTEGRATIVE ANALYSIS AND THE CHALLENGES IN SYSTEMS BIOLOGY

Given that cancer can be produced by a myriad of genetic and environmental factors, understanding its mechanisms and designing optimal treatments calls for computational schemes capable of integrating heterogeneous HT data to move toward personalized and predictive medicine. Among these factors, the microbiome composition in patients constitutes an important component to induce carcinogenesis or other dysfunctional states in human tissues (Thomas and Jobin, 2015).

An explanation of how the microbiome contributes to the physiological state in the host emerged by noticing that microbes are metabolic partners, for which the nutritional habits of the host can induce the dysregulation of biological processes and consequently alter the phenotypic state. For instance, foods enriched in phosphatidylcholine, choline or carnitine, such as red meat and fatty foods, can be metabolized by gut microbes to produce trimethylamine. The liver enzymes can further produce trimethylamine-N-oxide (TMAO), and this metabolite has proatherogenic properties (Koeth et al., 2013). Knowledge about the microbiome composition and levels of its derived metabolite TMAO predicted the probability of suffering a cardiovascular problem, by means of platelet hyperresponsiveness. Even more, the thrombosis potential was transmissible as a microbiomedependent trait (Zhu et al., 2016). In the case of type 2 diabetes, fasting plasma concentrations of branched chain (BCAA) and aromatic amino acids were higher in people who developed diabetes, and this signature was predictive of developing the disease for more than a decade later (Wang et al., 2011). Interestingly, a metagenomic signature identified in fecal samples from patients with diabetes was the enrichment in metabolic pathways for transport of BCAA and oxidative stress (Qin et al., 2012). It to expect in the near future, that identification of cancer biomarkers, microbiome signatures and its implementation in mechanistic models will also aid in predicting cancer risk and prognosis.

Thus, microbiota metabolism is a cornerstone for maintaining human and microbial symbiosis, whose involvement in signaling transduction and transcriptional regulation is capable of inducing wellness or disease in the human body (Chubukov et al., 2014). More importantly, the heterogeneous composition observed in individual microbiota provides evidence for the usefulness of personalized studies in terms of genetic backgrounds, lifestyle, nutrition and environmental factors. Even though these findings are currently supported with experimental evidence, the understanding of how a community of organisms consume and interchange their metabolic and cross-signaling products and how this dynamical behavior influences the phenotypic state of the human host is still an open question.

To decode this bewildering complexity and uncover their underlying mechanisms, combined strategies with available data coming from different HT technologies and conceptual schemes from systems biology have been employed. Currently, in systems biology, some paradigms have been suggested to reach this combined description, including genome scale metabolic reconstructions and constraints-based modeling (Bordbar et al., 2014). The implementation of this paradigm has made it possible to explore the metabolic phenotypes of isolated microorganisms and has successfully contributed to areas such as in vitro microbial evolution and organisms with biotechnological and therapeutic applications (Resendis-Antonio et al., 2007; Bordbar et al., 2014). More fundamentally, these schemes have served as a guide to characterize the metabolic activity of human tissues and explore the metabolic phenotypes in cancer (Resendis-Antonio et al., 2010; Lewis et al., 2012). Remarkably, genome scale metabolic reconstruction and computational modeling have extended the scope. Currently, it is possible to model the metabolic interaction between different tissues in the human body (Bordbar et al., 2011), and new approaches are currently pointing toward the integration of models for humanmicrobiome interaction to explore the metabolic activity in a community of microorganisms (Heinken and Thiele, 2015; Shoaie et al., 2015). Notably, these approaches pave the path toward quantitative models able to predict the metabolic profile in a community of microorganisms and exploring the mechanisms by which their metabolic products could drive the development of cancer.

Although this is a titanic enterprise, systems biology is a cornerstone in precision medicine for moving toward: (1) the coherent interpretation of heterogeneous HT data; (2) identification of potential biomarkers in cancer; and (3) the optimal design of personalized treatments in clinical trials (Wang R.-S. et al., 2015). Among the immediate challenges needing to be overcome to materialize those aims, the development of integrative conceptual schemes of HT data is important. Nonetheless, its capacity to provide meaningful biological insight will be the proof of concept. The development of methods for a coherent interpretation of data is particularly important in cancer studies where massive genome characterization of a variety of cancers have been reported (Cancer Genome Atlas Network, 2015). The accumulation of enormous quantities of molecular data has led to the emergence of systems biology as a set of principles that underlie the base functional properties of living organisms, evaluating and interpreting interactions between molecules (Kristensen et al., 2014). From a systems biology perspective, the use of genomic technologies and computational procedures may provide molecular approaches to early disease detection and opportunities for identifying high-risk individuals, thus contributing to opportune diagnosis (Stewart et al., 2015). In terms of cancer and the microbiome, the computational

platform from systems biology should be able to integrate HT data such as metagenome and metatranscriptome data for building hypotheses of host-microbiota metabolic activity, and eventually evaluate its role in cancer development (Bäckhed et al., 2012). The hypothesis generated using this approach can be contextualized with the nutritional information of patients, genetic variability, immune status or clinical record. As stated before, integrating microbiome analysis and host data has the potential to predict the disease outcome and has recently been explored in microbiome-related diseases.

Studying the microbiome variation over time offers an exceptional window to understand the properties leading to health and disease states. To date few longitudinal microbiome studies have been conducted on humans, mainly using 16S rRNA sequencing and observing changes in microbial diversity for over a year (Caporaso et al., 2011; David et al., 2014). Results from whole shotgun metagenomics over time are consistent with 16S studies, indicating both small taxonomic and functional variation over time in the absence of perturbations (Voigt et al., 2015). Although those whole shotgun metagenomic studies are scarce, it is expected that price reduction on sequencing will promote their application. From early exposition at birth to adulthood, factors such as diet, immunological tolerance, environment and microbe-microbe interactions can account preferred taxonomic compositions (Wu et al., 2011; Costello et al., 2012; Nutsch et al., 2016). Despite these factors can include an stochastic component, robustness is observed in tissue-specific microbiome identities maintained over time (Caporaso et al., 2011). Notably, when the community suffers a perturbation, taxonomically related bacteria are preferred as substitutes and subject-specific proportions are maintained within the same taxa (David et al., 2014).

Understanding the principles that rule the microbiome dynamics is an important challenge for system biology, nonetheless new paradigms capable to integrate data bases (Hood et al., 2014; Integrative HMP Research Network Consortium, 2014), empirically dissected patterns (Caporaso et al., 2011; David et al., 2014), and computational models (Stein et al., 2013; Mcgeachie et al., 2016) can aid to reach this enterprise. An hypothesis to explore in future is the idea of early warning signals that could link the dynamical microbiome behavior preceding the progression of a human disease (Faust et al., 2015). The advance in this aim will have a strong impact to translate basic knowledge into precision medicine.

In summary, systems biology suggests that human diseases are fundamentally a system issue at which our phenotype (functional or dysfunctional) is an emergent property that results from host-microbiome interactions. Understanding how this property emerges at a molecular level is valuable to reach one of the aims in precision medicine: the desire for more effective treatments in cancer based on personalized genetic background and lifestyle. In this context, HT technologies and biochemical, physiological and clinical data can be organized and evaluated using a network approach that can be useful for predicting disease expression or response to therapies (Loscalzo and Barabasi, 2011). Finally, addressing these aims will contribute positively to understanding the biological mechanisms in human diseases, and providing the right treatment for the right patients at the right moment with clinical strategies based on genomic, proteomics, metabolomics, and taking into account the behavioral and environment background information of individual patients. All these schemes aim to improve diagnostic power.

### TOWARD THE CLINICAL APPLICATIONS OF HOST-MICROBIOME INTERACTIONS IN CANCER

The development of diagnostic tests using biomarkers to be applied for early detection is likely a key aspect for precision medicine. For example, the immunosignature approach leverages the response of antibodies to disease-related changes and can be used for the simultaneous classification of multiple cancers (Stafford et al., 2014). In addition, researchers have evaluated the potential of the fecal microbiota for early-stage detection of CRC and as a screening tool to differentiate between healthy, adenoma, and carcinoma clinical groups (Zackular et al., 2014). Using metagenomic sequencing, it is possible to identify microbiome signatures able to distinguish CRC patients from tumor-free controls (Zeller et al., 2014).

Conversely, germ-free status and treatment with antibiotics has been shown to lead to a reduction of the numbers of tumors in genetic experimental models of CRC, suggesting the use of antibiotics to knock out cancer-promoting gut microbes (Schwabe and Jobin, 2013; Thomas and Jobin, 2015). For instance, cefoxitin treatment resulted in complete clearance of enterotoxigenic Bacteroides fragilis, a microbe that causes IL17A-dependent colon tumors. Bacteroides fragilis eradication reduced tumorigenesis and decreased mucosal IL-17A expression (DeStefano Shields et al., 2016). Nonetheless, clinical studies must be developed to probe the clinical effectiveness and the potential effect on the whole human microbiome.

Other players must be taken into account in shaping the microbiome. From environmental studies, it has been established that bacteriophages shape bacterial community structure and function via predation and gene transfer (Chibani-Chennoufi et al., 2004). In contrast to antibiotics, lytic phages are fairly specific, usually only targeting a subgroup of strains within one bacterial species, for treating bacterial human diseases. For instance, when a bacteriophage cocktail was used to treat Shigella sonnei in a mouse model, bacteriophage administration significantly reduced Shigella colonization without deleterious side effects and distortions in the gut microbiota (Mai et al., 2015). Taking this into account, using bacteriophages has been proposed to target specific strains of bacteria that are implicated in cancer, while leaving the rest of the microbiome unchanged (DeWeerdt, 2015).

In addition, with diet being a key determinant shaping the gut microbiome, dietary interventions and probiotics that promote the development of microorganisms providing health benefits are an attractive way to prevent or treat diseases such as cancer. Dietary interventions, such as a curcuminsupplemented diet increased survival and entirely eliminated tumor burden in a mouse model of colitis-associated colorectal cancer. The beneficial effect of curcumin on tumorigenesis was associated with the maintenance of a more diverse colonic microbial ecology (Mcfadden et al., 2015). Furthermore, dietary intervention with polyphenol extracts modulate the human gut microbiota toward a more healthy profile increasing the relative abundance of bifidobacteria and lactobacilli (Marchesi et al., 2015). The beneficial effects of natural polyphenols and their synthetic derivatives are extensively studied in context of cancer prophylaxis and therapy (Lewandowska et al., 2016).

In terms of reducing gastrointestinal inflammation and preventing CRC, beneficial roles of probiotics have been demonstrated. Moreover, a novel probiotic mixture suppressed hepatocellular carcinoma growth in mice; shotgun-metagenome sequencing revealed the crosstalk between gut microbial metabolites and hepatocellular carcinoma development (Li et al., 2016). Probiotics shifted the gut microbial community toward certain beneficial bacteria, including the genera Prevotella and Oscillibacter, which are producers of anti-inflammatory metabolites (Li et al., 2016; **Figure 4**).

Another area of clinical implications of the microbiome relates to its influence on the host's immune system response against pathogens and cancer (Abt et al., 2012; Belkaid and Hand, 2014). For instance, using antibiotics, the reduction of intestinal microbes ablated the effect not only of the immunotherapy directed to TLRs but also the effectiveness of platinum chemotherapy (Iida et al., 2013). Another type of immunotherapy against cancer relies on immune-checkpoint blockers (ICB). Both Cytotoxic T-Lymphocyte Antigen-4 (CTLA-4) and Programmed Death-1 (PD-1) are receptors that dampen T cell responses, and blocking these receptors with antibodies is approved for patients with advanced melanoma to enhance its recognition and elimination (Rotte et al., 2015). In vivo studies have shown that CTLA-4 blockade reduces tumor growth in specific pathogenfree mice but not in germ-free mice. This effect relied on the presence of the gut intestinal microbiota and the activation of both CD4+ TH1 cells and dendritic cells (DCs). Moreover, in melanoma patients who responded to anti-CTLA-4 treatment, the abundance in the stool of Bacteroides thetaiotaomicron and

B. fragilis correlated with the response to therapy. This protective effect can be transferred to mice by a fecal microbial transplant (FMT) (Vétizou et al., 2015). In addition, Sivan et al. found that differences in the gut intestinal microbiome composition in mice of the same strain can alter the response to PD-1 blockade, with stool samples enriched in Bifidobacterium spp. having a robust CD8+ T cell tumor infiltration and DC activation. The protective effect was also transferred between mice by means of FMT (Sivan et al., 2015).

Those results indicate an important modulatory role of both the microbiota activities and composition on the immune response to cancer. Importantly, the effectiveness of treatment in those experiments depended on the integrity of the gut intestinal microbiome, but its effects extend to the systemic level. From a translational point of view, the manipulation of the microbiome composition by means of probiotics, prebiotics or even FMT can have therapeutic benefits in cancer treatment.

#### INFORMATION MANAGEMENT IN PRECISION MEDICINE

Analysis of massive amounts of data generated by HT technology and personal clinical records requires computational capacities to handle this data, and, as a consequence, unveil the biological information of interest. Data collection, storage, and handling, and privacy policies of personalized genome data becomes a central issue that must be solved. Information management faces different challenges that can be classified into three aspects: storage, structural organization, and safety. The storage problems have been solved by the buying or leasing of space in the cloud hosting systems of large technology companies that simultaneously have been developing applications for data analysis. The structural organization involves the appropriate classification of personal records and HT data and the development of an efficient and optimized mechanism to look for the desired information through heterogeneous sources of biological databases.

Finally, given the social, ethical and legal implications of personalized information, it should be stored in a protected way. To this end, the management system can include protocols for prevention and protection, access control and a plan of action to prevent the loss of information when some event endangers the integrity and security of the data (https://www.whitehouse.gov/ sites/whitehouse.gov/files/documents/PMI\_Security\_Principles\_ and\_Framework\_FINAL\_022516.pdf).

#### PERSPECTIVES

Compositional and functional alterations of the human microbiome have been related to the development of complex diseases such as cancer, type 2 diabetes and obesity. As previously mentioned, host-microbiome interactions play a major role in determining the metabolic phenotype in the host, and, more importantly, their particular composition can serve as a potential measurement for establishing wellness and monitoring the evolution of diseases. This notion has not only changed our paradigm of how our body works, like a superorganism, but also unveiled the outstanding role that microorganisms play in establishing wellness or disease states. Although outstanding breakthroughs have been accomplished to discover its connection, new conceptual schemes able to integrate innovative HT technologies and computational modeling are required to improve our measurements and elucidate their fundamental mechanisms.

For instance, single-cell genomics has the potential to assemble the genomes of viruses and microorganisms that are at low frequencies, thus contributing to a better characterization of the biological samples (Gawad et al., 2016). There has been extraordinary progress in single-cell DNA and RNA sequencing for cancer research, specifically regarding evolution, diversity of cells in tumor progression, and intra-tumor heterogeneity depending on spatial localization of single cancer cells in tissue sections (Crosetto et al., 2015; Navin, 2015; Gawad et al., 2016). In the context of host-microbiome interactions, using the spatial information of the surrounding microbiome state and measurements of intra-tumor genetic heterogeneity might have prognostic utility for predicting which patients will be more likely to show poor response to therapy, higher probability of metastasis, or poor overall survival (Burrell et al., 2013; Murugaesu et al., 2013; Almendro et al., 2014). These and other HT technologies permit us to characterize the microbiome composition and open the possibility of therapeutic applications with a focus on precision medicine: the notion of a precision medicine with treatments applied at the right time, at the right dose, and for the right patient.

Precision medicine based on powerful HT technologies for characterizing patients, such as genomics, proteomics and metabolomics, and computational tools for analyzing large sets of data will integrate the discovery of biomarkers and the electronic medical records to provide evidence for the improvement of clinical practice. The big challenge of data analysis of HT technologies is the development of new computational algorithms to improve the integration of the information from different platforms. In this context, deep learning and machine learning have been proposed as good alternatives to perform these tasks (Eddy, 2009; https://arxiv.org/pdf/1603.06430.pdf). In addition, ambitious projects in precision medicine need to leverage important resources, such as research cohort biobanks for longitudinal research studies, and an efficient bioinformatics system that aids in the translation from biomedical research to molecular targeting and identification of biomarkers that correlate with the disease state. Intensive investigations are being conducted to illustrate how microbiome profiles, taking into account relationships with the host, could be used as biomarkers to revolutionize prognostication in cancer.

However, the interindividual variations in microbiome composition can potentially influence cancer evolution and the effectiveness of treatment. Cross sectional studies in large cohorts showed no evidence of a "core" of OTUs shared among healthy subjects (Huse et al., 2012). This highlights geography, ancestry, diet and age as crucial factors shaping the microbiome composition (Yatsunenko et al., 2012). On the other hand, at the metagenomic level, several functions or pathways are more consistent across individuals than taxonomic composition (HMP Consortium, 2012b; Knights et al., 2014). These evidence suggests both a robustness in functions and membership redundancy in the microbiome.

Harnessing the microbiome to improve cancer diagnosis and treatment is challenging given this interindividual variation. As we are still uncovering the mechanisms behind the emergent phenotypes from host-microbiome interactions, profiling the microbiome composition and its functional properties (i.e., metatranscriptomics and metabolomics) can provide more insight on the significance of different compositions in different states (Shade and Handelsman, 2012; Shafquat et al., 2014). Nonetheless, identifying conserved or consistently altered functions of the microbiome can also be elusive. A comparison of functional alterations at the metagenomic level revealed some overlapping, but no universal biomarkers between cohorts with type 2 diabetes (Qin et al., 2012; Karlsson et al., 2013). Thus, characterizing microbiome biomarkers should take into account the specific traits of the populations under study.

Finally, precision medicine faces many other challenges that will be addressed not only from a scientific point of view but also from a social and ethical point of view, including the proper distribution of the benefits of these technologies across most regions of the world and the development of

#### REFERENCES


reliable computational platforms that allow data to be stored confidentially, private and protected. Likewise, coordination with ethics committees and regulation will also be necessary for the use of information without the risk of infringement of the patient's rights and to understand and regulate the legal, social, and economic implications.

#### AUTHOR CONTRIBUTIONS

All authors contributed extensively to the design and discussions of the material presented in this review. All the authors participated to wrote the final version of the paper.

#### FUNDING

The authors thank the financial support from the Research Chair on Systems Biology-FUNTEL and Instituto Nacional de Medicina Genomica, Mexico.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2016.00606/full#supplementary-material

using a reference genome database. Bioinformatics 29, 2253–2260. doi: 10.1093/bioinformatics/btt389


for analysis of whole-body systems physiology. BMC Syst. Biol. 5:180. doi: 10.1186/1752-0509-5-180


cytoprotection against cancer development. J. Nutr. Biochem. 32, 1–19. doi: 10.1016/j.jnutbio.2015.11.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AA and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Contreras, Cocom-Chan, Hernandez-Montes, Portillo-Bobadilla and Resendis-Antonio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Personalized Prediction of Proliferation Rates and Metabolic Liabilities in Cancer Biopsies

Christian Diener <sup>1</sup> and Osbaldo Resendis-Antonio1, 2 \*

<sup>1</sup> Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica, México City, Mexico, <sup>2</sup> Coordinación de la Investigación Científica, Red de Apoyo a la Investigación, UNAM, México City, Mexico

Cancer is a heterogeneous disease and its genetic and metabolic mechanism may manifest differently in each patient. This creates a demand for studies that can characterize phenotypic traits of cancer on a per-sample basis. Combining two large data sets, the NCI60 cancer cell line panel, and The Cancer Genome Atlas, we used a linear interaction model to predict proliferation rates for more than 12,000 cancer samples across 33 different cancers from The Cancer Genome Atlas. The predicted proliferation rates are associated with patient survival and cancer stage and show a strong heterogeneity in proliferative capacity within and across different cancer panels. We also show how the obtained proliferation rates can be incorporated into genome-scale metabolic reconstructions to obtain the metabolic fluxes for more than 3000 cancer samples that identified specific metabolic liabilities for nine cancer panels. Here we found that affected pathways coincided with the literature, with pentose phosphate pathway, retinol, and branched-chain amino acid metabolism being the most panel-specific alterations and fatty acid metabolism and ROS detoxification showing homogeneous metabolic activities across all cancer panels. The presented strategy has potential applications in personalized medicine since it can leverage gene expression signatures for cell line based prediction of additional metabolic properties which might help in constraining personalized metabolic models and improve the identification of metabolic alterations in cancer for individual patients.

#### Edited by:

Linda Pattini, Politecnico di Milano, Italy

#### Reviewed by:

Scott H. Harrison, North Carolina Agricultural and Technical State University, USA Marco Vanoni, University of Milano-Bicocca, Italy

\*Correspondence:

Osbaldo Resendis-Antonio oresendis@inmegen.gob.mx; resendis@cic.unam.mx

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 03 October 2016 Accepted: 09 December 2016 Published: 27 December 2016

#### Citation:

Diener C and Resendis-Antonio O (2016) Personalized Prediction of Proliferation Rates and Metabolic Liabilities in Cancer Biopsies. Front. Physiol. 7:644. doi: 10.3389/fphys.2016.00644 Keywords: systems biology, personalized medicine, proliferation, flux balance analysis, TCGA, NCI60

## INTRODUCTION

Cancer is a heterogeneous disease that manifests in a wide variety of geno- and phenotypes. There is no one treatment that works for any cancer types and even cancers of the same phenotype may show large genomic or metabolic differences (Hu et al., 2013; Andor et al., 2015; Hensley et al., 2016). Due to this, there has been an ongoing effort to characterize the particular signatures of cancer in the genome and transcriptome (Mazor et al., 2016; Tirosh et al., 2016) and elucidate its tissue specific consequences for cancer patients. Two of the largest projects describing genomic and expression features of several cancers are the NCI60 and TCGA projects (Scherf et al., 2000; Shoemaker, 2006; Koboldt et al., 2012; Zheng et al., 2016). Currently, NCI60 comprises 60 cancer cell lines and their full genetic, transcriptomic and proteomic characterization. The Cancer Genome Atlas project has similar goals but for cancer samples coming from several thousand patients. Detailed studies of those data sets have revealed the variation inherited even within a single cancer panel and provide great potential for uncovering of the genomic differences that drive the strong variability in cancer phenotypes (Hoadley et al., 2014).

The NCI60 and TCGA databases concentrate on genomic characterizations of distinct cancers which creates the challenge to connect those data to metabolism, which itself is closely connected to the cancer phenotype by providing the macromolecules required for proliferation (Boroughs and DeBerardinis, 2015). Here, the cell lines contained in NCI60 have been characterized in more detail by providing the proliferation rates for the majority of the 60 cancer panels (in the form of doubling times). Due to the inherent complications in measuring those quantities in patients, TCGA includes clinical indicators but lacks biological characterizations of the cancer samples outside of genomic data. In particular, TCGA lacks quantification of cancer proliferation.

In general, inference of metabolic properties from genome and gene expression data is a difficult task due to the many posttranscriptional and post-translational regulatory mechanisms involved in central carbon metabolism that are usually not fully captured by sheer mRNA or protein abundance. Consequently, there have been many attempts to infer the metabolic state by computational methods. Here, flux balance analysis (FBA) is the most prominent one and has proven to be helpful in the analysis of cancer metabolism in cell lines and tissue-specific metabolic models (Orth et al., 2010; Resendis-Antonio et al., 2010; Agren et al., 2014; Yizhak et al., 2015). There are several algorithms performing this task but they all aim to reconcile gene expression or proteome data with the presence of distinct biochemical reactions in the model in some way or another (Becker and Palsson, 2008; Agren et al., 2012; Wang et al., 2012). The major limit to those models are the lack of metabolic data and the weak association between enzyme expression and metabolic fluxes. Due to this, many of the reconstruction methods use discretized enzyme expression values in order to exclude biochemical reactions with a lacking enzyme (Wang et al., 2012; Pornputtapong et al., 2015; Schultz and Qutub, 2016). This strategy has shown to be a promising approach in constraining the feasible metabolic space in cells or tissues and predicting the metabolic capacities of several cancers (Agren et al., 2014). One of the challenges in using FBA-based methods is finding sufficient constraints to identify the unique set of metabolic fluxes for a biological sample. Here, parsimonious FBA, where one only uses the most economic flux distribution for a metabolic objective, has shown to reproduce experimentally measured fluxes and may in some cases even outperform methods based on gene expression data (Lewis et al., 2010; Machado and Herrgård, 2014). Furthermore, it has also been shown that knowledge of the associated proliferation rate will yield to an improvement of those predictions making it desirable to complement expression data with at least a limited set of fluxome data such as growth rates or measurements of key fluxes (Yizhak et al., 2014). Growth rates for simpler eukaryotes can be predicted from gene expression signatures (Airoldi et al., 2009), thus raising the question whether one can identify growth or proliferation rates for clinical samples from gene expression data.

The combination of genome-scale metabolic modeling, personalized reconstruction, and inference of additional metabolic constraints forms the core of a strategy that shows high promises in personalized medicine. Here, accurate prediction of metabolic fluxes may help to identify distinct metabolic alterations and the causality underlying diseases in individual patients by identifying a patient-specific set of altered metabolic processes (Bordbar et al., 2015; Resendis-Antonio et al., 2015).

In this work we present a strategy capable of predicting proliferation rates for more than 12,000 cancer samples in the Cancer Genome Atlas by training a machine learning model for proliferation on the NCI60 data set. We show that the predicted proliferation rates correspond well with clinical data and employ them to estimate the fluxes driving cancer proliferation for more than 3500 samples from nine different cancer subtypes. Overall, our study provides a computational strategy that is able to predict the proliferation rate of cancer biopsies from cell line gene expression data alone and this allows detailed surveys of the potential metabolic activity underlying each case. As a result, our methodology can contribute to the identification of the common and specific metabolic alterations associated with cancers across different tissues, which is of importance during the development of personalized treatments for cancer.

### DATA AND METHODS

### Data Availability and Software

All source code and additional data needed to run the analysis is hosted on GitHub in a dedicated paper repository at https:// github.com/cdiener/proliferation and is archived by Zenodo (http://doi.org/10.5281/zenodo.166813). We also provide intermediate data sets for the NCI60 (http://doi.org/10.5281/ zenodo.61980) and TCGA data (http://doi.org/10.5281/zenodo. 61982). The repository includes Rmarkdown documents (http://rmarkdown.rstudio.com/) detailing the exact steps to produce the reported results and this information is also contained in the Supplementary Protocol S1 in PDF format. Respective software versions are reported in Protocol S1 under "Software versions." We also provide a docker image in order to reproduce our entire analysis interactively on a local machine or in the cloud at https://hub.docker.com/r/cdiener/ proliferation.

### NCI60 and TCGA Data Sets

HuEx ST 1.0 gene expression data for the NCI60 cancer cell lines was obtained from the GEO database from experiment GSE29682 (Reinhold et al., 2010; Barrett et al., 2013). The data was read using the oligo package from Bioconductor and normalized by RMA (Carvalho and Irizarry, 2010). This was followed by a summary step where we calculated the expression for each gene in each sample as the mean log expression across all probesets that were mapped to this gene. Here, the probeset-gene mapping was obtained from biomart (http://www.biomart.org) and is also provided in the paper repository (Smedley et al., 2015). Finally, replicates for a given cell line were summarized again by obtaining the mean log expression values across all replicates for a given cell line and gene.

TCGA data was obtained and parsed from the NCI Genomic Data Commons (GDC) repository (see https://gdcportal.nci.nih.gov/). HuEx 1.0 ST data was obtained from the GDC legacy archive (https://gdc-portal.nci.nih.gov/legacyarchive). Download and parsing was performed in an automated manner using the tcgar package for the R programming language (https://github.com/cdiener/tcgar) which we created for that purpose. A complete list of downloaded files can be found in the "GDC" subfolder of the data repository (https://github.com/ cdiener/proliferation). All analysis was based on Level 3 data (already preprocessed data) since this subset available to the general public.

### Generalized Linear Models

Generalized linear models were fitted using the glmnet package for R (Friedman et al., 2010). Regularization was performed using the L1 norm where the regularization strength λ was chosen as the one yielding the smallest mean squared error during crossvalidation. In order to improve regularization we also discarded very small coefficients in the final step of feature selection. Thus, for the final model we included only coefficients larger than the 25% quartile of the non-zero absolute coefficients (see Protocol S1). The resulting fits were analyzed using a set of 5 metrics, namely mean squared error (mse), root mean squared error (rmse), mean absolute error (mae), mean relative error (mre) and R 2 . Those metrics were calculated for the training set as well as for leave-one-out cross validation. Here, predictive power was evaluated by the leave-one-out cross validation alone.

### Flux Analysis

Flux analysis was performed using the Python programming language (https://python.org) and the COBRApy package (Ebrahim et al., 2013). Metabolic models were obtained from the Human Metabolic Atlas(https://metabolicatlas.org) using the available cancer models which contain a proliferation objective function (Gatto et al., 2014). Given the predicted proliferation rates rp, fluxes for the models were obtained by parsimonious flux balance analysis (pFBA) by first splitting each reversible reaction into its forward and backward reaction and then solving the resulting linear programming problem for each sample (Lewis et al., 2010):

$$\begin{array}{c} Minimize \sum\_{i} \nu\_{i} \\ \text{S} \boldsymbol{\nu} = \boldsymbol{0} \\ \nu\_{i} \ge \boldsymbol{0} \\ \nu\_{\boldsymbol{\mathcal{P}}} = \boldsymbol{r}\_{\boldsymbol{\mathcal{P}}} \end{array} \tag{1}$$

Here, S denotes the stoichiometric matrix of the respective irreversible metabolic model, v<sup>i</sup> denotes the flux with index i and v<sup>p</sup> denotes the the flux of the proliferation objective. Note that, given the proliferation rate r<sup>p</sup> this does not require constraints for the fluxes other than positivity. Given the large number of optimization problems we employed a strategy similar to FastFVA during optimization where each optimization was performed once de novo for each model and subsequent optimizations on the same model recycled the previous solution basis which allows for fast computation of the fluxes (Gudmundsson and Thiele, 2010). Optimization was only performed for samples with a positive proliferation rate and we only used fluxes in further analysis which were non-zero for at least one sample, yielding a total of 1026 used fluxes.

Specificity for a given cancer subtype was scored for each flux as the relative difference of the mean flux within the cancer panel vs. all other cancer panels.

$$s\_p^i = \log\_2 \mu\_p^i - \log\_2 \mu\_o^i \tag{2}$$

Here µ i p denotes the mean of flux v<sup>i</sup> across all samples in cancer panel p and µ i o the mean of flux v<sup>i</sup> in all other samples. Thus, the resulting specificity score s i p described the log-fold change of the target flux between the target cancer panel and the rest of all the samples.

Pathway enrichment was obtained by using an enrichment score similar to GSEA (Mootha et al., 2003; Subramanian et al., 2005). First, specificity scores s i <sup>p</sup> were sorted from highest to lowest absolute value across all panels and fluxes, yielding the ranked list R containing n elements. Then we calculated a raw enrichment score for a metabolic pathway mp mapping to n<sup>h</sup> elements in R as

$$\begin{aligned} \text{ES} &= \max\_{i} / min\_{i} \, P\_{h} \,(i) - P\_{m} \,(i) \\ \text{where } P\_{h} \,(i) &= \sum\_{\substack{\boldsymbol{\nu}\_{i} \in \boldsymbol{m}\boldsymbol{p}, \boldsymbol{j} \leq \boldsymbol{i}}} \boldsymbol{R}\_{\boldsymbol{j}} / n\_{r}, \boldsymbol{n}\_{r} = \sum\_{\boldsymbol{j} \in \boldsymbol{m}\boldsymbol{p}} \boldsymbol{R}\_{\boldsymbol{j}} \, \text{ (3)} \\ \text{and } P\_{m} \,(i, \boldsymbol{m} \boldsymbol{p}) &= \sum\_{\boldsymbol{\nu}\_{i} \text{ } \boldsymbol{m} \boldsymbol{t} \in \boldsymbol{m}\boldsymbol{p}, \boldsymbol{j} \leq \boldsymbol{i}} \boldsymbol{1} / (n - n\_{h}) \end{aligned}$$

ES will be large when the respective pathway is enriched in the beginning of R (specific fluxes are enriched in the pathway), and will be negative when the the pathway occurs in the tail of R (specific fluxes are depleted in the pathway). The score was then normalized by randomly permuting the pathway labels 100 times for each pathway, obtaining the respective mean permuted enrichment score ESperm, and calculating the normalized enrichment score as NES = ES/ESperm. Empirical p-values for the normalized enrichment scores were obtained from the 100 random permutations separately for the positive and negative tails. Thus, the normalized enrichment score NES denotes the fold change between the real pathway mapping and a randomly generated one. If NES is larger than one this denotes an enrichment of the given pathway in the specific fluxes, whereas a NES smaller than one denotes absence of the given pathway in the specific fluxes. Hence, NES > 1 identifies metabolic pathways that are active in cancer cell panel-specific manner whereas NES < 1 identifies metabolic pathways that are underrepresented in the panel-specific fluxes and thus form a set of core pathways whose activity does not vary across the cancer panels.

#### RESULTS

### Identification of Stable Gene Signatures Across Technologies and Cell Types

One of the major challenges when studying two large data sets such as NCI60 and TCGA together is the conservation of gene expression across different technologies and cell types. In the NCI60 data set gene expression was measured by microarrays with the HuEx 1.0 ST arrays being the most recent technology used. TCGA however mostly used RNA-seq for the quantification of gene expression and provides microarray data for only a small subset of cancer panels. For instance, TCGA includes HuEx 1.0 ST data for 1211 samples across 3 cancer panels but RNA-seq data for 11,093 samples across all 33 cancer panels. In order to include the majority of cancer panels in TCGA into the analysis, we thus tried to identify a subset of genes that showed similar global expression across NCI60 and TCGA. We first obtained the mean log expression values for all genes contained in the NCI60 HuEx 1.0 ST data as well as in the TCGA RNA-seq and HuEx 1.0 ST data. For the NCI60 data set this mean log expression was calculated across all cell lines for which proliferation rates were available (57 of 60), whereas the mean log expression for the TCGA data set was obtained by averaging over all samples.

Within the NCI60 and TCGA sample subsets that were measured by the HuEx microarrays sets expression values were similar (correlation 0.82, p < 2.2e-16, compare **Figure 1A**), indicating that the used cell lines are an adequate model system for human cancer cells. Comparing the microarray log expression values from NCI60 to RNA-seq log expression values from TCGA we found a more complex relation. Here, genes that showed a high expression in the RNA-seq data showed a linear relationship with the NCI60 microarray log expression values (compare **Figure 1C**). However, most of the genes with low expression in the TCGA RNA-seq data showed almost random expression values in the NCI60 HuEx data and a similar behavior could be observed when comparing the TCGA microarray data with the TCGA RNA-seq data (see **Figure 1B**). There are several possible explanations for this discrepancy, such as a the low dynamic range of microarrays, cell line-specific expression of some genes, or technical errors. Thus, we aimed at selecting only those genes that showed a globally correlated expression between the NCI60 microarray data and the TCGA RNA-seq data. Genes, whose expression was conserved across both platforms were identified by a linear model relating mean log gene expression values from the NCI60 HuEx experiments (e i N ) and and the TCGA RNA-Seq experiments (e i T ) as

$$e\_T^i = \alpha e\_N^i + \beta \tag{4}$$

Here, α denotes a platform-specific factor that describes the mapping from microarray to RNA-seq expression values for the same samples, whereas β denotes a sample parameter which adjusts for different sample quantities between the NCI60 and TCGA data set. One could fit those parameters directly using the NCI60 Huex and TCGA RNA-Seq data, however, we chose to use a more robust approach in which each of the two parameters was obtained individually from other data set combinations. Here, α was obtained by fitting the HuEx and RNA-Seq data contained in TCGA to a zero-intercept linear model (same samples implies β = 0), whereas β could be obtained by calculating the difference between the mean log expression values of the HuEx data from NCI60 and TCGA (same platform implies α = 1). The full model was then validated using the NCI60 Huex and TCGA RNA-Seq and showed good agreement with the data as is shown in **Figure 1**. As a consequence the fitted model could be used to correct the NCI60 log expression values to its respective TCGA RNA-seq log expression value.

Following the model fit, genes with conserved expression across both data set could be obtained by enforcing the linear relationship described before. In detail, genes were considered acceptable for further analysis if


FIGURE 1 | Gene expression across NCI60 and TCGA. In all figures the red dots denote the gene that were used in the final predictor for proliferation rates and dashed lines enclose the area used for filtering viable gene candidates. (A) HuEx expression data cross NCI60 and TCGA. The blue solid line denotes a 1:1 relationship offset by the parameter beta. (B) Gene expression between microarray and RNA-Seq data within TCGA. The solid blue line denotes the slope given by alpha and passes through the origin. (C) Gene expression between microarray and RNA-Seq data across NCI60 and TCGA. The solid blue line is given by the slope alpha and intercept beta which were obtained individually from the data shown in (A,B).

• The distance between the corrected mean log expressions of the gene in the NCI60 HuEx data set and the normalized TCGA RNA-seq data set was less than one (corrected maximal difference of 2-fold)

Of the 14,943 genes contained in all three data sets, 7799 passed the filter and showed a correlation of 0.91 (Pearson correlation, p < 2.2e-16) between NCI60 HuEx 1.0 ST and TCGA RNA-seq log expression values. Consequently, the filtered genes could now be used to construct a predictor for proliferation rates.

### Expression Interactions Enable a Strong Predictor for Cancer Proliferation

The statistical model chosen for the prediction of the NCI60 proliferation rates was a LASSO generalized linear model (Friedman et al., 2010). Here, we aimed at obtaining a predictor which would not only have good prediction properties on the training data, but would also be able to generalize to new data. Thus, all models were evaluated in a training and validation setting. In the training setting the models were trained using the entire NCI60 data set as in usual linear regression. For the validation step, in each iteration one of the 57 data points was removed from the data set, the model trained on the remaining 56 data points and the proliferation rate predicted for the omitted data point. The strategy of predicting and evaluating each data point by a model trained on all other data points is commonly known as leave-one-out cross-validation or LOOCV. Performance was evaluated across a set of five different metrics shown in **Table 1**.

We observed that a simple linear model (1st order model) yielded good performance in the training step but poor performance in the validation step denoting a strong overfitting to the training data and poor generalization (see **Figure 2**). To alleviate this limitation we increased the order of the model by allowing for products between 1 and 2 genes as variables. This increases the computational complexity of the model training drastically since one would now have to consider more than 30 million possible combinations of the more than 7700 input genes. However, we found that it was sufficient to only consider combinations of those genes that had obtained nonzero coefficients in the 1st order case. Because merely 54 genes showed clearly non-zero regression coefficients in the 1st order model the number of tested combinations could be reduced to 1485 (1431 combinations between 2 genes and 54 squares of the individual genes). Training a pure 2nd order model with those 1485 interaction variables yielded a much stronger predictor than the first order case, particularly in the validation step where the R <sup>2</sup> was raised from 0.2 to 0.85 compared to the 1st order model (see **Figure 2**, **Table 1**). Adding the original 1st order variables to the second order ones however did not improve the performance of the model further and we thus decided to continue with the pure 2nd order model. In a final step we tried to further improve the generalization of the predictor by removing those gene combinations with only very small regression coefficients to avoid overfitting. This was achieved by removing the 25% smallest non-zero absolute coefficient values from the model. This gave a slight improvement in the validation step to an R <sup>2</sup> of 0.98 which now allowed stable prediction of the NCI60 proliferation rates with a relative error of 4% (**Figure 2**, **Table 1**).

Using the trained model we now predicted proliferation rates for all 11,483 tumor tissue and all 756 normal tissue samples in TCGA having either associated RNA-seq or HuEx data (see **Figure 3**). Since the prediction is bound to make some errors it is possible that some of the proliferation rates are predicted to be negative which has no clear interpretation. In our analysis more than 98% of the predicted proliferation rates were larger than zero and negative proliferation rates were in the order of the absolute error predicted by the leave-one-out cross-validation (LOOCV 8e-3 vs. 9e-3 observed) suggesting that the negative proliferation rates actually were from samples that did not proliferate (proliferation rate is zero). As shown in **Figure 3** proliferation rates were heterogeneous within and across the different cancer panels. Interestingly the separation between normal and tumor samples was only pronounced in some of the cancer panels. This is consistent with previous studies that have found large heterogeneities in proliferation rates where proliferation rates may differ even more between different cancer panels than between normal and tumor cells within the same panel (Burrell et al., 2013; Wang et al., 2013; Tomasetti and Vogelstein, 2015). For instance, the predicted proliferation rates for normal and tumor tissue samples separated well for lung squamous cell carcinomas, but not for lung adenocarcinomas.

Unlike for cancer cell lines, there are no reported proliferation rates across the analyzed cancer panels. Thus, we looked for alternative strategies to validate the predicted proliferation rates and studied their association with clinical data. Here,


FIGURE 2 | Predictors for proliferations rates. Panels above the figures denote the order of the model where 1st order means just the log expression values and 2nd order products between two log expression values. Black lines denote a hypothetical perfect fit (1:1 relation between measurement and prediction). "Cutoff" denotes a model where variables with very small fitted coefficients were removed from the model. Panels to the right denote the used predictions where "train" means performance on the training set and "validation" the predictions obtained from leave-one-out cross validation (LOOCV).

12,111 samples had reported clinical data from 10,706 unique individuals. Comparing the Kaplan-Meier survival curves of the lower and upper quartiles of predicted proliferation rates (**Figure 4A**) we found a clear protective effect of lower proliferation rates on patient survival which could also be confirmed by a Cox proportional hazards model (β = 16.7, p <2.2e-16). This indicates that, for instance, an increase of 0.01 in predicted proliferation rate leads to a 19% in risk. This is consistent with the expectation that more proliferative cancer should be more aggressive in general. Because cancer is mostly characterized by its ability for uncontrolled proliferation we also hypothesized that the tumor samples should show globally higher predicted proliferation rates than the normal tissue samples. This was indeed the case with tumor samples having 75% higher proliferation rates than normal tissue samples in average (**Figure 4B**, Wilcoxon rank sum test p <2.2e-16, see Protocol S1). Finally, we also tested the association of the predicted proliferation rates with the cancer TNM staging system. Here,

we found a significant association of the predicted proliferation rates with 3 of the 4 substages (Kruskal-Wallis rank sum test for T, N, stage with all p-values smaller 2.2e-16, see Protocol S1), however, this was accompanied by large variations. Proliferation rates across the subclasses of the staging system are shown in **Figure 5**. Predicted proliferation rates seemed to increase linearly across the T subclass between classes T1-T4 (associated with tumor size) and general tumor stage between stages I-IV (**Figures 5A,D**). Interestingly, subclasses such as T0, N0, or Stage 0 which are carcinomas in situ or tumor that were to small to be classified showed higher proliferation rates than many of the higher classes (compare for instance T0 and T1) suggesting that correct diagnosis of those small tumors is important since they might be more aggressive than tumors in the other low stages.

### Flux Analysis Suggests the Metabolic Liabilities of Cancer

As mentioned earlier, one of the prevalent methods to study metabolism in cancer patients is the use of metabolic modeling and FBA. One of the usual limitations in trying to obtain the flux distribution for a specific tissue or sample is that even under knowledge of the model there is some uncertainty about the upper and lower flux bounds which may strongly influence the solution. One method to overcome this limitation is parsimonious FBA which looks for the most economic flux distribution yielding a predefined metabolic target (Lewis et al., 2010). In cancer proliferation this target can be set to be the measured or predicted proliferation rate of the cancer. Parsimonious FBA can then be used to obtain the flux distribution yielding the given proliferation rate and minimizing the sum of absolute flux values. Since this is a minimization problem it can be obtained from a model with infinitely large upper bounds and, thus, requires no knowledge about constraints in an irreversible model. Here the limiting factor is the availability of tissue reconstructions that allow for the required metabolic function (in our case proliferation). Unfortunately, many previously published reconstructions obtained by mCADRE or tINIT tissue reconstructions do not use a growth objective and are therefore not suitable for parsimonious FBA with known proliferation rates (Wang et al., 2012; Pornputtapong et al., 2015). However, there are some cancer-specific reconstructions which do allow for proliferation and have been validated qualitatively validated with experimental data (Gatto et al., 2014). Those models were reconstructed using proteome data specific for the cancer panel, thus representing the inclusion of an additional data source next to the gene expression data used to predict the proliferation rates.

Here, we used parsimonious FBA to obtain the flux distributions for 3825 samples from nine cancer panels across unique five tissues. Fluxes were split up into their forward and reverse reaction respectively and we only considered fluxes that were non-zero in at least one sample (1026 fluxes, see **Figure 6A**). We observed varying usage of Glycolysis/Gluoneogenesis, Oxidative phosphorylation and the TCA cycle across the nine cancer panels (shown in **Figure S1**). Here, bladder cancers and breast cancers showed the highest fluxes in Glycolysis, whereas breast cancers showed diminished fluxes in the TCA cycle compared to bladder cancers. All other panels showed relatively low metabolic fluxes compared to bladder and breast cancers. Fluxes varied considerably within and across different samples (compare **Figure 6A**). Within a single cancer panel, this is expected since all samples in a panel used the same metabolic model constrained by the predicted proliferation rates which show strong variations as shown in **Figure 3**. However, the clearest pattern could be observed in the presence of absence of particular fluxes across cancer panels, indicating that the model reconstruction has more impact than the exact flux values. Direct comparison of fluxes or metabolic processes between normal and tumor conditions is difficult because of the lack of reconstructions for normal tissues with the ability to grow. Thus, we rather tried to find metabolic processes that were either regulated specifically in one cancer panel or homogeneously across all cancer panels. In order to identify pathways which were specific for a particular cancer panel we calculated a specificity score s<sup>i</sup> p as the log fold-change of the mean for each flux v<sup>i</sup> between the target panel and all other panels (see Data and Methods). A value of 0 marks fluxes that are homogeneous across all cancer panels, whereas a high positive or negative value denotes fluxes which are higher (or lower, respectively) in the target cancer panel. The distribution of specificity score across different metabolic pathways and cancer panels is shown in **Figure S2**.

Finally, cancer panel-specificity of the fluxes was mapped to the metabolic pathway level by calculating an enrichment score as used by GSEA (Subramanian et al., 2005) for metabolic pathways based on the specificity scores (shown in **Figure 7**). Here, an enrichment score of 1 denotes that the pathway is not enriched in any manner, whereas scores larger than one denotes pathways whose fluxes are specific across cancer cell panels and a score smaller than one denotes pathways which are homogeneous across panels and define a set of core pathways (see Data and Methods). The most specific pathways were the Pentose phosphate pathway, retinol metabolism and the metabolism of branched amino acids whose specificity scores are shown in **Figures 6B–D**. Our results suggest that pentose phosphate

activity is highly heterogeneous across the studied cancer panels with metabolic fluxes being specifically up-regulated in breast cancer, cholangiocarcinoma, hepatocellular carcinoma and lung cancers (**Figure 6B**). The observed heterogeneity of pentose phosphate pathway activity is consistent with the literature (Cancer Genome Atlas Research Network, 2013; Du et al., 2013; Li et al., 2014; Patra and Hay, 2014; Dick and Ralser, 2015). Retinol metabolism has been shown to be altered in breast cancer and, as shown in **Figure 6C**, we find its fluxes specifically up-regulated in the breast and bladder cancer panel (Chen et al., 1997; Wei et al., 2015). Similarly, branched amino acid metabolism was specifically up-regulated in the bladder cancer panel (**Figure 6C**). Branched chain amino acid metabolism is known to be affected in cancers as well (Mayers et al., 2014; Chang et al., 2016), however, its relation to cancers is complex since it may also indicate a prior diabetic condition (O'Connell, 2013). Pathways showing homogeneous activity across the cancer cell panels all fell in the category of fatty acid metabolism-related pathways or reactive oxygen species detoxification. This is not surprising since fatty acid metabolism and oxidative stress have long been known to be involved in various cancers (Moreno-Sánchez et al., 2007; Reuter et al., 2010; Carracedo et al., 2013; Currie et al., 2013; Sosa et al., 2013; Camarda et al., 2016; Yang et al., 2016).

### DISCUSSION

In this study we extended the gene expression profiling data contained in the Cancer Genome Atlas with predictions of proliferation rates for more than 12,000 samples. Our results suggest that the heterogeneity between and within different cancer panels is also found on the level of proliferation. Even though there is a tendency for certain cancer types to have

higher proliferation rates, there is a large overlap in proliferative capabilities between different cancers. As we show the predicted proliferation rates are connected with patient survival and in differentiating normal from tumor samples and thus might be consequential for clinical investigations, particularly in early cancer stages where pathological classification is difficult.

obtained from 100 random permutations of pathway labels.

This opens the door for more complex schemes where phenotypic traits from model systems such as cancer cell lines can be extrapolated to individual patients. However, the proliferation rate is only one of many features that determines the outcome of a particular cancer. Additionally, metabolic fluxes seem to depend more on the presence or absence of biochemical reactions than the bounds imposed by achieving a particular proliferation rate. In this analysis we used the same metabolic model for all samples of a given cancer panel. This is obviously only an approximation, albeit a recent study found sample-specific metabolic reconstructions to differ only moderately within a single cancer panel (preprint, http://dx.doi.org/10.1101/050187). There may exist many additional metabolic constraints that vary across different cancer cell samples and cancer panels such as availability of metabolites in the microenvironment, mutations of metabolic enzymes and the required metabolic capacities to resist the immune system or apoptosis. Therefore, it would be worthwhile to predict several additional phenotypic traits for the samples of The Cancer Genome Atlas. This could for instance be based on particular metabolic indicators such as the redox balance, the level of oxidative stress or the balance between the Glycolysis and the TCA cycle. As we have shown, data obtained from cell lines can be an acceptable alternative and has the potential to further constrain the solution space of metabolic modeling.

The advantages of having predictions for distinct biological phenotypes for single patient data lie in its ability to predict metabolic alterations in a more complex fashion than just analyzing the gene expression and mutations of metabolic enzymes. Particularly, it allows the inclusion of additional data through the metabolic model such as the fulfillment of metabolic requirements such as the maintenance of a viable redox balance and the uptake of the necessary nutrients from the microenvironment. As shown in **Figure 4**, this allows to identify the metabolic liabilities within and across cancer panels and could also be used to find metabolic alterations specifically for a single patient. Here, we found that identified metabolic liabilities were consistent with previous publications in predicting alteration in lipid metabolism as a general theme across different cancers and identifying several specific metabolic alterations in the pentose phosphate pathway, retinol metabolism, and branched chain amino acid metabolism as alterations. As more reconstructions for normal tissues become available this list is likely to be extended by comparisons between normal and tumor tissues, however that would require the inference of metabolic constraints beyond proliferation or growth rates as many normal tissues do not grow significantly. Additionally, the methodology could probably be improved by using patient-specific reconstructions for the metabolic models that better capture the inherent heterogeneity. However, that would require fast reconstruction methods in order to produce personalized models in a high-throughput fashion.

Finally, after initial model training, prediction for new samples is very efficient and can help to reduce the amount of required data. In our study we only required gene expression levels for 38 unique genes in order to predict proliferation rates with an accuracy of 96%. Additionally, all of those genes were consistently expressed across all cancer panels and cell lines and had sufficiently high expression values to be quantified reliably by RNA-Seq and microarrays. This enables cost efficient clinical probing in order to quantify phenotypic traits that can usually not be observed directly.

#### AUTHOR CONTRIBUTIONS

CD developed the methods, performed the analysis and wrote the paper. OR developed the methods and wrote the paper.

#### FUNDING

The authors thank the financial support of the Research Chair on Systems Biology (INMEGEN-FUNTEL Mexico) and from an internal grant of the National Institute of Genomic Medicine.

#### REFERENCES


#### ACKNOWLEDGMENTS

The authors would like to acknowledge the individuals that donated samples to the Cancer Genome Atlas as well as the the Cancer Genome Atlas Research Network, whose work forms the basis for the results presented here.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2016.00644/full#supplementary-material

Figure S1 | Fluxes for the major metabolic pathways across the 9 used cancer panels. Each point denotes a single flux value for a specific sample. Shown are 141,525 individual flux values.

Figure S2 | Specificity scores across all metabolic pathways in the 1026 non-zero fluxes stratified by cancer panel.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Diener and Resendis-Antonio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cancer Clocks Out for Lunch: Disruption of Circadian Rhythm and Metabolic Oscillation in Cancer

Brian J. Altman1, 2, 3 \*

<sup>1</sup> Abramson Family Cancer Research Institute, Philadelphia, PA, USA, <sup>2</sup> Abramson Cancer Center, Philadelphia, PA, USA, <sup>3</sup> Division of Hematology-Oncology, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA

Circadian rhythms are 24-h oscillations present in most eukaryotes and many prokaryotes that synchronize activity to the day-night cycle. They are an essential feature of organismal and cell physiology that coordinate many of the metabolic, biosynthetic, and signal transduction pathways studied in biology. The molecular mechanism of circadian rhythm is controlled both by signal transduction and gene transcription as well as by metabolic feedback. The role of circadian rhythm in cancer cell development and survival is still not well understood, but as will be discussed in this Review, accumulated research suggests that circadian rhythm may be altered or disrupted in many human cancers downstream of common oncogenic alterations. Thus, a complete understanding of the genetic and metabolic alterations in cancer must take potential circadian rhythm perturbations into account, as this disruption itself will influence how gene expression and metabolism are altered in the cancer cell compared to its non-transformed neighbor. It will be important to better understand these circadian changes in both normal and cancer cell physiology to potentially design treatment modalities to exploit this insight.

Keywords: circadian rhythm, oncogenes, metabolism, cancer metabolism, molecular clock, oscillation, gene expression regulation

### INTRODUCTION: THE CIRCADIAN CLOCK CONTROLS GENE EXPRESSION AND METABOLISM

The majority of eukaryotes possess a circadian clock to optimize gene expression and metabolism to the day-night cycle. Cancer cells may disrupt normal circadian oscillation to release cells from control of gene expression and metabolism and provide a growth advantage. In mammals, many familiar processes such a sleep/wakefulness, feeding, blood pressure, and body temperature are synchronized by the circadian clock (Millar-Craig et al., 1978; Spiteri et al., 1982; Cagnacci et al., 1992; Bass, 2012). The "central clock" is governed by blue-light sensing in the eye and subsequent processing in the hypothalamic suprachiasmatic nucleus (Moore and Eichler, 1972; Liu et al., 1997; Ruby et al., 2002), while "peripheral clocks," which will be the focus of this Review, are present in virtually all organs and individual cells in the body, and are synchronized by the central clock through signals such as hypothalamic-pituitary-directed release of adrenal corticosteroids, but can also operate independently of central clock input (Buijs et al., 1999). Peripheral clocks are strongly entrained by the time of feeding, and misalignment of feeding and the central clock has recently been shown to lead to metabolic syndrome (Mukherji et al., 2015a,b). Synchronization

#### Edited by:

Osbaldo Resendis-Antonio, Instituto Nacional de Medicina Genomica, Mexico

#### Reviewed by:

Didier Gonze, Université Libre de Bruxelles, Belgium Oksana Sorokina, The University of Edinburgh, UK Ricardo Orozco Solis, Instituto Nacional de Medicina Genomica, Mexico

#### \*Correspondence:

Brian J. Altman altman@upenn.edu

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Cell and Developmental Biology

> Received: 25 March 2016 Accepted: 08 June 2016 Published: 24 June 2016

#### Citation:

Altman BJ (2016) Cancer Clocks Out for Lunch: Disruption of Circadian Rhythm and Metabolic Oscillation in Cancer. Front. Cell Dev. Biol. 4:62. doi: 10.3389/fcell.2016.00062 of the peripheral clock can be simulated in cell culture by treatment with the corticosteroid dexamethasone (Balsalobre et al., 2000), or the simple act of changing culture media (Yeom et al., 2010), and thus, circadian oscillations are likely common in most non-transformed cells lines and many cancer lines as well.

The molecular circadian clock is governed by several feedback loops (**Figure 1**) that lead to 24-h oscillations of target gene expression, defined by their amplitude (height), phase (position), and period (length). Several well-described and detailed mathematical models of this molecular oscillation exist, which have been used to make predictions about perturbations of the molecular clock (Leloup and Goldbeter, 2003; Relogio et al., 2011; Hirota et al., 2012; Kim and Forger, 2012). The bestcharacterized organ with respect to circadian rhythm is liver, where more than 20% of mRNAs oscillate (Panda et al., 2002; Storch et al., 2002; Ueda et al., 2002; Koike et al., 2012). In the whole mammal, up to 50% of protein-coding RNAs and 30% ofnon-coding RNAs oscillate in at least one organ, with the liver,

2012). In the first and most important loop, CLOCK-BMAL1 upregulates PER and CRY through binding to E-box DNA elements. Unbound PER and CRY proteins are phosphorylated by casein kinase 1 ε/δ (CK1ε/δ) and AMPK (AMP-kinase), respectively, to lead to degradation. GSK3 (glycogen synthase kinase 3, not pictured) can also phosphorylate PER and CRY to promote their degradation (Harada et al., 2005; Iitaka et al., 2005). Otherwise, PER and CRY form a complex with CK1, which translocates to the nucleus to repress CLOCK-BMAL1 activity. PER and CRY are then eventually degraded in a CK1-dependent manner (not pictured), and the time delay in the first loop forms an approximately 24-h cycle which is particularly dependent on dynamics of PER regulation (D'alessandro et al., 2015). In the second loop, CLOCK-BMAL1 upregulates the negative transcription factors REV-ERBα and β (gene names NR1D1 and NR1D2) and the positive transcription factors RORα,β, or γ (not pictured), which repress or activate BMAL1 (gene name ARNTL) transcription, respectively, through binding to RRE (R-response element) DNA sequences. The importance of this second loop is underscored by the fact that mice lacking REV-ERBα and β, which form a complex and act together, lack normal circadian gene oscillation in the liver (Bugge et al., 2012; Cho et al., 2012). Several accessory loops exist; in one that will be highlighted in this review, SIRT1 (sirtuin 1) deacetylase tunes CLOCK-BMAL1 activity by opposing the histone acetyl-transferase (HAT) activity of CLOCK (Asher et al., 2008; Nakahata et al., 2008, 2009; Ramsey et al., 2009). SIRT1 is regulated by the metabolite NAD, which in turn is produced by the NAD-salvage enzyme NAMPT (nicotinamide phosphoribosyltransferase), the rate-limiting enzyme of the NAD salvage pathway involved in NAD recycling and synthesis from dietary nicotinamide or niacin. Together, these primary and accessory loops lead to the 24-h expression of target genes and oscillation of downstream metabolic processes. Figure reprinted and modified from Altman et al. (2015), with permission from Elsevier.

kidney, and lung being the most "circadian"; however, there is little overlap in circadian gene expression between organs, with only 10 genes oscillating in all examined cell types (Zhang et al., 2014). Ribosome occupancy of mRNA and protein translation also demonstrate rhythmicity (Jang et al., 2015; Janich et al., 2015; Lipton et al., 2015), and thus, circadian rhythm strongly controls gene expression and translation, though the specific identity of oscillating genes may vary.

Circadian control of metabolism has been extensively studied on the level of organs. Many specific metabolites, including lipids, amino acids, and glycolytic intermediates, oscillate in mouse liver and human blood, saliva, and even breath (Dallmann et al., 2012; Eckel-Mahan et al., 2012; Kasukawa et al., 2012; Martinez-Lozano Sinues et al., 2014). Anabolic pathways in liver, including nucleotide biosynthesis and ribosomal biogenesis, also showed circadian oscillation (Fustin et al., 2012; Jouffe et al., 2013). On the other hand, appreciation of the oscillation of metabolism on a cell-autonomous level (as observed in tissue culture) is just becoming appreciated. Two studies demonstrated that NAD (nicotinamide adenine dinucleotide) oscillates in cell culture and liver (**Figure 1**) (Nakahata et al., 2009; Ramsey et al., 2009), which controls rhythmic mitochondrial oxidation (Peek et al., 2013). More recently, we observed in U2OS osteosarcoma cells, a commonly used model of circadian rhythm, that intracellular glucose showed circadian oscillation (Altman et al., 2015). This finding is supported by another study showing oscillation of NADH/NAD+ ratio in epidermal stem cell culture, which may reflect oscillation in glucose metabolism (Stringari et al., 2015). An unbiased metabolomic analysis is still needed to determine the extent of cell-autonomous metabolic oscillations.

Metabolism itself may also control the clock. Several nearlysimultaneous studies uncovered that the NAD- and NAMPTregulated deacetylase SIRT1 opposes the acetylytansferase activity of CLOCK protein activity (Doi et al., 2006) to deacetylate PER2, BMAL1, and histones, leading to alterations in both phase and amplitude of circadian gene oscillation (Asher et al., 2008; Nakahata et al., 2008, 2009; Ramsey et al., 2009). NAD availability may also influence circadian rhythm through regulation of PARP (poly-ADP-ribose polymerase) to regulate CLOCK-BMAL1 protein and DNA binding (Asher et al., 2010). Emerging evidence suggests that glucose availability may affect circadian rhythm, in part by contributing to O-GlcNAcylation of PER2 to control its activity (Kaasik et al., 2013; Oosterman and Belsham, 2016). It has long been observed that cancers have altered metabolism (Warburg, 1956; Vander Heiden et al., 2011; Stine and Dang, 2013), and that many cancers may have disrupted circadian rhythm (Levi et al., 2008); however, the significance and mechanism of the circadian dysrhythmia in cancer are poorly understood.

#### ONCOGENIC ALTERATION OF CIRCADIAN RHYTHM

Mutations in molecular clock genes, including promoter methylation, coding region mutation, deletion, or rare amplification, have been documented at a low frequency (less than 20% incidence per tumor type) across many different types of cancer (Cerami et al., 2012; Savvidis and Koutsilieris, 2012; Gao et al., 2013; Uth and Sleigh, 2014). Given that these mutations disrupt normal oscillation, it has been suggested that the clock may be tumor suppressive. Many proto-oncogenes and tumor suppressors are normally under circadian control (Sahar and Sassone-Corsi, 2009), and so disruption of oscillation could potentially release these proteins to be constitutively overexpressed or suppressed. This Review will focus on several notable examples of oncogenic pathways that are often mutated in cancer and have a well-described relationship to circadian rhythm. Given the frequency of mutation in the pathways detailed below, it can be speculated that many cancers with these and perhaps other oncogenic mutations have altered or disrupted circadian rhythm and altered oscillation of gene expression and metabolism.

## RAS

The RAS family of GTP-ases (H-, K-, and N-RAS) is mutated in many cancers to constitutively activate their GTPase function and hyperstimulate downstream mitogen-activated kinase (MAPK) signaling. Oncogenic RAS is known to promote transformation and altered cell metabolism (Pylayeva-Gupta et al., 2011; Kimmelman, 2015), and work spanning decades suggests that wild-type RAS is both influenced by and influences the circadian clock, and thus, mutated oncogenic RAS may potentially alter circadian rhythm. RAS is highly conserved among lower organisms in Animalia, and it was shown in Drosophila that RAS and the MAPK signaling family mediated circadian rhythm, and inversely that the MAPK pathway itself was governed by circadian oscillation (Williams et al., 2001). Further studies in Drosophila revealed that ERK (a critical downstream target of RAS) could directly phosphorylate CLOCK and thus increase the output of clock-controlled genes (CCGs) (Weber et al., 2006). Similarly, clock-controlled genes were increased by active RAS in the bread mold Neurospora crassa (Belden et al., 2007). In mammals, RAS and downstream MAPK signaling oscillate in neurons and in the liver, suggesting circadian control in both the central and peripheral clocks (Tsuchiya et al., 2013; Serchov et al., 2016). Neuronal constitutively activated RAS dramatically disrupted circadian gene oscillation and mouse circadian activity through upregulation of CCGs, in a pathway that was dependent on downstream activity of GSK3β (Serchov et al., 2016), and another study further implicated RAS in disruption of CCGs downstream of GSK3 (Spengler et al., 2009). As discussed in the **Figure 1** legend, GSK3 is a regulator of CRY and PER stability. While little work has been done to demonstrate this mechanism in cancer, one recent study identified mutated RAS as a mediator of circadian rhythm disruption in colon cancer cells, potentially through upregulation of CRY1 (Relogio et al., 2014). Thus, while strong evidence exists in multiple organisms and model systems that active RAS can alter circadian rhythm, specifically by upregulating CCGs, the potential role in cancer cell metabolism and physiology remains unclear.

## LKB1/AMPK

The AMP-kinase (AMPK) is an ancient protein complex conserved in nearly all eukaryotes that responds to metabolic stress (Hardie, 2014) by sensing increases in the AMP:ATP ratio, and inhibiting biosynthetic processes while upregulating catabolic metabolism to restore ATP levels (Hardie and Alessi, 2013). The chief upstream kinase responsible for phosphorylating and activating AMPK downstream of metabolic stress, LKB1 (liver kinase B1), is mutated or lost in many cancers, including up to 35% of non-small-cell lung carcinomas (Shackelford and Shaw, 2009). Thus, AMPK may function as a tumor suppressor in some cancers, and indeed, AMPK-promoting compounds such as the widely used complex-I inhibitor metformin and related biguanides have been investigated in preclinical and clinical models (Pollak, 2012).

AMPK plays a strong role in controlling circadian rhythm, and regulates the clock by directly phosphorylating and promoting the degradation of CRY1 (Lamia et al., 2009), and promoting the degradation of PER2 through CK1ε activation (Eide et al., 2005; Um et al., 2007), which both lead to upregulation of CCGs; however, whether this led to a shortening or lengthening of the period was unclear. Underscoring the importance of CK1ε downstream of AMPK, metformin was shown to upregulate Csnk1 (protein CK1) isoforms in the mouse and alter oscillation of circadian and metabolic genes (Barnea et al., 2012). In a separate pathway, AMPK increases NAD+ levels to activate SIRT1, leading to additional clock modulation (Fulco et al., 2008; Canto et al., 2009; Um et al., 2011; Brandauer et al., 2013). Cancer treatments that activate AMPK, including metformin or anti-metabolic therapies such as the lactate dehydrogenase A inhibitor FX11 (Le et al., 2010), would be expected to alter the molecular clock in affected cells. Strikingly, loss of either LKB1 or of both catalytic subunits of AMPK completely abrogated circadian oscillation, even in the absence of metabolic stress, in several models such as MEFs or mouse liver (Lamia et al., 2009; Um et al., 2011). This raises two interesting possibilities: first that AMPK is an integral accessory regulator of the circadian clock, and second, that cancers deficient in AMPK activity through loss of LKB1 may have a deficient clock.

#### p53

The p53 tumor suppressor protein is mutated or lost in a large number of cancers, leading to dysregulation of metabolism, cell cycle, and apoptosis (Berkers et al., 2013; Chen, 2016). Recent evidence suggests an interdependent relationship exists between p53 and PER2, which has fascinating implications for circadian rhythm and metabolism. PER2 may directly regulate p53 activity: inactivation of PER2 by mutation delayed p53 accumulation after ionizing radiation, sensitizing mice to both cancer development and death (Fu et al., 2002). Supporting these data, two studies showed that high levels of PER2 in cancer cell lines and glioma xenografts correlated with increased p53 induction and apoptosis (Hua et al., 2006; Zhanfeng et al., 2016). However, the possible molecular mechanism of p53 activity regulation by PER2 was not well described in these studies.

This relationship is bidirectional, as p53 can influence PER2 both at the gene expression and protein level. p53 can antagonize PER2 expression by directly binding to the PER2 promoter and blocking CLOCK-BMAL1 transactivation of the gene (Miki et al., 2013). Either loss of p53 or accumulation of p53 protein caused phase shifts in mouse circadian behavior, suggesting that both basal and induced p53 can regulate the clock through PER2 modulation. Adding another layer of complexity, two complementary studies demonstrated that PER2 protein can form a dimer with p53 in the cytoplasm to stabilize p53 and allow translocation to the nucleus, either under basal conditions or genotoxic stress (Gotoh et al., 2014, 2015). Once in the nucleus, PER2-p53 also binds its E3 ubiquitin ligase MDM2 (mouse double minute 2 homolog), and this trimeric complex prevents p53 ubiquitination and degradation, allowing for increased transactivation of p53 targets. The authors hypothesized that PER2 may exist in two pools: one bound to p53, and one bound to CRY and CK1εfor control of circadian rhythm and subsequent degradation (Gallego and Virshup, 2007).

Several interesting conclusions can be made from the above findings. First, given that PER2 strongly controls p53 gene expression, stability, and localization, and that PER2 levels oscillate in the cell, wild-type p53 protein and activity itself must oscillate, making these cells more or less sensitive to DNA damage at certain times. p53 mRNA and protein oscillation was observed in several studies (Horiguchi et al., 2013; Miki et al., 2013), and in fact, circadian sensitivity to p53 activity was demonstrated in several older studies that identified circadian variation in radiation toxicity in rodents (Pizzarello et al., 1964; Lappenbusch, 1972). However, it remains unclear whether oscillation of p53 activity was due to TP53 mRNA oscillation, or oscillation of the upstream E3 ubiquitin ligase MDM2 to control p53 protein stability (Horiguchi et al., 2013). Since p53 feeds back to suppress PER2 expression and alter protein localization, the above pathway may be an as-of-yet uncharacterized accessory loop of endogenous clock control. Additionally, it has been appreciated in recent years that DNA damage induces oscillatory p53 activity and protein levels, with a period of about 6 h and dependent on phosphorylation of both p53 and MDM2 (Lahav et al., 2004; Geva-Zatorsky et al., 2010). It is likely that, after DNA damage, this inherent p53 oscillation, circadian control of p53, and p53 control of PER2 interact in some significant way, but this has not yet been studied.

Another upshot of this relationship is that altered p53 status should disrupt circadian oscillation. DNA damage and other insults induce and stabilize p53 (Chen, 2016), and p53 can control circadian rhythm through its modulation of PER2 transcription, protein stability, and protein localization (Miki et al., 2013; Gotoh et al., 2014, 2015), so it can be hypothesized that under stress p53 induction will dramatically alter the circadian clock through its modulation of PER2, which may perhaps be an adaptive pro-survival process. On the other hand, p53 mutation loss or mutation in cancer would dramatically affect circadian rhythm, both by allowing for increased PER2 gene expression (Miki et al., 2013) and by altering the availability of PER2 protein to bind to other partners such as CRY (Gotoh et al., 2014, 2015). One interesting question is how mutant p53 that has acquired novel DNA-binding and transactivation functions would affect PER2 and circadian rhythm (Muller and Vousden, 2013). Thus, loss or mutation of p53 in cancer may alter or disrupt circadian rhythm, with unknown consequences to cancer physiology.

## MYC

The MYC and related MYCN oncogenes (encoding MYC and N-MYC) are translocated, amplified, or mutated in many cancers, and can dramatically upregulate genes involved in glucose and glutamine metabolism, ribosomal, lipid, and nucleotide biogenesis, and cell cycle progression (Stine et al., 2015). Given that MYC recognizes and binds to E-Box DNA promoter elements identical to those recognized by CLOCK-BMAL1, it was theorized that CLOCK-BMAL1 could bind to MYC target genes (Fu et al., 2002; Fu and Lee, 2003), an idea later borne out by observation that CLOCK-BMAL1 could inhibit N-MYC-dependent gene transactivation (Kondratov et al., 2006). Given that the MYC gene itself contains multiple E-box elements (Battey et al., 1983), it was shown that CLOCK-BMAL1 regulates endogenous MYC circadian oscillation and oscillation of MYC-target genes, both by direct BMAL1 binding to the MYC promoter, as well as by additional translational and posttranslational control by the molecular clock machinery (Fu et al., 2002, 2005; Okazaki et al., 2010; Repouskou et al., 2010; Repouskou and Prombona, 2016). It is also likely that endogenous MYC influences the clock, but this potential role has not been elucidated.

Given that MYC rewires the cell for altered metabolism and growth, we hypothesized that hyperactivated oncogenic MYC could disrupt the molecular clock and thus alter circadian oscillation of metabolism. We found that overexpressed MYC and N-MYC upregulated many clock family members, including PER2, CRY1, and most notably, REV-ERBα (Altman et al., 2015), leading to a dramatic suppression of BMAL1 expression and oscillation, which could be rescued by knockdown of REV-ERBα and its binding partner REV-ERBβ (Bugge et al., 2012; Altman et al., 2015). Our study also showed that oncogenic MYC dramatically altered and disrupted circadian oscillation of glucose and AMPK phosphorylation (Altman et al., 2015), thus suggesting that oncogenic mutation may disrupt circadian gene expression, metabolic oscillations, and oscillation of cellular bioenergetics.

FIGURE 2 | Interdependent relationship of oncogenesis, metabolism, and the circadian clock. Oncogenesis (defined as hyperactivation of pro-growth pathways downstream of mutations or alterations in RAS or MYC, or loss of normal function in growth-suppressive pathways as p53 or LKB1/AMPK, that lead to uncontrolled cell growth and transformation) is well known to alter cell metabolism, and these metabolic changes are necessary to support oncogenesis (Hirschey et al., 2015). As discussed in the Introduction, circadian rhythm strongly influences metabolism, and several metabolic pathways can feed back to control circadian rhythm. This Review demonstrates that oncogenic pathways, such as RAS, LKB1/AMPK, p53 (in part through p53 regulation of PER2), or MYC (in part through MYC activation of REV-ERBα), may disrupt or alter the normal peripheral circadian clocks of organs and individual cells. On the other hand, it has been shown that endogenous RAS, p53 (through PER2 regulation), and MYC oscillate on the genetic and functional level, and so it has been suggested that the clock itself is tumor suppressive (by regulating these oncogenes and tumor suppressors) and thus can prevent oncogenesis. What is still unknown is the extent to which altered metabolism downstream of cancer (and pathways such as RAS, LKB1/AMPK, p53, and MYC) contributes to suppression of the molecular clock. Red slash indicates pathways and proteins that are often lost in cancer, making them tumor-suppressive pathways.

Interestingly, MYC alteration of circadian gene expression seems to be highly cell-type specific. For instance, a recent study in HEK-293 and colon cancer cells showed that overexpressed MYC bound the PER1 promoter exclusively, and rather than transactivating expression, this binding led to a downregulation of PER1 due to competitive inhibition of CLOCK-BMAL1 promoter occupancy, which would presumably also lead to circadian disruption (Repouskou and Prombona, 2016). Alternately, MYC overexpression in embryonic stem cells led to PER cytoplasmic accumulation rather than upregulation (Umemura et al., 2014). Another study identified CSNK1e (protein CK1ε) as a synthetic lethal target of MYC and N-MYC upregulated in neuroblastoma and other human cancers (Toyoshima et al., 2012), and upregulation of CK1ε would be expected to destabilize the clock through its promotion of PER degradation and activation of BMAL1 (Gallego and Virshup, 2007). It remains to be determined in what contexts overexpressed MYC in cancer deregulates clock genes through either promoter co-occupancy, competition with CLOCK-BMAL1 to trasactivate or repress target genes, or through forming novel complexes with either CLOCK or BMAL1. Nonetheless, all of the above studies documented a role for overexpressed MYC in disruption of circadian oscillation, which as we showed has consequences for metabolic oscillation and cell physiology (Altman et al., 2015).

### CONCLUSIONS AND PERSPECTIVES: CONNECTIONS BETWEEN ONCOGENIC MUTATION, METABOLISM, AND CIRCADIAN RHYTHM, WITH AN EYE TOWARD CHRONOTHERAPY

Circadian rhythm is an essential part of cell physiology that underlies many biological processes. Common pathways involved in oncogenesis alter the molecular clock through a diverse set of mechanisms, and RAS, p53, and MYC are strongly regulated by the circadian machinery, suggesting a deep interdependent relationship that is lost when these genes are altered in cancer. The manner by which circadian oscillation is altered is varied: active RAS causes increases in amplitude, p53 loss causes phase shifts, and MYC seems to cause a suppression of overall oscillation. Adding another layer of complexity, both oncogenic alterations and circadian rhythm regulate metabolism, and metabolism itself can feed back to control circadian rhythm. An interesting consequence is that oncogenic alterations can potentially disrupt circadian rhythm both through direct effects on gene expression and protein regulation, and also through alteration of metabolism (**Figure 2**). However, the potential role of altered cancer metabolism in disruption of circadian rhythm has not been addressed. Additionally, it is not clear how potential oncogenic alterations of circadian rhythm respond to or modify synchronizing signals from the central clock.

Several unanswered questions arise from the work reviewed here. First, why do many cancers potentially disrupt circadian rhythm? One can imagine that circadian oscillation, which imposes a "rest" phase every 24 h, is maladaptive to cancer cells, and so altering or destroying this rhythm might allow transformed cells to outcompete their non-transformed neighbors. The clock may be upstream of normal tumor suppressors and proto-oncogenes (Sahar and Sassone-Corsi, 2009) to regulate normal metabolism and growth, and as shown above, these pathways seem to form feedback mechanisms with the clock that are lost in cancer, perhaps releasing oncogenes, tumor suppressors, and even metabolism from circadian control.

Second, how can the cancer research community use this knowledge of circadian disruption to better treat cancer? The answer may lie in chronotherapy, or timed administration of treatment to patients, based on circadian rhythm, to increase efficacy and reduce toxicity of drugs or radiation. Dozens of traditional cancer therapeutics, including the anti-metabolite folate pathway antagonist methotrexate, have known circadiandependent toxicity (Levi et al., 2010). Excitingly, recent research indicates that several targeted therapies currently in clinical use have strongly circadian-dependent efficacy depending on the time of day given, including but not limited to erlotibin (inhibits EGFR, used in lung cancer), lapatinib (inhibits HER/Neu and EGFR, used in breast cancer), and evirolimus (inhibits mTOR, used in some breast cancers and pancreatic neuroendocrine tumors), and in fact there are several chronotherapy dosing schedules under clinical trial (Dallmann et al., 2016). Better knowledge of how specific oncogenes disrupt normal oscillation of tumor cells could lead to more effective strategies in delivery of targeted or metabolic therapies. Circadian disruption is potentially an essential part of the evolution of cancer, and further study will allow us to better understand both the benefits to cancer of this disruption, and how this knowledge can be used to help patients.

### AUTHOR CONTRIBUTIONS

BA conceived of the review topic, performed the literature research for the review, wrote the review and designed the figures, and edited the review for final submission and revision.

### ACKNOWLEDGMENTS

I would like to acknowledge Chi Dang, Annie Hsieh, Zandra Walton, and Zachary Stine (University of Pennsylvania) as well as John Hogenesch (University of Cincinnati) and Arvin Gouw (Stanford University) for helpful commentary and discussion. I would also like to acknowledge and thank Elsevier for permission to reprint and modify the Graphical Abstract from Altman et al. (2015) for **Figure 1**. I apologize to authors whose work could not be included in this minireview due to space limitations. I am supported by the National Cancer Institute of the National Institutes of Health under F32CA180370. The content is solely my responsibility and does not necessarily represent the official views of the National Institutes of Health.

## REFERENCES


differentiation-coupled circadian clock development in mammalian cells. Proc. Natl. Acad. Sci. U.S.A. 111, E5039–E5048. doi: 10.1073/pnas.1419272111


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer RS and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Altman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# "Gestaltomics": Systems Biology Schemes for the Study of Neuropsychiatric Diseases

Nora A. Gutierrez Najera<sup>1</sup> , Osbaldo Resendis-Antonio1, 2 and Humberto Nicolini <sup>1</sup> \*

1 Instituto Nacional de Medicina Genómica, Mexico City, Mexico, <sup>2</sup> Human Systems Biology Laboratory, Coordinación de la Investigación Científica - Red de Apoyo a la Investigación, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, National Autonomous University of Mexico (UNAM), Mexico City, Mexico

The integration of different sources of biological information about what defines a behavioral phenotype is difficult to unify in an entity that reflects the arithmetic sum of its individual parts. In this sense, the challenge of Systems Biology for understanding the "psychiatric phenotype" is to provide an improved vision of the shape of the phenotype as it is visualized by "Gestalt" psychology, whose fundamental axiom is that the observed phenotype (behavior or mental disorder) will be the result of the integrative composition of every part. Therefore, we propose the term "Gestaltomics" as a term from Systems Biology to integrate data coming from different sources of information (such as the genome, transcriptome, proteome, epigenome, metabolome, phenome, and microbiome). In addition to this biological complexity, the mind is integrated through multiple brain functions that receive and process complex information through channels and perception networks (i.e., sight, ear, smell, memory, and attention) that in turn are programmed by genes and influenced by environmental processes (epigenetic). Today, the approach of medical research in human diseases is to isolate one disease for study; however, the presence of an additional disease (co-morbidity) or more than one disease (multimorbidity) adds complexity to the study of these conditions. This review will present the challenge of integrating psychiatric disorders at different levels of information (Gestaltomics). The implications of increasing the level of complexity, for example, studying the co-morbidity with another disease such as cancer, will also be discussed.

Keywords: systems biology, psychiatry, lung cancer, diagnosis, omics

### INTRODUCTION

According to the World Health Organization (WHO), the frequency of psychiatric diseases has been steadily increasing (World Health Organization, 2011). Furthermore, many patients do not fully respond to therapy. There is a currently limited knowledge on the pathophysiology of neuropsychiatric disorders, which in turn diminishes the ability to identify clinical biomarkers for the early diagnosis of patients at risk (Martins-de-Souza, 2014; Sethi and Brietzke, 2015).

**Abbreviations:** CSF, cerebrospinal fluid; RBC, red blood cells; CNV, copy number variation; SNV, single nucleotide variation.

#### Edited by:

Natalia Polouliakh, Sony Computer Science Laboratories, Japan

#### Reviewed by:

Maria Suarez-Diez, Wageningen University and Research Centre, Netherlands Hisham Bahmad, American University of Beirut, Lebanon

> \*Correspondence: Humberto Nicolini hnicolini@inmegen.gob.mx

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

Received: 20 September 2016 Accepted: 19 April 2017 Published: 09 May 2017

#### Citation:

Gutierrez Najera NA, Resendis-Antonio O and Nicolini H (2017) "Gestaltomics": Systems Biology Schemes for the Study of Neuropsychiatric Diseases. Front. Physiol. 8:286. doi: 10.3389/fphys.2017.00286

The classical approach for psychiatric diagnosis includes an essential evaluation on the mental health of the patient, by means of an interview, to determine the presence of a series of signs and symptoms (Fatemi and Clayton, 2008). For instance, paranoid schizophrenia is diagnosed by the presence of delirium, hallucinations, self-inflicted injuries, personality disorders, lack of substance abuse, and the continuity of this clinical frame for more than 6 months. In addition, the Diagnostic Interview for Genetic Studies (DIGS) is widely used in the diagnosis of schizophrenia, validated for both USA and non-USA populations, along with additional sources of information such as the Family Interview for Genetic Studies (Contreras et al., 2009). In this case, laboratory studies such as urine drug screens or sleep-deprived electroencephalograms are used to exclude stimulant-induced psychosis or complex partial (temporal lobe) seizures (Lishman, 1987). A positive familiar history provides further support in the diagnosis of schizophrenia. Thus, the diagnostic process in psychiatry is analogous to other branches of medicine where personal and familiar history, physical examination, and laboratory tests constitute essential steps. Regardless, it is difficult to obtain an accurate description without careful and skillful probing during face-to-face interviews. However, this phenomenology can be interpreted under different theoretical frames of reference pertaining to the formulation of the case but not to diagnosis (Fatemi and Clayton, 2008).

Although the Diagnostic and Statistical Manual of Mental Disorders (DSM) is often useful in classical diagnoses, it is not designed to facilitate the development and integration of biomedical knowledge. Therefore, the National Institute of Mental Health has developed an alternative tool known as the research domain criteria (RDoc). This multidimensional approach utilizes units of information beyond clinical phenotypes, i.e., imaging, behavior, etc. Thus, a matrix is developed with constructs that can be related to different elements of information ranging from imaging to genetics (American Psychiatric Association).

The Human Genome Project, along with high throughput technologies, has increased the biological knowledge of several human illnesses. The genome sequencing and analyses of physiological states have further contributed to this purpose. However, the genome as a whole is difficult to interpret and in the case of several multi-factor diseases such as diabetes, cancer and neurological disorders, which often involve the function of a large number of genes, biological pathways, and environmental factors, can further convolute an assessment. Therefore, the combination of genomic information with a detailed molecular analysis will be important in the prediction, diagnosis and treatment of diseases, also allowing the understanding of initiation, progression, and prevalence of disease states (Williams et al., 2004; Shi et al., 2009). In this regard, metabolomics is the newest of the "omics" sciences; it provides a comprehensive approach to understanding the biochemical regulation of metabolic pathways and networks in a biological system. Metabolomics is able to complement the data from genomics, transcriptomics, and proteomics to provide a potentially systemic approach in the study of central nervous system (CNS) diseases (Weckwerth and Morgenthal, 2005). However, there are few currently available studies in neuroscience regarding the data integration from different "omics" sciences.

Often, neuropsychiatric diseases are biologically difficult to define partly because the brain is more difficult to access than other parts of the body. Moreover, research in psychiatry is compounded by the complexity of the brain and the heterogeneity of phenotypes in psychiatric disorders. Brain imaging, genotyping, and immune system testing are important approaches in understanding the biology of psychiatric illness. The advances in technology have made possible the analysis of whole units of cellular components. Regardless, the study of protein and metabolic function in the CNS is made difficult because of intricate cellular heterogeneity with a complex neuronal morphology that includes cellular compartments such as neural dendrites, postsynaptic dendritic spines, axons and presynaptic terminals. Another factor contributing to the difficulty in studying the metabolome of CNS in humans is the limited access to either tissue or fluids, such as cerebrospinal fluid (CSF), in order to study molecular alterations in psychiatric disorders. Due to ethical considerations, it is often preferable to analyze peripheral samples such as plasma, serum, leukocytes and platelets, which are more easily available (Hayashi-Takagi et al., 2014). An "omics" approach has the potential to accelerate the discovery of markers for CNS diseases (Niculescu et al., 2015a). As an example, there is already the use of Systems Biology in the analysis of data from several "omics" technologies, such as proteomics, improving the discovery of pathophysiological mechanisms and biomarkers for brain injuries that could lead to Alzheimer's and Parkinson diseases (Abou-Abbass et al., 2016; Jaber et al., 2016a,b).

The tendency today is to integrate the data from clinical and "omics" studies to obtain a final behavioral phenotype (phenome) (Williams et al., 2004; Monteith et al., 2015; Sethi and Brietzke, 2015). Genome-wide association (GWA) studies with metabolic measurements have shown that genetic variation in metabolic enzymes and transporters lead to concentration changes of their respective metabolites (Suhre et al., 2011; Krumsiek et al., 2012). The main goal of these studies is to identify new interactions between genomic and metabolic systems, yielding valuable insight for basic research and clinical application. The analysis of metabolic data is often the result of several processes where a substance can be identified as unique in the sample but the specific process from which it was derived is unknown. This concept is similar to the identification of a fingerprint: each one is identifiable as unique, but it needs to be registered in a database, that way we know who owns that print. The association with genetics provides evidence of the metabolic pathway wherein such a metabolite is involved and the process from which it originates (Suhre et al., 2011; Krumsiek et al., 2012).

From the point of view of Gestalt psychology, the first biological response is organized as units or structures, these organized units or "gestalten" correspond to the exchange of information and interactions between environmental stimuli and the individual. The resulting "gestalten" are different than the sum of their factors so "there is a tendency not only to perceive the gestalten but also to complete and reorganize them according to biological principles, which will vary in the different levels of maturation or growth and the pathological states" (Bender, 1938). Currently, the approach based on systems biology methods is the most suitable for data integration from different levels of information (genome, transcriptome, proteome, epigenome, metabolome, phenome, and microbiome), in order to unify and reorganize these "gestalten" (organized units of biological or clinical data) in an integrated view of the psychiatric patient. Therefore, "Gestaltomics" is an integrated view of different levels of information ranging from clinical to "omics" data, proposing the diagnosis of neuropsychiatric diseases. Early diagnosis of these disorders could reduce the risk of developing chronic diseases such as obesity, diabetes, cancer, etc. Past research has proposed that affective disturbances involving mood alterations, anxiety, and irritability may be signals of medical conditions along with psychiatric diseases (Cosci et al., 2015). In this regard, depressive symptoms are of first occurrence in approximately 38– 45% of pancreatic carcinoma cases and symptoms of a major depressive illness may precede the diagnosis of lung cancer (Jacobsson and Ottosson, 1971; Hughes, 1986). These studies conclude that the development of psychiatric illness early in the course of a medical condition could affect the prognosis and therapy for patients diagnosed with the same medical disease.

On the other hand, addictive disorders are a class of chronic, relapsing mental disorders that often result in death. In fact, tobacco dependence is related with a higher risk for disease and premature death because of its association with several major health problems including respiratory and cardiovascular diseases, and cancer. There is a current initiative to test the Smokescreen genotyping array, a research tool for the significant advance in understanding addiction and the development of predictive models for personalized treatment strategies. This array includes markers related to addiction and, interestingly, it also has an additional set of comorbidity markers for lung cancer and other psychiatric disorders (Baurley et al., 2016).

Therefore, understanding the molecular factors contributing to psychiatric illness and identifying new biomarkers is essential in the proposal of alternative tools for diagnosis, prognosis, screening, or therapeutic targets. This manuscript describes some examples on the current knowledge of the "omics" field in three psychiatric conditions and their correlation with complex diseases, mainly cancer.

### "Omics" Technologies Applied in Schizophrenia

Schizophrenia was described by Emil Kraepelin as "dementia praecox, separated from manic-depressive psychosis" (Kraepelin, 1893). The current criteria for schizophrenia diagnosis has been compiled from years of empirical testing and recorded in the Diagnostic and Statistical Manual, 5th edition (DSM-5). The existence of different types of schizophrenia has been proposed, each one with its own phenotype and genotype. Most research has been focused on loci in chromosomes 6, 8, 13, and 22. Of these chromosomes, chromosome 22 calls for attention since it contains the comt (catechol o-methyltransferase) gene, involved in dopamine metabolism. Therefore, individuals with a particular comt genotype (e.g., val/val allele) are at risk developing schizophrenia (Combs et al., 2012). Research conducted on samples from schizophrenia patients, both peripheral and postmortem brain samples, revealed a correlation, although low, in the results obtained from peripheral samples (blood, plasma, serum, and platelets) compared to CNS samples (CFS, prefrontal cortex and other brain tissues). In one of these studies using DNA microarrays, postmortem analyses detected 177 genes in schizophrenia related brains. From these genes, only 6 correlated with the obtained blood results (Glatt et al., 2005). In another study, half of the genes found related to schizophrenia in the prefrontal cortex were also found in blood from the same patients (Sullivan et al., 2006). The hypomethylation of st6galnac1 in the blood and brain of schizophrenia patients has been previously reported (Dempster et al., 2011). Allele copy number variations (CNVs) seems to be the most relevant risk factor for schizophrenia, and the 15q11.2 (BP1-BP2) deletion confers the risk for developing schizophrenia (Stefansson et al., 2013). Using metabolomics, an increment in free fatty acids and ceramide in blood and brain samples was observed (Schwarz et al., 2008). Proteomics experiments using SELDI-TOF MS showed that the ApoA1 protein was downregulated in CFS and blood (red blood cells; RBC) (Huang et al., 2007). On the other hand, current advances in schizophrenia physiopathology research and the molecular effects of anti-psychotic drugs have made clear the need of biomarkers for this disease. Metabolomics techniques are not only useful in this purpose but also in monitoring the effect of these types of drugs in psychiatric patients.

A metabolomics study of serum using mass spectrometry (MS) reported 20 metabolites in patients with schizophrenia whose levels were modified when compared with the controls. These metabolites include citrate, palmitic acid, allantoin, and mio-inositol (Xuan et al., 2011). He et al. (2012) performed a nuclear magnetic resonance (NMR) study in the plasma from schizophrenia patients. In this study, the patients were diagnosed before starting the treatment. There was also a group of subjects under medication. Both groups were compared to the control group (no schizophrenia), identifying different metabolites from to the study performed by Xuan.

#### "Omics" Technologies Applied in Autism

Autism spectrum disorders (ASDs) are highly hereditary and genomic studies have revealed that a substantial proportion of ASD risk resides in rare variations ranging from chromosome abnormalities (CNV) to single-nucleotide variations (SNV). These studies highlight a striking degree of genetic heterogeneity, implicating both de novo germline mutation and rare inherited ASD variations (Pinto et al., 2014). De novo CNVs are observed in 5–10% of screened ASD-affected individuals, and after further follow-up studies, some of them have been shown to alter highrisk genes. De novo or transmitted CNVs, such as 15q11.2–q13 duplications of the affected region in Prader-Willi and Angelman syndromes, the 16p11.2 deletion, 16p11.2 duplication, and Xlinked deletions, including the PTCHD1-PTCHD1AS locus, have also been found to contribute to this risk (Stefansson et al., 2013). Exome and whole-genome sequencing studies have estimated at least another ∼6% contribution to ASD and an additional 5% conferred by rare inherited recessive or X-linked loss-of-function (LoF) SNVs (Pinto et al., 2014 and references therein). A genetic overlap between ASD and other neuropsychiatric conditions has been increasingly recognized. Informative studies on the metabolome of ASD individuals showed alterations in the levels of amino acids in plasma, platelets, urine and CSF (Ming et al., 2012). Further, it has been reported that the neurotransmitter and hormone metabolism of serotonin, catecholamines, melatonin, oxytocin, GABA, and endorphins, for example, are altered. In a case-control study, changes in the levels of succinate and glycolate in urine were observed (Emond et al., 2013). Therefore, alterations in metabolism are common features of ASD. In this regard, gut microbiota has important effects in the development of behavioral symptoms relevant to ASD and other neurodevelopmental disorders in a mouse model (Hsiao et al., 2013).

#### "Omics" Technologies Applied in Suicide

In the area of mental health, suicide is a particular prevention priority as it accounts for an estimated 804,000 deaths in 2012 (World Health Organization, 2015b). An objective of WHO Mental Health Action Plan calls for a 10% reduction in the rate of suicide by 2020. Men are four times more likely to commit suicide than women. However, women make more nonfatal suicide attempts than men. There are several factors involved in suicide and suicide attempts, the most important of which is having a psychiatric disorder. More than 90% of suicides have a diagnosable psychiatric disorder at the time of death, mood disorders being the most common (Fatemi and Clayton, 2008). The origin of suicidal behavior is multifactorial and includes genetic, biological, and psychosocial factors. The slc6a4 gene has been associated with suicide but only in women (Gaysina et al., 2006). The gene comt has been related to suicide in both women and men, but the degree of association differs between genders (Kia-Keating et al., 2007). GWAs have found gene markers for suicidal ideation such as polymorphisms rs11628713 and rs109030324 of genes papln and il28ra, respectively (Laje et al., 2009). A study addressing the relationship between genotype and brain transcriptome reported that the GABA A receptor gamma 2 (gabrg2) had lower postmortem expression in the brains of suicide cases and was thus associated with suicide (Yin et al., 2016). Amongst the polygenes implicated with 590 suicide attempts (SA) were several associated with important development functions (cell adhesion/migration, small GTPase and receptor tyrosine kinase signaling), and 16 of these SA polygenes have previously been studied in suicidal behavior (bdnf, cdh10, cdh12, cdh13, cdh9, creb1, dlk1, dlk2, efemp1, foxn3, il2, lsamp, ncam1, ngf, ntrk2, and tbc1d1) (Sokolowski et al., 2016). A recent study sought biomarkers for suicidal ideation using functional genomics. The authors identified genes involved in neuronal connectivity and schizophrenia, and the biomarkers validated for suicidal behavior included a wide number of genes involved in neuronal activity and mood. The 76 biomarkers validated for suicidal behavior map to biological pathways involved with the immune and inflammatory response, mTOR signaling, and growth factor regulation. Further, other potential therapeutic targets or biomarkers for drugs known to mitigate suicidality were identified, such as omega-3 fatty acids, lithium, and clozapine. These biomarkers are also involved in psychological stress response and in programmed cell death (apoptosis) (Niculescu et al., 2015b). A proteomics study of prefrontal cortex tissues showed that alpha crystalline chain B (CRYAB), glial fibrillary acidic protein (GFAP), and manganese superoxide dismutase (SOD2) appear only in suicide victims (Schlicht et al., 2007). Despite the vast amount of information from suicide "omics," it has not been possible to integrate the data to provide a "gestalt" view of the individual, allowing the prevention of this behavior and its outcome. Thus, the integration of this knowledge will provide new methods for the diagnosis and treatment of this complex behavior.

### Adding One Level of Complexity: Comorbidity of Cancer and Psychiatric Disorders

To impulse the advance toward a new era of precision medicine, in 2015 President Obama proposed a research initiative (www.whitehouse.gov/precisionmedicine). Precision medicine includes prevention and treatment strategies taking individual variability into account. This concept has been improved by the development of large-scale biological databases, powerful methods for characterizing patients, and computational tools for the analysis of large data sets. The proposed initiative has two main components: a near-term focus on cancer and a long-term aim to generate applicable knowledge for the whole range of health and disease (Collins and Varmus, 2015).

Cancer is a major public health problem and a challenge that needs to be solved by a multidisciplinary approach (World Health Organization, 2015a). Its appropriate control includes health care education to improve prevention and early detection programs; and optimizing diagnosis to determine specific treatment and provide palliative care improving the patients' quality of life (Mohar et al., 2009).

The need for psychiatric services in hospitals can be observed by the high prevalence of psychiatric disorders. In oncology hospitals, the prevalence of these disorders is approximately 50% (Citero Vde et al., 2003). A study evaluating the prevalence of psychiatric illness in cancer patients reported that 47% of cancer patients diagnosed with mental disorders, amongst them 85% with anxiety and depression, 8% with cerebral organic disorders, and 7% with personality disorders (Citero Vde et al., 2003). Another study reported this type of disorder in 11–21% of patients at the hospital (Razavi et al., 1990). Delirium is often found in patients at the general hospital. The prevalence is 25% in cancer patients, and 85% in terminally ill patients. Psychoses and cognitive impairment have demonstrated a key role in slowing down the progress of cancer treatment in these patients (Citero Vde et al., 2003). A psychiatric comorbidity between smoking and psychosis has severe effects in the morbimortality in those patients and results in an increased number of deaths by suicide. It has been estimated that schizophrenia patients are addicted to nicotine in 80% of cases compared to 22% in the healthy population (Brown, 2000). On the other hand, obsessive-compulsive disorder seems to be a protective pathology against nicotine addiction (Dell'Osso et al., 2015).

Cancer is a disease whose treatment has high personal, financial, and social costs. These factors influence the development of anxiety and depression disorders in the cancer patient. Even cancer treatments such as chemotherapy, which produces serious secondary effects, affect this condition. For instance, breast cancer comorbidity with depression is associated with a poorer quality of life, poor treatment adherence, impaired physical and cognitive function, and cancer progression or survival. Understanding depression etiology associated with breast cancer is a major concern. Depression in breast cancer patients is often the result of several contributing biological factors; amongst them are hormonal, inflammatory, and genetic mechanisms, and psychological factors such as bodily disfigurement and impaired sexual function. Genetic risk is important in the etiology of depression precipitated by medical conditions like cancer, which has been proposed as an environmental risk factor (Caspi et al., 2010). Smoking is one of the main risk factors within these environmental factors. In fact, the WHO global initiative for Framework Convention on Tobacco Control (Deland et al., 2003) is one of the first strategies for primary prevention of cancer, because tobacco is related to 16 different types of cancer and smoking is the cause of 71% of deaths due to lung cancer (2015). The knowledge from the biological, molecular, and clinical data could improve the outcome for this disease and help avoid the behavior increasing the susceptibility for cancer development. A broad research program to improve creative approaches to precision medicine, test them rigorously, and use them to build the evidence base needed to guide clinical practice is essential (Collins and Varmus, 2015). A clear example for this is the relation between smoking and lung cancer.

#### Nicotine Addiction and Lung Cancer

The clearest example of how a psychiatric disorder influences the development of cancer is the relation between smoking and lung cancer. Smoking is an addictive disorder and a major public health concern. It is the primary cause of death worldwide, as actively smoking causes different chronic diseases, several types of cancer, and respiratory and cardiovascular diseases (World Health Organization, 2008).

There is evidence presented in the 2014 Surgeon General's Report (US Health Department) modifying cancer care. The detrimental consequences of smoking in patients with cancer are mediated by the activation of tumorigenic pathways and physiological alterations, including the complications associated with cancer treatment and development of comorbidities. However, no cancer treatment has been proved more effective in cancer patients who smoke compared to non-smoking patients, neither are there any prognostic biomarkers for cancer patients who continue to smoke (US Department of Health, 2014).

If both processesshare the same molecular basis, and therefore the same biological pathways, it is important to highlight the need to study psychiatric diseases along with other co-morbidities such as cancer. The neuronal acetylcholine nicotinic receptors (nAChRs), a protein family of pentameric ion channels regulated by ligands, are potential candidates. These receptors can mediate signal transmission through the synapse as well as release of several neurotransmitters. The receptor subtype in the brain is the α4β2 form. Some α4β2 receptors also contain subunit α5, which is regulatory, inactivating the receptor. Nicotine is an exogenous agonist of these receptors. Seconds after starting to smoke, nicotine produces a physical response. Recent studies show that nicotine, despite not being carcinogenic, promotes cell proliferation, metastasis, angiogenesis, and resistance to apoptosis (Warren et al., 2014, and references therein). These processes, mediated by nAChRs, may influence the effectiveness of anti-cancer treatment (chemotherapy, radiotherapy, or targeted therapy). The evidence indicates that smoker patients have lower survival rates than those patients giving up smoking before starting treatment; suggesting that nicotine supplemented for smoking cessation treatment reduces the response to anticancer drugs (Czyzykowski et al., 2016).

Nicotine and its metabolites activate nAChRs and βadrenergic receptors that in turn activate several pathways, such as the Ras/Raf/MEK/MAPK and PI3K/Akt oncogenic pathways, and causing cross-activation of these pathways producing a tumor-promoting phenotype. Furthermore, nicotine and the activation of nAChRs decrease the therapeutic response to chemotherapy and radiotherapy both in vitro and in vivo (Dasgupta et al., 2006; Warren et al., 2010; Momi et al., 2012).

Genetic variations in nAChRs have been proposed as strong risk factors for nicotine dependence and susceptibility to lung cancer. GWAS involving human addictions in lung cancer patients have reported the same variants in the gene cluster chrna5/a3/b4, previously associated with nicotine dependence and lung cancer susceptibility (Wang et al., 2009). This gene cluster plays a key role in nicotine dependence, lung cancer and loss of lung function when the allele A of the polymorphism rs16969968 is present (Gabrielsen et al., 2013). Moreover, nicotine was suggested as an intermediary factor between variants at the chrna5/a3/b4 region and lung cancer (Tseng et al., 2014). Although it was previously considered that rare nonsynonymous variants in this region played a protective role, the variant rs56501756, encoding for R336C, confers a risk for nicotine dependence, lung cancer and other smoking-related diseases (Thorgeirsson et al., 2016).

Moreover, there is evidence that smoking cessation treatments are affected by genetics. The chrna5/chrna3/chrnb4 cluster defines haplotypes of low, intermediate and high risk of cessation treatment failure, according to the presence of polymorphisms rs16969968 and rs680244 (Chen and Bierut, 2013). Therefore, the identification of smokers with different haplotypes implies the need for personalized smoking cessation treatments.

However, research is not limited to genetic data only; there is research on nicotine metabolism and genotype association as well. One example of this concerns the cyp2a6 gene coding for P450 2A6, the major nicotine metabolizer enzyme. Genetic variations in the cytochrome cyp2a6 gene contribute greatly to the observed differences in nicotine metabolism, thus influencing smoking habits in different populations (Park et al., 2016). Differences in nicotine metabolism and risk of nicotine addiction have been attributed to functional allelic variation in cyp2a6 (Mwenifumbo and Tyndale, 2009; Al Koudsi and Tyndale, 2010). The meta-analysis of samples from the ENGAGE consortium proved the association between SNP's in this locus and the number of cigarettes smoked per day (Thorgeirsson et al., 2010). Further evidence on the association of cyp2a6 with the number of cigarettes smoked per day and nicotine dependence is observed in the synergic effects of the chrna5/chrna3/chrnb4 cluster and this gene, showing independent and additive effects of allelic risk for these two chromosomal regions in two phenotypes (Wassenaar et al., 2011).

Active smoking is an established critical factor for epigenetic modification. Methylation changes were detected studying the association of active smoking exposure with methylation patterns; amongst these studies were epigenome-wide association studies (EWASs) and gene-specific methylation studies (GSMSs) (Gao et al., 2015). At molecular level, epigenetic factors such as DNA methylation have been proposed as biomarkers for both psychiatric disease and cancer (Ai et al., 2012). The correlation between methylation in leukocytes from patients with Parkinson disease and in neurons from the same patient has been reported (Masliah et al., 2013). In breast cancer, methylation of the bdnf gene (brain-derived neurotrophic factor) has been studied in relation with depression in mastectomy patients (Kim et al., 2013; Kang et al., 2015). In fact, the onset of smoking has been associated with bdnf, a neurotrophin identified as a possible candidate gene (Tobacco Genetics Consortium, 2010).

### Systems Biology and the Challenges in Understanding the Underlying Mechanisms of Human Behavior

Data-intensive science consists of three basic activities: capture, curation, and analysis. These phases raise a challenge in systems biology science. These challenges entail not only their size but also their increasing complexity. Curation and analysis become important after capturing data from several experiments. It includes storage, retrieval, dissemination, and data filtering and integration. Algorithms and software tools developed for the analysis of biological data also face the problem of scalability when data become larger. However, several big databases have been created around the world for the curation and analysis of biological data, and their data volume and performance are gradually improving. These databases include GeneBank and Gene expression omnibus (GEO) from NCBI (Altaf-Ul-Amin et al., 2014). Recently, other projects have been initiated such as ENCODE (Encyclopedia of DNA Elements), a project supported by an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI/NIH). ENCODE aids the biologist using human and/or animal genetic data to study disease with a comprehensive list of functional elements in the human genome, including elements that act at protein and RNA levels, and regulatory elements that control cells and the circumstances in which a gene is active. Further, global metabolomics are used for the identification of metabolic pathways altered due to disturbances in biological systems. The statistical analysis involves an extensive process that sometimes may lead to the identification of a very narrow range of metabolites as biomarkers. In this regard, The Human Metabolome Project, funded by Genome Canada, was launched in 2005. The purpose of the project is to facilitate metabolomics research by providing a linkage between the human metabolome and the human genome. The project mission is to identify, quantify, catalog and store all metabolites that can potentially be found in human tissues and biofluids at concentrations greater than one micromolar. These data are free to access through the Human Metabolome Database (www.hmdb.ca) (Wishart, 2007; Wishart et al., 2009, 2012). The application of metabolomics in cancer research has led to a renewed appreciation of metabolism in cancer development and progression. It has also led to the discovery of biomarkers and novel cancer-causing metabolites. However, with so many cancer-associated metabolites being identified, it is often difficult to associate these compounds with their respective cancer type. It is also challenging to track down the information on the specific pathways that particular metabolites, drugs or drug metabolites may be affecting (Wishart et al., 2016).

The ENIGMA Consortium is an initiative seeking to integrate genetics, genomics and brain imaging (http://enigma.ini.usc. edu); it is a global alliance of over 500 scientists spread across 200 institutions in 35 countries collectively analyzing brain imaging, clinical and genetic data. ENIGMA has grown to over 30 working groups studying 12 major brain diseases, pooling and comparing brain research data. In some of the largest neuroimaging studies to date, such as in schizophrenia and major depression, ENIGMA has found replicable disease effects that are consistent worldwide, as well as common factors that modulate disease effects in different populations (Thompson et al., 2015, 2016).

Systems biology is being used to analyze data from different levels of information in psychiatric disease. In a study of CNVs and SNVs in genes related to ASD, chromatin remodeling and transcription regulation were inferred on functional gene networks related to neuronal signaling, development synapse function, chromatin regulation, MAPK, and other signaling pathways (Pinto et al., 2014). Other studies in systems biology suggest that the interplay between sleep, stress, and neuropathologies emerge from genetic influences on gene expression and their collective organization through complex molecular networks relating to underlying sleep mechanisms, stress susceptibility and neuropsychiatric disorders (Jiang et al., 2015). In animal models, a systems biology study based on proteomic and metabolomic research developed a schematic model summarizing the most prominent molecular network findings in the Df(16)A± mouse (a model of the 22q11.2 deletion syndrome). Interestingly, the implicated pathways were linked to one of the proteomic candidates, O-Linked N-acetylglucosaminyltransferase (OGT1), a predicted miR-185 target and a new mechanism associated with 22q11DS, which may be linked to a cognitive dysfunction and an increased risk of developing schizophrenia (Wesseling et al., 2016).

An analysis comparing proteome and biological pathways and their involvement with different psychiatric illnesses showed molecular similarities across all major neuropsychiatric disorders. These results, analyzed by systems biology methods, proved an overlapping of pathways affecting protein expression in a similar manner in these disorders. This supports the hypothesis that major neuropsychiatric disorders represent a disease of the brain with a spectrum of phenotypes derived of the genotype and the effect of the environmental stimuli (**Figure 1**; Gottschalk et al., 2014).

One of the best efforts to materialize the integration of the phenome with the genome is exemplified by the Consortium for Neuropsychiatry phenomics, at the University Of California in LA (UCLA) (Bilder et al., 2009b). Besides making available a brain imaging database of healthy individuals and patients with neuropsychiatric disorders such as schizophrenia, bipolar disorder and attention deficit/hyperactivity disorder, it also provides bioinformatics tools to visualize and analyze these dataset in a "systematic study of phenotypes on a genome-wide scale," including basic and clinical information (Poldrack et al., 2016). The concept of phenome is evolving to phenomics or "the discipline to enable the development and adoption of highthroughput and high-dimensional phenotyping" (Bilder et al., 2009a; Houle et al., 2010). The "phenomics" proposal of the Consortium of Neuropsychiatry includes an integrative vision of data in other complex biological systems and is already achieving that integrating vision (Bilder et al., 2013). We wish to convey this vision in medical practice, one that will also consider the socio-cultural issues and comorbidities of the patients. Of course, conveying the idea that "phenomics" applied to patient-centered medical practice will be "gestaltomics" in the near future.

Despite the efforts to integrate several networks of information, it has not been possible to personalize medicine through an integrative view of the individual through different levels of information; therefore, "gestaltomics" is an unifying vision of different sources of information through a systems

biology approach that is not limited to a biological understanding of the disease and instead follows an old medical principle from Hippocrates "It is far more important to know what person the disease has than what disease the person has." The onset of symptoms identify the clinical stage of the disease at the time of diagnosis. The disease can progress to mild, severe or fatal, i.e., "the spectrum of disease." The disease process results in recovery, disability or death, which is the reason why it is important to identify the individuals at risk (The Center for Disease Control and Prevention, US Department of Health Human Services, 1992). The early screening of a high-risk group, such as smokers, during the subclinical stage of the disease could identify a difference in the development of a disease such as cancer or influence the outcome to this disease. These screenings could involve the analysis of blood and urine samples, which are easy to obtain. It could involve the genotyping of genes, such as the cluster chrna5/chrna3/chrnb4. Further, the appropriate diagnosis of the psychiatric disease at the onset of symptoms could lead to an adequate treatment or therapy for the patient.

During the development of psychiatric illness-cancer, the complexity of both diseases becomes increased. Thus, the global view of the individual is vital for cancer survival. **Figure 2** shows a diagram describing the major levels of information regarding both psychiatric diseases and cancer implicated in the "gestaltomics" approach for disease diagnosis, prognosis, and discovery of therapeutic targets. The mechanistic view of these diseases, obtained from clinical and biological information, seems to be unified by common genetic factors leading to the activation of major biological pathways, in turn influenced by environmental factors (epigenetics), regulating the signal intensity causing several phenotypes of psychiatric illnesses as a disease spectrum (**Figure 1**). The task of systems biology is to unravel the complex mechanisms orchestrating such behavior. The construction of ontologies, whose principles could be applied to the systems biology of complex diseases, has been proposed in order to cope with this biological complexity.

The formation of ontologies that introduced human agents and software to organize information and execute a common goal in healthcare was proposed in 1998 (Falasconi et al., 1998; Falasconi and Stefanelli, 1998). This began with computerbased patient record (CPR) prototypes(Webster, 2001). However to achieve this goal, the problem of harmonizing data from one database to another had to be solved, this problem consisted in the definition of concepts or entities using the unification or integration of different data. The purpose of a medical ontology library is to analyze, integrate, and formalize medical terminologies of different areas or applications (Pisanelli et al., 2004), as an example, the concept of cancer can be defined from several points of view, morphological, biochemical, pathological, etiological, etc. The ontology library would serve as an informatics platform including every definition according to specified parameters. Therefore, the following principles must be followed in the construction of ontologies: (a) logical consistency (logical language and explicit formula semantics), (b) semantic coverage (all entities of its domain and all entity types of its domain), (c) modeling precision (only represents the intended models to accomplish the task of the ontology), (d) strong

from the influence of epigenetic factors (environmental stimuli).

modularity (to organize the domain into different descriptions), and (d) scalability (the language used expresses the intended meaning according to the domain or tasks to accomplish) (Pisanelli et al., 2004).

The increasing amount of data derived from genomics led to the development of biological ontologies (Fernández-Bries et al., 2004), introducing also an integrative approach using bioinformatics (Gopalacharyulu et al., 2008). Afterwards, cognitive ontologies, based on the structure–function data from neurologically affected patients, integrated cognitive, and anatomical models and organized the cognitive components for diverse tasks into a single framework (Price and Friston, 2005). Currently, ontologies serve as "a means to standardize terminology, to enable access to domain knowledge representation, cognitive science, to verify data consistency and to facilitate integrative analysis over heterogeneous biomolecule data" (Hoehndorf et al., 2013).

The ontology proposed by the Consortium of Neuropsychiatric Phenomics continues with the sequence of platforms being implemented to improve the definition of psychiatric phenotypes through different levels or domains of knowledge (syndrome, symptom, cognitive phenome, neural systome, cellular-signalome, Proteome, genome) seeking to define a disease more accurately, including the data derived from each domain, and focusing mainly on defining the cognitive phenome of psychiatric diseases. The multivariate definition of a phenotype can lead to advances in the face of complex diseases, such as cancer and psychiatric diseases. This not only improves the definition of phenotypes but also establishes connections between intermediate phenotypes (Bilder et al., 2009a). Together with the initiative of the National Mental Health Research Domains Criteria (RDoC), it will have a direct impact on the improvement of the diagnostic taxonomy of mental disorders based on brain biology (Bilder et al., 2013).

It is interesting that, in some of the ontologies available on the web, the harmonization of different formats of bioinformatics data or reservoirs of information is being achieved. Since the principles that construct these ontologies can be applied to the bioinformatics of complex diseases, this type of initiatives from multidisciplinary groups can be a more effective approach, through Systems Biology, to address the complexity issue of diseases such as cancer and psychiatric disorders in an organized framework that would provide an integral picture of the individual and his illness.

### "Omics" Studies on Neuropsychiatric Disorders and Cancer

There few studies regarding the association of psychiatric diseases and cancer, such as schizophrenia and breast cancer (Catts et al., 2008; Bushe et al., 2009), or Alzheimer's disease with reduced risk for cancer (Roe et al., 2005), addressing a potential opportunity for biomedical research (Catalá-López et al., 2017). A promising field in "omics" studies is the association between alcohol drinking behavior and cancer.

Alcohol abuse has been recognized as a common component in different types of cancer (World Health Organization, 2014). Alcoholism is accepted as a disease and though DMS-V criteria distinguish between alcohol dependence and alcohol abuse, the diagnosis criteria is evolving. There is also a variety of phenotypes of alcoholism. Polymorphisms of the alcohol dehydrogenase (ADH1B Arg48His) and aldehyde dehydrogenase (ALDH2 Glu487Lys) genes are commonly associated with alcohol consumption and cancer.

The ADH1B gene and its alleles, Arg48His (rs1229984) and Arg370Cys (rs2066702), are associated with alcohol metabolism and drinking behavior, cancer, and human phenomes (Polimanti and Gelernter, 2017). Esophageal cancer is associated with an Arg/Arg genotype of ADH1B Arg48His, although its 48His allele has been proved to have a protective effect against this type of cancer (Mao et al., 2016). The association of ADH1B with colorectal cancer risk in Chinese population has been reported (Zhong et al., 2016). It has also been shown that this gene is correlated with gastric cancer (Chen et al., 2016). In addition, the ADH1B Arg48His allele increases lung cancer risk in carriers (Álvarez-Avellón et al., 2017). ALDH2 and ADH1B polymorphisms are associated with a higher risk for bladder cancer and alcohol abuse (Masaoka et al., 2016). Alcohol abuse also mediates the ADH1B effect on hepatitis B-related hepatocellular carcinoma risk (Liu et al., 2016), and head and neck squamous cell carcinoma (Ji et al., 2015). There are few omics studies on the field; however, a noteworthy study on the microbiome in fecal samples of alcoholic individuals could help to understand the phenotype of individuals at risk of developing colorectal cancer (Tsuruya et al., 2016).

There is yet an enormous task to be undertaken in the "omics" field of comorbidity of psychiatric diseases and cancer. The knowledge gathered from this exciting field will contribute to the successful development of personalized medical care for these patients.

### CONCLUSION AND PERSPECTIVES

The present review highlights how the vast amount of information from omics technologies in complex diseases, such as schizophrenia, present several challenges regarding data management and format harmonization of output data. Despite the challenge, some studies have performed successful analyses starting from different technological platforms (See **Table 1**).


CSF, cerebrospinal fluid; RBC, red blood cells; CNV, copy number variation; SNV, single nucleotide variation.

Because most studies in the "omics" field are separate entities and do not integrate other levels of information, only a few have taken this approach (van Eijk et al., 2014). van Eijk et al. attempted an "omics" analysis with different levels or "layers" of genomic information (such as SNPs, methylation, and gene expression), identifying disease susceptibility loci for neuropsychiatric traits due to the enrichment of disease-specific signals when combining different genomic layers prioritizing genomic loci. This approach supported the use of whole blood for the study of brainrelated diseases (van Eijk et al., 2014). This issue could be solved also for other peripheral samples through integrative studies.

Systems Biology must be able to provide proper quantitative schemes that will contribute to the understanding of underlying mechanisms and phenotype prediction in psychiatric diseases, as well as its association with other comorbid diseases such as cancer. Some groups have developed mathematical analyses using model systems exploring feasible metabolic phenotypes in human cancer cell lines and tissues (Lewis and Abder-Haleem, 2013). In this regard, a metabolic phenotype modeling performed by Diener et al. (2016) used metabolome and expression data to infer the metabolic phenotype of HeLa cancer cells. The mathematical modeling, based on the metabolite concentrations in this study, set the basis for inferring affected enzymes in a diseased state when it is not evident at genomic level. Another important advance in exploring metabolic phenotypes is the Human Metabolic Atlas database containing a set of tissue specific genome scale metabolic reconstructions of human tissues (Pornputtapong et al., 2015). Therefore, advances in multiscale modeling promises the inference of the metabolic phenotype from a cell to a whole organism. Notably, this type of studies could have the potential to improve the decision-making process regarding the type of chemotherapy administered to a cancer patient (Diener and Resendis-Antonio, 2016).

#### REFERENCES


Ontologies are an excellent proposal for the integration of clinical, biological and behavioral information enabling a precise description of the disease presented by an individual. The use of multidisciplinary platforms, integrating the intermediate phenotypes contributing to the global phenotype, will provide the necessary tools for data analyses. We have already discussed the existence of different databases and software available from various platforms, which can be used to analyze experimental data derived from patient samples. We propose the development of a network derived from each type of data; the elements of such a network should be shared with the other networks of biological information. The convergence of evidence provided by bioinformatics analyses will allow the visualization of a characteristic phenotype pattern exhibited by psychiatric patients. Such evidence will lead to personalized diagnosis for each patient and, if appropriate, will also contribute to disease prognosis.

However, there is yet much work to do in order to (i) integrate clinical and "omics" data, (ii) integrate the networks from different "omics" technologies, (iii) complete data analyses from different levels of information, and (iv) compare different networks from two or more diseases affecting one individual to improve the description of his health/disease states. In this regard, the concept of "gestaltomics" will be developed by a better understanding of complex Systems Biology.

#### AUTHOR CONTRIBUTIONS

NG, Research, writing, and discussion of this review; HN, Original idea, writing, discussion, and revision of this review; OR, Original idea, writing, and revision of this review.

#### FUNDING

The present study was supported by CONACYT, Mexico. Funding code: SALUD-2015-2 No. 261516.


depression in breast cancer. Psychiatry Invest. Psychiatry Invest. 12:523. doi: 10.4306/pi.2015.12.4.523


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Gutierrez Najera, Resendis-Antonio and Nicolini. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.