The Large Scale Structure of Human Metabolism Reveals Resilience via Extensive Signaling Crosstalk

Metabolism is loosely defined as the set of physical and chemical interactions associated with the processes responsible for sustaining life. Two evident features arise whenever one looks at metabolism: first, metabolism is conformed as a very complex and intertwined construct of the many associated biomolecular processes. Second, metabolism is characterized by a high degree of stability reflected by the organisms resilience to either environmental changes or pathogenic conditions. Here we will investigate the relationship between these two features. By having access to the full set of human metabolic interactions as reported in the highly curated KEGG database, we built an integrated human metabolic network comprising metabolic, transcriptional regulation, and protein-protein interaction networks. We hypothesized that a metabolic process may exhibit resilience if it can recover from perturbations at the pathway level; in other words, metabolic resilience could be due to pathway crosstalk which may implicate that a metabolic process could proceed even when a perturbation has occurred. By analyzing the topological structure of the integrated network, as well as the hierarchical structure of its main modules or subnetworks, we observed that behind biological resilience lies an intricate communication structure at the topological and functional level with pathway crosstalk as the main component. The present findings, alongside the advent of large biomolecular databases, such as KEGG may allow the study of the consequences of this redundancy and resilience for the study of healthy and pathological phenotypes with many potential applications in biomedical science.

. All different subtypes of regulatory interactions included in KEGG database. The number of molecular interactions derived from each interaction subtype is shown. A group refers to a protein-complex (a stable interaction formed between two proteins). The interaction subtypes refer to the following: gene-gene: a TF that regulates the expression of a target gene; group-gene: a protein-complex that regulates the expression of a target gene; gene-group: a TF that regulates the expression of the genes involved in a protein-complex; group-group: a protein-complex that regulates the expression of the genes involved in a protein-complex. The molecular direction of the interaction is indicated by the direction of the arrow. Figure S3. All different subtypes of interactions derived from enzymatic reactions included in KEGG database. In the left panel, the enzyme codified by geneA catabolized the enzymatic reaction in which metabolite X and Y are transformed into metabolite Z and W. In the right panel, the reverse reaction is catabolized by the same enzyme. The number of molecular interactions derived from each interaction subtype is shown. In total, there are 1416 genes and 1547 metabolites participating in enzymatic reactions. In the directed metabolic network, the interactions from a reversible reaction must account for both directions of the reaction Figure S4. All different subtypes of interactions derived from metabolic interactions included in KEGG database. The number of molecular interactions derived from each interaction subtype is shown. A group refers to a protein-complex (a stable interaction formed between two proteins). The interaction subtypes refer to the following: metabolite-gene: a metabolite has an effect over the activity of a protein; gene-metabolite: the activity of a protein exerts an effect over the amount of a metabolite; metabolitegroup: a metabolite has an effect over the activity of a protein-complex; group-metabolite: the activity of a protein-complex exerts an effect over the amount of a metabolite; gene-gene: a protein has a posttranslational effect over the activity of another protein; gene-group: a protein has a post-translational effect over the activity of a protein-complex; group-gene: a protein-complex has a post-translational effect over the activity of a protein; group-group: a protein-complex has a post-translational effect over the activity of a protein-complex.The molecular direction of the interaction is indicated by the direction of the arrow. For each subtype of interaction the number of interactions included in our integrated network/the number of total interactions in KEGG database is shown. Figure S5. All different subtypes of interactions derived from the category of successive enzymatic reactions included in KEGG database. The number of molecular interactions derived from each interaction subtype is shown. A group refers to a protein-complex (a stable interaction formed between two proteins). The interaction subtypes refer to the following: gene-gene: two enzymes catalyze successive reactions; gene-group: an enzyme and a protein-complex catalyze successive reactions; group-gene: a protein-complex and an enzyme catalyze successive reactions. There is no annotation about two protein-complexes catalyzing successive reactions. The molecular direction of the interaction is indicated by the direction of the arrow. For each subtype of interaction the number of interactions included in our integrated network/the number of total interactions in KEGG database is shown. Figure S6. Closeness centrality of the GCC. No core-periphery structure is evident in the GCC network.
Closeness centrality approximately follows a normal distribution, so there is no clear distinction between central and peripheral nodes. Figure S7. A network of human pathways: In this pathway crosstalk network, the nodes represent pathways and there exist an interaction between nodes when those pathways share at least one molecule. The inset shows how dense is this network (<k> = 103).    Figure S14. A null model was built by edge swapping from the original KEGG network keeping the connectivity distribution fixed. A pathway network and a percolation analysis were done from this null model. The results for the percolation analysis removing edges ordered by descending values of edge betweenness (Panel A), nodes ordered by descending values of node degree (Panel B) and nodes chosen at random (Panel C) are shown. Each plot shows either the number of edges removed (A) or nodes removed (B and C) from the molecular interaction network in the X axis and the number of components, the number of edges, the mean degree, the number of nodes (pathways) and the average shortest path length of the pathway network in the Y axis for each iteration (top to bottom) Figure S15. A random Erdös-Renyí network the same size as our KEGG network was built. A pathway network and a percolation analysis were done from this network. The results for the percolation analysis removing edges ordered by descending values of edge betweenness (Panel A), nodes ordered by descending values of node degree (Panel B) and nodes chosen at random (Panel C) are shown. Each plot shows either the number of edges removed (A) or nodes removed (B and C) from the molecular interaction network in the X axis and the number of components, the number of edges, the mean degree, the number of nodes (pathways) and the average shortest path length of the pathway network in the Y axis for each iteration (top to bottom).  Figure S16. A scale-free Barabási-Albert network 1.5 times as big as the KEGG network was built. A pathway network and a percolation analysis were done from this network. The results for the percolation analysis removing edges ordered by descending values of edge betweenness (Panel A), nodes ordered by descending values of node degree (Panel B) and nodes chosen at random (Panel C) are shown. Each plot shows either the number of edges removed (A) or nodes removed (B and C) from the molecular interaction network in the X axis and the number of components, the number of edges, the mean degree, the number of nodes (pathways) and the average shortest path length of the pathway network in the Y axis for each iteration (top to bottom). Figure S17. A scale-free Barabási-Albert network 2 times as big as the KEGG network was built. A pathway network and a percolation analysis were done from this network. The results for the percolation analysis removing edges ordered by descending values of edge betweenness (Panel A), nodes ordered by descending values of node degree (Panel B) and nodes chosen at random (Panel C) are shown. Each plot shows either the number of edges removed (A) or nodes removed (B and C) from the molecular interaction network in the X axis and the number of components, the number of edges, the mean degree, the number of nodes (pathways) and the average shortest path length of the pathway network in the Y axis for each iteration (top to bottom).  Table S1. Number of nodes and edges in the whole network and in the GCC for the human TRN, MN and PPN. Table S2. The goodness of fit for the TRN, MN and its GCC and the PPN is shown. R is the log likelihood ratio between the two candidate distributions (power law distribution and the one in the corresponding row). This number will be positive if the data is more likely a power law distribution, and negative if the data is more likely to be attributed to the tested distribution. The significance value for that direction is p. The statistics derived from the GCCs for the different networks are not shown since they behave practically identical to their complete network counterparts. Table S3. Power law best fit parameters for the human TRN, MN, MN GCC and PPN. The and parameters for the GCCs are identical to the best fit parameters of the respective complete network. Table S4. R is the log likelihood ratio between the two candidate distributions. This number will be positive if the data is more likely a power law distribution, and negative if the data is more likely to be attributed to the tested distribution. The significance value for that direction is p. The goodness of fit for the whole network (removing isolated nodes) is shown.We the must notice that almost all of the network nodes and edges belong to Giant Connected Component (GCC) that actually shares the same statistics and will become a central study object in this work from now on.

SUPPLEMENTARY GLOSARY
• Bipartite graph: A network with two different types of nodes. These nodes form a disjoint set and every edge (or link) of the network connects nodes from different type. A bipartite graph does not contain oddlength cycles.
• Community (see, Module) • Component: A network component, often called a connected component or island, in an undirected network is a subgraph (a part of the network) in which any two nodes are connected to each other by one or more paths, and which is connected to no additional nodes in the supergraph (the full network) • Degree centrality: Degree centrality or simply degree is defined as the number of links incident on a node (i.e., the number of ties that a node has). The degree measures the flow of information through this node in the network. In the case of a directed network two separate degree centralities are defined, in-degree and out-degree. Accordingly, in-degree is a measure of the number of links directed to the node and outdegree is the number of links that such node directs to others • Edge Betweenness: or Edge Betweenness Centrality is a centrality measure for the links, it is defined as the number of the shortest paths that go through a given link in a network • Enzyme graph: In the context of the present paper an enzyme graph or enzyme network is an important biological network which relates enzyme proteins and chemical compounds. It is the graph theoretical depiction of a metabolic pathway or a series of metabolic pathways.
• Giant Connected Component (GCC): A giant connected component is a connected component of a given network that contains a significant fraction (more than 50 \%) of the nodes of the network • Hypergraph: a hypergraph is a generalization of a network in which a link can connect to an arbitrary number of nodes. In contrast, in an ordinary network, a link connects exactly two nodes. In the context of this work KEGG hypergraphs have been "flattened" by multiplicating edges.
• Metabolite interactions graph: In the context of KEGG a metabolite interactions graph is different to an enzyme network in that it only contains metabolites as nodes • Module: Network modules are understood as subnetworks formed by sets of nodes (or vertices) that are more densely connected among themselves than with the rest of the network. Modules are often viewed as semi-autonomous (but not independent) components of a network are responsible for functionality in real networks • Shortest path: A shortest or geodesic path, between two nodes in a network is a trajectory with the minimum number of edges. If the network edges are weighted, it is a path with the minimum sum of edge weights. The length of a geodesic path is called geodesic distance or shortest distance. Geodesic paths are not necessarily unique, but the shortest path distance is well-defined since all shortest paths have the same length.