Analysis Tools for Interconnected Boolean Networks With Biological Applications

Boolean networks with asynchronous updates are a class of logical models particularly well adapted to describe the dynamics of biological networks with uncertain measures. The state space of these models can be described by an asynchronous state transition graph, which represents all the possible exits from every single state, and gives a global image of all the possible trajectories of the system. In addition, the asynchronous state transition graph can be associated with an absorbing Markov chain, further providing a semi-quantitative framework where it becomes possible to compute probabilities for the different trajectories. For large networks, however, such direct analyses become computationally untractable, given the exponential dimension of the graph. Exploiting the general modularity of biological systems, we have introduced the novel concept of asymptotic graph, computed as an interconnection of several asynchronous transition graphs and recovering all asymptotic behaviors of a large interconnected system from the behavior of its smaller modules. From a modeling point of view, the interconnection of networks is very useful to address for instance the interplay between known biological modules and to test different hypotheses on the nature of their mutual regulatory links. This paper develops two new features of this general methodology: a quantitative dimension is added to the asymptotic graph, through the computation of relative probabilities for each final attractor and a companion cross-graph is introduced to complement the method on a theoretical point of view.


INTRODUCTION
An intuitive representation of system interactions, an algorithmic description of state transitions, and the capacity to capture the global dynamics of the system, list some of the advantages of Boolean models, which remain a powerful tool in the modeling and analysis of biological networks (Wang et al., 2012;Abou-Jaoudé et al., 2016). Successfully predictive examples of Boolean models cover complex networks across many different organisms, from cell cycle (Li et al., 2004;Fauré et al., 2006), to fly or plant morphogenesis (Albert and Othmer, 2003;García-Gómez et al., 2017), and highly complex networks such as T-cell induction (Mendoza and Xenarios, 2006;Saez-Rodriguez et al., 2007), leukemia (Zhang et al., 2008) or apoptosis (Calzone et al., 2010).
In a modular view of a biological organism, each task is executed by a specific set of interactions among an ensemble of biological components; in other words, it can be said that there is a specifc network, or module, for each specific task (signaling, metabolic, physiological, etc.). These modules often interact with each other, one task triggering the next in a chain of events or cyclic phenomena. Examples include chains of signaling networks such as MAPK cascades, geneticmetabolic interactions (Baldazzi et al., 2010), or coupled oscillations (Gérard and Goldbeter, 2012). However, in many cases, while experimental evidence supports the existence of links between two modules, their modes of interaction are still unclear (as in the case of mammalian cell cycle and circadian clock, see Feillet et al., 2015). In this context, mathematical tools are necessary to facilitate the analysis of the complex behavior obtained from the interconnection of two or more known modules.
One of the challenges in the analysis of Boolean networks is attractor computation, particularly for high-dimensional networks. For a network of dimension n, the size of the state transition graph is 2 n . A direct analysis of such a graph may become computationally costly, in terms of space and time, when n ≥ 20. This is especially true with asynchronous updating, which includes numerous dynamical trajectories. Two very efficient methods have recently been developed: Zañudo and Albert (2013) compute all attractors of a network (up to n ≈ 100), by isolating special properties of the state transition graph's components; Veliz-Cuba et al. (2014) compute all singletons (attractors containing a single state) for networks up to n = 1,000, by using a computational algebra approach.
In this paper, we propose a methodology aimed specifically at analyzing the interconnection between several known Boolean modules. The interconnection between two biological networks can be very hard to test in vivo: our methodology provides a platform for hypothesis testing, confirming or disproving assumptions regarding mutual regulatory effects, simulating and comparing various forms of interconnection schemes and corresponding emergent dynamical behavior. Our method relies on the construction of a new object, the asymptotic graph, introduced by Tournier and Chaves (2013), which is a directed graph constructed only from the set of attractors of each module and that captures all the asymptotic behaviors of the interconnected network.
After a brief review of Boolean network interconnections, two improvements to the asymptotic graph are introduced in this paper, to mitigate two of its known limitations. First, it was observed that the asymptotic graph may also recover spurious attractors, in addition to the true attractors of the full network (Tournier and Chaves, 2013); we introduce an extension, called the cross graph that solves this issue from a theoretical point of view. The cross graph is constructed from the set of strongly connected components of each separate module, while the asymptotic graph is constructed from terminal strongly connected components only. Second, to enrich the traditional ON/OFF representation inherent to Boolean models, we propose a method to assign probabilities to the edges of the asymptotic graph, thereby allowing a probabilistic representation of the various possible trajectories of the composed network. Our methodology is applied first to a class of general randomly generated Boolean models and then to two state-of-the-art biological models in two different organisms: (i) to explore the interplay between mammalian cell cycle and circadian clock oscillators and (ii) to test hypotheses on the regulatory links between budding yeast cell cycle and cell size, where our analysis suggests that the START signal should come from mitosis phase.

INTERCONNECTIONS OF ASYNCHRONOUS BOOLEAN NETWORKS: A SHORT REVIEW
Throughout this paper, we will consider Boolean networks under asynchronous updates. An interconnected Boolean network is, briefly, the combined network formed by linking together, in an approriately prescribed way, two or more separate Boolean modules. In previous works (Chaves and Tournier, 2011;Tournier and Chaves, 2013) we have introduced a new object, the asymptotic graph, that characterizes the attractors of the combined Boolean network in terms only of the attractors of the separate modules-hence with no need to compute the larger state transition graph. In the following, the definition of the main objects needed to introduce the asymptotic graph are briefly reviewed.
h A (x 1 , x 2 ) = x 2 . Graphically, this network can be represented as a simple cascade u → x 1 → x 2 . Its dynamics are characterized by the two graphs G A,0 and G A,1 , represented below in graphical and matricial forms: In adjacency matrices, by convention the (i, j) entry equals 1 iff state j is a successor of state i. Here, the four states (rows and columns of the matrix) are intended in the following order: 00, 01, 10, 11. In G A,0 , state 00 does not have any successor, implying the first row of its adjacency matrix is zero: 00 is a steady state of the network. Similarly, 11 is a steady state of G A,1 .
Classically, an asynchronous transition graph G A,u is analyzed by first computing its decomposition into strongly connected The set of all SCCs forms a partition of the state space {0, 1} n A and their computation can be efficiently achieved in O(2 n A +m A ). By contracting each SCC to a single vertex, a directed acyclic graph (dag) is constructed, sometimes called condensation graph or simply SCC graph. This dag provides a useful description of key dynamical behaviors of the network; in particular terminal SCCs (the leafs of the dag) correspond to the attractors of the network. More details about these graph theoretical tools can be found, for instance, in the textbook by Cormen et al. (2001).
Consider now two IO ABN A and B , of respective dimensions (n A , p A , q A ) and (n B , p B , q B ) and state variables x ∈ {0, 1} n A and y ∈ {0, 1} n B . Note that all the methods presented in this paper generalize to more than two modules; however, in order to maintain a clear exposition of the results, the definitions are given for interconnections of two modules. An interconnection scheme of A and B consists in two interconnecting functions µ A : {0, 1} q B → {0, 1} p A and µ B : {0, 1} q A → {0, 1} p B mapping the outputs of each module to the inputs of the other module. For convenience, throughout this paper we will make the assumption that q B = p A and q A = p B and that the interconnecting functions are simply identity maps. Following Tournier and Chaves (2013), with this assumption the resulting interconnected network is the ABN of dimension n A + n B , with no input and no output, defined by the following transition function: One can then consider the interconnection as a standalone network: its transition graph G can be constructed from this transition function f . Alternatively, one can also build the graph G directly from the set of transition graphs G A,u , u ∈ {0, 1} p A and G B,υ , υ ∈ {0, 1} p B as follows. Let (x, y) and (x ′ , y ′ ) be two Boolean vectors in {0, 1} n A × {0, 1} n B , then (x ′ , y ′ ) is a (asynchronous) successor of (x, y) if • either x = x ′ and y ′ is a successor of y in G B,h A (x) , • or y = y ′ and x ′ is a successor of x in G A,h B (y) .
It is possible to summarize this definition in a simple matricial form. First, for each α ∈ {0, 1} q A , introduce the 2 n A ×2 n A diagonal Boolean matrix A,α such that A,α ii = 1 if the output of state i is equal to α and 0 otherwise. Similarly, for module B introduce the 2 n B × 2 n B diagonal Boolean matrices B,β , with β ∈ {0, 1} q B . Then, G can be reconstructed by the formula: (2) where ⊗ designates the classical Kronecker product. By replacing matrices with identity matrices, one may recognize in this definition of G the notion of Cartesian product of graphs, first introduced by Sabidussi (1959). To be more precise, (2) generalizes the notion of Cartesian product to interconnections, by including only transitions that are consistent with the inputoutput scheme. EXAMPLE 2. Consider module A defined in Example 1 and let the one-dimensional SISO module B defined by f B (υ, y 1 ) = ¬υ and h B (y 1 ) = y 1 . Its dynamics are given by The interconnected network can be reconstructed by using (1), leading to the 3-dimensional transition function f (x 1 , x 2 , y 1 ) = (y 1 , x 1 , ¬x 2 ). Alternatively, the transition graph G can also be computed directly as the interconnection of the dynamics of the two separated modules by using (2): In graphical form, this transition graph G of the interconnected network can be represented as: In the present paper, note that we assume the modules and the interconnection scheme are given. It is also possible to consider interconnections as a general model reduction technique, where a large network is first decomposed into a priori unknown subnetworks. The identification of an efficient decomposition, with the corresponding interconnecting scheme, would then become critical. This problem is related to the general problem of graph partitioning and is addressed elsewhere (Tournier and Chaves, 2013).

The Asymptotic Graph of an Interconnection
We can now give the definition of the asymptotic graph (Tournier and Chaves, 2013). First, list all the terminal SCCs of module For some α such a set may be empty, in that case we will simply omit it. Similarly, The asymptotic graph of the interconnection is then defined as the directed graph G as = (V as , E as ) such that the vertex set V as is composed of all the cross products A i uα × B j υβ and the arc set E as is constructed as follows: such that there exists a path from x to x ′ in G A,β , αβ ′ such that there exists a path from y to y ′ in G B,α .
Finally, introduce the function π as follows: The interest of the asymptotic graph lies in the following theorem, a proof of which can be found in Tournier and Chaves (2013). THEOREM 1. If Q is an attractor of the interconnected network, then there exists a terminal SCC R of G as such that π(R) ⊆ Q. EXAMPLE 3. Consider the interconnection of Example 2 above. The asymptotic graph is given by with: Therefore, G as is composed of a single terminal SCC R, and π(R) = {001, 111, 000, 110} is actually included into the (unique) attractor of the interconnected network.
Thanks to Theorem 1, the asymptotic graph is a powerful analytic tool as it recovers all the attractors of an interconnection (without missing any), by constructing a graph significantly smaller than the full interconnected graph G (section 4 below provides numerical results for random interconnections). However, it may happen that some terminal SCC of G as does not correspond to an actual attractor of the interconnection. Such terminal SCCs, called spurious attractors, appear very rarely and there exist some sufficient conditions to detect a priori spurious attractors in certain cases. The most simple one, particularly useful for biological applications is the fact that when R is a singleton then it cannot be a spurious attractor. The proof, along with additional conditions are provided elsewhere (Tournier and Chaves, 2013;Chaves and Carta, 2015).

NEW ANALYSIS TOOLS
This section describes our new contributions. Our first goal is to improve the asymptotic graph construction to avoid the generation of spurious attractors (section 3.1) and our second goal is to update the asymptotic graph by adding quantitative information (probabilistic) on the state transitions (section 3.2).

A Theoretical Tool to Recover All the Dynamics of an Interconnection
The asymptotic graph of an interconnection is constructed only from the modules' attractors, generally implying a relatively manageable size allowing to analyze a wide range of practical examples of interconnections (see sections 4 and 5). Nevertheless, ignoring transient dynamical behaviors of the modules also implies two drawbacks for Theorem 1. First, spurious attractors may appear, although this phenomenon seems to be relatively rare as illustrated in section 4. Second, when a terminal SCC of G as corresponds to an actual attractor, Theorem 1 only ensures an inclusion, meaning the predicted attractor may contain only a small proportion of states that are in the real attractor. We now propose a new graph, called the cross-graph, overcoming those two issues and ensuring, at the price of a higher computational cost, a one-to-one recovery of all the attractors of the interconnected network. Note that Tournier and Chaves (2013) already introduced a notion of cross-graph, however the cross-graph described in the following is significantly improved. In particular, its size is bounded by the size of the full interconnected graph, which was not the case for the older version. Let A and B be two IO ABN of respective dimensions (n A , p A , q A ) and (n B , p B , q B ). As before, suppose for convenience that p A = q B , p B = q A and the interconnecting maps are simply identity maps. We also assume that each module has been separately analyzed: the transition graphs G A,u , u ∈ {0, 1} p A and G B,υ , υ ∈ {0, 1} p B have been constructed and decomposed into strongly connected components A i u , 1 ≤ i ≤ N A u for each u ∈ {0, 1} p A and B j υ , 1 ≤ j ≤ N B u for each υ ∈ {0, 1} p B . Let G Frontiers in Physiology | www.frontiersin.org denote the full transition graph of the interconnected network, of size 2 n A +n B . It can be computed thanks to (2), by interconnecting the modules' transition graphs. The idea behind the cross-graph is to generalize formula (2) in order to interconnect directly the SCCs of those graphs instead of the whole graphs themselves, thus potentially saving a significant amount of space when constructing the dynamics of the interconnection. First, observe that the strongly connected components Let P denote the set of all partitions of . Given two partitions P 1 , P 2 ∈ P , P 1 is said finer than P 2 , denoted by P 1 ≤ P 2 if, for each element p in P 1 there is an element q in P 2 such that p ⊆ q (in other words, partition P 1 is a fragmentation of partition P 2 ). The set (P , ≤) has the structure of a geometric lattice (see eg. Birkhoff, 1940). Consequently, for any set S ⊆ P , there exists a (unique) greatest lower bound of S denoted by S ∈ P . Coming back to the SCC decompositions, introduce the following partition: which is the coarsest partition of {0, 1} n A that is finer than every SCC decomposition of all the transition graphs G A,u . Once this partition is constructed, following the same idea as before it is further refined by cutting each set A i according to their outputs: with the convention that such sets are simply omitted when they are empty. Therefore, we finally obtain a partition that is compatible with every SCC decompositions of the dynamics of modules A . By construction, the number of elements in this partition, denoted by M A , verifies 1 ≤ M A ≤ 2 n A . Applying the exact same procedure for module B , one obtains a similar partition Once partitions Z A h and Z B h are defined, the construction of the cross graph closely resembles the one of the asymptotic graph. The cross graph is the digraph G cr = (V cr , E cr ), where the vertex set V cr is composed of all cross-products A i α × B j β and the arc set is constructed as follows: There is also a matricial form for the definition of G cr . First, project each transition graph G A,u onto Z A h , leading to 2 p A graphs, represented by their M A × M A adjacency matrices H A,u , u ∈ {0, 1} p A . These projections can be rather straightforwardly h is equal to α and 0 otherwise. Once similar objects H B,υ and B β have been constructed for module B , the cross-graph is simply defined by a generalization of formula (2): EXAMPLE 4. To illustrate this definition, let us consider two 2dimensional, single-input single-output modules A and B , defined by their transition graphs given in Figure 1A and their output functions h A (x) = x 2 , h B (y) = y 1 . The full transition graph of the interconnection, built from (2), is depicted in Figure 1B and the cross-graph is depicted in Figure 1C: it is constructed from the two partitions The interest of the cross-graph lies in the following theorem, establishing the one-to-one correspondence between the terminal SCCs of G cr and the attractors of the interconnected network.
FIGURE 1 | Comparison between the cross graph of an interconnection and the full transition graph. (A) Transition graphs of two SISO modules (see Example 4); (B) full transition graph G of the interconnection; (C) cross graph G cr of the interconnection. For each graph, dotted regions denote strongly connected components. There is a bijection between the SCC decomposition of the two graphs G (16 vertices) and G cr (8 vertices), illustrating Theorem 2.
Frontiers in Physiology | www.frontiersin.org THEOREM 2. Graphs G and G cr have the same decomposition into strongly connected components. Furthermore, terminal SCCs of G cr fully recover the attractors of the interconnected network.
A proof of Theorem 2 is given in appendix. The size of the cross-graph is M A × M B , which by construction is always less or equal than 2 n A +n B , the size of the full interconnected graph G.
The difference in size between the two graphs may vary greatly, and strongly depends on (i) the SCC decompositions of the two modules and (ii) as for the asymptotic graph, the numbers of inputs and outputs (and therefore the general modularity of the initial network). Part 4 proposes a brief evaluation of the performance of the method for a set of randomly generated interconnections. Although the interest of the cross-graph is mainly theoretical, in certain practical cases the full graph G can be too big to be stored easily while G cr could. Two possible extensions of the cross-graph method are noted here. First, Bérenguier et al. (2013) proposed a compression of the SCC graph of a network, called the hierarchical transition graph (HTG). As the cross-graph is constructed from a combination of the modules' SCC decompositions, it would be possible to consider similarly a combination of the modules' HTG decompositions. Benefiting from the compactness of HTGs, such a construction would be even more compact than the crossgraph. Second, note that both the cross graph and the asymptotic graph methods require prior analysis of the modules' dynamics and the computation of their attractors, implicitly implying the dimensions of the modules are manageable. For a large network, Zañudo and Albert (2013) proposed an efficient characterization of attractors with the notion of "stable motifs, " based on the network's interaction graph (see also Klarner et al., 2015). When considering interconnections of large modules, the investigation of the stable motifs of an interconnection would therefore constitute an interesting extension of Theorem 2.

A Probabilistic Asymptotic Graph
One of the limitations of Boolean models is the lack of quantitative details: while the state transition graph describes all possible dynamical behaviors, it gives no indication as to which trajectory is more likely to be observed under a given set of initial conditions. To circumvent this problem, Boolean models can be combined with probabilistic frameworks that account for biological perturbations and variability in the logical rules (Shmulevich et al., 2002;Mori et al., 2015). Another approach is to exploit the Markov chain description of the transition graph associated to the asynchronous Boolean model (Calzone et al., 2010;Stoll et al., 2017). Based on this description, Stoll et al. (2017) developed the MaBoSS software, which then applies Gillespie algorithm to produce continuous time trajectories.
We also use the Markov chain description to assign probabilities to the edges of the asymptotic graph, an approach which will lead to a more quantitative analysis of the interconnected network's dynamics. The output of our probabilistic asymptotic graph is thus the set of attractors of the full network, under a particular interconnection scheme, together with a relative probability for each of them (e.g., "there is a probability p 1 that phenotype Q 1 is the outcome of this experiment").
The originality of our approach consists in assigning incidence probabilities to the attractors of each separate module, which can be obtained through the biological observations and measurements available for each module. The goal is to include biological information as an input and provide predictions that can be confronted to biological observations and therefore lead to validate or disprove the given interconnecting scheme.

Initializing Incidence Probabilities
Each transition in the asymptotic graph depends on two factors: which module is first "updated" (A or B) and, in response to an input change, how frequently does a switch occur from A i uα to A k uα (or from B j υβ to B k υβ ). These quantities may be represented by probabilities, defined a priori, from known data, experimental observations, or other modeling considerations. Define Assume Boolean module A has a total of L A same-output attractor-sets and B a total of L B same-output attractor-sets, and each of these has a given incidence probability (meaning that it is observed with a certain frequency) defined as The probabilities w i A and w j B may be assigned in different ways, for instance using experimental observations, or setting uniform probabilities (w i A = 1/L A for all i), or else from the size of their respective basin of attraction but in any case they should satisfy L A i=1 w i A = 1. Using these initial probabilities, a joint incidence probability may similarly be defined for each product of attractor-sets:

Transition Probabilities in the Asymptotic Graph
The probability of switching between two attractor-sets of the same module, but different inputs, can be defined in terms of conditional probabilities: define s ik A to be the probability that attractor A [k] is reached, conditional to the fact that the initial state is some a i ∈ A [i] . In other words, w k A must be weighted by the probability of a i reaching any attractor in G A,u k : Frontiers in Physiology | www.frontiersin.org where J = {j : u j = u k and a i A [j] } means that there exists a path in G A,u k leading from a i to A [j] , where A [j] is an attractor of G A,u k . A similar definition holds for s ik B . Next, we can define the probability associated to an edge of V as as: with an "effective" probability̺ A , computed based on the set of all ougoing edges from node In other words,̺ A = 0 if all outgoing edges have a fixed A- outgoing edges may be of both types. Note that these definitions ensure that the probabilistic asymptotic graph matrix has the property that all rows add up to 1: since both k s ik A = 1 and k s jk B = 1.

Relative Probabilities of the Attractors of an Interconnection
If the asymptotic graph G as has two or more attractors, in addition to the transition probabilities, another useful information is the frequency of observing a given attractor, or in other words the relative probability of each attractor of the interconnection. This probability can be computed from the SCC graph G Sd = (V Sd , E Sd ) corresponding to G as , which is an acyclic graph and can be represented by an absorbing Markov chain. By definition, V Sd is composed of the strongly connected components of G as . Let C ∈ V Sd contain L C elements of V as . Define the incidence probability of observing C as: Moreover, a probability of transition can also be associated to each edge of E Sd , P(C i → C j ), computed by adding all the probabilities of the edges in E as that link elements of C i to elements of C j . Suppose there are m strongly connected components, |V Sd | = m, and let the m × m matrix M with M ij = P(C i → C j ), be the absorbing Markov chain associated with the graph G Sd . Suppose M has r absorbing states, {C k a : k = 1, . . . , r}, these are also the attractors of G Sd . Matrix M can be written in the following canonical form (Feller, 1970): where I r is the r×r identity matrix, Q is the (m−r)×(m−r) matrix of transitions between transient states and R is the (m − r) × r matrix of transitions from transient states to absorbing states. Since M is irreducible, it follows that (I−Q) has an inverse (where I is the (m − r) × (m − r) identity matrix). Then the probability that there exists a path from a given state to one of the r absorbing states is given by the probability of being absorbed by r: where M absorp (i, k) is the probability that transient state i converges to absorbing state k.
If, in addition, we wish to weigh these absorption probabilities by the incidence probabilities of observing C k a , we can define the relative probability of an attractor of the asymptotic graph: where C k a denotes each attractor and P(C k a ) is the incidence probability of C k a .

PERFORMANCE ON RANDOM NETWORKS' INTERCONNECTIONS
In this part we propose a series of computational experiments to assess the efficiency of the asymptotic graph and the cross graph to recover the attractors of random interconnected Boolean networks. Following the general idea of inputs/outputs at the core of this paper, we start with a brief description of the algorithm used to generate random IO modules. We then present numerical results computed on random interconnections with varying connectivity, showing the respective advantages and limitations of the two methods in practice.

Generation of Random IO Networks With Varying Connectivity
The NK-model, introduced by Kauffman (1969), is a general statistical model to represent random Boolean networks by controlling their dimension N and their inner connectivity K. It is used for instance by Zañudo and Albert (2013) and Veliz-Cuba et al. (2014). Here it is slightly adapted to include inputs and outputs. Let be an IO Boolean network of dimension (n, p, q), of transition function f : {0, 1} p × {0, 1} n → {0, 1} n and output function h : {0, 1} n → {0, 1} q . A usual way to depict such a network is by its wiring diagram, showing the dependencies between the different variables of the network. Equivalently, the wiring diagram can be represented by a (n + q) × (p + n) Boolean matrix where submatrices A (n × p), B (n × n) and D (q × n) are defined as follows: Let C designate the matrix [A|B]. The sum of the i-th row of C is the number of essential variables of logical function f i , also called the connectivity of f i . Given integers n > 0, p, q ≥ 0 and a real number K mean ∈ [1, n], we construct a random IO network of dimension (n, p, q) and of average connectivity K mean by applying the following procedure, which generates a dependency matrix M: 1. Let D : = 0. For each 1 ≤ i ≤ q, pick at random j ∈ {1, . . . , n} and set d ij : = 1. 2. Generate n integers k i in {0, . . . , n+p} according to a binomial distribution of parameters n + p (number of trials) and K mean n+p (probability of success). 3. Let C = [A|B] : = 0. For each 1 ≤ i ≤ n, pick a random combination (j 1 , . . . , j k i ) ∈ {1, . . . , n + p} k i (without replacement) and set c i,j l : = 1 for all 1 ≤ l ≤ k i . 4. Check that each column of A is non-zero; while it is not the case, repeat step 3.

Set
Step 4 ensures the generated module actually depends of every inputs. Once the dependency matrix M is obtained, the last step consists in generating the n + q Boolean functions according to M. A Boolean function of k variables is picked randomly among the 2 2 k possibilities; in case it is degenerate (i.e., at least one of the k variables is not essential), another one is chosen so as to ensure exact compatibility with M.

Complementarity of the Cross and Asymptotic Graph Methods
With this algorithm, it is possible to generate a IO module by controlling its inner connectivity, that is the number of actual dependencies in the wiring diagram. Thus, it becomes possible to generate random interconnections with varying degrees of modularity, according to the average connectivity of each module. We used this algorithm to generate 2,000 interconnections of two modules A and B of dimensions (n A , p A , q A ) = (n B , p B , q B ) = (10, 2, 2): where the mean connectivity of A and B varies in {1, . . . , 10}. For each interconnection, both 10-dimensional modules were analyzed separately (including the computation of the transition graphs, their SCC decompositions and the computation of their attractors), then the cross graph and the asymptotic graph were computed and compared. The main results are presented in Figure 2 and summarized below. All computations were made with Matlab R2016b, The MathWorks, Inc. First, we compare the respective sizes N cr and N as of the cross and the asymptotic graphs (ie. their number of vertices). Figures 2A,B show respectively the evolution of log 2 (N cr ) and log 2 (N as ) with respect to the connectivity of the two modules. Obviously, both N cr and N as are below N = 2 20 , which is the size of the full transition graph of the interconnected network. The cross graph, which captures both the transient and the asymptotic dynamics of the interconnection is relatively large, however its size seems to vary greatly with the modules' connectivity. When the connectivity increases, implying a highly modular interconnection, the ratio N cr /N can reach very small values, emphasizing the interest of the cross graph to efficiently store the dynamics of large, modular interconnected networks. On the other hand, the asymptotic graph is always much smaller, several orders of magnitude under the size of the full transition graph. Contrary to the cross graph, it is particularly small when the modules have lower connectivity, making it particularly well adapted for biological networks. Interestingly, its size seems to reach a plateau when the mean connectivity is above n 2 = 5. Another way to compare the two approaches is by studying their average execution times. The times shown in Figure 2C include the analysis of the two 10-d modules and of the cross and asymptotic graph methods. The latter comprise the construction of G cr (respectively, of G as ), the SCC decomposition of G cr (respectively, of G as ) and the reconstruction of the attractors (respectively, of π(R) for all terminal SCCs R of G as ). For the cross graph, the majority of the time is taken by the SCC decomposition of G cr while for the asymptotic graph, the most time-consuming step is the construction of G as itself (data not shown). For comparison, we also computed the complete dynamics of the 20-d interconnected network by using formula (2); on average, such direct method amounted to around 83 seconds (dotted line). Therefore, both methods are faster than the direct analysis of the full interconnected network. As before, the asymptotic graph is particularly efficient for low connectivity modules, while the cross graph is more efficient when the modules have high connectivity. Interestingly, for connectivity K mean = 5 and higher, when both graphs have roughly the same size, the cross graph method becomes even more rapid than the asymptotic graph.
Finally, since both graphs were computed it was possible to evaluate the quality of the asymptotic graph predictions. Recall that according to Theorem 1, the asymptotic graph has two drawbacks. First, it may predict spurious attractors and second, when it identifies a true attractor it only predicts a subset π(R) of the states lying in the attractor Q. The ratio |π (R)| |Q| is called the accuracy of the prediction. Among the 2,000 interconnections, 11 presented spurious attractors that is only 0.55% of the total. In all but one case, only one spurious attractor was detected. This result confirms the rarity of the appearance of spurious attractors. In total, we identified 3,693 true attractors. Among them more than 73% were completely recovered (see Figure 2D); overall, the mean accuracy is about 0.86, exhibiting the excellent predictive power of the asymptotic graph when it comes to uncover the asymptotic behaviors of an interconnection.

A Powerful Tool to Analyze Large Interconnections of Biological Networks
According to the previous results, the asymptotic graph seems particularly well adapted when the mean connectivity of the modules is low (≤ 5), which is arguably where biological networks generally operate (Zañudo and Albert, 2013;Veliz-Cuba et al., 2014). Therefore we decided to test it further with higher dimensional interconnections, including four modules A , B , C , D of dimension n = 15, with K mean ∈ {1, . . . , 5}, p A = q A = p D = q D = 1 and p B = q B = p C = q C = 2: When N cr < 10 7 , the cross graph was also constructed and analyzed, in order to check the existence of spurious attractors.
Since the global state space is 2 60 > 10 18 , we skipped the last treatment (identification of the attractors in {0, 1} 60 ) to avoid possible explosions. Therefore, we only computed the terminal SCCs of G as and, when available, the terminal SCCs of G cr . The results are presented in Table 1. When G cr could be analyzed, we were able to detect spurious attractors in G as : none were found. If the cross graph method is not practical for small K mean , the asymptotic graph was always manageable, confirming its practical interest to analyze large biological networks, as long as they can be expressed as interconnections of modules with a reasonable number of inputs and outputs.

TWO BIOLOGICAL APPLICATIONS
The asymptotic graph construction and its probabilistic interpretation are now applied to two biological examples, centered on the mammalian and yeast cell cycles. Both cases illustrate the asymptotic graph concept, its informative description of a composite system, and its usefulness for testing biological hypotheses.

Mammalian Cell Cycle, Circadian Clock and Their Interconnection
There are two basic cellular oscillators in mammalian cells: cell cycle describes the different phases of cellular growth and division, while circadian clock decribes the mechanism responsible for anticipating environmental changes and adapting the organism to deal with these changes (most notably, day-night differences). The interactions between these two oscillators are The cross graph is treated (constructed and analyzed) only when N cr < 10 7 (log 2 (10 7 ) ≈ 23.25). The column #treated/(#exp.) indicates the number of times it was treated over the total number of experiments. When it is treated, we further verify the presence of spurious attractors in the asymptotic graph. The column #spurious/(#treated) indicates the number of times the asymptotic graph predicts a spurious attractor over the number of times the cross graph could be treated. Symbol -indicates that the corresponding value could not be computed.
still not fully understood, but recent works by Feillet et al. (2014) and Bieler et al. (2014) have uncovered unexpected bi-directional links between the two modules. Successful mathematical models for the cell cycle and clock have been developed, as well as some studies on their interactions (Gérard and Goldbeter, 2012), but many questions remain (Feillet et al., 2015).

Mammalian Boolean Modules
At the discrete level, a reference model of the cell cycle was developed and discussed by Fauré et al. (2006) (see Figure 3). It comprises 10 variables: (CycD, Rb, E2F, CycE, CycA, p27, Cdc20, Cdh1, Ubc, CycB), where CycX (X ∈ {A, B, D, E}) represent four cyclins, each roughly corresponding to one of the four phases of the cell cycle. This constitutes our module A , and its rules can be found in the Supplementary Material. The clock model (module B ) has 7 variables and is based on the work of Comet et al. (2012). To account for transcription shutdown during mitosis, the input v negatively affects all mRNAs: In the clock model, mX and pX denote mRNA and protein coded by gene X, while PC denotes the complex formed by the proteins PER and CRY, and PCnuc denotes this complex in the nucleus. A well established link between these two oscillators is that protein BMAL acts on the cell cycle, possibly at different stages (Feillet et al., 2015). In our analysis, we will consider BMAL acting during G1 phase. Although no conclusive evidence exists on how the cell cycle may affect the clock, we have considered that during cell division (or mitosis phase) gene expression is stopped (in the model, mitosis can be modeled as Cdc20 ∧ CycB, see Figure 3). The interconnection between modules is thus given by: so that u = 0 (resp., u = 1) represents absence (resp., presence) of BMAL and υ = 1 represents mitosis. In the cell cycle model, BMAL affects negatively the G1 phase, leading to a logical equation for cyclin E of the form cycE + = ¬u ∧ (E2F ∧ ¬Rb) (see Figure 3 and Supplementary Material). Module A has a total of six, and module B has a total of three, same-output attractor sets. For algorithmic convenience, these are labeled using the lexicographic convention, that is A jûα forû,α ∈ {1, 2}, where "decimal 1 = logical 0" and "decimal 2 = logical 1." The attractors for both modules are as follows: In the case u = 0, module A becomes exactly the original model constructed by Fauré et al. (2006). Therefore, as expected, the attractors found for G A,u=0 correspond exactly to those listed by Fauré et al. (2006). Attractors A 1 11 and A 4 21 correspond to a steady state where the only expressed proteins are Rb, p27, and Cdh1, hence representing the quiescent cell state. The (full) attractor A 2 11 ∪ A 3 12 is a cyclic attractor containing 112 distinct states and corresponds to the known G1/S/G2/M cell cycle progression (Fauré et al., 2006). Similarly, A 5 21 ∪ A 6 22 is a cyclic attractor of the graph G A,u=1 , with 56 states. It tends to describe the cell cycle progression, with the difference that u = 1 implies CycE ≡ 0. In either of the cyclic attractors, the attractor-sets A 3 12 and A 6 22 contain states representing mitosis, that is, the output of any state a ∈ A 6 22 ∪ A 3 12 satisfies h A (a) = Cdc20 ∧ CycB = 1. The clock mechanism admits a cyclic attractor with 120 states, B 1 11 ∪ B 2 12 , which corresponds to regular circadian oscillations in the case υ = 0. At mitosis, represented by υ = 1, the clock network admits a single steady state attractor (B 3 22 = {1000000}), where all gene expression is arrested.

Asymptotic and Cross Graphs
The asymptotic graph for the interconnection of the two mammalian oscillators has 18 nodes and two attractors, with separate basins of attraction (Figure 4): The cross graph contains 54,272 nodes (compare to the full size of the interconnection, 2 17 = 131072) and confirms the existence of exactly two cyclic attractors for the interconnected system and returns all their elements: attractor R 1 is composed of 120 states and R 2 is composed of 13,552 states. Our methodology predicts two distinct operating modes for the coupled oscillators: R 1 corresponds to a quiescent cell with oscillatory clock, since it is the product of state 0100010100 representing a quiescent cell in module A and of cyclic attractor B 1 11 ∪ B 2 12 representing regular clock oscillations. The attractor R 1 is thus in agreement with observations by Plikus et al. (2013) (hair cells in quiescent phase seem to have a running clock). In contrast, R 2 represents joint oscillations of the cell progression cycle (A 2 11 ∪ A 3 12 ) and clock (B 1 11 ∪ B 2 12 ) (see Figure 4 for the dynamics within R 2 ). The cell cycle and clock may jointly oscillate and alternate states with a regular cycle of cyclin E (which is present mostly through S phase and mitosis) or eventually switch to a joint cycle with absence of cyclin E (A 5 21 ×B 1 1· → A 2 1· ×B 1 11 → A 2 11 × B 2 12 → A 5 21 × B 1 1· ). However, at mitosis (A 3 12 ), the clock may switch to its arrested steady state (A 3 12 × B 1 11 → A 3 12 × B 3 22 ), which leads directly to a full degradation of cyclin E in the cell cycle (A 5 2· × B 3 22 ). To assign transition probabilities to the asymptotic graph, there are essentially two elements to define: ̺ A which is the probability of updating first the component from module A ; and the incidence probability of each attractor from each module, w i A and w j B . To compute the incidence probabilities w i A and w j B , we have used the size of the original basins of attraction of A i uα in A and B j υβ in B , as in (4). However, for both modules, each attractor can be reached from any state, implying that the joint incidence probabilities, P( , are equal for all nodes of the asymptotic graph with: w i A = 1/6 (i = 1, . . . 6) and w j B = 1/3 (j = 1, . . . 3). Figure 4 shows the transition probabilities obtained for two different values of the updating probability ̺ A . These two graphs are very similar, differing only on the most frequent transitions (bold arrows, above 0.5). As should be expected, whenever the probability of first updating components from A is larger (̺ A = 0.6), the cell cycle oscillations dominate the global dynamics: most of the bold transitions in Figure 4 (bottom) concern switches between attractor-sets of A . In contrast, circadian clock oscillations are dominant for ̺ A = 0.2 (Figure 4, top). The evolution from mitosis phase toward cell cycle progression ( Computation of the relative probabilities (8) of reaching one of the attractors of the interconnected network yields P rel (R 1 ) = 0.333, P rel (R 2 ) = 0.667, independently of the updating probability ̺ A . An interpretation of these relative probabilities is that, in a typical population of cells, about one third are arrested in quiescent G0 state while the other two thirds follow the normal cell cycle progression G1/S/G2/M.

Budding Yeast Cell Growth and Cell Cycle START
Cell cycle and division is intimately linked with cell growth: a cell cannot divide into two daugther cells if its size is too small. There are many other factors that play a role in cell division (concentration of certain proteins, volume), but it remains unclear how a cell is able to perceive its own size and evaluate whether all conditions are in place for cell division (Turner et al., 2012).
In budding yeast, cell cycle is triggered by a START signal which is dependent on cell size. Li et al. (2004) propose a Boolean model that accurately describes cell cycle progression, taking START as an external input and stopping at a G1 phase steady state. One of the most important proteins involved in START is cyclin Cln3, which in involved in the G1-S phase transition and initiates cell cycle in the model of Li et al. (2004). Cyclin Cln3 forms a complex with another protein Whi3 but, in order to initiate cell cycle, Cln3 must be folded and released from this complex, which is achieved with the help of a chaperon protein Ydj1. Recent work by Aldea et al. (2017) suggests that cell size is growth rate dependent and that Ydj1 is one of the most important factors relating growth rate to cell size at START.
To describe cell size dependence on growth rate Aldea et al. (2017) proposes a model where Cln3 competes with a second hypothetical protein Prot for binding with Ydj1 for folding: and Prot would be a growth rate dependent protein. Here, we propose a basic Boolean network of this model, where the dependence on growth rate is modeled by an input υ: ProtF + = YP Cln3 + = ¬Whi3 The competition of Prot and Cln3 for Ydj1 is represented by the term ¬(Prot ∧ Cln3) in the rule for Ydj1 meaning that, in the absence of both Prot and Cln3, "free" protein Ydj1 will be available. Both Prot and Whi3 depend on growth rate, here given by input υ. Later on, υ will be computed as an output from the cell cycle model. Computation of the graphs G A,u and G B,v yields the following attractors: The symbol * in A i 1 * or A i 2 * means that the output of this attractor depends on the function h A (a): three different forms for h A (a) will be tested (see 13-15 below). For instance, we have h A (A 4 1 * ) = 2 whenever h A (a) is given by (15), so we should write A 4 12 ; but h A (A 4 1 * ) = 1 in the other two cases, hence A 4 11 . In the case u = 0, the yeast cell cycle model is exactly the one studied by Li et al. (2004) hence, as expected, the seven attractors A i 1 * of G A,u=0 are those listed in Table 1 of this reference. According to Li et al. (2004), attractor A 4 1 * represents the G1 steady state and has the largest attraction basin. Attractor A 2 11 is also close to G1 phase and has the second largest attraction basin. Using the size of the attractions basins, the incidence probabilities w i A have been computed according to Equation (4) and they are listed in Table 2.

Network Interconnection, Asymptotic and Cross Graphs
To establish a scheme of interconnection, observe that the cell size model acts on the cell cycle by triggering the start signal, that is START is given by (folded/free) protein Cln3F. Conversely, the input of the cell cycle to the cell size module is still unknown, the combination of variables and/or quantities used by the cell to detect its own size is a question for further analysis. As an hypothesis, we will assume that growth rate is detected through cell phase, since the cell cycle model provides this information. To explore the plausibility of this hypothesis, we will thus consider three different indicators of the cell cycle phase (M, S, and G1 phases) and compare the asymptotic graphs of the three corresponding interconnection schemes: In the case of growth measured by M phase (h A (a) = Swi5 ∧ Cdc20), the asymptotic graph has a unique, cyclic, attractor (Figure 6, top): setting Cln3 back to its OFF state (B 2 21 ) and ending "near" M phase (A 11 22 ). At this point, the system returns to stationary G1 and repeats the cycle, waiting for cell to grow and again send the start signal. Two alternative paths are proposed for the cell cycle, with G1-phase described either by A 4 11 or similar state A 2 11 . Since G as contains a unique attractor, its relative probability P rel is necessarily 1.
In the case of growth rate measured by S phase (h A (a) = Clb5), the asymptotic graph (Figure 6, middle) has three attractors, two single steady state and one cyclic attractor: In this case, however, computation of the cross graph shows that R S 1 is a spurious attractor, implying that the asymptotic graph has lost some information on transient pathways. In practice, the full graph contains pathways eventually leading from R S 1 to either R S 2 or R S 3 . This example shows the importance of verifying whether any of the asymptotic graph's attractors is spurious, and hence the usefulness of a complementary method as the cross graph. In this situation, the probabilistic interpretation of the asymptotic graph is unclear. The relative probabilities computed according to (8) yield equal probabilities for reaching attractors R S 2 and R S 3 (see Table 3). In contrast, R S 1 must now be interpreted as a transient set of states.
In the case G1 is used as measure of growth rate, we have h A (a) = Cdh1 ∧ Sic and the asymptotic graph (Figure 6, bottom) has six single state attractors but no cyclic attractor: All these attractors are confirmed by the cross graph. Computation of relative probabilities shows that the single steady state A 11 21 × B 1 12 is more frequently observed (with a percentage of around 54%, see Table 3). In this state all proteins of the cell cycle are expressed except for Cdh1 and Sic1, which characterize stationary G1 phase. The cell growth module is in a state where Cln3F is available, thus setting START to 1. The interconnected system is thus locked in a steady state where the interaction links are fixed: A 11 21 × B 1 12 = 11110111011 × 10100110, since the output of each attractor is equal to the input of the other.

Hypotheses Discrimination
These results appear to support a model for START signal of the form (12), as suggested by Aldea et al. (2017). Indeed, if cell size triggers START, then it can be assumed that there is a "critical size" which will be attained most probably at the end of G2 phase. And, in fact, the interconnected system exhibits an oscillatory cycle only in the case of M phase used as cell size indicator. This cycle is in agreement with cell cycle progression, meaning that the cell size module is able to trigger the START signal.
In contrast, when G1 or S phases are used as cell size indicator, the interconnected system has no oscillatory behavior. For the G1 case, the most frequent steady state (A 11 21 × B 1 12 ) represents a configuration where the cell size module permanently sets Cln3F = 1, and does not admit cell size to reset to zero. Note that G1 is the beginning of the cell cycle and a misleading indicator of "critical" size; in this case, the "critical" size is so small that the cell size module sets START permanently to 1 thus preventing the cell cycle to reset to zero and initiate a new cycle. Cells are locked in a steady state near mitosis and before early G1.
In conclusion, our analysis shows that neither G1 nor S phases are reliable cell growth indicators, but components from M phase are plausible candidates for detecting cell growth. We point out that the cell size Boolean network and the feedback interconnection points may admit several improvements, which are outside the scope of our paper. Nevertheless, we believe this first approach provides useful hints on how to further investigate and model the START signal in yeast.

DISCUSSION AND CONCLUSIONS
Our work illustrates a new concept for the analysis of an interconnection of Boolean networks: the goal is to study the coupled behavior of two or more modules, using only the dynamics of each separate module. A new methodology has been discussed, based on construction of the asymptotic and cross graphs both representative of the full network transition graph and guaranteed to compute all attractors of the interconnected network. The two graphs have different properties but also complement each other. The cross graph provides exact results, in the sense that it contains all transient and asymptotic behaviors of the interconnected network. The asymptotic graph is a lighter construction containing a minimal number of nodes while recovering all attractors. In contrast to the cross graph, no bijection with the full network transition graph is guaranteed, implying that spurious attractors may appear; however, this happens at an extremely low rate (less than 1%).
Construction of the two graphs for random input/output networks with varying connectivity reveals their complementarity in terms of modules' connectivity: for low connectivity (K mean ≤ 5), the asymptotic graph is much smaller (on average 0.01% of the full graph, against 28% for the cross graph; Figure 2B) and faster to compute; in contrast, for high connectivity (K mean > 5), the size of the cross graph drastically reduces to 0.04% of the full graph (Figure 2A) becoming even faster to analyze than the asymptotic graph ( Figure 2C). In addition, even though the asymptotic graph involves a drastic simplification of the state space, it has an unexpectedly high rate of accuracy, as shown in Figure 2D.
The practical advantages of our methodology are illustrated by the study of two well known biological networks. Among other useful characteristics, the asymptotic graph can greatly reduce the size of the state space, especially in the case of single-input single-output modules. For instance the mammalian and yeast interconnected networks, with an average connectivity of K = 2.76 and K = 2.68 respectively, have asymptotic graphs of only 18 and 22 nodes (compared to 2 17 or 2 19 ).
The analysis of the coupling between cell cycle and circadian clock shows that, according to experimental observations (for instance by Plikus et al., 2013), the asymptotic graph predicts that mammalian cells in the quiescent state may have a working clock. Furthermore, under general hypotheses, the probabilistic approach predicts that one third of cells are arrested in the quiescent state but still have circadian oscillations, while the other two thirds follow a normal cell cycle progression intertwined with circadian oscillations. In the budding yeast example, we have explored a recent hypothesis by Aldea et al. (2017) for a mechanism to trigger the START signal and initiate cell cycle. The mechanism is based on cell size detection through cell growth rate. Our analysis supports such a mechanism as a possible START trigger, and suggests that cell size indicator should come from an element during M phase.
The advantages of our analysis tools are multiple and particularly suited to the modeling of biological regulatory networks: by manipulating existing models as building blocks, the presented tools allow to rapidly simulate, compare, and test different coupling schemes or hypotheses on mutual regulatory effects, and therefore advance in the understanding of highly modular regulatory networks. The probabilistic interpretation and the analysis of transient behaviors emerge as two noteworthy directions for future developments in logical models.

AUTHOR CONTRIBUTIONS
MC and LT: equally contributed to conception, analysis and design of the study; MC and LT: wrote and revised the manuscript.

SUPPLEMENTARY MATERIAL
The Boolean models used for mammalian and budding yeast cell cycles are provided as Supplementary Material.