Analysis of Biased Competition and Cooperation for Attention in the Cerebral Cortex

A new approach to understanding the interaction between cortical areas is provided by a mathematical analysis of biased competition, which describes many interactions between cortical areas, including those involved in top-down attention. The analysis helps to elucidate the principles of operation of such cortical systems, and in particular the parameter values within which biased competition operates. The analytic results are supported by simulations that illustrate the operation of the system with parameters selected from the analysis. The findings provide a detailed mathematical analysis of the operation of these neural systems with nodes connected by feedforward (bottom-up) and feedback (top-down) connections. The analysis provides the critical value of the top-down attentional bias that enables biased competition to operate for a range of input values to the network, and derives this as a function of all the parameters in the model. The critical value of the top-down bias depends linearly on the value of the other inputs, but the coefficients in the function reveal non-linear relations between the remaining parameters. The results provide reasons why the backprojections should not be very much weaker than the forward connections between two cortical areas. The major advantage of the analytical approach is that it discloses relations between all the parameters of the model.


INTRODUCTION
Biological systems employ a selection processing strategy for managing the enormous amount of information resulting from their interaction with the environment. This selection of relevant information is referred to as attention. One type of visual attention is the result of top-down influences on the processing of sensory information in the visual cortex, and therefore is intrinsically associated with neural interactions within and between cortical areas. Thus, elucidating the neural basis of visual attention is an excellent paradigm for understanding some of the basic mechanisms for interactions between cortical areas.
Observations from a number of cognitive neuroscience experiments have led to an account of attention termed the "biased competition hypothesis, " which aims to explain the computational processes governing visual attention and their implementation in the brain's neural circuits and neural systems. According to this hypothesis, attentional selection operates in parallel by biasing an underlying competitive interaction between multiple stimuli in the visual field toward one stimulus or another, so that behaviorally relevant stimuli are processed in the cortex while irrelevant stimuli are filtered out (Chelazzi et al., 1993;Duncan, 1996;Chelazzi, 1998;. Thus, attending to a stimulus at a particular location or with a particular feature biases the underlying neural competition in a certain brain area in favor of neurons that respond to the location, or the features, of the attended stimulus. This attentional effect is produced by generating signals in areas outside the visual cortex which are then fed back to extrastriate visual cortical areas, where they bias the competition such that when multiple stimuli appear in the visual field, the cells representing the attended stimulus win, thereby suppressing the firing of cells representing distracting stimuli (Duncan and Humphreys, 1989;Desimone and Duncan, 1995;Duncan, 1996;. According to this line of work, attention appears as a property of competitive/cooperative interactions that work in parallel across the cortical modules. Neurophysiological experiments are consistent with this hypothesis in showing that attention serves to modulate the suppressive interaction between the neuronal firing elicited by two or more stimuli within the receptive field Motter, 1993;Chelazzi, 1998;. Further evidence comes from functional magnetic resonance imaging (fMRI) in humans (Kastner et al., 1998(Kastner et al., , 1999 which indicates that when multiple stimuli are present simultaneously in the visual field, their cortical representations within the object recognition pathway interact in a competitive, suppressive fashion, which is not the case when the stimuli are presented sequentially. It was also observed that directing attention to one of the stimuli counteracts the suppressive influence of nearby stimuli. Neurodynamical models providing a theoretical framework for biased competition have been proposed and successfully applied in the context of attention and working memory Rolls, 2016). In the context of attention, Usher and Niebur (1996) introduced an early model of biased competition to explain the attentional effects in neural responses observed in the inferior temporal cortex, and this was followed by a model for V2 and V4 by  based on the shunting equations of Grossberg (1988). Deco and Zihl (2001) extended Usher and Niebur's model to simulate the psychophysics of visual attention by visual search experiments in humans. Their neurodynamical formulation is a large-scale hierarchical model of the visual cortex whose global dynamics is based on biased competition mechanisms at the neural level. Attention then appears as an effect related to the dynamical evolution of the whole network. This large-scale formulation has been able to simulate and explain in a unifying framework visual attention in a variety of tasks and at different cognitive neuroscience experimental measurement levels, namely: singlecells (Deco and Lee, 2002;Rolls and Deco, 2002), fMRI Deco, 2002, 2004), psychophysics , and neuropsychology Heinke et al., 2002). In the context of working memory, further developments (Deco and Rolls, 2003;Szabo et al., 2004) managed to model in a unifying form attentional and memory effects in the prefrontal cortex integrating singlecell and fMRI data, and different paradigms in the framework of biased competition.
A detailed dynamical analysis of the synaptic and spiking mechanisms underlying biased competition was produced by Deco and Rolls (2005). However, the parameter regions within which biased competition operates were identified by a mean field analysis, which consisted of testing a set of parameters until effective regions in the state spaces were identified.
Here we treat the biased competition system analytically for the first time. This mathematical analysis complements previous numerical results and improves our understanding of the principles of operation of the system. Although the results are presented in the context of attention, they apply more generally to interactions between cortical areas. The dynamics of cortical attractor networks, and the dynamical interactions between cortical areas and the strength of the connections between them that enable them to interact usefully for shortterm memory and attention yet maintain separate attractors have been analyzed with the rather different approaches of theoretical physics elsewhere (Treves, 1993;Battaglia and Treves, 1998;Renart et al., 1999aRenart et al., ,b, 2000Renart et al., , 2001Panzeri et al., 2001;Rolls, 2016).

The Biased Competition Network
The network to be analyzed is shown in Figure 1, and has the same general architecture used by Deco and Rolls (2005) to investigate the mechanisms of biased competition in a range of neurophysiological experiments. The system has forward connections J f and K f , and top-down backprojection connections J b and K b , as shown in Figure 1. The top-down connections are weaker, so that they can bias the bottom up inputs, but not dominate them, so that the system remains driven by the world. The top-down connections in the model correspond to the backprojections found between adjacent cortical areas in a cortical hierarchy, and in the area of memory recall, to the backprojections from the hippocampus to the neocortex (Kesner and Rolls, 2015;Rolls, 2016Rolls, , 2018. The anatomical arrangement that facilitates this is that the backprojections end on the apical dendrites of cortical pyramidal cells far from the cell body, where their effects can be shunted by forward inputs that terminate on the parts of the dendrite that are electrically closer to the cell body (Rolls, 2016). In both top-down attention, and in memory recall, it is important that any bottom-up inputs from the world take priority, so that the organism is sensitive to events in the world, rather than being dominated by internal processing (Rolls, 2016). Interestingly, there is evidence that this situation is less the case in schizophrenia, in which some key forward connections are reduced in magnitude relative to the backprojections (Rolls et al., 2019). The type of neurophysiological experiment for which this model was designed is described by Deco and Rolls (2005), with one of the original neurophysiological investigations performed on object-based attention in the inferior temporal visual cortex (IT) by Chelazzi et al. (1993). The overall operation of this architecture is conceptualized as follows [see Deco and Rolls (2005)]. Activity in neuron or population of neurons L 1 have a strong driving effect on H 1 (via J f ), as stimulus 1 acting via λ 1 is the preferred (i.e., most effective) stimulus, and a weaker effect on H 2 (via the crossed connection K f ) for which stimulus 1 is the less preferred stimulus. There are weaker corresponding backprojections J b and K b . Correspondingly, L 2 has a strong driving effect on H 2 (via J f ) as stimulus 2 acting via λ 2 is the preferred stimulus, and a weaker effect on H 1 (via the crossed connection K f ) for which stimulus 2 is the less preferred stimulus. L 1 is in competition with L 2 (with strength c L ), and H 1 is in competition with H 2 (with strength c H ). The present model does not imply that L 1 and L 2 are in different areas in a topographical map, but that can be easily implemented by decreasing the strength of c L if L 1 and L 2 are not close together in the map, and the same general results hold (see Deco and Rolls, 2005). It is noted that convergence in topographically mapped systems from stage to stage is important in providing one of the bases for translation invariant visual object recognition as modeled in VisNet (Rolls, 2012(Rolls, , 2016, and that the dendritic morphology at different stages of processing in the visual cortical hierarchy may facilitate this (Elston, 2002;Elston and Fujita, 2014).
The nodes L correspond to V4, and the nodes H to inferior temporal visual cortex, in the neurophysiological experiment of Chelazzi et al. (1993) on object-based top-down attention. The lower nodes are L 1 and L 2 , and receive inputs λ 1 and λ 2 . L 1 has forward connections of strength J f to higher node H 1 , and L 2 has forward connections of strength J f to higher node H 2 . The corresponding backprojections have strength J b . In addition, there are crossed connections K f and K b shown with dashed lines. The rationale for this connectivity is that the preferred stimulus for L 1 has strong effects on H 1 , but weaker effects on H 2 for which it is not the preferred stimulus. In a corresponding way, the preferred stimulus for L 2 has strong effects on H 2 , but weaker effects on H 1 for which it is not the preferred stimulus. This is implemented as follows. L 1 has forward connections of strength K f to higher node H 2 , and L 2 has forward connections of strength K f to higher node H 1 . The corresponding backprojections have strength K b . The K connections are weaker that the J connections. A top-down attentional bias signal λ H 2 can be applied to node H 2 , and can bias the network to emphasize the effects of λ 2 even if λ 2 is less than or equal to λ 1 . There is competition within a level, with H 1 and H 2 competing with strength c H , and L 1 and L 2 competing with strength c L . The conditions for these effects to occur and for the network to be stable are analyzed. In addition, all four of the nodes can have recurrent collateral connections that produce self-excitatory effects, but the effects of these in the simulations do not affect the generic results obtained, and are not included for simplicity in the mathematical analyses.
The network can operate as follows. If a weaker than λ 1 input λ 2 is applied to L 2 , then neural populations L 1 and H 1 win the competition. However, if a top-down biased competition input λ H 2 is applied to H 2 , then this can bias the network so that H 2 wins the competition over H 1 , but also L 2 has higher activity than L 1 . Here we analyse the relation between the parameters shown in Figure 1 that enable these effects to emerge and to be stable. FIGURE 1 | Model architecture. The lower nodes are L 1 and L 2 , and receive inputs λ 1 and λ 2 . L 1 has forward connections of strength J f to higher node H 1 for which stimulus 1 acting via λ 1 is the preferred input, and L 2 has forward connections of strength J f to higher node H 2 for which stimulus 2 acting via λ 2 is the preferred input. The corresponding backprojections have weaker strength J b . In addition, there are crossed connections shown with dashed lines. In particular, L 1 has forward connections of weak strength K f to higher node H 2 as stimulus 1 is not the preferred stimulus for H 2 , and L 2 has forward connections of strength K f to higher node H 1 as stimulus 2 is not the preferred stimulus for H 1 . The corresponding backprojections have weaker strength K b . A top-down attentional bias signal λ H 2 can be applied to node H 2 , and can bias the network to emphasize the effects of λ 2 even if λ 2 is less than or equal to λ 1 . There is competition within a level, with H 1 and H 2 competing with strength c H , and L 1 and L 2 competing with strength c L . The conditions for these effects to occur and for the network to be stable are analyzed. In addition, all four of the nodes can in the simulations have recurrent collateral connections that produce self-excitatory effects, consistent with cortical architecture, and these produce the expected effects, but are not considered further here because the mathematical analysis focusses on the tractable case in which they are not present. The nodes L correspond to V4, and the nodes H to inferior temporal visual cortex, in the neurophysiological experiment of Chelazzi et al. (1993) on object-based top-down attention.
A feature of the model described here and also by Deco and Rolls (2005) is that local attractor dynamics are implemented within each of the neuronal populations, as recurrent collateral connections are a feature of the cerebral neocortex, and are one way to incorporate non-linear effects into the operation of the system. The new results are derived here analytically, and are illustrated with simulations with the parameters originating from the formulas obtained.

The Attentional Effects to be Modeled: Investigation 1
One top-down biased competition effect to be considered is as follows. If λ 1 is greater than λ 2 , then the activity of L 1 will be greater than the activity of L 2 . But if we apply a top-down biased competition input λ H 2 to H 2 , then at some value of λ H 2 the top-down bias will result in L 2 having as much activity as L 1 . We seek to analyse the exact conditions under which this biased competition effect occurs, in terms of all the parameters of the system. This system has been studied previously as follows, but with a mean-field analysis to search the parameter space, rather than the analytic approach described here. Figure 1 shows a design associated with a prediction that can be made by setting the contrast-attention interaction study in the framework of the experimental biased competition design of Chelazzi et al. (1993) involving object attention. Deco and Rolls (2005) modeled this experiment by measuring neuronal responses from neurons in neuronal population (or pool) H 1 in the inferior temporal cortex (IT) to a preferred and a non-preferred stimulus simultaneously presented within the receptive field. They manipulated the contrast or efficacy of the stimulus that was non-preferred for the neurons H 1 . They analyzed the effects of this manipulation for two conditions, namely without object attention, or with top-down object attention on the non-preferred stimulus 2 that produced input λ 2 , implemented by adding an extra bias λ H 2 to H 2 . In the previous integrate-and-fire simulations (Deco and Rolls, 2005), top-down biased competition was demonstrated, and it was found that the attentional suppressive effect implemented through λ H 2 on the responses of neurons H 1 to the competing non-preferred stimulus (2) was higher when the contrast of the non-preferred stimulus (2) was at intermediate values. However, the operation of this type of network was not examined analytically, which is the aim of the present investigation.

The Attentional Effects to be Modeled: Investigation 2
A second top-down competition effect might be considered as follows. If λ 1 is greater than λ 2 and both are applied simultaneously, then the activity of H 1 will be greater than the activity of H 2 . However, if we apply top-down bias λ H 2 to H 2 , then we can influence the activity of H 2 and H 1 (through all the connections in the system) until H 2 and H 1 have the same activity (and at the same time there will be an effect on L 1 and L 2 ). We wish to quantify these effects analytically in terms of all the parameters of the system.

DYNAMICS
We shall use the notation that [x] + = max{x, 0} for any rational value of x. We assume that L 1 (t), L 2 (t), H 1 (t), H 2 (t) are defined by the following system of recurrent equations in discrete time t = 0, 1, . . .: One can also use here any time step t = 0, τ , 2τ , . . . , treating τ as another parameter. Here I denotes the indicator function, i.e., and similarly for I{L i (t) > T L }. This means that the terms in (3.1) and (3.2) which involve these indicator functions for the recurrent dynamics are present only if the activity in L i (t) or H i (t) is greater than the corresponding threshold T L or T H . In the analysis described here, we assumed that T L and T H were infinity, so that the recurrent dynamics were not in operation, but the recurrent dynamics were tested in the simulations.
The constants w f ij represent the synaptic weight of the connection from node L j to node H i , while w b ij is the synaptic weight of the connection from node H j to node L i . We shall assume that where indexes b and f correspond to "back" and "forward". Furthermore, we assume that for some non-negative coefficient q (typically 0 ≤ q < 1, see more comments below) Functions λ i (t), λ H i (t) represent the external inputs. All the remaining constants in the system (3.1), (3.2) are free parameters; they are assumed to be non-negative, as they represent the following characteristics: β L is the decay term for the L nodes, c L is the competition between the L nodes, T L is the threshold at which an L node enters its attractor dynamics as was explained above, α L is the gain factor for self-excitation in an L node for attractor dynamics, β H is the decay term for the H nodes, c H is the competition between the H nodes, T H is the threshold at which an H node enters its attractor dynamics, α H is the gain factor for self-excitation in an H node for attractor dynamics.
Note that when all the constant parameters of the system are set to be zero, including the input functions λ i , λ H i , then the system remains at the initial state (L 1 (0), L 2 (0), H 1 (0), H 2 (0)). We shall assume that and that all the input functions are non-negative constants, i.e., We shall address the following problems. Assume, that for all large values of t? This is investigated in Investigation 2.
In other words we are looking for the conditions for the parameters of the system which allow the biased competition. Deco and Rolls (2005) found numerically an area of parameters which yields such effect. Here we derive this analytically. This allows us to treat a wide range of parameters and moreover to find out the exact relations between all the parameters of the network when the biased competition takes place.

Investigation 1
The top-down biased competition effect to be considered here is as follows. If λ 1 is greater than λ 2 , then the activity of L 1 will be greater than the activity of L 2 . But if we apply a top-down biased competition input λ H to H 2 , then at some value of λ H the topdown bias will result in L 2 having as much activity as L 1 . The aim was to discover the critical value of λ H by simulation, for comparison with the analytic value.
The system specified by (3.1) and (3.2) was implemented in Matlab. The parameters shown in (3.1) and (3.2) were set as follows: J f = 0.15 / 3 (the values for these synaptic weights are in the same ratio as in Deco and Rolls, 2005) β L = 0.35 (the decay term for the L nodes) c L = 0.3 (the competition between the L nodes) T L = 5.0 (the threshold at which an L node enters its attractor dynamics; in practice this was set to infinity in most of the work described, in order to prevent attractor dynamics operation in the nodes) α L = 0.0 (the gain factor for self-excitation in an L node for attractor dynamics, α L can be set as well to 0.1 if recurrent dynamics are required) β H = 0.35 (the decay term for the H nodes) c H = 0.3 (the competition between the H nodes) T H = 5.0 (the threshold at which an H node enters its attractor dynamics; in practice this was set to infinity in most of the work described, in order to prevent attractor dynamics operation in the nodes) α H = 0.0 (the gain factor for self-excitation in an H node for attractor dynamics, α H can be set to 0.1) λ 1 = 6.0 λ 2 = 5.0 λ H 2 = 0.0 or the critical value λ H,cr 2 of λ H 2 derived below in the analysis for the top-down bias to overcome the bottom-up inputs shown as λ 1 and λ 2 .
For all of the simulations illustrated in this paper and for the analytic investigations, the recurrent collateral self-excitatory effects were turned off (i.e., the α values were set to zero, and the parameters T L and T H were set to infinity). However, as stated in the Legend to Figure 1, the effects obtained when these were enabled were generically the same with respect to the biased competition effects described in this research.
Simulation results for a system with λ 1 = 6.0, λ 2 = 5.0, and the top-down bias λ H = λ H,cr 2 are shown in Figure 2. λ H,cr 2 is the value derived in the analysis for the top-down bias to just overcome the bottom-up inputs shown as λ 1 and λ 2 , as shown below in (5.54). λ H,cr 2 was 22.816. Now the H 2 node has high activity as expected due to the application of λ H 2 , but this has the effect that the network settles into a state where the L 2 node has just higher activity than the L 1 node, despite λ 1 > λ 2 . This demonstrates that the analysis described below that derives the critical value λ H,cr 2 for the biased attention signal to just reverse the difference between the bottom up inputs to make the network respond preferentially to the weaker incoming signal, is accurate.
In further investigations, it was found that the analysis made correct predictions over a wide range of values of the parameters that were confirmed by numerical simulations. For example, the value λ H,cr 2 was correctly calculated by the analysis over a 10fold variation of K f and K b as shown by the performance of the numerical simulations. It was also found in the simulations that ratios for J b /J f and k b /k f in the range of 0.1-0.5 produced  needed to make the activity of L 2 as large as L 1 as a function of λ 1 − λ 2 denoted δλ. λ 1 was fixed at 6. top-down biased competition effects with values for λ H,cr 2 that were in the range of the other activities in the system.
The interpretation of the analytical results shown in for example (5.54) is now facilitated by graphical analysis. Figure 3 for Investigation 1 shows the top-down bias λ H,cr 2 needed to make the activity of L 2 as large as L 1 as a function of λ 1 − λ 2 which is termed δλ. λ 1 was fixed at 6, and λ H 2 = 0. Figure 3 shows that λ H,cr 2 is a linear function of δλ. The slope of this function is high (140/6). The implication is that for reasonable values of the topdown bias limited to perhaps 50 in this model, the working range for δλ is relatively small, approximately 2. FIGURE 4 | Analytical results for Investigation 1 for a system with λ 1 − λ 2 which is termed δλ fixed at 1. It is shown that λ H,cr 2 the top-down bias needed to make the activity of L 2 as large as L 1 was relatively independent of the absolute values of λ 1 and λ 2 .
In another example to illustrate the utility of the analysis that produced (5.54), it was found that the top-down bias λ H,cr 2 needed to make the activity of L 2 as large as L 1 was relatively independent of the absolute values of λ 1 and λ 2 , and so depended on the difference between them, i.e., λ 1 −λ 2 termed δλ. This is illustrated in Figure 4 in which δλ was set to 1.0, and the value of λ 1 was increased from 1 to 10. It is clear that the top-down bias required depends very little on the absolute values of λ 1 and λ 2 , but instead on their difference as shown in Figure 3.
We can also understand quantitatively the effect of the other parameters using (5.54). For example, if J b is increased from its default value of 0.05/3 to 0.1/3, then the slope of the function illustrated in Figure 3 falls to a lower value (66/6). This makes the important point that top-down biased competition can only operate with rates for the top-down bias (in this case λ H 2 ) in a reasonable range (not too high) if the backward connection strength (here J b with a default value of 0.05/3) is not too low compared to the forward connection strength (here J f with a default value of 0.13/3). This places important constraints on the ratio of the strengths of backprojections relative to forward projections between cortical areas (Rolls, 2016).
In another example to illustrate the utility of the analysis that produced (5.54), it was found that if the crossed connections K b (default 0.005/3) were increased (for example to 0.01/3), then more top-down bias was needed, with the slope of the function shown in Figure 3 now 158/6. The reason for this is clear, that some of the top-down bias λ H 2 acts on λ 1 via K b , but the analytic result in (5.54) quantifies this, as it does the effects of the other parameters involved in this approach to biased competition.

Investigation 2
The second top-down competition effect considered was as follows. If λ 1 is greater than λ 2 and both are applied simultaneously, then the activity of H 1 will be greater than the activity of H 2 . However, if we apply top down biased competition λ H to H 2 , then we can influence the rates H 2 and H 1 (through all the connections in the system) until H 2 and H 1 have the same activity (and at the same time there will be an effect on L 1 and L 2 ). We wished to quantify these effects analytically in terms of all the parameters of the system.
Simulation results for a system with λ 1 = 6.0, λ 2 = 5.0, and the top-down bias λ H 2 = 0 or = λ H,cr 2 are shown in Figure 5. λ H,cr 2 is the value derived in the analysis for the top-down bias to make the rates in H 1 equal those in H 2 , to just overcome the difference between the bottom-up inputs λ 1 and λ 2 , as shown in (5.60). The upper graph in Figure 5 shows the results when the top-down bias λ H = 0. It can be seen that the rates in H 1 are higher than in H 2 , which is as expected, because λ 1 = 6.0, and λ 2 = 5.0. In the lower part of Figure 5 the top-down bias λ H 2 = 0.775, the value derived in the analysis for the top-down bias to just produce equal rates in H 1 and H 2 , overcoming the difference between the bottom-up inputs shown as λ 1 and λ 2 . The simulation illustrated in the lower part of the figure thus shows that when the analytically calculated value for λ H,cr 2 is used, the numerical simulation confirms that this is the correct value. When a different value is used, as shown in the top part of Figure 5, then the correct results are not obtained.
The interpretation of the analytical results shown in for example (5.60) for Investigation 2 is now facilitated by graphical analysis. Analytical results for Investigation 2 to show how λ H,cr 2 is a function of λ 1 are shown in Figure 6, for a system with λ 2 = 5.0. λ H,cr 2 is the value derived in the analysis for the top-down bias to just produce equal rates in H 1 and H 2 , overcoming the difference between the bottom-up inputs shown as λ 1 and λ 2 . Figure 6 shows that λ H,cr 2 is a linear function of λ 1 , when λ 2 = 5.0. Moreover, Figure 6 shows that only a small variation of λ H,cr 2 is sufficient to counteract large changes in λ 1 . Moreover, the implication of 5.60 is that provided that the conditions shown in (5.59) are met, the operation is relatively independent of λ 2 . The understanding to which this leads is that the relative outputs measured at the H nodes are relatively little affected by the values of λ 1 and λ 2 , compared to the effects of the input biases to λ H 2 and λ H 1 . An implication for the operation of the brain is that top-down biased competition can have useful effects on the lower (L) nodes in the system, which could then influence other systems. Another implication is that the output from the higher (H) nodes is relatively strongly affected by any direct inputs to the H nodes, compared to effects mediated by top-down biases acting through the backward connections to the L nodes, and on systems connected to these L nodes.
Analytical results for Investigation 2 to show how λ H,cr 2 depends on λ H 1 are shown in Figure 7, for a system with λ 2 = 5.0. λ H,cr 2 is the value derived in the analysis for the topdown bias to just produce equal rates in H 1 and H 2 , overcoming the difference between the bottom-up inputs shown as λ 1 and λ 2 . This figure shows that λ H,cr 2 is a linear function of λ H 1 with a slope of approximately 1. The implication here is that inputs to the H nodes influence each other almost equally, and this will occur primarily through the inhibition between these nodes implemented by c H , rather than through the top-down connections to the L nodes, and then the return effects from the L to the H nodes.

MATHEMATICAL ANALYSIS
To solve the named problems we shall study the system (3.1) and (3.2) which describes the dynamics of (L 1 (t), L 2 (t), H 1 (t), H 2 (t)) in R 4 + . Note that the entire state space R 4 + = {(x 1 , x 2 , y 1 , y 2 ) : x i ≥ 0, y i ≥ 0} is decomposed into 2 4 = 16 areas depending on whether x i ∈ [0, T L ], y i ∈ [0, T H ]: R 4 + = ∪ (e 1 ,...,e 4 ) : e i ∈{0,1} X e 1 × X e 2 × Y e 3 × Y e 4 , (5.7) where Frontiers in Computational Neuroscience | www.frontiersin.org  In each area X e 1 × X e 2 × Y e 3 × Y e 4 the behavior of a system (3.1), (3.2) is linear; the non-linear nature of this system is seen, roughly speaking, only at the borders of these areas. In particular, there is a "cut" or "threshold" at zero for all involved functions, as their meaning as rates assumes only non-negative values.
We are mostly interested in modeling dynamics of a vector (L 1 (t), L 2 (t), H 1 (t), H 2 (t)) which does not escape to infinity in any coordinate. Therefore, we shall first study the system in the area (5.8) Let us assume first the following simplifying assumption. Set (5.9) which means that there is no facilitation of rates in the network. Consider the linear system associated with the system (3.1), (3.2): (5.10) Let us fixλ ∈ S. (5.11) Then assuming also as in (3.5) the zero initial conditions we observe that as long as (5.12) i.e., L i (t) ≥ 0 and H i (t) ≥ 0 [recall assumption (5.9)], the system (5.10) describes exactly the same system as in (3.1) and (3.2), i.e., (5.13) Therefore we first derive the conditions for the matrix in (5.10) under which relations (5.12) hold for all t ≥ 0, i.e., the system remains to be in the bounded area S. The boundedness of the solution to (5.10) is defined entirely by the eigenvalues of the corresponding matrix. However, deriving the eigenvalues even for a (4×4)-matrix requires already heavy computations. Instead we shall take advantage of some particular properties and symmetries of our model which allow us to reduce the original 4 dimensions of the system.

Boundedness of the Trajectories
Let us consider the dynamics of two systems related to (5.10): (5.14) and Observe that H(t) → ∞ or H(t) → ∞ if and only if at least one H i (t) → ∞, in which case the original system (L 1 (t), L 2 (t), H 1 (t), H 2 (t)) cannot remain in the bounded area S.
Frontiers in Computational Neuroscience | www.frontiersin.org A similar statement holds about L(t) and L(t). Therefore, we begin the analysis by providing the necessary conditions for the stability of systems (5.14) and (5.15), and thus for the stability of (5.10). Then assuming that the connection parameters provide stability for the sums (5.14) we investigate the behavior of the system of differences (5.15) for different inputs λ 1 > λ 2 and λ H 1 < λ H 2 .
Our goal here is to find parameters such that given asymmetric inputs λ 1 > λ 2 we want to find λ H 1 < λ H 2 such that L(t) converges to a negative value for large t or that H(t) converges to a negative value for large t.
If the equality (5.13) still holds for all t this means that correspondingly, L 1 (t) < L 2 (t) or H 1 (t) < H 2 (t) for large t as well. Hence, correspondingly, we have a solution to Problem I or Problem II.
First we derive from (5.10): and Similarly, and Effectively, using the symmetries we reduced the 4 dimensions of our model to 2 dimensions, as the solution to our original system (5.10) is given by Recall that for a 2x2 matrix where all the entries are positive (as in our model) the eigenvalues are given by (5.23) and (5.24) Note that due to the assumption of positivity of entries we have here Hence, we have two (not linearly dependent) eigenvectors Note here for further reference that the inequalities and hold for all positive parameters A, B, C, D.
Since vectors E 1 and E 2 make a basis in R 2 , for any vectorȳ there are numbers q 1 and q 2 , such that Then the solution to (5.29) is given bȳ Observe that (5.32) The last system together with the assumption that both C and D are positive yields Under the last condition equation (5.31) yields the following convergence to the fixed point: Consider W S defined in (5.17) as an example of the generic matrix M in (5.22). Let us denote its eigenvalues κ i (W S ), i = 1, 2, κ 1 (W S ) ≥ κ 2 (W S ). Under assumption we have according to (5.33): (5.36) Next we study the eigenvalues κ 1 (W ) ≥ κ 2 (W ) of W . Under assumption which holds whenever (5.35) holds, we have according to (5.33): (5.38) Observe that when c L = c H = 0, condition (5.38) follows by the above condition (5.36).
Notice that together with the assumption (3.3) when all the parameters except J b , K b are fixed, the ratio q = J b /J f has to be small to ensure the boundedness of the functions L i (t), H i (t).
In words, conditions (5.36) and (5.38) tell us that under the assumption (5.35) the stability of the system (or boundedness of the trajectories) requires that at least some of the connections, forward or backward ones, have to be sufficiently small.

Dynamics of ( L(t), H(t)) for a Biased Input When All Rates Remain to be Strictly Positive
Here we study the system (3.1)-(3.2) assuming conditions when the functions L i (t), H i (t) remain to be strictly positive and bounded. This means that the dynamics is described by linear system.
Assume that initially the input to L 1 is greater than the one to L 2 , i.e., λ > 0. We shall find here the sufficient conditions on the parameters of the connections which yield existence of the values λ H < 0 such that eventually, contrary to the initial bias, the state of the system satisfies L 2 (t) < L 1 (t), (5.39) for all large t.
Consider system (5.19) with input Let us decompose this vector along the eigenvectors of W as in (5.30): Then by (5.34) the following convergence takes place when t → ∞: Hence, given a positive λ = λ 1 − λ 2 = x + y we want to find value x = x( λ) which satisfies condition and moreover minimizes function [see (5.40)] which is negative. Then for all the x which satisfy (5.44) we have (5.45) Finally, we can define a negative λ H such that (5.40) holds, and moreover the limiting state as defined in the first row on the right in (5.41) satisfies (5.42). Observe that condition (5.42) by (5.41) guarantees condition (5.39). Substituting previously derived formulas into (5.45) we derive that for any given positive λ and any λ H 1 < λ H 2 such that we have (5.39). We conclude that in the case when (5.13) still holds for all large t, i.e., if the system remains to be in the positive area, we have L 2 (t) ≥ L 1 (t) for all large t if conditions (5.35), (5.36), and (5.38) are fulfilled and λ H 2 is greater than the following critical value:

Operation With a Threshold at a Rate of Zero
Here we explore the non-linear effects of the system (3.1)-(3.2) considering the case when some of the rates L i (t) or H i (t) become zero and may stay at zero due to the non-linear threshold function that does not allow negative rates (·) + .

Equality in the L-Nodes
We shall find here the conditions when a state with strictly positive L and H can be a fixed point for the dynamical system (3.1)-(3.2) under assumption that λ 1 > λ 2 , λ H 2 > λ H 1 and also (5.9) holds, i.e., when a facilitation of rate above a certain threshold is not applied.
Assuming that the trajectories of L 1 (t), L 2 (t), and H 2 (t) remain to be strictly positive and do not hit zero, while H 1 (t) does decay to zero, the constants in (5.48) should satisfy the following system derived from (5.10) [see also 3.1)-(3.2)]: Observe that the inequality in the third line in (5.49) after the threshold at a rate of zero as in the original system (3.1)-(3.2) yields limiting state H 1 (t) = 0. The system (5.49) is equivalent to This requires the following conditions for the parameters in order for the last system to have a solution: which in turn requires (5.52) Observe that when λ H 1 = 0 the last condition is equivalent to Assuming that (5.52) holds we derive from (5.51) the following critical value λ H,cr (5.54) The above analysis yields the following statement. Assume also (5.52) (or (5.53) if λ H 1 = 0) and (5.55) Then the system (3.1)-(3.2) converges to a state where Notice that the limiting state described in the last Proposition satisfies (5.48).
Observe that the formula (5.54) for the critical value λ H,cr 2 is in a good agreement with the previous case (5.47); in fact the same condition as in (5.47) reads directly from the inequality in (5.50). However, this is precisely the non-linearity of the system that we use here to derive the exact formula (5.54) for the critical value.

Equality in the H-Nodes
We shall find here the conditions when a state L 1 (t) = L, L 2 (t) = 0, H 1 (t) = H 2 (t) = H (5.56) with strictly positive L and H can be a fixed point for the dynamical system (3.1)-(3.2) under the assumption that λ 1 > λ 2 , λ H 2 > λ H 1 and also (5.9) holds, i.e., when a facilitation of rate above a certain threshold is not applied.
Assuming that the trajectories of L 1 (t), H 1 (t), and H 2 (t) remain to be strictly positive and do not hit zero, while L 2 (t) does decay to zero, the constants in (5.56) should satisfy the following system derived from (5.10) [see also 3.1)-(3.2)]: (5.57) Observe that the inequality in the second line in (5.57) after the threshold at zero as in the original system (3.1)-(3.2) yields limiting state L 2 (t) = 0. The system (5.57) is equivalent to (5.58) First we derive the conditions for the parameters in order for the last system to have a solution: (5.59) Assuming that the latter holds we derive from (5.57) the following critical value λ H,cr (5.60) If λ H 2 ≥ λ H,cr 2 then the system ends up in a state where L 1 (t) = L > L 2 (t) = 0 and H 2 (t) = H 1 (t) = H. 6. DISCUSSION

The Analysis
We consider here a 4-dimensional system of linear equations with thresholds for the rates at zero. Although it is "almost" a linear system, which admits rather straightforward analysis, the focus is on the relations between the numerous parameters. Notably, the latter relations are mostly non-linear.
Our approach takes advantage of the authentic symmetries in the system which allowed us to reduce the original 4 dimensions to 2 dimensions. This method may have some interest on its own as it can be used in other similar situations.
The derived conditions for the parameters which yield certain desired properties (specified as Problems I and II) disclose nontrivial relations between the parameters.
We considered several cases: (i) when the system keeps a positive rate at each node (section 5.2), and (ii) when the rate at one node, namely L 2 (t) (section 5.3.2), or H 2 (t) (section 5.3.1) is suppressed to zero after a long enough time.
Consider first our solution to Problem I. Remarkably the formulas for the critical value of the bias λ H 2 to yield the success of competition are different for the above two cases, namely when all the rates are strictly positive or when one rate is zero. These formulas are given by (5.47) and (5.54). This level of accuracy would not be possible without analytic formulas. Observe that (5.47) is the first condition in system (5.50), which also has to be fulfilled for the formula (5.54) to work.
As we mentioned above, formula (5.47) holds only under assumption that all the rates remain to be strictly positive, that can be achieved, for example, by choosing λ H 1 sufficiently large. Further, we notice that the solution to Problem II provided by formula (5.60) reveals an interesting relation: as long as λ 2 satisfies condition (5.59) it does not enter directly formula (5.60). Deco and Rolls (2005) inferred from their particular simulations that the ratio of 2.5 between J f and J b provides a good working point for the biased competition. As we do not find any universal ratio between J f and J b in our analysis we conclude that the ratio 2.5 reflects particular scaling when the remaining parameters are fixed at certain values. On the other hand, our analysis tells us that the product J f J b has to be sufficiently small for the boundedness of the trajectories. More precisely, (5.36) and (5.38) under assumption (3.3) require the following sufficient conditions for our analysis (6.61) Furthermore, each of formulas (5.54) or (5.47) works under additional conditions as specified in the text. In particular, formula (5.54) requires (5.53), which is A reasonably large set of parameters satisfies the above conditions, as shown by the computational results.

Implications for Understanding Biased Competition and the Interaction Between Neural Systems
The analysis elucidates some interesting properties of the biased competition system described. For example, the system is sensitive to the difference between λ 1 and λ 2 , with the amount of biased competition required to produce the biased competition effects described related to this difference, as shown by the analytical results leading to (5.54), and the results shown in Figures 3, 4. These analyses and results show that while the system tolerates a wide range for the absolute values of λ 1 and λ 2 , the difference between then δλ must be relatively small for the values of the top-down bias λ H,cr 2 to be within a reasonable range of activity values, which in the context of the simulations described here might be up to 50.
The mean value of the λ inputs on the other hand influences how high the rates are of the output neurons. Another feature revealed by the analysis is how the parameters can be set to achieve asymptotically stable performance.
The analysis has interesting implications for understanding the operation of the backprojections that are important in topdown biased competition mechanisms of attention. Equation (5.54) and Figure 3, the associated results show that in order for the top down critical value λ H,cr 2 not to have to be too large, the backprojections J b must not be too weak. At the same time, J b must be less than J f , so that perceptual bottom-up inputs can dominate neural processing, which must not be dominated by internally generated top-down signals. This leaves a relatively small region for J b /J f between perhaps < 1.0 and 0.3. However, this ratio must be kept fairly low so that the two systems being coupled in this way can operate with separate attractors at the bottom (L) and top (H) ends (Renart et al., 1999a,b).
Another interesting property of this top-down biased competition system elucidated by the analysis is that the operation of the system, including the effects of the top-down bias λ H 2 , was influenced especially by the difference δλ between λ 1 and λ 2 , rather than by their absolute value, as shown in (5.54) and illustrated in Figures 3, 4. This is similar to the operation of integrate-and-fire decision-making networks (Rolls and Deco, 2010;Deco et al., 2013;Rolls, 2016), with the similarity reflecting the way in which the competition between the nodes or attractors operates.
The key correspondence between the mathematical analysis and the numerical simulations is that the simulations show that the mathematical analysis very accurately specifies the exact value of the top-down bias that is needed. That is useful confirmation that the analysis accurately specifies the interactions between the parameters in the biased competition system. The simulations are additionally useful in illustrating the operation of the biased competition system investigated analytically.
In conclusion, the major advantage of the analytical approach brought to bear here for the first time on biased competition between cortical areas is that it discloses relations between all the parameters of the model, and helps to identify those values that yield the desired effect of biased competition. This task cannot be fulfilled purely by numerical simulations.

DATA AVAILABILITY
All datasets analyzed for this study are included in the manuscript and the supplementary files.

AUTHOR CONTRIBUTIONS
TT performed the mathematical analyses. ER performed the numerical simulations and assessments. ER and TT wrote the paper jointly.