REVIEW article

Front. Artif. Intell., 21 May 2021

Sec. Medicine and Public Health

Volume 4 - 2021 | https://doi.org/10.3389/frai.2021.530937

Neuronal Sequence Models for Bayesian Online Inference

  • Department of Psychology, Technische Universität Dresden, Dresden, Germany

Abstract

Various imaging and electrophysiological studies in a number of different species and brain regions have revealed that neuronal dynamics associated with diverse behavioral patterns and cognitive tasks take on a sequence-like structure, even when encoding stationary concepts. These neuronal sequences are characterized by robust and reproducible spatiotemporal activation patterns. This suggests that the role of neuronal sequences may be much more fundamental for brain function than is commonly believed. Furthermore, the idea that the brain is not simply a passive observer but an active predictor of its sensory input, is supported by an enormous amount of evidence in fields as diverse as human ethology and physiology, besides neuroscience. Hence, a central aspect of this review is to illustrate how neuronal sequences can be understood as critical for probabilistic predictive information processing, and what dynamical principles can be used as generators of neuronal sequences. Moreover, since different lines of evidence from neuroscience and computational modeling suggest that the brain is organized in a functional hierarchy of time scales, we will also review how models based on sequence-generating principles can be embedded in such a hierarchy, to form a generative model for recognition and prediction of sensory input. We shortly introduce the Bayesian brain hypothesis as a prominent mathematical description of how online, i.e., fast, recognition, and predictions may be computed by the brain. Finally, we briefly discuss some recent advances in machine learning, where spatiotemporally structured methods (akin to neuronal sequences) and hierarchical networks have independently been developed for a wide range of tasks. We conclude that the investigation of specific dynamical and structural principles of sequential brain activity not only helps us understand how the brain processes information and generates predictions, but also informs us about neuroscientific principles potentially useful for designing more efficient artificial neuronal networks for machine learning tasks.

1. Introduction

In the neurosciences, one important experimental and theoretical finding of recent years was that many brain functions can be described as predictive (Rao and Ballard, 1999; Pastalkova et al., 2008; Friston and Kiebel, 2009; Aitchison and Lengyel, ). This means that the brain not only represents current states of the environment but also potential states of the future to adaptively select its actions and behavior. For such predictions, one important feature of neuronal dynamics is their often-observed sequence-like structure. In this review, we will present evidence that sequence-like structure in neuronal dynamics is found over a wide range of different experiments and different species. In addition, we will also review models for such sequence-like neuronal dynamics, which can be used as generative models for Bayesian inference to compute predictions. To familiarize readers of different backgrounds with each of these topics, we first briefly give an overview of the topics of sequences, predictions, hierarchical structure, the so-called Bayesian brain hypothesis and provide a more precise definition of the kind of sequence-like neuronal dynamics that we consider in this review.

1.1. Sequences in the Brain

The brain is constantly receiving spatiotemporally structured sensory input. This is most evident in the auditory domain where, when listening to human speech, the brain receives highly structured, sequential input in the form of phonemes, words, and sentences (Giraud and Poeppel, 2012). Furthermore, even in situations which apparently provide only static sensory input, the brain relies on spatiotemporally structured coding. For example, when observing a static visual scene, the eyes constantly perform high-frequency micro-oscillations and exploratory saccades (Martinez-Conde et al., 2004; Martinez-Conde, 2006), which renders the visual input spatiotemporally structured, and yet the visual percepts appear stationary. Another example is olfaction, where in animal experiments, it has been shown that neurons in the olfactory system respond to a stationary odor with an elaborate temporal coding scheme (Bazhenov et al., ; Jones et al., 2007). In the state space of those neurons, their activity followed a robust and reproducible trajectory, a neuronal sequence (see Table 1), which was specific to the presented odor. Similarly, in a behavioral experiment with monkeys, spatial information of an object was encoded by a dynamical neural code, although the encoded relative location of the object remained unchanged (Crowe et al., ). In other words, there is evidence that the brain recognizes both dynamic and static entities in our environment on the basis of sequence-like encoding.

Table 1

Neuronal sequenceSpatiotemporal patterns of neuronal activity that encode stimulus properties, abstract concepts, or motion signals (see Figure 1). Can be described by a specific, sequential trajectory in the so-called state space of the system, see also Figure 3 for an example.
State space/Phase spaceA multidimensional space that encompasses all possible states a system can be in. Every possible state is defined by a unique point in the space.
Continuodiscrete dynamics/TrajectoryReproducible spatiotemporal trajectories characterized by discrete points in state space (see Figure 3).
Winnerless Competition (WLC)Type of dynamic behavior of a system where the system shortly settles into a stable or metastable state before being forced away from it (by internal or external mechanisms) (see Figures 3, 6).
Metastable state/Saddle stateA state in the state space of a dynamical system. A metastable state of a system is stable in some directions and unstable in others. A saddle point is a metastable point where the first derivative vanishes.
Stable heteroclinic channel (SHC)Type of dynamic behavior of a system where the system goes through a succession of saddle points (metastable states) forming heteroclinic state-space trajectories (orbits). Importantly, small deviations from those trajectories will not diverge away from the heteroclinic orbit. See section 2.2.2.
Heteroclinic orbit/TrajectoryA path in the state space of a system that connects two equilibrium points.
Limit cycleAttractor type occurring in some complex dynamical systems. Closed, continuous trajectory in state space with fixed period and amplitude. The regular firing behavior of neurons can be described by limit cycle behavior. See section 2.2.1.
Synfire chainA feed-forward neuronal network architecture. See section 2.1.

Glossary.

Neuronal sequences have been reported in a wide range of experimental contexts. For example, in the hippocampus of mice and rats (MacDonald et al., 2011; Pastalkova et al., 2008; Bhalla, ; Skaggs and McNaughton, 1996; Dragoi and Tonegawa, ), the visual cortex of cats and rats (Kenet et al., 2003; Ji and Wilson, 2007), the somatosensory cortex of mice (Laboy-Juárez et al., 2019), the parietal cortex of monkeys and mice (Crowe et al., ; Harvey et al., 2012), the frontal cortex of monkeys (Seidemann et al., 1996; Abeles et al., ; Baeg et al., ), the gustatory cortex of rats (Jones et al., 2007), the locust antennal lobe (Bazhenov et al., ), specific song-related areas in the brain of songbirds (Hahnloser et al., 2002), and the amygdala of monkeys (Reitich-Stolero and Paz, 2019), among others. Even at the cellular level, there is evidence of sequence-processing capacities of single neurons (Branco et al., ). Neuronal sequences seem to serve a variety of different purposes. While sequences in specific brain regions drive the spatiotemporal motor patterns during behavior like birdsong rendition (Hahnloser et al., 2002) (Figure 1B), in other studies of different brain areas and different species, neuronal sequences were found to encode stationary stimuli (Seidemann et al., 1996; Bazhenov et al., ) and spatial information (Crowe et al., ), to represent past experience (Skaggs and McNaughton, 1996) (see also Figure 1A), and to be involved with both working memory and memory consolidation (MacDonald et al., 2011; Harvey et al., 2012; Skaggs and McNaughton, 1996). Behaviorally relevant neuronal sequences were reported to occur before the first execution of a task (Dragoi and Tonegawa, ), and in some behavioral tasks sequences were found to be predictive of future behavior (Abeles et al., ; Pastalkova et al., 2008).

Figure 1

As these findings show, neuronal sequences can be measured in different species, in different brain areas and at different levels of observation, where the expression of these sequences depends on the measurement and analysis method. A neuronal sequence can appear as the successive spiking of neurons (Figures 1A,B), or the succession of more abstract compound states (Figure 1C), or in yet different forms, depending on the experimental approach. For example, evidence for sequences can also be found with non-invasive cognitive neuroscience methods like magnetoencephalography (MEG) as shown in Figure 1D. Given these very different appearances of experimentally observed neuronal sequences, it is clear that an answer to the question of “What is a neuronal sequence?” depends on the experimental setup. In the context of this article, we understand a “neuronal sequence” quite broadly as any kind of robust and reproducible spatiotemporal trajectory, where stimulus properties, abstract concepts, or motion signals are described by a specific trajectory in the state space of the system (see Table 1). The brain may use such trajectory representations, whose experimental expressions are measured as neuronal sequences, to form a basis for encoding the spatiotemporal structure of sensory stimuli (Buonomano and Maass, ) and the statistical dependencies between past, present, and future (Friston and Buzsáki, 2016). Here, we will review evidence for this type of encoding and discuss some of the implications for our understanding of the brain's capacity to perform probabilistic inference, i.e., recognition based on spatiotemporally structured sensory input.

1.2. Hierarchies in the Brain

The brain's structure and function are often described with reference to a hierarchical organization, which we will cover in more detail in section 3.2. Human behavior can be described as a hierarchically structured process (Lashley and Jeffress, 1951; Rosenbaum et al., 2007; Dezfouli et al., ), as can memory, where the grouping of information-carrying elements into chunks constitutes a hierarchical scheme (Bousfield, ; Miller, 1956; Fonollosa et al., 2015). Similarly, the perception and recognition of spatiotemporally structured input can be regarded as a hierarchical process. For example, percepts, such as the observation of a walking person can be regarded as percepts of higher order (“walking person”), as they emerge from the combination of simpler, lower order percepts, e.g., a specific sequence of limb movements. Critically, the concept “someone walking” is represented at a slower time scale as compared to the faster movements of individual limbs that constitute the walking. There is emerging evidence that the brain is structured and organized hierarchically along the relevant time scales of neuronal sequences (e.g., Murray et al., 2014; Hasson et al., 2008; Cocchi et al., ; Mattar et al., 2016; Gauthier et al., 2012; Kiebel et al., 2008). Such a hierarchy allows the brain to model the causal structure of its sensory input and form predictions at slower time scales (“someone walking”) by representing trajectories capturing the dynamics of its expected spatiotemporal sensory input at different time scales, and by representing causal dependencies between time scales. This allows for inference about the causes of sensory input in the environment, as well as for inference of the brain's own control signals (e.g., motor actions). In this paper, we will review some of the experimental evidence and potential computational models for sequence generation and inference.

In the following section 1.3 we will first give a short introduction to the Bayesian brain hypothesis and the basic concept of the brain as a predictor of its environment. In section 1.4 we will go into more detail about the question “What is a sequence?” and will further discuss the trajectory representation. In section 2, we will provide an overview of several dynamical principles that might underlie the generation of neuronal trajectories in biological networks. Importantly, we are going to focus on general dynamical network principles that may underlie sequence generation, and which may differentiate types of sequence-generating networks. We are therefore not going to cover the vast field of sequence learning (e.g., Sussillo and Abbott, 2009; Tully et al., 2016; Lipton et al., 2015; Wörgötter and Porr, 2005), which mainly investigates neurobiologically plausible learning rules and algorithms that can lead to neuronal sequences, and thus possibly to the network types discussed in this article. In section 3, we review some approaches in which sequences are used to model recognition of sensory input. To highlight the relevance of sequence generators to a large variety of problems, we will visit methods and advances in computer science and machine learning, where structured artificial recurrent neural networks (RNNs) that are able to generate spatiotemporal activity patterns are used to perform a range of different computational tasks. This will however only serve as a rough and incomplete overview over some common machine learning methods, and we will not cover methods like Markov Decision Processes (Feinberg and Shwartz, 2012) and related approaches, as an overview of research on sequential decision making is beyond the scope of this review. Finally, we will briefly discuss functional hierarchies in the brain and in machine learning applications. A glossary of technical terms that we will use in the review can be found in Table 1.

1.3. The Bayesian Brain Hypothesis

Dating back to Hermann von Helmholtz in the 19th century, the idea that the brain performs statistical inference on its sensory input to infer the underlying probable causes of that same input (Helmholtz, 1867), started gaining considerable traction toward the end of the 20th century and had a strong influence on both computer science and neuroscience (Hinton and Sejnowski, 1983; Dayan et al., ; Wolpert et al., 1995; Friston, 2005; Friston et al., 2006; Beck et al., ; see also Rao and Ballard, 1999; Ernst and Banks, 2002; Körding and Wolpert, 2004). In particular, research into this interpretation of brain function led to the formulation of the Bayesian brain hypothesis (Knill and Pouget, 2004; Doya et al., ; Friston, 2010). The Bayesian brain hypothesis posits that aspects of brain function can be described as equivalent to Bayesian inference based on a causal generative model of the world, which models the statistical and causal regularities of the environment. In this framework, recognition is modeled as Bayesian inversion of the generative model, which assigns probabilities, that is, beliefs to different states of the world based on perceived sensory information. This process of Bayesian inference is hypothesized to be an appropriate basis for the mathematical description of most, if not all, brain functions (Friston, 2010; Knill and Pouget, 2004). Although the hypothesis that the brain is governed by Bayesian principles has met with criticism since human behavior does not always appear to be Bayes-optimal (Rahnev and Denison, 2018; Soltani et al., 2016), and because the definition of Bayes-optimality can be ambiguous (Colombo and Seriès, ), there is growing evidence that human behavior can indeed be explained by Bayesian principles (Figure 2) (Ernst and Banks, 2002; Körding and Wolpert, 2004; Weiss et al., 2002; Feldman, 2001), and that even phenomena like mental disorders might be explained by Bayesian mechanisms (Adams et al., ; Leptourgos et al., 2017; Fletcher and Frith, 2009) (see Knill and Pouget, 2004 and Clark, for reviews on the Bayesian brain hypothesis). How Bayesian inference is achieved in the human brain is an ongoing debate, and it has been proposed that the corresponding probabilities are encoded on a population level (Zemel et al., 1998; Beck et al., ) or on single-neuron level (Deneve, ).

Figure 2

).

Under the Bayesian view, model inversion, i.e., recognition, satisfies Bayes' theorem, which states that the optimal posterior belief about a state is proportional to the generative model's prior expectation about the state multiplied by the probability of the sensory evidence under the generative model. In Bayesian inference, prior expectation, posterior belief, and sensory evidence are represented as probability distributions and accordingly called prior distribution, posterior distribution, and likelihood (Figure 2). The posterior can be regarded as an updated version of the prior distribution, and will act as the prior in the next inference step. Importantly, the prior is part of the generative model as different priors could lead to qualitatively different expectations (Gelman et al., 2017).

The quality of the inference, that is, the quality of the belief about the hidden states of the world, is dependent on the quality of the agent's generative model, and the appropriateness of a tractable (approximate) inference scheme. In this review paper, we suggest that good generative models of our typical environment should generate, that is, expect sequences, and that such a sequence-like representation of environmental dynamics is used to robustly perform tractable inference on spatiotemporally structured sensory data.

The theory of predictive coding suggests that the equivalent of an inversion of the generative model in the cortex is achieved in a hierarchical manner by error-detecting neurons which encode the difference between top-down predictions and sensory input (Friston and Kiebel, 2009; Rao and Ballard, 1999; Aitchison and Lengyel, ) (Figure 2). The fact that sequences in specific contexts appear to have predictive properties (Abeles et al., ; Pastalkova et al., 2008) is interesting in light of possible combinations of the frameworks of predictive coding and the Bayesian brain hypothesis (Knill and Pouget, 2004; Doya et al., ; Friston, 2010). One intriguing idea is that the brain's internal representations and predictions rely on sequences of neuronal activity (FitzGerald et al., 2017; Kiebel et al., 2009; Hawkins et al., 2009). Importantly, empirical evidence suggests that these approximate representations are structured in temporal and functional hierarchies (see sections 1.2 and 3.2) (Koechlin et al., 2003; Giese and Poggio, 2003; Botvinick, ; Badre, ; Fuster, 2004). Combining the Bayesian brain hypothesis with the hierarchical aspect of predictive coding provides a theoretical basis for computational mechanisms that drive a lifelong learning of the causal model of the world (Friston et al., 2014). Examples for how these different frameworks can be combined can be found in Yildiz and Kiebel (2011) and Yildiz et al. (2013).

As an example of a tight connection between prediction and sequences, one study investigating the electrophysiological responses in the song nucleus HVC of bengalese finch (Bouchard and Brainard, ) found evidence for an internal prediction of upcoming song syllables, based on sequential neuronal activity in HVC. As another example, a different study investigating single-cell recordings of neurons in the rat hippocampus found that sequences of neuronal activations during wheel-running between maze runs were predictive of the future behavior of the rats, including errors (Pastalkova et al., 2008). This finding falls in line with other studies showing that hippocampal sequences can correlate with future behavior (Pfeiffer, 2020).

1.4. What Are Sequences?

What does it mean to refer to neuronal activity as sequential? In the most common sense of the word, a sequence is usually understood as the serial succession of discrete elements or states. Likewise, when thinking of sequences, most people intuitively think of examples like “A, B, C,…” or “1, 2, 3,….” However, when extending this discrete concept to neuronal sequences, there are only few compelling examples where spike activity is readily interpretable as a discrete sequence, like the “domino-chain” activation observed in the birdbrain nucleus HVC (Hahnloser et al., 2002) (Figure 1B). As mentioned before, we will use the word “sequence” to describe robust and reproducible spatiotemporal trajectories, which encode information to be processed or represented. Apart from the overwhelming body of literature reporting sequences in many different experimental settings (section 1.1), particularly interesting are the hippocampus (Bhalla, ; Pfeiffer, 2020) and entorhinal cortex (Zutshi et al., 2017; O'Neill et al., 2017). Due to the strong involvement of the hippocampus and the entorhinal cortex with sequences, the idea that neuronal sequences are also used in brain areas directly connected to them is not too far-fetched. For example, hippocampal-cortical interactions are characterized by sharp wave ripples (Buzsáki, ), which are effectively compressed spike sequences. Recent findings suggest that other cortical areas connected to the hippocampus use grid-cell like representations similar to space representation in the entorhinal cortex (Constantinescu et al., ; Stachenfeld et al., 2017). This is noteworthy because grid cells have been linked to sequence-like information processing (Zutshi et al., 2017; O'Neill et al., 2017). This suggests that at least areas connected to the hippocampus and entorhinal cortex are able to decode neuronal sequences.

The example of odor recognition shows that sequences are present even in circumstances where one intuitively would not expect them (Figure 1C). This very example does also show an interesting gap between a continuous and a discrete type of representation: The spatiotemporal trajectory is of a continuous nature, while the representation of the odor identity is characterized by discrete states and at a slower time scale. This gap also presents itself on another level. While we understand the term “neuronal sequence” to refer to a robust and reproducible spatiotemporal trajectory, in many cases these continuous state-space trajectories appear as a succession of quasi-discrete states (Abeles et al., ; Seidemann et al., 1996; Mazor and Laurent, 2005; Jones et al., 2007). In order to emphasize this interplay between continuous dynamics and discrete points we will denote such dynamics as continuodiscrete (see Table 1). In continuodiscrete dynamics, robust, and reproducible spatiotemporal trajectories are characterized by discrete points in state-space. As an example, in Figure 1C one can see the response of in vivo neurons in the gustatory cortex of rats, which is determined by the odor that is presented to the animal. The activity patterns of the neurons were analyzed with a hidden Markov model which revealed that the activity of the neuron ensemble can be described as a robust succession of discrete Markov states, where the system remains in a state for hundreds of milliseconds before quickly switching to another discrete state. These sequential visits to discrete states and the continuous expression of these states, specifically the switching between them, in terms of fast neuronal dynamics (here spiking neurons) is what we consider as continuodiscrete dynamics. Similar observations have been made in other experiments (Abeles et al., ; Seidemann et al., 1996; Mazor and Laurent, 2005; Rabinovich et al., 2001; Rivera et al., 2015) (see also Figure 3). The discrete states of a continuodiscrete sequence can be for example stable fixed points (Gros, 2009), or saddle points (Rabinovich et al., 2006, 2001) of the system, or simply points along a limit cycle trajectory (Yildiz and Kiebel, 2011; Yildiz et al., 2013), depending on the modeling approach (see section 2). Depending on the dynamical model, the system might leave a fixed point due to autonomously induced destabilization (Gros, 2007, 2009), noise (Rabinovich et al., 2006, 2001), or external input (Kurikawa and Kaneko, 2015; Toutounji and Pipa, 2014; Rivera et al., 2015; Hopfield, 1982).

Figure 3

Concepts similar to continuodiscrete trajectories have been introduced before. For example, in winner-less competition (WLC) (Rabinovich et al., 2000; Afraimovich et al., ; Rabinovich et al., 2008), a system moves from one discrete metastable fixed-point (see Table 1) of the state space to the next, never settling for any state, similar to the fluctuations in a Lotka-Volterra system (Rabinovich et al., 2001) (see Figure 3). In winner-take-all (WTA) dynamics, like during memory recall in a Hopfield network (Hopfield, 1982), the system is attracted to one fixed point in which it will settle. Both WLC and WTA are thus examples of continuodiscrete dynamics. The concept of continuodiscrete dynamics also allows for dynamics which are characterized by an initial alteration between discrete states, before settling into a final state, as for example in Rivera et al. (2015). In section 2, we will look at different ways to model continuodiscrete neuronal dynamics.

For the brain, representing continuodiscrete trajectories seems to combine the best of two worlds: Firstly, the representation of discrete points forms the basis for the generalization and categorization of the sequence. For example, for the categorization of a specific movement sequence, it is not necessary to consider all the details of the sensory input, as it is sufficient to categorize the sequence type (dancing, walking, running) by recognizing the sequence of discrete points, as e.g., in Giese and Poggio (2003). Secondly, the brain requires a way of representing continuous dynamics to not miss important details. This is because key information can only be inferred by subtle variations within a sequence, as is often the case in our environment. For instance, when someone is talking, most of the speech content, i.e., what is being said, is represented by discrete points that describe a sequence of specific vocal tract postures. Additionally, there are subtle variations in the exact expression of these discrete points and the continuous dynamics connecting them, which let us infer about otherwise hidden states like the emotional state of the speaker (Birkholz et al., ; Kotz et al., 2003; Schmidt et al., 2006). Some of these subtle variations in the sensory input may be of importance to the brain, while others are not. For example, when listening to someone speaking, slight variations in the speaker's talking speed or pitch of voice might give hints about her mood, state of health, or hidden intentions. In other words, representing sensory input as continuodiscrete trajectories enables the recognition of invariances of the underlying movements without losing details.

There is growing evidence that sequences with discrete states like fixed points are a fundamental feature of cognitive and perceptual representations (e.g., Abeles et al., ; Seidemann et al., 1996; Mazor and Laurent, 2005; Jones et al., 2007). This feature may be at the heart of several findings in the cognitive sciences which suggest that human perception is chunked into discrete states, see VanRullen and Koch (2003) for some insightful examples. Assuming that the brain uses some form of continuodiscrete dynamics to model sensory input, we will next consider neuronal sequence-generating mechanisms that may implement such dynamics and act as a generative model for recognition of sensory input. Importantly, as we are interested in generative models of sequential sensory input, we will only consider models that have the ability to autonomously generate sequential activity. Therefore, we are not going to discuss models where sequential activity is driven by sequential external input, as in models of non-autonomous neural networks (Toutounji and Pipa, 2014), or in models where intrinsic sequential neural activity is disrupted by bifurcation-inducing external input (Kurikawa and Kaneko, 2015).

2. Neuronal Network Models as Sequence Generators

In order to explain sequential neuronal activity in networks of biological neurons, several models have been proposed, some of which we are going to review in the following sections. As this paper aims at a general overview of neuronal sequence-generating mechanisms and less at a detailed analysis, we will not cover the details and nuances of the presented dynamical models and refer the interested reader to the references given in the text.

2.1. Synfire Chains

Synfire chains are concatenated groups of excitatory neurons with convergent-divergent feed-forward connectivity, as illustrated in Figure 4A (Abeles, ; Diesmann et al., ). Synchronous activation of one group leads to the activation of the subsequent group in the chain after one synaptic delay (Figure 4B). It has been shown that the only stable operating mode in synfire chains is the synchronous mode where all neurons of a group spike in synchrony (Litvak et al., 2003). Synfire chains create sequences that are temporally highly precise (Abeles, ; Diesmann et al., ). Such temporally precise sequences have been observed in slices of the mouse primary visual cortex and in V1 of anaesthetized cats (Ikegaya et al., 2004), as well as in the HVC nucleus of the bird brain during song production (Hahnloser et al., 2002; Long et al., 2010), and in the frontal cortex of behaving monkeys (Prut et al., 1998; Abeles and Gat, ). While synfire chains make predictions that agree well with these observations, a striking mismatch between synfire chains and neuronal networks in the brain is the absence of recurrent connections in the synfire chain's feed-forward architecture. Modeling studies have shown that sequential activation similar to synfire chain activity can be achieved by changing a small fraction of the connections in a random neural network (Rajan et al., 2016; Chenkov et al., ), and that synfire chains can emerge in self-organizing recurrent neural networks under the influence of multiple interacting plasticity mechanisms (Zheng and Triesch, 2014). Such fractional changes of network connections were used to implement working memory (Rajan et al., 2016) or give a possible explanation for the occurrence of memory replay after one-shot learning (Chenkov et al., ). Such internally generated sequences have been proposed as a mechanism for memory consolidation, among other things (see Pezzulo et al., 2014 for a review).

Figure 4

2.2. Attractor Networks

2.2.1. Limit Cycles

Limit cycles are stable attractors in the phase space of a system, and they occur in practically every physical domain (Strogatz, 2018). A limit cycle is a closed trajectory, with fixed period and amplitude (Figure 5). Limit cycles occur frequently in biological and other dynamical systems, and the beating of the heart, or the periodic firing of a pacemaker neuron are examples of limit cycle behavior (Strogatz, 2018). They are of great interest to theoretical neuroscience, as periodic spiking activity can be represented by limit cycles, both on single-cell level (Izhikevich, 2007) and population level (Berry and Quoy, ; Jouffroy, 2007; Mi et al., 2017). They also play an important role in the emulation of human motion in robotics. While there are numerous ways to model human motion, one interesting approach is that of dynamic motion primitives (DMPs) (Schaal et al., 2007), which elegantly unifies the two different kinds of human motion, rhythmic and non-rhythmic motion, in one framework. The main idea of DMPs is that the limbs move as if they were pulled toward an attractor state. In the case of rhythmic motion, the attractor is given by a limit cycle, while in the case of motion strokes the attractor is a discrete point in space (Schaal et al., 2007). In Kiebel et al. (2009), Yildiz and Kiebel (2011), and Yildiz et al. (2013), the authors used a hierarchical generative model of sequence-generators based on limit cycles to model the generation and perception of birdsong and human speech.

Figure 5

2.2.2. Heteroclinic Trajectories

Another approach to modeling continuodiscrete dynamics are heteroclinic networks (Ashwin and Timme, ; Rabinovich et al., 2008) (see also Table 1). A heteroclinic network is a dynamical system with semi-stable states (saddle points) which are connected by invariant manifolds, so-called heteroclinic connections. Networks of coupled oscillators have been shown to give rise to phenomena like heteroclinic cycles (Ashwin and Swift, ; Ashwin et al., ). It has therefore been proposed that neuronal networks exhibit such heteroclinic behavior as well, which has been verified using simulations of networks of globally coupled Hodgkin-Huxley neurons (Hansel et al., 1993a,b; Ashwin and Borresen, ). Interestingly, heteroclinic networks can be harnessed to perform computational tasks (Ashwin and Borresen, ; Neves and Timme, 2012), and it has been shown that it is possible to implement any logic operation within such a network (Neves and Timme, 2012). Furthermore, the itinerancy in a heteroclinic network can be guided by external input, where the trajectory of fixed points discriminates between different inputs (Ashwin et al., ; Neves and Timme, 2012), which means that different inputs are encoded by different trajectories in phase space.

While theoretical neuroscience has progressed with research on heteroclinic behavior of coupled neural systems, concrete biological evidence is still sparse, as this requires a concrete and often complex mathematical model which is often beyond the more directly accessible research questions in biological science. Despite this, heteroclinic behavior has been shown to reproduce findings from single-cell recordings in insect olfaction (Rabinovich et al., 2001; Rivera et al., 2015) and olfactory bulb electroencephalography (EEG) in rabbits (Breakspear, ). Another study replicated the chaotic hunting behavior of a marine mollusk based on an anatomically plausible neuronal model with heteroclinic winnerless competition (WLC) dynamics (Varona et al., 2002), which is closely related to the dynamic alteration between states in a heteroclinic network (Rabinovich et al., 2000; Afraimovich et al., ; Rabinovich et al., 2008). WLC was proposed as a general information processing principle for dynamical networks and is characterized by dynamic switching between network states, where the switching behavior is based on external input (Afraimovich et al., ) (see Table 1). Importantly, the traveled trajectory identifies the received input, while any single state of the trajectory generally does not, see for example Neves and Timme (2012). In phase space representation, WLC can be achieved by open or closed sequences of heteroclinically concatenated saddle points. Such sequences are termed stable heteroclinic sequences (SHS) if the heteroclinic connections are dissipative, i.e., when a trajectory starting in a neighborhood close to the sequence remains close (Afraimovich et al., ). While perturbations and external forcing can destroy stable heteroclinic sequences, it can be shown that even under such adverse circumstances, in many neurobiologically relevant situations the general sequential behavior of the system is preserved (Rabinovich et al., 2006). Such behavior is described by the concept of Stable Heteroclinic Channels (SHC) (see Figure 3 and Table 1) (Rabinovich et al., 2006). A simple implementation of SHCs is based on the generalized Lotka-Volterra equations (Bick and Rabinovich, ; Rabinovich et al., 2001), which are a type of recurrent neural network implicitly implementing the WLC concept. The temporal precision of a system that evolves along an SHC is defined by the noise level as well as the eigenvalues of the invariant directions of the saddle points. Therefore, sequences along heteroclinic trajectories are reproducible although the exact timing of the sequence elements may be subject to fluctuation.

In a similar approach, recent theoretical work on the behavior of RNNs has introduced the concept of excitable network attractors, which are characterized by stable states of a system connected by excitable connections (Ceni et al., ). The conceptual idea of orbits between fixed points may further be implemented in different ways. For instance, transient activation of neuronal clusters can be achieved by autonomously driven destabilization of stable fixed points (Gros, 2007, 2009).

2.3. Hierarchical Sequence Generators

As briefly introduced in section 1.2, growing evidence suggests that the brain is organized into a hierarchy of different time scales, which enables the representation of different temporal features in its sensory input (e.g., Murray et al., 2014; Hasson et al., 2008; Cocchi et al., ; Mattar et al., 2016; Gauthier et al., 2012). Here the idea is that lower levels represent dynamics at faster time scales, which are integrated at higher levels that represent slower time scales. For example, speech consists of phonemes (fast time scales), which are integrated into increasingly slower representations of syllables, words, sentences, and a conversation (Hasson et al., 2008; Ding et al., ; Boemio et al., ). The combination of this hierarchical aspect of brain function with the Bayesian brain hypothesis and the concept of neuronal sequences suggests that the brain implicitly uses hierarchical continuodiscrete dynamical systems as generative models. One illustrative example of a hierarchical continuodiscrete process is given in Figure 6. In this example, the dynamics of the 2nd and 3rd level of the hierarchy are modeled by limit cycles and govern the evolution of parameters of the sequence-generating mechanisms at the levels below. Such an approach for a generative model for prediction and recognition of sensory data has been used to model birdsong and human speech recognition (Yildiz and Kiebel, 2011; Yildiz et al., 2013; Kiebel et al., 2009) (see Figure 6). In Yildiz and Kiebel (2011), the 3rd level represented sequential neuronal activity in area HVC (proper name, see also Figure 1B), and the 2nd level modeled activity in the robust nucleus of the arcopallium (RA). Similarly, in Rivera et al. (2015) the authors employed a hierarchical generative model with a heteroclinic sequence for a sequence-generating mechanism to model odor recognition in the insect brain. In a slightly different approach to hierarchical continuodiscrete modeling, hierarchical SHCs, implementing winnerless competition, were used to demonstrate how chunking of information can emerge, similar to memory representation in the brain (Fonollosa et al., 2015). One computational study provided a proof of principle that complex behavior, like handwriting, can be decomposed into a hierarchical organization of stereotyped dynamical flows on manifolds of lower dimensions (Perdikis et al., 2011). These stereotyped dynamics can be regarded as the discrete points in a continuodiscrete sequence, which gave rise to complex and flexible behavior.

Figure 6

In the following section, we will briefly review how sequential methods have been used for problems in neuroscience and especially AI. Afterwards, we will review evidence for the organization of neuronal sequences into a hierarchy of time scales.

3. Recognition of Sequences

Although neuronal sequence models, such as the ones introduced in the preceding sections have been used to explain experimentally observed neuronal activity, these models by themselves do not explain how predictions are formed about the future trajectory of a sequence. To take the example of song production and recognition in songbirds, a sequence-generating model of birdsong generation is not sufficient to model or explain how a listening bird recognizes a song (Yildiz and Kiebel, 2011). Given a generative model, recognition of a song corresponds to statistical model inversion (Watzenig, 2007; Ulrych et al., 2001). A simple example of such a scheme is provided in Bitzer and Kiebel (), where RNNs are used as a generative model such that model inversion provides for an online recognition model. As shown in Friston et al. (2011), one can also place such a generative model into the active inference framework to derive a model that not only recognizes sequential movements from visual input but also generates continuodiscrete movement patterns. Generative models are not only interesting from a cognitive neuroscience perspective but also point at a shared interest with the field of artificial intelligence and specifically machine learning, to find a mechanistic understanding of how spatiotemporally structured sensory input can be recognized by an artificial or a biological agent. In the following, we will discuss how both fields seem to converge on the conceptual idea that generative models should be spatiotemporally structured and hierarchical.

3.1. Sequence Recognition in Machine Learning

The most widely-used models for discrete sequence generation are hidden Markov models (HMM) and their time-dependent generalisation, hidden semi-Markov models (HSMM) (Yu, 2015). In particular, HMMs and HSMMs are standard tools in a wide range of applications concerned with e.g., speech recognition (Liu et al., 2018; Zen et al., 2004; Deng et al., ) and activity recognition (Duong et al., ). Furthermore, they have often been used for the analysis of neuronal activity (Tokdar et al., 2010) and human behavior in general (Eldar et al., 2011). Similar to HSMMs, artificial RNNs are used in machine learning for classifying and predicting time series data. When training a generic RNN for prediction and classification of time series data, one faces various challenges, most notably incorporating information about long-term dependencies in the data. To address these dependencies, specific RNN architectures have been proposed, such as long-short term memory (LSTM) networks (Gers et al., 1999) and gate recurrent units (GRU) (Chung et al., ). In a common LSTM network, additionally to the output variable, the network computes an internal memory variable. This endows the network with high flexibility. LSTM networks belong to the most successful and most widely applied RNN architectures, with applications in virtually every field involving time-series data, or any data structure with long-range dependencies (Yu et al., 2019; LeCun et al., 2015). Another RNN approach is reservoir computing (RC), which started with the development of echo-state networks and liquid state machines in the early 2000s (Lukoševičius et al., 2012; Jaeger, 2001; Maass et al., 2002). In RC, sequential input is fed to one or more input neurons. Those neurons are connected with a reservoir of randomly connected neurons, which in turn are connected to one or more output neurons. Connections in the reservoir are pseudo-randomized to elicit dynamics at the edge of chaos (Yildiz et al., 2012), leading to a spatiotemporal network response in the form of reverberations over multiple time scales. RC networks have successfully been applied in almost every field of machine learning and data science, such as speech recognition, handwriting recognition, robot motor control, and financial forecasting (Lukoševičius et al., 2012; Tanaka et al., 2019).

While there is a lot of research on neurobiologically plausible learning paradigms for RNNs (Sussillo and Abbott, 2009; Miconi, 2017; Taherkhani et al., 2020), one possible approach for understanding the role of neuronal sequences is to use neurobiologically more plausible sequence generation models, which can act as generative models of the causal dynamic relationships in the environment. A natural application would be the development of recognition models based on Bayesian inference (Bitzer and Kiebel, ), and more specifically in terms of variational inference (Friston et al., 2006; Daunizeau et al., ).

3.2. Biological and Artificial Inferential Hierarchies

In neuroscience and the cognitive sciences, the brain is often viewed as a hierarchical system, where a functional hierarchy can be mapped to the structural hierarchy of the cortex (Badre, ; Koechlin et al., 2003; Kiebel et al., 2008). The best example of such a hierarchical organization is the visual system, for which the existence of both a functional and an equivalent structural hierarchy is established (Felleman and Van Essen, 1991). Cells in lower levels of the hierarchy encode simple features and have smaller receptive fields than cells further up the hierarchy, which posses larger receptive fields and encode more complex patterns by integrating information from lower levels (Hubel and Wiesel, 1959; Zeki and Shipp, 1988; Giese and Poggio, 2003). This functional hierarchy is mediated by an asymmetry of recurrent connectivity in the visual stream, where forward connections to higher layers are commonly found to have fast, excitatory effects on the post-synaptic neurons, while feedback connections act in a slower, modulatory manner (Zeki and Shipp, 1988; Sherman and Guillery, 1998). Moreover, neuroimaging studies have shown that the brain is generally organized into a modular hierarchical structure (Bassett et al., ; Meunier et al., 2009, 2010). This is substantiated by other network-theoretical characteristics of the brain, like its scale-free property (Eguiluz et al., ), which is a natural consequence of modular hierarchy (Ravasz and Barabási, 2003). Hierarchies also play an important role in cognitive neuroscience as most if not all types of behavior, as well as cognitive processes, can be described in a hierarchical fashion. For example, making a cup of tea can be considered a high-order goal in a hierarchy with subgoals that are less abstract and temporally less extended. In the example of making a cup of tea, these subgoals can be: (i) putting a teabag into a pot, (ii) pouring hot water into the pot, and (iii) pouring tea into a cup (example adopted from Botvinick, ).

3.2.1. A Hierarchy of Time Scales

Importantly, all theories of cortical hierarchies of function share the common assumption that primary sensory regions encode rather quickly changing dynamics representing the fast features of sensory input, and that those regions are at the bottom of the hierarchy, while temporally more extended or more abstract representations are located in higher order cortices. This principle has been conceptualized as a “hierarchy of time scales” (Kiebel et al., 2008; Hasson et al., 2008; Koechlin et al., 2003; Badre, ; Kaplan et al., 2020). In this view, levels further up the hierarchy code for more general characteristics of the environment and inner cognitive processes, which generally change slowly (Hasson et al., 2008; Koechlin et al., 2003; Badre, ). For example, although the visual hierarchy is typically understood as a spatial hierarchy, experimental evidence is emerging that it is also a hierarchy of time scales (Cocchi et al., ; Gauthier et al., 2012; Mattar et al., 2016). Importantly, the information exchange in such a hierarchy is bidirectional. While top-down information can be regarded as the actions of a generative model trying to predict the sensory input (Dayan et al., ; Friston, 2005), recognition is achieved by bottom-up information that provides higher levels in the hierarchy with information about the sensory input, see also Yildiz and Kiebel (2011) and Yildiz et al. (2013) for illustrations of this concept. A related finding is an experimentally observed hierarchy of time scales with respect to the time lag of the autocorrelation of neuronal measurements (e.g., Murray et al., 2014). Here, it was found that the decay of autocorrelation was fastest for sensory areas (<100 ms) but longest for prefrontal areas like ACC (>300 ms).

The importance of cognition based on spatiotemporal structure at multiple time scales is also illustrated by various computational modeling studies. In one study, robots were endowed with a neural network whose parameters were let free to evolve over time to optimize performance during a navigation task (Nolfi, 2002). After some time, the robots had evolved neural assemblies with representations at clearly distinct time scales: one assembly had assumed a quickly changing, short time scale associated with immediate sensory input while another assembly had adopted a long time scale, associated with an integration of information over an extended period of time, which was necessary for succeeding at the task. Another modeling study showed that robots with neuronal populations of strongly differing time-constants performed their tasks significantly better than when endowed only with units of approximately identical time-constants (Yamashita and Tani, 2008). In Botvinick () it was shown that, after learning, a neural network with a structural hierarchy similar to the one proposed for the frontal cortex had organized in such a way that high-level units coded for temporal context while low-level units encoded fast responses similar to the role assigned to sensory and motor regions in theories of hierarchical cortical processing (Kiebel et al., 2008; Alexander and Brown, ; Rao and Ballard, 1999; Botvinick, ; Badre, ; Koechlin et al., 2003; Fuster, 2004).

The principle of representing spatiotemporal dynamics at multiple time scales has also been used to model birdsong generation and inference in songbirds by combining a hierarchically structured RNN with a model of songbirds' vocal tract dynamics (Yildiz and Kiebel, 2011). The system consisted of three levels, each of which was governed by the sequential dynamics of an RNN following a limit cycle. The sequential dynamics were influenced both by top-down predictions, and bottom-up prediction errors. In another study, the same concept was applied to the recognition of human speech (Yildiz et al., 2013). The resulting inference scheme was able to recognize spoken words, even under adversarial circumstances like accelerated speech, since it inferred and adapted parameters in an online fashion during the recognition process. The same principle can also be translated to very different types of input, see Rivera et al. (2015) for an example of insect olfaction.

3.2.2. A Hierarchy of Time Scales: Neuroimaging Evidence

Experimental evidence for the hypothesis of a hierarchy of time scales has been reported in several neuroimaging studies (Koechlin et al., 2003; Hasson et al., 2008; Lerner et al., 2011; Gauthier et al., 2012; Cocchi et al., ; Mattar et al., 2016; Baldassano et al., ; Gao et al., 2020), two of which we are going to briefly discuss in the following. One functional magnetic resonance imaging (fMRI) study investigated the temporal receptive windows (TRW) of several brain regions in the human brain (Hasson et al., 2008). The TRW of an area is the time-interval over which the region “integrates” incoming information, in order to extract meaning over a specific temporal scale. It was found that regions, such as the primary visual cortex exhibited rather short TRW, while high order regions exhibited intermediate to long TRW (Hasson et al., 2008). Similarly, in Lerner et al. (2011) the same principle was tested with temporally structured auditory input, i.e., speech. Using fMRI, the authors found evidence for a hierarchy of time scales in specific brain areas. The different time scales represented fast auditory input, words, sentences and paragraphs (see Figure 7).

Figure 7

3.2.3. A Hierarchy of Time Scales: Machine Learning

Not surprisingly, the importance of hierarchies of time scales is well-established within the machine learning community (El Hihi and Bengio, 1996; Malhotra et al., 2015). Current state-of-the-art RNN architectures used for prediction and classification of complex time series data are based on recurrent network units organized as temporal hierarchies. Notable examples are the clockwork RNN (Koutnik et al., 2014), gated feedback RNN (Chung et al., ), hierarchical multi-scale RNN (Chung et al., ), fast-slow RNN (Mujika et al., 2017), and higher order RNNs (HORNNs) (Soltani and Jiang, 2016). These modern RNN architectures have found various applications in motion classification (Neverova et al., 2016; Yan et al., 2018), speech synthesis (Wu and King, 2016; Achanta and Gangashetty, ; Zhang and Woodland, 2018), recognition (Chan et al., ), and other related areas (Liu et al., 2015; Krause et al., 2017; Kurata et al., 2017). These applications of hierarchical RNN architectures further confirm the relevance of hierarchically organized sequence generators for capturing complex dynamics in our everyday environments.

4. Conclusion

Here, we have reviewed the evidence that our brain senses its environment as sequential sensory input, and consequently, uses neuronal sequences for predicting future sensory input. Although the general idea that the brain is a prediction device has by now become a mainstream guiding principle in cognitive neuroscience, it is much less clear how exactly the brain computes these predictions. We have reviewed results from different areas of the neurosciences that the brain may achieve this by using a hierarchy of time scales, specifically a hierarchy of sequential dynamics. If this were the case, the question would be whether already known neuroscience results in specific areas can be re-interpreted as evidence for the brain's operations in such a hierarchy of time scales. Such an interpretation is quite natural for neuroscience fields like auditory processing, where such a temporal hierarchy is most evident. But it is much less evident for other areas, like for example decision-making. To further test this suggested theory of brain function, researchers need to design experimental paradigms which are specifically geared toward testing what probabilistic inference mechanisms the brain uses to predict its input at different time scales, and select its own actions. Importantly, hierarchical computational modeling approaches as reviewed here could be used to further provide theoretical evidence of the underlying multi-scale inference mechanism and generate new predictions that can be tested experimentally.

What we found telling is that recent advances in machine learning converge on similar ideas of representing multi scale dynamics in sensory data, although with a different motivation and different aims. The simple reason for this convergence may be that much of the sensory data that is input to machine learning implementations is similar to the kind of sensory input experienced by humans, as for example in videos and speech data. Therefore, we believe that as computational modeling in the neurosciences as reviewed here will gain traction, there will be useful translations form the neurosciences to machine learning applications.

Statements

Author contributions

DM and SK contributed to the conception of the manuscript. SF wrote the manuscript, with contributions by DM and SK. All authors contributed to the article and approved the submitted version.

Funding

This work was funded by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft), SFB 940/2 - Project ID 178833530 A9, TRR 265/1 - Project ID 402170461 B09, and as part of Germany's Excellence Strategy - EXC 2050/1 - Project ID 390696704 -Cluster of Excellence Centre for Tactile Internet with Human-in-the-Loop (CeTI) of Technische Universität Dresden.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    AbelesM. (1991). Corticonics: Neural Circuits of the Cerebral Cortex. Cambridge, UK: Cambridge University Press. 10.1017/CBO9780511574566

  • 2

    AbelesM.BergmanH.GatI.MeilijsonI.SeidemannE.TishbyN.et al. (1995). Cortical activity flips among quasi-stationary states. Proc. Natl. Acad. Sci. U.S.A. 92, 86168620. 10.1073/pnas.92.19.8616

  • 3

    AbelesM.GatI. (2001). Detecting precise firing sequences in experimental data. J. Neurosci. Methods107, 141154. 10.1016/S0165-0270(01)00364-8

  • 4

    AchantaS.GangashettyS. V. (2017). Deep elman recurrent neural networks for statistical parametric speech synthesis. Speech Commun. 93, 3142. 10.1016/j.specom.2017.08.003

  • 5

    AdamsR. A.StephanK. E.BrownH. R.FrithC. D.FristonK. J. (2013). The computational anatomy of psychosis. Front. Psychiatry4:47. 10.3389/fpsyt.2013.00047

  • 6

    AfraimovichV.ZhigulinV.RabinovichM. (2004a). On the origin of reproducible sequential activity in neural circuits. Chaos14, 11231129. 10.1063/1.1819625

  • 7

    AfraimovichV. S.RabinovichM. I.VaronaP. (2004b). Heteroclinic contours in neural ensembles and the winnerless competition principle. Int. J. Bifurc. Chaos14, 11951208. 10.1142/S0218127404009806

  • 8

    AitchisonL.LengyelM. (2017). With or without you: predictive coding and bayesian inference in the brain. Curr. Opin. Neurobiol. 46, 219227. 10.1016/j.conb.2017.08.010

  • 9

    AlexanderW. H.BrownJ. W. (2018). Frontal cortex function as derived from hierarchical predictive coding. Sci. Rep. 8:3843. 10.1038/s41598-018-21407-9

  • 10

    AshwinP.BorresenJ. (2004). Encoding via conjugate symmetries of slow oscillations for globally coupled oscillators. Phys. Rev. E70:026203. 10.1103/PhysRevE.70.026203

  • 11

    AshwinP.BorresenJ. (2005). Discrete computation using a perturbed heteroclinic network. Phys. Lett. A347, 208214. 10.1016/j.physleta.2005.08.013

  • 12

    AshwinP.OroszG.WordsworthJ.TownleyS. (2007). Dynamics on networks of cluster states for globally coupled phase oscillators. SIAM J. Appl. Dyn. Syst. 6, 728758. 10.1137/070683969

  • 13

    AshwinP.SwiftJ. W. (1992). The dynamics of n weakly coupled identical oscillators. J. Nonlin. Sci. 2, 69108. 10.1007/BF02429852

  • 14

    AshwinP.TimmeM. (2005). Nonlinear dynamics: when instability makes sense. Nature436:36. 10.1038/436036b

  • 15

    BadreD. (2008). Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes. Trends Cogn. Sci. 12, 193200. 10.1016/j.tics.2008.02.004

  • 16

    BaegE.KimY.HuhK.Mook-JungI.KimH.JungM. (2003). Dynamics of population code for working memory in the prefrontal cortex. Neuron40, 177188. 10.1016/S0896-6273(03)00597-X

  • 17

    BaldassanoC.ChenJ.ZadboodA.PillowJ. W.HassonU.NormanK. A. (2017). Discovering event structure in continuous narrative perception and memory. Neuron95, 709721. 10.1016/j.neuron.2017.06.041

  • 18

    BassettD. S.GreenfieldD. L.Meyer-LindenbergA.WeinbergerD. R.MooreS. W.BullmoreE. T. (2010). Efficient physical embedding of topologically complex information processing networks in brains and computer circuits. PLoS Comput. Biol. 6:e1000748. 10.1371/journal.pcbi.1000748

  • 19

    BazhenovM.StopferM.RabinovichM.AbarbanelH. D.SejnowskiT. J.LaurentG. (2001). Model of cellular and network mechanisms for odor-evoked temporal patterning in the locust antennal lobe. Neuron30, 569581. 10.1016/S0896-6273(01)00286-0

  • 20

    BeckJ. M.MaW. J.KianiR.HanksT.ChurchlandA. K.RoitmanJ.et al. (2008). Probabilistic population codes for bayesian decision making. Neuron60, 11421152. 10.1016/j.neuron.2008.09.021

  • 21

    BerryH.QuoyM. (2006). Structure and dynamics of random recurrent neural networks. Adapt. Behav. 14, 129137. 10.1177/105971230601400204

  • 22

    BhallaU. S. (2019). Dendrites, deep learning, and sequences in the hippocampus. Hippocampus29, 239251. 10.1002/hipo.22806

  • 23

    BickC.RabinovichM. I. (2010). On the occurrence of stable heteroclinic channels in lotka-volterra models. Dyn. Syst. 25, 97110. 10.1080/14689360903322227

  • 24

    BirkholzP.KrogerB. J.Neuschaefer-RubeC. (2010). Model-based reproduction of articulatory trajectories for consonant-vowel sequences. IEEE Trans. Audio Speech Lang. Process. 19, 14221433. 10.1109/TASL.2010.2091632

  • 25

    BitzerS.KiebelS. J. (2012). Recognizing recurrent neural networks (RRNN): Bayesian inference for recurrent neural networks. Biol. Cybernet. 106, 201217. 10.1007/s00422-012-0490-x

  • 26

    BoemioA.FrommS.BraunA.PoeppelD. (2005). Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389395. 10.1038/nn1409

  • 27

    BotvinickM. M. (2007). Multilevel structure in behaviour and in the brain: a model of Fuster's hierarchy. Philos. Trans. R. Soc. B Biol. Sci. 362, 16151626. 10.1098/rstb.2007.2056

  • 28

    BotvinickM. M. (2008). Hierarchical models of behavior and prefrontal function. Trends Cogn. Sci. 12, 201208. 10.1016/j.tics.2008.02.009

  • 29

    BouchardK. E.BrainardM. S. (2016). Auditory-induced neural dynamics in sensory-motor circuitry predict learned temporal and sequential statistics of birdsong. Proc. Natl. Acad. Sci. U.S.A. 113, 96419646. 10.1073/pnas.1606725113

  • 30

    BousfieldW. A. (1953). The occurrence of clustering in the recall of randomly arranged associates. J. Gen. Psychol. 49, 229240. 10.1080/00221309.1953.9710088

  • 31

    BrancoT.ClarkB. A.HäusserM. (2010). Dendritic discrimination of temporal input sequences in cortical neurons. Science329, 16711675. 10.1126/science.1189664

  • 32

    BreakspearM. (2001). Perception of odors by a nonlinear model of the olfactory bulb. Int. J. Neural Syst. 11, 101124. 10.1142/S0129065701000564

  • 33

    BuonomanoD. V.MaassW. (2009). State-dependent computations: spatiotemporal processing in cortical networks. Nat. Rev. Neurosci. 10:113. 10.1038/nrn2558

  • 34

    BuzsákiG. (2015). Hippocampal sharp wave-ripple: a cognitive biomarker for episodic memory and planning. Hippocampus25, 10731188. 10.1002/hipo.22488

  • 35

    CeniA.AshwinP.LiviL. (2019). Interpreting recurrent neural networks behaviour via excitable network attractors. Cogn. Comput. 12, 330356. 10.1007/s12559-019-09634-2

  • 36

    ChanW.JaitlyN.LeQ.VinyalsO. (2016). Listen, attend and spell: a neural network for large vocabulary conversational speech recognition, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Shanghai: IEEE), 49604964. 10.1109/ICASSP.2016.7472621

  • 37

    ChenkovN.SprekelerH.KempterR. (2017). Memory replay in balanced recurrent networks. PLoS Comput. Biol. 13:e1005359. 10.1371/journal.pcbi.1005359

  • 38

    ChungJ.AhnS.BengioY. (2016). Hierarchical multiscale recurrent neural networks. arXiv arXiv:1609.01704.

  • 39

    ChungJ.GulcehreC.ChoK.BengioY. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 1412.3555.

  • 40

    ChungJ.GulcehreC.ChoK.BengioY. (2015). Gated feedback recurrent neural networks, in International Conference on Machine Learning (Lille), 20672075.

  • 41

    ClarkA. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36, 181204. 10.1017/S0140525X12000477

  • 42

    CocchiL.SaleM. V.GolloL. L.BellP. T.NguyenV. T.ZaleskyA.et al. (2016). A hierarchy of timescales explains distinct effects of local inhibition of primary visual cortex and frontal eye fields. Elife5:e15252. 10.7554/eLife.15252

  • 43

    ColomboM.SerièsP. (2012). Bayes in the brain—on bayesian modelling in neuroscience. Br. J. Philos. Sci. 63, 697723. 10.1093/bjps/axr043

  • 44

    ConstantinescuA. O.O'ReillyJ. X.BehrensT. E. (2016). Organizing conceptual knowledge in humans with a gridlike code. Science352, 14641468. 10.1126/science.aaf0941

  • 45

    CroweD. A.AverbeckB. B.ChafeeM. V. (2010). Rapid sequences of population activity patterns dynamically encode task-critical spatial information in parietal cortex. J. Neurosci. 30, 1164011653. 10.1523/JNEUROSCI.0954-10.2010

  • 46

    DaunizeauJ.FristonK. J.KiebelS. J. (2009). Variational bayesian identification and prediction of stochastic nonlinear dynamic causal models. Phys. D238, 20892118. 10.1016/j.physd.2009.08.002

  • 47

    DayanP.HintonG. E.NealR. M.ZemelR. S. (1995). The Helmholtz machine. Neural Comput. 7, 889904. 10.1162/neco.1995.7.5.889

  • 48

    DeneveS. (2008). Bayesian spiking neurons I: inference. Neural Comput. 20, 91117. 10.1162/neco.2008.20.1.91

  • 49

    DengL.YuD.AceroA. (2006). Structured speech modeling. IEEE Trans. Audio Speech Lang. Process. 14, 14921504. 10.1109/TASL.2006.878265

  • 50

    DezfouliA.LingawiN. W.BalleineB. W. (2014). Habits as action sequences: hierarchical action control and changes in outcome value. Philos. Trans. R. Soc. B Biol. Sci. 369:20130482. 10.1098/rstb.2013.0482

  • 51

    DiesmannM.GewaltigM. O.AertsenA. (1999). Stable propagation of synchronous spiking in cortical neural networks. Nature402:529. 10.1038/990101

  • 52

    DingN.MelloniL.ZhangH.TianX.PoeppelD. (2016). Cortical tracking of hierarchical linguistic structures in connected speech. Nat. Neurosci. 19, 158164. 10.1038/nn.4186

  • 53

    DoyaK.IshiiS.PougetA.RaoR. P. (2007). Bayesian Brain: Probabilistic Approaches to Neural Coding. Cambridge, MA: MIT Press. 10.7551/mitpress/9780262042383.001.0001

  • 54

    DragoiG.TonegawaS. (2011). Preplay of future place cell sequences by hippocampal cellular assemblies. Nature469:397. 10.1038/nature09633

  • 55

    DuongT. V.BuiH. H.PhungD. Q.VenkateshS. (2005). Activity recognition and abnormality detection with the switching hidden semi-Markov model, in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), Vol. 1 (San Diego, CA: IEEE), 838845. 10.1109/CVPR.2005.61

  • 56

    EguiluzV. M.ChialvoD. R.CecchiG. A.BalikiM.ApkarianA. V. (2005). Scale-free brain functional networks. Phys. Rev. Lett. 94:018102. 10.1103/PhysRevLett.94.018102

  • 57

    El HihiS.BengioY. (1996). Hierarchical recurrent neural networks for long-term dependencies, in Advances in Neural Information Processing Systems, 493499.

  • 58

    EldarE.MorrisG.NivY. (2011). The effects of motivation on response rate: a hidden semi-Markov model analysis of behavioral dynamics. J. Neurosci. Methods201, 251261. 10.1016/j.jneumeth.2011.06.028

  • 59

    ErnstM. O.BanksM. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature415:429. 10.1038/415429a

  • 60

    FeinbergE. A.ShwartzA. (2012). Handbook of Markov Decision Processes: Methods and Applications, Vol. 40. Boston, MA: Springer Science & Business Media.

  • 61

    FeldmanJ. (2001). Bayesian contour integration. Percept. Psychophys. 63, 11711182. 10.3758/BF03194532

  • 62

    FellemanD. J.Van EssenD. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex1, 147. 10.1093/cercor/1.1.1

  • 63

    FitzGeraldT. H.HämmererD.FristonK. J.LiS. C.DolanR. J. (2017). Sequential inference as a mode of cognition and its correlates in fronto-parietal and hippocampal brain regions. PLoS Comput. Biol. 13:e1005418. 10.1371/journal.pcbi.1005418

  • 64

    FletcherP. C.FrithC. D. (2009). Perceiving is believing: a bayesian approach to explaining the positive symptoms of schizophrenia. Nat. Rev. Neurosci. 10, 4858. 10.1038/nrn2536

  • 65

    FonollosaJ.NeftciE.RabinovichM. (2015). Learning of chunking sequences in cognition and behavior. PLoS Comput. Biol. 11:e1004592. 10.1371/journal.pcbi.1004592

  • 66

    FristonK. (2005). A theory of cortical responses. Philos. Trans. R. Soc. B Biol. Sci. 360, 815836. 10.1098/rstb.2005.1622

  • 67

    FristonK. (2010). The free-energy principle: a unified brain theory?Nat. Rev. Neurosci. 11, 127138. 10.1038/nrn2787

  • 68

    FristonK.BuzsákiG. (2016). The functional anatomy of time: what and when in the brain. Trends Cogn. Sci. 20, 500511. 10.1016/j.tics.2016.05.001

  • 69

    FristonK.KiebelS. (2009). Predictive coding under the free-energy principle. Philos. Trans. R. Soc. B Biol. Sci. 364, 12111221. 10.1098/rstb.2008.0300

  • 70

    FristonK.KilnerJ.HarrisonL. (2006). A free energy principle for the brain. J. Physiol. 100, 7087. 10.1016/j.jphysparis.2006.10.001

  • 71

    FristonK.MattoutJ.KilnerJ. (2011). Action understanding and active inference. Biol. Cybernet. 104, 137160. 10.1007/s00422-011-0424-z

  • 72

    FristonK. J.StephanK. E.MontagueR.DolanR. J. (2014). Computational psychiatry: the brain as a phantastic organ. Lancet Psychiatry1, 148158. 10.1016/S2215-0366(14)70275-5

  • 73

    FusterJ. M. (2004). Upper processing stages of the perception-action cycle. Trends Cogn. Sci. 8, 143145. 10.1016/j.tics.2004.02.004

  • 74

    GaoR.van den BrinkR. L.PfefferT.VoytekB. (2020). Neuronal timescales are functionally dynamic and shaped by cortical microarchitecture. Elife9:e61277. 10.7554/eLife.61277

  • 75

    GauthierB.EgerE.HesselmannG.GiraudA. L.KleinschmidtA. (2012). Temporal tuning properties along the human ventral visual stream. J. Neurosci. 32, 1443314441. 10.1523/JNEUROSCI.2467-12.2012

  • 76

    GelmanA.SimpsonD.BetancourtM. (2017). The prior can often only be understood in the context of the likelihood. Entropy19:555. 10.3390/e19100555

  • 77

    GersF. A.SchmidhuberJ.CumminsF. (1999). Learning to Forget: Continual Prediction With LSTM. Stevenage: Institution of Engineering and Technology. 10.1049/cp:19991218

  • 78

    GieseM. A.PoggioT. (2003). Cognitive neuroscience: neural mechanisms for the recognition of biological movements. Nat. Rev. Neurosci. 4:179. 10.1038/nrn1057

  • 79

    GiraudA. L.PoeppelD. (2012). Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15:511. 10.1038/nn.3063

  • 80

    GrosC. (2007). Neural networks with transient state dynamics. New J. Phys. 9:109. 10.1088/1367-2630/9/4/109

  • 81

    GrosC. (2009). Cognitive computation with autonomously active neural networks: an emerging field. Cogn. Comput. 1, 7790. 10.1007/s12559-008-9000-9

  • 82

    HahnloserR. H.KozhevnikovA. A.FeeM. S. (2002). An ultra-sparse code underliesthe generation of neural sequences in a songbird. Nature419:65. 10.1038/nature00974

  • 83

    HanselD.MatoG.MeunierC. (1993a). Clustering and slow switching in globally coupled phase oscillators. Phys. Rev. E48:3470. 10.1103/PhysRevE.48.3470

  • 84

    HanselD.MatoG.MeunierC. (1993b). Phase dynamics for weakly coupled hodgkin-huxley neurons. Europhys. Lett. 23:367. 10.1209/0295-5075/23/5/011

  • 85

    HarveyC. D.CoenP.TankD. W. (2012). Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature484:62. 10.1038/nature10918

  • 86

    HassonU.YangE.VallinesI.HeegerD. J.RubinN. (2008). A hierarchy of temporal receptive windows in human cortex. J. Neurosci. 28, 25392550. 10.1523/JNEUROSCI.5487-07.2008

  • 87

    HawkinsJ.GeorgeD.NiemasikJ. (2009). Sequence memory for prediction, inference and behaviour. Philos. Trans. R. Soc. B Biol. Sci. 364, 12031209. 10.1098/rstb.2008.0322

  • 88

    HelmholtzH. V. (1867). Handbuch der Physiologischen Optik. Leipzig: Voss.

  • 89

    HintonG. E.SejnowskiT. J. (1983). Optimal perceptual inference, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 448 (New York, NY: Citeseer).

  • 90

    HopfieldJ. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. U.S.A. 79, 25542558. 10.1073/pnas.79.8.2554

  • 91

    HubelD. H.WieselT. N. (1959). Receptive fields of single neurones in the cat's striate cortex. J. Physiol. 148, 574591. 10.1113/jphysiol.1959.sp006308

  • 92

    IkegayaY.AaronG.CossartR.AronovD.LamplI.FersterD.et al. (2004). Synfire chains and cortical songs: temporal modules of cortical activity. Science304, 559564. 10.1126/science.1093173

  • 93

    IzhikevichE. M. (2007). Dynamical Systems in Neuroscience. Cambridge, MA: MIT Press. 10.7551/mitpress/2526.001.0001

  • 94

    JaegerH. (2001). The “Echo State” Approach to Analysing and Training Recurrent Neural Networks-With an Erratum Note. Bonn: German National Research Center for Information Technology GMD Technical Report 148.

  • 95

    JiD.WilsonM. A. (2007). Coordinated memory replay in the visual cortex and hippocampus during sleep. Nat. Neurosci. 10, 100107. 10.1038/nn1825

  • 96

    JonesL. M.FontaniniA.SadaccaB. F.MillerP.KatzD. B. (2007). Natural stimuli evoke dynamic sequences of states in sensory cortical ensembles. Proc. Natl. Acad. Sci. U.S.A. 104, 1877218777. 10.1073/pnas.0705546104

  • 97

    JouffroyG. (2007). Design of simple limit cycles with recurrent neural networks for oscillatory control, in Sixth International Conference on Machine Learning and Applications (ICMLA 2007) (Cincinnati, OH: IEEE), 5055. 10.1109/ICMLA.2007.99

  • 98

    KaplanH. S.ThulaO. S.KhossN.ZimmerM. (2020). Nested neuronal dynamics orchestrate a behavioral hierarchy across timescales. Neuron105, 562576. 10.1016/j.neuron.2019.10.037

  • 99

    KenetT.BibitchkovD.TsodyksM.GrinvaldA.ArieliA. (2003). Spontaneously emerging cortical representations of visual attributes. Nature425:954. 10.1038/nature02078

  • 100

    KiebelS. J.DaunizeauJ.FristonK. J. (2008). A hierarchy of time-scales and the brain. PLoS Comput. Biol. 4:e1000209. 10.1371/journal.pcbi.1000209

  • 101

    KiebelS. J.Von KriegsteinK.DaunizeauJ.FristonK. J. (2009). Recognizing sequences of sequences. PLoS Comput. Biol. 5:e1000464. 10.1371/journal.pcbi.1000464

  • 102

    KnillD. C.PougetA. (2004). The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27, 712719. 10.1016/j.tins.2004.10.007

  • 103

    KoechlinE.OdyC.KouneiherF. (2003). The architecture of cognitive control in the human prefrontal cortex. Science302, 11811185. 10.1126/science.1088545

  • 104

    KördingK. P.WolpertD. M. (2004). Bayesian integration in sensorimotor learning. Nature427:244. 10.1038/nature02169

  • 105

    KotzS. A.MeyerM.AlterK.BessonM.von CramonD. Y.FriedericiA. D. (2003). On the lateralization of emotional prosody: an event-related functional MR investigation. Brain Lang. 86, 366376. 10.1016/S0093-934X(02)00532-1

  • 106

    KoutnikJ.GreffK.GomezF.SchmidhuberJ. (2014). A clockwork RNN. arXiv 1402.3511.

  • 107

    KrauseJ.JohnsonJ.KrishnaR.Fei-FeiL. (2017). A hierarchical approach for generating descriptive image paragraphs, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI), 317325. 10.1109/CVPR.2017.356

  • 108

    KurataG.RamabhadranB.SaonG.SethyA. (2017). Lan” guage modeling with highway LSTM, in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (Okinawa: IEEE), 244251. 10.1109/ASRU.2017.8268942

  • 109

    KurikawaT.KanekoK. (2015). Memories as bifurcations: realization by collective dynamics of spiking neurons under stochastic inputs. Neural Netw. 62, 2531. 10.1016/j.neunet.2014.07.005

  • 110

    Kurth-NelsonZ.EconomidesM.DolanR. J.DayanP. (2016). Fast sequences of non-spatial state representations in humans. Neuron91, 194204. 10.1016/j.neuron.2016.05.028

  • 111

    Laboy-JuárezK. J.LangbergT.AhnS.FeldmanD. E. (2019). Elementary motion sequence detectors in whisker somatosensory cortex. Nat. Neurosci. 22, 14381449. 10.1038/s41593-019-0448-6

  • 112

    LashleyK. S. (1951). The problem of serial order in behavior, in Cerebral Mechanisms in Behavior; The Hixon Symposium, ed JeffressL. A. (Wiley), 112146.

  • 113

    LeCunY.BengioY.HintonG. (2015). Deep learning. Nature521, 436444. 10.1038/nature14539

  • 114

    LeptourgosP.DenèveS.JardriR. (2017). Can circular inference relate the neuropathological and behavioral aspects of schizophrenia?Curr. Opin. Neurobiol. 46, 154161. 10.1016/j.conb.2017.08.012

  • 115

    LernerY.HoneyC. J.SilbertL. J.HassonU. (2011). Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 29062915. 10.1523/JNEUROSCI.3684-10.2011

  • 116

    LiptonZ. C.BerkowitzJ.ElkanC. (2015). A critical review of recurrent neural networks for sequence learning. arXiv 1506.00019.

  • 117

    LitvakV.SompolinskyH.SegevI.AbelesM. (2003). On the transmission of rate code in long feedforward networks with excitatory-inhibitory balance. J. Neurosci. 23, 30063015. 10.1523/JNEUROSCI.23-07-03006.2003

  • 118

    LiuH.HeL.BaiH.DaiB.BaiK.XuZ. (2018). Structured inference for recurrent hidden semi-Markov model, in IJCAI (Stockholm), 24472453. 10.24963/ijcai.2018/339

  • 119

    LiuP.QiuX.ChenX.WuS.HuangX. (2015). Multi-timescale long short-term memory neural network for modelling sentences and documents, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (Okinawa), 23262335. 10.18653/v1/D15-1280

  • 120

    LongM. A.JinD. Z.FeeM. S. (2010). Support for a synaptic chain model of neuronal sequence generation. Nature468:394. 10.1038/nature09514

  • 121

    LukoševičiusM.JaegerH.SchrauwenB. (2012). Reservoir computing trends. Künstl. Intell. 26, 365371. 10.1007/s13218-012-0204-5

  • 122

    MaassW.NatschlägerT.MarkramH. (2002). Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14, 25312560. 10.1162/089976602760407955

  • 123

    MacDonaldC. J.LepageK. Q.EdenU. T.EichenbaumH. (2011). Hippocampal “time cells” bridge the gap in memory for discontiguous events. Neuron71, 737749. 10.1016/j.neuron.2011.07.012

  • 124

    MalhotraP.VigL.ShroffG.AgarwalP. (2015). Long short term memory networks for anomaly detection in time series, in Proceedings (Louvain-la-Neuve: Presses Universitaires de Louvain), 89.

  • 125

    Martinez-CondeS. (2006). Fixational eye movements in normal and pathological vision. Prog. Brain Res. 154, 151176. 10.1016/S0079-6123(06)54008-7

  • 126

    Martinez-CondeS.MacknikS. L.HubelD. H. (2004). The role of fixational eye movements in visual perception. Nat. Rev. Neurosci. 5, 229240. 10.1038/nrn1348

  • 127

    MattarM. G.KahnD. A.Thompson-SchillS. L.AguirreG. K. (2016). Varying timescales of stimulus integration unite neural adaptation and prototype formation. Curr. Biol. 26, 16691676. 10.1016/j.cub.2016.04.065

  • 128

    MazorO.LaurentG. (2005). Transient dynamics versus fixed points in odor representations by locust antennal lobe projection neurons. Neuron48, 661673. 10.1016/j.neuron.2005.09.032

  • 129

    MeunierD.LambiotteR.BullmoreE. T. (2010). Modular and hierarchically modular organization of brain networks. Front. Neurosci. 4:200. 10.3389/fnins.2010.00200

  • 130

    MeunierD.LambiotteR.FornitoA.ErscheK.BullmoreE. T. (2009). Hierarchical modularity in human brain functional networks. Front. Neuroinform. 3:37. 10.3389/neuro.11.037.2009

  • 131

    MiY.KatkovM.TsodyksM. (2017). Synaptic correlates of working memory capacity. Neuron93, 323330. 10.1016/j.neuron.2016.12.004

  • 132

    MiconiT. (2017). Biologically plausible learning in recurrent neural networks reproduces neural dynamics observed during cognitive tasks. Elife6:e20899. 10.7554/eLife.20899

  • 133

    MillerG. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63:81. 10.1037/h0043158

  • 134

    MujikaA.MeierF.StegerA. (2017). Fast-slow recurrent neural networks, in Advances in Neural Information Processing Systems, 59155924.

  • 135

    MurrayJ. D.BernacchiaA.FreedmanD. J.RomoR.WallisJ. D.CaiX.et al. (2014). A hierarchy of intrinsic timescales across primate cortex. Nat. Neurosci. 17:1661. 10.1038/nn.3862

  • 136

    NeverovaN.WolfC.LaceyG.FridmanL.ChandraD.BarbelloB.et al. (2016). Learning human identity from motion patterns. IEEE Access4, 18101820. 10.1109/ACCESS.2016.2557846

  • 137

    NevesF. S.TimmeM. (2012). Computation by switching in complex networks of states. Phys. Rev. Lett. 109:018701. 10.1103/PhysRevLett.109.018701

  • 138

    NolfiS. (2002). Evolving robots able to self-localize in the environment: the importance of viewing cognition as the result of processes occurring at different time-scales. Connect. Sci. 14, 231244. 10.1080/09540090208559329

  • 139

    O'NeillJ.BoccaraC.StellaF.SchönenbergerP.CsicsvariJ. (2017). Superficial layers of the medial entorhinal cortex replay independently of the hippocampus. Science355, 184188. 10.1126/science.aag2787

  • 140

    PastalkovaE.ItskovV.AmarasinghamA.BuzsákiG. (2008). Internally generated cell assembly sequences in the rat hippocampus. Science321, 13221327. 10.1126/science.1159775

  • 141

    PerdikisD.HuysR.JirsaV. K. (2011). Time scale hierarchies in the functional organization of complex behaviors. PLoS Comput. Biol. 7:e1002198. 10.1371/journal.pcbi.1002198

  • 142

    PezzuloG.van der MeerM. A.LansinkC. S.PennartzC. M. (2014). Internally generated sequences in learning and executing goal-directed behavior. Trends Cogn. Sci. 18, 647657. 10.1016/j.tics.2014.06.011

  • 143

    PfeifferB. E. (2020). The content of hippocampal “replay. Hippocampus30, 618. 10.1002/hipo.22824

  • 144

    PrutY.VaadiaE.BergmanH.HaalmanI.SlovinH.AbelesM. (1998). Spatiotemporal structure of cortical activity: properties and behavioral relevance. J. Neurophysiol. 79, 28572874. 10.1152/jn.1998.79.6.2857

  • 145

    RabinovichM.HuertaR.LaurentG. (2008). Transient dynamics for neural processing. Science321, 4850. 10.1126/science.1155564

  • 146

    RabinovichM.HuertaR.VolkovskiiA.AbarbanelH.StopferM.LaurentG. (2000). Dynamical coding of sensory information with competitive networks. J. Physiol. 94, 465471. 10.1016/S0928-4257(00)01092-5

  • 147

    RabinovichM.VolkovskiiA.LecandaP.HuertaR.AbarbanelH.LaurentG. (2001). Dynamical encoding by networks of competing neuron groups: winnerless competition. Phys. Rev. Lett. 87:068102. 10.1103/PhysRevLett.87.068102

  • 148

    RabinovichM. I.HuertaR.VaronaP.AfraimovichV. S. (2006). Generation and reshaping of sequences in neural systems. Biol. Cybernet. 95:519. 10.1007/s00422-006-0121-5

  • 149

    RahnevD.DenisonR. N. (2018). Suboptimality in perceptual decision making. Behav. Brain Sci. 41, 1107. 10.1017/S0140525X18000936

  • 150

    RajanK.HarveyC. D.TankD. W. (2016). Recurrent network models of sequence generation and memory. Neuron90, 128142. 10.1016/j.neuron.2016.02.009

  • 151

    RaoR. P.BallardD. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2:79. 10.1038/4580

  • 152

    RavaszE.BarabásiA. L. (2003). Hierarchical organization in complex networks. Phys. Rev. E67:026112. 10.1103/PhysRevE.67.026112

  • 153

    Reitich-StoleroT.PazR. (2019). Affective memory rehearsal with temporal sequences in amygdala neurons. Nat. Neurosci. 22, 20502059. 10.1038/s41593-019-0542-9

  • 154

    RiveraD. C.BitzerS.KiebelS. J. (2015). Modelling odor decoding in the antennal lobe by combining sequential firing rate models with Bayesian inference. PLoS Comput. Biol. 11:e1004528. 10.1371/journal.pcbi.1004528

  • 155

    RosenbaumD. A.CohenR. G.JaxS. A.WeissD. J.Van Der WelR. (2007). The problem of serial order in behavior: Lashley's legacy. Hum. Mov. Sci. 26, 525554. 10.1016/j.humov.2007.04.001

  • 156

    SchaalS.MohajerianP.IjspeertA. (2007). Dynamics systems vs. optimal control—a unifying view. Prog. Brain Res. 165, 425445. 10.1016/S0079-6123(06)65027-9

  • 157

    SchmidtK. L.AmbadarZ.CohnJ. F.ReedL. I. (2006). Movement differences between deliberate and spontaneous facial expressions: zygomaticus major action in smiling. J. Nonverb. Behav. 30, 3752. 10.1007/s10919-005-0003-x

  • 158

    SeidemannE.MeilijsonI.AbelesM.BergmanH.VaadiaE. (1996). Simultaneously recorded single units in the frontal cortex go through sequences of discrete and stable states in monkeys performing a delayed localization task. J. Neurosci. 16, 752768. 10.1523/JNEUROSCI.16-02-00752.1996

  • 159

    ShermanS. M.GuilleryR. (1998). On the actions that one nerve cell can have on another: distinguishing “drivers” from “modulators. Proc. Natl. Acad. Sci. U.S.A. 95, 71217126. 10.1073/pnas.95.12.7121

  • 160

    SkaggsW. E.McNaughtonB. L. (1996). Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science271, 18701873. 10.1126/science.271.5257.1870

  • 161

    SoltaniA.KhorsandP.GuoC.FarashahiS.LiuJ. (2016). Neural substrates of cognitive biases during probabilistic inference. Nat. Commun. 7:11393. 10.1038/ncomms11393

  • 162

    SoltaniR.JiangH. (2016). Higher order recurrent neural networks. arXiv 1605.00064.

  • 163

    StachenfeldK. L.BotvinickM. M.GershmanS. J. (2017). The hippocampus as a predictive map. Nat. Neurosci. 20:1643. 10.1038/nn.4650

  • 164

    StrogatzS. H. (2018). Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering. Boca Raton, FL: CRC Press. 10.1201/9780429492563

  • 165

    SussilloD.AbbottL. F. (2009). Generating coherent patterns of activity from chaotic neural networks. Neuron63, 544557. 10.1016/j.neuron.2009.07.018

  • 166

    TaherkhaniA.BelatrecheA.LiY.CosmaG.MaguireL. P.McGinnityT. M. (2020). A review of learning in biologically plausible spiking neural networks. Neural Netw. 122, 253272. 10.1016/j.neunet.2019.09.036

  • 167

    TanakaG.YamaneT.HérouxJ. B.NakaneR.KanazawaN.TakedaS.et al. (2019). Recent advances in physical reservoir computing: a review. Neural Netw. 115, 100123. 10.1016/j.neunet.2019.03.005

  • 168

    TokdarS.XiP.KellyR. C.KassR. E. (2010). Detection of bursts in extracellular spike trains using hidden semi-Markov point process models. J. Comput. Neurosci. 29, 203212. 10.1007/s10827-009-0182-2

  • 169

    ToutounjiH.PipaG. (2014). Spatiotemporal computations of an excitable and plastic brain: neuronal plasticity leads to noise-robust and noise-constructive computations. PLoS Comput. Biol. 10:e1003512. 10.1371/journal.pcbi.1003512

  • 170

    TullyP. J.LindénH.HennigM. H.LansnerA. (2016). Spike-based Bayesian-Hebbian learning of temporal sequences. PLoS Comput. Biol. 12:e1004954. 10.1371/journal.pcbi.1004954

  • 171

    UlrychT. J.SacchiM. D.WoodburyA. (2001). A bayes tour of inversion: a tutorial. Geophysics66, 5569. 10.1190/1.1444923

  • 172

    VanRullenR.KochC. (2003). Is perception discrete or continuous?Trends Cogn. Sci. 7, 207213. 10.1016/S1364-6613(03)00095-0

  • 173

    VaronaP.RabinovichM. I.SelverstonA. I.ArshavskyY. I. (2002). Winnerless competition between sensory neurons generates chaos: a possible mechanism for molluscan hunting behavior. Chaos12, 672677. 10.1063/1.1498155

  • 174

    WatzenigD. (2007). Bayesian inference for inverse problems-statistical inversion. Elektrotech. Inform. 124, 240247. 10.1007/s00502-007-0449-0

  • 175

    WeissY.SimoncelliE. P.AdelsonE. H. (2002). Motion illusions as optimal percepts. Nat. Neurosci. 5, 598604. 10.1038/nn0602-858

  • 176

    WolpertD. M.GhahramaniZ.JordanM. I. (1995). An internal model for sensorimotor integration. Science269, 18801882. 10.1126/science.7569931

  • 177

    WörgötterF.PorrB. (2005). Temporal sequence learning, prediction, and control: a review of different models and their relation to biological mechanisms. Neural Comput. 17, 245319. 10.1162/0899766053011555

  • 178

    WuZ.KingS. (2016). Investigating gated recurrent networks for speech synthesis, in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE), 51405144. 10.1109/ICASSP.2016.7472657

  • 179

    YamashitaY.TaniJ. (2008). Emergence of functional hierarchy in a multiple timescale neural network model: a humanoid robot experiment. PLoS Comput. Biol. 4:e1000220. 10.1371/journal.pcbi.1000220

  • 180

    YanS.SmithJ. S.LuW.ZhangB. (2018). Hierarchical multi-scale attention networks for action recognition. Signal Process. 61, 7384. 10.1016/j.image.2017.11.005

  • 181

    YildizI. B.JaegerH.KiebelS. J. (2012). Re-visiting the echo state property. Neural Netw. 35, 19. 10.1016/j.neunet.2012.07.005

  • 182

    YildizI. B.KiebelS. J. (2011). A hierarchical neuronal model for generation and online recognition of birdsongs. PLoS Comput. Biol. 7:e1002303. 10.1371/journal.pcbi.1002303

  • 183

    YildizI. B.von KriegsteinK.KiebelS. J. (2013). From birdsong to human speech recognition: Bayesian inference on a hierarchy of nonlinear dynamical systems. PLoS Comput. Biol. 9:e1003219. 10.1371/journal.pcbi.1003219

  • 184

    YuS. Z. (2015). Hidden Semi-Markov Models: Theory, Algorithms and Applications. Burlingotn, MA: Morgan Kaufmann.

  • 185

    YuY.SiX.HuC.ZhangJ. (2019). A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31, 12351270. 10.1162/neco_a_01199

  • 186

    ZekiS.ShippS. (1988). The functional logic of cortical connections. Nature335:311. 10.1038/335311a0

  • 187

    ZemelR. S.DayanP.PougetA. (1998). Probabilistic interpretation of population codes. Neural Comput. 10, 403430. 10.1162/089976698300017818

  • 188

    ZenH.TokudaK.MasukoT.KobayashiT.KitamuraT. (2004). Hidden semi-markov model based speech synthesis, in Eighth International Conference on Spoken Language Processing (Jeju Island).

  • 189

    ZhangC.WoodlandP. C. (2018). High order recurrent neural networks for acoustic modelling, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Calgary, AB: IEEE), 58495853. 10.1109/ICASSP.2018.8461608

  • 190

    ZhengP.TrieschJ. (2014). Robust development of synfire chains from multiple plasticity mechanisms. Front. Comput. Neurosci. 8:66. 10.3389/fncom.2014.00066

  • 191

    ZutshiI.LeutgebJ. K.LeutgebS. (2017). Theta sequences of grid cell populations can provide a movement-direction signal. Curr. Opin. Behav. Sci. 17, 147154. 10.1016/j.cobeha.2017.08.012

Summary

Keywords

neuronal sequences, Bayesian inference, generative models, Bayesian brain hypothesis, predictive coding, hierarchy of time scales, recurrent neural networks, spatiotemporal trajectories

Citation

Frölich S, Marković D and Kiebel SJ (2021) Neuronal Sequence Models for Bayesian Online Inference. Front. Artif. Intell. 4:530937. doi: 10.3389/frai.2021.530937

Received

30 January 2021

Accepted

13 April 2021

Published

21 May 2021

Volume

4 - 2021

Edited by

Bertram Müller-Myhsok, Max Planck Institute of Psychiatry (MPI), Germany

Reviewed by

Hazem Toutounji, University of Nottingham, United Kingdom; Philipp Georg Sämann, Max Planck Institute of Psychiatry, Germany

Updates

Copyright

*Correspondence: Sascha Frölich

This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Artificial Intelligence

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics