An enactivist-inspired mathematical model of cognition

In this paper we start from the philosophical position in cognitive science known as enactivism. We formulate five basic enactivist tenets that we have carefully identified in the relevant literature as the main underlying principles of that philosophy. We then develop a mathematical framework to talk about cognitive systems (both artificial and natural) which complies with these enactivist tenets. In particular we pay attention that our mathematical modeling does not attribute contentful symbolic representations to the agents, and that the agent's nervous system or brain, body and environment are modeled in a way that makes them an inseparable part of a greater totality. The long-term purpose for which this article sets the stage is to create a mathematical foundation for cognition which is in line with enactivism. We see two main benefits of doing so: (1) It enables enactivist ideas to be more accessible for computer scientists, AI researchers, roboticists, cognitive scientists, and psychologists, and (2) it gives the philosophers a mathematical tool which can be used to clarify their notions and help with their debates. Our main notion is that of a sensorimotor system which is a special case of a well studied notion of a transition system. We also consider related notions such as labeled transition systems and deterministic automata. We analyze a notion called sufficiency and show that it is a very good candidate for a foundational notion in the “mathematics of cognition from an enactivist perspective.” We demonstrate its importance by proving a uniqueness theorem about the minimal sufficient refinements (which correspond in some sense to an optimal attunement of an organism to its environment) and by showing that sufficiency corresponds to known notions such as sufficient history information spaces. In the end, we tie it all back to the enactivist tenets.

In this paper we start from the philosophical position in cognitive science known as enactivism. We formulate five basic enactivist tenets that we have carefully identified in the relevant literature as the main underlying principles of that philosophy. We then develop a mathematical framework to talk about cognitive systems (both artificial and natural) which complies with these enactivist tenets. In particular we pay attention that our mathematical modeling does not attribute contentful symbolic representations to the agents, and that the agent's nervous system or brain, body and environment are modeled in a way that makes them an inseparable part of a greater totality. The long-term purpose for which this article sets the stage is to create a mathematical foundation for cognition which is in line with enactivism. We see two main benefits of doing so: ( ) It enables enactivist ideas to be more accessible for computer scientists, AI researchers, roboticists, cognitive scientists, and psychologists, and ( ) it gives the philosophers a mathematical tool which can be used to clarify their notions and help with their debates. Our main notion is that of a sensorimotor system which is a special case of a well studied notion of a transition system. We also consider related notions such as labeled transition systems and deterministic automata. We analyze a notion called su ciency and show that it is a very good candidate for a foundational notion in the "mathematics of cognition from an enactivist perspective." We demonstrate its importance by proving a uniqueness theorem about the minimal su cient refinements (which correspond in some sense to an optimal attunement of an organism to its environment) and by showing that su ciency corresponds to known notions such as su cient history information spaces. In the end, we tie it all back to the enactivist tenets. KEYWORDS enactivism, transition systems, automaton, cognitive modeling, information spaces, robotics . Introduction: Mathematizing enactivism The premise of this paper is to lay down a logical framework for analyzing agency in a novel way, inspired by enactivism. Classically, mathematical and logical models of cognition are in line with the cognitivist paradigm in that they rely on the notion of symbolic representation and do not emphasize embodiment or enactment (Newell and Simon, 1972;Fodor, 2008;Gallistel and King, 2009;Rescorla, 2016).

FIGURE
The environment, body, and nervous system (or brain) will be modeled as inseparable parts of a coupled transition system.
Cognitivism presumes that the world possesses objective structure and the contentful information of this structure is acquired and represented by the cognitive agent. This aligns well with the classical model-theoretic paradigm. In this paradigm a formal language is describing a static model (such as when sentences in the language of rings describe algebraic structuressuch as rings).
In the cognitivist analogy, the agent possesses ("in its head") formulas of the language and the model is the world or the environment of the agent. If the formulas possessed by the agent hold in the model, then the agent's representation of the world is correct; otherwise, it is incorrect. Such view of cognitive agency is rejected by the enactivists either weakly or strongly depending on the branch of enactivism. For example, radical enactivism Myin, 2012, 2017) rejects this view strongly. Our question for this paper is: What would the mathematical logic of cognition look like, if even the radical enactivists were to accept it?
We do not take part in the cognitivist-enactivist, or the representationalist-antirepresentationalist debate (Pezzulo et al., 2011;O'Regan and Block, 2012;Gallagher, 2018;Fuchs, 2020). Rather, we take a somewhat extreme enactivist and antirepresentational view as our axiomatic starting point and as a theoretical explanatory target. Then we develop a mathematical theory that attempts to account for cognition in a way congruent with this view. Even though most forms of enactivism (even radical ones) have room for representation, it is not our main goal at the moment to bridge the gap between "basic minds" and "scaffolded minds, " to use terminology of (Hutto and Myin, 2017). Thus, in this terminology, we are going to explore a mathematical (only) of basic minds.
The following "axioms" we take as fundamentals for our work: (EA1) Embodiment. "From a third-person perspective the organism-environment is taken as the explanatory unit" (Gallagher, 2017). The environment, the body, and the nervous system are inseparable parts of the system which they form by coupling; see Figure 1. They cannot be meaningfully understood in isolation from each other. "Mentality is in all cases concretely constituted by, and thus literally consists of, the extensive ways in which organisms interact with their environments, where the relevant ways of interacting involve, but are not exclusively restricted to, what goes on in brains" (Embodiment Thesis Hutto and Myin, 2012). (EA2) Groundedness. The brain does not "acquire" or "possess" contentful states, representations, or manipulate semantic information in any other way. "Mentalityconstituting interactions are grounded in, shaped by, and explained by nothing more, or other, than the history of an organism's previous interactions. Nothing other than its history of active engaging structures or explains an organism's current interactive tendencies." [Developmental-Explanatory Thesis (Hutto and Myin, 2012)]. (EA3) Emergence. The crucial properties of the brain-bodyenvironment system from the point of view of cognition emerge from the embodiment, the brainbody-environment coupling, the situatedness, and the skills of the agent. The agent's and the environment's prior structure come together to facilitate new structure which emerges through the sensorimotor engagement. " [T]he mind and world arise together in enaction, [but] their manner of arising is not arbitrary" (i.e. it is structured) (Varela et al., 1992). (EA4) Attunement. Agents differ in their ways of attunement and adaptation to their environments, and in the skills they have. A skill is a potential possibility to engage reliably in complex sensorimotor interactions with the environment (Gallagher, 2017). (EA5) Perception. Sensing and perceiving are not the same thing. Perception arises from skillful sensorimotor activity. To perceive is to become better attuned to the environment. O' Regan and Noë (2001) and Noë (2004) "Perception and action, sensorium and motorium, are linked together as successively emergent and mutually selecting patterns." Varela et al. (1992).
The mathematics we use to capture those ideas is a mixture of known and new concepts from theoretical robotics, (non-)deterministic automata and transition systems theory, and dynamical systems (Goranko and Otto, 2007). It will also build upon the information spaces framework, introduced in LaValle (2006) as a unified way to model sensing, actuation, and planning in robotics; the framework itself builds upon earlier ideas such as dynamic games with imperfect information (von Neumann and Morgenstern, 1944;Başar and Olsder, 1995), control with imperfect state information (Kumar and Varaiya, 1986;Bertsekas, 2001), knowledge states (Lozano-Pérez et al., 1984;Erdmann, 1993), perceptual equivalence classes (Donald . /fnbot. . and Jennings, 1991;Donald, 1995), maze and graph-exploring automata (Shannon, 1952;Blum and Kozen, 1978;Fraigniaud et al., 2005), and belief spaces (Kaelbling et al., 1998;Roy and Gordon, 2003). Although information spaces refer to "information, " they are not directly related to Shannon's information theory (Shannon, 1948), which came later than von Neumann's use of information in the context of sequential game theory. Neither does "information" here refer to content-bearing information. One important intuition behind the information in information spaces is that more information corresponds to narrowing down the space of possibilities (for example of future sensorimotor interactions).
The main mathematical concept of this paper is a sensorimotor system (SM-system), which is a special case of a transition system. Sensorimotor systems can describe the bodybrain system, the body-environment system as well as other parts of the brain-body-environment system. Given two SMsystems they can be coupled to produce another (third) SMsystem. Mathematically, the coupling operation is akin to a direct product. We introduce several notions that describe the coupling of the agent and the environment from an outside perspective (not from the perspective of the agent or the environment). The main notion is that of sufficiency. In some sense it guarantees that the coupling is of "high fidelity." It does not compare "internal" models of the agent to "external" states of affairs. Rather it asks whether the way in which the agent engages in sensorimotor patterns is well structured. The notion of sufficiency compares the sensorimiotor capacity of the agent to itself by asking whether the past sensorimotor patterns (in a given environment) determine reliably the future sensorimotor patterns. We then introduce several related notions. The degree of insufficiency is a measure by which various agents can be compared in their coupling versatility (Def 4.11). Minimal sufficient refinement is a concept that can be used in the most vivid ways to illustrate how the sensorimotor interaction "enacts" properties of the brain-body-environment system. The notion of minimal sufficient refinement ties together mathematics of sensorimotor systems and the philosophical ideas of emergence, structural coupling and enactment of the "world we inhabit" (cf. Varela et al., 1992); see Example 4.25. We prove the uniqueness of minimal sufficient refinements (Theorem 4.19) and point out their connection to the notions of bisimulation and sufficient information mappings. Strategic sufficiency is a mathematically more challenging concept, but has appealing properties in the philosophical and practical sense. A sensor mapping is strategically sufficient for some subset of the state space G, if that sensor can (in principle) be used by the agent to reach G . Again, any sensor mapping has minimal strategic refinements, but this time they are not unique. Different This idea of a reachable set G is the simplest way to formalize a ordances. minimal refinements in this case can be thought of as different adaptations to the same environmental demands.
Mathematically, sufficiency is a relative concept to some known notions in theoretical computer science and robotics: that of bisimulation in automata and Kripke model theory (Goranko and Otto, 2007), and sufficient information mappings in information spaces theory (LaValle, 2006).
Minimal sufficient refinements lead to unique classifications of agent-environment states that "emerge" from the way in which the agent is coupled to the environment, not merely from the way the environment is structured on its own. Thus, the world is simultaneously objectively existing (from the global "god" perspective), but also "brought about" by the agent.
This should be enough to answer the two questions that, according to Paolo (2018), any embodied theory of cognition should be able to provide precise answers to: What is its conception of bodies? What central role do bodies play in this theory different from the roles they play in traditional computationalism?
Section 2 introduces the basics of transition and SMsystems, their coupling, and other mathematical constructs such as quotients. Section 3 illustrates the introduced notions with detailed examples. Section 4 introduces the notion of sufficiency, sufficient refinements, and minimal sufficient refinements. We will prove the uniqueness theorem for the latter and illustrate the notions in a computational setting. We will explore the importance of sufficiency and related notions for the enactivist way of looking at cognitive organization. Finally, Section 5 ties the mathematics back to the philosophical premises.

. Transition systems and sensorimotor systems
At the most abstract level, the central concept for our mathematical theory is that of a transition system. This is a standard definition from automata theory (for instance Goranko and Otto, 2007): Definition 2.1. A transition system is a triple (X, U, T) where X is the state space (mathematically it is just a set), U is the set of names for outgoing transitions (another set), and T ⊆ X ×U ×X is a ternary relation.
The intuitive interpretation of (X, U, T) is that it is possible to transition from the state x 1 ∈ X to the state x 2 ∈ X via u ∈ U iff (x 1 , u, x 2 ) ∈ T. We use the notation x 1 u → x 2 to mean that (x 1 , u, x 2 ) ∈ T. Our notion of transition system is often called a labeled transition system in the literature, because each potential transition has a name or label, u ∈ U. However, we drop the term "labeled" because in Section 2.5 we will introduce a version of transition systems in which the states are relabeled, thereby introducing a new kind of labeling. Note that when working with such transition .
/fnbot. . systems as modeling agency, we are safely within the realm of the Developmental-Explanatory Thesis (EA2). The following definitions are standard (although we do not restrict X to be finite): A bisimulation is a relation R such that both R and R T = {(y, x) :(x, y) ∈ R} are simulations.
The notation X ∼ = X ′ means that X , X ′ are isomorphic, and X ∼ X ′ means that there is a bisimulation R such that X = dom(R) and X ′ = ran(R). We speak of automorphism and autobisimulation, if X = X ′ .
We are ready to make the first observation:

. . Transition systems as a unifying concept
There are several ways in which transition systems and their relatives appear in the literature relevant to us.
Examples 2.4. Let (X, U, T) be a transition system.
Then (X, U,T, x 0 , F) is a nondeterministic automaton. If in addition X and U are finite, then it is a nondeterministic finite automaton (NFA). 2. LetT : X × X → P(U) be the functionT(x 1 , ThenT(x 1 , x 2 ) is the set of all u that take x 1 to x 2 . Then, (X,T) is a labeled directed graph in which the labels are subsets of U. Another way to think of it is as a labeled directed multigraph: the multiplicity of the edge from x 1 to x 2 is n = |T(x 1 , x 2 )| and these n edges are labeled by the labels from the setT(x 1 , x 2 ). 3. If for all x 1 ∈ X and u ∈ U there is a unique Then (X, U, τ , x 0 , F) is a deterministic automaton, and if X and U are finite, then it is a deterministic finite automaton (DFA). Without F, (X, U, τ , x 0 ) also satisfies the definition of the temporal filter of LaValle (2012, 4.2.3). In this case X is the information space or the I-space (usually denoted by I instead of X), and U is the observation space (usually denoted by Y instead of U).

. . Information spaces and filters
We can reformulate the notion of a history information space introduced by LaValle (2006) as follows. In this context, X is an external state space that characterizes the robot's configuration, velocity, and environment, U is an action space, f is a state transition mapping that produces a next state from a current state and action, h is a sensor mapping that maps states to observations, and Y is a sensor observation space. As in LaValle (2006), for each x ∈ X, let (x) be a finite set of "nature sensing actions" and for each x ∈ X and u ∈ U let (x, u) be a finite set of "nature actions." Let X = {(x, ψ) | ψ ∈ (x)} and let h : X → Y be a "sensor mapping" where Y is a set called the "observation space." Let X = {(x, u, θ ) | θ ∈ (x, u)} and let f : X → X be the "transition function." The following definition is an adaptation from LaValle (2006).
Suppose for each x, y ∈ X there is at most one u ∈ U with x u → y. Let and let l : E T → U be defined so that l((x, y)) is the unique u such that x u → y. Then (X, E T , l, x 0 ) with x 0 ∈ X is a passive I-state graph as in O'Kane and Shell (2017, Def 1).
The following definition is more of a notational than mathematical value.
. /fnbot. . Definition 2.6. Let X = (X, U, T) be a transition system. If for all (x, u) ∈ X × U there is a unique y ∈ X with (x, u, y) ∈ T, then we denote the function (x, u) → y by τ , and write (X, U, τ ) instead of (X, U, T). In this case we call X an automaton. Note that usually in computer science literature an automaton is finite and also has an initial state and a set of accepting states, but we do not have those in our definition. For automata we also use the notation x * u = τ (x, u) and if u = (u 0 , . . . , u k−1 ), then x * ū is defined by induction for k > 1 as follows: x * (u 0 , . . . , u k−1 ) = (x * (u 0 , . . . , u k−2 )) * u k−1 .
Examples 2.7. Automata and transition systems can model agent-environment and related dynamics.
1. If (X, ·) is a group, U ⊆ X is a set of generators, and τ (x, u) = x · u, then (X, U, τ ) is an automaton. For example, consider the situation in which X = Z × Z and U = {a, b, a −1 , b −1 } in which a = (1, 0) and b = (0, 1). Thus, X is presented with generators a, b, and relation a · b = b · a. This models an agent moving without rotation in an infinite 2D-grid and the agent can move left, right, up and down. There are no obstacles. The standard Cayley graph is equivalent to the graph based representation of the automaton. 2. Let U * be the set of all finite sequences ("strings") of elements of U. Ifū = (u 0 , . . . , u k−1 ) ∈ U * and u k ∈ U, we denote bȳ u ⌢ u k the concatenation (u 0 , . . . , u k−1 , u k ). Ifū 0 ,ū 1 ∈ U * , thenū 0 ⌢ū 1 is similarly the concatenation of two strings. The operation of concatenation turns U * into a monoid. Suppose τ : X ×U * → X is an action of the monoid U * on X meaning that it satisfies τ (τ (x,ū),ū ′ ) = τ (x,ū ⌢ū′ ) and τ (x, ∅) = x. Then the automaton (X, U, τ ) is a discrete-time control system. A sequence of controlsū = (u 0 , . . . , u k−1 ) produces a unique trajectory (x 0 , . . . , x k ), given the initial state x 0 by induction: x i+1 = τ (x i , u i ) for all i < k. 3. Consider an automaton (X, U, τ ) in which U is a group, and τ is a group action of U on X. In some situations it can be natural to consider the set of motor-outputs of an agent to be a group: the neutral element is no motoroutput at all, every motor-output has an "inverse" for which the effect is the opposite, or negating (say, moving right as opposed to moving left), the composition of movements is many movements applied consecutively. The action τ of U on X is then the realization of those motor-outputs in the environment. In realistic scenarios, however, this is not a good way to model the sensorimotor interaction because of the following reason. Suppose the agent has actions "left" and "right, " but it is standing next to an obstacle on its left. Then moving "left" will result in staying still (because of the obstacle), but moving "right" will result in actually moving right, if there is no obstacle at the right of the agent. In this situation the sequence "left-right" results in a different position of the agent than the sequence "right-left, " so if "left" and "right" are each other's inverses in G, then the axioms of group action are violated.
4. Note that if T = ∅, then (X, U, T) is a transition system. 5. Let X = {0, 1} * as in (2), U = {0}, and (x, 0, y) ∈ T if and only if |y| = |x| + 1, then (X, U, T) is a transition system, where |x| is the length of the string x. 6. If (X, U, T) is a transition system and E ⊆ X an equivalence relation, then (X/E, U, T/E) is a transition system, where | (x, u, y) ∈ T}, and / denotes a quotient space; see Definition 2.33.

. . Sensorimotor systems
Next, we will define a sensorimotor system, which is a special case of a transition system. Following the tenet (EA1) that "environment is inseparable from the body which is inseparable from the brain, " our sensorimotor systems can model any part of the environment-body-brain coupling. The model that describes the environment differs from the one that describes the agent merely in the type of structure it possesess, but not in an essential mathematical way.
SM-systems can be thought of as a partial specification of (some part of) the brain-body-environment coupling. Physicalist determinism demands that under full specification we are left with a deterministic system. A specification is partial when it leaves room for unknowns in some, or all, parts of the system. Definition 2.8. A sensorimotor system (or SM-system) is a transition system (X, U, T) where U = S × M for some sets S and M, which we call in this context the sensory set and the motor set, respectively.
The interpretation is that if x (s,m) −→ y, then s is the sensation that either occurs at x, or along the transition to the next state y, and m the motor action which leads to the transition. We will show later how SM-systems can be connected together (Definition 2.22) to form coupled systems. Sometimes an SMsystem is modeling a brain-body totality, and other times it is modeling body-environment totality. A coupling between these two will model the brain-body-environment totality. This is a flexible framework which enables enactivist-style analysis. We do not assume that the agent "knows" the effect of a given m ∈ M or that the "meaning" of a given s ∈ S. The sets S and M are purely mathematical sets denoting the interface between the agent and the environment from the third person perspective.
In fact, the sensory and motor components can be decoupled which might be more natural from the mathematics' point of view in some cases. The following shows that we can look at it both ways.
This means a full specification of the environment, the agent's body, its brain, their coupling, as well as the initial states.
Frontiers in Neurorobotics frontiersin.org . /fnbot. . Definition 2.9. An asynchronous SM-system is a transition system (X, U, T) such that there exist partitions U = S ∪ M and X = X s ∪ X m such that for all (x, u, y) ∈ T we have 1. if x ∈ X s , then u ∈ S, 2. if x ∈ X m , then u ∈ M, and 3.
x ∈ X m ⇐⇒ y ∈ X s .
Thus, the state space of a sequential SM-system contains separate sensory states and motor states.
Definition 2.10. Suppose E is an equivalence relation on a set X.
We say that a map f : There is a natural correspondence between SM-systems and their asynchronous counterpart: Theorem 2.11. Let SM and aSM be the classes of SM-systems and asynchronous SM-systems, respectively. There are functions F : SM → aSM and G : aSM → SM such that 1. F and G are isomorphism and bisimulation preserving, 2. restricted to finite systems, F and G are polynomial-time computable, and restricted to the infinite ones they are Borelfunctions in the sense of classical descriptive set theory (Kechris, 1994).
Another type of a system, which is in a similar way equivalent to a special case of an SM-system, is a state-labeled transition system which we will introduce next, and prove a similar result, Lemma 2.19.

. . Quasifilters and quasipolicies
The amount of information specified in a given SM-system depends on which part of the brain-body-environment system we are modeling. At one extreme, we specify the environment's dynamics down to the small detail and leave the brain's dynamics completely unspecified. In this case the SM-system will have only one sensation corresponding to each state and the transition to the next state will be completely determined by knowing the motor action. This is, in a sense, the environment's perspective. At the other extreme, we specify the brain completely, but leave the environment unspecified. We "don't know" which sensation comes next, but we "know" which motor actions are we going to apply. This is in a sense the perspective of the agent. The first extreme case is the perspective often taken in robotics and other engineering fields when either specifying a planning problem (Ghallab et al., 2004;Choset et al., 2005;O'Kane and LaValle, 2008), or designing a filter (Hager, 1990;Thrun et al., 2005;LaValle, 2012;Särkkä, 2013) (also known as sensor fusion). This is why we call SM-systems of that sort quasifilters (Definition 2.12). The other extreme is the perspective of a policy. The policy depends on sensory input, but the motor actions are determined (by the policy). This is why we call the SM-systems of the latter sort quasipolicy. The "quasi-" prefix is used because both are weaker and more general notions than those that appear in the literature; see Remarks 2.20 and 2.21.
Another way to look at this is the dichotomy between virtual reality (VR), and robotics. In virtual reality, scientists are designing the (virtual) environment for an agent whereas in robotics they are typically designing an agent for an environment. In the former case the agent is partially specified: the type of embodiment is known (S and M are known) and some types of patterns of embodiment are known (eye-hand coordination). However, the specific actions to be taken by the agents are left unspecified. The job of the designer is to specify the environment down to the smallest detail, so that every sequence of motor actions of the agent yields targeted sensory feedback. The VR-designer is designing a quasifilter constrained by the partial knowledge of the agent's embodiment and internal dynamics. The case for the robot designer is the opposite. She has a partial specification of the robot's intended environment and usually works with a complete specification of the robot's mechanics. She is designing a quasipolicy. For VR-designers the agent is a black box; for roboticists the agent is a white box (Suomalainen et al., 2020) (unless the task is to reverse engineer an unknown robot design). For the environment, the roles are reversed. A similar dichotomy can be seen between biology (in which the agent is a black box) and robotics (in which it usually is a white box).
All the definitions in this section are new.
Definition 2.12. Suppose that (X, S × M, T) is an SM-system with the property that for all x 1 ∈ X there exists s x 1 ∈ S such that for all x 2 ∈ X and all (s, m) ∈ S × M we have that x 1 (s,m) −→ x 2 implies s = s x 1 . Then, (X, S × M, T) is a quasifilter.
In a quasifilter the sensory part of the outgoing edge is unique. The dual notion (quasipolicy) is when the motor part is unique: Before explaining the connections between quasifilter and a filter and quasipolicy and a policy, let us define projections of the sensorimotor transition relation to "motor" and to "sensory": Definition 2.14. Given an SM-system (X, S × M, T), let These are called the motor and the sensory projections, respectively of the sensorimotor transition relation. They are .
/fnbot. . also called the motor transition relation and the sensory transition relation, respectively. The corresponding transition systems (X, M, T M ) and (X, S, T S ) are called the motor and the sensory projection systems.
Definition 2.15. Given a transition system (X, U, T), and x ∈ X, Combining this notation with the one introduced in Example 2.4(2), given x, y ∈ X, we have Mathematically coupling of two transition systems is symmetric [see Theorem 2.24(3)], but from the cognitive perspective there is (usually) an asymmetry between the agent and the environment (which can be evident from some specific properties of the agent and of the environment). Because of the partial symmetry, many properties of an agent can dually be held by the environment and vice versa. The following proposition highlights the duality between quasipolicy and quasifilters: reversing the roles of the environment and the agent.
Proposition 2.16. For an SM-system X = (X, S × M, T) the following are equivalent: Similarly, X is a quasipolicy if and only if O T M (x) = (O T (X)) 1 is a singleton for each x ∈ X.
Proof: A straightforward consequence of all the definitions.

. . State-relabeled transition systems
It will become convenient in the coming framework to assign labels to the states. The elements x of the state space X are already named; thus, our labeling can be more properly considered as a relabeling via a function h : X → L, in which L is an arbitrary set of labels. This allows partitions to be naturally induced over X by the preimages of h. Intuitively, this will allow the state space X to be characterized at different levels of "resolution" or "granularity." Thus, we have the following definition: Definition 2.17. A state-relabeled transition system (or simply labeled transition system) is a quintuple (X, U, T, h, L) in which h : X → L is a labeling function and (X, U, T) is a transition system.
We think of state-relabeled to be a more descriptive term, but we shorten it in the remainder of this paper to being simply labeled.
Remark 2.18. In an analogy to Definition 2.6, a labeled transition system is a labeled automaton, if T happens to be a function; in other words, for all (x, u) ∈ X × U there is a unique y ∈ X with (x, u, y) ∈ T. In this case we may denote this function by τ : (x, u) → y and work with the labeled automaton (X, U, τ , h, L). For example, the temporal filter in Section 2.1 is a labeled automaton.
The isomorphism and bisimulation relations are defined similary as for transition systems, but in a label-preserving way.
One intended application of a labeled transition system (X, U, T, h, L) is that h is a sensor mapping, L is a set of sensor observations, and U is a set of actions. Thus, actions u ∈ U allow the agent to transition between states in X while h tells us what the agent senses in each state. We intend to show that this can be seen as a special case of an SM-system by proving a theorem similar to Theorem 2.11, but stronger, namely these corresponces preserve isomorphism: Lemma 2.19. Let F be the class of quasifilters, P the class of quasipolicies, and L the class of labeled systems. Then there are one-to-one maps LTS P : P → L and LTS F : F → L such that 1. LTS P and LTS F are isomorphism and bisimulation preserving, 2. restricted to finite systems, LTS P and LTS F are polynomialtime computable, and restricted to the infinite ones they are Borel-functions in the sense of classical descriptive set theory.
Proof: See Appendix B Remark 2.20. Let X = (X, S × M, T) be a quasifilter and X ′ = LTS F (X ) = (X, M, T M , h, S) as in Lemma 2.19. Suppose further that for each x, y ∈ X there is at most one u ∈ U with x u → y. Let Then (X, M, E T , x 0 ) coincides with the definition of a filter (O'Kane and Shell, 2017, Def 3). If it is also an automaton, meaning that above we replace "at most one" by "exactly one, " then every sequence of motor actions (m 0 , . . . , m k−1 ) determines a unique resulting state x k−1 ∈ X. This is analogous, and can be proved in the same way, as the fact that each sequence of sensory data determines a unique resulting state in Remark 2.21 below.
Remark 2.21. Usually, a policy is a function which describes how an agent chooses actions based on its own past experience. Thus, if M is the set of motor commands and S is the set of sensations, a policy is a function π : S * → M where S * is the set of finite sequences of sensory "histories"; see for example (LaValle, 2006). Now, suppose that an SM-system X = (X, S × M, T) is a quasipolicy in the sense of Definition 2.13 and let x → m x be as in that Definition. Assume further that X is an automaton (Section 2.1) and let τ : X × (S × M) → X be the corresponding transition function so that for all x ∈ X and (s, m) ∈ S × M we have (x, (s, m), τ (x, (s, m))) ∈ T. Let x 0 ∈ X be an initial state. We will show how the pair (X , x 0 ) defines a function π : S * → M in a natural way. Lets = (s 0 , . . . , s k−1 ) ∈ S k be a sequence of sensory data. If k = 0, and sos = () = ∅, let π (s) = m x 0 . If k > 0, assume that π (s 0 , . . . , s k−2 ) and x k−1 are both defined (induction hypothesis). Then let The idea is that because of the uniqueness of m x , a sequence of sensory data determines (given an initial state) a unique path through the automaton X .

. . Couplings of transition systems
The central concept of this work pertaining to all principles (EA1)-(EA5) is the coupling of SM-systems. We define coupling, however, for general transition systems with the understanding that our most interesting applications will be for SM-systems where U 0 = U 1 = S × M. The idea is that in every transition there is a sensory component and a motor component. The set S could be thought of as all possible events that trigger afferent nervous signals, or their combinations. The elements of M are those events that are triggered by efferent nervous signals. This is an abstract space and in transitioning from one state to another some subset of S × M is "active." If we know little of what kind of sensory data the agent receives during the transition, then that transition will occupy a subset of S × M whose projection to the Scoordinate is large. If, on the other hand we know a lot, and can specify the exact sensory data, then the projection to the S-coordinate is small. Vice versa, if we do not know which motor actions lead from one state to another, then the projection of the corresponding subset to the M-coordinate is large etc. This was made more precise in Section 2.4. The fact that the transition consists of pairs (s, m) where s is a sensory input and m is a motor command does not mean that the agent is equipped with the semantics of what m "means, " or what it "does" in the world. The effect of m is "computed" by the environment and the agent only receives the next "s" as the feedback. It might have been more intuitive, but more cumbersome to make this definition in terms of functions that map events of the environment to sensory stimuli and internal events of the nervous system to motor actions, and further functions that map the motor actions to the actual events in the environment, etc., but from the point of view of essential mathematical structure these extra identifications wouldn't add anything qualitatively new.
Example 2.23. A simple example of coupling is illustrated in Figure 2.
Mathematically the coupling is a product of sorts. If we think of one transition system as "the environment" and the other as "the agent, " then the coupling tells us about all possible ways in which the agent can engage with the environment. The fact that the state space of the coupled system is the product of the state spaces of the two initial systems reflects the fact that the coupled system includes information of "what would happen" if the environment was in any given state while the agent is in any given ("internal") state.
We immediately prove the first theorem concerning coupling: 1} are four SM-systems. Then the following hold: Coupling provides an interesting way to compare SM-systems from the "point of view" of other SM-systems. For example, given an SM-system E one can define an equivalence relation on SM-systems by saying that If E is the "environment" and I, I ′ are "agents, " this is saying that the agents perform identically in this particular environment. Or vice versa, for a fixed I, the relation E * I = E ′ * I means that the environments are indistinguishable by the agent I.
Remark 2.25. In the definition of coupling we see that the two SM-systems constrain each other. This is seen from the fact that in the definition we take intersections. For example, when an agent is coupled to an environment, it chooses certain actions from a large range of possibilities. In this way the agent structures its own world through the coupling (EA3). To make this notion further connect to enactivist paradigm, we invoke the dynamical systems .
approach to cognition (Tschacher and Dauwalder, 2003). An attractor in a transition system X = (X, U, T) is a set A ⊆ X with the property that for all infinite sequences there are infinitely many indices n such that x n ∈ A. There could be other possible definitions, such as "for all large enough n, x n ∈ A". For the present illustration purposes it is, however, irrelevant. It could be the case that A ⊆ X is not an attractor of X , but after coupling with X ′ = (X ′ , U ′ , T ′ ), A × X ′ may be an attractor of X * X ′ . Thus, if X is the environment and X ′ is the agent and A is a set of desirable environmental states, then we may say that the agent is well attuned to X , if A was not initially an attractor, but in X * X ′ , then A × X ′ becomes one. It could also be that the agent needs to arrive to A while being in a certain type of an internal state B ⊆ X ′ , for example, if A is "food" and B is "hungry". Then it is not important that A × X ′ is an attractor, but it is imperative that A × B is one.

. . Unconstrained and fully constrained SM-systems
As we mentioned before, the information specified in an SMsystem depends on which part of the brain-body-environment system we are modeling. In the extreme case we do not specify anything, except for the very minimal information. Consider a body of a robot for which the set of possible actions (or motor commands) is M and the set of possible sensor observations is S. Suppose that is all we know about the robot. We do not know what kind of environment it is in and we do not know what kind of "brain" (a processor or an algorithm) it is equipped with. Thus, we do not know of any constraints the robot may have in sensing or moving. We then model this robot as an unconstrained SM-system: There are many intuitions behind the above. An unconstrained system is one where anything could happen: the agent might perform any actions in any order and the environment could provide the agent with any sensory data. Such a world is reminiscent of white noise. Such a system is only interesting from an abstract mathematical perspective, it is in some sense "maximal". The content of Proposition 2.27 is that such systems are indistinguishable from each other. An unconstrained system has a similar role with respect to all SM-systems as the free group has to other groups, although we haven't made this universality claim precise in the present paper. Intuitively it means that every possible agent-environment combination can be found as a subsystem (or possibly a quotient) of the unconstrained one. The term "unconstrained" refers in particular to that when coupled to other systems, this system doesn't constrain them, so it acts in the same way as 0 in arithmetic addition (Proposition 2.29). The opposite is the fully constrained system (Definition 2.31, Proposition 2.32). In that case, the intuition is the opposite: in environments where nothing happens and actions do not have any effects, any agent is as good as any other and vice versa: agents that don't do anything are equivalent. Corollary 2.30. If X and X ′ are SM-systems and X ′ is unconstrained, then X * X ′ ∼ X .
The opposite of an unconstrained system is a fully constrained one: Proposition 2.32. Dually to the propositions above, we have that (1) all fully constrained systems are bisimulation equivalent to each other, (2) the simplest example being λ = ({0}, S×M, ∅), and (3) if X = (X, S × M, T) is another SM-system, then X * λ ∼ λ.
All transition systems are in some sense between the fully constrained and the unconstrained, these being the two theoretical extremes.

. . Quotients of transition systems
When considering labelings and their induced equivalence relations, it will be convenient to develop a notion of quotient systems, analogous to quotient spaces in topology. Suppose X = (X, U, T) is a transition system and E is an equivalence relation on X. We can then form a new transition system, called the quotient of X by E in which the new states are E-equivalence classes and the transition relation is modified accordingly.
The following definition of a quotient is standard in Kripke model theory, especially bisimulation theory: Definition 2.33. Suppose X = (X, U, T) and E are as above. Let The following definition is inspired by the idea of sensory pre-images, see LaValle (2019), but is also needed for technical reasons.

Definition 2.34. Given any function
The equivalence relation E h partitions X according to the preimages of h, as considered in the sensor lattice theory of LaValle (2019). The partition of X induced by h directly yields an quotient transition system by applying the previous two definitions: Definition 2.35. Let X = (X, U, T) be a transition system and h : X → L be any mapping. Then define X /h to be X /E h where we combine Definitions 2.34 and 2.33.
Proposition 2.36. If h is one-to-one, then X /h ∼ = X .
Proof: h is one-to-one if and only if E h is equality, in which case it is straightforward to verify that the function x → [x] E h is an isomorphism.
For h : X → L, the transition system (X/h, U, T/h) is essentially a new state space over the preimages of h. In this case X /h is called the derived information space (as used in LaValle, 2006). More precisely: Then (X/h, U, T/h) is isomorphic to (L ′ , U, T ′ ) via the isomorphism f : The intuitive meaning of the quotient is the following. There is a Soviet comedy film from the 1970's where the main character ends up in an apartment in Leningrad, while he thinks that he is actually in Moscow. The apartement in Leningrad is identical to his home in Moscow and he cannot distinguish between them. He thinks for a while that he is at his home in Moscow while being in an apartment in Leningrad. Even his key from Moscow worked for the Leningrad apartment. The pun is that in Soviet times all houses were built according to the same blueprint. Now, before he realized his situation, as far as he was concerned, he was in Moscow. He thought he came to the same place in the evening as in the morning, while he actually didn't. The idea of the quotient captures exactly that: We identify those states that "look the same" (the label is the same) even though they are actually different states. In fact, let us look at a cognitive system on several levels of granularity: When I type on my laptop at home or in a cafeteria, my fingers experience the keyboard in (approximately) the same way. As far as my fingers (and associated motor areas) are concerned, we can identify all situations where they are pressing keys on my keyboard. On a higher level, I might be coming home after a 10 h time and experience as if I am in the same place, but we all know that the planet, on which my home is, has moved, so I actually am not in the same place, just like the main character in the movie referenced above.

. Illustrative examples of SM-systems
We next illustrate how sensorimotor systems model body-environment, brain-body, and brain-body-environment couplings. Consider a body in a fully understood and specified deterministic environment. In this case the body-environment system will be modeled by a quasifilter, Definition 2.12. Instead of using the quasifilter definition, we work with a labeled transition system which, according to Proposition 2.19, is equivalent. According to the assumption of full specification, we will in fact work with labeled automata.
The body has a set M of possible motor actions each of which has a deterministic influence on the body-environment dynamics. Denote the set of body-environment states by E 0 . Whenever a motor action m ∈ M is applied at a bodyenvironment state e ∈ E, a new body-environment state A(e, m) ∈ E is achieved. At each state e ∈ E the body senses data σ (e). Denote the set of sensations by S. In this way, the labeled automaton E 0 = (E, M, A, σ , S) models this body-environment system. This model is ambivalent toward the agent's internal dynamics, its strategies, policies and so on, but not ambivalent toward its embodiment and its environment's structure. In fact, it characterizes them completely.
Alternatively, consider a brain in a body, and suppose that the brain is fully understood and deterministic (for example, perhaps it is designed by us), but we do not know which environment it is in. We model this by an SM-system which is a quasipolicy. Again, by the analogous considerations as above, we work directly an equivalent labeled automaton specification. Denote the set of internal states of the brain by I. The agent's internal state is a function of the sensations; therefore, let B : I × S → I be a function (B stands for brain) that takes one internal state to another based on new sensory data. At each internal state, the agent produces a motor output which is an element of the set M; therefore, let µ : I → M be a function assigning a motor output to each internal state. Now, I = (I, S, B, µ, M) is a labeled transition system modeling this agent. It is ambivalent toward the type of the environment the agent is in, but it is not ambivalent toward the agent's internal dynamics, policies, strategies and so on; in fact, it determines them completely. Now, the coupling of the environment E and the agent A is the SM-system obtained as The sensory and motor sets S and M capture the interface between the brain and the environment because they characterize the body (but not the embodiment).
Example 3.1. Consider an agent that has four motor outputs, called "up" (U), "down" (D), "left" (L), and "right" (R), and there is no sensor feedback (this defines the body). In Corollary 2.28 we gave a minimal example of an unconstrained SM-system. On the other extreme one can give large examples. For instance the free monoid generated by the set M = {U, D, L, R}.
Let X be the set of all possible finite strings in the four "letter" alphabet M, let T = {(x, m, y) | x ⌢ m = y}. "No sensor data" is equivalent to always having the same sensor data; thus, we can assume that S = {s 0 } is a singleton and the sensor mapping h : X → S is constant. The resulting unconstrained transition system U = (X, T, M, σ , S) can be represented by an infinite quaternary tree, shown in Figure 3A.
Suppose that this body is situated in a 2 × 2 grid. The body can occupy one of the four grid's squares at a time, and when it applies one of the movements, it either moves correspondingly, or, if there is a wall blocking the movement, it doesn't. This defines the body-environment system. The set of states is now E and has four elements corresponding to all the possible positions of the body. The transition function A : E × M → E tells where to move, and the rest is as above. The system E = (E, A, M, σ , S) is shown in Figure 3B. Let us now look at the agent. Suppose that it applies the following policy: (1) In the beginning move left; (2) if the previous move was to the left, then move right, otherwise We do not mean to say that no data is always the same as some other data. We are talking here about an agent that never receives any data, or an agent that always receives the same data. Thus, it cannot rely on any "change" between having and not having any sensory input. Thus, there is no "presense in absence" paradox here.
If the agent has a different embodiment in the same environment, then all of the automata will look different. Suppose that instead of the previous four actions, the agent has two: "rotate 90-degrees counterclockwise" (C),"forward one step" (F). Note that these are expressed in the local frame of the robot: It can either rotate relative to its current orientation, or it can move in the direction it is facing; the previous four actions were expressed as if in a global frame or the robot is incapable of rotation. Under the new embodiment, the unconstrained automaton with no sensor feedback is an infinite binary tree, with every node having two outgoing edges, labeled C and F, respectively, instead of the quaternary infinite tree depicted on Figure 3A. Instead of the four-state automaton of Figure 3, the automaton describing the environment transitions is a 16 state-automaton, because the orientation of the agent can now have four different values. See Figure 4B. Finally the automaton describing the internal mechanics of the agent I is a quasipolicy in these two actions, and finally, the coupling corresponds essentially to taking a path in the 16-state automaton above.
Note that there is a bisimulation between U and E which reflects the fact that from the point of view of an agent they are indistinguishable. This is natural because there is no sensory data, so from the agent's viewpoint it is unknowable whether or not it is embedded in an environment. A bisimulation is given as follows: Let y 0 ∈ Y be the top-right corner and x 0 ∈ X the root of the tree. Define R ⊆ X × Y be the minimal set satisfying the following conditions: 2. If (x, y) ∈ R and m ∈ M, then (T(x, m), U(y, m)) ∈ R.
Example 3.2. The 16-state automaton of Example 3.1 has four automorphisms corresponding to the rotation of the environment by 90 degrees counterclockwise. Each of those automorphisms corresponds to an auto-bisimulation. Mirroring is not an automorphism because the agent's rotating action fixes the orientation of the automaton. Example 3.3. Figure 5 shows an example of how an automaton with non-trivial sensing could look. Jumping a little bit ahead, it will be seen that the labeling provided by h in this figure is not sufficient (a notion introduced in Definition 4.2).

. Su cient refinements and degree of insu ciency
This section presents the concept of sufficiency, which will be the main glue between enactivist philosophy and mathematical understanding of cognition. In Section 4.1 we introduce the main concepts and explain its profound relevance to enactivist modeling and how it can be a precursor to the emergence of meaning from meaningless sensorimotor interactions. In Section 4.2 we introduce the notion of minimal sufficient refinements, prove a uniqueness result about them, and show how they are connected to the classical notions of bisimulation as well as derived information state spaces .
There could be an interesting relationship between this concept and the free energy principle proposed by K. Friston. A system which is attuned to its environment in a su cient way can be interpreted by an inspector as a system that is making ] predictions about its environment.

. . Su ciency
The following consider the main definition of this work. It is based on the idea of sufficiency in LaValle (2006, Ch.11).
Definition 4.1. Let (X, U, T) be a transition system and E ⊆ X × X an equivalence relation. We say that E is sufficient or completely sufficient, if for all (x, y) ∈ E and all u ∈ U, if (x, u, x ′ ) ∈ T and (y, u, y ′ ) ∈ T, then (x ′ , y ′ ) ∈ E.
This means that if an agent cannot distinguish between states x and y, then there are no actions it could apply to later distinguish between them. To put it differently, if the states are indistinguishable by an instant sensory reading, then they are in fact indistinguishable even through sensorimotor interaction. This is related to the equivalence relation known as Myhill-Nerode congruence in automata theory.
The equivalence relation of indistinguishability in the context of sensorimotor interactions is at its simplest the consequence of indistinguishability by sensors. Thus, we define sufficiency for labelings or sensor mappings: Definition 4.2. A labeling h : X → L is called sufficient (or completely sufficient) iff for all x, y, x ′ , y ′ ∈ X and all u ∈ U, the following implication holds: Proposition 4.3. If (X, U, τ ) is an automaton, then h : X → L is sufficient if and only if for all x, y ∈ X and all u ∈ U, we have that if h(x) = h(y), then h(τ (x, u)) = h(τ (y, u)). Proof: Checking the definitions.
The above proposition is saying that when the sensorimotor system is deterministic, then sufficiency is equivalent to predictability.
There is a connection with the classical notion of bisimulation in classical transition systems theory (recall Definition 2.2): . /fnbot. .

Proposition 4.4. An equivalence relation on a state space of an automaton (X, U, τ ) is sufficient if and only if it is an autobisimulation.
Proof: See Appendix B.
The above proposition can intuitively be interpreted as saying that a sufficient relation is one where different states with the same label are not only indistinguishable on their own, but are actually indistinguishable even by their consequences. Starting from one of two states with same labels, there is no way to ever find out which one of them it was, no matter how much will the agent investigate its environment, compare to the discussion in the end of Section 2.8.
Proposition 4.5 below is an important proposition on which the idea of derived I-spaces and combinatorial filters builds upon (LaValle, 2006(LaValle, , 2012O'Kane and Shell, 2017), although as far as the authors are aware, in the literature, only the "if "-direction is mentioned. We say that a transition system (X, U, T) is full, if for all x 1 ∈ X and all u ∈ U there exists at least one x 2 ∈ X with (x 1 , u, x 2 ).
Proposition 4.5. Suppose X = (X, U, T) is a transition system. Let h : X → L be a labeling. Then X /h is an automaton if and only if X is full and h is sufficient.
Proof: See Appendix B.
The above proposition brings together the ideas of a quotient, automaton and sufficiency. The idea of the quotient is that indistinguishable states can be in some circumstances considered the same and the idea of an automaton is that it is deterministic. The above proposition says that as far as the agent is concerned, if it equalizes indistinguishable states, then the world looks deterministic from the agent's perspective if and only if the underlying labeling satsifies Definition 4.2.
The sufficiency of an information mapping was introduced in LaValle (2006, Ch 11), and is encompassed by a sufficient labeling in this paper. In the prior context, it has meant that the current sensory perception together with the next action determine the next sensory perception. The elegance with respect to our principle (EA2) is that sufficiency is not saying that the agent's internal state corresponds to the environment's state (as is in representational models). Nor is it saying that the agent predicts the next action. It is saying, rather, that the agent's current sensation together with a choice of a motor command determine the agent's next sensation; and this statement is true only as a statement made about the system from outside, not as a statement which would reside "in the agent." The sensation may carry no meaning at all "about" what is actually "out there." However, if the agent has found a way to be coupled to the environment in a sufficient way, then sensations begin to be about future sensation. In this way meaning emerges from sensorimotor patterns. This relates to (EA3) and somewhat touches on the topic of perception (EA5). Furthermore, the property of determining future outcomes is related to (EA4) because that is what skill is. There is no potential to reliably engage with the environment in complex sensorimotor interactions, if the sensations do not reliably follow certain historical patterns.
Thus, the notion of sufficiency is considered by us to be of fundamental importance for enactivist-inspired mathematical modeling of cognition. The violation of sufficiency means that the current sensation-action pair does not correlate with the future sensation, making it harder to be attuned to the environment. Having a different sensation following the same pattern can be seen as a primitive notion of a "surprise." This can be seen as aligning with the predictive coding and the free energy principle from neuroscience (Rao and Ballard, 1999;Friston and Kiebel, 2009;Friston, 2010), although our framework leaves the space to a clean non-representational interpretation while this is not obvious for these other frameworks. Does the notion of sufficient labelings capture the same ideas in a more general way? This is an open question for further research.
A generalization of sufficiency is n-sufficiency, in which the data of n previous steps is needed to determine the next label. Here, we define an n-chain.
Definition 4.6. An n-chain in X = (X, U, T) is a sequence c = (x 0 , u 0 , · · · , x n−1 , u n−1 , x n ) ∈ (X × U) n × X such that x i u → x i+1 for all i < n. If n = 0, then by convention c = (x n ). Let E ⊆ X × X be an equivalence relation. Let k < n. We say that two n-chains c = (x 0 , u 0 , . . . , x n−1 , u n−1 , x n ), An ∞-chain is defined in the same way as n-chain, except the sequences are infinite, without the "last" x n .
Definition 4.7. For a transition system X = (X, U, T), an equivalence relation E on X is called n-sufficient if there are no two (T, E, n)-equivalent n-chains c = (x 0 , u 0 , . . . , x n−1 , u n−1 , x n ) and This enables us to define the degree of insufficiency: Definition 4.11. The degree of insufficiency of the labeled automaton X = (X, U, τ , h, L) is defined to be the smallest n such that h is n-sufficient, if such n exists, and ∞ otherwise. Denote the degree of insufficiency of X by degins(X ), or degins(h) if only the labeling needs to be specified and X is clear from the context.
The intuition is that the larger the degree of insufficiency of an environment X , the harder it is for an agent to be attuned to it. We talk more about the connection between attunement and sufficiency in the following sections.

. . Minimal su cient refinements
In this section we prove that the minimal sufficient refinements are always unique (Theorem 4.19). This will follow from a deeper result that the sufficient equivalence relations form a complete sublattice of the lattice of all equivalence relations. This does not hold for n-sufficient equivalence relations for n > 1 (Example 4.20). We will then explore how the minimal sufficient refinements can be thought of as an enactive perceptual construct that emerges from the body-environment, brain-body, and brain-body-environment dynamics. The idea is that a minimal sufficient refiniment corresponds to an optimal attunement of the agent to the base labeling which corresponds to some minimal information that the agent is interested in the environment, such as death or life, danger or safety information. It is "optimal" by minimality and "attunement" by sufficiency. Our Theorem 4.19 states that such attunement is mathematically unique.

Definition 4.12. An equivalence relation
An important interpretation of the concept of a refinement is that a better sensor provides the agent with more information about the environment . Each sensor mapping h induces a partition of X via its preimages, and refinement applies in the usual set-theoretic sense to the partitions when comparing sensors mappings. If a sensor mapping h is a refinement of h ′ , then it enables the agent to react in a more refined way to Here we are not talking about contentful or semantic information, but merely about correlational information in the philosophical sense. nuances in the environment. Using the partial ordering given by refinements, we obtain the sensor lattice (LaValle, 2019).
By a referee's request, let us give a couple of biological examples.
Example 4.13 (First biological example). There is an accepted theory that primates see red color wavelength, because it enables them to distinguish ripe fruit from non-ripe. Assuming this theory is true, it is an example of a refinement which is to some extent "minimal" and to some extent "sufficient" (of course strictly speaking it is neither -in the same way as there is no ideal circle in the physical world). The minimality is seen in this example, because we perceive other things as red, even if it is completely unnecessary (certainly unnecessary to tell the ripeness of fruits). So we are not distinguishing "too much." On the other hand, perceiving red color is a refinement of ripe/non-ripe which is only detected through stomach ache after the fruit has been already consumed. And it is sufficient in the sense that it is predictive of the original "base" labeling (ripe/non-ripe).
Example 4.14 (Second biological example). Where our eyes look depends on the position of our head as well as the position of our eyes. Despite this, "looking up" (or "left, " "right" etc..) are not ambiguous, even though these can be achieved with virtually infinitely many different head-eye configurations. One way to understand how this invariance could emerge is through minimal sufficient refinements. Suppose at birth, every headeye configuration is considered as a separate state, but we label them by what we see in any given (stable) situation. A minimal sufficient refinement of that labeling will never distinguish between different states in which the eyes are pointing in the same direction. So then, by learning the minimal sufficient refinements, the agent may learn eye-direction invariance.

. . Lattice of su cient equivalence relations
Please refer to Appendix A in the Supplementary material for notations and definitions used in this section.
We will prove in this section that if (X, U, τ ) is an automaton, the sufficient equivalence relations form a complete sublattice of (E(X), ⊆). Given an automaton X = (X, U, τ ), denote by E U,τ suf (X) ⊆ E(X) the set of sufficient equivalence relations on X. When U and τ are clear from the context, we write just E suf (X) = E U,τ suf (X).
Theorem 4.15. Suppose (X, U, τ ) is an automaton and suppose that E ⊆ E suf (X) is a set of sufficient equivalence relations. Then E and E are sufficient. Thus, (E suf (X), ⊆) is a complete sublattice of (E(X), ⊆).
Suppose that a labeling h is very important for an agent. For example, h could be "death or life, " or it could be relevant for a robot's task. Suppose that h is not sufficient. The robot may want to find a sufficient refinement of h. Clearly a oneto-one h ′ would do. However, assume that the agent has to use resources for distinguishing between states; thus, the fewer distinctions the better. This motivates the following definition. Recall Definition 4.12 of refinements.
Definition 4.16. Let (X, U, T) be a transition system and E 0 ⊆ X × X an equivalence relation. A minimal sufficient refinement of E 0 is a sufficient equivalence relation E which is a refinement of E such that there is no sufficient E ′ with E 0 r E ′ < r E.
Given a labeling h 0 of a transition system (X, U, T), a minimal sufficient refinement of h 0 is a labeling h such that E h is a minimal sufficient refinement of E h 0 (recall Definition 2.34). Example 4.18. Let X be as above and let h : Then h ′ is a minimal sufficient refinement of σ .
Theorem 4.19. Consider an automaton X = (X, U, τ ) and let E 0 be an equivalence relation on X. Then a minimal sufficient refinement of E 0 exists and is unique.
Let E 0 be an equivalence relation on X such that the equivalence classes are {0, 1, 3, 4}, {2} and {5}. Then this relation is not 2-sufficient, because (0, u 0 , 1, u 0 , 2) and (3, u 0 , 4, u 0 , 5) are (T, E 0 , 2)-equivalent, but 2 and 5 are not E 0 -equivalent. Let E 1 , E 2 ⊆ E 0 be equivalence relations with equivalence classes as follows: Then E 1 and E 2 are refinements of E 0 . They are both 2-sufficient, because there doesn't exist any (T, E 1 , 1) or (T, E 2 , 1) equivalent 2-chains. They are also both r -minimal with this property which can be seen from the fact that they are actually rminimal refinements of E 0 as equivalence relations (not only as sufficient ones). . Then E 0 is not sufficient, because (0, 2) ∈ E 0 , but (3, 4) / ∈ E 0 . Let E 1 and E 2 be the refinements of E 0 with the following equivalence classes: Now it is easy to see that both E 1 and E 2 are sufficient refinements of E 0 , and by a similar argument as in Example 4.20 they are both minimal. The reason why this is possible is the odd behavior of the state 2 which doesn't have out-going connections. Such odd states are the reason why the decision problem "Does there exist a sufficient refinement with k equivalence classes?" is NP-complete for finite transition systems (O'Kane and Shell, 2017).
Remark. It is worth noting that Theorems 4.15 and 4.19 do not assume anything about the cardinality of X or of U, other structure on them (such as metric or topology) nor anything about the function τ or the relation E 0 . Keeping in mind potential applications in robotics, X and U could be, for instance, topological manifolds, and τ a continuous function, or X could be a closed subset of R n , U discrete and τ a measurable function, or any other combination of those. In each of those cases, the sublattice of sufficient equivalence relations is complete, as per Theorem 4.15, and every equivalence relation E 0 on X admits a unique minimal sufficient refinement as per Theorem 4.19.
Recall Definition 2.10 of an equivalence relation preserving function. We say that an equivalence relation E on X is closed Theorem 4.23. If f is an automorphism of the automaton (X, U, τ ), then E f is a sufficient equivalence relation.  Figure 6A. Consider two agents in this environment. Both are equipped with the same h sensor, but their action repertoires differ. Both have two possible actions. One has actions L = "move left one space" and R = "move right one space, " and the other one has actions T = "turn 180 degrees" and F = "go forward one space." Let M 0 = {L, R} and M 1 = {T, F}. Thus, these agents have a slight difference in embodiment. Although both of them can move to every square of the lattice in a very similar way (almost indistinguishable from the outside perspective), we will see that the differences in embodiment will be reflected in that the minimal sufficient refinements will produce nonequivalent "categorizations" of the environment. The structures that emerge from these two embodiments will be different. These agents enact different environments, although physically the environments are the same, as congruent with tenet (EA3). First, we define the SM-systems that model these agents' embodiments in E. The first agent does not have orientation. It can be in one of the five states, and the state space is X 0 = E ( Figure 6B). For the second agent, the effect of the F action depends on the orientation of the agent (pointing left or pointing right). Thus, there are ten different states the agent can be in, yielding X 1 = E × {−1, 1} ( Figure 6C). The effects of motor outputs are specified completely (L means moving left, and so on), whereas the agent's internal mechanisms are left completely open, so our systems will be quasifilters. According to Remark 2.18, we can work with a labeled automaton instead. Hence, let τ 0 : X 0 × M 0 → X 0 be defined by τ 0 (x, L) = max(x − 1, −2) and τ 0 (x, R) = min(x + 1, 2). For the other agent, let τ 1 ((x, b), T) = (x, −b) and τ 1 ((x, b), F) = (min(max(x + b, −2), 2), b). Now we have labeled automata X 0 = (X 0 , M 0 , τ 0 , h, S) and X 1 = (X 1 , M 1 , τ 1 , h, S).
It is not hard to see that the one-to-one map h 0 : X 0 → {−2, −1, 0, 1, 2} with h 0 (x) = x is a sufficient refinement of h which is minimal (see Figure 7A). Thus, every state needs to be distinguished by the agent for it to be possible to determine the following sensation from the current one. The derived information space automaton X 0 /h 0 isomorphic to X 0 (Proposition 2.36).
Claim. h 1 is a minimal sufficient refinement of h in X 1 .

Proof: See Appendix B
Both minimal sufficient labelings, h 0 and h 1 have five values; thus, they categorize the environment into five distinct statetypes. However, the resulting derived information spaces are different in the sense that the quotients X 0 /h 0 and X 1 /h 1 are not isomorphic; compare Figure 7B with Figure 7D.
Example 4.26. Figure 8A shows a filtering example from Tovar et al. (2014). More complex versions have been studied more recently in O' Kane and Shell (2017), and are found through automaton minimization algorithms and some extensions. It can be shown that this example's four-state derived information space depicted on Figure 8B corresponds to the unique minimal sufficient refinement of the labeling that only distinguishes between "are in the same region" and "are not in the same region." To see this, first note that this labeling is sufficient (since it can be represented as an automaton, this follows from Theorem 4.5). It follows from Theorem 4.19 that if this labeling is not minimal, then there is a minimal one which is strictly coarser, and so can be obtained by merging the states in the automaton of Figure 8B. This is impossible: the state T cannot be merged with anything because it violates the base-labeling; if, say D a and D c , are merged, then transition a will lead to inconsistency as it can lead either to D b (from D c ) or to T (from D a ). This proves that this derived information space is indeed minimal sufficient, and by Corollary 4.19 there are no others up to isomorphism.

. . Computing su cient refinements
This section sketches some computational problems and presents computed examples. The problem of computing the minimal sufficient refinement in some cases reduces to classical deterministic finite automaton (DFA) minimization, and in other cases it becomes NP-hard (O'Kane and Shell, 2017).
Consider an automaton (X, M, τ ) and a labeling function h 0 , and the corresponding labeled automaton described using the quintuple (X, M, τ , h 0 , L). Suppose that the automaton (X, M, τ ) corresponds to that of an body-environment system. Hence, X corresponds to the states of this coupled system. Suppose h 0 is not sufficient and consider the problem of computing a (minimal) sufficient refinement of h 0 , that is, the coarsest refinement of h 0 that is sufficient.
Despite the uniqueness of the minimal sufficient refinement of h 0 (by Corollary 4.19), we can argue that the formulation of the problem, in particular, the input, can differ based on the level at which we are addressing the problem (for example, global perspective, agent perspective or something in between). Since the labeled automaton corresponding to an agent-environment coupling is described from a global perspective, the input to an . /fnbot. . algorithm that addresses the problem from this perspective is the labeled automaton A = (X, τ , M, h 0 , L) itself. Then, the problem is defined as given A compute A ′ = (X, M, τ , h, L) such that h is the minimal sufficient refinement of h 0 . A special case of this problem from the global perspective occurs if the preimages of h 0 partition X in two classes which can be interpreted as the "accept" and "reject" states, for example, goal states at which the agent accomplishes a task and others. Furthermore, suppose that the initial state of the agent is known to be some x 0 ∈ X. Then, computing a minimal sufficient refinement becomes identical to minimization of a finite automaton, that is, given a DFA (X, M, τ , x 0 , F) in which x 0 is the initial state and F is the set of accept states find (X ′ , M, τ ′ , x ′ 0 , F ′ ) such that no DFA with fewer states recognizes the same language. Existing algorithms, for example Hopcroft (1971), can be used to compute a minimal automaton.
Here, we also consider this problem from the agent's perspective for which the information about the environment states is obtained through its sensors, more generally, through a labeling function. Note that by agent's perspective we do not necessarily imply that the agent is the one making the computation (or any computation) but it means that no further information can be gathered regarding the environment other than the actions taken and what is sensed by the agent. At this level we address the following problem; given a set M of actions, a domain X, and a labeling function h 0 defined on X, compute the minimal sufficient refinement of h 0 . The crux of the problem is that unlike the global perspective described above, the labeled automaton A is not given, in particular, the state transitions are not known a priory. Instead, the information regarding the state transitions can only be obtained locally by means of applying actions and observing the outcomes, that is, through sensorimotor interactions. Hence, the current bodyenvironment state is also not observable. To show that an algorithm exists to compute a sufficient refinement of h 0 at this level, we propose an iterative algorithm (Algorithm 1) that explores X through agent's actions and sensations by keeping the history information state, that is, the history of actions and sensations (labels). We then show, by empirical results, that the . /fnbot. .

FIGURE
(A) Two point-sized independent bodies move along continuous paths in an annulus-shaped region in the plane. There are three sensor beams, a, b, and c. When each is crossed by a body, its corresponding symbol is observed. Based on receiving a string of observations, the task is to determine whether the two bodies are together in the same region, with no beam separating them.
(B) The minimal filter as a transition system has only states: T means that they are together, and D x means that are in di erent regions but beam x separates them. Each transition is triggered by the observation when a body crosses a beam.
sufficient refinement computed by Algorithm 1 is minimal for the selected problem.
The functioning of Algorithm 1 is as follows. Starting from an initial sensation s 0 = h(x 0 ), the agent moves by taking an action given by the mapping policy : L → M. Particularly, we used a fixed policy which samples an action m from a uniform distribution over M for each s ∈ S. In principle, any policy that ensures all states that are reachable from x 0 will be visited infinitely often should be enough. The history information state is implemented as a list, denoted by H, of triples (s, m, s ′ ) such that s = h(x) and s ′ = h(x ′ ) in which x ′ = τ (x, m). At each step, it is checked whether the current sensation is consistent with the history (Line 7). Current sensation is inconsistent with the history if there exists a triple (s, m, s ′′ ) in the history such that This can either be in a real environment or in a realistic simulation. s ′ = s ′′ . If it is not consistent then the label is split, which means that h −1 (s) is partitioned into two parts P and Q. In particular, we apply a balanced random partitioning, that is, we select P and Q randomly from a uniform distribution over the partitions of h −1 (s) that have two elements with balanced cardinalities. The labeling function is updated by a split operation as Recall that labels or subscripts do not carry any meaning from the agent's perspective. Even a trivial strategy that splits the preimage of the label seen at each step would succeed computing a sufficient refinement. However, this would result in h being a one-toone mapping. Hence, the finest possible refinement. Splitting only at the instances when an inconsistency is detected might reach a coarser refinement that is sufficient but there might be more equivalence classes than the ones induced by the minimal sufficient refinement of h 0 . Therefore, a merge operation is introduced (Line 10). Let s and s ′ be two distinct labels for which . Let t denote a triple in H and let t k , k = 1, 2, 3, denote the k th element of that triple. Suppose s ′ = s, if there are at least N number of triples in H such that for each triple t, (t 1 , t 2 ) = (s, m) and ∀m ∈ M and ∀t, t ′ ∈ H such that (t 1 , t 2 ) = (t ′ 1 , t ′ 2 ) = (s, m) it is true that t 3 = t ′ 3 then labels s and s ′ are merged. The merge procedure goes through all labels and updates h as for each pair of labels s and s ′ that satisfies the aforementioned condition. Note that in principle, one can merge two labels regardless of the number of occurrences in the history. However, we noticed that this can result in oscillatory behaviour between split and merge operations especially for states that are reached less frequently. At present, we considered N as a tunable parameter and we know that it depends on the cardinality of the state space X such that larger the number of states, larger N should be. The problem of defining N as a function of the problem description remains open.
In the following, we present an illustrative example to show the practical implications of the previously introduced concepts in Section 4.2. In particular, we show how a simple algorithm like Algorithm 1 can be used by a computing unit which relies only on the sensorimotor interactions of an agent to further categorize the environment such that there are no inconsistencies in terms of the actions taken by the agent and the resulting sensations with respect to an initial categorization induced by h 0 (Figure 9C).
. /fnbot. . Example 4.27. Consider an agent (a mouse) that is placed in a maze where certain paths lead to cheese and others do not (see Figure 9A). At each intersection the agent can go either left or right and it can not go back. Hence, at each step the agent takes one of the two actions; go right or go left. Figure 9B shows the corresponding automaton with 15 states describing the agent-environment system together with the initial labeling h 0 that partitions the state space into states in which the agent has reached a cheese (light blue) and others (dark blue). The initial state x 0 is when the agent is at the entrance of the maze. Once the end of the maze is reached (a leaf node) the state does not change regardless of which action is taken. After a predetermined number of steps the system reverts back to the initial state, similar to an episode in the reinforcement learning terminology (see, for example, Sutton and Barto, 2018). However, despite the system going back to the initial state the history information state still includes the prior actions and sensations. Figure 10 reports the updates of h, initialized at h 0 , by Algorithm 1 being run for 1,000 steps. It converged to a final labeling h ( Figure 10R), that is the minimal sufficient refinement of h 0 , in 435 steps. For 20 initializations of Algorithm 1 for the same problem, on average, it took 364 steps to converge to a minimal sufficient refinement of h 0 ( Figure 9C).
We have also applied the same algorithm to variations of this example with different depths of maze and different number of cheese and cheese placements (varying h 0 ). Empirical evidence shows that the same algorithm was capable of consistently finding the minimal sufficient refinement of the initial labeling. However, it is likely that it might fail for more complicated problems, for example, when the number of actions are significantly larger. It remains an open problem finding a provably correct algorithm for computing the minimal sufficient refinement of h 0 from the agent's perspective.

. . Su ciency for coupled SM-systems
Section 2 introduced SM-systems, including the special class of quasifilters. We showed that quasifilters can be thought of as labeled transition systems, and we worked with such systems in Sections 4, 4.4. Let us see how do the concepts introduced in those sections work for SM-systems. We also defined coupling of SM-systems (Definition 2.22), but we have not defined what it means for a coupling to be "good." We will use sufficiency to approach this subject.
Let E = (E, (S × M), T) and I = (I, S × M, B) be SM-systems. We think intuitively of E as "the environment" and I as the "agent, " even though they share the set of sensorimotor parameters S × M. When is the coupling E * I "successful"? Given another I ′ = (I ′ , S × M, B ′ ), how can we compare I and I ′ in the context of E? The coupled system E * I is not labeled; therefore, we cannot apply the definition of sufficiency. However, as soon as we apply some labeling to it, we can. There are many different ways to do it, intuitively corresponding to the "agent's perspective, " the "environment's perspective" and a "god's perspective" (or "global perpsective").
The first one is the labeling h : E × I → I, which is the projection to the right coordinate, h I (e, i) = i. The second one is the projection to the left coordinate h E (e, i) = i, and the third one is the labeling of states by themselves, h G (e, i) = (e, i). Clearly, h G is a refinement of both h E and h I . Yet another option is to use the sensory data as labelings, which is a coarser labeling than h I . Or perhaps there was already a labeling h : E → S to begin with, so then we can ask about the property ofĥ : E×I → S defined byĥ(e, i) = h(e). We focus on what we called the agent's perspective, h I , for the rest of this section.
Recall Definition 4.11 of the degree of insufficiency. Given SM-systems E (environment) and I (agent), we can ask what is the degree of insufficiency of h I in E * I? The smaller the Frontiers in Neurorobotics frontiersin.org . /fnbot. . degree, the better the agent is attuned to the environment. This says something about the way in which the agent is adapted or attuned to the environment without attributing contentful states or representations to the agent in alignment with (EA2) and (EA4).
Let E, I, and I ′ be SM-systems. When is degins(E * I, h I ) < degins(E * I ′ , h I ′ )? Of course, if I is fully constrained (Definition 2.31), then degins(E * I) = ∞. This corresponds to the agent never engaging in any sensorimotor interaction with the environment. No wonder that it can always "predict" the .

Discussion
In the introduction we defined our basic enactivist tenets: (EA1) Embodiment and the inseparability of the brain-bodyenvironment system, (EA2) Grounding in sensorimotor interaction patterns, not in contentful representations. (EA3) Emergence from embodiment, enactment of the world, (EA4) Attunement, adaptation, and skill as possibilities to reliably engage in complicated patterns of activity with the environment. (EA5) Perception as sensorimotor skills.
We developed a model of sensorimotor systems and coupling for which the purpose is to account for cognition mathematically, but in congruence with the principles (EA1)-(EA5). The principle (EA1) is intrinsic in the ways SM-systems are supposed to model brain-body and body-environment dynamics. The central ingredient is the control set S × M in all of those systems which include "motor" and "sensory" part; it is impossible in our framework to model say the environment without acknowledging the way in which the body is part of it. The approach that the actions of an agent depend solely on the history of its sensorimotor interactions with the environment, our approach is well in the scope of (EA2). We do not assume any representational or symbolic content possessed by the SM-systems. We do not evaluate them normatively by the "correctness" of their internal states, but rather by the ways in which they are, or can be, coupled to the environment and whether their sensory apparatus generates a sufficient sensor mapping or not. Coupling of SM-systems is defined so that two systems constrain each other. Thus, when an agent is coupled to the environment, they constrain each other, thereby creating new global properties of the body-environment system.
The principle (EA4) is mostly discussed in connection with minimal sufficient refinements. Given a labeling, or a categorization, or an equivalence relation on the state space, one can ask how well does this labeling "predict itself." The interpretation of this labeling can be anything from a sensor mapping to the labeling of environmental states by the internal states of the agent which coincide with them (this is not representation, this is mere co-occurence; see enactivist interpretation of the place cells in Hutto and Myin (2017) for Further research will indicate how much of this will be accepted by the most radical enactivists. comparison). A sufficient sensor mapping can be achieved in many different ways. In Section 4.4 we present a way in which the agent "develops" new sensors to be better attuned to the environment and in that way finds a sufficient sensor mapping. Another way for the agent would be to learn to act in a way that excludes "unpredictability." Both are examples of situations where the agent "structures" its own body-environment reality and gains skill. Finally, perception (EA5) can be understood as sensorimotor patterns on a microlevel. On the other hand, the agent engage in a sensorimotor activity locally without making big moves, such as moving the eyes without moving the body. The result of such sensorimotor interaction is another labeling function on a macro level.
In this paper, we not only presented mathematical definitions, but proved a number of propositions and theorems about them. There would be (and we hope there will be!) much more of them, but they did not fit in this expository work for which the main purpose was to demonstrate the connection of the mathematics in question with the enactive philosophy of mind.
We have already developed more concepts and theorems on top of this framework, including notions of degree of insufficiency, universal covers, hierarchies, and strategic sufficiency, but these are omitted here due to space limitations.
In other, more mathematical work, we plan to concentrate on working out mathematical and logical details of the proposed theory as well as applying the ideas to fundamental questions in robotics and autonomous systems, control theory, machine learning, and artificial intelligence.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.

Author contributions
VW, BS, and SL developed the mathematical theory together over the past 2 years after extensive collaborative sessions. The primary author is VW, who wrote the most among the authors. VW contributed more to mathematical proofs. BS contributed more to computation. In addition to individual contributions, SL also played a supervisory role. All authors contributed to writing.