Planar Formation Control of a School of Robotic Fish: Theory and Experiments

This paper presents a nonlinear control design for the stabilization of parallel and circular motion in a school of robotic fish actuated with internal reaction wheels. The closed-loop swimming dynamics of the fish robots are represented by the canonical Chaplygin sleigh. They exchange relative state information according to a connected, undirected communication graph to form a system of coupled, nonlinear, second-order oscillators. Prior work on collective motion of constant-speed, self-propelled particles serves as the foundation of our approach. However, unlike a self-propelled particle, the fish robots follow limit-cycle dynamics to sustain periodic flapping for forward motion with time-varying speed. Parallel and circular motions are achieved in an average sense without feedback linearization of the agents’ dynamics. Implementation of the proposed parallel formation control law on an actual school of soft robotic fish is described, including system identification experiments to identify motor dynamics and the design of a motor torque-tracking controller to follow the formation torque control. Experimental results demonstrate a school of four robotic fish achieving parallel formations starting from random initial conditions.


INTRODUCTION
Collective behavior of mobile agents has received significant interest recently in fields such as biology, physics, computer science, and control engineering (Reynolds, 1987;Vicsek et al., 1995;Ren et al., 2007). Research in this area is allowing scientists to better understand swarming behavior in nature and benefits control engineers in numerous applications by mimicking nature's behavior in engineered mobile systems such as unmanned ground, air, and underwater vehicles.
Previous investigations of bioinspired underwater vehicles include the design, sensing, and control of a single fish-inspired robot that is driven by an internal reaction wheel (Zhang et al., 2016;Free et al., 2017;Lee et al., 2019). Here, we present control laws that stabilize planar formations of a school of such robotic fish (Figure 1). Related work involving formation experiments of fish robots propelled by tail flapping was presented in (Berlinger et al., 2021;Zhang et al., 2021). In (Berlinger et al., 2021), a school of fish robots achieves circular formations and other collective behaviors using vision-based behaviors based on relative position. Similarily, in (Zhang et al., 2021), parallel and circular formations are achieved using an overhead camera to provide absolute positions of all the agents. Our work differs in that we investigate synchronized motion of multiple fish robots driven by an internal reaction wheel. We utilize consensus control to achieve collective motion by communicating only relative position and/or orientation with nearby agents. This approach is particularly well suited to challenging underwater environments where small, low-power robots have limited communication or sensing range.
Consensus control in Euclidean space, which assumes that the states of the system live on R N , is a well-studied topic (Cao et al., 2013). The goal of consensus control is to steer N agents into identical states. Similarly, average-consensus control laws steer agents towards the average value of the initial conditions of the agents (Olshevsky, 2015). Consensus and average consensus are typically studied for single-integrator dynamics (Ren and Beard, 2005), which may contain linear or nonlinear drift vector fields (Cao and Ren, 2012). Interactions between agents can be static (Chopra, 2012), time-varying (Moreau, 2005), all-to-all (Chopra, 2012), or limited (Li et al., 2011). These interactions are typically described using the Laplacian matrix from algebraic graph theory (Horn and Johnson, 1990) to compute relative state information, such as relative position. Consensus and average consensus in Euclidean space have also been studied for double-integrator dynamics (Zhang and Tian, 2009) and second-order systems with a nonlinear drift vector field that represents the vehicle dynamics (Yu et al., 2010). Furthermore, consensus control on a nonlinear manifold has been investigated (Scardovi et al., 2007;Paley, 2009). For example, consensus on the N-torus-also called synchronization-arises in the control of planar formations, where the heading orientation is a phase angle on the unit circle . Orientation and translation control of agents in the plane utilizes the special Euclidean group (Justh and Krishnaprasad, 2004). Many synchronization approaches are based on the theory of coupled oscillators, such as the celebrated Kuramoto model , and invoke the graph Laplacian for cooperative control of first-order dynamics on the N-torus (Sepulchre et al., 2008). Second-order consensus of coupled oscillators with double-integrator dynamics (Napora and Paley, 2013) uses the gradient of a phase potential.
Another class of collective behaviors of multi-agent systems are circular formations. Previous work in this area studied circular formations of first-order, self-propelled particles with unit velocity. Feedback control laws designed in  stabilize a circular formation having a fixed center and a constant radius. Some extensions to this work consider a circular formation in a flow field (Paley, 2008) and constant non-unitary velocity, or with a constraint bounding the circular formation to a region of interest (Jain and Ghose, 2019). Other extensions include time-varying centers, so that the circular formation position is not fixed (Brinón-Arranz et al., 2014;Yu and Liu, 2017). Some authors assume agents use relative-position sensing to achieve circular formations around a given center and radius that is known only to a subset of agents (Yu et al., 2018). Circular formation control on the tangent bundle of the N-torus has also been investigated where agents are second-order self-propelled particles Napora and Paley, 2013).
This work investigates planar formations in a novel setting: a system of second-order oscillators with nonlinear dynamics and nonholonomic constraints on the tangent bundle of the Ntorus. The closed-loop swimming dynamics of the fish robots are represented by the Chaplygin sleigh (Kelly et al., 2012), (Lee et al., 2019), a nonholonomic mechanical system driven by an internal reaction wheel. Our control design is inspired by prior work on collective motion of self-propelled particles (Paley, 2007;Sepulchre et al., 2007;Paley, 2008;Napora and Paley, 2013); however, a key distinction is that agents have secondorder limit-cycle dynamics with time-varying speed. Thus, novel parallel and circular formations are achieved in an average sense.
The contributions of this paper are 1) a control design that achieves parallel motion for a school of robotic fish, represented by a system of coupled, nonlinear, second-order oscillators with Chaplygin sleigh dynamics using only relative state information; 2) a control design that achieves circular motion for the same system; 3) system identification of the reaction-wheel motor dynamics and the design of an optimal estimation and tracking controller that follows the torque commands of the formation control; and 4) experimental validation of the parallel formation control law on a school of bio-inspired robotic fish (Figure 1). The proposed control algorithms are illustrated through both numerical simulations and experiments in the University of Maryland's Neutral Buoyancy Research Facility.
The remainder of the paper is organized as follows. Section 2 provides preliminaries on graph theory, the self-propelled particle model, and Chaplygin sleigh dynamics. Section 3 present control designs to achieve parallel and circular formations for a robotic fish school. Section 4 presents the experimental implementation and results for the parallel formation control for a school of robotic fish. Lastly, Section 5 summarizes the paper and discusses ongoing and future work. This section reviews concepts from graph theory, presents the self-propelled particle model, and summarizes the dynamics of a Chaplygin sleigh, used to model our robotic fish.

Graph Theory
A graph is used to represent the communication topology of an interacting system of agents. The communication graph is built upon a set of nodes V {1, . . . , N} that represent agents. An edge denoted by the pair (k, j) exists between agent k ∈ V and j ∈ V if information flows from j to k. The set of all edges is denoted E4V 2 . The set of nodes V and the edges E define a graph G (V, E) (Diestel, 2000). A sequence of edges {(k, k 1 ), (k 1 , k 2 ), . . ., (k ξ , j)} with distinct nodes k l ∈ V, k l ≠ k, k l ≠ j, for l 1, 2, . . ., ξ is called a path from node k to node j. A graph G is called undirected if (k, j) ∈ E implies (j, k) ∈ E. For a undirected graph, the set of neighbors to node k is denoted N k {v ∈ V : (k, v) ∈ E}. If there exists a path between any pair of distinct nodes k, j ∈ V, then an undirected graph G is called connected. Edges are expressed using the adjacency matrix A ∈ R N×N , where the entry on the kth row and jth column is The degree matrix D ∈ R N×N encodes how many unique edges are connected to each node and has nonzero elements on the diagonal, i.e., The symmetric and positive semi-definite Laplacian matrix L ∈ R N×N associated with the undirected graph G is L D − A. The Laplacian is used to compute relative state information communicated between agents. The quadratic form z T Lz ≥ 0, where z ∈ R N , is equal to zero if and only if z k z j , for all k, j ∈ V.

Self-Propelled Particle Model
The self-propelled particle model (Justh and Krishnaprasad, 2004) has often been used to describe the collective motion of N planar vehicles that move at a constant speed with steering controls inputs. The planar position of the kth particle with respect to the origin of the inertial frame is expressed using complex coordinates as r k x k + iy k ∈ C, where k ∈ V. The dynamics of the kth particle are where, for the kth particle, v k b _ x 2 k + _ y 2 k ∈ R is a constant speed, θ k batan( _ y k / _ x k ) ∈ T is the orientation of the velocity (also called the phase of the particle), TbS is the torus, and u k ∈ R is the steering control. The unit vector e iθ k is called the phasor of particle k and is aligned with its heading, whereas ie iθ k is perpendicular to the heading (Figure 2A). For a constant speed, v k v 0 , and a constant turn-rate, _ θ k ω 0 , the particle moves on a circle with radius |v 0 ω −1 0 | and center c k r k + iv 0 ω −1 0 e iθ k . This fixed-radius circle will later serve as a reference for stabilizing circular formations.
When referring to the positions, phase arrangement, reference circle centers, and control inputs of the collective of N particles, we use bold letters, i.e., rb[r 1 , . . . , Similarly, e iθ b[e iθ1 , . . . , e iθ k ] T ∈ C N . For complex numbers, z 1 , z 2 ∈ C, the inner product is defined as 〈z 1 , z 2 〉 Re{z 1 z 2 }, where z 1 is the complex conjugate of z 1 . This inner product is equivalent to the standard inner product on R 2 . For complex vectors, z, y ∈ C N , the inner product is similarly defined as 〈z, y〉 N i 1 Re{z i y i }. The modulus of a complex number is denoted | · | 〈·, ·〉 √ . Cooperative control laws for stabilizing the collective motion of identical, unit-speed, self-propelled particles in parallel or circular formations have been extended to include an external flow field (Paley, 2008), motion on spherical surfaces (Paley, 2009), and various communication topologies (Sepulchre et al., FIGURE 2 | Coordinates and unit vectors: (A) the self-propelled particle; (B) the Chaplygin-sleigh model of a robotic fish. In (B), the hydrofoil shape represents the fish robot body and a bronze-colored reaction wheel is shown at the center of mass. 2007; Sepulchre et al., 2008). For parallel formations, all particles are synchronized when they have equal and constant phase, θ θ 0 1, where 1 [1,. . .,1] T is the N-by-1 vector of ones, for some constant θ 0 ∈ T. For synchronization, the relative positions of particles are arbitrary. For circular formations, all particles move in the same direction along the same circle, that is, _ θ ω 0 1 for some constant ω 0 and c c 0 1 for some constant c 0 ∈ C. In a circular formation, the relative phases of the particles are arbitrary.
Parallel and circular formations may be achieved using Lyapunov-based control design to minimize a potential function for a desired formation. Consider the Laplacian parallel formation potential (Paley, 2007) which is minimized when the agents are synchronized. Assume the Laplacian matrix L corresponds to a time-invariant, connected, and undirected graph G representing the communication topology of the agents. The time-derivative of U p (θ) along trajectories of Eq. 1 is (Paley, 2007) where L k is the kth row of the Laplacian matrix. The term L k e iθ is the sum of the phasor of the kth agent relative to the phasors of all connected agents, i.e., L k e iθ |N k |e iθ k − j∈N k e iθj . Choosing the gradient control (Paley, 2007) for K > 0, makes Eq. 3 negative semi-definite and drives U p (θ) to zero so that agents converge to the set of synchronized parallel formations.
Similarly, to achieve a circular formation, the Laplacian circular formation potential (Paley, 2007) may be used. The potential U c (r, θ) has a minimum value when the agents are in a circular formation. The time-derivative of U c (r, θ) along trajectories of the self-propelled particle Eq. 1 is (Paley, 2007) Choosing the circular formation control (Paley, 2007) makes Eq. 6 negative semi-definite and drives U c (r, θ) towards zero so that the agents' time-averaged circle centers coincide to a common point.

Chaplygin Sleigh Dynamics
The Chaplygin sleigh is a canonical nonholonomic mechanical system consisting of a rigid body moving in the plane that is supported by two frictionless sliding points and a single knife edge that allows no motion perpendicular to its edge (Bloch, 2003). Previous studies have demonstrated that a fish robot driven by an internal reaction wheel can be modeled as a Chaplygin sleigh due to the nonholonomic constraint imposed by the Kutta condition (Kelly et al., 2012), (Lee et al., 2019), which constrains the fluid flow at the trailing edge. As the reaction wheel spins back and forth, it flaps the robot's body, which interacts with the surrounding fluid to generate thrust. Consider a system of N fish robots each modeled as a Chaplygin sleigh with the following dynamics in state-space form (Lee et al., 2019): where r k ∈ C is the position of the trailing edge of the fish robot ( Figure 2B), v k ∈ R is the swimming speed, θ k ∈ T is the velocity orientation, ω k ∈ R is the angular rate of the kth fish, and u k ∈ R is the applied torque, where k 1, . . ., N. Furthermore, d ≥ 0 is the drag coefficient, and m > 0, l > 0, and b > 0 are the mass, length, and moment of inertia, respectively. Unlike the self-propelled particle Eq. 1, the speed of the Chaplygin sleigh Eq. 8 is not constant and the control input is a torque rather than an angular rate. Prior work has established that the Chaplygin-sleigh model exhibits limit-cycle dynamics under open-loop periodic control inputs (Pollard et al., 2019), as well as feedback control (Lee et al., 2019) (Free et al., 2020). Consider the feedback control (Lee et al., 2019) where θ k is the desired heading angle of the kth fish, and K 1 , K 2 > 0 are feedback gains. Substituting Eq. 9 into Eq. 8 yields the closed-loop system (Lee et al., The system Eq. 10 can be divided into a slow and fast subsystem (Lee et al., 2019), where the fast v k -subsystem (Lee et al., 2019) Observe that Eq. 11 gives the equations of motion of a pendulum with nonlinear damping and natural frequency K 2 √ (Lee et al., 2019). The system Eq. 11 has two equilibrium points corresponding to a fish robot heading that is parallel (θ k , ω k ) ( θ k , 0) and anti-parallel (θ k , ω k ) ( θ k − π mod 2π, 0) to the desired heading. Both equilibria are unstable and the system exhibits a stable limit cycle centered on ( θ k , 0) in the (θ k , ω k ) plane (Lee et al., 2019). The corresponding limit cycle of Eq. 10 is evident in the (v k , ω k ) plane as well. The limit cycle propels the robot in the desired direction by flapping; however, the limit cycle is achieved only for certain values of the control gains K 1 and K 2 (Lee et al., 2019). The average swimming velocity is proportional to K 1 , but if K 1 is too large, then the angular rate in the resulting limit cycle does not switch signs and the robot spins in a circle (Lee et al., 2019). The control law Eq. 9 that enables each fish robot to swim in a desired direction can be modified, with interactions from neighboring fish, to achieve collective motion of the school, as described next.

PLANAR FORMATION CONTROL
We propose a nonlinear control design for the stabilization of parallel and circular formations in a model of a school of robotic fish. Our approach bridges collective motion of self-propelled particles (Paley, 2007) and feedback control of a fish robot modeled by Chaplygin sleigh dynamics (Lee et al., 2019). Since (Paley, 2007) assumes a constant-speed particle, it cannot be applied directly to control fish robots that follow limit-cycle dynamics with a varying speed. Furthermore, since the fish robots oscillate, parallel and circular motions are achieved in an average sense. Novel formation potential functions are required for Lyapunov-based control design.

Parallel Formations
Consider a collection of N identical fish robots modeled by the Chaplygin sleigh system Eq. 8. Assume a sufficiently large drag coefficient so that v k → (l/d)ω 2 k and the (θ k , ω k ) dynamics follow Eq. 11. For the purposes of control design, the simplified Chaplygin sleigh system Eq. 8 becomes Inspired by the Laplacian parallel formation potential Eq. 2 for the self-propelled particle, consider the potential The time-derivative of V p (θ) is where, along trajectories of Eq. 12, and By choosing the control and substituting Eqs 15-17 into Eq. 14, _ V p (θ, ω) becomes The feedback control law Eq. 17 relies only on relative-state measurements between agents and does not include feedback linearization of the agents' dynamics. Recall K 1 , K 2 > 0 are control gains. Since Eq. 18 is a summation of quartic functions with roots at ω k 0 and Therefore, all trajectories are trapped in Ω p . The gain K 2 in Eq. 17 is chosen to ensure forward flapping motion for Eq. 8, as discussed in Section 2.3.
The control law Eq. 17 is illustrated by numerical simulation using control Eq. 17 and the full dynamics Eq. 8. The simulation was conducted for 150 s with N 8 robots using the parameters listed in Table 1. The robots were initialized with random headings and zero linear and angular velocities. A communication range of 3 m determined the communication topology, which remained invariant during the simulation based on the agent's random initial positions. Figures 3A,B show all N robots converging to the same limit cycle in the (θ k , ω k ) and (v k , ω k ) planes. As a result, all robots move in the same direction (on average), as shown in Figure 3C. The parallel potential, V p (t), initially decreases ( Figure 3D), but instead of converging to zero, it oscillates around a fixed value as the robots converge to different phases on the same limit cycle.

Circular Formations
The parallel formation control Eq. 17 is based on the forward swimming control Eq. 9; however, a desired heading was not prescribed, but rather the average heading emerged through interactions among agents depending on their initial conditions. Similarly, a circular formation control is proposed here that drives the fish robots to continuously adjust their heading at a known average rate, while aligning the center position of the nominal circles to an (average) consensus value. Virtual fish can be introduced to achieve a reference heading or position (Sepulchre et al., , 2008. Consider the following circular formation potential inspired by Eq. 5: where γ [c 1 , . . . , c k ] T with γ k (t) cos(θ k − ω 0 t) and K 2 , K 3 > 0. The time-derivative of V c along trajectories of Eq. 12 is where _ γ T 1 − N k 1 sin(θ k − ω 0 t)(ω k − ω 0 ) , and _ ω T ω is given in Eq. 15.
When averaged over time, the motion of a fish robot resembles that of a self-propelled particle. Recall that the reference circle center for a self-propelled particle is c k r k + v 0 ω −1 0 ie iθ k . Since the average swimming speed of the robots is bK 1 /ml (Lee et al., 2019), then by setting v 0 bK 1 /ml, the parameter ω 0 may be chosen to yield an average turn rate and reference circle with radius |v 0 ω −1 0 |. However, ω 0 should be sufficiently small to ensure that ω k switches signs along the limit-cycle so that the robots flap. The third term in Eq. 20, evaluated along trajectories of Eq. 12, is Therefore Eq. 20 may be rewritten as Choosing the control Thus, , which implies the system is driven to a bounded set containing the desired limit cycle. The K 3 term in Eq. 23 biases the torque to align the reference circles, whereas the remaining terms produce a flapping motion with a given average turn rate. Although Eq. 24 remains negative for any positive K 3 , this gain must be chosen sufficiently small to ensure flapping. As with Eq. 17, the feedback control Eq. 23 relies only on relative-state measurements between agents and does not include feedback linearization of the agents' dynamics.
The circular formation feedback control Eq. 17 is numerically illustrated by simulating Eq. 8 with parameters from Table 1 and using ω 0 0.05 rad/s. The simulation was conducted for ten minutes to demonstrate circular motion. Figure 4A shows all N robots converge to the same limit cycle in the (v k , ω k ) plane. The orbits in Figure 4B resemble the limit cycle in Figure 3B; however, due to the time-varying term in Eq. 19, they translate along the perimeter of the phase cylinder. The net result is motion along a circle whose center position is determined by the initial conditions of the agents ( Figure 4C). Since the oscillating centers are aligned when all robots have identical position and phase, the controller drives all the robots to one side of the circle. In ongoing work, we seek to stabilize symmetric circular formations Sepulchre et al., 2008), which evenly distribute the agents around the formation. The circular potential exhibits low and high frequency oscillations ( Figure 4D), which correspond to motion around the reference circle and flapping, respectively.

EXPERIMENTAL RESULTS
This section describes the implementation of the parallel formation control law Eq. 17 on a school of robotic fish and experimental results. First, the experimental testbed used in this work is briefly described. Next, results from system identification experiments are presented that provide parameters for a model of the reaction wheel dynamics. Based on this model, an inner-loop linear-quadratic-Gaussian (LQG) controller is implemented to track a desired reference torque generated from the formation control law using the onboard motor's angular velocity measurements. Lastly, results from a series of in-water experiments demonstrating the parallel formation control are presented.

Experimental Testbed
The fish-inspired soft robots used in the experiments (Figure 1) are each driven by a Pololu 12 V DC motor (with a 4.4 to 1 gear ratio) that oscillates a reaction wheel located at each robot's center of mass. Each fish robot measures its orientation with an onboard BNO055 inertial measurement unit (IMU) sensor that features a built-in extended Kalman filter. A micro-SD card is used to store sensor data and a 900 MHz xBee radio enables each fish robot to communicate with a ground station and other fish robots on the water's surface. Each fish robot performs onboard sensor processing and control with a Teensy 3.2 microcontroller. For a more detailed discussion of the fish robots' design, refer to (Lee et al., 2019).
The experiments were conducted in a 367,000 gallon water tank at the Neutral Buoyancy Research Facility at the University of Maryland, College Park. An overhead camera is mounted above the experimental area to record the true position of each robotic fish; however, the position data is not used in real time by the fish for formation control. Instead, during formation control experiments, each robot exchanges orientation data from their onboard IMU with other fish in the school using the xBee radios. Since the position of each robotic fish is not computed in realtime, an invariant complete communication graph was used in the experiments; whereas, a proximity-based communication graph was used in simulation. AT The overhead camera images are post-processed after each experiment to visualize the trajectory of each robotic fish. The image processing uses MATLAB's built-in corner/object detection based on a minimum eigenvalue algorithm (The MathWorks, 2019) and a built-in constant velocity Kalman filter for object tracking. Since all of the sensing and control law computations occur onboard, the school of robotic fish are a self-contained system when performing formation control experiments. The block diagram in Figure 5 gives an overview of the experimental testbed.
The nonlinear control laws Eq. 17 or Eq. 23 generate a reference torque command that steers the fish robots to their respective formations. However, this torque cannot be commanded directly since the control input into the reaction wheel system is the voltage applied to the DC motor generated by the motor driver and Teensy micro-controller. Furthermore, the only available measurement is the angular velocity of the motor's shaft measured by an encoder. Thus, to track the reference torque a linear-quadratic-Gaussian (LQG) controller and estimator was implemented. The LQG controller assumes the following motor dynamics (Kim, 2017): where Λ is the voltage input, R is the motor's electrical resistance, μ is the inductance, i is the current, e is the motor's back electromotive force (EMF), _ ψ is the angular rate of the output shaft, J is the sum of inertias between the reaction wheel and motor's output shaft, τ is the torque applied by the motor, ζ m is the internal damping friction applied to output shaft, and K e and K τ are the motor's generator and torque constants, respectively. To identify the values of the DC motor and reaction wheel Frontiers in Control Engineering | www.frontiersin.org December 2021 | Volume 2 | Article 782121 parameters, a series of system identification experiments were conducted, as described next.
The resistance of the motor R was measured directly with an ohm meter, and the sum of the motor and reaction wheel inertias J was approximated using analytical expressions for the moment of inertia of a cylinder about its axis of symmetry with a known mass and diameter. To determine the motor generator constant K e , substitute Eq. 26 into Eq. 25 and examine the system at steady current state, i.e., di/dt 0. Solving for _ ψ yields Using a Rigol DP711 power supply and quadrature encoder attached to the motor's output shaft, a series of constant voltages were applied to the motor. For each voltage, the current through the motor and the angular velocity of the output shaft at steady state were recorded. A linear regression was used to determine the quantity 1/K e , which is the slope of the line in Eq. 29 and Figure 6A.
The internal frinction coefficient ζ m was found by setting the torque to zero, i.e., τ 0 in Eq. 28, and solving the resulting differential equation to obtain where _ ψ(0) is the initial shaft angular velocity. Notice that Eq. 30 is a linear equation that describes ln( _ ψ) as a function of time.   A similar technique as described previously can be used to find − ζ m /J from a time-history of torque-free motor data. To identify the damping parameter, the motor was initialized with a constant nonzero angular velocity and the input voltage was removed. The motor's angular velocity was recorded as it decays under internal friction. As before, a linear regression was used to determine the slope − ζ m /J in Eq. 30 from Figure 6B and, hence, ζ m can be inferred since J is known.
To determine the torque constant K τ , substitute Eq. 27 into Eq. 28 at steady state ( _ ψ constant) and solve for _ ψ to obtain By repeating the procedure used to determine K e , using the value of ζ m , the value of K τ is found from the slope K τ /ζ m in Eq. 31 and Figure 6C.
Since the inductance μ only plays a role in the transient response of the motor, which are sufficiently fast, we estimate this parameter heuristically. The simulated response of the motor to a sinusoidal voltage input is visually compared to the actual response of the motor under the same input. The process is repeated while adjusting the value of μ to obtain a similar response, as shown in Figure 6D.
Lastly, the motor was found to exhibit a range of deadband voltages near zero that resulted in the motor being unresponsive.
To determine the range of this deadband, a series of incrementally increasing voltages were applied to the motor, giving an approximate deadband range of ± 4V. The parameter values determined through this system identification process are summarized in Table 2.

DC Motor Torque Tracking Controller
To implement the LQG torque controller and state estimator for the DC motor, convert Eqs 25-28 into state space form where q [τ, _ ψ] T denotes the states of the motor and the output is Y _ ψ. The continuous-time state space equations take the form For implementation onboard the micro-controller, the continuous system Eqs 32, 33 is converted into a discrete-time system (with an addition of torque process noise and heading measurement noise): where (Crassidis and Junkins, 2011) k is an integer indexing the discrete state, Δt 100Hz is the timestep of the microcontroller, w k is zero-mean, Gaussian, additive process noise with variance σ 2 w , and η k is zero-mean, Gaussian, additive measurement noise with variance σ 2 η . We adopt the standard approach outlined in (Crassidis and Junkins, 2011) and (Brown and Hwang, 1997) to implement a discrete-time Kalman filter for the system Eqs 35, 36. The estimated stateq k is used in a linear-quadratic-regulator (LQR) feedback control law of the form Λ k κ T (q d −q k ), where the κ is a gain matrix found by solving the discrete-time algebraic Riccati equation and q d are the desired state values. Numerical values used in the LQG controller and estimator design are given in Table 3.
To evaluate the LQG torque controller/estimator, the control law Eq. 9 was used to generate a reference torque for a step input change in desired heading. The performance of the torque tracking controller is shown in Figure 7A, and the heading trajectory of the fish robot during this experiment is examined in the next section.

Heading Control Experimental Results
We first examine the accuracy of the closed-loop directional controller Eq. 9, where θ d is a desired heading. After testing Eq. 9 on a single robotic fish, we found that the DC motor's deadband does not allow the average heading to completely converge to θ d . When the fish robot's heading is close to the desired heading, the voltage required to close the gap is too small and falls within the dead zone making the motor unresponsive. The accuracy of Eq. 9 is limited on our testbed due to this voltage deadband, but can also be mitigated by choice of the values of the nonlinear control gains K 1 and K 2 . In both Eq. 9 and Eq. 17, increasing the value of K 2 makes the control laws more sensitive to relative heading errors and increases the torque required to minimize the error. Similarly K 1 amplifies the torque and, therefore the voltage input required to keep the fish swimming. Tuning these control gains increases the accuracy of the aforementioned control laws. The experimental results of this test with a step input for θ d is shown in Figure 7B. The fish robot's heading oscillates about a desired heading with a small persistent error between the average and desired headings. The control gains used in this experiment are K 1 0.5 and K 2 7.

Parallel Formation Experimental Results
To validate the theoretical results of the parallel formation control law Eq. 17, the results from six experiments are presented. The fish robots' micro-controller uses IMU measurements for the heading and xBee radios for communication within the school. An overhead camera observes the positions of each fish robot; these were not used by the school during the experiments.
Four fish robots were initialized with random initial positions and orientations at the beginning of each parallel formation control experiment (implemented with gains K 1 3 and K 2 5). Consensus was achieved by the four fish with a small phase shift (up to 20 degrees) of the mean heading for each fish. This offset may be attributed to excess noise in angular velocity measurements and the aforementioned voltage dead zone. An example heading time-history from one of the experiments is shown in Figure 8A.
The excess noise in the angular velocity measurements caused large spikes to appear when computing the Lyapunov potential Eq. 13 from experimental data. Thus, to illustrate the convergence of this potential, the 1 2 ω T ω term was removed, leaving only the heading alignment term K2 2N < e iθ , Le iθ > plotted in Figure 8B for each of the six experiments. As mentioned in Section 3, the potential function does not decrease completely to zero when consensus is achieved because each robot converges to a different phase on the limit cycle. An example trajectory of the fish robots in Figure 9A and the Supplementary Material S1 shows the fish robots achieving consensus and swimming in a parallel formation. However, since the ground truth orientations of the fish robots are not used, errors in the IMU measurements sometimes cause a subset of the school to swim in a different direction than the rest of the school, as shown in Figure 9B. The error in IMU measurements may be caused by an erroneous calibration or magnetic anomalies in the water tank interfering with the onboard magnetometer.

CONCLUSION
Nonlinear control laws are proposed that stabilize parallel and circular formations in a model of N planar fish robots. The control design approach extends prior work on collective motion of self-propelled particles to a school of robotic fish with Chaplygin sleigh dynamics. The feedback control laws rely only on relative-state measurements between agents that interact according to a connected, undirected, communication graph and do not include feedback linearization of the agents' dynamics. Implementing the parallel control law on a testbed of fish robots required conducting system identification experiments to characterize the motor dynamics and designing a torque tracking motor controller and estimator. Numerical simulations and experiments on a school of robotic fish demonstrate the proposed approach. In ongoing work, we seek to model the fluid interactions between fish robots and instrument the robotic fish with pressure sensors to exploit the hydrodynamic benefits of close-proximity swimming.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
DP, AW, and PG contributed to conception and design of the study. AT performed the experiments and analyzed the experimental data. AT and AW wrote the experimental sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.