How to: Using Mode Analysis to Quantify, Analyze, and Interpret the Mechanisms of High-Density Collective Motion

2 While methods from statistical mechanics were some of the earliest analytical tools used 3 to understand collective motion, the ﬁeld has expanded in scope beyond phase transitions 4 and ﬂuctuating order parameters. In part, this expansion is driven by the increasing variety of 5 systems being studied, which in turn, has increased the need for innovative approaches to 6 quantify, analyze, and interpret a growing zoology of collective behaviors. For example, concepts 7 from material science become particularly relevant when considering the collective motion that 8 emerges at high densities. Here, we describe methods originally developed to study inert jammed 9 granular materials that have been borrowed and adapted to study dense aggregates of active 10 particles. This analysis is particularly useful because it projects difﬁcult-to-analyze patterns of 11 collective motion onto an easier-to-interpret set of eigenmodes. Carefully viewed in the context 12 of non-equilibrium systems, mode analysis identiﬁes hidden long-range motions and localized 13 particle rearrangements based solely on the knowledge of particle trajectories. In this work, we 14 take a “how to” approach and outline essential steps, methods, diagnostics, and know-how used 15 to apply this analysis to study densely-packed active systems.


INTRODUCTION
The vast complexity of human neurobiology gives rise to a rich interior life filled with thoughts, moods, motivations, ideas, discourse, and imagination.Given this lived experience, it's remarkable that the challenges for explaining an individual's specific actions washes away when we instead consider groupscale human collective behavior.This observation has fueled a surge of interest at the intersection of social psychology, behavioral economics, and data science that has led to highly-effective and systematic strategies for broad-based social engineering [1,2,3].These behavioral interventions, often called "nudges," are developed through A/B testing, which is a form of two-sample statistical hypothesis assessment.
Examples of successful nudges are abundant, and they have become a staple for organizations ranging from governments to Fortune 500 companies that seek to broadly reshape human decision making at a large scale [4,5].However, despite the well-deserved excitement, there are distinct limitations, especially when it comes to controlling spontaneous collective behavior.
A substantial historical record shows music concerts, religious pilgrimages, sporting competitions, political protests, and consumer shopping holidays have all played host to the shocking disaster of injurious human collective behavior [6,7,8].In these circumstances, high crowd densities and limited communication are the foundation for stampedes, crowd crush, and escape panic.Though these negative outcomes provide broad impetus for developing preventative safety strategies, reshaping collective behavior through A/B testing of nudges is impractical at best and unethical at worst due to the "black swan" nature of crowd disasters [9,10].Thus, alternative approaches need to be undertaken in order to establish effective life-saving interventions for these low-probability high-impact events.With these challenges in mind, we describe a physical and mathematical approach to understand, predict, and ultimately prevent tragic human collective motion at mass gatherings.This course aims to unravel the basic mechanisms of emergent collective motion in order to ground and inspire future intervention strategies.
The methods for analyzing high-density human crowds described here stem from an uncanny resemblance between mass gatherings and the disordered granular packings of soft matter physics (Fig. 1) [11,12,13,14].
The existing research on these complex materials provides a systematic framework for the characterization of collective motion along with theoretical tools that connect local structure to dynamical response [15,16,17,18,19,20,21].The specific technique derives from the analysis of linear systems at equilibrium wherein eigenvalues and eigenmodes of the local displacement correlation matrix convey information about structural stability [18].To the extent that linearized approximations effectively quantify jammed active matter [19], we apply this framework to study aggregated self-propelled particles.In the context of human crowds, our results enable an understanding of specific mechanisms for dangerous collective motion [22].
In the sections that follow, we describe how to implement the basic steps of eigenmode analysis and effectively interpret the results for high-density human crowds.While the method itself is quite general, we demonstrate the protocol through the example of an asocial model for high-density human collective behavior.In step-by-step fashion, we specifically emphasize practical tips for working with data, whether it be numerically simulated or empirically collected.

METHODS
We divide the methods section into two subsections.The first subsection describes a physical model for asocial human collective behavior in high-density crowds.This model is motivated by previous work and provides a specific means for simulating individual human trajectories [22,23].The second subsection details an analysis protocol that takes trajectory data as input, and converts this information into a quantitative description of emergent collective behavior.This protocol may be broadly applied to trajectory data from sources other than the asocial model.

How to: Mode analysis 2.1 A physical model to study high-density human collective behavior
To quantitatively model human collective behavior, we take a Newtonian force-based approach to generate complex emergent phenomena [24,25,26,27,28].While systems studied within this framework are generally non-equilibrium, the resulting phenomenology is often reminiscent of behaviors analyzed in the fields of statistical mechanics, granular materials, and fluid dynamics.As such, there is a rich tradition of concepts from these fields intermixing over the study of these "active matter" models [12,13,14].

Equations of asocial model
We specifically investigate human collective motion in high-density crowds.We therefore simplify the richness of human life to four forces, numerically simulate the resulting equations of motion, and investigate the emergent collective behavior.Referring to these simplified humans as Self-Propelled Particles (SPPs), we assume they are all interested in going toward a single point of interest, P.This point could be the front of a concert stage, the police line at a protest, or the exit of a stadium.While these activities have clear social differences, their commonality is in the accumulation of a large densely-packed crowd drawn by a common attraction.Each SPP indexed by i can now be described as a disk with radius r 0 positioned at a point r i (t) and subject to: a pairwise soft-body repulsive collision force which is non-zero only when the distance between two particles [23,22]; a self-propulsion force where v 0 is a constant preferred speed, v i is the current speed of the i th SPP, and pi is a unit vector pointing from each particle's center to the common point of interest P; a randomly fluctuating force with components drawn from a zero-mean Gaussian distribution and standard deviation σ defined by the correlation function , ensuring noise is spatially and temporally decorrelated; and finally, a wall collision force used to construct a confining simulation environment which is pointed along each wall's outward normal direction ŵ and non-zero when the distance of the SPP from the wall r i,w < r 0 .While other repulsion forces have been used in similar models of human collective behavior [29,30,31], the functional form of F repulsion i and F wall i used here comes from treating SPP collisions as a Hertzian contact mechanics problem involving frictionless elastic spherical bodies [23,32,33].Summing the forces from Eqs. ( 1)-( 4), we find the evolution of each SPPs dynamics is driven by where the relative magnitudes of individual force terms can be tuned through the scalar coefficients and µ.Because Eq. ( 5) lacks any terms that reflect social interaction, we refer to it as an asocial model for high-density human collective behavior [22].

Numerical implementation
Simulations take place in a rectangular room with wall length L r 0 centered at the origin (0, 0).The attraction point P is placed at the center of the right wall at (L/2, 0) and N SPPs are seeded at random initial positions with zero initial speed and acceleration.At every integration time-step we compute the total force acting on individual SPPs at their current positions r i (t) and evolve their trajectories according to Eq. ( 5) (Fig. 2).This calculation is performed numerically with the Newton-Stomer-Verlet algorithm, which finds the next position using the current velocity ˙ r i (t) = v i (t) and current acceleration ¨ according to: The next acceleration ¨ r i (t + ∆T ) = F total i (t + ∆T ) is then calculated at this new position, so that the next velocity can be determined by an average of the current and next accelerations: Looping this sequence of calculations produces trajectories for each of the N SPPs (Fig. 2).In Eqs. ( 6) and ( 7) the parameter ∆T defines the integration step, which should be small enough to ensure smooth trajectories, but large enough to achieve reasonable computation times.Here, we run our simulations for t = 3, 000τ units of time with ∆T = τ /10 = 0.10, yielding a total of 30, 000 integration steps.Every 10 time steps we record each SPPs position r i (t) as well as the pressure due to radial contact forces where the dot product is with the unit normal vector centered on the i th SPP.We consistently find transient motion dominates the first ≈ 50τ of the simulation resulting in far-from-equilibrium effects (Fig. 2; Fig. 3, linear path segments).By 300τ the SPPs have aggregated near P and formed a stable, dense, disordered aggregate with F propulsion i acting similar to an external field confining the SPPs.Within the aggregate, collision and noise forces are responsible for position fluctuations, causing each particle to move randomly around its average position (Fig. 3 inset, densely accumulated trajectory data near the point of interest P denoted by ).

Model parameters and time scales
Setting model parameters in terms of the fundamental simulation unit length and unit time τ allows us to maintain careful control over the relative force and time scales while not explicitly committing to dimensionful units.As such, we simulate N = 200 SPPs of radius r 0 = /2 in a region of size L = 50 .
These choices ensure the simulation box size L is larger than the typical aggregated SPP group size ∼ √ N .
We also set the SPP preferred speed v 0 = /τ , the random force standard deviation σ = /τ 2 , and the force scale coefficients = 25 /τ 2 , µ = τ −1 [23].For our analysis, we run 10 independent simulations of the dynamics with this set of parameters and random initial conditions.
Setting r 0 = /2 means that in the absence of any other interactions a SPP would move a distance equal to its diameter in the time τ .This choice approximates relaxed pedestrian motion if we were to have τ equal to one second [29].
The coefficient µ relates to an exponential relaxation time for the self-propulsion force, which can be seen by solving for free acceleration of a SPP.Specifically, dv/dt = µ(v 0 − v) has a solution v(t) = v 0 [1 − exp(−µt)] when v(t = 0) = 0.This expression shows that an unobstructed SPP will exponentially approach its preferred speed v 0 with a timescale µ −1 .
Both SPP-SPP and SPP-wall collisions are subject to a Hertzian repulsion force directed normally to the surface of contact with a magnitude set by the coefficient .These forces are non-singular, making them numerically stable and easy to simulate, which is particularly useful for studies of jammed soft granular matter [33].In the context of human crowds, Eqs. ( 1) and ( 4) greatly simplify collisional interactions, while still capturing the fact that people can be partially compressed with a rising non-linearity as the stresses increase.Here, is a constant for all SPPs equal to 25 /τ 2 , which guarantees the collision force scale is substantially larger than the self-propulsion force scale µv 0 = /τ 2 .
To ensure collisions and random force fluctuations contribute roughly equally to the SPPs dynamics, the collision time scale τ coll and the random force time scale τ noise must be comparable.In our asocial model, the random collision time scale τ coll = 1/(2r 0 v 0 n) ≈ (π/4)τ is given by the mean-free path by an amount equal to v 2 0 .Hence, τ noise = µv 2 0 /2σ 2 .Because the unit speed v 0 is fixed by the fundamental simulation units, and µ is set by the self-propulsion relaxation time, we simply let σ = /τ 2 to satisfy τ noise = τ /2 ≈ τ coll at steady-state.

Analysis protocol of trajectory data
In section 2.1, we outlined a physical model for generating trajectory data of simulated human crowds using SPPs.In this section, we provide a step-by-step protocol for analyzing trajectory data using mode analysis.While we demonstrate the protocol with simulated data from the asocial model, our analysis can be applied equally well to other active jammed systems.As such, we cast our discussion in terms of "agents," which could be either SPPs, empirical measurements of human motion, or other examples of high-density active matter under consideration.

2.2.1
Step 1: Calculate the displacement covariance matrix components C ij to estimate the ground-truth correlation matrix C p Each of the trajectories r i (t) = x i (t), y i (t) provide spatially resolved position data on the i = 1, . . ., N agents at discrete time points t.From these trajectories, we want to determine the displacement correlation matrix C p .Ideally, this matrix is a statistical quantity averaged over all realizations of the underlying system.
In practice, there are a limited number of computational runs or experimental measurements available.Thus, we calculate the covariance matrix [C ij ], whose components C ij converge to the ground-truth correlation matrix C p in the t → ∞ limit.These components are where time averages • • • are calculated for each realization of the underlying random process over a statistically-independent sub-sampling of the time series data.Conventionally, this is the equivalence between time and ensemble averaging.Critical Note 1: While Eq. ( 9) is standard notation in the literature, it obscures the fact that there are actually two sets of covariance matrix components to calculate: one each for the x and y directions.A more transparent representation would be However, the cumbersome nature of these expressions tends to favor the aesthetics and compactness of Eq. ( 9).Critical Note 2: Time-averaging calls for a judicial eye that balances two competing demands: subsampling in time should be spaced out to reduce effects from auto-correlated motion, while simultaneously leaving a sufficient number of statistically-independent "snap-shots" of the system for the components of the covariance matrix [C ij ] to converge to the correlation matrix C p .A straight-forward convergence criteria is to ensure the ratio 2N/N t < 1.5, where N t is the number of independent snap-shots [15].Critical Note 3: Because mode analysis depends on the eigenvalues and eigenmodes of [C ij ], the quality of time averaging can be tested by examining how the eigenvalue spectrum converges as a function of the temporal sampling rate [15].Specifically, the spectrum should be insensitive to the sampling rate; comparing spectra generated at different sampling rates will indicate whether a specific value is in this regime.Critical Note 4: Many physical models can be treated with scaling analysis to determine a minimal estimate for the time scale of auto-correlated motion.In the asocial model described in section 2.1, noise and repulsive collision forces dissipate auto-correlation.Thus, the trajectory data should be sub-sampled at temporal intervals spaced out longer than τ noise and τ coll .Critical Note 5: In some circumstances, the agents being studied can undergo large rearrangements in position.When this occurs, the covariance matrix components C ij must be recalculated from post-rearrangement trajectory data and convergence of [C ij ] → C p must be rechecked.
Example: Determining appropriate temporal sub-sampling of asocial model.For the parameter values described above, we find the simulation reaches steady state after ≈ 50τ .We generously discard the first 300τ to eliminate far-from-equilibrium transients leaving 2, 700τ of trajectory data for each SPP.
Because τ noise = τ /2 ≈ τ coll , then τ /2 is the minimal estimate for temporal sub-sampling.From this baseline, we check convergence of the eigenvalue spectrum for intervals longer than τ /2, and find a temporal sub-sampling of 10τ is adequate.We are then left with N t = (2, 700τ )/(10τ ) = 270 statisticallyindependent snap-shots of the system to use when calculating temporal averages in Eq. ( 9), which is consistient with the 2N/N t = 1.48 < 1.5 criteria.

Step 2: Calculate the eigenmodes and eigenvalue spectrum
Having calculated a displacement covariance matrix [C ij ] that approximates the ground-truth correlation matrix C p , we can now use standard numerical techniques to compute the eigenvalues λ m and their corresponding eigenmodes e m .The index m = 1, . . ., N runs over all agents under consideration.Often, the terminology "eigenmode" is simply shortened to "mode," which is a callback to the field's roots in analyzing harmonic vibrational motion.Critical Note 1: Sorting the eigenvalues in decreasing order and plotting as a function of index m gives the spectrum of the covariance matrix, which in the t → ∞ limit converges to the spectrum of the correlation matrix.Critical Note 2: As in the previous step, notational conventions obscure the fact that there are two sets of eigenvalues and eigenmodes.Specifically, both x and y directions have their own eigenvalues λ x m and λ y m for a total of 2N eigenvalues with two distinct spectra.Likewise, each eigenmode is a two-dimensional displacement vector field.For example, the m = 1 eigenmode is more transparently expressed as e x 1 , e y 1 , with e Note 4: Within a linear approximation, the eigenmodes express relative magnitude and directions for harmonic oscillation of the agents about their mean positions.The eigenvalues relate to the corresponding excitation energies of these modes.Trajectory data r(t) can then be decomposed into a linear combination of these modes with a collection of 2N time-dependent coefficients.As the non-linearity and non-equilibrium nature of the system asserts itself, these approximations breakdown.

Step 3: Find and remove rattlers
Plotting low-m eigenmodes as two-dimensional displacement fields frequently reveals a small number of agents N r N that represent a disproportionately large amount of the overall motion.This phenomenon is well-known to arise in jammed systems, where such agents are called "rattlers."The naming reflects the fact that rattler motion resembles a free rattling within a cage of highly-constrained and tightly-packed neighbors.As such, rattlers are an artifact of local structure [21].Because (i) rattlers tend not to participate in global collective motion and (ii) the location of rattlers is impossible to predict a priori, we must identify them a posteriori, remove the N r rattlers from consideration, and recompute Eq. ( 9) with the subset of N − N r agents.A threshold identification criteria is useful for systematically finding rattlers.In particular, for each mode m we check for agents with index j whose individual displacement is greater than the mode's average displacement plus ξ r times the mode's standard deviation More succinctly, rattlers are defined as agents with index j that obey | e m,j | ≥ | e m | + ξ r σ m , for a given threshold value ξ r .Once identified, these N r agents can be removed from further consideration.Critical Note 1: The removal of rattlers ensures low-m modes reflect genuine collective motion of the system instead of the free vibrations of under-constrained agents.Critical Note 2: Selection of an appropriate threshold should be done by testing multiple values of ξ r and examining how the fraction N r /N varies.No single value of N r /N will be appropriate for all circumstances, and the selection of ξ r should be given due consideration.Additional checks described in the next step provide further useful information to aid in the choice of ξ r .

2.2.4
Step 4: Re-calculate the eigenmodes and eigenvalue spectrum without rattlers Having provisionally identified the N r rattlers in step 3, we repeat steps 1 and 2 to re-calculate the eigenmodes and eigenvalue spectrum from [C ij ] with the remaining N − N r agents.When examining the new modes and spectra produced by step 2, we generally find the first attempt at removing rattlers is insufficient and steps 1 through 3 should be performed as an iterative process to optimally determine ξ r .
Critical Note 1: Concretely, this strategy starts with an initial guess for the threshold ξ r , and determines a final value by qualitatively and quantitatively checking how ξ r affects the eigenmodes and eigenvalue spectrum.The goal is to find a value of ξ r that filters rattlers from the overall collective motion in low-m eigenmodes.Critical Note 2: While iterating, it's useful to: (i) visually confirm whether the eigenmode plots are being affected by under-constrained motions, and (ii) confirm the eigenvalue spectra retain their basic shape.This information is essential for making an informed decision on the next threshold value to test.Critical Note 3: A reasonable initial guess for the rattler threshold is to set ξ r = 1 with the expectation that the final value will be larger.
Example: Iteratively selecting ξ r .Working with trajectory data generated by the asocial model, we perform steps 1 through 4 of the analysis protocol and test various values for the rattler threshold ξ r (Fig. 4).In this instance, we seek to find and remove rattlers from the first 10 modes by examining values of ξ r ranging from 1 to 6.As prescribed in the protocol, we calculate and re-calculate the eigenvalue spectrum eliminating provisionally identified rattlers from consideration.These comparisons show the general form of the eigenvalue spectra remain largely unchanged for a range of ξ r , and the number of rattlers decreases substantially for ξ r > 1 (Fig. 4D).In addition to examining the spectra and N r /N ratio, we also examine plots of the modes before (Fig. 5, red) and after (Fig. 5, blue) rattlers are removed.We see for two simulation runs how modes m = 1, 2, 3, 4, and 100 are affected by the removal of rattlers; in both instances, we notice an absence of large irregular vector arrows in low-m modes as the value of ξ r is optimized.In this case, we have successfully filtered rattler motion out from the overall collective behavior, leaving a set of N − N r eigenmodes and eigenvalues for downstream analysis.

(Optional) Step 5: Alternative heuristic method for finding rattlers
We briefly mention an alternative method for finding rattlers.This heuristic involves the computation of each agent's positional auto-correlation time ∆t * from the trajectory data r i (t) = x i (t), y i (t) and the auto-correlation functions Here, the normalizations are over all T time steps, and the averaging • • • is calculated for individual agents.We seek minimal time delays ∆t x and ∆t y such that χ x i (∆t x ) = χ y i (∆t y ) = 0, which average to the auto-correlation time ∆t * = (∆t x + ∆t y )/2.Because rattlers are free to move within their local region, they are generally not influenced by their neighbors and we expect these agents to have the highest auto-correlation times.Critical Note 1: The auto-correlation times are independently calculated for each of the i agents even though ∆t * does not have an explicit index i.Critical Note 2: Auto-correlation measurements can have non-trivial behavior that requires individualized assessment.Some cases of note include: when multiple time delays where χ i = 0 are found, we simply take the smallest value of ∆t; when there are no time delays with χ i = 0, it may be appropriate to fit a smooth function and extrapolate the delay time that χ i intersects zero; when the auto-correlation functions are noisy and it appears that ∆t as-defined is erroneous, it may be appropriate to smooth the data and infer a revised value of ∆t.Critical Note 3: While the heuristic approach for finding rattlers has the advantage of being rapidly calculated directly from raw trajectory data, it does not have the same degree of accuracy as the threshold identification method.
This cost-benefit assessment suggests the heuristic approach may find its most effective applications in real-time inference of crowd diagnostics.
Example: Testing heuristic identification of rattlers.
Step 3 of the mode analysis protocol identifies rattlers based on each agent's positional fluctuations within a given mode.Here, we consider a side-by-side comparison for three simulation runs showing how a threshold of ξ r = 4 compares with the distribution of auto-correlation times.Evidently, the agreement is considerable (Fig. 6, overlapping red circles and dark squares, especially in runs 1 and 3), though certainly not perfect (Fig. 6, non-overlapping red circles and dark squares, especially run 2).When an abundance of data and analysis time is available, the threshold approach is clearly preferable for its accuracy.However, in circumstances that limit the availability of data or when analysis is needed in near-real-time, the heuristic may be preferable.

(Optional) Step 6: Calculate the Density of States (DOS)
In mode analysis, the Density of States (DOS) D(ω 2 ) is used to quantify the probability of having a certain number of excitable oscillations at frequency ω within a given energy range ∼ ω 2 .This concept is well-defined at equilibrium, but more tenuous in non-equilibrium systems such as the asocial model considered here.The value of using a DOS analysis is that it sheds light on how a perturbation will transfer energy into various modes, as long as the perturbation itself does not substantially disrupt the organization of agents that defines the modal structure.To calculate D(ω 2 ), we note the eigenvalues are related to harmonic frequencies by λ m = ω −2 m .Since oscillation energy is proportional to the frequency squared, D(ω 2 ) is essentially a histogram of the inverse-eigenvalues.Critical Note 1: The DOS conveys information about the rigidity of a solid; when there are many low-energy modes, the system is "soft" and will appear unstable to excitations.Here again, we mention the caveat that "low-energy modes" in the asocial model are a linearized approximation of the true response.Critical Note 2: A useful conceptual touchstone is the Debye law for regular lattices wherein the DOS is often expressed as the density of frequencies ω as opposed to energies ω 2 .In d dimensions, the Debye law is D(ω) ∼ ω d−1 .
Example: Interpreting eigenvalue spectra through a DOS lens.We previously analyzed the eigenvalue spectra to examine their dependence on the rattler threshold ξ r .To provide additional context for these plots, we can explicitly plot the DOS (Fig. 4D) to reveal the distribution of low-frequency excitations.
These measurements provide a potential target of opportunity for theoretical predictions.

(Optional) Step 7: Find soft spots
When studying jammed granular materials, certain regions are often found to be partially underconstrained resulting in the presence of "soft spots" that are more likely to undergo large structural rearrangements when the system is perturbed [18].In the context of analyzing dense human crowds, soft spots localize the people undergoing the largest displacements.The agents in these soft spots are known as "bucklers" [21], and they can be identified with a thresholding process similar to the one used to find rattlers.
In this case, we seek the N s non-rattler agents indexed by j from the collection of low-m modes that obey where each term is defined as in Eqs. ( 11)-( 13), and ξ S is a yet-to-be-determined threshold for finding agents in soft spots.We also seek the N D non-rattler agents indexed by i whose dynamics in x, y obey in mind, we note the asocial model involves a propulsion force aligned to a specific point of interest P that explicitly breaks x, y translational symmetry (Eq.2).We therefore predict a low-energy long-range pseudo-Goldstone mode to be a fundamental collective excitation of the asocial model.
One way to check if one or more of our system's modes is a pseudo-Goldstone mode is to examine whether the polarization correlation length is system-spanning [45].Practically, this is accomplished by calculating • [ e j m − Φ(m)] d ij =d about this average value [45].In this last expression, the average is over all particles i and j whose pairwise distance d ij is equal to the distance d.
We then define the correlation length l c (m) as the minimum distance at which C m (l c ) = 0. Plotting C m (d) for a few modes shows that most have a relatively short correlation length, while the m = 1 mode extends across the entire system (Fig. 10).This system-wide excitation is at the lowest possible mode number, which in linear mode theory corresponds to the lowest possible excitation energy.Thus, these two pieces of evidence combine to strongly implicate the m = 1 mode as a pseudo-Goldstone mode.Because its origins can be traced to a broken continuous symmetry, this long-range highly correlated collective motion is an intrinsic effect of densely aggregated agents.
In our asocial model, the m = 1 mode is a collective motion that slides up-and-down along the right-most edge of the simulation box (Fig. 9C).In real-world circumstances this means the most easily-excitable mode would result in a large number of people being suddenly displaced together, possibly toward a wall, concert stage, or some other barrier.Such a situation has been widely observed in conjunction with the emergence of shock waves and density waves during "crowd turbolence" [11,46].Since its origins can be traced to the general principle of symmetry breaking, this type of long-range collective motion can be expected as a latent excitation arising in a wide range of circumstances with the potential for causing crowd crush casualties [47].

Topological defect density drives disorder in the modes
If broken symmetries and pseudo-Goldstone modes can explain the m = 1 long-range collective motion, then what is the most useful way to understand the remaining m > 1 modes?Remarkably, and somewhat surprisingly, topological principles provide useful insights.Two modes are considered topologically equivalent if their vector fields can be continuously deformed to match one another, and as such, the difference in excitation energy will become arbitrarily small as their alignment converges [40].
However, the introduction of a topological defect, such as vortices in superfluids, magnetic flux tubes in superconductors and edge-dislocations in liquid crystals, prevent convergence and drive a persistent non-zero energetic difference.To check whether topological defects play a meaningful role in explaining the structure of eigenmode m, we calculate the winding number charge q i = (2π) −1 ∇θ • d i for each non-edge agent i using a path i that loops around i's nearest-neighbors.Here, ∇θ measures the change in orientation between each agent's eigenmode vector along the loop.This measure is 0 when there are no topological defects, +1 if there is a vortex centered on the agent, or −1 if there is an anti-vortex centered on the agent.We can therefore identify each agent as coinciding with a topological defect depending on whether the local vector field makes a vortex with positive or negative charge q i .Qualitatively examining q i for a handful of different modes, it is clear that there are a number of topological defects, especially for m > 1 (Fig. 11A).Recognizing that the rotation of the mode's vector field around these defects will reduce the correlation length, we sum the total absolute defect charge Q abs (m) = N i=1 |q i | for each mode (Fig. 11B) and estimate the expected correlation length.Assuming n defects are randomly distributed among N agents in a half disk of radius R and area πR 2 /2, the spatial distribution of defects can be Bottinelli et al.

How to: Mode analysis
expressed by a Poisson process of intensity ρ q : where ρ q = Q abs /N is the defect density.The probability to find the first defect closest to the disk's origin at a distance greater than R is equal to F (> R) = P (no point within R) = exp[−ρ q πR 2 /2].Differentiating with respect to R and being careful with the sign of F (> R), we find the probability f (R) for the first neighbor at a distance R is f (R) = πρ q Re −ρqπR 2 /2 .Thus, the average nearest-neighbor distance R between any two points in the semi-disc is: which is the average distance between two topological defects at density ρ q .To connect R with the polarization fluctuation correlation length, we notice that if two neighboring defects are both positive or both negative, the mode's vector field cannot change direction in the space separating them, therefore l c = R .However, if two neighboring defects have opposite signs, the vector field must change sign and l c ≈ R /2.Therefore, on average we have This last expression is a parameter-free theoretical prediction that readily agrees with our numerical computations (Fig. 11C), suggesting that disordered features of modes m > 1 arise from topological defects.For m < 50 (Fig. 11B), we interpret this result as indicating that modes cannot be continuously deformed into one another, while for m > 50 a maximum disorder is reached due to a saturation in defect density.

Collective motion has microscopic structural origins
Understanding collective motion in high-density crowds is motivated by an impetus to predict and prevent human disaster.Thus far, we have successfully linked individual trajectories to emergent collective motion through mode analysis.While the analysis provides insights such as the existence of soft spots and psuedo-Goldstone excitations, the specific locations and orientations of these phenomena depend on trajectory data.Consequently, these details can only be unlocked through an after the fact analysis, as opposed to the more-desirable goal of assessing risk in real-time.Nevertheless, it should still be possible to Frontiers perform real-time risk assessment given that these collective motions ultimately depend on the microscopic self-organized structure of the crowd.The question remains: how do we make such an inference?
When we examine the structure of high-density crowds, it clearly deviates from the well-known hexagonal packing of 2D hard disks under uniform conditions.In the asocial model, these irregularities arise from a pressure gradient, which is calculated by averaging Eq. ( 8) as a function of distance from the point of interest P (Fig. 12A).The structural irregularities created by this pressure gradient can be quantified by similarly computing the average number of interacting neighbors, also as a function of distance from P.
Within the bulk of the crowd, this value is typically equal to six (Fig. 12B, black line), which would be expected for homogeneous packing, while increasing to seven near P due to the higher pressure (Fig. 12B, dashed black line).If we now filter our averaging to examine just the bucklers within soft spots, we see the average number of interacting neighbors is measurably higher, suggesting that the local coordination number contains a signature of these potentially high-risk locations (Fig. 12B, solid red line).Examining specific runs and comparing the local coordination number (Fig. 12C) to the location of bucklers in soft spots (Fig. 12D), we find a broad consistency between these two measures.The critical point here is that the coordination number can be extracted from a single "snap-shot" of the crowd, whereas soft spots are identified through the full machinery of mode analysis.Even if the mapping is not perfectly one-to-one, coordination number may provide a valuable correlate for predicting the location of high-risk areas before collective motions become excited.
If soft spots are indeed a consequence of local structure, the mechanistic connection remains to be identified and understood.Therefore, we measure the two particle radial structure factor , which quantifies the radial distribution of distances between neighboring agents.Unlike globally averaged properties, g(d) provides information about the local structure [36].For the asocial model, it reveals that the overall structure has clear short range order (Fig. 13, peaks for d < 3) but no periodic long range order (Fig. 13, generally smooth distribution for d > 3).We see the position of the first peak centered at d = 0.8 suggests that, on average, the agents slightly overlap due to the self-propulsion forces.When we filter our measurement and reexamine g(d) strictly for the bucklers in soft spots, the data shows a new sub-peak around 0.5 d 0.8, while a second more prominent peak shifts to d ≈ 0.9.This seems to suggest the structure within a soft spot is asymmetrically squeezed with nearest-neighbors somewhat closer than average in one direction, presumably in the direction of P, while also somewhat further away than average from other neighbors.As a result, this irregular structure provides a microscopic mechanism for bucklers to easily displace when perturbed.
In terms of high-density human crowds, motion can be thought of as the superposition of the most easily excited modes.When this happens, our analysis predicts that people in soft spots would be the ones displacing the most.Since perturbations typically come abruptly and unexpectedly, we interpret soft spots as posing the highest risk for tripping and subsequent trampling.This phenomenon has been documented in a number of crowd disasters, when sudden unexpected movements of the crowd cause individuals to trip and fall, resulting in injury or death due to trampling or compressive asphyxia [11,47,48].Furthermore, the observation that these areas can be heuristically detected through heterogeneity in the local coordination number provides a potential target for real-time prediction and prevention.

3.4
The "participation ratio" and "effective coordination number" do not measure collectiveness in the asocial model While we were able to successfully co-localize soft spots with the local coordination number, there are two additional metrics we tested that were found to provide insufficiently detailed information.Nevertheless,

Bottinelli et al.
How to: Mode analysis because these metrics are more widely used to study densely-packed jammed systems, we provide an overview of our findings so that others may have our null results as a reference.Our main finding with these metrics is that they seem to detect a difference between high-and low-m modes with a transition around m ≈ 50, similar to the transition found when measuring the density of topological defects (Fig. 11B).
However, this says little about soft modes, the eigenvalue spectrum, the DOS, or auto-correlation length.
Any deeper significance for active matter systems apparently requires further analysis.
A standard measure for the collectiveness of a mode is given by the participation ratio PR(m), which quantifies how many agents in the system move when a given mode is excited.In the literature there are several definitions of this metric [20,15], and while we tested them all, we present results when PR(m) is calculated as which, respectively, takes values between 0 and 1 for fully localized and fully extended collective motion [20].Plotting the participation ratio against mode number m provides a signature of the system and gives an overview of the collective nature of modes.In our case, we find the participation ratio for soft modes is lower than the random matrix RM σ , and increases toward 1/2 with mode number m (Fig. 14A).
This occurs because the typical length of the displacements on the high-m modes are highly similar while their direction is random.Conversely, soft mode displacements are more variable in length but more correlated in direction.This seems to suggest that in the framework considered here, the participation ratio is not an appropriate measure for detecting collective behavior.
Another commonly used metric to characterize modes is the effective coordination number [20]: where z i is the number of neighbors interacting with the i th agent defined as d ij < 2r 0 , and -3 is to remove degrees of freedom associated with global rotational/translational symmetries.This expression calculates the average number of constraints per agent in each mode, weighted by their displacement on that mode.In jammed solids, its value depends on the amount of compression and affects the frequency ω of modes [20].
Here, we cannot precisely relate the eigenvalues to the energy of the system, therefore z eff (m) simply helps identify over-or under-constrained modes.In particular, rigid stability requires z eff (m) ≥ 3; in light of our results (Fig. 14B), the system appears generally non-rigid.

DISCUSSION
With an eye toward understanding, predicting, and preventing tragedies at mass gatherings, we view our main results as revealing mechanisms for the emergence of potentially dangerous collective motion.By first identifying these principles and outlining a quantitative framework for measuring their existence, we are now in position to test their real-world applicability using video data of concerts, pilgrimages, and sporting events.This next step is a straight-forward empirical data collection process, given the current availability of low-cost high-definition digital cameras and inexpensive cloud-computing resources for rapid image analysis.The only remaining obstacle, therefore, is to develop computer vision algorithms that robustly and automatically track individual trajectories in footage of high-density crowds.While this image

FIGURES AND CAPTIONS
This is a provisional file, not the final typeset article    .Heuristic identification of rattlers through their auto-correlation time ∆t * .Each SPP is shown as a square and shaded according to ∆t * .In runs 1 and 3 the SPPs with the highest auto-correlation time are also identified as rattlers by the threshold method (red circles, ξ r = 4).In run 2 this heuristic is less successful at identifying rattlers near the center of the aggregate.Winding number charge q i for the soft modes of two independent simulation runs.Each SPP is colored according to its winding number and topological defects are centered on SPPs with q i = 0.In both simulation runs the number of topological defects increases with mode number.(B) The total absolute charge Q abs plotted against mode number m for 10 independent simulation runs shows the number of defects increases with increasing mode number.(C) Scatterplot of the total absolute charge Q abs versus correlation length l c for each mode of the 10 independent runs (circles) compared to the predicted estimate from our geometric argument (line).(B) Zooming in on the blue boxed region from (A), we see differences in local structure between bucklers in soft spots (red solid line) the averaged aggregate (black dashed line).All data is generating by averaging over 10 independent simulation runs.

Figure 1 .Figure 2 .
Figure 1.Typical examples of non-human granular media including (A) quinoa, (B) muesli, (C) mixed candy, and (D) rice, compared to high-density human crowds at (E) a Black Friday sales event and (F) an outdoor concert.Similarities with the former inspires an analysis of the latter.

Figure 3 .Figure 4 .
Figure 3. Trajectories r i (t) for N = 200 SPPs in a typical simulation run of the asocial model for highdensity human crowds.After a transient period (radial lines toward ), the SPPs reach a stable state where they randomly oscillate about their average position (inset, dense squiggles).Each SPP is given a different color to more easily distinguish it from its neighbors.

Figure 5 .
Figure 5. Visualization of five eigenmodes for two independent simulation runs before (red) and after (blue) removing the rattlers with a threshold choice of ξ r = 4. Removing rattlers affects the first few eigenmodes by filtering out large irregularly-oriented vectors, but has little affect at higher m.

Figure 6
Figure 6.Heuristic identification of rattlers through their auto-correlation time ∆t * .Each SPP is shown as a square and shaded according to ∆t * .In runs 1 and 3 the SPPs with the highest auto-correlation time are also identified as rattlers by the threshold method (red circles, ξ r = 4).In run 2 this heuristic is less successful at identifying rattlers near the center of the aggregate.

Figure 7 .Figure 8 .
Figure 7.The agreement functions for the correlation between being in a soft spot (S), undergoing large pressure (P ), and large displacement (D) before (top) and after (bottom) the normalization by the weighting functions (center) that measure the size of the sets S, P, D.

5 Figure 11 .
Figure 11.Topological defects drive disorder in m > 1 modes.(A) Winding number charge q i for the soft modes of two independent simulation runs.Each SPP is colored according to its winding number and topological defects are centered on SPPs with q i = 0.In both simulation runs the number of topological defects increases with mode number.(B) The total absolute charge Q abs plotted against mode number m for 10 independent simulation runs shows the number of defects increases with increasing mode number.(C) Scatterplot of the total absolute charge Q abs versus correlation length l c for each mode of the 10 independent runs (circles) compared to the predicted estimate from our geometric argument (line).

Figure 12 .
Figure 12.Local coordination number correlates with soft spots.(A) The average radial pressure normalized by P 0 = µv 0 (2πr 0 ) −1 reveals a gradient within the aggregate.This pressure drives deviations (B) from homogenous hexagonal packing as measured by the average number of interacting neighbors at a distance r from the point of interest P. Hexagonal close packings of hard disks have six interacting neighbors, which is added as a dashed line for reference.(C) The local coordination number, shown here for 4 runs, can be computed from a snap-shot of the system, which (D) correlates strongly with the location of soft spots that are found through more lengthy computations involving the identification of aggregated bucklers (red).

Figure 13 .
Figure 13.The structure factor g(d) helps explain why local coordination predicts location of soft spots.(A)The structure factor g(d) shows short range order (peaks for d 3), but generally no long-range order.(B) Zooming in on the blue boxed region from (A), we see differences in local structure between bucklers in soft spots (red solid line) the averaged aggregate (black dashed line).All data is generating by averaging over 10 independent simulation runs.
r 0 divided by the preferred speed v 0 .The average crowd density n ≈ N/π(

Bottinelli et al. How to: Mode analysis
Figure 14.While consistent with our general findings, not all measures of structure explain collective motion.(A) The participation ratio PR(m) initially decreases as a function of mode number, but after m ∼ 6 increases towards the random case (dotted line).(B) The effective coordination number z eff (m) shows the lowest modes are under-constrained.