^{1}Toyota Central R&D Labs, Inc., Tokyo, Japan^{2}Department of Physics, Nagoya University, Nagoya, Japan

Glass transitions are widely observed in various types of soft matter systems. However, the physical mechanism of these transitions remains elusive despite years of ambitious research. In particular, an important unanswered question is whether the glass transition is accompanied by a divergence of the correlation lengths of the characteristic static structures. In this study, we develop a deep-neural-network-based method that is used to extract the characteristic local meso-structures solely from instantaneous particle configurations without any information about the dynamics. We first train a neural network to classify configurations of liquids and glasses correctly. Then, we obtain the characteristic structures by quantifying the grounds for the decisions made by the network using Gradient-weighted Class Activation Mapping (Grad-CAM). We consider two qualitatively different glass-forming binary systems, and through comparisons with several established structural indicators, we demonstrate that our system can be used to identify characteristic structures that depend on the details of the systems. Moreover, the extracted structures are remarkably correlated with the non-equilibrium aging dynamics in thermal fluctuations.

## 1 Introduction

When a liquid is cooled while preventing crystallization by quenching or adding impurities, a liquid state can be maintained below melting temperature, resulting in a so-called supercooled liquid state. Further cooling of the supercooled liquid results in a dramatic increase in the viscosity of the liquid and yields a glass (more generally, an amorphous solid). In such a system, the particle motion is frozen, and the structure remains disordered. Various materials, e.g. oxides, alloys, polymers, and colloids, take on glassy states. Glassy materials are generally considered disordered and homogeneous because they basically cannot be distinguished from simple liquids that are also disordered in structure using analytical methods such as neutron, X-ray, or light scattering and other two-body correlations in the density field. However, dramatic changes to their properties can occur, for example, a 15-order-of-magnitude increase in the viscosity from a temperature change of only approximately 20% [1]. Although the glass transition phenomenon has been studied for more than 150 years, its mechanism has not yet been clarified [2–5].

Heterogeneity in particle motion develops in supercooled liquids near the glass transition temperature, and the spatial length scale increases on such a glass transition [6–12]. This behavior is called dynamic heterogeneity and is a potential cause for the rapid increase in viscosity at the glass transition point. However, to date, the origin of this dynamic heterogeneity has not been clarified; in particular, questions remain as to whether it is formed entirely dynamically or whether a static structure exists in the background. The “dynamical facilitation theory” describes the heterogeneity associated with glass transitions as a fully dynamic phenomenon, and explains the experimental results and numerical analysis of glass transitions [13]. In contrast, “the theory of random first-order transition” (RFOT), which considers the glass transition as a thermodynamic phase transition and proposes a scenario in which a static conceptual structure called a “mosaic” develops, also explains the experimental results and numerical analysis of glass transitions [4, 14]. Thus, although these theories are contradictory in terms of whether the glass transition is a purely dynamic transition or a thermodynamic phase transition governed by a static structure, they can explain various aspects of the glass transition phenomenon. Hence, in the current state, there appears to be no definitive theory for understanding the full picture of glass transitions.

Many attempts have been made to explore the specific structures that exist in supercooled liquids. For instance, icosahedral-like structures in metallic glasses [15, 16] and medium-range crystalline order in colloidal glasses with small particle-size dispersity have been found [17–20]. Order parameters are introduced on a system-by-system basis to extract these characteristic structures, but no order parameter applicable to *all* amorphous solids has been found. It is also unclear whether such characteristic structures are universal; this is a topic of active debate. Therefore, elucidating the presence or absence of a universal structure in amorphous solids is a significant and challenging problem in fundamental physics. Tong and Tanaka recently developed a new order parameter consisting of the bond angles of particle structures and successfully extracted the characteristic structures correlated with particle dynamics for a wide range of glass-forming systems, including binary mixtures and polydisperse glassy systems with large particle size dispersions [21, 22]. However, as indicated in the corresponding literature [21, 22], this method has not been able to extract the characteristic structures in the Kob–Andersen system [23], a typical glass-forming model, and consequently, attempts to develop a universal structural analysis method for a variety of glass-like systems continues to the present day. We mention that as another branch of examples of promising static information-based approaches, a method relying on the Franz-Parisi potential has been proposed [24]. Its effectiveness was demonstrated by the quantitative correspondence with the structural relaxation time. The recently proposed microscopic version of a similar Franz-Parisi potential-based quantity would allow us to specify the local characteristic structures that govern the dynamics on the purely static basis [25].

In recent years, machine-learning approaches have been widely used to investigate the characteristic structures governing glass dynamics [26–30]. In particular, recent studies have successfully predicted the dynamics from the static structure in Kob–Andersen systems by learning from a large amount of structural data, as well as the corresponding dynamic data, using graph neural networks [28, 29]. In addition to these supervised approaches, unsupervised counterparts have also been applied to the extraction of characteristic structures from glasses, pioneered by Ronhovde and co-workers [31, 32]. Interestingly, many researchers have recently reported that the structures extracted using unsupervised methods [27, 30] exhibit correlations with the long-time dynamics, despite no information about the dynamics being provided during the training. However, although machine learning is very promising for exploring the structures of glasses, accurate learning (including preparation of the training data) is computationally expensive, and the results are difficult to interpret.

In this work, we propose a method to extract the characteristic multi-particle structures of glasses solely from the static configurations using a deep learning-based approach. To this end, we work on the classification problem for the random structures in glasses and liquids using a convolutional neural network (CNN) [33] and then identify the structures that the CNN relied on to make decisions using gradient-weighted class activation mapping (Grad-CAM) [34]. We applied our proposed method to two representative glass-forming liquid systems and compared the obtained structures with well-established structural indicators. The results demonstrate that the proposed method can extract qualitatively different characteristic structures in a system-detail-dependent manner. Surprisingly, although our method does not refer to information about the dynamics during the learning process and extracts the characteristic structure solely from the instantaneous static configurations, the obtained structures strongly correlate with the non-equilibrium aging dynamics.

The remainder of this paper is organized as follows. In Section 2, we summarize the simulation methods and protocols used for sample preparation for the deep-learning tasks. In Section 3, we introduce the CNN and Grad-CAM, and provide a brief explanation of the established structural indicators used as a reference. In Section 4, the results of the structural analyses are presented, and the correlation between distinct indicators, as well as the predictability of our method with respect to the dynamics, is discussed. Finally, in Section 5, we provide a summary of this study and an overview of future research directions.

## 2 Simulations

### 2.1 System setups

In this study, we consider two distinct systems: the Kob–Andersen model (KAM) [23] and the additive binary mixture (ABM) [35]. Both systems are two-dimensional (2D) and are described by the Lennard–Jones (LJ) potential with linear smoothing terms:

where the subscript *ij* indicates that the variable is between particles *i* and *j*, *ϵ*_{ij} sets the energy scale, *σ*_{ij} determines the interaction range, and *ϕ*_{ij}, whereas *ϕ* guarantees the continuity of the potential and force at cutoff distance

Both systems are composed of two different types of particles (A and B) and are characterized by different parameter sets, such as *ϵ*_{ij} and *σ*_{ij}. In the case of the KAM, the LJ parameters are non-additive: *σ*_{AA} = 1, *σ*_{AB} = 0.8, *σ*_{BB} = 0.88, *ϵ*_{AA} = 1, *ϵ*_{AB} = 1.5, and *ϵ*_{BB} = 0.5. Therefore, the concept of “particle size” is not well defined in the KAM system. For the ABM, on the other hand, the parameters are simply additive; thus, we can unambiguously say that particle A is small, and B is large (that is, *σ*_{AA} = 5/6, *σ*_{AB} = 1, *σ*_{BB} = 7/6, and *ϵ*_{ij} = 1) regardless of the combination of types of particles *i* and *j*.

All observables were non-dimensionalized using characteristic length *σ**, characteristic energy *ϵ**, and particle mass *m** (the characteristic variables are listed in Table 1). The total number of particles was fixed at *N* = *N*_{A} + *N*_{B} = 2000. The number density *ρ* = *N*/*L*^{2} and the number ratio of the two-particle species *N*_{A}/*N*_{B} also differ between the two systems, i.e., the KAM and ABM. With the values of *ρ* used here, the systems entered the glassy phase once the temperature was sufficiently low [23, 35]. Information about the parameters mentioned here is summarized in Table 1. Although we consider only 2D systems in this article for the sake of simplicity, we stress that all the analyses here can be easily extended to three-dimensional systems, which will be performed in the future.

### 2.2 Sample preparation protocol

We performed molecular dynamics (MD) simulations using the open-source Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS: https://www.lammps.org/). We generated samples *via* NVT simulations using the Nosé–Hoover thermostat. Periodic boundary conditions were set in all directions. In this study, we aim to address a simple binary classification problem. For the two classes, we chose configurations at the temperature where the dynamic slowing-down starts at (*T*_{L} = 0.8 for KAM and *T*_{L} = 2.0 for ABM) and at a very low temperature (*T*_{G} = 0.05 for both models). For both KAM and ABM, we first generated 5000 independent random configurations (4,000 were used for training, 400 for validation, and 600 for the test data) and equilibrated them at a very high temperature of *T* = 4.0. The obtained configurations were then cooled at a constant cooling rate *T*_{L} and *T*_{G}. The samples at *T*_{L} correspond to “equilibrium” supercooled liquids in the sense that their dynamics exhibit time-translational invariance, whereas those at *T*_{G} are regarded as “non-equilibrium” glasses in the sense that they are expected to experience aging. Note that, judging from the evolution of the potential energy of the system as a function of the temperature (Supplementary Figure S6) [37], crystallization is avoided in both systems at this cooling rate (i.e., we did not observe any discontinuous jumps in the energy). Consequently, the radial distribution function *g*(*r*) of the configurations at *T*_{G} does not show any signs of global crystallization (Supplementary Figure S7).

## 3 Analytical methods

We train a neural network to distinguish two classes of systems, “glass” and “liquid,” based purely on the instantaneous configurations. Then, the characteristic structures of glasses are identified by extracting the meso-scale structures that the trained network relied on to provide a correct classification. In this section, we explain the methods used to achieve these classifications and identifications of characteristic structures.

### 3.1 Convolutional neural network (CNN)

We first perform supervised learning to train a CNN [33, 38] to predict whether a given configuration is glass or liquid. Following Ref. [38] in which the authors tackled a similar classification task successfully, our network has no pooling layers. It is then simply composed of three convolutional layers and subsequent activations (the rectified linear units), followed by the fully connected layer, dropout layer, and final fully connected layer as the output layer. Note that, although we apply the “softmax” function subsequently to obtain the final results, the output layer of the network is the fully connected one to make it compatible with Grad-CAM, as explained in Section 3 2. The full details of the network and learning protocol, including the precise values of the hyperparameters, are summarized in the Supplementary Material. After training, the softmax layer outputs a value in the range of [0,1] which can then be interpreted as the probability for a configuration to be assigned to one of these classes.

Importantly, when we feed the particle configurations,

### 3.2 Gradient-weighted class activation mapping (Grad-CAM)

Once a CNN is able to classify glasses and liquids correctly after training, we aim to extract the characteristic mesoscale structures that the CNN relies on when classifying. This identification of crucial information is called “class activation mapping” (CAM). The first-proposed simple CAM [39] assumed a global average pooling at the end of the network, and thus, cannot be utilized for networks with different types of architectures. This problem has been solved using a method called gradient-weighted class activation mapping (Grad-CAM) [34]. In Grad-CAM, CAM is calculated based on the differential of the output of the network with respect to the feature maps as

where *y*^{C} is the score for class *C* (*C* ∈ {*glass*, *liquid*} in the current setup) before softmax, *m*-th feature map activation of a convolutional layer, *k*, *l*) component of *Z* = *uv* is the normalizing factor for the global pooling calculation. The rectified linear unit *ReLU* simply returns *x* if *x* > 0 and zero otherwise. Thus, in this Grad-CAM method, the characteristic part of the input information is identified as the weighted sum of the feature maps after a specified convolution layer (usually the last layer), and the weights are obtained depending on the global average of the sensitivity (gradient) of the output with respect to each pixel of the feature maps. Importantly, this method can be applied to networks with any architecture if the backpropagation is tractable.

The results presented in this paper are all particle-based Grad-CAM scores, _{i} is the Grad-CAM score of particle *i*. We simply call Γ the Grad-CAM score. Note that, hereinafter, all particle-based variables are coarse-grained and normalized (see the Supplementary Material for technical details, including the precise definition of Γ_{i}).

### 3.3 Voronoi volume

In this study, we compared the results of the proposed method with those of handcrafted structural indicators to interpret the obtained Grad-CAM score Γ. The first reference indicator is the volume of the Voronoi cells ϒ that particles reside in (here, we call them volumes, although they are in fact areas because the system is 2D). The volume of the Voronoi cell allows us to quantify the so-called free volume of each particle, which is considered a significant static characteristic that explains the divergence of the viscosity in glass transition (the free volume theory) [40]. We note that, in Refs. [41, 42], the *microscopic* correlation between the free volume and the dynamics (i.e., the dynamical propensity) was studied and concluded to be not significantly correlated. On the contrary, a strong correlation between the free volume and bond-bond breakage occurring over long periods of time in low-temperature glassy systems has been reported [43]. Despite this controversial situation, because there is no doubt that the free volume of the particles is an important interpretable static property determined geometrically from the particle structure, we will refer to it here as one of the structural indicators.

The Voronoi cell to which particle *i* belongs can be uniquely defined without the introduction of any additional parameters, as follows:

where ** a** and

**. The point**

*b***in the equation is an arbitrary point in the system that is independent of the particle density field**

*r**ρ*(

**). The volume of the Voronoi cell for particle**

*r**i*can then be obtained as

*V*. To achieve Voronoi tessellation, we used the

*freud*[44, 45] Python library, which properly considers periodic boundary conditions. The Voronoi cell volumes provide a quantitative measure of the (inverse) local packing density. We call

### 3.4 Tong–Tanaka order parameter

Tong and Tanaka [21, 22] proposed an excellent order parameter that can characterize structures correlated with long-time dynamics even in glass-forming systems with large particle size dispersion, where characteristic structures are difficult to capture with bond-orientation order parameters [19]. We call this order parameter the Tong–Tanaka order parameter (TT-OP). The TT-OP has been successfully used as a structural indicator of the dynamic properties of various glasses and hence we measure it as a reference below.

To calculate the TT-OP, we first look at each particle (e.g., particle *o*) and its neighbors (the particles sharing the edges of the Voronoi cell with center particle *o*). Then, particle *o*’s TT-OP, Θ_{o}, is obtained as the average difference between the angle formed by particle *o* and two of its neighbors that are adjacent to each other,

where *N*_{o} is the number of particles neighboring particle *o* (this number agrees with the number of neighboring pairs of neighbors). The TT-OP

### 3.5 Dynamic propensity

As a measure of the dynamic heterogeneity that appears originated from a specific configuration of particles, the so-called dynamic propensity is usually employed [46–48]. To define this variable, we introduce the isoconfigurational ensemble first: in this special ensemble, samples share an identical initial particle configuration, *T* (the statistics of velocities obeys the Maxwell-Boltzmann distribution with this temperature). In this study, for each initial configuration, we performed MD simulations with 30 different initial velocity distributions. For each realization, we calculated intensity of the so-called cage-relative displacement (CRD)^{1}, *i* at time *t*, *j* runs over the neighbors of *i* including *i* itself, and *N*_{i} stands for the number of particles involved here) is the average displacement vector of the cage to which particle *i* belongs, and the superscript *s* is the sample index (which distinguishes different realizations of the velocity distribution at *t* = 0). The dynamic propensity field Δ is then defined as the average of the sample-dependent values of the CRD field, *N*_{s} samples as *N*_{s}, we basically employed *N*_{s} = 30 unless stated otherwise.

### 3.6 Coarse-graining of particle-base indicators

In References [21, 22], Tong and Tanaka showed that, when properly coarse-grained, the TT-OP introduced in Section 3.4. correlates strongly with the dynamic propensity field. In our analyses, we have coarse-grained all the particle-based indicators by a similar method to the one proposed in Ref. [22]:

where *ξ*_{X} is the coarse-graining length for variables *X* ∈ {Γ, ϒ, Θ, Δ}. For the calculation of the coarse-graining of the variables *X*, the cutoff distance *P*(*r*) is introduced, which is fixed as

We stress that, in this work, we coarse-grain not only the structural indicators but also the dynamic propensity field. We explain how we determine the coarse-graining lengths *ξ*_{X} in Section 4.2. Additionally, all particle-based variables are further normalized to [0,1] by simply subtracting the minimum value and then dividing by the maximum.

## 4 Results and discussions

### 4.1 Extraction of characteristic structures using Grad-CAM

The CNN introduced in Section 3.1. was run over 250 epochs. During the training process, 4,000 training data samples (for both glasses and liquids: 8,000 samples in total) were provided with the correct labels indicating whether the samples were glasses or liquids. For both systems (KAM and ABM), the learning stage proceeded smoothly, and the classification accuracy reached almost 100% both for the training and validation data after these relatively small epochs. The same degree of accuracy was achieved for the test data (the results for the test data are summarized in Table 2). We stress here that the calculation cost for the training part is very low in our setup: the entire 250 epochs of learning only took approximately 8 h using an NVIDIA Quadro P4000 (GP104GL).

Subsequently, using Grad-CAM, we further extracted the characteristic structures that the CNN relied on when identifying glass samples as *glasses*. Notably, this calculation requires only a trivial cost (much less than a second for each sample). We present the typical results obtained for the KAM system in Figure 1A and those of ABM in Figure 2A (notice that Γ visualized here are coarse-grained with the length *ξ*_{Γ} determined in the next Section 4.2). Both these results are for the glass configurations: although we can also investigate the characteristic structures of liquids and try to extract glass-like structures from liquids (and *vice versa*) within the Grad-CAM framework, we restricted ourselves to the investigation on the characteristic structures of glasses in this study^{2}.

**FIGURE 1**. Visualization of particle-based structural indicators for a typical glass configuration of the Kob–Andersen model (KAM) system: **(A)** Grad-CAM score (Γ), **(B)** Voronoi volume (ϒ), **(C)** TT-OP (Θ), and **(D–F)** Dynamic propensity (Δ) at different “time” scales. The argument of Δ stands for the mean intensity of the displacements *δ* at which Δ is measured: as indicated in the panel titles, *δ* ≈ 0.3, 1.0, 3.0, which roughly corresponds to *t* ≈ *τ*_{α}, 3*τ*_{α}, 10*τ*_{α}, are employed. Notice that all indicators are normalized to [0,1], and the different colors distinguish the values as shown in the color bar. In addition, 1 − Γ and 1 − Θ are shown in panels **(A)** and **(C)**, respectively, rather than Γ and Θ, for ease of comparison with the dynamic propensity. The precise values of coarse-graining length *ξ*_{X} employed here are summarized in Table 3.

**FIGURE 2**. Visualization of particle-based structural indicators for a typical glass configuration in the ABM system. The meanings of the panels are basically the same as those presented in Figure 1 (**(A)** Grad-CAM score, **(B)** Voronoi volume, **(C)** TT-OP, **(D–F)** Dynamic propensity at different time scales), while 1 − ϒ and Θ are shown in panels **(B)** and **(C)**, rather than ϒ and 1 − Θ.

### 4.2 Determination of coarse-graining length

In this subsection, we explain how we determined the coarse-graining lengths *ξ*_{X} (*X* ∈ {Γ, ϒ, Θ, Δ}) that are used for the analyses in the following sections (or already used in Figures 1A, 2A). Since the coarse-graining lengths for the structural indicators are determined depending on the coarse-grained dynamic propensity field, we explain that for the dynamic propensity *ξ*_{Δ}, first.

Coarse-grained structural indicators exhibit spatially smooth profiles as already presented in Figures 1A, 2A. On the other hand, as shown in Supplementary Figures S4, S5, the bare dynamic propensity field without coarse-graining shows noisy profiles even within each mobile/immobile domain. When we try to quantify the dynamic heterogeneity, we are interested in the meso-scale domain exhibited by the propensity field. However, in the presence of these intra-domain noises, the estimation of correlation with structural indicators suffers from high-frequency modulations. To exclude this unintentional underestimation of the correlation, we coarse-grained the propensity field as well. To systematically determine the coarse-graining length *ξ*_{X}, we first measure the spatial correlation function of the dynamic propensity *X* = Δ:

where *r* is the distance from the reference particle *i*, and ⟨⋅⟩_{|r|=r} stands for the spherical average over particle pairs separated by a distance *r*. Since we aim to smooth out the intra-domain noises here, we employ the decay length *r** defined by *C*_{Δ}(*r**) ≈ 1/*e* as the coarse-graining length *ξ*_{Δ}. In Figure 3, we plot the measurement results of *C*_{Δ} at three different time scales for both KAM (panel **A**) and ABM (panel **B**). For later convenience, in this study, we express the dynamic propensities at different time scales as functions of the mean intensity of the displacement (here, the displacement is cage-relative one. And the average is taken over the isoconfigurational samples and particles), *δ*). In Figure 3, *C*_{Δ} at *δ* = 0.3, 1.0, 3.0 are shown. These values of *δ* are expected to correspond to approximately *t* ≈ *τ*_{α}, 3*τ*_{α}, 10*τ*_{α}, where *τ*_{α} is the *α* relaxation time [9]. We summarize the values of extracted coarse-graining length in Table 3. Below, we use these values of *ξ*_{Δ} for Δ at these three time scales.

**FIGURE 3**. The spatial correlation function of the dynamic propensity Δ, *C*_{Δ}, as a function of the distance from the reference particle *r*. Results for **(A)** KAM and **(B)** ABM systems. Different line styles distinguish different time scales as shown in the legend. The horizontal dotted lines depict *C*_{Δ} = 1/*e*.

We would like to stress that the coarse-graining of the dynamic propensity Δ introduced here seems not just an artificial operation but a physically reasonable one. To show this, we prepared 100 independent isoconfigurational samples and considered four different ensembles. For three of the ensembles, we employed *N*_{s} = 30, and completely different sets of samples are composed for each. We call these ensembles *N*_{s} = 100): we call this *e*_{100}. Because the number of samples is different, we expect that the ensemble *e*_{100} should provide the statistically most reliable result. In Figure 4, we compare the bare dynamic propensity fields and the coarse-grained ones obtained from these four ensembles (only results for the KA system are shown). Although we see strong fluctuations among the bare fields of ensembles *e*_{100} (Figure 4D) is much smoother than the ones of ensembles *C*_{Δ,Δ}(*e*_{α}, *e*_{β}), the correlations between the propensity fields of ensemble *e*_{i} and *e*_{j}, are presented. When we consider the coarse-grained field instead of the bare ones, we denote them as *N*_{s} = 30 are very close to each other (*e*_{30}, the average value over all combinations of *i* and *j* is shown, where *i*, *j* ∈ {*A*, *B*, *C*} and *i* ≠ *j*.) while the correlations between the bare fields are much smaller *N*_{s} further provides an important insight into the meaning of the coarse-graining of Δ. As expected, the correlation between non-coarse-grained and coarse-grained fields of *e*_{100}, namely *e*_{30}, *N*_{s} = *∞* would be identical to its bare field. Moreover, we also mention that the correlation between coarse-grained ensembles *N*_{s} → *∞*) from the numerical results with a finite *N*_{s}.

**FIGURE 4**. Comparisons between the bare and the coarse-grained dynamic propensity fields Δ of different ensembles. The results of Δ(1.0) of the KA system are shown. **(A–C)** The bare Δ fields of **(D)** the bare Δ field of *e*_{100}, **(E–G)** the coarse-grained Δ of **(H)** the coarse-grained field of *e*_{100}. The coarse-graining length of *ξ*_{Δ} = 4.0 is employed as in Figure 1E.

The coarse-graining lengths *ξ*_{α} for structural indicators *α*(*α* ∈ {Γ, ϒ, Θ}) are then determined in the same manner as the one in Ref. [22]: the values that maximize the Pearson’s correlation coefficient [27, 28, 30, 53] between structural indicators and the dynamic propensity are chosen. Here, as the dynamic propensity field, we employed the coarse-grained ones with *ξ*_{Δ} determined in the previous paragraph. The determined values of *ξ*_{α} are summarized in Table 3. See Supplementary Material for the detailed *ξ*_{α} dependence of the correlations. In Figures 1B,C, 2B,C, we show the visualization results of the coarse-grained Voronoi volume ϒ and TT-OP Θ fields (the results for the same configurations as Figures 1A, 2A). We stress that all panels present much larger domains than the size simply expected from the value of *ξ*_{α} (e.g., linear spanning of 2*ξ*_{α}).

As a reference, we also present the results without the coarse-graining of the dynamic propensity Δ (that is, results with *ξ*_{Δ} = 0) in Supplementary Figures S4, S5). We note that, as shown in Supplementary Figures S4, S5, the consequences of the coarse-graining of Δ are fairly consistent with the bare Δ field.

### 4.3 Predictability of the dynamics

Now we ask the following question: *Are the extracted structures correlated with some material properties, for example, the dynamics?* To address this, we compared the Grad-CAM scores (Γ) with the dynamic propensity (Δ). Owing to the computational cost, we calculated the propensities only for the configurations shown in Figures 1, 2 (only one configuration for each system).

For each configuration, we performed MD simulations with 30 different initial velocity distributions and calculated the dynamic propensity field Δ following the procedure summarized in Section 3.5. Regarding the temperature during the measurement of the dynamics, we considered temperatures slightly above the glass transition point, *T**, (whose empirical definition is provided in the Supplementary Material; the obtained values are *T** ≈ 0.37 for KAM and *T** ≈ 1.0 for ABM) because we cannot expect any cage-breaking relaxational dynamics below *T** within the computationally accessible time window. To investigate the possible temperature dependence of the dynamics, we performed simulations at temperatures up to approximately 1.5*T**. We stress again that although the initial velocities follow the specified temperatures (which are higher than the glass transition point *T**), the initial configurations are drawn from the sample at *T*_{G} = 0.05 (those shown in Figures 1, 2). In Figures 1D–F and Figures 2D–F, the propensity Δ at *δ* ≈ 0.3, 1.0, 3.0 are shown. Note again that we express the time-dependence indirectly *via δ*, the mean intensity of the cage-relative displacement, and these values of *δ* correspond roughly to *t* ≈ *τ*_{α}, 3*τ*_{α}, 10*τ*_{α} respectively. Interestingly, there is agreement between 1 − Γ and Δ at long times (Δ(1.0) and Δ(3.0)) for both systems.

To quantify the correlation between Γ and Δ further, we calculated the Pearson’s correlation coefficients, *C*_{Γ,Δ}. Although the coarse-graining length of Δ is dependent on the value of *δ* (i.e., the time scale), for simplicity, we employed a fixed value for each system here (see Table 3). The results are presented in Figure 5. In this plot, the time dependence is indirectly reflected by the value of *δ*. Such a presentation allows us to compare the correlation between the dynamics (at different temperatures) and the static structure directly, thus ruling out the effect of the non-trivial dependence on time. Note that the correlation between 1 − Γ and Δ is quantified, not Γ, in agreement with the visualization. From Figure 5, we observe several striking consequences. First, *C*_{Γ,Δ} rises in accordance with the increase in *δ*, reaching its maximum value at *δ* ≈ 1 in both the KAM and ABM. This indicates that the structures extracted by our method are responsible for the dynamics at a longer time scale than the *α* relaxation (note that these are non-equilibrium aging dynamics and not intra-metabasin equilibrium relaxation). Secondly, the change in the correlation *C*_{Γ,Δ} is non-monotonic in the KAM system and starts decreasing for *δ* ≥ *δ**, while plateauing for *δ* ≥ *δ** in the ABM system. These results indicate that the specified characteristic “well-ordered” clusters are transient in the KAM system, whereas they seem very stable within the time window of our calculation in the ABM system. Thirdly, the maximum correlation, *C*_{max}, reaches very high values in both systems: 0.925 and 0.712 in the KAM and ABM, respectively. The predictability of the dynamics is surprising because our method does not require any information about the dynamics during the training process; thus, the computational cost for both the training and the sample-preparation part is low. Finally, the results of different *T* follow a single master curve. This result confirms the fact that the dynamics are indeed governed by the static “glass structures,” at least in the temperature regime under study and concerns non-equilibrium aging dynamics.

**FIGURE 5**. The “time” evolution of the correlation between Grad-CAM score 1 − Γ and dynamic propensity Δ as a function of the intensity of the cage-relative displacement *δ*. Results for **(A)** KAM and **(B)** ABM. Each simulation was performed for 3 × 10^{8} steps, and *δ*(*t*) during those simulations is plotted on the abscissa for each plot. Different markers are used to distinguish different temperatures as shown in the legend. The vertical dotted lines indicate the delta values visualized in Figures 1, 2.

### 4.4 Interpretable structural indicators

In this section, we measure two distinct local multibody structural indicators to interpret the Grad-CAM score Γ. Because these indicators are handmade, we can take advantage of their interpretable nature. In the Supplementary Material, we present the two-body correlation function *g*(*r*) for reference (Supplementary Figure S7). We again stress that all the particle-based indicators, including Δ, were coarse-grained and further normalized to the interval [0,1].

#### 4.4.1 Voronoi volume

The Voronoi volume ϒ is the first interpretable local multiparticle structural indicator. In this subsection, we briefly explain the obtained ϒ values for the KAM and ABM systems. A typical result for the KAM system is shown in Figure 1B in which particles with small values of ϒ_{i} _{i}

These distinct trends are derived entirely from the difference in the set of interaction parameters (*ϵ*_{ij} and *σ*_{ij}) and the number ratio of the particle species (*N*_{A}/*N*_{B}). For instance, in the KAM system, the interaction energy is most stable when different species are in contact, and the interaction range is also the shortest in this situation (see Table 1). Therefore, small values of ϒ_{i} are energetically favored in the KAM. In the ABM system, on the other hand, because the area occupied by particles A is almost half that of particles B (the area fraction is 1: 1.96), the region with a large Voronoi volume (corresponding to particles B) tends to be slightly dominant. Because samples with a very low temperature *T* = 0.05 are shown in Figures 1, 2, the structurally low-energy states are expected to be more probable. We also note that the small value of ϒ_{i} does not necessarily mean that the local structure around particle *i* is highly ordered, as is evident in the case of the KAM.

#### 4.4.2 Tong-Tanaka order parameter

The second interpretable structural indicator is the TT-OP Θ. As mentioned in Section 3.4, of the various locally defined structural indicators reported to date, TT-OP captures the dynamical behavior of many classes of glasses very well, especially universally. The characteristic structures of glasses in terms of the TT-OP are specified by small values of Θ_{i}, which means that the local structure is highly ordered.

The results for the KAM system are shown in Figure 1C. We note that because of the non-additive nature of the potential parameters, defining a reference three-particle ideal configurational angle *θ*^{1} in the case of the KAM is non-trivial. In this study, we employed the definition using the additive assumption (*σ*_{AA} = 1.0, *σ*_{BB} = 0.88, and *σ*_{AB} = (*σ*_{AA}+*σ*_{BB})/2 = 0.94) rather than the parameters used in the simulations because the reference structure is easier to interpret with this additive assumption. Surprisingly, the coarse-grained Θ field looks very similar to other indicators and appears correlated to the long-time propensity field. The results for the ABM system are shown in Figure 2C. In contrast to the findings of Tong and Tanaka [21, 22], the structure is not well developed, and the spatial modulation is much smaller than that in the KAM: Intermediate values are system-spanning. This is likely because our samples were generated by quenching at a fixed cooling rate and, thus, were not well annealed. We expect these samples to exhibit aging behavior.

### 4.5 Correlations between different indicators

To interpret the Grad-CAM scores (Γ) obtained by our method, we further calculated the Pearson’s correlation coefficients between different indicators for both the KAM and ABM. The results are summarized in Figure 6. In this figure, we show violin plots of the coefficients between different indicators (Γ, ϒ and Θ) calculated using 600 samples for each case (KAM or ABM and glasses or liquids). In this subsection, for clarity, we employ different values of *ξ*_{α} from those used in other subsections as shown in Table 3. The change in the value of *ξ*_{α} does not change the qualitative discussion here, while the distributions (those are plotted in Figure 6) become broader, and thus the differences between them are less pronounced when *ξ*_{α} increases.

**FIGURE 6**. Violin plots of the Pearson’s correlation coefficients between distinct structural indicators. Results for the **(A)** ABM and **(B)** KAM systems. The dark and light gray parts represent the results for glasses liquids, respectively, as shown in the legend. The dashed lines indicate the quartiles.

We call the correlation coefficients between the Grad-CAM score and the Voronoi volume *C*_{Γ,ϒ} and define those between different pairs in a similar manner: *C*_{Γ,Θ} and *C*_{Θ,ϒ}. Although the results below are mostly those for the glass configurations only, we also mention the results for the liquids when we discuss their differences from those of the glass samples. Finally, we stress that, in the main text, all correlations are based on the Pearson’s definition. As presented in the Supplementary Material, however, we also obtained semi-quantitatively consistent results using Spearman’s definition. Below, we explain the results for the ABM and KAM systems.

#### 4.5.1 ABM

In the ABM system, *C*_{Γ,ϒ} (the Grad-CAM score vs. the Voronoi volume) is the largest in terms of the intensity (Figure 6B), and its average is an intermediate positive value: *C*_{Γ,Θ} and *C*_{Θ,ϒ} (those for the Grad-CAM score vs. the TT-OP and the TT-OP vs. the Voronoi volume, respectively) are negative, with the intensities being slightly smaller (

However, importantly, the difference between glasses and liquids is most evidently quantified by *C*_{Γ,Θ}, which becomes almost completely negative for glasses but positive for liquids. On the other hand, the probability distribution of *C*_{Γ,ϒ} shows a large overlap between glasses and liquids; moreover, in the case of liquids, the distribution is centered around zero, indicating that the Voronoi-volume-like aspect of the Grad-CAM score is likely unable to distinguish glasses and liquids accurately. Therefore, our method seems to rely on structures that are qualitatively consistent with the TT-OP rather than on the Voronoi volume when a decision is made, although ϒ is closer to Γ than Θ in terms of the correlation for glass samples, as mentioned above

#### 4.5.2 KAM

In the case of the KAM system, *C*_{Γ,ϒ} and *C*_{Γ,Θ} for glasses are negative and positive, respectively, at odds with the results for the ABM (Figure 6A). Such a qualitative difference indicates that the structures extracted by our method respect the details of the systems. It should also be noted that, from the perspective of the intensity of the correlation coefficients, *C*_{Γ,ϒ} is significantly larger than *C*_{Γ,Θ}, and only *C*_{Γ,ϒ} exhibits a clear difference in the signs between the results for glasses and liquids. This is another qualitative difference from the ABM system, where the difference in sign is evident for *C*_{Γ,Θ}. However, although the TT-OP is a good descriptor for the ABM, as presented above, it is unlikely to characterize the properties of KAM systems. Thus, our method regards structures with high *C*_{Γ,ϒ} as characteristic while *C*_{Γ,Θ} is small. In particular, the intensity of *C*_{Γ,Θ} is lower than that of *C*_{Θ,ϒ} (

#### 4.5.3 Summary of this subsection

Interestingly, although we could interpret the Grad-CAM scores in terms of other conventional indicators (the Voronoi volume and the TT-OP) to some extent in both the KAM and ABM, the correlations are not perfect, and Grad-CAM seems to blend different indicators in an “appropriate” manner. In particular, we emphasize that the precise recipe of such blending is obviously dependent on the system details. Therefore, it would be meaningful to regress the obtained Grad-CAM score field Γ symbolically to achieve a fuller interpretation using recently invented methods [26, 54, 55].

### 4.6 Dynamics vs. other indicators

In Section 4.3, we studied the predictability of the Grad-CAM score Γ with respect to the dynamics Δ by measuring the correlation coefficient between them. In this subsection, we further investigate the correlation between dynamics and other indicators, namely, the Voronoi volume ϒ and the TT-OP Θ. Figure 7 presents the correlation coefficient between the dynamic propensity and the structural indicators. Note that, in this subsection, when calculating the correlation *C*_{α,Δ}, we sometimes use 1 − *α* instead of *α* to obtain a positive value (the choices obey those in Figures 1, 2). We explain the results for ϒ and Θ one by one below.

**FIGURE 7**. The evolution of the correlation between structural indicators (Grad-CAM score Γ, Voronoi volume ϒ, and TT-OP Θ) and dynamic propensity Δ (or 1 − Δ depending on the target indicator and system) as a function of the mean intensity of the cage-relative displacement *δ*. Results for **(A)** KAM and **(B)** ABM. The meaning of the abscissa is the same as the one in Figure 5. Different markers distinguish different indicators, as shown in the legend. Data for all temperatures are plotted without distinction. The vertical dotted lines indicate the delta values visualized in Figures 1, 2.

#### 4.6.1 Voronoi volume

The time evolution of the correlation coefficient between the Voronoi volume ϒ and the dynamic propensity Δ, *C*_{ϒ,Δ}, is quite similar to that of *C*_{Γ,Δ}: it changes non-monotonically (reaches the maximum at *δ*∗ > 1.0 and then starts decreasing) in the KAM system, and grows monotonically as a function of *δ* and saturates at *δ*^{∗} > 1.0 in the ABM system. The maximum correlation *δ*^{∗} at which *α* ∈ {Γ, ϒ, Θ} distinguishes the indicator of interest. The maximum

#### 4.6.2 Tong-Tanaka order parameter

Interestingly, the correlation between the TT-OP and the dynamic propensity, *C*_{Θ,Δ}, reaches high values: 0.885 and 0.790, respectively, in the KAM and ABM systems (1 − Θ and Θ were employed). It is unexpected that the TT-OP provides a good predictability of the dynamics even in the KAM (the correlation is higher in the KAM than in the ABM). This good predictability is achieved maybe because we employed the additive convention of the reference angle *α* relaxation regime (*δ* ≈ 0.3). Further comprehensive investigations are necessary to identify the cause of the unexpectedly high predictability.

In the KAM system, the qualitative trend is the same as those for correlations of other indicators (*C*_{Γ,Δ} and *C*_{ϒ,Δ}): it starts from a small value and follows an upward convex curve. In the ABM system, in contrast, the time evolution of *C*_{Θ,Δ} is qualitatively different from those of *C*_{Γ,Δ} or *C*_{ϒ,Δ}. It is high even at the early-stage small *δ* regime and changes in a non-monotonic manner with the increase in *δ*: it increases only a little bit, reaches the maximum value at *δ** ≈ 0.148, remains almost at the same level, and then starts decreasing.

#### 4.6.3 Summary of this subsection

To summarize, first, the Voronoi volume has the largest correlation with the dynamics in both KAM and ABM systems, in terms of the

The results presented in this article indicate that the characteristic structures extracted by the Grad-CAM capture information consistent with that of other coarse-grained structural indicators proposed in previous works [27, 30] in the sense that all structures are correlated with the dynamic propensity to some extent. However, we do observe clear differences between the correlation coefficient for the TT-OP and the other two indicators (the Voronoi volume and the Grad-CAM score), particularly in the ABM: although *C*_{Γ,Δ} and *C*_{ϒ,Δ} reach their maximum values at *δ* > 1.0, which corresponds to the longer time scales than *α* relaxation time, only *C*_{Θ,Δ} exhibits clearly smaller values of *δ** (Table 5). This may suggest that the structures specified by the TT-OP and those identified by the other indicators signal qualitatively different aspects of heterogeneous dynamics. Indeed, while the TT-OP focuses on angular information, the other two indicators take into account the whole structural information.

### 4.7 Do the coarse-graining lengths have a structural origin?

Thus far, we have shown that, coarse-grained with proper choices of *ξ*_{α}, structural indicators show strong correlations with the dynamics Δ. In this subsection, we discuss whether we can find the structural origin of these “proper” coarse-graining lengths. To this end, we measured the same spatial correlation functions as the one in Eq. 8 for structural indicators. The results are presented in Figure 8. We can first tell, from this figure, that ϒ and Θ decay very fast to *C*_{α} = 0 (particularly in the case of KAM). In contrast, Γ decays relatively slowly in both KAM and ABM systems: *C*_{Γ} reaches zero at *r*_{Γ} ≈ 3.0 − 4.0. This indicator-dependence of the correlation length is evident in the visualizations of bare fields shown in Supplementary Figures S6, S7. Results in Figure 8 suggest that relying on Γ, and we can extract relatively long purely structurally-based correlation length. Note that, here, we define the correlation length as the one where *C*_{Γ} becomes zero, not 1/*e*. We employed this choice because, unlike the case of Δ for which we introduced the coarse-graining to smooth out the intra-domain noises, we are interested in the average size of the domains as discussed below.

**FIGURE 8**. The spatial correlation functions of structural indicators *α*(*α* ∈ {Γ, ϒ, Θ}), *C*_{α}, as a function of the distance from the reference particle *r*. Results for **(A)** KAM, **(B)** ABM systems. Different line styles distinguish different indicators as shown in the legend. In the insets of both panels, the magnified images of *C*_{Γ} in the vicinity of *C*_{Γ} = 0 with error bars. In these insets, the vertical axes are in the logarithmic scale. Although the values of *C*_{Γ} are negative when markers are missing, the error bars are crossing *C*_{Γ} = 0.

In both systems (KAM and ABM), the length scale *r*_{Γ} defined here as *C*_{Γ}(*r*_{Γ}) = 0 is roughly half of the optimal coarse-graining length *ξ*_{α} used in the aforementioned analyses (*ξ*_{α} = 8.0 for KAM and *ξ*_{α} = 5.0 for ABM). Because *r*_{Γ} is expected to correspond to the average domain size, the length scale *ξ*_{α} ≈ 2*r*_{Γ} corresponds to the average distance between domains with the same sign of *via* elastic interactions.

To further give a concrete interpretation of the coarse-graining length on a purely structural basis, we must understand the interactions between domains *via* the elastic field. To provide useful data for such an exploration of new understandings, it is important to perform a comprehensive measurement of Γ for equilibrated well-annealed low-temperature samples. We leave this problem as a future work.

### 4.8 Relation to recent works

In the closing remarks for this section, we discuss the relation of our work to several recent works using machine learning-based methods. Recently, much effort has been dedicated to challenges in explaining the heterogeneous slow dynamics of glasses from a purely structural perspective. For instance, in Refs. [28, 29], supervised learning of graph neural networks was performed with information on dynamics as part of the training data. The trained networks succeeded in predicting the long-time glassy dynamics of the KAM system at very low temperatures (the lowest value *T* = 0.44 is comparable to the mode coupling transition point *T*_{MCT} = 0.435) solely from static structures with high precision (the correlation coefficient exceeds 0.6). As examples of unsupervised approaches, Refs. [27, 30] similarly tried to extract characteristic structures of glasses from static configurations. In these studies [27, 30], information from dynamics was not used, even in the training stage, similarly to our method. The major difference to our method was that only glass configurations were provided during the training stage [27, 30]; the liquid samples were absent. Surprisingly, the obtained structures were well-correlated with the long-time dynamics, particularly the dynamic propensity at approximately the *α* relaxation time. For the KAM system, the correlation coefficient reaches around 0.4 and 0.7 in Refs. [27, 30] respectively.

Because our method similarly does not require any information about the dynamics, we can say that it is also unsupervised in regard to dynamics prediction. Accordingly, it is non-trivial and interesting that the structures extracted using our method exhibit a strong correlation with the long-time heterogeneous dynamics at a longer time than the *α* relaxation time, as was the case for methods in Refs. [27, 30]. This implies that the characteristic structures governing the relaxational dynamics are extracted in a similar way, whether we try to identify the structural difference between glasses and liquids (our approach) or specify structurally distinct parts from the glass sample only (approaches in Refs. [27, 30]). This may support the view that the characteristic glass structures, if exist, grow gradually from completely random liquid configurations as the temperature decreases. It would be very meaningful to investigate the similarity of structures extracted by different machine learning methods.

We also note the quantitative difference in the predictability of different machine-learning methods with respect to the dynamics. Although we cannot make direct quantitative comparisons because of the varied setups in the references, our method provides the highest-level performance in terms of the simple correlation coefficient between extracted characteristic structures and long-time dynamic propensity: for our 2D ABM and KAM system, the correlation between the Grad-CAM score and dynamics reached approximately 70% and 90%, respectively.

## 5 Summary and overview

In this work, we proposed a method to extract the characteristic structures of amorphous systems solely from a couple of classes of static random configurations by means of classification with a CNN and quantification of the grounds for classification using Grad-CAM. We applied the proposed method to two qualitatively different binary glass-forming mixtures, *viz.* The ABM and KAM, and showed that our method could automatically extract the system-detail-dependent mesoscopic characteristic structures of glasses. The proposed method has three outstanding features. First, our method can extract characteristic structures solely from the instantaneous static structures without any information about the dynamics. Second, the extracted characteristic structures are system-dependent; in other words, our method automatically identifies the tailor-made structural indicator suitable for each distinct system. Finally, the extracted structures are strongly correlated with the dynamic propensity. The time evolution of the correlation reveals that the characteristic structure is closely related to the dynamics at a longer time scale than that for the *α* relaxation, where the mean intensity of the cage-relative displacement is of the order of unity: *δ* ⪆ 1. Moreover, such a correlation is robust over a wide range of temperatures, at least in the range *T** ≤ *T* ≤ 1.5*T**. We again stress that, unlike in the previously reported studies, our dynamic propensity quantifies the non-equilibrium aging processes, not the intra-metabasin equilibrium relaxational dynamics, and is coarse-grained.

In addition, we discuss several future research directions. First, we should conduct a similar analysis using well-annealed glass configurations because the equilibrium dynamics are important to understand the properties of glasses more deeply. In particular, it is challenging to determine whether the characteristic structures that our method extracts for equilibrium glass configurations are correlated with the equilibrium dynamics. Moreover, our method can find the characteristic structures from the static configurations alone, even when the microscopic physical quantity that characterizes the two classes (e.g., the dynamic propensity) is not available, as long as the two different classes are defined, for example, by specifying macroscopic quantities such as the temperature. Therefore, it allows us to directly ask, for example, whether we observe any structural differences between normal and ultrastable glasses [59, 60], for which the dynamical properties are numerically intractable. Because Ref. [61] reported that the stability of glass samples is structurally reflected by the density of the quasi-localized vibrational (QLV) modes, it would be interesting to see if Grad-CAM also quantifies the QLV modes or highlights completely different structures. Similarly, it has been shown that the structural difference between instantaneous configurations under different external shear speeds is quantified by the density of the imaginary normal modes [62]. Our method is also applicable in these situations.

In general, we expect that the extracted structures are dependent on the precise setup of the classification problem, such as the temperature choice for each class. It would be interesting to compare the characteristic glass structures obtained from different reference high temperatures. Further, such structures could be sensitive to the details of the protocols, for example, the network architecture or the number of epochs. The investigation of the effects of these factors would be valuable.

## Data availability statement

The datasets generated during the current study are available from the corresponding authors on reasonable request.

## Author contributions

NO and SK performed the numerical simulations and analyzed the data. NO and TK designed the study. All authors wrote this paper.

## Funding

This work was financially supported by the JST FOREST Program (Grant No. JPMJFR212T) and JSPS KAKENHI (Grant Nos. 22H04472, 20H05157, 20H00128, 19K03767, and 18H01188).

## Acknowledgments

The authors thank Satoshi Koide, Takenobu Nakamura, Hayato Shiba, and Ludovic Berthier for fruitful discussions and comments.

## Conflict of interest

Authors NO and SK were employed by company Toyota Central R&D Labs, Inc., Tokyo, Japan.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

## Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphy.2022.1007861/full#supplementary-material

## Footnotes

^{1}We employed the cage-relative displacements to exclude the undesired effects due to anomalous fluctuations that are specific to two-dimensional systems [49–52].

^{2}Importantly, we discarded a test sample for which the trained CNN gave the wrong classification (only 1 out of 2,400 samples: in the case of liquid in the KAM) to rule out the possible influence from such an abnormal sample, e.g., when evaluating the probability distribution function shown later in Figure 6. However, we would like to mention that the investigation of such samples is still important since they can reside in the vicinity of the “boundary” between glass and liquid classes and thus provide meaningful information about their structural differences. This investigation should be performed as future work.

## References

1. Angell CA. Formation of glasses from liquids and biopolymers. *Science* (1995) 267:1924–35. doi:10.1126/science.267.5206.1924

2. Debenedetti PG, Stillinger FH. Supercooled liquids and the glass transition. *Nature* (2001) 410:259–67. doi:10.1038/35065704

3. Dyre JC. Colloquium : The glass transition and elastic models of glass-forming liquids. *Rev Mod Phys* (2006) 78:953–72. doi:10.1103/revmodphys.78.953

4. Cavagna A. Supercooled liquids for pedestrians. *Phys Rep* (2009) 476:51–124. doi:10.1016/j.physrep.2009.03.003

5. Berthier L, Biroli G. Theoretical perspective on the glass transition and amorphous materials. *Rev Mod Phys* (2011) 83:587.

6. Kob W, Donati C, Plimpton SJ, Poole PH, Glotzer SC. Dynamical heterogeneities in a supercooled Lennard-Jones liquid. *Phys Rev Lett* (1997) 79:2827–30. doi:10.1103/physrevlett.79.2827

7. Yamamoto R, Onuki A. Dynamics of highly supercooled liquids: Heterogeneity, rheology, and diffusion. *Phys Rev E* (1998) 58:3515–29. doi:10.1103/physreve.58.3515

8. Yamamoto R, Onuki A. Heterogeneous diffusion in highly supercooled liquids. *Phys Rev Lett* (1998) 81:4915–8. doi:10.1103/physrevlett.81.4915

9. Lačević N, Starr FW, Schrøder TB, Glotzer SC. Spatially heterogeneous dynamics investigated via a time-dependent four-point density correlation function. *J Chem Phys* (2003) 119:7372–87. doi:10.1063/1.1605094

10. Dauchot O, Marty G, Biroli G. Dynamical heterogeneity close to the jamming transition in a sheared granular material. *Phys Rev Lett* (2005) 95:265701. doi:10.1103/physrevlett.95.265701

11. Karmakar S, Dasgupta C, Sastry S. Growing length and time scales in glass-forming liquids. *Proc Natl Acad Sci U S A* (2009) 106:3675–9. doi:10.1073/pnas.0811082106

12. Keys AS, Hedges LO, Garrahan JP, Glotzer SC, Chandler D. Excitations are localized and relaxation is hierarchical in glass-forming liquids. *Phys Rev X* (2011) 1:021013. doi:10.1103/physrevx.1.021013

13. Chandler D, Garrahan JP. Dynamics on the way to forming glass: Bubbles in space-time. *Annu Rev Phys Chem* (2010) 61:191–217. doi:10.1146/annurev.physchem.040808.090405

14. Kirkpatrick TR, Thirumalai D, Wolynes PG. Scaling concepts for the dynamics of viscous liquids near an ideal glassy state. *Phys Rev A* (1989) 40:1045–54. doi:10.1103/physreva.40.1045

15. Hirata A, Guan P, Fujita T, Hirotsu Y, Inoue A, Yavari AR, et al. Direct observation of local atomic order in a metallic glass. *Nat Mater* (2011) 10:28–33. doi:10.1038/nmat2897

16. Hirata A, Kang LJ, Fujita T, Klumov B, Matsue K, Kotani M, et al. Geometric frustration of icosahedron in metallic glasses. *Science* (2013) 341:376–9. doi:10.1126/science.1232450

17. Kawasaki T, Araki T, Tanaka H. Correlation between dynamic heterogeneity and medium-range order in two-dimensional glass-forming liquids. *Phys Rev Lett* (2007) 99:215701. doi:10.1103/physrevlett.99.215701

18. Tanaka H, Kawasaki T, Shintani H, Watanabe K. Critical-like behaviour of glass-forming liquids. *Nat Mater* (2010) 9:324–31. doi:10.1038/nmat2634

19. Kawasaki T, Tanaka H. Structural signature of slow dynamics and dynamic heterogeneity in two-dimensional colloidal liquids: glassy structural order. *J Phys : Condens Matter* (2011) 23:194121. doi:10.1088/0953-8984/23/19/194121

20. Leocmach M, Tanaka H. Roles of icosahedral and crystal-like order in the hard spheres glass transition. *Nat Commun* (2012) 3:974. doi:10.1038/ncomms1974

21. Tong H, Tanaka H. Revealing hidden structural order controlling both fast and slow glassy dynamics in supercooled liquids. *Phys Rev X* (2018) 8:011041. doi:10.1103/physrevx.8.011041

22. Tong H, Tanaka H. Structural order as a genuine control parameter of dynamics in simple glass formers. *Nat Commun* (2019) 10:5596. doi:10.1038/s41467-019-13606-3

23. Kob W, Andersen HC. Testing mode-coupling theory for a supercooled binary Lennard-Jones mixture. II. Intermediate scattering function and dynamic susceptibility. *Phys Rev E* (1995) 52:4134–53. doi:10.1103/physreve.52.4134

24. Berthier L. Self-Induced heterogeneity in deeply supercooled liquids. *Phys Rev Lett* (2021) 127:088002. doi:10.1103/physrevlett.127.088002

25. Guiselin B, Tarjus G, Berthier L. Static self-induced heterogeneity in glass-forming liquids: Overlap as a microscope. *J Chem Phys* (2022) 156:194503. doi:10.1063/5.0086517

26. Schoenholz SS, Cubuk ED, Kaxiras E, Liu AJ. Relationship between local structure and relaxation in out-of-equilibrium glassy systems. *Proc Natl Acad Sci U S A* (2017) 114:263–7. doi:10.1073/pnas.1610204114

27. Boattini E, Marín-Aguilar S, Mitra S, Foffi G, Smallenburg F, Filion L. Autonomously revealing hidden local structures in supercooled liquids. *Nat Commun* (2020) 11:5479. doi:10.1038/s41467-020-19286-8

28. Bapst V, Keck T, Grabska-Barwińska A, Donner C, Cubuk ED, Schoenholz SS, et al. Unveiling the predictive power of static structure in glassy systems. *Nat Phys* (2020) 16:448–54. doi:10.1038/s41567-020-0842-8

29. Shiba H, Hanai M, Suzumura T, Shimokawabe T. Unraveling intricate processes of glassy dynamics from static structure by machine learning relative motion. *arxiv* [Preprint] (2022). doi:10.48550/arXiv.2206.14024

30. Paret J, Jack RL, Coslovich D. Assessing the structural heterogeneity of supercooled liquids through community inference. *J Chem Phys* (2020) 152:144502. doi:10.1063/5.0004732

31. Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, et al. Detecting hidden spatial and spatio-temporal structures in glasses and complex physical systems by multiresolution network clustering. *Eur Phys J E* (2011) 34:105. doi:10.1140/epje/i2011-11105-9

32. Ronhovde P, Chakrabarty S, Hu D, Sahu M, Sahu KK, Kelton KF, et al. Detection of hidden structures for arbitrary scales in complex physical systems. *Sci Rep* (2012) 2:329. doi:10.1038/srep00329

33. Valueva M, Nagornov N, Lyakhov P, Valuev G, Chervyakov N. Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. *Mathematics Comput Simulation* (2020) 177:232–43. doi:10.1016/j.matcom.2020.04.031

34. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. *Int J Comput Vis* (2020) 128:336–59. doi:10.1007/s11263-019-01228-7

35. Oyama N, Mizuno H, Ikeda A. Unified view of avalanche criticality in sheared glasses. *Phys Rev E* (2021) 104:015002. doi:10.1103/physreve.104.015002

36. Shimada M, Mizuno H, Ikeda A. Anomalous vibrational properties in the continuum limit of glasses. *Phys Rev E* (2018) 97:022609. doi:10.1103/physreve.97.022609

37. Sastry S, Debenedetti PG, Stillinger FH. Signatures of distinct dynamical regimes in the energy landscape of a glass-forming liquid. *Nature* (1998) 393:554–7. doi:10.1038/31189

38. Swanson K, Trivedi S, Lequieu J, Swanson K, Kondor R. Deep learning for automated classification and characterization of amorphous materials. *Soft Matter* (2020) 16:435–46. doi:10.1039/c9sm01903k

39. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016 Jun 27–30; Las Vegas, NV. IEEE (2016). p. 2921–9.

40. Cohen MH, Turnbull D. Molecular transport in liquids and glasses. *J Chem Phys* (1959) 31:1164–9. doi:10.1063/1.1730566

41. Widmer-Cooper A, Harrowell P. On the relationship between structure and dynamics in a supercooled liquid. *J Phys : Condens Matter* (2005) 17:S4025–34. doi:10.1088/0953-8984/17/49/001

42. Widmer-Cooper A, Harrowell P. Free volume cannot explain the spatial heterogeneity of Debye–Waller factors in a glass-forming binary alloy. *J Non-Crystalline Sol* (2006) 352:5098–102. doi:10.1016/j.jnoncrysol.2006.01.136

43. Shiba H, Kawasaki T. Spatiotemporal heterogeneity of local free volumes in highly supercooled liquid. *J Chem Phys* (2013) 139:184502. doi:10.1063/1.4829442

44. Ramasubramani V, Dice BD, Harper ES, Spellings MP, Anderson JA, Glotzer SC. freud: A software suite for high throughput analysis of particle simulation data. *Comput Phys Commun* (2020) 254:107275. doi:10.1016/j.cpc.2020.107275

45. Dice B, Ramasubramani V, Harper E, Spellings M, Anderson J, Glotzer S. Analyzing particle systems for machine learning and data visualization with freud. In: Proceedings of the 18th Python in Science Conference (SciPy 2019); 2019 Jul 8–14; Austin, TX (2019). p. 27–33.

46. Widmer-Cooper A, Harrowell P, Fynewever H. How reproducible are dynamic heterogeneities in a supercooled liquid? *Phys Rev Lett* (2004) 93:135701. doi:10.1103/physrevlett.93.135701

47. Widmer-Cooper A, Harrowell P. On the study of collective dynamics in supercooled liquids through the statistics of the isoconfigurational ensemble. *J Chem Phys* (2007) 126:154503. doi:10.1063/1.2719192

48. Berthier L, Jack RL. Structure and dynamics of glass formers: Predictability at large length scales. *Phys Rev E* (2007) 76:041509. doi:10.1103/physreve.76.041509

49. Shiba H, Yamada Y, Kawasaki T, Kim K. Unveiling dimensionality dependence of glassy dynamics: 2D infinite fluctuation eclipses inherent structural relaxation. *Phys Rev Lett* (2016) 117:245701. doi:10.1103/physrevlett.117.245701

50. Illing B, Fritschi S, Kaiser H, Klix CL, Maret G, Keim P. Mermin–Wagner fluctuations in 2D amorphous solids. *Proc Natl Acad Sci U S A* (2017) 114:1856–61. doi:10.1073/pnas.1612964114

51. Shiba H, Keim P, Kawasaki T. Isolating long-wavelength fluctuation from structural relaxation in two-dimensional glass: cage-relative displacement. *J Phys : Condens Matter* (2018) 30:094004. doi:10.1088/1361-648x/aaa8b8

52. Shiba H, Kawasaki T, Kim K. Local density fluctuation governs the divergence of viscosity underlying elastic and hydrodynamic anomalies in a 2D glass-forming liquid. *Phys Rev Lett* (2019) 123:265501. doi:10.1103/physrevlett.123.265501

53. Coslovich D, Ikeda A. Revisiting the single-saddle model for the *β* -relaxation of supercooled liquids. *J Chem Phys* (2022) 156:094503. doi:10.1063/5.0083173

54. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. *Proc Natl Acad Sci U S A* (2016) 113:3932–7. doi:10.1073/pnas.1517384113

55. Rudy SH, Brunton SL, Proctor JL, Kutz JN. Data-driven discovery of partial differential equations. *Sci Adv* (2017) 3:e1602614. doi:10.1126/sciadv.1602614

56. Donati C, Glotzer SC, Poole PH, Kob W, Plimpton SJ. Spatial correlations of mobility and immobility in a glass-forming Lennard-Jones liquid. *Phys Rev E* (1999) 60:3107–19. doi:10.1103/physreve.60.3107

57. Matharoo GS, Razul MSG, Poole PH. Structural and dynamical heterogeneity in a glass-forming liquid. *Phys Rev E* (2006) 74:050502. doi:10.1103/physreve.74.050502

58. Ozawa M, Biroli G. Elasticity, facilitation and dynamic heterogeneity in glass-forming liquids. *arxiv* [Preprint] (2022). doi:10.48550/arXiv.2209.08861

59. Berthier L, Charbonneau P, Flenner E, Zamponi F. Origin of ultrastability in vapor-deposited glasses. *Phys Rev Lett* (2017) 119:188002. doi:10.1103/physrevlett.119.188002

60. Khomenko D, Scalliet C, Berthier L, Reichman DR, Zamponi F. Depletion of two-level systems in ultrastable computer-generated glasses. *Phys Rev Lett* (2020) 124:225901. doi:10.1103/physrevlett.124.225901

61. Wang L, Ninarello A, Guan P, Berthier L, Szamel G, Flenner E. Low-frequency vibrational modes of stable glasses. *Nat Commun* (2019) 10:26. doi:10.1038/s41467-018-07978-1

Keywords: deep neural networks, machine learning, molecular dynamics simulations, glass transition, dynamical heterogeneity of liquid state

Citation: Oyama N, Koyama S and Kawasaki T (2023) What do deep neural networks find in disordered structures of glasses?. *Front. Phys.* 10:1007861. doi: 10.3389/fphy.2022.1007861

Received: 31 July 2022; Accepted: 05 December 2022;

Published: 09 January 2023.

Edited by:

Alberto Fernandez-Nieves, Catalan Institution for Research and Advanced Studies (ICREA), SpainReviewed by:

Gerhard Jung, UMR5221 Laboratoire Charles Coulomb (L2C), FranceHai-Bin Yu, Huazhong University of Science and Technology, China

Copyright © 2023 Oyama, Koyama and Kawasaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Norihiro Oyama, Norihiro.Oyama.vb@mosk.tytlabs.co.jp; Takeshi Kawasaki, kawasaki@r.phys.nagoya-u.ac.jp