Edited by: Joseph Betouras, Loughborough University, United Kingdom
Reviewed by: Pantelis Rafail Vlachas, ETH Zürich, Switzerland; Alexandre M. Zagoskin, Loughborough University, United Kingdom
This article was submitted to Quantum Computing, a section of the journal Frontiers in Physics
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Chimeras and branching are two archetypical complex phenomena that appear in many physical systems; because of their different intrinsic dynamics, they delineate opposite non-trivial limits in the complexity of wave motion and present severe challenges in predicting chaotic and singular behavior in extended physical systems. We report on the long-term forecasting capability of Long Short-Term Memory (LSTM) and reservoir computing (RC) recurrent neural networks, when they are applied to the spatiotemporal evolution of turbulent chimeras in simulated arrays of coupled superconducting quantum interference devices (SQUIDs) or lasers, and branching in the electronic flow of two-dimensional graphene with random potential. We propose a new method in which we assign one LSTM network to each system node except for “observer” nodes which provide continual “ground truth” measurements as input; we refer to this method as “Observer LSTM” (OLSTM). We demonstrate that even a small number of observers greatly improves the data-driven (model-free) long-term forecasting capability of the LSTM networks and provide the framework for a consistent comparison between the RC and LSTM methods. We find that RC requires smaller training datasets than OLSTMs, but the latter require fewer observers. Both methods are benchmarked against Feed-Forward neural networks (FNNs), also trained to make predictions with observers (OFNNs).
Predicting the state of complex, non-linear dynamical systems as a function of time is an important problem of great practical utility. Recent advances of artificial neural networks and machine learning (ML) methods have made possible significant applications in science, industry, and technology [
LSTM networks [
Reservoir computing (RC), first proposed by Jaeger and Haas [
LSTM networks have proven successful in learning and generalizing sequential tasks from isolated sequences, such as handwriting and speech. Inspired by this success, we first considered a model with a single LSTM network assigned to each system node, which is independent of all other LSTMs. The prediction error for this approach turned out to be very large. This is not surprising since chimera states are collective phenomena and the simplistic use of one LSTM network per node, independent of all others, does not capture well the interaction between the nodes; the sequences of different nodes are not isolated but correlated.
In order to address the independent LSTMs' limited ability to capture dependencies between multiple correlated sequences, we propose and demonstrate a new method, which we call “Observer LSTM (OLSTM),” based on the extension of the notion of “reservoir observers” to the LSTM networks. In the OLSTM method, we assign one LSTM to each (non-observer) system node but also assign “observer” status to certain system nodes (“LSTM observers,” taken at equidistant positions for simplicity) which provide continual “ground truth” measurements as input to the prediction method. We demonstrate that their presence even in small numbers (of order <10% of the total number of nodes) greatly improves the long-term data-driven forecasting capability of the LSTM networks and provides the framework for a consistent comparison between the RC and LSTM methods.
Time-series data are used to train each network while no knowledge of the underlying system equations is required. Each individual LSTM network is trained by taking as input a number of past values (denoted as
We study the networks' long term forecasting capability as a function of the number of observers and as a function of training-set size, and we compare the OLSTM performance with that of “reservoir observers” trained by RC, which utilizes a single (“global”) network for the entire system. We benchmark both methods against a standard Feed-Forward neural network (FNN) method, with the same number of observers (OFNN). We compare quantitatively the networks' performance by calculating the normalized root mean square error (RMSE) at each time step, for all system nodes, over the predicted time steps, as in [
We structure our study of predicting complex dynamics in extended physical systems with ML approaches in two parts: the first part concerns chimeras and the second concerns branching in flows. Together, these two cases capture extremes of complex spatiotemporal behavior that severely challenge any method aspiring to predict the long-term behavior. Chimera states challenge the ML methods to predict partial, self organized coherence while the stochastic onset of branching challenges the ML methods to predict stochastic yet singular events.
Chimera states are collective, self-organized patterns of coexisting coherence, and incoherence in coupled oscillator systems. Following the first discovery of chimeras for symmetrically coupled Kuramoto identical oscillators in 2002 [
Chimeras can be stationary or turbulent. Turbulent chimeras have been observed experimentally [
SQUID metamaterials constitute a subclass of superconducting artificial media whose function relies both on the geometry and the extraordinary properties of superconductivity and the Josephson effect [
We investigate the long-term prediction capability of the ML methods under study on turbulent single-headed and double-headed chimeras (“head” stands for incoherent cluster) observed numerically in an array of N identical
where the indices
is the current in the n-th SQUID given by the resistively and capacitively shunted junction (RCSJ) model [
We apply RC, OLSTM, and OFNN methods for long-term prediction of the dynamics of these single-headed and double-headed chimeras. Prediction snapshots for the single-headed chimera are presented in
Spatiotemporal plots of single- and double-headed chimeras and predicted time series and fluxes.
In our implementation of OLSTMs we found that a large value of the number of past steps
In laser systems, chimeras were first reported both theoretically and experimentally in a virtual space-time representation of a single laser system subject to long delayed feedback [
where all indices have to be taken modulo
Shena et al. [
In the synchronization regime the local curvature is close to zero while in the asynchronous regime it is finite and fluctuating.
Since our objective is to obtain “wild” turbulent chimeras in order to test the long-term predictions of the ML methods, we have chosen the turbulent chimera of Shena et al. [
Spatiotemporal plot of a turbulent chimera state [as is generated in the 1-dimensional semiconductor Class B laser array of Shena et al. [
In
Normalized RMSE (<R>) of RC, OLSTM, and the OFNN methods calculated for each time step over all system nodes at each predicted time step (top panel) and over the predicted time steps
Wave focusing due to refractive index variation is a common occurrence in many physical systems, as, for example, in optical media where the index of refraction changes in a statistical way due to small imperfections or distributions of defects in the medium through which the wave propagates. Random spatial variability of the index leads to local focusing and defocusing of the waves and the formation of caustics (or wave “branches”) with substantially increased local wave intensity [
It is a formidable challenge to predict singular events like branching in wave propagation or electron flow because of the stochastic nature of the onset of such events. An important question is whether or not ML methods can dissect the stochastic nature of branching and “learn” the interactions that take place among trajectories, thereby providing an accurate detection mechanism for the caustics that mark the onset of branching. We attempt to resolve this issue with results from the RC, OLSTM, and OFNN methods, on singular branched flows in graphene with random potentials. Our results demonstrate that the ML methods we considered can adequately capture the stochastic temporal dependencies of the time series in this prototypical complex dynamical system.
Caustic event prediction in a 2D electron flow is facilitated if we consider one of the spatial dimensions (we refer to it as the “longitudinal” x-direction) as the “time-coordinate,” and therefore, map the stationary phenomenon of caustic formation onto a 1D spatio-temporal dynamical problem. In the framework of this approach, we model the motion of electrons as individual rays whose density matrix is transformed onto a vector of time series with dimension
Branching in electronic flows in graphene with random potentials (intensity values are color-coded); the graphene sheet has size 176 (vertical) × 84 nm (horizontal). The snapshots in insets depict the intensity of the flows as found in the simulation and predicted by the ML methods. In predicting the time evolution of the flows, 10 “observers” have been placed in the positions marked by the tips of the arrows, monitoring the entire “time” (vertical) axis. The thick horizontal white line marks the end of the RC training, while the dotted white line marks the end of the OLSTM (and OFNN) training. Inset outlined in pink: the actual and predicted time series for the entire system at “time coordinate” point xn = 100 nm, corresponding to 501 time steps. The pink-shaded curve represents the actual (ground-truth) data, red line is the OLSTM prediction, green is the RC prediction and blue the OFNN prediction. Inset outlined in blue: same as for the other inset, but at “time coordinate” point xn = 160 nm, corresponding to 801 time steps, and with blue-shaded curve depicting the actual (ground-truth) data.
The issue of predicting complex spatiotemporal behavior using ML approaches is one of central importance for their potential applications in the physical sciences and beyond. Here, we attempted to address this issue by considering two distinct prototypical phenomena, viz. partially coherent chimera states and the stochastic onset of branching in 2D wave flows, as these are realized in coupled arrays of SQUIDs or lasers, and in the flow of electrons in graphene with random potentials, respectively. We find that ML approaches like LSTM and RC recurrent neural networks can perform well in predicting complex dynamics in extended physical systems when they involve “observers” that monitor the system evolution throughout its time dimension. The presence of observers is an inherent requirement of the second approach (RC), but not of the first (LSTM). Accordingly, we proposed a new method, which we call “Observer LSTM (OLSTM),” to address the limitations of single, independent LSTM networks in capturing dependencies between multiple correlated sequences. We have also considered an observer-enhanced Feed-Forward network (OFNN) and tested the long-term prediction performance of the three approaches, OLSTM, OFNN, and reservoir observers trained by RC, on the two difficult problems of turbulent chimeras and 2D branching flows. Our results quantify how the prediction error (root mean squared error, RMSE, of the predicted values) varies as a function of the number of observers and of the size of the training datasets.
We conclude that “observers” comprise
The RC network architecture used in predicting turbulent chimeras comprises of 1,000 reservoir nodes, with spectral radius ρ = 1.0, average degree D = 80, scale of inputs weights σ = 1.5, bias constant ξ = 0.0, leakage rate α = 0.9, ridge regression parameter β = 0.5, and time interval Δt = 0.01. The RC network applies to the system as a whole (single network architecture). For branching, the RC network used comprises of 3,000 reservoir nodes, with spectral radius ρ = 0.9, average degree D = 50, scale of inputs weights σ = 1, bias constant ξ = −0.4, leakage rate α = 0.5, ridge regression parameter β = 0.05, and Δt = 0.05. The network applies to the system as a whole (single network architecture).
The OLSTM network architecture used in this study constitutes 400 LSTM cells with RELU activation functions (one hidden layer). A single LSTM network is applied to each (non-observer) system node (SQUID or laser oscillator or electron flow). Similarly, a single fully-connected (dense) Feed Forward network (OFNN), with RELU activation functions, is applied to each (non-observer) system node (SQUID or laser oscillator or electron flow). Optimization for LSTMs during training is performed using the Adam stochastic optimization method [
In case of the SQUIDs system (
The data of the time series used (chimeras and electronic flows in graphene) was preprocessed as in [
Each trained OLSTM network is then used to forecast the node's state in the next time step in an iterative fashion (getting also input from the observers). All the hyper-parameters in OLSTM and OFNN models including the number of training epochs, the number of training batches, the number of neurons and the number of hidden layers are optimized so that they are leading to the smallest RMSE. Similarly, for RC we optimized the parameters with criterion the smallest value for RMSE.
As a comparison measure for the networks' performance, we use the (normalized) root mean square error calculated at each time step, for all system nodes and over the predicted time steps (unless otherwise specified).
The machine learning library
The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.
GN, MM, and GDB designed and performed research, analyzed data, and wrote the manuscript. JH provided data on chimeras, made contributions to the manuscript, and constructive comments. GPT and EK contributed to the design of research, interpretation of results, and to the writing of the manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewer AZ declared a past co-authorship with one of the authors GT to the handling editor.
GPT acknowledges useful discussions with Prof. Edward Ott. GN and GPT acknowledge support by the European Commission under project NHQWAVE (MSCA-RISE 691209). JH acknowledges support by the General Secretariat for Research and Technology (GSRT) and the Hellenic Foundation for Research and Innovation (HFRI) (Code: 203). MM and EK acknowledge partial support from EFRI 2-DARE NSF Grant No. 1542807 and from ARO MURI Award No. W911NF14-0247. We used computational resources on the Odyssey cluster of the FAS Research Computing Group at Harvard University.
1Available online at: