Cocaine-Induced Changes in Low-Dimensional Attractors of Local Field Potentials in Optogenetic Mice

Optogenetically evoked local field potential (LFP) recorded from the medial prefrontal cortex (mPFC) of mice during basal conditions and following a systemic cocaine administration were analyzed. Blue light stimuli were delivered to mPFC through a fiber optic every 2 s and each trial was repeated 100 times. As in the previous study, we used a surrogate data method to check that nonlinearity was present in the experimental LFPs and only used the last 1.5 s of steady activity to measure the LFPs phase resetting induced by the brief 10 ms light stimulus. We found that the steady dynamics of the mPFC in response to light stimuli could be reconstructed in a three-dimensional phase space with topologically similar “8”-shaped attractors across different animals. Therefore, cocaine did not change the complexity of the recorded nonlinear data compared to the control case. The phase space of the reconstructed attractor is determined by the LFP time series and its temporally shifted versions by a multiple of some lag time. We also compared the change in the attractor shape between cocaine-injected and control using (1) dendrogram clustering and (2) Frechet distance. We found about 20% overlap between control and cocaine trials when classified using dendrogram method, which suggest that it may be possible to describe mathematically both data sets with the same model and slightly different model parameters. We also found that the lag times are about three times shorter for cocaine trials compared to control. As a result, although the phase space trajectories for control and cocaine may look similar, their dynamics is significantly different.


INTRODUCTION
Pyramidal cells together with spiny stellates constitute more than 70% of the excitatory neural population of the cortex (Feldman, 1984;Bannister, 2005). Pyramidal cells form wide interconnected network spanning layers 2-6 (Bannister, 2005). They serve, among other functions, maintaining prefrontal cortex activity during working memory (Sanchez-Vives and McCormick, 2000) and generate the UP states of persistent network activity (Luczak et al., 2007;Compte et al., 2008).
The pyramidal neurons receive inhibitory inputs from GABA interneurons, of which the most prominent group expresses the calcium-binding protein parvalbumin (PV) (Takahata and Kato, 2008;Casanova and Trippe, 2009;Kana et al., 2011). It is thought that the PV-positive (PV+) neurons projections coordinate the output of local minicolumns (Galarreta and Hestrin, 2001;Sultan et al., 2013). GABAergic interneurons present significant density and morphological heterogeneities across different cortical areas (Rodriguez et al., 2002) in order to regulate the firing rate of pyramidal cells and facilitate information processing (Guidotti et al., 2005;Fuchs et al., 2007;Schmidt and Mirnics, 2015). PV+ interneurons also present a significant heterogeneity of projections to pyramidal cells processes that allow them a fine-tuned functional control of pyramidal cells (DeFelipe and Farinas, 1992). It has been established that projections of PV+ neurons to pyramidal cells soma and proximal dendrites are very effective in modulating their firing rate (Halasy et al., 1996;Booker et al., 2013), whereas synapses on the axons of pyramidal cells can block action potentials (Melchitzky and Lewis, 2003;Henry et al., 2004;Micheva et al., 2016).
It is also believed that PV+ neurons are instrumental in maintaining and/or modulating both beta (15-30 Hz) and gamma (25-40 Hz) rhythms of the brain. It has been shown that beta rhythm activity could be linked to autism, as it coordinates activity across fronto-parietal networks (Schnitzler and Gross, 2005;Peter and Wolf, 2010), and schizophrenia (Liddle et al., 2016). Beta rhythm has also been associated with sensory gating (Michael and Zoe, 2006;Hong et al., 2008;Cheng et al., 2016). GABAergic neurons in beta band are thought to guide early stage development of radial columnar circuits of pyramidal and radial interneurons (Rippon et al., 2007;Casanova and Trippe, 2009). In this respect, GABAergic neurons could also be involved in schizophrenia (Schmidt and Mirnics, 2015), although recent studies (O'Connell et al., 1997;Muraki and Tanigaki, 2015) suggested that this disorder results from embryonic developmental abnormalities rather than from neuronal degenerations .
Gamma band activity has been linked to sensory processing of stimulus characteristics (Kaiser and Lutzenberger, 2003) and is determined by the details of the local circuits involving PV+ fast-spiking interneurons (Cardin et al., 2009;Sohal et al., 2009Sohal et al., , 2016. PV+ interneurons are thought to also promote gamma band signal transmission both within (Veit et al., 2017) and between neocortical areas (Cantero and Atienza, 2005), while corresponding abnormalities in this process may contribute to schizophrenia (Lewis et al., 2005;Lewis and Hashimoto, 2007) and autism (Levy, 2007;Orekhova et al., 2007). Since PV+ neurons provide inhibitory modulation, it has been found that a decrease in local inhibition could lead to sensory hypersensitivity and neural hyper-excitability (Gibson et al., 2008;Rotschafer and Razak, 2014;Contractor et al., 2017;Ethridge et al., 2017).
We used optogenetic tools to investigate the response of the local network in the medial prefrontal cortex (mPFC) of mice under brief light stimuli. This study used the same knock-in mouse model together with optogenetics and in vivo electrophysiology described in detail in Dilgen et al. (2013). The goal was to investigate the effects of acute cocaine on mPFC gamma oscillation and their relationship to more permanent cortical changes of long-term use of stimulants. We previously carried out a similar data mining study on the same animals under control conditions (Oprisan et al., 2015). The present study is a continuation of Oprisan et al. (2015) in which the same mice were systemically injected with cocaine. We performed a nonlinear time series analysis of LFPs recorded from PV+ neurons using time reversal asymmetry and false nearest neighbor (FNN) statistics between the original signal and surrogate data (Oprisan et al., 2015) to identify the nonlinearity in the data set.
The local field potential (LFP) is the sum of excitatory and inhibitory dendritic potentials in a small region (approximately 200-400 µm Katzner et al., 2009) around the tip of the electrode (Scherberger et al., 2005;Kajikawa and Schroeder, 2011). As opposed to spike recording through intra/extracellular microelectrodes, which represent neural outputs, the LFPs represent inputs and local processing from synaptic activity (Mitzdorf, 1987). Additionally, LFPs are easier to record, which could be useful for practical implementations of control mechanisms similar to deep brain stimulation or brain-computer interfaces (Hatsopoulos and Donoghue, 2009). It has been shown that cognitive processes could modulate the temporal structure of the LFPs (Pesaran et al., 2002;Mehring et al., 2003). Such a temporal structure could be captured by the lag time distribution of delay-embedding method used here.

Human Search and Animal Research
A detailed description of the procedures can be found in the first paper of this series (Oprisan et al., 2015) and we only briefly summarize them here. All procedures were done in accordance to the National Institute of Health guidelines as approved by the Medical University of South Carolina Institutional Animal Care and Use Committee.

Experimental Protocol
The experimental protocol is the same as in Dilgen et al. (2013) and Oprisan et al. (2015). Briefly, male PV-Cre mice (B6; 129P2 -Pval btm1(Cre)Arbr/J Jackson Laboratory (Bar Harbor, ME, USA) were infected with the viral vector (AAV2/5. EF1a. DIO. hChR2(H134R) -EYFP. WPRE. hGH, Penn Vector Core, University of Pennsylvania) delivered to the mPFC as described in detail in Dilgen et al. (2013). The extracellular signals were amplified using a Grass amplifier (Grass Technologies, West Warwick, RI, USA), digitized at 10 kHz by a 1401plus data acquisition system, visualized using Spike2 software (Cambridge Electronic Design, LTD., Cambridge, UK) and stored on a PC for offline analysis. A HumBug 50/60 Hz Noise Eliminator (Quest Scientific Inc., Canada) filter canceled out the line noise and the signal was band-pass filtered online between 0.1 and 130 kHz to obtain the LFPs. A 473 nm laser (DPSS Laser System, OEM Laser Systems Inc., East Lansing, MI, USA) delivered the light stimulation via a 1401plus digitizer and Spike2 software (Cambridge Electronic Design Ltd., Cambridge, UK).

DATA ANALYSIS
As in the first paper of this series (Oprisan et al., 2015), for each 2 s long LFP recording in response to a brief 10 ms light pulse, the first approximately 0.5 s were discarded to remove the transient response of the neural network and only analyze the last 1.5 s of steady oscillatory activity of the network. The transient response is dominated by the transient phase resetting in response to a brief 10 ms optical stimulation (see Oprisan et al., 2015 for a detailed procedure of measuring network-level phase resetting). While phase resetting at the neural network level can provide invaluable information regarding the ability of stimuli to drive the network with meaningful applications in, for example, epilepsy (Osorio and Frei, 2009;Parastarfeizabadi and Kouzani, 2017) or Parkinson's (Tass, 2000(Tass, , 2003, here we focused on the steady state, unperturbed, activity of the network. The goals of this study were to (1) identify possible low-dimensional stable attractors of neural activity during the steady activity of the network, and (2) compare neural activity under systemic cocaine against the control data published in Oprisan et al. (2015).

Nonlinearity Tests
Without repeating all the mathematical details from Oprisan et al. (2015), we again tested for nonlinearity in cocaine-induced changes in neural activity. The nonlinearity tests are necessary since some algorithms, e.g., for computing the embedding dimension of a time series, give similar results both for linear stochastic processes and for deterministic data (Osborne and Provencale, 1989). To distinguish nonlinearity from purely stochastic time series (Osborne and Provencale, 1989;Small et al., 2001), we used surrogate data (Theiler et al., 1992;Small, 2005) with the null hypothesis that the data is linearly correlated in the temporal domain, but are random otherwise (Cogranne and Retraint, 2013). The surrogate data were generated by randomizing the Fourier transform phases, which is known to preserve the linear correlations within the original data while destroying any nonlinear structure (Theiler et al., 1992;Garcia et al., 2013).
The rate of false rejections of the null hypothesis determines the necessary number of surrogates to be generated (Jung et al., 2003). At least n = 1/l surrogates should be generated to attain a certain level l of significance, e.g., for a significance level of l = 0.05 at least n = 1/l = 20 surrogates are required (Jung et al., 2003;Yuan et al., 2004). In general, a set of values λ i (with i = 1, . . . , n) of the discriminating statistics is computed for surrogates and compared against the value λ 0 for the original time series. Rejecting the null hypothesis can be done using rank ordering, in which case λ 0 must occur either on the first or on the last place in the ordered list of all values of the discriminating statistics to reject the null hypothesis. Alternatively, the null hypothesis could be rejected using the average statistical method, in which case a score γ is derived as follows: where λ = 1 n n i=1 λ i is the mean value of the discriminating statistics over all surrogates. If γ > 1, then the original data and the surrogates are significantly different and the null hypothesis is rejected (Yuan et al., 2004). Finally, the null hypothesis could be rejected using the coefficient of variation statistical method, in which case a score γ is derived as follows: where σ λ is the standard deviation of the discriminating statistics over all surrogates. Assuming a normal distribution for λ i , rejection of the null hypothesis requires γ > 1.96 at a 95% confidence level (Stam et al., 1998;Jung et al., 2003). We used the time reversal asymmetry method (Costa et al., 2005) for surrogate data to compute both the individual trial measure, λ, and the cumulative statistical scores, γ , for both (1) the average statistics (Jung et al., 2003;Yuan et al., 2004) and (2) the coefficient of variation of the distributions of λ's (Theiler et al., 1992;Kugiumtzis, 2002;Jung et al., 2003). As before (Oprisan et al., 2015), for every trial and every animal we generated 100 surrogates and, in addition to the γ scores (Stam et al., 1998;Jung et al., 2003), we used the percentage of false nearest neighbors to check the time series nonlinearity (Birkelund et al., 2004;Kugiumtzis and Tsimpiris, 2010). Although our time series were long enough and recorded with a high sampling rate, we also explored additional null hypothesis testing methods used for testing very short time series (Caillec and Montagner, 2013).
A time series is said to be reversible if its probabilistic properties are invariant with respect to time reversal (Diks et al., 1995). For example, a simple Gaussian random walk is timereversal invariant (Weiss, 1975). Practical implementations of temporal asymmetry measures use, for example, the difference between the probability density functions of the original and time-reversed series, or of their corresponding variances (Zumbach, 2009(Zumbach, , 2012, or temporal-based correlation measures over different temporal windows, or between the past and future data blocks of the same temporal length (Zamparo et al., 2013), or by using Granger causality (Winkler et al., 2016). Time irreversibility is a strong signature of nonlinearity (Schreiber and Schmitz, 2000) and is commonly related to entropy production by the underlying (often unknown) mechanism that generated the time series (see Vladimirov and Petersen, 2010;Roldán, 2014 and references therein). As in the previous study (Oprisan et al., 2015), the null hypothesis assumes that the time series is produced by a linear Gaussian random process (Diks et al., 1995). We used the Tisean software package to compute the time reversal asymmetry statistics both for the original and the surrogate data (Hegger et al., 1999;Schreiber and Schmitz, 2000;Oprisan et al., 2015). Given that (1) our time series did not require a significant computational overhead by generating the surrogates, and (2) we already know the time scale of the processes we are interested in capturing, we did not consider additional nonlinearity tests, such as the horizontal visibility algorithm and the Kullback-Leibler divergence (Lacasa et al., 2012). When using the rank ordered statistics for time reversal asymmetry (see Oprisan et al., 2015 for detailed definitions), the original data had a value of λ 0 = 0.2, and the surrogate data had a significantly different value of λ = 2.5, which rejects the null hypothesis. Additionally, the γ score of the coefficient of variation statistics was above the 1.96 threshold, and therefore we rejected the null hypothesis (see Oprisan et al., 2015 for a detailed mathematical description of the statistical tests). As a result, we concluded that the above statistical tests support the hypothesis of a nonlinear structure in our data.

Phase Resetting of LFP
External perturbations, such as synaptic inputs, light, or mechanical pressure, alter the ongoing rhythm of oscillators by changing both their phases and amplitudes (Oprisan and Canavier, 2002;Oprisan et al., 2004;Oprisan, 2013;Oprisan and Austin, 2017). Brief perturbations applied to intrinsic oscillatory activities lead only to transient changes of the rhythm, which eventually dissipate after a few cycles. Since our focus is on identifying similarities among steady state LFPs, any transient changes in the phases of LFP oscillations should be removed (Oprisan, 2017). As previously described (Oprisan, 2017;Oprisan and Austin, 2017), the phase resetting was estimated by the amount of required circular shift on each LFP trace ( Figure 1A) in order to maximize the coefficient of correlation between any trial and an arbitrary selected "reference" trial ( Figure 1B). Phase resetting correction led to a significant increase in the coefficient of correlation from 0.025 ± 0.035 (red trace in Figure 1C) to 0.414 ± 0.089 (blue trace in Figure 1C). Additionally, the root-mean-square (rms) error, i.e., the Euclidian norm of the difference between each 1.5 s long trial and the arbitrary "reference" trial, decreased from 22.3 ± 5.5 (red line in Figure 1D) to 16.6 ± 3.8 (blue line in Figure 1D).

Dendrograms of Phase Shifted LFPs
A dendrogram is a visual representation of "relationships" among trials. A possible quantitative measure of the "relationship" is through the correlation coefficient (Saraçli et al., 2013), although in this study we used the Euclidian distance as implemented in Matlab (Hill and Lewicki, 2005). The trials are aligned along the horizontal direction of the dendrogram plot and are called "leaf " nodes, whereas the vertical axis is an appropriately-defined "distance." For example, if the correlation coefficient (c) is the measure of trials' similarity, then the distance between two trials is d = 1 − c, i.e., higher the correlation coefficient smaller the distance between "leaf " nodes (trials).
The circular shift, performed in the previous section with the purpose of maximizing the coefficient of correlation between any trial and an arbitrary "reference" trial, helps defining the relative phase of the trials with respect to each other. The hierarchical classification of trials in dendrogram groups suggests "similar"looking clusters of trials that may have a similar mathematical description. The rms error computed between each trial and its corresponding cluster average further decreases (see green solid circles in Figure 1D), suggesting a strong correlation among the clustered trials. Throughout this study, we only used the Euclidian distance to measure similarities among the "leaf " nodes of the dendrogram (see Figure 2A). As we noticed, phase resetting correction significantly lowered the distance among similar trials. For example, the distance threshold for breaking the trials in six clusters before correcting for phase resetting was about 80 units (see Figure 2A) and after phase correction was about 60 units (see Figure 2B). In Figures 2C,D we plotted the average trace of each cluster from the optimized dendrogram shown in Figure 2B. There are some clear differences between the cluster traces, e.g., clusters 1 (magenta trace in Figure 2C) and 2 (green trace in Figure 2C) have high amplitude oscillations whereas cluster 6 (yellow trace in Figure 2D) has a very small peak-to-peak amplitude. Although we only show figures for one of the six animals, the same numerical procedure was applied to all the data.

DELAY EMBEDDING METHOD
Membrane potential oscillations are generated by intricate feedback mechanisms that involve many different types of ionic channels (Hille, 2001). Although in the simplest possible conductance-based model only a fast sodium and a potassium delayed rectifier suffice (Hodgkin and Huxley, 1952), more accurate models have hundreds of compartments each populated FIGURE 1 | Phase resetting of neural network activity from LFPs. Steady LFP activity between a "reference" trial (blue) and another arbitrary trial (A) without phase shift and (B) with a circular phase shift that maximizes the coefficient of correlation. (C) Without phase shifting to correct for the phase resetting, the average correlation was 0.025 ± 0.035 (red trace), which increased after appropriately phase shifting to 0.414 ± 0.089 (blue trace). (D) The root-mean-square (rms) error between the two trials decreases from 28.3 ± 7.8 without phase resetting correction (red line), to 22.3 ± 5.5 after phase-shifting (blue line), to 16.6 ± 3.8 for phase-shifted dendrogram-based correlation (green solid circles).
with tens of different ionic channels (Buzsáki and Draguhn, 2004;Schnitzler and Gross, 2005). Even the simplest conductancebased model requires evolution equations for four independent variables (membrane potential, activation variable for both sodium and potassium and inactivation of potassium channels) (Hodgkin and Huxley, 1952). The number of independent variables required for a model is its dimensionality. For pyramidal cells realistically interconnected to mimic the mPFC network, we would expect an extremely large number of dimensions (Schnitzler and Gross, 2005). However, when the actual model equations are not known, finding the dimensionality of a nonlinear system falls on experimental data. Nonlinear dynamics (Abarbanel, 1996;Kantz and Schreiber, 1997;Schuster and Just, 2005) developed a set of tools for data mining, which include phase space embedding. In this paper series (see also Oprisan et al., 2015), we investigated the LFPs using delay-embedding method of nonlinear dynamics to estimate the number of degrees of freedom of the steady activity of mPFC neural network.
One of the challenges of delay-embedding is that we only record one-dimensional data (time series) of the membrane potential and use it to identify all the other independent variables that describe the system. The delay embedding method (Packard et al., 1980;Takens, 1981), takes a time series x i = x(i t) with i = 1, 2, . . . , N where N is the number of data points and t is the uniform sampling time, and expands it into a d−dimensional vector: where τ = n t is the delay, or lag, time.

The Lag Time
One potential problem with selecting the "right" delay time is that a too small value leads to highly correlated embedded vectors. Geometrically, all the data points cluster along the diagonal direction of the embedding space leading to a one-dimensional attractor, regardless the complexity of the original data. This issue is known as redundancy, and the obvious solution is to increase the delay time τ until the components of the embedded vectors become independent (Casdagli et al., 1991). However, the delay time cannot be arbitrarily large because then the reconstructed vectors are completely de-correlated. Geometrically, the data points will uniformly fill out the entire phase space without showing any particular structure (Casdagli et al., 1991). As in the previous study (Oprisan et al., 2015), we used both (1) the autocorrelation (Holzfuss and Mayer-Kress, 1986;King et al., 1987;Zeng et al., 1991;Schiff and Chang, 1992;Schuster and Just, 2005) and (2) the average mutual information (AMI) (Fraser and Swinney, 1986;Kantz and Schreiber, 1997;Hegger et al., 1999) for estimating the lag time τ .

The Embedding Dimension
As before (Oprisan et al., 2015), Takens' theorem (Takens, 1981) provided a rough estimate of the embedding dimension through its practical implementation in the false nearest neighbors (FNN) algorithm (Kennel et al., 1992;Hegger et al., 1999;Sen et al., 2007). The intuitive idea behind FNN is that high-dimensional phase space trajectories projected onto a lower dimensional embedding space will show self-crossing points. Such false crossing points could be eliminated by unfolding the attractor in the right dimensional space (Kennel et al., 1992).

Experimental Data
Every 1.5 s-long LFP trial was first circularly shifted to correct for the phase resetting induced by the brief 10 ms light stimulus (see Figure 1B).
The lag time was estimated both with (1) the autocorrelation (Casdagli et al., 1991), and (2) the average mutual information method (Fraser and Swinney, 1986). The autocorrelation measures the amount of linear correlation between the time series and a time-shifted version of itself. By selecting for the lag time the first zero crossing of the autocorrelation (see Figure 3A), we ensure that any (linear) correlation between the two time series was removed (Abarbanel, 1996). Since the autocorrelation method only eliminates the linear correlation between a time series and its time-shifted version, we also used the first minimum of the nonlinear autocorrelation function called Average Mutual Information (AMI) (Fraser and Swinney, 1986;Kantz and Schreiber, 1997) to estimate the lag time (see Figure 3B). The distribution of all autocorrelation-based lag times for animal #1 is shown in Figure 3C and the corresponding dendrogrambased cluster averages are given in Table 1. Although only the autocorrelation-based lag times are shown both in Figure 3C and Table 1, the AMI-based lag time values (not shown) were FIGURE 3 | Time lag estimation. The first zero crossing of the autocorrelation function is around τ ≈ 2,400 t (A) and the first minimum of the average mutual information is around τ ≈ 2,300 t (B) with t = 10 −4 s. The histogram of all lag times for animal #1 has a broad range with a mean of about τ mean = (2,428 ± 707) t ≈ (0.24 ± 0.07) s (C). within 5% of those obtained with the autocorrelation method. As before (Oprisan et al., 2015), we used the Tisean function autocor to compute the autocorrelation (see Figure 3A) and mutual to compute the AMI (Figure 3B).

Embedding Dimension
Once we obtained a consistent estimate of the lag time, the Tisean package was used for computing the embedding dimension of the data (see Oprisan et al., 2015 for explicit Tisean function calls).
In the examples shown in Figure 4A, we used a lag time τ mean = 2428 t to estimate the embedding dimension with variable distance ratios, f , between 2 and 20. The distance ratio f is given by the distance between two points in a (d + 1)dimensional space relative to its value in the d-dimensional space. Distance ratio (Kennel et al., 1992;Abarbanel, 1996) is sometimes called "the escape factor" (Kugiumtzis, 2002) since by increasing the embedding dimension from d to (d + 1) false neighbors move quickly apart. Too small of a distance ratio leads to an overestimation of the percentage of false neighbors, whereas too large of a distance ratio gives a large number of false positives. For example, for large distance ratios, e.g., f > 7, the percentage of false nearest neighbors drops below 1% for an embedding dimension d E = 3 (see Figure 4A). For all values of the distance ratio f > 12, the percentage of FNN dropped below 10 −6 % at embedding dimension d E = 3 (not shown in Figure 4A). Strong temporal correlations are expected among data points that are close to each other (Theiler, 1990;Theiler et al., 1992). As a result, such data points should be removed, i.e., the time series must be "windowed" to avoid spurious temporal correlations (Grassberger, 1987;Theiler, 1990). Among the most used Theiler window criteria is three times the correlation time (Heath, 2000), (d − 1)τ , or the space-time separation distance (Provenzale et al., 1992). We tested a wide range of Theiler windows from 100 to 8,000 sampling times, which allowed us to account for any possible strong temporal correlation of closely spaced data points. For all Theiler windows tested, the percentage of FNN dropped below 0.1% at embedding dimension of d E = 3 (see Figure 4B). Although we tested a wide range of Theiler windows, we only show five representative results in Figure 4B for two reasons: (1) the trend in the omitted data is similar and does not add anything to the data shown in Figure 4B, and (2) avoid figure cluttering.
For each trial, the phase space attractor was reconstructed using its corresponding delay (lag) time, of which we only show two examples from each cluster, represented by the red and green thin lines in Figure 5. For each cluster, we also show the average reconstructed trace with the thick blue line in Figure 5. Although the reconstructed cluster average (blue thick traces in Figure 5) does not represent any "true" attractor, we used it here mostly as a visual aid that helps us gauge how individual phase space trajectories relate to the average. Although it seems as if all the attractors are different, they are topologically equivalent (see Oprisan et al., 2015), i.e., by changing the delay time one phase space trajectory could be morphed into the other. The attractors in Figures 5A,B show a double loop (period-2 attractor), which can be "unfolded" into the "8"-shape in Figures 5E,F by slightly changing the delay time. Furthermore, any "8"-shaped attractor is topologically equivalent with a closed elliptic attractor as shown in Figures 5C,D. Indeed, the attractor in Figure 5D could be morphed into the one shown in Figure 5F by twisting the upper half of the attractor with respect to the lower half. Moreover, an "8"-shaped attractor such as the one shown in Figure 5F could be morphed into a period-2 attractor as in Figure 5B by folding the front two loops.
The detailed procedure described above was also applied to the other five data sets from the different animals (not shown). In summary, the delay embedding method gave consistent delay (lag) time estimations with both the autocorrelation and the average mutual information methods (see Table 1) and the optimum delay embedding dimension was d E = 3.

FIGURE 4 | Percentage of false nearest neighbors. (A)
A semi-log plot of the percentage of FNN shows that for a distance ratio f > 7 the percentage drops below 1% for an embedding dimension of d E = 3. This suggests that an optimum ratio is above f = 7, in agreement with results from others (Abarbanel, 1996;Konstantinou, 2002). (B) Spurious temporal correlations among too closely spaced data points were removed by different Theiler windows from 100 to 8,000 samples, i.e., from 0.01 to 0.8 s (only a small subgroup is shown here). The percentage of FNN drops below 0.1% for an embedding dimension of d E = 3. Figure 2B) two randomly selected trials (red and green thin lines) together with the cluster average (blue thick line) show the reconstructed attractors. The cluster average (blue thick line) only serves as a visual aid to guide us gauge if the two randomly selected trials from the same cluster remain close to each other al all times.

CONTROL VS. COCAINE NEURAL ACTIVITY
What did we learn from analyzing the neural activity under cocaine? First, we found that the delay embedding method also works for the cocaine case as well (see Oprisan et al., 2015 for a discussion of control data). Second, we found that the embedding dimension is still three, the same as for the control data in Oprisan et al. (2015). This means that the mathematical model that could describe both the control and the cocaine cases only require three independent variables. This is significant because it opens the possibility of deriving a single mathematical model, possibly with only (slightly) different control parameters to describe both control and cocaine results.
Are control data truly different in a significant way when compared against cocaine to warrant such a hypothesis? To answer this question, we compared the data side-by-side from the same animals before and after cocaine using a dendrogram similarity measure. For each of the six animals, we concatenated the 100 trials before cocaine (trial index from 1 to 100) with the 100 trials after cocaine (trial index 101 to 200) in a single 200trial file. As it was described above, after performing appropriate phase shifting to account for the transient phase resetting induced by the light stimulus (see also Oprisan et al., 2015), we performed a dendrogram analysis. If the neural activity patterns before and after cocaine are totally distinct, then we would expect that they separate in clusters with no overlap. A very high degree of similarity would be troubling from the modeling point of view since it would indicate that we cannot distinguish between the two conditions using this dendrogram method.
For each cluster, we computed the percentage of mixing between before and after cocaine trials by using the formula: % overlap = min(#trials before , #trials after ) #trials cluster , where #trials cluster represents the total number of trials in a given cluster, of which #trials before belong to the experiments before cocaine injection (trial index 1 to 100) and #trials after belong to the experiments after cocaine injection (trial index 101 to 200). If a dendrogram cluster contains only one category of trials (either only before or only after cocaine trials), then the overlap is zero and the separation between before and after cocaine trials is maximum possible. However, if half of the trials of a cluster are from before and the other half are from after cocaine, then the overlap is maximum possible and the dendrogram cannot discriminate between those trials. We used both the six-cluster (see Figure 6A) and the 12-cluster (see Figure 6B) dendrograms to classify the 200 combined trials. The difference in the average percentage of trial overlap between the two cases is not significant, i.e., (21.2 ± 1.1)% for six-cluster and (18.9 ± 1.1)% for 12-cluster dendrograms. The fact that the average percentage overlap does not change with the number of clusters of the dendrogram suggests that the amount of overlap could be determined by a true similarity among the three-dimensional reconstructed dynamics both before cocaine (see Oprisan et al., 2015) and after cocaine (as in this work). As a result, it may be possible to fit the same three dimensional mathematical model to the data before cocaine with one set of model parameters and the after cocaine data with the same model but with a different set of parameters. The almost 20% average mix between before and after cocaine could be the result of a small set of parameters that remain constant between the two models, i.e., the most stable, or invariant, part of the model. We also know that between before and after cocaine, there are significant differences that a possible mathematical model must capture (besides the common or invariant part represented by a 20% similarity among phase space reconstructions). Indeed, from a dynamical point of view, we found significant differences between before and after cocaine trials. For example, the delay times for the phase space reconstructions are significantly different (see Figure 7). Although for each trial the optimal delay is different, it is clear that the distributions of delay times are also significantly different between before and after cocaine (see Figure 7A). We notice that, on average, all delay times are larger for before cocaine trials compared to after cocaine (see Figure 7A). We fitted the distributions with lognormal functions to capture correctly the long tail of distributions and showed that the center of the delay time distributions before and after cocaine are well separated (see Figure 7B). In all animals except one, the centers of lognormal delay time distributions for before cocaine trials are two to three times longer (see Figure 7B) than for cocaine trials. This has a significant impact on the mathematical modeling since the delay time sets the time scale of model equations. One possible approach to modeling such a difference in delay times between control and cocaine trials could be along the line of inquiry of previous experiments done by Dilgen et al. (2013), who speculated that cocaine may enhance synchronization of neural activity via a tighter PV-cell induced oscillation.
The above classification method is entirely based on dendrogram's Euclidian distance between phase shifted trials. However, the one-dimensional time series do not tell the entire story of neural activity. Therefore, after we reconstructed the three-dimensional attractors using the delay-embedding method, we computed the Frechet distance between phase space reconstructed trajectories (Frechet, 1906). Frechet distance can be intuitively formulated in terms of a man walking a dog on a leash. The man walks along one curve whereas the dog along the other and Frechet distance is the shortest leash that is sufficient for traversing both curves (Eiter and Mannila, 1994). In other words, the Frechet distance is a measure of the similarity between two curves in any metric space by taking into account the location and ordering of the points along the curves (Eiter and Mannila, 1994). We used a readily available Matlab implementation of Frechet distance algorithm (Danziger, 2013) and computed all distances between any possible combinations of the 100 trials before cocaine (bc) with the 100 trials after cocaine (ac) for all six animals. In Figure 8, each white-bordered square corresponds to a combination of 100 × 100 trials. For example, the top left square shows the color-coded Frechet distance between the 100 trials before cocaine for the first animal (labeled bc1) and the 100 trials after cocaine for the first animal (labeled ac1). Deep blue colors indicate small Frechet distances, i.e., more similar reconstructed attractors. We must emphasize that this is a different kind of similarity measure compared to the dendrogram similarity described above. This is because the Frechet distance was computed between the three-dimensional reconstructed attractors rather than the one-dimensional LFPs. As a result, Frechet distance includes dynamic information regarding both the lag time and the embedding dimension of the reconstructed attractors. A cursory inspection of Figure 8 suggests some patterns. For example, all after cocaine trials for animal #3 (ac3) have large Frechet distances to any control trial across all animals (bc1 to bc6). Similarly, it seems that the control data for animal #4 (bc4) have consistent large Frechet distances with respect to all after cocaine trials across all animals (ac1 to ac6). The dynamic information contained in the Frechet distance plots shown in Figure 8 could potentially provide additional hints regarding the time scales of variables involved in a future modeling of LFPs. A consistent blue color due to a small Frechet distance between trials suggests very similar phase space dynamics, i.e., possibly with very close time scales.

DISCUSSION
As in the previous study (Oprisan et al., 2015), we performed both a phase shifting on the LFPs to correct for the phase resetting induced by light stimulus, and a grouping of the shifted LFPs in similar patterns of activity using a dendrogram (see Figure 2). FIGURE 6 | Overlap index between before and after cocaine for each of the six animals. The overlap index varies from zero (total separation of before from after cocaine trials) to maximum possible overlap where 50% of trials are from before and the other 50% are from after cocaine in the same cluster. The overlap seem to be consistent regardless the number of clusters: in six-cluster dendrogram (A) the mean percentage overlap is (21.2 ± 1.1)%, whereas in 12-cluster dendrogram of the same data the mean overlap is (18.9 ± 1.1)% (B).
FIGURE 7 | Distribution of delay times before and after cocaine. Typical delay time distributions show a clear time scale separation between conditions: after cocaine the delay times are significantly smaller than before cocaine (A). The average value of the lognormal fit for the six animals show that for all animals except one, the after cocaine mean delay time is shorter (two to three times shorter) than for before cocaine (B).
We successfully reconstructed three dimensional attractors based on LFPs from mPFC of ChR2 expressing PV+ interneurons. The delay time for each trial was estimated using both the autocorrelation and the average mutual information functions (see Table 1). We used the false nearest neighbor method and found that the minimum embedding dimension was d E = 3 for all six animals.
We found topologically similar phase space attractors that could be morphed into each other through an appropriate change in delay time (see Oprisan et al., 2015 for details). The characteristic "8"-shaped attractor (see Figure 5) was also found in the control cases (Oprisan et al., 2015).
Both the study on control data (Oprisan et al., 2015) and this study on cocaine modulated neural activity suggest that the local network could be described by a model with only three variables. Furthermore, the control and cocaine trials are classified as similar and placed in the same cluster by the dendrogram method in about 20% of the cases, regardless the number of clusters of the dendrogram (see Figure 6). This overlap suggests that there may be a common, invariant, part of the mathematical model that describes both control and cocaine trials. At the same time, the significant difference in the delay times between control and cocaine trials (see Figure 7) could lead to different phase space dynamics. This finding suggests that although there may be some similarities between control and cocaine trials, the differences in the delay times could be interpreted as two different time scales for the two experiments. Additionally, the Frechet distance plots shown in Figure 8 provide an intuitive and global understanding of potentially similar time scales between different trials as represented by blue colors.
This study complements a previous Fourier-based analysis done by our group (Dilgen et al., 2013). For the Fourierbased analysis, power spectral densities before and during the optogenetic stimulation were computed. It was found that stimulation at 40 Hz significantly increased oscillations and induced a clear peak in the gamma range (see Figure 3 in FIGURE 8 | Frechet distance between the control, before cocaine (bc) and after cocaine (ac) trials for all six animals. Each white-bordered square contains the color-coded Frechet distance between the 100 control (bc) trials and the 100 after cocaine (ac) trials. Deep blue colors mean small Frechet distance and suggests more similar phase space trajectories. Dilgen et al., 2013). The effect of cocaine in the Fourier-based analysis was reflected in a decrease in bandwidth of induced oscillations (see Figure 4 in Dilgen et al., 2013). As a result, it was concluded that the main effect of cocaine is that "it increases the entrainment of the laser-induced oscillation to the driving frequency, resulting in a very narrow-bandwidth gamma oscillation centered at 40 Hz." Applications of nonlinear dynamics methods to neuroscience are based on the assumption that the central nervous system is a nonlinear dynamical system exhibiting deterministic chaos. The combination of the two words, "deterministic" and "chaotic, " seem unlikely, since we associate deterministic behavior of systems with predictable, well-behaved, responses governed by precise (deterministic) evolution equations whereas chaotic behavior is associated with unpredictable and "random" responses. More than a century ago, Poincaré shifted the paradigm by noticing that "... it may happen that small differences in the initial conditions produce very great ones in the final phenomena. A small error in the former will produce an enormous error in the latter. Prediction becomes impossible, and we have the fortuitous phenomenon" (Poincare, 1920). Poincaré showed that the three celestial bodies problem, although mathematically described by Newton's law of gravity, has "chaotic" behavior determined by initial conditions, i.e., sometimes small difference in initial conditions (positions and velocities of the three bodies) remain small and produce a well-behaved solution whereas for other infinitesimally close initial conditions the trajectories diverge exponentially. Such sensitivity of complex, nonlinear, systems to small variations in initial conditions are particularly important in computational neuroscience since the evolution equations are solved using digital computers that can only represent data with a finite precision. The nonlinearity and sensitivity to initial conditions is ubiquitous in neuroscience (Faure and Korn, 2001) and has been captured starting with early models, such as Hindmarsh and Rose model (Hindmarsh and Rose, 1984), which can produce a "chaotic" looking spike train that is actually deterministic. The interest in deterministic chaos and nonlinear dynamics is also clinically motivated as it has been shown that some nonlinear dynamics measures are very effective in detecting pathological conditions such as epileptic seizures, coma and dementia from electroencephalogram (EEG) recordings (Galka, 2000;Müller et al., 2001;Lehnertz, 2008). Others related the unfolding rate of the system attractor obtained with delay-embedding to different pathological cases of epilepsy (Pravitha et al., 2001), or used delay-embedding to filter various EEG artifacts (Anderson et al., 2006). Another important clinical application of delay-embedding include the detection of cardiac arrhythmia (Ravelli and Antolini, 1992;Richter and Schreiber, 1998) or the correlation between anxiety and electrocardiogram complexity (Radhakrishna and Vikram, 2001).
Other optogenetic modeling studies used a bottom-up approach in which relevant brain areas subject to optogenetic stimulation, such as the subthalamic nucleus (STN)external Globus Pallidus (GPe) subnetwork, are modeled with conductance-based model neurons (Ratnadurai-Giridharan et al., 2017). Although limited to only ten STN and ten GPe neurons, the model seems to capture patterns of synchronized oscillatory activity observed in Parkinsonian patients. Their numerical simulations show that optogenetic inhibition is more effective than electrical deep brain stimulation (DBS) (Rosa et al., 2012;Wichmann and DeLong, 2016). While further studies are needed, a data-driven model obtained with the delay-embedding method could provide subject-tailored clinical support for alternative optogenetic solutions to DBS.
Other recent studies used a combination of delay embedding (Kantz and Schreiber, 1997;Hegger et al., 1999) and statistical learning theory (Hastie et al., 2001). The method was applied to a different data structure than ours, i.e., to analyze multiple single-unit recordings from the rat anterior cingulate cortex while the animals performed decision-making tasks in a radialarm-maze (see Balaguer-Ballester et al., 2011 for details), or from the medial prefrontal cortex of rats while the animals performed a foraging task guided by working memory (Lapish et al., 2015). The authors augmented the embedding space with extra dimensions represented by interactions between units' firing rates to account for neuronal cross-correlations. In their study, the dimension of the embedding space (10 6 ) was significantly larger than in our case. To handle the statistical analysis in such a high dimensional space, they used kernelmethods (Hastie et al., 2001). To visualize the phase space trajectories, they used a kernel-PCA (Principal Component Analysis) (Scholkopf et al., 1998;Zheng et al., 2005) by only retaining the three most variance-explaining dimensions of the neural dynamics. PCA algorithms find directions which have minimal reconstruction error by describing as much variance of the data as possible with a (relatively) small number of orthogonal directions. To further aid with visual interpretation of the data, they also projected neural dynamics onto a 3-dimensional space using a Fisher discriminant analysis (FDA) technique (Fisher, 1936;Mika et al., 1999;Sugiyama et al., 2009). FDA maximizes the separability of classes in a dataset by constructing projection vectors that maximize the scatter between the classes, while minimizing the scatter within each class (Duda et al., 2000). By augmenting the embedding space, the 3-dimensional visualizations seem to uniquely identify task-specific phases space attractors associated with different population states.
In this study, we corrected for the transient change in the LFPs by computing the amount of phase resetting required to maximize the correlation among trials. For truly nonstationary recordings, new methods have been suggested that range from machine learning to a novel class-trajectory coherence algorithm that can estimate the departures from deterministic nature in multi-attracting dynamics (Balaguer-Ballester et al., 2014). The method of class-trajectory coherence is particularly suitable for detecting subtile dynamical changes that are not detected by statistical moments and, therefore, when significant trends cannot be statistically proven (Balaguer-Ballester et al., 2014).
Our ultimate goal is to better understand the mechanisms of information processing and the role of the coherent 40 Hz oscillations. It is believed that this gamma rhythm of the brain is involved in higher cognitive function or consciousness (Mashour, 2006;Lee et al., 2009). The rhythm supports information integration across different areas of the brain and helps binding neural processes that generate consciousness as shown both experimentally (Tononi et al., 1998;Tononi and Sporns, 2003) and through computer modeling (Hauptmann et al., 2005). Loss of consciousness has been revealed by decoherence of gamma rhythm (John et al., 2001;Lee et al., 2009). The delay-embedding method has been previously used to investigate the change in the embedding dimension and lag times due to anesthetics (Walling et al., 2006;Lee et al., 2009). The "cognitive rebinding" associated with the emergence from unconsciousness was associated with the significant change from an ordered to a chaotic reconstructed attractor. Since the mice in our experiments were awake during both control and cocaine experiments, we did not find such a dramatic change in the attractors' structure. However, the underlying dynamics has a faster time scale for cocaine experiments, a detail that is only captured by analyzing the lag times.

CONCLUSIONS
The activity of the medial prefrontal cortex in six mice systemically injected with cocaine (20 mg/ kg ip) was optogenetically perturbed with brief laser pulses. The permanent phase resetting induced by a light stimulus was removed using the pair correlations between recorded local field potentials. The phase-corrected trials were embedded in a three dimensional phase space using a delay embedding method. The main results are as follows: (1) The reconstructed attractors for cocaine trials are three-dimensional. (2) The 20% classification overlap between control and cocaine trials suggests a possible common, invariant, mathematical description of network activity. (3) At the same time, the cocaine dynamics is about three times faster than the control, suggesting different time scales for a possible mathematical model. Since previous experiments using EEGs for delay-embedding focused only on comparing the wakefulness vs. unconsciousness attractors, we can only speculate that a gradual increase of the anesthetic could be useful in revealing not just the topological, i.e., the appearance and dimensionality of the attractor, but also the dynamical aspects, i.e., the lag time distribution, of phase space activity. Similarly, we would like to devote future experimental studies to investigate the effect of cocaine dose on the phase space reconstruction. We hypothesize that increasing the cocaine dose should shorten the lag times to the point where neural activity leads to a dramatic change even in the topology of the attractors. Such a hypothesis aligns with the general consensus of a communication breakdown between different parts of the cerebral cortex as consciousness fades (Massimini et al., 2005;Ferrarelli et al., 2010).

AUTHOR CONTRIBUTIONS
SO contributed to the analysis and interpretation of data, and wrote the manuscript. JI and JH contributed to delay-embedding numerical simulations and reviewed the manuscript. TT and AL contributed to the design of the work, performed the experiments, and reviewed the manuscript.

FUNDING
SO acknowledges support for this research from NSF-CAREER award IOS 1054914.