A Framework for Adapting Deep Brain Stimulation Using Parkinsonian State Estimates

The mechanisms underlying the beneficial effects of deep brain stimulation (DBS) for Parkinson's disease (PD) remain poorly understood and are still under debate. This has hindered the development of adaptive DBS (aDBS). For further progress in aDBS, more insight into the dynamics of PD is needed, which can be obtained using machine learning models. This study presents an approach that uses generative and discriminative machine learning models to more accurately estimate the symptom severity of patients and adjust therapy accordingly. A support vector machine is used as the representative algorithm for discriminative machine learning models, and the Gaussian mixture model is used for the generative models. Therapy is effected using the state estimates obtained from the machine learning models together with a fuzzy controller in a critic-actor control approach. Both machine learning model configurations achieve PD suppression to desired state in 7 out of 9 cases; most of which settle in under 2 s.


INTRODUCTION
Continuous deep brain stimulation (DBS) for Parkinson's disease (PD) uses high frequency stimulation to ameliorate patient condition. However, this induces side effects in patients and shortens pacemaker battery life (Little et al., 2013). Both can be resolved using adaptive deep brain stimulation (aDBS). Adaptive DBS driven by feedback signals provides an approach that optimizes clinical benefits whilst minimizing side effects and battery depletion (Little et al., 2013;Arlotti et al., 2016). A commonly adopted feedback signal for aDBS are local field potentials (LFP) (Arlotti et al., 2018). LFP are used due to their correlation to patient clinical states and the ease with which they can be acquired (Priori et al., 2012;Little et al., 2016). In conventional DBS, programming of stimulation parameters are done by trained clinicians (Picillo et al., 2016). Thus, aDBS techniques that imitate human reasoning into decision making could be adopted-an example of which is fuzzy control.
Fuzzy control is found in numerous applications for closed loop therapy (Zarkogianni et al., 2011;Soltesz et al., 2013;Zavitsanou et al., 2016). It has the potential to achieve a level of expertise close to (and possibly better than) human expertise in therapy modulation (Barro and Marin, 2002). However, the capabilities of fuzzy control are dependent on the level of sophistication of its rules and input signal. In this paper, state estimates are used as input to a fuzzy controller to achieve a critic-actor control policy as shown in Figure 1. It leverages on a machine learning model as the critic and a fuzzy controller as the actor. This individualizes therapy by means of patient-specific state estimates which are obtained through the machine learning models. Machine learning models were selected because of their ability to create adaptable models for complex signals using statistical attributes from the signals (Sajda, 2006). The choice of fuzzy control was driven by their ability to provide computationally efficient and robust decision making. Consequently, as more knowledge on PD and DBS is gained, the fuzzy rules could be updated, which provides an adaptable control scheme. The scheme has the potential to be developed into a fully implantable closed-loop DBS system. The rest of the paper is organized as follows. Section Machine Learning for Disease Tracking describes the methods adopted for disease tracking. Sections Models and Metrics and Fuzzy Controller Design describe the materials used for implementing the adopted methods. Section Performance Evaluation presents the results obtained. Sections Discussion and Conclusion present discussion and concluding remarks, respectively.

MACHINE LEARNING FOR DISEASE TRACKING
Disease tracking is important because dynamic changes in PD pathophysiology could help inform treatment strategies. This can be achieved using machine learning models as they provide insights on disease progression. In brain machine interface applications, machine learning provides the ability to notify caregivers of life-threatening events related to chronic disease diagnosis and management (Johnson et al., 2016;Mohammed and Demosthenous, 2018). Using closed-loop control strategies, this useful information can be used to generate actionable outputs-mostly from stimulation devices-to mitigate patient conditions (Csavoy et al., 2009). Machine learning models for disease tracking are intended to achieve one of two outcomes: prediction or state estimation. For optimal delivery of bio-electronic therapy, prediction is the most desirable outcome. Nevertheless, early and accurate state estimation can be used to adjust therapy to suit patients' needs. State estimation tracks fluctuations in PD symptom severity so that stimulation can be modulated correspondingly. Machine learning algorithms can be used to obtain state estimates. Generally, machine learning algorithms are classified into supervised (using labeled data), semi-supervised (using partly labeled data) and unsupervised (using unlabeled data) learning algorithms. This work will focus on the use of supervised learning algorithms.

Supervised Learning Algorithms
These algorithms are not only concerned with detecting patient states, but can also be used in understanding the evolution of the pathophysiological processes in patients; thus, modeling transitions between various states in a disease. Supervised learning algorithms are divided into discriminative and generative machine learning models. For both algorithms, the major pre-processing approach adopted before state estimation is scaling the features using mean normalization. This is represented mathematically as follows, where µ j = 1 m m i=1 x j (i) is the mean of feature j, x j (i) is feature j of training example i (with a total of m training examples x 1 (i) , x 2 (i) , . . . x m (i) ) and s j is the standard deviation of feature j. Contour plot for state estimates using SVM, with a range from 0 to 1 representing levels of severity from non-PD to PD for (A). (D) Contour plot for state estimates using GMM, with a range from 0 to 1 representing levels of severity from non-PD to PD for (A). The two features are, Feature 1 (21-26 Hz band) and Feature 2 (18-23 Hz band).
Feature scaling using mean normalization scales features such that features have a comparable range of values. Discriminative models focus on detecting disease or nondisease states, in this case PD and non-PD states. On the other hand, generative algorithms are particularly useful in applications were the sequence of transition between states is essential in determining future states, like in sleep-stage monitoring applications (Rossow et al., 2011). In aDBS, they can be principally useful in applications were stimulation parameters are defined by the evolution of the sensed neural potentials.

Representative State Estimators
Generative algorithms model the data based on the joint probability distributions between its classes (PD and non-PD) while discriminative algorithm models data based on their conditional probability distribution. One example in each of the two models was used to test the soundness of the proposed framework for aDBS. A linear kernel support vector machine (SVM) was selected as the representative algorithm for the discriminative models while the Gaussian mixture model (GMM) was selected for the generative algorithms. SVM and GMM were selected because of their computational efficiency compared to other similar algorithms. Figure 2 shows the contour plot for features space using the conditional probability from the SVM as state estimate and the joint probability from the GMM as state estimate. PD regions are points on the plot where the probability is >0.5, while non-PD regions are those in which the probability is <0.5. Thus, from non-PD to severe PD is a transition in probability from 0 to 1.
The SVM uses the widest margin between differing states to discriminate. For a linear SVM kernel, the discriminating function f SVM (x), used in classifying test cases is obtained using the training examples as in Cristianini and Shawe-Taylor (2000). where x i are the support vectors and their labels y i , x is the test case, (x i , x) is the kernel transformation (linear kernel), α i is a weight vector and b represents the classification threshold. Figure 2C depicts the state estimates obtained using SVM on the feature space in Figure 2A, whose probability density function (PDF) is shown in Figure 2B.
The GMM estimates conditional probability using a weighted sum of a number of PDFs. These PDFs are used to form the Gaussian models. The weighted Gaussian functions f GMM (x) modeling the underlying processes are where w i is the weight assigned to a particular Gaussian model i, − → x is the input feature vector, µ i is the mean vector and i is the covariance matrix. The major assumption employed in GMM is that the population of feature vectors can be represented by N Gaussian models. Thus, two Gaussian models are fitted in the training data, in order to represent each of the patient states (PD and non-PD). Figure 2C shows the state estimates obtained using SVM on the feature space in Figure 2A, whose PDF is shown in Figure 2B.

MODELS AND METRICS
The proposed model in Figure 1 consists of a basal ganglia network, a feature extraction stage, a state estimation stage for diagnosing PD severity and an adaptive stimulator for delivering therapy. The basal ganglia network uses LFP recordings to mimic the underlying mechanism of PD. LFP signals from the basal ganglia model are applied to a feature processing stage, and the output from this stage is applied to a state estimation stage. Stimulation parameters are adjusted based on patient state estimates. The model was developed using custom SIMULINK blocks. The SIMULINK blocks were implemented using level-2 MATLAB S-functions. This was used to validate the complete aDBS system.

Basal-Ganglia Network Model
In order to validate these methods a basal ganglia model using LFP recordings obtained from measurements made on patients exhibiting a combination of bradykinesia and/or rigidity during the onset of PD, with less noticeable tremor was employed. The network which is shown in Figure 3A, consists of: patient LFP signals, modulating network and the modulated LFP signal.

Patient LFP Signals
These are LFP signals consisting of PD and non-PD periods synthesized from real-life LFP recordings. The LFP synthesis, involved fitting autoregressive moving average (ARMA) models to the LFP recordings to produce semi-synthetic LFP signals. Fitting an ARMA model provides the flexibility to manipulate the signal characteristics such that all underlying conditions can be represented. Also, it offers the opportunity to generate LFP signals consisting of PD and non-PD episodes of variable duration. The LFP dataset consists of LFP recording for nine patients. The recordings were obtained from the subthalamic nucleus (STN) of subjects exhibiting a combination of bradykinesia and/or rigidity during the onset of PD, with less noticeable tremor.
The permanent quadri-polar macro-electrode used was model 3,389 (Medtronic Neurologic Division, Minneapolis, MN) consisting of 4 platinum-iridium cylindrical contacts. Its contacts were numbered 0, 1, 2, and 3, with 0 being the most caudal and 3 being the most cranial for both right and left electrodesmaking a total of eight monopolar channels for each patient. The recorded signals were amplified using a low-noise amplifier and band-pass filtered. Figure 4 shows a snapshot of OFF and ON L-dopa recordings of the left DBS lead for patient/dataset A. The complete LFP data synthesis process and a detailed description of the LFP recordings are provided in Mohammed et al. (2017). For the basal-ganglia model in Figure 3A, the applied stimulus regulates the patient LFP signal such that the modulated LFP characteristics are restored to those resembling non-PD LFP. Stimulation is not applied on detecting patient LFP with non-PD characteristics.

Modulating Network and Modulated LFP Signals
The therapeutic mechanisms of DBS on neuronal activities is still not clear. Various studies suggest that it reduces neuronal activities (Kiss et al., 2002), while others claim that it increases neuronal activities (Carlson et al., 2010). Later studies provide other alternative explanations (Chiken and Nambu, 2016). Generally, studies show that the frequency settings of DBS of the STN influence the motor symptoms of PD. For example, the study in Su et al. (2018), observed that frequency-specific effects can ultimately inform the frequency programming of STN-DBS in the clinical use. From the studies, what is clear is that DBS has a multimodal and modulating effect on neuronal activities at the stimulation site (Hell et al., 2019). In addition, the various clinical aspects related to bradykinesia and other PD symptoms are still unclear (Bologna et al., 2020). As such, to model the effect of stimulation on patient LFP signals, a black-box approach was used as shown in Figure 3A.
For the black box model, changes in the coefficient of variation (CV) of neuronal signals during DBS supports the hypothesis that modulating LFP signals is one of the mechanisms that can lead to PD suppression (Birdno and Grill, 2008;Dorval et al., 2010). PD symptoms have been found to correlate with beta band LFPs (Little et al., , 2013Grant and Lowery, 2013); gamma (Brown and Williams, 2005;Brittain and Brown, 2014); and tremor (Heida et al., 2013) bands. Hence, the neuromodulatory effect of DBS on PD occurs in multiple bands. This prompted the two-degrees-offreedom (2-DOF) changes in CV applied by the modulating network as shown in Figures 3B-D. For 2-DOF modulation in CV, the modulating network varies the amplitude of patient LFP signals in the two bands with the most pronounced variation between non-PD and PD bands as shown in Figure 3C. For both bands, the headroom of variation for the magnitude of the filter response is between 1 and the ratio of CV for non-PD to PD (CV NPD /CV PD ), as is shown in Figure 3D. Figures 3B-D show all the cases of CV ratio between PD and non-PD for 2-DOF variation. Figure 3B shows a situation where CV NPD in both bands is greater than CV PD . Figure 3C shows a situation where CV NPD in both bands is less than CV PD . Finally, Figure 3D shows a situation where CV NPD in one of the bands is greater than CV PD . This makes the modulating network unique for each patient since the frequency response of the modulating network is dependent on the relationship between PD and non-PD periods of each patient.
The modulated LFP signals are extracellular/LFP signals resulting from the modification of patient LFP signals by the modulating network. The modulated LFP signals are the signals monitored by aDBS in order to adjust the stimulation.

Feature Extraction and Selection
For feature extraction, the fast Fourier Transform (FFT) is used to obtain time-frequency data. This is achieved by dividing the signal into windows and applying FFT to each window (Prandoni and Vetterli, 2008). Mathematically STFT is given by where m is the discrete time index, L is the window length into which the signal is split and k is the discrete frequency index. For this application, the time-stamped measurements are split into 2 s overlapping epochs, with 50% overlap between epochs. In addition, the power bands (features) are divided into 5 Hz bands, with 3 Hz overlap between bands; 0 to 5, 3 to 8 Hz, . . . 45 to 50 Hz. This provides a total of 16 features. The window is chosen such that a balance between time and frequency resolution is obtained.
More so, feature selection involves reducing the number of features that will be used for state estimation. For this study, the maximum ratio method is used (Mohammed et al., 2017). It starts by identifying the channel having the two bands with the most pronounced variation in activity between PD and non-PD LFP signals. The goal is to obtain the frequency bands that make state estimation easier and computationally efficient. The maximum ratio method is a computationally simple method. A more detailed description is presented in Mohammed et al. (2017).

Stimulation Parameters
Stimulation is used to respond to fluctuations in the dynamics of patient LFP data. The estimated patient state is applied to the fuzzy controller and the fuzzy controller determines the appropriate stimulation parameters. The fuzzy controller is designed in section Fuzzy Controller Design. The poorly understood mechanisms for DBS makes the selection of stimulation parameters (i.e., amplitude, frequency and pulse duration) difficult (Kuncel and Grill, 2004). Experimental studies have been undertaken regarding the most beneficial stimulation parameter for reduction in motor symptoms. Some studies suggest that there are more noticeable improvements when stimulation frequency is adjusted (Moreau et al., 2008;Xie et al., 2012;Belasen et al., 2016). However, other studies maintain that stimulation amplitude is more critical (Moro et al., 2002;Eusebio et al., 2011;Whitmer et al., 2012). More research has focused on stimulation frequency alone (Birdno and Grill, 2008;Baker et al., 2011;Brocker et al., 2013). Varying the stimulation frequency is essential for the therapeutic effects of STN-DBS on motor symptoms in PD (Su et al., 2018). Nevertheless, the major  considerations in selecting stimulation parameters are patient responses to stimulation patterns and power consumption to conserve battery life (Kuncel and Grill, 2004). The consensus is that the most beneficial stimulation frequency occurs at 130 Hz (Birdno and Grill, 2008;Moreau et al., 2008;Vercruysse et al., 2014).
Based on the therapeutic benefits of varying the stimulation frequency, the fuzzy rules are designed to adjust the stimulation frequency. Adjusting the stimulation frequency modifies the modulating effect in a linear fashion as depicted in Figure 5. The headroom for the frequency response of the modulating network in Figure 3 (i.e., a magnitude response of between 1 and CV NPD /CV PD ) corresponds to a stimulation frequency ranging from 0 to 90 Hz. This is shown in the contour plot of Figure 5, where increasing the stimulation frequency by 45 Hz moves the test case from the point marked X to the center of the non-PD cluster, while a decrease of 45 Hz moves it to the center of the PD cluster. In theory, a 90 Hz increase/decrease in stimulation frequency maintained over 2 s can move a test case from the center of one cluster to the other (PD to non-PD cluster or vice versa). The range of stimulation frequency is between 0 and 180 Hz, which is within the limit for conventional DBS.

Evaluation Metrics
In assessing the performance of the different state estimatorbased approaches, three measures that are indicative of accuracy, latency and computational complexity have been used.

Accuracy
The state estimators are evaluated using Mathews correlation coefficient (MCC) and weighted classification error (WCE). MCC and WCE are balanced measures used in assessing state estimator quality that can be used even for cases with skewed classes. MCC measures the correlation coefficient between the observed and predicted binary classifications. It has a range between −1 (total disagreement) and +1 (total agreement); with 0 representing a random prediction. Mathematically, where TP are the true positives, FP the false positives, FN the false negatives and TN the true negatives. The major shortcoming of MCC is that it can only be used when one of the denominators TP + FN, TP + FP, TN + FP and TN + FN is not a zero. For WCE, it can be represented mathematically as, The first part of Equation (5) represents type I error (falsepositive rate), while the second part represents type II error (false-negative rate). In Equation (5), WCE uses equal weights to compute the average of type I and type II error.

Latency
Detection latency in this work, measures the total time required by the system (or controller) to settle to the modal state interval for non-PD defined by the fuzzy controller. For SVM driven control policy, the modal state interval is between 0.15 and 0.35, while for GMM, it is a state between 1 × 10 −8 and 0.1. This were empirically obtained considering that from non-PD to severe PD is a transition from 0 to 1.

Computational Complexity
In this work, the primary concern is the computational cost of the critic-actor control algorithm consisting of the state estimator and the fuzzy controller. Computational cost consists of two components, number of operations (NOP) and where N add(sub) is the number of 1-bit additions or subtractions; N mult(div) is the number of 1-bit multiplications and divisions; and Res is the resolution of the data converter used. For memory estimates, the number of 1-bit registers required are obtained.

FUZZY CONTROLLER DESIGN
Based on parkinsonian state estimates, fuzzy rules are defined to regulate stimulation. The fuzzy controller modifies the stimulation parameters applied to the modulating network to suppress PD-related oscillations. A fuzzy controller was chosen because it uses a reasoning which is similar to human reasoning and decision making. This makes it superior in handling non-linearity and uncertainty compared to schemes like proportional-integral-derivative controllers, lead-lag and state feedback control (Feng, 2006;Wu et al., 2017). Fuzzy controller design essentially involves the following: • Choosing the fuzzy controller inputs and outputs.
• Choosing the pre-processing that is required for the controller inputs and the post-processing for the controller outputs. • Designing the four components of the fuzzy controller (ruletable, inference mechanism, fuzzification and defuzzification).
To facilitate the design of the fuzzy controller, Figures 6A,B shows the desired average profile for the effect of incremental stimulation frequency for all possible input combinations for the SVM and GMM driven approaches, respectively. Figure 6 represents the average 3-D profile that maps inputs (state estimates and change in state) to outputs (stimulation frequency). To obtain the profile or each patient dataset, training examples at discrete points on the feature space representing states estimates ranging from 0 to 1 in steps of 0.1 are identified.
For training examples at each discrete point on the feature space, stimulation frequency is increased in steps from −45 to +45 Hz (in steps of 5 Hz). The corresponding rate of change in patient state is obtained for each discrete pair consisting of patient state estimate and applied stimulation frequency. This produces a mapping of three variables (patient state estimates, change in state and applied stimulation frequency). This means, for every patient state, there is an applied stimulation frequency that results in a specific rate of change in patient state. The process is repeated for all nine patient datasets and the average for the various profiles are obtained as Figures 6A,B. For the SVM based approach, Figure 6A represents the average 3-D profile that maps state estimates and change in state to stimulation frequency. This is represented by Figure 6B for the GMM based approach. The average profiles in Figure 6 are used to guide the rule-tables for controlling PD suppression.
The profile for the change in state (measured in s −1 ) targets a settling time of between 1 and 1.5 s from the center of the modal class of the PD state (with a probability 0.75 for SVM, and 0.9999 for GMM) to the center of the modal class of the non-PD state (with a probability 0.25 for SVM and 1 × 10 −4 for GMM). From Figure 6 it is obvious that from a PD state of 1, the SVM-driven approach has a more gradual descent, while the GMM has a sharper descent at the edges, plateaus for a range of input values in which change in input only causes a slight change in stimulation frequency before it finally descends steeply. This surface plot guided the choice of membership function and rule table for the fuzzy controller, which are normally chosen heuristically. The input-output relationship was obtained using the average profile for state estimate and incremental stimulation frequency which are depicted in Figures 2, 5, respectively.

Fuzzification
This is the encoding step. It modifies the inputs so that they can be interpreted and compared to the rules in the rule-table. The controller inputs are converted to information usable by the inference mechanism. Obtaining a value for an input variable and finding the numeric values of the membership functions that are defined for that variable. It can also be seen as an encoding of the fuzzy controller inputs. The encoded information is used in the fuzzy inference process that begins with matching.

Fuzzy Rules and Membership Functions
Fuzzy rules and membership functions are obtained by studying the plant dynamics (using modeling and simulation), based on these, a set of control rules that make sense are adopted. This makes fuzzy controller design subjective and dependent on expert designer (Passino and Yurkovich, 1998). In addition, the adaptable nature of a fuzzy controller makes a suitable candidate, since the mechanisms of DBS are still under debate. The control scheme uses a two-input one-output fuzzy controller. The inputs are PD state estimate and the rate of change in state. The state estimates and rate of change in state quantify the dynamics of the underlying process to enable control. State estimates are obtained using SVM and GMM. The output is the incremental stimulation frequency. Based on the contour plot of the state estimates in Figures 2C,D, and the contour plot depicting the effect of stimulation frequency in Figure 5, triangular membership functions were used for the inputs and output of the SVM driven approach. While for the GMM based approach, Gaussian functions were adopted. The rule table for the SVM-driven approach is shown in Table 1. It is obtained using the 3-D profile in Figure 7A representing the mapping between inputs (state estimates and change in state) and outputs (stimulation frequency). The input membership function for the rules in Table 1 are summarized in Figures 7A,B. While the membership functions for the output (incremental stimulation frequency) is summarized in Figure 7C.
For the GMM-based control approach, its rule table is shown in Table 2. It uses Gaussian membership functions. Its input membership functions for the rules in Table 2 are summarized in Figures 8A,B. While the membership functions for the output (incremental stimulation frequency) is shown in Figure 8C. The universe of discourse for the state estimates is [0, 1] as can be seen in Figure 7A and Figure 8A for the SVM and GMM, respectively. The input fuzzy sets for the SVM are represented by alphanumeric variables A 0 A 1 . . . A 10 , and that of the GMM is D 0 , D 1 . . . D 7 . This means for state estimates, the SVM driven approach has eleven fuzzy sets and the GMM driven approach has eight fuzzy sets. The membership functions for the SVM and GMM driven controllers are summarized in Figures 7,  8, respectively.
For the second input which is change in state, the fuzzy sets of the SVM driven approach are represented by alphanumeric variables B −4 . . . B 0 . . . B 4 , making a total of nine fuzzy sets. Their membership functions are summarized in Figure 7. From Figure 7, it can be seen that negative subscripts represent a change from one toward zero (PD to non-PD) and positive subscripts represent a change from zero toward one (non-PD to PD). This is the same for the change in state of the GMM-driven approach with fuzzy sets represented by alphanumeric variables E −4 . . . E 0 . . . E 4 , and their respective membership functions summarized in Figure 8. As summarized in Figures 7, 8, the universe of discourse for the SVM-driven approach is [−0.31, 0.31] s −1 and that of the GMM-driven approach is [−5, 5] s −1 . The fuzzy sets representing the output (incremental stimulation frequency) are labeled C −3 . . . C 3 , for the SVM approach and that of the GMM are labeled F −3 . . . F 3 . Like in the fuzzy sets for the change in state, the negative subscripts represent an output representing a reduction in stimulation frequency, while a positive subscript represents an output resulting in an increase in stimulation frequency. Both have a universe of discourse of [−60, 60] Hz. Based on heuristics, the SVM rule-table has an 11 × 9 array making a total of 99 possible rules, which are summarized in Table 1. For the GMM rule-table in Table 2, it is made up of an 8 × 9 array making a total of 72 possible rules.
The desired fuzzy set for the SVM driven approach is shaded in Figure 7. The desired fuzzy set for state estimates is between the intervals 0.15 and 0.35 (represented by A 2 in Figure 7A). This represents the modal class for non-PD cases. In terms of the change in state, the desired interval is between −0.1 s −1 and 0.1 s −1 (represented by B 0 in Figure 7B). The modal class interval for the state estimate (A 2 ) was made not to overlap with other classes to avoid ambiguity in fuzzy quantification. The outermost membership functions for the inputs can be seen to saturate and values outside the range are grouped to their closest fuzzy set. However, this is not the case for the output, due to  expert designer. That is why a wide desired range was selected in both approaches to ensure convergence. In addition, the selected range represents the modal state for stable and non-disease conditions when projected to the patient feature space, which could be demonstrative of symptom severity. The membership functions and fuzzy rules were defined carefully based on the gradation of the state estimates on the patients feature space. This was to enable a gradual and deliberate PD suppression as against abrupt and jerky response.

Inference Mechanism
The inference mechanism generally involves two steps: premise quantification and determining conclusions. Premise quantification compares the premise of all rules to the controller inputs to determine which rules are applicable to the current situation. It involves determining the certainty with which rules apply. The recommendations from rules that we are more certain with are adopted. Next is the determination of conclusions. This decides the control action to take using the applicable rules at the current time instant. The conclusions are characterized with a fuzzy set that represents the certainty with which the input should take various values. Premise quantification using the minimum f the applicable rules is adopted, while conclusion determination is obtained by ANDing the applicable rules.

Defuzzification
This is the final operation of the fuzzy controller. It operates on implied fuzzy sets (output fuzzy sets) produced by the inference mechanism. It combines the effects of the various fuzzy sets to produce the "most certain" controller output (plant output). Defuzzification can be considered as decoding. As the fuzzy sets produced by the inferencing process (implied fuzzy sets) is converted to numerical controller outputs. The center of gravity (COG) method for combining recommendations was adopted. More detail of defuzzication is given in Passino and Yurkovich (1998). From both Tables 1, 2, the pattern of rule consequents shows a certain symmetry. For states estimates approaching a state of 1 and having a positive rate of change in state (positive subscript i.e., moving from non-PD to PD), there is a positive increase in stimulation frequency (positive subscript). Similarly, for state estimates approaching 0 and having a negative rate of change in state (negative subscript i.e., moving from PD to non-PD), the incremental stimulation frequency is negative (negative subscript). Note, the diagonals of near zero for the incremental stimulation frequency from state A 0 to state A 6 , for the SVM and A 0 to state A 5 , for the GMM in Tables 1, 2, respectively.

PERFORMANCE EVALUATION PD Suppression
PD suppression is depicted in Figure 9 using the GMM and SVM driven approaches. From Figure 9A which depicts SVM state estimation, it can be seen that the test case travels from the PD region and converges at the non-PD region as desired. It is also the case for the GMM approach in Figure 7C, but with a smoother trajectory. Figures 9A,C show the feature space profile and Figures 9B,D display the time profile. For the time profile, it can be seen that both cases cross the desired interval exactly after 2 s and both present the same settling profile. After settling, the SVM based approach has a mean PD state of 0.3137 and GMM-driven approach has a PD state of 1.3 × 10 −2 , both of which fall within the desired range. The stimulation profile for both cases is shown in Figure 10. Both cases present almost the same stepwise pattern, with the SVM having a more gradual ascent to the required stimulation frequency compared to the GMM which overshoots before finally settling. The settling stimulation frequency for both cases are not far apart. The feature space profile on the feature space and the time profile (both in Figure 9) display a stable PD suppression profile. In addition, the stimulation profile in Figure 10 also displays a stable stimulation profile. Both of these are indicative of a stable PD suppression.
For the rest of the datasets, Table 3 summarize their mean PD state and settling time. For the mean PD state in Table 3, the SVM has a lower quartile of 0.2514 and an upper quartile of 0.316, which both fall within the desired range (0.15-0.35). For the GMM, it has an upper quartile of 0.085 and a lower quartile of 2.5 × 10 −4 , which are both within the desired range (1 × 10 −8 −0.1). For the settling times in Table 3, the SVM-driven approach has a median of 1.5 s, lower quartile of 1.25 s and an upper quartile of 1.875 s. While for the GMM, it has a median of 1.25 s, lower quartile of 0.25 and an upper quartile of 1.75 s. This shows that on average, the GMM based approach settles faster than the SVM based approach; however, the GMM has more variation in settling time as shown in Table 3.

Performance of State Estimators
To assess the quality of the SVM and GMM state estimators, the MCC and WCE which are skew insensitive measures were used. The MCC measured the correlation coefficient between the original dataset and the models fitted using each of the state estimators. On the other hand, the WCE consisted of weightings of type I and type II error. This was because in aDBS, high false positive-rate will result in administering stimulation when it is not required, and this may lead to stimulation induced side effects (Baizabal-Carvallo and Jankovic, 2016). High false-negative rate will result in the non-administering of stimulation when it may be required, which could worsen patient condition (Hacker et al., 2015). The real-time detection performance of the state estimator was investigated. Both models used 128 training examples and PD events were detected from 2 s overlapping epochs (with 50% overlap). For MCC in Table 4, both state estimators present a positive correlation for all datasets, with the SVM having a median of 1 and the GMM with a median of 0.9433. Of the 9 cases, both SVM and GMM have 7 cases with strong positive correlation (MCC ≥ 0.5). Only the state estimates of dataset G have a weak positive correlation in both cases. This is due to the high overlap between its PD and non-PD clusters which makes it difficult to fit the classifier to the data. From the MCC results, it can be seen that SVM fits the data better than the GMM. Similarly, the WCE results present a superior performance of the SVM over the GMM. The SVM presents a mean and median WCE of 9.03 and 0%, respectively. While the GMM presents a mean and median of 11 and 1.98%, respectively. This further confirms the superiority of the SVM over the GMM in fitting the data.

Relative Complexity
To ensure that the approach is effective for real LFP recordings, the semi-synthetic LFP were made from real LFP recordings to mimic PD progression in real LFP recordings. In addition, state estimators that are size and power conscious were implemented. Complexity estimates for both approaches were obtained using 128 training examples were assumed to be used with 8-bit quantization (GMM inputs to fuzzy controller were assumed to have 32-bit quantization due to their resolution requirements) and 10% of the training examples were assumed to be support vectors of the SVM. The relative complexity between the SVMdriven and GMM-driven approach for each of the two stages of the critic-actor control policy are shown in Figure 11. From Figure 11A, it can be seen that at the state estimation stage the SVM-driven approach requires more NOP, with the GMM approach requiring only about 5% SVM NOP. At the state estimation stage, computation in the GMM is dominated by memory while for the SVM it is dominated by NOP. This is because the GMM is a population dependent algorithm, while the SVM only uses the footprint from the population to infer properties. In Figure 11B, the GMM requires a higher NOP for fuzzy inferencing due to its adoption of Gaussian functions as against the triangular function used by the SVM-where triangular COG is simpler to calculate. In terms of memory the GMM requires fewer rules compared to the SVM. It is clear that in the state estimation stage the GMM has less computation and more memory, while at the fuzzy control stage the reverse is the case.

Critic-Actor Control Policy
The critic-actor approach models the relationship between the physician and the automated neuromodulation system. The critic like the "trained clinician" assesses the state of the system based on a cost function (in this case state estimates) and provides the information to the actor. The actor provides control signal based on evaluation from the "informed critic." In this configuration the state estimator is the critic, while the fuzzy controller is the actor. The main motivation for adopting the critic-actor control policy is because PD suppression can be extremely difficult to achieve due to the limited understanding of the mechanisms underlying PD. This makes it difficult to produce an accurate model that could be used for controller development. It is for this reason that more heuristic methods are proposed. The adaptive scheme exhibits the ability to restore patient LFP characteristics to PD-free conditions for different patients without a change in controller parameters. Changing conditions were monitored through the state estimates, which was the feedback signal. The feedback-loop consists of parkinsonian state (representing symptom severity) determination and stimulation facilitated by the fuzzy controller. The control signal modulates the spectral features to match PD free conditions of each individual patient. The resulting spectral features show that the adaptive scheme has the capacity to restore PD signals to their primary oscillations present under PD-free conditions. More so, using fuzzy inference mechanisms to quantify the dynamics of PD can be very intuitive for modulating therapy. Since it uses rule-based decision making that combines human heuristics into decision making; these rules could be updated into the controller as more knowledge regarding PD is acquired. Effective fuzzy control can only be achieved by adopting the right input pre-processing, in this case state estimates and their rate of change over time were chosen. In the future, external signals e.g., accelerometry activity can be incorporated to produce comprehensive rules that cover an increased number of possible situations. As things stand, optimal control can only be achieved by having a deeper understanding of the underlying mechanisms of DBS and PD-which is more of a clinical challenge. Ultimately, this tool could provide a paradigm on which stimulation can be adapted. The study provides a scheme in which DBS can be adapted using heuristics. To validate the efficacy of the approach, state estimates were obtained using both generative and discriminative machine learning models. Both showed promising results, which are attributable to their self-adjusting nature due to periodic training.

Model Limitations
At present, a model representing all possible dynamics is far from being realized because there is insufficient knowledge to produce models which closely represent the expected behavior of the system. This is why PD symptom severity is represented by the probability that a patient LFP signal is a PD condition. Apart from clinically sound PD state estimates, several other issues are necessary in order to achieve efficient PD onset control, such as optimal stimulation parameters and how they vary across patients and time. More specifically, the study focused on modulating DBS frequency, it is still under debate which of the parameters (stimulation intensity, pulse width and frequency) is the most beneficial. Nevertheless, controlling one of the parameters could shed more light on how best to control therapy. Currently, a number of assumptions regarding the effect of stimulation on neuronal signals are used to create a stimulation model that draws on the common denominator in all of the theories in Kiss et al. (2002), Carlson et al. (2010), and Chiken and Nambu (2016); which suggest a modulating effect on neuronal signals. This model could be improved if more detailed information on experimental LFP data consisting of stimulation parameters and PD symptom severity are obtained. Achieving significant progress in aDBS will depend on the correlation between patient state and LFP signal, as well as how stimulation modulates patient LFP. This would require a large set of LFP representing the effect of stimulation on the progression in PD symptoms for a wide range of patients. Presently, the major challenge in adaptive DBS is the difficulty in establishing a direct relationship between patient state and stimulation parameters. This is mainly due to the complexity of post-surgery programming of stimulation parameters by trained clinicians, which can take up to 6 months or more (Bronstein et al., 2011). Because of the limited availability of PD data incorporating the effects of stimulation, stimulation was modeled only by varying stimulation frequencies. This was chosen because stimulation frequency has proven to be more beneficial and reliable than other stimulation parameters (Birdno and Grill, 2008;Baker et al., 2011;Brocker et al., 2013;Su et al., 2018).
Finally, the control policy proposed tends to work better on cases with separable classes and clear states. A summary of the various transition profiles for PD suppression of datasets A to I is presented in the Supplementary Material of the paper. As presented in the Supplementary Material, for non-binary clusters (like the XOR classification problem) or binary clusters with large overlap, additional input information may be required to enable convergence. Convergence of the state estimates to the modal interval of the non-PD state can only be guaranteed for feature spaces with binary clusters and machine learning algorithms that produce an MCC >0.5.

CONCLUSION
The work provides theoretical evidence on the possibility of mitigating intractable Parkinsonism by adaptively regulating stimulation using recorded neurophysiological signals. It provides a framework for which if fine-tuned, could lead to the suppression of LFP characteristics in PD patients based on their state estimates (symptom severity) obtained using machine learning algorithms. The dynamic progression of neural signals in PD patients necessitated the adoption of machine learning models for tracking PD. The fuzzy control approach was adopted for computational efficiency and robustness to non-linearity. This was done with hardware implementation in mind, so that the architecture can be deployed in fully implantable aDBS systems that automatically adjust stimulation parameters in real-time in response to changes neurophysiological signals.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
AM performed simulations, analyzed the results, and wrote the manuscript. AD and RB supervised the work and reviewed the manuscript. All authors conceptualized the ideas and approved the manuscript.