Deep Brain Stimulation Does Not Modulate Auditory-Motor Integration of Speech in Parkinson's Disease

Deep brain stimulation (DBS) has significant effects on motor symptoms in Parkinson's disease (PD), but existing studies on the effect of DBS on speech are rather inconclusive. It is assumed that deficits in auditory-motor integration strongly contribute to Parkinsonian speech pathology. The aim of the present study was to assess whether subthalamic DBS can modulate these deficits. Twenty PD patients (15 male, 5 female; 62.4 ± 6.7 years) with subthalamic DBS were exposed to pitch-shifted acoustic feedback during vowel vocalization and subsequent listening. Voice and brain activity were measured ON and OFF stimulation using magnetoencephalography (MEG). Vocal responses and auditory evoked responses time locked to the onset of pitch-shifted feedback were examined. A positive correlation between vocal response magnitude and pitch variability was observed for both, stimulation ON and OFF (ON: r = 0.722, p < 0.001, OFF: r = 0.746, p < 0.001). However, no differences of vocal responses to pitch-shifted feedback between the stimulation conditions were found [t(19) = −0.245, p = 0.809, d = −0.055]. P200m amplitudes of event related fields (ERF) of left and right auditory cortex (AC) and superior temporal gyrus (STG) were significantly larger during listening [left AC P200m: F(1, 19) = 10.241, p = 0.005, f = 0.734; right STG P200m: F(1, 19) = 8.393, p = 0.009, f = 0.664]. Subthalamic DBS appears to have no substantial effect on vocal compensations, although it has been suggested that auditory-motor integration deficits contribute to higher vocal response magnitudes in pitch perturbation experiments with PD patients. Thus, DBS seems to be limited in modulating auditory-motor integration of speech in PD.


INTRODUCTION
Deep brain stimulation (DBS) is known to have strong beneficial effects on motor symptoms in Parkinson's disease (PD) (1,2). However, research on the effects of DBS on speech is rather inconclusive and the effects seem to critically depend on unidentified individual factors (3,4). Untreated, up to 90% of PD patients develop severe speech or swallowing difficulties in the later course of their disease (5). In particular, reduced voice volume (hypophonia) and monopitch (hypoprosodia) are typically part of speech characteristics in PD (6).
Sensorimotor deficits are thought to contribute to speech symptoms in PD like hypophonia and hypoprosodia (7). Many PD patients tend to overestimate their own voices' volume and therefore, reduce their output volume (8,9). When changing auditory feedback in loudness, PD patients compensate the amplitude of their voices substantially stronger than healthy individuals (10). The reduced ability in modulation of pitch represents a significant component of dysprosody and may be at least partly explained by sensorimotor deficits in auditory-motor integration as well (6,11).
To scrutinize the neural mechanisms underlying auditorymotor integration of speech, pitch perturbation experiments were used in earlier studies (10,12,13). In these experiments, auditory feedback is artificially pitch-shifted in real-time and a vocal compensation to these changes is provoked. PD patients compensated stronger to pitch-shifted feedback than healthy individuals and their vocal response magnitudes correlated with the pitch variability of unaltered vowel vocalizations (12,13). This suggests, that PD patients rely more on auditory feedback during speech production than healthy individuals, reflecting deficits of auditory-motor integration of speech in PD (13). Interestingly, the amplitude of the P200 event-related potential was larger for patients in this experiment. These P200 responses demonstrated a left-lateralized cortical activation pattern, including superior and inferior frontal gyrus (SFG/IFG), premotor cortex (PMC), inferior parietal lobule (IPL), and superior temporal gyrus (STG) (13).
Recent work suggested that the subthalamic nucleus (STN) is also involved in speech production. Increased activity in the gamma band in the STN was shown before speech onset and during articulation, similar to the sensorimotor cortex (14). In fact, DBS of the STN can alleviate specific speech symptoms like hypophonia and voice tremor (15). A trend toward increasing loudness and pitch variability of fluent speech could also be observed (3). Yet, in some patients DBS can also lead to speech deterioration based on perceptual ratings, acoustical measures of verbal fluency, as well as self-reported speech difficulties (16,17).
In the present study, we aimed at revealing how DBS of the STN influences the auditory-motor integration of speech. To this end, we studied behavioral and neurophysiological responses in a pitch perturbation experiment using magnetoencephalography (MEG). Due to deficits in auditory-motor integration, we expected PD patients to overestimate auditory feedback changes and therefore, to compensate strongly to pitch-shifted feedback (12,13), especially when DBS is turned OFF. If increased pitch variability during vocalizations is associated with deficits in auditory-motor integration (13), vocal response magnitudes will correlate with pitch variability. Previous work already demonstrated ameliorating tendencies of DBS on the acoustic measure of pitch variability, representing an improved modulation of pitch in fluent speech (3,4). Therefore, we expected that turning DBS ON would attenuate vocal response magnitudes, i.e., diminish vocal compensations toward similar magnitudes as in healthy individuals (4,13). Additionally, we hypothesized that P200m amplitudes would be reduced accordingly when the stimulation is turned ON as P200 amplitudes have been suggested to represent the neural correlate of auditory-motor integration deficits in speech perturbation experiments (13). Finally, we expected task specific differences of ERF amplitudes, that have been described earlier in pitch perturbation experiments (18,19).

Patients
Twenty German speaking PD patients (15 male, 5 female; 62.4 ± 6.7 years) were recruited during their annual DBS control visit at the Center for Movement Disorders and Neuromodulation at the University Hospital Düsseldorf. The mean disease duration was 9.6 ± 4.4 years. All patients were right-handed as assessed by the Edinburgh Handedness Test (20). The patients were implanted with a DBS system targeting the STN 24.9 ± 21.3 months prior to testing and had a significant therapeutic effect with their respective clinically used monopolar stimulation settings regarding motor scores of the unified Parkinson's Disease rating scale (UPDRS III) (ON: 15 ± 6, p < 0.001 vs. OFF: 28 ± 12). UPDRS scores provided in Table 1 have been rated with monopolar stimulation settings as part of the clinical evaluation of the DBS control visit. To minimize DBS artifacts, stimulation was switched from monopolar to bipolar for MEG recordings ( Table 1). These bipolar settings were installed and evaluated by a physician trained in DBS programming. Stimulation amplitudes were raised to values below individual side effect thresholds, if the stimulation did not suppress motor symptoms sufficiently. To achieve an equivalent clinical effect and a similar volume of tissue activated (VTA) of bipolar stimulation compared to monopolar stimulation an amplitude increase of about 30% has been suggested earlier (21)(22)(23). During MEG recordings patients were in their medication ON state. The levodopa equivalent daily doses (LEDD) can be found in Table 1. The study was approved by the local ethics committee (study number: 6211) and performed in accordance with the Declaration of Helsinki (24). All patients gave their prior written informed consent.

Procedure
After turning stimulation OFF for ≥30 min, patients were comfortably seated in the MEG scanner, asked to vocalize the German vowel "E" [e] (vocal task) and instructed to hold their tone irrespective of changes in the feedback. Each block consisted of 30 vocalizations. A visible count-down from 3 to 1, lasting 3 s in total, was displayed on a screen to prepare patients for vocalization. Afterwards, a blue circle containing the letter "E" appeared, indicating the beginning of the first vocalization period. The circle disappeared clockwise within 6 s. A white screen indicated a pause including the countdown for the next vocalization period. These pauses increased from 5 to 9 s to prevent vocal fatigue toward the end of the experiment. During each vocalization period, patients' voice was pitch-shifted downwards 200 cents (two semitones) for 200 ms up to 6   . The altered as well as the unaltered vocalizations were recorded. Afterwards, the recording-including the pitch-shifted sequences-was played back to the patient (listen task). The entire experiment was repeated after stimulation was turned ON again for ≥30 min. The time of each pitch shifting onset was saved in a separate audio file. Subsequently, this information was used to extract the time locked pitch response contours of each trial offline.

Apparatus
An optical microphone (Sennheiser MO 2000, Wedemark, Germany) was installed at a distance of 5 cm to the patients' mouth. After the signal was processed on the computer's built-in audio interface (SoundMX integrated Digital HD, Intel Corporation©, 64 bits; 33 MHz), a pitch-shifted signal was played back to the participant through insert earphones (ER-1, Etymotic Research Inc., Illinois, USA) via a mixing console (Behringer© XENYX 502 PA). The built-in audio interface had a hardware delay of 5.4 ms and recordings were sampled at 96 kHz. The audio system was calibrated, so that the feedback channel was more than 10 dB louder than the input channel of patients' voice (18). A dummy head microphone (Neumann KU100, Berlin, Germany), typically used for binaural audio recordings, was utilized to calibrate the system. Additionally, a visual presentation with the experimental instructions was executed on another computer independent from sound-processing and displayed using a rear projection system.

Pitch Shifting
The experiment required a small change in pitch of a voice signal in real-time without changing its intraspectral relations. Not changing these relations allows us to largely preserve the voice signal's natural sound, thereby avoiding possible dynamic artifacts and assuring that the patients still recognize their own voice. To this end, we employed a novel custom-made setup for real-time speech perturbation experiments (25). The speech signal was recorded online into a 4 s buffer pre-allocated to allow for the maximal duration of the pitch shift. During the pitch shifting, its play-back rate was reduced to 0.891 ≈2 −200/1,200 using cubic interpolation, effectively lowering the signal's pitch by 200 cents. This modified signal replaced the live signal for 200 ms. It was cross-faded back to the live signal over a period of 100 ms. This simple method is easily replicable and adjustable in SuperCollider (26) using our source code, which is publicly available online under the GNU general public license, version 3 (https://github.com/musikinformatik/pspeech).

MEG Acquisition
During the experiment, neuromagnetic activity was measured using MEG (Elekta Oy, Helsinki, Finland). Prior to this, patients' head shape and head position indicator coils (HPI) were digitized by means of a 3D-digitizer (Fastrak Digitizer, Polhemus©, Vermont, USA). Eye movements (EOG) and heart activity (ECG) were monitored throughout the measurements. Additionally, the audio signal played-back to the patients was taken from the 2track output of the mixing console to record it synchronously with the MEG data via one of the miscellaneous channels of the MEG system. Larynx accelerations were acquired the same way. These were measured by the use of a MEG compatible accelerometer to monitor vocalization periods independently from sound (27). To mark pitch shifting events in the MEG data, we sent transistor-transistor logic (TTL) pulses via a parallel port from the experimental computer to the MEG acquisition computer. The pulses were generated with a simple shell script activated by SuperCollider. Each vocalization on-and offset as well as pitch shifting on-and offset was encoded as a specific TTL trigger pulse. Precision of TTL pulses was adjusted with the help of the miscellaneous channels of larynx acceleration monitoring and the original audio signal with a hardware induced jitter of ±1 ms. The combined audio delay of hardware and software components was measured with a SuperCollider script resulting in 9 ms delay. Taking the microphone as well as the ear insert headphones into account, the total delay amounted to 10 ms.

Vocal Response Analysis
To analyze vocal compensating responses to pitch-shifted feedback, we extracted the individual pitch contours of every patient's recording in PRAAT, a free computer software package for speech analysis in phonetics (28). The pitch contours were transferred to the cent scale (4). We then extracted the responses time locked to the pitch shifting onset, using voice fundamental frequency (f0) values 100 ms before and 500 ms after the onset. By rejecting trials with negative response magnitude values (following responses), we assured that only opposing, i.e., compensating responses, were considered for the response analysis (12). The mean downward response magnitudes were: OFF: 6.05 cents and ON: 8.77 cents. In average 3.9% of trials in OFF and 3.8% of trials in ON were rejected due to following (downward) responses. We also excluded trials that were omitted in MEG preprocessing, so that the number of averaged trials was the same for vocal response and MEG analysis (Supplementary Table 1). We calculated the response magnitude by subtracting the mean baseline f0 (100 ms before pitch shifting onset) from the maximum f0 value in a time window of 100-300 ms after the pitch shifting (13). We made this calculation for each trial and averaged the response magnitudes. The standard deviation of f0 in the baseline period (see above) was calculated as a measure for pitch variability. Also, voice intensity and voice jitter (f0 cycle-to-cycle perturbation) were extracted from the vocalization sequences before pitch shifts to assess overall voice quality. Voice jitter was calculated as the average absolute difference between voice f0 of consecutive cycles (4). After extraction, all further processing was conducted in MATLAB (2018b, MathWorks Inc.).

MEG Data Analysis
MEG data was sampled at 1,000 Hz with a high-pass filter of 0.1 Hz and a low-pass filter of 330 Hz. The analyses were restricted to the 204 gradiometers of the MEG system. Data analysis was performed with Brainstorm (29), a documented and freely available toolbox for the analysis of brain signals (http:// neuroimage.usc.edu/brainstorm). Event related fields (ERF) were previously demonstrated to be nearly unaffected by DBS induced artifacts, due to the fact that DBS pulses are not time-locked to the stimulus (30). Artifacts induced by the DBS hardware, however, might affect ERF (31). To minimize these artifacts, we pre-selected PD patients with DBS systems using only slightly or nonmagnetic hardware components at the skull, i.e., DBS systems by Abbott R with their low-iron extension cable or DBS systems by Boston Scientific R ( Table 1). Indeed, in most cases, no signs of artifact contamination were visible after averaging and 40 Hz low-pass filtering at the channel level ( Figure 1). Still, in about 5 patients there were focal low and high frequency artifacts in channels over the right side of the skull, even after averaging (Figure 2). Therefore, we worked with Linearly Constrained Minimum Variance (LCMV) beamforming, which was demonstrated to reduce artifacts caused by movements of the magnetic DBS hardware components (32). Even unfiltered, the obtained ERF from the source level presented no signs of artifact contamination after LCMV beamforming (Figure 2).
Signal space projection (SSP) as implemented in Brainstorm was used to eliminate cardiac artifacts. Then, we inspected and removed trials affected by eye movement, muscle, and sensor artifacts. Trials resulting in negative response magnitudes in the vocal response analysis were excluded (Supplementary Table 1 Table 1). Clean trials were averaged and projected to the individual anatomical source level using LCMV beamformer (33). For source reconstruction, individual anatomical cortical surfaces were used and an overlapping sphere head model was constructed in Brainstorm. The anatomical surfaces were extracted from clinical MRIs, which every patient had received before DBS surgery, using Freesurfer (http://surfer.nmr.mgh. harvard.edu/). A z-score baseline normalization (−100 to −1 ms) was applied to the individual source level data. Baseline noise of averaged trials before source localization is given in Supplementary Table 2. Next, all individual source level data were projected to MNI space using Freesurfer's registered spheres (34).
The analysis focused on four regions of interest in the right and left hemisphere, previously described to relate to P200 changes in PD (13) (19). To extract individual time series for each region, a principal component analysis (PCA) was conducted. The first principal component was selected, resulting in one time series for each patient and condition. Data was then low-pass filtered (40 Hz) and time series with positive N100m peaks were flipped. Thus, in every acoustically evoked field (AEF), the N100m was a negative peak. Sign flipping was necessary to deal with sign ambiguity of MEG data. Afterwards, we automatically detected the minima of N100m (100-200 ms) and P200m maxima (200-300 ms) and used these for statistical analysis.
Statistics SPSS (v.25.0) was used for the statistical analyses of both behavioral and neurophysiological data. To test for differences between ON and OFF stimulation, the magnitudes of vocal responses as well as voice jitter and voice intensity were subjected to paired t-tests. To explore the relation between vocal response magnitude and pitch variability, we calculated Pearson's r, separately for ON and OFF stimulation. Repeatedmeasures Analyses of Variance (RM-ANOVA) were conducted to analyze differences between ERF amplitudes and latencies (N100m and P200m). Here, task (vocal vs. listen) and stimulation condition (ON vs. OFF) were within-subject factors. Finally, we calculated Cohen's d z for each t-test and the effect size f for each of the RM-ANOVAs using release 3.1.9.4 of G * Power (35).

Behavioral Data
In Figures 3A,C

DISCUSSION
In this study, we investigated the effect of DBS on auditorymotor integration of speech. While we could not find an effect of subthalamic DBS on vocal compensation to pitchshifted feedback, there was a positive correlation between vocal response magnitudes and pitch variability in both conditions.
In line with the behavioral findings, a difference between ERF amplitudes, comparing ON and OFF stimulation, was not observed. However, when looking at differences between vocalization and listening, amplitudes were larger and latencies shorter for listening over right and left AC and STG.

Auditory-Motor Integration Is Not Modulated by DBS
Analyzing voice recordings in the stimulation ON and OFF, we found vocal response magnitudes opposing the downward  Frontiers in Neurology | www.frontiersin.org pitch-shifted feedback of about +24 cents, which is similar to results of earlier studies with this experimental design (12,13). In addition, we could replicate the positive correlation between vocal response magnitude and pitch variability (12,13). This means, the stronger a patient compensated to pitch-shifted feedback, the larger was their own vocal pitch variability. This observation tallies with earlier work and is probably related to deficits in the mechanisms of auditory-motor integration, as it was only observed in patients (12,13). Noteworthy, the positive correlation of these two parameters-f0 response and f0 variability-was similar with DBS ON and OFF ( Figure 3B). This suggests that the deficits underlying this relation were not modulated by subthalamic DBS. The fact that we could not find a difference between vocal responses in the stimulation ON vs. OFF supports this notion further. In a recent study, subthalamic DBS was shown to attenuate compensating vocal response magnitudes to pitch-shifted feedback and also improved voice jitter (4). However, these results were solely based on 10 PD patients. Earlier work already suggested that DBS effects on acoustic parameters are highly individual (3). Thus, Skodda et al. could only find tendencies of amelioration of pitch variability and concluded that DBS effects on Parkinsonian speech differ considerably between patients. With the present findings based on 20 PD patients, we neither observed an effect on vocal nor neurophysiological responses. Additionally, we could not identify any clinical or acoustical parameter predicting individual performances. Thus, DBS might have critical limitations when it comes to influencing the modulation of speech in PD. One parameter, which we did not include in our analysis, however, is electrode placement. A recent study demonstrated that electrode placement in the anterior portion of the STN was associated with an improvement of voice-related outcomes in PD patients (36). Future studies investigating larger patient samples should assess, whether differences in individual speech performance and modulation of speech can be explained by the electrode location.

Vocalization Induced Suppression
In accordance with our behavioral findings, we could not see a significant difference between ERF amplitudes, comparing ON and OFF stimulation. Still, the ERF amplitudes were larger in the listen task than in the vocal task over the right and left AC and STG (Figures 4, 5). These results seem to contradict earlier findings, where a so-called vocalizationinduced enhancement of P200 amplitudes was reported for healthy individuals and was even augmented in PD (13,18). Within a previous EEG experiment, the P200 response for the vocalization task was increased over the Cz electrode (13). The P200 peak for the vocalization task was followed by a sustained amplitude plateau. This plateau might be interpreted as a P300 component combined with an enhanced P200 response. However, MEG normally fails to represent magnetic field P300 equivalents due to the deep localization of their generators (37). Indeed, a MEG study examining vocalizationinduced enhancement in 11 healthy individuals could not find an enlargement of P200m amplitudes as clear as in the EEG experiment (19). To solve the issue of limited comparability between MEG and EEG findings, experiments focusing on late auditory potentials should probably rather be conducted with high-density EEG measurements or a combination of EEG and MEG.
Since we assessed responses to pitch changes in self-generated speech and P200 changes in PD relate to a left-lateralized network (13), we expected changes to be localized mainly to the left hemisphere. Indeed, amplitudes appeared to be higher in the left hemisphere (Figure 4). However, when comparing effect sizes of left and right STG, there is a stronger main effect of task (vocal vs. listen) for the right STG [right STG: P200m: F (1, 19) = 8.393, p = 0.009, f = 0.664; left STG: P200m: F (1, 19) = 5.758, p = 0.027, f = 0.551]. Additionally, there is robust evidence concerning vocalization-induced suppression, especially for N100m amplitudes, probably reflecting auditory cortex sensitivity to self-generated sounds (18,19,38). The right AC is known to be especially sensitive to the spectral dimension of sound (39). In line with these observations, N100m amplitudes were suppressed during vocalization at the right STG (Figure 4). Similarly, N100m latencies were longer during vocalization at left and right AC and STG as well as left PMC, which has been described before (13,38).

DBS Artifacts
Measuring brain activity during active DBS using MEG is an emerging field of research (40). As DBS-MEG recordings are associated with more or less severe artifacts, the use of artifact reduction methods is most often necessary (30,41). In case it is not necessary, however, these methods should not be applied because they bear the risk of altering brain signals, e.g., amplitude reduction (30). Here, we investigated ERF, which are comparably robust to DBS artifacts (Figure 1). Moreover, using LCMV beamforming, we reduced artifacts caused by the movement of ferromagnetic DBS components additionally (32) (Figure 2). Due to the fact that the source level DBS ON data revealed similar ERFs as DBS OFF data, we can assume that the stimulation artifact itself was sufficiently reduced with that approach. These findings might therefore facilitate and pave the way for further investigations on ERFs during DBS to better understand the cortical effects of DBS. The use of recent more noise-resistant SQUIDs in newer MEG systems might even further improve data quality in future combined MEG-DBS-studies.

CONCLUSION
Auditory-motor deficits play an important role for Parkinsonian speech pathology and are represented by strong pitch compensations to pitch-shifted auditory feedback correlating with pitch variability. Subthalamic DBS appears not to modulate these compensations in PD and therefore seems to have no substantial effect on the auditory-motor integration of speech. Moreover, we were able to demonstrate that it is possible to explore auditory ERFs in DBS patients using LCMV beamforming without additional artifact reduction methods.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Medical Faculty's Ethics Committee Heinrich-Heine University Düsseldorf Germany. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
BB, JR, HK, AS, and MB contributed conception and design of the study. BB, EF, HK, JR, MB, and JH contributed to and developed the methodology and conducted parts of the formal and statistical analysis. RV developed the voice analysis scripts and wrote parts of the methods section. BB, AS, and MB recruited patients and conducted the measurements. BB and HK organized the database. BB wrote the first draft of the manuscript. MB, EF, JH, RV, and JR wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.
FUNDING EF gratefully acknowledges support by the Volkswagen Foundation (Lichtenberg program 89387).