Spectral Resting-State EEG (rsEEG) in Chronic Aphasia Is Reliable, Sensitive, and Correlates With Functional Behavior

We investigated spectral resting-state EEG in persons with chronic stroke-induced aphasia to determine its reliability, sensitivity, and relationship to functional behaviors. Resting-state EEG has not yet been characterized in this population and was selected given the demonstrated potential of resting-state investigations using other neuroimaging techniques to guide clinical decision-making. Controls and persons with chronic stroke-induced aphasia completed two EEG recording sessions, separated by approximately 1 month, as well as behavioral assessments of language, sensorimotor, and cognitive domains. Power in the classic frequency bands (delta, theta, alpha, and beta) was examined via spectral analysis of resting-state EEG data. Results suggest that power in the theta, alpha, and beta bands is reliable for use as a repeated measure. Significantly greater theta and lower beta power was observed in persons with aphasia (PWAs) than controls. Finally, in PWAs theta power negatively correlated with performance on a discourse informativeness measure, while alpha and beta power positively correlated with performance on the same measure. This indicates that spectral rsEEG slowing observed in PWAs in the chronic stage is pathological and suggests a possible avenue for directly altering brain activation to improve behavioral function. Taken together, these results suggest that spectral resting-state EEG holds promise for sensitive measurement of functioning and change in persons with chronic aphasia. Future studies investigating the utility of these measures as biomarkers of frank or latent aphasic deficits and treatment response in chronic stroke-induced aphasia are warranted.


INTRODUCTION
For over two million adults living with aphasia in the United States, effective neurorehabilitation will be critical to successful return to everyday life pursuits. Resting-state EEG (rsEEG) holds potential as a measure of neurorehabilitation success because it is inexpensive, has few contra-indications for use, and is already widely used in hospitals. Further, data acquisition is straightforward for trained users, even in individuals with severe impairments, and commercial systems with data processing applications are already positioned in healthcare settings, making it highly feasible.
Resting-state assessments offer a simple complement to taskbased paradigms in the identification of functional network connectivity. While task-based paradigms provide important insights into behaviors, they can be difficult to interpret when completed by individuals with an impairment in the behavior of interest (Price et al., 2006). For example, investigations of language in persons with aphasia (PWAs) often require participants to name pictures or match pictures to a word while undergoing fMRI. Because these individuals have difficulty processing and producing language, incorrect responses are common and must be accounted for in paradigm design and subsequent analyses. Typically, this means excluding incorrect trials from analysis, at the cost of power to detect effects. To offset this loss of statistical power, more trials can be included, but at the risk of increasing fatigue, frustration, and emotional distress. Further, the cognitive load of a given task will vary greatly across participants, depending on their level of deficit and residual abilities (Logan, 1985;Borghini et al., 2017). These differences in cognitive load mean that recruitment of brain regions for task completion will also vary greatly, further complicating inferences about task-based activation. Similarly, event-related potential (ERP) analyses are used to investigate patterns in the EEG signal during task completion, with the same interpretive difficulties observed as in task-based fMRI. Recently, resting-state fMRI (rsfMRI) has allowed investigation of brain function in the absence of task-based behavior (Veldsman et al., 2014), which has led to an improved understanding of how networks communicate at rest and when on-task (Klingbeil et al., 2019). It is reasonable to expect that spectral analysis of rsEEG may offer similar benefits as rsfMRI, enabling assessment of PWAs even when language deficits limit or prevent completion of tasks that rely on intact language systems.
Spectral rsEEG in the acute and sub-acute stages following stroke reveals increased low frequency (delta and theta) activity and reduced high frequency (alpha and beta) activity compared to controls (for a review, see Finnigan and van Putten, 2013). These differences between controls and PWAs seem to be related to functional outcomes as measured by general stroke scales de Vos et al., 2008;Sheorajpanday et al., 2009;Dubovik et al., 2013;Nicolo et al., 2015), where increased low frequency activity and reduced high frequency activity is often associated with poorer outcomes. Only a few studies have examined spectral EEG in chronic stroke (Rozelle and Budzynski, 1995;Spironelli and Angrilli, 2009;Spironelli et al., 2013;Herron et al., 2014;Song et al., 2015). The utility and generalizability of these studies are limited due to their reliance on behavioral tasks, restricted spectral frequencies investigated, and restricted behavioral profiles (e.g., persons with Broca's aphasia only). While completing language tasks, including lexical judgment, rhyming judgment, semantic matching, and orthographic matching, 17 Italian-speaking PWAs (3 non-fluent, 13 fluent, 1 unclassified) demonstrated increased delta power in the left hemisphere compared to healthy controls (Spironelli and Angrilli, 2009). In a related study with an overlapping sample, Spironelli et al. (2013) reported that 11 Italian-speaking PWAs also had reduced high beta power (calculated by averaging activity in the 21-28 Hz range, the upper half of the classic beta frequency band) in the lesioned left hemisphere compared to controls.
While not always specific to PWAs, several investigations have examined changes in spectral EEG over time. Spectral task-based EEG in 11 German-speaking PWAs (five non-fluent, six fluent) demonstrated that decreased left hemisphere delta power corresponded to significant language recovery in the first year post stroke, but no changes in language recovery or delta power were seen in the second year post stroke (Hensel et al., 2004). In a similar study of cognitive function (measured by the MoCA) following mild right hemisphere stroke in 10 Serbianspeaking participants, Petrovic et al. (2017) reported that spectral rsEEG features measured approximately 10 days post-stroke did not fully resolve 1-1.5 years post-stroke, even when cognitive behavioral performance did. It was posited that these permanent changes in rsEEG were a result of neural adaptations to support cognition in the face of a lesion.
Some researchers have also examined spectral EEG before and after rehabilitation in sub-acute and chronic PWAs (Rozelle and Budzynski, 1995;Stojanovic et al., 2013). A single case study of an English-speaking person with chronic non-fluent aphasia investigated the use of spectral EEG as a biofeedback method to lower theta and increase beta activity (Rozelle and Budzynski, 1995). Following training, the participant demonstrated significantly reduced theta activity in spectral rsEEG, alongside behavioral improvements in speech, language, motor, mood, and cognitive domains. More recently, a study investigated aphasia treatment response in a sample of 32 Serbian-speaking PWAs (26 non-fluent, 6 fluent) using spectral rsEEG to compare the hemispheric and regional symmetry of delta, theta, alpha and beta power (Stojanovic et al., 2013). Prior to treatment, hemispheric and regional asymmetry were increased, and variability was decreased, in PWAs compared to 86 age-and sex-matched healthy controls. Following treatment, between-group differences were significantly decreased, driven primarily by the better responders to aphasia therapy.
Taken together, these studies provide preliminary evidence that spectral rsEEG may be useful for prognosis and measurement of treatment response. However, the psychometric properties of spectral rsEEG have not yet been characterized in PWAs, limiting the validity and accuracy of predictive inferences that can be made. A necessary step in the translation of spectral rsEEG measures from research to clinical practice is ensuring they possess adequate psychometric properties for use as repeated measures (i.e., treatment monitoring). Thus, it is paramount that the psychometric properties of spectral rsEEG are defined in PWAs, particularly in the chronic stage, as this has been understudied in comparison to the acute and sub-acute stages. Investigations of spectral EEG variability in adults have generally demonstrated good stability for healthy populations (Oken and Chiappa, 1988;Salinsky et al., 1991;Corsi-Cabrera et al., 1997;Kondacs and Szabó, 1999;Gudmundsson et al., 2007;Suarez-Revelo et al., 2015), though electrode montage (i.e., spatial arrangement of electrodes) can have a significant impact on reliability (Gudmundsson et al., 2007), warranting further investigation. Acceptable specificity and test-retest reliability has yet to be investigated or established for specific patient populations. Despite this very limited psychometric evidence to support the use of spectral EEG as a repeated measure, it is already being used in this manner in research (e.g., Rozelle and Budzynski, 1995;Hensel et al., 2004;Stojanovic et al., 2013;Wu et al., 2015;Petrovic et al., 2017). Defining specificity and reliability of spectral rsEEG measures utilizing varied electrode montages post-stroke is critical to ensuring appropriate application of these measures and preventing research waste (for a discussion of research waste as it pertains to biomarkers, see Ioannidis and Bossuyt, 2017).
The current study seeks to improve our understanding of brain activation changes and their potential as a biomarker (i.e., indicator of presence of aphasia, indicator of treatment response) in persons with chronic stroke-induced aphasia by repeated examination of the four most frequently used spectral EEG frequency bands during two rest conditions (eyes-open and eyes-closed). We focus our investigation on the chronic stage for two reasons. First, spectral rsEEG is understudied in the chronic phase compared to acute and sub-acute stages; and second, persons with chronic aphasia are likely to have more stable brain functions, making this the ideal population for investigating spectral rsEEG's appropriateness for repeated measurement. This study will provide confirmation of spectral rsEEG changes persisting into the chronic stage post-stroke and will have wide application to PWAs of varying severities, where accurate completion of language tasks may not be possible. We report results for varied montages given the reported differences in the reliability of montages, the use of varied montages in the literature, and the contrast between clinically focused research which uses fewer electrodes (typically no more than 19) versus laboratory-focused research which uses large numbers of electrodes (64-256). The specific aims of this study are to: 1. Determine the reliability of spectral rsEEG measures in healthy controls and PWAs for four montages in order to establish suitability as a repeated measure. 2. Examine differences in spectral rsEEG between welldescribed samples of healthy controls and PWAs. 3. Examine relationships between spectral rsEEG measures and performance on behavioral measures.

Persons With Aphasia
Twenty-one PWAs were recruited into the study. Following inspection of EEG data, two participants were removed from analysis due to poor data quality, leaving 19 (seven females) PWAs (Table 1). Participants who had experienced multiple strokes (2-4) were included to ensure that the results are maximally applicable to the general aphasia rehabilitation population served by speech-language pathologists. Fifteen participants experienced left hemisphere stroke(s) while four experienced stroke(s) of mixed hemisphericity. Western Aphasia Battery -Revised (WAB-R; Kertesz, 2006) Aphasia Quotient (AQ) scores were used to classify participants into the following aphasia subtypes: 6 anomic, 3 conduction, 1 Wernicke's, 1 Broca's, 1 transcortical motor, and 7 who experienced left hemisphere strokes but tested above the WAB cut-off for clinical aphasia (not aphasic by WAB; NABW). PWAs NABW demonstrated communication deficits based on discourse and naming performance (described in detail below) and were therefore included in the study, consistent with previous research (Spironelli and Angrilli, 2009;Spironelli et al., 2013;Fromm et al., 2017;Dalton and Richardson, 2019). All PWAs were greater than 1-year post-stroke to ensure that spontaneous recovery was not a factor in the reliability analysis. As in previous studies (e.g., Finnigan et al., 2007Finnigan et al., , 2016Leon-Carrion et al., 2009;Herron et al., 2014;Nicolo et al., 2015;Song et al., 2015;Wu et al., 2015;Gorišek et al., 2016), potential participants with a diagnosis of significant psychiatric mood disorders were excluded (including major depressive disorder or generalized anxiety disorder), but persons with self-reported symptoms of mild depression and anxiety with no clinical diagnosis were allowed to participate. All PWAs were right-handed prior to their stroke. English was the primary language used by all participants at the time of testing, and had been for many years, according to responses on the Language Experience and Proficiency Questionnaire (LEAP-Q; Marian et al., 2007). Three participants with stroke reported speaking more than one language, and all but one participant (who reported Spanish as the first language) reported English as their first language. The average age of PWAs was 58.2 years (SD = 14.5). Average education was 15 years (SD = 3.1).

Healthy Controls
Twenty-six control participants were recruited into the study. Following inspection of EEG data, two participants were removed from analysis, leaving 24 (15 females) healthy, native Englishspeaking controls ( Table 1). Participants were screened to ensure no history of neurological disease or injury that might affect brain function. Potential participants with a diagnosis of significant psychiatric mood disorders were excluded (including major depressive disorder or generalized anxiety disorder), but persons with self-reported symptoms of mild depression and anxiety with no clinical diagnosis were allowed to participate. All healthy control participants were right-handed, and the majority were monolingual; six reported speaking at least one other language in addition to English, but English was the first language for all control participants according to LEAP-Q responses (Marian et al., 2007). Healthy controls were matched to PWAs primarily on age, with years of education as a secondary matching variable. For one PWA, we were unable to match both age and education, due to the low level of education (seventh grade level) attained. The average age of control participants was 59.3 years (SD = 15.1 years). Control participants averaged 16.8 years of education (SD = 2.7).

Assessments
In order to fully describe the characteristics of the sample, all participants (PWAs and healthy controls) completed sensorimotor, cognitive, and speech-language testing ( Table 2). Sensorimotor testing was included since many individuals with aphasia experience these deficits given the close proximity of primary and supplementary motor cortices to perisylvian language areas. Sensorimotor function was measured via an in-house battery that included sensation, proprioception, motricity, and fine motor coordination tasks. Cognitive function was assessed with the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, 1998) and the Wechsler Adult Intelligence Scales -Picture Completion subtest (WAIS-PC; Wechsler et al., 2008). Motor speech function was assessed using subtests of the Apraxia Battery for Adults -2 (ABA-2; Dabul, 2000). Processing and production of prosodic information in spoken language was assessed using the Aprosodia Battery (Ross and Monnot, 2011). Narrative abilities were assessed with the Discourse Production Test (DPT; Fromm et al., 2020), which requires participants to tell a story depicted in pictures, retell the story of Cinderella, and describe how to make a peanut-butter and jelly sandwich. Additional language measures were administered to PWAs only and included the WAB-R (Kertesz, 2006) to assess overall aphasia severity, the Boston Naming Test (BNT; Kaplan et al., 2001) to assess anomia severity, and a shortened version of the Discourse Comprehension Test (DCT; Brookshire and Nicholas, 1997). The DPT was scored using a main concept analysis (MCA; Nicholas and Brookshire, 1995), which evaluates how well persons communicate the gist, or essential elements of a story. MCA was scored using standardized checklists with accompanying normative data (Nicholas and Brookshire, 1995;Dalton, 2016, 2020;Dalton and Richardson, 2019) which involves comparison of participant-produced utterances to standardized checklists to evaluate accuracy and completeness of utterances. For example, during a retelling of the Cinderella story, one of the main concepts is "Cinderella danced with the prince" (with three essential elements 'Cinderella, ' 'danced, ' and 'with the prince'). If an individual attempted to produce this main concept and said, "Cinderella danced", it would be considered accurate but incomplete, because it is missing the essential element 'with the prince.' If an individual said "Cinderella walked with the prince" it would be considered inaccurate (because they dance, not walk) but complete, because all three essential elements are represented. A score is assigned for each main concept ranging from 0 for absent to 3 for accurate and complete and then scores are summed to yield the MC composite score. In this investigation we focused on the MC composite score and the number of MCs coded as accurate and complete since they are the most straightforward to interpret. Using this system, we can sensitively assess the quality of discourse production in individuals across the impairment spectrum, from healthy controls (e.g., no history of stroke and aphasia) to profound aphasia (Dalton and Richardson, 2019;Dalton et al., 2020).

EEG Recording
EEG data was recorded from 64 active electrodes placed in an elastic cap according to the 10-10 International system of classification (Chatrian et al., 1985) which extends the 10-20 placement system for more dense electrode arrays. The ground electrode was located at Fpz (the default in the electrode caps used in this study) with the reference electrode at CPz. Blinks and eye movement were recorded via vertical electrooculography using paired electrodes placed above and below the left eye. Data were recorded on a BrainVision actiCHamp system with a 500 Hz sampling rate (corresponding to a data point recorded every 2 ms) and online bandpass filtering from 0.01-100 Hz. Participants were seated in front of a computer in a dimly lit room. rsEEG was recorded for 2 min with eyes-open and 2 min with eyes-closed. During the eyes-open recording, participants were asked to fixate on a white cross presented on a black background to limit eye movement artifacts.
Both eyes-open and eyes-closed conditions were used as there is inconsistency in the literature, with most reporting only the eyes-closed condition (e.g., Dubovik et al., 2013;Finnigan et al., 2016), others reporting only the eyes-open condition Detail -Total (12) 9.2 (2.1) The value next to each task name represents the maximum score on the measure. Performance on each task is reported as mean (standard deviation). All effect sizes, except those for left hand function, left index/middle finger tap, and left foot tap, correspond to large effects. RBANS, Repeatable Battery for the Assessment of Neuropsychological Status; WAIS-PC, Weschler Adult Intelligence Scales -Picture Completion; WAB-R-AQ, Western Aphasia Battery -Revised -Aphasia Quotient; BNT, Boston Naming Test.
(e.g., Herron et al., 2014;Wu et al., 2015), and some reporting both conditions (e.g., Cuspineda et al., 2003;Stojanovic et al., 2013). In addition, research indicates that these conditions are not equivalent, even as individuals age, suggesting a need to determine psychometric properties for each condition separately (Barry et al., 2007;Barry and De Blasio, 2017).

Data Processing
Standard offline pre-processing using BrainVision Analyzer 2.1 was conducted to ensure adequate data quality. First, noisy channels were identified through visual inspection and interpolated. Data were high (0.1 Hz) and low (50 Hz) pass filtered using infinite impulse response zero-phase shift Butterworth filters to minimize distortion and preserve phase information (Hamming, 1998;Oppenheim et al., 1999). After filtering, bad segments (i.e., muscle activity) were manually rejected and independent component analysis was conducted to remove eye movement artifacts (Makeig et al., 1996). Channels were re-referenced to an average reference and channel CPz was interpolated from the average referenced data. After the pre-processing steps above, the following steps were conducted to calculate absolute spectral power. First, data were epoched into 2048 ms bins with 1024 data points per epoch. Epochs with data values greater than ±100 microvolts and/or changes in value greater than ±25 microvolts were rejected. Second, data were subjected to a fast Fourier transform (FFT) with Hanning window and tapering at the beginning and end of the window totaling 10% of the epoch length.
The resolution of a FFT can be calculated as the sampling frequency divided by the number of data points in the epoch. With a sampling frequency of 500 Hz and 1024 data points in each epoch, this resulted in 0.49 Hz resolution, consistent with previous research (e.g., Hensel et al., 2004;Herron et al., 2014;Schleiger et al., 2014). Participants with fewer than 30 s of artifact-free data per condition were excluded from the analysis, resulting in the exclusion of data for two PWAs and two healthy controls. Third, the absolute sum of spectral power in the four classic frequency bands was calculated, accounting for the 0.49 Hz resolution of the FFT (delta: 0.98-2.93; theta: 3.91-6.84 Hz; alpha: 7.81-12.21 Hz; beta: 13.18-30.27 Hz). These frequency bands were selected based on the most commonly reported values in the articles we reviewed (e.g., Leon-Carrion et al., 2009;Herron et al., 2014). All measures were calculated for each electrode separately, then averaged across electrodes, a common approach in the literature (e.g., Cuspineda et al., 2003;Hensel et al., 2004;Finnigan et al., 2007;Song et al., 2015;Carrick et al., 2016).
Calculations were performed for the following anatomical electrode montages (see Figure 1): whole brain (all 64 electrodes), clinical (19 electrodes corresponding to 10-20 International system electrode locations; e.g., Finnigan et al., 2016), left hemisphere (excluding midline electrodes), and right hemisphere (excluding midline electrodes). For the left and right hemisphere montages, the average reference was calculated separately using only the electrodes in that hemisphere to ensure that only activation in the hemisphere of interest was included. Previous research has reported on hemispheric comparisons in both healthy controls and individuals poststroke, indicating a need for information regarding the stability and reliability of such measures (e.g., Hensel et al., 2004;Spironelli and Angrilli, 2009;Spironelli et al., 2013;Herron et al., 2014;Petrovic et al., 2017). Also, neuroimaging studies have long utilized hemispheric comparisons and/or FIGURE 1 | All labeled electrodes were included in the whole brain montage. Red circles show electrodes included the clinical (10-20 montage), and blue boxes show electrodes included in the left and right hemisphere montages. examined hemispheric differences and related behaviors (e.g., French and Beaumont, 1984;Bolduc et al., 2003;Szaflarski et al., 2006;Learmonth et al., 2017;Othman et al., 2020). Hemispheric comparisons are frequent in the aphasia literature where lesions affect a behavior (e.g., language) that is typically strongly lateralized to the left hemisphere (e.g., Gow and Ahlfors, 2017;Piai et al., 2017;Sandberg, 2017;Wilson et al., 2019). By providing hemispheric results in this investigation, we pave the road for future investigations that seek to compare individuals with left and right hemisphere strokes, or to compare individuals with either left or right hemisphere strokes to healthy controls or other clinical populations.

Data Analysis
All statistical analyses were conducted using SPSS v27. The reliability of spectral rsEEG absolute power was calculated between sessions one and two via single-measure intra-class correlations using a two-way mixed effect model with absolute agreement (ICC; Shrout and Fleiss, 1979;McGraw and Wong, 1996;Suarez-Revelo et al., 2015;Koo and Li, 2016). ICCs are widely used to evaluate the psychometric properties of newly developed assessment instruments. The ICCs conducted here assessed the exactness of the match between, for example, whole brain relative delta power at session one and whole brain relative delta power at session two during eyes-closed rest. The closer these values are to each other, the stronger the correlation, and the more stable the measure over time. Only the point estimates are reported in the text; however, 95% confidence intervals are reported in tables to allow readers a more nuanced interpretation of reliability. ICCs were identified as poor (<0.5), moderate (≥0.5 and <0.75), good (≥0.75 and <0.9), or excellent (≥0.9) (Koo and Li, 2016).
Before examining between-group differences, descriptive statistics (mean, median, standard deviation, range, skew, and kurtosis) were calculated for both groups; normality was assessed using skew and kurtosis. Student's t-tests to compare differences between groups were planned. While many variables reported here violate the assumption of normality, previous research has shown parametric statistics such as the t-test to be robust to violations of normality (using Bradley's [1978] definition of robustness where deviation from p = 0.05 is ∼ ±0.005). Simulation studies have demonstrated the robustness of the t-test in the face of non-normal distributions when the absolute value of skew is less than 2, and the absolute value of kurtosis is less than 9 (Boneau, 1960;Posten, 1978;Bradley, 1982;Schmider et al., 2010). Data with skew or kurtosis outside the range for which t-tests are robust were assessed using the Mann-Whitney U test consistent with previous research (Hensel et al., 2004;Nolfe et al., 2006;Sheorajpanday et al., 2009;Herron et al., 2014). Homogeneity of variance was assessed using the Levene's test. For variables that violated the homogeneity of variance assumption, Welch's t-tests were used (Ruxton, 2006). Between-group comparisons were conducted using the data from the first recording session only. Effect size calculations (Hedge's g for t-tests and η 2 for Mann-Whitney U tests) were conducted for all comparisons, and medium to large effects are reported. Hedge's g and η 2 have a different range of possible values, so for ease of interpretation, both the numeric value and commonly accepted estimates of effect size (small, medium, or large) are reported (Cohen, 1988). Holm-Bonferroni corrections for multiple comparisons (Holm, 1979) were used to control type I error. We selected the Holm-Bonferroni approach over the more widely used Bonferroni technique as the Bonferroni tends to inflate Type I errors, while the Holm-Bonferroni approach better balances Type I and Type II error (Aickin and Gensler, 1996), and will be more sensitive to group differences.
Finally, an analysis of correlations between spectral rsEEG measures and performance on behavioral tasks in PWAs was conducted. Correlations were calculated to identify the relationship between spectral rsEEG measures and performance on cognitive (RBANS index score), and language (WAB-R AQ, number of correct items named on BNT, and production of Main Concepts [MCs] during storytelling) assessments. For all behavioral measures, a lower score represents poorer performance.

RESULTS
Independent samples t-tests were conducted to compare age and education between the two groups to ensure adequate matching. No significant difference was observed between the two groups for age (t = 0.239; p = 0.812). A significant difference was observed between groups on education (t = 2.081; p = 0.044), driven by the PWA with a seventh grade education. When this individual was removed from the PWA group, the t-test for education was no longer significant (t = 1.715; p = 0.094).

Aim 1 -Reliability
Reliability was examined for each of the four montages (whole brain, clinical, left hemisphere, right hemisphere) in each spectral frequency band (delta, theta, alpha, beta). This resulted in the calculation of ICCs for 16 montage + spectral band combinations.

Eyes-Open Rest
Better intra-individual reliability was observed for PWAs than controls during eyes-open rest (Table 3 and Figure 2). PWAs demonstrated moderate-good reliability in all montage + spectral band combinations (good reliability in 9/16 and moderate reliability in 7/16), while healthy controls demonstrated moderate-good reliability in 14/16 montage + spectral band combinations (good reliability in 4/16 and moderate reliability in 10/16). Healthy controls demonstrated poor reliability for delta and theta power in the right hemisphere. For PWAs, the highest reliability was observed with theta and alpha power, then beta power, with the lowest reliability in delta power. In healthy controls the highest reliability was observed with beta power, followed by alpha, then delta and finally theta power. The average reliability for electrode montages and frequency band is reported in the columns and rows labeled "average". Cells are color-coded according to the guidelines reported by Koo and Li (2016) as poor (red), moderate (yellow), and good (green).  Table 3.  Table 3.

Eyes-Closed Rest
Reliability in the eyes-closed condition was similar across groups, with better reliability for PWAs than controls (Table 3 and Figure 3). For PWAs, moderate-good reliability was demonstrated in 15/16 montage + spectral band combinations (good reliability in 7/16 and moderate reliability in 8/16). Controls also demonstrated moderate-good reliability in 15/16 montage + spectral band combinations (good reliability in 5/16 and moderate reliability in 10/16). Both groups demonstrated poor reliability of delta power in the left hemisphere. For PWAs, highest reliability was observed for theta and alpha power, then beta, and finally delta. For healthy controls, highest reliability was observed for theta power, followed by alpha and beta, and finally delta power.

Aim 2 -Group Differences
A repeated measures ANOVA was conducted to investigate potential differences in spectral rsEEG over time. The results of this analysis revealed that neither the main effect [F(1,1278) = 0.130, p = 0.718] nor the interaction term [F(1,1278) = 1.267, p = 0.261] were statistically significant, indicating that spectral power in the first and second session did not significantly differ. For this reason, only data from the first session was used in the following analyses. Descriptive statistics for both groups are reported in Table 4, with boxplots providing a visual representation of the data in Figure 4. Examination of skew and kurtosis values indicated that parametric t-tests were acceptable for all montages and frequencies except alpha power during the eyes-open condition (please see Supplementary Table 2 for skew and kurtosis values for all measures). Therefore, Mann-Whitney U tests were conducted for alpha power comparisons.

Eyes-Open Rest
Significant differences between controls and PWAs was observed for beta power in the whole brain (t = 2.818; p = 0.007), clinical (t = 2.729; p = 0.009), and left hemisphere (t = 2.968; p = 0.005) montages, and for delta power in the right hemisphere (t = 2.97; p = 0.005) montage. No significant between-group differences were observed in alpha or theta power, or for delta power in the remaining montages. For all significant comparisons, healthy controls demonstrated greater average power than PWAs. See Table 5 for complete results.

Eyes-Closed Rest
Significant differences between controls and PWAs was observed for beta (t between 2.44 and 4.529; p between 0.019 and <0.001) and theta (t between −2.928 and −3.122; p between 0.011 and 0.004) power in all montages. No significant between-group differences were observed for alpha or delta power. Compared to PWAs, healthy controls demonstrated greater beta power and lower theta power. See Table 5 for complete results.

Aim 3 -Functional Relationship
Prior to computing correlations, data were checked for normality and linearity. Due to observations of non-normality in the data, non-parametric Spearman correlations were calculated. Additionally, when plotting data to determine linearity, both the BNT and WAB-R AQ (completed by PWAs only) were determined to be non-linearly related to rsEEG power, so correlations were only computed for the RBANS and Main Concept (MC) scores.

Eyes-Open Rest
Moderate to strong negative correlations were observed between MC scores and theta power in the whole brain (

Eyes-Closed Rest
Strong negative correlations were observed between MC scores and theta power in the whole brain (rho = −0.53, p = 0.02), clinical (rho = −0.53, p = 0.02), and left hemisphere (rho = −0.55, p = 0.014) montages in PWAs. Positive correlations were also observed between MC scores and alpha (rho = 0.59, p = 0.008) and beta power (rho = 0.48, p = 0.039) in the left hemisphere in PWAs. In healthy controls, strong positive correlations were observed between RBANS scores and theta power in all montages (rho between 0.56 and 0.6, p between 0.002 and 0.005).  Table 5). Gray bars indicate comparisons that were not significant after corrections for multiple comparisons but that had medium to large effect sizes (corresponding to italicized cells in Table 5). (In a boxplot, data are split into quartiles and the figure attributes are as follows: the top of the box represents the third quartile; the bottom of the box represents the first quartile; the length of the box from the top to the bottom represents the interquartile range; the horizontal line within the box is the median of the dataset; the upper whisker is the line from the top of the box to the maximum value, or in the presence of an outlier, 1.5 times the interquartile range above the third quartile; the lower whisker is the line from the bottom of the box to the minimum value, or in the presence of an outlier, to 1.5 times the interquartile range below the first quartile; circles and stars represent outliers, defined as any point beyond 1.5 times the interquartile range either above the third quartile or below the first quartile).  Comparisons that survived correction for multiple comparisons are bolded. Comparisons that were not statistically significant but showed a medium or large effect size are italicized. For the Eyes Open Alpha comparison only, we report the Z statistic in addition to the Mann-Whitney U test statistic to allow a more direct comparison to the Eyes Closed Alpha condition t-test statistic results.

DISCUSSION
The results of this study suggest that spectral rsEEG is suitably reliable for use as a repeated measure in both controls and PWAs. Additionally, spectral rsEEG is sensitive to differences between controls and PWAs. Finally, spectral rsEEG measures were also related to behavioral measures of cognition and language.

Reliability of Spectral rsEEG for Repeated Measurement
To our knowledge, this is the first study to examine the reliability of spectral rsEEG in persons with chronic strokeinduced aphasia -an important step to ensuring these measures are viable for use as neurophysiological biomarkers of treatment response. We selected an approximately 1-month delay between the first and second recording sessions since many aphasia research studies involving a treatment component last approximately 1 month from pre-treatment assessment to post-treatment assessment. Our results demonstrated moderate to good reliability for healthy controls and PWAs in alpha, beta, and theta bands, with relatively poorer reliability in the delta band. Previous research has examined changes in spectral rsEEG over time as a response to treatment (e.g., Rozelle and Budzynski, 1995;Stojanovic et al., 2013). Our results support the use of spectral rsEEG in this manner and strengthen the inferences that can be drawn from such studies. Both groups demonstrated similar spectral rsEEG reliability during the eyesopen and eyes-closed rest conditions, suggesting that either condition may be used for repeated measures, although eyesclosed rest demonstrated slightly better reliability across the board. However, our findings of the overall poor reliability of delta power suggest that researchers should proceed with caution when this is the spectral band of greatest interest, as we expect it could be since it is frequently the focus in acute and sub-acute studies of stroke recovery. When examining the overall reliability of the various montages, there was not a clear pattern of better reliability in one montage over the others in either condition. However, the left and right hemisphere montage + spectral band combinations were the only ones that demonstrated poor reliability (i.e., left hemisphere delta in the eyes-closed condition for controls and PWAs, right hemisphere delta and theta in the eyes-open condition for controls). This suggests that researchers can have greater confidence in the reliability of results when the montage includes electrodes across both left and right hemisphere, and that additional caution should be taken when examining smaller montages that do not include representation over the whole scalp. Future investigations should continue to explore the impact of montage on reliability, particularly as this sample was comprised mostly of individuals with single hemisphere lesions. However, these results suggest that more clinically focused montages with fewer electrodes should not have significantly worse reliability than the dense electrode arrays more frequently used in research.
When considering the reliability for each group, PWAs demonstrated a pattern of numerically better reliability, especially in the eyes-open condition. It is not immediately clear what is driving this pattern, as our naïve assumption entering into the study was that controls would demonstrate better reliability, given the well-known behavioral variability of PWAs. Although speculative, one possible explanation of this result is that the increased variability in controls is indicative of a system with more flexibility and greater capacity. On the other hand, the decreased variability in PWAs could be indicative of a system that, due to damage, is less able to respond flexibly or has limited capacity. There is some limited preliminary evidence for this in traumatic brain injury during task completion (see Beharelle et al., 2012); however, additional research would be needed to determine if this holds for resting-state activation and in PWAs.

Utility of Spectral rsEEG to Identify Group Differences
In this study, we observed significant differences in beta (in both conditions) and theta power (in the eyes closed condition only) between controls and PWAs. With respect to theta power, we did not find statistically significant differences between groups in the eyes-open condition, only in the eyes-closed condition. We attribute this to the improved signal-to-noise ratio of the eyesclosed condition, which may have improved statistical power to detect differences, and therefore may have greater clinical and research utility. As expected from research in the acute and subacute stages post-stroke, healthy controls demonstrated a pattern of greater power in high frequencies and lower power in lower frequencies than PWAs (but see our discussion of delta power below). These results indicate that some degree of spectral rsEEG slowing observed in the acute and sub-acute stages post-stroke persists into the chronic phase, at least for persons with chronic aphasia. However, our results revealed significant between-group differences in theta power, while much of the previous acute and sub-acute literature has reported significant between-group differences in delta power. While studies examining spectral EEG in the chronic phase have utilized task-based paradigms, thereby limiting the comparisons that can be made to our findings, those investigations did report differences in delta, theta, and beta bands, mostly consistent with our results (Spironelli and Angrilli, 2009;Spironelli et al., 2013;Herron et al., 2014).
Interestingly, while we did observe a single significant difference in delta power between the groups (right hemisphere delta in the eyes open condition), the direction of that difference also ran contrary to previous literature, with healthy controls demonstrating higher delta power than PWAs in the right hemisphere. Indeed, when examining the descriptive statistics of delta power for all montages, healthy controls demonstrated numerically greater delta power across all montages. One possible explanation for this finding is body positioning, which has demonstrated effects on cortical activation (e.g., Spironelli et al., 2016;Thibault and Raz, 2016) -the supine position (laying down) is associated with increased delta activation compared to upright positions. Given that much of the previous research has been conducted in settings where patients are often reclined or laying down in bed, but our study only included participants who were sitting upright in a chair, it is possible that the positioning differences between this study and previous research may at least partially explain this dissimilar finding.
Additionally, since most studies have examined persons poststroke in the acute and sub-acute, rather than chronic phase, it is possible that the slowing of spectral rsEEG resolves to a certain extent. This is consistent with Hensel et al. (2004) who conducted a longitudinal study of post-stroke recovery and reported a gradual increase in the frequency of spectral rsEEG over the course of 2 years. Another alternative explanation lies in the sample characteristics of participants in our study (e.g., specifically investigating aphasia, including persons with multiple strokes, mild depression or anxiety, and a realistic range of control performance). It is also consistent with patterns observed via fMRI where the locus of activation during a language tasks shifts from the right hemisphere in the acute stage back toward the left hemisphere in the sub-acute and chronic stages (Heiss et al., 1999;Saur et al., 2006). Still, while our more lenient inclusion/exclusion criteria may have increased the variability of the sample, the ability to directly compare our results to the typical clinical population served by practicing speech-language pathologists, and a more realistic cross-section of the population, compensates for loss of specificity.

Relationship of Spectral rsEEG Power With Behavior
We investigated the correlation between spectral rsEEG measures and performance on behavioral tasks, as the relationship between spectral rsEEG and behavioral performance has been frequently reported in the acute and sub-acute spectral rsEEG literature. Our results provide an important assurance that in addition to being reliable over time, the differences in spectral rsEEG observed between PWAs and controls relate to functionally relevant behaviors. Results also support the notion of a continued relationship between spectral rsEEG and behavioral function in the chronic phase of stroke recovery. Of greatest interest, we found that the main concept (MC) composite score, a measure of discourse informativeness, was positively correlated with power in high frequency bands, and negatively correlated with power in low frequency bands. These findings indicate that spectral rsEEG slowing observed in PWAs in the chronic stage is pathological and suggests a possible avenue for directly altering brain activation to improve behavioral function. However, additional research is needed to fully elucidate the relationship between spectral rsEEG and the behaviors included here. While it may be tempting to try to interpret these results in the context of spectral task-based EEG findings (e.g., cortical inhibition and/or activation variously associated with the spectral bands Engel and Fries, 2010;Palva and Palva, 2011;Cavanagh and Frank, 2014;Cavanagh, 2015;Antzoulatos and Miller, 2016;Richter et al., 2017), these associations do not necessarily hold for resting-state measurements. Therefore, cautious interpretation of the mechanistic processes inferred by spectral bands at rest is recommended.
Our finding of a strong positive correlation between theta power and RBANS scores in healthy controls was relatively unexpected. This result suggests there might be a range of resting theta power that is functionally appropriate, and within this range, increased theta is generally associated with better cognition. However, once this range of theta values is exceeded (as in persons with stroke) higher resting theta power may be functionally maladaptive. This interpretation is supported by the fact that higher resting state theta power is associated with better cognitive function in healthy aging (Cummins and Finnigan, 2007) and when comparing young adults to older adults (Finnigan and Robertson, 2011).
These results also highlight the importance of the behavioral measures selected to quantify functional abilities. While the WAB-R is a standardized and norm-referenced assessment, it demonstrated a ceiling effect in this sample, as it often does in aphasia research and practice. The WAB-R is scored out of 100 points, and the cut-off score to distinguish between PWAs and those without aphasia is 93.8. This means that the entire range of "normal" or "not aphasic" performance is constrained to less than seven points in comparison to the range of 93.8 points to quantify aphasic performance. Further, it is well-known that this score is not sensitive to mild aphasic deficits that may be most prominent in connected speech, which it does not adequately assess. Similarly, the BNT is scored out of 60 points, and most healthy controls correctly name 53-60 items (depending on age and education; Tombaugh and Hubiey, 1997). While naming is the hallmark impairment of aphasia, confrontation naming tasks such as the BNT may not adequately represent difficulties in connected speech (e.g., Richardson et al., 2018). In contrast, the MC composite score has demonstrated sufficient sensitivity to describe both impaired and control discourse across a wide range of performance (Dalton and Richardson, 2019;Richardson et al., 2021). These results further amplify calls to utilize discourse measures as the primary outcome for aphasia research studies since discourse performance is a better index of the functional communication outcomes desired by PWAs (Brady et al., 2016). To our knowledge, this is the first study to relate spectral rsEEG to discourse production abilities in PWAs.

Limitations and Future Directions
Future research should more directly examine the effect of lesion location in PWAs as research in individuals in the chronic stage post-stroke suggests this may have an effect on spectral rsEEG (e.g., Park et al., 2016). Because this study included PWAs with fluent and non-fluent aphasia, lesion locations included both posterior and anterior left hemisphere perisylvian regions (see Supplementary Table 1), as well as subcortical areas and right hemisphere (for individuals with mixed hemisphericity and/or multiple strokes). A more nuanced understanding of the results with improved localization of the spectral rsEEG sources and region of interest analyses with functional networks may be possible when EEG is paired with other neuroimaging modalities, such as MRI or MEG. Additional avenues for future research include examining changes in spectral rsEEG before and after speech-language therapy to determine the sensitivity and predictive capacity of spectral rsEEG and investigating the reliability of functional behavioral measures, such as the discourse measures included in this study, to identify sources of non-informative variability that can be minimized in the future.
Improvements in reporting methodology and basic descriptive statistical information regarding data are needed. For example, we excluded individuals with clinically identified depression and anxiety, but allowed individuals without a clinical diagnosis who reported experiencing mild symptoms of depression and/or anxiety. This decision was based on the reportedly high prevalence of depression and/or anxiety in PWAs (e.g., Hilari et al., 2012;Ayerbe et al., 2013;Døli et al., 2017;Morris et al., 2017) as well as precedent set by previous foundational spectral EEG studies in chronic stroke with and without aphasia. Most studies of spectral EEG in individuals' post-stroke with or without aphasia do not mention mental health disorders as an exclusion criterion, so it is unclear whether individuals in their sample experience these difficulties. Those studies that do report an exclusion criterion for mental health issues range from excluding only individuals with refractory (or treatment-resistant) depression (Carrick et al., 2016), to excluding individuals who are taking specific classes of medications (such as benzodiazepines or tricyclic antidepressants, e.g., Finnigan et al., 2007), to excluding based on known (Nicolo et al., 2015;Song et al., 2015;Finnigan et al., 2016;Gorišek et al., 2016), major active (Wu et al., 2015), major pre-morbid (Leon-Carrion et al., 2009), and/or uncontrolled (Herron et al., 2014) psychiatric illness. In this study, prospective data regarding the types or severity of depression and anxiety symptoms were not collected since mental health concerns were asked about during pre-enrollment screening only. The lack of this data limits the interpretation of the results presented here, as it is possible that between group differences, or lack of differences, were driven by altered spectral rsEEG secondary to depression and/or anxiety. In the future, more clearly defining inclusion/exclusion criteria (e.g., what is the difference between "major active" and "known" psychiatric illness?), prospectively collecting depression and anxiety symptoms and severity in both clinical and healthy control participants using measures validated for those populations, and clear reporting of these characteristics may yield even greater insights into brain function.
By providing more detailed methodology and data reporting, we can increase the confidence in reported results and strengthen the inferences that can be drawn from published findings. Ideally, a database, such as those already established for healthy controls and some populations with disorders (e.g., Brain Research and Integrative Neuroscience Network 1 ; Patient Repository for EEG Data + Computational Tools 2 ) would be populated with data from PWAs to allow for sharing of data and use of big data analytics that are currently unavailable. The current lack of such data in existing repositories highlights the need for continued research specific to PWAs. Much of the stroke literature actively excludes PWAs from participation, typically due to concerns regarding language abilities, informed consent, and the ability to complete language-based study activities (Thorne and Paterson, 2000;Townend et al., 2007;Dalemans et al., 2009;Brady et al., 2013). Unfortunately, even that research which does not exclude PWAs often neglects to report their data as a separate subset of the participants. Since there is sufficient evidence in the literature that PWAs are impacted differently, and more severely, than individuals with other types of post-stroke deficits (e.g., Hilari, 2011;Brady et al., 2013;Hilari and Northcott, 2017;Pike et al., 2017;Simmons-Mackie, 2018), the inclusion of, and separate reporting of data for, PWAs is warranted.

CONCLUSION
Within the field of stroke rehabilitation, and especially within the field of aphasiology, there is great need for improved individualization and optimization of rehabilitation. One of the primary limiting factors in achieving this goal is the lack of sensitive measures that can predict treatment response. Simply looking at an individual's behavioral profile or examining structural or functional MRI records has proven insufficient to determine the treatment course that will lead to greatest functional recovery for that individual. This is especially critical for adults engaging in rehabilitation, since private insurance companies may impose annual caps on total therapy hours or dollars, or limit access to therapy for individuals with mild or latent, but functionally debilitating or limiting, deficits. By improving individualization of treatment and thereby maximizing outcomes, PWAs will be more likely to experience meaningful improvements in everyday living.
Resting-state investigations, both rsEEG and rsfMRI, hold great advantages for the inclusion of PWAs (or others with specific post-stroke deficits that alter behavioral task performance) in research studies. Resting-state paradigms allow investigations of functional networks without the confounds of response accuracy, variable cognitive load across participants, and the differential recruitment of brain areas in response to cognitive load. In this study, speechlanguage pathologists and speech-language pathology master's students with little to no background in EEG were able to collect high quality data, regardless of the severity of participant deficits, following provision of a detailed study procedures manual, several training (including practice) sessions, and minimal supervision during initial participant sessions. Spectral rsEEG, especially when recorded in an eyes-closed condition, shows potential as a means of improving diagnostic sensitivity, tracking outcomes, and individualizing therapy for maximum benefit given its reliability, persistent changes after stroke, and correlations with behavioral function.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board -Human Research Review Committee University of New Mexico Health Sciences. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SD contributed to all aspects of the research process including study conception and design, data collection, data analysis, and manuscript drafting. JC contributed to study conception and design, data analysis, and manuscript revisions. JR contributed to study conception and design, data collection, and manuscript revisions. All authors contributed to the article and approved the submitted version.