The Effects of Repeated Testing, Simulated Malingering, and Traumatic Brain Injury on High-Precision Measures of Simple Visual Reaction Time

Woods, David L.; Wyma, John M.; Yund, E. William; Herron, Timothy J.

doi:10.3389/fnhum.2015.00540

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 09 November 2015

Sec. Motor Neuroscience

Volume 9 - 2015 | https://doi.org/10.3389/fnhum.2015.00540

The Effects of Repeated Testing, Simulated Malingering, and Traumatic Brain Injury on High-Precision Measures of Simple Visual Reaction Time

David L. Woods^1,2,3,4*

John M. Wyma¹

E. William Yund¹

Timothy J. Herron¹

¹Human Cognitive Neurophysiology Laboratory, Veterans Affairs Northern California Health Care System, Martinez, CA, USA
²UC Davis Department of Neurology, University of California, Davis, Sacramento, CA, USA
³UC Davis Center for Neurosciences, University of California, Davis, Davis, CA, USA
⁴UC Davis Center for Mind and Brain, University of California, Davis, Davis, CA, USA

Simple reaction time (SRT), the latency to respond to a stimulus, has been widely used as a basic measure of processing speed. In the current experiments, we examined clinically-relevant properties of a new SRT test that presents visual stimuli to the left or right hemifield at varying stimulus onset asynchronies (SOAs). Experiment 1 examined test-retest reliability in 48 participants who underwent three test sessions at weekly intervals. In the first test, log-transformed (log-SRT) z-scores, corrected for the influence of age and computer-use, were well predicted by regression functions derived from a normative population of 189 control participants. Test-retest reliability of log-SRT z-scores was measured with an intraclass correlation coefficient (ICC = 0.83) and equaled or exceeded those of other SRT tests and other widely used tests of processing speed that are administered manually. No significant learning effects were observed across test sessions. Experiment 2 investigated the same participants when instructed to malinger during a fourth testing session: 94% showed abnormal log-SRT z-scores, with 83% producing log-SRT z-scores exceeding a cutoff of 3.0, a degree of abnormality never seen in full-effort conditions. Thus, a log-SRT z-score cutoff of 3.0 had a sensitivity (83%) and specificity (100%) that equaled or exceeded that of existing symptom validity tests. We argue that even expert malingerers, fully informed of the malingering-detection metric, would be unable to successfully feign impairments on the SRT test because of the precise control of SRT latencies that would be required. Experiment 3 investigated 26 patients with traumatic brain injury (TBI) tested more than 1 year post-injury. The 22 patients with mild TBI showed insignificantly faster SRTs than controls, but a small group of four patients with severe TBI showed slowed SRTs. Simple visual reaction time is a reliable measure of processing speed that is sensitive to the effects of malingering and TBI.

Introduction

Simple reaction time (SRT) tests are basic measures of processing speed that index the minimal time needed to respond to a stimulus (Woods et al., 2015b). SRTs are weakly correlated with general intelligence (Deary et al., 2001) and are slowed in many neurological disorders, including traumatic brain injury (TBI; Stuss et al., 1989b; Willison and Tombaugh, 2006; Neselius et al., 2014), Parkinson’s disease (PD; Camicioli et al., 2008), post-concussion syndrome (Makdissi et al., 2001), cerebrovascular disease (D’Erme et al., 1992), and mild cognitive impairment (Christensen et al., 2005).

In most visual SRT tests, participants respond as rapidly as possible with the keyboard or mouse to stimuli presented at central fixation following randomized stimulus onset asynchronies (SOAs). Here, we describe the clinically-relevant characteristics of a new SRT test (Woods et al., 2015b) that quantifies SRTs separately for stimuli presented in the left and right hemifield, analyzes SRT latencies as a function of the preceding SOAs (Niemi and Naatanen, 1981), and isolates stimulus detection time (SDT) by subtracting the time needed to depress the mouse button (movement initiation time) from the SRT.

The new SRT paradigm was previously used to study age-related (age range 18–82 years) changes in SRTs (Woods et al., 2015b) and revealed shorter SRT latencies (mean = 238 ms) than reported in most prior large-scale studies. In addition, SRT standard deviations (28 ms) were smaller than in previous studies, and within-subject (trial-to-trial) standard deviations were also reduced (53 ms). The differences with previous studies were attributed to the improved precision of the computer hardware and software used for SRT measurement. We also found that SRT latencies increased with age at a rate of 0.55 ms/year, while age effects on SDT latencies were insignificant. SRT latencies decreased by 27 ms as SOAs lengthened, but were minimally affected by the hemifield of stimulation.

The current experiments were designed to evaluate the characteristics of the new SRT paradigm most relevant to its potential clinical deployment: its test-retest reliability, sensitivity to malingering, and sensitivity to the effects of TBI.

Test-Retest Reliability

In Experiment 1, we examined the test-retest reliability of the new SRT test. Our objective was to compare the test-retest reliability of SRT latency measures with those of previous commercial and non-commercial SRT tests, and to examine the test-retest reliability of the additional measures provided by the new test (e.g., the effects of SOAs on SRT latencies).

Sensitivity to Malingering

In Experiment 2, we examined the effects of simulated malingering on test performance with the goal of discriminating simulated malingerers from control participants. Based on previous studies that found much longer SRT latencies in malingerers than in either control subjects or brain-injured patients (Kertzman et al., 2006; Willison and Tombaugh, 2006; Reicker, 2008; Marx et al., 2009), we examined the sensitivity and specificity of simple z-score cutoffs. We also examined the malingering-detection utility of the additional measures provided by the test.

The Effects of TBI

In Experiment 3, we evaluated whether the new SRT test would detect impairments in patients with chronic mild and severe traumatic brain injury (mTBI and sTBI). While previous studies of patients in the chronic phase have generally found that SRT slowing is restricted to patients with sTBI (Stuss et al., 1989a; Ferraro, 1996; Bashore and Ridderinkhof, 2002; Tombaugh et al., 2007), we hypothesized that a more sensitive SRT test might also reveal abnormalities in patients with mTBI. We also evaluated whether other performance measures, such as trial-to-trial latency variability, would also show TBI-related abnormalities (Stuss et al., 1989a; Collins and Long, 1996; Tombaugh et al., 2007).

Experiment 1: Test-Retest Reliability

In Experiment 1, we examined the test-retest reliability of SRT measures in a group of 48 young control participants. Previous studies have generally shown that SRT latencies show high test-retest reliability (Lemay et al., 2004; Sakong et al., 2007). However, little is known about the test-retest reliability of SRT latencies for laterally-presented stimuli, or the test-retest reliability of the additional SRT measures gathered in the current paradigm, including SDTs and the increase in SRT latencies that occurs when stimuli are presented at short SOAs.

Minimal learning effects of SRTs have been found in previous studies (Lemay et al., 2004; Straume-Naesheim et al., 2005; Sakong et al., 2007; Eckner et al., 2011). Here, we examined whether learning would occur in a somewhat more complex SRT paradigm.

Timing Precision and Replicability

Although the timing calibration of computer hardware is essential for providing accurate estimates of SRT latencies (Plant and Quinlan, 2013), timing calibration data have not previously been published for commercial or non-commercial SRT paradigms. Hardware delays are generally constant for different tests performed on a given computer, but can change substantially (e.g., by 40 ms or more) if the same test is run with a different computer monitor and response device (Plant and Turner, 2009). We therefore provided additional information about hardware and software timing precision for the SRT tests reported here.

Computer software can also introduce timing imprecision when multiple operations (e.g., monitoring for responses, loading files from disk, writing data to disk, etc.) occur concurrently and when paradigm execution is interrupted by other processes executing concurrently on the test computer. Unlike hardware delays, software delays occur unpredictably, and hence need to be measured for each stimulus and response event during a test to assure optimal precision. Previous tests used to evaluate monitor precision have found infrequent software delays of 17–51 ms that vary with SOA and the software platform used (Garaizar et al., 2014). However, software delays have not been measured in more complex paradigms where multiple operations occur concurrently. Here, we provide measures of the software delays associated with each stimulus and response event during SRT test execution.

Methods

Participants

The demographic characteristics of the participants are shown in Table 1. The 48 young volunteers (mean 26.2 years, range 18–46 years, 48% male) were recruited from advertisements in the San Francisco Bay Area on Craigslist (sfbay.craigslist.org), and from pre-existing control populations. All participants were required to meet the following inclusion criteria: (a) fluency in the English language; (b) no current or prior history of psychiatric illness; (c) no current substance abuse; (d) no concurrent history of neurologic disease known to affect cognitive functioning; (e) auditory functioning sufficient to understanding normal conversational speech; and (f) visual acuity normal or corrected to 20/40 or better. They were recruited alongside a larger control population (Experiment 2 in Woods et al., 2015b) and had agreed to participate in four test sessions: three test sessions to evaluate test-retest reliability, and a fourth session to study the effects of malingering (see “Experiment 2”, below). All participants signed written consent forms approved by the institutional review board (IRB) at the Veterans Affairs Northern California Health Care System (VANCHCS), and were compensated for their participation. Fifty eight percent of the participants were college students, and the group as a whole was very well-educated (mean 15.1 years of education). Ethnically, 68% were Caucasian, 11% Hispanic, 9% African American, 9% Asian, and 3% other. The data were compared with those from a normative control group of 189 participants ranging in age from 18 to 82 years whose results have been described in detail elsewhere (Woods et al., 2015b).

TABLE 1

Table 1. Demographic characteristics of the normative population (Norm) and the participants in the three experiments.

Methods and Procedures

SRT testing occurred midway through a series of tests that required approximately 2 h to complete¹. The SRT paradigm is shown in Figure 1 and is available for download at www.ebire.org/hcnlab/cognitive-tests/SRT. Participants responded to the occurrence of a bulls-eye stimulus in either the left or right hemifield by pressing the response button of the mouse with their index finger. After 20 practice trials, 100 stimuli of 200 ms duration were presented randomly to the left and right hemifield at SOAs ranging from 1000 to 2000 ms in 250 ms steps.

FIGURE 1

Figure 1. The SRT paradigm. Stimuli were high-contrast bulls-eyes presented to the left or right hemifield for a duration of 200 ms at randomized stimulus onset asynchronies (SOAs) ranging from 1000–2000 ms in five 250 ms steps. Stimuli could occur in the visual hemifield ipsilateral (shown) or contralateral to the responding hand.

A response window of 110–1000 ms was used. Responses outside this range were categorized as false alarms (FAs). The failure to respond during the 110–1000 ms interval following the presentation of a stimulus was categorized as a miss. Hit rate was defined as the percentage of stimuli associated with valid responses. For each participant, hit-rate, false-alarm rate, and mean SRT latency were calculated along with trial-to-trial SRT variance.

While our primary focus was on SRT latency, we were also interested in measuring SDT, the difference between SRTs and movement initiation times measured in a finger-tapping task performed on the same day of testing (Hubel et al., 2013a,b). In addition, we evaluated the reliability of other potentially useful metrics, including hit rate, trial-to-trial SRT standard deviations, Coefficient of Variation (CV, trial to trial standard deviations/mean RT), differences between SRT latencies for stimuli presented to the left and right hemifield, and differences between stimuli presented at short and long SOAs. Other methodological details have been described elsewhere (Woods et al., 2015b).

Hardware and Software Calibration

There are two principal sources of hardware delay. First, there is a delay in the appearance of the stimulus after the computer video card sends the stimulus image to the LCD monitor, which depends on monitor electronics. We measured the delay for the 17″ Samsung Syncmaster monitor with a photodiode (StimTracker, Cedrus, San Pedro, CA, USA) and found a mean delay of 11.0 ms (sd = 0.1 ms). Second, there is a variable delay between the moment that the response button is pressed and the moment that the response is registered by the device driver and detected by the computer software controlling the paradigm. The magnitude of this delay (often 20 ms or more) depends on mouse design and the device driver software that signals responses to the operating system (Plant et al., 2003). In the current experiment, we used a PC gaming mouse (Razer Sidewinder, Carlsbad, CA, USA) that required minimal (2.0 mm) movement for button closure and incorporated a device driver with a high USB sampling rate (1.0 kHz). We measured response delays by disassembling the mouse and simulating button closure with an electronic relay. The average response delay was 6.8 ms (sd = 1.8 ms). Thus, total delays introduced by the video display and response device were 17.8 ms.

In addition to hardware delays, stimulus-delivery software can introduce unpredictable delays and latency variability. Presentation software (Neurobehavioral Systems, Inc., Berkeley, CA, USA) is designed so that resource-demanding operations (e.g., loading a stimulus from disk) are multiplexed with continuous high-precision monitoring for event occurrences. This feature enables Presentation to report event times with 0.1 ms precision using the 100 kHz programmable clock. Event-time uncertainties, the difference between times recorded before the event occurred and times recorded after the event, are also recorded for each event. Thus, there will be a gap in the otherwise continuous timing record and a corresponding increase of the event-time uncertainties if stimulus delivery or response monitoring is interrupted by a resource-demanding operation or an extraneous process. For example, if a response occurred during a 5.0 ms interruption, its latency would be logged at the beginning of the interruption and would be associated with a 5.0 ms event-time uncertainty. In the current experiments, the PC was configured to minimize extraneous operating system interruptions. Event-time uncertainties for 5,279 stimulus presentations in Experiment 1 averaged 0.16 ms (sd = 0.05 ms) with a maximal uncertainty of 2.2 ms, and the mean uncertainty for 5,226 response events was 0.22 ms (sd = 0.11 ms) with a maximal uncertainty of 1.3 ms.

Data Analysis

The distribution of mean SRTs was asymmetrical (skew = 0.90) so that SRTs were first log-transformed to reduce skew. The examination of the normative data of participants ranging in age from 18 to 82 years (Woods et al., 2015b) showed that both age (r = 0.34, t₍₁₈₇₎ = 4.94, p < 0.0001) and computer-use (r = −0.28, t₍₁₈₇₎ = 3.99, p < 0.0001) had significant effects on log-transformed SRT latencies. When analyzed conjointly, these factors accounted for 16% of log-SRT variance (r = 0.40) in the normative population, with both age (t₍₁₈₆₎ = 4.26, p < 0.0001) and computer-use (t₍₁₈₆₎ = −3.08, p < 0.003) independently influencing log-SRTs. Therefore, the regression functions from the normative data were used to calculate log-SRT z-scores after correcting for the influence of age and computer-use (see Table 2). SDTs (skew = −0.90) were not significantly influenced by either age (r = −0.07) or computer-use (r = 0.01), so no regression functions were applied when calculating SDT z-scores.

TABLE 2

Table 2. Mean values for all experiments.

Statistical analysis was performed with multifactor mixed analysis of variance (ANOVA). Separate ANOVAs were performed for different metrics, with Greenhouse-Geisser corrections of degrees of freedom used in computing p values in order to correct for covariation within factors or interactions. Effect sizes are reported as partial ω² values or as Cohen’s d. Test-retest correlations were measured with intraclass correlation coefficients (ICCs) using SPSS (IBM, version 22). Pearson product moment correlation coefficients are also reported when appropriate.

Results

Figure 2 shows SRTs as a function of age for the participants in the normative database (blue diamonds) and participants in the first session of Experiment 1 (1a, open red squares). Figure 3 shows SDTs as a function of age, and Figure 4 shows age- and computer-use regressed log-SRT z-scores and SDT z-scores. Summaries of the results from the different experiments are presented in Table 2.

FIGURE 2

Figure 2. Mean SRT latencies as a function of age. SRT latencies from individual participants in normative data (norm, blue diamonds), Experiment 1a (open red squares), Experiment 2 (simulated malingering, green triangles) and Experiment 3 (patients with mTBI, red circles, sTBI, striped red circles). The normative age-regression slope is shown. Simulated malingerers with SRT latencies >600 ms are not included.

FIGURE 3

Figure 3. Mean stimulus detection times (SDTs) as a function of age. SDTs were derived by subtracting movement initiation time (measured in a finger-tapping experiment performed in the same test session) from SRTs. SDTs are shown for normative data (norm, blue diamonds), Experiment 1a (open red squares), Experiment 2 (simulated malingering, green triangles) and Experiment 3 (patients with mTBI, red circles, sTBI, striped red circles). The normative age-regression slope is shown.

FIGURE 4

Figure 4. Log-SRT z-scores and SDT z-scores for the normative group and the three experiments. Data from two simulated malingerers with SDT z-scores greater than 12.0 and two simulated malingerers with SDT z-scores less than −4.0 are not shown. The red lines show p < 0.05 thresholds for normative log-SRT and SDT z-scores.

Comparison with Normative Data

We first compared the performance of participants in Experiment 1a (the first test session) with the performance of participants in the normative database (Table 2, norm). No significant group differences were seen in log-SRT z-scores (mean z-score = 0.10, F_(1,234) = 1.16, NS), hit rates (F_(1,234) = 2.46, p < 0.15), SDTs (F_(1,234) = 0.16, NS), or SOA effects (F_(1,234) = 1.30, NS). However, Experiment 1a participants had slightly reduced CVs (z-score = −0.33, F_(1,234) = 4.92, p < 0.03, partial ω² = 0.02) compared to the normative population. In addition, the participants in Experiment 1a were a more homogeneous group than the participants in the normative group, resulting in reduced intersubject standard deviations for SRTs (17.6 ms vs. 27.8 ms) and log-SRT z-scores (0.78 vs.1.0). Table 3 shows the percentage of abnormal test results (based on single-sided p < 0.05 cutoffs in the normative data). The incidence of abnormal results in Experiment 1a ranged from 0.0–4.2%.

TABLE 3

Table 3. Percentage of abnormal results (p < 0.05).