Identifying Electrophysiological Prodromes of Post-traumatic Stress Disorder: Results from a Pilot Study

Wang, Chao; Costanzo, Michelle E.; Rapp, Paul E.; Darmon, David; Bashirelahi, Kylee; Nathan, Dominic E.; Cellucci, Christopher J.; Roy, Michael J.; Keyser, David O.

doi:10.3389/fpsyt.2017.00071

ORIGINAL RESEARCH article

Front. Psychiatry, 15 May 2017

Sec. Psychopathology

Volume 8 - 2017 | https://doi.org/10.3389/fpsyt.2017.00071

This article is part of the Research TopicTrauma, Psychosis, and Posttraumatic Stress DisorderView all 21 articles

Identifying Electrophysiological Prodromes of Post-traumatic Stress Disorder: Results from a Pilot Study

Chao Wang^1,2

Michelle E. Costanzo^2,3

Paul E. Rapp¹*

David Darmon^1,2

Kylee Bashirelahi^1,2

Dominic E. Nathan^1,2,4

Christopher J. Cellucci⁵

Michael J. Roy³

David O. Keyser¹

¹Traumatic Injury Research Program, Department of Military and Emergency Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
²The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD, USA
³Department of Medicine and Center for Neuroscience and Regenerative Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
⁴Graduate School of Nursing, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
⁵Aquinas LLC, Berwyn, PA, USA

The objective of this research project is the identification of a physiological prodrome of post-traumatic stress disorder (PTSD) that has a reliability that could justify preemptive treatment in the sub-syndromal state. Because abnormalities in event-related potentials (ERPs) have been observed in fully expressed PTSD, the possible utility of abnormal ERPs in predicting delayed-onset PTSD was investigated. ERPs were recorded from military service members recently returned from Iraq or Afghanistan who did not meet PTSD diagnostic criteria at the time of ERP acquisition. Participants (n = 65) were followed for up to 1 year, and 7.7% of the cohorts (n = 5) were PTSD-positive at follow-up. The initial analysis of the receiver operating characteristic (ROC) curve constructed using ERP metrics was encouraging. The average amplitude to target stimuli gave an area under the ROC curve of greater than 0.8. Classification based on the Youden index, which is determined from the ROC, gave positive results. Using average target amplitude at electrode Cz yielded Sensitivity = 0.80 and Specificity = 0.87. A more systematic statistical analysis of the ERP data indicated that the ROC results may simply represent a fortuitous consequence of small sample size. Predicted error rates based on the distribution of target ERP amplitudes approached those of random classification. A leave-one-out cross validation using a Gaussian likelihood classifier with Bayesian priors gave lower values of sensitivity and specificity. In contrast with the ROC results, the leave-one-out classification at Cz gave Sensitivity = 0.65 and Specificity = 0.60. A bootstrap calculation, again using the Gaussian likelihood classifier at Cz, gave Sensitivity = 0.59 and Specificity = 0.68. Two provisional conclusions can be offered. First, the results can only be considered preliminary due to the small sample size, and a much larger study will be required to assess definitively the utility of ERP prodromes of PTSD. Second, it may be necessary to combine ERPs with other biomarkers in a multivariate metric to produce a prodrome that can justify preemptive treatment.

Introduction

Historically, psychiatric practice has been reactive rather than preemptive. It has been recognized that a transition to preemptive psychiatry requires the identification of prodromes of psychiatric disorders that have a predictive reliability that justifies intervention in the absence of a fully expressed disorder. A prodrome is not a risk factor. A prodrome is a physiological change antecedent to a full expression of the disorder. Costello and Angold (1) provide the following definition: “… a prodrome is a premonitory manifestation of the disease. It is not a characteristic of the individual or their environment or a causal agent of the disease. A prodromal symptom may or may not continue to be manifest once the full disease appears. Conversely, the same disease may or may not manifest prodromal symptoms in different episodes.” Emerging genetic, epigenetic, and psychophysiological technologies offer the possibility of identifying prodromes or combinations of prodromes (where a combination of metrics may improve specificity) that can warrant preemptive treatment (2, 3). Prior research has investigated prodromes of several psychiatric disorders including psychosis (4–7), depression (8), autism (9, 10), dementia (11), alcoholism and substance abuse (1, 12), and post-traumatic stress disorder [PTSD (13–15)].

The objective of this research project is the identification of a physiological prodrome of PTSD that has a reliability that could justify preemptive treatment in the sub-syndromal state. The search for statistically reliable prodromes requires two things: a sub-syndromal period where physiological changes prior to the disease onset have been initiated, and a measure that can quantify these changes. In the ideal case, a third element can facilitate the search for prodromes: the identification of an at-risk population because an enriched population (a population where incidence is higher than the general population) will increase the statistical likelihood of identifying a prodrome. In this contribution, we address a specific question: can event-related potentials identify individuals at risk of delayed-onset PTSD? As preceding questions we must ask whether an at-risk population can be identified and if there is evidence indicating that PTSD can, in some instances, present with delayed onset? It is the period between trauma exposure and the presentation of a fully expressed PTSD that provides the window of opportunity for preemptive treatment.

Can a PTSD At-Risk Group Be Identified?

Military deployment is a risk factor for PTSD. The reported incidence of PTSD in veterans varies greatly between studies. A critical review found that PTSD incidence in US Iraq veterans ranges from 4 to 17% (16). Reports of the incidence of PTSD in the general population are similarly varied, but the National Comorbidity Survey Replication Study (17, 18) estimated the lifetime prevalence of PTSD in adult Americans to be 6.8%. Current past year prevalence was estimated at 3.5%. This suggests that military service members (SMs) who have returned from deployment will provide a statistically enriched population increasing the likelihood of identifying prodromes of PTSD. When making this observation, it is recognized that it is possible that military-related PTSD and PTSD in civilian populations may have distinct pathophysiological etiologies. This would potentially limit the general utility of results obtained with a military population.

Can PTSD Present with Delayed Onset?

Meta-analysis indicates that approximately 25% of PTSD cases present with delayed onset, where delayed onset is defined as meeting diagnostic criteria after a sub-syndromal or asymptomatic period of at least 6 months after the precipitating traumatic event (19, 20). In a military population, Grieger et al. (21) found that the majority of individuals PTSD-positive 7 months after serious combat injury did not meet diagnostic threshold at 1 month post-injury. In cases of PTSD following mild traumatic brain injury (TBI), the fraction of cases presenting with delayed onset can be higher. Bryant et al. (22) found that of those who met PTSD criteria at 24 months following a TBI, 44.1% reported no PTSD at 3 months. The analysis of Smid et al. (20) and Andrews et al. (19) indicates that PTSD can present after a symptom-free period, but it has been found to be more likely after a period of sub-syndromal PTSD in which two or three of the symptom clusters are endorsed (22). The factors contributing to delayed-onset PTSD in the absence of mild TBI are incompletely understood (15). On reviewing the trajectories of full and sub-syndromal PTSD, Bryant et al. (22) reached the following conclusions: “The present study demonstrates longitudinally that there is not a linear relationship between acute trauma response and long-term PTSD and highlights that PTSD levels fluctuate markedly in the initial years after trauma exposure. This pattern can explain the modest predictive capacity of acute markers to identify subsequent PTSD status. The complexity of these trajectories is further indicated by the delayed occurrence of PTSD responses, which appears to result from a combination of the immediate stress response and cumulative stress in the aftermath of the trauma.” These clinical observations further encourage the search for reliable physiological prodromes of PTSD.

Is There a Prior Literature Reporting Alterations of Event-Related Potentials in Fully Expressed PTSD?

As noted above, an additional requirement in the search for prodromes is the identification of a measure that can quantify physiological changes antecedent to disease onset. This search can be informed by asking whether there are markers that show alteration in the fully expressed disease, since it seems possible that these alterations may have begun prior to reaching diagnostic threshold. In the specific context of this investigation, this question becomes is there a prior literature showing abnormalities in event-related potentials in PTSD patients? An examination of the prior literature summarized in Table 1 suggests that event-related potentials can be altered in the fully expressed PTSD state.

TABLE 1

Table 1. Studies reporting ERP abnormalities in PTSD-positive participants.

The divergence of electrophysiological results across studies is consistent with the emerging understanding that PTSD is not a discrete clinical entity and that different pathophysiological processes may be active in different individuals. The results do, however, suggest that alterations of brain electrical behavior can be associated with the disorder. As indicated in Table 1, alterations in P300 are most frequently reported.

There is an emerging understanding of the neurological origin of the empirical results reported in Table 1 that suggests why alterations of P300 may be associated with both fully expressed PTSD and the sub-syndromal state. P300 has been hypothesized to reflect neural activity associated with attention and subsequent memory processing (43), with larger P300 amplitude associated with greater attentional resources employed in the task (44, 45). The prior studies with PTSD positive participants reporting reduced P300 amplitude to target stimuli in the PTSD group compared to the control group, suggest impairment of attentional processes which is consistent with clinical observation. In addition, a meta-analysis examining ERP components and PTSD revealed that the P300 amplitude may also be sensitive to contextual cues such that information processing is modulated based on the situation and environment (31). These dynamics are consistent with functional changes of two reported neural generators of the P300 (46, 47): the anterior cingulate cortex (ACC) and the hippocampus, which are also altered in individuals with PTSD (48). The ACC is critical to attentional processing and fear inhibition (49, 50) and the hippocampus is involved in memory and contextual representations (51). Araki et al. (23) revealed that lower P300 amplitude in patients with PTSD was associated with smaller ACC volume, which linked the P300 abnormality to underlying brain morphological abnormality.

It should be recognized that the results in Table 1 were obtained from participants who were diagnostically PTSD-positive at the time of recording. The question of the utility of ERPs as a predictor of a transition to PTSD is not addressed by these studies, but these studies do suggest that altered ERPs may be present in the sub-syndromal state. This possibility is investigated in this study. The study was sponsored by the Department of Defense to investigate the utility of using a reduced montage that could be implemented in a military field hospital environment. Event-related potentials can be elicited by visual, auditory, somatosensory, and olfactory stimuli, with visual and auditory stimuli being the most commonly used. Hearing and vision can be compromised after blast exposure, but visual disturbances typically resolve faster. We therefore used visual stimuli in this study. As indicated in Table 1, several ERP components [P50, P200, N200, and contingent negative variation (CNV)] can be altered in PTSD-positive participants. Typically, however, the P300 is the most robust component. Since the object of this research program is the development of a robust technology that can be implemented in an austere medical environment, we focused on the P300.

Methods

Subjects

We recruited 85 military SMs within 2 months of their return from an Operation Enduring Freedom (OEF)/Operation Iraqi Freedom (OIF) deployment of at least 3 months’ duration in either Iraq or Afghanistan. The Clinician-Administered PTSD Scale (CAPS) (52) and the PTSD Checklist-Military Version (PCL-M) (53) were administrated to assess PTSD. Patient Health Questionnaire-9 (PHQ-9) (54) and the International Classification of Diseases, 10th Clinical Modification (ICD-10) criteria for postconcussional syndrome (PCS) were administrated to determine the presence of depression and PCS, respectively. Exclusion criteria included a history of head injury resulting in loss of consciousness for 60 min or more; a current Glasgow Coma Scale less than 13; visual acuity lower than 20/100 after correction; psychosis; active suicidal, or homicidal ideation; pregnancy; a diagnosis of PCS according ICD-10, PHQ-9 score greater than or equal to 10; and a PCL-M score greater than or equal to 50, or a diagnosis of PTSD made by an experienced psychologist using the CAPS based on the DSM-IV criteria. All subjects provided written informed consent in accordance with the protocol approved by institutional review boards at Uniformed Services University, Walter Reed National Military Medical Center, and the National Institutes of Health.

Out of the 85 participants, 8 were excluded after baseline assessment: 2 for PCL-M ≥50, 2 for PHQ-9 scores ≥10, and 4 for problems with electroencephalogram (EEG) recording. Among the remaining 77 participants, 65 completed at least one follow-up psychological evaluation (52 at 3 months, 33 at 6 months, and 53 at 12 months). On serial follow-up evaluations, 5 of the 65 participants developed PTSD as determined by PCL-M scores (4 PTSD, 1 PTSD with depression). We therefore separated the 65 participants into 5 cases (referred to as Converters, mean age 35.6 ± 6.2 years, 4 men and 1 woman) and 60 controls (referred to as Stables, mean age 30.5 ± 8.0 years, 54 men and 6 women). The 5 Converters and 60 Stables are the final set of subjects in this study. In this paper, we focus on electrophysiological data from baseline assessment as we are trying to identify neural markers that predict the development of PTSD.

All participants in the group of 65 were exposed to relatively severe traumatic experiences. The types of index trauma reported by those who developed PTSD included experiencing a base attack (e.g., mortar or rocket fire, n = 1), engaging in combat-related violence (e.g., firefights, hit by improvised explosive device, IED, killing enemy, n = 2), witnessing combat-related violence (e.g., watching truck in convoy hit by an IED, witnessing death n = 1), and deployment bullying and abuse (n = 1). Those who did not develop PTSD also reported experiencing base attacks (n = 24), engaging in combat-related violence (n = 23), and witnessing combat-related violence (n = 13). Two factors, however, preclude a meaningful search for correlations between ERP abnormalities and cause of trauma. The first is the small size of the study population. The second would be applicable even in a larger study. Many, if not most of these participants have received multiple traumas from many causes.

Electrophysiological Recording

A visual oddball task was performed by subjects in an acoustically and electrically shielded room. Visual stimuli were presented by a digital tachistoscope of our own design and construction. The tachistoscope is a 5 × 5 square array of yellow, light-emitting diodes. Each diode is 1 cm in diameter. Given spacing between LEDs, the array is 6 cm × 6 cm. The standard visual stimulus was a vertical stimulus which consists of the five vertical center line LEDs illuminated simultaneously for 40 ms. The target visual stimulus was a horizontal stimulus which is composed of the five horizontal center line LEDs illuminated simultaneously for 40 ms. Each subject received 125 stimuli in total, of which about 21% (26 ± 1 trials) were target and 79% (99 ± 1 trials) were standard stimuli. The subjects were instructed to maintain a silent count of the number of target stimulus presentations and to report their count at the end. The inter-stimulus onset time was varied randomly between 1.4 and 1.8 s. The number of trials in the current study is sufficient to elicit a valid P300 response. For example, a classic P300 study by Pollich et al. (55) used 25 target trials. Cohen and Polich (56) found that the P300 stabilized with approximately 20 trials.

The scalp EEG was recorded using the EPA6 amplifier (Sensorium Inc.) and the Grass electrodes (Natus Neurology Inc.) at Fz, Cz, Pz, Oz, C3, and C4 according to the standard 10-20 electrode system, with linked earlobes as reference and a forehead ground. Electrode impedances were maintained under 5 kΩ. EOG was recorded from two electrodes placed below and above the right eye. The sampling rate was 2,048 Hz, and the analog filter band-pass was 0.02–500 Hz.

Data Processing of Electrophysiological Data

Data processing was performed offline using custom scripts written in MATLAB (www.mathworks.com). Channels contaminated by artifacts were removed from analysis. This resulted in one Fz channel (from the Stable group) and four Oz channels (one from the Converter group and three from the Stable group) being removed. EOG artifacts were corrected by using a regression approach (57). The data after EOG correction were high-pass filtered at 0.5 Hz, low-pass filtered at 50 Hz, and down sampled to 256 Hz. The analysis period was −200 to 1,000 ms where time zero denotes stimulus onset. Trials with peak potentials exceeding 75 μV or exhibiting abnormal trends were excluded from ERP averaging. The overall trial rejection rate was 4.84%. Target trials and standard trials were averaged separately. P300 amplitude was measured as the voltage of the largest positive peak of target ERP within 250–500 ms. P300 latency was measured as the time from stimulus onset to the maximum positive amplitude within 250–500 ms.

Statistical Analyses

Differences between groups in demographics, psychological measures, and task performance (accuracy of target count) were examined by Student’s t-tests if data are numerical or Fisher’s exact tests if data are categorical. Because the Oz channel was lost in some recordings (including one in the Converter group), the statistical analysis is limited to Fz, Cz, Pz, C3, and C4 electrode sites. Group differences in P300 amplitude and latency at each electrode site were tested by Student’s t-tests. Correlations between P300 amplitude and the psychological measures were examined by Pearson’s correlation coefficient. p-Values less than 0.05 were considered statistically significant.

To examine the efficacy of using P300 amplitude as the predictor for PTSD, we performed several statistical analyses including approximate classification error rate, receiver operating characteristic (ROC) curve, leave-one-out cross validation, and bootstrapping. The detailed mathematical methods and equations can be found in the Mathematical Appendices.

Results

Subject Characteristics and Baseline Psychological Measures

The subject characteristics and baseline psychological measures were summarized in Table 2. Age, gender, handedness, and history of mild TBI (mTBI) were not significantly different between the Converter and Stable groups. At the baseline assessment, the Converter group reported significantly higher CAPS, PHQ-9, and PCL-M scores than the Stable group.

TABLE 2

Table 2. Subject characteristics and baseline psychological measures.

Behavioral Data

The accuracy of target count at baseline assessment was not significantly different between Converters and Stables. For Converters, the mean accuracy of target count was 93.1% (SD 5.0%) and for Stables the mean accuracy was 97.4% (SD 5.5%) The difference was not statistically significant (t = 1.70, df = 63, p = 0.095).

P300 Data: Amplitude and Latencies of Averaged Responses

We computed the approximate signal-to-noise ratios (SNRs) for both target and standard trials within the P300 time window for each subject. The SNR was calculated from the power of the ERP during the P300 window (300–400 ms) minus the power of the ERP during baseline (−200 to 0 ms) and then divided by the power of the ERP during baseline window. The mean SNR for single subject ERP for target trials at Pz is 145 (21.6 dB). The mean SNR for single subject ERP for standard trials at Pz is 87 (19.4 dB).

The P300 waveforms of average responses to standard stimuli do not have a well-defined single peak that can provide a unique amplitude and latency measure that can be incorporated into statistical analysis. Statistical analysis is therefore limited to the average responses to target stimuli where well-defined P300 waveforms make precise measurements possible. Figure 1 displays the grand average ERPs in response to target and standard stimuli at the six electrodes in Converters and Stables. Because the Oz channel was lost in some recordings, the statistical analysis is further limited to Fz, Cz, Pz, C3, and C4 electrode sites. We found that for all these electrode sites, the P300 amplitude was significantly smaller (p < 0.05) for the Converter group compared to the Stable group. The P300 latency was not significantly different (p > 0.05) between the two groups. The statistical results for each electrode were summarized in Table 3. We also explored the correlation between the P300 amplitudes and the psychological measures (CAPS, PHQ-9, and PCL-M) across subjects. No significant correlations were found (p > 0.05).

FIGURE 1

Figure 1. P300 waveforms in converters and stables. Grand average ERPs in response to target and standard stimuli at the six electrodes. Blue lines represent waveforms for Stables. Red lines represent waveforms for Converters.

TABLE 3

Table 3. Baseline results from participants who remained PTSD-negative for one year after enrollment (N = 60) and those who converted to PTSD-positive (N = 5).

Diagnostic Validity

Approximate Classification Error Rate

As summarized in Table 3, there was a statistically significant difference in the target amplitude between the participants who remained PTSD-negative throughout the study and those who became PTSD-positive. A statistically significant between-group separation does not, however, establish the efficacy of these measures as predictors. The most commonly applied quantitative measure of between-group separation is the t-test. As shown in Table 3, a naive calculation (a two-tailed t-test that assumes unequal variances) suggests a significant separation between the two participant groups. Two essential observations should be made. First, the asymptotic assumptions of the t-test cannot be meaningfully satisfied when N_C = 5. Second, a separation of means, which is what the t-test assesses, does not of itself ensure a successful classification even in those instances where the assumptions of the test are satisfied. An estimate of classification error rates can be made by again assuming normality of the two populations. The equations used are given in the Mathematical Appendices. This estimate often results in a substantial under estimate of the true error rate. This is particularly true when population numbers are small (58). The results shown in Table 3 show that application of this admittedly optimistic error rate estimate predicts that using target amplitude results in unacceptable classification error rates of P_ERROR = 0.29 to P_ERROR = 0.32, where it should be remembered that random assignment results in a 0.50 error if we assume that the two populations occur in equal proportions. This negative conclusion will be supported by the more reliable empirical determinations of classification error. It should be noted, however, that the error rates are different between the amplitudes and latencies, namely approximately 30% for the amplitudes and 50% for the latencies.

ROC Curve

Prediction using prodromes can be treated as a diagnostic problem in which the disease-positive state corresponds to being a member of the group that becomes PTSD positive. Calculation of the ROC curve is a commonly employed method for characterizing a diagnostic classification. The first row of Table 4 shows the area under the curve (AUC), for the electrophysiological measures. The mathematical methods used to determine the AUC and its confidence intervals are given in the Mathematical Appendices. A value of AUC >0.5 indicates better than random assignment. The P300 amplitude at Cz showed the highest predictive power, with an AUC of 0.85 (confidence interval of [0.67, 0.94]). The ROC curve of the P300 amplitude at Cz is shown in Figure 2. While the values of the AUC are encouraging, the very large confidence intervals diminish confidence in the result.

TABLE 4

Table 4. Area under the receiver operating curve and measures of diagnostic efficacy computed using the smallest value of threshold giving the maximum value of the Youden index.

FIGURE 2

Figure 2. The receiver operating characteristic (ROC) curve of the P300 amplitude at Cz. Horizontal axis is the false positive rate (1-specificity) which equals the number of false positive divided by the sum of false positive and true negative. Vertical axis is the true positive rate (sensitivity) which equals the number of true positives divided by the sum of true positive and false negative. The solid line represents the ROC curve for using the P300 amplitude at Cz as the diagnostic test. The dashed line represents the ROC curve for a random test.

Diagnostic Efficacy and Determination of the Diagnostic Cut Score

The results of a diagnostic calculation (and by implication for the present context the identification of a prodrome) can be expressed in the canonical four element diagnostic matrix: true positive, false positive, false negative, and true negative. There is no single fully satisfactory summary measure for characterizing the diagnostic matrix. Each has advantages and limitations. The limitations are particularly evident in studies like this one where disease prevalence is low. We will therefore examine six common measures of diagnostic efficacy: diagnostic accuracy, sensitivity, specificity, the positive likelihood ratio, the negative likelihood ratio, and the diagnostic odds ratio. Their definitions are given in the Mathematical Appendices.

The values of elements in the diagnostic matrix, and therefore measures of diagnostic efficacy like sensitivity and specificity, are critically dependent on the cut score used to assign individuals to the disease-positive and disease-negative groups. The choice of the cut value is therefore a central problem in the implementation of a diagnostic procedure. As outlined in the Mathematical Appendices, more than one candidate procedure has been proposed. In the calculations summarized in Table 4, the diagnostic threshold was determined by the value of threshold that gave the maximum value of J, the Youden index (59). The value of sensitivity, specificity, and other measures of diagnostic efficacy reported in Table 4 are the values obtained when the threshold was set to the smallest value of threshold giving the maximum J. Because the results of Table 3 indicate that target latencies cannot discriminate between-group means, the analysis is limited to target amplitudes.

Leave-One-Out Cross Validation

The results presented in Table 4 are encouraging particularly in the cases of average Cz amplitude and average C4 amplitude which give sensitivity and specificity values in excess of 0.8. Measures of diagnostic efficacy obtained by examination of the ROC can be misleadingly optimistic if sample sizes are small. A fast, albeit imperfect, reality check can be implemented by a leave-one-out cross validation. In this calculation, one of the values is removed from the sample. A between-group classifier is constructed from the remaining data, and the omitted value is classified. It is then replaced. Another value is removed and classified. This procedure continues to exhaustion and the classification results are used to populate the diagnostic matrix (true positive, false positive, false negative, true negative). The measures of diagnostic efficacy introduced in the previous section are then calculated.

In order to implement a leave-one-out cross validation the choice of classifier must be addressed. In these calculations, a classifier based on Gaussian populations with prior probabilities was used. The mathematical structure of the classifier is given in the Mathematical Appendices. Two sets of prior probabilities were considered. In the first set of calculations, equal priors were used. In the second, it was supposed that the prior probability of delayed-onset PTSD was 0.25 which is the value derived from a review of the clinical literature (19, 20).

With both sets of prior probabilities, the sensitivity and specificity values are considerably less encouraging (Table 5). In the previous calculations, the sensitivity and specificity obtained at Cz are 0.80 and 0.87, respectively. In the leave-one-out calculation using equal priors, the corresponding values are 0.60 and 0.65. Similarly, the previous sensitivity and specificity results obtained at C4 were 0.80 and 0.90, respectively. The leave-one-out values with equal priors are 0.80 and 0.62. This divergence counsels interpretive caution when evaluating the results summarized in Table 3.

TABLE 5

Table 5. Classification based on average target amplitudes determined by a leave-one-out calculation.

Populating the Diagnostic Matrix by Bootstrapping

A deficiency of the results presented in the previous section is immediately apparent on examining Table 5. The sensitivities and specificities are reported without confidence intervals. This deficiency can be addressed with a bootstrap calculation. The procedure is outlined in the Mathematical Appendices. Two thousand bootstrap samples were used to estimate the bootstrapped distribution. The results are shown in Table 6. The confidence intervals provide an essential clarification to the preceding results. The sample size precludes a dispositive response to the hypothesis that the amplitudes of average ERPs can serve as a predictor of delayed-onset PTSD.

TABLE 6

Table 6. Classification based on average target amplitudes determined by a bootstrap calculation.

The confidence intervals reported for sensitivity values, [0,1] in all cases, are particularly telling. The definition of sensitivity is

Sensitivity = True Positive Ratio = \frac{N_{TP}}{N_{TP} + N_{FN}}

where N_TP is the number of true positives and N_FN is the number of false negatives. There are only five elements in the Converter set, and two of these elements are used to build the classifier. Therefore, N_TP is frequently zero, giving Sensitivity = 0. Similarly, if in other cases N_TP ≠ 0 and N_FN = 0 giving Sensitivity = 1 as another frequent value. This results in a bootstrapped confidence interval of [0,1].

Discussion

In this analysis, the identification of individuals who will present delayed-onset PTSD is treated as a diagnostic process where the diagnostic groups are Converters (those who present delayed-onset PTSD) and Stables (those who do not). Sensitivity values based on average target stimulus amplitude range from 0.58 to 0.68. Specificity values range from 0.61 to 0.70, suggesting that event-related potentials may be helpful in identifying at-risk individuals.

The results in this study can only be considered preliminary due to the small sample size of Converters. The limitations of the sample size are indicated by the calculations presented in Table 6. Suppose the objective is to know sensitivity to an accuracy of ±0.1 with 95% confidence. A calculation given in the Mathematical Appendices indicates that N ≥ 185 is required, where it must be emphasized that this N is the number of Converters. If Converters are 10% of the population, then the projected requirement is for 1,850 participants in the study. The implications of this simple calculation extend beyond the study of PTSD and generalize to all of neuropsychiatry where conversion rates even in enriched populations are low. Large participant numbers will be required. Additionally, by definition, the search for prodromes requires a longitudinal study extended, perhaps, over a period of years. The challenges of supporting and implementing very large longitudinal studies are formidable.

Further limitations should be acknowledged. Electrophysiological abnormalities associated with neuropsychiatric disorders are non-specific. For example, in addition to PTSD, alterations in EEG synchronization have been observed in AD/HD, alcohol abuse, alexithymia, autism, bipolar disorder, dementia, depression, migraine, multiple sclerosis, Parkinson’s disease, TBI, schizophrenia, and other psychotic disorders (60). The potential loss of electrophysiological specificity is particularly likely in a military population where PTSD is often associated with TBI and is comorbid with depression and substance abuse. Additionally, medications can alter event-related potentials and will complicate diagnosis based on ERPs.

Statistical identification of individuals who will present with PTSD might, however, be improved by two extensions to the present analysis. First, the analysis of ERPs reported here was limited to calculation of average ERPs. More recently, developed methods of analysis, for example, information dynamics (61) and network analysis of brain electrical activity (62) might improve results. Second, specificity and sensitivity may be improved by combining electrophysiological measures with other biomarkers and clinical information. Incorporating scores from psychological questionnaires with electrophysiological results in a multivariate discrimination would be an obvious possibility. The psychological measures including CAPS, PHQ-9, and PCL-M scores showed significant difference between Stables and Converters at the baseline assessment, but none of the scores significantly correlated with the P300 amplitude. The discordance between neural responses and self-reported symptoms may be partially a consequence of psychological defensive denial (63, 64). Some SMs recruited in this study may deny the presence of their PTSD symptoms due to military training or concerns that this may jeopardize their job, promotion, and self-image. This defensive denial may be softened after a prolonged period. Consistent with this possibility, a review by Andrews et al. (19) reported that most delayed-onset PTSD cases occurred in military samples rather than in civilian samples. If this is the case, objective biomarkers would be fundamentally more favorable than self-report psychological measures in identifying SMs at risk of PTSD.

While additional forms of electrophysiological analysis in combination with other classes of data may improve the likelihood of success, this will not eliminate the previously documented requirement for large sample sizes in a longitudinal study. Such detection would be critical to the military because early intervention to prevent PTSD has revealed a critical window for fear activation and extinction of conditioned responses related to traumatic memories (65).

Ethics Statement

This research protocol was approved by the Institutional Review Board of the Uniformed Services University and by the Institutional Review Board of the Walter Reed National Military Medical Center. All participants gave written informed consent in accordance with the Declaration of Helsinki.

Author Contributions

CW performed the analysis of the event-related potentials and the preliminary statistical analysis. MC screened participants for eligibility and conducted the psychological assessments. PR performed the literature search, statistical analysis, and wrote the final drafts of the paper. DD participated in developing and implementing the statistical analysis plan. KB and DN obtained the electrophysiological data. CC designed and built the ERP acquisition system. MR participated in the design of the investigation. DK lead the research effort and participated in acquisition of the electrophysiological data.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We would like to acknowledge support from the Uniformed Services University, the Defense Medical Research and Development Program and the Center for Neuroscience and Regenerative Medicine. The Opinions and assertions contained herein are the private opinions of the authors and are not to be construed as official or reflecting the views of the United States Department of Defense.

Funding

Funding was provided by the Center for Neuroscience and Regenerative Medicine Project 351030 and by the Defense Medical Research and Development Program Project D10_1_AR_J5_605.

References

1. Costello EJ, Angold A. Developmental transitions to psychopathology: are there prodromes of substance use disorder? J Child Psychol Psychiatry (2010) 51(4):526–32. doi: 10.1111/j.1469-7610.2010.02221.x

Identifying Electrophysiological Prodromes of Post-traumatic Stress Disorder: Results from a Pilot Study

Introduction

Can a PTSD At-Risk Group Be Identified?

Can PTSD Present with Delayed Onset?

Is There a Prior Literature Reporting Alterations of Event-Related Potentials in Fully Expressed PTSD?

Methods

Subjects

Electrophysiological Recording

Data Processing of Electrophysiological Data

Statistical Analyses

Results

Subject Characteristics and Baseline Psychological Measures

Behavioral Data

P300 Data: Amplitude and Latencies of Averaged Responses

Diagnostic Validity

Approximate Classification Error Rate

ROC Curve

Diagnostic Efficacy and Determination of the Diagnostic Cut Score

Leave-One-Out Cross Validation

Populating the Diagnostic Matrix by Bootstrapping

Discussion

Ethics Statement

Author Contributions

Conflict of Interest Statement

Acknowledgments

Funding

References

Mathematical Appendices

Estimating Classification Error (Contents of Table 3)

Receiver Operating Characteristic Curve (Contents of Table 4)

Measures of Diagnostic Efficacy (Contents of Table 4)

Classification Based on Gaussian Likelihood and Bayesian Priors (Contents of Table 5)

Populating the Diagnostic Matrix with a Bootstrap Estimator (Contents of Table 6)

Sample Size Requirements for Measures of Diagnostic Efficacy