Impaired self-awareness after traumatic brain injury: inter-rater reliability and factor structure of the Dysexecutive Questionnaire (DEX) in patients, significant others and clinicians

Aims: This study sought to address two questions: (1) what is the inter-rater reliability of the Dysexecutive Questionnaire (DEX) when completed by patients, their significant others, and clinicians; and (2) does the factor structure of the DEX vary for these three groups? Methods: We obtained DEX ratings for 113 patients with an acquired brain injury from two brain injury services in the UK and two services in Ireland. We gathered data from two groups of raters—“significant others” (DEX-SO) such as partners and close family members and “clinicians” (DEX-C), who were psychologists or rehabilitation physicians working closely with the patient and who were able to provide an opinion about the patient’s level of everyday executive functioning. Intra-class correlation coefficients and their 95% confidence intervals were calculated between each of the three groups (self, significant other, clinician). Principal axis factor (PAF) analyses were also conducted for each of the three groups. Results: The factor analysis revealed a consistent one-factor model for each of the three groups of raters. However, the inter-rater reliability analyses showed a low level of agreement between the self-ratings and the ratings of the two groups of independent raters. We also found low agreement between the significant others and the clinicians. Conclusion: Although there was a consistent finding of a single factor solution for each of the three groups, the low level of agreement between significant others and clinicians raises a question about the reliability of the DEX.


INTRODUCTION
Impaired self-awareness is a common cognitive deficit after traumatic brain injury and can lead to problems with self-monitoring and behavioral self-regulation (McBrinn et al., 2008). These problems may contribute to difficulties undertaking many everyday functions such as engaging in interpersonal communication, budgeting, household chores and carrying out vocational activities (Godbout et al., 2005). The cognitive capacities associated with self-awareness and self-regulation are considered to be part of the executive system in the frontal lobes of the brain. Executive functions include cognitive operations that contribute to the ability to initiate, inhibit and integrate other functions, simultaneously termed supervisory, attentional or control processes (Shallice and Burgess, 1991;Miyake et al., 2000;Stuss and Alexander, 2007).
A cluster of symptoms associated with these functions is thought to present in "dysexecutive syndrome", the central component of the cluster being impairment in self-awareness and self-regulation . This impairment is assumed to arise from damage to critical areas of the brain that are integral to behavioral self-regulation, typically the frontal lobes. There is, however, growing recognition that self-awareness is a highly complex and multifaceted process that is not exclusive to the frontal lobes. Efforts to identify specific brain areas that may be responsible for self-monitoring and self-regulation have led researchers to acknowledge that multiple pathways may be involved-impaired self-awareness does not appear to be linked exclusively to focal or generalized brain damage or to specific neurocognitive test profiles (Philippi et al., 2012;Caldwell et al., 2014;Ham et al., 2014).
While the study of the underlying processes involved in impaired self-regulation continues, clinicians agree that the capacity for self-monitoring and behavioral self-regulation is important to successful rehabilitation outcomes after brain injury (Winkens et al., 2014). To that end, clinicians are reliant on existing psychometric tools for identifying and quantifying impaired self-regulation. However, measurement of executive ability is challenging as executive function tests are not process-purethey will invariably and unavoidably involve other non-executive functions that may be variously spared or compromised after brain injury . One commonly used measure of the behavioral manifestation of dysexecutive impairment is the Dysexecutive Questionnaire (DEX; Wilson et al., 1997). The DEX is purported to be an ecologically valid test; that is, it provides an estimation of executive function as applied to everyday life challenges. The interpretation of the DEX score is based on the difference between the client's self-report and the report of another person who knows the client well, with any resultant discrepancy assumed to reflect a lack of self-awareness in the brain-injured person.
As a relatively quick and easy questionnaire to complete, the DEX offers an appealing method of quantifying a complex neuropsychological process. However, the utility of the test relies on two important premises: (1) the third party respondent can give a true and accurate account of the injured person's functioning; and (2) the psychometric validity of the measurement tool is constant across users (i.e., both client and independent rater "versions" are measuring the same construct or factor[s]). Each of these premises is considered briefly.
Regarding the first premise, there is certainly evidence that patient self-reports differ from the reports of their significant others. This finding is not unexpected as the DEX is designed to identify discrepancies in scores that may reflect impairment in self-awareness in people following brain injury. However, some evidence suggests that independent raters may not respond in a similar way about the same person. For example, a study of the inter-rater reliability of the ratings of family members found a low level of agreement among three independent raters reporting about the same individual with a brain injury (Barker et al., 2011). The authors concluded that all raters do not respond in a comparable manner and, thus, it would be erroneous for clinicians to assume that DEX ratings by significant others are always accurate (Barker et al., 2011). The problem of ascertaining whether a rating by a family member is accurate is more complex than it may appear. For example, if one does not use independent ratings of the level of impaired awareness of the person with brain injury, then the other main source of information is objective neuropsychological data. However, the situation here is far from clear-there is not a direct correlation between overall severity of cognitive impairment and level of impaired self-awareness or between scores on specific tests of executive function and level of impaired self-awareness (Barker et al., 2004). Thus, there are ongoing questions regarding the precise nature of impaired selfawareness, its link to overall executive functioning, and the extent to which the construct can be measured by existing questionnairebased tools.
With respect to the second premise of whether the DEX selfrated questionnaire measures the same construct(s) as the DEX completed by independent others, several studies have examined the factor structure of the DEX focusing on DEX self-ratings. Variable findings have been obtained. For instance, a study using a large community sample of more than 1100 people identified a single underlying factor (Gerstorf et al., 2008). Conversely, a study using non-clinical (N = 293) and clinical (N = 49) samples found a 4-factor solution with factors best described as inhibition, intention, social regulation, and abstract problemsolving (Mooney et al., 2006). A 4-factor model also was identified by Bodenburg and Dopslaff (2008); however, based on different loadings, their interpretations of these factors were: initiating and sustaining actions, impulse control, psychophysical and mental excitability, and social conventions. A study of the factor structure of the DEX in the context of normal aging (Amieva et al., 2003) identified a 5-factor solution: intentionality, interference management, inhibition, planning, and social regulation. Thus, substantial variability is evident in the dimensionality of the DEX.
Only one previous study has tested the factor structure of the DEX amongst independent raters. Using the significant others of 46 adults with varying neurological conditions, that study obtained a 3-factor solution described as behavioral inhibition, goal-directed behavior/intentionality, and executive memory/cognition (Chaytor and Schmitter-Edgecombe, 2007). No studies have yet examined the factor structure of the DEX when completed by independent raters who are reporting about the degree of impairment associated with acquired brain injury. Further, no previous study has compared the factor structure of the DEX when completed by two or more independent raters in relation to the same patient. The fundamental questions addressed by this study are: (1) what are the levels of inter-rater consistency when the DEX is completed by patients, significant others, and clinicians; and (2) does the dimensionality of the DEX vary as a function of the individuals completing it (e.g., client vs. clinician)?

METHOD MEASURES
The Behavioral Assessment of the Dysexecutive Syndrome (BADS) is considered an ecologically valid, multidimensional measure of executive function comprising six sub-tests and a questionnaire which probes symptoms of Dysexecutive syndrome, called the DEX (Wilson et al., 1997). The DEX is a 20item questionnaire which the authors describe as having three factors assessing everyday changes in cognition, emotion and behavior after an acquired brain injury or other brain trauma. The DEX is completed by the patient (self-rating: DEX-S) and by a person who knows the patient well (independent rater). In this study, we gathered data from two groups of independent raters-"significant others" (DEX-SO) such as partners and close family Frontiers in Behavioral Neuroscience www.frontiersin.org October 2014 | Volume 8 | Article 352 | 2 members and "clinicians" (DEX-C), who were psychologists or rehabilitation physicians working closely with the patient and who were able to provide an opinion about the patient's level of everyday executive functioning. Ethical Approval: Each participant and their significant other provided consent to take part and each of the participating services received ethical approval from their local institutional research ethics committee.

PARTICIPANTS
The number of patients included in this study was 113 (87 males, M age = 37.77, SD = 12.76; 26 females, M age = 38.96, SD = 12.06) from two brain injury services in the UK and two services in Ireland. The participants were identified by the service managers and clinicians by virtue of being a client of the service and meeting the inclusion criteria. Inclusion criteria for the study were: 18 years or older, had experienced an acquired brain injury, had sufficient cognitive and physical ability to give informed consent to participate, able to read and respond to the questionnaires. Exclusion criteria were major psychiatric illness or cognitive impairment of such severity that would prevent the ability to consent and/or to respond to the questionnaires. None of those identified by the services as potentially suitable participants refused to participate. In each center, an unspecified number of patients were deemed by the clinician or service manager to not meet the inclusion criteria. The sample in the study would, in the authors' view, be considered typical of those accessing brain injury support services in the UK and Ireland with moderately severe brain injury. All were in the post-acute phase of rehabilitation, typically receiving support services focused on optimizing independent functioning. The mean duration of injury was 57.49 months (SD = 44.24), with minimal and maximal time periods of 10 months and 168 months, respectively. The median for duration was 36 months (25th percentile = 24 months; 75th percentile = 84 months). Type of injury data indicated that an overwhelming majority of clients had experienced traumatic head injuries (95%). Finally, with respect to current occupation, the most commonly selected options were: currently unemployed (24.1%), supported training/employment (20.4%) and retired (20.4%).

RESULTS
Cronbach's alpha coefficients and their 95% confidence intervals suggest excellent scale score reliability within each respondent group. However, upper bound estimates, particularly for the DEX-SO and DEX-C, suggest that item redundancy may be of concern (Streiner, 2003). As noted in previous research, DEX-SO scores were higher than DEX-S ratings, although this difference was not statistically significant (LSD, p = 0.07). DEX-C scores were lowest of all and differed significantly from DEX-SO scores (LSD, p = 0.001).
Intra-class correlation coefficients and their 95% confidence intervals were calculated between DEX-S and DEX-SO items; DEX-S and DEX-C items; and DEX-SO and DEX-C items. This analysis permits one to determine the degree of consistency between self-, significant other, and clinician ratings, with ICC values >0.74 representing an excellent level of agreement; values between 0.60 and 0.74 reflecting good agreement; and values between 0.40 and 0.59 representing fair agreement. Absolute agreement ICCs were estimated using a one-way random effects model (see Table 2).
The average level of agreement between self-and significant other ratings was 0.41 (SD = 0.09). The averages for self-and clinician ratings and significant other and clinician ratings were 0.15 (SD = 0.09) and 0.31 (SD = 0.13), respectively. Post-hoc testing revealed that these averages differed significantly: self and significant other vs. self and clinician (LSD, p < 0.001); self and significant other vs. significant other and clinician (LSD, p < 0.01); self and clinician vs. significant other and clinician (LSD, p < 0.001). Importantly, the average level of agreement between self and significant other ratings was at the bottom end of the stratum denoting "fair agreement". The remaining averages were poor. These findings suggest there is only nominal consistency in ratings on the DEX among patients, their caregivers, and clinicians.
To assess the dimensionality of the DEX when completed by patients, significant others and clinicians, three principal axis factor (PAF) analyses were conducted. This factor analytic technique is recommended when data have the potential to be nonnormally distributed (Finch and West, 1997). Diagnostics, such as the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and Bartlett's test of sphericity, were conducted for each PAF analysis and deemed to be satisfactory (i.e., KMO exceeded 0.90 and Bartlett's test was statistically significant permitting one to reject the null hypothesis that associations among the DEX items may be represented as an identity matrix). Parallel analysis and inspection of the unrotated factor solution were used to assist with factor retention.
When completed by patients, a one-factor solution appeared to best represent the data (i.e., the unrotated solution revealed that  2 | Intra-class correlation coefficients and their 95% confidence intervals between DEX-Self (DEX-S), DEX-Significant Other (DEX-SO) and DEX-Clinician (DEX-C).  no items loaded uniquely on any factor besides the first one and there was a negligible difference between the eigenvalue associated with factor 2 [1.09] for the random data and the eigenvalue associated with factor 2 [1.49] for the real data). The eigenvalue associated with the first factor was 8.46 (42.29% of the variance accounted for). Factor loadings ranged from 0.33 to 0.78 (see Table 3).

DEX item DEX-S and DEX-SO DEX-S and DEX-C DEX-SO and DEX-C
A similar solution emerged when significant others completed the DEX. Specifically, one factor appeared to best represent the data (eigenvalue = 10.46, accounting for 52.32% of the variance). The factor loadings ranged from 0.58 to 0.83 (see Table 3). Finally, a one-factor solution also was optimal for the clinicians (eigenvalue = 12.54, accounting for 62.72% of the variance). For this group, factor loadings ranged from 0.49 to 0.92.

DISCUSSION
This study sought to address two questions: (1) what is the inter-rater reliability of the DEX when completed by patients, their significant others, and clinicians; and (2) what is the factor structure of the DEX for these three groups?
Results suggest there is only nominal agreement in item ratings on the DEX among patients, their caregivers, and clinicians. The fact that self-rating and ratings by others is different is not surprising-the very purpose of the measure is to detect a lack of self-awareness in people with brain injury, operationalized as a discrepancy between the patient and their significant other. However, it is concerning that there is a large discrepancy in the ratings of other people who know the patient well: significant others and clinicians attributed quite variable scores across the range of items, indicating a low level of agreement between raters. This finding is in keeping with previous research which showed that the DEX ratings of significant others (mainly family members rather than clinicians) were variable (Barker et al., 2011).
The fact that third party raters can differ quite significantly when reporting about the same individual raises an important question about the reliability of the DEX. It could be suggested that clinician respondents might, by virtue of their professional training, be able to provide a more accurate appraisal of the level of executive function impairment. This is difficult to confirm, however, since clinical judgment is inherently subjective. In the  authors' experience, neuropsychological testing may also not be especially helpful in this regard, as performance on tests of executive function does not always correlate strongly with functional ability (Chaytor et al., 2006;Razani et al., 2007). It may be the case that the best and, perhaps, only reliable way to measure executive function impairment in everyday situations is through a combination of behavioral, task-based measures such as the Multiple Errands Test (Shallice and Burgess, 1991) and a consensus-based response to the DEX where a discussion of the individual items between the respondents may lead to a more accurate description of the problems encountered by the person with brain injury. The feasibility of including ecologically-valid behavioral testing has been improved with the development of virtual reality-based technologies. For example, a virtual reality version of the Multiple Errands Test (Raspelli, 2014) has been developed and offers the potential to measure real-life challenges coupled with the convenience of being able to do the assessment within a clinical setting. With respect to the dimensionality of the DEX, PAF analysis suggested that a single factor offered the best fit for all three groups. This finding indicates that DEX items are best construed as representing a single construct of executive dysfunction. It should be noted that other researchers have identified a similar factor structure. For instance, using two independent samples of community-dwelling persons, Gerstorf et al. (2008) identified a single factor solution as being optimal for the self-rated version. Specifically, these authors report that "independent of specifying an orthogonal or oblique solution, we found that the eigenvalue for one factor was consistently above 7 whereas four or five other factors could have been extracted but their eigenvalues were only marginally larger than 1" (pp. 432-433). We observed a similar outcome across three different categories of respondent: self, significant other, and clinician. To our knowledge, only one other study (Chaytor and Schmitter-Edgecombe, 2007) has examined the factor structure of the DEX when completed by third party respondents (N = 46). These researchers identified a five-component solution, with the first three components corresponding well with the inhibition, intentionality, and executive memory factors specified in other psychometric studies assessing the self-rated version. The authors conclude that these three components appear to be replicable whereas components 4 and 5 are, perhaps, idiosyncratic (i.e., components unique to the specific sample being tested). However, the validity of their three-component interpretation may be questioned. First, the authors appear to have relied on the "eigenvalue greater than 1 rule", which many have argued is among the least accurate methods for identifying factor/component retention (i.e., it often results in over-extraction) (Costello and Osborne, 2005). Second, although the authors do not provide the intercorrelations among the components, based on the study conducted by Gerstorf and associates (Barker et al., 2004), it is possible they are of sufficient magnitude so as to suggest redundancy (i.e., for the factors representing inhibition, intentionality, and executive memory, r-values obtained by Gerstorf et al. (2008) ranged from 0.93 to 0.99).
As with all studies, the current investigation possesses certain limitations that warrant discussion. First, the number of participants recruited was modest, especially for the clinician subsample. It should be noted, however, that other researchers have published psychometric assessments of the DEX using similar (or smaller) numbers of participants (e.g., N = 20 (Amieva et al., 2003); N = 46 (Amieva et al., 2003); N = 93 (Bachmann et al., 2008)). Further, MacCallum et al. (1999) demonstrate that "rules of thumb" about sample size are less important than the degree to which a factor solution is characterized by factor over-determination (i.e., the number of indicators per factor, with a common ratio being 5:1) and strong communality values (i.e., the proportion of variance in each item accounted for by the extracted factor[s]). In the current study, communality estimates were variable; strong overdetermination was present (i.e., p [variable]: r [factor] ratio was 20:1); and, for the smallest subsample (clinician group), 19 of the 20 variables had large structure coefficients (>0.60) suggesting that one can be reasonably confident in the reproducibility of the obtained factor solutions. Larger samples are clearly needed, however, if one were to conduct subgroup analyses based on variables such as type of injury, gender of patient, or relationship between patient and significant other (e.g., spouse vs. sibling).
Another limitation pertains to the small set of variables that were measured. Gerstorf et al. (2008), for example, assessed a host of individual difference variables including neuroticism, depression, subjective health, trait anxiety, positive and negative affect, and cognitive functioning. In the current study, as only the DEX and a small number of sociodemographic items (e.g., age) were used, the convergent validity of this instrument when completed by patients, significant others and clinicians could not be tested. Future studies might consider the use of alternative methodologies, such as those used in clinical judgment studies (e.g., Bachmann et al., 2008), to look at the cues and weightings used by respondents to arrive at their judgments regarding the presence and extent of any difficulties in executive functioning.
In conclusion, our dimensionality evaluation suggests that the DEX is best construed as a single factor measure of dysexecutive syndrome. The inter-rater reliability analysis suggests that there is a low level of agreement in item ratings on the DEX among patients, their caregivers, and clinicians. The fact that evaluations by two raters are not highly correlated in reference to the same patient raises a question about this element of the reliability of the DEX. While it is well recognized that executive function deficits occur frequently after traumatic brain injury and this is often associated with impaired self-awareness, we are as yet limited in our ability to measure and quantify these impairments. The difficulties arising from measuring deficits in executive function also presents challenges in how best to involve patients in aspects of their own rehabilitation such as patient-determined goals and outcomes (Hogan et al., 2013) when self-awareness is compromised.