Mood Instability and Irritability as Core Symptoms of Major Depression: An Exploration Using Rasch Analysis

Background Mood instability (MI) and irritability are related to depression but are not considered core symptoms. Instruments typically code clusters of symptoms that are used to define syndromic depression, but the place of MI and irritability has been under-investigated. Whether they are core symptoms can be examined using Rasch analysis. Method We used the UK Psychiatric Morbidity Survey 2000 data (n = 8,338) to determine whether the nine ICD/DSM symptoms, plus MI and irritability, constitute a valid depression scale. Rasch analysis was used, a method concerned with ensuring that items constitute a robust scale and tests whether the count of symptoms reflects an underlying interval-level measure. Two random samples of 500 were drawn, serving as calibration and validation samples. As part of the analysis, we examined whether the candidate symptoms were unidimensional, followed a Guttman pattern, were locally independent, invariant with respect to age and sex, and reliably distinguished different levels of depression severity. Results A subset of five symptoms (sad, no interest, sleep, cognition, suicidal ideas) together with mood instability and irritability satisfactorily fits the Rasch model. However, these seven symptoms do not separate clinically depressed persons from the rest of the population with adequate reliability (Cronbach α = 0.58; Person Separation Index = 0.35), but could serve as a basis for scale development. Likewise, the original nine DSM depression symptoms failed to achieve satisfactory reliability (Cronbach α = 0.67; Person Separation Index = 0.51). Limitations The time frame over which symptoms were experienced varied, and some required recall over the last year. Symptoms other than those examined here might also be core depression symptoms. Conclusion Mood instability and irritability are candidate core symptoms of the depressive syndrome and should be part of its clinical assessment.

Background: Mood instability (MI) and irritability are related to depression but are not considered core symptoms. Instruments typically code clusters of symptoms that are used to define syndromic depression, but the place of MI and irritability has been under-investigated. Whether they are core symptoms can be examined using Rasch analysis.
Method: We used the UK Psychiatric Morbidity Survey 2000 data (n = 8,338) to determine whether the nine ICD/DSM symptoms, plus MI and irritability, constitute a valid depression scale. Rasch analysis was used, a method concerned with ensuring that items constitute a robust scale and tests whether the count of symptoms reflects an underlying interval-level measure. Two random samples of 500 were drawn, serving as calibration and validation samples. As part of the analysis, we examined whether the candidate symptoms were unidimensional, followed a Guttman pattern, were locally independent, invariant with respect to age and sex, and reliably distinguished different levels of depression severity.
results: A subset of five symptoms (sad, no interest, sleep, cognition, suicidal ideas) together with mood instability and irritability satisfactorily fits the Rasch model. However, these seven symptoms do not separate clinically depressed persons from the rest of the population with adequate reliability (Cronbach α = 0.58; Person Separation Index = 0.35), but could serve as a basis for scale development. Likewise, the original nine DSM depression symptoms failed to achieve satisfactory reliability (Cronbach α = 0.67; Person Separation Index = 0.51).
limitations: The time frame over which symptoms were experienced varied, and some required recall over the last year. Symptoms other than those examined here might also be core depression symptoms.
conclusion: Mood instability and irritability are candidate core symptoms of the depressive syndrome and should be part of its clinical assessment. Keywords inTrODUcTiOn Depression is a common condition with an estimated lifetime prevalence in the USA of about 16% (1). It is an important cause of workdays lost to disability (2) and is as impairing as arthritis, diabetes, and cardiovascular disease (3). The cost of sub-syndromal symptoms probably exceeds that of formally diagnosed major depression (3)(4)(5)(6). It is a concern that the incidence of suicide -the most tragic consequence of depression -has not decreased over decades (7). Clearly, we need to better understand the depressive syndrome and the symptoms used in its assessment. The conceptualization, assessment, and measurement of major depression are tricky, and this shows in the poor reliability and validity of its instruments (8)(9)(10). Non-cohesive symptoms might partly explain why specific genetic, biological, or psychological underpinnings are poorly understood (11)(12)(13). While depressive symptoms diverge in their association with external variables -as with cognitive and neuro-vegetative symptoms (14) -a particular symptom can be shared by different disorders. For example, it is unclear whether agitation is an indicator of anxiety or depression and whether it is because agitation is related to the higher construct of distress (14,15). Two individuals can share the same major depression diagnosis without sharing a single symptom (16). Calculating the prevalence and burden of depression is made challenging by the heterogeneity of studies, partly a result of differences in measurement (17).
In clinics worldwide, diagnosing major depression is fairly straightforward. Primary care and specialist physicians follow the DSM (which requires 5 of 9 symptoms) or the ICD (which requires 4 of 10 symptoms) (18). Interestingly, prevalence estimates are similar between systems, although somewhat different populations are identified (19). Having equivalent diagnostic systems has simplified the work of health systems with regard to billing for services and clinical communication (18), but has left important conceptual work unattended. Two problems with the DSM criteria, and by extension, the ICD were raised by Kendler (20). First, the criteria are narrower than the symptoms known to the Western tradition of psychiatry, resulting in an impoverished concept (20). This is perhaps understandable because a list of diagnostic symptoms needs to be brief. Second, the DSM criteria are reified, in the sense that they are thought to constitute depression itself, instead of selected signs of depression (20). In health systems where time is a premium, relying solely on the checklist of symptoms, and ultimately on symptom counts, is a common practice.
Two potential candidate symptoms of the depressive syndrome are MI and irritability. By mood, we mean a valenced emotional state (i.e., positive or negative) (21) in a patient. MI can be defined as "rapid oscillations of intense affect, with a difficulty in regulating these oscillations or their behavioral consequences" (22). MI is closely associated with depression in both cross-sectional and longitudinal studies (23,24). The prevalence of MI is about 14% in the UK general population and about 61% in participants with depression (25), suggesting that MI could be important in diagnosing depression. MI is central to neuroticism (26) that, in turn, is the personality trait most consistent predictor of depression (27).
DSM-V and ICD-10 accounts of major depression mention irritability in their narrative descriptions, but do not include it in the list of diagnostic symptoms (4,28). Hence, it could be ignored by clinicians who follow the nine standard symptoms, as if it were an exhaustive list. Yet, it is reported that irritability occurs in one-third to one-half of child and adult patients with major depression (29)(30)(31), and is part of a strong principal factor of major depression (29). Irritable depression is also associated with greater severity, lower quality of life, and a history of suicidal attempts, which is itself a criterion for depression (29). These findings, as well as Kendler's critique suggest that the ICD or DSM symptom lists are incomplete.
Our research questions are: (i) Do the DSM/ICD symptoms for major depression constitute a valid measure? (ii) Are MI and irritability symptoms of depression?
We addressed these questions using Rasch analysis, which tests a crucial assumption in scales: the total score (or count of symptoms) is an adequate, equidistant representation of depression levels. In brief, the objective of Rasch modeling is to verify that questionnaires have the properties of physical measures (e.g., a ruler). The units are equally spaced, measure a single attribute (i.e., length), and are additive. Moreover, the reading does not depend on the properties of the entity being measured or the person making the measurement.

Data
We used data from the 2000 Psychiatric Morbidity Survey (PMS) of Great Britain. The main purpose of the survey was to estimate the prevalence of psychiatric disorders and their correlates using a stratified random multistage design. Participants were 8,580 adults, aged 16-74 years, living in private households in Great Britain. Of these, the 8,338 people (97%) who had complete records of symptoms of interest were the population from which our calibration and verification samples were drawn. Full details of the PMS methods are available in the main survey report (32).

Depressive Symptoms
Participants were assessed for depression and anxiety disorders by trained lay interviewers who used the Clinical Interview Schedule-Revised (CIS-R). This is a reliable and valid instrument that can be used to algorithmically assign an ICD-10 diagnosis (33). We selected CIS-R questions that were similar in meaning and wording to the nine symptoms of major depression specified by DSM-V. The DSM-V depression symptoms only has "subtle changes" over DSM-IV (34), while both ICD-10 and the upcoming ICD-11 are designed to harmonize with DSM (35,36). Where the DSM-V symptoms had multiple parts, we combined the participant's answers to multiple CIS-R questions. The CIS-R questions we included for analysis are the following: "sad, miserable, or depressed, " "unable to enjoy or take an interest in things, " "loss of appetite/weight except on a diet, " "problems getting to sleep/sleeping more than usual, " "restless, walking more slowly, less talkative, " "tired except from doing exercise/ lacking in energy, " "felt guilty/blamed self/felt not as good as other people, " "problems concentrating/forgetting things, " "life not worth living/wished for death/thought of suicide. " The time frame over which symptoms were experienced differed for different symptoms (weeks to years), so the duration and timing of occurrence were disregarded.

MI and Irritability
These were assessed within the participant-completed Structured Clinical Interview for DSM-IV Axis II Personality Disorders (SCID-II): borderline personality disorder section (37). The question that assessed MI was "Do you have a lot of sudden mood changes?" There were two questions on irritability: (a) "Many people become irritable or short tempered at times, though they may not show it. Have you felt irritable or short tempered with those around you in the past month?" and (b) "During the past month did you get short tempered or angry over things which now seem trivial when you look back on them?"

analytical strategy
Although Rasch analysis is increasingly used in other medical specialties, it is still largely underutilized in psychiatry (38,39). As mentioned in the Section "Introduction, " Rasch analysis determines whether mental or psychological scales have the characteristics of physical measures. For this to be the case, five conditions must be met (Figure 1). First, the instrument is designed to measure a single attribute (unidimensionality). Just as a ruler measures length only, depression scales must measure depression alone. Second, the responses follow a Guttman pattern: persons are ranked from lowest to highest levels of the trait, while items are ranked by highest to lowest levels of endorsement. The appearance of a Guttman pattern is like a staircase. The ranking of persons and items in this manner produces units that are intervallevel measures called logits ("log odds unit"). Third, endorsing a particular question should be independent of the endorsement of another question except with respect to the attribute being measured (local independence). This requirement guards against spurious correlations -those that are due to external factors, such as wording or position in the scale. Fourth, the items and the overall instrument must not have differential item or test function. This means that at the question and instrument levels, there must be invariance with respect to person characteristics like age and gender. Fifth, the overall scale must be internally consistent (as measured by Cronbach's alpha) and able to distinguish different strata of respondents along the latent trait (40,41).
We performed Rasch analysis in two samples of 500 subjects, one serving as calibration and the other as validation sample. The requirements, tests performed, and the criteria in each test are summarized in Table 1. For the complete description of analysis steps, please refer to the Appendix in Data Sheet S1 in Supplementary Material.

resUlTs
We refer the interested reader elsewhere (32) for a description of the demographic characteristics of all 8,580 PMS participants. Our calibration and verification samples were similar in age distribution (mean = 45 years), mean frequency of depression symptoms endorsed (about four symptoms), sex, and living arrangements. Please refer to Table 2 for details.

Unidimensionality
Parallel analysis of the 11 candidate items in the calibration sample showed two dimensions (adjusted eigenvalues: 2.82 and 1.02). In the validation sample, a similar result was reached with

assessment of Fit with a Probabilistic guttman Pattern
Of the 11 symptoms we examined, the most common symptom was irritability, while MI was the least common. Please refer to Table 3 for the complete list of item locations in logits. Figure 2 is a visual representation of the item locations and the corresponding fraction of the population that they demarcate. The symptoms that misfit the Rasch model were weight/ appetite change, agitation/retardation, fatigue, and self-blame. With consistent findings in both samples, these symptoms were eliminated. See Table 4 for the initial and final symptom lists and fit statistics.

Test of local independence
In the calibration sample, large residual correlations were observed between the item pairs fatigue and cognition, χ 2 = 14.36, df = 1, p with Holm's adjustment = 0.004. The validation sample showed a similar large residual correlation for these two items, χ 2 = 11.26, df = 1, p with Holm's adjustment = 0.02. After fatigue was eliminated from the symptom list, no large residuals remained.

Differential item/Test Function
With the assessment of differential response by gender, no items were flagged for DIF in the calibration sample. In the validation sample, irritability showed both uniform and non-uniform effects. Female respondents more frequently endorsed irritability, and the disparity with male respondents also differed by depression level. The impact of DIF by gender with respect to irritability had a medium sized effect on the test for the validation sample. See Table 5 for details. With the DIF assessment by age group, cognition showed both uniform DIF. In the calibration sample, respondents above 45 years of age endorsed cognitive problems more frequently. In the validation sample, respondents above age 45 endorsed irritability more frequently. Both cognition and irritability had large test effects in both samples. See Table 6 for details.

Test reliability
The PSI for the initial 11 symptoms was 0.60 for the calibration sample and 0.58 for the validation sample. For the seven retained symptoms (i.e., sad, no interest, sleep, cognition, suicidal ideas, MI, irritability) the PSIs were 0.38 and 0.35 for the calibration and validation samples, respectively. Cronbach alphas for the 11 symptoms (nine original, plus MI and irritability) were 0.71 and 0.70 for the calibration and validation samples, respectively, and 0.58 for both samples for the seven retained symptoms.
In post hoc analysis, we examined the PSI and alpha for the standard nine DSM/ICD symptoms. PSIs were 0.52 and 0.51, and Cronbach alphas were 0.70 and 0.67 for calibration and validation, respectively.

DiscUssiOn
In this work, we sought to clarify what symptoms form the most statistically cohesive set to measure depression as a construct. We now discuss the practical and theoretical implications of our findings.

Measuring Depression
We found that the nine depression symptoms in the DSM/ICD systems are not unidimensional. In practice, this means that the "5 of 9" rule (or "4 of 10" for ICD) is probably not warranted because the symptoms do not all tap the same attribute. A more homogeneous set of indicators is achieved by removing weight and appetite change, agitation and retardation, and feelings of worthlessness and inappropriate guilt and fatigue from the core of the major depression syndrome. The findings that MI and irritability fit the Rasch model indicates that they belong to the core network/cluster of symptoms that includes sadness and anhedonia (49). While irritability appears to be endorsed differentially by sex and by age group, MI is invariant with respect to both characteristics. That irritability is identified as a DIF item should not automatically exclude it from the list. One possibility is to adjust for the DIF effect of gender and/or age in assessing the severity of depression (50). This would be difficult to implement in a paper and pencil test, but could be solved by computer adaptive testing that takes covariates into account.
Seven items -sad, lack of interest, sleep, cognition, suicidal ideas, MI, and irritability -could serve as the kernel for a depression measure, but on their own do not reliably separate depressed persons from the rest. Likewise, the list of DSM/ICD symptoms fails the typical criterion for internal consistency (alpha < 0.80) and also falls short of distinguishing the depressed from the rest (PSI < 0.70). Clinical judgment and contextual information may need to be taken into account apart from the canonical list of symptoms. For use outside of the clinic, scales such as the PHQ-9, Beck Depression Inventory, HADS, CES-D, and the like are only recently being examined using item response theory (51)(52)(53).

reconceptualizing Depression
Mood instability is common in depression (23,25) and has been shown to be a precursor of depression (24). Our current results provide evidence that MI is a symptom of depression. According to DSM, relatively short durations of MI phases would not meet episode criteria for major depression (2 weeks) or for hypomania (4 days) (4). If the patient reports rapid mood fluctuations, it is typical to either dismiss MI as clinically unimportant or to concurrently diagnose a personality disorder, particularly borderline personality disorder where both MI and irritability are DSM-V criteria (4). Unfortunately, people who do not fulfill duration criteria may also be considered "well" or at least not depressed and receive no treatment (54), but these people are at higher risk of developing future depression (55). The evidence indicates that intense and frequent mood swings are associated with severe distress (56,57), and there is merely a quantitative difference between the mood fluctuations of normal individuals and those of patients (58). MI is linked to other indicators of distress and impairment such as health care utilization, medication use, and suicidal thoughts (25,59) and has recently been proposed to fit the characteristics of the Research Domain Criteria (60).
Irritability is associated with emotional lability in patients with unipolar depression (30) and in university students (30,61) and is common in depression (29,30). Mixed depression, which may be defined as "an overlapping of manic and depressive symptoms" includes irritability and emotional lability among its symptoms (62). This presentation is characterized by psychic and motor agitation, accompanied by intense suffering, which put the patient at increased risk of suicide (54). Irritability could be a core symptom of depression (29), an indicator of a more severe and chronic course (30,63) or a feature of bipolar disorders (64). DSM-V has included irritability as a core symptom of mania, generalized anxiety disorder, and borderline personality disorder. but excluded it as a symptom of major depression. It has previously been rejected as a symptom of major depression as it does not appreciably increase the prevalence above that of sadness and loss of interest (29,65). Increased prevalence is not necessarily a good basis for defining a syndrome. Conversely, irritability, along with MI, could lead to longstanding interpersonal and adjustment difficulties that could lead to depression (30,49). In summary, both MI and irritability are observed in a range of psychiatric disorders.
It is uncertain whether agitation and retardation are specific distinguishing features of major depression, melancholia, mixed mood states, atypical depression, bipolar II depression, or anxiety comorbid with mood disorders (66,67). Agitation is a defining characteristic of a proposed mixed depressive state that has both melancholic and excitatory features, but which does not have the levity in mood of hypomanic patients (62). Our finding is more consistent with a major depression study that reported that agitation could be dropped from the definition of major depression with no loss of validity (68). The Feighner symptom of "self-reproach or guilt" was expanded in DSM-III to include "feelings of worthlessness …" as part of the DSM trend to broaden the criteria for major depression (69). There is a clear semantic difference between guilt (worry) about past misdeeds and anxiety (worry) about future threats and perhaps feeling helpless, but this distinction might not be meaningful for people with common mental disorders, high comorbidity, or high distress (70)(71)(72)(73). Our results replicate findings of a previous Rasch analysis of the PHQ-9 scale that guilt was not coherent with the model of depression (51).
We eliminated fatigue because of higher than expected correlation with cognitive problems. This could be the result of similar wording: both symptoms are presented as diminished ability. An alternative to eliminating this item is rewording either fatigue or cognitive problems. Retaining this item is probably the more prudent course of action. Although one study reported that fatigue is not unidimensional with the other depression symptoms (74), several other studies reported that it satisfies the Rasch model (11,52,75).
It should be emphasized that the misfitting items are frequently experienced by patients with major depression. What is in question is whether they are central to the network of symptoms comprising the depressive syndrome and whether they are useful in its assessment. The search for underlying biological or psychological aberrations (76) or treatment (77) for depression is probably hampered by a heterogeneous cluster of symptoms.
Our study has several limitations. First, the data are based on retrospective recall with all of the disadvantages of this method (78). People who are depressed have a general negative recall bias that might affect reporting of symptoms (78). Second, the CIS-R was designed to elicit the ICD-10 criteria for depression, although the symptoms are very similar to those of DSM-V. Third, we did not consider duration criteria for the individual symptoms, and thus cannot be certain that all symptoms occurred at the same time. However, all duration criteria of major depression and similar groupings are arbitrary and, theoretically, symptoms may occur sequentially and still indicate the same syndrome (55,79). Fourth, we studied MI and irritability, but other common symptoms such as anxiety, rumination and physical symptoms should be studied (73,80). Fifth, we performed DIF analysis only by age and gender in a British sample, so further analysis is required to determine if MI and irritability are part of the syndrome across cultures. The differential item status of sleep, irritability, and MI should be addressed by future work. Sixth, while MI and irritability were shown to load on a single factor with depression, we did not have an independent external criterion to serve as a reference. Finally, the PMS did not assess bipolar disorder so it is possible that some of the respondents had bipolar, instead of unipolar depression. We do not think this limitation undermines our findings because MI is a feature of a wide range of psychiatric disorders (22,81) and, second, the prevalence of people with bipolar depression in the sample in comparison to unipolar depression is likely to have been small. It would be beneficial in replication studies to correlate the scores of patients in our proposed 7-item scale to standard psychometric questionnaires, such as the Mood Disorders Questionnaire or the Affective Lability Scale.
A strength of our study is that it was based on empirical data obtained from an epidemiological sample of the population. Accordingly, it was not constrained by a pre-selected sample with major depression as diagnosed by the criteria being studied.
cOnclUsiOn Mood instability and irritability are candidate core symptoms of the depressive syndrome and should be part of its clinical assessment.