Transdiagnostic clustering of self-schema from self-referential judgements identifies subtypes of healthy personality and depression

Introduction The heterogeneity of depressive and anxiety disorders complicates clinical management as it may account for differences in trajectory and treatment response. Self-schemas, which can be determined by Self-Referential Judgements (SRJs), are heterogeneous yet stable. SRJs have been used to characterize personality in the general population and shown to be prognostic in depressive and anxiety disorders. Methods In this study, we used SRJs from a Self-Referential Encoding Task (SRET) to identify clusters from a clinical sample of 119 patients recruited from the Institute of Mental Health presenting with depressive or anxiety symptoms and a non-clinical sample of 115 healthy adults. The generated clusters were examined in terms of most endorsed words, cross-sample correspondence, association with depressive symptoms and the Depressive Experiences Questionnaire and diagnostic category. Results We identify a 5-cluster solution in each sample and a 7-cluster solution in the combined sample. When perturbed, metrics such as optimum cluster number, criterion value, likelihood, DBI and CHI remained stable and cluster centers appeared stable when using BIC or ICL as criteria. Top endorsed words in clusters were meaningful across theoretical frameworks from personality, psychodynamic concepts of relatedness and self-definition, and valence in self-referential processing. The clinical clusters were labeled “Neurotic” (C1), “Extraverted” (C2), “Anxious to please” (C3), “Self-critical” (C4), “Conscientious” (C5). The non-clinical clusters were labeled “Self-confident” (N1), “Low endorsement” (N2), “Non-neurotic” (N3), “Neurotic” (N4), “High endorsement” (N5). The combined clusters were labeled “Self-confident” (NC1), “Externalising” (NC2), “Neurotic” (NC3), “Secure” (NC4), “Low endorsement” (NC5), “High endorsement” (NC6), “Self-critical” (NC7). Cluster differences were observed in endorsement of positive and negative words, latency biases, recall biases, depressive symptoms, frequency of depressive disorders and self-criticism. Discussion Overall, clusters endorsing more negative words tended to endorse fewer positive words, showed more negative biases in reaction time and negative recall bias, reported more severe depressive symptoms and a higher frequency of depressive disorders and more self-criticism in the clinical population. SRJ-based clustering represents a novel transdiagnostic framework for subgrouping patients with depressive and anxiety symptoms that may support the future translation of the science of self-referential processing, personality and psychodynamic concepts of self-definition to clinical applications.


Introduction
Depression is one of the most prevalent mental disorders worldwide (Bromet et al., 2011), as well as a leading cause of disability (World Health Organization, 2017).However, depression is heterogeneous, with substantial variability in causes, symptomatology, and course of development (Rush, 2007;Goldberg, 2011;Ulbricht et al., 2018).Due to such heterogeneity, depressed individuals may exhibit differing responses to treatments such as psychotherapy.A meta-analysis showed that a large proportion of depressed individuals were non-responsive to psychotherapy (intention-to-treat remission rate ranging between 32 to 37% depending on the severity of depression (De Maat et al., 2007)).Further, a high drop-out rate from psychotherapy schemes among depressive outpatients (17.5% in Cooper and Conklin, 2015;24.6% in Hans and Hiller, 2013) can compromise treatment efficacy.As such, it is important to better understand the heterogeneity of depression in order to personalize treatment.
According to cognitive models of depression, the presence of negatively-focused self-schemas and negative self-referential processing biases is central to the onset, maintenance, and recurrence of clinical depression (Beck, 1967;Ingram et al., 1983;Scher et al., 2005).Self-referential processing (SRP) refers to the processing of information as related to one's self (Northoff et al., 2006).Incoming information is remembered best when it is encoded with reference to one's self as compared to other-reference, semantic, phonemic, and structural encoding (Rogers et al., 1979;Symons and Johnson, 1997;Bentley et al., 2017).This supports the notion that one's self-concepts serve as an important framework for the encoding, processing, interpretation, and storage of incoming information, which is termed the self-reference effect (Rogers et al., 1979).
Notably, self-schemas implicated in depression are usually characterized by themes of loss, failure, worthlessness, rejection, and hopelessness (Phillips et al., 2010).Such negatively-focused selfschemas often lead to biases in processing self-referential information.Individuals tend to prioritize the encoding and retention of negative self-concepts, thus reinforcing depressive cognitive patterns (LeMoult and Gotlib, 2019).For example, Disner et al. (2017) found that the valence of a person's self-referential schema significantly predicted the severity of their onset depressive symptoms.Specifically, having a stronger negative self-schema, as opposed to a positive one, was associated with more severe depressive symptoms.Moreover, Dozois (2007) highlighted the stability of the structure of these negative selfschemas over time.Interestingly, the self-schemas often remain stable, even when individuals experience improvements in their depressive symptoms.Similarly, in a longitudinal study with pregnant women conducted by Evans et al. (2005), the authors found that the association between negative self-schemas remained significant and predicted the onset of depression more than 3 years later.This supports the notion that negative self-schemas represent a long-lasting vulnerability to depression.
It is within this framework that the use of self-referential judgements (SRJs) becomes particularly relevant as it serves as an explicit manifestation of these negative self-schemas.By analyzing the content and frequency of SRJs made by individuals, we gain deeper insights into how negative self-concepts are constructed and perpetuated, shedding light on the relationship between SRJs, selfschemas and depression.
The Self-Referential Encoding Task (SRET; Derry and Kuiper, 1981) is a key measure of biases in self-schemas and self-referential processing.In the SRET, participants are asked to make binary decisions about whether positive and negative adjectives describe themselves or not (the endorsement phase), after which they go through a distractor task, and then complete an incidental free recall for those same adjectives (the recall phase).Biases in self-schemas are indicated by the number of positive/negative words that people endorse; biases in SRP are usually assessed by their speed of endorsement or rejection, as well as subsequent recall for positive and negative endorsed words.
Negative and positive biases in the endorsement and recall of negative self-relevant stimuli in the SRET are associated with depression and depressive symptoms (Derry and Kuiper, 1981;Gotlib et al., 2004;Auerbach et al., 2015;Goldstein et al., 2015;Connolly et al., 2016).In the original paper that developed SRET for accessing self-schemas in clinical depression, Derry and Kuiper (1981) found that as compared to non-depressed psychiatric control and healthy control participants, clinically depressed participants showed superior recall for depressive/negative (rather than non-depressive/positive) adjectives endorsed as self-descriptive.Subsequent research demonstrated similar patterns of results across age samples: Gotlib et al. (2004) found that adult patients with major depressive disorder (MDD) endorsed more negative words and fewer positive words as self-descriptive, and recalled a higher proportion of negative endorsed words and lower proportion of positive endorsed words than psychiatric control and healthy control participants.Similar results on another adult sample were obtained by Fritzsche et al. (2010).Likewise, Auerbach et al. (2015) demonstrated that depressed adolescents endorsed more negative and fewer positive words, as well as recalled fewer positive words compared to healthy controls.Goldstein et al. (2015) found that depressive symptoms were positively correlated with the proportion of negative self-referent words recalled, and negatively correlated with the proportion of positive self-referent words recalled, among a community of children at age 6.Similarly, among a community sample of 12-year-old adolescents, depressive symptoms were correlated with higher endorsement of negative words, lower endorsement of positive words, slower RT in rejecting negative words as self-descriptive, as well as higher recall of negative self-referent words and lower recall of positive self-referent words (Connolly et al., 2016).In an adult sample with elevated depressive symptoms, the number of positive and negative words endorsed, negatively and positively predicted baseline depressive symptoms, respectively (Disner et al., 2017).Lastly, using a best subset regression approach, Dainer-Best et al. (2018) discovered that the number of positive and negative words endorsed and the recall of negative endorsed words were strong predictors of depressive symptoms in an adolescent, an undergraduate, and an adult sample.These results showed that SRP biases are robust predictors for depressive symptoms across age and sample types.
However, an important limitation in prior SRET research is the tendency to construe self-schemas and SRP biases as coherent, homogeneous variables by primarily investigating the relationship between the valence of one's SRP biases and the underlying selfschemas and depression.While this approach is valuable in demonstrating a strong link between negatively-biased SRP and depression, it nonetheless ignores more nuanced individual differences in self-concepts.To our knowledge, studies using SRET have not investigated the specificity of one's self-concepts beyond positive and negative valences in relation to depressive symptoms and subtypes of depression.
Self-concept has also been conceptualized as personality traits.The Five Factor Model of personality (FFM) is a widely recognized framework that categorizes personality traits into five broad dimensions: Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to Experience (McCrae and John, 1992).A meta-analysis revealed that depressed patients exhibited higher neuroticism, lower extraversion, lower conscientiousness, and no differences in agreeableness and openness as compared to non-depressed individuals (Kotov et al., 2010).Further, a study showed that neuroticism as indicated by the revised NEO Personality Inventory (NEO PI-R; Costa and McCrae, 2008) was positively correlated with depression scores measured by Beck's revised Depression Inventory (BDI-II), whereas conscientiousness was negatively correlated with depressive severity among depressed individuals (Jourdy and Petot, 2017).Similarly, another study showed that among 10 community samples, extraversion and conscientiousness were negatively associated with depressive symptoms, while neuroticism was positively associated with depressive symptoms (Hakulinen et al., 2015).There is a high concordance between self-ratings of personality trait adjectives and self-reported personality questionnaires for the five personality dimensions (McCrae and John, 1992).

Self-concepts and theoretical subtypes of depression
Research also indicates that distinct self-concepts may underlie theoretical subtypes of depression, such as Blatt and colleagues' two personality subtypes of depression: dependent (or "anaclitic") and selfcritical (or "introjective") depression (Blatt and Zuroff, 1992).While the dependent subtype focuses on interpersonal-relatedness and is characterized by fear of abandonment, longing for care from others, loneliness, and helplessness, the self-critical subtype is characterized by self-criticism, feelings of unworthiness and failure, and need for achievement and approval (Blatt and Zuroff, 1992;Abramson et al., 1997).Previous theoretical research suggested that depressed individuals with concerns in different dimensions (interpersonal vs. self-focused) are likely to use different self-referent adjectives to describe themselves (Dobson, 1986).Thus, it is worth investigating whether these two personality subtypes of depression can be distinguished through clustering the words that people endorse in SRET.

The utility of a clustering approach to investigate subtypes of self-concepts in relation to depressive symptoms and depression subtypes
Given the central role that self-schemas play in depression, a more fine-grained analysis of the relationship between self-concepts, depressive symptoms, and depression subtypes through a clustering approach is both conceptually and clinically useful.First, by identifying natural subgroups based on SRET endorsement data in a clinical sample with elevated depressive symptoms, clustering allows us to explore if heightened depressive symptoms are underlined by an overall bias toward negative information and against positive information, or driven by specific patterns of self-concepts.Further, comparing the self-concepts of clusters of individuals in a clinical versus a non-clinical sample will provide further evidence of the kinds of self-concepts that underlie clinical symptoms of depression.
In terms of clinical utility, due to the high non-response and drop-out rates for psychotherapy among depressed patients (De Maat et al., 2007;Hans and Hiller, 2013;Cooper and Conklin, 2015), it is crucial to take into account individual idiosyncrasies to develop better personalized treatments for patients with depression.Investigating subtypes of self-concepts is especially useful for this purpose, since negative self-referential cognitions are an important target of cognitive-behavioral therapy (Yoshimura et al., 2014).Thus, a bottom-up approach looking at each individual's data of unsupervised clustering may uncover patterns of self-concepts.Subsequently, if we can profile people based on their self-endorsement patterns and examine how their different self-concepts relate to depressive symptoms and depression subtypes, then we can better personalize treatment.Specifically, we can not only identify people who are more vulnerable to heightened depressive symptoms, but also identify specific aspects of their self-concepts to tackle, to improve treatment efficiency.

The present study
The way in which biases in self-schemas and self-referential processing may contribute to the heterogeneity of depression is not well-understood.Since both self-concepts and depression are complex, heterogeneous constructs (Delugach et al., 1992;Bracken et al., 2000;Rush, 2007), the present study aims to contribute to the literature by using a clustering approach to examine the content of one's self-concepts beyond valence, as indicated by self-endorsement of adjectives in SRET in relation to depressive symptoms.This may help to uncover subgroups of individuals with specific constellations of self-concepts who are more vulnerable to depression.
Additionally, we also investigate recall bias, which pertains to participants' memory of self-referential adjectives.We aim to examine whether there are significant differences between the identified clusters in terms of their memory biases.However, our clustering approach aligns with prior research, which has highlighted the limitations of recall bias as a consistent metric, when compared to endorsement data.Notably, a study by Dainer-Best et al. (2018) found that the number of positive or negative words remembered was not strongly associated with the severity of depressive symptoms.Models solely based on recall bias data explained relatively little variance, compared to models looking at endorsement data.Consequently, the current study focuses on the pattern of endorsements as the basis for clustering individuals, rather than the pattern of memory biases.
Since self-concepts are multidimensional and heterogeneous (Delugach et al., 1992;Bracken et al., 2000) and depression tends to be associated with certain domains of self-concepts but not others (Beck, 1967(Beck, , 1987)), the richness of self-endorsement data in SRET may offer us more nuanced insights into the relationship between specific self-concepts and depression.Specifically, through a clustering approach, we can uncover whether there are naturally existing, data-driven subtypes of self-concepts that differentially relate to depression, and find the extent to which they agree with theorized subtypes of depression.Identifying subtypes of selfconcepts that are more strongly related to depression and depression subtypes can then inform personalization of clinical treatments by tackling specific aspects of one's self-concepts, and in turn, improve treatment efficacy.
Given the richness of self-referential data in SRET and the lack of studies investigating the relationship between clusters of self-concepts in relation to subtypes of depression, the present study aims to examine whether a clustering approach can reveal meaningful subtypes of self-concepts that differentially relate to depression severity and subtypes of depression, using three existing Singaporebased datasets that are later reconfigured into a clinical and a non-clinical group.Clusters are first generated using the selfendorsement data in SRET based on the clinical, non-clinical, and overall samples, and reliability and correspondence of the clusters are examined.Subsequently, the characteristics of each cluster are examined to see if meaningful subtypes of self-concepts were revealed through the clusters.The clusters are also compared on their level of positive/negative endorsement and other clinical measures to investigate what the clusters are associated with.
We expect that clinical and non-clinical clusters will show good stability and correspondence when compared to combined clusters.Further, different patterns of self-endorsement are expected to be revealed between clinical and non-clinical clusters.We also hypothesize that within clinical and non-clinical clusters, clusters will show differences in their endorsement of positive and negative words, depressive symptoms, as well as the two personality subtypes of depression.The clinical sample comprised 119 patients from the Institute of Mental Health (IMH) with past or current anxiety or depressive symptoms who were literate in English.They were recruited from triage, outpatient clinics, and referrals from therapists.They were recruited from three studies: 85 IMH patients from the "Understanding the person, exploring change across psychotherapies" (Xchange) study, which included data from the "Understanding the Person, Improving Psychotherapy: Preventing Relapse by targetting Emotional bias Modulation in PsychoTherapy" (PRE-EMPT) and 34 patients and 18 healthy controls from "The role of cholinergic dysfunction in the progression of depression" (CholDep) study.In the Choldep study, healthy controls were also recruited by word of mouth.

Methods Participants
The non-clinical community/university sample comprised 97 participants mainly recruited at the National University of Singapore as part of an undergraduate thesis project.For the purposes of clustering analysis, this sample was merged with the 18 healthy controls from the CholDep project to form a total of 115 participants in the 'non-clinical sample' .A meta-analysis reported that the relationship between implicit cognitive biases and depression showed similar effect sizes in studies with clinical, community, and undergraduate samples (Phillips et al., 2010).The purpose of a non-clinical sample was to identify how clustering solutions differed across disparate clinical and non-clinical populations.

Procedure
Participants in all samples completed questionnaires and the Self-Referential Encoding Task (SRET) remotely online using the Inquisit platform (Inquisit 5 [Computer software], 2016) by Millisecond.The procedure consisted of the following steps.First, during the endorsement phase of the SRET, participants were presented with one word at a time in a random order and indicated whether the word described them by pressing a corresponding keyboard button.Following the endorsement phase, participants worked on a digitsymbol substitution distractor task for 5 min.After the distractor task, participants were asked to recall as many words as possible from the endorsement phase.

Measures
Self-referential encoding task (SRET) The SRET is a computer-based task used to access one's selfrelevant schemas (Derry and Kuiper, 1981) that typically includes three segments in order: endorsement, distractor task, and incidental recall.In a given trial of the endorsement phase, participants judged whether presented adjectives described them ("Describes me?").The SRET presented both positive (e.g., "popular, " "successful") and negative adjectives ("awful, " "ugly") in random order.The participants responded by pressing "Yes" or "No" keys on a computer keyboard.Participants' responses and reaction time (measured in ms) were recorded for each trial.After the endorsement phase, participants worked on a distractor task for 5 min to minimize interference and memory consolidation of the endorsed words before undertaking the incidental recall task.
Three SRET metrics were calculated to assess the responses in the SRET, namely the endorsement rate, reaction time (RT) and recall bias.
Endorsement rate, represents the proportion of positive/ negative words that participants endorsed as describing themselves.It was calculated as the number of positive/ negative words, divided by the total number of words presented to the participants.
Two RT variables, Negative RT bias and Positive RT bias were calculated to assess participants' reaction time differences between endorsing and rejecting negative/ positive words during the SRET.The formula for Negative RT bias is as follows: Negative RT Bias = (Mean RT of Endorsement of Negative Words − Mean RT of Rejection of Negative Words) / Average RT Across all Trial Types.Similarly, Positive RT Bias was calculated using the following formula: Positive RT Bias = (Mean RT of Endorsement of Positive Words − Mean RT of Rejection of Positive Words) / Average RT Across all Trial Types.This method of calculating RT bias aligns with the approach used in prior SRET studies (Connolly et al., 2016).
Recall bias was computed by dividing the total number of either positive or negative words endorsed and recalled by the total number of words endorsed and recalled.Only words that were correctly recalled were considered.Positive recall bias was obtained by dividing the number of positive recalled words by the total recalled words, while negative recall bias was calculated by dividing the number of negative recalled words by the total recalled words.A difference between negative recall and positive recall bias was also taken to measure the relative strength of memory biases for positive and negative self-referential information.
While the standard SRET was administered in the XChange and Choldep studies, some variations to the SRET were used in the University sample.Besides the typical three segments, the SRET in the XChange and Choldep also included an additional endorsement task in a matrix format where participants were presented with a matrix of words at once and were asked to tick the box under words that they identified themselves with.60 words were presented in the first endorsement task, and 200 words were presented in the matrix task.
The SRET in the undergraduate-student sample included 179 words in the endorsement task, 40 words from LeMoult et al. (2017), 120 words from IASR-B5 (Revised), and 19 words from the List of Threatening Experiences (LTE, Brugha and Cragg, 1990).The processing of SRET data into a standardized and comparable format is explained in a later section on data processing.

Depressive symptoms
IDS-30-SR is a 30-item self-report measure of depressive symptoms that includes all criterion symptoms for a major depressive episode, as well as all criterion symptoms for melancholic and atypical subtypes of depression (Rush et al., 1996).Participants were asked to rate the severity of each of the 30 symptoms in the preceding 7 days on a scale of 0-3, with higher scores indicating greater symptom severity.The total score is calculated by summing 28 of the 30 items (for the appetite and weight change questions, only appetite and weight increase or decrease was scored for any participant).The total score ranges from 0 to 84.IDS-30-SR was shown to have satisfactory psychometric properties: Cronbach's alpha was 0.94 for an overall sample including depressed individuals and healthy controls, and 0.77 for symptomatic-only individuals, indicating acceptable internal consistency across depressed and non-depressed individuals (Rush et al., 1996).IDS-30-SR also highly correlates with other self-reported scales measuring depressive symptoms such as 17-item HRS-D (r = 0.88, p < 0.0001) and BDI (r = 0.93, p < 0.0001), showing good convergent validity (Rush et al., 1996).IDS-30-SR was also demonstrated to significantly discriminate between symptomatic depressed individuals and non-symptomatic euthymic individuals.The suggested optimal cut-off score for IDS-30-SR is 18, as determined by ROC analysis, with a sensitivity of 1.0 and specificity of 0.94 (Rush et al., 1996).People scoring 18 and above are considered symptomatically depressed.The numbers of people who met this cut-off score in the clinical and non-clinical groups are reported in this study.In the current study, Cronbach's alpha for IDS-30-SR is 0.909 for the clinical group, 0.912 for the non-clinical group, and 0.938 for the overall sample.

Depression subtypes
The Reconstructed Depressive Experiences Questionnaire (RecDEQ) is a 19-item measure that differentiates between the dependent (anaclitic) and self-critical (introjective) personality subtypes of depression (9 items for dependency, 10 items for selfcriticism).Sample items for the dependency scale include: "I become frightened when I feel alone, " and sample items for the self-criticism scale include: "I tend not to be satisfied with what I have." Participants were asked to rate each item on a Likert scale of 1-7 (1 = strongly disagree, 7 = strongly agree).All items for each of the two scales are summed to obtain a total score for each scale.RecDEQ showed excellent fit to a two-factor model in both an undergraduate and a clinical sample, and an association with depressive symptoms that are in line with theoretical predictions (Desmet et al., 2007).Test-retest reliability was 0.75 for the dependency scale, and 0.83 for the selfcriticism scale.Cronbach's alpha for the two scales ranged from 0.69 to 0.80 across four samples (normal adults, university students, depressed patients, and panic disorder patients; Bagby et al., 1994).

Data processing and reorganization
To maximize the self-endorsement data used as input in the clustering analysis, endorsement data in the matrix task in XChange and Choldep were included to identify the maximum overlap across three samples.Endorsement data on 90 overlapping words (48 negative and 42 positive) across all participants were used in the clustering analyses (see the word list in Supplementary Appendix 1).
For grouping of participants, all participants in XChange and the clinical participants in Choldep were combined to form the clinical group, and all participants in the undergraduate sample and healthy control participants in Choldep were combined to form the non-clinical group.
Demographic variables, SRET variables, namely endorsement rate, reaction time and recall bias (participants' ability to recall positive or negative self-referential adjectives), and depressive symptom severity were examined and compared to confirm that participants grouped into the same group were similar on these measures and that the grouping decision was reasonable.See Supplementary Table S1 for the full list of demographic and clinical characteristics of each subsample and Supplementary Table S2 for correlations of age and SRET variables with depressive symptoms.

Clustering analysis methodology
The clustering analysis was conducted separately for the clinicalonly sample, the non-clinical-only sample and the combined sample with the Rmixmod package in R Studio.Rmixmod is a package devoted to clustering (or, unsupervised classification) using mixture modeling.Other approaches for clustering including K-means clustering, hierarchical clustering and Gaussian models were considered for clustering analysis but given that we had 10 multivariate multinomial mixture models, Rmixmod was deemed the most suitable method as it could effectively manage high-dimensional binary data.Rmixmod specializes in finite mixture modeling and latent class analysis, making it well-suited for data that arises from multiple underlying distributions, like in the case of our binary SRET endorsement data (Lebret et al., 2015).Under Rmixmod package, mixmodCluster() function was used to obtain clustering solutions.Arguments required for the function included the criterions used for optimization of models and seed number that specifies generation of a particular sequence of numbers.
Optimization of models was done according to Bayesian Information Criterion (BIC), Integrated Completed Likelihood (ICL), and Normalized Entropy Criterion (NEC).BIC, a widely-used statistical criterion used for model selection, aims to strike a balance between model fit and model complexity (Neath and Cavanaugh, 2012).Lower BIC values indicate a better fit to the data while remaining parsimonious, making models with the lowest BIC value preferable.ICL, another model selection criterion, is typically used in mixture modeling and clustering analyses (Biernacki et al., 2000).It evaluates the quality of clusters with higher ICL values suggesting more distinct and better-defined clusters, indicating a more appropriate clustering solution.NEC, on the other hand, is a model selection criterion that measures the quality of clusters by assessing the dispersion of data points within them (Biernacki et al., 1999).Lower NEC values indicate more compact and well-separated clusters, which are considered better for clustering solutions.
Clustering solutions were generated for Seed 1-30, and each set of solutions were sorted according to the 3 criterions used for optimisation of models.To compare the solutions, criterion values were extracted.We also considered the likelihood values of each solution as a measure of how well the clustering model explains the observed data, thus a higher likelihood value indicates better clustering.Two clustering evaluation metrics were used to determine the optimisation model.It includes the Davies-Bouldin index (DBI) and Calinski-Harabasz Index (CHI).The Davies-Bouldin index (DBI) is calculated as the average similarity of each cluster with its most similar cluster.A lower DBI value means the clusters are better separated (Davies and Bouldin, 1979).The Calinski-Harabasz Index (CHI) is a variance ratio criterion that evaluates the ratio of betweencluster variance and within-cluster variance.A higher value suggests a better clustering solution (Caliński and Harabasz, 1974).
Following Bozdogan's (1993) recommendation, the number of clusters tested in each clustering analysis ranged from 2 to the smallest integer larger than the cube root of the number of observations in each sample.Based on this guideline, the number of clusters tested in both the clinical and non-clinical sample was between 2 and 5, and the number of clusters tested in the combined sample was between 2 and 7.The lowest criterion value was used to choose the optimal number of clusters.
Clustering analyses were run on the clinical, non-clinical, and overall sample, respectively, to identify meaningful subgroups of participants based on their self-endorsement patterns in SRET.

Analyses
A simplified thematic analysis using a deductive approach was conducted on the top-endorsed words by each clinical and non-clinical cluster to characterize the patterns of self-concepts in each cluster (Boyatzis, 1998;Braun and Clarke, 2006) by mapping the most endorsed words in each cluster to the five-factor model of personality (McCrae and John, 1992) and Blatt's two personality subtypes of depression (Blatt and Zuroff, 1992).Words with a mean endorsement equal to or above the upper quartile of mean endorsements of words within a cluster were defined to be the top-endorsed words for a cluster.These top-endorsed words were then reviewed and mapped onto the five dimensions of the Five Factor Model of Personality (FFM) and the two personality subtypes of depression (dependent and self-critical).
Correspondence of the clusters were examined by comparing the clinical and non-clinical clustering solutions to the overall sample clusters.
One-way univariate between-subject ANOVAs on positive and negative endorsement and depressive symptoms were conducted on the clinical and non-clinical clusters, respectively.Post-hoc tests, such as Tukey's Honestly Significant Difference (HSD), were also conducted for pairwise comparisons following the ANOVAs.This test corrects for the inflation of Type I error inflation that can occur when conducting multiple pairwise comparisons.It applies a rigorous correction method that considers the overall error rate and ensures that the observed differences between group means are truly statistically significant.To examine for the differences in the frequency of the different diagnoses between the clusters, a multinomial test was conducted on the clinical clusters, using JASP 0.16.3.

Clustering evaluation and stability
The cluster validation metrics for goodness of split and stability are presented in Table 2.When we performed perturbation by varying the seed for initialisation across 30 seeds, we identified that optimisation with BIC and ICL yielded the most stable solutions with no differences in the number of clusters identified and minimal differences in criterion values.The optimum number of clusters differed between seeds when using NEC and variance across seeds was relatively higher.While BIC and ICL yielded similar DBI and CHI values, the BIC criterion values and likelihood values showed less variance between seeds.Thus, BIC optimization was used for the final clustering solution.We also examined the centroid locations and in the majority of the seed to seed comparisons, a unique centroid location with correlation coefficient of 0.7 or above was identifiable for a given cluster.The clustering solution for seed number 3 was chosen as a representative solution as the various solutions appeared to have trade-offs between metrics.
Broadly, the majority of algorithms generated 5 clusters as the optimal solution for both the clinical (C1-C5) and non-clinical group (N1-N5).7 clusters were generated as the optimal solution for the combined sample (NC1-NC7).All three optimal clustering solutions have the maximum possible number of clusters for their respective sample size.

Top endorsed words in clinical, non-clinical and combined clusters
In order to characterize and label the clusters, we ranked words by both mean endorsement rates and relative endorsement rates between clusters (See Supplementary Appendix 2).The most representative words have been listed in Table 3.
A closer examination at the top words endorsed by the combined sample revealed significant overlaps with the clinical and non-clinical clusters.NC1, the "Self-confident" cluster endorsed the same words as  N1 (Self-confident): "unworrying, " "successful, " "popular" as well as N3 (Non-neurotic): "at ease." Similarly, NC4, the "Secure" cluster also endorsed the same words as N3: "relaxed, " "at ease." In the case of NC7, the "Self-critical" cluster, there were overlaps with C4 (Selfcritical): "awful, " "coward, " "abandoned, " "stupid, " "bad, " "ugly." Given that there is a significant overlap in the top words endorsed, we did a correspondence analysis between the clustering solutions.
In the combined (NC) sample solution, two clusters fully included clusters from the clinical (C) or non-clinical (N) solutions: NC1 (Selfconfident)/ N1 (Self-confident), NC7 (Self-critical)/ C4 (Self-critical), and an additional three were dominated by two clusters (NC2 (Externalising)/ C2 (Extraverted)/ N5 (High endorsement), NC3 (Neurotic)/ C1 (Neurotic)/ N4 (Neurotic), NC6 (High endorsement)/ C1 (Neurotic)/ N5 (High endorsement)), and two were comprised of several clusters.While the combined solution had two clusters comprising primarily the non-clinical sample (NC1 (Self-confident)/ N1 (Self-confident), NC4 (Secure)/ N2 (Low endorsement)/ N3  (Non-neurotic)/ N5 (High endorsement)) and one comprising a clinical cluster (NC7 (Self-critical)/ C4 (Self-critical)), there was meaningful co-occurrence of clinical with nonclinical cluster in the combined solution as well (C1 (Neurotic)/ N4 (Neurotic), C2 (Extraverted)/ N5 (High endorsement), C3 (Anxious to please)/ C5 (Conscientious)/ N2 (Low endorsement)).Furthermore, the distribution of clusters within the combined solution demonstrated varying levels of preservation.For example, C2, C3, and C4 are each primarily distributed only in two combined clusters and are thus well-preserved.Contrarily, C1 is distributed across four combined clusters (NC2, NC3, NC6, NC7), and C5 is distributed across three combined clusters (NC1, NC4, NC5); thus, these two clinical clusters are less well-preserved in the combined solution.For non-clinical clusters, N1 is only found in NC1, while N2 and N3 are each found primarily in only two combined clusters, thus relatively well-preserved.In contrast, N4 and N5 are distributed across six and four combined clusters respectively, less well-preserved.However, even for the less well-preserved clusters, they tend to be predominantly distributed across only two to three combined clusters, which suggests a certain degree of cluster consistency for these clinical and non-clinical clusters.Refer to Figure 1 for the mapping of clinical and non-clinical clusters onto combined clusters and Supplementary Figure S1 for reverse mapping.

Cross-cluster differences in negative and positive endorsement
Cluster differences in negative and positive endorsement were examined through scatter plots and one-way ANOVAs.

Clinical clusters
The scatterplot of cluster distribution on the number of positive and negative words endorsed (see Figure 2) showed that C1 and C4 endorsed the most amount of negative words, followed by C2 and C3.C5 endorsed the least amount of negative words, but showed a wide spread of the number of positive words endorsed within this cluster.

Non-clinical clusters
The scatterplot of non-clinical cluster distribution (see Figure 3) showed that both N4 and N5 endorsed a high number of negative words, but N5 endorsed more positive words than N4.While N1, N2, and N3 fell in the same range for endorsing negative words, N1 endorsed the greatest number of positive words, followed by N3 and N2.

Cross-cluster difference In reaction time of endorsement of positive/ negative words
One-way between-subject ANOVA was also performed to examine the effect of clusters on reaction time in response to positive and negative words.

Cross-cluster difference in recall bias
One-way between-subject ANOVA was also performed to examine the effect of clusters on recall bias.

Clinical clusters
The results show that clusters significantly predicted negative recall bias in the clinical group (F (4, 114) = 3.55, p = 0.009, η 2 = 0.11; see Table 7).Tukey's HSD post hoc tests were performed to examine group differences.Notably, we applied a value of p threshold of 0.1 for interpretation, allowing us to emphasize practical significance.This choice was made to reduce the risk of Type I errors and align with the exploratory nature of this study.Using this threshold, individuals in C4 (M = 0.11, SD = 0.08) had a weaker recall bias for negative words as compared to C2 (M = 0.19, SD = 0.10, 95%CI [−0.001, 0.15], p = 0.06).Similarly, individuals in C5 (M = 0.10, SD = 0.12) also had a weaker recall bias for negative words as compared to C2 (95%CI [−0.001, 0.15], p = 0.06).

Combined clusters
On the other hand, clusters significantly predicted positive recall bias (F (6, 227) = 2.21, p = 0.04, η 2 = 0.06; see Table 7) and the difference between negative recall bias and positive recall bias (F (6, 225) = 2.22, p = 0.04, η 2 = 0.06; see Table 7) between the combined clusters.While Tukey's HSD did not reveal a significant difference between the combined clusters for positive recall bias, it did reveal a significant difference in terms of the difference between negative recall bias and positive recall bias.Specifically, NC1 (M = −0.10,SD = 0.17) exhibited a stronger memory bias towards positive words than negative words, as compared to NC7 who exhibited a stronger memory bias towards negative words (M = 0.007, SD = 0.15, 95%CI [−0.22,−0.001], p = 0.045).See Supplementary Appendix 6 for pairwise comparisons of recall bias across all samples.

Distribution of psychiatric diagnoses between clinical clusters
Among diagnostic categories, only Depressive Disorders showed significant differences in frequency between clusters (χ 2 (4) = 34.70,p < 0.001; see Table 9).C1 and C4 appeared to have higher proportions of individuals with depressive disorders, while C3 and C5 had lower proportions of individuals with depressive disorders.

Discussion
While grouping individuals by SRJs has been applied in the field of personality (Scully and Terry, 2011), it has not been applied  clinically to depression despite evidence that SRJs have important prognostic value (Nejad et al., 2013;LeMoult et al., 2017) both from the perspectives of self-referential processing and psychodynamic constructs of depression.In applying clustering of SRJs across clinical and non-clinical populations, our findings may further our understanding of how these very different theoretical frameworks relate to one another.Metrics such as optimum cluster number, criterion value, likelihood, DBI and CHI remained relatively stable when perturbed by varying the seed used to initialize clustering and this reflected our observations that cluster centers also remained relatively stable.
Traits from the FFM, psychodynamic constructs and anxiety appeared to inform the most endorsed words in each cluster.Within the clinical clustering solution, five-factor adjectives such as "neurotic" in C1 and N4, "extraverted" in C2 and "conscientious" in C5 appeared among the most endorsed words.The link between the FFM and depression has been well-explored, with common findings of high neuroticism, low conscientiousness and low extraversion being personality traits strongly associated with depression (Malouff and Thorsteinsson, 2005;Kotov et al., 2010;Grav et al., 2012).
"Self-critical" clusters C4 and NC7 can also be understood in terms of Blatt's introjective subtype of depression, which emphasizes self-criticism and an internalized focus, that also corresponds to endorsement of words for neuroticism and negative self-referential processing.Meanwhile, the "Anxious to please" cluster, C3, could be understood in terms of Blatt's anaclitic subtype, which emphasizes dependency and a strong desire for external validation in individuals experiencing depressive symptoms, as they endorsed words related to agreeableness and introversion (Blatt and Zuroff, 1992;Grav et al., 2012;Marfoli et al., 2021).
Among clusters that appeared to show consistency in the combined solution were unique endorsement patterns between clinical and nonclinical solutions, such as self-critical NC7/ C4 clusters and self-confident NC1/N1 clusters and common endorsement patterns, such as NC2 (Low Endorsement)/ C2 (Extraverted)/ C1(Neurotic)/ N4 (Neurotic).The unique patterns may relate to patterns of self-schema that may be either absent in or protective against depression or associated with subtypes of depression not present in the normal population.The common ones could reflect overlap in underlying self-schema across both clinical and non-clinical populations.
Overall, clusters endorsing more negative words also tended to endorse fewer positive words, showed more negative biases in reaction time and negative recall bias, reported more severe depressive symptoms and a higher frequency of depressive disorders and more self-criticism in the clinical population.Previous studies have found that depressive symptoms and depressive disorders are predicted by negative self-referential processing during similar tasks (Beevers et al., 2019).C1 (Neurotic) and C4 (Self-critical) members endorsed more negative words and had more severe depressive symptoms than the other clusters and reported higher introjection/self-criticism scores than C5 (Conscientious).C4 (Self-critical) members also endorsed fewer positive words than C1 (Neurotic) and were slower when endorsing positive words compared to those in C2 (Extraverted).Taken together, the results show that the C4 (Self-critical) members  displayed a heightened propensity for negative self-referential processing and a deficit in endorsing positive SRJs.These findings are consistent with network analysis research that depressed individuals with self-critical views tend to maintain highly interconnected negative self-perceptions while undervaluing their positive selfschemas (Collins et al., 2021).C5 (Conscientious) emerged as the least depressed clinical cluster and endorsed the least negative words.C5 endorsed fewer positive words, but demonstrated a weaker negative recall bias than C2 (Extraverted).This is in agreement with a previous finding that low extraversion and low conscientiousness predict the development of depressive symptoms (Hakulinen et al., 2015;Jourdy and Petot, 2017).
Although there were no differences in depression symptom severity between non-clinical clusters, N4 (Neurotic) and N5 (High endorsement) members endorsed more negative self-schema and N4 members had slower RTs in positive self-evaluations compared to their self-confident counterparts in N1.This aligns with other studies showing more severely depressed patients also exhibited slower RTs when endorsing positive words, distinguishing them from nondepressed individuals (Collins and Winer, 2023).Conversely, N1 (Self-confident) members endorsed more positive words compared to all other clusters, indicating that individuals with greater selfconfidence tend to perceive themselves more positively.This positive self-perception contributes to their overall psychological well-being.Numerous studies have demonstrated that positive self-esteem acts as a protective buffer against negative influences (Mann et al., 2004).
In the combined sample, NC2 (Externalising), NC3 (Neurotic), and NC7 (Self-critical) clusters exhibited higher levels of depressive symptoms when compared to the NC4 (Secure) cluster.Additionally, both NC2 and NC7 had elevated depressive symptoms scores than NC1 (Self-confident).Interestingly, NC2 tended to endorse more positive words than NC3 and NC7.Prior research has found that more aggressive groups of children demonstrated similar levels of positiveself-perception and did not differ in the number of positive words they endorsed as compared to children in the control group (Burgess and Younger, 2006).Hence, this points at a unique cognitive pattern of the externalizing cluster where they endorsed more positive self-referential words, that is distinct from the neurotic and self-critical clusters.
NC3 (Neurotic) comprised C1 (Neurotic) and N4 (Neurotic) suggesting consistency as a construct across populations.Both NC3 (Neurotic) and NC7 (Self-critical) endorsed fewer positive words and responded more slowly when endorsing positive words compared to NC1, indicating difficulties in making positive SRJs, but NC7 additionally had a faster RT endorsing negative words than NC1, which suggests that self-critical individuals may have a heightened awareness of negative self-referential information and may readily endorse such negative self-attributes.
We have identified clusters on the basis of SRJs using words that are meaningful across theoretical frameworks from personality, psychodynamic concepts of relatedness and self-definition, and selfreferential processing with key distinctions that may be useful for further study both in healthy populations and clinically.While positive and negative self-referential processing is typically highly correlated, identifying subgroups where they differ may be clinically meaningful by providing targets for interventions focused on positive psychology or anhedonia (Sandman and Craske, 2021).Further work could characterize further differences in clinical characteristics and interpersonal patterns.
We also considered various limitations of our approach.Firstly, the clinical and non-clinical participants were obtained from different datasets and not matched on demographic and psychiatric characteristics, resulting in heterogeneity between groups.Secondly, there is substantial heterogeneity in terms of clinical diagnosis within the clinical group.Further, the clustering analyses may be limited in their generalizability due to the relatively small sample sizes.Our ability to cluster using additional data such as recall bias or latency was limited by the sparse nature or small number of measures that could be derived from them, however it would be useful to develop ways to integrate such data into our clustering approach.It could also be useful to incorporate neuroimaging or EEG data in conjunction with behavioral data from SRET.While behavioral data can reveal overt manifestations of these conditions, it is often limited in its ability to uncover underlying neural mechanisms.The incorporation of neuroimaging data provides a means to directly visualize and measure brain activity and structure, helping us go beyond mere classification to uncover the neural signatures that differentiate healthy and affected individuals.Moreover, such approaches have been used in the classification of depression and other psychiatric conditions using deep learning methods such as Convolutional Neural Networks (CNNs), which have shown promise in classification of psychiatric conditions (Strambo et al., 2018;Ke et al., 2019Ke et al., , 2020Ke et al., , 2022Ke et al., , 2023)).
The study was not designed to test the directionality of the relationship between SRP/self-concepts and depression.While negative self-schemas and SRP biases are posited to be stable individual characteristics that precede depression and can confer risk to depression (Beck, 1967;Derry and Kuiper, 1981), in reality, the relationship between them is more complex and bidirectional.For instance, Hayden et al. (2013) found that while negative and positive SRP prospectively predicted depressive symptoms in a community sample of children, depressive symptoms also prospectively predicted their negative SRP.As such, it is uncertain whether the current association found between self-concept-based clusters and depressive symptoms is due to such self-concepts contributing to the experience of depressive symptoms, or depressive moods leading to the development of certain self-concepts.It is also uncertain whether currently found self-endorsement clusters will remain stable after participants' depressive symptoms subside, highlighting the importance of longitudinal studies to investigate the directionality of relationships and stability of self-concept subtypes linked to depression.
SRJ-based clustering represents a novel transdiagnostic framework for subgrouping patients with depressive and anxiety symptoms that may support the future translation of science of selfreferential processing, personality and psychodynamic concepts of self-definition to clinical applications.
, five non-clinical clusters and seven combined clusters were generated from clustering analysis.Clusters are labeled according to the top endorsed words.Top-endorsed words have a mean relative endorsement rate equal to or above the upper quartile of mean relative endorsements of words within a cluster.

FIGURE 1
FIGURE 1Mapping of clinical and non-clinical clusters onto combined clusters.

FIGURE 2
FIGURE 2Scatterplot showing the distribution of positive and negative words endorsed by participants in the five clinical clusters.

FIGURE 3
FIGURE 3Scatterplot showing the distribution of positive and negative words endorsed by participants in the five non-clinical clusters.

FIGURE 4
FIGURE 4Depression symptom severity differences (IDS-30 scores) across all clusters.Error bars denote the standard error of the mean.

TABLE 1
Demographic and clinical characteristics.
M, mean; SD, Standard Deviation.% Percentage of each category out of total.Clinical diagnoses are categorized into five groups (depression, anxiety, mixed anxiety and depression, adjustment disorder and bipolar disorder).Additional comorbid psychiatric diagnoses (schizophrenia, cluster B traits, eating disorder, obsessive-compulsive personality disorder, gambling, alcohol dependence, substance abuse, insomnia, attention deficit disorder with hyperactivity) were accounted for.

TABLE 2
Descriptive statistics of cluster validation metrics for goodness of split and stability.
Mean and standard deviation of the cluster validation metrics for clustering solution of seed 1-30.

TABLE 3
Top endorsed words for every cluster.

TABLE 4
Analysis of variance (ANOVA) results for the effect of clusters on endorsement rate in all samples.

TABLE 6
Analysis of variance (ANOVA) results for the effect of clusters on reaction time to negative words.

TABLE 8
Analysis of variance (ANOVA) results for the effect of clusters on depressive scores (IDS-30) on all samples.

TABLE 5
Analysis of variance (ANOVA) results for the effect of clusters on reaction time to positive words.

TABLE 7
Analysis of variance (ANOVA) results for the effect of clusters on recall bias in all samples.

TABLE 9
Diagnostic characteristics of participants in the five clinical clusters.
Distribution of diagnoses amongst the clinical clusters.Multinomial test found that only Depressive Disorders significantly differed in frequency between the clusters.***p < 0.001.