Original Research ARTICLE
Use of Recurrence Quantification Analysis to Examine Associations Between Changes in Text Structure Across an Expressive Writing Intervention and Reductions in Distress Symptoms in Women With Breast Cancer
- 1Psychooncology Research Unit, Department of Oncology, Aarhus University Hospital, Aarhus, Denmark
- 2Department of Psychology, University of Aarhus, Aarhus, Denmark
- 3Department of Oncology, Aarhus University Hospital, Aarhus, Denmark
- 4UPMC-Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, United States
- 5Department of Language and Communication, Centre for Human Interactivity, University of Southern Denmark, Odense, Denmark
- 6Department of Culture and Society, Interacting Minds Centre, Aarhus University, Aarhus, Denmark
- 7Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany
The current study presents an exploratory analysis of using Recurrence Quantification Analysis (RQA) to analyze text data from an Expressive Writing Intervention (EWI) for Danish women treated for Breast Cancer. The analyses are based on the analysis of essays from a subsample with the average age 54.6 years (SD = 9.0), who completed questionnaires for cancer-related distress (IES) and depression symptoms (BDI-SF). The results show a significant association between an increase in recurrent patterns of text structure from first to last writing session and a decrease in cancer-related distress at 3 months post-intervention. Furthermore, the change in structure from first to last essay displayed a moderate, but significant correlation with change in cancer-related distress from baseline to 9 months post-intervention. The results suggest that changes in recurrence patterns of text structure might be an indicator of cognitive restructuring that leads to amelioration of cancer-specific distress.
Expressive writing interventions (EWI) have long been viewed as a potentially effective means to reduce the negative psychological and physical consequences of stressful or traumatic experiences for individuals interested in a largely self-help approach, with relatively low cost, and relatively little involvement of professionals . EWI is considered to be a psychological/behavioral intervention since it entails having participants write essays on a given topic, encouraging them to express, and reflect upon their thoughts and emotions, which may not previously have been shared with others . EWI has been widely investigated as a possible intervention to help individuals who have experienced a variety of stressful or traumatic experiences, notably including the diagnosis and treatment of cancer [3–6]. Results in this literature have been mixed, however, leading to calls for greater research attention to how and for whom EWI is efficacious [5, 7].
Both self-reports and ratings by judges regarding the content of the expressive writing have pointed toward the importance of cognitive change in the form of restructuring, understanding or a construction of a coherent narrative. Firstly, participants writing about upheavals displayed a greater amount of cognitive change, measured by ratings of understanding of the problem and alternative explanations for it [8, 9]. The presence of cognitive change was furthermore displayed in self-reports  and raters found a progressive construction of a narrative throughout the writings for the participants improving in physical health . Furthermore, raters reported a difference in organization, acceptance and optimism over time between the texts of those who improved in physical health vs. those who did not improve, with the improvers increasing in these dimensions . Together these findings suggest that an effective intervention consists in the creation of a coherent narrative, rather than the mere pre-existence of one . Hence, it seems that changes in the narrative structure of the text are indicative of cognitive restructuring processes, which might have been stimulated through EWI, and can lead to beneficial effects.
A major methodological impediment to research exploring relationships between the benefits of EWI and changes in the content of the writing has been the time demand of the manual coding process. To address this problem, Pennebaker et al. developed an automatized quantification application termed the Linguistic Inquiry and Word Count (LIWC) [12–15]. This application can analyze texts on a word-by-word basis and is able to recognize almost 4,500 words and word stems, whereby content-driven predictors for later physical and psychological adjustment can be derived. One major problem with this and other similar tools (e.g., LSA-based quantification; ) however, is that they mainly quantify elemental semantic aspects of the texts, which do not necessarily capture large-scale changes of text structure that are related to narrative cohesion as outlined above. A more practical problem is the language-dependence of such tools. While LIWC exist for languages with many users, such as English  or German , this is not the case for languages with fewer users, such as Danish.
The aim of the present study is to investigate the potential utility of Recurrence Quantification Analysis (RQA) to address the limitations of LIWC for investigation of the relationships between the benefits of EWI and changes in the content of the writing, by providing a simple, language-free analysis of the degree of text structure that can be interpreted as reflecting narrative coherence. RQA quantifies the temporal correlations in groups of letters within text samples, without any need for prior training or a corpus base. These temporal correlations are argued to constitute fundamental relations within text data , revealing information about the process creating them.
RQA is a type of non-linear-correlation technique that can be applied in the study of temporal correlation within time series data, as well as sequential correlations within nominal data sets, such as texts [19, 20]. The core concept of RQA is examination of recurrence (i.e., patterns that repeat over time). Recurrences are caused by a movement in data of periodically returning to identical or similar states, and are commonly observed within data from a wide range of phenomena, including the basic sciences (e.g., chemistry), life sciences (e.g., biology) earth sciences (e.g., geology), as well as economics and business. For example, recurrences are seen in the form of protein folding [21, 22], in heart rate [23, 24], skin conduction , joint physiological arousal [26, 27], in brain activity [28, 29], in earthquakes [30, 31], in lake eutrophication , and in stock market exchanges . The method's flexibility resides in the fact that it makes no assumptions about the mathematical structure, its stationarity or even origin of the data [18, 34, 35], making it robust against outliers and misapplications . The method thus provides a means to study non-stationary, non-linear and relatively short data series .
To our knowledge, only Orsucci et al. [18, 37, 38] have applied RQA for the purpose of text analysis, when it comes to the letter level (for a review of RQA applied to language and conversation see ). Orsucci et al. [18, 38] conducted a study of 18 texts (i.e., speech samples and poems) across 3 languages (Italian, Swedish, and English) and found a linear relationship (r = 0.87, p < 0.001) between repetition in letters and overall text structure (i.e., whether these are part of a longer sequence of repeated letters, also termed %determinism). Furthermore, they discovered that simple repetitions in letter (termed %recurrence) can be spread across a wide array of values, depending on the genre (e.g., poem, vs. speech samples) and content (e.g., topic) of the text . In other words, this first explorative study found that there exist certain patterns of structure in text across languages that are quantifiable and applicable in the study of language-associated variables.
The objective of the study reported in the present article is to explore if symptom improvements following an EWI for women undergoing treatment for breast cancer  can be predicted by RQA—and how such findings can be interpreted. Only participants from the intervention group in the original study were included in the analysis, since the intervention did not yield significant main effect on symptoms compared to the control group . The sample of individuals that is included in the present analysis was selected based on their symptom variation, in order to test whether RQA can be utilized to detect any related variations within text structure. If RQA parameters predict symptom improvement following an EWI, this would provide some encouragement for the utility of the method in the context of EWIs.
Materials and Methods
The cohort investigated has been previously described . Briefly, all women were treated surgically within 3 weeks of their diagnosis (mastectomy or lumpectomy) for invasive breast cancer, stage I or II between March and September 2006. The eligible patients were contacted by mail 8 to 12 weeks after surgery, or 4 weeks after completion of chemotherapy and/or radiation therapy . From this initial population, 507 eligible participants responded. The responding participants were randomized into an intervention group with three writing sessions (N = 253) and a control group (N = 254) using a computerized stratified sampling method with four mutually exclusive strata reflecting the four standard adjuvant cancer treatment protocols (chemotherapy, radiotherapy, both, or none).
From the 253 participants in the intervention group, a selection procedure from Pennebaker  was approximated, resulting in a subsample of 55 participants representing those whose symptoms improved most or least following the EWI. Furthermore, this re-analysis of a subsample was chosen due to the time-consuming work of digitizing the handwritten essays, in our case (3 × 55) 165 essays with an average length of 347.9 words (SD = 132.3 words). Of the 55 participants, 22 wrote about their breast cancer diagnosis, while 28 wrote about other traumatic experiences such as physical abuse, death of a family member, and divorce (5 participants was missing a topic registration). The selection of the subsample was performed on basis of the two main outcomes in Jensen-Johansen et al. , the two symptoms variables: selecting the 15 women showing the largest improvement in depressive symptoms (BDI-SF) and the 15 women showing the least improvement (incl. worsening) in depressive symptoms from baseline to 3 months follow-up. Likewise, the 15 women showing the largest improvement in cancer-related distress (IES) and the 15 women showing the least improvement (incl. worsening) in cancer-related distress from baseline to 3 months follow-up were selected. This subsample was selected based on simple subtracting of baseline and 3 months follow-up symptom level. Due an overlap in symptoms change of 5 patients between the samples (e.g., highest improver in both depression and cancer-related distress), sample size did not amount to 60 but 55 individual participants. Furthermore, two participants had not reported any scores for cancer-related distress, resulting in 53 participants in the analyses of this symptom variable.
The EWI group was asked to write about a traumatic or distressing event, according to the procedure described by Pennebaker and Beall . Within this framework, the participants were encouraged to write freely about the cancer or any other traumatic experience. An excerpt of the instructions are provided below:
“We will ask you to write about the most traumatic, or the most stressful experience in your life. In your writing, we will ask you to explore your deepest emotions and feelings. It is important that you seek deep (…) Ideally we would like you to write about the parts of your experience you may have found difficult to share with others (…)” (, p. 16)
The participants were instructed not to focus on spelling and grammar. Both groups were instructed to write three 20 min writing exercises, 1 week apart, for 3 weeks. Writing took place at home according to previously applied procedures of Zakowski et al.  and was facilitated by research assistants calling to initiate and terminate the writing, and to perform a manipulation check. The manipulation check consisted of a short pre- and post-rating on the Profile of Mood State questionnaire (POMS; ). This was implemented due to an established finding of a temporary peak in negative mood immediately after EWI . Post-intervention questionnaires were mailed to the participants at 3 and 9 months.
Cancer-related distress was measured by the Impact of Event Scale (IES; ). This questionnaire assesses the amount of intrusive and avoidant symptoms related to the cancer diagnosis and treatment during the last 7 days.
Transcription and Analysis of Expressive Writings
The original documents were hand-written and, thus, had to be digitalized. In this process, all grammatical and spelling mistakes were preserved as the original writings, as well as corrected in a second edition. However, no differences were found in the outcome, leaving us reporting the calculations on the uncorrected essays. Punctuation marks as well as signs were omitted for simplicity reasons, but spaces were left in, due to the information they contribute in their function of delimiting and therefore defining word boundaries.
Within a toolbox for Matlab R2015b (the CRP toolbox; ), the method of RQA was applied to quantify the recurrence variables of all 3 × 55 essays. This code automatically calculates all 7 standardized RQA-variables. However, relying on the methods applied in Orsucci et al. , we focus our analysis on the variables of %recurrence and %determinism (see description below).
Parameters for the analysis were likewise chosen based on Orsucci et al. , applying a delay of 1 (delaying each essay one letter at a time and checking for recurrences with itself) and an embedding dimension of 3. Thus, we analyzed how triplets of letters recurred in the EWI texts. In this manner, a higher resolution was obtained because all tenses of a word can be included and counted as a recurrent word, compared to choosing a higher dimensionality (e.g., 4 or 5 letter units). Reversely, choosing a dimension of 2 (i.e., 2 letter units), might have resulted in an outcome that is almost impossible to interpret, due to the many arbitrary recurrences of two letter strings in language.
The core tool of RQA is the recurrence plot (RP), originally developed by Eckmann et al. . The RP is a similarity matrix indicating the existence of similarity between all specified aspects of the data. This matrix is structured in the same manner as a correlational matrix, charting similarities within a sequence. Importantly, the RP is not just a visualization tool of correlation patterns in a sequence, but further allows for quantifying those correlations. In the present study, the amount of recurrence was measured within each written essay by intra-textual comparison of the letter-series of each essay through the method of applied delays; continuously delaying the letter-series with one letter, marking all instances of identical letters. With this procedure, all possible recurrences within each text was identified and marked with a black dot on a recurrence plot (see Figure 1 below for an illustrative example of the recurrence plot). In the recurrence plot, the development of time is presented along the diagonal, with the main diagonal representing the fact that the time-series without delays will always recur 100 percent with itself. Further, the two triangles on each side of the main diagonal are identical and, thus, present the same information twice. Lastly, due to the applied delay for every round of comparison, the parallel lines to the diagonal represent longer sequences of recurrent letters within each essay, i.e., entire words or even parts of sentences that are recurring between two different times in the text.
Figure 1. Graphical model of a recurrence plot illustrating the procedure of investigating recurrence in a string of letters. An identical string of letters is presented on the x- and y-axis of the plot and any recurrent letters are marked by a black dot in the plot. In this fashion, the plot will end up as a visual pattern of all recurrent letters throughout the text. To the right are formulas for quantifications of the recurrence plot. First variable %recurrence, quantifies the amount of recurrence present, in regard to the overall possible recurrences (only the lower triangle is counted in, as the upper triangle is a mirror image of this one). %Determinism informs about the amount of structure, given that it measures the amount of dots (recurrences) that are placed on a line, implying letters that are part of longer repeating sections of text.
Webber and Zbilut  developed a way to quantify the recurrences and hereby enabled the step from visualization to quantification, i.e., to derive statistics of the recurrent behavior. The quantification furthermore enables the possibility for the outcome of this non-linear analysis method to be available to further conventional statistical analyses. The most common quantification variables are %recurrence, describing the relative amount of recurrence present in the plot of all possible recurrences (see Figure 1), and %determinism describing the percentage of recurrence points that are part of diagonal lines.
To understand the measurement variable of %determinism, we need to specify what this means in a textual context. In the reference study by Orsucci et al. [18, 38], the RQA parameters were set to measure 3 letter unit sizes of text, quantifying the repetition of 3 letter units anywhere in the text. By further applying %determinism the study identified an extension of this phenomena, measuring consecutive repetition of these 3 letter units, and additionally indicating parts of longer words, complete words or even parts of longer sections of text, which were repeated within the given text. Orsucci et al.  reported that %recurrence (repetition of 3 letter units) can vary dependent on the type of text quantified, whereas the linear relationship to %determinism, indicates that increasing this type of “simple” repetition will further be reflected in an increase of longer sequences of repeated text (% determinism). Longer sequences will most likely consist of parts of or even complete sentences, further indicating that an increase in “simple” 3 letter unit repetitions are not random, but embedded within a larger meaningful body of a syntactic structure or narrative. For an overview over parameter estimation and application of recurrence quantification analysis, see Wallot .
A series of repeated ANOVAs were run with the RQA-variables for the three written EWI essays, and two independent grouping variables of: (1) Direction of change in depressive symptoms (BDI-SF) scores from baseline to 3 months follow-up, and (2) Direction of change in cancer-related distress (IES) scores from baseline to 3 months follow-up. Further, in order to test any possible long-term predictability of development in text structure, correlations between change scores of %recurrence and %determinism from writing session 1 to 3, and symptoms change at 9 months post intervention were investigated. Lastly, we investigated whether the linear relationship between the two RQA-variables found in Orsucci et al.  were present in the two samples of improvers and non-improvers in psychological symptoms.
Note that our data analysis is performed on an extreme-group-contrast (i.e., between participants that show most and least improvement in symptoms). Extreme-group-contrasts such as ours are based on a pre-selection of participants using prior information about the sample, which can lead to an inflation of the type-I error rate (e.g., ). For the current study, however, we deem this procedure acceptable, because we do not present tests of hypothesis or theory, but our research questions are rather exploratory in nature. The goals are to assess if RQA can possibly be applied as text analysis tool in the context of EWI interventions, and how the results of such analyses can be interpreted. Moreover, this procedure is in line with the general practice of accepting higher type-I error rates for exploratory research.
With regards to depressive symptom improvement, our analysis showed that the sample included in the present study had a slightly higher symptom score at baseline compared to the cohort, mainly driven by the subgroup improving in symptoms, whereas the non-improvers were fairly close to the level of the EWI-arm of the cohort (see Table 1 below). This difference between improvers and non-improvers in baseline depressive symptoms was highly significant [t(35.253) = 4.254, p = 0.000]. Furthermore, they showed a significantly different development in symptoms across time [F(2, 106) = 45.792, p = 0.000]. Lastly, it should be noted that all 4 samples had a BDI-symptom level at baseline that was either minimal (0–4) or mild (5–7), with only the improvers reaching a score of moderate symptoms (8–15).
Table 1. Depressive symptoms and cancer-related distress at baseline for the intervention group of the cohort (N = 273), the subpopulation (BDI: N = 55, IES: N = 53), and improvers (BDI: N = 29; IES: N = 30) and non-improvers (BDI: N = 26, IES: N = 23).
Concerning cancer-related distress at baseline, the symptom score for the subpopulation is slightly higher than for the EWI-group of the cohort. There is a significant difference in baseline symptoms of cancer-related distress [t(51) = 3.543, p = 0.001] between the improvers and non-improvers. Furthermore, the development in symptoms across from baseline to 9 months follow-up was also highly significant across the two groups (F(2, 102) = 49.430, p = 0.000).
Differences in Development of Structure Between Groups
To create an impression of the development in text structure that may be observed for a specific participant, Figure 2 below display a distance plots (a color-saturated form of a recurrence plot) of essay 1 and 3 for a participant in the improvers group of symptoms.
Figure 2. Distance Plots from participant improving her text structure from essay 1 to 3. (A) Essay from writing session 1, and (B) Essay from writing session 3. Distance plots are a related to the conventional recurrence plot, but can be more informative in cases where the recurrence rate is relatively low, as it is in case of our transcribed texts (for examples of a recurrence plot of an expressive essay, see Appendix). In distance plots the black dots of recurrences are exchanged with a color indicating the distance to the next recurrence point, indicated by the color bar at the bottom of the plot, with red indicating low distance and white, and blue indicating longer distances. This creates a plot fully saturated in color and is therefore more intuitively “read”.
Interpreting plot a, it can be observed that the distribution of recurrence dots display a pattern almost resembling a chess board, with white laces crossing the plot in both directions of it (see gray arrows). These white laces or stripes are indicative of periods where the distance to the next recurrence point is relatively far, implying that these are periods of low redundancy or in other words, a high degrees of change in data. Opposite this, plot b presents slightly more saturated with red color and with less white laces, in total indicating higher redundancy.
To investigate the assumption that any improvement in symptoms are related to change in text structure, we performed 4 repeated ANOVAs with symptom change at 3 months follow up as the independent variable, and both %determinism or %recurrence as the dependent variables (see Table 2 below).
For depressive symptoms (BDI-SF), two repeated ANOVAs were performed showing no significant differences in the development of %recurrence and %determinism between the improvers and non-improvers in depressive symptoms.
The same analyses were performed for cancer-related distress (IES), with %recurrence and %determinism as the dependent variables. Again, the development of %recurrrence did not significantly differ for the improvers and non-improvers of cancer-related distress, while %determinism reached a marginally significant effect (p = 0.059) (see Figure 3 below).
Figure 3. Graph of development in %determinism across the EWI for improvers and non-improvers in cancer-related distress at 3 months follow up.
Independent t-tests showed that this result was due to the increase in %determinism from essay 1 to 3 [t(51) = 2.296, p = 0.026), representing a medium effect (d = 0.636) (see Figure 4 below). It is however important to notice that the 95% confidence intervals are slightly overlapping.
Figure 4. Change in %determinism from essay 1 to 3 for improver and non-improvers in cancer-related distress at 3 months follow up.
Correlation Between Change in RQA Measures of Text Structure and Symptom Change at 3 and 9 Months
To investigate whether the text structure was associated with symptom change at 9 months follow-up, Pearson correlations were investigated between change scores for %recurrence and %determinism from essay 1 to 3 and change scores of depressive symptoms and cancer-related distress, from baseline to 9 months follow-up. The chosen change scores of text structure (from essay 1 to 3) are guided by the findings in Pennebaker  and the findings in abovementioned ANOVAs.
A correlation analysis (see Table 3 below) showed no significant correlation between change in %determinism and change in depressive symptoms from baseline to 9 months follow-up. The change scores of %determinism from essay 1 to 3 however showed a significant correlation of r = −0.372 with change in cancer-related distress from baseline to 9 months follow-up.
Table 3. Correlation between change scores of RQA-variables from first to third writing session and change scores of psychological symptoms from baseline to 9 months post-intervention.
Correlation Between RQA Measures of Improvers and Non-improvers
Given that the linear relationship between text structure variables indicate some form of “natural” or “normal” relationship (i.e., correlation of r = 0.87 found in ), it would be interesting to test whether this relationship may perform as a predictor of psychological symptoms. To explore how this relationship between the two RQA-variables changes throughout the writings of both improvers and non-improvers in psychological symptoms, we investigated the correlation coefficients between the two variables for both groups (see Table 4 below).
Table 4. Correlation between %recurrence and %determinism for each of the three writing sessions for the improvers and non-improvers in cancer-related distress.
As can be observed in Table 4 more associations are observed between %recurrence and %determinism for the improvers in both symptom groups. In both improver groups, significant positive correlations are observed within each essay and between the two RQA-variables across essays. However, the correlation strengths differ across the two symptom groups, with increases in strength across essays for the improvers in cancer-related distress. This is observed within essay from first to third, but also between %recurrence in essay 1 and %determinism in essay 1, 2, and 3. Similarly, improvers in depressive symptoms show significant correlations for similar associations, that is within every essay, and oppositely between %determinism 1 and %recurrence in essay 1, 2, and 3. In conclusion, the findings suggest that both improver groups display the expected associations between %recurrence and %determinism, indicating that the more short repetitions in the texts are positively associated with the presence of longer sequences of repetitions in the same text.
Lastly, non-improvers in both symptom groups display a significant correlation within essay 1, but no further significant associations. This overall relationship between simple (%recurrence) and longer repetitive sequences (%determinism) therefore seems to be violated for non-improvers, leading to texts with high amounts of repeated 3 letter sequences, distributed in a less deterministic manner, implying less coherence or structure in the repetitions.
The study showed that change in %determinism from essay 1 to 3 was related to improvement in cancer-related distress at 3 months post-intervention, as well as significantly correlated with improvement in cancer-related distress at 9 months post-intervention. This conforms to the findings in Pennebaker , who reported associations between physical health and change in word use from first to last writing session, as well as increases in qualitative ratings of the organization of text across time. No other significant associations between symptom change and the RQA parameters %recurrence or %determinism in the texts were found.
Relating these findings to the results of Orsucci et al. (18), we first discuss the findings for %recurrence. In our results, %recurrence presented consistently with the lack of any significant associations, displaying an almost identical change for the improvers and non-improvers. Across the entire re-analysis sample, an increase in % recurrence was observed across time [F(2, 108) = 13.914, p < 0.001). This means that participants came to use a specific set of triplets (perhaps syllables or words) more systematically. They may have settled on specific vocabulary to describe their situation. However, this narrowing in on a specific set of elements per se did not seem to capture relevant linguistic structure regarding relatable to the change of symptoms.
A different pattern was observed for our findings of %determinism. As mentioned, %determinism refers to the degree of linguistic structure present in the essays (in the form of several consecutive 3 letter units being repeated). Improvement in cancer-related distress was associated with an increase in this structure of a minimum of 6 letters unit repetition and all the way up to a maximum of 51 letters being repeated within the current texts. The linear relationship discovered by Orsucci et al. , implies that the amount of repeated letters (%recurrence) shows a linear relation with the degree to which these are part of a longer sequence of repeated letters (%determinism). According to this relationship, %determinism would be expected to increase along with %recurrence across EWI essays. The increase of %determinism for improvers in cancer-related distress corroborates this relationship, displaying an increase across time, along with the increase in %recurrence. For non-improvers in cancer-related distress, the opposite was the case, as these texts displayed a slight decrease in %determinism, despite the increase in %recurrence. This means that even though more repetitions are observed in the 3 letter units for both improvers and non-improvers of cancer-related distress, these text bits are not part of a larger coherent word or sentence structure in the case of the non-improvers, suggesting a more sporadic sense of repetitions in the text.
Due to the correlational nature of this study, it is not possible to determine whether %determinism in text is causally related to changes in psychological symptoms. Alternatively, the association could be explained by an underlying relationship, reflected in any of the writings of individuals. Benefits of EWI do have a component of reflection about one's situation in them, and there is surely an ongoing feedback loop between reflection and action; that is how one is changing his or her habits and perspective on things based on a reflection process, and those new habits and perspectives are the basis for new reflection processes. As we will discuss further down, we interpret the current findings in terms of change in cognition, and in so far as this change in cognition does not (exclusively) happen during the writing sessions, changes in RQA-measures probably pick up changes that have—in substantial parts—been happening outside of a concrete writing session. Future experimental studies are required to address this question. However, for the present re-analysis it is highly relevant to discuss, what in the texts might potentially cause an increase in %determinism from first to last writing session.
What Can Increased Linguistic Structure Imply?
A prior study of three different EWI populations showed significant correlations between the average change of pronouns from essay to essay (i.e., non-similarity across essays) and number of visits to the doctor following the intervention, with a higher average change correlating with fewer visits to the doctor . Interestingly, the following words: “I, my, it, you, me, she, he, her, we, they, your, him, his, them, our, myself, their, us, and its.” (, p. 63) accounted for only 0.06% of the total unique words in the texts, but produced a significant correlation coefficient to prevalence of doctor visits between r = 0.35 to 0.50. In addition, a meta-analysis of the texts of the three populations even showed a weighted effect size of d = 1.15 (i.e., weighted by degrees of freedom). Surprisingly, only pronouns showed significant correlations and no other function words (i.e., prepositions, conjunctions, or auxiliary verbs) or content words . Additionally, an identical analysis applied to a control population writing about facts, did not find any significant correlations with doctor visits , relating this correlation to the act of writing in an emotionally expressive paradigm. Unfortunately, these findings do not provide any information on what caused this dissimilarity in pronouns. However, the current re-analysis provides a hypothesis, given the information that improvements in psychological symptoms correlate with a rise in %determinism, implying a higher degree of repeated syllables, words (or sentences) in the third essay compared to the first. Combined, this would amount to the suggestion that differences in pronouns across essays, could be caused by fewer but more repeated pronouns in the last essay.
In order to generate future hypotheses for this rise in %determinism, a qualitative analysis was performed on a subsample of texts from the re-analysis. This pilot analysis was performed on essay 1 and 3 for the 10 patients showing the greatest increase in %determinism, as well as the 10 patients showing the least improvement or even worsening (Bjørndahl and Lyby, unpublished). In this naive analysis, both groups (improvers and non-improvers) displayed a shift in their use of pronouns from the first to the last essay. Patients improving in %determinism, tended to move from a frequent change in pronouns providing a multiplicity of different perspectives in essay 1 (e.g., she, they, it, etc.), to a more consistent focus on pronouns expressing the perspective of the self (e.g., I, me, it) in essay 3. In opposition to this, the sample of participant essays that was not showing an increase in %determinism, showed the reversed patterns with a frequent use of I, there or it as pronouns in essay 1, and a change to pronouns expressing different perspectives (e.g., we, they, he, everybody) in essay 3. Furthermore, for the group with increasing %determinism, a qualitative shift in narrative structure was observed in essay 3, toward more coherent and structured essays that focused on a single story and/or few topics. Additionally, the group improving in symptoms displayed a majority of regular medium length sentences, whereas the group of that was worsening showed a mix between very long, very short, and medium length sentences. To sum up, both shift in types of applied pronouns, the decrease in number of topics, and the heightened regularity in sentence length, points toward the improving group experiencing a form of condensation or resolution in the last essay, compared to the group worsening in symptoms displaying the opposite movement of more varied pronouns, more topics and less regularity in sentence length in the last essay.
While we acknowledge that this preliminary qualitative pilot analysis cannot claim to be representative or statistically significant, it was performed to inform hypotheses on the cause of this change in structure. Based upon the pilot-analysis, themes such as a shift in consistency of pronouns, narrative structure, and consistency of sentence lengths may be suggested for future investigations.
Further, a tentative interpretation supported by the studies presented in the introduction, is that a shift in narrative structure reflected in the increase in consistency within pronouns, could be one of the phenomena reflected in the change in %determinism from the first to third essay. However, this interpretation is tentative, given the fact that the cause of the increased %determinism is not investigated in this study. Moreover, an alternative—or additional—hypothesis is that triplet-recurrences could be driven by basic prosocial changes that lead to regular structures in orthography on the syllabic level. The linking factor could be the correlation between emotion regulation and arousal-inducing formulations which are primarily, or substantively based on prosodic features . Such features trickle from prosody into orthography through the grapheme-phoneme relations of a language, and are most salient in poetic language via features such as rhyme and meter, which are explicitly used to induce emotional changes on the side of the reader or recipient [54, 55]; see also Orsucci et al. .
Such changes would be part of our results and add to the overall RQA measures, because participants write about an emotional, arousing topic. However, our text did not seem to feature strong patterns of metric components; still, prosody-driven formulations have to be considers as a source of recurrences in the present context.
Creation of a Narrative
It is tempting to speculate on what this heightened consistency or change in %determinism in word use mean psychologically? In the following paragraph, we will investigate potential hypotheses on what the change in %determinism of text structure may imply. In the qualitative study mentioned above, observed progression in pronouns may reflect the progression expected in a narrative. A narrative is a widely used concept and has many definitions, but in this discussion it refers to an organization and story-telling aspect . A narrative may be a structured story-telling including information about the circumstances prior to and during an event, followed by consequences of the event—among others what happened, and what was the involved individual thinking and feeling . It is possible that structured story telling like narratives imply a variety of different pronouns during the first essay(s) to describe the event and circumstances leading up hereto. Differently, the last essay is likely to contain more self-references due to reflection on one's thoughts and feelings about the event. Participants showing a pattern of no increase in %determinism may lack this reflection on personal consequences during the latter part of the intervention, reflected by the lack of transition in use of pronouns. This interpretation is based on the finding that pronouns often refer to relational information (e.g., us and them, me and her, we and them, I and him, etc.) or others in general (he, they, it, she, us, I, etc.) . Additionally, it is even argued that resolving a traumatic event revolves around “…thinking about oneself in relation to others…” (, p. 64; cf. [57–59], all in ). The importance of relationships is furthermore seen in the effects of dyadic coping during cancer treatment [60, 61], which may support the use of relational pronouns, while the application of a general other construct may resemble a lack of integrating oneself into the situation.
Concluding this interpretation, one hypothesis could be that the creation of a narrative mediates the association between %determinism and cancer-related distress. Such an interpretation also fits of findings of the vast body of research on how changes in narrative structure are associated with mental health status (e.g. [62–64]).
The third and last question in this section is why the depressive symptoms did not show any association to %determinism. Depressive symptoms and cancer-related distress display very different characteristics. Depressive symptoms revolve around affective dimensions such as sadness, loss of pleasure, loss of interest, and irritability, etc., whereas cancer-related distress consists of the dimensions intrusion and avoidance. The two symptom categories can be argued to consist of different processes, one of over-engagement (intrusion and avoidance) and the other of lack of engagement (loss of interest and pleasure). A writing intervention may therefore affect these processes differently, e.g., desensitizing the over-engaged mind in the case of IES symptoms, but not necessarily engaging the depressed mind.
However, other hypotheses can be proposed. Firstly, given the assumption of pronouns resembling sociality or processing of social relationships, depressive symptoms may be less associated with this part of the individual's psychology. Secondly, cancer-related distress is an affect related to a very specific theme, and the likelihood that processing this theme will influence the related symptoms may be more likely, than it influencing a trait-dependent  tendency toward depressive symptoms.
However, it is worth noticing that the depressive symptom level in this sample at baseline was below the suggested clinical cut-off, and a third and likely explanation may therefore be that the participants were too well-adapted to experience any significant improvements, supported by a rather stable condition (change only 1.51 point) across the intervention.
Limitations and Future Research
There are several limitations or drawbacks in this study. First, the selection of a subpopulation introduces some weaknesses concerning possible conclusions of the analysis, i.e., leaving the data non-representative of the investigated population and the findings reduced to being explorative.
Secondly, due to the correlational character of the study, the causality between change in text structure and psychological symptoms is not investigated, leaving the findings hard to interpret.
Thirdly, a limitation exist in the fact that the initial study did not find a significant effect of the intervention , which may also reflect itself in the missing findings in the present study. Since the EWI was not associated with significant improvements in either depressive symptoms or cancer-related distress, analyzing textual structure as a potential effect mediator the chance of success is very limited. However, since repeated ANOVAs did reveal a significant difference in the development in symptom variables across time, the current text analysis probably rather quantify changes in text structure associated with naturally occurring symptom reduction compared to intervention-induced reductions.
Lastly, the inclusion procedure based on two different symptom variables may not have been the optimal selection paradigm, causing the findings to represent a mixed population. The alternative would have been to keep the four selection groups intact (BDI-SF, Non-impr. BDI-SF, Impr. IES, and Non-impr. IES), and to omit the inclusion of existing values for the second symptom category. This would however call for the inclusion of more transcribed essays, to reach an acceptable level of power.
Regarding future research, pursuing the results of this study, an instruction facilitating higher structure in the writings may be the most straightforward way to test causality. However, testing this hypothesis would require an instruction facilitating texts with a change toward higher %determinism from first to last essay. The lack of knowledge of the words causing an increase in %determinism, implies that the most straightforward hypothesis would be to assume a generally higher repetitiveness in the latter essay, in the participants improving. To test this hypothesis, would require asking participants to write as varied as possible in the first essay, and as repetitive as possible in the last essay, or to use a variety of pronouns in the first couple of essays and to sum up their personally experienced learning in the last. In a pilot study  explored an attempt to boost an effect by asking 14 students to write an essay applying the specific word categories associated with improved physical health, and found that although the students found these writings less personal and more difficult, they were also experienced more valuable and meaningful to them compared to the regular writing exercises. Moreover, such a test should be performed on a new, randomly drawn sample of participants to deliver accurate preservation of type-I error rate.
However, given that we do not know anything about the causality behind this correlation, it could also be that these instructions may only have an instrumental effect not causing any symptom improvement, due to no change in narrative. In this case, more studies of narrative creation may be called for, asking participants to first unfold their thoughts during the first writing exercises and later to synthesize their learning from this in the latter exercise.
Results from the current analysis of the writing content from a published EWI study with depressive symptoms and breast cancer-specific distress as outcomes partly supports the hypothesis that change in text structure is associated with change in psychological symptoms in an EWI with a subsample of Danish women treated for breast cancer. This was however only confirmed for the relative change from essay 1 to 3 in the text structure variable of %determinism and its association to cancer-related distress. No significant results emerged for either the text structure variable of %recurrence or depressive symptoms.
More specifically, the results showed that a change in %determinism from essay 1 to 3 was associated with the direction of change in cancer-related distress at 3 months post-intervention, with improvers in symptoms showing an increase in %determinism. Furthermore, a significant correlation was observed between change in %determinism from essay 1 to 3, and change in cancer-related distress at 9 months post-intervention, with increased %determinism predicting a decrease in cancer-related distress.
However, it is important to remember that these associations were present in a selected pilot population. This stresses the need to replicate the findings in a freshly sampled and representative population, to test whether these text structure effects are still present. The findings therefore rather point to RQA being able to capture changes in text related to psychological symptoms, possibly through mechanisms of narrative restructuring. To test this implicit hypothesis, would however require controlled experiments manipulating this variable.
Recurrence quantification analysis may therefore not only function as a monitoring tool in intervention facilitating adaptation to the impacts of a life event, but also as general monitoring tool of psychological well-being.
ML prepared the data and wrote the manuscript. ML, JP, and SW conducted data analysis. ML, JP, MM, AJ, SW, and DB provided critical revisions to the manuscript.
Preparation of this manuscript was supported by the Danish Cancer Society (grant no. PP04034), as well as by Seed Funding from the Interacting Minds Centre at Aarhus University (SEED 2014-2, Expressive writing in cancer patients—a one-person dialogue?) to SW. Furthermore, SW acknowledges funding by the Marie-Curie Initial Training Network, TESIS: Toward an Embodied Science of InterSubjectivity (FP7-PEOPLE-2010-ITN, 264828).
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We want to thank Mikael Jensen-Johansen for kindly collecting and providing the subsample of data.
5. Zachariae R, O'Toole M. The effect of expressive writing intervention on psychological and physical health outcomes in cancer patients - a systematic review and meta-analysis. Psychooncology. (2015) 24:1349–59. doi: 10.1002/pon.3802
6. Zhou C, Wu Y, An S, Li X. Effect of expressive writing intervention on health outcomes in breast cancer patients: a systematic review and meta-analysis of randomized controlled trials. PLoS ONE. (2015) 10:e0131802. doi: 10.1371/journal.pone.0131802
7. Nyssen OP, Taylor SJC, Wong G, Steed E, Bourke L, Lord J, et al. Does therapeutic writing help people with long-term conditions? Systematic review, realist synthesis and economic considerations. Health Technol Assess. (2016) 20:1–367. doi: 10.3310/hta20270
17. Wolf M, Horn AB, Mehl MR, Haug S, Pennebaker JW, Kordy H. Computer-aided quantitative textanalysis: equivalence and reliability of the German adaptation of the linguistic Inquiry and Word Count. Diagnostica. (2008) 54:85–98. doi: 10.1026/0012-19188.8.131.52
21. Webber CL Jr, Giuliani A, Zbilut JP, Colosimo A. Elucidating protein secondary structures using alpha-carbon recurrence quantifications. Proteins Struct Function Bioinform. (2001) 44:292–303. doi: 10.1002/prot.1094
22. Giuliani A, Benigni R, Zbilut JP, Webber CL, Sirabella P, Colosimo A. Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. Chem Rev. (2002) 102:1471–92. doi: 10.1021/cr0101499
23. Zbilut JP, Webber CL. Jr, Zak M. “Quantification of heart rate variability using methods derived from nonlinear dynamics,” in: Analysis and Assessment of Cardiovascular Function. New York, NY: Springer. (1998) 324–334.
24. Zbilut JP, Hu Z, Giuliani A, Webber CL. Singularities of the heart beat as demonstrated by recurrence quantification analysis. In: Proceedings of the 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Vol. 4, Cat. No. 00CH37143 IEEE. (2000). p. 2406–9.
25. Goshvarpour A, Abbasi A, Goshvarpour A, Daneshvar S. Discrimination between different emotional states based on the chaotic behavior of galvanic skin responses. Signal Image Video Process. (2017) 11:1347–55. doi: 10.1007/s11760-017-1092-9
26. Konvalinka I, Xygalatas D, Bulbulia J, Schjødt U, Jegindø EM, Wallot S, et al. Synchronized arousal between performers and related spectators in a fire-walking ritual. Proc Natl Acad Sci USA. (2011) 108:8514–19. doi: 10.1073/pnas.1016955108
27. Mønster D, Håkonsson DD, Eskildsen JK, Wallot S. Physiological evidence of interpersonal dynamics in a cooperative production task. Physiol Behav. (2016) 156:24–34. doi: 10.1016/j.physbeh.2016.01.004
32. Zaldívar JM, Strozzi F, Dueri S, Marinov D, Zbilut JP. Characterization of regime shifts in environmental time series with recurrence quantification analysis. Ecol Modell. (2008) 210:58–70. doi: 10.1016/j.ecolmodel.2007.07.012
33. Belaire-franch J, Belaire-franch J, Contreras-bayarri D, Contreras-bayarri D. Assessing non-linear structures in real exchange rates using recurrence plot strategies. Econ Anal. (2002) 171:249–64. doi: 10.1016/S0167-2789(02)00625-5
35. Webber CL, Zbilut JP. Recurrence quantification analysis of nonlinear dynamical systems. In: Riley A, Van Orden GC, editors. Contemporary Nonlinear Methods for the Behavioral Sciences. (2005). pp. 26–94. Available online at: http://www.nsf.gov/sbe/bcs/pac/nmbs/nmbs.jsp (accessed March 20, 2015).
37. Orsucci F, Walter K, Giuliani A, Webber CL, Zbilut JP. Orthographic structuring of human speech and texts: linguistic application of recurrence quantification analysis. Int J Chaos Theory Appl. (1999) 4:21–8.
40. Jensen-Johansen MB, Christensen S, Valdimarsdottir H, Zakowski S, Jensen AB, Bovbjerg DH, et al. Effects of an expressive writing intervention on cancer-related distress in Danish breast cancer survivors - results from a nationwide randomized clinical trial. Psychooncology. (2013) 22:1492–500. doi: 10.1002/pon.3193
42. Zakowski SG, Ramati A, Morton C, Johnson P, Flanigan R. Written emotional disclosure buffers the effects of social constraints on distress among cancer patients. Health Psychol. (2004) 23:555–63. doi: 10.1037/0278-6184.108.40.2065
47. Furlanetto LM, Mendlowicz MV, Bueno JR. The validity of the Beck Depression Inventory - short form as a screening and diagnostic instrument for moderate and severe depression in medical patients. J Affect Disord. (2005) 86:87–91. doi: 10.1016/j.jad.2004.12.011
53. Wiethoff S, Wildgruber D, Kreifelts B, Becker H, Herbert C, Grodd W, et al. Cerebral processing of emotional prosody-influence of acoustic parameters and arousal. Neuroimage. (2008) 39:885–93. doi: 10.1016/j.neuroimage.2007.09.028
54. Menninghaus W, Bohrn IC, Knoop CA, Kotz SA, Schlotz W, Jacobs AM. Rhetorical features facilitate prosodic processing while handicapping ease of semantic comprehension. Cognition. (2015) 143:48–60. doi: 10.1016/j.cognition.2015.05.026
60. Lafaye A, Petit S, Richaud P, Houédé N, Baguet F, Cousson-Gélie F. Dyadic effects of coping strategies on emotional state and quality of life in prostate cancer patients and their spouses. Psychooncology. (2014) 23:797–803. doi: 10.1002/pon.3483
61. von Heymann-Horan A, Bidstrup PE, Johansen C, Rottmann N, Andersen EAW, Sjøgren P, et al. Dyadic coping in specialized palliative care intervention for patients with advanced cancer and their caregivers: effects and mediation in a randomized controlled trial. Psychooncology. (2019) 28:264–70. doi: 10.1002/pon.4932
62. Adler JM, Turner AF, Brookshier KM, Monahan C, Walder-Biesanz I, Harmeling LH, et al. Variation in narrative identity is associated with trajectories of mental health over several years. J Pers Soc Psychol. (2015) 108:476. doi: 10.1037/a0038601
63. Dunlop WL, Tracy JL. Sobering stories: narratives of self-redemption predict behavioral change and improved health among recovering alcoholics. J Pers Soc Psychol. (2013) 104:576. doi: 10.1037/a0031185
Figure A1. Example recurrence plot of the text data. The plot was generated by the CRP toolbox for MatLab .
Keywords: expressive writing intervention, text structure, recurrence quantification analysis, cognitive restructuring, narrative
Citation: Lyby MS, Mehlsen M, Jensen AB, Bovbjerg DH, Philipsen JS and Wallot S (2019) Use of Recurrence Quantification Analysis to Examine Associations Between Changes in Text Structure Across an Expressive Writing Intervention and Reductions in Distress Symptoms in Women With Breast Cancer. Front. Appl. Math. Stat. 5:37. doi: 10.3389/fams.2019.00037
Received: 08 February 2019; Accepted: 12 July 2019;
Published: 30 July 2019.
Edited by:Peter beim Graben, Brandenburg University of Technology Cottbus-Senftenberg, Germany
Reviewed by:Franco Orsucci, University College London, United Kingdom
Michael Spivey, University of California, Merced, United States
Qian Lu, University of Houston, United States
Alex Karan, University of California, Riverside, United States, in collaboration with reviewer QL
Copyright © 2019 Lyby, Mehlsen, Jensen, Bovbjerg, Philipsen and Wallot. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Sebastian Wallot, firstname.lastname@example.org