The Healing Encounters and Attitudes Lists (HEAL): Psychometric Properties of a German Version (HEAL-D) in Comparison With the Original HEAL

Introduction: Over the last years, the interest in understanding health improvements that occur due to non-specific treatment effects, rather than in response to the specific active treatment ingredients, increased. Nevertheless, investigations on patients’ idiosyncratic perspectives on the non-specific aspects of the healing encounter or of the treatment itself that contribute to placebo effects are still rare. The Healing Encounters and Attitudes Lists (HEAL) offer a unique and parsimonious set of instruments to measure patients’ views on a variety of non-specific aspects of the caring encounter. The HEAL items can be administered as computerized adaptive tests or short forms that assess the patient-provider connection, the healthcare environment, treatment expectancy, positive outlook, spirituality, as well as attitudes towards complementary and alternative medicine. So far, no German version of the HEAL exists. Methods: The original 168 HEAL items were translated into German (HEAL-D) applying a translation-back-translation procedure. We examined the psychometric properties of HEAL-D in a sample of 165 participants who reported at least one healthcare visit during the last year. Results: The German short forms of HEAL (HEAL-D-SF) showed good internal consistency and test-retest reliability. The factor structure observed in the English original items showed low to moderate model fit in our sample. Discussion: The development of a German version of HEAL in addition to the original English items offers new possibilities for investigating patients’ idiosyncratic perspectives on the non-specific aspects of treatments across language borders. We will close with presenting possible clinical application as well as promising and relevant future research directions using HEAL-D-SF, including for instance large-scale, cross-national investigations.


INTRODUCTION
Different aspects of healthcare interventions and of the healing encounter itself may influence health outcomes and well-being of patients. Typically, these aspects have been classified into two groups: First, certain treatment components are deduced from specific treatment theories, and have been referred to as characteristic, active, or (disorder-)specific treatment components (1). They are assumed to actively and directly affect health and symptom improvement (e.g. pharmacological ingredients in medications, particular exercises in physiotherapy, or the confrontation with a feared stimulus in exposure-based psychotherapy). Second, healthcare interventions typically take place in a context of care (2) in which additional aspects, such as the therapeutic bond or relationship between a healthcare professional and a patient (3), a plausible rationale for the treatment (4), the treatment providers' warmth (5,6) as well as aspects of the treatment setting and environment, impact treatment success (7). These aspects have previously been labelled as non-specific, common, general, incidental, or contextual and their effects are typically described as placebo effects. While there are conceptual differences between the individual labels, all these aspects are assumed to be interacting with the characteristic, active or specific treatment components in contributing to health improvements. In the following we will use the terms specific effects when referring to the first kind of treatment effects and non-specific effects when referring to the latter kind of treatment effects.
In healthcare outcome research, which aims at identifying efficacious active treatments and treatment components, placebos (and other inert treatments) are used to keep all of the nonspecific treatment components constant, while manipulating the presence of the specific treatment component. Accordingly, controlling for the non-specific treatment effects in placebocontrolled randomised trials became the gold-standard in healthcare research (8). However, when evaluating more complex treatment packages the realization of a high-quality placebo-controlled study design, intended to control for the non-specific treatment effects, turned out to be a challenge (9)(10)(11)(12). In addition the validity of distinguishing between specific and non-specific treatment components has been questioned empirically (13)(14)(15)(16), as well as theoretically (17)(18)(19).
When turning from the highly controlled setting of health outcome research towards the practice of healthcare, where the actual improvement of a presenting patients' health is the major goal, several questions regarding the role of the non-specific treatment aspects and their potential effects arise: How relevant are the placebo effects, and thus the effects of the non-specific treatment components? How much do they contribute to patients' health improvement? Do certain patients benefit more from non-specific treatment components than others? And can non-specific treatment aspects support and boost the effectiveness of a standard treatment (18,20,21)?
Recently, an increased interest in understanding and investigating the effects of non-specific treatment aspects can be observed. This research has shown that in addition to the above-mentioned non-specific aspects of the healthcare encounter itself, patients' perceptions and attitudes are associated with healthrelated outcomes across diverse healthcare settings. These perceptions and attitudes include patients' treatment outcome expectations (22)(23)(24)(25)(26), patients' trust in their treatment provider (27), or patients' spirituality (28,29). Accordingly, a detailed knowledge about a particular patient's perception of and attitudes towards certain non-specific treatment aspects might enable treatment providers to specifically tailor the context in which interventions take place as well as the intervention itself to a certain patient's needs.
The "Healing Encounters and Attitudes Lists" (HEAL) have been developed as a precise and concise set of patient-report measures for assessing attitudes towards and perceptions of several treatment components that are associated with nonspecific treatment effects (30). HEAL item banks were constructed following the rigorous instrument development methodology of PROMIS ® (31, 32), which combines literature reviews, surveys, clinician interviews, focus groups, cognitive interviews to assess item clarity, exploratory and confirmatory factor analyses, and item response theory methods. The convergent and discriminant validity of the initial items was demonstrated in two samples with over 1600 participants (30). The final item banks include a total of 168 Items reflecting six scales: patient-provider connection (57 items, e.g., I trust my healthcare provider), healthcare environment (25 items, e.g., My care was well organized.), positive outlook (27 items, e.g., I am hopeful about my future.), treatment expectancy (27 items, e.g., I expect good outcomes of this treatment.), spirituality (26 items, e.g., Spiritual beliefs give me hope.), and attitude toward complementary and alternative medicine (CAM; 6 items, e.g., I prefer natural remedies.). Participants are asked to rate items in relation to their current treatment on a five-point Likert scale (never, rarely, sometimes, often, and almost always). The items are generally applicable in clinical practice, and are not restricted to any type of treatment modality. The HEAL scales are independent of one another: researchers or clinicians can choose which HEAL scales to use. HEAL scales can also be administered as computerized adaptive tests. In computerized adaptive testing the test will be adapted individually to the testtakers responses. If the HEAL items were administered as computerized adaptive tests not all items belonging to one scale would be administered, but based upon the respondent' previous answers the following items would dynamically be selected for administration.
Short forms of the HEAL (HEAL-SF) have been proposed, with seven items for patient-provider-connection, and six items for healthcare environment, positive outlook, treatment expectancy, spirituality, and attitude toward CAM, respectively (30). Clinical experts selected items for the short forms that had excellent psychometric properties and that were considered to represent the clinical range of each scale of items. The HEAL-SF demonstrated excellent internal consistency which ranged between 0.92 and 0.97.
For clinical practice, particularly the HEAL-SF scales may be applied as a parsimonious assessment tool for complementing the treatment process. Certainly the use of HEAL items are not to replace the necessary exchange between a healthcare provider and the patient regarding the patient's idiosyncratic perceptions of and attitudes towards the treatment. Rather, HEAL item responses provide a formalized assessment about a certain patient's attitudes towards a number of non-specific treatment aspects, which may result in shared reflections about the treatment implementation, and may inform about necessary adaptions of the treatment in order to meet the patient's needs.
So far, no comparable item banks in German were available, that assessed patients' perceptions of and attitudes towards nonspecific treatment components that contribute to placebo effects. Therefore, we translated the English version of the HEAL item banks into a German version of HEAL (i.e., HEAL-D). The aim of this study was to investigate the psychometric properties of HEAL-D, with a specific focus on the short versions (HEAL-D-SF) as these have the most potential to being used in clinical practice as well as in research.

Translation
We translated the HEAL item banks by means of a translationback-translation procedure in line with the guidelines proposed by Beaton and colleagues (33). First, the original 168 HEAL items were translated into German independently by two translators (MG and a student research assistant) without adding words or introducing new expressions, and a team of the two independent translators and two supervisors (HG and CL) consented on one German version of the HEAL items. Second, this version was translated back into English language by two independent translators (DS and a research assistant), and again a team including the two independent translators and two supervisors (HG and CL) compared the English back translations with the original HEAL items. If both back-translated versions indicated meaningful deviations from the original HEAL items, adjustments in the German wording were applied until a consensus was reached within the team of translators and supervisors.

Sample
We tested the German version of the HEAL-D items in a sample of 165 subjects who were recruited via an internet survey service of the University of Basel (baps.sona-systems.com). Subjects who received healthcare treatments within the past year, aged over 18, were fluent in reading and speaking German, and not under the acute influence of psychoactive drugs were invited to participate in the online survey.
The Local Ethics Committee Ethikkommission Nordwestund Zentralschweiz, Switzerland, approved the design and informed consent of the study. The database project and the server were coordinated and located at the Division of Clinical Psychology and Psychotherapy of the Faculty of Psychology at the University of Basel, Switzerland.

Demographic Variables
Demographic variables such as age, gender, mother tongue, and education were initially assessed.

Health-Related Questions
Our sample consisted of subjects who have been receiving at least one healthcare treatment within the past year. We assessed health-related characteristics of the sample, such as information regarding the main diagnosis, the according treatment, the practitioner providing the treatment, as well as the place where the treatment was delivered. We asked our participants to refer to the same treatment context in the first and second assessment.

Healing Encounters and Attitudes Lists-German Version (HEAL-D)
The HEAL item banks consist of 168 items reflecting six scales: patient-provider connection (PPC; 57 items), healthcare environment (HE; 25 items), positive outlook (PO; 27 items), treatment expectancy (TE; 27 items), spirituality (SP; 26 items), and attitude toward CAM (CAM; 6 items). We used the translated parallel German version (HEAL-D) of the 168 HEAL items. Additionally, we used the German version of the HEAL-SF (30), with seven items for PPC, and six items for HE, PO, TE, SP, and CAM, respectively. The original HEAL-SF scales demonstrated excellent internal consistencies, which ranged between 0.92 and 0.97.

Balanced Inventory of Desirable Responding (BIDR)
The short form of the BIDR (34, 35) contains 20 items, 10 of which capture self-deception (BIDR-SD) and 10 of which tap impression management (BIDR-IM). Internal consistencies of the German version of the two subscales ranged between 0.61 and 0.69 across three studies (34).

Procedure
Recruitment of participants took place online between July and December 2018. The online survey was advertised on markt. unibas.ch, studienteilnahme.ch, a faculty-internal student platform and in various pharmacies in Basel and was open to the public. Students received course credit for their participation.
After giving informed consent, participants were asked to generate a personalized token and were invited to participate in a secure online survey that included demographic and healthrelated questions as well as standardized questionnaires (including the HEAL-D items, for details see section Measures). The items of the standardized questionnaires were presented in a random manner, in order to prevent carry-over effects when answering a relatively large number of items which all belong to one scale (as is the case in the long version of HEAL-D). Participants had to indicate their preference on a 5-point response scale with 0 = not at all, 1 = a little bit, 2 = somewhat, 3 = quite a bit, and 4 = very much. The online survey was created and conducted in LimeSurvey (36). For the purpose of assessing the retest reliability of the HEAL-D items, participants were invited to complete the survey twice, whereby the median time interval between the first and the second assessment was 31 days (range 20-56). Since participants' answers were anonymized, the individual tokens allowed us to match the first and second assessments. Participants had to provide their email addresses in the first assessment, so that we were able to contact them 4 weeks later for the second assessment. Afterwards, email addresses were deleted so that the anonymity of the data was guaranteed.

Statistical Analyses
The major goal of our study was the development of HEAL-D-SF, a parallel version of HEAL-SF in German language. Initially, we excluded those cases from our sample that did not complete at least one entire scale, as well as cases that did not report a current healthcare provider. If participants reported diagnoses and healthcare providers in the second assessment that differed from the first assessment, the second assessment was not considered for retest reliability assessments. Then we checked for floorand ceiling effects as well as for the presence of central tendency bias, and excluded respective cases.
In the remaining sample of 165 participants who completed the first assessment all individual item responses were analyzed with respect to their psychometric properties according to the principles of classical test theory. We analyzed the item difficulties and skewness across all 168 items. In addition, we checked for items that showed high correlations with social desirability, in order to identify inadequate items (i.e. items with restricted validity that reflect a high tendency towards socially desirable responses). Then, we selected the respective German items that constitute the original HEAL-SF. Based on this short forms of HEAL-D, we calculated the internal consistency (Cronbach's a) and the discrimination (corrected item-total-correlation) per scale, and the skewness for each of the 6 HEAL scales, as well as the correlation of the scales with social desirability. We assessed the comparability between the German short and long versions by correlating the scale means of both versions. Finally, we tested the retest-reliability by correlating the item means, as well as the scale means between the first and second assessment using the data from 115 participants who completed both assessments.
Next, confirmatory factor analyses (CFA) were carried out with the HEAL-D-SF in the sample of 165 participants who completed the first assessment, using R, "lavaan" package (37). Maximum likelihood estimation was used, with full information maximum likelihood for the missing data. Standardized latent factors were standardized, allowing free estimation of all factor loadings. Following recommendations of Kline (38), Hu and Bentler (39), and McDonald (40), four fit indices were used to examine the data-model fit of the CFA: (a) the chi-square test statistic, (b) the root-mean-square error of approximation (RMSEA), (c) the standardized root-mean-square residual (SRMR), and (d) the comparative fit index (CFI). As the chisquare test statistic is known to be influenced by sample size, model fit was assessed by determining whether the observed chisquare value divided by df (c 2 /df) was smaller than three (41). Regarding RMSEA, a cutoff value of 0.06 or lower was required for a relatively good fit (39), whereas values between 0.08-0.061 indicate a reasonable model fit (42). For the SRMR, Hu and Bentler (39) recommended a value close to 0.08 or lower. Finally, the CFI has a cutoff value close to 0.95 (39). Regarding differences between the models of invariance, changes in CFI of 0.01 or less reveal that the invariance hypothesis should not be rejected (43). Given that the interpretation of model fit in CFA is not without some degree of controversy, all these indices of fit were used, and evaluation was based on convergence among findings (39,44).
Modification indices informed how the model fit would have changed if we would have added new parameters to the model. However, since the CFA model was not exploratory, we decided to only specify a particular modification of the model if this was theoretically justifiable (45).
All analyses were conducted using the open-source software environment R (version 3.3.1; 46). We assumed statistical significance if the 2-sided p was smaller than 0.05.

Socio-Demographic and Clinical Sample Characteristics
Two hundred forty four participants provided informed consent and started the online survey. Of those, 59 had to be excluded because they submitted an empty survey or did not complete at least one of the HEAL-D scales. In 32 cases we had to omit the second assessment, because they provided insufficient data for the retest reliability calculations, and in 10 cases we did not use the second assessment, because the healthcare provider differed between the first and second assessment. No single case had to be excluded because of occurring floor or ceiling effects or central tendency bias. The final sample, that completed the first assessment, and that was used for most analyses, consisted of 165 participants (86.7% female). The median age was 22 years (ranging from 19 to 48 years). Ninety eight percent of participants had at least a high school degree. The included participants reported a variety of reasons for seeking treatment. The most prevalent health complaints in our sample were affective, emotional, or behavioral problems (including depression, posttraumatic stress disorder, anxiety disorders, bipolar disorder, attention deficit and hyperactivity disorder, and anorexia mentioned by 33 participants) followed by pain (mentioned by 31 participants). Ten participants referred to check-ups (e.g. yearly check-up at the dentist). Two authors independently classified the mentioned health issues as chronic, acute, or unclear. In the chronic category chronic headaches, migraines, anxiety disorders, depression, allergies, and asthma were mentioned most often. Less frequently mentioned were chronic infections, irritable bowel syndrome, neurodermatitis, and chronic orthopedic dysfunctions including scoliosis and instability of joints. We rated health issues as chronic in 85 cases (52%). In 32 cases (19%) we rated the mentioned problems as acute. In this category most participants referred to accidents, surgeries, or check-ups. But also dental issues were rated as acute. In the unclear category (48 cases; 29%) we included pain-related issues (e.g. headaches and back pain that were not described as chronic), sleep problems, premenstrual and menstrual complaints, deficiency symptoms, and problems with the digestive system that were neither explicitly described as a particular syndrome nor as chronic. Table 1 summarizes the main characteristics of the study sample.

Item Characteristics of HEAL-D-SF
The items for the short-forms were selected in parallel to the original HEAL-SF. Table 2 displays the item characteristics of the HEAL-D-SF. Table 3 shows the relevant psychometric properties of the applied scales. The HEAL-D-SF scales showed acceptable to excellent internal consistencies between 0.74 and 0.93. The retest reliability ranged between 0.71 and 0.96. Five of the scales were significantly skewed (all p < 0.02).

Characteristics of the HEAL-D-SF Scales and the BIDR Subscales
The BIDR-SD showed an unacceptably low internal consistency (0.31), and the BIDR-IM showed a questionable internal consistency (0.61). As we found three items with negative discrimination among the BIDR items, we deleted those items and repeated the analyses using the BIDR subscales. In the adapted version the BIDR subscales' internal consistency improved slightly with Cronbach's a 0.54 for BIDR-SD and Cronbach's a 0.65 for BIDR-IM. The retest reliability of the adapted BIDR scales was r = 0.67 SD and r = 0.82 for IM, and both subscales were significantly skewed (p = 0.01, and p = 0.009, respectively). Due to the poor reliability of the BIDR-SD subscale (even after adaption), we did not use this scale for further correlation analyses, and we used the adapted version of BIDR-IM for the following correlation analyses.

Correlation Analyses
Four of the HEAL-D-SF scales showed significant correlations with BIDR-IM. The correlations between the short and long versions of HEAL-D were moderate to high ranging from r = 0.66 (positive outlook) to r = 0.98 (spirituality), indicating that the two versions are highly consistent. Table 3 shows the respective correlation coefficients.

Testing the Factor Structure of the HEAL-D-SF Scales
For our CFA the standardized factor loadings of most items were significant and most were larger than 0.4, except for the loading of five items (see Table 2 for details). Nevertheless, the initial model fit of the German version of the HEAL-SF was not sufficiently satisfying [c 2 : 2237.04; df: 614; p < 0.000; RMSEA: 0.13 with 90% CI (0.12, 0.13); RMR: 0.18, and CFI: 0.68] ( Table 4). Modification indices found that specifying the presence of covariance for the error terms of one pair of items on the HCE factor, two pairs of items on the PO factor, and one pair of items on the CAM factor would significantly improve model fit (see Table 4 for details). Given that each pair of items contained related content and the same factor, it was judged appropriate to adjust the model such that the error terms of these items were allowed to covary 1 . All indicators of model fit ( Table 4) suggested that the adjusted model had a slightly better, but still non-acceptable fit with the data.

Main Findings
We set out to evaluate a parallel version of the HEAL-SF in German language. The HEAL items assess patients' attitudes towards and perceptions of the so-called non-specific treatment components that have been shown to contribute to the effectiveness of inert treatments (e.g. sham interventions or placebos) but also to be responsible for a considerable amount of the effectiveness of pharmacological and psychotherapeutic  1 Modification indices also suggested that specifying a covariance between the error terms of the items "My healthcare provider pays attention to my individual needs" and "The staff was helpful", as well as of the items "This treatment is right for me" and "It is important to be open to CAM" would improve model fit. However, as each item pair was from separate scales and the item content was judged as nonsimilar, we felt it was not theoretically justifiable to specify these particular modifications of the model.    treatments. The HEAL items have been developed applying rigorous methodology.
In the present study, the German HEAL items were used for the first time in an online survey in Switzerland. The six scales of HEAL-D-SF have demonstrated acceptable to excellent internal consistency and retest reliability, which indicate that the HEAL-D-SF scales are reliably applicable instruments. Most of the scales were skewed in our sample with most participants indicating high endorsement, except for the scale CAM. Given the wellorganized and high-quality healthcare system in Switzerland, the skewness towards positive responses in the scales PCC, HCE, TE, and PO is no surprise. The scale SP was skewed towards negative responses, which may be explained by a poor relevance of spirituality in the selective sample of our study.
Using CFA, the six-factor structure of HEAL and HEAL-SF reported by Greco and colleagues (30) was partly confirmed using HEAL-D-SF: while factor loadings indicate a good fit of the items with the latent factors (i.e. scales) the overall model fit of the CFA was moderate to low. However, the model fit indices have been shown to largely depend on the sample size, which was comparably small in our study. By adjusting the original model following the highest modification index, which allows for covariation of error terms of several items, the model fit for the assessed fit indices slightly improved. Four items showed very low factor loadings as well as a low discrimination (HCE: "The waiting area was comfortable."; PO: "I feel I can cope with my problems." "I am satisfied with my life."; CAM: "It is important to be open to CAM."). If confirmed in future studies, these findings might indicate that the respective items represent different latent constructs compared with the other items of the respective scales.
Due to the poor psychometric quality of the BIDR scales, no conclusions are possible based on the significant correlations between the HEAL-D items and social desirability. In future studies the HEAL-D items need to be validated with additional reliable instruments.

Relation to Relevant Previous Conceptual and Theoretical Work
HEAL and HEAL-SF have been constructed as a set of individual scales, which represent different aspects of treatments and of the according treatment context. The development of HEAL included a comprehensive overview of existing scales, and of expert and patient opinions. Although the authors of the original HEAL item banks did not explicitly relate the HEAL items to theoretical frameworks of non-specific factors, when relating the items to a prominent model of context factors proposed by Frank and Frank (47), the HEAL scales can be considered as operationalizations of the proposed factors: First, the scale HCE can be seen as including operational definitions of the professional healthcare environment. Second, the scale PPC can be seen as an operationalization of the healing relationship. Third, the scales TE, PO, SP, and CAM can be seen as contributing to ensuring that the advised and prescribed treatment (i.e. the ritual) and the rationale for this treatment are in line with patients' expectations and attitudes and that they are thus acceptable for the patient as described by Budge and Wampold (48). Nevertheless, given the extreme variety of potentially relevant non-specific treatment aspects, the defined scales can only cover a part of all potentially relevant aspects, and additional operationalizations of the theoretical contextual factors are possible. In future, depending on the actual context, in which the HEAL items are to be administered, more scales tapping additional non-specific treatment components might be considered, and added to the HEAL item lists: For instance, items focusing on the provider's empathy might be added to the HEAL item lists in future studies, as empathy has been demonstrated to be associated with treatment effects across different kinds of treatments, and is not explicitly addressed in the current HEAL item lists.
The possibility of assessing patients' idiosyncratic perceptions of and attitudes towards treatment aspects besides the actively prescribed treatment components, can be seen as a further step to overcoming the invalid distinction between non-specific and specific treatment components and towards defining nonspecific aspects of treatments as specific, as described for instance by Kaptchuk (49). The idea of "making the nonspecifics specific" is not new: As early as 1973 Jefferson M. Fish proposed that that therapeutic processes have significant parallels to those taking place in faith-healing and placebo mechanisms in general (50). Along similar lines Frank characterized healing as a social influence process (47), and emphasized the relevance of the non-specific treatment components by presenting a contextual treatment model. More recently, Weinberger argued against using the term non-specific in the context of psychotherapeutic treatments: "I would prefer to say that some important factors may have not been operationalized well enough to be studied empirically; they have not yet been specified. Thus, they are non-specified, not non-specific. Contrary to the views of those questioning their scientific bona fides …, so-called non-specific effects are not ontologically non-specific. They are capable of being empirically specified." (17). The outlined views on the relevance of "making the non-specifics specific" are also reflected by a recent feature in *Model included specified covariance between error terms for the item "The staff was friendly" and the item "The staff was helpful" (both factor HCE); the item "I am satisfied with my life" and the item "I feel I can cope with my problems" (both factor PO); the item "I feel positive about my life" and the item "I am satisfied with my life" (both factor PO); as well as the item "It is important to be open to CAM" and the item "I prefer natural remedies" (both factor CAM). CFI, comparative fit index; df, degrees of freedom, RMSEA, root-mean-square error of approximation; SRMR, standardized root-mean-square residual.
The British Medical Journal entitled "Social prescribing: coffee mornings, singing groups, and dance lessons on the NHS" (51), which outlines the idea to formalize physicians' referrals of patients to community activities, and highlights the relevance of the entire healing context for clinical practice.

Implications for Clinical Practice
In clinical practice, placebo effects, and thus non-specific treatment aspects, moderate and mediate treatment outcomes significantly. However, if healthcare providers are not particularly sensitive towards the relevance of the non-specific treatment aspects, issues associated with these treatment aspects are likely to remain undetected. If a given patient had for instance a low expectancy regarding the efficacy of a necessary standard treatment, the patient's negative perceptions might have negative consequences with respect to the administration of or the adherence to the prescribed treatment, which might lead to a treatment failure. The low expectancy, however, might not appear to be relevant to the patient (and neither to the provider), and thus, might remain uncovered. In such a case, the administration of the HEAL items could help detecting the issue at hand. Then, the treatment provider could first take action in improving the patients' outcome expectancy, before initiating the actual standard procedure. As many of the non-specific treatment aspects, that impact treatment outcomes, are largely neglected in the context of standard treatment administration, the implementation of HEAL items in clinical practice might be seen as facilitating the detection of problematic aspects of a treatment, that are routed in the non-specific aspects of treatments. A deeper knowledge of patients' idiosyncratic perceptions of and attitudes towards these would thus allow tailoring interventions in line with individual patients' needs by facilitating an ethical and research-based conversation regarding what works in an intervention. This may in turn contribute to positive treatment expectations by providing a plausible treatment rationale.
It is important to note that we see the HEAL-D-SF as a flexible tool: Depending on the context of implementation, different scales may be of greater importance than others. For instance, the scale spirituality (SP) might help some patients to understand their symptoms within the context of their culture and religious beliefs. Concordantly, a recent meta-analysis revealed that treatments which are tailored to patients' religious or spiritual beliefs are significantly more effective than no treatment or non-religious/ spiritual psychotherapies in terms of psychological functioning (29). Along similar lines, a feature recently published in The British Medical Journal stated that there is a "high demand among the public for someone to talk to about spiritual matters in times of crisis" (52). The HEAL-SF spirituality scale can help to detect such needs in individual patients, and in turn the treatment provider and the patient can collaboratively discuss and decide, how the treatment can be adapted or complemented, in order to satisfy the patient's need. Nevertheless, and to come back to the argument that a treatment should be credible and plausible, spirituality may not be relevant for every patient. We would therefore advise to judge from patient to patient, (or from context to context, respectively), whether the assessment of spirituality seems appropriate. The same holds true for the other scales of the HEAL-D-SF.

Implications for Research
The development of HEAL-D-SF as a parallel version of the original HEAL-SF is of importance for healthcare research: The HEAL items offer the possibility to investigate the impact of non-specific treatment components across diverse interventions as well as across treatment contexts. The theory behind non-specific treatment components claims that these components have comparable effects across various interventions, treatment settings, and contexts. It would be interesting to test this assumption-e.g., to evaluate whether there is one factor which is the most reliable predictor for treatment success across cultures, populations, and treatment approaches using the parallel English and German versions of HEAL-SF. Importantly, these findings would be based on the patient's own idiosyncratic views and assumptions, rather than relying on theoretical models or assumptions. Since HEAL-D-SF was developed in parallel to the existing HEAL-SF in English language, cross-cultural studies become possible in the future. Thus, a specific focus of future research projects can be the detection of similarities as well as dissimilarities in patients' perception of the impact of non-specific treatment components on treatment outcomes-depending on patients' cultural background.

Limitations
Some limitations of the present study should be considered. First, and most important, the presented data are based on a comparably small sample, which may have negatively affected the overall model fit of the CFA, since fit indicators highly depend on the sample size. Hence, further studies are necessary that include larger samples in order to finally assess the factor structure of the German HEAL items. Second, our sample was heterogeneous with respect to the reported health conditions. While about half of the participants reported rather chronic conditions half of the participants reported rather acute conditions. It is possible, that the impact of the nonspecific aspects of treatments on health outcomes varies depending on the chronicity of health conditions, and is, for instance, mediated by the intensity, frequency, and duration of the treatment. Along similar lines, non-specific aspects may have a greater impact in an ongoing treatment for a clinical condition when compared to a medical check-up. On the other hand, however, the HEAL item banks are considered to be condition-insensitive. Thus, the diversity of our sample with respect to the reported health complaints could be considered a strength of our study. Nevertheless, future studies should consider and test these possible moderators or mediators. Third, study participants were rather homogeneous with respect to educational level and age. It is possible that our findings will not generalize to populations with other socio-demographic characteristics. Therefore, future studies should include a broader range of study participants. Fourth, validation studies are necessary to test the convergent and discriminant validity as well as the prognostic value of the HEAL-D-SF items, including for instance comparisons with existing scales that assess non-specific factors more extendedly, and using longer item lists but also testing the prognostic value of HEAL items in predicting for instance health improvements or well-being in prospective studies. Fifth, the presented study relies on outpatient data assessed via online survey. For future studies it would be interesting to apply HEAL-D-SF in a clinical context.

Conclusion
To conclude, we presented a German translation and a first evaluation of the HEAL items, that assess patients' attitudes towards so-called non-specific treatment components. The German version (HEAL-D-SF) proved to be a reliable set of measures in an initial study. With six scales and six to seven items per scale the HEAL-D-SF are a parsimonious set of measures to assess the relevance of diverse non-specific treatment aspects. Especially when implemented in clinical practice, the shortness of HEAL-SF and HEAL-D-SF constitute a particular strength. But, before a possible application of HEAL-D items in clinical practice, additional validation studies are needed.

DATA AVAILABILITY STATEMENT
The datasets used for the analyses in this study are available from the corresponding author, upon reasonable request.

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of Ethikkommission Nordwest-und Zentralschweiz (Project-ID 2017-00870), with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethikkommission Nordwest-und Zentralschweiz.