# PLACEBO AND NOCEBO EFFECTS IN PSYCHIATRY AND BEYOND

EDITED BY : Paul Enck, Katja Weimer, Luana Colloca and Seetal Dodd PUBLISHED IN : Frontiers in Psychiatry

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-048-3 DOI 10.3389/978-2-88966-048-3

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# PLACEBO AND NOCEBO EFFECTS IN PSYCHIATRY AND BEYOND

Topic Editors: Paul Enck, University of Tübingen, Germany Katja Weimer, University of Ulm, Germany Luana Colloca, University of Maryland, Baltimore, United States Seetal Dodd, Barwon Health, Australia

Citation: Enck, P., Weimer, K., Colloca, L., Dodd, S., eds. (2020). Placebo and Nocebo Effects in Psychiatry and Beyond. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-048-3

# Table of Contents


Johannes A. C. Laferton, Sagar Vijapura, Lee Baer, Alisabet J. Clain, Abigail Cooper, George Papakostas, Lawrence H. Price, Linda L. Carpenter, Audrey R. Tyrka, Maurizio Fava and David Mischoulon


Rebecca K. Webster and G. James Rubin

*77 Re-evaluation of Significance and the Implications of Placebo Effect in Antidepressant Therapy*

Marko Curkovic, Andro Kosec and Aleksandar Savic

*82 Can a Brief Relaxation Exercise Modulate Placebo or Nocebo Effects in a Visceral Pain Model?*

Sigrid Elsenbruch, Till Roderigo, Paul Enck and Sven Benson


Piotr Gruszka, Christoph Burger and Mark P. Jensen

*123 Placebo Effect in the Treatment of Depression and Anxiety* Irving Kirsch


*275 Studying a Possible Placebo Effect of an Imaginary Low-Calorie Diet* Valentin Stefanov Panayotov

*280 Placebo and Nocebo Effects in Patients With Takotsubo Cardiomyopathy and Heart-Healthy Controls*

Elisabeth Olliges, Simon Schneider, Georg Schmidt, Daniel Sinnecker, Alexander Müller, Christof Burgdorf, Siegmund Braun, Stefan Holdenrieder, Hansjörg Ebell, Karl-Heinz Ladwig, Karin Meissner and Joram Ronel


Jens Hamberger, Karin Meissner, Thilo Hinterberger, Thomas Loew and Katja Weimer


Vanessa Brown and Marta Peciña

*331 Placebo Manipulations Reverse Pain Potentiation by Unpleasant Affective Stimuli*

Philipp Reicherts, Paul Pauli, Camilla Mösler and Matthias J. Wieser


Efrat Czerniak, Tim F. Oberlander, Katja Weimer, Joe Kossowsky and Paul Enck

*386 Effects of Expectancy on Cognitive Performance, Mood, and Psychophysiology in Healthy Adolescents and Their Parents in an Experimental Study*

Daniel Watolla, Nazar Mazurak, Sascha Gruss, Marco D. Gulewitsch, Juliane Schwille-Kiuntke, Helene Sauer, Paul Enck and Katja Weimer

# Editorial: Placebo and Nocebo Effects in Psychiatry and Beyond

#### Katja Weimer 1\*, Paul Enck <sup>2</sup> , Seetal Dodd3,4,5,6 and Luana Colloca7,8,9

<sup>1</sup> Department of Psychosomatic Medicine and Psychotherapy, Ulm University Medical Center, Ulm, Germany, <sup>2</sup> Department of Psychosomatic Medicine and Psychotherapy, University Hospital Tübingen, Tübingen, Germany, <sup>3</sup> The Institute for Mental and Physical Health and Clinical Translation, Deakin University, Geelong, VIC, Australia, <sup>4</sup> Centre for Youth Mental Health, University of Melbourne, Parkville, VIC, Australia, <sup>5</sup> Department of Psychiatry, University of Melbourne, Parkville, VIC, Australia, <sup>6</sup> University Hospital Geelong, Barwon Health, Geelong, VIC, Australia, <sup>7</sup> Department of Pain and Translational Symptom Science, School of Nursing, University of Maryland, Baltimore, MD, United States, <sup>8</sup> Departments of Anesthesiology and Psychiatry, School of Medicine, University of Maryland, Baltimore, MD, United States, <sup>9</sup> Center to Advance Chronic Pain Research, University of Maryland, Baltimore, MD, United States

Keywords: placebo effect, nocebo effect, learning, expectancy, conditioning, psychotherapy, psychiatry

Editorial on the Research Topic

INTRODUCTION

Placebo and Nocebo Effects in Psychiatry and Beyond

Edited and reviewed by: Stefan Borgwardt, University of Basel, Switzerland

> \*Correspondence: Katja Weimer katja.weimer@uni-ulm.de

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 18 July 2020 Accepted: 24 July 2020 Published: 07 August 2020

#### Citation:

Weimer K, Enck P, Dodd S and Colloca L (2020) Editorial: Placebo and Nocebo Effects in Psychiatry and Beyond. Front. Psychiatry 11:801. doi: 10.3389/fpsyt.2020.00801 The placebo effect is part of every medical intervention and plays a crucial role in randomized placebo-controlled trials (RCTs). It is beneficial to maximize the placebo effect when treating patients, but it should be minimized in RCTs to estimate the true drug effect (1). Studies have shown that the placebo effect is formed by learning mechanisms (2), and an expert consensus has suggested that the beneficial effects of placebo can be harnessed for clinical use to improve patient outcomes (3). In contrast to the placebo effect, adverse events can occur and symptoms can get worse through a negative placebo effect, the so-called nocebo effect (4). Yet, to exploit placebo mechanisms in clinical practice a lot of questions remain unanswered. For this Research Topic Issue, we called for the latest research articles in the field of placebo and nocebo research. The issue comprises 38

articles from "Hypothesis and Theory" to "Reviews" and to "Original Research" articles. After giving an overview about the underlying mechanisms of the placebo effect, such as conditioning, expectations and influencing factors, Friesen summarizes ethical views regarding the use of the placebo effect. Until recently, it has been assumed that placebos take only effect when patients are deceived, but she encourages considering placebos as a "source of agency", without deception and in agreement with patients' autonomy. Babel complements the current view about classical conditioning in the placebo effect. In fact, many studies use a combination of classical conditioning and verbal suggestions to induce placebo and nocebo effects. Due to recent studies using hidden and subliminal conditioning procedures, Babel argues that classical conditioning is a

distinct mechanism that works without conscious expectations. However, there are only a few studies limited to the area of pain and further studies are needed.

#### THE PLACEBO EFFECT IN PSYCHOTHERAPY

Particularly in psychiatry, patients are not only treated with pharmacotherapy but often with different forms of psychotherapy. The role and mechanisms of the placebo effect in psychotherapy has been repeatedly discussed, and Enck and Zipfel point to the challenges of disentangling specific effects of the different psychotherapeutic approaches including unspecific and the placebo effect. This is even more challenging when considering that many psychotherapeutic approaches are equally effective and there is still a debate within psychotherapy research about the specific, common and unspecific factors (also known as the "Dodo bird verdict"). Enck and Zipfel encourage psychotherapy researchers as well as therapists to understand that the placebo effect exists and provide a framework that acknowledges context, common, and specific factors for further research. With her Mini Review, Blease attempts to provide greater clarity in the definition of the placebo effect in psychotherapy and gives insights into controversial views such as "psychotherapy is a placebo". She argues that the problem could be solved when placebos and the placebo effect are clearly defined the same way as they are defined in clinical trials: as control interventions and the effect they induce. In the first instance, it seems to be contradictory that Blease recommends using a clear definition of the placebo effect, whereas Jonas states that "the placebo response is a myth" and does not exist. According to his arguments it is contradictory that an inert treatment will produce a response and votes for a broader understanding of this response that should be called "meaning response" or "healing response". However, these two views are compatible and in line with the definitions of "placebo effect" as the effect elicited by placebo mechanisms, and "placebo response" as all health changes after administration of an inert treatment, as stated by expert consensus of placebo researchers published in 2018 (3).

### THE ROLE OF CONTEXT FACTORS IN PLACEBO AND NOCEBO EFFECTS

In psychotherapy research, context factors such as the patientprovider interaction are considered a common factor, albeit they are considered to be part of the placebo response in other treatments. In their systematic review, Daniali and Flaten found that aspects of a positive patient-provider interaction such as higher confidence in the provider, perceived higher competence and professionalism, and positive nonverbal behaviors were associated with lower pain reports and higher placebo effects in patients and participants. In contrast, negative nonverbal behaviors led to higher pain reports and nocebo effects. Howe et al. delve deeper in specific aspects of the patient-provider-interaction and differentiate between competence and warmth. They provide a framework for researchers and practitioners about how patients perceive competence and develop the feeling that the physician "gets it", and how they perceive warmth when the physician "gets them". However, non-specific effects of treatments comprise many aspects, and Gerger et al. translated and validated the first German version of the Healing Encounters and Attitudes Lists (HEAL-D) and its short form (HEAL-D-SF). This set of questionnaires assesses patients' views on the patient-provider interaction, the healthcare environment, treatment expectations, positive outlook, spirituality, and attitudes towards complementary and alternative medicine. It may help to turn non-specific into specific effects, and therefore may be usable for research purposes and clinical practice.

To evaluate how and how often oncologists make use of empathy expressions by practitioners, van Vliet et al. assessed video-taped consultations between oncologists and patients with advanced breast cancer in an observational study. Overall, oncologists often provided information about expectancy and used several empathic behaviors such as understanding, respecting, supporting and exploring, whereas a lack of empathy was less often observed. Further studies should evaluate effects of empathic expressions on treatment outcomes and (nocebo) side effects. Not only physicians are aware of the effect of unspecific factors on treatments, patients are aware of them, too: In their large online survey among Italian patients with musculoskeletal pain, Rossettini et al. found that patients believe that contextual factors such as an empathetic alliance, and verbal and non-verbal communication are effective and work through mind-body connections. Furthermore, they have positive attitudes towards their use in clinical practice if they are not used in a deceptive way.

One of the challenges in placebo research is to disentangle the placebo effect from other effects through elaborate study designs. To differentiate the placebo effect from the psychosocial context, Gruszka et al. as well as Curkovic et al. recommend outsourcing some parts of the psychosocial context via smartphone applications. Such an app could be used for standardized recruitment, randomization and the provision of treatment information to induce positive expectations. Furthermore, it could be used to assess expectations, symptom severity, or physiological data via smartphone sensors (e.g., heart rate) without personal interaction and in daily life. Additionally, Curkovic et al. suggest that studies should rigorously investigate and report aspects of research plans to the better investigate which aspects of an intervention at which dose is relieving symptoms, and this could also be achieved through an app.

### THE PLACEBO EFFECT ON DEPRESSION, ANXIETY, PAIN, AND OTHER SYMPTOMS

Irving Kirsch published several studies and meta-analyses about the placebo response and placebo effect in treatments with antidepressants and questioned whether the placebo response and the drug effect in RCTs are additive (5, 6). In his recent article, Kirsch summarizes the results of these and other metaanalyses clearly demonstrating that "most (if not all) of the benefits of antidepressants in the treatment of depression and anxiety are due to the placebo response". However, RCTs cannot answer the question how patients' symptoms evolve without any treatment or how they should be treated instead. Kirsch reports several alternative treatments such as psychotherapy, physical exercise, omega-3 supplements, and yoga that has been shown to be as effective as antidepressants but with less side effects, and in some cases with better long-term effects than antidepressants. To further evaluate how expectancy could influence outcomes in antidepressant trials, Laferton et al. performed a re-analysis of a double-blind RCT in major depression comparing escitalopram, S-adenosyl-L-methionine (SAMe) and placebo. Results show that the patients' perceived treatment assignment during the trial changed, was predicted by symptom improvements, and contributed more to treatment outcomes than actual treatment. Finally, there was no difference between groups.

But patients do not only "feel better" through the placebo effect, several neuroimaging studies could demonstrate neurophysiological changes in the brain. Brown and Pecina underline these results and provide an overview of neuroimaging studies of the antidepressant placebo effect. They show that this effect is comparable to the placebo effect on pain. This finding implies common underlying mechanisms involving brain areas associated with cognitive control, the representation of expectations, and reward and emotional processes.

Still, pain is the best investigated symptom in placebo research. Complementary to neuroimaging studies, Reicherts et al. present an electroencephalography (EEG) study combining the motivational priming hypothesis and the conditioning of placebo and nocebo effects. Participants who were told that unpleasant pictures decrease pain, indeed reported less pain, and consequently, somatosensory evoked potentials were decreased when they watched unpleasant pictures compared to neutral pictures. They conclude that the well-known modulation of pain by emotions is influenced by expectations.

The experimental pain study by Zhou et al. found interactional effects of different expectations, sex of participants and personal characteristics such as dispositional optimism and state anxiety on pain reports in a complex manner. After a conditioning procedure with electrical pain, women in the low expectancy group reported decreased pain compared to the No or High expectancy groups, whereas men reported decreased pain in the High expectancy group in the test session. Whether optimism or state anxiety predicted placebo effects was dependent on the expectancy level, but independent of sex. To explore other predictors of placebo analgesia, Wang et al. used latent class analyses (LCA) to identify learning patterns during a conditioning procedure in an experimental pain study. LCA revealed that greater or increased differences between high and low pain ratings in combination with red and green light signaling stimuli during conditioning were associated with greater placebo analgesia in the subsequent testing phase.

Furthermore, expectations of pain decrease were a mediator for placebo analgesia, but higher age and higher warmth-detection thresholds were associated with lesser placebo analgesia.

A large proportion of our knowledge about the placebo effect and its underlying mechanisms stems from experimental studies with pain, but there is little knowledge whether the same mechanisms apply to other symptoms. To elucidate this question, Wolters et al. reviewed the literature about placebo and nocebo effects in dyspnea, fatigue, nausea, and itch. They can confirm that in general the same mechanisms as in pain are at work in these symptoms, such as the combination of verbal suggestions and conditioning, and that subjective symptoms are more prone to elicit a placebo effect than are physiological measures. However, there are also some differences as the influence of individual characteristics varies between symptoms. Evidence can be added by an experimental study by Meeuwis et al. who investigated placebo and nocebo effects through verbal suggestions on itch. Participants received the respective information either in an open-label condition knowing that the applied tonic was a placebo (a pink-colored skin disinfectant), or in a closed-label condition in which they were deceptively told that the tonic was effective. Whereas suggestions did not affect itch reports during histamine iontophoresis, participants in both positive suggestion groups reported lower itch and lower skin temperature increase after the iontophoresis compared to the negative suggestion groups. Interestingly, their open-label suggestion was as effective as the deceptive information about the effectiveness of the placebo, and they found a symptom specific physiological reaction to itch.

Another underreported areas are placebo and nocebo effects on cardiac symptoms and physiology. In an experimental study with patients with Takotsubo cardiomyopathy—a rare, reversible form of cardiomyopathy after stressful psychosocial life events and heart-healthy controls, all participants received a saline infusion three times together with the information that it has no effect, a positive (placebo) or negative (nocebo) effect on cardiac functions, respectively. Olliges et al. report that before and during the nocebo condition subjective stress rating, heart rate, and systolic blood pressure increased, whereas the latter also increased after placebo information. However, there were no differences between patients and controls.

### AREAS RELATED TO MENTAL DISORDERS

The placebo effect could not only be helpful to directly decrease symptoms of a disorder, but also when it is used to influence functions related to mental disorders such as cognitive functioning or appetite regulation. Participants in the study of Fuhr and Werle were randomized to listen to a mental training or philosophy lecture both audio-taped for 20 min, and half of the participants of each group were told that they listen to an effective or control tape. All participants improved their cognitive performance as measured with a d2-test, but those participants who experienced a greater improvement rated the received treatment as effective irrespective of group assignment. This, at least, shows that healthy persons can rate their cognitive performance without being influenced by (bogus) verbal suggestions, and thus, could be indicative of a healthy function. Winkler and Hermann chose a different study design: two groups received a nasal spray along with the suggestion of a cognitive improvement (placebo) or impairment (nocebo) effect, and one group served as a control (without nasal spray or suggestions). Similar to the study by Fuhr and Werle, verbal suggestions did not affect actual cognitive performance. However, participants in the placebo group rated their cognitive improvement better and felt less tired compared to the nocebo group. The authors conclude that these subjective effects may explain why so-called neuroenhancers are still popular among college students. For their study about placebo and nocebo effects of a sham transcranial magnetic stimulation (sTMS), Höfler et al. employed women who turned out to be placebo or nocebo responders, respectively, in previous studies. According to their responsiveness they received the information that the sTMS will increase (placebo) or decrease (nocebo) their left-sided visual attention in an eye-tracking experiment. As in the above-mentioned studies, the placebo instruction did not affect actual visual attention, but subjectively improved attention. In contrast, nocebo responders showed the opposite to the expected reaction.

In another eye-tracking study from the same work group, Potthoff et al. did not directly target visual attention, but a placebo pill that claimed to reduce appetite was given to healthy, mostly normal-weight women, and their reactivity to food cues was registered. Participants reported decreased appetite which was related to decreased visual attention for food, e.g., fixation and dwell time on high and low-caloric food images compared to non-food pictures. The experimental study by Hoffmann et al. confirms these results: healthy normal-weight participants reported decreased appetite after ingesting a placebo pill that should increase satiety compared to a control group. They additionally assessed an objective marker of hunger and found that the opposite information—that a placebo pill claimed to enhance appetite - increased plasma ghrelin levels but did not affect appetite itself. In a third study of placebo effects on food consumption, Panayotov showed that the information about a calorie-reduced diet decreased body mass, body mass index (BMI), and fat tissue in overweight and obese participants of a weight loss program. Although participants did not strictly adhere to their diet programs and the sample size was small, this preliminary study shows that weight regulation could be directly addressed through manipulating expectations of patients.

the "bad brother" of the placebo effect have shown that known placebo mechanisms such as conditioning, expectations, and social learning can also have negative outcomes. Faasse et al. define "nocebo effects as unpleasant or adverse outcomes triggered by the treatment context". The authors differentiate between primary nocebo effects and nocebo side effects, and the misattribution of regular symptoms to an (inert) treatment. Furthermore, they describe how experimental studies should be designed to investigate the nocebo effect appropriately. While Faasse et al. focus on studies with treatments involving drugs or medical devices, Locher et al. emphasize that the nocebo effect could also occur in psychotherapy. They provide two examples where a nocebo or nocebo-related effect could evolve: In patients with chronic primary pain or other symptoms without a clear physiological etiology, and in relation to trauma debriefing to prevent posttraumatic stress disorders (PTSD).

To prevent nocebo (side) effects it would be helpful if nocebo responders could be detected in advance. In a re-analysis of experimental endotoxemia studies, Benson and Elsenbruch investigated predictors of the nocebo effect. Nocebo responders, defined as participants in the placebo arms of RCTs who believed they were allocated to the verum arm, reported significantly more physical symptoms but did not differ from non-responders in psychological or physical parameters. Within nocebo responders, physical symptoms correlated with greater state anxiety, negative mood, catastrophizing and neuroticism. Their study demonstrates that it is difficult to predict who will be a nocebo responder, but that perceiving nocebo side effects could affect perceived treatment allocation—another reason why nocebo side effects should be reduced. Webster and Rubin provide a systematic review of RCTs investigating brief psychological interventions to reduce or avoid nocebo side effects in medical treatments. In the 27 studies found, omitting side effect information was most successful to reduce nocebo side effects, whereas other communication strategies such as priming, distraction, and altering the branding of drugs showed mixed effects. De-emphasizing of side effects was not effective. Finally, they discuss that it could be challenging to balance the reduction of nocebo side effects with informed consent. Pan et al. investigated another strategy to reduce nocebo side effects in an experimental study: Participants with weekly headaches received a placebo pill and were randomized to read a bogus medication leaflet only or to read additionally an explanation about the nocebo effect. Two minutes after pill intake, the group that had received the explanation about nocebo reported less nocebo symptoms than the other group. This effect was moderated by baseline symptoms, perceived sensitivity to medicine, and expectations. Furthermore, most participants evaluated the nocebo information as helpful.

#### NOCEBO EFFECTS

In conjunction with studies about the placebo effect, the nocebo effect has already been mentioned above. Previous studies about

### UNDERREPORTED RESEARCH FIELDS

Most of the articles in this Research Topic deal with the placebo effect and response after typical applications of treatments such as pills or ointments. However, disentangling the true treatment effect from the placebo response and placebo effect is also challenging in other forms of treatments, e.g., psychotherapy (see above). Chae et al. discuss in particular two aspects that could lead to a high placebo response in acupuncture: the fact that even sham acupuncture may elicit physiological responses, and the difficulty of effective blinding of provider and patient. They suggest more appropriate alternative control strategies in acupuncture treatment.

There is less research about the placebo effect in children (7) and this Research Topic comprises only two further articles about it: one involved an experimental design with healthy children, and one discusses the influence of the so-called placebo-by-proxy effect. The placebo-by-proxy effect was introduced by Grelotti and Kaptchuk (8) in 2011 and describes the effect where people in the social environment of a patient (parents, siblings, relatives, peers) feel better when the patient receives an effective treatment. Czerniak et al. complement this concept with the corresponding "noceboby-proxy" effect and discusses the impact of these two concepts particularly on children's symptoms and treatments. Their review of the available literature opens an important research field. The influence of parents or other proxies on placebo and nocebo responses has rarely been studied. The experimental study by Watolla et al. investigated the effect of a suggested ginkgo patch on cognitive performance in children and one parent. While they found only a poor overall placebo effect, neither the cognitive performance nor the expectations of children and their parents were interrelated. This may imply that shared information and heritability have a low impact on the placebo effect. Although it should be taken into account that the participants were all healthy and without need for cognitive improvement. This finding is supported by the first study involving a classical twin design: Weimer et al. employed healthy mono- and dizygotic twin pairs in an experimental study with a heat pain paradigm. After conditioning the effectiveness of an ointment, twins reported a significant placebo analgesic effect in the test condition. This effect was mainly related to the personal learning experience during the conditioning procedure, but not to the effect of their co-twin, suggesting that heritability and shared environment play a minor role. In contrast, first studies show a genetic component in the placebo effect, but these results are still inconclusive (9) and twin studies should be combined with genetic analyses to further elucidate this area.

### MAXIMIZE OR OPTIMIZE TREATMENTS THROUGH PLACEBO MECHANISMS

Elsenbruch et al. tie in with first evidence that psychophysiological responses, such as an increase of parasympathetic activation, to placebo interventions could play a role in the establishment of a placebo effect. In their study, a brief progressive muscle relaxation exercise but not a control task reduced heart rate and systolic blood pressure, and decreased pain perceptions in relaxed participants in a pain paradigm with rectal distensions.

Such experimental studies show promising ways to harness the placebo effect for patients' treatments in ethical and legal ways. Benefits for patients are clear as they experience symptom as well as side effect reductions, but the placebo effect is rarely used systematically. Showing that harnessing the placebo effect is not only effective but also cost-efficient could improve its visibility and acceptability. A systematic review by Hamberger et al. investigated if placebo interventions are also cost-efficient but showed that there is a lack of health economic evaluations and encourage placebo researchers to report costs of placebo interventions.

### CONCLUSION: MORE QUESTIONS THAN ANSWERS?

In summary, the multifaceted articles in this Research Topic issue show that placebo and nocebo effects are complex phenomena. There is still a debate about the role of placebo and nocebo effects in psychotherapy research and their relation to common and context factors. In contrast, context factors such as the patient-provider interaction have already been acknowledged as part of the placebo effect in other treatments. Research about the placebo effect on depression, anxiety, and pain reveals a high placebo effect showing symptom improvement and neurophysiological changes in the brain. However, there is less research about other symptoms such as itch or heart-related diseases, among others. Recent studies aim to harness the placebo effect to improve functions that are related to mental disorders, such as cognitive functioning or appetite regulation, and may be an interesting research area for further studies. There are several other underreported research fields such as: appropriate control conditions for treatments other than pills, placebo and nocebo effects in children, and the role of genetics and heritability. An increasing amount of articles investigate the nocebo effect and nocebo related adverse effects, their mechanisms and strategies to avoid or reduce them. Finally, all research aims to improve treatments of patients and recent studies show promising results by employing techniques that enhance the placebo effect or reduce the nocebo effect. However, more research is needed to transfer knowledge about placebo and nocebo effects into clinical practice to benefit patients in an ethical and broadly accepted manner (10).

### AUTHOR CONTRIBUTIONS

KW wrote the first draft of the manuscript. PE, SD, and LC provided critical revision of the manuscript and important intellectual contributions. All authors contributed to the article and approved the submitted version.

### REFERENCES


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Weimer, Enck, Dodd and Colloca. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# How Placebo Needles Differ From Placebo Pills?

Younbyoung Chae<sup>1</sup> \*, Ye-Seul Lee1,2 and Paul Enck <sup>3</sup>

<sup>1</sup> Acupuncture and Meridian Science Research Center, College of Korean Medicine, Kyung Hee University, Seoul, South Korea, <sup>2</sup> Department of Anatomy and Meridians, College of Korean Medicine, Gachon University, Seongnam, South Korea, <sup>3</sup> Department of Internal Medicine, Psychosomatic Medicine and Psychotherapy, University of Tübingen, Tübingen, Germany

Because acupuncture treatment is defined by the process of needles penetrating the body, placebo needles were originally developed with non-penetrating mechanisms. However, whether placebo needles are valid controls in acupuncture research is subject of an ongoing debate. The present review provides an overview of the characteristics of placebo needles and how they differ from placebo pills in two aspects: (1) physiological response and (2) blinding efficacy. We argue that placebo needles elicit physiological responses similar to real acupuncture and therefore provide similar clinical efficacy. We also demonstrate that this efficacy is further supported by ineffective blinding (even in acupuncture-naïve patients) which may lead to opposite guesses that will further enhances efficacy, as compared to no-treatment, e.g., with waiting list controls. Additionally, the manner in which placebo needles can exhibit therapeutic effects relative to placebo pills include enhanced touch sensations, direct stimulation of the somatosensory system and activation of multiple brain systems. We finally discuss alternative control strategies for the placebo effects in acupuncture therapy.

#### Edited by:

Michael Noll-Hussong, Universitätsklinikum des Saarlandes, Germany

#### Reviewed by:

Carmen Uhlmann, ZfP Südwürttemberg, Germany Mirta Fiorio, University of Verona, Italy Bryan Saunders, Universidade de São Paulo, Brazil

> \*Correspondence: Younbyoung Chae ybchae@khu.ac.kr

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 29 March 2018 Accepted: 17 May 2018 Published: 05 June 2018

#### Citation:

Chae Y, Lee Y-S and Enck P (2018) How Placebo Needles Differ From Placebo Pills? Front. Psychiatry 9:243. doi: 10.3389/fpsyt.2018.00243 Keywords: acupuncture, blinding, control, placebo, physiology

## INTRODUCTION

Acupuncture is a therapeutic intervention performed by "inserting one or more needles into specific sites on the body surface for therapeutic purposes" (1). Placebo needles were developed and validated to evaluate the efficacy of acupuncture treatment in randomized controlled clinical trials (RCTs) (2, 3). Due to the indistinguishably inert nature of placebo controls compared with active treatments, placebo-controlled studies enable determination of the therapeutic effects of target treatments from unspecific treatment effects, such as medical context and consequent expectation. Similarly, placebo needles must be indistinguishable from real acupuncture needles and not produce any physiological therapeutic effects. To achieve this, non-penetrating needles with a similar appearance to real acupuncture needles, which retract telescopically into the needle handle when pressed on the skin, were developed because they provide patients with the visual illusion that their skin is being penetrated, much like a stage dagger in theater performances.

Non-penetrating needles have been commonly used as placebo controls for acupuncture research over several decades (4), and are often seen as standard when investigating the mechanisms underlying the acupuncture effects (5). Interestingly, several studies have shown that the effectiveness of placebo acupuncture needles is similar to that of real acupuncture needles. A systematic review of clinical trials revealed only a small difference between real and placebo needles in terms of pain relief, whereas a moderate difference was found between placebo treatment and no treatment at all, e.g., during a waiting period (6). RCTs have shown that real and placebo acupuncture treatments are equally effective and that both are superior to "treatments as usual" (TAU) for chronic pain (7, 8). Taken together, these findings imply that acupuncture treatment is equally effective as placebo acupuncture and therefore, that acupuncture treatment effects are placebo effects (9). However, the adequacy of the controls being used in these studies remains to be determined (10). Many discussions of whether placebo needles are appropriate controls for acupuncture research have followed the development of these needles (11), and there has been some criticism from a physiological perspective that placebo needles may not be proper controls for acupuncture studies (12). In fact, placebo needles are neither fully indistinguishable from regular needles nor physiologically inert (13, 14). Similarly, a recent meta-analysis suggested that neither the Streitberger device nor the Park Sham device is adequate inert controls for clinical studies (15).

This issue pertains not only to acupuncture needles, but also to other treatment devices that involve physical contact with the patient, such as injections, transcutaneous electrical nerve stimulation, manual therapy, and surgical interventions. Placebo devices, including placebo injections and placebo acupuncture needles, exhibit stronger effects than do oral placebo pills (16). Similarly, a meta-analysis showed that subcutaneous placebo administrations produce greater effects than do oral placebos for the acute treatment of migraine (17). A more recent meta-analysis of the effects of placebo interventions across all clinical conditions showed that physical placebo interventions, including acupuncture, have greater effects than do pill controls (18); sham acupuncture has been shown to have even greater effects than other physical placebos (19). A clinical trial revealed that placebo needles have greater effects than placebo pills on self-reported pain and severity of symptoms in patients with persistent arm pain (20). Expectations on the potential benefit induced in the recipient, influenced by the magnitude of the invasiveness of the intervention, leads to therapeutic effects following a placebo treatment (21). The greater effect of placebo devices compared with placebo pills may be due to the additional physical contact or the tactile component of the intervention, which is minimally present with the use of pharmaceutical pills. Therefore, the contextual effects associated with the preparation of acupuncture treatment devices are multisensory and have a broader impact on the patient. The tactile context of treatment devices such as during acupuncture is essential for the establishment of therapeutic effects (22). In contrast to the use of oral placebo pills, this context has two components: physiological action and ineffective blinding, which initially takes effect once the treatment is applied, and which, therefore, is different from the gradual unblinding due to experiences of adverse events during the drug applications.

Thus, the purpose of the present article was to review the two components of placebo devices, physiological action and effective blinding, and to discuss how these features result in stronger placebo effects relative to oral pills.

## PHYSIOLOGICAL ACTIONS OF PLACEBO NEEDLES

### The "Specific" Effect of Placebo Needles Due to Tactile Stimulation

Pharmaceutical research involving a placebo requires a verum preparation with a specific drug and a placebo preparation without that drug, with the difference in the effects of these two preparations indicating the effectiveness of the target drug. The aims of this type of study design are to exclude any other possible factor that might influence the general effects of medical treatment, such as natural history, regression to the mean, and/or methodological biases, and to test the "true" therapeutic effects of the novel compound (23). Additionally, the non-specific effects of the treatment can be observed by comparing the response with placebo to a no-treatment control condition, e.g., a waiting list; these effects are caused by the treatment preparation itself within a medical context, i.e., the attention the patient receives. The context provided by the medical setting may be referred to as the "specific" effect of the placebo (24). In fact, placebo effects are regarded as brain–body responses to contextual information that promote health and well-being (24).

In the case of placebo needles, tactile stimulation is an additional component that is associated with the treatment context of acupuncture, which is absent in a pharmaceutical context. Due to this component, the expected difference in effect between placebo needle treatment and waiting list groups includes a tactile context that has been overlooked in previous studies. The tactile context provided by the placebo needles, much like the medical context under which a pill is given, cannot be physiologically inert, and this stimulation can even exert similar therapeutic actions by enhancing touch sensations in the body (25). Furthermore, the touch of the placebo needles experienced by the patient initiates a multisensory process and thereby activates bodily self-awareness. Overall, tactile stimulation provides a broader range of contexts that contribute to the effect and improve the healing process relative to other placebo interventions (26). The effect of the tactile component on the patient can be categorized accordingly into sensorydiscriminative and affective-social aspects. These aspects of the tactile component play important roles in the therapeutic effect of acupuncture treatment in clinical practice (22), which is examined in the context of placebo needles in the following sections.

### The Sensory-Discriminative Aspect of the Touch Component of Placebo Needles

Several studies have examined in depth the sensorydiscriminative aspect of acupuncture needles. The process of needle insertion and the types of needle manipulation (27) activate diverse touch perception processes and stimulate mechanically sensitive pain fibers (28). This tactile stimulation process produces what is known as the de qi sensation (a combination of various sensations that include heaviness, numbness, soreness, and distention), which is fundamental for the therapeutic outcome of acupuncture treatment (29, 30). Placebo needles were first validated as a sufficient control in acupuncture studies under the assumption that a lesser degree of de qi sensation would be evoked, thereby leading to less effective clinical outcomes (2, 3). In the initial validation studies of placebo needles, participants were not able to distinguish the placebo needles from real needles, but they experienced a greater degree of de qi sensation with real needles than with placebo needles (2, 3, 31) (**Figure 1**).

On the other hand, a recent validation study of the Streitberger needle conducted with a large population showed no significant difference in de qi sensation between patients treated with real and placebo needles, even though the placebo needle does not penetrate the skin (32). Additionally, a study investigating Park Sham devices revealed that the de qi sensation induced by real and placebo needles is not distinguishable (33). De qi sensation, a composite of unique sensations produced during acupuncture, has been considered to be one of the essential components for clinical efficacy (22). Considering the lack of a significant difference between treatments administered with real and placebo needles, we can assume that the placebo needle exerts an action that is similar to those exerted during real acupuncture.

The somatosensory system is activated directly by placebo needles, which exert various physiological actions in the body that are similar to those exerted by real acupuncture needles. Real and placebo needles produce enhanced skin conductance responses and decrease the heart rate, suggesting that placebo needles are not physiologically inert in terms of autonomic response patterns (14). Furthermore, these autonomic responses to placebo needles might be derived from the patient's orienting responses, or bodily self-awareness (34). A functional magnetic resonance imaging study demonstrated that tactile stimulation, which mimics acupuncture stimulation, not only induces activation in sensorimotor processing regions and deactivation in default-mode network regions, but also modulates higher cognitive areas in the brain (35). Additionally, a meta-analysis of brain imaging studies showed that placebo needles produce weaker, but similar, patterns of brain activation compared with real acupuncture (36). When the placebo needle touches the skin and evokes activity in cutaneous afferent nerves, it seems to act in the brain and result in a limbic touch response (37).

In the pharmaceutical trials, active pills have "true" therapeutic effects of the novel compound in the capsules while placebo pills use the same types of capsules without active components. Placebo pills, of course, can induce tactile sensation on the tongue, but it is not likely that such tactile sensation can be related with the therapeutic effects in the trials. On the other hand, placebo needles can induce tactile sensations around the acupoints that is similar to real acupuncture needles; these tactile sensations themselves could produce physiological actions through the body in the acupuncture trials.

### The Affective-Social Aspect of the Touch Component of Placebo Needles

The process of treatment with placebo needles involves a component of touch between the patient and the practitioner. This affective-social aspect, involving slow gentle touch stimulation, activates unmyelinated C tactile fibers (CT afferents) and induces feelings of calm and well-being (38, 39). Prior to

FIGURE 1 | Additional components involved in the effects of placebo needles. In pharmaceutical trials, the nonspecific effects of treatments can be ruled out by comparing the placebo pill group with an untreated group, e.g., on a waiting list. In acupuncture trials, tactile stimulation is an additional factor that affects the placebo needle and untreated groups. Enhanced touch sensations, which are distinct during acupuncture treatment, but absent with placebo pills, remain substantial during placebo needle administration. Thus, placebo needles not only play a role as a cue for treatment expectations, but also evoke the somatosensory system and directly activate multiple brain systems.

inserting and stimulating the needle, the practitioner touches the patient to assess the skin tissue and identify the region to which the needle will be applied. This process of gently touching the patient's skin activates CT afferents and alleviates unpleasantness. Furthermore, this type of pleasant touch reestablishes the patient's sense of self-esteem and well-being by inducing a limbic touch response (39). A clinical study (40) supports the role of affective-social touch in treatments with acupuncture and placebo needles because the enhanced patient–doctor relationship produced greater improvements in patients with irritable bowel syndrome. Additionally, the entirety of the procedure, including warmth, empathy, and the communication of positive expectations, might influence clinical outcomes (40).

Gentle touch, which is always a component of acupuncture treatment, plays a crucial role in the overall outcome of the medical treatment. Gentle touch by a nurse before a surgical operation decreases subjective and objective levels of stress in the patient (41). Furthermore, gentle touch plays a direct moderating role in the physiological responses of the patient such that it lowers blood pressure, enhances transient sympathetic reflexes, and increases pain thresholds (42). The affectivesocial components of gentle touch also enhance the patient– doctor relationship, even when patients are treated with placebo needles (40). Although the gentle touch component prior to the application of real or placebo needles is not considered to be part of the active component of placebo treatment, it is nevertheless part of the placebo preparation in a clinical acupuncture trial. Thus, compared with the effects observed in a waiting list group or a group receiving another placebo intervention, this component generates a stronger doctor–patient relationship and enhances the placebo effect.

Although the placebo needle acts as a control due to its non-penetrating qualities, the tactile component is not completely removed; thus, its application in acupuncture trials may additionally produce crucial effects such as directly evoking the somatosensory system, strengthening the doctor–patient relationship, and enhancing the patient's general condition. The biophysical effects of placebo needles influence the patient's expectations and contextualization, which likely also play roles in his or her cognitive perception during the treatment process regarding the alleviation of symptoms.

### BLINDING OF PLACEBO NEEDLE APPLICATIONS

### The Blinding Components of Placebo Needles

Placebo needles were developed based on a visual illusion that induces the belief that one's skin has been penetrated (2, 3). The tip of the placebo needle is blunt and retracts into the needle's handle; thus, a placebo needle has a shape similar to that of a real needle, but is dissimilar in that it does not penetrate the skin. Because the placebo needle induces the sensation of pricking and appears to penetrate the skin, the patient is more likely to classify placebo needle treatment as active relative to placebo pills. Placebo pills are indistinguishable in appearance from the active drug, but the patient must be convinced that they are receiving real treatment. The chance of determining whether a pill is a placebo or an active treatment is theoretically equal in pharmaceutical trials due to the indistinguishable appearance, smell, and taste of placebo pill compared to active drugs; in contrast, the chance of determining whether a needle is placebo or real is not completely equal, since the patient receiving the treatment while looking at and feeling the needle would be inclined to believe that the placebo treatment is active. Consequently, the probability of a patient determining placebo and real needle would be even more biased, if they have prior experience of acupuncture needling and have felt its therapeutic effects.

Blinding is another important issue that can minimize bias or the potential effect of context on the outcomes of RCTs (43). The blinding index (BI) was developed to assess the success of blinding in clinical trials (44) and is interpreted as a "correct guess beyond chance." For example, a BI of 1 indicates that all guesses are correct, a BI of −1 indicates that all guesses are incorrect, and a BI of 0 indicates that the probabilities of correct and incorrect guesses are equal (45). When classifying the blinding results of trials, BI values > 0.2 are considered to indicate failed blinding because more participants guessed correctly, BI values < 0.2 and > −0.2 are considered to be random guesses, and BI values < −0.2 are also considered to indicate failed blinding because more participants guessed incorrectly (45). An assessment of blinding in trials involving pharmacological interventions for psychiatric disorders yielded average BI values of 0.18 and 0 in the active treatment and placebo control groups, respectively (46). This finding implies that blinding was established successfully, which is an ideal result from a scientific perspective.

In contrast, people more often respond to placebo needles because they are more likely to believe that they are receiving active treatment, which is also known as an opposite guess (15, 46). Although a recent systematic review of the use of placebo needles for acupuncture in clinical trials with limited reporting of the credibility of blinding showed that participant blinding was successful in most cases (15), participants were less likely than chance levels to believe that the needles were real, rather than placebos. When a BI calculation was applied to this review, the average BI values were 0.55 and −0.33 for the real and placebo needle groups, respectively (15), indicating unsuccessful blinding. Additionally, based on the classification rules for blinding scenarios, 86% of studies have involved unblinded participants in the real acupuncture group (BI > 0.2) and participants making opposite guesses in the placebo group (BI< −0.2) (15).

A recent acupuncture study showed that 61 and 68% of patients administered real and placebo treatments, respectively, perceived treatment type correctly, which implies that blinding was unsuccessful (47). One possible reason for this unsuccessful blinding is the experience of the de qi sensation, which could contribute to the correct identification of the treatment (47), even though placebo needling sessions produce substantial levels of this sensation. Another possible explanation is that smaller insertion and pullout forces are used during placebo needling (13). Differences in biomedical forces may be a crucial reason for the association of different somatosensory processes with the use of real and placebo needles (7) (**Figure 2A**).

### Greater Expectations During Placebo Needling Produced Greater Placebo Effects

According to systematic reviews of the BI in clinical trials, pharmacological placebo pills have an approximately 50% chance of being perceived as active, whereas this assumption is not necessarily true for placebo needles (15, 46). While in the aforementioned studies the adverse events of drug trials indicate the risk of unblinding, the BI index seem to have been uncompromised, possibly due to the occurrence time and the frequency of such events.

The discussed BI patterns are often thought to indicate adequate blinding, but a greater probability of believing that a placebo is real might be due to wishful thinking rather the well-known psychological preference toward real or better treatment (48). The greater probability of opposite guesses in placebo needle groups may be related to greater expectations regarding symptom alleviation. Placebo effects, or any improvement in the symptoms or physiological condition of an individual receiving a placebo treatment (23), are based largely on the expectation of receiving actual treatment, cued and contextual conditioning, and/or observational and social learning (49). Thus, patients may have higher levels of expectation during placebo needling than when receiving placebo pills, which could contribute to treatment efficacy (50). In this manner, placebo responses may be more frequent in placebo needles than in placebo pills because patients are more likely

they are receiving active treatment in the acupuncture trials. Differences in blinding scenarios for placebo needles and placebo pills. In pharmaceutical trials, successful blinding in the treatment and placebo groups results in patients making random guesses about whether they are receiving active or placebo pills. Acupuncture trials involve different blinding scenarios: "unblinded participants" in the real acupuncture group and participants making "opposite guesses" in the placebo needle group. Due to this unique pattern of blinding, individuals more often respond to placebo needles because they are more likely to believe they are receiving active treatment (i.e., opposite guess).

to perceive the use of placebo needles as active treatment (**Figure 2B**).

### ALTERNATIVE CONTROL STRATEGIES

When blinding becomes difficult (as with sham acupuncture needles) or even impossible (such as with psychotherapy), alternative control strategies are required to separate specific therapy effects from unspecific (e.g., contextual) effects as well as from spontaneous remission and response biases (23). Ineffective or impossible blinding also precludes conventional cross-over designs where each patient serves as his/her own control, thereby reducing the data variance and allowing trials with far less patients than with a parallel-group design. However, cross-over designs carry another risk: that of carry-over effects from one phase to the next. If the carry-over effect is based on Pavlovian conditioning of responses (51), even longer wash-out phases cannot prevent it to occur.

A number of design alternatives have been discussed which all exhibit both specific advantages and pitfalls.

### No Treatment Controls (NTC)

To separate "spontaneous variation" from "placebo responses", a "no-treatment" control group appears necessary that determines how much of the unspecific effects can be attributed to spontaneous variation and recovery. Since this is rarely done, the exact size of the contribution of spontaneous variation to the placebo response is known only for minor and benign clinical conditions and may account here for approximately 50% of the placebo effect (52). In experimental settings, "no treatment controls" may also serve to control for habituation and sensitization effects that may occur with repetitive stimulation, e.g. in pain and placebo analgesia experiments.

NTC are limited by ethical rules when patients with a severe clinical condition require treatment and cannot be offered trial participation that would assign them to a NTC group, as set by the Declaration of Helsinki of the World Medical Association (53).

### Waiting List Control (WLC), Treatment as Usual (TAU)

Assigning patients to a "no treatment" group may be ethically problematic, e.g., in case of severe diseases, or when for other reasons the patients require treatment; in such cases WLC and TAU are control strategies for non-drug testing when an inert "placebo" is not available, e.g., in psychotherapy, physical/manual therapy, surgery, and "instrumental" therapies (TENS, transcranial magnetic or direct current stimulation, laser or light therapy), including acupuncture (see above). While some of these therapies have "sham therapy techniques" that can serve as placebo controls, e.g., in acupuncture, others must rely on WLC and TAU as their only control condition.

However, WLC and TAU face significant limitations: while patients expect to receive effective therapy, they are randomized to routine treatment most of them have had in the past (TAU), or (in case of WLC) have to wait for the treatment they were recruited for, resulting in disappointment and potentially nocebo effects (21). This affects only recruitment and compliance, and biases patient populations in such studies.

To avoid WLC and TAU and the associated disadvantages, studies in acute and chronic pain are often conducted comparing a novel drug with another drug already available rather than with placebos (54, 55).

### Comparative Effectiveness Research (CER)

One approach to circumvent the placebo dilemma in RCT (for ethical as well as for methodological reasons) has recently been favored by drug approval authorities, by boards of medical societies, and by ethics committees, namely to avoid utilization of placebos in clinical trials. Comparative effectiveness research (CER) compares novel treatments to already approved therapies: to the best of our knowledge, this has never been done for acupuncture therapy, e.g., in chronic pain conditions.

However, as has been shown in a number of meta-analyses in depression, schizophrenia. and other diseases, comparing a new therapy to a comparator increases the response solely driven by the higher likelihood of patients to receive active treatments (100%) as compared to placebo-controlled trials (56). In such trials therefore, the placebo response is high but cannot be controlled anymore. Of specific interest is the fact that CER studies need to test for "non-inferiority" of the novel drug, resulting in higher patient numbers (57).

### Cohort Multiple Randomized Controlled Trial (CMRCT) Design

The "cohort multiple randomized controlled trial" (CMRCT) (58)—formerly also known as the Zelen design (59)—splits the "no treatment" control arm of a drug trial (done for the purpose of mere observation of the natural course of the disease) from the drug trial itself, by recruiting a large cohort of patients for an "observational study" in which patients are followed under their TAU condition.

The observational cohort then serves as the basis for the recruitment of a subsample for the treatment study, either placebo-controlled or CER: patients are randomly approached, but can be selected based on a number of factors accounting for statistical representativeness.

A number of limitations apply, however: "the observational cohort needs to be monitored over time (a cross-sectional sample analysis would not be sufficient to account for changes occurring over time), and it needs to be representative for complete patient cohort affected by the diseases, both in terms of disease features (e.g., symptom severity) as well as disease management (diagnosis, TAU). Once such a cohort it established it may be used for more than one RCT" (21).

## DISCUSSION AND CONCLUSION

Similar to other placebo types, placebo needles play an important contextual role in treatment expectations; however, they also directly evoke the somatosensory system and activate multiple brain systems. Placebo preparations are applied in studies to blind participants, and they enable the calculation of chance levels for patients' guesses about whether interventions are therapeutic or inert. However, the probability of making an opposite guess is greater for placebo needles than for placebo pills, which is often explained by patients' greater expectations. Because patients are more likely to perceive placebos as active treatment in placebo needle trials, placebo responses may be observed more frequently to placebo needles than to placebo pills.

The tactile components of acupuncture needle use are crucial factors during treatment preparation and could not be fully controlled for as placebo needles were being developed. The distinctive touch sensations experienced during acupuncture treatment are substantial, even during the administration of placebo needles. Due to the physical contact necessary when applying placebo needles, the validity of these needles as controls has been in question from the perspectives of physiological inertness and blinding. These factors may result in placebo needles exerting stronger placebo effects than do other types of placebo preparation that do not include tactile components. Thus, the development of a technique to control for the tactile components of acupuncture interventions while participants are consciously receiving treatment is an important consideration. The studies reviewed here demonstrated that the de qi sensation cannot be completely accounted for when using placebo needles without controlling for the tactile components, which suggests some level of clinical efficacy. Placebo needle administrations may inadvertently, albeit less robustly, activate the somatosensory system and induce regulatory mechanisms that are also triggered by acupuncture needling. Furthermore, placebo needles, or what we have considered to be control needles for experimental studies, may be a form of acupuncture treatment that is low dose or that provides weak stimulation.

In clinical trials, the placebo control should be indistinguishable from the active treatment (i.e., blinding success) and yet physiologically inert (less deqi sensation in this case). In the case of acupuncture, however, it is difficult to meet these two criteria simultaneously (60). Most importantly, our argument on the inadequacy of placebo needles as controls in

#### REFERENCES


acupuncture trials should not inhibit further acupuncture trials with randomized, controlled designs. Placebo needles indeed are more likely to induce placebo responses than placebo pills, which is largely due to the tactile component that cannot be separated from the components of the real acupuncture needles. In other words, conversely, our arguments imply that acupuncture needles contain a substantial level of placebo effect, which was not completely ruled out by controlling the penetration. It is also important to note that waiting lists do produce unspecific effects on their own (61). Furthermore, recent studies in acupuncture have employed study designs such as pragmatic trials, which compare acupuncture treatment with waiting lists and usual care (62–64), while other innovative control strategies still await validation with acupuncture. In the meantime, the discussion on the effect of the tactile components of placebo needles in its effectiveness as placebos, as well as effective blinding, needs to be continued.

Taken together, the placebo needles do have different characteristics from placebo pills in clinical trials. Our exploration does not imply that acupuncture may be more effective than placebo, but suggests that we have to consider these unique characteristics of placebo needles before we draw premature conclusions that acupuncture itself is just a placebo.

### AUTHOR CONTRIBUTIONS

Conceived and designed the paper: YC and PE. Wrote the first draft of the paper: YC Y-SL, and PE. Revised the paper and approved the final version: YC, Y-SL, and PE.

### ACKNOWLEDGMENTS

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2014K2A3A1000166 & 2015R1D1A1A01058033 & 2015M3A9E3052338).


sham needle. Complement Ther Med. (2011) **19**(Suppl. 1), S8–12. doi: 10.1016/j.ctim.2010.09.002


of acupuncture for chronic low back pain. Am J Epidemiol. (2006) **164**:487–96. doi: 10.1093/aje/kwj224


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chae, Lee and Enck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Psychotherapy and Placebos: Manifesto for Conceptual Clarity

Charlotte R. Blease1,2 \*

<sup>1</sup> Program in Placebo Studies, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, United States, <sup>2</sup> School of Psychology, University College Dublin, Dublin, Ireland

Keywords: psychotherapy, placebo, placebo effect, common factors across psychotherapies, philosophical psychology, thomas kuhn, placebo studies, placebo research

For nearly as long as the term has existed "placebo" has been a source of debate and disagreement. Scientists and philosophers have been active contributors to the protracted dialog about how best to define the term leading one prominent health researcher to argue that there appears to be "currently no widely accepted definition of placebo" (1). Meanwhile, new theoretical models aimed at resolving conceptual quagmires—once and for all [e.g., (1, 2)]—often seem to confound rather than crack the problem by inviting further questions (3, 4). If discussion about how to conceive placebo terminology seems to "rage" within the sphere of biomedicine (1) when it comes to the domain of psychotherapy conceptual entanglements appear even more complicated. Here the toand-fro of debate has spanned the decades though been episodic rather than ongoing [e.g., (5–7)] and lately the debate has re-emerged [see: (8–11)]. Reviewing the recent contributions to this discussion, I argue that are indeed stable definitions for the terms placebo and placebo effect within the science of placebo studies. Furthermore, I argue that it is justified to use these definitions as a starting point for appraising conceptual disagreement, including the (apparently) contentious translation of these terms to psychotherapy. Exploring two provocative yet divergent claims about the relationship between placebo and psychological treatments (9, 10) I conclude that disagreement arises when researchers employ definitions of placebo that are disengaged from implicit scientific usage.

Discussion about conceptual or definitional matters in science may appear to be esoterica, however definitions are important. How we understand placebo concepts carries subtle but significant methodological significant methodological implications for clinical trials as well as for ethical practice in the delivery of care (4, 12, 13). Therefore, gaining clarity about the argumentation within disputes over concepts is not trivial—rather, it might even be viewed as a major priority for the field of placebo research.

In this Mini Review I focus on two of the most prominent recent claims about the relationship between placebo concepts and psychotherapy proposed by leading scholars (9–11). I argue that appearances to the contrary, the resultant conceptual quagmire is avoidable, and suggest how and why definitions of placebo and placebo effects have become muddled within the context of psychotherapy. However, to highlight why disagreement arises it is imperative to identify unambiguous definitions for the terms "placebo" and "placebo effect." Fortunately, in this regard, the insights of philosopher of science Thomas Kuhn are instructive (14).

### KUHN'S INSIGHTS

In the mid-Twentieth Century Thomas Kuhn helped to re-orientate philosophy of science by arguing that philosophers should move away from a priori formulations about the nature of science and pay closer attention to how scientists in fact reason and evaluate theories (15). One of Kuhn's most important (and least controversial legacies) is his claim that for empirical progress to arise in a given field of enquiry there must be discernible underlying conceptual stability (15). Following

Edited by: Paul Enck,

Universität Tübingen, Germany Reviewed by: Jörn von Wietersheim,

Universität Ulm, Germany \*Correspondence:

Charlotte R. Blease cblease@bidmc.harvard.edu

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 15 July 2018 Accepted: 30 July 2018 Published: 20 August 2018

#### Citation:

Blease CR (2018) Psychotherapy and Placebos: Manifesto for Conceptual Clarity. Front. Psychiatry 9:379. doi: 10.3389/fpsyt.2018.00379

Kuhn and other post-positivist philosophers I argue that the most effective way to clarify conceptual issues is not to start from the philosophical armchair but to ask: How do established scientists in fact use these terms? (14).

If we assume that the field of placebo studies has emerged as an established field of science it is a small step to infer—despite the buzz of debate—that key definitions must be relatively settled in order to support the systematic growth of empirical knowledge. In a recent paper I argued that supposedly contested terminology is relatively settled in scientific placebo research (14).

However, before we address the question about what scientific usage reveals about core definitions, it is important to foreground the discussion by flagging up two important points. To begin, scientific concepts of placebo and placebo effect should be differentiated from those meanings ascribed to the terms in other non-expert domains including among medical practitioners, clinical investigators, patients, and research participants. Here I assume that even individuals who use placebos (e.g., in clinical trials, or as prescribing physicians) may not be experts in about how best to define these terms. Indeed, sociological research has demonstrated that both physicians and patients interpret "placebo" and "placebo effect" in myriad ways [see: (16, 17)]. How these non-experts define this terminology is important but not the present concern of this paper which is to inquire how these terms ought to be used.

Next, is a residual and paradoxical question: if there is conceptual consensus within the scientific placebo community about placebos and placebo effects why, then, is there so much debate about how to define the terms? I suggest three reasons. First, and perhaps most importantly the term "placebo" has been in use for centuries and is embedded within common lay as well as medical usage, further obstructing the possibility of clear and unambiguous meaning change. Second, even if we agree that scientific research into placebo effects is burgeoning as a field of enquiry it has emerged only recently (14, 18). Therefore, we might still expect to observe a residual hangover of conceptual disputes regardless of whether such disagreement is substantive. Third, and related, even within scientific contexts, conceptual stability can be typified by implicit understanding rather than articulable, explicit definitions among many scientists who, as Kuhn observed, may be "little better than laymen at characterizing the established bases of their field, its legitimate problems and methods" [(15), p. 44].

Despite these challenges, I argue that we can discern conceptual stability over key definitions within the emergent science of placebo studies (14). I suggest that within the empirical research field, "placebo effects are understood to be positive health changes that occur as a result of specific psychobiological mechanisms . . . These psychobiological mechanisms are elicited, in turn, by a range of cues in the context of the practitionerpatient encounter" (14). Placebo effects, therefore, can be broadly understood as a natural kind of psychobiological phenomenon.

The term placebo is more nuanced. When it comes to placeborandomized controlled trials (RCTs) the placebo allocation should ideally be identical to the verum intervention (the treatment under evaluation) in all respects except for its hypothesized remedial factor(s), and patient allocations should be randomized and double-blinded. The function of placebos in RCTs is to act as controls for the experimental "noise" that arises within clinical trials: this includes: regression to the mean, natural progression of an illness, patient or physician/investigator reporting biases, Hawthorne Effects, as well as placebo effects. Placebos in RCTs should therefore be conceived as a moving target: an instrument that is designed to mimic the appearance of a verum intervention (14). This means that the appearance and administration of the placebo control should always be dependent on the treatment under scrutiny rather than simply being reified as a particular kind of thing (e.g., "placebos are sugar pills"). Indeed, it would be less misleading to label placebos in RCTs as "control interventions" (4, 14). Placebo researchers also differentiate between placebo effects and placebo responses (12): the latter comprise the aggregate responses of receiving a placebo in an RCT—the factors associated with so-called "experimental noise" which, as noted, may or may not include placebo effects.

When it comes to the scientific community's definitions of placebos in clinical contexts, things get trickier. However, one place to glean insight is so-called open-label placebo experiments. Here the following script has been provided to patient participants in experimental set ups: "placebo pills are made of an inert substance, like sugar pills...have been shown in clinical studies to produce significant improvement in IBS symptoms through mind-body self-healing processes" [(19); see also: (20)]. In these scenarios, there are two implied definitions of placebo: (i) treatments theorized not to be effective for a condition or symptoms by virtue of their intrinsic properties; and secondly, the added notion that (ii) placebos as described in (i)—may be causally implicated in the elicitation of placebo effects: here it is implied that placebos play a role as causal antecedents of psychobiological pathways which, when combined with other proximal conditions and factors in the context of health care (such as practitioner empathy, warmth, and confidence) cause placebo effect(s) (21). With these delineations in mind, I appraise two recently published, divergent analyses of the relationship between placebos and psychotherapy proposed by prominent scholars.

### "ALIGNING PLACEBOS AND PLACEBO EFFECTS WITH PSYCHOTHERAPY IS INCOHERENT"

The first claim owed to Kirsch, Wampold and Kelley is that deployment of this terminology within psychotherapy leads to a form of reductio ad absurdum (10). The authors argue: "In the context of medical treatment, placebo effects are relatively easy to define. They are the effects produced by factors other than the physical properties of the treatment" [(10), p. 123]. However, in psychological contexts, the authors contend, "Here is the central problem: The effect of psychotherapy is—by definition of the term psychotherapy—produced by something other than the physical properties of the treatment. Therefore, if we adhere to the received implicit definition of placebo as it has been used in the context of medicine, the effects of psychotherapy are ispo facto placebo effects and psychotherapy is ipso facto a placebo" [(10), p. 123].

First, Kirsch, Wampold and Kelley claim that we can rely on, "the received implicit definition of how placebo has been used in the context of medicine" [(10), p. 123]. Yet, as argued, implicit and explicit conceptualizations of placebos among non-experts are unhelpful precisely because the term has been deployed in myriad inconsistent and sometimes confusing ways [e.g., (17)]. To draw on another example, consider "folk biology" which encompasses among other intuitions, in-built ideas about how to classify species (22): this intuitive classification scheme does not provide a foolproof scientific foundation for how species are (in fact) related to one another. Mixing both classification systems would undermine scientific enquiry. Rather, to avoid conceptual quagmires, definitions of placebo must be anchored to how these terms are standardly, even if implicitly, deployed by experts working in the scientific placebo research community.

Second, the authors suggest that all non-physical responses to treatments should be conceived as placebo effects. This is incorrect: just because responses are non-physical—i.e., occurring at a psychological level—does not mean they are de facto placebo effects. This line of reasoning implies that every non-physical effect of a treatment is a placebo effect. Indeed, the logical extension of this argument is there can be no psychological responses other than placebo effects in psychotherapy: yet to suggest that the rich variety of psychological events elicited in psychotherapy simply amount to placebo effects is improbable (23). Correlatively, as scientific research in placebo studies has shown, not all non-physical responses to placebos are accurately described as placebo effects. We might surmise, in this instance, that Kirsch et al. confuse placebo responses with placebo effects.

Third, and finally, Kirsch, Wampold and Kelley argue, "[I]n evaluating the efficacy of psychotherapy, the placebo effect cannot and should not be controlled" [(10), p. 212]. From the premise that psychological responses just are equatable with placebo effects they infer the strong conclusion that it is unjustified to undertake placebo-RCTs of psychological treatments. This is unwarranted. From the definition of placebos as control interventions only the weaker claim is supported: in principle it is possible to design psychotherapy RCTs but in practice, the task is fraught with multiple serious challenges.

Indeed, one such problem is the double-blinding requirement (whereby neither therapist nor participant are aware about whether the individual has been allocated to placebo or the intervention under scrutiny). Another problem, which the authors highlight, is the need to control for so-called "common factors" in the delivery of psychotherapy. Here we must pause to consider what the term common factors means and how it should be distinguished from specific treatment factors in psychotherapy research.

Specific treatment factors vary according to different psychotherapy modalities and theories. So, for example, specific techniques in cognitive behavioral therapy (CBT) involve identifying hypothesized "cognitive distortions" or "maladaptive thoughts" which, according to CBT theorists and practitioners, are believed to have negative effects on behavior (24, 25). Here the goal of specific treatment techniques is to redress "faulty thinking" by promoting "cognitive restructuring" which, proponents of CBT theorize, will thereby elicit more psychologically constructive thoughts and behaviors (24, 25). Similarly, in psychodynamic psychotherapies and humanistic therapies distinctive kinds of specific techniques are theorized. Common factors on the other hand—and as the name suggests refers to those features of treatment that appear to be shared across different psychological interventions. These include verbal and non-verbal therapist factors (e.g. empathy, positive regard); patient factors (e.g. confidence in the therapist); and factors associated with a strong working alliance between patient and psychotherapist.

To the extent that Kirsch et al. argue that controlling for common factors poses a serious obstacle to placebo-controlled clinical trials in psychotherapy we can agree with them. Nonetheless, conceivably this hindrance may yet be overcome. In the future, technological innovations may render it possible to delivery psychological treatments using avatars in the future: in such a scenario, we might hypothesize that the regulation and control of common factors would become practicable within psychotherapy-RCTs.

### "PSYCHOTHERAPY IS A PLACEBO"

Gaab et al. (9) and Trachsel and Gaab (11) present a very different interpretation of the relationship between placebo and psychotherapy. Their proposition is that psychotherapy has an "unwanted proximity" to placebos which poses problems for ethical clinical practice in respect of disclosures to patients about how psychotherapy works [(11), p. 493]. Here I will focus on the claim that psychotherapy is interpretable as a "placebo" and sidestep intricate questions about ethical implications of this conjecture (26). Since these arguments rely on: (a) common factors research into psychotherapy; and (b) Grünbaum's model of placebos (6), it is first necessary to set the scene by providing an overview of each premise.

### Common Factors Research

Empirical findings indicate that different versions of psychotherapy, which employ different treatment techniques, appear to be equally successful (23). This is often referred to as the "Dodo Bird Verdict"—the label is derived from the words of the Dodo Bird in Alice in Wonderland: "everybody has won and all must have prizes" [(27), p. 995]. Subsequently, it has been proposed that the Dodo Bird Verdict is explained by the common factors hypothesis—namely, that it is the common factors and not the specific factors that are relevant to outcome (23). While the Dodo Bird Verdict is still somewhat contested (28) a considerable body of research nonetheless suggests that the common factors play a significant role in mediating treatment outcomes (29, 30).

### Grünbaum's Model of Placebo

The second key idea underpinning Gaab et al. (9) and Trachsel and Gaab's (11) views about the relationship between psychotherapy and placebos, is Adolph Grünbaum's model of placebos. Grünbaum differentiates between "characteristic" and "incidental" features of interventions which he says must be relativized to—that is, determined by—particular theories about how treatments work [(6), p. 33]. So, characteristic factors of (for example) amoxicillin are its particular antibiotic formula in the pill; meanwhile, the incidental factors include its coloration, the bulking agent, and price. Placebos, on this framework, are conceived as interventions that lack any remedial characteristic treatment factors for a particular condition. Placebo effects, on the other hand, are conceived as those positive effects that arise from the incidental features of a treatment.

Embracing the validity of Grünbaum's model and the common factors hypothesis, Gaab et al. argue that the specific techniques of psychotherapy can be equated with Grünbaum's description of characteristic features of treatments and the common factors interpreted as incidental factors (9, 11). From this perspective, it is concluded that psychotherapy risks being conceived as a placebo.

What should we make of this analysis? A positive feature of Grünbaum's framework is his conceptualization of placebos as a moving classification: placebos are not reified as physical "things" e.g., sugar pills. But when it comes to placebo effects problematic discrepancies arise between Grünbaum's model and scientific research. On Grünbaum's account placebo effects are also conceived as moving targets (rather than as a natural kind): this is because they are conceived as the effects of incidental treatment factors associated with a particular treatment theory. Even if we modify and narrow this framework to accommodate the view that placebo effects are the positive effects of incidental factors (1) the account is still too liberal from the perspective of scientific placebo studies. This is because other positive psychological effects may conceivably be precipitated by incidental factors (e.g., reporting biases or Hawthorne effects which precipitate positive health behaviors, Pygmalion effects, and/or other psychological processes).

Further problems arise when applying this model to psychotherapy research. If: (a) we accept the validity of the common factors hypothesis; and (b) defend Grünbaum's model of placebos, it might be countered that the common factors cannot be interpreted as "incidental factors." This is because theories within psychotherapy typically regard common factors as integral components of treatment [e.g., (25)]. Thus, the terms specific and common factors may not be construed as conceptually isomorphic with Grünbaum'scharacteristic and incidental factors, respectively. Instead, it would be more accurate to describe different versions of therapy as employing idiosyncratictreatment factors alongside common factors but that all of these factors are "characteristic," i.e., considered to be necessary factors for psychotherapy to be successful.

### REFERENCES


### CONCLUSIONS AND RECOMMENDATIONS

In conclusion, conceptual clarity in placebo studies will only be settled when we attend to how placebo terminology is in fact used within burgeoning scientific research—rather than how disputants say it used.

In examining the relationship between psychotherapy and placebos we must ask: What is the context of our analysis? Placebos in clinical trials, I have argued, are best characterized as control interventions. As with all control interventions, then, the function of placebos in psychotherapy clinical trials is to mimic the appearance of the verum treatment except for its particular, hypothesized, remedial factor. In practice, designing placebos for psychotherapy clinical trials is hugely challenging; though (as suggested) future technological innovations may eventually help to resolve recalcitrant problems.

In clinical contexts it is incorrect to describe psychotherapy as a placebo. Within the scientific placebo field, researchers implicitly define placebos as, "treatments theorized not to be effective for a condition or symptoms by virtue of their intrinsic properties." While research into basic science of psychotherapy mechanisms is not advanced, it appears that common factors play an important role in mediating change. Moreover, these factors are also theorized by proponents of different psychological treatments to be necessary to outcome.

Finally, placebo effects cannot be equated with "all non-physical responses" of a treatment. The growing science of placebo studies informs us that placebo effect(s) are the remedial outcomes of specific psychobiological mechanisms. Such mechanisms may be elicited by psychotherapy—just as they may be triggered in other treatment modalities.

## AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

## ACKNOWLEDGMENTS

Thanks are due to the reviewer for very helpful feedback and Prof. Paul Enck for the invitation to contribute to this special issue. I also wish to thank Prof. Ted Kaptchuk for fruitful discussions on the themes of this paper. This article was supported by a Fulbright Scholarship Award and an Irish Research Council/Marie Sklowdowska Curie Global Fellowship (CLNE/2017/226) granted to CB.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Blease. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mechanisms of Perceived Treatment Assignment and Subsequent Expectancy Effects in a Double Blind Placebo Controlled RCT of Major Depression

Johannes A. C. Laferton1,2,3†, Sagar Vijapura4†, Lee Baer 4‡, Alisabet J. Clain<sup>4</sup> , Abigail Cooper 4§, George Papakostas <sup>4</sup> , Lawrence H. Price<sup>5</sup> , Linda L. Carpenter <sup>5</sup> , Audrey R. Tyrka<sup>5</sup> , Maurizio Fava<sup>4</sup> and David Mischoulon<sup>4</sup> \*

<sup>1</sup> Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Harvard University, Boston, MA, United States, <sup>2</sup> Department of Clinical Psychology and Psychotherapy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany, <sup>3</sup> Department of Clinical Psychology and Psychotherapy, Psychologische Hochschule Berlin, Berlin, Germany, <sup>4</sup> Depression Clinical and Research Program, Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States, <sup>5</sup> Mood Disorders Research Program, Laboratory for Clinical and Translational Neuroscience, Butler Hospital, Department of Psychiatry and Human Behavior, Alpert Medical School, Brown University, Providence, RI, United States

Objective: It has been suggested that patients' perception of treatment assignment might serve to bias results of double blind randomized controlled trials (RCT). Most previous evidence on the effects of patients' perceptions and the mechanisms influencing these perceptions relies on cross-sectional associations. This re-analysis of a double blind, placebo controlled RCT of pharmacological treatment of major depression set out to gather longitudinal evidence on the mechanism and effects of patients' perceived treatment assignment in the pharmacological treatment of major depression.

Methods: One-hundred eighty-nine outpatients with DSM-IV diagnosed major depression were randomized to SAMe 1,600–3,200 mg/d, escitalopram 10–20 mg/days, or placebo for 12 weeks. Data on depressive symptoms (17-item Hamilton Depression Scale; HDRS-17), adverse events and patients' perceived treatment assignment was collected at baseline, week 6, and week 12. The re-analysis focused on N = 166 (out of the originally included 189 participants) with available data on perceived treatment assignment.

Results: As in the parent trial, depressive symptoms (HDRS-17) significantly decreased over the course of 12 weeks and there was no difference between placebo, SAMe or escitalopram. A significant number of patients changed their perceptions about treatment assignment throughout the trial, especially between baseline and week 6. Improvement in depressive symptoms, but not adverse events significantly predicted perceived treatment assignment at week 6. In turn, perceived treatment assignment at week 6, but not actual treatment, predicted further improvement in depressive symptoms at week 12.

#### Edited by:

Paul Enck, Universität Tübingen, Germany

#### Reviewed by:

Karin Meissner, Ludwig-Maximilians-Universität München, Germany Gerard Joseph Marek, Astellas Pharma, United States

#### \*Correspondence:

David Mischoulon dmischoulon@mgh.harvard.edu

†These authors have contributed equally to this work and share first authorship

‡Deceased

§Abigail Cooper, Fairleigh Dickinson University, Teaneck, NJ, United States

#### Specialty section:

This article was submitted to Psychopharmacology, a section of the journal Frontiers in Psychiatry

Received: 25 June 2018 Accepted: 17 August 2018 Published: 07 September 2018

#### Citation:

Laferton JAC, Vijapura S, Baer L, Clain AJ, Cooper A, Papakostas G, Price LH, Carpenter LL, Tyrka AR, Fava M and Mischoulon D (2018) Mechanisms of Perceived Treatment Assignment and Subsequent Expectancy Effects in a Double Blind Placebo Controlled RCT of Major Depression. Front. Psychiatry 9:424. doi: 10.3389/fpsyt.2018.00424

Conclusions: The current results provide longitudinal evidence that patients' perception of treatment assignment systematically change despite a double blind procedure and in turn might trigger expectancy effects with the potential to bias the validity of an RCT.

Parent study grant number: R01 AT001638 Parent study ClinicalTrials. gov Identifier: NCT00101452

Keywords: major depressive disorder, SAMe, escitalopram, placebo, perceived treatment assignment, un-blinding, double blind randomized controlled trial, bias

### INTRODUCTION

The belief that one is taking a medication can lead to improvement in numerous health conditions regardless of the presence or absence of a pharmacologic agent (1, 2). This expectation effect is specifically pronounced in the pharmacological treatment of depression (3, 4). Double blind randomized controlled trials (RCTs) assume that these expectations are equally balanced across treatment arms. Yet, the effectiveness of blinding in RCTs is rarely assessed or reported, and there are suggestions in the literature that patients are frequently un-blinded (5–7). If patients do learn which treatment arm they are in, expectancy effects due to perceived treatment assignment are no longer controlled for. This introduces a considerable amount of bias, as meta-epidemiological studies typically find un-blinded studies to exaggerate effect size by more than 30% compared to blinded studies (8–10). Moreover, it is important to note that not only an actual un-blinding, but any between-groups imbalance of the perceived treatment assignment can bias the results of a trial (11, 12).

Possible mechanisms that may influence perceived treatment assignment include the physical characteristics of the medication and the placebo, medication side effects, or beneficial effects on the health condition (13). Regarding the former, taste, color, shape, size, route and process of administration (13) might lead to un-blinding, if they differ between drug and placebo. Moreover, medication side effects could inadvertently serve to influence perception of treatment assignment (14, 15). Studies have shown that side effects are associated with patients and independent evaluators guessing treatment assignment (15, 16). The experience of side effects could then increase the treatment effect by enhancing the patient's expectation of benefit. This possibility is supported by both experimental and clinical evidence. Thus, for example, in an experimental pain task (17), participants receiving a placebo that produces side effects (so called, "active placebo") achieved higher pain thresholds than those who received a non-active placebo. Clinical trials on the pharmacological treatment of pain or depression using active placebos as a control condition have found smaller differences between active medication and the placebo arm compared to similar trials using non-active placebos (18–21). It should be noted that in order for physical characteristics or side effects to influence perceived treatment assignment, the participant needs at least a certain amount of knowledge about these characteristics of the drug. Finally, improvement of symptoms may also indicate participants' perception of treatment assignment. Previous studies among various health conditions have shown an association between clinical improvement and patient perceptions regarding treatment assignment (22–27). However, a major limitation of most of those analyses is that perceived treatment assignment was elicited either before or after treatment, making it impossible to investigate mechanisms of un-blinding and its prospective impact on treatment outcome. If assessed at the beginning of a trial, it cannot be concluded whether mechanisms such as side effects or health improvement have had an influence on the perception of treatment assignment. In contrast, assessment at the end of a study does not indicate whether more side effects or greater improvement in health were due to the perceived treatment assignment, or whether the perception of treatment assignment was due to experienced side effects or improvement in health. Experimental evidence suggests that experienced improvement influences perceived treatment assignment, which in turn influences treatment outcome (28). Whether this is true for clinical trials remains unclear so far. Moreover, when using a single time point assessment, one cannot determine whether participants change their perception of treatment assignment throughout a trial.

To better understand (1) whether the perception of the treatment assignment changes over the course of a study, (2) and whether these changes are influenced by side effects or health improvement, and (3) whether the perceived treatment assignment is prospectively related to the treatment effects, a reanalysis of a three-armed, double blind RCT on the treatment of major depression was conducted. The parent trial (29) examined the effects of escitalopram or S-adenosyl-L-methionine (SAMe) vs. placebo in patients with major depression and assessed perceived treatment assignment at several points throughout the trial. Moreover, in the parent study, no significant differences in improvement in the 17-item Hamilton Depression Rating Scale (HDRS-17) total score or response rates were found between the three treatment arms, making it particularly interesting to investigate expectancy effects due to possible bias in perceived treatment.

### MATERIALS AND METHODS

### Study Design and Procedure

This study is a secondary analysis of a two center, three-arm, double blind RCT (29) on the treatment of major depression with escitalopram or SAMe vs. placebo (clinical trials.gov: NCT00101452) conducted at two academic psychiatry centers in the U.S. Detailed methods for the parent trial have been described elsewhere (29). The study was approved by both local Institutional Review Boards.

### Participants

One-hundred eighty-nine outpatients, 18–80 years old, who met criteria for current major depressive episode according to Structured Clinical Interview for DSM-IV (30) plus screening and baseline scores of ≥25 on the Inventory of Depressive Symptomatology-Clinician-Rated (31), were recruited from April 2005 to December 2009 through clinician referral and general advertisement (e.g., "Have you lost interest in things you used to enjoy, had appetite or sleep changes? Are you interested in natural remedies? Participate in a research study of a naturally occurring supplement called SAMe in treating Major Depressive Disorder") in local newspapers, radio, and television. A ≥ 6 week use of SAMe or escitalopram during the concurrent episode as well as severe medical or other primary psychiatric disorder were exclusion criteria [for detailed description of exclusion criteria see (29)].

### Procedure

After written informed consent participants were randomized in a 1:1:1 manner (stratified by center) for 12 weeks of double-blind treatment with SAMe (1,600 mg/d), escitalopram (10 mg/d), or placebo. A double-dummy design was used to maintain the blind, since SAMe tablets differed in appearance from escitalopram tablets. Participants were made aware of their odds of receiving any particular one of the three possible treatments. At week 6, for non-responders (<50% HAM-D reduction) escitalopram dose could be increased to 20 mg/d and SAMe to 3,200 mg/d for weeks 7–12. Participants who experienced intolerable side effects at the higher dose were allowed to decrease their dose to the previous level.

### Assessment

Assessment relevant for the reanalysis took place during baseline, visit 4 (week 6) and visit 7 (week 12; end of active treatment). Antidepressant efficacy was assessed with the Hamilton 17 item Depression Rating Scale [HDRS-17; (32)]. Side effects were assessed using the Systematic Assessment for Treatment Emergent Events-Systematic Inquiry [SAFTEE (33)]. Side effects documented on the SAFTEE were categorized by severity as 0 (none), 1 (mild), 2 (moderate), and 3 (severe). Scores were calculated based on the number of adverse events reported by each subject that were treatment-emergent, which we defined as any SAFTEE side effect for which severity increased by 1 or more levels from baseline. Besides an overall side effect score, subcategory scores for gastrointestinal and sexual functioning side effects were calculated based on known pharmacologic profiles of the active treatments and side effect patterns reported in the parent study (29). In order to assess perception of treatment assignment patients were asked to guess whether they believed to have received SAMe, escitalopram or placebo.

### Data Analysis

This re-analysis focused on the acute treatment phase only (baseline-12 weeks) including N = 166 participants from the intent-to-treat sample, with at least one post-baseline visit and available data on perceived treatment assignment. Patients with missing data on perceived treatment assignment did not differ from those included in the analysis with regard to HRDS-17 scores and side effects at the respective time points.

Frequency distribution, means and standard deviations were assessed for each variable. Non-normally distributed variables were log10-transformed in order to satisfy the statistical assumptions of parametric tests. Baseline differences were tested with analysis of variance (ANOVA) for parametric data, and χ 2 -tests for categorical data. Change in clinical variables was analyzed with mixed ANOVAs with treatment assignment as the between-participant factor and time as the within-participants factor. Significant main or interaction effects were analyzed by Bonferroni corrected post-hoc tests. Differences in perceived treatment assignment distributions were analyzed using χ 2 -tests. To assess whether clinical improvement or side effects were prospectively associated with perceived treatment assignment, logistic regression analyses were performed using clinical improvement and side effects as predictors, actual treatment assignment (dummy coded with placebo as the reference condition) and previous perceived treatment assignment (active treatment vs. placebo) as covariates, and whether participants perceived themselves to be on active medication [placebo = 0; active treatment (SAMe or escitalopram) = 1] at the successive time point as the dependent variable. To assess whether perceived active treatment affected subsequent clinical improvement, linear multiple regression analysis were performed with perceived treatment [placebo = 0; active treatment (SAMe or escitalopram) = 1] as predictor, actual treatment assignment (dummy coded with placebo as the reference condition) and pre HDRS-17 score as covariates, and successive HDRS-17 score as dependent variable. To test for treatment arm specific effects of perceived treatment, the interaction term between actual treatment assignment and perceived treatment assignment was added to the multiple regression analysis and additional Bonferroni corrected post-hoc tests were carried out for each treatment arm individually. For all analyses, two-tailed significance was set at p < 0.05. All calculations were performed with SPSS Version 23 (Chicago, Illinois).

## RESULTS

### Trial Characteristics

Demographic and baseline clinical characteristics for each treatment group are reported in **Table 1**. No significant differences were observed between the treatment groups.

### Changes in Depressive Symptoms and Side Effects Over Time

Depressive symptoms significantly declined over time (see **Table 1**). However, there was no significant between-group difference or group × time interaction. For both total side effect score and gastrointestinal side effects (see **Table 1**), there was


MD, Missing Data.

no significant change from week 6 to week 12, between-group difference, or group × time interaction. For sexual functioning side effects (see **Table 1**), there was no significant change from week 6 to week 12, or group × time interaction, but the between-group difference was significant. Bonferroni-corrected post-hoc tests revealed that sexual functioning side effects were significantly higher with escitalopram compared to SAMe (p = 0.002) and to placebo (p = 0.009), but not different between SAMe and placebo (p = 0.714).

### Changes in Perceived Treatment Assignment Distributions Over Time

Participants' perceptions of treatment assignment throughout the study can be seen in **Figure 1**. At baseline (before application of treatment), participants in the SAMe group (χ <sup>2</sup> = 14.00, df = 2, p = 0.001), the escitalopram group (χ <sup>2</sup> = 18.88, df = 2, p <0.001) and the placebo group (χ <sup>2</sup> = 11.41, df = 2, p = 0.003) were expecting to receive SAMe significantly more often than would be expected, based on an equal assignment probability (1/3). At week 6, only participants in the escitalopram arm perceived themselves to be on SAMe more often than by chance (χ <sup>2</sup> = 6.37, df = 2, p = 0.041). Participants' perceived treatment assignment in the SAMe (χ <sup>2</sup> = 2.00, df = 2, p = 0.368) and placebo groups (χ <sup>2</sup> = 2.53, df = 2, p = 0.282) did not significantly differ from that expected in an equal treatment distribution.

At week 12, participants' perceived treatment assignment did not significantly differ from an equal distribution in the SAMe arm (χ <sup>2</sup> = 4.36, df = 2, p = 0.113), the escitalopram arm (χ 2 = 1.92, df = 2, p = 0.283) or the placebo arm (χ <sup>2</sup> = 0.54, df = 2, p = 0.764). There was no association between participants' baseline perceived treatment assignment and drop out by week 6 (χ <sup>2</sup> = 0.04, df = 2, p = 0.982) and week 6 perceived treatment assignment and drop out by week 12 (χ <sup>2</sup> = 1.90, df = 2, p = 0.387).

Overall, almost twice as many patients changed their perceived treatment assignment between baseline and week 6 (48.9%), compared to between week 6 and week 12 (26.0%; see **Table 2** for detailed within-person change patterns). Between baseline and week 6 patients in the placebo group changed their perceived treatment assignment more often (60%) than patients in the SAMe and the escitalopram group (42.4%; 48.6%). Between week 6 and week 12 patients in the SAMe group changed their perception of treatment assignment more frequently (38.8%) than in the escitalopram and the placebo group (both: 20.8%).

### Factors Associated With Perceived Treatment Assignment

Prospective associations of clinical improvement and side effects with whether participants perceived to be on an active medication (SAMe or escitalopram) or placebo are reported in **Table 3**. At week 6, participants were significantly more likely to perceive that they were receiving active medication if they experienced clinical improvement. Among participants who experienced a reduction in HDRS-17 score of 13 or greater, all perceived that they were

assigned to an active medication group. There was no threshold below which participants would certainly perceive themselves to be on placebo. Side effects and actual treatment assignment were not associated with participants' perception that they were receiving an active medication at week 6. Clinical improvement between week 6 and week 12, side effects, and actual treatment assignment, were not associated with participants' perceived treatment assignment at week 12.

### Prospective Associations of Perceived Treatment Assignment and Subsequent Improvement

Prospective associations of participants' perceived treatment assignment (active vs. placebo) and actual treatment assignment on clinical improvement are shown in **Table 4**. Controlling for baseline HDRS-17, neither actual nor perceived treatment assignment predicted HDRS-17 scores at week 6. However, controlling for week 6 HDRS-17, patients' perceived treatment assignment at week 6, but not actual treatment assignment predicted week 12 HDRS-17 scores. Participants perceiving they were taking active medication at week 6 showed significantly higher improvement in depressive symptoms at week 12 than participants believing they were taking placebo. Post-hoc analysis revealed no significant interaction between actual treatment assignment and week 6 perceived treatment predicting HDRS-17 improvement at week 12 [R <sup>2</sup>= 0.02, F(2, 76)= 1.28, p = 0.284]. However, Bonferroni corrected post-hocsub group analyses of the effect of perceived treatment at week 6 on treatment outcome at week 12 within each treatment group individually (**Table 5**) indicated that the effect of perceived treatment assignment was significant in the escitalopram treatment arm but not in the placebo and SAMe treatment arm (**Figure 2**).

## DISCUSSION

This re-analysis of a three arm, two-center, double blind RCT of SAMe or escitalopram vs. placebo for the treatment of major depression shows that a significant number of patients did change their perception of treatment assignment throughout the trial, corroborating previous analyses on longitudinal changes of perceived treatment assignment in clinical trials (11, 23). The large majority of change in perceived treatment assignment happened throughout the first 6 weeks of treatment and was significantly predicted by the preceding improvement in depressive symptomatology. Side effects did not seem to have influenced perceived treatment assignment. Although side effects are frequently mentioned as a potential mechanism informing perceived treatment assignment, this result is consistent with another re-analysis of an RCT containing two active treatments for opioid dependency. Oviedo-Joekes et al. (34) found that desired drug effects (drug related highs) but not overall side effects were associated with perceived treatment allocation. Possibly, in trials with more than one active treatment, side effects might not be a pivotal mechanism influencing perceived treatment assignment, since it may be more difficult to guess between two active treatments based simply on side effects. Moreover, patients randomized to placebo in the current reanalysis did not differ from the active treatment groups in terms


TABLE 2 | Changes in perceived treatment assignment from baseline to week 6 and week 6 to week 12 by actual treatment group.

PBO, placebo; SAMe, S-adenosyl-L-methionine; ESC, escitalopram.

of side-effects indicating a nocebo effect (35, 36). Hence, variance in side effects might not have been heterogeneous enough to be associated with perceived treatment. While a majority of patients expected to receive SAMe before treatment began, there was no imbalance of perceived group allocation in favor of actual group allocation during the active treatment phase. This indicates that patients overall were successfully blinded regarding their specific treatment allocation until the end the treatment phase.

However, patients believing to be on active medication at week 6 showed significantly higher improvement in depressive symptoms at week 12 than patients believing to be on placebo, indicating expectancy effects due to perceived treatment assignment. The expectancy effect did not seem to be influenced by selective drop out, given the lack of association between perceived treatment and subsequent attrition. It remains unclear whether the expectancy effect was of the same or different magnitude among all treatment arms. Post-hoc analysis suggests that this effect might have been more pronounced in the escitalopram treatment arm. However, the omnibus test for an interaction between perceived and actual treatment was not significant. Therefore, this suggestion remains speculative for the trial at hand and should encourage better powered re-analyses to further explore this issue.

Notwithstanding, while most previous studies—due to their cross-sectional analysis—were unable to differentiate whether the perception of receiving active medication enhanced the treatment response via expectancy effects, or whether the improvement at the end of these trials influenced the final perception of treatment assignment, this re-analysis indicates that both might be true successively. The suggested pathway of expectancy effects due to perceived treatment assignment found in experimental research (28)–improvement influences perceived treatment, which in turn influences treatment outcome-appears to be validated within this re-analysis of a clinical trial.

In view of the above, this re-analysis now longitudinally confirms expectancy effects in double blind RCTs, and further ads to research highlighting important limitations for the interpretation of effects found in double blind placebo controlled RCTs. In clinical practice, patients do not have reason to doubt that they are receiving active medication. In placebo controlled RCTs, however, this doubt is justified and potentially induces decreased expectancy, which results in an underestimation of the effectiveness of a drug compared to routine clinical practice (37, 38). More generally, double blind RCTs evaluate the specific treatment effect of a pharmacologic verum by comparing it to the response generated by the unspecific treatment effect in the placebo group. If the verum group's treatment response exceeds that in the placebo group, the drug is considered to have drug specific effects. However, such a judgment is only justifiable under the assumption that the treatment response in the verum group consists of the equivalent unspecific effects observed in the placebo group plus the specific effects of the verum ("additive model"). Such an additive model of drug and placebo effects, however, has frequently been questioned theoretically and empirically (39–41). In fact evidence from clinical trials and both behavioral and neuro-physiological experiments suggest that drug specific and unspecific effects can interact (38, 39) and hence might not be equal between a verum and a placebo arm. Therefore, drug specific effects can be both over and underestimated in double blind RCTs. Although results regarding differential expectancy effects with the treatment arms are inconclusive in the current re-analysis, based on the results of this RCT, it cannot be said with absolute certainty that neither SAMe nor escitalopram do have or do not have drug specific efficacy. There is the possibility that specific characteristics of the SAMe or esctitalopram arm induced expectancy effects, which would have led to an overestimation of drug specific effects. On the other hand, RCTs with more than one active treatment arm have been found to show enhanced expectancy effects since the uncertainty of receiving active treatment is lower (38). As a result, this might have led to ceiling effects masking the drug specific expectancy. To circumvent these pitfalls, new study designs to better evaluate the effects of pharmacological treatment have been proposed; these include the balanced placebo design, the balanced cross over design, balanced open-hidden design or delayed response design [see (39, 40) for more details]. However, until such innovations for testing TABLE 3 | Improvement and adverse effects predicting patients' perception to be on active medication (SAMe or escitalopram) or placebo at week 6 and week 12 using logistic regression analysis.


The predictors drug assignment and perceived treatment assignment were dummy coded with placebo as reference condition.

TABLE 4 | Longitudinal associations of patients' perceived treatment assignment and actual drug assignment with subsequent improvement in depression (HDRS-17).


The predictors drug assignment and perceived treatment assignment were dummy coded with placebo as reference condition.

pharmacological interventions become more established, double blind RCTs should at least assess and test for expectancy effects in a systematic manner.

Concerning the assessment of expectancy effects, the results of this re-analysis suggest that the current practice of measuring perceived treatment assignment only once—either at the TABLE 5 | Longitudinal associations of patients' perceived treatment assignment with subsequent improvement in depression (HDRS-17) at week 12 individually by treatment group.


Perceived treatment assignment was dummy coded with placebo as reference condition. p-values have been adjusted for multiple testing using Bonferroni correction.

beginning or at the end of a trial—is questionable. The disadvantages of end of trial assessment of perceived treatment assignment have already been mentioned above. However, some previous studies used baseline or early assessment of perceived treatment allocation for their analysis of blinding or expectancy effects. Yet, in this re-analysis the perception of treatment assignment at the start of the trial did not at all reflect the perception of treatment assignment throughout the trial, making it a very unreliable measure to operationalize un-blinding or expectancy effects. Future trials should use repeated assessment of patients' perceived treatment assignment to determine unblinding and evaluate expectancy effects. In addition, it would be even more advisable to assess patients' outcome expectations throughout a trial. First, a patient who believes to be on a specific active medication but does not believe the medication to be effective will most likely not have any expectancy effects (37), a case that would not be differentiated by only assessing perceived treatment assignment. Secondly, expectations can be assessed on parametric (or at least ordinal) scales, giving the advantage of statistical power over the assessment of perceived treatment assignment assessed as nominal data. For further details on the assessment of patients' expectations in medical treatment see (42).

While not the focus of this re-analysis, one additional finding is of interest: a large majority of patients before the start of treatment expected to receive SAMe, despite being informed about the equal assignment probability. A similar pattern in favor of the "new" treatment has been observed in other trials before (11, 34). This might shed interest on the role of drug trial advertisement. For the trial of this re-analysis, advertisements emphasized the treatment of depression with SAMe, because it was thought that this would attract more participants interested in taking a natural product; this could have influenced patients' expectations. While the current reanalysis did not find any indication that these expectations influenced treatment effects, one might see some indication of advertisement or novelty effect (14) in the newest net-work metaanalysis on anti-depressants (43). Cipriani et al. did find that the same anti-depressant had a positive pooled effect size in trials when evaluated as a new agent, and negative effect sizes when evaluated as the "old" comparator agent. Hence, in trials with two or more active comparators it might be worthwhile to investigate whether framing such as "new" vs. "old" toward participants is an additional source of bias. If such a "novelty effect" existed, this would pose further evidence against an additive model.

Some limitations have to be considered when interpreting the results presented. First, the external validity is limited by the fact that the re-analysis was based on a three arm RCT with SAMe and escitalopram as the active treatments. As discussed above expectancy effects are considered to be higher in a study with two active treatments. Generalization to clinical practice is limited since patients usually have no doubt about receiving active medication. However, patients in clinical practice are likely to have expectations regarding the efficacy of the medication or expectations about side effects (42), which might also serve to influence treatment effects. Additionally, both active treatments are reasonably well tolerated. Therefore, it remains unclear, whether results would be different among active treatment with stronger side effect profiles (e.g., tricyclic antidepressants). Moreover, it remains unclear whether comparing a "classical" anti-depressant with a natural supplement anti-depressant might have different mechanisms of perceived treatment than in other trials. Related to that, it is unclear whether trial advertisement attracted individuals with specific interest in natural remedies, who might for example have more negative expectations about "classical" anti-depressants. Further, although there was no difference in participants providing perceived treatment data regarding depressive symptoms and side effects, nor an association with drop-out, one can not completely rule out whether missing guess data might be a source of bias.

In conclusion, this re-analysis showed that patients' perceptions about treatment assignment do change throughout a trial, that these perceptions appeared to be influenced by preceding improvement in depressive symptoms, and that perceptions about treatment assignment predicted further improvement. Building on previous cross-sectional and experimental evidence the current results further highlight issues with the interpretation of effects found in double blind RCTs. Future RCTs should apply multiple assessment of perceived treatment assignment and/or expectations throughout the trial. This would permit testing for possible expectation effects that might bias the comparison for specific efficacy, and provide the opportunity to further explore mechanisms of bias in double blind RCTs. Moreover, new study designs for testing pharmacological interventions should be considered, to get a more concise estimate of the specific effects of a pharmacological treatment.

## REFERENCES


### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the U.S. Food and Drug Administration (FDA) guidelines with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Institutional Review Boards of the Massachusetts General Hospital in Boston, MA and the Butler Hospital in Providence, RI.

### DATA AVAILABILITY STATEMENT

The datasets for this manuscript are not publicly available yet, because the principal investigators are still in the process of actively analyzing the data and preparing manuscripts for publication, based on the main goals and outcome measures put forth in the original funded grant proposal. Requests to access the datasets should be directed to David Mischoulon, MD, Ph.D., dmischoulon@mgh.harvard.edu.

### AUTHOR CONTRIBUTIONS

JL, SV, and DM designed the re-analysis, conducted data analysis, and wrote the current manuscript. DM, LP, LC, AT, and MF designed and conducted the original randomized controlled trial this re-analysis is based on. LB and AJC assisted with the statistical analysis. GP and AC assisted with the writing of the manuscript.

### FUNDING

The original study was supported by the NIH and the National Center for Complementary and Alternative Medicine (NCCAM), R01 grant R01AT001638. SAMe tosylate and matching placebo were supplied by Pharmavite LLC, CA, United States. JL was supported by a fellowship within the Postdoc-Program of the German Academic Exchange Service (DAAD).


both blinded and non-blinded outcome assessors. BMJ (2012) 1119:1–11. doi: 10.1136/bmj.e1119


**Disclaimer:** The article reflects the views of the authors and may not reflect the opinions or views of all the study investigators, the NIH, or the NCCAM.

**Conflict of Interest Statement:** JL was supported by a fellowship within the Postdoc-Program of the German Academic Exchange Service (DAAD).

LP has received research support from Neuronetics, NIH, HRSA, and NeoSync. He has served on advisory panels for Abbott and AstraZeneca. He has served as a consultant to Gerson Lehrman, Wiley, Springer, Qatar National Research Fund, and Abbott.

LC has received research support from Neuronetics, NIH, and NeoSync. She has served on advisory panels or provided consulting services for Abbott, Corcept, Johnson &Johnson, and Takeda-Lundbeck.

AT has received research support from Neuronetics, NeoSync, and NIH.

GP has served as a consultant for Abbott Laboratories, Acadia Pharmaceuticals, Inc<sup>∗</sup> , Alkermes, Inc, AstraZeneca PLC, Avanir Pharmaceuticals, Axsome Therapeutics<sup>∗</sup> , Boston Pharmaceuticals, Inc., Brainsway Ltd, Bristol-Myers Squibb Company, Cephalon Inc., Dey Pharma, L.P., Eli Lilly Co., Genentech, Inc<sup>∗</sup> , Genomind, Inc<sup>∗</sup> , GlaxoSmithKline, Evotec AG, H. Lundbeck A/S, Inflabloc Pharmaceuticals, Janssen Global Services LLC<sup>∗</sup> , Jazz Pharmaceuticals, Johnson & Johnson Companies<sup>∗</sup> , Methylation Sciences Inc, Mylan Inc<sup>∗</sup> , Novartis Pharma AG, One Carbon Therapeutics, Inc<sup>∗</sup> , Osmotica Pharmaceutical Corp.<sup>∗</sup> , Otsuka Pharmaceuticals, PAMLAB LLC, Pfizer Inc., Pierre Fabre Laboratories, Ridge Diagnostics (formerly known as Precision Human Biolaboratories), Shire Pharmaceuticals, Sunovion Pharmaceuticals, Taisho Pharmaceutical Co, Ltd, Takeda Pharmaceutical Company LTD, Theracos, Inc., and Wyeth, Inc.

GP has received honoraria (for lectures or consultancy) from Abbott Laboratories, Acadia Pharmaceuticals Inc, Alkermes Inc, Asopharma America Cntral Y Caribe, Astra Zeneca PLC, Avanir Pharmaceuticals, Bristol-Myers Squibb Company, Brainsway Ltd, Cephalon Inc., Dey Pharma, L.P., Eli Lilly Co., Evotec AG, Forest Pharmaceuticals, GlaxoSmithKline, Inflabloc Pharmaceuticals, Grunbiotics Pty LTD, Jazz Pharmaceuticals, H. Lundbeck A/S, Medichem Pharmaceuticals, Inc, Meiji Seika Pharma Co. Ltd, Novartis Pharma AG, Otsuka Pharmaceuticals, PAMLAB LLC, Pfizer, Pharma Trade SAS, Pierre Fabre Laboratories, Ridge Diagnostics, Shire Pharmaceuticals, Sunovion Pharmaceuticals, Takeda Pharmaceutical Company LTD, Theracos, Inc., Titan Pharmaceuticals, and Wyeth Inc.

GP has received research support (paid to hospital) from AstraZeneca PLC, Bristol-Myers Squibb Company, Forest Pharmaceuticals, the National Institute of Mental Health, Neuralstem, Inc, PAMLAB LLC, Pfizer Inc., Ridge Diagnostics (formerly known as Precision Human Biolaboratories), Sunovion Pharmaceuticals, Tal Medical, and Theracos, Inc.

GP has served (not currently) on the speaker's bureau for BristolMyersSquibb Co and Pfizer, Inc.

∗Asterisk denotes activity undertaken on behalf of Massachusetts General Hospital.

MF: All disclosures can be view on line at: http://mghcme.org/faculty/facultydetail/maurizio\_fava

**Research Support**: Abbott Laboratories; Acadia Pharmaceuticals; Alkermes, Inc.; American Cyanamid;Aspect Medical Systems; AstraZeneca; Avanir Pharmaceuticals; AXSOME Therapeutics; Biohaven; BioResearch; BrainCells Inc.; Bristol-Myers Squibb; CeNeRx BioPharma; Cephalon; Cerecor; Clintara, LLC; Covance; Covidien; Eli Lilly and Company;EnVivo Pharmaceuticals, Inc.; Euthymics Bioscience, Inc.; Forest Pharmaceuticals, Inc.; FORUM Pharmaceuticals; Ganeden Biotech, Inc.; GlaxoSmithKline; Harvard Clinical Research Institute; Hoffman-LaRoche; Icon Clinical Research; i3 Innovus/Ingenix; Janssen R&D, LLC; Jed Foundation; Johnson & Johnson Pharmaceutical Research & Development; Lichtwer Pharma GmbH; Lorex Pharmaceuticals; Lundbeck Inc.; Marinus Pharmaceuticals; MedAvante; Methylation Sciences Inc; National Alliance for Research on Schizophrenia & Depression (NARSAD); National Center for Complementary and Alternative Medicine (NCCAM);National Coordinating Center for Integrated Medicine (NiiCM); National Institute of Drug Abuse (NIDA); National Institute of Mental Health (NIMH); Neuralstem, Inc.; NeuroRx; Novartis AG; Organon Pharmaceuticals; Otsuka Pharmaceutical Development, Inc.; PamLab, LLC.; Pfizer Inc.; Pharmacia-Upjohn; Pharmaceutical Research Associates., Inc.; Pharmavite&174; LLC; PharmoRx Therapeutics; Photothera; Reckitt Benckiser; Roche Pharmaceuticals; RCT Logic, LLC (formerly Clinical Trials Solutions, LLC); Sanofi-Aventis US LLC; Shire; Solvay Pharmaceuticals, Inc.; Stanley Medical Research Institute (SMRI); Synthelabo; Taisho Pharmaceuticals; Takeda Pharmaceuticals; Tal Medical; VistaGen; Wyeth-Ayerst Laboratories

**Advisory Board/Consultant**: Abbott Laboratories; Acadia; Affectis Pharmaceuticals AG; Alkermes, Inc.; Amarin Pharma Inc.; Aspect Medical Systems; AstraZeneca; Auspex Pharmaceuticals; Avanir Pharmaceuticals; AXSOME Therapeutics; Bayer AG; Best Practice Project Management, Inc.; Biogen; BioMarin Pharmaceuticals, Inc.; Biovail Corporation; BrainCells Inc; Bristol-Myers Squibb; CeNeRx BioPharma; Cephalon, Inc.; Cerecor; CNS Response, Inc.; Compellis Pharmaceuticals; Cypress Pharmaceutical, Inc.; DiagnoSearch Life Sciences (P) Ltd.; Dinippon Sumitomo Pharma Co. Inc.; Dov Pharmaceuticals, Inc.; Edgemont Pharmaceuticals, Inc.; Eisai Inc.; Eli Lilly and Company; EnVivo Pharmaceuticals, Inc.; ePharmaSolutions; EPIX Pharmaceuticals, Inc.; Euthymics Bioscience, Inc.; Fabre-Kramer Pharmaceuticals, Inc.; Forest Pharmaceuticals, Inc.; Forum Pharmaceuticals; GenOmind, LLC; GlaxoSmithKline; Grunenthal GmbH; Indivior; i3 Innovus/Ingenis; Intracellular; Janssen Pharmaceutica; Jazz Pharmaceuticals, Inc.; Johnson & Johnson Pharmaceutical Research & Development, LLC; Knoll Pharmaceuticals Corp.; Labopharm Inc.; Lorex Pharmaceuticals; Lundbeck Inc.; Marinus Pharmaceuticals; MedAvante, Inc.; Merck & Co., Inc.; MSI Methylation Sciences, Inc.; Naurex, Inc.; Navitor Pharmaceuticals, Inc.; Nestle Health Sciences; Neuralstem, Inc.; Neuronetics, Inc.; NextWave Pharmaceuticals; Novartis AG; Nutrition 21; Orexigen Therapeutics, Inc.; Organon Pharmaceuticals; Osmotica; Otsuka Pharmaceuticals; Pamlab, LLC.; Pfizer Inc.; PharmaStar; Pharmavite <sup>R</sup> LLC.; PharmoRx Therapeutics; Precision Human Biolaboratory; Prexa Pharmaceuticals, Inc.; PPD; Purdue Pharma; Puretech Ventures; PsychoGenics; Psylin Neurosciences, Inc.; RCT Logic, LLC (formerly Clinical Trials Solutions, LLC); Relmada Therapeutics, Inc.; Rexahn Pharmaceuticals, Inc.; Ridge Diagnostics, Inc.; Roche; Sanofi-Aventis US LLC.; Sepracor Inc.; Servier Laboratories; Schering-Plough Corporation; Shenox Pharmaceuticals; Solvay Pharmaceuticals, Inc.; Somaxon Pharmaceuticals, Inc.; Somerset Pharmaceuticals, Inc.; Sunovion Pharmaceuticals; Supernus Pharmaceuticals, Inc.; Synthelabo; Taisho Pharmaceuticals; Takeda Pharmaceutical Company Limited; Tal Medical, Inc.; Tetragenex; Teva Pharmaceuticals; TransForm Pharmaceuticals, Inc.; Transcept Pharmaceuticals, Inc.; Usona Institute,Inc.; Vanda Pharmaceuticals, Inc.; Versant Venture Management, LLC; VistaGen

**Speaking/Publishing**: Adamed, Co; Advanced Meeting Partners; American Psychiatric Association; American Society of Clinical Psychopharmacology; AstraZeneca; Belvoir Media Group; Boehringer Ingelheim GmbH; Bristol-Myers Squibb; Cephalon, Inc.; CME Institute/Physicians Postgraduate Press, Inc.; Eli Lilly and Company; Forest Pharmaceuticals, Inc.; GlaxoSmithKline; Imedex, LLC; MGH Psychiatry Academy/Primedia; MGH Psychiatry Academy/Reed Elsevier; Novartis AG; Organon Pharmaceuticals; Pfizer Inc.; PharmaStar; United BioSource, Corp.; Wyeth-Ayerst Laboratories.

**Stock/Other Financial Options**: Equity Holdings: Compellis; PsyBrain, Inc.

Royalty/patent, other income: Patents for Sequential Parallel Comparison Design (SPCD), licensed by MGH to Pharmaceutical Product Development, LLC (PPD) (US\_7840419, US\_7647235, US\_7983936, US\_8145504, US\_8145505); and patent application for a combination of Ketamine plus Scopolamine in Major Depressive Disorder (MDD), licensed by MGH to Biohaven. Patents for pharmacogenomics of Depression Treatment with Folate (US\_9546401, US\_9540691).

Copyright for the MGH Cognitive & Physical Functioning Questionnaire (CPFQ), Sexual Functioning Inventory (SFI), Antidepressant Treatment Response Questionnaire (ATRQ), Discontinuation-Emergent Signs & Symptoms (DESS), Symptoms of Depression Questionnaire (SDQ), and SAFER; Lippincott, Williams & Wilkins; Wolkers Kluwer; World Scientific Publishing Co.Pte.Ltd.

DM has received research support from the Bowman Family Foundation, Nordic Naturals, and PharmoRx. He has provided unpaid consulting for Pharmavite LLC and Gnosis USA, Inc. He has received writing honoraria from Pamlab and Nordic Naturals, and speaking honoraria from Blackmores, and the MGH Psychiatry Academy. He has received royalties from Lippincott Williams & Wilkins, for textbook "Natural Medications for Psychiatric Disorders: Considering the Alternatives" (DM and Jerrold F Rosenbaum, Eds.).

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Laferton, Vijapura, Baer, Clain, Cooper, Papakostas, Price, Carpenter, Tyrka, Fava and Mischoulon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mental Training for Better Achievement: Effects of Verbal Suggestions and Evaluation (of Effectiveness) on Cognitive Performance

#### Kristina Fuhr\* and Dustin Werle

Clinical Psychology and Psychotherapy, Department of Psychology, University of Tuebingen, Tuebingen, Germany

Objective: There is only some literature regarding the influence of verbal suggestions on cognitive performance in healthy volunteers. For example, the performance in a knowledge test was enhanced when participants were told that they had subliminally received the correct answer. However, enhancing cognitive performance only via verbal suggestions without prior conditioning phases has not yet been examined. The goal of our study was therefore to investigate the effects of a mental training based on verbal suggestions compared to a control training on cognitive performance in a student population using a balanced-placebo-design.

Edited by:

Katja Weimer, Universität Ulm, Germany

#### Reviewed by:

Cosima Locher, Universität Basel, Switzerland Karin Meissner, Ludwig-Maximilians-Universität München, Germany

\*Correspondence: Kristina Fuhr kristina.fuhr@uni-tuebingen.de

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 21 June 2018 Accepted: 27 September 2018 Published: 22 October 2018

#### Citation:

Fuhr K and Werle D (2018) Mental Training for Better Achievement: Effects of Verbal Suggestions and Evaluation (of Effectiveness) on Cognitive Performance. Front. Psychiatry 9:510. doi: 10.3389/fpsyt.2018.00510 Methods: In total, 103 participants were randomly assigned either to listening to a 20 min audio-taped mental training or to a 20 min philosophy lecture (control training) via headphones. Participants were individually tested before and after the training concerning their cognitive performance. Information about the type of training were varied in both intervention conditions ("You are part of our experimental condition and you will receive an effective mental training" or "You are part of our control group and you will receive the control condition"). At the end of the assessment, participants were asked what kind of training they believed they had received and how effective they would rate the received training.

Results: Overall, the cognitive performance improved in all participants, F (1, 99) = 490.01, p < 0.001. Contrary to our hypotheses, we found no interaction of the type of training and type of instruction on the cognitive performance. Participants who rated the received training as being effective at the end of the experiment (regardless if it was the mental or the control training), have before experienced a greater improvement in their cognitive performance [F (2,100) = 7.26, p = 0.001] and showed higher scores in the ability to absorb [F(2, 99) = 3.75, p = 0.027].

Conclusion: The subjects' own experiences in the task might have influenced the rating of the training rather than the actual training or the information they receive regarding the type of training. This finding underlines the relevance of enhancing the subjective beliefs and self-efficacy in situations where cognitive attention processes are important and of individually tailoring mental trainings.

Keywords: placebo, mental training, cognitive performance, verbal suggestions, effectiveness rating

## INTRODUCTION

The influence of expectations and suggestions on the placebo response was shown in different experimental studies concerning psychological aspects like for example pain [e.g., (1)]. In the context of pain, the given verbal instructions did not only influence the subjective analgesic effect but also reduced the amount of requests for opioid doses (1). However, verbal suggestions alone (without any preconditioning) could only influence pain tolerance in healthy subjects and motor performance in Parkinson's patients, but not for example hormonal secretion (2). Little is known about placebo effects on cognitive performance. For example, cognitive processes like attention could be improved after suggesting participants they had consumed caffeine similar to the improvement after they really consumed caffeine [e.g., (3)]. Comparable results were found when college students believed that they had received a "neuro-enhancing" stimulant. The cognitive performance improved in some scales and the participants also evaluated their subjective results as better (4). However, the placebo effect concerning the cognitive processes was only investigated when providing the participants with some kind of substance or placebo. Therefore, it would be interesting if similar to the results that were found in pain, cognitive processes might also be influenced by evoking specific expectations just via receiving verbal suggestions, i.e., verbal instructions but also via psychological interventions. Automatic visual perception and cognitive processes, as they are for example assumed for the Stroop effect, can be influenced and even controlled just by receiving verbal suggestions during a hypnotic experience. This effect was most pronounced in highly suggestible individuals (5, 6). For example, the interference in the Stroop effect, when the ink color and the word color are incongruent, could be eliminated when suggesting subjects to view word stimuli as neutral and meaningless (5). Furthermore, the performance of counting visual stimuli was reduced when suggesting participants after a hypnotic induction that a wooden board would cover the screen (6). In one study, the performance in the Flanker task, for example, was only influenced by posthypnotic suggestions in highly suggestible participants compared to (the same) nonhypnotic suggestions in an alert state (7). Thus, the advantage of using verbal suggestions after a hypnotic induction or with highly suggestible participants instead of only verbal instructions is that hypnotic suggestions were able to even demonstrate control over some cognitive processes, as described before (5, 6). However, there is only some literature regarding the enhancement of cognitive performance only via verbal suggestions. For example, the performance in a knowledge test was enhanced when participants believed that they had subliminally received the correct answer (placebo condition) compared to those who were told that they subliminally only received a flash [control condition, (8)]. Even the results in an intelligence test could be enhanced when positive expectations were evolved only via the way participants were recruited (9). Another study found that it was more the subjective evaluation of the own performance that was influenced by expectations rather than the objective performance [as for example reaction times or success rates, see (10)]. However, in previous studies, expectancy effects on placebo were usually paired with some previous test phase in which participants already underwent a specific conditioning paradigm and therefore could already build according expectations regarding the relevant test phase (10).

Taken together, previous results imply that some suggestions can block cognitive processes that were assumed to be automatic and not directly influenceable. However, the magnitude of the placebo effect on enhancing cognitive performance only based on verbal suggestions without previous experience has not yet been examined.

Concerning placebo effects in psychotherapy, as in wellestablished treatments like for example cognitive behavioral therapy, it is impossible to conduct double-blind trials and challenging to develop "placebo" control groups that are not distinguishable from the specific treatment that is to be tested (11). However, it was demonstrated that psychological placebo interventions show equivalent effects as specific psychotherapies if they were structurally equally designed (12). That's why some researchers emphasize the relevance/superiority of the "common factors" in psychotherapy over the specific therapeutic ingredients (13, 14). Investigating mechanisms of change and differentiating specific and non-specific/common therapeutic "ingredients" is more or less impossible since every specific ingredient is transmitted also via words, verbal suggestions, and other therapeutic rituals (15). Studies are lacking that directly manipulate some of the non-specified factors as for example expectations (16). Using hypnotic verbal suggestions could be one possibility to use some kind of psychological intervention that is not based on specific treatment strategies but directly addresses the non-specific factors like expectations (17) and hypnosis can thus be used as a "non-deceptive placebo" (18).

The goal of our study was therefore to investigate the effects of a mental training based on hypnotic verbal suggestions compared to a control training consisting of a philosophy lecture on the cognitive performance in a student population using a "balanced-placebo-design" (19). We paired the trainings with evoking different expectations in the participants concerning the "efficacy" of the mental training. We expected the strongest effects on cognitive performance when participants received both the mental training and the suggestion of this training as being "effective" compared to the conditions in which they received incongruent information (received mental training and were told "ineffective" or received control training and were told "effective") or the control training paired with the information of being "ineffective."

### MATERIALS AND METHODS

#### Study Design

The study used a mixed 2 × 2 × 2 design with the within-factor time and the between-factors intervention and information, see **Figure 1**. As dependent variable, we assessed the cognitive performance in a specific attention/concentration task (see Assessments). With the within-factor time, cognitive performance was measured before and after the intervention. The factor intervention consisted of the mental training vs.

the control training. Concerning the factor information, half of the participants of each intervention condition was told to receive an effective training ("You are part of our experimental group and you will receive an effective mental training") vs. the other half was told that they are part of the control group ("You are part of our control group and you will receive the control condition"). With pairing the factors intervention and information, a balanced-placebo-design was established. Subjects were randomly assigned to one of the four experimental conditions. We chose random numbers between one to four that were equally distributed to do so (source: https://rechneronline. de/zufallszahlen/).

The study protocol was approved by the local Ethics Committee for Psychological Research of the Faculty of Science at the University of Tuebingen (Az 2016/1123/26). All subjects gave written informed consent in accordance with the Declaration of Helsinki.

### Participants

The a priori sample size was defined at 100 participants.

Participants for the current study were recruited at the University of Tuebingen via announcements in social media (e.g., facebook groups of psychology students and of the students who were interested in study participation) and in cafeterias and libraries. Inclusion criteria were (1) being at the age of 18 to 50 years old, (2) no uncorrected ametropia, (3) normal hearing, and (4) providing written informed consent to participate in the study. Participants were alerted to the fact that they might be incompletely informed about some part of the study ["authorized deception," see (20)]. After study completion, every participant was completely debriefed in case of prior deception. The "misinformation" of participants was necessary in these experimental conditions where the type of intervention and the type of information were not congruent (mental training and told that "being part of the control group"; control training and told that "being part of the experimental group and receiving an effective training"). The goal of the incomplete information was to evoke specific expectations in participants concerning the effectiveness of the following intervention. The so-called "authorized deception" is a possibility in experimental placebo studies to overcome an important ethical dilemma. On one hand, the deception of participants is necessary to demonstrate the effect of expectancy on outcome. On the other hand, the ethical norms request the participant's free choice of taking part in a study only after every aspect of the study was fully displayed (21). It was shown that the use of "authorized deception" does not affect the placebo effect and therefore is a very useful tool (20).

In total, 103 participants were investigated in the present study. Eighty-three of the subjects (80.6%) were female, all of them being students, about 50% were studying psychology (n = 43, 41.7%) or cognitive science (n = 10, 9.7%) in the first or second year. The mean age was 23.35 years (SD = 4.18).

### Interventions

In both conditions, participants individually listened to a 20 min audio-take via headphones. The mental training consisted of different indirect hypnotic verbal suggestions for enhancing the cognitive performance: reminding participants of their own experience when learning some new procedures, creating the image of an archer, giving metaphors with the goal to concentrate and focus on relevant aspects of a task whereas irrelevant aspects can be ignored. Participants were thus not directly but indirectly told to focus more on the cognitive performance task. The control training consisted of a part of a lecture in philosophy about the humans' free will. Listening to the lecture should not evoke an enhancement of the cognitive performance in the subjects. The "control training" was parallelized in the length to the mental training.

### Assessments

The cognitive performance of participants was individually tested with the d2R (22) before and after receiving the intervention. The d2R measures attention and concentration performance within 5 min. Participants have to cancel out every "d" with two dashes in between distractors across several rows within a specific time limit. We used the measure of concentration performance of the d2R as overall measure for the participants' attention ability.

Depressive symptoms within the last 14 days were assessed with the Beck Depression Inventory II at baseline [BDI II, (23)].

The participants' ability to absorb in thoughts and imaginings was assessed with the Tellegen Absorption Scale with 34 items at baseline [TAS, (24)].

The BDI-II and the TAS were included as covariates in the data analysis.

At baseline, we also asked participants via questionnaire about some sociodemographic variables.

At the end of the assessment, participants were asked to rate (1) the effectiveness of the training and (2) what type of training the believed they received.

### Study Procedure

The investigators of the study followed a specific protocol with standardized instructions when interacting with the participants. Investigators were not blinded to the intervention and the information condition of the participants. After oral and written informed consent, participants received the BDI-II, the TAS, and a sociodemographic questionnaire. At baseline, the d2R was administered the first time. Afterwards, information about the type of training was varied according the participant's experimental condition ("You are part of our experimental condition and you will receive an effective mental training" or "You are part of our control group and you will receive the control condition"). Afterwards, participants received either the mental training or the philosophy lecture via headphones for 20 min. After listening to the audios, the participants were again tested with the d2R. After the task, they were asked to answer the previously described questions about the training they have received before. At the end of the assessment, all participants were debriefed following again a standardized protocol. The assessment took 1 h in total per participant. Subjects received either monetary compensation or got hourly credit for participating in the study.

### Statistical Analysis

We computed a three-way analysis of variance, with the betweensubjects factors intervention (mental training vs. control training) and information ("effective training" vs. "part of the control group"), and the within-subjects factor time (before and after the training). The dependent variable was cognitive performance measured before and after receiving the training. For baseline correction, we conducted an analysis of covariance with the factors intervention and information, as well as cognitive performance at baseline as covariate. Cognitive performance after the training was used as dependent variable.

### RESULTS

The four study groups were comparable at baseline regarding age, F(3, 99) = 0.21, p = 0.890, sex, X 2 (3) = 3.79, p = 0.285, the TAS, F(3, 99) = 0.56, p = 0.644, and the BDI-II, F(3, 99) = 0.41, p = 0.748.

There were differences in the four groups in the cognitive performance at baseline, almost reaching significance, F (3, 99) = 2.61, p = 0.056, see also the following analyses.

The cognitive performance significantly improved in all participants, F (1,99) = 490.01, p <0.001. Means and standard deviations are displayed in **Table 1**. We found no effect of the type of intervention, F (1, 99) = 0.11, p = 0.747, and no interaction between the factors intervention and time, F (1, 99) = 0.06, p = 0.802. Further, we found no interaction between information and time, F (1, 99) = 0.71, p = 0.402, and between information and intervention, F (1, 99) = 0.85, p = 0.358. However, we found a significant interaction between all three factors, F (1, 99) = 4.08, p = 0.046 and a significant effect of type of information, F (1, 99) = 4.79, p = 0.031. This was due to significant differences in the cognitive performance at baseline regarding the type of information, F (1, 101) = 5.72, p = 0.018. The participants that were later told to be in the control group were actually faster than those who were later instructed to receive an effective mental training ("effective mental training": M = 166.4, SD = 34.8 vs. "control group": M = 183.1, SD = 36.1).

When controlling for baseline differences in cognitive performance, there was an interaction between type of training and type of instruction, almost reaching significance, t(98) = −12.77, p = 0.063. The cognitive performance in groups with congruent information and training did not improve as much, as in groups with incongruent information (improvement "congruent": M = 33.89, SD = 20.11 vs. improvement "incongruent": M = 40.55, SD = 12.38).

The covariate depressive symptoms, assessed with the BDI-II, was not related to the cognitive performance, F (1, 96) = 0.13, p =0.723, nor was the ability to absorb, as measured with the TAS, F (1, 96) = 0.46, p =0.500.

When we compared the cognitive performance of the participants regarding their effectiveness rating at the end of the intervention, a significant interaction between time and the rating of the training was found. We observed an improvement of the cognitive performance, when the participants afterwards rated the training as neutral or effective at the end of the assessment, compared to those who rated the training as ineffective, F (2, 100) = 7.26, p = 0.001, see also **Table 2**. The improvement in cognitive performance was, however, independent of the fact if participants correctly identified the training condition or not, F (1, 101) = 1.75, p = 0.189.

The actual training they received had no significant effect on the evaluation of the effectiveness, X²(2) = 5.50, p = 0.064, see **Table 3**. The effectiveness rating at the end of the intervention was rather significantly associated with their own belief, what type of training they received, X²(2) = 10.67, p = 0.005, see **Table 4**.

These participants who rated the training as neutral or effective, showed higher scores in absorption (TAS) at baseline compared to those who rated the training as ineffective, F (2, 99) = 3.75, p = 0.027 (ineffective: M = 42.0, SD = 19.6; neutral: M = 46.4, SD = 18.3; effective: M = 56.1, SD = 25.4; post hoc Bonferroni: effective vs. ineffective: p = 0.038). No differences were found regarding age, depressive symptoms, and sex.

TABLE 1 | Cognitive performance before and after the intervention in all four experimental conditions.


M, mean; SD, standard deviation; n, sample size.

TABLE 2 | Cognitive performance before and after the intervention regarding effectiveness ratings.


M, mean; SD, standard deviation; n, sample size.


n, sample size, X<sup>2</sup> (2) = 5.50, p = 0.064.

TABLE 4 | Effectiveness rating in the two assumed intervention conditions.


n, sample size, X<sup>2</sup> (2) = 10.67, p = 0.005.

### DISCUSSION

Contrary to our hypothesis, that a mental training, which is only based on verbal suggestions, or the information about the effectiveness of the training could enhance cognitive performance, there was no effect of the training and the information about the training on their actual performance. Overall, we found that the cognitive performance of all participants improved. This result might be due to practicing or an effect of repeated measures [see also (22)]. However, the participant's own rating about the effectiveness of the training was significantly influenced by the cognitive performance change irrespective of the actual type of intervention or information they received. Our results suggest that it was either the perception of their own performance improvement that influenced their rating about the effectiveness at the end of the assessment. The effectiveness rating, however, was independent of the actual training they received and also of the fact if they correctly identified the training condition. Thus, another interpretation of the results could be that the subjective evaluation influenced the participant's performance in the second trial. Unfortunately, we did not assess expectations regarding the mental training and its influence on cognitive performance before the assessment.

Mental practice is known to enhance performance in general, because it involves training of specific behaviors, especially in cognitive tasks (25). Another study found that mental practicing imagining a specific motor activity—could enhance the outcome in that specific motor task (26). This finding could be interpreted as a top-down mechanism that somehow activated the brain regions that are also associated with the concrete task and thus enhance performance. Similarly, if a mental training is able to activate areas that are associated with cognitive performance, the performance itself can be improved. However, the mental practice should be regularly trained for maintaining effects (25). In our study, participants received only one session of mental training. Furthermore, several cognitive abilities are needed for assessing attention performance in a specific task like the d2R that was used in our study, for example performance speed,

accuracy, inhibition of distractors etc. (22). It might be possible that our mental training was not able to activate the specific abilities that were necessary for the attention task that we measured. Our training, which consisted of suggestions that might indirectly influence their performance, did not include any mental practice of the specific attention task. However, some higher order cognitions, as for example self-efficacy (27) or other meta-cognitive aspects like self-regulation or motivation (28), could have been more important for the cognitive outcome that we measured. Especially in student samples, perceived self-efficacy can influence the cognitive performance (29). The finding of another study, that verbal suggestions could enhance creativity, was explained by the idea that it was driven by intrinsic motivation and the belief in the own competence of the participant (30). If we transfer that explanation to our results, we could hypothesize that the participant's own motivation and belief in their performance had the biggest effect on their actual performance regardless of the information given by the investigator or the suggestions that were used in the mental training. Their own evaluation of the effectiveness of the training was consequently not based on external information but on their own intrinsic standards. The participants' motivation to improve or their perceived self-efficacy therefore influenced their performance the most. Our results are also in line with previous findings that it was rather the subjective evaluation than the objective performance that was influenced by verbal suggestions [see (7)]. The (almost significant) result that the cognitive performance improved more when subjects were given information about the type of training that was not congruent to the actual training they received, is extremely interesting within the previously discussed explanation. We interpret that result in the way that the participants' motivation to improve was even triggered more when receiving incongruent information.

We found that participants who rated the training as effective had higher hypnotic suggestibility than those who rated the training as ineffective. The effectiveness rating, however, was significantly influenced by the intervention they perceived they have received rather than the actual intervention condition that they received. This potential placebo effect implies that highly suggestible subjects might base their expectations on the owns appraisal instead of external information. We argue that the effect of enhancing cognitive performance was more pronounced in a subgroup of participants with high intrinsic motivation, high selfefficacy, and high suggestible ability. Similarly, patients with high suggestibility suffering from depression showed greater responses to suggestions and expectations regarding the effects and side effects they perceived together with taking an antidepressant medication (31). The ability to absorb in images, also known as hypnotic suggestibility [see (24)], might therefore mediate the effect of expectations on outcome [see also (31)]. However, the ability to absorb in our study had no impact on the cognitive performance itself but on the participants' own evaluation of the effectiveness of the treatment. This underlines the importance of tailoring interventions to some of the participants' characteristics or needs. Personalized medicine, also used in the psychiatric context, is based on that idea of optimizing

the fit between patient characteristics and treatment choice and therefore enhancing treatment outcome and benefit [e.g., (32)]. Furthermore, interventions that are based on hypnotic verbal suggestions should have the goal of increasing the selfefficacy and the belief in the own competence (33). This idea can be underlined by the findings regarding the influence of non-specific/common factors on the outcome of psychotherapy (34).

Placebo effects on cognitive performances were found when participants suffer from mild cognitive or attentional deficits as for example in some nicotine-smokers that were deprived before (35). Some patients with Major Depression also suffer from a cognitive impairment [see (36)], and attention or concentration problems are also included in the list of typical symptoms and criteria of depression. Mental trainings and other psychological interventions for enhancing cognitive performance might even be more effective for these patients compared to healthy academically high performers like university students. This is in line with a study that found that older adults might also profit in their cognitive performance after receiving some kind of cognitive training (37). Future studies should evaluate the effects of a mental training that focuses on cognitive enhancement especially in patients with Major Depression. Within this context, the influence of placebo effects on the cognitive performance should also be investigated.

### LIMITATIONS

There are several factors that limit the generalizability of the current study.

One limitation of the current study is that we did not conduct any pilot study to figure out if the mental training that we conceptualized was effective or not. We also did not obtain any feedback about the quality of the mental training from the participants. However, we have to note, that it was not our goal to evaluate the mental training. In contrast, we were more interested in differentiating the effects of direct suggestions and information that were given by the investigators compared to creating some images (within the mental training) that might indirectly influence the participants' performance. However, the mental training should have been evaluated in different samples regarding its effectiveness on enhancing cognitive performance. For this purpose, a full deceptive placebo design could be used. In summary, our mental training was not specifically effective for the cognitive performance in the student sample.

Second, our study sample was not representative. We measured a very young and highly educated student sample. Comparing the means of the present sample with the norms of the d2R, it was obvious that even at baseline the student sample showed an extremely high concentration performance [norms of the d2R at the age of 20–39: M = 158.6, SD = 29.4, see (22); current sample: M = 174.38, SD = 36.24]. This might be based on motivation differences regarding the assessment or some previous experience in the task [see also (22)].

We found differences in the cognitive performance of participants at baseline regarding the factor type of information. The differences were found at baseline where the instruction were not yet given to the subject. Thus, we might have created an investigator's effect. Even if the participants were randomly assigned to the experimental condition, the investigator was not blind regarding the intervention and information condition that the subjects received when interacting with him or her. Especially when the investigator knew the fact that the participant will be later told to receive the control condition, it might have been that he or she behaved in a different way when interacting with the participant that may have influenced and increased their performance. We wanted to avoid any influence of the investigator on the subject by using a standardized protocol for instructing the subjects. But maybe they were already influencing the subjects' performance unconsciously or via indirect communication. In sum, a potential investigators' allegiance effect may have confounded the results [see also (38)]. Future studies should either avoid investigator effects by blinding the investigator who is interacting with the participants regarding the type of training they receive. Another possibility could also be to directly manipulate and vary of some aspects of the contact with the participants. For example, placebo effects were enhanced when a practitioner contact was longer and focused more on the nature and history of symptom assessment compared to a relationship with only limited contact (39) and a warm empathic contact with a clinician could even result in subjective and objective ratings of improvement of cold duration and severity (40).

### CONCLUSIONS

The participants' own evaluation of the effectiveness of the training was most probably driven by their own performance in the first and second trial of the task or by their own motivation to perform. The own experiences and ratings were subsequently more important for their cognitive performance than the efficacy of a specific training or information about the training they receive. This finding underlines the relevance of enhancing the self-efficacy in situations where cognitive attention processes are important and of individually tailoring psychological trainings or interventions accordingly. The relevance of mental trainings for people with psychological disorders with a mild cognitive impairment as for example in patients with mild to moderate Major Depression Episodes should be investigated in future studies. Within this context, especially the participant's belief in the efficacy of a specific treatment should enhance their actual treatment response. The ability to absorb in images, also known as hypnotic suggestibility [see (24)], might mediate the effect of expectations on outcome and should be investigated in future psychotherapy studies.

### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

### AUTHOR CONTRIBUTIONS

Both authors contributed to the conception and design of the study. KF wrote the first draft of the manuscript. KF and DW conducted the statistical analysis. Both authors approved the submitted version.

#### FUNDING

The study was funded by the University Program for the Promotion of Junior Researchers at the Eberhard Karls

#### REFERENCES


Universität of Tübingen, awarded to KF, between July 2016 to December 2017. We acknowledge support from Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of University of Tuebingen.

#### ACKNOWLEDGMENTS

We want to thank our investigators Neele Alberts, Lia Heubner, Isabell Kunze, and Tina Lorenz for helping with the data acquisition.


in patients with irritable bowel syndrome. BMJ (2008) 336:999–1003. doi: 10.1136/bmj.39524.439618.25

40. Rakel DP, Hoeft TJ, Barrett BP, Chewning BA, Craig BM, Niu M. Practitioner empathy and the duration of the common cold. Fam Med. (2009) 41:494–501.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Fuhr and Werle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effects of Placebo Interventions on Subjective and Objective Markers of Appetite–A Randomized Controlled Trial

Verena Hoffmann<sup>1</sup> \*, Marina Lanz <sup>1</sup> , Jennifer Mackert <sup>1</sup> , Timo Müller 2,3, Matthias Tschöp2,3 and Karin Meissner 1,4 \*

1 Institute of Medical Psychology, Faculty of Medicine, LMU Munich, Munich, Germany, <sup>2</sup> Helmholtz Diabetes Center, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Munich, Germany, <sup>3</sup> German Center for Diabetes Research (DZD), Garching bei München, Germany, <sup>4</sup> Division of Health Promotion, Coburg University of Applied Sciences, Coburg, Germany

Objective: Patients' expectations about the benefit of an intervention are important determinants of the placebo effect. Little is known about the extent to which expectations influence outcomes of treatments in the field of appetite regulation. This study aimed to investigate the effects of treatment-related expectations on subjective and objective markers of appetite.

#### Edited by:

Paul Enck, University of Tubingen, Germany

#### Reviewed by:

Kristina Fuhr, University of Tubingen, Germany Damien Finniss, University of Sydney, Australia Teodora Pribic, Vall d'Hebron Research Institute (VHIR), Spain

#### \*Correspondence:

Verena Hoffmann verena.hoffmann@med.lmu.de Karin Meissner karin.meissner@med.lmu.de

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 14 October 2018 Accepted: 03 December 2018 Published: 18 December 2018

#### Citation:

Hoffmann V, Lanz M, Mackert J, Müller T, Tschöp M and Meissner K (2018) Effects of Placebo Interventions on Subjective and Objective Markers of Appetite–A Randomized Controlled Trial. Front. Psychiatry 9:706. doi: 10.3389/fpsyt.2018.00706 Methods: 90 healthy participants of normal weight were randomly allocated to either an appetite-enhancing placebo group, a satiety-enhancing placebo group, or a control group. All participants received a placebo capsule along with group-specific verbal suggestions to either be appetite-promoting, or satiety-enhancing, or to have no effect on appetite. Before and during the 2 h following randomization, participants were repeatedly asked to rate feelings of hunger and satiety on visual analog scales (VAS), and blood samples were taken repeatedly to assess plasma ghrelin levels as a physiological marker of hunger.

Results: In comparison to the control group, the satiety-enhancing placebo intervention significantly reduced appetite and increased satiety. The appetite-enhancing placebo intervention did not alter subjective levels of hunger, but increased plasma ghrelin levels in females.

Conclusions: Results provide the first experimental evidence that appetite-regulating placebo interventions can elicit a psychobiological response. Expectations are important factors to consider when evaluating the effects of interventions in the field of appetite regulation.

Keywords: placebo effect, expectation, appetite, satiety, ghrelin

## INTRODUCTION

Obesity is a dramatically increasing problem in our society. Treatment approaches for obesity include psychological, pharmacological, and surgical interventions (1–3). To what extent placebo effects, i.e. positive treatment expectations, contribute to the success of obesity treatments is unclear. A recent systematic review of placebo-controlled surgery trials revealed that patients

receiving sham bariatric surgery showed on average 71% of the weight loss reported by the patients in the active surgery groups (4). These data suggest a strong inhibitory effect of placebo interventions on appetite.

Eating behavior is closely linked to mental sets. For example, Higgs (5) reported that participants consumed less in a test session when they were reminded of a recent meal. Furthermore, Provencher et al. (6) showed that participants ate less when the meal was perceived as healthy. Crum and colleagues went one step further and evaluated the impact of expectations on plasma levels of the gut hormone ghrelin, a physiological marker of appetite (7). In a within-subjects design, healthy volunteers on two occasions were made believe to receive either a "highcaloric, indulgent milk shake" or a "low-caloric, sensitive milk shake." In truth, both milk shakes were of identical contents. Results showed a different ghrelin response to these milk shakes: In comparison to the "sensible" shake, the ghrelin increase was larger when expecting the "indulgent" milk shake, followed by a sharper decline of ghrelin levels 30 min after drinking the shake. These findings indicate a strong impact of nutrition-specific expectations on appetite and satiety, as evidenced by altered plasma ghrelin levels before and after a test meal. Additionally, differences in eating behavior are linked to gender. Several studies have shown that females tend to eat healthier than men [i.e., avoiding high-fat food and eating more fruit and fiber; (8, 9)]. This has been linked to more concerns of women about their body weight as compared to men (10). Also, it has been reported, that females eat more sweets when perceiving stress than men (11).

In this study, we investigated whether treatment-related expectations can affect appetite, satiety and plasma ghrelin levels. In a between-subjects design, normal-weight participants received a placebo capsule together with the information that its content would either increase appetite, or increase satiety, or would leave appetite and satiety unaffected. We hypothesized that the appetite-enhancing placebo intervention would decrease satiety and increase appetite and plasma ghrelin levels, while the satiety-enhancing placebo intervention would have the opposite effects, both in comparison to a no treatment control group.

## MATERIALS AND METHODS

#### Participants

The study was conducted at the Institute of Medical Psychology at the LMU Munich, Germany. Healthy participants aged 18–36 years were recruited via flyers and a university mailing list. All participants had to be of normal weight (Body Mass Index (BMI) 18–25 kg/m<sup>2</sup> ). Exclusion criteria included report of pregnancy, breastfeeding, regular use of medication (except hormonal contraceptives and anti-allergic drugs), acute or chronic disease, smoking, surgery in the last 4 weeks before the experiment, elevated fasting blood glucose levels (>100 mg/dl), and elevated levels of anxiety and/or depression scores [>7 in at least one subscale of the Hospital Anxiety and Depression Scale (HADS); (12)]. The study protocol was approved by the ethics committee of the Medical Faculty at LMU Munich. Participants provided written informed consent and received 45 Euro compensation.

### Experimental Procedure

Ninty participants were randomly allocated to one of three groups: "control," "enhanced appetite" (placebo), or "enhanced satiety" (placebo). To allow for double-blinding, 6 additional participants were randomized to verum treatments (3 enhanced appetite, 3 enhanced satiety; compare (13); **Figure 1**). Groups were stratified by sex due to sex differences in eating behavior (14) and neurobiological mechanisms of placebo effects (15). At recruitment, participants were informed that the experiment investigated how biological and psychological factors contribute to the regulation of hunger and appetite.

Participants underwent a single test session starting at 8 o'clock in the morning. They were asked to abstain from food for 10–12 h prior to the experiment (intake of small amounts of water was allowed). Upon arrival, participants took seat in a comfortable chair, and blood glucose levels were determined from finger blood samples using a BG Star device (Sanofi-Aventis, Hannover, Germany). An indwelling flexible catheter was then placed in the antecubital vein and kept patent with a saline infusion to allow for repeated blood drawing during the experiment. Electrodes to measure the electrocardiogram (ECG) were attached. Participants were then asked to fill in the "Hospital Anxiety and Depression Scale" [HADS; (12)], the 'Three Factor Eating Questionnaire' [TFEQ; (16)] and to rate current levels of hunger and satiety on 100-mm visual analog scales (VAS). Thereafter, the first blood sample to assess ghrelin levels was collected and the ECG measurement was started. Following a 15-min baseline period, the experimenter opened the randomization envelope, performed the verbal expectancy manipulation according to group allocation ("appetite increase," "satiety increase," or "control") and asked the participants to swallow the provided test capsule with 100 ml of mineral water (standardized temperature of 20◦C). After resting periods of 30 and 60 min, respectively, participants were asked to rate current levels of hunger and satiety, and the second and third blood sample for ghrelin assessments were collected.

### Blinding and Randomization

A person not directly involved in the experiments prepared an opaque, sequentially-numbered randomization envelope for each participant according to a computer-generated randomization list. The envelopes contained information on the type of intervention ("appetite-stimulating," "satiety-enhancing," or "control") as well as a test capsule. Neither the experimenter nor the participants were informed whether the capsule in the hunger-enhancing and satiety-enhancing conditions was a placebo or contained an active ingredient (double-blinded design).

### Capsules

Identical white and opaque vegetarian capsules were used for all interventions. The placebo capsules were filled with lactose (Heirler Cenovis GmbH, Radolfzell, Germany). For the satietyincreasing active intervention, capsules were filled with an alginate complex (lyophilized sodium-alginate complex, added with aluminum- and calciumchoride; CM3 Alginat Kapseln, Easyway GmbH, Monheim, Germany). Alginate reduces hunger

and increases satiety feelings, which is partly due to its volumeincreasing content (17). For the appetite-stimulating active intervention, one tablet of "Appetit-Anreger" with extracts of bitter herbs (Zirkulin Naturheilmittel GmbH, Bremen) was placed in the study capsules. Dietary supplements containing bitters are traditionally used to increase appetite and to support digestion (18).

enhanced satiety"). The verum groups served only for double-blinding and were not evaluated further.

### Expectancy Manipulation

Standardized expectancy manipulations were performed by two female experimenters in white coats (one undergraduate, one graduate student). After opening the randomization envelopes, participants in the appetite-stimulating groups were told to receive either a placebo capsule or a capsule filled with bitters, and that bitters are known to increase secretion of digestive fluids in the stomach and thus are expected to increase appetite within 20–30 min after intake of the capsule. Participants in the satiety-enhancing groups were told to receive either a placebo capsule or a capsule containing alginate, and that alginate is known to increase its volume in the stomach and thus is expected to enhance satiety within 20– 30 min after intake of the capsule. They were told to receive either a verum or a placebo intervention (randomization ratio was not disclosed). Participants in the control group received a placebo capsule together with the information that its ingredients would have no effect on gastric activity and appetite.

### Measurements Hunger and Satiety Ratings

Perceived hunger ("How hungry do you feel?") was rated using a 100-mm visual analog scale from "0" ("not at all hungry") to "100" ("extremely hungry"). Perceived satiety ("How full do you feel?") were assessed by means of a 100-mm visual analog rating scale, ranging from 0 ("not at all full") to 100 ("extremely full").

#### Plasma Ghrelin

To assess the concentration of ghrelin in plasma, blood samples were collected in commercially available EDTA tubes (2.7 ml), complemented with 54 µl 4 mM 4-(2 aminoethyl)benzenesulfonyl fluoride hydrochloride (AEBSF) (19). Blood samples were immediately stored on ice and centrifuged within 30 min for 10 min at 3,000 g and 4◦C. Two samples of 500 µl plasma were transferred to Eppendorf tubes and complemented with 100 µl 1 mM HCl. Samples were gently mixed and stored at −70◦C until final analysis. Analysis of plasma ghrelin content was performed with the Ghrelin (total) Assay Kit (Catalogue number: EZGRT-89K) from Merck Millipore, Darmstadt, Germany according to protocol.

#### Electrocardiogram

The electrocardiogram was recorded to evaluate changes in heart rate. A transient increase of heart rate has been described as part of the cephalic phase response when food is anticipated (20). The electrocardiogram signal was measured continuously during the experiment using three disposable Ag/AgCl electrodes, which were positioned in an Einthoven Lead I configuration and connected to the BIOPAC amplifier module ECG100C of a BIOPAC MP 150 device (BIOPAC Systems Inc., Goleta, CA, USA). Data was acquired using AcqKnowledge 3.7.2 software and a sampling rate of 500 Hz. Intervals between successive R peaks (cardiac periods) were extracted from the electrocardiogram signal using the peak-detection function implemented in AcqKnowledge 3.7.2. Heart periods were examined and screened for artifacts based on the procedure developed by Proges and Byrne (21). Average heart rate was calculated for the last five artifact-free minutes of the baseline period and the two postintervention periods (i.e., minutes 25–30 and minutes 55–60 after randomization).

#### Questionnaires

The Hospital Anxiety and Depression scale [HADS; (12)] was used to screen for anxiety and/or depression. The Three Factor Eating Questionnaire [TFEQ; (16)] with its three subscales "cognitive restraint of eating," "disinhibition," and "hunger" was used to test for possible differences in eating behavior between groups at baseline.

Female participants were asked for the normal length of their menstrual cycle, the beginning of the last menstruation, and whether they took hormonal contraceptives. Time point of ovulation was estimated by subtracting 14 days from the length of the menstrual cycle (22).

#### Statistical Analyses

Assuming an effect size partial eta-squared of 0.1, the study was planned to have a power of 90% to detect a significant interaction effect between "group" and "time point" in a mixed ANOVA for changes of hunger, satiety and ghrelin from before to after the intervention (with a type 1 error of 5%) (calculated by GPower Version 3.1.7). However, we later decided to use ANCOVAs to adjust for the slight group differences at baseline. Statistical analyses were performed using SPSS (version 23.0). Hunger ratings, satiety ratings and plasma ghrelin levels were each subjected to 3-way analyses of covariance (ANCOVA), with

TABLE 1 | Group characteristics at baseline.

two levels of "time" (30 min and 60 min after randomization), three levels of "group" (appetite, satiety, control) and two levels of "sex" (male, female). In each model, baseline levels (15 min before randomization) were included as covariates. Bonferroni corrections were applied, where appropriate. A p-value ≤ 0.05 (2-sided) was considered statistically significant.

### RESULTS

#### Participants

One Hundred thirteen participants were assessed for eligibility and 17 were excluded (three did not meet inclusion criteria, 12 declined to participate, one did not show up and one had elevated fasting blood glucose levels). Thirty participants each were assigned to the appetite group, the satiety group, and the control group. All participants completed the experiment.

Study groups were comparable with respect to demographic variables, eating behavior as well as anxiety and depression scores (**Table 1**). Participants had a mean age of 26.6 years (3.2 SD; range: 18–36 years) and a mean BMI of 21.9 kg/m<sup>2</sup> (1.8 SD; range: 18.6–25 kg/m<sup>2</sup> ). Fourteen women were in the preovulatory phase and two women in the postovulatory phase of the menstrual cycle, while 29 women were using hormonal contraceptives.

#### Hunger Ratings

The 3-way ANCOVA for post-intervention hunger ratings, controlled for baseline levels, revealed a significant 3 way interaction between "group," "time," and "sex" [Fgroup×time×sex(2, 83) = 4.0, p = 0.023]. However, post hoc 2-way ANCOVAs performed separately for each sex showed no significant interaction effect between "group" and "time" [women, Fgroup×time(2, 41) = 3.1, p = 0.058; men, Fgroup×time(2, 41) = 0.6, p = 0.571). Furthermore, the 3-way ANCOVA showed a significant main effect of "group" [Fgroup (2, 83) = 6.7, p = 0.002]. Bonferroni-corrected post hoc tests indicated significantly lower hunger ratings in the satiety group compared to the control group (p = 0.033) and to the appetite group (p = 0.002) (**Figure 2**, **Table 2**).


SD, Standard Deviation.

#### Satiety Ratings

The 3-way ANCOVA for post-intervention satiety ratings, controlled for baseline levels, revealed a significant main effect of "group" [F(2, 83) = 11.1, p < 0.001]. Bonferroni-corrected post hoc tests indicated significantly higher satiety ratings in the satiety group than in the control group (p < 0.001) and in the appetite group (p < 0.001) (**Figure 3**, **Table 2**). The 3-way interaction between "group," "time," and "sex" was not significant [Fgroup×time×sex(2, 83) = 2.5, p = 0.102].

#### Ghrelin Levels

The 3-way ANCOVA for post-intervention plasma ghrelin levels, controlled for baseline levels, revealed a significant 2-way interaction between "group" and "sex" [F(2, 71) = 3.4, p = 0.040]. Separate ANCOVAs for male and female participants indicated a significant main effect of "group" in women [F(2,37) = 4.4, p = 0.019] but not in men [F(2, 33) = 1.5, p = 0.235]. Bonferronicorrected post hoc tests indicated that the interaction in women was due to higher post-intervention ghrelin levels in the appetite group compared to the control group (p = 0.019; **Figure 4**, **Table 2**). Neither the main effect of "group" [Fgroup(2, 71) = 0.9, p = 0.401) nor the 3-way interaction between "group," "time," and "sex" [Fgroup×time×sex(2, 71) = 2.7, p = 0.075] was significant.

### Heart Rate

The 3-way ANCOVA for post-intervention heart rate, controlled for baseline levels, revealed no significant main or interaction

#### Treatment Guesses

Thirteen participants (72.2%) in the satiety group, but only five participants (27.8%) in the appetite group guessed to having received the active agent. The difference between groups was significant (χ² = 0.1, p = 0.024).

### DISCUSSION

This is the first study designed to evaluate the effects of treatment-related expectations on appetite, satiety, and associated plasma ghrelin levels. Our randomized-controlled double-blinded experiment revealed that the satiety-enhancing placebo intervention successfully altered subjective feelings of appetite and satiety in the suggested direction. Furthermore, the appetite-enhancing placebo intervention increased ghrelin levels in women.

A recent meta-analysis of sham-controlled surgery trials suggested that bariatric surgery for obesity is associated with a large placebo effect on weight loss, equaling 71% of the effect of active bariatric surgery (4). Our finding that the satiety-enhancing placebo intervention indeed increased satiety provides the first experimental evidence that treatment-related expectations contribute to the success of satiety-enhancing medical interventions.

TABLE 2 | Post-intervention values (baseline-adjusted) of hunger ratings, satiety ratings, plasma ghrelin levels, and heart rate.


SE, Standard Error; VAS, Visual Analogue Scale.

The retrospective evaluation of treatment guesses suggests that the appetite-enhancing placebo intervention was less credible to the participants than the satiety-enhancing placebo intervention. This could explain why the appetite-enhancing placebo intervention did not alter subjective feelings of appetite and satiety. However, the guess of having received placebo does not necessarily mean that the participant did not believe in the effectiveness of the intervention. Recent studies indicate that also open-label placebo administration can lead to positive beliefs and symptom improvement (23–25). Supporting this explanation, we observed an increase in ghrelin levels following the appetiteenhancing placebo intervention in women, suggesting the occurrence of a placebo effect on a physiological level. Ghrelin is secreted by the stomach, with levels peaking just before a meal and declining after feeding. In addition, ghrelin serves as an interoceptive signal for food-seeking behavior (26, 27). A previous study in a predominantly female cohort (65% women) found that plasma ghrelin levels increased when participants anticipated the intake of an "indulgent" milk-shake as compared to a "low-calorie" milk-shake, even though hunger ratings did not change (7). This may indicate that ghrelin is a highly sensitive measure to capture a psychologically mediated increase in appetite that occurs even before behavioral effects are measureable. With regard to the observed sex difference, it is important to note that stronger physiological placebo responses in women have also been reported in studies of placebo analgesia (15, 28). In addition, there is ample evidence that the physiology of appetite differs between sexes (29, 30). For example, women showed higher brain activation in the fusiform gyrus while viewing high-caloric pictures in the hungry state (31). Furthermore, brain activation to calorie-rich foods within the dorsolateral, ventrolateral, and ventromedial prefrontal cortices, the middle/posterior cingulate, and the insula were larger in women than in men (32), regions that play a role in selfreflection (33). Interestingly, sex differences in eating behavior are mediated, among other factors, by the gut hormone ghrelin (14), both in terms of secretion of this hormone and of ghrelin sensitivity (34, 35). Thus, the sex-specific ghrelin response in our experiment is in line with previous studies showing a stronger physiological response to placebo interventions as well as to appetite-enhancing food stimuli in women.

Our results provide first evidence that a placebo intervention to enhance appetite may enhance ghrelin secretion in women even before behavioral effects are measureable. In contrast, we found a strong effect of the satiety-enhancing intervention on ratings of hunger and satiety, notably without changes in circulating levels of total ghrelin. These data collectively suggest that ghrelin secretion is most likely unrelated to the placebo effect on satiety. It could be argued that food ingestion is a prerequisite for the postprandial fall in circulating ghrelin. However, as demonstrated in healthy human volunteers, postprandial suppression of ghrelin secretion did not differ between subjects receiving a mixed meal or who have been sham fed to allow smelling, chewing and tasting but not swallowing of food (26, 36). An anorexigenic hormone such as leptin or peptide YY (37) may still be better suited to capture the hormonal correlates of the placebo effect on satiety.

Several limitations of our results need to be mentioned. First, the short observational period in our experiment does not allow any conclusion on whether placebo effects on hunger and satiety can last longer than a few hours. Second, we performed our experiment in a normal-weight sample. Further studies are needed to clarify whether the findings of our experiment can be replicated in obese and anorectic patients. Third, our study was designed to investigate placebo effects on hunger and satiety induced by verbal suggestions. Learning mechanisms, such as behavioral conditioning and reinforcement learning, are known to affect eating behavior (38) as well as placebo effects (39), and their involvement in placebo effects on appetite-regulation should be evaluated in follow-up studies.

In conclusion, the results of the present study indicate a powerful inhibition of appetite in response to a satiety-enhancing placebo intervention and first evidence for an increase of ghrelin levels in women in response to an appetite-enhancing placebo intervention. Results thus provide the first experimental evidence that expectations are important factors to consider when evaluating the effects of medical interventions in the field of appetite regulation. Further studies with additional physiological outcome parameters are needed to better understand the psychobiological processes triggered by appetite-modulating placebo interventions.

#### AUTHOR CONTRIBUTIONS

VH and KM designed the experiments. VH, ML and JM performed the experiments. VH and KM analyzed the data. VH, TM, MT, and KM interpreted the data. VH drafted the first version of the manuscript. All authors critically reviewed the manuscript.

#### ACKNOWLEDGMENTS

KM received support by the Theophrastus Foundation, Germany, the Schweizer-Arau Foundation, Germany, and KM and MT by the Deutsche Forschungsgemeinschaft (ME-3675/1-1).

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hoffmann, Lanz, Mackert, Müller, Tschöp and Meissner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Influence of Expectancy Level and Personal Characteristics on Placebo Effects: Psychological Underpinnings

Lili Zhou1,2, Hua Wei <sup>3</sup> , Huijuan Zhang1,2, Xiaoyun Li <sup>3</sup> , Cunju Bo<sup>4</sup> , Li Wan<sup>4</sup> \*, Xuejing Lu1,2 \* and Li Hu1,2,3,4

*<sup>1</sup> CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China, <sup>2</sup> Department of Psychology, University of Chinese Academy of Sciences, Beijing, China, <sup>3</sup> Key Laboratory of Cognition and Personality, Ministry of Education, Faculty of Psychology, Southwest University, Chongqing, China, <sup>4</sup> Department of Pain Management, The State Key Clinical Specialty in Pain Medicine, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China*

#### Edited by:

*Luana Colloca, University of Maryland, Baltimore, United States*

#### Reviewed by:

*Meike C. Shedden-Mora, University Medical Center Hamburg-Eppendorf, Germany Karin Meissner, Ludwig Maximilian University of Munich, Germany*

#### \*Correspondence:

*Xuejing Lu luxj@psych.ac.cn Li Wan wanli5000cn@163.com*

#### Specialty section:

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

Received: *30 April 2018* Accepted: *11 January 2019* Published: *05 February 2019*

#### Citation:

*Zhou L, Wei H, Zhang H, Li X, Bo C, Wan L, Lu X and Hu L (2019) The Influence of Expectancy Level and Personal Characteristics on Placebo Effects: Psychological Underpinnings. Front. Psychiatry 10:20. doi: 10.3389/fpsyt.2019.00020* Placebo effects benefit a wide range of clinical practice, which can be profoundly influenced by expectancy level and personal characteristics. However, research on the issue of whether these factors independently or interdependently affect the placebo effects is still in its infancy. Here, we adopted a 3-day between-subject placebo analgesia paradigm (2-day conditioning and 1-day test) to investigate the influence of expectancy levels (i.e., No, Low, and High) and personal characteristics (i.e., gender, dispositional optimism, and anxiety state) on placebo effects in 120 healthy participants (60 females). Our results showed that the reduction of pain intensity in the test phase was influenced by the interaction between expectancy and gender, as mainly reflected by greater reductions of pain intensity in females at Low expectancy level than females at No/High expectancy levels, and greater reductions of pain intensity in males than in females at High expectancy level. Additionally, the reduction of pain unpleasantness was not only modulated by the interaction between expectancy and gender, but also by the interaction between expectancy and dispositional optimism, as well as the interaction between expectancy and anxiety state. Specifically, participants who were more optimistic in Low expectancy group, or those who were less anxious in High expectancy group showed greater reductions of pain unpleasantness. To sum up, we emphasized on regulating the expectancy level individually based on the assessment of personal characteristics to maximize placebo effects in clinical conditions.

Keywords: expectancy, placebo analgesia, gender, dispositional optimism, anxiety state

## INTRODUCTION

Placebo commonly refers to an inert substance or a medicinally inactive treatment that can generate clinically-useful effects. A person who receives a placebo treatment usually experiences actual improvements in his/her physical condition, which is well-known as the placebo effect. The placebo effect can be beneficial in a wide range of clinical situations, such as modulating the therapeutic effects of deep brain stimulation on Parkinson's disease, generating antidepressant responses in depression, and reducing unpleasantness in patients with anxiety (1, 2). It can also enhance the

effectiveness of physical interventions (3–5) and provide an alternative approach to avoid side effects of drug treatments (6, 7).

Although placebo effect is a complex phenomenon that can be affected by multiple factors (e.g., memory, motivation, anxiety, learning, patient-provider interaction, and previous treatment experience) (1, 8–10), response expectancy has been recognized as one of the main psychological mechanisms underlying this effect (11). Response expectancy has been defined as the expectancy to the occurrence of non-volitional responses (i.e., responses experienced as occurring automatically without volitional efforts, including fear, sadness, sexual arousal, pain, etc.) to situational cues (12). According to Response Expectancy Theory, such response expectancy could affect the probability that an individual would engage in a particular behavior (e.g., increased/decreased pain responses), as non-volitional responses have positive and negative reinforcement values (12). Consistent with this theory, accumulating evidence has shown that the placebo effect (e.g., placebo analgesia) could be altered by changing individual expectancy (3, 7, 13–19).

In general, response expectancy is composed of two distinct aspects: (1) the expected magnitude of a change (i.e., expectancy level), and (2) the subjective probability that the change will occur (i.e., individual belief) (12). With regards to the first aspect of response expectancy, a greater placebo effect is usually associated with a higher level of positive expectancy (4, 5, 20–23). However, these observations are not guaranteed under some circumstances. For example, in laboratory settings, even individual expectation of placebo effects has been successfully acquired during the classical conditioning phase, unrealistically high expectancy that does not match with one's present experience would weaken individual belief in the placebo treatment during the test phase (24).

In terms of the second aspect of response expectancy, previous studies have demonstrated that individual belief in the current experience has a critical influence on response expectancy through learning mechanisms (21, 24). Such a belief is easily affected by personal characteristics, thus contributing to the differentiation of individual expectancy (25, 26), and subsequently leading to individual variability in response to placebos (27, 28). For instance, gender has been verified as a factor contributing to the variability of placebo effects—some studies suggested that males reported greater pain reductions after placebo treatments compared to females (29–31), whereas other studies described a better respondence to placebos in females than in males (29, 32–34). Additionally, the influence of other personal characteristics on placebo responses has been frequently reported in the literature. For example, dispositional optimism, referred to a generalized positive outcome expectancy for the future (35), is inextricably linked to proneness of increased placebo effects (36, 37). Comparatively, individuals with low anxiety level are more likely to respond to a placebo treatment (36).

However, research on the issue of whether expectancy level and personal characteristics independently or interdependently affect the placebo effects is still in its infancy. Here, we adopted a between-subject placebo analgesia paradigm to test the influence of expectancy levels (i.e., No, Low, and High) and personal characteristics (i.e., gender, dispositional optimism, and anxiety state) on placebo effects.

### MATERIALS AND METHODS

### Participants

A total of 120 healthy, right-handed participants (60 females) were recruited from the local community. None of them reported a history of illness or concurrent medication. Participants were informed that they were attending a study aimed to test the effect of lidocaine (a local anesthetic that could be topically applied on the skin) on alleviating pain, and they were asked not to consume products containing caffeine, alcohol, or nicotine at least 12 h before the experiment. All the participants gave their written informed consents and were told their rights to discontinue participation at any time during the study. Each participant was randomly assigned to one of the three experimental groups divided by the manipulated expectancy levels (i.e., No, Low, and High) during the Conditioning phase (as described below), with 40 participants (20 females) in each group. After the whole experiment, all participants were fully debriefed.

### Experimental Materials

#### Pain Stimuli

The electrical pain stimuli were delivered using a constantcurrent stimulator (model DS7A; Digitimer, UK) with three stainless steel concentric bipolar needle electrodes (38, 39). Pain stimuli were intraepidermal electrical pulses delivered to the inner side of the left forearm through the electrodes (located according to an equilateral triangle shape), which have been proved to preferentially activate the Aδ nociceptive fibers in the superficial skin layers (40, 41). Each electrode consisted of a needle cathode (length = 0.1 mm, diameter = 0.2 mm) surrounded by a cylindrical anode (diameter = 1.4 mm). Each stimulus consisted of 100 rapidly succeeding constantcurrent, square-wave pulses at 50 Hz (0.5-ms duration for each pulse).

#### Dispositional Optimism

The Chinese version of the Life Orientation Test-Revised (LOT-R) was adopted to assess participants' dispositional optimism, as its reliability has been well-established (Cronbach alpha of positive subscale = 0.73, N = 479; Cronbach alpha of negative subscale = 0.82, N = 479) (42). In the current sample, the reliability of the scale was satisfactory (Cronbach alpha = 0.66, N = 120).

#### Anxiety State

The state subscale of Chinese version of State-Trait Anxiety Inventory (STAI-S) was adopted to assess participants' anxiety state. The reliability of the Chinese version of STAI-S (Cronbach alpha = 0.90, N = 2,150) (43) has been wellestablished. Notably, the reliability of the subscale in the current sample was satisfactory (STAI-S: Cronbach alpha = 0.89, N = 120).

### Experimental Procedure

A randomized, single-blinded between-subject experimental paradigm of placebo analgesia was adopted in the present study (7). Participants were firstly familiarized with the electrical stimulation prior to the formal experiment. The stimulus intensities were adjusted individually using the method of limits, to identify the thresholds for each participant that would elicit a low sensation (∼2 rating), moderate sensation (∼4 rating), and high sensation (∼6 rating) on an 11-point self-report Numeric Rating Scale (NRS, 0 = no sensation, 10 = unbearable pain). Specifically, the stimuli at ∼2 rating elicited a non-painful sensation, whereas the stimuli at ∼4 and ∼6 ratings elicited a painful pinprick sensation. Once these stimulus intensities were determined, a randomized sequence of pain stimuli with different intensities was delivered to participants until they were able to reliably distinguish the intensities of these stimuli. Notably, these determined stimuli with varied intensities were used during the conditioning procedure (see Conditioning Phase section) to ensure a successful manipulation of expectancy level during the experiment.

The experiment consisted of two phases in three consecutive days: Conditioning phase (Day 1 and Day 2) and Test phase (Day 3). On each day, participants underwent three sessions: (1) a pre-treatment session, (2) a treatment session, and (3) a post-treatment session (see **Figure 1**). To rule out possible confounding effects related to the gender of experimenter (44, 45), half of participants in each group with an equal number of males (n = 20) and females (n = 20) were instructed by a female experimenter, while the rest were guided by a male experimenter. Both female and male experimenter wore white coats and had received systematic training of procedure prior to the formal experiment.

#### Conditioning Phase

The Conditioning phase started with a pre-treatment session consisting of 20 trials. Each trial started with a 1 s white fixation centered on the screen with a black background. After a 5 s waiting, an electrical stimulus at ∼6 ratings (0.80 ± 0.29 mA) lasting for 2 s was delivered to the left forearm of the participant. Being waiting for another 5 s, participants were required to verbally rate the perceived intensity (0 = no sensation, 10 = unbearable pain) and unpleasantness (0 = not unpleasant, 10 = extremely unpleasant) of pain evoked by the electrical stimulus (with 8 s per each rating, 16 s in total). The inter-trial interval varied between 8 and 12 s. The stimulus intensity in the pre-treatment session was identical across groups and the whole session lasted for ∼16 min.

In the treatment session, a non-active skin cream was applied on the palmar side of the participant's left forearm. Being waiting for 5 min, participants were instructed to remove the cream and have a 10 min rest. Meanwhile, in order to strength expectancy level, participants were given one of the following verbal interventions, depending on treatment assignment:


The treatment session lasted for ∼15 min.

The Conditioning phase ended with a post-treatment session consisting of 40 trials. The procedure was identical to the pretreatment session, except that different intensities of electrical stimuli were set for different groups: inducing a painful sensation at ∼6 rating (0.69 ± 0.16 mA) for No expectancy group, at ∼4 rating (0.47 ± 0.17 mA) for Low expectancy group, and a non-painful sensation at ∼2 rating (0.28 ± 0.08 mA) for High expectancy group. Such changes of stimulus intensity for different groups were intended to strengthen the power of verbal intervention to response expectancy, which has been frequently applied in previous placeborelated studies (46–48). The post-treatment session lasted for ∼37 min.

#### Test Phase

The Test phase also consisted of a pre-treatment session, a treatment session, and a post-treatment session. The procedure of this phase was identical to the Conditioning phase, except that the intensity of electrical stimuli applied in the posttreatment session was identical for all participants across groups, i.e., inducing a painful pinprick sensation at ∼6 rating (0.78 ± 0.26 mA). Participants were first required to complete the psychological questionnaires upon arriving to the laboratory on the Test day. To make sure that the expectancy manipulation was successful, participants in the Low and High Expectancy groups were required to verbally rate the strength of expectancy to drug efficacy on an 11 point NRS (0 = without any expectancy, 10 = full expectancy) at the end of the test. The average ratings of expectancy to drug efficacy were significantly different between groups (Low: 6.90 ± 1.31, High: 8.39 ± 0.80, t = −6.11, P < 0.001), indicating a successful expectancy manipulation. For participants in No expectancy group, the expectancy to drug efficacy was not assessed to avoid extra response bias to the expectancy manipulation.

### Statistical Analysis

To assess the magnitude of placebo effects, we calculated the changes of subjective pain intensity and unpleasantness by subtracting the ratings in the post-treatment session from those in the pre-treatment session in Test phase (49). To demonstrate the influence of personal characteristics on the modulation of expectancy level on placebo effects, we performed a statistical analysis using a "split into three subgroups" strategy (13-14- 13 split). Specifically, we sorted the LOT-R scores in ascending order and split the data into low (13 participants), middle (14 participants), and high LOT-R subgroups (13 participants) for each experimental condition. Following, we performed three-way analyses of variance (ANOVAs) on the indicators

of placebo effects, with "expectancy level" (No, Low, and High), "dispositional optimism (LOT-R)" (low and high), and "gender" (female and male) as between-subject factors. Likewise, scores of anxiety state (STAI-S) were sorted and analyzed using the same statistical strategy. The statistical P-values were adjusted with Greenhouse-Geisser correction to avoid violation of the sphericity assumption, when necessary. Post hoc pairwise comparisons were performed with Bonferroni adjustments, when the main effects or interactions reach statistical significance. The effect size and statistical power in the present sample were estimated by partial eta-squared and 1-β, respectively. For partial eta-squared (η 2 p ), an effect size of 0.0099 is deemed as a "small" effect, around 0.0588 as a "medium" effect, and 0.1379 to infinity as a "large" effect (50). For 1- β, 0.8 is the commonly acceptable statistical power. To detect the effectiveness of sample size used in the current study, we performed a prior computation on the required sample size using G∗power (an online free software for power analysis, available at http://www.gpower.hhu.de/en. html) by setting statistical power at 0.8 with large effect size (η 2 <sup>p</sup> <sup>=</sup> 0.1379). The result showed a minimal sample size of 64 in total to detect the main effects and interactions between independent variables, which indicated that our sample size (N = 120) was enough to detect these effects. All statistical analyses were carried out in SPSS 22.0 statistical analysis package (SPSS Inc., New York, USA). Statistical threshold was set at 0.05.

## RESULTS

### Participant Characteristics

Participant characteristics for each experimental group are summarized in **Table 1**. The age was not significantly associated with "expectancy level" (No, Low, and High) and "gender" (female and male) [F(2,114) = 2.84, P = 0.06, η 2 <sup>p</sup> <sup>=</sup> 0.05]. This result, together with the counterbalanced experimental design for gender, indicated that all the participants across groups were ageand gender-matched, thus avoiding possible bias when assessing placebo effects.

### Influence of Expectancy Level, Dispositional Optimism, and Gender on Placebo Effects

Significant main effects of "expectancy level" [F(2,72) = 7.06, P = 0.002, η<sup>p</sup> <sup>2</sup> = 0.172] and "dispositional optimism (LOT-R)" [F(1, 72) = 5.18, P = 0.026, η<sup>p</sup> <sup>2</sup> = 0.071] were observed, while no significant main effect of "gender" [F(1, 114) = 0.04, P = 0.848, η<sup>p</sup> <sup>2</sup> = 0.001] was showed in the reduction of pain intensity (see **Table 2**). Post hoc comparisons on "expectancy level" showed that participants in Low expectancy group elicited a greater reduction of pain intensity than both No (P < 0.001) and High (P = 0.055) expectancy groups, while the latter two had no significant difference (P = 0.29). Neither the interaction between "expectancy level" and "dispositional



*F, Female; M, Male.*

TABLE 2 | The changes of pain intensity from pre-treatment to post-treatment sessions in all experimental groups (data are expressed as M ± SD).


*F, Female; M, Male; LOT-R, Life Orientation Test-Revised; STAI-S, State subscale of State-Trait Anxiety Inventory. Changes of pain intensity were obtained by subtracting the ratings in post-treatment sessions from those in pre-treatment sessions.*

optimism (LOT-R)" [F(1, 72) = 0.79, P = 0.460, η<sup>p</sup> <sup>2</sup> = 0.023] (**Figure 3**, left panel), nor the interaction between "dispositional optimism (LOT-R)" and "gender" [F(2,72) = 0.20, P = 0.655, η<sup>p</sup> 2 = 0.003] (see **Figure 2**, left panel) was significant. However, the interaction between "expectancy level" and "gender" [F(2,114) = 4.29, P =0.018, η<sup>p</sup> <sup>2</sup> = 0.112] was significant. Post-hoc comparison on this interaction revealed that (1) female participants in the Low expectancy group reported a greater reduction of pain intensity due to placebo treatment than females in No (P < 0.001) and High (P = 0.001) expectancy groups; (2) for participants in High expectancy group, males reported a greater reduction of pain intensity due to placebo treatment than females (P = 0.01).

In contrast, main effects of "expectancy level" [F(2,72) = 1.00, P = 0.374, η<sup>p</sup> <sup>2</sup> = 0.029], "dispositional optimism (LOT-R)" [F(1, 72) = 0.05, P = 0.832, η<sup>p</sup> <sup>2</sup> = 0.001], and "gender" [F(1, 72) = 0.03, P = 0.874, η<sup>p</sup> <sup>2</sup> < 0.001] were not significant on the reduction of pain unpleasantness (see **Table 3**). Except of non-significant interaction between "dispositional optimism (LOT-R)" and "gender" [F(2,72) = 0.78, P = 0.379, η<sup>p</sup> 2 = 0.011], both the interaction between "expectancy level" and "dispositional optimism (LOT-R)" [F(2,72) = 3.26, P = 0.044, ηp <sup>2</sup> = 0.084] (**Figure 3**, right panel), and the interaction between "expectancy level" and "gender" [F(2,72) = 3.38, P = 0.040, η<sup>p</sup> 2 = 0.091] (**Figure 2**, right panel) were statistically significant. Post-hoc comparison on the interaction between "expectancy level" and "dispositional optimism (LOT-R)" showed that (1) for participants with high LOT-R scores, those in Low expectancy TABLE 3 | The changes of pain unpleasantness from pre-treatment to post-treatment sessions in all experimental groups (data are expressed as M ± SD).


*F, Female; M, Male; LOT-R, Life Orientation Test-Revised; STAI-S, State subscale of State-Trait Anxiety Inventory. Changes of pain unpleasantness were obtained by subtracting the ratings in post-treatment sessions from those in pre-treatment sessions.*

group experienced a greater reduction of unpleasantness due to placebo treatment than those in No expectancy group (P = 0.046); (2) for Low expectancy group, participants with high LOT-R scores had a tendency to report a greater reduction of unpleasantness than those with low LOT-R scores (P = 0.056; see **Table 3**). Similar to the results of pain intensity, post-hoc pairwise comparisons showed that (1) female participants in the Low expectancy group reported a greater reduction of unpleasantness due to placebo treatment than those in High expectancy group (P = 0.039); (2) for participants in High expectancy group, males tended to report a greater reduction of unpleasantness due to placebo treatment than females (P = 0.073).

### Influence of Expectancy Level, Anxiety State, and Gender on Placebo Effects

For the reduction of pain intensity, there was a significant main effect of "expectancy level" [F(2, 72) = 4.14, P = 0.02, η<sup>p</sup> 2 = 0.026]. Post-hoc comparisons showed that participants in Low expectancy group reported a greater reduction of pain intensity than those in No expectancy group (P = 0.007). No significant main effect of "gender" [F(1, 72) = 3.07, P = 0.082, η<sup>p</sup> <sup>2</sup> = 0.09] or "anxiety state (STAI-S)" [F(1, 72) = 1.21, P = 0.26, η<sup>p</sup> <sup>2</sup> = 0.02] was found. No significant interaction between "expectancy level" and "anxiety state (STAI-S)" [F(2, 72) = 1.68, P = 0.19, η<sup>p</sup> <sup>2</sup> = 0.05] (see **Figure 4**, left panel), or between "anxiety state" and "gender" [F(2, 72) = 1.27, P = 0.26, η<sup>p</sup> <sup>2</sup> = 0.02] was observed. However, the interaction between "expectancy level" and "gender" was significant [F(2, 72) = 4.04, P = 0.02, η<sup>p</sup> <sup>2</sup> = 0.11]. Post-hoc comparison on this interaction revealed the same pattern as the results reported in the previous section ("Influence of expectancy level, dispositional optimism, and gender on placebo effects").

With regard to the reduction of unpleasantness, no main effect of "expectancy level" [F(2,72) = 0.12, P = 0.891, η<sup>p</sup> <sup>2</sup> = 0.003], "anxiety state (STAI-S)" [F(1, 72) = 0.07, P = 0.796, η<sup>p</sup> <sup>2</sup> = 0.001], or "gender" [F(1, 72) = 1.28, P = 0.263, η<sup>p</sup> <sup>2</sup> = 0.018] was observed. The interaction between "expectancy level" and "anxiety state (STAI-S)" [F(2, 72) = 4.05, P = 0.022, η<sup>p</sup> <sup>2</sup> = 0.106] (**Figure 4**, right panel), and the interaction between "expectancy level" and "gender" [F(2, 72) = 4.49, P = 0.015, η<sup>p</sup> <sup>2</sup> = 0.117] (**Figure 2**,

right panel) were significant. Post-hoc pairwise comparisons on the interaction between "expectancy level" and "anxiety state (STAI-S)" showed that for participants with low STAI-S scores, those in High expectancy group felt less pain unpleasantness due to the placebo treatment than those in No expectancy group (P = 0.027). Post-hoc comparison on interaction between "expectancy level" and "gender" revealed the same pattern as the results reported in the previous section ("Influence of expectancy level, dispositional optimism, and gender on placebo effects"). No significant interaction was observed between "anxiety state (STAI-S)" and "gender" [F(2, 72) = 2.39, P = 0.127, η<sup>p</sup> <sup>2</sup> = 0.034].

### DISCUSSION

In the present study, we demonstrated that placebo effects were not only influenced by expectancy level or personal characteristics alone, but also depended on their interactions. Specifically, we observed that the reductions of pain intensity and pain unpleasantness in the Test phase were influenced by the interaction between expectancy level and gender, as mainly reflected by greater reductions of pain intensity and pain unpleasantness in females at Low expectancy level than females at No/High expectancy levels, and greater reductions of pain intensity and pain unpleasantness in males than in females at High expectancy level. Additionally, the reduction of pain unpleasantness was modulated by the interaction between expectancy level and dispositional optimism, as well as the interaction between expectancy level and anxiety state. Participants who were more optimistic in Low expectancy group, or those who were less anxious in High expectancy group showed greater reductions of pain unpleasantness.

Firstly, placebo effects, defined as the reduction of pain intensity or unpleasantness, depended on the interaction between expectancy level and gender. We found female participants elicited maximal placebo effects in Low rather than No and High expectancy groups, which is in line with the previous

Inventory (Low STAI-S and High STAI-S) are marked in blue and red, respectively.

study showing that placebo effects, as quantified by systolic blood pressure, alertness, and tension, were stronger at the moderate expectancy level than at the extremely low and high expectancy levels (24). In other words, a realistically reasonable expectancy, rather than the unrealistic expectancy level, is more likely to enhance individual belief of the treatment, which is essential to maximize placebo effects (21, 24, 51). However, when compared with female participants, males exhibited a different pattern: in High expectancy group, they reported a greater reduction of pain intensity/unpleasantness due to placebo treatment. These findings suggested that a placebo response to an expectancy manipulation can vary tremendously by gender. However, we failed to observe the main effect of gender in placebo responses, which is inconsistent with a few previous studies (33, 34, 52). Admittedly, the issue of gender discrepancy in placebo responses is still controversial, and further investigation on this issue is highly needed. Please note that gender-specific placebo effects would have tremendous implications for medical research and clinical conditions, such as pain and neurological disorders, in which placebo responses are commonly considered relevant (53, 54).

Secondly, we provided evidence showing that placebo effects, defined as the reduction of pain unpleasantness, were influenced by the interaction between expectancy level and other personal characteristics, such as dispositional optimism and anxiety state. Previous evidence proved that individuals with high scores of dispositional optimism or low scores of anxiety state were more likely to respond to placebo treatment (36, 37). This is in line with our results demonstrating that optimists (those with high LOT-R scores) or participants with low STAI-S scores showed greater placebo responses after treatments. Obviously, being more optimistic or less anxious has a positive influence on the experience to a placebo treatment, as these individuals have a tendency to hold positive expectation (55). In particular, the present results might help explain the consistent correlation between dispositional optimism and positive medical outcomes (56–58). It is suggested that the different respondence between optimists and pessimists (those with low LOT-R scores) to placebo-related expectations may contribute to placebo response discrepancy. Noted that such an effect of dispositional optimism on placebo effects was confirmed in our study, but only observed in Low expectancy group, suggesting that a realistically reasonable, but not an overly-positive expectancy could optimize the influence of dispositional optimism on the placebo response. In other words, since optimists cannot be frequently driven by negative expectancy as forcefully as pessimists can, they might experience fewer negative events. Further, may it be the optimists, not the pessimists, who could be most likely to respond to a placebo-related expectancy for positive outcomes. The above speculation is consistent with a study on persuasion, in which optimists were more likely than pessimists to be persuaded by positively structural arguments (59). Therefore, an individual with high dispositional optimism might not only be less susceptible to negative expectancy, but also be more possible than those with lower dispositional optimism to benefit from positive expectancy, particularly at realistically optimized expectancy level. This is also prompted that patients with high dispositional optimism should be informed more frequently about a certain treatment with realistically positive expectancy to strengthen their responses in medical care. Future studies are needed to explore this issue under clinical conditions.

Importantly, in line with previous studies demonstrating that people's belief can be influenced by personal characteristics, such as optimism, neuroticism, and extraversion (25, 26, 60), our observation provides further evidence suggesting that the placebo effect can be jointly affected by the expectancy level and personal characteristics, which is fitted well with the Response Expectancy Theory (24, 51). Notably, there are tremendous differences in personal characteristics between healthy population and patients. For example, depressive, persistent social phobic, neurotic, fearful, and obsessive-compulsive personality characteristics are very common in pain sufferers compared to healthy population, whereas patients undergoing injectable aesthetic treatments scored significantly higher on extraversion, agreeableness, openness to experience, and neuroticism (61–63). Thus, the next important step is to replicate the main findings of the present study in clinical conditions. To note, a growing body of neurobiological researches on placebo effects indicated the influence of cognitive progressing on the modulation of pain perception (64, 65), which implied that an integrated model combining cognitive factors with psychological factors is warranted to comprehensively explore the placebo mechanisms.

#### LIMITATIONS

There are two limitations in the present study. First, although we assigned both male and female experimenters randomly to the participants, we did not control the potential influence of experimenters' gender well-enough, which still could have increased error variance. Selecting either a male or a female experimenter might be more suitable for further investigations. Second, we examined the effect of the interaction between expectancy level and personal characteristics on placebo effects within a non-clinical population, and it calls for clinical studies to replicate the main findings of the present study.

#### CONCLUSION

Considering that placebo effects have been recognized as effective psychobiological events attributing to the improvement of the overall therapeutic outcomes, we believe that our findings not only advance our understanding of the psychological underpinnings of placebo effects, but also suggest a constructive way (regulating the expectation level individually based on the

#### REFERENCES


assessment of personal characteristics) to maximize placebo effects in various clinical applications.

### ETHICS STATEMENT

This experiment was approved by the Ethics Committee of Southwest University, China, and registered with ChiCTR1800014737 through Chinese Clinical Trial Register Centre. All procedure was carried out in accordance with the relevant approved lines.

### AUTHOR CONTRIBUTIONS

HW, LW, XJL, and LH conceived and designed the experiments. HW and HZ performed the experiments. LZ, XJL, and LH analyzed the data. LZ, HW, XYL, CB, LW, XJL, and LH wrote the paper. All authors approved the final manuscript and agreed to be accountable for all aspects of the work.

#### FUNDING

This work was supported by the National Natural Science Foundation of China (No. 31671141, 31701000, 31822025), the Informatization Special Project of Chinese Academy of Sciences (No. XXH13506-306), and the Scientific Foundation project of Institute of Psychology, Chinese Academy of Sciences (No. Y6CX281007, Y6CX021008, KLMH2018ZG02). The funders had no role in study design, data collection, data analysis, decision to publish, or preparation of the manuscript.


65. Zhang H, Zhou L, Wei H, Lu X, Hu L. The sustained influence of prior experience induced by social observation on placebo and nocebo responses. J Pain Res. (2017) 10:2769. doi: 10.2147/JPR.S1 47970

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zhou, Wei, Zhang, Li, Bo, Wan, Lu and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Influencing Side-Effects to Medicinal Treatments: A Systematic Review of Brief Psychological Interventions

Rebecca K. Webster 1,2 \* and G. James Rubin1,2

*<sup>1</sup> Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, United Kingdom, <sup>2</sup> The National Institute for Health Research Health Protection Research Unit in Emergency Preparedness and Response, King's College London, London, United Kingdom*

Background: Nocebo effects contribute to a large proportion of the non-specific side-effects attributed to medications and are mainly generated through negative expectations. Previous reviews show that interventions designed to change participants' expectations have a small effect on pain experience. They are also effective in reducing side-effects caused by exposure to sham medications. To date, there has been no review of the influence of such interventions on symptoms attributed to real medicinal treatments.

#### Edited by:

*Luana Colloca, University of Maryland, Baltimore, United States*

#### Reviewed by:

*Meike C. Shedden-Mora, University Medical Center Hamburg-Eppendorf, Germany Karin Meissner, Ludwig Maximilian University of Munich, Germany*

> \*Correspondence: *Rebecca K. Webster Rebecca.webster@kcl.ac.uk*

#### Specialty section:

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

Received: *24 October 2018* Accepted: *24 December 2018* Published: *05 February 2019*

#### Citation:

*Webster RK and Rubin GJ (2019) Influencing Side-Effects to Medicinal Treatments: A Systematic Review of Brief Psychological Interventions. Front. Psychiatry 9:775. doi: 10.3389/fpsyt.2018.00775* Objective: To review studies using a randomized controlled design testing the effect of brief psychological interventions compared to usual practice on the side-effect experience to medicinal treatments in healthy volunteers and patients.

Methods: We searched Web of Science, Scopus, Medline, PsycINFO, PsycARTICLES, and Cochrane CENTRAL using search terms for randomized controlled trials along with "nocebo," "placebo effect," "medication," "side-effects," and associated terms. Studies were eligible if they studied a human population, used an active medicine, delivered a brief psychological intervention intended to influence side-effect reporting compared to usual care or no intervention, and used a randomized controlled design. Because of the heterogeneity of the literature we used a narrative synthesis and assessed evidence quality using the GRADE approach.

Results: Our database search and supplementary search of the reference sections of included studies retrieved 50,140 citations. After screening, full text review and manual reference searches, 27 studies were included. The quality of the studies and evidence was judged to be low. The strongest and most consistent effect came from omitting side-effect information, although surprisingly de-emphasizing side-effects did not affect side-effect reporting. Other techniques, including priming, distraction, and altering the perception of branding, produced mixed results.

Conclusion: Brief psychological interventions can influence side-effect reporting to active medications. Research is currently investigating new ways to de-emphasize side-effects whilst still upholding informed consent, but larger confirmatory trials with suitable control groups are needed. The literature in this area would be improved by more detailed reporting of studies.

Keywords: review, side-effects, medicine, nocebo effect, interventions, side-effect information

## INTRODUCTION

Nocebo effects, sometimes dubbed the placebo effect's "evil twin," are the experience of noxious symptoms in response to an inert exposure (1). Nocebo effects can also refer to negative clinical outcomes which are not attributable to the actual pharmacological or physiotherapeutic action of an intervention (2). It is estimated that between 38 and 100% of side-effects reported to drugs taken for a large range of medical conditions are related to the treatment context, rather than the active ingredients of the medication itself (3).

These nocebo-related side-effects are important, as they can affect a patient's well-being (4) and influence their decision as to whether to adhere to their treatment regimen (5, 6). For example adverse media coverage surrounding the safety of statins and their reported side-effects has resulted in around 2,00,000 patients who are no longer taking their statins as directed leading to a predicted increase of 2,000 cardiovascular events in the next decade (7). This is despite the fact that most of these side-effects are probably nocebo-related (8). Perhaps unsurprisingly, sideeffects can also result in substantial additional health care costs in terms of additional primary care and hospital visits and also the cost of wasted medication due to non-adherence (9).

Of the multiple factors that may contribute to the development of nocebo effects, expectations of symptoms appear to be the main contributor. These can be generated through verbal and written suggestions about what symptoms to expect, be implied by the apparent dose of a drug, and be learnt through classical conditioning and social observation (10). Studies have used these psychological mechanisms as a means to alter peoples' experience of experimentally induced pain (11), as well as pain following acute medical procedures, such as injections (12) and surgery (13). These effects have been studied in multiple reviews, showing that brief psychological interventions designed to change expectation of pain following treatment have a small but reliable effect on relieving patients' pain compared to usual care (14–16).

However, to our knowledge, there has been no review of whether such interventions can alter patient experience ofsideeffects to medicinal treatments. Although evidence demonstrates that such interventions can be effective in altering side-effects reported following exposure to inert substances (10), it is also important to assess if these effects can be transferred to clinical practice. We therefore set out to review studies using a randomized controlled design testing the effect of brief psychological interventions compared to usual practice on the side-effect experience to medicinal treatments in healthy volunteers and patients. To answer the question: can brief psychological interventions influence the side-effect experience to medications?

### METHODS

Our reporting of this systematic review adheres to the standards for the Preferred Reporting Items for Systematic reviews and Meta-Analyses (17). The protocol for this review was prospectively registered on PROSPERO (CRD42018091903).

### Identification of Studies

We searched the following electronic databases with a predefined search strategy: Web of Science, Scopus, OvidSp (Medline, PsycINFO, and PsycARTICLES) and Cochrane CENTRAL. We included Web of Science and Scopus for their coverage of the sciences and social sciences. OvidSp was chosen for its coverage of journals chiefly in the area of health sciences, and also for its inclusion of the databases PsycINFO and PsycARTICLES. Cochrane CENTRAL was included due to its coverage of randomized controlled trials and because it includes records which are derived from other sources to the ones already chosen.

In preliminary work we tested a variety of search strategies in an effort to balance specificity and sensitivity. Our final search strategy used the recommended search terms to identify randomized controlled trials (18) along with the terms and associated words for "nocebo," "placebo effect," "medication," and "side-effects." We used separate search strategies for each of the databases as these needed to be modified due to differences in MeSH terms, boolean operators and wildcards. A copy the search strategy we used for Medline can be seen in the **Supplementary Material**.

### Review Process

The search was initially carried out on 22nd March 2018 and updated on 22nd June 2018 following the identification of a relevant study published between this time. The initial electronic searches were combined using EndNote and duplicates were identified and deleted. The titles and abstracts of citations were then screened for potential relevance. If relevance was not clear from the abstract, the study was taken forward to the full text review. All full text versions of papers that were potentially relevant were then screened in relation to the inclusion criteria. Papers that met the inclusion criteria had their reference sections manually searched for other studies that could be included.

### Selection Criteria

Studies were eligible for inclusion in this review if they met the criteria below.

#### Population

Human population (healthy volunteers, patients and children were allowed).

#### Exposure

Active medicinal treatment (i.e., contains a pharmacological agent), associated with side-effects.

#### Intervention

A brief, psychological intervention delivered in one session and that could be feasibly introduced within a single doctorpatient consultation or treatment appointment. By psychological we mean an intervention that targets certain psychological processes, such as cognitive expectations, attention or learning. Interventions requiring biological or chemical stimuli were excluded because these are not purely psychological. As we wanted to identify interventions that could be easily incorporated into clinical practice, in-depth psychological interventions, such as cognitive behavior therapy, mindfulness, relaxation training or guided imagery, or that consisted of intensive educational packages were excluded as these typically are not delivered in one session and often take place over the course of a treatment.

#### Comparator

Usual care. We excluded studies with control conditions involving a different type of intervention.

#### Outcome

We included studies with an outcome of side-effects measured via self-report or inferred through objective measures. We followed the NICE (19) definition of a side-effect as "An effect of a drug (or treatment or intervention) that is additional to the main intended effect. It could be good, bad or neutral, depending on the circumstances." For some studies, e.g., those concerning infant experience to vaccinations, side-effects were measured within minutes of the procedure. We excluded these on the bases that the "side-effects" were presumably related to the insertion of the needle rather than the effects of the vaccine itself.

#### Study Design

Used an experimental design in which participants were randomized or quasi randomized to receive the intervention or the control condition.

#### Other Criteria

Published in the English language.

#### Data Extraction

We extracted data from the final set of studies using a data extraction table developed for this systematic review. Data extracted included the study design and methodology, main demographics of participants, description of intervention and control conditions, side-effect measures, statistical approach and results. We also extracted details about the mode of the intervention, its content and duration.

### Quality Assessment

We assessed the quality of all eligible studies using the Cochrane Collaboration's Risk of Bias tool for randomized controlled trials (20).

### Data Synthesis and Analysis

Due to the heterogeneity in the interventions that we included and the way that side-effects were measured, scored and analyzed, we used a narrative synthesis to analyse the results. There is no general consensus on the best way to carry out a narrative synthesis for systematic reviews (21). As such we decided to use a weight of evidence approach by identifying the quality of evidence for each type of intervention reviewed. To do this we used the GRADE approach (22) which is a transparent framework used to grade the quality of evidence included in systematic reviews and the strength of recommendations.

### RESULTS

#### Search Results

The database search retrieved 50,133 citations and searching the reference lists of included studies retrieved another 7, giving a total of 50,140. After removing duplicates 40,346 citations remained. After screening titles and abstracts, we reviewed the full text of 63 articles relating to 66 studies. Of these, 39 studies were excluded for not meeting the inclusion criteria, resulting in a total of 26 articles reporting on 27 studies. One article (23) reported results on two separate studies and is referred to in the text and tables as Study 1 or Study 2 where necessary. The number of studies at each stage of the search strategy and the reasons for exclusion are shown in **Figure 1**.

### Study Characteristics

See **Table 1** for a full summary of the characteristics of the included studies. The 27 studies included in the review reported on a total of 3,459 participants. There was a range of patient groups and treatments under investigation. The most common of these were patients with cancer receiving chemotherapy (26, 31, 39, 41–44, 46), and patients with depression prescribed anti-depressants (29, 32, 35–37). All studies used a between participants RCT design apart from Cildag et al. (24), Myers and Calvert (36), Redd et al. (39), and Schagen et al. (42) which used a quasi-randomized approach, and Faasse et al. (28) who used a within-subjects RCT design. Some studies used a factorial design in their RCT involving different experimental conditions or baseline variables entered as independent factors (23, 31, 39, 41–43). In these cases, we have reported the main effects of the relevant intervention under investigation.

There were a variety of interventions used by included studies. We looked for common themes and content of the various interventions and were able to group the studies into five different types, these were: priming, distraction, branding, omitting sideeffects, and de-emphasizing side-effects, plus a miscellaneous group.

The majority of studies used an un-validated questionnaire specifically designed for their study to measure side-effects, and measures were generally completed within days/weeks following treatment initiation.

### Quality Assessment

The quality of included studies was poor (see **Figures 2**, **3**). The main problem was a lack of clear reporting within the papers. Over half of the studies neglected to mention how they carried out randomization, and four were at high risk for using a quasi-randomized approach. Because of the unclear reporting of random sequence generation, the risk for allocation concealment bias followed a similar pattern, and six studies were at high risk because their randomization approach allowed research staff to foresee subsequent allocations. For blinding of participants and personnel, studies often failed to state whether the experimenters were blind to the manipulation that accompanied the active treatment, leaving the risk of bias unclear. Only seven studies used adequate blinding procedures, with one not using blinding at all. Nineteen studies used side-effect measures which were completed by participants, as such blinding of the outcome assessment was judged unlikely to influence these results. For the remaining eight studies it was unclear if participants filled in the measures themselves or if they were administered by a blind/non-blind member of the study team. For 16 studies, drop outs were not addressed, or if they were, the paper typically failed

FIGURE 1 | Flow diagram of the selection process of studies including the number of events and reasons for exclusion.

to explain how this affected the results, leaving the risk of bias unclear; the remaining 11 studies provided adequate information and reasoning behind drop outs. Only two studies had lodged a protocol in a publicly accessible registry before the start of recruitment, leaving us unable to assess the risk for selective reporting for the remaining studies, apart from one in which there was a change in the prespecified primary analysis suggesting there was a high risk of bias.

#### Quality of the Evidence

The quality of evidence regarding priming, distraction, omitting side-effect information and de-emphasizing side effects, and doctor characteristic intervention(s) was very low. This is because most of the information came from studies at low or unclear risk of bias, in which plausible bias could alter the results. There was also some evidence of inconsistency and imprecision in the results due to opposite findings, wide confidence intervals and some small studies which may not have been adequately powered. Due to the broad nature of this systematic review, there is no evidence of indirectness, as all included studies helped to answer the question. It is plausible however, there may have been some publication bias due to the preponderance of smaller studies.

The quality of evidence regarding the branding intervention studies was low. This was graded similarly due to the reasons discussed for the above interventions, however the inconsistency in the results could perhaps be explained by differences in the interventions, and we judged that the small studies were probably due to this literature representing an early evidence base, rather than publication bias.

Finally, the quality of evidence for the deception intervention was moderate. There was some imprecision evident and the sample size was small, however the study was judged to have a low overall risk of bias, and there was no evidence of indirectness. As only one study was included, inconsistency and publication bias could not be determined.

#### Effect of Interventions on Side-Effect Reporting Priming

#### Four studies looked at the effect of priming on side-effect reporting following chemotherapy with mixed results (see

TABLE

1


Summary

table

of

included

studies.

*(Continued)*


*(Continued)*

Frontiers in Psychiatry | www.frontiersin.org

TABLE

1


Continued


TABLE

1


Continued


**Table 2**). Colagiuri et al. (26) found a slight trend for priming patients by assessing their expectancies for side-effects vs. no assessment on subsequent nausea. Jacobs et al. (31) and Schagen et al. (42) found no indication of an effect of priming patients by mentioning that chemotherapy is associated with cognitive problems on retrospectively reported cognitive sideeffects. However, Schagen et al. (43) in a similar study did find a small effect of priming leading to increased reporting of previous cognitive side-effects to chemotherapy compared to those in a control group who received no such information.

#### Distraction

Three studies looked at the effect of distraction on side-effect reporting following chemotherapy and drug provocation tests, showing some evidence that distraction can reduce side-effect reporting (see **Table 3**). Cildag et al. (24) found that keeping patients busy with filling/archiving files significantly reduced the occurrence of adverse reactions compared to a control group, but only by a small amount. Redd et al. (39) found that distracting pediatric cancer patients with video games significantly reduced chemotherapy nausea from baseline, compared to those in the control group. However, for adult cancer patients, Vasterling et al. (46) found that video games were not effective in reducing chemotherapy nausea or vomiting compared to a control group.

#### Branding

Two studies looked at the effect of branding on side-effect reporting to ibuprofen showing some evidence that branding can affect side-effect reporting (see **Table 4**). Colgan et al. (27) found that a video designed to correct participants' beliefs about generic medicines significantly reduced side effects for both branded and generic ibuprofen compared to those in a control group, showing a large effect. However, Faasse et al. (28) found that simply changing the labeling of ibuprofen from branded to generic did not significantly affect side-effect reporting.

#### Omitting Side-Effect Information

Eleven studies looked at the effect of omitting side-effect information on side-effect reporting to a range of different treatments, showing that omitting side-effects significantly decreases side-effect reporting (see **Table 5**). Eight studies

#### TABLE 2 | Priming intervention results.



*Ns, non-significant; M, mean; SE, standard error; SD, standard deviation; OR, odds ratio; d, Cohen's d; –, insufficient detail to calculate effect size.*

#### TABLE 3 | Distraction intervention results.


*Ns, non-significant; M, mean; OR, odds ratio; –, insufficient detail to calculate effect size.*

found that not informing patients about potential side-effects significantly decreased side-effect reporting to metropolol (25), a myelogram (30), antidepressants (32), finasteride (34), skin cream (23) (study 1 and 2), atenolol (45), and montekulast (48) compared to a control group which received side-effect information, each showing large effect sizes. Similarly Myers and Calvert (36) found a trend for a decrease in side-effect reporting when patients were not informed about the sideeffects to the antidepressant dothiepin compared to a control group, and Myers and Calvert (37) found that side-effects significantly decreased to dothiepin when comparing the group that only received beneficial information to groups that received no information and side-effect information. Only one study, Myers and Calvert (35), found no effect of side-effect information on subsequent side-effect reporting.

#### De-emphasizing Side-Effects

Five studies looked at the effect of de-emphasisng side-effects on side-effect reporting to range of different treatments, showing evidence that this seems to have no effect (see **Table 6**). Three studies found that informing patients of side-effects but in a way that does not make them seem as bad had no effect on sideeffect reporting to anesthesia (33), or chemotherapy (41, 44), however this was compared to a control group that did not receive any information about side-effects. O'Connor et al. (38) found that positively framing side-effects to emphasize those that remain side-effect free and comparing to a control group that received standard information about side-effects significantly reduced side-effect reporting to the flu vaccine. Wilhelm et al. (47) found that positively framing side-effects by explaining they are a sign that the drug is working did not significantly reduce side-effects to metoprolol compared to those who received standard information.

#### Other Interventions

Two other studies investigated interventions which do not fall into the above categories (see **Table 7**). Faria et al. (29) deceptively told seasonal affective disorder patients that they would receive an active placebo which would produce similar side-effects to escitalopram when in fact they received the active drug itself

TABLE 4 | Branding intervention results.


*Ns, non-significant; M, mean; SE, standard error; d, Cohen's d; –, insufficient detail to calculate effect size.*

and found this showed a trend in decreasing reported sideeffects compared to a control group who were correctly informed. Rickels et al. (40) found no effect of the prescribing psychiatrist being a drug "enthusiast" or drug "skeptical" on reported sideeffects to tranquilisers among psychiatric patients.

### DISCUSSION

#### Summary of Main Findings

Although previous literature has looked at altering side-effects generated in response to inert exposures, it is important to test if these interventions also work in the clinical setting and affect side-effects to real medications which may be initiated or exacerbated through a nocebo effect. This can then provide the basis for introducing into clinical practice strategies to reduce these side-effects. Unfortunately, the quality of the studies identified in this review were generally low quality mainly due to the lack of clear reporting, inadequate randomization and allocation procedures, and unpowered effects. Our overriding recommendation, therefore, is that additional, better quality work is needed in this field.

This point notwithstanding, from the results of the included studies, the strongest and most consistent effect in altering side-effects experienced following medical treatments was omitting information about side-effects. Other techniques, such as priming, distraction and altering the perceptions of branding produced mixed results. More tentatively, studies which investigated over the counter medications, common prescription medications, and vaccines seemed to be more susceptible to these interventions than those which studied chemotherapy.

The finding that omitting side-effect information produced the most consistent and strongest effect supports the evidence from the literature on inert exposures (10) which recommends that in order to reduce side-effects induced by nocebo effects we should avoid giving suggestions of side-effects associated with medications to patients. It also echoes what is found in experimental nocebo studies which find that altering information about side-effects alters side-effect experience to infrasound (49), and electrical pain stimuli (50). In addition, this supports previous work showing that interventions designed to change patients' expectations of pain by altering verbal suggestions about the pain to expect after a treatment or procedure can relieve (placebo) or increase (nocebo) patients baseline pain depending on the suggestion (15, 16), highlighting the role that expectations play in both placebo and nocebo effects. Perhaps unsurprisingly no study looked at the effect of omitting information about side-effects to chemotherapy, and therefore we cannot say if the results extend to chemotherapy too. However, as chemotherapy is already well-known for its sideeffects, it may be that omitting side-effect information would do little to alter subsequent side-effect reporting in this group.

Not mentioning side-effects to patients in order to reduce these effects is ethically problematic and may not meet the requirements of informed consent, something which has been widely discussed in the literature (51, 52). An alternative approach is to explain the potential side-effects to patients in a way that de-emphasizes them and reduces their apparent likelihood or severity (53). At first look, the results of studies which have used this approach do not appear promising. Most studies have showed no effect of de-emphasizing side-effects on subsequent side-effect experience. However, this might be an artifact relating to the design of these studies, in which the groups that received the de-emphasized side-effect information were compared to a control groups that received no side-effect information at all. Explaining side-effects to patients, albeit in a positive light, is still likely to increase the perceived likelihood of side-effects compared to not describing side-effects. It would be interesting for future studies to test the effects of de-emphasizing side-effects of medication compared to a suitable control group which receives standard side-effect information. In other studies which used an appropriate control group, positive framing of side-effects was shown to be beneficial, a finding that has also been reproduced in healthy adults taking an inert tablet (54). There is also scope for further investigations about framing the side-effects of medication as a sign that the drug is working. This was investigated in a pilot study that, although not powered to find an effect, nonetheless showed a decrease in side-effect measures among participants who believed the medicine to be harmful (47). This idea of de-emphasizing side-effects has shown some promise in the placebo literature on pain, in which positive messages which focus more on the beneficial outcome of treatments rather than the potential side-effects may be more effective in relieving patients pain compared to usual care messages (14).

#### TABLE 5 | Omitting side-effect information intervention results.


*Ns, non-significant; M, mean, SD, standard deviation; OR, odds ratio; d, Cohen's d;* η*p* 2 *, partial eta squared; –, insufficient detail to calculate effect size; ?, not reported.*

#### TABLE 6 | De-emphasizing side-effects intervention results.


*Ns, non-significant; M, mean; SE, standard error; SD, standard deviation; OR, odds ratio; d, Cohen's d; –, insufficient detail to calculate effect size.*

#### TABLE 7 | Miscellaneous results.


*Ns, non-significant; M, mean; SD, standard deviation; d, Cohen's d; –, insufficient detail to calculate effect size; ?, not reported.*

Priming patients by informing them about the side-effects to chemotherapy and then asking them to recall side-effects, or by asking about their expectations of chemotherapy side-effects overall showed little impact on side-effect reporting. This may be due to the treatment under investigation. Chemotherapy is a high-profile treatment, and as such it is likely patients are already aware of the side-effects that accompany it, limiting the effect that priming could have. In experimental studies, priming patients using pain-related fear has been shown to increase sensitivity to heat stimuli (55). It may be that priming patients about side-effects to lower profile drugs find more promising effects.

Distraction techniques have been shown to be effective in the field of pain research for example experimental and needle-related pain (56, 57), but in terms of medication sideeffects, the evidence base is not as large, limiting conclusions. From the results, it seems that distraction tasks should be relevant to the patients to have the greatest chance of being effective. For example, while video games are suitable for reducing side-effects to chemotherapy in pediatric patients they are less effective in adults (39, 46).

The effect of branding on side-effects shows some effect, something also reflected in the inert literature (58). However given the early evidence base, future studies are needed to test the effects of branding on prescribed drugs, and interventions to alter patients' perceptions of prescribed generic drugs.

### Quality of Original Research

It is possible that some of our conclusions may be due to differences in quality between those studies that found an effect and those that did not. We did not observe any clear trend for lower quality studies to report more or fewer significant results than higher quality studies. However, overall the quality of the studies included in this review was limited due to poor reporting of key issues in experimental research, such as randomization, allocation concealment, blinding, and not registering a study protocol prior to initiating recruitment. In addition, the quality of evidence from these studies was low, partially due to these risk of bias issues, but also the fact that the samples sizes of studies were relatively small, adding to evidence of imprecision and indirectness due to the wide confidence intervals, and sometimes contradictory findings.

### Quality of This Review

Search strategies for systematic reviews based on nocebo effects are difficult to balance in terms of their specificity and sensitivity (10). In this instance we deliberately opted for a broad search strategy in order to identify as many relevant studies as possible. Due to time constraints, screening, data extraction and quality assessment were done by primarily one author. However, there were regular weekly meetings with both authors to discuss screening, data extraction, quality assessment and writing up of the results, allowing us to resolve any issues as they arose.

Other limitations of the review reflect the way we grouped the results. We aggregated studies based on the type of intervention under investigation. These groupings contained different sideeffect outcomes, treatments and participants. It is possible that interactions exist between these variables and the interventions under investigation. Unfortunately, due to the small number of studies investigating each intervention, we did not have enough data to explore this in any depth. However, it does appear that chemotherapy might not be as susceptible to brief psychological interventions compared to prescription and over-the-counter drugs.

### Implications and Future Directions

Not mentioning potential side-effects to patients has the most consistent effect in reducing side-effects to medical treatments, especially for over-the-counter and prescription drugs. Whether this meets ethical or regulatory requirements is debatable, however (53, 59). De-emphasizing side-effects through positive framing has potential and could be introduced within doctorpatient consultations and in accompanying patient information leaflets for patients to read at home. Further testing of this method especially in terms of reframing side-effects as signs that the drug is working is needed in an adequately powered trial. In addition, it is important for future studies testing ways of deemphasizing side-effects to adequately compare them to a control group that receives the standard side-effect information.

Besides framing, it is also important for doctors to consider patients beliefs about generic medicines if prescribing generic drugs or switching patients from a previously branded medication to a generic. Colgan et al. (27) suggest that a simple explanation of how the pharmacological ingredients in generic drugs do not actually differ with branded drugs would be useful. So far, the effects of branding have been studied in over the counter and inert tablets: research with prescribed medication is now needed. In addition, distraction could be beneficial for use if age appropriate tasks are used.

Finally, only one study investigated the effect of doctor characteristics on side-effects. This represents a surprising gap in the literature. Doctor characteristics, such as empathy have been shown to be important in benefitting patients for a range of clinical conditions, especially pain (60). We believe this is an important avenue for future research to investigate in terms of benefitting patients by reducing medication side-effects.

### CONCLUSION

This review was restricted by the quality and heterogeneity of the included studies, limiting the conclusions that can be drawn. It does however, provide an indication of which brief psychological interventions are effective in reducing side-effects to active medical treatment. The clearest effect was from omitting information about side-effects to participants before exposure. Although withholding side-effect information would be one way to reduce this, it is ethically and legally problematic. Current work is looking at how we can effectively de-emphasize sideeffects while still giving patients the information needed for informed consent, and this shows promise. Potential strategies include positively framing the risk of side-effects, focusing more on the benefits of the drug, and framing side-effects in terms of signs that the drug is working. However further research is needed in larger trials with suitable control groups. There is also a gap for future research to consider doctor characteristics, such as empathy, as a means of reducing patients' experience of sideeffects. Finally, better reporting of studies is essential in future, allowing for more concrete determinations of study quality.

## AUTHOR CONTRIBUTIONS

RKW and GJR developed the initial research question for the systematic review and the search strategy. RKW carried out the search, screening, data extraction and quality assessment with regular input from GJR. RKW wrote the first draft which was subsequently revised by GJR.

### FUNDING

RKW and GJR are affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Emergency Preparedness and Response at King's College London in partnership with Public Health England (PHE), in collaboration with the University of East Anglia and Newcastle University. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, the Department of Health or Public Health England.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt. 2018.00775/full#supplementary-material

### REFERENCES


on chemotherapy-related nausea. J Pain Symptom Manage. (2010) 40:379–90. doi: 10.1016/j.jpainsymman.2009.12.024


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Webster and Rubin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Re-evaluation of Significance and the Implications of Placebo Effect in Antidepressant Therapy

Marko Curkovic<sup>1</sup> \*, Andro Kosec<sup>2</sup> and Aleksandar Savic1,3

<sup>1</sup> Department for Diagnostics and Intensive Care, University Psychiatric Hospital Vrapce, Zagreb, Croatia, <sup>2</sup> Department of Otorhinolaryngology and Head and Neck Surgery, University Hospital Center Sestre Milosrdnice, Zagreb, Croatia, <sup>3</sup> Department of Psychiatry, University of Zagreb School of Medicine, Zagreb, Croatia

Keywords: depression, antidepressants, placebo, placebo effect, efficacy, randomized controlled trials

## INTRODUCTION

Placebo was conceived as an epistemological tool to control for incidental factors that could influence investigated outcomes (1, 2). As a phenomenon, placebo has been conceptualized in many different ways, both theoretically and practically, although a generally accepted definition is still to be devised (3). In clinical research setting it is, however, helpful to distinguish placebo response from placebo effect. Placebo response is considered to be the composite change observed in individuals after administration of placebo, consisting of different aspects, such as natural course of the disease and methodological artifacts, as well as the placebo effect itself (4, 5). Placebo effect would therefore be the change observed in individuals after controlling for natural course of the disorder, methodological aspects, and the effect linked to treatment-specific features (i.e., antidepressant verum) (1). In other words, placebo responses may or may not include placebo effect as genuine psychobiosocial effect that is usually attributed to various features of treatment situation and contexts (5). Recent findings consistently show a modest average effect (a mean effect size of d = 0.30) of antidepressants above placebo in short-term treatment of adult depression (6, 7). This could be interpreted as strong evidence of a modest effect. Whether this effect on outcome measures, that are rather limited and subjective, is clinically meaningful remains an open question (8). Issues of true long-term efficacy, safety and cost-effectiveness of antidepressant drugs still loom over the horizon (8–11). Still, these recent findings pose interesting questions, since similarly consistent findings have been published showing a large and highly variable placebo effect in studies aiming to prove true efficacy of antidepressants (12, 13). In other words, participants in antidepressant studies that are receiving, at least from a theoretical point, a supposedly inherently neutral intervention (one that should be lacking known, relevant, and specific features), show substantial and consistent improvement across different study designs and contexts.

#### University of Marburg, Germany

Edited by: Paul Enck,

Reviewed by: Johannes A. C. Laferton,

\*Correspondence: Marko Curkovic markocurak@gmail.com

University of Tübingen, Germany

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 30 January 2019 Accepted: 26 February 2019 Published: 19 March 2019

#### Citation:

Curkovic M, Kosec A and Savic A (2019) Re-evaluation of Significance and the Implications of Placebo Effect in Antidepressant Therapy. Front. Psychiatry 10:143. doi: 10.3389/fpsyt.2019.00143 CONTEXT

Double blind randomized placebo-controlled trials (DBRPCT) have long been considered to be the golden standard for determining true efficacy of an intervention. As such, DBRPCTs are based on the logic that reflects the basic premises of scientific epistemology: it allows a certain degree of control over factors that could influence outcomes of interest, but at the current point is not a subject of scientific inquiry (3). Randomization, blinding, and placebo control groups allow for probabilistic balancing of these unspecific factors, and prevent intentional or accidental influence from study participants and investigators (14). Consequently, it is assumed that true effect of intervention could be extracted based on the "additivity assumption"—true effect is one that is present only within the last remaining uncontrolled therapeutic features, and therefore attributed to the intervention being investigated.

Such a design, created for a very specific purpose, has its shortcomings that have been widely discussed [for more comprehensive discussion consider (14) and related commentaries]. In order for a DBRPCT to be internally valid (able to fulfill its explanatory purpose), a certain degree of deviation from external validity is required, causing loss of similarity to the targeted model—clinical practice (14, 15). Placebo in DBRPCTs was initially conceived as a procedure that allows blinding, removing the influences of study participants and investigators. It seems, however, that placebo consistently influences outcomes above and beyond its anticipated boundaries (1, 16, 17). Although we aim to constrain "human factor," certain aspects of human nature evade such attempts. Human subjects, inherently vulnerable because of the nature of their medical condition, are systematically and consistently reacting to a more or less specific set of internal and external cues, creating a "genuine" placebo effect. While placebo effect may be a valuable and legitimate object of research, one should be careful not to overgeneralize this term, since a tendency to erroneously characterize everything and anything as a placebo effect can be seen [for more detailed discussion consider (18)]. In other words, as previously mentioned, genuine placebo effect should be distinguished from methodological artifacts that exert certain influence on outcomes, such as natural course of the investigated condition, spontaneous variation in symptoms, and various sources of research bias (2, 18, 19). It seems that genuine placebo effect exercises a greater influence in its own right than any of above mentioned factors, and as such is neither inert nor unspecific. It is responsible for physical changes and effects in individuals that are specific and somewhat related to the investigated condition and/or effects of "true" treatment (1, 2, 16, 19, 20). The placebo effect is considered to be an adaptive process that emerges from contextual and individual features within a treatment situation, and as such is driven by underlying biological, psychological and social components that are not mutually exclusive [for further details consider: (1, 21)]. As such, the placebo effect may contradict the additivity assumption, influencing outcomes conjointly or even independently from the investigated intervention (1). Therefore, an "interactive" assumption has been proposed, acknowledging that underlying mechanisms that yield a therapeutic response interact in a complex manner (1, 22).

### RECENT FINDINGS

Findings suggest that placebo effect in antidepressant trials is a genuine entity, and as such may be distinguished from methodological artifacts that are also exhibiting a substantial influence on outcomes (23–25). While recent findings suggest that antidepressants show therapeutic efficacy and effectiveness, it seems that placebo effect may be one of the key driving forces of their effect. Moreover, it has been suggested that as much as 88% of antidepressants efficacy could be attributed to the placebo effect (8). In other words, antidepressants would in that case have little additional specific effect beyond the placebo effect. Furthermore, recent analyses found that a subset of 17% of individuals with depression could exhibit "clinically significant advances" with placebo relative to antidepressants (26). Similarly, earlier findings suggested that 20% of individuals with depression could have a worse disease trajectory with antidepressant than with placebo therapy (27).

Moderators and mediators of placebo and antidepressant effects have been thoroughly investigated and reviewed [more thoroughly discussed in: (28, 29)]. Unbalanced studies group randomization and effect modulation by baseline severity have been previously singled out as most consistent and robust findings. It may seem intuitive that baseline severity of depression influences responses to any given intervention, and it has long been argued that as depression is more severe, placebo effects are less prominent, while response rates to antidepressants remain stable (28, 30). This concept was recently dismissed, as antidepressants or placebo intervention seems to be equally (in)effective across the whole depression severity spectrum (31, 32). Interestingly, recent findings even suggest that placebo response rates seem to be similar in persistent depressive disorder (defined as all forms of depressive conditions that persist for at least 2 years) compared to episodic depression (33). The probability of receiving placebo (unbalanced group randomization) has been repeatedly and firmly correlated with the antidepressants' response (12, 29, 34). This relationship has a linear gradual effect, with efficacy of antidepressants increasing as we move from greater toward lower probability of receiving placebo. So, antidepressant response rates are significantly higher in comparator trials than in DBRPCT. This finding is usually interpreted as implicit evidence that both placebo and treatment effects could be based on patient expectations (that could obviously be positive and/or negative) (12). Nonetheless, it has been shown that expectations (conceptualized as perceived treatment assignment) significantly change during studies while retaining their relative predictive power (35). What participants believe may be more important than what they actually receive as an intervention, making a false but sincerely held belief more important that actual intervention. Some advancement has been made in predicting antidepressant and placebo responses and/or responsiveness in research and clinical practice. Although certain neurobiological features, clinical and sociodemographic characteristics of patients have been highlighted as possible outcome predictors, low sensitivity and high intraand inter-individual variability remain an issue (26, 36–38). Placebo responsiveness, and to lesser account antidepressant responsiveness remain highly and complexly variable on all levels.

### DOES PLACEBO EFFECT HAVE AN EFFECT ABOVE AND BEYOND THAT OF ANTIDEPRESSANTS?

Many questions are still unanswered regarding characteristics, mechanisms and definition of the placebo response and effect. Line of research dealing with those issues could be referred to as "placebo explanatory research," and depression and psychiatry disorders in general could be seen as particularly fertile ground for these inquiries (1, 2, 16, 24, 25, 29). For example, openlabel placebo administration with full disclosure, seems to yield similar antidepressant therapeutic effects as the traditionally administered ones (39). On the other hand, antidepressants compared with active placebos that imitate some side-effects showed no significant advantage (40). Thus, expectation related placebo effects may be driven by unblinding properties of side-effects, and further diminishing antidepressants' signaling potential (extricating true efficacy).

We consider "antidepressant explanatory research" as one being primarily oriented toward proving true antidepressants efficacy. Within this approach, it has been recently argued that the placebo control group should be completely omitted, as diverse variability of placebo effects seems to undermine internal validity making studies fundamentally invalid and uninterpretable (7, 12). Proponents of keeping the placebo control group, propose methodological and analytical approaches that aim to control and manage placebo effects [further details may be found in: (1, 41, 42)]. One of the underlying assumptions is that "placebo responders" influence outcomes blurring the antidepressants' signaling potential. Hence, different methods, such as placebo run-in phase could be applied in order to eliminate these obstacles. This approach should be considered ethically and methodologically erroneous as there is no evidence that such a stable trait exists. Just the opposite, it seems that placebo responsiveness emerges from complex interrelationship between stable and situational traits [recently elaborated in: (19)]. Furthermore, it could be argued that reduction of placebo responsiveness will further reduce antidepressants responsiveness (25). Similar logic is applied within the approach of risk modeling where "risk participants" (disproportionately contributing to the outcome) and/or "non-responders" (not prone to react regardless of assigned intervention) are further dealt in identify and mitigate manner (1, 42). All of these strategies may be considered pragmatic, as there is great pressure to reduce ineffectual research. There are other strategies that tackle different possible sources of error by manipulating study context, design, conduct and analysis with primary aim to enhance studies' internal validity, antidepressants' signal detection potential and yield more historically reliable response rates (1, 2, 16, 20, 25, 28, 41–43). However, these strategies tend to increase internal validity at the expense of external validity, and as such seem more like a harm reduction strategies than as true advancement of our understanding of the complex underlying phenomena (14, 18, 25, 44, 45). Following this line of argumentation, solutions could include introduction of an independent study investigator and the concept of "cold standardization"—a virtual, computer driven standardized recruitment, admission of interventions and assessment of study participants. Such an approach would have potential to eliminate some features of intrapersonal healing that has been singled out as possibly one of the major contributors to the placebo effect, tackle widespread issue of inadequate blinding and other sources of investigator or study-staff related biases (14). Although such (still hypothetical) computerized study investigator could standardize study recruitment, administration and assessment procedures, it would not be resistant to other sources of bias. One could even imagine that participants' expectations in such a setting would change in previously unimaginable directions (either by certain therapeutic potential of this interaction or properties of interventions itself, such as side effect profile). Although here being used as extreme argument on how one could possibly further strengthen studies internal validity, such an approach could be also used in order to distinguish specific features underlying placebo and/or therapeutic effects (serving a more pragmatic purpose).

Alternatively, "antidepressant pragmatic research" would steer toward comprehending complex interactions of specific and/or unspecific features that are contributing to a therapeutic effect. We should not try to simply manage placebo effects, but direct our attention to its understanding through rigorous initial planning, assessment, reporting and sharing of all data possibly linked to the therapeutic response as well as nonresponse (1, 17, 19, 20, 43, 44). In other words disentangling of the placebo enigma seems to carry the potential of being the royal road to answering presumably the most important question at hand: which elements of the intervention, and in what proportion, are the ones relieving the suffering? In that sense, inclusion of an additional study arm in which the primary aim is to reach the maximum possible efficiency through any means necessary could be labeled as "warm standardization" (25). Again, different means could be used for that purpose, for example harnessing and maximizing expectations or even including additional specific interventions (such as some form of specific psychotherapy—being previously conceived as inherently expectation modulatory treatment) (46). Finally, as placebo is a relational phenomenon that significantly differs from context to context, all known moderators and mediators of placebo effect (from its physical characteristics to informed consent process) should be rigorously reported (1, 2, 16, 17, 43). Factors that affect treatment outcomes need to be evidenced, extrapolated, weighted, agglomerated, and discussed having in mind that acquiring scientifically grounded knowledge is an iterative, cumulative process. Currently, novel analytical tool, such as computational methods allow us to amplify robustness of other data rich sources, such as electronic health records, while searching for the structures of causality that could be more rooted in real world estimates of certain interventions safety, efficacy and effectiveness (14, 45, 47–49).

### AUTHOR CONTRIBUTIONS

MC provided and constructed initial idea of the manuscript. MC, AS, and AK co-authored and edited the manuscript.

#### Curkovic et al. Placebo in Antidepressants Therapy

### REFERENCES


individual symptoms of depression: Individual patient data meta-analysis. Br J Psychiatry. (2019) 214:4–10. doi: 10.1192/bjp.2018.122


outcome in heart surgery patients: results of the randomized controlled PSY-HEART trial. BMC Med. (2017) 15:4. doi: 10.1186/s12916-016-0767-3.


**Conflict of Interest Statement:** AS has received lecture honoraria from Janssen, Lundbeck, Eli Lilly, Pfizer, Pliva, Krka, Belupo, and participated in clinical trials (sub-investigator/rater) for Otsuka, Affiris, Eli Lilly.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Curkovic, Kosec and Savic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Can a Brief Relaxation Exercise Modulate Placebo or Nocebo Effects in a Visceral Pain Model?

Sigrid Elsenbruch<sup>1</sup> , Till Roderigo<sup>1</sup> , Paul Enck <sup>2</sup> and Sven Benson<sup>1</sup> \*

*1 Institute of Medical Psychology and Behavioral Immunobiology, University Hospital Essen, Essen, Germany, <sup>2</sup> Department of Internal Medicine VI, University Hospital Tuebingen, Tuebingen, Germany*

Translational research aiming to elucidate mediators and moderators of placebo and nocebo effects is highly relevant. This experimental study tested effects of a brief progressive muscle relaxation (PMR) exercise, designed to alter psychobiological stress parameters, on the magnitude of placebo and nocebo effects in a standardized psychosocial treatment context. In 120 healthy volunteers (60 men, 60 women), pain expectation, pain intensity, and pain unpleasantness in response to individually-calibrated rectal distensions were measured with visual analog scales during a baseline. Participants were then randomized to exercise PMR (relaxation group: *N* = 60) or a simple task (control group: *N* = 60), prior to receiving positive (placebo), negative (nocebo) or neutral suggestions regarding an intravenous administration that was in reality saline in all groups. Identical distensions were repeated (test). State anxiety, salivary cortisol, heart rate, and blood pressure were assessed repeatedly. Data were analyzed using analysis of covariance, planned Bonferroni-corrected group comparisons, as well as exploratory correlational and mediation analyses. Treatment suggestions induced group-specific changes in pain expectation, with significantly *reduced* expectation in placebo and *increased* expectation in nocebo groups. PMR had no discernable effect on pain expectation, state anxiety or cortisol, but led to significantly lower heart rate and systolic blood pressure. Relaxation significantly interacted with positive treatment suggestions, which only induced placebo analgesia in relaxed participants. No effects of negative suggestions were found in planned group comparisons, irrespective of relaxation. Exploratory correlation and mediation analyses revealed that pain expectation was a mediator to explain the association between treatment suggestions and pain-related outcomes. Clearly, visceral pain modulation is complex and involves many cognitive, emotional, and possibly neurobiological factors that remain to be fully understood. Our findings suggest that a brief relaxation exercise may facilitate the induction of placebo analgesia by positive when compared to neutral treatment suggestions. They underscore the contribution of relaxation and stress as psychobiological states within the psychosocial treatment context—factors which clearly deserve more attention in translational studies aiming to maximize positive expectancy effects in clinical settings.

#### Edited by:

*Martina De Zwaan, Hannover Medical School, Germany*

#### Reviewed by:

*Michael Stephan, Hannover Medical School, Germany Georgios Paslakis, University of Bamberg, Germany*

> \*Correspondence: *Sven Benson sven.benson@uk-essen.de*

#### Specialty section:

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

Received: *22 January 2019* Accepted: *27 February 2019* Published: *21 March 2019*

#### Citation:

*Elsenbruch S, Roderigo T, Enck P and Benson S (2019) Can a Brief Relaxation Exercise Modulate Placebo or Nocebo Effects in a Visceral Pain Model? Front. Psychiatry 10:144. doi: 10.3389/fpsyt.2019.00144*

Keywords: placebo effect, nocebo effect, expectation, visceral pain, pain unpleasantness, stress, relaxation

## INTRODUCTION

Although placebo research spans many medical disciplines, the pain field continues to drive conceptual, mechanistic, and clinical advances in placebo knowledge, providing fruitful opportunities of forward- and backward translation. Placebo analgesia constitutes one of the most fascinating and impressive examples of such translational research. Laboratory and preclinical studies in healthy populations and in patients with chronic pain conditions have elucidated the psychological and neurobiological mechanisms underlying placebo and nocebo effects (1, 2). The clinical potential offered by a transfer of this knowledge into treatment settings has been recognized within the pain field (3, 4) and beyond (5). This is underscored by trials supporting the efficacy of placebo interventions in patients with chronic low back pain (6, 7) and chronic visceral pain (8, 9). Facilitating placebo while minimizing nocebo effects may contribute to refining treatment approaches to provide patients with improved and more personalized patient care (10, 11). Toward this end, translational research aiming to optimize the efficacy of placebo interventions is highly relevant. In the context of chronic visceral pain and related gastrointestinal symptoms, the potential of placebo knowledge has been recognized but is far from fulfilled (12–14).

Various aspects of the psychosocial treatment context, including the setting (15), nature of the intervention, as well as the quality and quantity of patient-provider interactions (9, 16, 17), shape treatment expectations and thereby the presence and magnitude of placebo effects. Optimizing the psychosocial treatment context has the potential to improve the efficacy of placebo treatment, and to maximize the benefits of placebo-elements that are an inherent part of therapeutic interventions, including pharmacological treatments (4, 18). Interestingly, two laboratory studies in healthy volunteers support the idea that placebo analgesia can be enhanced with specific pharmacological interventions, i.e., the administration of vasopressin and oxytocin, respectively (19, 20). Whether behavioral approaches that target stress-related psychobiological factors are capable of facilitating placebo analgesia has not been tested. Herein, we explore for the first time the modulatory effects of a brief behavioral intervention, i.e., progressive muscle relaxation (PMR), on placebo and nocebo effects in a clinically-relevant model of visceral pain. The rationale was inspired by evidence supporting that enhanced stress [e.g., increased state anxiety (21–24), subjective stress levels (25– 28), experimentally-induced fear (29), acute psychosocial stress (30)] moderates placebo and/or nocebo effects. As part of a larger experimental study (30), we herein implemented PMR aiming to test effects of reduced stress-related psychobiological factors on the magnitude of placebo and nocebo effects induced by treatment suggestions. Building on our earlier experimental studies on placebo/nocebo effects in the context of visceral pain (22, 30–33), we specifically aimed to test whether a brief relaxation exercise, carried out immediately prior to the delivery of deceptive positive (placebo), deceptive negative (nocebo), or truthful neutral (control) treatment suggestions, can facilitate placebo analgesia or reduce nocebo hyperalgesia in an established and clinically-relevant model of visceral pain in healthy volunteers. To explore if the effects of relaxation or treatment suggestions on outcomes were mediated by stress markers or expectations, we conducted correlational and mediation analyses.

### MATERIALS AND METHODS

### Participants

Healthy adults were recruited by local advertisements seeking volunteers for an experimental study on the modulation of visceral pain perception. We herein report on a total of N = 120 healthy volunteers (60 men, 60 women) who were randomized to a brief relaxation exercise or a control task on the experimental study day just prior to undergoing an established placebo/nocebo paradigm (see below, study design). Note that this study was conducted as part of a larger trial which also included an additional N = 60 volunteers who were randomized to a psychosocial stress protocol (data on the psychosocial stress and control groups have been reported in Roderigo et al. (30). Recruitment and screening procedures were accomplished with a total of N = 219 participants originally interested in the study. Reasons for non-participation were lack of interest, exclusion based on criteria specified below, and a high pain threshold that was above the herein applied safety cut-off for distensions at 55 mmHg. The study was conducted at Essen University Hospital with data collection between January 2015 and June 2016. The study protocol was approved by the local ethics committee (protocol number 13-5565-BO, approval date: August 28, 2013). All volunteers gave informed written consent in accordance with the Declaration of Helsinki and were paid for their participation.

Exclusion criteria included age <18 or >65 years, a body mass index (BMI) <18 or >30, any known medical or psychological conditions, current medication use (except thyroid medication, occasional over-the-counter drugs for minor allergies, benign headaches, etc.), current anxiety or depression symptoms above the published cut-off values on the Hospital Anxiety and Depression Scale (HADS) (34), current gastrointestinal (GI) symptoms suggestive of an undiagnosed GI condition (35), peri-anal tissue damage (e.g., painful hemorrhoids or fissures which may interfere with rectal balloon placement), and prior participation in any of our previous placebo studies. In an effort to reduce possible variability related to fluctuations of hormones across the female menstrual cycle, only women on hormonal contraceptives were recruited. All participants completed a comprehensive questionnaire battery, as detailed in Roderigo et al. (30). We herein characterized groups using the HADS (34) for symptoms of anxiety and depression, the trait version of the STAI (36) for trait anxiety, the TICS (screening scale) (37) for chronic perceived stress, and sum scores from a gastrointestinal (GI) symptom questionnaire (35) to assess frequency and severity of common upper and lower GI symptoms. Note that previous experience with any type of relaxation technique, including progressive muscle relaxation (PMR) was not an inclusion or exclusion criterion, however, it was required that volunteers were willing to complete a home-based PMR training program as part of the study, as detailed below.

#### Study Design and Procedures

During a 4-week period preceding the experimental study day, volunteers were instructed to complete a home-based, standardized training program in progressive muscle relaxation (PMR). This was done to achieve a large enough sample of individuals capable of completing a short relaxation exercise on the day of the study. In order to achieve proper blinding and randomization, all 180 participants underwent the training program. To do so, we selected a commercially available training manual that consisted of an illustrated book with an audio CD that contained guided training sessions. Note that the same audio-guided training CD was used by participants randomized to the brief relaxation group on the study day. Every volunteer—irrespective of possible prior experience with the PMR or other relaxation techniques—was instructed to start the training in the first week with two sessions of a long program that lasted ∼40 min. Thereafter, participants could choose between the long version and a shorter 15-min. version in the remaining training weeks, but were required to practice at least twice per week. Participants recorded their practice in a training log, and at the end of the week (i.e., on Sundays) completed a standardized questionnaire assessing the number of training sessions (N), training duration (in minutes), perceived training efficacy (7-point Likert-scale ranging from "training worked not at all" to "training worked perfectly"), psychological distress (7-point Likert-scale ranging from "felt completely relaxed" to "felt extremely distressed") and various bodily symptoms (not reported here) for the past week. Together with each weekly questionnaire, participants collected morning saliva samples for analysis of the cortisol awakening response (CAR). In case of non-compliance (i.e., on average <2 training sessions per week) participants were encouraged to continue practicing for up to two additional weeks before the study day was scheduled. Note that questionnaire data and CAR were not acquired to verify the efficacy of PMR training (which is impossible given the absence of a control group that did not undergo training) but rather to provide sample characteristics for comparisons of groups that on the study day were randomized to brief relaxation exercise vs. a control task.

On the experimental study day, rectal sensory and pain thresholds were initially determined with a pressure-controlled barostat system (modified ISOBAR 3 device, G & J Electronics, Ontario, Canada), using well-established methodology [e.g., (22, 30–33, 38). During a BASELINE, each participant received a series of painful rectal distensions titrated individually to rectal threshold (6 distensions, duration each 30 s; pauses in-between 30 s). Participants were then randomized to relaxation (practice relaxation using the 15-min. audio-CD program, N = 60) or control intervention (engage in an easy cognitive activity, e.g., crosswords, reading a magazine, N = 60) while stratifying for sex. Immediately afterwards, participants were randomized to positive (placebo), negative (nocebo), or neutral treatment suggestions (details on suggestions below). This resulted in a total of 2 (relaxation, control) x 3 (positive, negative, neutral suggestions) experimental groups consisting of N = 20 participants per group. The series of rectal distensions using the same individualized pressures as during BASELINE was then repeated (TEST).

### Treatment Suggestions and Blinding

We herein implemented previously used methodology to induce placebo and nocebo effects in this visceral pain model [e.g., (30, 33); for recent discussions of methodology aspects, see (13, 14)]. In this paradigm, deceptive or truthful treatment suggestions are delivered in combination with an i.v. administration that in reality contains saline. In placebo groups, volunteers receive positive treatment suggestions regarding pain relief induced by a spasmolytic drug (i.e., Butylscopolaminiumbromid). In nocebo groups, negative suggestions regarding increased pain sensitivity due to administration of an opioid antagonist (i.e., Naloxone) are delivered. In control groups, truthful information about saline are provided. These control groups (herein referred to as "neutral" groups to distinguish from the relaxation vs. control intervention group terminology) are an essential part of the study design as they allow a differentiation and separate analyses of placebo and nocebo effects, respectively, as well as controlling for effects of time (e.g., habituation), etc.

In order to achieve proper blinding and a randomization to treatment suggestions on the study day, all volunteers received deceptive information about all possible drug treatments during recruitment and informed consent, including detailed information about typical clinical uses, pharmacodynamics, and possible side effects. Blinding of the study team interacting with volunteers on the study day was accomplished as follows: The clinical psychologist responsible for recruitment and conducting the study protocol (relaxation, control) was blinded to subsequent treatment information, the physician who delivered treatment information was blinded to prior relaxation vs. control intervention, the female study nurse was fully blinded throughout the study day.

#### Pain-Related Measures

Primary outcome measures were overall perceived visceral pain intensity and pain unpleasantness, quantified with visual analogs scales (VAS, 0−100 mm, ends defined as none—very much). In addition, expected pain intensity was quantified with a VAS (0−100 mm, ends defined as none—very much) prior to BASELINE and TEST, respectively.

#### Additional Measures

State anxiety (STAI-S), salivary cortisol concentrations (see below), heart rate (Task Force Monitor, CNSystems Medizintechnik AG, Graz, Austria), and blood pressure were assessed repeatedly and are herein presented for a baseline (prior to first randomization to relaxation vs. control intervention), after treatment suggestions, and after the TEST series of distensions. Note that we chose not to additionally assess these stress-related measures in-between the intervention and delivery of treatment suggestions given concerns that this may disrupt or interfere with effects of relaxation on the subsequent experimental procedures.

Saliva samples were collected using Salivettes (Sarstedt, Nümbrecht, Germany). To assess the cortisol awakening

#### TABLE 1 | Sample characteristics and measures collected during training.


*All data are shown as mean* ± *standard error of the mean, unless indicated otherwise. For all questionnaire references, see main text. HADS, Hospital Anxiety and Depression Scale; STAI, Spielberger State-Trait Anxiety Inventory (trait version); TICS, Trier Inventory for Chronic Stress (screening scale);*

*<sup>a</sup>Mean values averaged over weekly diaries completed during the 4-wk training period. For detailed weekly results.*

*<sup>b</sup>Perceived training efficacy during the last week, rated on a seven-point Likert-scale ranging from "training worked not at all" to "training worked perfectly."*

*<sup>c</sup>Mean distress in past week rated on a seven-point Likert-scale ranging from "felt completely relaxed" to "felt extremely distressed."*

*<sup>d</sup>Cortisol awakening response measured once per week, calculated as area under the curve (AUC) with respect to increase which controls for baseline levels.*

response (CAR) during the 4-week home PMR training period, participants collected samples once per week immediately after awakening and 30, 45, 45, and 60 min. afterwards and stored the samples in their freezers until bringing them to the laboratory on the study day. All saliva samples, including all samples collected on the study day, were centrifuged (2,000 rpm, 2 min, 4◦C) and stored at −20◦C. Salivary cortisol concentrations were measured using a commercially available enzyme-linked immunosorbent assay (ELISA; IBL International, Hamburg, Germany) according to the manufacturer's protocol. Intra- and interassay variances were 4.8 and 5.9%, respectively. The detection limit was 0.138 nmol/l. The CAR was calculated as area under the curve (AUC) with respect to increase, which corrects for baseline levels, according to published recommendations (39).

#### Statistical Analyses

All statistical analyses were conducted using SPSS version 22.0 (IBM Corporation, Armonk, NY). Power analysis using G-Power (http://www.gpower.hhu.de/) indicated that a total sample size of N = 120 has a sufficient statistical power of 1-β = 0.96 to detect large effects (f = 0.40, α = 0.05) for ANOVA interaction effects. The groups were characterized and compared with respect to sociodemographic, psychological, and clinical characteristics using Chi-Square Tests, t-tests, or Mann-Whitney-U-tests where appropriate.

Effects of the relaxation vs. control on stress markers were tested with repeated measures analysis of covariance (ANCOVA) with time as repeated factor and two between factors, namely intervention (relaxation, control) and treatment suggestions (positive, negative, neutral). Note that the factor "treatment suggestions" was included as a group factor in this analysis to test for possible interactions between the intervention and treatment suggestions on stress markers. Post hoc tests were conducted as Bonferroni-corrected ANCOVA (for comparisons between groups) or Bonferroni-corrected paired t-tests (for changes across time points within one group).

To address effects of relaxation and treatment suggestions on changes in pain expectation, pain intensity, and pain unpleasantness from BASELINE to TEST, repeated measures ANCOVAs were computed with the repeated factor time and two group factors (intervention; treatment suggestions). Bonferronicorrected planned comparisons of pre-specified group means were accomplished with univariate ANCOVAs testing differences between positive and neutral treatment suggestion groups (for placebo effects) and between negative and neutral treatment suggestion groups (for nocebo effects). In all ANCOVAs, Greenhouse-Geisser correction was applied if the sphericity assumption was violated (based on results of Mauchly test), and HADS anxiety scores were included as a covariate, given a small but significant group difference between the relaxation and control groups (see results, **Table 1**).

To explore if the effects of relaxation or treatment suggestions on outcomes were mediated by stress markers or expectations, we conducted correlational and mediation analyses. Correlations were computed as Pearson's r. Mediation analyses were conducted using the PROCESS

#### TABLE 2 | Relaxation training period.


*Data shown here extend data shown in* Table 1*. All data are shown as mean* ± *standard error of the mean. All N* = *120 participants underwent the same home-based relaxation training, and were randomized on the study day to brief relaxation vs. control task. Weekly means were compared between groups with independent samples t-tests or Mann-Whitney U-test as appropriate. To assess changes across weeks, ANOVA and Friedman test were used on data from the whole sample (N* = *120). Mean training time showed a significant decrease (F* = *76.9, p* < *0.001,* η *2 <sup>p</sup>* <sup>=</sup> 0.40*, ANOVA time effect). No significant changes were observed for mean training units (F* <sup>=</sup> *0.5, p* <sup>=</sup> *0.67,* <sup>η</sup> *2 <sup>p</sup>* <sup>=</sup> 0.01*), perceived training efficacy (Chi<sup>2</sup>* = *4.2, p* = *0.24), mean distress (Chi<sup>2</sup>* = 3.2*, p* = *0.37), or cortisol awakening response (F* = *1.5, p* = *0.22,* η *2 <sup>p</sup>* <sup>=</sup> *0.01). <sup>a</sup>Perceived training efficacy during the last week, rated on a seven-point Likert-scale ranging from "training worked not at all" to "training worked perfectly." <sup>b</sup>Mean distress rated on a seven-point Likert-scale ranging from "felt completely relaxed" to "felt extremely distressed." <sup>c</sup>Cortisol awakening response was calculated as area under the curve (AUC) with respect to increase which corrects for baseline levels. Saliva cortisol was collected immediately after, as well as 30, 45, and 60 min after awakening.*

SPSS macro provided by A.F. Hayes (version 2.12.2, downloaded from http://www.processmacro.org/download. html). Bootstrapping with 10,000 samples was used to determine 95% confidence intervals (CIs) to test for statistical significance.

In case of missing data (e.g., due to technical problems), data from this participant for all time points for the affected variable were omitted from analyses. Missing data for each variable are indicated in the result section. All results are reported as mean ± standard error of the mean (SEM) unless indicated otherwise. All authors had access to the study data and reviewed and approved the final manuscript.

#### RESULTS

#### Participants

Volunteers randomized to practice brief relaxation (N = 60) or control (N = 60) did not differ with respect to sociodemographic variables or psychosocial questionnaire scores (**Table 1**, upper section). As per exclusion criteria, mean HADS scores were within the normal range and below the clinicallyrelevant cut-offs. Nevertheless, mean HADS anxiety score was significantly higher in the control group (p = 0.026), and was therefore included as a covariate in subsequent analyses. No significant differences were observed in trait anxiety assessed with the STAI. This is however not unusual given that the HADS measures clinical symptoms of anxiety, while STAI scores primarily reflect non-clinical anxiety. The groups were comparable with respect to all measures collected during the 4-week PMR training phase (**Table 1**, lower section), including training intensity, frequency, perceived training efficacy, psychological distress, and the CAR (for weekly means, see **Table 2**). Rectal thresholds, assessed on the study day prior to first randomization, were comparable between groups (sensory threshold: 14.8 ± 0.7 mmHg relaxation group, 15.0 ± 0.7 mmHg control group, t = −0.2, p = 0.87; pain threshold: 36.6 ± 1.3 mmHg relaxation group, 35.9 ± 1.9 mmHg control group, t = −0.5, p = 0.65).

#### Stress Markers

The ANCOVA computed to test effects of the brief relaxation (N = 60) vs. control intervention (N = 60) on stress markers (see **Table 3**; for group means per treatment suggestion group, see **Table 4**) revealed significant group × time interactions for systolic blood pressure (F = 9.22, p < 0.001, ηp² = 0.08) and heart rate (F = 8.10, p < 0.001, ηp² = 0.07), which decreased significantly in the relaxation but not in the control group. Salivary cortisol and state anxiety showed significant

#### TABLE 3 | Stress parameters.


*All data are shown as mean* ± *standard error of the mean, unless indicated otherwise. Stress parameters were repeatedly assessed, i.e., at BASELINE (prior to first randomization to relaxation vs. control intervention), after treatment suggestions, and after the series of distensions (TEST). BP, blood pressure; STAI, Spielberger State-Trait Anxiety Inventory (state version).*

*<sup>a</sup>Results of analyses of covariances (ANCOVA) accounting for HADS anxiety scores. Incomplete/missing data: Incomplete STAI-S questionnaires: N* = *3 relaxation group, N* = *2 control group; technical errors with ECG signal for heart rate: N* = *3 relaxation group, N* = *6 control group.* \**p* < *0.05,* \*\**p* < *0.01,* \*\*\**p* < *0.001, results of Bonferroni-corrected paired t-tests comparing means vs. baseline within each experimental group.* #*p* < *0.05, result of post-hoc computed Bonferroni-corrected ANCOVA, comparing relaxation and control group at distinct time points.*

decreases over time (salivary cortisol: F = 11.68, p < 0.001,ηp² = 0.09; state anxiety scores: F = 9.56, p < 0.001, ηp² = 0.08), however, without evidence of significant group × time interactions (salivary cortisol: F = 0.07, p = 0.86, ηp² = 0.01; state anxiety scores: F = 0.53, p = 0.59, ηp² = 0.01). No significant effects were observed for diastolic blood pressure.

#### Pain-Related Measures

Expected pain intensity (**Figure 1A**) was reduced by positive and increased by negative treatment suggestions (F = 8.84, p < 0.001, ηp² = 0.14, ANCOVA main effect of treatment information; F = 32.25, p < 0.001, ηp² = 0.37, ANCOVA interaction effect of time × treatment information). Pain expectation was not affected by relaxation, as indicated by the absence of significant main or interaction effects.

For perceived pain intensity (**Figure 1B**), there was a significant effect of treatment suggestions (F = 4.38, p = 0.015 ηp² = 0.07, time x suggestion interaction; F = 3.70, p = 0.028 ηp² = 0.06, main effect of suggestion), but no main effect of the intervention (F = 0.01, p = 0.98 ηp² = 0.01, time x intervention interaction; F = 0.31, p = 0.58 ηp² = 0.01, main effect of intervention) and no interaction effect (F = 1.29, p = 0.29 ηp² = 0.02, time × suggestion × intervention interaction). Planned comparisons of group means revealed significantly reduced perceived pain intensity at TEST due to positive compared to neutral suggestions in the relaxation groups (F = 8.04, p = 0.008, ηp² = 0.19), while a similar placebo effect was not observed in the control groups (F = 0.44, p = 0.51, ηp² = 0.01). Nocebo effects, tested by comparing groups with negative vs. neutral treatment suggestions, were not observed in either intervention group (relaxation: F = 0.3, p = 0.57, ηp² = 0.01; control: F = 1.9, p = 0.17, ηp² = 0.05).

For pain unpleasantness (**Figure 1C**), a significant interaction between intervention, treatment suggestions, and time (F = 3.53, p = 0.032, ηp² = 0.06), as well as a significant effect of treatment suggestions (F = 4.41, p = 0.014 ηp² = 0.07, time × suggestion interaction; F = 3.21, p = 0.044 ηp² = 0.05, main effect of treatment suggestion) emerged, while effects of the intervention were not significant (F = 0.82, p = 0.37, ηp² = 0.01, time × intervention interaction; F = 0.37, p = 0.54 ηp² = 0.01, main effect of intervention). Planned comparisons of group means revealed significantly reduced unpleasantness at TEST in response to positive when compared to neutral suggestions (F = 7.8, p = 0.008, ηp² = 0.18) in relaxation groups, but not in control groups (F = 0.9, p = 0.34, ηp² = 0.02). No significant effects of negative suggestions were observed (relaxation groups: F = 0.02, p = 0.88, ηp² = 0.01; control groups: F = 0.63, p = 0.43, ηp² = 0.02).


FIGURE 1 | Expected pain intensity (A), perceived pain intensity (B), and perceived pain unpleasantness (C), assessed with visual analog scale (VAS, 0–100 mm) at BASELINE and TEST, in groups receiving positive, neutral, or negative treatment information after relaxation (right panels) or control (left panels). Note that pain expectation was assessed before, whereas perceived pain intensity and unpleasantness were assessed after the series of distensions during BASELINE and TEST, respectively. For ANCOVA results, please see text. \**p* < 0.05; \*\**p* < 0.01 results of planned comparisons with Bonferroni-correction at TEST (for exact *p*-values, see text) comparing groups with positive information to groups with neutral information (to test for placebo effects) and groups with negative information to groups with neutral information (to test for nocebo effects) after either relaxation or control.

### Exploratory Correlational and Mediation Analyses

To explore the role of pain expectation, we conducted correlational and mediation analyses both in the whole sample and in groups with positive suggestions (placebo groups) and negative suggestions (nocebo groups). In the whole sample of N = 120, pain expectation was significantly associated with both perceived pain intensity (r = 0.58, p < 0.001) and pain unpleasantness (r = 0.38, p < 0.001). In addition, pain expectation correlated with state anxiety (r = 0.25, p = 0.007), but not with other stress markers. No significant correlations between any other stress marker and pain outcomes were found (all p > 0.05, data not shown).

Within placebo groups (N = 40), pain expectation was positively correlated with perceived pain intensity (r = 0.54, p < 0.001, **Figure 2A**) and unpleasantness (r = 0.32, p = 0.047, **Figure 2B**). To explore if pain expectation mediated effects of positive treatment suggestions, we conducted mediation analyses on data from placebo and neutral suggestion groups (N = 80) after ensuring that positive associations remained significant in multiple regression analyses including treatment suggestions in addition to pain expectation as independent variables (data not shown). We found an indirect effect of pain expectation which mediated the association between treatment suggestions and pain intensity (**Figure 3A**) as well as unpleasantness (**Figure 3B**).

Within nocebo groups (N = 40), pain expectation was significantly associated pain intensity (r = 0.53, p = 0.03), but not with unpleasantness (r = 0.25, p = 0.11). The former association remained significant in multiple regression analyses including treatment suggestions in addition to pain expectation as independent variables (data not shown). To explore if pain expectation mediated effects of negative treatment suggestions, we conducted mediation analysis for pain intensity on data from nocebo and neutral suggestion groups (N = 80). We found an indirect effect of pain expectation which mediated the association between treatment suggestions and pain intensity (**Figure 3C**).

We conducted additional mediation analyses to explore if putative effects of relaxation vs. control on pain intensity or unpleasantness could be explained by pain expectation. In separate analyses within the placebo, nocebo, and control groups, we did not find evidence of direct or indirect effects of relaxation on pain outcomes (data not shown). This is in line with (1) the absence of significant effects of relaxation on expectations and (2) the non-significant correlations between stress markers and outcome variables.

### DISCUSSION

This is the first study testing whether a behavioral intervention aimed at reducing acute stress parameters affects the response to positive and/or negative treatment suggestions in a clinicallyrelevant model of visceral pain. Our findings suggest that a brief relaxation exercise may facilitate the induction of placebo analgesia by positive when compared to neutral treatment suggestions. These findings extend evidence that placebo analgesia can be boosted with pharmacological interventions (19,

20). There are clearly many facets surrounding the psychosocial treatment context that ultimately determine the presence and magnitude of expectancy effects. Our results support the contribution of relaxation and stress as psychobiological states within the psychosocial treatment context—factors which clearly deserve more attention in translational studies aiming to maximize positive expectancy effects in clinical settings.

Healthy volunteers were randomized to a brief muscle relaxation exercise or a control task just prior to randomly receiving deceptive positive, deceptive negative, or truthful neutral treatment suggestions regarding an intravenous infusion that was in reality saline in all groups. These treatment suggestions induced group-specific changes in pain expectation, with reduced pain expectation in groups receiving positive suggestions of pain relief (i.e., placebo groups) and increased pain expectation in groups receiving negative suggestions of enhanced pain sensitivity (i.e., nocebo groups). While the relaxation exercise had no discernable effect on pain expectation, relaxation significantly interacted with positive treatment suggestions. Planned comparisons of group means showed significantly reduced pain intensity and lower pain unpleasantness after positive compared to neutral treatment suggestions only in the relaxation groups. In other words, positive treatment suggestions only induced placebo analgesia in relaxed participants, which

coefficients with 95% CIs are shown. \**p* < 0.05; \*\**p* < 0.01; \*\*\**p* < 0.001.

is partly in line with our hypothesis assuming a facilitated placebo effect. On the other hand, relaxation had no discernable effect on groups receiving negative suggestions. Since no nocebo effects were observed in either relaxation or control group, we could not confirm our hypothesis that relaxation may reduce nocebo hyperalgesia.

We chose a brief relaxation exercise as behavioral intervention with the intention to acutely reduce stress parameters within a highly standardized psychosocial treatment context. This approach was conceptually and methodologically based on our earlier brain imaging work on the role of emotional context in visceral pain processing (38). It complements placebo/nocebo studies in the broader pain field aiming to discern effects of acute stress, state anxiety or fear (23, 25, 26, 28–30, 40) on placebo analgesia or nocebo hyperalgesia. In order to verify the efficacy of the intervention and to gain insight into possible mechanisms, we assessed several relevant stress markers reflecting different biopsychological aspects of stress. Brief relaxation significantly reduced systolic blood pressure and heart rate, supporting effects on the autonomic nervous system (ANS). On the other hand, no effects on state anxiety or cortisol concentrations were found. This could indicate that measures of ANS function (herein: heart rate and blood pressure) are more sensitive or responsive to short-term effects of PMR, at least in healthy individuals. However, it should be noted that cortisol and state anxiety significantly decreased in both groups, and that these measures could not be assessed immediately after the relaxation exercise for methodological considerations. Hence, effects on state anxiety or cortisol could be difficult to detect given reductions in both groups and may have been missed herein. Nevertheless, the ANS is increasingly appreciated in the context of pain modulation [e.g., (41)], especially in acute and chronic visceral pain as a key component of the brain axis (42–50). Within the placebo field, the ANS has been proposed as a primary mediator of peripheral placebo effects in conditioning models (51, 52). Placebo analgesia evokes complex effects within the cardiovascular system, including changes in heart rate and blood pressure (25, 53). Blood pressure and stress were found to mediate hyperalgesia after nocebo suggestions (27), and a recent study supports a role of autonomic arousal in the persistence of nocebo hyperalgesia (54). Interestingly, the same study (54) found no correlation between either self-reported anxiety or autonomic arousal and placebo analgesia/nocebo hyperalgesia. We also explored these relationships in our dataset, and found no correlations between placebo effects and stress markers. In fact, pain expectation was the only mediator we could identify to explain the association between treatment suggestions and pain-related outcomes. These results call for caution with respect to any speculation about stress-related mechanisms and underscore the need to further study possible moderators of placebo analgesia, especially emotional factors that have been proposed to play a role in placebo analgesia (55, 56). Clearly, visceral pain modulation is complex and involves many cognitive, emotional, and possibly neurobiological factors that remain to be fully understood.

This study has strengths and limitations. Strengths include the clinically-relevant visceral pain model, blinding procedures, the combination of different psychobiological measures for traits and states, and the inclusion of groups receiving positive, negative or neutral treatment suggestions within one study. The full factorial within-between study design goes beyond correlational approaches aiming to identify psychological mediators and moderators of placebo and nocebo effects. At the same time, final group Ns are relatively small, posing limitations of statistical power, and risk of Type II error. This may for example explain why post hoc testing revealed a statistically significant reduction in pain expectation induced by positive vs. neutral suggestions only in the control but not in the relaxation group. Further, for reasons of feasibility and cost effectiveness, data from the control group were also used in Roderigo et al. (30), and there was also no additional control group that did not undergo prior relaxation training for feasibility reasons and to ensure blinding and randomization on the study day. We therefore cannot assess possible effects of prior relaxation training on measures obtained on the study day. While the absence of the brief PMR vs. control exercise effects on pain-related outcomes on the study day may be interpreted as evidence supporting a lack of relaxation effects on visceral pain, this would in our view be premature. First, we could not ascertain whether regular PMR exercise of 4 weeks did in fact induce changes in variables relevant to chronic stress. To do so was not our intention since this was not a treatment study but rather herein implemented in order to teach a sufficiently large number of study volunteers to perform PMR on the study day, aiming to realize a study design with proper randomization and blinding. We recruited a tightly-screened, healthy population of young individuals with comparatively low levels of chronic stress or stress-related symptom burden. Hence, our findings likely do not transfer to other populations at risk for stress-related health conditions or even patients with chronic pain, and should not be viewed as evidence for or against the potential clinical use of relaxation techniques in patients. In irritable bowel syndrome, for example, a recent meta-analysis (57) showed a clinical benefit of relaxation methods, and an older, more comprehensive Cochrane review (58) on relaxation therapy and stress management revealed medium effect sizes for symptom severity after 2–3 months, but inconsistent longerterm findings (after 6–12) months with regard to abdominal pain and quality of life. The lack of control group without prior relaxation training further limits our ability to test the possibility that the absence of nocebo effects could be explained by effect(s) of previous relaxation training. There are other methodological considerations regarding the absence of nocebo effects herein: Given clear effects of negative suggestions on expected pain intensity, we would argue that the nocebo manipulation did not "fail" per se. This is supported by positive correlations between pain expectation and intensity and to a smaller extent pain with unpleasantness, supporting the connection between negative pain-related expectations and ratings. Whether, negative expectations are more tightly "linked" with intensity than unpleasantness requires further study. Nocebo effects in visceral pain models have thus far not been studied outside of our group, and they may be more difficult to reliably elicit in the laboratory setting than placebo effects. It is conceivable that they can more effectively be induced in healthy individuals under conditions

of heightened stress or arousal, e.g., in the scanner setting (33) that is per se stressful (59) or after acute psychosocial stress as shown in a separate arm of this study (30). Our nocebo paradigm relied exclusively on treatment suggestions, and the study was only powered to detect large effects. Combining suggestions with a learning experience (i.e., a preconditioning procedure consisting of the surreptitious increase/decrease of pain intensity prior to suggestions) may be more efficacious and enhance effect size (13, 22). Finally, our approach to utilize truthful information regarding i.v. administration of saline as a control (i.e., groups with "neutral suggestions") is essential to properly quantify placebo/nocebo effects and distinguish them from other effects, like habituation, sensitization, order effects, etc. At the same time, these "neutral" groups are not untreated and hence by definition not free of treatment-related expectations. This may also reduce the magnitude of expectancy effects when their detection essentially relies on group comparisons [for more detailed methodological considerations, see (13)].

Together, our data provide further evidence that psychological states may alter how individuals respond to treatment suggestions. They complement recent conceptual developments on how bodily symptoms are experienced (60), especially interoceptive symptoms (61) which are demonstrably particularly salient and unpleasant when compared to exteroceptive, somatic stimuli even at matched intensities (62). Inter-individual variability in the presence and magnitude of placebo and nocebo effects is likely not only moderated by individual traits, characteristics of the treatment, and patientprovider interactions, but also by the psychological state in which treatment expectations are formed. Our findings call for more research to unravel how psychological states and their neurobiological correlates contribute to inter-individual variability in expectancy effects on symptom perception. Further, these experimental data acquired in a clinically-relevant pain model pave the way toward translation into clinical populations implementing behavioral interventions that target patients' expectancies and (also) consider psychobiological states. Indeed, placebo and nocebo effects for interoceptive, visceral symptoms are relevant to the treatment of the large group of patients with functional gastrointestinal disorders like IBS (12), but studies are needed to test whether findings from healthy volunteers can be transferred to patients. The role of the psychobiological stress systems in the pathophysiology of these clinical conditions is undisputable, as is the importance of pain or symptom-related cognitive and emotional factors (12, 42, 63, 64). If indeed these very same systems (or one of these) impacts how treatment expectations are processed, the implications are broad both for

#### REFERENCES


clinical practice and treatment trials. Indeed, placebo research has impressively demonstrated the clinical potential offered by psychological interventions (1, 2, 11), especially in the context of pain (1, 4). Effort to transfer knowledge from mechanistic work to clinical routine (65) are built on evidence that placebo analgesia engages similar neurobiological mechanisms as those responsible for the efficacy of pharmacological analgesic treatment (11, 66), and effectively enhances the "pure" pharmacological effect of analgesics in experimental but also in clinical settings (1, 2, 4, 18). Together, these findings pave the way for future studies. Our findings provide a small, additional "piece of the puzzle," at minimum supporting that the recent statement "Implementation of successful treatment requires effective communication skills to improve patient acceptance, adherence and to optimize the patient provider relationship." (67) may need amendment to incorporate additional aspects of the psychosocial treatment context, including individual treatment expectations and psychobiological states.

### DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

### AUTHOR CONTRIBUTIONS

SB and SE: planning of the study and acquisition of funding. TR and SB: conducting the study. TR, SB, SE, and PE: data analysis and interpretation. SE, SB, and PE: drafting of the manuscript. All authors: revision of the manuscript for critical intellectual content and approval of the final draft submitted. All authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

### FUNDING

Funded by a research grant (EL 236/8-2) from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG (FOR 1328). The DFG had no role in the study design, collection, analysis, interpretation of the data, or in the writing of the report.

### ACKNOWLEDGMENTS

The authors express their gratitude to Dr. M. Schöls and M. Hetkamp for their support during data acquisition.


nocebo information. Pain. (2015) 156:39–46. doi: 10.1016/j.pain.00000 00000000004


pain identify clinically relevant pain clusters. Neurogastroenterol Motil. (2014) 26:139–48. doi: 10.1111/nmo.12245


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Elsenbruch, Roderigo, Enck and Benson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Qualitative Systematic Review of Effects of Provider Characteristics and Nonverbal Behavior on Pain, and Placebo and Nocebo Effects

*Hojjat Daniali1 and Magne Arve Flaten2\**

*1 Department of Humanity Sciences, Shahed University, Tehran, Iran, 2 Department of Psychology, Norwegian University of Science and Technology, Trondheim, Norway*

*Edited by: Paul Enck,* 

*University of Tübingen, Germany*

#### *Reviewed by:*

*Dimos-Dimitrios D. Mitsikostas, National and Kapodistrian University of Athens Medical School, Greece Andrew Geers, University of Toledo, United States*

> *\*Correspondence: Magne Arve Flaten magne.a.flaten@ntnu.no*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 24 October 2018 Accepted: 29 March 2019 Published: 15 April 2019*

#### *Citation:*

*Daniali H and Flaten MA (2019) A Qualitative Systematic Review of Effects of Provider Characteristics and Nonverbal Behavior on Pain, and Placebo and Nocebo Effects. Front. Psychiatry 10:242. doi: 10.3389/fpsyt.2019.00242*

Background: Previous research has indicated that the sex, status, and nonverbal behaviors of experimenters or clinicians can contribute to reported pain, and placebo and nocebo effects in patients or research participants. However, no systematic review has been published.

Objective: The aim of this study was to investigate the effects of experimenter/clinician characteristics and nonverbal behavior on pain, placebo, and nocebo effects.

Methods: Using EmBase, Web of Knowledge, and PubMed databases, several literature searches were conducted to find studies that investigated the effects of the experimenter's/ clinician's sex, status, and nonverbal behaviors on pain, placebo, and nocebo effects.

Results: Thirty-four studies were included, 20 on the effects of characteristics of the experimenter/clinician, 11 on the role of nonverbal behaviors, and 3 on the effects of both nonverbal behaviors and characteristics of experimenters/clinicians on pain and placebo/nocebo effects. There was a tendency for experimenters/clinicians to induce lower pain report in participants of the opposite sex. Furthermore, higher confidence, competence, and professionalism of experimenters/clinicians resulted in lower pain report and higher placebo effects, whereas lower status of experimenters/clinicians such as lower confidence, competence, and professionalism generated higher reported pain and lower placebo effects. Positive nonverbal behaviors (e.g., smiling, strong tone of voice, more eye contact, more leaning toward the patient/participant, and more body gestures) contributed to lower reported pain and higher placebo effects, whereas negative nonverbal behaviors (i.e., no smile, monotonous tone of voice, no eye contact, leaning backward from the participant/patient, and no body gestures) contributed to higher reported pain and nocebo effects.

Conclusion: Characteristics and nonverbal behaviors of experimenters/clinicians contribute to the elicitation and modulation of pain, placebo, and nocebo effects.

Keywords: contextual factors, experimenter characteristics, experimenter sex, clinician sex, nonverbal behavior, placebo effect, nocebo effect, pain

## INTRODUCTION

The present qualitative review investigated whether the characteristics or nonverbal behavior (NB) of the person administrating painful stimulation affected pain or placebo/ nocebo effects in the research participant. The placebo effect is a psychobiological response that may occur following the application of active and inactive interventions (1). Applying an inactive medication paired with positive information about its analgesic effects can reduce pain (2). Likewise, negative information can reverse the analgesic effect of the medication (3, 4) and is called a "nocebo effect" (5, 6). Classical conditioning (previous experience with a treatment) and verbal information about the efficacy of the treatment are involved in the induction of placebo effects and expectations, that a treatment will reduce a symptom (e.g., pain), mediate the effects of both processes (7, 8).

Expecting that a procedure will increase pain may elicit anxiety and increase pain, whereas expecting that a procedure will decrease pain may reduce anxiety and thus reduce pain (9–12). As noted, placebo effects are induced by verbal information and/ or classical conditioning [e.g., Refs. (2, 4, 12–14)]. However, other factors can modulate the experience of pain and placebo and nocebo effects. Treatments, whether active or sham, are administered in a compound of situational elements such as medication features (e.g., color of a tablet), the healthcare setting (hospital or clinic layout), and the characteristics and behavior of the experimenter/clinician. Such subtle cues in the environment (7, 15, 16) can affect the treatment outcome. For instance, Levine and Gordon (17) used three different methods of administering an inert substance (injection by a person sitting beside the patient and giving suggestive information; injection by a person in an adjacent room; or an injection by a programmable machine) and showed that even subtle cues that suggest a painkiller was administered could elicit a placebo response.

This systematic review is aimed to focus on the fields of pain and placebo/nocebo effects, due to their large literature background. This review is to our knowledge the first investigation of whether cues such as characteristics and NBs of the experimenter or clinician can affect pain, and placebo and nocebo effects.

#### Experimenter/Clinician Characteristics

Characteristics of experimenters/clinicians such as sex or gender contribute to the report of pain (18–21). "Gender" refers to the societal definition of characteristics for each sex and consists of beliefs of proper behaviors including pain behaviors. "Sex" refers to biological sex (20, 22). In Western societies, the stereotypical male gender role is characteristically stoic and tries to impress women by their capability to tolerate pain, whereas the female role displays higher sensitivity to pain to induce protective behaviors in men (19). Characteristics of observers or providers can impact the experience of pain (22–25). For instance, Aslaksen et al. (25) indicated that, compared to males tested by a male experimenter, male participants who were tested by a female experimenter reported lower pain. The status of the experimenter/clinician, like the expertise, appearance, and professionalism, is another characteristic that may influence the report of pain or placebo

effects (22, 26–31). For instance, Mercer et al. (32) reported that patients perceived clinicians wearing laboratory coats as more professional, whereas clinicians with informal clothes were rated less professional, compared to clinicians with laboratory clothes (29, 32, 33).

#### Experimenter or Clinician Nonverbal Behaviors

NB is present in almost all human interactions and conveys information that may modulate the verbal message. NB is behavior without a linguistic component (34) and refers to expression of thoughts and feelings through nonverbal expressions (35). NBs can be automatic (36) and may gain priority when there is an incongruity between nonverbal and verbal information (37). NB is divided into positive (NBs that convey a positive emotion, attitude, or relationship) and negative (NBs that convey a negative emotion, attitude, or relationship); and into micro-level (e.g., smiling, leaning forward, hand movement, eye contact, tone of voice, and body gesture) and macro-level behaviors (i.e., a collection of micro-level behaviors that conveys a psychological meaning such as dominance, confidence, or warmth) (38). NB contributes to building of relationships, provides signs about unspoken thoughts and emotions, and strengthens or contradicts verbal information (39). Also, the perception of NBs can be nonconscious and automatic (35, 40–43). Research suggests that NBs of experimenters/clinicians can modulate pain, and placebo/ nocebo effects [e.g., Refs. (22, 44)]. For instance, Ambady and Gray (40) demonstrated that clinician's negative NBs, such as lack of smiling, a larger distance from patients, and looking away from them, contributed to decreased cognitive (focused attention and level of consciousness) and physical functioning (walking across a room and getting up from a chair) of patients. Another study indicated that negative NBs of clinicians impacted patient's health outcome as keeping a larger distance, and not looking at patients decreased the satisfaction with the consultation (45).

Thus, the characteristics and NBs of the experimenter/clinician can have consequences for health (3) and a review is therefore warranted. This review investigated 1a) whether experimenters'/ clinicians' sex can impact pain and placebo/nocebo effects, 1b) whether the status of experimenters/clinicians influences pain and placebo/nocebo effects, and 2) whether experimenter/ clinician NBs affect pain and placebo/nocebo effects.

### METHODS

#### Search Procedure

Searches in the PubMed, EmBase, and ISI databases (Web of Knowledge) were conducted until September 10, 2018. **Table 1** shows the list of Boolean term combinations that were used to search in each database.

### Data Extraction

Data were extracted by the first author (HD). The second author (MF) checked the extracted data. The searches resulted in 3,958 hits. Only experimental (i.e., a causal manipulation following

#### TABLE 1 | Search terms used for the database search.


a random assignment in an experiment or a control group), quasi-experimental (i.e., a manipulation without *a priori* random assignment), and correlational (i.e., a non-experimental method to measure the relationship between variables) studies that investigated the contribution of characteristics and/or NBs of experimenters/clinicians to placebo, nocebo, and pain were included. Studies with both humans and animals were included and the primary target outcomes were pain report and pain behavior (e.g., pain intensity, sensitivity, threshold, duration, tolerance, unpleasantness, and pain medication use). The secondary target outcomes were symptom severity, improvement rate, mood, quality of life, and treatment expectation. A placebo response was defined as the difference between a group or condition where placebo treatment was administrated with information that the placebo was a painkiller, and a natural history control group or condition where no treatment was provided. Studies were also included if equal amounts of medication were administrated to all participants/patients, but where different types of information (verbal and/or nonverbal) about the drug were presented to different conditions and groups. Studies that reported a placebo response only as the difference between a pretest and a posttest in the same group were excluded. Studies that reported the effects of contextual factors such as group or family membership (e.g., the role of NBs of mothers on children reports of pain), race and ethnicity (e.g., the effects of black experimenter's sex), etc., without distinction from other characteristic of experimenters/ clinicians, were excluded. There were no restrictions regarding the target population of included studies. As the terms "Sex" and "Gender" are inconsistently used in studies, both terms were entered in searches, even though the present review focuses on the effects of sex. There is not a review protocol, but a list of the excluded studies is available by contacting the first author (HD) (**Figure 1**).

In line with previous studies [e.g., Refs. (38, 40)], positive NBs were defined as leaning forward, keeping less distance to the participant or patient, more body gestures, a friendly and warm voice, frequent eye contact, nodding, and smiling. Negative NBs were defined as leaning backward, increased distance to the participant/patient, less body gestures, a cold and flat tone of voice, looking away, and frowning. Thirty-four studies (20 experimental, 11 quasi-experimental, and 3 correlational studies) that reported the effect of experimenters/clinicians characteristics and/or NBs on placebo/nocebo effects or pain were included. Included studies were classified in two tables on the basis of the relativeness to whether characteristics (sex and status) (20 studies, **Table 2**) or NBs (11 studies, **Table 3**) of the experimenter/clinician. Additionally, three studies were included in both tables as they had investigated both NBs and characteristics of experimenters/clinicians. Studies were classified according to design, number of participants, sample (healthy participants, patients, or animals), type of provider (clinician or experimenter), characteristics (**Table 2**) or NB (**Table 3**), target outcome, and the result.

#### Bias Risk Assessment

In order to represent trustable outcomes, systematic reviews should acknowledge a number of risk of biases (74). Although there is not a protocol review, the aims of this study did

not change throughout the study and the risk of reporting bias (i.e., changing the aims according to the nature of obtained findings) was avoided (74). To avoid the risk of evidence selection bias (lack of access to all of the accessible information), the references and citation lists (in google scholar) of all included studies were manually searched and studies that fulfilled the inclusion criteria were entered. Although there is no consensus on what tool to assess the risk of bias in different types of studies, the Cochrane risk of bias tool was used to evaluate the risk of bias in experimental studies that used random assignment and a control group (75). This tool provides a categorized qualitative judgment about the level of risk (high, low, or unknown) across a number of bias types, and includes random sequence generation (i.e., concerning randomization and random sampling), allocation concealment (i.e., hiding the nature of exposure and control groups from participants and personnel), blinding of participants and personnel, blinding of outcome assessment (e.g., the level of objectiveness in outcome assessments), incomplete outcome data (i.e., concerning the missing data and dropouts), selective reporting (i.e., reported and unreported findings), and other biases [for comprehensive information, see Ref. (75)]. To evaluate the risk of bias in quasi-experimental and correlational studies, the Risk of Bias Assessment tool for Non-randomized Studies (RoBANS) was used. RoBANS can be used to assess all study types except for randomized control trials and contains six domains for the risk of bias, which are the selection of participants, confounding variables (i.e., lack of clear distinction between dependent and independent variables), the measurement of exposure (e.g., reliability of measures and scales used), the blinding of outcome assessments, incomplete outcome




(*Continued*)

Frontiers in Psychiatry | www.frontiersin.org 7 April 2019 | Volume 10 | Article 242

TABLE 3 | Studies investigating the role of the experimenter/clinician nonverbal behaviors.


TABLE 3 | Continued

data, and selective outcome reporting. RoBANS is compatible with the Cochrane risk of bias tool and has a same qualitative judgment procedure [for more information, see Ref. (76)].

Using the Cochrane risk of bias assessment tool for experimental studies (75) and RoBANS for quasi-experimental and correlational studies (76), the risk of bias of the individual studies was judged by both authors and the second author (MF) synchronized the results in two tables (**Table 5** for Cochrane risk of bias assessment; and **Table 6** for the RoBANS; see the results).

#### RESULTS

A total of 34 studies were identified: 20 on the role characteristics, 11 on the role of NBs, and 3 studies on the role of both characteristics and NBs of the experimenters/clinicians.

#### Experimenter/Clinician Characteristics

Experimenter/Clinician Sex and the Participants' Pain A total of 15 studies investigated whether the sex of the experimenter/clinician affected the pain report of research participants: Six studies showed a main effect of experimenter sex: three studies showed that male experimenters induced lower pain intensities than females did (22, 59, 60), and Sorge et al. (61) showed that male experimenters induced less pain behaviors and more pain inhibition in rodents. On the other hand, two studies reported that female experimenters induced lower pain intensities than males (50, 51). Nine studies reported no main effect for the sex of the experimenter/clinician (19, 25, 47–49, 52–54, 62) (**Table 2**).

Ten of these 15 studies investigated the interaction of experimenter and subject sex: Three studies showed that, compared to male experimenters, female experimenters induced higher pain thresholds (54), lower pain intensities (19, 25), and marginally significant lower pain unpleasantness (25) in male subjects. Two studies reported that, compared to female experimenters, male experimenters induced higher pain tolerance in female subjects (22, 62). On the other hand, five studies did not find a significant interaction of experimenter/ clinician sex and participant sex on pain report (47, 50–53). The remaining four studies (48, 49, 59, 60) did not use subject sex as a dependent variable and thus could not investigate the interaction of experimenter/clinician sex and participant/ patient sex. One study was on animals and was not relevant in this context (61) (**Table 2**).

In sum, there is no reliable tendency for a main effect of experimenter sex on pain. However, there is some evidence of an interactive effect, as 5 of 10 studies show that the experimenter induced less pain in a subject of the opposite sex (19, 22, 25, 54, 62) (**Table 2**).

#### Experimenter/Clinician Sex and Placebo/Nocebo Effects

Two studies investigated the role of experimenter/clinician sex on placebo/nocebo effects: Aslaksen and Flaten reported that, compared to female experimenters, male experimenters contributed to higher placebo responses in male subjects (56). However, Weimer et al. (58) who studied the effects of ginger and a placebo on nausea, reported no interaction between experimenter sex and placebo responses (**Table 2**; for a review, see **Table 4**).

In sum, there is no reliable tendency for the impact of experimenter sex on placebo effects (**Table 2**).

#### Experimenter/Clinician Status and Participants' Pain

Five studies investigated the effects of experimenter/clinician status on pain reports of research participants: Three studies showed that compared to lower professional status (a student or an assistant), higher-status (e.g., a faculty member or a professor) experimenters generated higher pain thresholds (27) and tolerance (22, 26) and lower pain unpleasantness (26). Williams and colleagues (55) reported that in comparison with research assistants, clinicians contributed to more accurate pain ratings (i.e., recollections of pain intensity following a surgery, correlated with pain ratings presented at the time of surgery) in low back pain patients. Also, Egbert et al. (46) reported that confident clinicians had patients with less usage of narcotics and in a better physical and emotional state than patients of less confident clinicians (**Table 2**).

In sum, all the five studies showed that higher professional status and higher confidence of experimenters/clinicians led to lower pain reports (22, 26, 27), more accurate pain ratings (55), and better physical and emotional state (46). No studies reported other effects of experimenter/clinician status on pain (**Table 2**).

#### Experimenter/Clinician Status and Placebo/Nocebo Effects

Two studies investigated the effects of the status of experimenters/ clinicians on placebo/nocebo effects: Kaptchuk and colleagues (57) showed that, compared to less confident practitioners, more confident clinicians induced higher symptom relief, higher scores on a global improvement scale, and less symptom severity in patients with irritable bowel syndrome (IBS). Howe et al. (44) reported that competent experimenters (who made no mistakes throughout the experiment) induced higher placebo effects (**Table 2**).

To sum up, two studies revealed that confidence and competence status of experimenters/clinicians generated higher placebo effects (44, 57). No studies reported other effects of experimenter/clinician status on placebo effects (**Table 2**).

### Nonverbal Behaviors

#### Experimenter/Clinician Nonverbal Behaviors and Participants' Pain

Seven studies investigated the effects of experimenters/clinicians NBs on the pain of research participants: Ruben et al. (70) showed that, compared to clinicians with negative NBs, clinicians with positive NBs induced higher pain tolerance and less pain expressions. In another study, Ruben and colleagues (69) showed that clinicians with positive NBs generated more accurate pain ratings (i.e., consistency between expressions of pain by subjects


*\*The study of Kállai et al. (22) has reported both interaction and main effects. Therefore, this study is considered twice.*

and judgments about pain ratings by observers), compared to clinicians with negative NBs. Czerniak et al. (71) showed that a clinician with restricted movements, minimal eye contact, more typing, and lack of tactile interaction such as shaking hands induced lower pain thresholds in participants. In comparison, a clinician that had more eye contact, more body movements, shook hands with patients, and touched the patients through the examination had patients who displayed higher pain thresholds. Bohns and Wiltermuth (67) showed that preserving the physical space (not getting too close to the participants) and speaking softly led to higher pain thresholds, whereas lack of preserving the physical space and speaking loudly led to lower pain thresholds. On the other hand, Egbert et al. (46) reported that patients who were visited by a more enthusiastic clinician had less usage of narcotics and their surgeons considered them in a better physical and emotional condition and ready to discharge from hospital. Brown et al. (64) reported no significant difference between the pain reports of participants who received "active support" (including more eye contact and body gestures) and "passive support" (lack of eye contact or body gestures). However, both groups had lower pain reports than the "alone" (undergoing the experiment alone) group, suggesting that the NBs of the clinician reduced pain report. Modić Stanke and Ivanec (66), on the other hand, reported that closer physical distance of observers from participants did not have any significant effect on the pain report of participants (**Table 3**).

In sum, six of seven studies concluded that positive NBs of experimenters/clinicians resulted in lower pain reports (64, 67, 70, 71), more accurate pain ratings (69), and less narcotic use and better physical and emotional state (46), whereas negative NBs led to higher pain reports and lower pain tolerance (67, 70, 71). On the other hand, one study failed to find a significant effect of experimenters/clinicians NB (66) (**Table 3**).

#### Experimenter/Clinician Nonverbal Behaviors and Placebo/Nocebo Effects

Seven studies investigated the effects of experimenters/clinicians NBs on placebo/nocebo effects: Gryll and Katahn (63) found that enthusiastic messages of clinicians generated higher placebo responses and less anxiety in patients that received dental treatment. Another study showed that, compared to the limited interaction (5-min duration, and a very small talk about the sham injection), an augmented communication style (45-min interaction, including a warm and friendly manner) of clinicians resulted in lower pain intensity reports, higher symptom relief, higher scores on a global improvement scale, and less symptom severity (57); whereas limited communication style of clinicians led to higher pain severity reports, lower scores on the global improvement scale, and less symptom relief and higher symptom severity reports by patients (57). Furthermore, compared to a cold communication style (i.e., directing gaze and body posture away from participants and no empathic remarks), a warm communication style (i.e., gazing at the patient, welcoming in a friendly manner, an open body posture, and adding empathic remarks) of clinicians resulted in positive expectations (expectations of shorter pain duration), decreases in anxiety and negative mood, and higher treatment satisfaction in women with menstrual pain (65, 72). A cold communication style of clinicians resulted in higher anxiety levels and expectations of longer pain duration in patients (65, 72) (**Table 3**).

He et al. (73) showed that, compared to a neutral communication style (speaking in a monotone voice, neutral facial expressions, less hand movements, and less eye contact), clinicians with a positive communication style (including strong tone of voice, dynamic facial expressions, eye contact, hand gestures, and open body postures) induced stronger positive expectations in a coordination and balance test and believed their coordination ability improved more (**Table 3**).

Howe et al. (44) showed that, compared to a "low-warmth" clinician who used minimal eye contact, no smiles, and had more distance from participant, a "high-warmth" clinician who used more eye contact, more smiles, and had closer distance enhanced the impact of positive expectations about the effects of an inert cream on their allergic responses, and lowered the allergic reactions. Valentini et al. (68) showed that compared to neutral facial expressions, participants had higher placebo effects when they were exposed to more facial expressions with emotional contents. Interestingly, higher placebo effects were reported when participants observed smiling faces (68) (**Table 3**).

To sum up, all seven studies reported that positive NBs of experimenters/clinicians enhanced the placebo effects and negative NBs lowered placebo effects or increased nocebo effects (44, 57, 63, 65, 68, 72, 73). There were no studies that indicated other effects of NBs (**Table 3**).

#### Risk of Bias Assessment

Of the 20 experimental studies, 19 had low risk of bias in random sequence generation, 16 had low risk of bias in allocation concealment, 12 had unclear risk of bias in blinding of participants and personnel, 16 had low risk of bias in blinding of outcome assessment, 18 had low risk of incomplete outcome data, and 19 had low risk of selective reporting bias (**Table 5**).

Of the 14 quasi-experimental and correlational studies, 10 studies had low risk of bias in selection of participants, 13 had low risk of confounding variables, 7 had low risk of bias in measuring the exposure, 9 had unclear risk of bias in blinding of outcome assessments, 10 had low risk of incomplete outcome data, and 8 studies had unclear risk of bias in selectively reporting the outcomes (**Table 6**).

### DISCUSSION

Several findings emerged: 1) Five of 10 studies showed an interactive effect of experimenters and participants' sex such that experimenters induced less pain in a participant of the opposite sex. There was, on the other hand, no reliable main effect of experimenter sex on the reports of pain. 2) All five studies showed that experimenters/clinicians of a higher status and confidence induced less pain in participants or had patients who had less narcotic usage. 3) Two of two studies showed that experimenters

TABLE 5 | Cochrane Risk of bias assessment for experimental studies of the effects of experimenters/clinicians characteristics and non-verbal behaviors on pain and placebo effects.


*7. Other bias*

of a high status induced larger placebo effects. 4) Six of seven studies showed that positive NBs induced less pain. 5) All seven studies showed that positive NBs induced larger placebo responses. 6) All seven studies showed that negative NBs induced lower placebo responses or higher nocebo effects.

#### The Role of Experimenter/Clinician Sex on Pain and Placebo Effects

Five of 10 studies showed that participants reported lower pain when tested by an experimenter of the opposite sex. Thus, the tendency of an interaction of experimenter/clinician sex and the sex of the participant must be considered with caution. Previous studies have suggested that this tendency can be related to the experimenter gender role rather than to biological factors. For instance, Aslaksen et al. (25) showed that although female experimenters contributed to lower pain report in male subjects, the female experimenters did not have a significant effect on the heart rate variability of the subjects. Thus, the impact of the pain stimulus on autonomic nervous system activity was the same in both male and female participants. This suggests that the lower reported pain in males tested by a female was a reporting bias. In the same line, Flaten et al. (2) showed that female experimenters induced lower pain reports in male participants and concluded TABLE 6 | Risk of Bias Assessment for quasi-experimental and correlational studies (RoBANS) of the effects of experimenters/clinicians characteristics and non-verbal behaviors on pain and placebo effects.

that this could be due to a response bias in males, so they were trying to impress female experimenters by reporting lower pain. Interestingly, Gijsbers and Nicholson (54) showed that by exaggerating the gender-related appearance and behaviors of female experimenters, the hypoalgesic effect of female experimenters on male subjects can be enlarged.

Two studies (22, 62) showed that male experimenter/clinicians induced lower pain reports in female subjects. This finding contradicts the conventional gender role assumptions that assumed a helpless state for females, in which they display higher pain to induce male protection. Kállai et al. (22) showed that females reported lower pain to male experimenters and concluded that females, as well as males, try to impress opposite sex experimenters by their ability to tolerate pain longer. This can be due to changes in the female gender role in contemporary societies in which more authority and power are granted for females.

A second explanation attributes the hypoalgesic effects of male experimenters on female subjects to the physiological aspects of females. Vigil et al. (62) tested two groups of high- and low-fertility females by male and female experimenters and showed that, compared to females who were tested by a female experimenter, high-fertility females who were tested by a male experimenter reported lower pain. This finding suggests that physiological factors can contribute to the lower pain reports of female subjects to male experimenters/clinicians. Also, this finding can partially explain why some studies [e.g., Ref. (25)] failed to observe a hypoalgesic effect of male experimenters on female subjects.

There was no reliable effect of experimenter/clinician sex on placebo analgesia (56, 58).

#### The Role of Experimenter/Clinician Status on Pain and Placebo Effects

Five studies showed that higher status of experimenters/ clinicians generated lower pain reports. Campbell et al. (26) demonstrated that subjects displayed higher blood pressure reactivity and pain tolerance to higher-status experimenters and concluded that increased blood pressure stimulated pressure receptors in the vasculature that also modulate the perception of pain (77–84). The stress induced by the higher-status experimenters may therefore lead to lower pain reports (26).

Two studies demonstrated that higher status of the experimenters/clinicians induced larger placebo effects. Howe et al. (44) showed that competent clinicians enhanced the effects of positive expectations and reduced subject's allergic responses. They suggested that outcome expectations, that are underlying factors for the placebo and nocebo effects, can be modulated by the warmth and competence of clinicians. Notably, Howe et al. (44) studied the effects of low-competence characteristics of clinicians on negative expectations, and did not observe a significant effect on negative expectations.

#### The Role of Experimenter/Clinician Nonverbal Behaviors on Pain and Placebo Effects

Six studies showed that positive NBs of experimenters/clinicians induced lower pain reports, and three studies showed that negative NBs resulted in higher pain reports. Pain is recognized as a stressor and most of painful situations induce stress and negative emotions (54, 85). Negative emotions and stress can increase the experience of pain [e.g., Refs. (56, 85)], whereas providing information about the forthcoming intervention and outcomes of a treatment may reduce the stress and negative emotions. As there can be uncertainty about the outcome of interventions (54, 85), participants/patients might use as much of accessible information as possible to gain knowledge about the efficacy of the intervention. NBs of experimenters/clinicians can be a substantial source of information for participants/patients (36, 69, 70). In this line, Ambady and Gray (40) showed that positive NBs of clinicians (e.g., facial expressiveness, nodding, and smiling) were associated with long-term improvements in cognitive and physical functioning of elderly patients. Previous studies have shown that clients can perceive the expectations of their providers [e.g., Refs. (36, 86)]. As interpersonal expectations are mostly communicated through NBs [e.g., Ref. (38)], positive NBs of experimenters/clinicians can be interpreted as a sign of satisfactory functioning or results and lead to decrease in negative emotions and subsequently lower pain reports, whereas negative NBs can be assumed as a sign of negative forthcoming results and lead to higher pain reports. In this line, Egbert et al. (46) showed that patients who were exposed to enthusiastic clinicians were in a better emotional state, and Gryll and Katahn (63) showed that enthusiasm by clinicians reduced negative emotions.

Seven studies showed that positive NBs of experimenters/ clinicians induced higher placebo effects, whereas negative NBs led to lower placebo effects and higher nocebo hyperalgesia. To explain the modulatory effects of NBs on placebo and nocebo effects, a similar perspective is taken. NBs may have a confirmatory (or contradictory) role for verbal information that is used to induce positive outcome expectations and placebo effects. So, positive NBs may have an additive value for the verbal information, e.g., that a tablet is a powerful pain killer, and negative NBs may contradict the verbal information and diminish the induction of placebo effects. In this line, Howe et al. (44) showed that positive NBs of clinicians enhanced the impact of positive expectations about the effects of an inert cream on allergic responses; and He et al. (73) showed that positive NBs of clinicians induced stronger positive expectations in a coordination and balance test. Expectations may also contribute to the modulation of emotions and stress. For instance, Verheul et al. (65) and Van Osch et al. (72) reported that positive NBs of clinicians enhanced positive outcome expectancies and reduced the state anxiety and negative mood, whereas negative NBs resulted in higher anxiety levels and expectations of longer pain duration.

Therefore, NBs may have an additive value for the role of verbal information in modulation of expectations, negative emotions, and stress, and hence lead to changes in amplitudes of placebo or nocebo effects. Several studies have reported failure to elicit a placebo effect [e.g., Refs. (58, 87)]. Uncontrolled NBs of experimenters/clinicians may partially account for such diversity in findings.

### CONCLUSION

This qualitative review documented the contribution of experimenters/clinicians' sex, status, and NBs, as three factors capable of altering the perception of pain, and amplitude of placebo/nocebo effects and responses.

Sex, status, and NBs of experimenters/clinicians are interwoven in every laboratory and clinical setting and the present review shows that these factors can influence research results. The failure to control for the effects of characteristics and NBs of experimenters/clinicians can explain why placebo studies occasionally yield inconsistent or variable findings [e.g., Refs. (58, 87, 88)], or why the reliability of pain measurement is limited and doubted [e.g., Ref. (25)]. To gain a deeper understanding of the effects of such nonspecific factors, this review emphasizes the need to further investigate the contribution of characteristics and NBs of experimenters/ clinicians in pain and placebo effects.

#### Recommendations for Future Research

Prospective investigations are encouraged to address the following gaps in current literature; first, to our knowledge, just two studies have investigated the separate effects of different NBs on pain and placebo effects (68, 73). Thus, future studies should specify what specific NBs (facial expressions, eye contact, nodding, physical distance, tone of voice, or body postures) that have the strongest impact on pain and placebo/nocebo effects; He et al. (73) showed that compared to physical distance and body posture, facial expressions and tone of voice had stronger effects on placebo effects. However, this finding should be replicated especially in prospective pain studies. Second, the interaction of NBs and sex of providers and subjects should be investigated to see whether NBs of experimenters can modulate the effects of sex or vice versa. Only one study has studied this and reported that positive NBs of experimenters induced lower pain reports in male subjects than in female subjects (70). Third, future studies should suggest how to control for the effects of NBs in research on pain and other symptoms. Indeed, this can only be achieved if we have more knowledge about the effects of each specific NB on pain or other symptoms. Fourth, studies could consider the effects of other genders (e.g., transgendered experimenters) on the experience of pain; to our knowledge, only one study has addressed this (59) and showed that compared to a male or female experimenter who acted in accordance with their sex, a biological male who acted in a feminine way induced higher pain reports in female subjects. Fifth, there might be an interaction between experimenters/ clinician's sex and their status. Several studies have reported that for example, male providers were considered more credible (87); their status influenced male subjects more (27); male clinicians who were reputed for their expertise were more preferred over female clinicians; and female clinicians who were reputed for their interpersonal skills were preferred more by patients (30). The possible interaction of the status and the sex of the experimenters/clinicians should be taken into account to determine whether status can modulate the effects of sex or vice versa. According to our searches, only Kállai et al. (22) have tested both sex and status systematically, but unfortunately have not reported the interaction of sex and status of the experimenters. Lastly, the underlying mechanisms (e.g., expectations and emotions) of the effects of NBs and characteristics of experimenters/clinicians on pain and placebo effects are still unclear and should be investigated. More knowledge of these factors would be highly relevant in the training of health personnel.

### LIMITATIONS

The present study contains a number of limitations that should be noted here. First is the qualitative nature of this study that hinders the generality of findings. Second is the heterogeneity of keywords used in different studies, which made it difficult to gain access to all related studies and may have caused to miss a few studies; however, to prevent this, several Boolean searches were conducted and also the reference and citation lists of included studies were checked. Third is the interpretation of the findings on the interaction of the experimenters'/ clinicians' sex and subjects' sex. Of the included studies, five studies showed an interaction, and five studies did not find an interaction. Therefore, the findings on the interaction of the experimenter/clinician and participant's sex should be interpreted with caution. Fourth is the problem of confounding in some findings such as investigating the provider status and NBs simultaneously and without differentiation as in Kaptchuk et al. (57); or lack of clarity in methodological procedures such as absence of differentiation in providers' sex and status as in Campbell et al. [Ref. (26) or (87)]; or lack of differentiation between verbal and nonverbal components as in Gryll and Katahn (63). Such deficiencies limit the drawing of straightforward conclusions. Additionally, this systematic review did not comprise a review protocol, but authors tried to precisely characterize the scientific nature of this systematic review by determining *a priori* question and the procedure relevant to the questions.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

MF planned the study. HD searched and extracted the articles and MF screened them. Both authors significantly contributed to the analyses of results, drafting of the manuscript, and preparation of the final draft.

### FUNDING

The present research was funded by the Norwegian University of Science and Technology (NTNU).


(TAILOR) investigators. *BMJ Open* (2015) 5(1):e006578. doi: 10.1136/ bmjopen-2014-006578


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Daniali and Flaten. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Optimizing Expectations *via* Mobile Apps: A New Approach for Examining and Enhancing Placebo Effects

*Piotr Gruszka1\*, Christoph Burger2,3 and Mark P. Jensen4*

*1 Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany, 2 Department of Basic Psychological Research and Research Methods, Faculty of Psychology, University of Vienna, Vienna, Austria, 3 Department of Applied Psychology: Work, Education and Economy, Faculty of Psychology, University of Vienna, Vienna, Austria, 4Department of Rehabilitation Medicine, University of Washington, Seattle, WA, United States*

There is growing interest in interventions that enhance placebo responses in clinical practice, given the possibility that this would lead to better patient health and more effective therapy outcomes. Previous studies suggest that placebo effects can be maximized by optimizing patients' outcome expectations. However, expectancy interventions are difficult to validate because of methodological challenges, such as reliable blinding of the clinician providing the intervention. Here we propose a novel approach using mobile apps that can provide highly standardized expectancy interventions in a blinded manner, while at the same time assessing data in everyday life using experience sampling methodology (e.g., symptom severity, expectations) and data from smartphone sensors. Methodological advantages include: 1) full standardization; 2) reliable blinding and randomization; 3) disentangling expectation effects from other factors associated with face-to-face interventions; 4) assessing short-term (days), long-term (months), and cumulative effects of expectancy interventions; and 5) investigating possible mechanisms of change. Randomization and expectancy interventions can be realized by the app (e.g., after the clinic/lab visit). As a result, studies can be blinded without the possibility for the clinician to influence study outcomes. Possible app-based expectancy interventions include, for example, verbal suggestions and imagery exercises, although a large number of possible interventions (e.g., hypnosis) could be evaluated using this innovative approach.

#### Keywords: placebo, expectancy, intervention, app, mobile, smartphone, expectation

### INTRODUCTION

There is an increased interest in understanding the effects of placebo interventions and the mechanisms underlying these effects. While basic research has led to a better understanding of psychobiological mechanisms underlying placebo effects by means of strictly controlled experiments (1), applied research has focused on elucidating the factors contributing to placebo effects in clinical practice (2). Some of these studies have been extensively covered in the media, reflecting the interest in placebo effects among the general public. A number of researchers have emphasized the potential of maximizing placebo effects in clinical practice to optimize treatment outcomes (3–5).

Despite recent progress, research in this area faces several unsolved methodological challenges and awaits broader validation. Similarly, as is the case in psychotherapy trials,

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Younbyoung Chae, Kyung Hee University, South Korea Rüdiger Christoph Pryss, University of Ulm, Germany*

> *\*Correspondence: Piotr Gruszka piotr.gruszka@rub.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 22 March 2019 Accepted: 13 May 2019 Published: 31 May 2019*

#### *Citation:*

*Gruszka P, Burger C and Jensen MP (2019) Optimizing Expectations via Mobile Apps: A New Approach for Examining and Enhancing Placebo Effects. Front. Psychiatry 10:365. doi: 10.3389/fpsyt.2019.00365*

blinding is extremely difficult to achieve when delivering placebo interventions (6). As a result, it has been challenging to estimate the true effects of placebos separately from the effects of experimenter bias. It is therefore crucial to develop new methods to assess placebo effects.

This paper aims to highlight several methodological advantages of using mobile apps in the area of placebo research. Methodological advantages include full standardization and more reliable blinding, randomization, and allocation concealment. By delivering expectancy interventions *via* apps, researchers can disentangle expectancy effects due to the intervention from effects induced by the patient–researcher (or patient–practitioner) interaction, allowing for the control of experimenter bias (7). Further, combining app-based placebo interventions with experience sampling offers several opportunities for addressing important research questions, such as investigating the impact of placebo interventions on symptom trajectories and on changes in expectations. Additionally, subjective ratings can be potentially complemented by objective data gathered through smartphone sensors and mobile-based experiments. Validated apps can be used for treatment delivery to a large number of people.

### Traditional Definition of Placebo Effects

The term *placebo effect* was first described as a set of positive changes that occur after an inert or inactive treatment (i.e., placebo) was administered to patients (1). Placebo effects are usually associated with so-called blinded randomized controlled trials (RCTs), where placebos—in the form of inert pills, injection, or sham procedures that resemble the active treatments—are administered to study participants in a control group. In order to be considered specifically effective (i.e., beyond the effects of placebos), active treatments are required to outperform placebos in these trials. Optimally, study participants, researchers, study clinicians, data collectors, outcome adjudicators, and data analysts are blinded in RCTs, in order to ensure that differences between active treatments and placebos are not confounded by potential biasing factors such as experimenter effects or participants' expectations (7–9).

#### Problems and Inconsistencies Inherent in the Traditional Definition of Placebo Effects

There are, however, several problems with the abovementioned traditional definition of the placebo effect. First, by defining it as the global response to a placebo treatment, this definition combines the genuine placebo response with other confounding factors, such as natural course or fluctuations in the outcome variable, regression to the mean, the effects of additional treatment(s), observer bias, and subsiding adverse effects of any previous treatments (10, 11). Furthermore, associating placebo effects with RCTs has led to an understanding of placebo effects as (mostly) a vehicle for testing the effectiveness of treatments, such as pharmacological substances, and not otherwise of much interest. As a result, many view placebo effects as something

that should be controlled, rather than investigated or used to improve health and function (12).

Second, putting the placebo itself and its inertness into the focus of the definition has led to significant confusion and controversy regarding the placebo effect (e.g., how is it possible for an inert treatment to have genuine effects?). This has resulted in a rather negative connotation being attributed to placebo effects; they are often thought to be fictitious, nonexistent, or only for the gullible [for focus group results, see Ref. (13)]. Furthermore, placebo effects are often considered unworthy and unscientific (14).

As a result of these problems, there have been various attempts to make a case for abandoning the concept of placebo effect (15) and to propose new concepts [e.g., "context effects" (12), "meaning responses" (16)]. Because the concept of placebo is deeply entrenched in the literature, proposed alternative labels and concepts have not been adopted. We have therefore decided to continue using the term *placebo effect* in this paper. However, in order to reconcile this concept with the current evidence, a reconceptualization of this concept is in order [also see Ref. (11)]. In short, the focus should not be on the placebo itself but on the mechanisms underlying the placebo effects. Consistent with this idea, Gliedman and colleagues stated over 60 years ago that the "so-called placebo effect should be looked upon as an epiphenomenon of complicated psychological processes, which are far more important than the disarmingly simple means utilized for its realization" (17).

### Reconceptualization of Placebo Effects

Placebo effects have been found to originate from psychobiological mechanisms in those who respond to placebos (1). Both conscious expectancies and unconscious conditioning mechanisms are assumed to be major contributing factors to placebo responses (1, 18). Previous research has shown that patients' expectations of clinical benefits play a major role in placebo effects by triggering distinct neurobiological systems that then shape the therapeutic outcomes (3, 11, 19).

When focusing on the underlying mechanism of expectancy learning, it becomes clear that placebo responses are omnipresent in clinical practice—even when no placebo is administered. When active treatments are administered, patients' responses are determined not only by specific effects of the treatments themselves but also by the patients' outcome expectations, as well as their possible interaction. This can be easily demonstrated by the so-called "open–hidden" paradigm, which has shown that treatments are more effective when they are given when the patients are present and fully aware of them (i.e., they are able to form expectations) than when they are given in a hidden manner and without patients' knowledge (11).

A large and growing literature has demonstrated that expectancy-driven placebo effects are a genuine phenomenon that occurs not only after the administration of inert but also of active treatments, and that contributes substantially to the success of many active medical treatments (1). Such effects are potentially relevant in clinical practice because they might lead to better patient health and more effective therapy outcomes. In fact, several studies have shown a positive association between optimistic outcome expectations of patients and favorable therapeutic improvements for a variety of conditions and symptoms, such as disability after surgical interventions (19), hypertension (20, 21), depression (22), anxiety (23, 24), other psychiatric disorders (25), and pain (26).

However, some researchers are less optimistic about the clinical value of placebo effects. Hróbjartsson and Gøtzsche, for example, questioned the clinical relevance of placebos in their meta-analyses (27, 28) and argued that placebos can affect only subjective outcomes such as pain but not objective health parameters. Other researchers, however, note that placebos can improve objective outcomes such as peripheral health parameters and immune responses (29, 30).

#### Expectancy Interventions: Modifying Patients' Expectations to Improve Clinical Outcome

Recently, there has been increased interest in interventions that optimize placebo effects to improve clinical outcomes in routine medical care (1, 2, 31, 32). Previous research has established that interventions targeting outcome expectations have been shown to relieve patients' symptoms such as pain [for a meta-analysis, see Ref. (26)]. These expectancy interventions usually consist of brief procedures, such as verbal suggestions or imagery interventions, and can be implemented by clinicians in their routine clinical practice. There has been a growing interest in examining the effects of both verbal suggestions and imagery to increase patients' outcome expectations, which are then thought to enhance treatment outcomes. Such interventions have been used as part of hypnotic treatments for more than a century (33, 34). In fact, evidence indicates that expectancies are mediators of the effects of suggestions both in placebo interventions and in hypnosis (35).

Given that expectancy interventions have been shown to improve symptoms, one could argue that there is an ethical obligation to encourage their widespread implementation and application. This would raise the question regarding how such interventions can be most effectively delivered in order to reach as many patients as possible. Even if the intervention's benefit is small, it still could be considered a valuable public health intervention if it reaches a high number of people with few adverse effects.

#### Methodological Challenges in Validating Clinician-Delivered Expectancy Interventions

Despite the potential of placebo interventions for improving health outcomes, a number of researchers have noted that the efficacy of placebo interventions, such as expectancy interventions, has not been adequately validated. This lack of validation is due to the as-yet-unresolved challenges in placebo research (36, 37), such as the inability to achieve the basic prerequisites for rigorous validity testing of placebo interventions.

One critical precondition is the blinding of the person delivering the interventions. In placebo research—as is also the case in face-to-face psychotherapy trials—reliable blinding of the intervention is extremely challenging. When clinicians are delivering expectancy interventions (e.g., suggesting that pain will decrease soon), they are aware of doing so because delivering the intervention per se is a conscious social act. Thus, they cannot be blinded to treatment allocation or the type of interventions they are delivering. One can envision a variety of ways that this awareness could lead to additional conscious or unconscious changes in the clinicians' behaviors (e.g., preferential treatment) or verbal/nonverbal communication (e.g., more friendly and reassuring manner) that go beyond the expectancy intervention alone. This lack of blinding may, and probably does, result in experimenter bias (7, 38, 39), which can then contribute to spurious effects or overestimation of effect sizes. Although one might try to blind experimenters or study clinicians by not telling them about study hypotheses, their beliefs and assumptions about the intervention they are delivering can still bias outcomes.

One potential approach to understand the impact of interaction patterns on placebo effects is to manipulate factors within the patient–provider interaction. For example, Kaptchuk and colleagues (40) showed in a single-blind three-arm RCT of 262 patients with irritable bowel syndrome (IBS) that factors such as warmth, empathy, active listening, and indirect suggestions ("I have had much positive experience treating IBS and look forward to demonstrating that acupuncture is a valuable treatment in this trial") affected outcomes. It makes sense that factors such as clinicians' warmth, empathy, active listening, or suggestions have positive effects on clinical outcomes, given that similar aspects are at the heart of person-centered psychotherapy (congruence, unconditional positive regard, empathy) and hypnosis (suggestions) (41–46). However, the conclusions that can be drawn from the Kaptchuk et al. (40) and other similar studies are limited because they are generally conducted unblinded. As a result, it is not possible to conclude whether the outcomes are due to these nonspecific clinician factors (e.g., warm, friendly interaction, expectancy manipulation through verbal suggestion) that are a part of how the intervention is delivered, due to experimenter bias (e.g., differential treatment of patients beyond the actual intervention depending on their experimental condition), or both [for a review on the effects of nonverbal behaviors of experimenters on placebo effects in research participants, see Ref. (47)].

Further challenges for the rigorous evaluation of expectancy interventions are response sets, such as acquiescence bias (i.e., the patient or participant wishing to please the experimenter). It is also difficult to disentangle the impact of patient–provider interactions from other response biases. It has been shown, for example, that patients have a higher tendency for response bias when they are experiencing a warm patient–provider interaction (36). Thus, a patient might report a decrease in symptom severity to please the clinician, although it might not reflect an actual change in subjective experience.

In conventional settings, expectancy interventions are delivered by clinicians. These settings almost always involve biases such as those mentioned above. This crucially limits the interpretation of the results. These biasing effects may be an even bigger hindrance for placebo research in children and adolescents, as children are more suggestible than adults (48) and thus might be more easily influenced by experimenter or response bias.

#### ADVANTAGES OF USING MOBILE APPS FOR PLACEBO RESEARCH

There has been an increased interest in apps in the field of medicine and psychology in recent years. Mobile apps are being used more and more frequently by researchers, clinicians, and patients and have the potential to revolutionize different aspects of medical and psychotherapeutic care (49–54). However, to the best of our knowledge, apps have not been systematically used to examine or deliver placebo-boosting interventions. Thus, the field could potentially profit from technological advances in the area of smartphones.

We propose in this paper that the use of mobile apps can lead to many advantageous developments in both placebo research and clinical practice: i) using smartphones can help to solve problems inherent in validating placebo-boosting interventions such as expectancy manipulations; ii) mobile apps can be used to gain a better understanding of placebo mechanisms in everyday life; and iii) once placebo-boosting interventions have been successfully validated, apps can be used as an effective way to deliver these interventions as an adjunct to therapy sessions or as a stand-alone tool to a large number of people (see **Figure 1**).

#### Validating Placebo-Boosting Interventions

In light of difficulties in reproducing major findings in psychological and medical science (55–58) in recent years,

the area of placebo research might profit from innovative methodological advances. App-based studies offer several methodological advantages enabling more robust research, which can play an important role in improving the scientific status of expectancy interventions, potentially enabling them to be introduced into mainstream medicine (1).

One of the advantages of app-based expectancy interventions relates to the fact that they can be fully standardized. In the past, expectancy manipulations used different protocols and were conducted in different settings and with different samples. Thus, differences in outcomes may be related not only to different outcome measures and types of illnesses (29, 59) but also to different protocols, settings, clinicians, samples, sampling procedures, and methodological standards.

To address these issues, app-based interventions can take advantage of full standardization, thereby reducing heterogeneity. They can also ensure adherence to key characteristics of highquality trials such as adequate randomization and allocation concealment, by performing these tasks objectively and reliably within the app. This can be achieved by placing importance on the right timing; that is, randomization and allocation concealment can be performed by the app after the clinic or lab visit (where the patient or study participant can be introduced to the app) so that it is impossible for inadequate group allocation or blinding to impact the experimenter or clinician and his or her interaction with the study participant. Alternatively, patients might use the app fully remotely without any contact at all with experimenters or clinicians. As it has been shown that trials with inadequate or unclear allocation concealment exaggerate subjective outcome effects (8, 60), the use of apps could potentially increase the reliability of effect size estimates.

By making the apps open-source, independent researchers could use them at minimal cost to conduct fully identical replications. Although innovative placebo interventions [e.g., open-label placebos (61, 62)] have been tested in recent years, identical replications of these studies are lacking. Given that effect sizes in psychology are, on average, only half the initial size when replicated (56), identical replications are crucial for validating expectancy interventions. Providing app-based expectancy interventions as open-source software may potentially reduce costs by streamlining research (63), thereby increasing the quality of conducted studies (64).

The standardization that results from the use of app-based expectancy interventions would lead to smaller heterogeneity and more precise replications. Thus, studies will be fully comparable and could be easily aggregated in prospective metaanalyses (65), leading to large and meaningful sample sizes, a key characteristic of robust research (66, 67). This would also allow investigators to quantify the influence of sample procedures and sample characteristics on trial outcomes and replication rates.

In addition, by standardizing expectancy interventions and increasing sample sizes, variance will be reduced. This could enable researchers to investigate the impact of expectancy interventions in different samples. Thus, conducting highly standardized app-based experimental interventions in different samples and cultures can lead to a better understanding of interpersonal and intercultural differences in expectations (68). This might lead to more precise predictions of placebo effects and to the development of more effective culturally sensitive expectancy interventions. As most research is conducted in Western, educated, industrialized, rich, and democratic (WEIRD) (69) samples, app-based expectancy interventions offer the potential of gathering data in more diverse and representative samples if uploaded to an app store or used as an add-on to established treatments.

Crucially, experimenter bias can be limited by disentangling patients' expectations from the patient–provider interaction. By delivering expectancy interventions within the app, expectations can be studied in isolation, disentangled from the effects of the patient–provider interaction. As such, expectancy interventions can be delivered at home, after seeing the clinician, thereby eliminating experimenter biases (7, 39), or even fully remotely if the app is uploaded to an app store.

## Gaining Insights Into Placebo Mechanisms in Everyday Life

#### Ecological Validity

Further, app-based expectancy interventions offer the potential to deliver interventions with high ecological validity in patients' everyday life. Thus, effects from the lab can be extended to the natural surroundings of patients, thereby increasing the potential usefulness of interventions (64). Apps also offer the opportunity of combining experience sampling procedures with expectancy interventions. Although the advantages of experience sampling methodology have been discussed in the area of psychiatry before [e.g., Ref. (70)], to our knowledge, this methodology has not yet been applied to placebo research. Experience sampling is a method for assessing momentary thoughts, feelings, and symptoms and is usually employed several times per day over consecutive days (71, 72). This structured diary method can be easily implemented in mobile apps. It offers the possibility to assess symptom trajectories in everyday life as well as underlying mechanisms, thereby increasing ecological validity.

Investigating symptom trajectories over time could enable researchers to cluster study participants into different types of responders (73). Gueorguieva and colleagues (74), for example, have investigated trajectories of depression severity in clinical trials of duloxetine showing that placebo-treated patients were characterized by different trajectories than responders and nonresponders in the antidepressant-treated subsample. Moreover, it may be possible to differentiate study participants based on early or late responses. Simons and colleagues (75) have classified response trajectories of children with chronic pain after intensive pain rehabilitation treatment into early treatment responders, late treatment responders, and nonresponders.

In addition, more intensive daily experience sampling would enable researchers to investigate the variability in symptoms within and between persons following expectancy interventions. Apps might potentially enable researchers also to gather information on adverse events and long-term data after expectancy interventions. Thereby it would be possible to answer an important research question that has not yet been adequately addressed: Do expectancy interventions lead to long-lasting changes or only temporary improvement? Thus, this type of research has the potential to elucidate a much more in-depth understanding of placebo effects in everyday life.

Experience sampling might be used to assess not only symptom fluctuations but also changes in symptom expectations. Mun and colleagues (76) have, for example, investigated pain expectations in a sample of 231 individuals with rheumatoid arthritis showing that pain expectations are a reliable predictor of pain. As expectations are at the heart of placebo effects, the assessment and fluctuation of symptom expectations will add to a more precise understanding of placebo effects and a better understanding of expectations and their formation over time.

The assessment of symptoms and expectations *via* apps can be complemented by open questions and other qualitative assessments (e.g., interviews *via* smartphone chats about daily experiences) to investigate the impact of not only expectancy interventions but also daily experiences such as social interactions on symptom trajectories and expectations. A detailed understanding of these processes will enable a more precise prediction of placebo effects and will offer new avenues for individualized expectancy interventions.

#### Assessment of Objective Data

Subjective data on symptom and expectation trajectories can be complemented with data obtained through smartphone sensors. Smartphone sensors can provide researchers with data about social interactions, daily activities (e.g., physical activity and sleep quality), and mobility patterns (77, 78). Researchers targeting chronic pain could, for example, investigate how expectancy interventions affect physical activity, sleep quality, or social interactions.

Apps also offer the possibility of running behavioral experiments on smartphones. Thus, experiments from the lab could be conducted on smartphones. Free popular experimental software such as PsychoPy1 is now also available for mobile devices (79, 80), potentially enabling researchers to conduct these experiments with minimal costs. A promising approach might be to develop experiments to phenotype beliefs underlying changes in expectations or to employ existing implicit measures such as the implicit association test (IAT) (81) for that purpose.

Although some researchers have argued that placebo effects lead only to an improvement in parameters that depend on subjective patient ratings (28), others came to more favorable conclusions (29). Thus, it seems crucial to find alternative ways of assessing objective data following expectancy interventions in order to resolve this issue. Smartphones and other mobile devices offer several efficient ways for doing so by assessing different types of behavioral measures in an unobtrusive way without putting additional burden on study participants.

#### Treatment Delivery Multiple or Repeated Interventions

Apps also can be used to deliver multiple or repeated expectancy interventions, thereby potentially increasing their efficacy. One could, for example, deliver different weekly expectancy interventions and assess their impact on symptom trajectories through the use of experience sampling. This might potentially enable researchers to investigate cumulative effects of repeated expectancy interventions. As some patients show *cognitive immunization strategies* [strategies to weaken or eliminate expectation violation or, in other words, strategies to reduce cognitive dissonance between suggested information and individual beliefs, (82, 83)], it might be necessary to deliver expectancy interventions gradually or to individualize them according to patient beliefs, person characteristics, and symptom trajectories for them to take effect.

#### Just-in-Time Adaptive Expectancy Interventions

A precise understanding of symptom and expectation trajectories complemented with behavioral data through smartphone sensors might pave the way for the development of just-in-time adaptive expectancy interventions (JITAEIs). Just-in-time adaptive interventions (JITAIs) relate to interventions that are adapted to the status or context of an individual over time (84–87). As every person has individual beliefs, it is likely that individualized interventions will have higher efficacy. Psychotherapy research has shown, for example, that resistant patients profit more from nondirective therapy than from directive approaches (88). Thus, patients with more rigid health beliefs, which make them more resistant to change, could potentially profit more from indirect suggestions ("Many patients profited from the app before") or imagery exercises (e.g., imagining healthy future self) than from direct suggestions ("You will profit from this app"). Suggestions as part of expectancy interventions might therefore be delivered based on symptom changes, patients' beliefs and needs, other personal characteristics, and data from smartphone sensors. Thus, if patients have strong beliefs about their condition (as assessed by questionnaires) and have not shown symptom improvements for several weeks, they might be offered indirect suggestions, such as, "Some patients did not seem to profit from the app in the beginning, some were even frustrated. Often, however, their symptoms did in fact improve, bit by bit." Less resistant patients, who report early improvements in symptom reduction, might be given more direct suggestions such as, "You have used the app for one week now. Your pain has already decreased. You will experience your pain decreasing even further in the coming weeks." Thus, the app may be programmed in such a way as to accommodate the patients' symptom ratings, other personal characteristics, and objective data gathered through smartphone sensors to deliver individualized expectancy interventions.

#### Treatment Dissemination

Once an expectancy intervention is found to be effective for producing changes in reported symptoms, clinical implementation of that intervention may prove challenging, given the significant limitations on clinicians' time. Apps could potentially be used to deliver highly standardized expectancy interventions without posing an unnecessary burden on busy clinicians. Thus, appbased expectancy interventions might be used either as an add-on to existing medical and psychotherapeutic procedures or even as a stand-alone intervention.

<sup>1</sup>https://www.psychopy.org/

#### TABLE 1 | Summarized advantages of app-based expectancy interventions.


medical and psychotherapeutic procedures or as a stand-alone intervention.

Treatment dissemination App-based expectancy interventions can be uploaded to app stores and delivered as an add-on to existing

### DISCUSSION

The present paper introduces a novel approach of delivering expectancy interventions (e.g., verbal suggestions, imagery exercises) aimed at boosting placebo effects through mobile apps. Because this approach does not involve an attendant person (e.g., an experimenter or clinician) to deliver the expectancy intervention, expectancy-driven components of the intervention can be disentangled from social interaction–driven components. Such an approach can answer questions such as what aspect of the placebo effect is driven by changes in expectancies. Moreover, this approach can help us to better understand the patient populations for whom such interventions may be most effective.

Previous studies have already shown that verbal suggestions delivered by technology (i.e., audio players) are effective in improving clinical symptoms in patients. For example, playing recorded hypnosis audio tracks, consisting of verbal suggestions (also used to elicit imagery), has been shown to be effective in reducing pain [e.g., Refs. (89, 90)]. These studies, however, did not use mobile apps to deliver the verbal suggestions and thus did not exploit the full potential of available technology. Nevertheless, the findings support the approach presented in this paper as promising.

As expectancy interventions have been used primarily in experimental research in relation to an active or placebo treatment [oral, injection, cutaneous, or other; see Ref. (26)], their implementation in clinical settings may be inspired by

clinical hypnosis research. Clinical hypnosis has a long history of using verbal suggestions for symptom improvement (33, 34), with several journals focusing solely on hypnosis (e.g., *American Journal of Clinical Hypnosis*, *International Journal of Clinical and Experimental Hypnosis*). Bringing both fields together research on clinical hypnosis and placebo research—may be particularly fruitful for developing more effective expectancy interventions. Thus, we think that new expectancy interventions for mobile apps would greatly profit from research in both the clinical hypnosis and placebo research fields. However, possible app-based expectancy interventions are not limited to verbal suggestions and imagery exercises, as a large number of expectancy interventions could be delivered and evaluated using this innovative approach.

We have described several advantages of app-based expectancy interventions (see **Table 1** for an overview). This approach makes it possible to investigate placebo effects independent of the patient–provider interaction, thereby overcoming some of the inherent methodological challenges associated with placebo interventions. Highly standardized app-based expectancy interventions can lead to more robust research by enabling researchers to replicate findings more easily. Apps also can be used to phenotype placebo responses longitudinally, while investigating mechanisms of change. Researchers can integrate behavioral experiments into their apps and gather data from smartphone sensors for this purpose. This could allow researchers to predict placebo responses more precisely, helping scientists gain insights into short-term, long-term, and cumulative effects of expectancy interventions as well as adverse events. Further, app-based expectancy interventions can be individualized and delivered just in time.

This approach has significant potential for both research and clinical practice. If simple app-based interventions aimed at improving outcome expectations (e.g., verbal suggestions or imagery exercises) lead to symptom relief, they could be widely applied to optimize patient treatment. Such approaches could be used to support medical treatment more efficiently (e.g., reducing dose of medication without diminishing effects, improving outcome effects without having to raise medication dose) or even be a viable alternative to medication when the anticipated adverse effects might outweigh the benefits of drug use (91). In the field of pediatrics, where medications may have long-term side effects on children's brain development, reducing the pharmacological load might be even more relevant. Improving outcome expectations could also translate into better patient adherence and compliance (32) and reduced feelings of helplessness and hopelessness.

However, since no effectiveness data on different forms of app-based expectancy interventions are currently available, it will take further empirical research efforts to understand the kinds of expectancy interventions that are most effective under what conditions and for what populations. Eventually, it will be necessary to conduct studies with large samples to investigate precise predictors of placebo responses taking into account various data sources, including data from smartphone sensors, app-based experiments, as well as biological data. These studies will provide important information for individualizing interventions, which could subsequently be delivered just in time.

Some limitations of this approach need to be acknowledged. The first limitation refers to the fact that mobile apps cannot replace the provider–patient relationship, which is considered an important factor of placebo effects and clinical outcomes. Rather, their strength lies in their ability to systematically study expectation effects separately from social interaction effects. App-based approaches might be of special interest for i) patients who do not want to disclose their problems to clinicians and ii) patients with a high affinity for smartphones and new technology, such as children and adolescents (92). They may also be used as an add-on or aftercare to medical/psychotherapeutic procedures.

The second limitation refers to legal, ethical, and privacyrelated aspects of app-based treatments. Apps that aim to treat medical conditions are considered medical devices and need to adhere to relevant regulations, such as the European Medical Device Regulation or the Food and Drug Administration (FDA) regulation in the US before entering the market (93, 94). Furthermore, using medical apps in research potentially leads to challenges relating to consent and privacy (47). The fact that the legal situations regarding medical privacy vary between countries further complicates the matter (95). Several recommendations have been made to address ethical issues, data privacy, and data security concerns, which should be considered while developing mobile apps (96–99).

The introduction of app-based interventions also comes with technological challenges. First, although there are currently only two major operating systems available for smartphones (Android and iOS), new versions of these operating systems are released continuously. Most manufacturers also provide modified versions of Android, resulting in potential compatibility issues. In addition, manufacturers provide smartphones and other mobile devices with a plethora of different hardware specifications, including different screen sizes and screen resolutions. Thus, software developers not only need to ensure that the apps run on different operating systems but also need to program them with different screen sizes, screen resolutions, and hardware specifications in mind.

Second, it has been proposed that interventions that are delivered through a mobile device might lead to heightened expectations of a high-tech treatment among patients with high affinity for their digital devices. This phenomenon has been termed "digital placebo" (100). It has been argued that trials with such app-delivered interventions have to be complemented with an active placebo control group that also involves an app (101, 102) in order to distinguish the specific effects of the appdelivered interventions from digital placebo effects.

Further, the use of mobile technology and the Internet might be contraindicated for individuals prone to Internet addiction (103). These individuals might not profit from such apps and therefore should be assigned to other treatment modalities. Also, different operating systems or smartphone technologies in general might represent confounders that could bias the results. For example, the majority of the population in Europe uses Android smartphones, whereas there is a higher proportion of iPhone users in the US. There might also be sociodemographic differences between Android and iOS users (104).

Finally, we want to point out the importance of future research efforts to focus on translational aspects of their findings. It is well established that many findings from studies evaluating the efficacy of behavioral and health promotion interventions have not been put into (clinical) practice. It has been pointed out that an important reason for this gap between research results and evidence-based practice may lie in the tendency of the current research culture to neglect issues of external validity (105, 106). To address this important issue, Glasgow and colleagues argue that researchers should pay attention to issues of moderating variables (external validity) in both efficacy and effectiveness studies (107). These issues also have been shown to be present in smartphoneenhanced health research, as mobile health intervention studies tend to neglect the reporting of validity indicators, including indicators of external validity (108). Although there may be practical constraints, the usefulness of future research efforts (64) might benefit from quality criteria available from published best practice standards [e.g., Consolidated Standards of Reporting Trials of Electronic and Mobile HEalth Applications and onLine TeleHealth, CONSORT-EHEALTH (109)] and evaluation frameworks [e.g., RE-AIM framework: reach, efficacy/effectiveness, adoption, implementation, maintenance (105, 110)]. These criteria might be used at different stages throughout the research process (reviewing of literature, planning, conducting, reporting) as a guide to maximize internal and external validity. These criteria include, among others, reports on sample representativeness, research setting and delivery agents, theoretical framework, the development process, source code, accessibility and features/ functionalities of the app, information on instructions/reminders/ prompts, sustainability of effects, and potential conflicts of interest.

Once the above-described issues have been adequately tackled and the external validity of apps addressed, the use of apps and big data could potentially open up completely new avenues of research and contribute to truly personalized and more effective treatments. We have only touched upon some of the possibilities of smartphone technology in the area of placebo research. There will be many more approaches to come in the future, which we cannot even imagine right now.

#### REFERENCES


### AUTHOR CONTRIBUTIONS

PG, CB, and MJ conceived and designed the paper. PG and CB wrote the first draft of the manuscript. PG, CB, and MJ wrote the final version.

## FUNDING

The authors acknowledge support by the DFG Open Access Publication Funds of the Ruhr-Universität Bochum.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Gruszka, Burger and Jensen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo Effect in the Treatment of Depression and Anxiety

#### *Irving Kirsch\**

*Harvard Medical School, Boston, MA, United States*

The aim of this review is to evaluate the placebo effect in the treatment of anxiety and depression. Antidepressants are supposed to work by fixing a chemical imbalance, specifically, a lack of serotonin or norepinephrine in the brain. However, analyses of the published and the unpublished clinical trial data are consistent in showing that most (if not all) of the benefits of antidepressants in the treatment of depression and anxiety are due to the placebo response, and the difference in improvement between drug and placebo is not clinically meaningful and may be due to breaking blind by both patients and clinicians. Although this conclusion has been the subject of intense controversy, the current article indicates that the data from all of the published meta-analyses report the same results. This is also true of recent meta-analysis of all of the antidepressant data submitted to the Food and Drug Administration (FDA) in the process of seeking drug approval. Also, contrary to previously published results, the new FDA analysis reveals that the placebo response has not increased over time. Other treatments (e.g., psychotherapy and physical exercise) produce the same benefits as antidepressants and do so without the side effects and health risks of the active drugs. Psychotherapy and placebo treatments also show a lower

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Charlotte R. Blease, Harvard Medical School, United States Przemysław Ba˛bel, Jagiellonian University, Poland*

#### *\*Correspondence:*

*Irving Kirsch ikirsch@bidmc.harvard.edu*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 04 April 2019 Accepted: 22 May 2019 Published: 13 June 2019*

#### *Citation:*

*Kirsch I (2019) Placebo Effect in the Treatment of Depression and Anxiety. Front. Psychiatry 10:407. doi: 10.3389/fpsyt.2019.00407*

relapse rate than that reported for antidepressant medication.

Keywords: placebo, nocebo, depression, anxiety, antidepressants

### INTRODUCTION

The aim of this review is to evaluate the placebo effect in the treatment of anxiety and depression. On February 19, 2012, Leslie Stahl opened a segment of the CBS news program *60 Minutes* saying "The medical community is at war, battling over the scientific research and writings of a psychologist named Irving Kirsch. The fight is about antidepressants and Kirsch's questioning of whether they work." By that time, I had co-authored three meta-analyses and a book concerning the placebo effect in the treatment of depression (1–4). Two of these meta-analyses (2, 3) were conducted on the data sent to the Food and Drug Administration (FDA) by the manufacturers of what at that time were the six most widely prescribed antidepressants—data that we obtained using the Freedom of Information Act. We found that although the people given antidepressants showed considerable improvement in the clinical trials submitted to the FDA by the manufacturers, so did the people given placebo, and the difference in outcome between drug and placebo was below the criterion for clinical meaningfulness used by the National Institute for Health and Care Excellence (NICE), the organization that sets treatment guidelines for the National Health Service in the United Kingdom.

There is now a crisis concerning the lack of replicability of many studies in psychology and medicine (5, 6). I am pleased to report that the antidepressant meta-analyses we published have not contributed to this crisis. There are now at least nine subsequent meta-analyses aimed at replicating or discrediting our studies (7–16). Some of these were restricted to changes on the Hamilton Rating Scale for Depression (HAM-D), whereas others included data from a variety of scales. Some were conventional meta-analyses in which means and standard deviations were used to calculate effect sizes, whereas others were patient-level analyses. Although interpretations of the data varied from study to study, the results have been consistent across all of them. We had reported a mean drugplacebo difference of 1.80 points on the HAM-D and a standardized mean difference (SMD) of 0.32. The differences reported in the replications ranged from 1.62 to 2.56 HAM-D points, with SMD effect sizes ranging from 0.23 to 0.34. To put this into perspective, the NICE criteria for clinical significance of antidepressant-placebo differences are three points on the HAM-D or SMDs of at least 0.50, corresponding to what Cohen (17) proposed as a moderate effect size.

Special attention is due to the preliminary results of a patientlevel meta-analysis reported by Stone et al. (15). Marc Stone is the Deputy Director for Safety at the Division of Psychiatric Products of the FDA. He and his colleagues reported a patientlevel analysis of the data from all randomized placebo-controlled trials of antidepressants in the treatment of Major Depressive Disorder that had been submitted to the FDA between 1979 and 2016. The similarity in outcome between what the Stone et al. data and those that my colleagues and I had reported in 2002 and 2008 is astounding. We had reported a drug response of 10.1 points on the HAM-D and a placebo response of 8.3 point—a drug-placebo difference of 1.8 points. In Stone et al.'s comprehensive analysis of the data from the 73,178 patients in the 228 trials submitted to the FDA, the drug response was 10.1 points, the placebo response was 8.3 points—yielding a drugplacebo difference of 1.80 points on the 17-item HAM-D, exactly what my colleagues and I reported in our analysis of the FDA data for the six antidepressants that we evaluated (2).

Antidepressants are also used to treat anxiety disorders. Might they be more effective in treating anxiety than in treating depression? My colleagues and I have assessed that issue in a meta-analysis of the effects of paroxetine in treating anxiety disorders (18). We chose to limit our analysis to paroxetine so that we could assess a complete dataset of unpublished pre- and post-marketing trials, as well as those that had been published. As part of a 2004 lawsuit settlement, GlaxoSmithKline was required to post online the results of all clinical trials involving its drugs on its Clinical Trial Register (19). Examining these data, we found a drug-placebo effect size (SMD) of 0.27, similar to those reported for antidepressants in the treatment of depression. In a subsequent study, Roest et al. (20) analyzed data obtained from the FDA for premarketing trials of nine second-generation antidepressants in the treatment of anxiety disorders. They reported an SMD of 0.33, similar to that reported by Sugarman and colleagues for paroxetine (18) and to those reported in the meta-analyses of antidepressants in the treatment of depression cited above. Subsequently, Sugarman and colleagues (21) replicated the Roest et al. study and found an SMD of 0.34 across all antidepressants and all anxiety disorders, with individual effect sizes ranging from 0.26 to 0.39. Thus, antidepressants are no better in treating anxiety disorders than they are in treating depression.

The impact of placebo factors in the treatment of anxiety can also be seen in a study by Faria et al. (22). Participants diagnosed with social anxiety disorder (SAD) were treated with an selective seratonin reuptake inhibitor (SSRI) (escitalopram). Approximately half of the patients were accurately informed that they were taking an SSRI. The others were told that they were being given an active placebo (i.e., a drug that produces side effects but has no therapeutic effect on the condition being treated). Telling patients that they were being treated by an active medication doubled its effectiveness on a continuous measure of anxiety and tripled the response rate.

Critics have noted that the criteria proposed for clinical significance by NICE (3 points on the HAM-D or SMDs of at least 0.50) are arbitrary (23), and they are correct. The NICE criteria are as arbitrary as the criterion of p < .05 for statistical significance, the use of a 50% reduction in symptoms as a criterion of a clinical response, and the use of a HAM-D score below 8 as the criterion of remission. Given that the conventional cutoffs for statistical significance are arbitrary, as are those for assessing clinical "response" and "remission," why would we expect the criteria for the clinical significance of drug-placebo differences to be any less arbitrary?

Nevertheless, Joanna Moncrieff and I (24) have proposed empirically derived criteria for the clinical significance of antidepressant-placebo differences. We used published data from a large patient-level analysis (25) of the correspondence between changes on the HAM-D and the Clinical Global Impressions-Improvement (CGI-I) scale, a scale that rates improvement on a scale of 1 (very much improved) through 4 (no change) to 7 (very much worse). This analysis revealed that an improvement of three points on the HAM-D (SMD = 0.375) is equivalent to a clinician rating "no change" on the CGI-I. A CGI-I rating of "minimally improved" corresponds to a HAM-D difference of 7 points (SMD = 0.873), and a rating of "much improved" corresponds to a 14-point HAM-D difference (SMD = 1.75). None of the meta-analyses have reported drug-placebo differences that come close to reaching the criterion for CGI-I ratings of minimal improvement, even among the most severely depressed patients.

Many depressed patients report substantial improvement after taking antidepressant medication, as do psychiatrists when describing their outcomes. How are we to reconcile this with the consistent finding that the differences between the response to antidepressants and placebos are vanishingly small? The answer is the placebo response. Although drug–placebo differences in outcome are equivalent to no difference at all, both drug and placebo responses can be substantial. The improvement of 8.3 points following placebo treatment and 10.1 points on the active drugs reported by Kirsch et al. (3) and Stone et al. (15) corresponds to CGI-I ratings between minimally improved and much improved. It is only the 1.8-point difference that corresponds to a CGI-I rating of no change. Thus, the clinically meaningful improvement seen following prescriptions of antidepressants is largely to the placebo response (i.e., the placebo effect, regression toward the mean, and spontaneous remission).

The failure to find meaningful differences between antidepressants and placebos has been blamed on increasing placebo responses over the years (26), and some meta-analyses have shown increases in both the placebo response and the drug response over time [e.g., Ref. (27)]. However, the comprehensive analysis of all trials submitted to the FDA from 1979 to 2016 tells a different story (15). The placebo response was 8.3 HAM-D points in both 1979 and 2016, with little variation between those dates. There was a small decrease (0.8 points) in the drug–placebo difference over time, but this was due to a 0.8-point decrease in the drug response (from 10.7 points in 1979 to 9.9 points in 2016), rather than an increase in the placebo response.

### PLACEBO EFFECTS VERSUS PLACEBO RESPONSES

In 1965, Fisher and colleagues (28, pp. 57–58) noted that "a clinical response following treatment (*drug response*) is not synonymous with an effect which can be attributed to the treatment (*drug effect*)." In 1998, Kirsch and Sapirstein (4) extended this distinction to placebo responses and effects, and in 2018, a group of 29 internationally recognized placebo researchers published a "consensus statement," in which they endorsed the view that "the placebo and nocebo response includes all health changes that result after administration of an inactive treatment (i.e., differences in symptoms before and after treatment), thus, including natural history and regression to the mean. The *placebo and nocebo effect* refers to the changes specifically attributable to placebo and nocebo mechanisms" (29, p. 206). The meta-analyses described above indicate a strong placebo response, but with one exception: they do not assess the placebo effect.

In the one exception (4), Guy Sapirstein and I assessed the placebo effect by comparing the placebo response in drug trials to changes observed in no-treatment natural-history control conditions in psychotherapy studies. We found that 25% of the drug response was duplicated in the no-treatment groups, and 75% of the drug response was found in the placebo groups. Thus, the placebo effect was 50% of the drug response—double the drug effect and also double the response found in the no-treatment controls. It was a genuine placebo effect.

A limitation of our study was that data in the no-treatment groups and data in the placebo groups came from different studies. That limitation has been overcome in a clinical trial reported by Leuchter and his colleagues (30). This was a threearm study, in which depressed patients were randomized to either antidepressant plus supportive care, placebo plus supportive care, or supportive care alone. Mean HAM-D improvement was 10.05 points in the antidepressant group and 7.59 in the placebo group, but only 1.37 in the supportive care only group. As in the Kirsch and Sapirstein study, the response in the placebo group was mostly a genuine placebo effect and not simply due to spontaneous improvement or regression toward the mean.

#### IS THERE A DRUG EFFECT AT ALL?

Although the difference between antidepressant and placebo is not clinically meaningful, it is statistically significant. Can we interpret that small but statistically significant difference as a genuine drug? Although that cannot be ruled out, there is another possibility. Clinical trials in which patients and/or their doctors or other outcome raters are asked to judge whether the patient was given an active drug or a placebo are consistent in showing that those judgements are very accurate. This indicates that the trials are not really double-blind. Numerous studies have shown that when patients know they are getting a drug, they are more responsive to the drug than when they know they might be getting a placebo (31–35). This indicates a placebo effect component in the drug response. Similarly, the placebo response is reduced when people know they might be getting a placebo than when they are led to believe that they are getting the active drug (31, 36). Therefore, the small drug–placebo difference in outcome might be due to the increased response in the drug group and decreased responding in the placebo group produced by what participants are told about the trials.

In 1986, Rabkin and her colleagues (1986) published a study in which doctors and their depressed patients who had been randomized to imipramine, phenelzine, or placebo were asked to guess the group to which the patients had been assigned. Overall, 78% of patients and 87% of the doctors accurately identified whether the patients had been given an active drug or a placebo. As shown in **Figure 1**, patients randomized to active drug groups were especially successful in breaking blind, whereas those receiving placebo seem to be merely guessing. In contrast, doctors showed high levels of accuracy in identifying group assignment for patients in the placebo groups as well as those in the drug groups. Furthermore, this pattern of results has been replicated successfully in subsequent studies (38–41), indicating that they are reliable. Rabkin et al. concluded that "in view of these findings we recommend that investigators routinely record and report doctor and patient opinions about treatment assignment in randomized trials, preferably both early in the trial and at the end" (p. 86). Unfortunately, this recommendation has been largely ignored.

Given these exceptionally high rates of breaking blind, the next question is whether this phenomenon is associated with the outcome of clinical trials. In 2013, Baethge and colleagues (42) reported the results of a meta-analysis addressing this issue. In 47 clinical trials of psychiatric disorders in which blinding was assessed, the correlation between patient accuracy and the drug– placebo effect size was .51 (p = .002) and that between rater accuracy and effect size was .55 (p = .067). Thus, the greater the likelihood of breaking blind, the greater the drug–placebo difference.

However, there is an interpretive problem with respect to understanding the direction of causality in the data on accuracy of judgements of group assignment. In most of the studies in which blinding was assessed, the assessment was made near the end of the trial. Thus, it is possible that breaking blind is a consequence rather than a cause of drug–placebo differences. However, some of the data reported by Rabkin et al. (37) indicate that breaking blind is not solely a consequence of the patients' responses to treatment. **Figure 2** displays the accuracy of judgements separately for patients who responded to treatment and those who did not. Of particular interest is the ability of both patients and doctors to accurately guess group assignment of nonresponders in the drug group. Seventy-four percent of nonresponders who received an active drug judged themselves to

be on the drug, as did 84% of their doctors. Furthermore, almost half of responders to placebo guessed they were on placebo. Although this would be expected by chance guessing, it indicates that the improvement experienced by these placebo responders did not lead them to think they were taking an active medication. Taken together, these data indicate that although response to treatment influences patients' and doctors' judgements of treatment assignment, it does not fully explain the accuracy of those judgements.

I and others (1, 43, 44) have hypothesized that the presence of side effects is responsible for breaking blind. As part of the informed consent processes, patients in clinical trials are told that they might receive a placebo. They are also told that the medication under investigation has side effects, and they are told exactly what the known side effects are. Now placebos can also generate side effects, a phenomenon known as the nocebo effect, but they do so to a much lesser degree than active medications (45). This difference in side effects might lead patients in clinical trials, as well as the clinicians who rate their improvement, to figure out to which group they have been randomized. To the extent that this occurs, the trial is not really double-blind. In this section, I describe data indicating that patients in clinical trials often do break blind and that breaking blind affects the outcomes of the trials.

Studies have shown mixed results for the hypothesis and drug– placebo differences are associated with reported side effects (46– 51). However, side effects may be only one of the cues leading participants in clinical trials to break blind. Joanna Moncrief (52) has hypothesized that people learn how to recognize the sometimes subtle changes produced by medications without necessarily reporting symptoms that would be listed as a side effect on the checklists used to assess them.

Two studies conducted by Aimee Hunter and colleagues at UCLA provide indirect support for this hypothesis (53, 54). In each of these studies, depressed patients in clinical trials were grouped according to whether they had ever been on antidepressants before. As displayed in **Figure 3**, there were virtually no differences at all between drug and placebo among patients who had never been taken antidepressants before. In contrast, among those with prior experience, drug–placebo differences were both significant and substantially larger than those reported in other clinical trials, whereas the combined differences for antidepressant-experienced and antidepressantnaive participants are in the same range of other clinical trials. Taken together, the data from both studies strongly suggest that prescriptions for antidepressants should not be given to depressed people who have never taken them before.

#### WHAT IS TO BE DONE?

How then shall we treat depression? One suggestion that has been made to me informally is to prescribe antidepressants as active placebos. An active placebo is a pharmacologically active substance that does not have specific activity for the condition being treated. Antidepressant medications have little or no pharmacological effects on depression or anxiety, but they do elicit a substantial placebo effect. Could we not use them as a means of capitalizing on the power of placebo?

The problem with this suggestion is that treatment decisions need to be based on an assessment of risks, as well as benefits. The risks of antidepressant treatment include suicidal and violent aggressive behavior in adolescents and young adults; stroke, death from all causes, falls and fractures, and epileptic seizures in the elderly; and sexual dysfunction, withdrawal symptoms, diabetes, deep vein thrombosis, and gastrointestinal and intracranial bleeding in everyone else (55–62). One might argue that these risks might be worth taking for an effective treatment of severe depression, but are they worth risking for a treatment that has no benefit at all over placebo for first-time users?

A second possibility would be to prescribe placebos. They are safe and effective, with relatively few nocebo side effects and no health risks. The problem with prescribing placebos rests with the commonly held assumption that to be effective in clinical practice, placebos have to be presented deceptively as active medications. This assumption has been reported to be false in recent clinical trials [reviewed in Ref. (63)]. In these studies, placebos were presented non-deceptively as placebos with no active ingredients. How could this ever work? The answer is that it was accompanied by a rationale in which it was explained that placebos have been found effective to the condition being treated, that it has been found to involve Pavlovian conditioning, and that it might therefore be effective in treating the person's condition. This rationale has been found to be critical for the success of the open-label placebo (OLP) intervention (64). Additional OLP trials with larger samples, longer duration, and blinded assessors are warranted.

Unfortunately, only one of the studies assessing OLPs involved the treatment of depression, and that one, although showing promising results, was only a small pilot (65). However, there are many other treatments that equal antidepressants in terms of degree of symptom reduction (66–69). These include psychotherapy, physical exercise, acupuncture, omega-3, homeopathy, tai chi, qigong, and yoga. We do not know the mechanisms of these alternative treatments, and their efficacy may be at least partly due to expectancy, but they are certainly safer than antidepressant medication.

The long-term advantage of psychotherapy over medication has been shown in a number of studies [reviewed in Ref. (70)]. Whereas short-term outcomes were equivalent between the two treatments, long-term outcomes were significantly better for patients who had received psychotherapy than for those who had received medication. Additionally, the National Institute of Mental Health (NIMH) Treatment of Depression Collaborative Research Program reported relapse rates of 36% and 33% for cognitive behaviour therapy and interpersonal therapy, respectively, compared with a 50% relapse rate for antidepressant medication (71). However, the rate of relapse for patients who had recovered on placebo was 33%, the same as that for psychotherapy. There are two take-home messages from these data. First, it dispels the myth that placebo responses are short-lived. Second, it raises the questions of whether psychotherapy reduces relapse or medication increases it (72).

Support for the hypothesis that antidepressant medication increases the risk of relapse comes from other studies comparing antidepressant and placebo treatment for depression and anxiety disorders. Consistent with the NIMH data, a 2011 meta-analysis reported a relapse rate of 25% for depressed patients successfully treated with placebo compared to relapse rates ranging from 42% to 57% among those treated with various antidepressants (73). A direct test of the effect of antidepressants and psychotherapy on the risk of relapse comes from a study on the treatment of panic disorder (74). The study compared the 6-month relapse rates for

patients who had been treated with a tricyclic antidepressant (imipramine), cognitive behavior therapy (CBT), or the two combined. The results, displayed in **Figure 4**, indicate that the risk of relapse following imipramine was more than double that following CBT. However, the addition of the antidepressant to imipramine completely erased that benefit. Similarly, physical exercise as a treatment for depression has been shown to have a much lower relapse rate than SSRIs, but that benefit disappears when the two treatments are combined (75).

These studies reveal another benefit of including placebos in clinical trials of medication. They can reveal situations in which the treatment does more harm than good for the condition being treated. For example, placebos have outperformed antipsychotic medication (haloperidol and risperidone) in the treatment of delirium in palliative care patients and aggression in intellectually disabled adults (76, 77). Similarly, placebo was significantly better than a combination of chondroitin and glucosamine in the treatment of knee osteoarthritis (78) and showed similar superiority in a trial of nutraceuticals in the treatment of depression (79).

Given these data, I suggest that the following principles be used in treatment selection. When treatments are equally effective, recommend the safest. When they are equally safe, let the patient choose which he or she prefers. Before making this choice, however, patients should be accurately informed of the potential harms of antidepressant medication (e.g., increased risk of relapse, suicidality, gastrointestinal and intracranial bleeding, deep vein thrombosis, pulmonary embolism, diabetes, stroke, epilepsy, and death from all causes), as well as the finding that all of these treatments appear to be equally effective in the short term but that psychotherapy and physical exercise might be more effective than antidepressants in the long run.

### AUTHOR CONTRIBUTIONS

IK wrote the article.

### REFERENCES


in a placebo-controlled trial of imipramine and phenelzine. *Psychiatry Res* (1986) 19:75–86. doi: 10.1016/0165-1781(86)90094-6


http://www.sciencedirect.com/science/article/pii/S000578941930005X. doi: 10.1016/j.beth.2019.01.002


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer CB declared a shared affiliation, with no collaboration, with the author to the handling editor.

*Copyright © 2019 Kirsch. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Experimental Assessment of Nocebo Effects and Nocebo Side Effects: Definitions, Study Design, and Implications for Psychiatry and Beyond

*Kate Faasse1\*, Suzanne G. Helfer2, Kirsten Barnes3, Ben Colagiuri3 and Andrew L. Geers4*

*1 School of Psychology, University of New South Wales, Sydney, NSW, Australia, 2 Department of Psychology, Adrian College, Adrian, MI, United States, 3 School of Psychology, University of Sydney, Sydney, NSW, Australia, 4 Department of Psychology, University of Toledo, Toledo, OH, United States*

#### Keywords: nocebo effect, side effects, experimental design, research methods—psychology, no treatment control

#### *Edited by:*

*Katja Weimer, University of Ulm, Germany*

#### *Reviewed by:*

*Przemysław Ba˛bel, Jagiellonian University, Poland*

*\*Correspondence: Kate Faasse k.faasse@unsw.edu.au*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 28 February 2019 Accepted: 20 May 2019 Published: 14 June 2019*

#### *Citation:*

*Faasse K, Helfer SG, Barnes K, Colagiuri B and Geers AL (2019) Experimental Assessment of Nocebo Effects and Nocebo Side Effects: Definitions, Study Design, and Implications for Psychiatry and Beyond. Front. Psychiatry 10:396. doi: 10.3389/fpsyt.2019.00396*

Interest in nocebo effects is increasing exponentially: a Google Scholar search for articles referencing *nocebo* in 1998 (20 years ago) yields 90 results, increasing to 449 in 2008 (10 years ago) and to 1600 in 2018. Increased attention has likely resulted from recognition of the prevalence and potential seriousness of nocebo effects in clinical contexts. It is estimated that up to 97% of reported pharmaceutical side effects are *not* caused by the drug itself but rather by nocebo effects and symptom misattribution (1). These nocebo effects can cause symptoms serious enough to require hospitalization and medical intervention (2). As a result of the increased recognition of the importance of nocebo effects, experimental research seeking to understand how nocebo effects are formed has also intensified.

As the literature on nocebo effects has expanded, additional methodological decision points arise for researchers in this area. In this article, we discuss a set of methodological issues that result from emerging approaches to studying nocebo effects, including distinctions between designs for standard nocebo effects versus nocebo side effects, the information provided by selecting different types of control groups in experimental designs, and the distinction between "true" nocebo effects and symptom misattribution. This discussion will focus on between-subjects designs, using examples of nocebo effects that are generated by verbal/written transmission of information. For each issue, we compare the different methodologies and seek to highlight the strengths and limitations of these approaches.

### NOCEBO EFFECTS AND NOCEBO SIDE EFFECTS DESIGNS: DEFINITIONS

Definitions of the nocebo effect typically focus on the role of negative expectations in producing aversive outcomes [e.g., Refs. (3–7)]. In contrast, Faasse (8) extends this definition to incorporate past experience and other aspects of the treatment context. We would refine this further to define nocebo effects as unpleasant or adverse outcomes triggered by the treatment *context*, beyond any inherent pharmacological effects of the treatment itself. These nocebo effects are scientifically measurable effects caused by psychological processes including negative expectations, classical conditioning, and observational learning (9).

Although not always differentiated, there are two variants of nocebo effects that researchers examine: primary nocebo effects and nocebo side effects. Experimental designs that study *primary*  *nocebo effects* focus on nocebo effects as the central "action" or primary negative outcome of a treatment/medical condition. Such outcomes have been described by Hahn (5, 6) as nocebo effects, where he distinguished these from what he called "placebo side effects," whereby a treatment intended primarily for benefit can cause harmful outcomes. As an example of a primary nocebo effect—when the potential adverse outcome is framed as the primary, or focal, effect of the treatment—Benedetti and colleagues (10) informed postoperative patients that a saline injection would increase pain, resulting in elevated pain. This design contrasts those of *nocebo side effect* experiments in which participants are informed of the main (typically beneficial) outcome of a treatment/condition, and also unpleasant outcomes that are ancillary to this main outcome. This conceptualization of nocebo side effects is similar to that of Barksy and colleagues (3) as a phenomenon occurring when placebo treatment results in unpleasant side effects. For example, Neukirch and Colagiuri (11) gave participants experiencing sleep difficulty an inert treatment to improve sleep—with or without the suggestion that it created a specific side effect (increased or decreased appetite). The results revealed that participants given the side-effect warning reported more changes in appetite—in the expected direction—than those not given the side-effect warning.

Notably, recent evidence suggests that primary nocebo and nocebo side effect manipulations do not produce equivalent results. Caplandies and colleagues (12) used an experimental design that compared the two nocebo effects and found that when headache pain was described as the *primary* effect of sham transcranial direct current stimulation (tDCS), headaches were significantly more likely to occur than when headache was described as a side effect of tDCS. These results indicate that nocebo effect instructions can produce different outcomes depending on whether the adverse effect is described as the primary effect or a side effect of treatment.

### NOCEBO EFFECTS AND NOCEBO SIDE EFFECTS: EXPERIMENTAL DESIGN SELECTION

Making a clear distinction between a primary nocebo effect and a nocebo side effect design is valuable for several reasons. First, these two designs correspond to different clinical care circumstances. Primary nocebo effect designs relate to situations in which negative outcomes can occur without a concomitant benefit, such as when patients are given a warning about potential pain due to a medical condition (e.g., broken leg) or disease course (e.g., *rheumatoid arthritis*)*,* which leads to increased negative expectations. In contrast, nocebo side effect designs are analogues of situations when there is a beneficial treatment that may also cause adverse effects. These designs have greater correspondence to medical treatments where treatment descriptions and instructions—including informed consent protocols, physician warnings, drug labels, and directto-consumer advertising—suggest that adverse outcomes can accompany a primary treatment benefit. Consequently, in the study of nocebo effects, researchers should decide between primary nocebo effect or nocebo side effect designs based on the applied circumstance they wish to understand.

A second reason for distinguishing between these two types of designs is that the pivotal mediating and moderating variables may diverge. Consider the case of moderating variables. When examining nocebo side effects, but not primary nocebo effects, variables such as number of side effects listed or their order could be critical moderators to assess. Additionally, it is possible that the contribution of different psychological mediators varies with these different designs. For example, primary nocebo effect designs emphasize adverse outcomes, whereas nocebo side effect designs emphasize the positive treatment effect as well as co-occurring adverse outcomes; because of this greater emphasis on negative outcomes, individuals in primary nocebo effect designs may devote more higher-order cognitive processes to thinking about the negative symptoms than individuals in the nocebo side effect designs (13). It should also be noted that nocebo side effects and placebo effects can, theoretically, co-exist within the same person, and the experience of benefits and unpleasant side effects may influence one another.

### NOCEBO EFFECTS AND NOCEBO SIDE EFFECTS: CONTROL CONDITIONS

A second issue to consider is the appropriate control conditions for these designs. Here, we differentiate between several options.

*No-treatment control group*. It is widely accepted that when investigating the *placebo* effect, a placebo-treated group must be compared with an untreated control group in order to detect "true" rather than "perceived" placebo effects (14). The inclusion of an untreated comparison group allows researchers to distinguish between improvements *caused* by placebo administration and other factors that can result in apparent improvements, including natural history of the condition, Hawthorne effects, and regression to the mean. However, the importance of this distinction between "true" and "perceived" *nocebo* effects—and the inclusion of a no-treatment control group—is less well recognised (15).

We argue that no-treatment control groups are important for both primary nocebo and nocebo side effect designs. A simple laboratory procedure to test for true *nocebo* effects both primary and side effects—would involve two conditions. In one condition, participants would take a sham treatment or undergo a sham procedure described as having either a primary or secondary (i.e., side effects) unpleasant outcome. In the other condition, participants would not get the nocebo treatment or procedure (i.e., they form the no-treatment control group), but undertake all other study components. An increase in the rate of unpleasant outcomes in the "treated" group would be indicative of a true nocebo effect—because comparing with the untreated control rules out natural history, Hawthorne effects, and regression to the mean, as alternative explanations.

*Sham treatment/no-information control group*. A second control condition that may be employed is one in which participants are given a treatment or procedure, but are not provided with information about possible unpleasant outcomes (primary or side effect; see **Table 1**). Thus, when a *sham treatment/ no-information control group* is compared to a *sham treatment/*


*negative information group*, participants in both conditions engage in the treatment activity (12). This procedure keeps constant factors such as naturally occurring symptoms, treatment administration, and engagement in other experimental activities across the conditions. This control can be used for either primary nocebo or nocebo side effect studies and is useful for identifying the specific effect of the information provided. Some limitations of this control condition, however, are 1) because treatments or procedures are given without corresponding information about possible side effects or adverse outcomes, it has less ecological validity than the no-treatment control condition described above, and 2) the control group does not provide information on the generation of "true" nocebo effects, i.e., the magnitude of the overall nocebo effect compared to individuals with no treatment.

These two types of control conditions are not exhaustive. For example, as indicated in **Table 1**, other variations could include giving participants a treatment with "standard" or usual care information in order to test the effect of other information provision strategies, for example, standard information versus positive framing of information about adverse outcomes, which focuses on the proportion of patients who will *not* experience these unpleasant outcomes (16, 17). Of most importance, however, is that researchers consider carefully their hypotheses and relevance to clinical practice, consider which factor(s) differ between their chosen conditions, and utilize appropriate control conditions in experimental designs that will allow them to most appropriately test their research question. The control conditions described here are complementary rather than mutually exclusive, and we recommend that researchers consider including a no-treatment control as well as (for example) an informationrelated control condition. Finally, although this discussion has focused on designs using sham treatments, it is important to note that nocebo effects occur in response to active medical treatments too. In studies examining nocebo effects from active treatments, a no-treatment control condition is less likely to be helpful than an *active treatment/no-information control group*, which controls for the physiological effect of the treatment itself.

### NOCEBO EFFECTS AND NOCEBO SIDE EFFECTS: MISATTRIBUTION

A third consideration in experimental designs to study nocebo effects and nocebo side effects is the role of symptom misattribution. In contrast to "true" nocebo effects, "perceived" nocebo effects are those symptoms that would have occurred regardless of treatment administration but are (mistakenly) attributed to the treatment. Misattribution is particularly relevant to the study of nocebo *side* effects, such as a where a patient is experiencing regular headaches, starts a new medication, and subsequently misattributes these headaches to the treatment. Although such misattributed symptoms are undoubtedly important in how patients view their treatments and influence their health care decisions, there are likely to be different processes underlying the development of misattributed and true nocebo side effects. When designing experimental studies to investigate true nocebo side effects, assessing baseline symptoms that may be subject to later misattribution and encouraging participants to report *all*  symptoms experienced regardless of perceived cause may help to assess or reduce the influence of misattribution on results. If researchers wish to explicitly study misattribution, participants who receive an experimental treatment can be asked whether they believe their symptoms were *caused* by the treatment they received.

#### SUMMARY

Nocebo effects and nocebo side effects play an important role in the outcomes of medical care. Heightened recognition of their importance and increased experimental research seeking to understand how nocebo effects are formed have raised the need for consideration of the methodological decisions that researchers face in studying the nocebo effect. These decisions include whether to examine primary nocebo effects or nocebo side effects, appropriate control conditions, and differentiating true and perceived nocebo effects. Future research would benefit from careful selection of study design and assessment of nocebo outcomes. Such steps will contribute to generating a deeper understanding of how both primary nocebo effects and nocebo side effects develop.

### AUTHOR CONTRIBUTIONS

AG and KF contributed to the conception of the work and wrote the first draft of the manuscript. All authors critically revised the manuscript for important intellectual content and provide approval for publication of the content.

### FUNDING

KF is supported by an Australian Research Council Discovery Early Career Researcher Award (DE180100471); BC is supported

### REFERENCES


by an Australian Research Council Discovery Early Career Researcher Award (DE160100864). The funder played no role in the conceptualization or preparation of this work. The researchers are independent from the funder.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Faasse, Helfer, Barnes, Colagiuri and Geers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Effects of Open- and Closed-Label Nocebo and Placebo Suggestions on Itch and Itch Expectations

*Stefanie H. Meeuwis1,2\*, Henriët van Middendorp1,2, Antoinette I.M. van Laarhoven1,2,3, Dieuwke S. Veldhuijzen1,2, Adriana P.M. Lavrijsen4 and Andrea W.M. Evers1,2,3*

*1 Health, Medical and Neuropsychology Unit, Faculty of Social and Behavioural Sciences, Institute of Psychology, Leiden University, Leiden, Netherlands, 2 Leiden Institute for Brain and Cognition, Leiden University Medical Center, Leiden, Netherlands, 3 Department of Psychiatry, Leiden University Medical Center, Leiden, Netherlands, 4 Department of Dermatology, Leiden University Medical Center, Leiden, Netherlands*

#### *Edited by:*

*Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Karin Meissner, Ludwig Maximilian University of Munich, Germany Meike C. Shedden-Mora, University Medical Center Hamburg-Eppendorf, Germany*

*\*Correspondence:*

*Stefanie H. Meeuwis s.h.meeuwis@fsw.leidenuniv.nl*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 20 December 2018 Accepted: 03 June 2019 Published: 21 June 2019*

#### *Citation:*

*Meeuwis SH, van Middendorp H, van Laarhoven AIM, Veldhuijzen DS, Lavrijsen APM and Evers AWM (2019) Effects of Open- and Closed-Label Nocebo and Placebo Suggestions on Itch and Itch Expectations. Front. Psychiatry 10:436. doi: 10.3389/fpsyt.2019.00436*

Placebo and nocebo effects have been shown to influence subjective symptoms such as itch. These effects can be induced by influencing outcome expectations through, for example, combining the application of an inert substance (e.g., a cream) with verbal suggestions on the anticipated effects of this substance. Interestingly, placebo effects also occur when it is known that a treatment is inert (i.e., open-label placebo). However, no study to date has examined the efficacy of negative and positive verbal suggestions under similar open-label and closed-label (i.e., concealed placebo/nocebo) conditions in itch. A randomized controlled between-subjects study design was applied in which healthy volunteers (*n* = 92) were randomized to 1) an open-label positive verbal suggestion group, 2) a closed-label positive verbal suggestion group, 3) an open-label negative verbal suggestion group, or 4) a closed-label negative verbal suggestion group. Verbal suggestions were made regarding the topical application of an inert substance. Itch was evoked experimentally by histamine iontophoresis at baseline and again following suggestions. Itch expectations, self-reported itch during and following iontophoresis, and skin response parameters were measured. Positive suggestions were found to result in significantly lower expected itch than were negative suggestions in both open- and closed-label conditions. No effects of the suggestions on itch during iontophoresis were found, but significantly lower itch was reported in the 4 min following iontophoresis in the (combined open- and closed-label) positive compared with negative verbal suggestion groups. In addition, a smaller increase in skin temperature was found in the positive compared with negative suggestion groups. The findings illustrate a potential role of (open- and closed-label) placebo for optimizing expectations and treatment effects for itch in clinical practice.

Clinical Trial Registration: Netherlands Trial Register, trial number: NTR6530.

Keywords: placebo, nocebo, itch, suggestion, pruritus

### INTRODUCTION

Itch is the most common somatosensory symptom in dermatological conditions. It is a hallmark symptom of atopic eczema (1), and its prevalence in psoriasis is high (2). Moreover, itch is a common symptom of various other disorders, including kidney failure, liver disease, cancer, allergy, and diabetes mellitus (3–5). Due to its high prevalence—approximately 8% of the general population and over 50% of dermatological patients the burden of itch and its impact on society are high (6, 7). Often, patients report significantly lowered quality of life, increased depressive and anxious symptoms, and sleep disturbances due to chronic itch (8). While current treatments aim to suppress itch through pharmacological interventions, oftentimes, limited effects and significant side effects are reported (3, 9). As such, it is important to identify factors that contribute to treatment efficacy.

One of the factors that may be especially relevant for treatment outcomes is the placebo effect. Placebo effects are defined as beneficial effects of otherwise pharmacologically inert substances (10, 11) and have been studied in a variety of medical conditions and symptoms, including itch and pain (12–14). Multiple pathways through which these effects can be elicited have been identified, including associative learning processes, social learning, or instructional learning (12, 15–17). Within these pathways, expectancy is a key component. To illustrate, a positive expectation may be elicited through past experiences with the beneficial effects of a certain type of medication (associative learning), through observation of treatment efficacy in others (social learning), or through receiving positive (verbal) information about a treatment (instructional learning) (17). In turn, this positive expectation can then result in psychoneurobiological changes and symptom reduction (18, 19). On the other hand, when expectations regarding a treatment outcome are low or negative outcomes are expected, symptoms may worsen or the occurrence of treatment side effects may increase, known as the nocebo effect (12, 20).

The current literature indicates that at least 30% of itch reduction in clinical practice might be attributable to placebo effects (21). Placebo and nocebo effects can be experimentally induced for itch by changing expectations through verbal suggestions regarding inert treatments or through the use of associative learning mechanisms (22–28). However, not all studies confirm these findings (28, 29). In addition, there is some evidence that it may be necessary to combine multiple placebo induction methods (e.g., associative learning and positive suggestions) and that a single induction method may not be sufficient to elicit significant placebo effects (22). Hence, further study of the circumstances under which placebo effects may be elicited for itch is relevant.

Most studies on placebo effects take on a traditional approach, in which patients or healthy individuals are told that a pharmacologically effective substance (e.g., a pill or cream) is given, whereas in reality, the substance is pharmacologically inert (30, 31). While this concealing or deceptive approach is useful for studying the underlying mechanisms of placebo effects in general, it may become problematic when it comes to utilizing these effects in clinical practice, where concealment or deception regarding the treatment provided brings along ethical issues. For a long time, it was believed that this would prevent strategic use of the placebo effect in clinical practice (30). In the past years, however, studies have demonstrated that placebo effects can also occur when it is explicitly told that, although a given substance is inert, placebo effects may still help in alleviating symptoms. These so-called open-label placebo effects have been found to significantly reduce symptoms of irritable bowel syndrome, depression, attention deficit hyperactivity disorder, chronic low back pain, and allergic rhinitis (31–39). Most of these studies induce openlabel placebo effects through a combination of an attribute (e.g., an inert pill) that may trigger previously learned associations between medicine and symptom reduction in general, and a scripted briefing in which the positive effects of placebo pills are emphasized (a suggestive framework) (31–34, 36–38). Findings on whether these effects can be attributed to the provided pill or the provided explanation alone are contradictory (35, 39, 40).

In view of the previous findings, further research is needed to demonstrate the efficacy of both open-label and closed-label (i.e., concealed) placebo effects for itch. It is not yet known whether effect sizes of open-label and closed-label placebo effects are comparable. Moreover, no study to date has investigated whether nocebo effects can be induced under both closed-label and open-label conditions for itch. To this end, we investigated in the current study whether negative or positive outcome expectations, induced by a suggestive framework (negative and positive verbal suggestions, provided in an open-label and closed-label context) combined with an attribute (an inert tonic), can influence selfreported itch during an experimental itch induction by histamine in healthy volunteers. We primarily tested the effects of the positive and negative suggestions on itch by combining open- and closedlabel groups. Secondarily, we tested these effects for open-label and closed-label contexts separately to see whether these effects were comparable, and we investigated the effects of suggestions on other markers of the response to this test, for example, the physical skin response to histamine. We expected a decrease in itch following positive verbal suggestions compared with an increase in itch following negative verbal suggestions for both the open-label and closed-label conditions.

### MATERIALS AND METHODS

The study was approved by the Medical Ethical Committee at the Leiden University Medical Center, the Netherlands (NL58792.058.16), and registered in the Dutch Trial Register (NTR6530). The study was performed in accordance with the Declaration of Helsinki (41). All participants provided written informed consent.

### Participants

Healthy male and female participants were recruited through advertisements at Leiden University and through social media (e.g., Facebook). Inclusion criteria consisted of an age between 18 and 35 years and a good understanding of the Dutch written and spoken language. Interested volunteers were excluded in case of self-reported severe somatic or psychological morbidity that could interfere with the participant's safety or study protocol [e.g., heart or lung diseases, histamine intolerance, or Diagnostic and Statistical Manual of Mental Disorders - Fourth Edition Text Revised (DSM-IV-TR) psychiatric diagnoses]; current chronic itch or pain complaints; current use of analgesics, antiinflammatory medication, antihistamines, or antibiotics; and (suspected) pregnancy. Participants were asked to refrain from the consumption of heavy meals, caffeine, and smoking 2 h, exercise 12 h, and alcohol as well as drugs 24 h prior to the sessions to prevent potential influences on study outcomes. Adherence to these lifestyle guidelines, as well as the exclusion criteria, was verified at the start of each session by means of interviewing.

#### Study Design

A between-subjects, single-blinded, randomized controlled experimental trial design was applied. Eligible participants were randomized to (I) an open-label positive verbal suggestion (VS) group, (II) a closed-label positive VS group, (III) an openlabel negative VS group, or (IV) a closed-label negative VS group. Randomization sequence was acquired using an online random number generator (www.random.org, Dublin, Ireland). Allocation was not concealed for experimenters. Participants were invited for a baseline and an experimental session, which were timed 1 week apart. An overview of the study design and measurement schedule is provided in **Figure 1**.

#### Measures and Materials Verbal Suggestions

Before the study, participants were informed that the study aimed to investigate individual differences in the sensitivity to itch and the role of psychological factors in explaining these differences. They were informed that the itch induction method would elicit a response similar to a mosquito bite (e.g., that their skin may become red and swollen). During the experimental session, participants were told that, prior to itch induction, a tonic would be applied that influences sensitivity to itch. In reality, this tonic was a pink-colored skin disinfectant (Orphi Pharma B.V., Dordrecht, the Netherlands). The itch induction during the baseline session was used as a reference point for the suggestions. In the positive VS groups (I and II), the following suggestion was given: "This tonic has an itch-reducing effect and will make the skin less sensitive to itch. From previous research we know that the application of this tonic will reduce itch for most people, meaning around 95 percent of the studied people. As such, we expect that you will experience less itch, compared to the previous test." Participants in the negative VS groups (III and IV) were given the same information, but negative words were used instead of positive words (e.g., "itch-increasing" rather than "itch-reducing").

When participants were allocated to one of the two open-label groups, additional instructions were given. For the positive VS

group, these were: "I just told you that the tonic reduces itch. In fact, the tonic is a placebo. From research we know that the expectation that a remedy reduces itch will really cause people to experience less itch. This is caused by different processes, for example itch-reducing substances that are released in the brain. These substances are also released when people know that they receive a placebo. So, even though I told you this, you will likely experience less itch during the test." For the open-label negative VS groups, negative words were again used in the instructions instead of positive words. During application of the tonic, the provided suggestions and, if applicable, open-label instructions were briefly repeated in a single sentence.

#### Expected Itch

Expected itch in response to histamine iontophoresis was assessed on a Numeric Rating Scale (NRS) ranging from 0 ("no itch") to 10 ("worst itch imaginable"). Participants rated the amount of itch they expected to experience during iontophoresis twice: once at the start of the baseline session and once during the experimental session, following the verbal suggestions but prior to histamine iontophoresis.

#### Histamine Iontophoresis

Histamine was applied to the skin by transdermal iontophoresis (Chattanooga Group, Hixson, TN, USA). This method has been previously validated and reliably induces itch in healthy populations (22, 28, 29, 35). A 0.6% diphosphate (equivalent to 1% histamine dihydrochloride) histamine solution was prepared in distilled water with propylene glycol and hypromellose 4,000 mPa by the local pharmacy. In preparation of iontophoresis, the skin was cleaned with either a transparent disinfectant (alcohol 70%; baseline session) or a pink-colored disinfectant (0.5% chlorhexidine in alcohol 70%, with rhodamine; experimental session) suggested to be itch-reducing or itch-increasing, depending on placebo or nocebo condition. A 2.5-cc electrode (Iogel, Iomed, DJO Global, Hannover, Germany; active surface: 11.7 cm2 ) was treated with the histamine solution and applied to the volar side of the non-dominant forearm. A reference electrode was placed on the volar side of the upper arm. The electrode nodes were spaced approximately 10 cm apart. Histamine was applied to the skin by iontophoresis with a current level set at 0.4 mA for 2.5 min, following which the electrodes and any residual histamine were removed from the skin.

#### Self-Reported Itch

Self-reported itch in response to histamine iontophoresis was assessed using the same 0–10 NRS as described under Expected Itch. During iontophoresis, participants continuously rated itch using a vertical bar slide depicting the NRS. Scores were sampled at a 10-Hz rate using E-Prime 2.0 (42). Directly following iontophoresis, mean itch was verbally assessed by asking participants how much itch (on a 0–10 scale) they experienced in general during the test. From 1 to 4 min after iontophoresis, participants were asked to rate self-reported itch every 30 s on the bar slide as a follow-up period. The primary study outcome was the area under the curve (AUC) of itch during the 2.5 min of iontophoresis. Secondary outcomes were maximum itch reported during the 2.5 min of iontophoresis, verbally assessed mean itch, and AUC itch during the 4-min follow-up. AUC of itch and maximum itch during iontophoresis were computed using MATLAB Release 2012b (The MathWorks, Inc., Natick, MA, USA).

#### Subjective Skin Response

Participants filled in the Sensitive Scale-10 (SS-10) (43) to measure their subjective skin response. The SS-10 contains 10 items, of which 9 items assessed specific skin symptoms (e.g., itch, pain, general discomfort, and heat sensations). Participants were asked to rate in what intensity they had experienced these symptoms over the past 3 days as a baseline measurement, as well as during histamine iontophoresis. Symptoms were rated on NRS ranging from 0 ("zero intensity") to 10 ("intolerable intensity"). An additional symptom ("redness of the skin") was assessed on a 0–10 NRS (43). Total scores were calculated by summing all items and ranged from 0 to 100. Cronbach's alpha ranged from .83 to .87 in the current study for baseline and post-iontophoresis assessments of the SS-10.

#### Physical Skin Response

Wheal size and flare areas following histamine application were measured after the 4-min follow-up period after the iontophoresis test. The size of the skin response was measured by drawing the outline of the redness and thickening of the skin onto a 1-cm2 gridded transparent sheet with a 0.4-mm black permanent marker (Staedtler, Germany). The sheets were scanned and then retraced using ImageJ software (44), after which the wheal and flare area (in cm2 ) were calculated. In addition, skin temperature was measured following iontophoresis, using a handheld infrared digital thermometer (accuracy ± 2.0 °C, resolution 0.1 °C; BaseTech, Conrad Electronic Benelux B.V.). Measurements were taken with the thermometer held vertically and approximately 1 cm above the center of the histamine application area. To control for individual differences in skin temperature, a baseline measurement was taken prior to iontophoresis, with change from baseline temperature being used as the outcome measure.

#### Procedures

Prior to participation, written information regarding the study was provided, and volunteers were asked to fill in an online questionnaire assessing the study's inclusion and exclusion criteria. When volunteers were considered eligible for participation, they were invited to the lab for a 30-min baseline session and a 45-min experimental session, which were timed 1 week apart. At the start of the baseline session, the study procedures were explained, and written informed consent was provided. Next, personality questionnaires were administered, which are not further described here as they are unrelated to the aim of the current study. Baseline measurements of itch expectation and subjective skin responses in the past 3 days were taken. Next, the skin of the non-dominant forearm was disinfected, and electrodes were placed on the arm, after which the histamine test was conducted. Measurements of itch and physical skin responses were taken, followed by an assessment of subjective skin responses. After 1 week, the experimental session took place. First, the general procedure of the second session was explained, and verbal suggestions were given (the content of which depended on group allocation). Measurements of post-VS expected itch and of subjective skin responses in the past 3 days were taken. Next, the skin was cleaned using the pink disinfectant, during which the verbal suggestions were briefly repeated. Histamine iontophoresis was conducted; and measurements of itch, physical skin response, and subjective skin response were taken. At the end of the session, participants were asked to fill in a final questionnaire assessing the general amount of itch experienced during both baseline and post-VS iontophoresis and, for the open-label groups only, how believable and convincing participants thought the open-label rationale was (on a 0–10 NRS). Upon completion, they were debriefed on the true purpose of the study. For each session, participants received a compensation of €7.50.

#### Statistical Analyses

As input for the power calculation, we used the effect size of Cohen's *d =* 1.10, that was found by Napadow et al. (25), who investigated nocebo effects induced by an inert substance (i.e., a sham allergen solution) on itch. As the current study investigated not only nocebo effects following application of an inert substance but also placebo effects, and both were investigated under closed-label and open-label conditions, a more conservative effect size of *d =* 0.90 was used for computation of sample sizes for the separate openlabel and closed-label analyses. A power calculation for an analysis of covariance (ANCOVA) using G\*Power 3.1 (45) indicated that 21 participants per group would be needed at a power of β = .80 and a significance level of α = .05 for the primary outcome of AUC itch during iontophoresis in the experimental session between the (separate closed-label or open-label) positive verbal suggestion group and the negative verbal suggestion group while controlling for AUC itch at baseline. A missing data rate of 10% was taken into account, resulting in a sample size of 23 participants in each group.

Analyses were performed using SPSS 21.0 for Windows (IBM SPSS Inc., Chicago, IL, USA). Normal distribution of the variables and the assumptions of each analysis were checked prior to analysis. To test for group differences in demographics and baseline variables, chi-square tests and one-way analyses of variance (ANOVAs) were conducted. As *a priori* determined primary analysis, differences between the combined negative VS groups and positive VS groups in AUC itch during iontophoresis were assessed by a general linear model (GLM) ANCOVA, controlled for AUC itch during baseline iontophoresis. Similar analyses were conducted for the secondary outcome parameters, maximum itch during iontophoresis, mean itch (assessed verbally following iontophoresis), AUC itch during the 4-min follow-up, subjective skin response, and the physical skin response parameters.

Due to technical difficulties with the NRS bar slide and E-Prime, data of some participants (*n* = 6) were missing for the analyses of AUC itch and maximum itch during iontophoresis. Data of one participant were missing for the skin temperature measurements. For those variables that were non-normally distributed (i.e., AUC itch during follow-up), a change score was calculated by subtracting baseline scores from those measured post-VS (with zero indicating no change, negative scores indicating a decrease, and positive scores indicating an increase from baseline to post-VS). A GLM ANOVA was then performed to detect differences in change scores between groups. For expected itch following suggestions, an ANOVA was also conducted. For each AN(C)OVA, Cohen's *d* was calculated, and the following interpretations were used: small effect size 0.20, medium effect size 0.50, and large effect size 0.80 (46). When appropriate, covariateadjusted means were used for calculation of Cohen's *d*. In addition, paired sample *t*-tests were conducted within each group to assess changes in each outcome parameter from the baseline to post-VS measurements. In order to examine whether the effects of verbal suggestions were similar regardless of participants knowing about the expectation induction, all analyses were repeated for the separate open-label and closed-label conditions. As the effects of suggestions were expected to be similar under open-label and closed-label contexts, differences between open- and closed-label groups were not tested statistically. Rather, effect sizes generated by the separate open-label and closed-label analyses were used for indirect comparisons.

To explore potential group differences in the strength of associations between the process measure of post-VS itch expectation and the outcome measures of itch and skin response, Pearson's *r* correlations were calculated within each group, and Cohen's *q* was computed as an effect size for the difference in strength of association, with the following categories of interpretation: no effect < 0.10, small effect size 0.10 < 0.30, medium effect size 0.30 < 0.50, and large effect size ≥ 0.50 (46). For AUC itch during follow-up, Spearman's rho was calculated. The open-label groups were compared on how believable and convincing participants thought the open-label rationale was by independent-samples *t*-tests. All analyses were conducted two sided with α = .05. For the secondary analyses (i.e., AN(C)OVAs and paired-sample *t*-tests for separate open-label and closed-label analyses), Bonferroni's correction for multiple comparisons was used, thus resulting in a significance level of α/2 = .025. To correct for alpha inflation due to multiple itch outcomes, an additional Bonferroni's correction was applied for the secondary itch outcomes, resulting in a significance level of α/3 = .017 for the combined-group analyses and (α/3)/2 = .008 for the separate-group analyses of the secondary itch outcomes. All values described in the Results section represent mean ± SD, unless stated otherwise.

### RESULTS

#### Participants

A total of 138 potential participants expressed interest in the study, of whom 44 were not included (18 had somatic or psychological morbidity, 4 were non-proficient in the Dutch language, and 22 gave no response following screening). Two participants dropped out after the baseline session and were replaced. This resulted in the intended sample size of 92 participants (16 males, 17.4%; 76 females, 82.6%), whose age ranged from 18 to 30 (*M* = 21.8 ± 2.7). Participants were randomized into 1) the open-label positive VS group (*n* = 22), 2) the closed-label positive VS group (*n* = 23), 3) the open-label negative VS group (*n* = 23), or 4) the closed-label negative VS group (*n* = 24). The groups did not differ in demographic factors (all *p* ≥ .42), baseline itch expectation prior to iontophoresis (*p = .*13), baseline self-reported itch parameters (all *p* ≥ .58), and baseline subjective and physical skin condition (all *p* ≥ .12). An overview of the means and standard deviations of the baseline and outcome measures is presented in **Table 1** (combined openlabel and closed-label groups) and in **Supplementary Table S1** (separate open-label and closed-label groups).

#### Expected Itch

A large-sized effect of verbal suggestions on expected itch was found; *F*(1,90) = 20.94, *p < .*001, Cohen's *d* = 0.96. As illustrated in **Figure 2**, expected itch following suggestions was significantly lower in the combined positive VS groups (*M* = 2.62 ± 1.82) compared with the combined negative VS groups (*M* = 4.41 ± 1.93).

A secondary analysis showed a large-sized effect of suggestions in the open-label groups [*F*(1,43) = 15.84, *p < .*001, Cohen's *d* = 1.21] and a medium-sized effect in the closed-label groups [*F*(1,45) = 6.15, *p = .*017, Cohen's *d* = 0.74], both indicating significantly lower expected itch in the positive VS group (open label: *M* = 2.35 ± 1.88; closed label: *M* = 2.88 ± 1.77) than in the negative VS group (open label: *M* = 4.59 ± 1.91; closed label: *M* = 4.24 ± 1.99).

#### Primary Itch Measure: Area Under the Curve (AUC) of Itch during Histamine Iontophoresis

For the primary outcome AUC itch, a small-sized non-significant difference between the combined positive VS groups and the combined negative VS groups was found; *F*(1,83) = 1.75, *p = .*19, Cohen's *d =* 0.29. Secondary analyses for the separate open- and closed-label groups revealed similar findings (both *p* ≥ .31; see **Figure 3**). Within-group analyses of baseline to post-VS changes indicated that AUC itch decreased marginally in the combined positive VS groups [*t*(39) = 1.98, *p = .*055] but did not change in the combined negative VS groups [*t*(45) = −0.19, *p = .*85]. No within-group changes in AUC itch from baseline to post-VS were detected for the separate open- and closed-label groups (all *p* ≥ .12). An overview of within-group comparisons for AUC itch and other outcome measures is presented in **Table 2** (combined

TABLE 1 | Means ± standard deviations for the combined open- and closed-label positive and the combined open- and closed-label negative verbal suggestion groups.


*AVS, verbal suggestions. BAUC, area under the curve. CAssessed verbally on a Numeric Rating Scale ranging from 0 to 10. DGroup differences assessed by ANCOVA, controlled for baseline. Cohen's d was calculated with the estimated marginal means (controlled for baseline). ECalculated as post-VS measure–baseline measure (session 2–session 1) and corrected for significant outliers. FAs measured by an adjusted version of the Sensitive Scale-10 (43). GCalculated as post-iontophoresis temperature–pre-iontophoresis temperature.*

open-label and closed-label groups) and **Supplementary Table S2** (separate open-label and closed-label groups).

[*t*(21) = 1.87, *p = .*075] and a significant change within the closed-label positive VS group [*t*(20) = 3.14, *p = .*005].

### Secondary Itch Measures during and Following Histamine Iontophoresis

#### Maximum Itch and Mean Itch during Iontophoresis

Findings for maximum itch during iontophoresis were similar to those of AUC itch, with no effects of suggestions for the combined as well as separate groups (all *p* ≥ .24) and a marginal decrease from baseline to post-VS exclusively for the combined positive VS groups [*t*(39) = 2.00, *p = .*053]. The combined positive VS groups showed a small-sized tendency to report lower (postiontophoresis-assessed) mean itch (*M* = 2.83 ± 1.93) than did the combined negative VS groups (*M* = 3.19 ± 2.09); *F*(1,89) = 3.22, *p = .*076, Cohen's *d* = 0.38. No effects of verbal suggestions were found when open- and closed-label groups were separated, nor were changes from baseline to post-VS scores detected for any of the groups (all *p* ≥ .19).

#### AUC of Itch during Follow-Up Following Iontophoresis

A significant and medium-sized difference in the change scores of AUC for itch during the 4-min follow-up was found when open- and closed-label groups were combined [*F*(1,88) = 6.09, *p = .*016, Cohen's *d* = 0.52], with AUC itch during follow-up decreasing significantly in the combined positive VS groups (*M* = −3.73 ± 7.55) compared with the combined negative VS groups (*M* = 0.02 ± 6.88). A small-sized non-significant effect of verbal suggestions was found in the open-label groups; *F*(1,43) = 2.11, *p = .*15, Cohen's *d* = 0.43, and a marginal and mediumsized effect in the closed-label groups, in the same direction as for the combined groups; *F*(1,43) = 4.94, *p = .*032, Cohen's *d* = 0.67. A significant change from baseline to post-VS in AUC itch during follow-up was demonstrated for the combined positive VS groups [*t*(42) = 3.24, *p = .*002]. In the combined negative VS groups, however, no change was detected [*t*(46) = −0.02, *p = .*98]. Separating open- and closed-label groups revealed a non-significant change within the open-label positive VS group

### Skin Response

#### Subjective Skin Response (SS-10)

For subjective skin response following the histamine test, no significant difference was found between the combined positive and negative VS groups, nor between the separate open- and closed-label positive and negative VS groups (all *p* ≥ .12). A significant decrease in subjective skin response from baseline to post-VS was demonstrated in the combined positive VS groups [*t*(43) = 2.59, *p = .*013], but not in the negative VS groups [*t*(46) = 1.61, *p = .*12]. When analyses were conducted for separate openand closed-label groups, a significant decrease was demonstrated only for the closed-label positive VS group; *t*(22) = 3.75, *p < .*001.

#### Physical Skin Response

No effects of verbal suggestions on wheal or flare areas were found for either the combined or separate open- and closed-label groups (all *p* ≥ .23). Regarding skin temperature, the combined positive VS groups showed a medium-sized lower increase in skin temperature from before to after iontophoresis (*M* = 1.83 ± 1.15) than did the combined negative VS groups (*M* = 2.34 ± 1.62); *F*(1,87) = 5.84, *p = .*018, Cohen's *d* = 0.52. In the same direction, marginally significant medium-sized effects of verbal suggestions on skin temperature increase were found in the open-label [*F*(1,41) = 3.01, *p = .*090, Cohen's *d* = 0.54] and closed-label groups [*F*(1,43) = 2.93, *p = .*094, Cohen's *d* = 0.52], respectively. Within-group comparisons for both combined and separate open- and closed-label positive and negative VS groups showed that skin temperature increased significantly from baseline to post-VS for the negative VS groups (all *p ≤ .*048), but not for the positive VS groups (all *p* ≥ .12).

#### Associations between Expected Itch and the Outcome Measures of Itch

In the combined open- and closed-label groups, expected itch following suggestions was significantly and positively associated


TABLE 2 | Within-group mean changes from baseline and separate paired sample *t*-test results for the combined open- and closed-label positive verbal suggestion groups and combined negative verbal suggestion groups.

*Mean change was calculated as post-verbal suggestions score–baseline score, with negative values indicating a decrease from baseline and positive scores indicating an increase from baseline. AAUC, area under the curve. BAssessed verbally on a Numeric Rating Scale ranging from 0 to 10. CAs measured by an adjusted version of the Sensitive Scale-10 (43). DCalculated as post-iontophoresis temperature–pre-iontophoresis temperature.*

with all itch measures during and following iontophoresis (all *r* ≥ .43, all *p ≤ .*01). Comparisons of the strength of the association between expected itch and the itch outcome measures showed small-sized to no differences in associative strength between the combined positive and combined negative VS groups (all Cohen's *q* ≤ 0.15). In the separate open-label and closed-label groups, findings were similar, with one exception: in the openlabel positive VS group exclusively, itch expectations were not associated with mean itch and AUC of itch during follow-up (both *p* ≥ .11). An overview of Pearson's *r* and Spearman's ρ correlation coefficients can be found in **Table 3** (combined openand closed-label groups) and **Supplementary Table S3** (separate open- and closed-label groups).

#### Open-Label Instruction Believability

Overall, participants in the open-label conditions rated the instructions as very clear (*M* = 7.90 ± 2.32). Ratings on how convincing the instructions had been were more ambiguous (*M =* 5.37 ± 2.46). In general, participants in the open-label groups believed that expectations are able to influence itch (*M* = 6.49 ± 1.97) but rated the extent in which their own itch experience was influenced by the application of the tonic as low (*M* = 3.81 ± 2.43). Groups did not differ in their ratings of the instructions (all *p* ≥ .21).

### DISCUSSION

The current study investigated whether positive and negative outcome expectations, induced by open-label and closed-label positive and negative verbal suggestions regarding an inert tonic, could influence self-reported itch in response to a histamine test. For the first time, open- and closed-label placebo effects for itch were investigated within a single study, including a comparison with open- and closed-label nocebo effects. It was demonstrated TABLE 3 | Within-group Pearson's *r* and Spearman's rho correlations for the process measure of post-VS itch expectation and outcome measures of self-reported itch and skin response for the combined open- and closed-label group comparisons, with Cohen's *q* as estimate of the difference in effect size between groups.


*AAUC, area under the curve. BAssessed verbally on a Numeric Rating Scale ranging from 0 to 10. CCalculated using the non-parametric Spearman's rho. DAs measured by an adjusted version of the Sensitive Scale-10 (43). ECalculated as post-iontophoresis temperature–pre-iontophoresis temperature. \*\*p < .01; \*\*\*p < .001.*

that both open-label and closed-label verbal suggestions were able to influence itch expectations. For the primary outcome of area under the curve for itch during histamine iontophoresis, a smallsized but non-significant effect of verbal suggestions was found. Participants in the combined open- and closed-label positive VS groups reported lower itch during an immediate follow-up period after iontophoresis compared to the negative VS groups. *Post hoc* tests indicated that this was mostly due to differences between positive and negative VS groups under closed-label conditions. In addition, a significantly smaller increase in skin temperature was observed in the combined positive VS groups compared with the negative VS groups, but no effects on other markers of the physical skin response to histamine were found. Overall, the current study shows that verbal suggestions regarding a topical application of a substance can influence expectations for itch, regardless of whether or not participants know about receiving suggestions, and provides limited evidence that these suggestions may influence itch and skin response in response to histamine.

The findings that verbal suggestions were able to influence itch in the follow-up period after histamine iontophoresis are in line with a previous study that found medium-to-largesized effects of positive suggestions on histamine-induced itch (24). While that particular study made use of a cream to help induce placebo effects, the current study used a pink-colored tonic. Potentially, the use of this particular attribute may have led towards smaller effects in the current study, since a cream could be perceived as a common treatment for itch by some participants, could trigger previously learned associations, and could thus potentially elicit stronger effects overall (47). Moreover, negative verbal suggestions did not elicit negative expectations for itch in the current study and did not increase itch either during or following the histamine test, which is not in line with previous evidence for verbal suggestion-induced nocebo effects in itch (25, 26, 28). It should be noted though that these previous studies have induced nocebo effects through negative suggestions regarding the experimental itch induction method that was used, whereas the current study provided suggestions regarding the topical application of an attribute prior to itch induction. While this did allow for a direct comparison of positive and negative expectation induction, potentially, it may have influenced the credibility of the negative verbal suggestions as well. Topical application of, for example, a cream or tonic in a laboratory setting might be associated more easily with symptom reduction rather than worsening of symptoms. In comparison, information regarding an experimental itch induction method, though less clinically relevant, may provide a more neutral basis for induction of nocebo effects through suggestions. Alternatively, although the baseline histamine application was valuable for participants as a comparison point for the second application, nocebo effects induced through negative verbal suggestions could have been influenced by participants being less anxious about the second histamine test, in comparison with the first test (since participants were generally unfamiliar with histamine iontophoresis prior to participating in the study). Future research may utilize a counterbalanced design to examine this more in detail. Likewise, more research is needed to investigate under which circumstances and through which attributes placebo and nocebo effects may be elicited for itch.

An effect of negative verbal suggestions on change in skin temperature due to histamine application was demonstrated. This finding is similar to previous work on placebo effects in autonomically controlled parameters and wheal responses (26, 48), a meta-analysis of clinical trials demonstrating placebo effects on physical outcome parameters controlled by the autonomic nervous system (49), and early studies on suggestions and hypnosis (50–52). Considering that either the outcome measure differed from these previous studies (i.e., skin temperature change rather than wheal size) or the expectation induction method was different (i.e., verbal suggestions given without hypnosis), caution is needed in interpreting these results. Moreover, the verbal suggestions in the current study did not influence wheal and flare areas to histamine, which is in line with most recent studies (24, 29, 35, 53, 54).

Our design allowed for the first time comparisons of effect sizes of positive and negative verbal suggestions under openand closed-label conditions for itch. The findings demonstrate that positive verbal suggestions are able to significantly reduce expectations of itch under both open-label and closed-label conditions, with open-label verbal suggestions seemingly inducing larger expectancy effects. Overall, the effects of positive and negative verbal suggestions on itch were approximately similar sized under open-label and closed-label conditions. However, some differences between the conditions could be seen when examining the within-group changes from baseline. Closed-label suggestions appeared slightly more effective for itch, as illustrated by the significant within-group changes in itch during follow-up from baseline to post-suggestions under closed-label conditions. That open-label placebo treatment can significantly influence expectations and, potentially, symptoms of itch is in line with previous findings on other outcome parameters (31, 32, 34–39). It also provides further preliminary support for the notion that concealment of treatment is not necessary to elicit placebo responses, and that placebo mechanisms can potentially be utilized in clinical practice. Small differences between the open-label instructions of the current study and previous work need to be noted. Previous studies [e.g., Refs. 31–34, 40] began their open-label placebo instructions by indicating that the pill that was used was a placebo, prior to indicating the efficacy and mechanisms of these effects. The current study on the other hand began by introducing the tonic as an effective tool for itch reduction and explaining that it was a placebo afterwards, together with a rationale on why it would still be effective. Differences in the order in which this type of information is presented may impact the strength of open-label placebo and nocebo effects. In addition, previous work has incorporated the concept of learning in the open-label instructions (i.e., by giving the example of Pavlov's dog). This aspect has been omitted here, as the current study investigates placebo responses evoked by conscious expectancy (i.e., verbal information) rather than associative learning mechanisms. Potentially, this may have influenced the efficacy of the open-label rationale. Some caution needs to be taken in interpreting the effects of negative verbal suggestions under the separate open-label and closed-label conditions, since neither type of negative verbal suggestions was able to increase expectations of itch.

Some strengths and limitations need to be taken into account. This is the first study that compares open- and closedlabel positive and negative verbal suggestions to elicit placebo and nocebo effects in itch and other responses to histamine. Meeuwis et al. Open- and Closed-Label Suggestions and Itch

Since the study was conducted single blinded, a reporting bias cannot be ruled out, as participants may have adjusted their answers to the experimenters' expectations. To minimize influences of response bias on assessments of expectations and itch, participants used a (computerized) bar slide to indicate these parameters. Future research might, however, consider using a double-blinded approach. The effect sizes found in the current study are considerably small, which may be due to the itch stimulus being perceived as low by participants. As such, the study may have been underpowered to find small effects, which seems to be supported by finding more significant effects of the combined open- and closed-label groups than for the separate groups. Moreover, the design of the current study did not include a no-treatment group. This prevents an estimation of a true placebo or nocebo response, as itch may reduce from the first to second histamine test regardless of group allocation. Though habituation to the itch stimulus cannot be ruled out, its role is likely small, since the itch stimuli were relatively short and presented with 1 week in between. Alternatively, anxiety may have resulted in higher itch ratings during baseline. Including a no-treatment group to control for these reductions or utilizing a counterbalanced design could provide better estimates of a true placebo and nocebo response. Lastly, verbal suggestions were given regarding an inert tonic. While this approach may have worked for placebo induction, potentially, it may have been harder to elicit nocebo effects in this manner, as negative consequences regarding such a treatment method may be counterintuitive. To compare open-label and closed-label nocebo effects for itch, a different approach could be needed. For example, future research could investigate whether nocebo effects can be induced when the effects of an inert substance on itch are introduced as side effects of this substance, as changing to such an introduction of negative effects may be more closely related to how negative effects would be experienced in clinical practice.

In conclusion, this study provides evidence for the first time that positive verbal suggestions can induce expectations for itch reduction under both open-label and closed-label conditions. Suggestions are able to reduce the amount of itch experienced after histamine iontophoresis under both open-label and

#### REFERENCES


closed-label conditions, with closed-label suggestions appearing more effective in reducing itch during follow-up. However, experienced itch during histamine iontophoresis was not influenced by suggestions. Future research may aim to investigate under which circumstances and with which type of attribute these suggestions could elicit effects for itch. Further demonstrating the efficacy of open-label placebo effects may help facilitate the application of these effects in clinical practice.

### AUTHOR CONTRIBUTIONS

SM, HvM, and AE designed the study and wrote the protocol. AvL and DV commented on the protocol. SM and AvL undertook the statistical analysis. SM and HvM wrote the first draft of the manuscript. AvL, DV, AL and AE commented on the manuscript.

### FUNDING

This study was funded by a European Research Council Consolidator Grant 2013 (ID: ERC-2013-CoG-617700\_EXPECT HEAL-TH, granted to AE). The funders had no role in study design, data collection or analysis, decision to publish, or writing this manuscript.

#### ACKNOWLEDGMENTS

The authors would like to thank Ir. Elio E. Sjak-Shie for his help with developing the NRS bar slide used for histamine iontophoresis and would like to thank Dr. Victor L. Knoop for his help with analyzing the bar slide data generated by E-Prime.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00436/ full#supplementary-material

population-based cross-sectional study. *Acta Derm Venereol* (2011) 91(6):674–9. doi: 10.2340/00015555-1159


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Meeuwis, van Middendorp, van Laarhoven, Veldhuijzen, Lavrijsen and Evers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A Direct Comparison of Placebo and Nocebo Effects on Visuospatial Attention: An Eye-Tracking Experiment

#### *Carina Höfler\*, Jonas Potthoff and Anne Schienle*

*Department of Clinical Psychology, University of Graz, Graz, Austria*

Background: Placebo and nocebo effects on visual attention are still poorly understood. This eye-tracking study directly compared effects of sham transcranial magnetic stimulation (sTMS) that was administered along with the verbal suggestion that the treatment would either increase (placebo) or decrease (nocebo) left-sided visual attention.

Method: Twenty women who had reported decreased attention (nocebo responders) and 20 women who had reported increased attention (placebo responders) following sTMS completed a visual search task with three visual load levels. The task was conducted once with and once without the placebo or the nocebo (sTMS). Left-sided fixations and reaction times for left-sided targets (in comparison with right-sided targets) were analyzed.

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Karl Bechter, University of Ulm, Germany Katrin Giel, University of Tübingen, Germany*

> *\*Correspondence: Carina Höfler carina.hoefler@uni-graz.at*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 21 December 2018 Accepted: 05 June 2019 Published: 21 June 2019*

#### *Citation:*

*Höfler C, Potthoff J and Schienle A (2019) A Direct Comparison of Placebo and Nocebo Effects on Visuospatial Attention: An Eye-Tracking Experiment. Front. Psychiatry 10:446. doi: 10.3389/fpsyt.2019.00446*

Results: Contrary to the verbal suggestion, the nocebo responders showed more left-sided fixations in the nocebo condition (compared with the control condition) and responded faster to left-sided targets in the high-load condition. The placebo had no effect on fixations and reaction times.

Conclusion: These results indicate a more beneficial effect of a nocebo compared with a placebo for the first time. Limits and possibilities of placebo and nocebo interventions are discussed.

Keywords: placebo, nocebo, eye-tracking, visuospatial attention, sham transcranial magnetic stimulation

## INTRODUCTION

Placebos and nocebos are physically or pharmacologically inert drugs, devices, or other types of sham interventions that are able to influence various clinical and physiological outcomes related to health (1). Whereas placebos have beneficial effects on specific conditions, nocebos are associated with the occurrence of negative symptoms, the worsening of symptoms, or the prevention of improvement. Both effects are considered to be 'context effects' because they are mediated by diverse mechanisms, such as learning, expectations, and social cognition (1).

It has been repeatedly shown that placebos and nocebos are able to change somatic and emotional processes. The most studied phenomena, "placebo analgesia" and "nocebo hyperalgesia," refer to the experience of either decreased or increased levels of pain after sham treatment. Other placebo/nocebo phenomena, for example, those related to perceptual processes, have been investigated less frequently and are therefore still poorly understood. A few studies have shown that placebos and nocebos are able to alter visual attention [e.g., Refs. (2–7)]. In those studies, the placebo treatments reduced visual avoidance of negative affective stimuli (4, 5, 8) and enhanced the performance on a visual search task (3). In contrast, the nocebos reduced the performance on a visual search task (3) and increased visual cortex activation during negative affective picture processing (6). Thus, there is converging evidence indicating that nocebo- or placebo-related expectations are able to influence the processing of visual inputs.

In one nocebo study on attention, a surprising effect was observed (7). Healthy individuals received sham transcranial magnetic stimulation (sTMS) along with the verbal suggestion that the treatment would elicit temporary neglect-like attention deficits in the left visual field (transitory "pseudo-neglect"). Contrary to this suggestion, in those participants who had reported experiencing attention deficits, the nocebo actually enhanced the number of left-sided fixations and facilitated target detection. These results point to a paradoxical yet positive aspect of nocebo treatment, where the suggestion of unilateral attention deficits actually provokes unilateral attention improvements (7).

This unexpected finding raises questions relating to an analogous situation: what would be the effects of a placebo sTMS combined with the verbal suggestion of a unilateral improvement in attention? In general, placebo/nocebo mechanisms are still poorly understood and controversial topics of discussion. While some findings indicate that placebos and nocebos are "evil twins" that produce effects that are counterparts of one common phenomenon [e.g., Refs. (9, 10)], others argue that placebo/nocebo responses are distinct phenomena with distinct neurobiological representations [e.g., Refs. (11, 12)].

In order to better understand both mechanisms, comparative studies, which include both placebo and nocebo conditions, are needed. In the present study, the effects of equivalent placebo and nocebo suggestions on visual-spatial attention were directly compared with each other. The study design was based on a previous nocebo study (7), which was extended by adding a placebo group. Participants completed a visual search task after being treated with a placebo or nocebo device: this device was an sTMS system, which was administered with the verbal instruction that the stimulation would either induce temporary left-sided attention improvements (placebo) or deficits (nocebo). Differences in left-sided fixation frequency, as well as reaction times for leftsided targets (in comparison with right-sided targets) during sham treatment, were compared between the placebo and the nocebo groups. Based on previous placebo studies on general visual attention [e.g., Ref. (3)], it was expected that the placebo would enhance left-sided attention as reflected by an increase in left-sided fixations and faster reactions to left-sided targets (in comparison with right-sided targets). This placebo-related improvement should be more pronounced than the previously observed increase in left-sided attention during nocebo treatment (7).

#### METHOD

#### Sample

A total of 40 right-handed healthy university students with a mean age of 21.06 years (SD = 2.58) were included in the study sample. Exclusion criteria were the presence of mental/neurological disorders, medication intake (except contraceptives), participation in a previous study with a real TMS system and attention deficits as assessed by a clinical interview, and an attention test (d2) (13). All participants had normal or corrected-to-normal vision. They were recruited *via* announcements at the university campus and gave written informed consent. The study was conducted in accordance with the Declaration of Helsinki and approved by the ethics committee of the university.

#### Design and Procedure

The subjects either participated in the placebo arm of the study (*n* = 20) or in the nocebo arm (*n* = 20). The placebo arm consisted of two counterbalanced conditions (with placebo vs. without placebo). The same was true for the nocebo arm (two counterbalanced conditions: with nocebo vs. without nocebo). The two conditions were separated by approximately 1 week. The design of the study is displayed in the **Supplementary Table S1**.

The placebo/nocebo device was an sTMS system, which was administered with the verbal suggestion that the stimulation would either induce temporary left-sided attention improvements (placebo) or deficits (nocebo). In fact, the sTMS system was a head massage tool, which induced symmetrical vibrations across the head (**Figure 1**) associated with a whirring sound. The system was presented as an innovative portable low-intensity repetitive TMS system for neurological rehabilitation. Given the increasing relevance of TMS in this field [especially in the treatment of visual neglect symptoms, e.g., Ref. (14)], this type of treatment was chosen. In order to increase the credibility of the cover story, the participants were provided with technical illustrations and a fictitious scientific article about the TMS system and its possible applications.

The sTMS system was administered for 4 min with verbal instructions either suggesting temporary left-sided attention improvements (placebo) or deficits (nocebo).

Placebo: "TMS can induce left-sided attention improvements … The visual exploration on the left side will be perceived as significantly easier and can be done faster…"

Nocebo: "TMS can induce left-sided [neglect-like] attention deficits…. The visual exploration on the left side will be perceived as significantly more challenging and exhausting…"

After the sTMS, the system was removed and the eye-tracking experiment with the visual search task started. Before and after the experiment, the affective state of the participants was assessed *via*  the self-assessment manikin (1–9, 9 = happy, aroused, dominant) (15). At the end of the placebo/nocebo condition, the efficacy of the sTMS system was rated (0–100%), and the participants were asked to report experienced symptoms induced by the sTMS. At the end of the study, all participants were debriefed.

The participants of the nocebo arm were 20 "nocebo responders" [subsample of a previous study by Höfler et al. (7)], who had rated the sTMS stimulation as most effective. Effectiveness was defined as perceived change in visual attention (in the suggested direction) in percent (100% = very effective). For the placebo arm of the study, 20 women were selected from a bigger sample of 50 women ("placebo responders"). These responders did not differ from the nocebo responders in their effectiveness ratings for the sTMS (nocebo = 49.50%, SE = 3.18; placebo = 54.35%, SE = 4.03; *p* > .28). The two

groups (placebo, nocebo) did not differ in mean age (nocebo = 21.00 years, SD = 2.41; placebo = 22.20 years, SD = 2.67; *p* > .14), average value of d2-attention (nocebo = 107.55, SE = 2.19; placebo = 109.65, SE = 1.48; *p* = .43), mean reaction time (nocebo = 11,209.73 ms, SE = 660.02; placebo = 10,535.56 ms, SE = 470.31; *p* = .41), and hit rate of targets in the visual search task (nocebo = 98.82%, SE = .29; placebo = 98.89%, SE = .28; *p* = .86).

We only selected "responders" for the present investigation because previous studies showed that the effects of placebos/ nocebos are associated with the expected and experienced efficacy of the sham treatment [e.g., Refs. (7, 16, 17)]. Placebo/ nocebo effects are mediated by diverse processes, including expectations, beliefs, and social cognition (1). In this sense, a positive/negative belief is a prerequisite for the placebo/nocebo effect to occur.

#### Visual Search Task

Participants performed a visual search task, the adapted version of the balloons test (18). The balloons task had three visual load conditions with either 50, 100, or 200 schematic black balloons depicted on a white background (**Figure 2**). Each balloon was represented by a black circle with an adjoining black vertical line originating from the bottom of the circle. The diameter of each circle was 11 mm; the line had a length of 7 mm. The balloons functioned as distractors, and one black circle without a line was the target. Participants were instructed to localize the target as fast as possible on the computer screen and confirm the detection *via* mouse click (the cursor was not visible during the search task). The mouse click was used to determine the reaction time. Subsequently, the participants were asked to point to the target to verify the correct localization. Prior to each visual load condition, a blank white screen was shown for 30 s. The sequence of the conditions was counterbalanced. Each condition comprised 12 trials; each trial had a maximum duration of 90 s. In each trial, the target had a different position oriented on a balanced 4:3 grid (six targets at each side per condition). The sequence of target location was randomized. Prior to the task, the participants performed two example tasks (target on the left/right) to get familiar with the procedure.

During the search task, two-dimensional eye movements were recorded with an SMI RED250mobile (sampling rate: 250 Hz, nine-point calibration). We calibrated both eyes and analyzed data from the eye, which produced a better spatial resolution (>0.35° visual angle). The data were only analyzed if the spatial resolution was above 0.5°. The experiment was controlled with the SMI Experiment Suite. The data were exported with SMI Begaze and customized Python scripts. For event detection, standard

thresholds of the SMI BeGaze Software (Version 3.6.52) for highspeed eye-tracking data (sampling rate >200 Hz) were used: The velocity threshold for saccade detection was 40°/s. Fixations were defined by an absence of saccades and blinks (defined as moments without registered gaze positions) that lasted at least 50 ms. Participants sat about 60 cm away from the computer monitor. To minimize head movements and standardize the head position, we additionally used a chin rest. Prior to the recording, a nine-point calibration procedure was used. The paradigm was presented on a 24-in. widescreen TFT monitor with a resolution of 1,920 × 1,080 pixels.

#### Data Analyses

For the analysis of the data from the balloons test, the computer screen was divided into the left and right sides (area of interest). To identify changes in directed attention due to the placebo/ nocebo treatment, the percentage of left-sided (relative to rightsided) fixations was calculated (mean percent of total fixations per trial, which was within the left area of interest: values above 50% indicate a left-sided bias, values below 50% indicate a rightsided bias). Further, the lateralization index (LI) (19) of the mean reaction time for targets on the left vs. right side was determined (positive values indicate slower reactions to left-sided targets; negative values indicate faster reactions to left-sided targets).

Separate repeated-measures 3 × 2 ANOVAs were performed for the percent of the left-sided fixations and the LI of the reaction time with the within-subject factors visual load (50, 100, 200 balloons) and treatment (nocebo OR placebo, control) for the placebo and nocebo groups.

In order to compare the attention bias between the placebo and the nocebo treatments, two separate ANOVAs for the difference scores for the percent of left-sided fixations [treatment (placebo OR nocebo) minus control] and LI reaction time [treatment (placebo OR nocebo) minus control] were computed with visual load (50, 100, 200 balloons) as within-subjects factor and group (placebo, nocebo) as between-subjects factor.

To assess possible group differences in affective states, separate ANOVAs including the within-subjects factor time of measurement (before, after search task) and the between-subjects factor group were computed for the difference score of valence, arousal, and dominance [treatment (placebo OR nocebo) minus control]. We report Bonferroni adjusted *p*-values and partial eta squared (η2p) as effect size measure.

### RESULTS

### Eye-Tracking

Descriptive statistics for the left-sided bias (fixations and reaction times) in the placebo and nocebo groups are displayed in **Table 1**.

*Placebo:* The conducted ANOVA for the percentages of left-sided fixations and LI reaction time in the placebo group revealed no significant main effects or interactions for the factor treatment (all *p* > .06).

*Nocebo:* In the nocebo group, the ANOVAs for fixation count [F(1, 19) = 18.65, *p* < 0.001, η2*p* = .495] and LI reaction time [F(1, 19) = 13.01, *p* = .002, η2*p* = .406] showed a significant main effect treatment. More left-sided fixations were observed, and reaction time for left-sided targets (in relation to right-sided targets) was lower in the nocebo condition compared with the control condition. The interactions treatment × visual load revealed no significant results (*p* > .18).

*Placebo vs. Nocebo:* The conducted ANOVA for left-sided fixations in the sTMS condition relative to the control condition showed a significant main effect group [F(1, 38) = 4.426, *p* = 0.042, η2*p* = .104]. The nocebo group displayed more left-sided fixations due to the treatment than the placebo group. Other effects were not significant (all *p* > .09). Means and standard errors for left-sided fixations (treatment minus control) are displayed in **Figure 3**.

The ANOVA for the differences in LI reaction time (treatment minus control) revealed a significant main effect group [F(1, 38) = 7.12, *p* = 0.011, η2*p* = .158] and a significant interaction group × visual load [F(2, 76) = 4.41, *p* = 0.015, η2*p* = .104]. The conducted *post hoc t*-tests showed that sTMS decreased response times for left-sided targets in the nocebo group in comparison with the placebo group in the high-load condition (*p* = .001) but not in the low- and medium-load conditions (both *p* > .15). The main effect visual load was not significant (*p* > .70). Means and standard errors for LI scores are shown in **Figure 4**.

TABLE 1 | Percentages of left-sided fixations and LI reaction time (means and standard errors) in the placebo and nocebo groups (treatment minus control) for the different visual load levels.


*Visual load conditions (low: 50, medium: 100, high: 200 distractors); left-sided fixations (above 50% = more left-sided fixations); reaction time LI (lateralization index): negative values indicate faster reactions to left-sided targets.*

of left-sided fixations in the treatment condition (placebo or nocebo) compared with the control condition.

### Self-Report

*Affective ratings:* The ANOVAs for the difference scores of arousal and dominance revealed a significant main effect group. The nocebo group reported higher arousal [F(1, 38) = 4.43, *p* = 0.042, η2*p* = .104] and lower dominance [F(1, 38) = 9.08, *p* = 0.005, η2*p* = .193] in the treatment relative to the control condition. The conducted ANOVA for the difference scores of valence (treatment minus control) produced no significant results (all *p* > .17). Means and standard errors for the affective ratings can be found in the **Supplementary Table S2**.

*Reported symptoms:* The following nocebo-induced symptoms were reported by the nocebo group: slower search behavior (40%), heavy eye-lid (30%), blurred vision (20%), reduced concentration (45%), and other nonspecific symptoms (60%, e.g., numbness in the left side of the body). The placebo group reported: enhanced concentration (70%), faster search behavior (45%), twitching of the eyelids (5%), perceptual changes (10%, e.g., left-sided targets appeared bigger), and other nonspecific symptoms (15%, e.g., increased sensitivity in the left side of the body) in the treatment condition.

An exploratory correlation analysis indicated that the treatmentrelated affective changes [treatment (placebo or nocebo) minus control] in arousal, dominance, and valence (before, as well as after the search task) were not associated with the placebo/nocebo responsiveness (percentages of left-sided fixations and LI reaction time during sTMS; all *p* > .11).

### DISCUSSION

This eye-tracking study directly compared the effects of a placebo and a nocebo on visuospatial attention in healthy individuals. The participants reported experiencing improved attention in the placebo condition, although no changes in gaze behavior and reaction time occurred. Contrary to this, the nocebo significantly increased the number of left-sided fixations and decreased reaction time for left-sided compared with right-sided targets, especially in the condition with the highest visual load. Thus, the placebo had no effects on attention, whereas the nocebo exerted effects in the opposite direction of the verbal suggestion.

These results indicate a more beneficial effect of a nocebo, relative to a placebo, for the first time. The suggestion of a deficit in the nocebo group seemed to have prompted a need for compensation, and thus elicited a paradoxical effect. In other words, the suggestion of negative symptoms actually led to improvement. To the best of our knowledge, this is the first report on positive nocebo effects. In contrast, paradoxical placebo effects have been described before. Here, a sham treatment introduced as an agent to reduce symptoms actually made a condition worse or elicited negative side effects [for a review see (20)].

According to the present results, paradoxical interventions could be more effective than a common goal-directed placebo intervention, at least in some cases. In psychotherapy, the usefulness of paradoxical interventions has long been recognized. Particularly, when the commitment to change or therapy motivation is low, paradoxical interventions can be helpful for achieving therapy goals [e.g., Ref. (21)]. This especially applies to neuropsychological therapy where lack of compliance is a common problem in patients with disorders such as anosognosia (e.g., hemiplegia, aphasia; visual neglect). These patients are not aware of their deficit and therefore do not use, or pursue learning, compensatory strategies (22). In this specific case, nocebo interventions could open new doors in neuropsychological therapy, perhaps helping achieve positive therapy outcomes when goal-directed suggestions do not work.

The placebo group also found the treatment to be effective and experienced a subjective increase in left-sided attention. Objectively, however, this was not present. To explain this, it is very likely that the participants reduced their individual effort during the search task because of the assumed support by the sTMS treatment. This might even be considered a negative placebo effect because the participants overestimated their own attention abilities. Partly in line with this effect, when sTMS was applied, participants in the placebo group described themselves as generally more relaxed and self-confident (i.e., lower arousal and increased dominance) than those in the nocebo group. In any case, these effects portray an interesting dissociation between subjective and objective placebo/nocebo effects.

The findings of the present investigation raise basic questions regarding the possibilities and limits of placebo and nocebo treatments. It is known that placebos show differential effectiveness depending on the particular condition being treated. For example, substantial placebo effects have been found in the treatment of some disorders (e.g., depression, irritable bowel syndrome) but not in others (e.g., bacterial infections, the common cold) (23). In healthy individuals, pronounced effects have also been observed, such as when attempting to change emotional responses *via*  placebo. Schienle et al. (5) administered a disgust placebo to their participants (labeled as an anti-nausea drug), while they were presented with stimuli commonly perceived as repulsive (e.g., spoiled food, excrements). The placebo reduced the intensity of experienced disgust by more than half of its original value.

In the present study, a neglect-like reaction was suggested to participants. Inducing "pseudo-neglect" (or "pseudo-unilateral attention focusing") may be more difficult because healthy individuals have no experience with this specific phenomenon. It has been argued that direct experience (conditioning) is the most powerful way of inducing placebo-related expectancies and associated placebo responses (24); in other words, more commonly experienced reactions may be more susceptible to placebo effects. In the present investigation, a left-sided improvement/reduction of attention was suggested. This is a very specific symptom. Healthy individuals are very likely more familiar with feelings of generally reduced or increased attention and alertness. When such general changes in attention have been suggested, visual search performance was able to be altered *via*  placebo/nocebo treatment (3).

It is important to acknowledge the following limitations of the present study. We only investigated women due to sex-related differences in placebo/nocebo responses [e.g., Refs. (25, 26)]. Therefore, the results cannot be generalized to men. Moreover, we did not assess or control the intake of nicotine and caffeine prior to the investigation, which might have introduced unspecific effects on general visual attention. Further, since only placebo and nocebo responders were included in the analyses, the sample size was relatively small and only allows for conclusions regarding individuals who subjectively experienced left-sided attention improvements/deficits. Finally, the nocebo group reported higher arousal and lower dominance, which may reflect a higher subjective value of the suggested left-sided deficits (as compared with left-sided improvements). However, the affective ratings were not correlated with the responsiveness to the sTMS (e.g., percentages of left-sided fixations). Therefore, it seems unlikely that the nocebo effects were mediated *via* enhanced arousal.

In summary, the present results indicate an interesting dissociation between subjectively experienced effects of placebos/ nocebos and the resulting behavioral changes.

#### ETHICS STATEMENT

The study was approved by the ethics committee of the University of Graz. Each participant gave written informed consent.

### AUTHOR CONTRIBUTIONS

CH and AS designed the study and wrote the manuscript. CH and JP recruited participants for the study, collected the data, and conducted the statistical analysis of the data.

#### REFERENCES


### ACKNOWLEDGMENTS

The authors acknowledge the financial support by the University of Graz.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00446/ full#supplementary-material


**Conflict of Interest of Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Höfler, Potthoff and Schienle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Classical Conditioning as a Distinct Mechanism of Placebo Effects

#### *Przemysław Ba˛bel\**

*Pain Research Group, Institute of Psychology, Jagiellonian University, Kraków, Poland*

Classical conditioning was suggested as a mechanism of placebo effects in the 1950s. It was then challenged by response expectancy theory, which proposed that classical conditioning is just one of the means by which expectancies are acquired and changed. According to that account, placebo effects induced by classical conditioning are mediated by expectancies. However, in most of the previous studies, either expectancies were not measured or classical conditioning was combined with verbal suggestions. Thus, on the basis of those studies, it is not possible to conclude whether expectancies are involved in placebo effects induced by pure classical conditioning. Two lines of recent studies have challenged the idea that placebo effects induced by classical conditioning are always mediated by expectancies. First, some recent studies have shown that a hidden conditioning procedure elicits both placebo analgesia and nocebo hyperalgesia, neither of which is predicted by expectancy. Second, there are studies showing that visual cues paired with pain stimuli of high or low intensity induce both placebo analgesia and nocebo hyperalgesia when they are presented subliminally without participants' awareness. The results of both lines of studies suggest that expectancy may not always be involved in placebo effects induced by classical conditioning and that conditioning may be a distinct mechanism of placebo effects. Thus, these results support the idea that placebo effects can be learned by classical conditioning either consciously or unconsciously. However, the existing body of evidence is limited to classically conditioned placebo effects in pain, that is, placebo analgesia and nocebo hyperalgesia.

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany* 

#### *Reviewed by:*

*Liesbeth Van Vliet, Leiden University, Netherlands Susanne Becker, Central Institute for Mental Health, Germany*

#### *\*Correspondence:*

*Przemysław Ba˛bel przemyslaw.babel@uj.edu.pl*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 29 December 2018 Accepted: 06 June 2019 Published: 25 June 2019*

#### Citation:

*Ba˛bel P (2019) Classical Conditioning as a Distinct Mechanism of Placebo Effects. Front. Psychiatry 10:449. doi: 10.3389/fpsyt.2019.00449*

Keywords: classical conditioning, nocebo effect, Pavlovian conditioning, placebo effect, response expectancy

### THE ORIGINS OF THE CLASSICAL CONDITIONING ACCOUNT OF PLACEBO EFFECTS

Classical conditioning was independently suggested as a mechanism of placebo effects for the first time in 1957 by Gliedman, Gantt, and Teitelbaum (1) and Kurland (2). It is interesting that just 2 years earlier, Beecher (3) had published his seminal paper that is now considered the starting point of scientific interest in placebo effects. Thus, classical conditioning has been regarded as a mechanism of placebo effects since the very beginning of research on placebo. However, Wickramasekera (4, 5) was the first to propose a broad and coherent theoretical account of placebo effects as conditional reflexes.

According to the classical conditioning approach, placebo is a conditioned stimulus and placebo effects are conditioned responses. The first studies in which classical conditioning with an active drug as an unconditioned stimulus was used to induce placebo effects were conducted in animals (6–8). However, in fact, Pavlov (9) was the first to describe the effects of repeated applications of

active drugs that his collaborators had found. Dr. Podkopaev associated the sound of a definite pitch with the effects of a dose of apomorphine in dogs. In effect, the sound of the note alone produced all the symptoms of the drug. Similarly, when Dr. Krylov repeatedly injected morphine into dogs, he observed that the preliminaries of the injection were sufficient to produce all the symptoms of the drug. Nevertheless, these early studies started two very important lines of research, that is, studies on conditioned immunopharmacological effects (10) derived from Ader and Cohen's (6) experiment and studies on conditioned drug tolerance (11) derived from Sigel's (8) experiment. In both lines of research, responses to stimuli that accompany the application of pharmacologically active drugs are classically conditioned. However, these studies do not aim to explore the mechanisms of placebo effects, and they focus on conditioning of physiological responses.

Voudouris, Peck, and Coleman (12–14) developed the classical conditioning paradigm to induce placebo effects in humans. By surreptitiously pairing an inactive cream with decreasing nociceptive stimulation, they strengthened the placebo effect induced by verbal suggestion of the analgesic action of the cream (12, 13). Moreover, in spite of the fact that they had previously induced the placebo effect by verbal suggestion of the analgesic action of an inactive cream, they were subsequently able to induce the nocebo effect by pairing the same cream with increasing nociceptive stimulation (12, 13). Most importantly, they also found that placebo analgesia can be induced by classical conditioning alone (without verbal suggestions); that is, the placebo effect was found in a group that was informed that they had received an inactive cream, which was then surreptitiously paired with decreasing nociceptive stimulation (14). However, it should be noted that the cream used in these studies might have raised expectancy based on previous experiences with active treatment creams and that expectancy might have biased the results. These studies started a new line of research on placebo effects induced by classical conditioning. The aim of the paper is to briefly summarize recent findings and, based on them, draw conclusions on the differential roles of classical conditioning and expectancy in placebo effects. It should be noted that subjective responses, that is, pain, are subject to conditioning in this new line of research. Thus, this paper focuses on classical conditioning of placebo effects in pain, including placebo analgesia and nocebo hyperalgesia.

### THE CLASSICAL CONDITIONING ACCOUNT IS CHALLENGED BY RESPONSE EXPECTANCY THEORY

In the same year as the first study on classical conditioning of placebo effects in humans was published (12), Kirsch (15) published his seminal paper on response expectancy in which he proposed another account of placebo effects. His theory assumes that placebo effects result from expectancies concerning placebo intervention. Kirsch (15) highlighted that, among other processes, classical conditioning is involved in the acquisition and modification of expectancy. According to this viewpoint, classical conditioning is one of the means by which expectancies are acquired and modified; that is, the effects of conditioning are mediated by expectancy (15). In other words, there is only one mechanism of placebo effects—expectancy; classical conditioning is only a method that is used to acquire or change expectancy.

This view is reflected in the popular learning model of placebo effects proposed by Colloca and Miller (16). In this model, placebo effects result from expectancies acquired by decoding information from the psychosocial context, including conditioned stimuli, among others. Thus, according to the model, classical conditioning is a mean by which placebo effects may be induced and expectancies play a central role in the formation of placebo effects induced by classical conditioning.

It should be noted that expectancies are by definition consciously accessible (17–19). According to a recent definition, expectation is understood to mean a "conscious, conceptual belief about the future occurrence of an event" (20).

Kirsch's (15) account of the role of expectancy in the formation of placebo effects induced by classical conditioning is based on a current view on classical conditioning, which is best summarized by Rescorla (21). This modern view differs substantially from Pavlov's (9) account, as is well reflected in the title of Rescorla's (21) seminal paper: "Pavlovian conditioning: It's not what you think it is." According to this current view, classical conditioning is not a mechanical process in which one stimulus passes control over a response from another stimulus; instead, conditioning is now seen as the learning of relations among events, which allows the organism to represent its environment. As a consequence, cognitive involvement is assumed for classical conditioning. From this perspective, conditioning produces the expectancy that certain stimuli will be followed by other stimuli, and it is this expectancy that produces the response. In other words, expectancies mediate the effects of conditioning (18).

### THE CHALLENGE CONTINUES IN STUDIES CONTRASTING CLASSICAL CONDITIONING AND EXPECTANCY

Kirsch (15) not only challenged the classical conditioning account of placebo effects on theoretical grounds but also conducted an empirical test of his theory. Montgomery and Kirsch (22) showed that the effects of classical conditioning on placebo analgesia induced by verbal suggestions are completely mediated by expectancies and that when participants were informed that they were undergoing the conditioning procedure (i.e., pairing placebo cream with decreasing nociceptive stimulation), the conditioning did not have an effect on placebo analgesia induced by verbal suggestions.

Montgomery and Kirsch's (22) study together with Voudouris and collaborators' (12–14) investigations started the conditioning versus expectancy debate, which has still not been fully resolved. The essence of this debate is whether classical conditioning is a distinct mechanism of placebo effects or the effects of conditioning are mediated by expectancy. The early stage of this debate was reviewed by Stewart-Williams and Podd (19). However, during 15 years since their seminal paper was published, new research findings have been collected that shed light on the debate.

So far, few studies have been conducted in which both classical conditioning was applied and expectancy was measured. Although most of these studies suggest that the effects of conditioning are correlated with expectancy (23–26), predicted by expectancy (27), or mediated by expectancy (22, 28–30), their results are limited to participants in whom both verbal suggestions of analgesia or hyperalgesia and classical conditioning were applied. Thus, based on these findings, one cannot draw any conclusions on the role of expectancy in placebo effects induced by pure classical conditioning. Instead, it can be concluded that expectancy is involved in the effects of conditioning on placebo effects induced by verbal suggestions.

Moreover, the sparse studies in which pure classical conditioning was applied (without verbal suggestions) and expectancy was measured usually failed to induce placebo effects (25, 31, 32), probably due to limited conditioning trials (from 12 to 30, including 6–15 in which placebo was paired with changes in nociceptive stimulation). Even if it succeeded in one study (i.e., placebo analgesia was found in the group subjected to pure conditioning), the results of regression analysis revealing the prediction of the placebo effect by expectancies were based on the results from all the study groups, including those in which verbal suggestions of analgesia were provided (33). Thus, it is not possible to conclude whether expectancies predicted placebo analgesia found in the group subjected to pure classical conditioning. Interestingly, in that study, classical conditioning produced the placebo effect, regardless of whether or not participants were informed that they were undergoing the conditioning procedure (i.e., pairing placebo cream with decreasing nociceptive stimulation) and regardless of whether they were informed that active or inactive intervention was used (in fact placebo) (33). These results contradict Montgomery and Kirsch's (22) findings.

### CHALLENGE ACCEPTED: PLACEBO EFFECTS INDUCED BY PURE CLASSICAL CONDITIONING

Unfortunately, most of the few studies in which pure classical conditioning without verbal suggestions succeeded in inducing placebo effects did not involve the measurement of expectancy (34–36). For many years, the only study in which pure classical conditioning effectively induced the placebo effect and expectancy was measured was the one conducted by Voudouris and collaborators (14). In one of the groups, participants were informed that they were in a control group and they would receive a neutral cream. They were then subjected to conditioning procedure in which the cream was paired with decreased nociceptive stimulation without participants' knowledge. However, in that study, expectancy was measured only once (before the pre-test), so it is impossible to determine whether the conditioning that was performed after the pre-test changed expectancies.

Recently, two lines of studies have challenged the idea that placebo effects induced by classical conditioning are always mediated by expectancies. In the first line, hidden conditioning without verbal suggestions is conducted, and expectancies are measured on a trial-by-trial basis. Conditioning procedure may be conducted in two ways: by informing or not informing participants that there is a relationship between the placebo (i.e., a conditioned stimulus) and the active drug or procedure (i.e., an unconditioned stimulus). When participants are aware of the relationship, this is referred to as open conditioning; when they are not aware of it, this is called hidden conditioning. Thus, the role of consciousness is the main difference between hidden and open conditioning.

In three recent studies, hidden conditioning was used to induce placebo analgesia (37, 38) and nocebo hyperalgesia (38, 39), and expectancies were measured on a trial-by-trial basis. These studies found that not only hidden conditioning was effective in producing placebo effects but also, primarily, expectancies predicted or mediated neither placebo analgesia nor nocebo hyperalgesia (37–39), even though conditioning had an effect on expectancies (37). Moreover, when participants were asked at the end of the study whether they had noticed the contingency between placebo stimuli and differences in pain intensity, most of them denied (37). Thus, based on these results, it seems that it is possible to induce placebo effects without the awareness of the participants.

The second line of research that sheds light on the role of expectancy in placebo effects induced by pure classical conditioning involves placebo stimuli presented subliminally without participants' awareness. In this paradigm, clearly recognizable visual stimuli are first paired with pain stimuli of high or low intensity. After a conditioning phase is completed, the same conditioned visual cues are presented subliminally in a testing phase. It has been found that pain stimuli preceded by subliminally presented conditioned visual stimuli are rated as less or more painful depending on whether they have previously been paired with high or low pain stimuli, indicating that placebo analgesia and nocebo hyperalgesia are induced without awareness (40–44). Moreover, it has also been found that both placebo analgesia and nocebo hyperalgesia can be induced not only by conditioning of supraliminal stimuli but also by conditioning of subliminally presented stimuli (44). Placebo effects induced by conditioned stimuli presented subliminally without participants' awareness suggest that expectancy may not have been involved in their production, which is consistent with the results from the first line of studies. Although expectancy is not measured in those studies, participants are not aware of the presented stimuli. Thus, their expectancy should not have affected the results.

It may be argued that the studies from both lines of research did not include any placebo interventions in the form of a sugar pill, fake cream, or sham electrodes. In fact, in all of those studies, visual stimuli were paired with decreasing or/and increasing pain stimulation. However, according to Miller and Kaptchuk (45), the placebo effect is not the result of a specific intervention, but it is rather produced and enhanced by the context surrounding the treatment. Thus, even if no inert treatment is administered, the so-called placebo-related effect may still be found (46).

### CONCLUSIONS

The results of both lines of studies suggest that expectancy may not be always involved in placebo effects induced by classical conditioning and that conditioning may be a distinct mechanism of placebo effects.

These findings are in line with the fact that, in some cases, classical conditioning represents an automatic process that is not mediated by cognitive expectancy (18). In fact, many phenomena could be explained by classical conditioning without cognitive mediation. They include evaluative conditioning, second-order conditioning, conditioned taste aversions and flavor preferences, conditioning with subliminally presented conditioned stimuli, conditioned immunosuppression, and conditioning in simple organisms among others (see (18) for review). Thus, only some placebo effects could be explained by classical conditioning without expectancy involvement.

However, the findings under discussion do not exclude the role of expectancy in inducing placebo effects. Expectancy ratings may not always predict placebo effects. However, pre-cognitive associations, that is, "links between events and/or objects that exist outside conscious awareness" (20), may be acquired through hidden conditioning procedures or be responsible for responses to subliminally presented conditioned stimuli. In fact, when classical conditioning is used to enhance or reduce placebo effects induced by verbal suggestions, expectancies are involved in their formation (22–30). In that case, classical conditioning is just a mean by which expectancies are acquired and modified. Moreover, expectancies might not always be easily self-reported; that is, although expectancies do exist, one might not be able to report them. However, the idea of conscious expectancies that are not self-reported should be dealt with caution as it may lead to circular reasoning (17).

These conclusions are in line with recent review (47) and previously proposed models postulating that the classical conditioning and response expectancy accounts do not exclude each other, but the range of phenomena they explain is not completely the same (19, 48). Conditioning involves either conscious learning (acquisition and modifications of expectancies) or unconscious learning (conditioning not mediated by expectancy). Expectancies can be acquired and modified by conditioning and other procedures, including verbal suggestions and observational learning. In other words, either conscious learning (expectancy and conditioning) or unconscious learning (conditioning) can be mechanisms of placebo effects. Thus, both accounts seem to be compatible rather than mutually exclusive (19, 48). From this perspective, classical conditioning is in some cases a distinct mechanism of placebo effects, and sometimes, it is just a method used to acquire or change expectancy.

Thus, the current conclusions contradict Colloca and Miller's (16) learning model of the formation of placebo effects. They suggest that conditioned placebo and nocebo responses may not always be mediated by expectancy. It seems that Colloca and Miller's (16) model does not explain the mechanism of all instances of placebo effects. However, future studies should answer the question under which circumstances placebo effects induced by classical conditioning are mediated by expectancy and when they are not mediated by expectancy. Previous studies in which expectancies were not involved in the induction of placebo effects by classical conditioning used visual stimuli as placebos together with a large number of conditioning trails. Thus, these two factors may be necessary to induce conditioned placebo effects that are not mediated by expectancy. So far, it seems only clear that placebo effects induced by both conditioning and verbal suggestions are mediated by expectancy. Further research is also needed to investigate the differential role of classical conditioning and expectancy in placebo effects outside pain. It would also be of interest to investigate whether all principles of classical conditioning found in studies outside the placebo research field (e.g., generalization and extinction) can be directly applied to placebo effects.

The finding that expectancy may not always be involved in placebo effects induced by classical conditioning has implications that have been discussed above, not only for placebo theory. It also has important implications for the methodology of placebo studies, that is, that expectancies should be measured in research on placebo effects when the role of expectancy is under study. Regardless of whether placebo effects were induced by classical conditioning, verbal suggestions, or both, the involvement or absence of expectancy might be postulated only when expectancy was measured. Most importantly, this fact also has implications for clinical practice. Pain can decrease or increase after negative or positive experiences that are associated with environmental stimuli. In effect, these environmental stimuli may increase or reduce pain symptoms, not only without any provided verbal suggestions, but—most importantly—without patients' conscious awareness. Thus, pain changes can occur even when patients do not anticipate them. The decrease or increase of pain may result from uncontrollable contextual factors. Identifying the elements, that is, the conditioned stimuli that change pain experiences, could be an essential part of pain management programs. However, as significant differences between experimental and clinical settings exist, further studies are needed before translating laboratory research results into clinical practice.

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work.

### FUNDING

This manuscript was prepared under grant number 2014/14/E/ HS6/00415 from the National Science Centre, Poland.

### REFERENCES


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Bąbel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo Effects in Psychotherapy: A Framework

#### *Paul Enck\* and Stephan Zipfel*

*Psychosomatic Medicine and Psychotherapy, Department of Internal Medicine VI, University Hospital Tübingen, Tübingen, Germany*

The issue of placebo response and the extent of its effect on psychotherapy is complex for two specific reasons: i) Current standards for drug trials, e.g., true placebo interventions, double-blinding, cannot be applied to most psychotherapy techniques, and ii) some of the "nonspecific effects" in drug therapy have very specific effects in psychotherapy, such as the frequency and intensity of patient–therapist interaction. In addition, different psychotherapy approaches share many such specific effects (the "dodo bird verdict") and lack specificity with respect to therapy outcome. Here, we discuss the placebo effect in psychotherapy under four aspects: a) nonspecific factors shared with drug therapy (context factors); b) nonspecific factors shared among all psychotherapy traditions (common factors); c) specific placebo-controlled options with different psychotherapy modalities; and d) nonspecific control options for the specific placebo effect in psychotherapy. The resulting framework proposes that the exploration and enumeration of context factors, common factors, and specific factors contributes to the placebo effects in psychotherapy.

#### *Edited by:*

*Michael Noll-Hussong, Saarland University Hospital, Germany*

#### *Reviewed by:*

*Giorgio Sandrini, University of Pavia, Italy Joel Paris, McGill University, Canada Chantal Berna, Lausanne University Hospital (CHUV), Switzerland*

#### *\*Correspondence:*

*Paul Enck paul.enck@uni-tuebingen.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 01 August 2018 Accepted: 10 June 2019 Published: 26 June 2019*

#### *Citation:*

*Enck P and Zipfel S (2019) Placebo Effects in Psychotherapy: A Framework. Front. Psychiatry 10:456. doi: 10.3389/fpsyt.2019.00456*

Keywords: placebo effects, psychotherapy, control condition, placebo response, clinical trials

## HISTORICAL ROOTS

Although the term "placebo" became commonplace medical language some time ago (1), it was not before the 1940s that placebo-controlled pharmacological trials became the standard in psychiatry and beyond (2). This rather restrictive use of the term for controlled trials was relinquished only recently in favor of a broader use in all therapeutic conditions, for differentiation between minimizing placebo effects in controlled trials, while maximizing it in daily routine (3, 4), and for harnessing the effect to improve the therapist–patient relationship (5).

Throughout this paper, we will use the terms "placebo effect" and "placebo response" (or "nocebo effect" and "nocebo response") in accordance with a recent expert opinion of the placebo research community (4): Placebo effect refers to a distinctive psychobiological phenomenon, while placebo response refers to the outcome of clinical trials, the amalgam of responses after receiving a placebo bias in reporting, regression to mean, possibly also Hawthorne effects, and placebo effects (6).

However, psychotherapy and the placebo response share a specific and delicate relationship.

A response to placebo was soon recognized as an indication of a psychological rather than of a somatic/medical condition (7). Two "roots" of this early placebo research can be identified:

a) In the early 1950s, Stewart Wolff described the mechanisms (conditioning, expectation) by which placebo effects occurred and were strong, particularly with somatic symptoms such as pain and nausea (8, 9). At the same time, in psychiatry, particularly high placebo effects were observed in randomized controlled trials (RCT) with drugs in depression, anxiety, etc. (10), and among other things, the severity of the illness, duration of treatment, and previous therapies were causing this effect (11) [for a survey, see Weimer et al. (12)].

b) Around the same time, Jerome D. Frank noted that patients' and therapists' expectations influenced the outcome of psychotherapy (13) and speculated that suggestions (but not motivation) may play a role, as may the duration of therapy, specific-patient characteristics (which he called placebo reactors), and side effects may eliminate it. To distinguish between specific and nonspecific effects, Frank called for clearly defined control groups in psychotherapy also, regardless of its theoretical orientation.

Little has been achieved experimentally since then with regard to exploring placebo effects in psychotherapy, although the therapeutic options available have increased dramatically: psychodynamic psychotherapy, cognitive behavioral therapy (CBT), hypnotherapy, interpersonal psychotherapy, group therapy, couple and family therapy, mindfulness-based therapy (MBT), self-help programs (SHPs), phone- and internet-based therapies, health interventions, e.g., smartphone apps, and the like. The general and specific placebo effects of all of these should be examined. In the following sections, we will argue that of the many factors regarded as "nonspecific" in drug RCT, some should be considered as being specific in psychotherapy, while others remain nonspecific under all circumstances. As with drug therapy, however, not all nonspecific factors are attributable to a placebo effect; since response biases, statistical regression to the mean and spontaneous symptom variation account for some of the effects involved in both the drug and the placebo aspect of trials and therefore also influence psychotherapy. This concept is illustrated in **Figure 1**.

While most of the older and many recent publications on the placebo effect in psychotherapy avoid determining the size of the placebo effect in psychotherapy (14–18), unless they were claiming that the placebo concept cannot be applied to psychotherapy at all (15, 19), others argue that properly designed placebo (control) therapy may be as effective as psychotherapy (16). However, neither of these positions is helpful in planning a rational psychotherapy evaluation.

Instead, we will follow an argument raised by Blease (20, 21); according to which, there is general scientific consensus that the placebo concept exists, but unnecessary debate in placebo studies persists due to the failure to recognize this fact. In principle, the same underlying definitions for placebo response and placebo effect that apply in biomedical research interventions also apply to psychological interventions for which the concept "placebo" was not developed. The key difference lies in recognizing the serious challenges of placebo-controlled clinical trials for psychological treatments. It is therefore unnecessary to eliminate placebo concepts in psychological contexts, as proposed by Kirsch (15).

We will abstain from discussing the placebo concept of Grünbaum (22) for CBT for one simple reason: it was developed before the surge of empirical placebo research had begun in the 1990s (23) and thus cannot reflect current knowledge. Gaab (19) falls into the "Grünbaum trap" when arguing that psychotherapy is at risk of being misconstrued as "mere" placebo without such a discussion and that psychotherapists otherwise simply prescribe placebos. It is not without irony that Wampold (16) illustrates the concept with a contemporary drug example (antibiotics) but falls short (as do others) of explaining what contemporary "incidental constituents" of psychotherapy may be, adhering instead to Grünbaum´s 1986 definition.

A "Grünbaum trap" is what we call the outdated understanding of the placebo response in psychotherapy. It was developed as a seemingly timeless concept (applicable to all psychotherapies at all times, e.g., the "incidental constituents of psychotherapy" according to Grünbaum) when much of what determines the placebo response had already been identified, e.g., learning history and acute expectancies, which are no longer "incidental" in either drug therapy or psychotherapy.

Our subsequent arguments assume that—like drug RCT in similar conditions, when primary efficacy measures are

patient-reported outcomes (PROs)—an average placebo response of around 40% may also be effective in psychotherapy, provided that optimal research conditions prevail; where this is not the case, the placebo response is liable to be higher. This position is supported by a more recent meta-analysis of psychotherapy trials in irritable bowel syndrome (IBS) with nearly 100 RCT of drug therapy and an average of 40% placebo response across all trials (24); six additional psychotherapy RCT also yielded an average placebo response of 40% (25).

This is similar to Lambert (26) who proposed that 40% of the effect of psychotherapy is attributable to factors beyond psychotherapy (or, in our terms, nonspecific effects: spontaneous variation, regression to the mean, biases) and a further 15% to the placebo effect (expectancy of improvement); in addition, 15% are thought to be due to the specifics of each psychotherapy modality, while the remaining 30% are common to all psychotherapies, the "dodo bird verdict" (27). These 30% "common factors" of all psychotherapies can be subdivided into "support factors," "learning factors," and "action factors," in accordance with Huibers and Cuijpers (28) (see **Table 1**).

While all these numbers may be variable with respect to their empirical base—from guesses to meta-analyses—they come surprisingly close to what has been reported from RCT across medicine (30) as well as from psychiatry (12) and in the range of what Henry K. Beecher had already estimated from the few clinical trials he had at his disposal in 1955 (7). Provided that PRO are in the focus, our current understanding is that at this level, placebo effects in drug therapy and in psychotherapy do not vary whatsoever in size and mechanism.

We will not elaborate further on the concept of these "common factors"—a detailed review and discussion is available in Lambert (26). An in-depth discussion of the control-group issue in psychological interventions can be found, among others, in Mohr et al. (31) and Guidi et al. (32).

TABLE 1 | Factors assumed to be common in all psychotherapies that may influence psychotherapy outcome. These can be classified in three groups and can—to different degrees—be effective in different psychotherapies, thus enabling different modes of psychotherapy to operate. Their sequential order (from left to right) is based on a concept by Lambert and Ogles (29) that is theory-driven and yet without empirical basis [concept according to Huibers and Cuijpers (28)].


We will neither present nor discuss the vast body of evidence with regard to neurobiology and neurochemistry of the placebo response, but again refer to the literature, e.g., Fabrizio Benedetti´s book (33), and Luana Colloca´s reader (34, 35).

We will structure the following discussion using the analogy of drug therapy and aim to identify nonspecific effects in drug therapy that have either become specific or that have remained nonspecific in psychotherapy. We will discuss common problems of control for nonspecific effects across different psychotherapies as well as potentially specific problems in certain psychotherapies, as also illustrated in **Figure 1**. Finally, we will address placebo issues with a combination of drug and psychotherapy and discuss the relationship between placebo effects and the efficacy of psychotherapy.

### NONSPECIFIC EFFECTS IN DRUG THERAPY WHICH BECOME SPECIFIC IN PSYCHOTHERAPY

Most RCT with drugs are keen to demonstrate that no center effects occurred, which otherwise could explain to some degree the efficacy of the drugs under investigation. In pivotal trials, such a center effect could potentially cause the requested indication to be declined by the approval authorities. It is of interest to note that in RCT before the 1990s, most studies were single-center trials in which such an effect was not even noticed. Furthermore, the qualification of trial doctors, the degree of their training, and their communication skills and empathy were rarely assessed or subsequently linked to treatment outcome. Age, sex, and other personal characteristics of the patients were not specifically taken into account, although it is well established that these factors may play a role in clinical routine (36) as well as in RCT, for both drug therapy (37) and psychotherapy (38). Rules of good clinical practice required independent raters and therapists, staff training, study monitoring, and strict adherence control (39).

Large multicenter trials produce higher placebo responses (40–42), presumably due to a lower standardization of recruitment [including recruitment biases (43)] and higher variability of therapist–patient interaction during the study. In agreement with this, more study visits are now clearly associated with higher placebo response rates in depression in both children (44) and adults (45), as well as in other areas of medicine, e.g., inflammatory (46) and functional bowel disorders (24).

Frequency and intensity of therapist–patient interaction are well-known factors determining the efficacy of psychotherapy (47). They may serve as an example of how nonspecific effects in drug therapy could become specific effects in psychotherapy, however common they may be for most psychotherapy modalities. This is why psychotherapy trials have always sought to standardize the amount of time spent with the patient as well as the communication between patient and therapist. Furthermore, while manuals harmonizing the content and interaction during therapy are standard in psychotherapy, such factors are now also deemed to be relevant in drug trials (48).

One prime example of a common but specific effect involved in psychotherapy is described in an open-label placebo study (49): To achieve an "augmented placebo response" in a shamacupuncture trial in patients with IBS, acupuncturists were instructed to control their treatment behavior on the basis of a manualized script requesting intensified 20-min doctor–patient communication instead of the usual, standard acupuncture treatment. Many of the verbal instructions required "normal" therapist–patient communication behavior in a psychotherapy setting but may be rather atypical in drug therapy environments.

This procedure doubled the placebo response to sham acupuncture on most outcome measures.

### NONSPECIFIC ELEMENTS IN DRUG THERAPY THAT REMAIN NONSPECIFIC IN PSYCHOTHERAPY

Of the small number of patient-centered predictors of the placebo response identified in RCT in psychiatry (12)—low severity of the disease, short disease duration, no treatment history, more recent trials—none were shown to be specifically relevant in psychotherapy, although it is open to speculation as to whether patients accepting psychotherapy as their primary treatment option are less severely affected, e.g., by depression, than patients accepting psychotropic drug therapy. In depression therapy in particular, younger age was associated with higher placebo response, but this may be due to shorter disease history and lower disease severity in children and adolescents than in adults (30, 50). These factors may lose their importance in all those cases in which a first-line drug therapy is not available.

Of the traditional therapist-centered variables tested (age, sex, theoretical orientation, and percentage of work time conducting therapy), only age was a significant demographic predictor of psychotherapy outcome in a univariate analysis, while in a multivariate analysis, interpersonal and social skills accounted for most of the outcome variance (51). While this casts doubts on the replicability of many psychotherapy RCT, it calls for more research into the role of researcher variables for therapy outcome (52): allegiance to theoretical concepts per se has been made responsible for most of the therapy outcomes (53).

It is, however, of relevance that, particularly in psychiatry but not outside psychiatry, see Ref. (54)—an unbalanced randomization has been shown to drive the placebo effect: increased placebo effects were observed in depression (45, 55, 56), schizophrenia (57, 58), and psychosis (42) when more patients were randomized to active treatment than to (placebo) control. While this is usually carried out for ethical reasons (to leave the least number of patients untreated), it also serves in certain cases to test different drug dosages against one placebo arm.

Such designs are presumably also common in psychotherapy and may account for a substantial overall effect of the therapy: According to Papakostas and and Fava (55), a 10% increase in the probability of receiving active treatment (i.e., a 10% decrease in the probability of being assigned to the control condition) increases the probability of responding to active (drug) antidepressant therapy by 1.8% and to control (placebo) by 2.6%, in comparison with a 50:50 randomization scheme. When one active treatment is compared with another active treatment [comparative effectiveness research (CER)], the response was higher by a factor of 1.79 than in a placebo-controlled trial, solely brought about by the 100% certainty for patients that they would receive active antidepressant treatment (59).

### COMMON CONTROL PROBLEMS IN ALL PSYCHOTHERAPY TRIALS AND THEIR ADVANTAGES AND PITFALLS

Different psychotherapy options share common features when it comes to standards as set down by RCT of drug therapy in psychiatry and psychosomatics, e.g., trial registration, power calculation, ethics approval, and informed consent are easily applicable to all. Others, such as monitoring of treatment progress and adherence control need to be adapted to the specific therapy in some cases, e.g., with internet-based therapies. In most cases, the design also required adaptation to specifics for certain therapy options (60).

The common denominator in all psychotherapy procedures is the inability to effectively blind treatment and control group assignment and to provide a "true" (by nature, ineffective) placebo treatment; among the many procedures that have been developed to secure blinding therapy assignment and to warrant equipoise (61), very few are applicable to psychotherapy (62). Both limitations have important consequences for the placebo response, as will be discussed later. Nevertheless, current guidelines for good clinical practice require independence of raters and diagnostic staff and their impartiality toward the intervention (39).

#### Ineffective Blinding

Blinding (of the patient) as well as double blinding (of both patient and therapist) is literally impossible, not only in psychotherapy but also with many other interventions such as manual or physical therapy. Even where apparently possible, e.g., in biofeedback and neurofeedback therapy where "false feedback" (signals from another patient, e.g., as "yoked control") is provided, patients will realize immediately whether they have been randomized to treatment or control. The situation mimics some of the circumstances encountered in therapies using technical tools, e.g., acupuncture, transcutaneous electric nerve stimulation, and transcranial magnetic stimulation where only those patients can be enrolled who had never experienced the "real" therapy before and who may possibly be hoodwinked (63).

In classical drug RCT, unblinding will have imminent consequences for efficacy. Deliberate unblinding of RTC is usually only carried out when severe safety concerns arise but may also occur incidentally when patients and/or doctors notice significant differences in reporting of adverse events (64); even meta-analyses can identify such involuntary unblinding (65). Such unblinding will enhance the response to active therapy and reduce the response to control, thus enlarging the treatment–control difference (66). However, when therapies with either doubleblinded placebo-controlled drug interventions or unblinded but controlled psychotherapies for the same condition (depression) were compared, the meta-analysis showed a small but significant effect (drug–placebo difference) in favor of pharmacotherapy (67), indicating that (un-) blinding affects psychotherapy to a lesser degree than conventional drug RCT. Furthermore, patients who were obviously assigned to the control condition (irrespective of its form) will respond with disappointment (68), increased risk of dropping out (69, and potentially with nocebo effects (70), further contributing (via the "last value carried forward" requirement for intent-to-treat analysis of the trial data) to an overestimation of the efficacy of the active arm of the trial.

Blinding is particularly necessary with conventional crossover designs where each patient serves as his/her own control, thereby reducing the data variance and making RCT possible with considerably less patients than with a parallel-group design. However, crossover designs carry another risk: that of carry-over effects from one phase to the next. If the carry-over effect is based on the Pavlovian conditioning of responses (71), even the use of longer washout phases cannot prevent it from occurring.

Ineffective blinding and carry-over effects that cannot be washed out therefore constitute the two reasons why psychotherapeutic trials cannot employ a crossover design. The limitations of a parallelgroup design, in particular higher between-subject data variance, had to be overcome by developing other design features to account for the missing "true placebo" in psychotherapy, predominantly "waiting list control" (WLC), and "treatment as usual" (TAU).

#### Waiting List Control

The fact that no "true placebo" is applicable in psychotherapy RCT does not imply that no placebo effect occurs, as discussed above in the case of CER: when psychotherapy is compared with another therapy, the placebo effect is not controlled for and can therefore no longer be quantified. It can, however, be assumed that some of the alternative control strategies in psychotherapy research also have enhancing placebo effects. This may specifically be true of WLC.

Like crossover studies, WLC reduces data variance on account of a lower within-subject than between-subject variability of data; in this case, however, all patients should have to wait (which is usually not the case). It is also argued that WLC may additionally serve as a control for spontaneous variation of symptoms, a condition that cannot be readily tested with any RCT: it is ethically questionable as to whether a "no treatment control" is acceptable unless the disease is of minor severity and no effective therapy is available. This is the most rigorous interpretation of the current position of the Declaration of Helsinki (72).

Three-arm trials (active, placebo, and no treatment), between 25% and 45% of the treatment effect—of either drug or psychotherapy—can be attributed to spontaneous variation (73), with highest effects in nausea (45%), smoking cessation (40%), depression (35%), phobia (34%), and acute pain (25%). The authors concluded that most of the placebo effects in these conditions are attributable to spontaneous variation of symptoms. However, Kirsch and Sapirstein concluded in their initial paper (74) that 25% of the improvement observed in the drug-treated group (for depression) was due to the active medication, 25% to natural history, and 50% to the placebo effect.

The use of WLC as an indicator of spontaneous variation is therefore misleading (patients are not naive but are promised effective treatment), and it would be more appropriate to install one of the novel designs that separate recruitment for a diseasemonitoring study from recruitment for an intervention study, called Zelen design (75) or multiple cohort RCT (MCRCT) design (76) (for more details, see 77 and 78).

Since WLC are promising patient-effective treatment in the future, they may produce strong expectancy effects, probably enhancing the placebo response, even in the phase before the treatment actually commences: symptom improvement during waiting has been reported (79, 80)—similar to effects of run-in periods in drug trials (81)—and cannot be taken solely as indicative of spontaneous remission. Placebo-controlled trials are superior to WLC trials and induce greater symptom reduction (82), as do RCTS with a "no treatment" condition in comparison with WLC trials (70). Furthermore, this effect may rely on the duration of waiting, and standard rules for this have yet to be investigated. One way of doing so would be to install a "stepwedge approach" (83), where randomization between different waiting groups (periods of different length) is used to test a doseresponse function of waiting and the point at which positive expectations (placebo effects) may turn into disappointment (nocebo effects) (70) and increased dropout rates (69).

#### Treatment as Usual

If being randomized to a WLC can induce hope (placebo) or disappointment (nocebo) depending on its length, being randomized to TAU, by dint of its name, is already suggestive of its nocebo effect, the implicit message being that "you get what everybody else gets with this disease, and this treatment is unsatisfactory; that is why we are testing the new one, to which you, unfortunately, have not been randomized." Unless this is a treatment-naive patient with a very short disease history [which is indicative of high placebo response rates in many conditions and with many therapies (12)], this also reminds him or her of previous unsatisfactory or unsuccessful therapies, which—as we know—contributes significantly to the efficacy of any novel therapies (84).

Being randomized to the active treatment arm rather than to TAU will therefore enhance the placebo effect by enabling patients to compare the ongoing therapy with (all) previously inefficient therapies. For this reason, they prefer to participate in the novel approach; being randomized to TAU is almost a verdict. A TAU approach therefore enhances the placebo effect in the active arm and induces nocebo effects in the control arm.

The only thing that we have learned for sure from placebo/ nocebo research over the past 10 years is that words can be painful (85) and can induce nocebo effects (86, 87). TAU definitively hurts, and it would be better rephrased as "the best available and approved treatment" (BAAT) when compared with a novel approach, but this would not work without having a number of logistic repercussions for trial designs.

Firstly, the utmost standardization of the TAU/BAAT treatment used for control purposes would be required. However, this is usually not carried out in psychotherapy RCT involving TAU. It is particularly complicated when patients are recruited from different clinical settings for treatment in a specialized center, but TAU is provided by the transferring therapist. It also generates a further methodological issue with regard to the selection of BAAT, when more than one is available on the market, in the region, or under prevailing restrictions, e.g., health insurance plans. It should be noted that control conditions, e.g., optimized treatment as usual (TAU-O) in psychotherapy trials [see for instance Refs. (88, 89)], face additional challenges, depending on the health care environment in which they were conducted. In Germany, by way of example, patients with a psychiatric or psychosomatic illness, e.g., anorexia nervosa, have access to inpatient, day-patient, and outpatient psychotherapy treatment. If these patients are randomized to the TAU-O arm, they have a choice with regard to a) the treatment setting, b) the treatment method (e.g., CBT or psychodynamic psychotherapy), c) the therapist, and d) the intensity or dosage of therapy. It is therefore also particularly important to discuss findings of studies on the background of the health care system in which they were conducted. These challenges are somewhat similar in CER, where divergent interests (drug companies, ethics boards, and patient representatives) may nominate different options as BAAT (90). Given the large number of different psychotherapeutic approaches to one disease, this may be impossible to achieve in RCT but perhaps in meta-analyses of RCT (91). Finally, if one novel treatment A is compared with the best (or one of the bests) treatment B, statistics cannot rely on A's superiority over B but should test A's non-inferiority in comparison with B, with the consequence that as many as a fourfold number of patients could be required to confirm this (92). Not to mention the fact that this generates an ethical paradox since, according to the Declaration of Helsinki, the smallest possible number of patients should be recruited for a trial, while all others should receive regular and adequate treatment (93).

Among the many design alternatives that have been developed to either explore and maximize the placebo response or to avoid or minimize it in drug RCT (3), the so-called preference design may indicate an alternative approach specifically relevant for psychotherapy (78). In short, patients can choose between two (or more) alternative therapies, e.g., drug or psychotherapy, and are assigned accordingly (94). Only those who have no clear preference will undergo randomization. The role of patients' preference (and its placebo effect) can be assessed *post hoc*, comparing those with a preference for the one therapy with those randomized to this therapy in each therapy arm.

#### Specific Control Problems with Specific Psychotherapy Modalities

Beyond these general problems of control conditions with global placebo and nocebo effects in psychotherapy RCT, specific psychotherapies generate specific problems related to control and adequate estimation of the placebo/nocebo effects. Much of what has recently been described as decisional framework for neurocognitive and behavioral intervention (60) applies to most other therapeutic options also.

The subsequent review of different psychotherapies that we discuss bears some arbitrary selection bias and may reflect a more traditional vision of the spectrum of psychotherapies available. However, the intention is to illustrate rather than to cover the variability of problems associated with specific psychotherapy modalities. Readers who feel neglected or overlooked are welcome to consider and outline the specifics of their own modus operandi in light of what has been discussed.

#### Psychodynamic Psychotherapy

As already observed by Frank (13), it is essential for any control strategy attempting to catch the placebo effect in psychotherapy RCT that it devotes the same interaction (time, number of contacts, and intensity of communication) between the patient and the therapist as is the case in the "active arm" of the therapy. He proposed the use of relaxation therapy as a control for psychodynamic therapy (PDT), but it may equally well be any other passive but interactive therapy. It should be borne in mind that, in most such cases, the control condition does not provide a clear measure of the placebo effect but simply another effective therapy. This will increase the pressure to demonstrate superiority of PDT over control while increasing the placebo effect in both arms. Recent approaches (52, 88) applied standardized diagnostic systems (e.g., operationalized psychodynamic diagnosis) to identify clear treatment foci on the basis of a psychodynamic approach (52, 88).

### Cognitive Behavioral Therapy

Unlike psychodynamic psychotherapy (PDT), CBT approaches intrapersonal problems at an individualized rational (cognitive) or behavioral level on the basis of an extensive prior behavioral analysis. To adequately control for nonspecific effects, it is feasible that written information on putative cognitive and behavioral strategies that are independent of the patient's own history may provide a control strategy. Albeit this lacks the actual behavioral analysis that precedes the active part of the therapy, it is, nevertheless, part of it. Behavioral exercises and tasks, which may mimic some of the effects occurring, bear the risk of errors if not adequately structured to the patient´s pathology and therefore require careful monitoring.

### Interpersonal Psychotherapy

Interpersonal psychotherapy, like CBT, is based on a very intimate knowledge and guidance of the patient's acute problems, and problem solving may therefore not tolerate "sham" interventions without becoming evident. Keeping an (electronic) diary may be a method of monitoring one's own problems in the absence of a therapist (95). MBT, self-aid programs, and educational programs may also provide a lowerlevel control for the attention received.

### Mindfulness-Based Therapy

An increasing number of therapy studies have demonstrated the efficacy of MBT in somatoform disorders such as IBS. Meditation-based therapy is difficult to control for nonspecific effects. In some trials, validated self-aid programs are used for attention control (96) or "sham mindfulness meditation" (97).

#### Couple and Family Therapy

The nonspecific contribution of "proxies" toward therapy efficacy has been well established for children in medical therapy (98) but has rarely been assessed in adults (99). Experimentally, the placebo and the nocebo effects in both groups are affected by social models, be it peers, parents, or strangers (100, 101). The control strategies therefore become even more difficult when the "proxies" are part of the intervention, as is the case in pair and family therapy. Merrilees et al. (95) used event-contingent diaries about marital conflict situations to change marital interactions as a control strategy rather than conventional face-to-face family psychotherapy sessions.

#### Group Psychotherapy

The problem of "proxies" and "others" for the therapy progress and success of individual patients becomes even more virulent with group psychotherapy. One control strategy would be to run two (or more) groups in parallel, with all participants truly randomized to one of the groups, and to compare the group as well as the individual progress between the two. In addition, group processes could be monitored by applying group-specific outcome measures. A further control strategy could consist of using eHealth applications such as chatrooms, focus groups, selfaid guides, and blogs as controls (102).

#### Hypnotherapy

Nonspecific effects of hypnotherapy, whether general or in a disease-specific form such as gut-directed hypnotherapy (103), are probably best and most readily controlled by relaxation exercises and therapy, since these are similar with respect to the time spent (in a group setting as well as in individual therapy) and active/passive components. A comparison with mindfulness mediation (see previously), while perhaps advisable, has not yet been conducted.

#### Self-Help Programs

SHPs were initially developed as a control condition for more manualized therapies, especially in patients with somatoform disorders, such as the IBS (104). As they developed their own theoretical framework, and for economic reasons—providing professional help to more patients outside academic centers many applications are now available, particularly in combination with web-based approaches (105).

### E-Mental Health Approaches

The very recent development of phone- and internet-based therapies has spread across all psychotherapy modalities, from CBT to MBT and SHP, e.g., Refs. (106) and (107). Among the most widely used applications is Deprexis®, an internet-based CBT program for the treatment of depression (108, 109). Due to its high standardization, it can easily allow for the control of the effect of a variety of nonspecific factors such as age, sex/ gender, race, and other therapist-based demographics, for style of communication (personalized versus neutral), for intensity of communication, e.g., with or without question and answer, feedback, chatroom activity, etc. By contrast, of the many smartphone health applications presently available (now numbering over 300,000), those with a psychotherapeutic approach still lack clear control strategies that would enable us to estimate the overall efficacy of their placebo effect (110). Using "virtual" doctors or therapists (111) in the future may enable us to exert a much better control of the nonspecific factors not only in psychotherapy but also in medical therapy in general (21, 112).

### NONSPECIFIC SOLUTIONS FOR THE CONTROL OF THE SPECIFIC PLACEBO EFFECT IN PSYCHOTHERAPY

In a recent paper (60), we proposed a dynamic decision framework for choosing a control condition depending on the patient population and associated risks, i.e., the risk of the disease itself, placebo vs. nocebo responses in this population, and the armamentarium of available therapies with known efficacy for this patient group, as well as the trial stage. We argued that the choice of control group and its justification need to be taken into consideration, e.g., when comparing behavioral and pharmacological therapies. High participation risk studies should therefore choose among controls with high effect sizes favoring treatment (e.g., waiting list and TAU) that may require smaller sample sizes, while low-risk studies may opt for active comparators and minimal treatment control conditions [see Figures 1 and 2 in Ref. (60)].

Wampold characterizes three global strategies resulting from the need to control for nonspecific effects of psychotherapy: a) identifying single components of the psychotherapy under investigation and replacing them by components of another psychotherapy (tradition); b) dismantling, without replacing, one or more components of a specific psychotherapy; and c) using treatments that control for common factors such as education and counseling (16).

Neither strategy has produced convincing results when it comes to adequately controlling the placebo effect in psychotherapy and has (worst case) shown that the control therapies may be as effective as the therapy under investigation, e.g., Ref. (113). Another novel control strategy, known as "befriending" (114, 115), refers to professional (nurse-conducted) social contacts developed for patients with schizophrenia in the community (116). It may, however, fall into the same trap as others before it, in demonstrating that even the mildest form of patient–therapist communication can result in significant therapeutic effects (117) and may therefore be a specific control only for the specific group of patients for which it was developed.

The "Goldilocks placebo effect" (118) exploits something that has rarely been tested and compared in psychotherapy research, i.e., the provision of alternatives from which the patient may choose. Preference designs (94) allow patients to choose between alternative treatments when available (e.g., drug vs. psychotherapy, different psychotherapy options) prior to randomization. It also allows comparison of the efficacy in patient that preferred one treatment arm with patients that were randomized to this arm of the study. The role of preferences can also be included in the overall statistics when comparing both treatment effects (78). Although the use of preferences does not seem to affect the overall internal and external validity of trials (94) and the preferences themselves do not appear to play a role in the placebo response (119), the systematic evaluation of placebo data beyond acupuncture has not yet been carried out.

By way of comparison (antidepressant versus CBT) of treatment outcome (treatment–control difference, not of the placebo response) in patients with depression, patients in either arm who selected this treatment were found to respond better than those who were randomized to the same arm (120). The difference was even greater in CBT trials and was independent of depression severity and dropout rates. In a trial in patients with chronic widespread pain, participants could choose between four options (CBT, exercise, a combination of both, or TAU), and the treatment preference had no effect on treatment outcome, while improvement expectations did (121). Neither of the studies elaborated on the placebo effect size under preference–choice conditions.

The "Goldilocks principle," which refers to Goldilocks´ quote about her preferred porridge temperature as being "just right" in the popular fairytale "Goldilocks and the Three Bears" (Robert Southey 1837), has found many applications in science. The Goldilocks placebo effect study (118) takes things a step further and asks whether it is the number of options available rather than the option to select per se that determines the placebo effect, and that the effect is larger when this number is "just right" than when there are too many or too few options to choose from. While their example is taken from a choice between different alternative medicine remedy treatments (2, 12, or 38 Bach flower essences, where the middle option received the highest rating as well as the highest symptom improvement report), the principle may also apply to lower number of choice options. Whether it can explain the divergent results of the two preference studies cited previously (120, 121) remains open at this point, and more data are required before this principle can be applied to placebo effects in psychotherapy.

### SOME SPECIFICS OF PSYCHOTHERAPY AND ITS EVALUATION

Unlike drug therapy, where drug–drug interactions appear to be a pharmaceutical problem only, and the placebo effects above (or below) both are identical, psychotherapy is often combined with drug therapy, albeit their interactions have rarely been investigated. An RCT with a combination of both cannot, therefore, easily answer the question as to the size of the placebo effect and the relative contribution of either component to it. In depression, for example, psychotherapy in combination with drug therapy (122) and vice versa (123) is more effective than psychotherapy or drug therapy alone. The question, however, remains unanswered. Direct comparison of drug and psychotherapy has shown no superiority of drug over psychotherapy (124), and assuming similar effect sizes of the placebo component under both conditions would not change this relationship. However, they cannot be taken as merely additive, and the superiority of the combination may be also a function of the individual patient's preference (120). Unless an RCT is conducted and evaluated that provides either drug therapy (or placebo) alone, or psychotherapy (or an appropriate control) alone, or both in any combination, we will presumably not be able to answer this question with sufficient accuracy. Such a study resembles similarity with a double-dummy design (125), and it can be combined with a "no treatment" control group, e.g., in a register trial (76). The same holds true of other combinations of psychotherapy, such as those with neuro-modulatory therapies and biofeedback approaches (126).

Finally, one issue that requires specific consideration is the fact that, in psychotherapy, the outcome measure is usually, if not invariably a measure of subjective PRO or expert-rated outcome, whereas in drug therapy, efficacy can often be measured with both PRO/expert-rated outcome and with biomarkers, or at least with the circulating or tissue-specific level of the applied drug. Since PRO are more susceptible to placebo response than biomarkers (127), approval authorities and expert boards usually require both as endpoints in RCTs. It would therefore be advantageous if psychotherapy research were able to develop the equivalent of a biomarker as an indicator of therapy success and as an adjunct measure of the size of the placebo response in psychotherapy in the future.

### SUMMARY

Overestimation of the efficacy of interventions, not only in psychotherapy, is common to all medical subspecialties, as is the effort to minimize nonspecific treatment effects, among which the placebo effect has the poorest reputation. Unfortunately, as we have already shown previously, psychotherapy lacks a true placebo intervention, and some of the nonspecific effects in drug therapy, such as the empathy of the therapist and the quality of the patient–therapist communication, become quite specific effects in psychotherapy. On the other hand, many of the common control strategies in psychotherapy research, especially waiting list and TAU, tend to inflate these nonspecific effects at the expense of already reduced overall efficacy, be it specific for individual psychotherapy modalities or for a "common effect" of all of them.

Under these circumstances, the scientific community of placebo researcher should not seek exemption from the scientific rules of treatment evaluation that have been developed for drug therapy but rather seek specific strategies to control at least some of the elements of psychotherapy that are responsible for the placebo response. These strategies can either be specific to certain psychotherapy modalities (as we have discussed) or further develop common strategies for all, bearing the possibility of covering at least some of Grünbaum´s "incidental constituents" without attempting to identify and enumerate them all. For instance, changing the waiting-list control into a "step-wedge" design, evaluating the cohort multiple RCT design, or developing the preference design to a Goldilocks approach for psychotherapy are empirical ways of proceeding (78, 114, 115) and much more appropriate than reasoning why it is impossible to control placebo effect in psychotherapy or remonstrating that all of psychotherapy is only placebo.

Last but not least: Biomedicine has learned to accept that placebo/nocebo effects exist outside placebo-controlled trials in daily medicine, and they contribute to a large extent to the success or failure of patient treatment, sometimes even more so than the drugs available. It is now time for psychotherapists to accept them in their daily practice.

#### REFERENCES


### AUTHOR CONTRIBUTIONS

PE had the idea for the paper and conceptualized it. PE and SZ wrote the paper.

## FUNDING

Supported by the German-Norwegian Günther Jantschek Research Stipend (PE).


previous psychotherapy. *Psychosoc Med* (2010) 7:Doc06. doi: 10.3205/ psm000068


acupuncture trials: a systematic review. *J Clin Epidemiol* (2013) 66:308–18. doi: 10.1016/j.jclinepi.2012.09.011


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Enck and Zipfel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo and Nocebo Effects Across Symptoms: From Pain to Fatigue, Dyspnea, Nausea, and Itch

*Fabian Wolters1,2\*, Kaya J. Peerdeman1,2 and Andrea W.M. Evers1,2,3*

*1 Health, Medical and Neuropsychology Unit, Institute of Psychology, Faculty of Social and Behavioral Sciences, Leiden University, Leiden, Netherlands, 2 Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands, 3 Department of Psychiatry, Leiden University Medical Center, Leiden, Netherlands*

Placebo and nocebo effects are, respectively, the helpful and harmful treatment effects that do not arise from active treatment components. These effects have thus far been researched most often in pain. It is not yet clear to what extent these findings from pain can be generalized to other somatic symptoms. This review investigates placebo and nocebo effects in four other highly prevalent symptoms: dyspnea, fatigue, nausea, and itch. The role of learning mechanisms (verbal suggestions, conditioning) in placebo and nocebo effects on various outcomes (self-reported, behavioral, and physiological) of these different somatic symptoms is explored. A search of experimental studies indicated that, as in pain, the combination of verbal suggestion and conditioning is generally more effective than suggestion alone for evoking placebo and nocebo effects. However, conditioning appears more and verbal suggestions less relevant in symptoms other than pain, with the exception of placebo effects on fatigue and nocebo effects on itch. Physiological measures, such as heart rate, lung function, or gastric activity, are rarely affected even when self-reported symptoms are. Neurobiological correlates are rarely investigated, and few commonalities appear across symptoms. Expectations generally predict placebo and nocebo effects for dyspnea and itch but seem less involved in fatigue and nausea. Individual characteristics do not consistently predict placebo or nocebo effects across symptoms or studies. In sum, many conclusions deriving from placebo and nocebo pain studies do appear to apply to other somatic symptoms, but a number of important differences exist. Understanding what type of learning mechanisms for which symptom are most likely to trigger placebo and nocebo effects is crucial for generalizing knowledge for research and therapies across symptoms and can help clinicians to optimize placebo effects in practice.

Keywords: placebo and nocebo effects, suggestion, conditioning, fatigue, dyspnea, nausea, itch, pain

## INTRODUCTION

The placebo effect, the positive treatment outcomes that cannot be ascribed to active treatment components, has evolved from a nuisance in clinical trials to a phenomenon worth studying in its own right. Placebo effects can influence clinical outcomes in a meaningful way (1) and, under optimal conditions, achieve a large magnitude (2, 3). Moreover, placebo effects occur not just

#### *Edited by:*

*Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Anne-Kathrin Bräscher, Johannes Gutenberg University Mainz, Germany Victor Chavarria, Parc Sanitari Sant Joan de Déu, Spain*

> *\*Correspondence: Fabian Wolters f.wolters@fsw.leidenuniv.nl*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 11 March 2019 Accepted: 13 June 2019 Published: 02 July 2019*

#### *Citation:*

*Wolters F, Peerdeman KJ and Evers AWM (2019) Placebo and Nocebo Effects Across Symptoms: From Pain to Fatigue, Dyspnea, Nausea and Itch. Front. Psychiatry 10:470. doi: 10.3389/fpsyt.2019.00470*

when a placebo is given, but can potentially enhance any active treatment that a patient receives (1, 4). Medical outcomes can be further influenced by the nocebo effect, where, instead of the positive effect in the case of placebo, harmful treatment side effects are evoked or increased, or positive treatment effects are reduced (5, 6).

Most of what we know about placebo and nocebo effects their magnitude, their working mechanisms, their physiological and neurological correlates—comes from the study of these effects in pain. There are good reasons for this, as pain is well studied, is the most commonly reported somatic symptom (7), and can greatly influence quality of life (8). Pain also has the advantage that it is relatively easy to manipulate and control in laboratory settings: it can be tuned "up" and "down" by exposing the participant to different levels of a noxious stimulus such as heat, cold, or pressure. By contrast, other somatic sensations generally take more time to evoke (e.g., fatigue) or tend to last for a time even after the stimulus is removed (e.g., itch and nausea). This has led to a strong research tradition of studies on placebo analgesia and nocebo hyperalgesia to emerge in the last century, using the benefit of cumulative findings in comparable research settings to thoroughly investigate underlying mechanisms of these effects.

Placebo and nocebo effects play a role not just in pain, but in a wide range of conditions and symptoms. The available research indicates that the underlying mechanisms for these effects might differ per symptom (9). Accordingly, similar procedures might lead to very different results when symptoms are very different, such as pain and hormone levels (10), or to more comparable results when symptoms are more alike, such as pain and itch (11). Symptoms can differ on aspects such as conscious accessibility, the amount of cognitive control one can exert over it, what physiological systems they are connected to, and the related conditions and possible pathophysiological pathways [see, e.g., Ref. (12) for a comparison of itch and pain]. All of these factors can influence a symptom's susceptibility to placebo and nocebo effects or to learning mechanisms that cause them. The dominant position of pain in placebo and nocebo studies might give the impression that placebo and nocebo effects are only impactful for pain, or that they operate in other symptoms exactly as they do in pain. More importantly, knowing which findings generalize from pain to other symptoms could lead to more effective use of placebo and nocebo effects in both research and clinical practice.

While placebo and nocebo studies of symptoms other than pain are not as plentiful, some lines of research have a long history—for example, the placebo effect was studied in weightlifters and asthmatics in the early 1970s (13, 14). However, as in pain, these studies tend to focus only on a single symptom; there is very little comparative work that examines the similarities and differences between placebo and nocebo effects on pain and these other symptoms. The current review aims to help fill that gap. To facilitate the comparison with pain, we will focus on symptoms that share the features of being subjective, somatic, and commonly reported in the general population (7, 15–17) and that have been studied in the area of placebo and nocebo effects: fatigue, dyspnea, nausea, and itch. We will focus primarily on whether the learning mechanisms that have been established for pain function similarly for these other symptoms, with the status of research for each symptom separately being featured in the discussion. The focus will be on verbal suggestion and conditioning, as other learning mechanisms (such as observational learning) are rarely investigated in the included symptoms [although see Ref. (18) for an exception]. We will first see whether these learning mechanisms are similarly effective at inducing placebo and nocebo effects on fatigue, dyspnea, nausea, and itch as they are at affecting pain. After discussing these results, we will compare the selected symptoms with pain in terms of possible underlying mechanisms, specifically the role of expectations, and individual predictors of placebo and nocebo responses.

### SEARCH STRATEGY AND SELECTION CRITERIA

We searched the scientific literature for experimental research on placebo and nocebo effects on subjective, somatic, and commonly reported symptoms other than pain. The included symptoms (and related search terms) were as follows: fatigue (mental fatigue, muscle fatigue), itch (pruritus, antipruritic), nausea (motion sickness, emetic, antiemetic), dizziness (vertigo, fainting), and dyspnea (asthma). These terms were entered in databases PubMed, PsycINFO, and Web of Science in combination with search terms for placebo and nocebo effects (placebo effect, placebo effects, nocebo, conditioning, operant conditioning, classical conditioning, verbal suggestion). Only those studies were included that either mentioned at least one of the included symptoms in the suggestion given to participants or included at least one of these symptoms as a self-reported outcome after a learning procedure featuring verbal suggestion or conditioning. Both studies that included healthy participants and those drawing participants from clinical samples were included. Only experimental laboratory studies were considered, since there are many possible reasons for symptom change in clinical trials and it is unclear whether the change in the placebo group (placebo *response*) is actually due to the placebo (placebo *effect*) or due to other factors such as natural history [see, e.g., Ref. 4)]. This process resulted in no relevant studies for dizziness, and thus this symptom was not further considered. Further studies were added by examining included studies for references and upon expert recommendation.

To answer the question whether expectations play a role in placebo and nocebo effects in fatigue, dyspnea, nausea, and itch, those studies that have explicitly examined participants' expectations were considered. To examine the question of which traits identify the placebo responder, the studies included based on the aforementioned criteria were scanned for individual characteristics used in moderation analyses. Only variables measured through questionnaires and gender were identified; no studies investigating, e.g., genetic factors were found. A brief summary of the results of included studies can be found in **Table 1**, while a detailed overview of every study is available as **Supplemental Material**.

#### TABLE 1 | Overview of results of included studies.


*Positive results: studies where a significant effect was found in the direction matching the verbal suggestion or learning procedure.*

### PLACEBO EFFECTS

In the prototypical experimental placebo analgesia study, participants are exposed to a painful stimulus, and then receive an inert treatment (the placebo) that is suggested to be an analgesic. This method is easily converted for use in other symptoms by changing the noxious stimulus; for example, instead of applying heat to induce pain, participants cycle on an ergometer to induce fatigue or sit in a rotation chair to induce nausea. The placebo itself is also adaptable: for instance, instead of an analgesic cream, a cream can be described as antiallergenic or an inert inhaler can be described as a bronchodilator. Some studies do not feature a separate inert medication, but directly suggest a change in the method or substance that induces the noxious sensation. For example, electrical stimulation can be described as very likely or very unlikely to cause itch.

Placebo effects can be evoked by only the verbal suggestion of symptom relief, but also by letting participants experience the reduction in stimulus intensity through a conditioning procedure. Meta-analyses have shown that in experimental studies investigating placebo mechanisms, verbal suggestions alone are on average rather effective at evoking placebo analgesia (3, 19). This analgesic placebo effect tends to be further enhanced when verbal suggestion and conditioning are combined (19– 21). Conditioning can also be used to evoke placebo effects by itself, without verbal suggestions (21–24), but this is less often investigated in pain and even rarer in most other symptoms we discuss. Other learning mechanisms, such as letting the participant observe the effect in another person, are also rarely examined. Therefore, we will discuss first the effect of only verbal suggestions, and then all studies using conditioning, either paired with verbal suggestion or used by itself. Within each of these categories, fatigue, dyspnea, nausea, and itch are handled in order.

#### PLACEBO EFFECTS EVOKED THROUGH VERBAL SUGGESTIONS

**Fatigue.** A substantial number of studies have investigated the effect of a verbal suggestion of reduced fatigue or increased performance (25–38). In seven of the 13 studies, participants report a lower sense of fatigue while performing a motor task in the placebo condition than in a control condition (25, 28–30, 32, 33, 35). The lack of an effect in the remaining studies might be due to other factors, such as small sample size [Refs. (26, 38); 9 and 10, respectively] or the generic wording of the suggestion [not specifically directed toward fatigue (31, 34) or suggesting a 50/50 chance of placebo (27)]. Across all of the studies, participants also perform better; all but one study (37) find that participants either produced more power or continued a set performance for longer (26–33, 36, 38). Physiological indications of effort, such as blood lactate or heart rate, are often measured, but are not affected in most studies (28, 29, 33, 34, 38), even when the study found effects on fatigue or performance. A decreased readiness potential using EEG during repeated finger movements might indicate that placebo effects on fatigue and performance are caused by a central action in the preparatory phase of movement (35). Thus, overall, it seems that verbal suggestions are effective at reducing experienced fatigue and improving performance, but this is often not accompanied by the expected changes on physiological measures.

**Dyspnea.** Six studies, all using asthmatic participants, have investigated the placebo effect induced by a verbal suggestion on dyspnea (39–44). Three of the five studies that examined self-reported dyspnea report a decrease in symptoms (39, 42, 43). Participants in these studies first received a suggestion of bronchoconstriction (39, 42) or were denied their normal asthma medication (42, 43) before offering participants the placebo which was suggested to improve their breathing. The two studies that report non-significant findings on self-reported dyspnea (40, 44) tested the placebo without first inducing breathing problems. It is likely that a reduction of dyspnea is only expected or possible when it is clearly present in the first place. None of the studies found an effect on measures of lung function (39, 40, 42, 44), except Kemeny and colleagues (41), who did not examine self-reported dyspnea but found an improvement on airway reactivity after a placebo induction. No behavioral or neurological measures were collected. The tentative conclusion from these limited findings is that an existing feeling of dyspnea can be reduced by a verbal suggestion, but likely without accompanying physiological changes [which are themselves not strongly correlated to subjective asthma symptoms; see, e.g., Ref. (45)].

**Nausea.** Eight studies have examined the effect of suggestion of reduced nausea (46–53). Three studies show placebo effects on nausea experienced during a nausea-inducing activity after verbal suggestions (47, 48, 51), although the effect was limited to women in one study (51) and to men in another (48). A possible reason for the non-significant findings in the other studies (46, 52, 53) is that in these studies, participants were not previously made familiar with the nausea-inducing task, possibly resulting in a low expectation of nausea that cannot be reduced further with placebo. Other studies have shown that suggestions of reduced nausea can reduce the disgust experienced when viewing disgusting stimuli (49, 50). With respect to behavioral outcomes, participants did not tolerate the nauseating stimulus for longer after a verbal suggestion of a ginger treatment, regardless of whether there was an effect on reported symptoms (51, 52). Similarly, no differences between the placebo and control groups were detected with an electrogastrogram in any study (46, 51, 52), except for one (53), where participants who received a suggestion of reduced nausea actually showed more abnormal gastric activity. The two functional magnetic resonance imaging (fMRI) studies (49, 50) indicate an effect of a placebo with the suggestion of nausea reduction, showing decreased activity in the insula (particularly the left) and increased connectivity between the dorsomedial prefrontal cortex and the amygdala. The latter finding is consistent with processes of cognitive reappraisal of aversive stimuli, while the insula is a region associated with disgust and pain perception as well as pain analgesia (54–56). The overall pattern indicates that a placebo effect on nausea after verbal suggestion is found only on self-reported measures in some subgroups and under specific conditions.

**Itch.** Six recent studies have examined the effect of suggestions on itch (11, 34, 57–60). In most studies experienced itch was not successfully reduced (11, 34, 57, 59, 60), although it was in one study (58). All but one of the studies that did not find an effect gave only the suggestion, without a separate placebo (11, 59) or used a nonspecific suggestion (34, 60). Regardless of self-reported itch, none of the studies reported an effect on a physiological measure, be it weal size (58, 59), flare (59), skin temperature (59), heart rate (34), or skin conductance (34). Taken together, producing a placebo effect on itch seems to require more than just a verbal suggestion, with effects appearing only with very convincing suggestions or under specific circumstances.

Overall, placebo effects from a verbal suggestion do not seem to be as generally effective in other symptoms as they are in pain. The many studies showing clear effects on self-reported fatigue that extend to performance measures echo results on placebo analgesia, where verbal suggestion alone seems to be effective at reducing pain (3, 19). However, dyspnea, nausea, and itch were not reduced after a verbal suggestion in many studies, and seem to require certain conditions (such as specific phrasing of the suggestion) to be effective. Physiological correlates such as heart rate, lung function, or weal size show little evidence of being affected for any symptom.

### PLACEBO EFFECTS EVOKED THROUGH CONDITIONING

In a conditioning procedure in a placebo study, participants can personally experience the beneficial effect of the placebo. This is generally done by modifying the intensity of the presented noxious stimulus, such as lowering the heat of a heat pain stimulus when a placebo cream is applied. While conditioning and suggestion were in the past sometimes seen as competing explanations of placebo effects, more recent perspectives (61, 62) generally consider them complementary, both contributing to the expectations that then influence the experience of noxious sensations. Note that when a study involves conditioning, this is almost always classical conditioning; while some studies into operant conditioning in the context of placebo and nocebo effects exist (63, 64), they are as of yet too rare too draw any general conclusions.

**Fatigue.** The combination of verbal suggestion and conditioning to produce a placebo effect on fatigue has been much less studied than the effect of verbal suggestion alone. Only two studies have adopted the method (31, 36). Both studies also include a group where only verbal suggestion was applied, allowing conclusions about the added effect of combining the methods. No effects on fatigue or perceived exhaustion were found in one study (31), while in the other (36), the self-report measure was only affected in the combined verbal suggestion and conditioning group, with no effect of suggestion alone. In both studies, participants in the combination condition showed larger effects on physical performance than those who only received a placebo and the suggestion. The study by Fiorio et al. (31), using transcranial magnetic stimulation, supports the idea that a placebo procedure influences a central mechanism (35), and extends these findings by suggesting that this results in rapid increases in excitability in the corticospinal system for the specific muscle involved. While low in number, these studies suggest greater placebo effects on fatigue and motor performance when verbal suggestion and conditioning were combined.

**Dyspnea.** We are not aware of studies using a design combining verbal suggestion and conditioning or using conditioning alone that investigated placebo effects on dyspnea.

**Nausea.** Two studies have combined verbal suggestion and conditioning to evoke a placebo effect on nausea (48, 65). Horing et al. (65) found that a combination of suggestion and conditioning was effective in reducing both self-reported nausea symptoms as well as behavioral consequences (how often participants could move their head during the nauseating task and how long they could tolerate the task). No results were found on electrogastrogram measures of digestive tract activity. The other study (48) similarly found that self-reported nausea was reduced after a procedure of suggestion and conditioning, but noted that suggestion seemed to be only effective for men, while conditioning was only effective for women. Both studies found an effect that was more elusive in studies that only used a verbal suggestion, indicating that conditioning may have some added value in reducing nausea, but further research will have to elucidate to what extent and for which groups this applies.

For nausea, there is also a line of research into using conditioning alone, without verbal suggestion, to induce placebo effects. These studies use two strategies. The first is overshadowing, where during a learning phase the nausea-inducing stimulus is associated with a very salient stimulus (e.g., a strong-tasting beverage) which is then not present at test (66–68). Because the nausea is associated with the salient stimulus, the absence of the stimulus may reduce nausea. The other is latent inhibition, where participants are exposed to the environment where the nausea is induced several times before the nausea induction (66, 69, 70). There, the fact that the environment is not just associated with nausea but also with previous neutral experiences will make it less nausea-inducing. These protocols have been used to reduce anticipatory nausea (66–69) and nocebo nausea (70). This seems to be generally effective (67–70), although the study implementing both interventions found no differences between the latent inhibition, overshadowing, combination, and control groups, and there were some indications that the latent inhibition intervention actually increased nausea (66). The one behavioral measure of rotation tolerance was not affected (69). Physiological results are mixed: some findings for heart rate correspond to self-report measures (68), but hormone measures either show no effect (67) or follow the unexpected effect of showing increased symptoms for latent inhibition (66). The results of the conditioning studies look promising in reducing self-reported anticipatory nausea and might stimulate other fields to continue to develop optimized conditioning procedures.

**Itch.** Only two studies specifically investigated the effect of verbal suggestion with conditioning on itch (57, 71). In both cases, the placebo consisted of an electrode, with the suggestion that it modified the intensity of the electrically-induced itch. The first study (57) found that only the combination of verbal suggestion and conditioning reduced self-reported itch, with either method individually not producing significant results. The second study (71) further indicated that this combination could also reverse nocebo effects on itch that had earlier been induced by a similar procedure. This reduced itch, however, did not result in reduced scratching behavior (72). While the evidence is limited, these studies suggest that the combination of verbal suggestion and conditioning is needed to successfully induce placebo effects on itch.

Overall, procedures that combine conditioning and verbal suggestion seem to more reliably induce a placebo effect on fatigue, nausea, and itch than those that use verbal suggestion alone. This aligns with results in pain (20, 21). It should be noted, however, that these results are based on a small number of studies, and more confirmatory work will still need to be done. One possible exception and example is the work in nausea, where a stronger tradition of studies has confirmed the utility of conditioning both with and without a verbal suggestion.

#### NOCEBO EFFECTS

Whereas placebo effects involve the reduction of noxious symptoms, nocebo effects consist of evoking or enhancing these symptoms. The experimental setup of a nocebo study is generally much like a placebo study, where a noxious agent is applied, and the participant learns through a verbal suggestion or conditioning to experience it differently. Nocebo research is still limited because of its relative novelty and the ethical concerns involved. In pain, studies have shown that nocebo effects require fewer learning trials than placebo effects (73) and are resistant to extinction (74). This would suggest more robust findings for nocebo studies than placebo studies. It has also been suggested that nocebo effects on pain might be more reliably evoked with just a verbal suggestion than placebo effects, making the addition of conditioning less necessary [(20); see also Ref. (2)]. We studied whether these findings also apply to fatigue, dyspnea, nausea, and itch.

### NOCEBO EFFECTS EVOKED THROUGH VERBAL SUGGESTIONS

**Fatigue.** Three studies have investigated a nocebo effect on fatigue from verbal suggestion alone (25, 75, 76). One study (25) found that the suggestion of a fatigue-inducing drink increased participants' rate of exhaustion, but did not decrease their performance or influence cardio-respiratory, muscle, and blood lactate measures, while another found the opposite result, with no effect on rate of exhaustion but a reduction in force output (76). The final study (75) did not find increased fatigue after a nonspecific suggestion of ultrasonic noise. This suggests that nocebo effects from verbal suggestion are possible for fatigue when the suggestion is specific enough, but more evidence is needed.

**Dyspnea.** Five studies, all using asthmatic participants, have investigated the nocebo effect of suggestion on dyspnea (39, 40, 42, 44, 77). The results are relatively equivocal: two studies found an effect on reported symptoms (39, 77) while two others found no effect (40, 44) and another only found an effect in a subgroup of highly nervous participants (42). Similarly mixed results are found for lung measures, with two studies finding an effect (44, 77), which was not confirmed in two other studies (40, 42). One study additionally found an effect on a measure of airway inflammation (40). The evidence for nocebo effects on dyspnea and related lung function measures arising from verbal suggestion alone is thus rather tenuous. The overall results are very mixed and no clear methodological trend seems to explain them.

**Nausea.** There are three studies examining nocebo effects from verbal suggestion on nausea (46, 75, 78). No studies found the hypothesized results, reporting no effect on experienced nausea (75, 78), an effect only in men on motion tolerance (78), and a reversed effect (46), where reported nausea and gastric tachyarrhythmia were actually lower for the nocebo group compared to control. The suggestions used in two of these studies were not optimal however, either referring generally to effects of ultrasonic noise (75) or to a drug that would increase nausea but reduce other symptoms of motion sickness (46). These studies suggest that verbal suggestions are not effective for evoking or worsening nausea, but studies using a more specific or unilateral suggestion may prove to be more effective in the future.

**Itch.** Five studies have investigated nocebo effects from the suggestion of increased itch (11, 79–82). All of them indicate that self-reported itch worsened after the nocebo suggestion, although the results were not consistent for every measure in one study (82). Scratching duration was also increased in one study in the group that received a very negative compared to a more neutral suggestion (81), but this also applied to the group receiving no information, and the difference was only seen in patients. The results on self-reported measures seem to transfer to the associated skin reactions (80), although again not consistently for every measure in one study (82). Napadow and colleagues (79) also performed fMRI analyses, finding increased activity in the dorsolateral prefrontal cortex, caudate, and intraparietal sulcus associated with nocebo itch. These areas responded in similar ways to an actual allergen, suggesting this activation may be specific for itch and not applicable to nocebo effects more generally. The results of these studies indicate that nocebo suggestions can worsen experienced itch, while the evidence is more mixed for behavioral and physiological correlates.

Overall, the studies investigating the effect of only verbal suggestions to induce nocebo effects are less numerous than the corresponding placebo studies, and do not provide enough evidence for solid conclusions. Only in itch are consistent nocebo effects seen, which may be due to the unique qualities of that symptom: itch is known to arise even when it is just talked about or observed in others [contagious itch; see Ref. (83) for a review]. The results for itch seem to most resemble those in pain, where nocebo effects seem easier to evoke than placebo effects. For dyspnea and nausea, the results for placebo and nocebo studies are both mixed, and for fatigue the results for placebo are more consistent than for nocebo effects.

#### NOCEBO EFFECTS EVOKED THROUGH CONDITIONING

**Fatigue.** We are aware of only a single study that has examined nocebo effects on fatigue elicited by a combination of verbal suggestion and conditioning (76). The results indicate no effect on perceived exhaustion, though this might also be due to a training effect emerging throughout the repeated sessions of the experiment. The procedure did lead to an overall reduction in performance, although the effect was not larger in the condition combining verbal suggestion and conditioning than in the verbal suggestion alone condition.

**Dyspnea.** Eight studies have investigated a nocebo effect on dyspnea using conditioning. Two of them combine verbal suggestion and conditioning [Refs. (84–86); note that the latter two use the same dataset), and six rely on conditioning while offering either no suggestion or a similar suggestion in both conditions (87–92). Two studies (84, 87) offer the participant an inhaler and are thus clearly placebo studies, while the others expose participants to scented air *via* a special breathing apparatus but do not offer a physical treatment that would be universally recognized as a placebo. All studies show an increase in self-reported asthma symptoms after conditioning. However, in some studies this increase in self-reported symptoms applied only under certain conditions [i.e., when the conditioning procedure featured an unpleasant and not a pleasant scent (89, 91, 92)] or on some of the included measures [Ref. (87); only subjective airway obstruction was affected, and not feelings of dyspnea or hyperventilation]. Physiological measures of lung functioning and breathing were not affected in three studies (87, 88, 92) while five other studies found an effect on some of the included physiological outcomes (84, 85, 89–91). Many of these studies use healthy samples (84–86, 88–90, 92); the results do not differ systematically between the results of these studies and those where participants were asthmatics (84, 87) or psychosomatic patients (91). These studies together provide convincing evidence that a conditioning procedure can evoke self-reported symptoms of dyspnea, much more clearly than verbal suggestion alone, although the results remain somewhat inconsistent for physiological measures.

**Nausea.** Four studies have used either conditioning alone (78, 93) or the combination of verbal suggestion and conditioning (70, 94) to induce nocebo effects on nausea. In all cases, selfreported nausea was increased in participants, although one study (78) found the same gender pattern as in placebo nausea, where women responded more strongly than men to the conditioning procedure. The gender pattern also applied when considering how long participants could endure the nauseainducing rotation. Another study using only conditioning (93) found that participants consumed less of a drink that was associated with the rotation, but no effect on tolerance of rotation and also no effect on two hormonal outcomes. The findings indicate that gender may be an important factor in placebo and nocebo nausea. The combination of verbal suggestion and conditioning seems to be quite effective at influencing nausea, and more effective than suggestion alone, although very limited evidence suggests this might not extend to physiological correlates of nausea.

**Itch.** The three available studies indicate that the combination of verbal suggestion and conditioning is effective for inducing nocebo effects on itch (57, 71, 95). All of the available studies find a nocebo effect on self-reported itch when using a procedure combining suggestion and conditioning, although one follow-up analysis (72) did not find consistent effects on scratching behavior. Moreover, the study by van de Sand et al. (95) used fMRI to investigate neural activity associated with nocebo itch, finding increased activity in the rolandic operculum and increased connectivity between the insula and periaqueductal grey in the nocebo condition. These results do not correspond to those found in the earlier fMRI study into nocebo itch (79), which did not use conditioning and only tested patients. Activity in the operculum is also found in fMRI nocebo hyperalgesia studies (96), but these results do not overlap with imaging studies for other symptoms. While no immediate pattern emerges from behavioral or physiological outcomes, self-reported itch is clearly influenced when conditioning and verbal suggestion are combined.

The low number of studies on nocebo effects that use conditioning makes it hard to draw a conclusion across symptoms that is not pre-emptive. The limited results available would suggest that nocebo effects on dyspnea and nausea are more robust after the combination of verbal suggestion and conditioning than they are after verbal suggestion alone. The combination procedure did not seem to lead to more robust effects in fatigue, where results are limited but appear weak, or in itch, where verbal suggestion alone already produced clear nocebo effects. In this sense, only the results for itch seems to echo those on pain, where it has also been suggested that nocebo effects can be evoked as easily with suggestion alone as with a combination of suggestion and conditioning (20).

#### THE MEDIATING ROLE OF EXPECTANCIES

A common theoretical view is that placebo and nocebo effects function by means of expectancies: you will feel less pain when you expect to (61, 97, 98). In pain, research has shown that the expectation of analgesia or hyperalgesia is indeed a contributing factor to placebo and nocebo effects [e.g., Refs. (99–100)]. Current theoretical perspectives generally consider verbal suggestion and conditioning complementary forces that together influence the expectations that in turn influence the experience of noxious sensations (61, 62).

**Fatigue.** One study (25) found a strong relation between the expectation of increased or decreased fatigue and increases and decreases in performance. Other studies found an effect of the suggestion on participants' expectations, but no relationship between expectations and fatigue (34) or performance (37).

**Dyspnea.** De Peuter and colleagues (84) found that participants undergoing a nocebo procedure had higher expectations for asthma symptoms as well as higher asthma symptom ratings; however, expectations were only statistically related for asthmatics and not the whole sample. From another angle, a study that specifically tried to not to instill any expectations in participants also found no effects on dyspnea (101).

**Nausea.** Four studies have directly investigated the role expectations in placebo and nocebo effects on nausea (47, 51, 52, 70), but only one of them (47) found the hypothesized effect, with both expectations of nausea and self-reported nausea lower

in the placebo than in the control group. The other studies show discrepancies; either expectancies were affected but nausea was not (52), nausea was effected but expectancies were not (51), or there was a relationship between expectations and nausea for nocebo effects but not for placebo effects (70).

**Itch.** The study by van Laarhoven and colleagues (11) showed a correlation between expected itch and nocebo-induced itch ratings. Another study found increased expectations but no corresponding effect on itch (34), while a third found a relation between positive expectations and reduced symptoms only in the experimental group (59). An investigation of the mechanisms behind placebo and nocebo effects on itch (102) found that placebo responders self-generated fewer itch expectations in a separate task, although corresponding results were not found for nocebo responders.

The available evidence points to differential effects of expectations for every symptom, with stronger evidence for expectations as a mediator in itch and some evidence in dyspnea, but more evidence against a mediating relationship in fatigue and nausea. Further research is needed to elucidate what underlying mechanisms might additionally play a role in placebo and nocebo effects in these symptoms, especially for nausea and fatigue where expectations might not play an important role. Other mechanisms such as attention and fear have been suggested [e.g., Refs. (103, 104)], but have only been investigated infrequently, especially outside of pain.

#### IDENTIFYING THE PLACEBO RESPONDER

The question whether it is possible to recognize the placebo responder is almost as old as the study of placebo effects itself (105), but consistent findings have been elusive (106). One possible reason for the lack of consistent findings could be that predictors are different for different symptoms. We therefore review the findings from the included studies for each of the discussed symptoms.

**Fatigue.** The only study that investigated outcomes on fatigue as well as individual characteristics (34) found no moderating effect of neuroticism, extraversion, positive or negative affectivity, depression, anxiety, catastrophizing, or body vigilance.

**Dyspnea.** Based on a fear learning model (107), it has been hypothesized that participants high in negative affectivity might respond more strongly to negative suggestions of impaired breathing. This has been examined in several studies, with three finding the expected relationship (42, 87, 88) but four others no relationship (41, 89, 91) or an effect only for one of six subjective breathing measures (84). Suggestibility has also been examined as a possible predictor, with one study finding a relationship (77) that was not confirmed in three other studies (39, 41, 42). Likewise, no effect was found in one study for positivity (41). A final study (91) found no effect for information seeking and a negative effect of a blunting behavioral style.

**Nausea.** Two studies investigated the relationship between a placebo effect and multiple individual characteristics. One found a larger placebo effect for participants with lower scores on general self-efficacy, internal locus of control, generalized self, and mobility of nervous processes (108), while the other found no effect of the same variables as well as no effect of anxiety or optimism (51). As mentioned before, in other studies an effect is seen for gender, with men showing larger effects on placebo and nocebo nausea after suggestion and women showing larger effects after conditioning (48, 78), although one study also found suggestion effective only in women (51). There is some indication that this effect may be due to the gender of the experimenter (52).

**Itch.** Several studies in itch have examined many individual characteristics, but found almost no effects for any of them, regardless of whether they found actual placebo or nocebo effects (11, 34, 57, 71). The variables investigated in these studies are theorized to relate to expectations and include neuroticism, extraversion, positive and negative affectivity, depression, anxiety, catastrophizing, body vigilance, optimism, hope, worrying, impulsivity, self-efficacy, general future expectations, suggestibility, and social desirability characteristics. The only one of these studies to find effects (57) did so only in the group where conditioning and verbal suggestion were combined, which was also the only condition that showed effects. Here, a greater placebo effect was associated with less hope, while greater nocebo effects were associated with less hope and extraversion and more worrying, and negative effect. Another study investigating fewer variables (109) found a positive relationship between a placebo effect and ego resiliency but none with neuroticism. Considering the many variables investigated across these studies, the few observed associations should be interpreted with care.

Taken together, these results suggest that individual characteristics do not consistently predict placebo or nocebo effects on fatigue, dyspnea, nausea, and itch. The search for predictors is inconvenienced by the fact that different studies tend to investigate different variables and many results still need to be replicated. It should also be noted that, compared to some other symptoms, the type of variables under consideration is rather narrow, being almost entirely limited to personality factors. Other placebo and nocebo studies have, for example, found indications of genetic predispositions (110) or neurochemical indicators (9). Since placebo effects seem to be determined by a variety of different factors (social, psychological, neurobiological, genetic), future studies may need to incorporate more sophisticated statistical methods to test the combined effect of several predictors at once in order to identify the placebo responder [for some recent examples in pain, see, e.g., Refs. (111, 112)].

### DISCUSSION

This review investigated to what extent findings from studies on placebo analgesia and nocebo hyperalgesia also apply to fatigue, dyspnea, nausea, and itch. Broad similarities can be observed in that placebo and nocebo effects are evoked for these symptoms in a large proportion of studies using similar methods. Some specific results also appear to be consistent: placebo effects are more likely after a procedure combining conditioning and verbal suggestion than verbal suggestion alone, and there is no clear evidence which individual characteristics predict who will respond to placebo and who will not. Other findings do not clearly confirm those in pain. We find little evidence that verbal suggestion alone can consistently evoke placebo and nocebo effects across symptoms, with the exception of placebo effects on fatigue and nocebo effects on itch. For dyspnea and nausea only, nocebo effects seem to be larger after a combination of verbal suggestion and conditioning than suggestion alone. There is some evidence for a mediating role of expectations in placebo and nocebo effects across symptoms, although to a lesser extent in nausea or fatigue. Altogether, it seems that placebo and nocebo studies on pain provide a reasonable starting point for predicting these effects in other sensations, but a number of differences caution against extrapolating every finding in pain to other symptoms.

Each of the sensations we discuss has been studied in a line of research separated from the others, each with its own strengths and opportunities for further inquiry. Studies that examine fatigue tend to come from the field of sport psychology, and therefore focus on improving athletic performance. This has led to placebo effects being investigated much more than nocebo effects. Performance is generally the primary outcome, with perceived exhaustion as just one extra variable. Participants in these studies are generally physically active individuals, sometimes even professional athletes. These factors obviously limit the generalizability of these findings to medical contexts, where fatigue is a large problem (113); it remains to be seen to what extent the findings apply to patients suffering from chronic or mental fatigue. In the context of improving performance, investigating mechanisms behind the effect is perhaps a secondary concern, and fewer studies examine the effect of conditioning or individual characteristics that predict the response. Several researchers have, however, started the work of performing tightly controlled experimental studies that offer more insight into the exact mechanisms of placebo and nocebo effects on fatigue [e.g., Refs. (31, 76)]. It would be fruitful to extend these toward the clinical domain, especially since two separate trials have already shown patients suffering from fatigue can benefit from a placebo intervention (114, 115).

In dyspnea, there is a strong clinical focus, since the background of many of these studies comes from the study and treatment of asthma and somatic symptom disorders. A large proportion of studies therefore also uses asthmatics as participants. This, in turn, also limits generalizability for some findings, albeit in another direction than in the case of fatigue. Since this field features older studies, it also shows more methodological limitations, such as low participant numbers, nonspecific suggestions, and unclear symptom induction methods that can easily be rectified in new studies. Studies that include both conditioning and verbal suggestion, allowing the effects to be compared, would also be a valuable addition. A perhaps bigger issue is that no research group seems to have focused specifically on placebo and nocebo effects in dyspnea. This has led to a lack of common methods and conceptualizations that would facilitate comparisons within the subdiscipline and to other subdisciplines, despite having one of the longest traditions of placebo research [reaching as far back as the study in Ref. (116)]. This might be helped by systematic review or meta-analysis, of the type that exists for other fields [e.g., Refs. (117, 118)].

A clear experimental tradition exists in placebo and nocebo effects on nausea. The field originated in the study of anticipatory nausea in chemotherapy (68, 119), so it has a clear clinical angle, even though later studies focus on healthy participants. The subdiscipline also sports two dedicated research groups and insightful reviews (118, 119). A strong theoretical foundation in conditioning has not only improved upon mixed initial results (46, 52) but has also led to the development of the latent inhibition and overshadowing paradigms that are, as of yet, rarely applied in the rest of the placebo literature. We echo earlier calls (120) that these results should be replicated and applied to other fields of placebo research. This subdiscipline also offers conclusions that deviate the most from findings in pain, with an increased importance of gender [(48, 51, 78); see also Ref. (121) for a recent nuanced overview] and indications of a reduced role of expectations (51, 52, 70). The latter finding is further confirmed by findings in clinical trials that report similarly inconsistent results when expectations are explicitly investigated (122, 123). Our findings do appear to fit with earlier speculation that the gustatory system may have a special capability for unconscious conditioning (124). Due to its connection with the digestive system, nausea may have more in common with symptoms like hormone levels that are more affected by unconscious conditioning than consciously accessible expectations (10). However, since pain is also affected by implicit conditioning (125), further comparisons are needed to resolve this question.

Most studies of placebo and nocebo effects on itch are comparatively recent. This has allowed the field to benefit from advances in other subdisciplines, and thus the studies cover different methods and avoid some of the limitations of earlier work. Itch studies also feature a large number of individual variables as possible moderators, although no consistent findings have emerged. This may be taken as an indication that individual predictors, at least of the kind that can be measured by questionnaires, might not provide much further insight in experimental studies of placebo and nocebo effects. Several studies in the field have also investigated effects arising from verbal suggestion without a physical placebo (11, 59, 80–82), one of which also forgoes deception (59). These are interesting explorations of alternative applications of placebo effects that can also be considered in other subdisciplines. Although the two fMRI studies of nocebo itch do not report clearly congruous results, this is an important first step in investigating whether there is a common nocebo network across symptoms.

Aside from pain, fatigue, dyspnea, nausea, and itch, many other symptoms exist that can be affected by placebo and nocebo effects. Results indicate effects on variables as varied as sleep quality (126), symptoms of Parkinson disease (10), and depression (9). The current review was limited to sensations that share several important similarities with pain, but the question of generalizability of course applies to other sensations as well. Comparisons between these other symptoms can also answer important questions about underlying mechanisms and predictors that cannot be answered for the included symptoms in this review. For example, the neurological underpinnings of placebo effects in Parkinson disease have been studied more (9) than similar mechanisms in dyspnea or nausea. Similarly, there are studies of genetic predictors of placebo effects on pain (127) and fatigue (115), but the available research does not allow a comparison of genetic predictors between the symptoms included in this review. Comparisons might also be valuable for predictors with clear implications for clinical practice, such as the perceived cost of the placebo (128, 129), the odds of receiving placebo (51, 130), the invasiveness of the placebo (131, 132), other interventions that could enhance the effect of placebos (133), or pre-existing associations that could influence symptom acquisition [e.g., the color red being associated with pain (134)].

In order to translate findings to clinical practice, a comparison must also be made between healthy and patient populations. The included studies mostly use healthy volunteers as participants, but a reasonable number focus on patients. While the number of studies that compare healthy and patient populations is too low for a meaningful analysis, there is some indication that patients show different or stronger results (79, 81, 84). A meta-analysis of placebo-like effects on pain in patients (99) tentatively indicated that effects on chronic pain are smaller than on experimentally induced or acute procedural pain, possibly because of a relatively high number of unsuccessful treatment experiences in chronic pain patients. These same experiences, however, should theoretically increase the likelihood of nocebo effects. This is especially relevant considering multiple theories that argue that certain chronic conditions may be exacerbated by or find their etiology in learning effects (107, 135, 136). These theories focus on sensitization, fear learning, conditioning, and generalization, which all likely play a role in nocebo effects. More research that compares healthy groups to those suffering chronically from the relevant symptoms or investigates the progression of these chronic complaints is sorely needed to indicate how much support there is for these theories. This would allow knowledge of placebo and nocebo effects to be utilized to prevent the development and aid the treatment of chronic conditions, such as by counterconditioning the nocebo effect (71).

Our conclusions are limited by the small number of studies available. More studies are needed for solid conclusions, especially about nocebo effects and the added value of combining verbal suggestions with conditioning. Many of the included studies include methodological limitations, such as a small sample size, the omission of a baseline measure, ineffective induction of noxious sensation, or a less convincing verbal suggestion. The included studies also show a large amount of heterogeneity in terms of the methods they use to induce noxious sensations or evoke placebo and nocebo effects. Lastly, the lack of a systematic approach means the review is not exhaustive.

In conclusion, learning mechanisms of placebo and nocebo effects show large overlap, but also important differences across pain, fatigue, dyspnea, nausea, and itch. Knowledge of these differences can be used to optimally control these effects in experimental and clinical studies and increase placebo and reduce nocebo effects in clinical practice. As the separate subdisciplines for each symptom not only provide different results, but also differ in the amount and type of studies available, this review also highlights future promising research possibilities.

### AUTHOR CONTRIBUTIONS

FW drafted the manuscript, supported by AE and KP. All authors fully read the final draft and provided their approval.

### FUNDING

The preparation of this manuscript was supported by an Innovational Research Incentives Scheme (Vici) grant from the Netherlands

#### REFERENCES


Organization for Scientific Research (NWO) (Grant No. 453-16- 004) and an ERC Consolidator Grant from the European Research Council (ERC) (Grant No. 617700), both granted to AE.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00470/ full#supplementary-material


suggestion: a randomized clinical trial in healthy humans. *PLoS ONE* (2017) 12(9):1–19. doi: 10.1371/journal.pone.0182959


paradigm relevant to multiple chemical sensitivity. *Occup Environ Med* (1999) 56(5):295–301. doi: 10.1136/oem.56.5.295


behavioral and subjective measures. *Psychopharmacology* (2005) 181(4):761– 70. doi: 10.1007/s00213-005-0035-2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Wolters, Peerdeman and Evers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# When Your Doctor "Gets It" and "Gets You": The Critical Role of Competence and Warmth in the Patient–Provider Interaction

*Lauren C. Howe1\*, Kari A. Leibowitz2 and Alia J. Crum2\**

*1 Department of Business Administration, University of Zurich, Zurich, Switzerland, 2 Department of Psychology, Stanford University, Stanford, CA, United States*

#### *Edited by:*

*Katja Weimer, University of Ulm, Germany*

#### *Reviewed by:*

*Jörn von Wietersheim, Ulm University Medical Center, Germany Frank Vitinius, Uniklinik Köln, Germany*

> *\*Correspondence: Lauren C. Howe*

*Lauren.howe@business.uzh.ch Alia J. Crum crum@stanford.edu*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 24 December 2018 Accepted: 14 June 2019 Published: 04 July 2019*

#### *Citation:*

*Howe LC, Leibowitz KA and Crum AJ (2019) When Your Doctor "Gets It" and "Gets You": The Critical Role of Competence and Warmth in the Patient–Provider Interaction. Front. Psychiatry 10:475. doi: 10.3389/fpsyt.2019.00475*

Background: Research demonstrates that the placebo effect can influence the effectiveness of medical treatments and accounts for a significant proportion of healing in many conditions. However, providers may differ in the degree to which they consciously or unconsciously leverage the forces that produce placebo effects in clinical practice. Some studies suggest that the manner in which providers interact with patients shapes the magnitude of placebo effects, but this research has yet to distill the *specific* dimensions of patient–provider interactions that are most likely to influence placebo response and the mechanisms through which aspects of patient–provider interactions impact placebo response.

Methods: We offer a simplifying and unifying framework in which interactions that boost placebo response can be dissected into two key dimensions: patients' perceptions of *competence*, or whether a doctor "gets it" (i.e., displays of efficiency, knowledge, and skill), and patients' perceptions of *warmth*, or whether a doctor "gets me" (i.e., displays of personal engagement, connection, and care for the patient).

Results: First, we discuss how this framework builds on past research in psychology on social perception of competence and warmth and in medical literature on models of effective medical care, patient satisfaction, and patient–provider interactions. Then we consider possible mechanisms through which competence and warmth may affect the placebo response in healthcare. Finally, we share original data from patients and providers highlighting how this framework applies to healthcare. Both patient and provider data illustrate actionable ways providers can demonstrate competence and warmth to patients.

Discussion: We conclude with recommendations for how researchers and practitioners alike can more systematically consider the role of provider competence and warmth in patient–provider interactions to deepen our understanding of placebo effects and, ultimately, enable providers to boost placebo effects alongside active medications (i.e., with known medical ingredients) and treatment in clinical care.

Keywords: placebo effects, placebo response, patient–provider interactions, warmth, competence, provider characteristics, provider demeanor

### INTRODUCTION

The doctor has been called "a powerful therapeutic agent" (p. 1,067) (1) who can evoke healing in her or his patients even by simply interacting with them. One way providers can help their patients heal, and the focus of this paper, is through eliciting *placebo effects,* or "healing that is produced, activated, or enhanced by the context of the clinical encounter, as distinct from the specific efficacy of treatment interventions" (2)*.* Diverse factors can produce placebo effects, including medical rituals (e.g., taking a pill) and provider behaviors (e.g., communication). For example, providers explicitly stating to patients that a treatment will improve their condition makes it more likely that the treatment will do so (3, 4). Placebo effects bolster the efficacy of both active medications (5–7) and treatments with no active medical properties, ranging from sugar pills (8) to inert creams described as pain relievers (9) to sham acupuncture involving fake needles that never pierce the skin (10).

But not all placebo effects are created equal. A series of studies suggests that how providers interact with their patients shapes the magnitude of placebo effects (10–13). But while these studies acknowledge that patient–provider interactions are critical to placebo response, they do not provide a theoretical framework for the *specific* dimensions of the patient–provider interaction that enhance placebo effects and thus shape a patient's physical health outcomes.

In the current article, we address four key questions, which correspond to the four main sections of the article:


In considering these questions, we delineate a novel framework proposing that interactions that boost placebo response can be dissected into two key dimensions: patients' perceptions of *competence*, or whether a doctor "gets it" (i.e., displays of efficiency, knowledge, and skill) and patients' perceptions of *warmth*, or whether a doctor "gets me" (i.e., displays of personal engagement, connection, and care for the patient). We suggest that competence and warmth work together to influence placebo response and therefore shape effective healthcare.

### WHAT ARE THE KEY DIMENSIONS OF PATIENT–PROVIDER INTERACTIONS?

Is there a parsimonious way to represent the many diverse qualities that may be present in patient–provider interactions? We tackle this question in three steps. First, we discuss the psychological literature on social perception, which identifies key dimensions that underlie our impressions of others. Second, we introduce a model of patient–provider interactions that explains how key dimensions from social perception apply in the healthcare context. Third, we illustrate how these key dimensions are evident in the medical literature on patient–provider interactions by reviewing theoretical and empirical work on effective patient– provider interactions.

#### Competence and Warmth: Two Core Dimensions of Social Perception

Psychologists have long been interested in understanding the dimensions on which people judge others when forming first impressions. In order to successfully navigate one's social world, a person must constantly and rapidly make accurate assessments of other people. Should a stranger be approached or avoided? Is a person a suitable friend or romantic partner? Is an expert worthy of trust? To answer such questions, people need to quickly determine whether another person is likely and able to harm or help them. Although many dimensions for the factors that underlie such social judgments have been proposed, over 50 years of research suggests that they can all be distilled into two key dimensions: warmth and competence (14–20).

One study attempting to identify the underlying dimensions of personality asked participants to describe different people they knew by selecting personality traits from a list of over 60 different traits (21). These researchers then evaluated the degree to which these traits co-occurred in people's descriptions of a particular person. They found the traits that co-occurred frequently could be grouped into those that described intellectual qualities that were either good or bad (i.e., competence—e.g., qualities like determined and industrious *vs*. irresponsible and unintelligent) and social qualities that were good or bad (i.e., warmth—e.g., qualities like sincere and good-natured *vs*. irritable and humorless). These two dimensions were independent and accounted for most of the variance in people's judgments of others.

In other research, participants generated descriptions of events that helped them form strong impressions of other people or themselves (22). Of the over 1,000 descriptions generated by these participants, approximately three-fourths depicted considerations of warmth or competence, as rated by independent judges. In yet another study, a pool of 200 diverse traits were rated on a variety of dimensions, including the degree to which they captured warmth and captured competence (23). These ratings of a trait's warmth and competence predicted all but 3% of the variance in ratings of trait favorability, suggesting that these two ingredients are key to describing positive and negative qualities in person perception.

Together these studies, and dozens of others using a variety of methodologies, suggest that warmth and competence are two key dimensions holding the greatest explanatory power when it comes to positive and negative evaluations of others.1 Qualities like friendliness, honesty, trustworthiness, good-naturedness, empathy, and kindness (*vs*. coldness, deceit, and unreliability) are all essentially different ways to describe a person's general warmth.

<sup>1</sup>For example, the dimensions of warmth and competence also model people's judgments of the characteristics of social groups. Ratings of warmth and competence distinguished a variety of different social groups on the basis of out-group members' stereotypes about these groups (24). Stereotypes of groups could be categorized into four unique clusters: those rated high on warmth and competence, low on warmth and competence, high on warmth but low on competence, and low on competence but high on warmth.

Qualities like intelligence, power, assertiveness, ambition, efficacy, and skill (*vs*. inefficiency, indecisiveness, passivity, and laziness) are all essentially different ways to describe a person's general competence (15, 20). Though these dimensions have sometimes been called by other names [e.g., agency and communion (25– 27); for a review see Ref. (17)] regardless of the nomenclature, there is remarkable consistency among researchers in the qualities that are commonly reflected by these two dimensions.

There is a strong evolutionary argument for the primacy of warmth and competence: the need to rapidly determine whether a person intends to, and is capable of, harming or helping an individual. Essentially, warmth encapsulates answers to the question of "Are this person's intentions toward me positive or negative?" and competence encapsulates answers to the question of "Does this person have the ability to enact those positive or negative intentions?" (14). To promote survival, a person must be able to find an answer to these key questions whenever they encounter someone new.

And indeed, people make these judgments rapidly and nonconsciously, any time they evaluate someone new. People judge others as warm or competent based on even brief exposure to another person's behavior (28–30). For example, both adults and children form evaluations of warmth and competence after brief, 100-millisecond exposure to a person's face (31, 32). These two dimensions are readily perceived from a variety of limited non-verbal information, such as tone of voice, body posture, and facial expressions (33–35). Further, ratings of warmth predict liking and ratings of competence predict respect for others (25, 36). Warmth and competence thus seem likely to influence both the quality and outcomes of a variety of important interpersonal interactions, including patient–provider interactions.

In summary, decades of research in social, evolutionary, and cognitive psychology have shown that a multitude of qualities can essentially be distilled into the two core dimensions of competence and warmth, and that these dimensions are fundamental to how people form impressions of others. Next, we apply this competence and warmth framework to healthcare.

#### Judgments of Competence and Warmth in Healthcare: The Provider "Gets It" and "Gets Me" Framework

Patients' assessments of a provider likely also follow these two key dimensions of social perception, but with a slightly different flavor. We propose a healthcare-specific framework in which patients assess competence by judging whether the provider "gets it" (i.e., demonstrates efficiency, knowledge, and skill) and assess warmth by judging whether the provider "gets me" (i.e., demonstrates personal engagement, connection, and care for the patient; in other words, whether a provider sees a patient as a social being, and not just in terms of their health or illness). See **Table 1** for a summary of these dimensions.

When assessing whether a provider "gets it," a patient may pay attention to cues indicating whether a provider has the necessary qualities to conduct relevant procedures, make an accurate diagnosis, and make the best recommendations for treatment. When assessing whether a provider "gets me," a patient may pay attention to cues indicating whether a provider recognizes and respects that this individual is a person with a life outside of the healthcare context who has their own desires, needs, and values.

There are a multitude of qualities that could bolster patients' perceptions that a provider "gets it," all of which involve a practitioner's perceived expertise and ability to help address a patient's medical concerns. Some qualities might foster perceptions of medical competence in a broader sense, such as whether a provider attended a top-tier medical school, if they seem up-to-date on medical research, or if they speak clearly and confidently. Other qualities might instead focus on perceived competence regarding the patient and their particular situation. For example, does a patient feel like the provider knows their family history, has experience with patients who are similar to them, and can answer their specific questions?

Similarly, patients' perceptions that a provider "gets me" could be cultivated in different ways. Some ways involve very general qualities or actions: whether the provider smiles at and sits near the patient, whether they introduce themselves and use the patient's

TABLE 1 | Judgments of competence and warmth in healthcare: the provider "gets it" and "gets me" framework.


*We define patient-specific qualities as providers' qualities, such as knowledge of important aspects of a patient's life outside of the healthcare context (warmth) and experience working with similar patients (competence), that reflect knowledge of the specific patient's individual needs, desires, and/or perspectives, as opposed to more general qualities of providers, such as general friendliness (warmth) and general medical knowledge (competence), that do not necessarily require knowledge of the specific patient's individual needs, desires, and/or perspectives.*

name, and even whether they are polite to their co-workers at the hospital. These qualities and behaviors, as signals of general positive social engagement, may foster the perception that a provider is likely to regard their patient as a social being worthy of human dignity and respect. Cultivating perceived warmth could also involve qualities that are more patient-specific: listening to a patient and acknowledging their individual perspectives, asking a patient questions about their life outside of the healthcare context to get to know them as a person, appearing to understand the social world of the patient and their values, and respecting the patient. Warmth may also encompass interpersonal skills that bolster perceptions of a provider's engagement with and care for the patient (e.g., active listening) as well as their emotional feelings toward the patient (e.g., empathy).2

#### Competence and Warmth in the Medical Literature

We have proposed that patient–provider interactions can be distilled into two key dimensions: whether a provider appears to "get it" (i.e., competence) and "get me" (i.e., warmth). Here we describe how these dimensions, although not always explicitly categorized as such, represent the foundation of existing theories of effective medical care.

#### Competence and Warmth in Theoretical Models of Medical Care

Competence and warmth surface as two key dimensions in a variety of theoretical models of effective medical care, as outlined in **Table 2**. Major advances in our understanding of medicine have often involved a shift from considering only a provider's competence as critical to patient care to also incorporating a provider's warmth.

One of the earliest calls to incorporate warmth into models of medical care was the shift from biomedical to biopsychosocial models of medicine (40–42). Biomedical models focused on tasks related to medical competence: rooting out physical causes of illness, using diagnostic tests to determine treatment, and intervening at the level of biology. Biopsychosocial models emphasized the critical role of psychological factors (e.g., personality, mood, coping skills) and social context (e.g., culture, family, socioeconomic status) in health. Biopsychosocial models thus encouraged a greater focus on patients' concerns, comfort, values, and goals—the "getting me" of medicine.

The role of warmth alongside competence is further reflected in the shift from a doctor-centered, physician-centered, or diseasecentered approach (43, 45, 48) to patient-centered medicine (44, 46, 47). As Levenstein and colleagues (47) suggested, in patient-centered medicine "the task of the physician is twofold, to understand the patient and to understand the disease" (p. 24). Patient-centered medicine suggests that most effective treatments based on exceptional knowledge (the "getting it" of medicine) may prove irrelevant if these treatments do not align with a patient's values and desires, which requires recognizing the patient as a social being and putting effort into "getting me." Similarly, other research distinguishes between disease as objective (i.e., abnormalities of the structure and function of body organs and systems) and illness as subjective, e.g., incorporating how a patient perceives the event and how it affects their life (57).

There are similar parallels in the "voice of the lifeworld" and the "voice of medicine" (49), or as "a question of facts" *versus* "a question of personal values" (50), as described in **Table 2**. Engel captured these dimensions neatly as two different patient considerations: the *need to know and understand* and the *need to feel known and understood* (51)*.* A quote from Engel encapsulates the importance of a provider's warmth as well as competence:

For the patient, to feel understood by the physician means more than just feeling that the physician understands intellectually, that is, 'comprehends' what the patient is reporting and what may be wrong, critical as these are for the physician's scientific task. Every bit as important is that the physician display understanding about the patient as a person, as a fellow human being, and about what he is experiencing and what the circumstances of his life are. (p. 11)

Later models captured competence and warmth as behaviors that are *cure-oriented versus care-oriented* (52, 53), *instrumental versus affective* (54), and *task-oriented versus socio-emotional* behaviors (55). The tradition of narrative medicine (56) suggested directly that "a scientifically competent medicine alone" (p. 1,897) is not sufficient for effective healthcare. This tradition argues that physicians must complement their scientific ability by listening to patients' stories, engaging with them empathetically, and understanding their individual perspectives. By acknowledging the role of personal connections between providers and patients in healthcare, this tradition, as well as the substantial interest in empathy (58, 59) and the emotional aspects of patient–provider communication (60) in the medical literature in recent years, moved medicine closer still toward recognizing the importance of warmth.

In the medical literature, the past decades have involved a shift from a focus on "getting it" to a focus on also "getting the patient." However, often in these models, warmth and competence have been portrayed as in conflict or competition, or as alternative rather than complementary approaches to care. We propose, and the social perception literature supports, that there need not be a trade-off between warmth and competence, and that these two dimensions often bolster one another.

<sup>2</sup>A large literature has explored provider empathy in patient-provider interactions and suggests that it can play an important role (e.g., improving patient health outcomes) (37–39). Empathy is a multifaceted construct that may include several different components, including awareness and sharing of others' affect, caring for others' welfare, and/or imagining what others are feeling (39). The literature on social perception distinguishes between warmth and empathy; empathy is subsumed under the umbrella of warmth as a feature that may indicate it, but other qualities that cannot be directly equated to empathy also comprise warmth (e.g., friendliness, honesty, kindness, and good-naturedness) (15). Simply being friendly or honest does not necessarily communicate empathy but could bolster perceived warmth. Thus, since it encompasses a wider variety of relevant provider characteristics and behaviors, we adopt the more general term *warmth* rather than the more specific term *empathy* in our discussion of provider qualities.

#### TABLE 2 | Competence (provider "gets it") and warmth (provider "gets me") in theories of medical care.


#### Competence and Warmth in Medical Research on and Measures of Patient Satisfaction

Next, we review some of the most highly-cited measures of patient satisfaction to illustrate that the competence and warmth framework can distill the provider characteristics present in these measures. As can be seen in **Table 3**, widely-used patient satisfaction scales such as the Press Ganey Survey (61) and the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) (62) capture both warmth (e.g., is courteous) and competence (e.g., is prompt). While these patient satisfaction scales may have their flaws, they nevertheless implicitly assess both competence and warmth, demonstrating that these dimensions are already considered important to effective healthcare.

Competence and warmth also underlie the constructs captured in some of the most highly cited scales used in medical research (from citations from Google Scholar in November 2018), including the Risser Patient Satisfaction Scale (63) (>490 citations), the Picker Patient Experience Questionnaire (64) (PPE-15, >440 citations), the Medical Interview Satisfaction Scale (65) (MISS, >440 citations, the Consultation Satisfaction Questionnaire (66) (CSQ, >410 citations), and the La Monica-Oberst Patient Satisfaction Scale (67) (LOPSS, > 280 citations), as well as more recently devised scales of patient satisfaction (e.g., the Short Assessment of Patient Satisfaction (SAPS) scale) (60) (see **Table 4** and **Supplemental Table 1**). For example, the items in the LOPSS (67) capture warmth (e.g., is pleasant and gentle) and competence (e.g., is thorough and efficient).

Many critical capabilities of providers highlighted in these measures of patient satisfaction rely on both competence and warmth. For example, the Press Ganey Survey assesses the degree to which a provider made efforts to include the patient in decisions about treatment. To effectively engage a patient in the treatment process, a provider needs the competence to advise a patient on the technical aspects of care and to know what treatment options are suitable. But a provider also needs warmth to gain insight into a patient's perspective and values in order to present relevant

TABLE 3 | Competence and warmth in items from patient satisfaction scales commonly utilized in clinical care evaluations (the Press Ganey Survey and Hospital Consumer Assessment of Healthcare Providers and Systems).


#### TABLE 4 | Competence and warmth in items from patient satisfaction scales developed for medical research.


The nurse explains things in simple language.

Too often the nurse thinks you can't understand the medical explanation of your illness, so s/he just doesn't bother to explain. (R)

*(R) indicates that the item describes a provider who is lower on warmth or lower on competence. Otherwise, the item is representative of higher warmth or higher competence. Some other items in these scales not captured in this table assessed general satisfaction and/or confidence in providers, which may be shaped by perceptions of both warmth and competence.*

options to a patient. They need warmth to judge a patient's knowledge and skills appropriately based on their life experiences and to take that into account when conveying information to them. And, they need warmth to cultivate enough approachability to make a patient feel comfortable engaging in their care. Abilities such as advice-giving may function similarly. Of course, a provider needs the competence to know possible recommendations and to explain them clearly to patients, but a provider also needs the warmth to choose advice that is appropriate for a particular patient and to relate it to the patient to encourage adherence. Competence and warmth combined thus form the foundation of many healthcare skills, as highlighted in **Tables 3** and **4**.

Several scales (i.e., Press Ganey, CSQ, MISS, SAPS) include questions assessing how satisfied patients were with the amount of time that their provider spent with them. Some research shows that provider warmth shapes perceptions of the time spent with a provider during a medical exam (68), and so measures of patient satisfaction with visit length may be linked with perceived provider warmth.

Thus, when attempting to measure the quality of interactions with providers, existing scales tap into the core dimensions of competence and warmth or assess skills that require both of these dimensions. Details on the validity of these scales are reported elsewhere (69–71). Here we focus primarily on the fact that all of these scales capture the core dimensions of competence and warmth, therefore providing further evidence that a combination of these qualities are critical to effective healthcare (in this case, as evidenced by patient satisfaction).

#### Competence and Warmth in Medical Research on and Measures of Patient–Provider Interactions

Research-based measurements of patient–provider interactions also illuminate the core dimensions of competence and warmth (see **Table 5**). Some widely used methods for analyzing patient– provider interactions include the Roter Interaction Analysis System (55, 72) (RIAS, >700 citations), a coding systems for patient– provider communication, and the coding scheme associated with the Four Habits model (73, 74) (>190 citations).

The RIAS categorizes dialogue into two buckets: 1) task-focused behaviors, involving gathering data to determine care and providing patient education and counseling, and 2) affective behaviors, involving building a relationship and rapport with patients and TABLE 5 | Warmth and competence in behaviors from the Roter Interaction Analysis System and Four Habits Coding Scheme used to code dialogue between patients and providers.


Clinician attempts to determine in detail/shows great interest in how the problem is affecting the patient's lifestyle (work, family, daily activities). Clinician clearly encourages and invites paint's input into the decision-making process.

*(R) indicates that the measure describes a provider who is lower on warmth or lower on competence. Otherwise, the measure is representative of higher warmth or higher competence.*

responding to a patient's emotions. Task-focused behaviors often reflect competence, such as asking questions about a medical condition, discussing the results of tests, and giving instructions about treatment. Affective behaviors reflect warmth, such as emotional expressions toward the patient (e.g., concern, optimism, reassurance), verbal attentiveness (e.g., paraphrasing, empathy), social behaviors (e.g., making personal remarks, joking, laughter), and negative talk (e.g., expressing disapproval or criticism) (75–77). The Four Habits model focuses on developing four key families of skills in providers, namely investing in the beginning of the visit, eliciting patient perspectives, demonstrating empathy, and investing in the end of the visit (73, 74). Many of the skills in the model involve warmth (e.g., create rapport quickly, make at least one empathic statement) and many involve competence (e.g., deliver diagnostic information, provide education). As with the patient satisfaction scales, some measures in these scales build on both competence and warmth (e.g., dispensing advice relevant to a patient's lifestyle, checking patients' understanding, and encouraging patients to talk).

Provider empathy has raised much recent interest, particularly given its association with improved patient health outcomes (78– 81). One of the most widely used scales of provider empathy is the 20-item Jefferson Scale of Physician Empathy (80) (>600 citations), which essentially assesses to what degree providers personally endorse the importance of "getting the patient"; for example, items include whether a provider agrees that "Physicians' understanding of their patients' feelings and the feelings of their patients' families is a positive treatment factor" and "It is as important to ask patients about what is happening in their lives as it is to ask about their physical complaints." To some degree, these items assess providers' beliefs about whether warmth is relevant to a provider's competence (e.g., whether it is an important part of diagnosis and treatment). These qualities seem likely to bolster perceptions of a provider's warmth.

Echoing measures of patient satisfaction, other research-based measures that dissect patient–provider interactions (e.g., dialogue) into important qualities again capture the core dimensions of competence and warmth.

#### Competence and Warmth in Experimental Research on Patient–Provider Interactions

Some studies have experimentally compared more standard interactions (e.g., meeting basic standards for clinical care, but limiting the social aspects of the interaction) with "enhanced" interactions that focus more on building rapport and positive engagement with a patient. The qualities in these studies can also be organized into the competence and warmth framework. Some manipulations involve verbal statements that indicate competence or warmth explicitly, and others tap into non-verbal behaviors that signal competence and warmth.

In one study, Rakel and colleagues (82) randomly assigned patients with a common cold to meet with a provider in either a standard visit (e.g., taking medical history, physical exams and diagnosis, limiting touch, eye contact, and visit time) or an enhanced visit involving setting more positive expectations about healing, expressing empathy, empowering and connecting with patients, and educating patients about their illness and treatment to a greater extent (83). The "enhanced interaction" examined in this study reduced the severity and duration of patients' colds, and boosted IL-8 and neutrophil count. Though the researchers largely intended this interaction to bolster perceived provider empathy, many of the behaviors map onto the broader and more comprehensive dimensions of competence and warmth. For example, patients in the enhanced condition received more information about care, including written notes (relevant to competence), and experienced warmth-related non-verbal behaviors (e.g., handshakes, increased eye contact). Some manipulations may have simultaneously conveyed both warmth and competence (e.g., individualizing patient care). **Table 6** illustrates how the qualities can be organized along the competence and warmth dimensions.

Another study experimentally altered patient–provider interaction in hypothetical vignettes in order to assess its relationship to malpractice claims (84), focusing on physician communication behaviors that, in pilot data, surfaced as the most important for enhancing patient–provider rapport. They essentially varied provider competence (e.g., giving information and advice) and warmth (e.g., whether they seemed judgmental and critical *vs*. warm, friendly, and attentive), as well as several components bridging competence and warmth (e.g., engaging the patient, using straightforward language) (see **Table 7**).

Several other studies manipulating patient–provider interactions have focused on training communication skills, as reviewed by Kelley et al. (85). These interventions have often leveraged components that can be understood using the competence and warmth framework. For example, one intervention trained physicians on several skills related to competence (e.g., repeating and summarizing important information; making referrals if needed) and several skills related to warmth (e.g., establishing rapport by introducing themselves and making eye contact; conveying empathy), as well as encouraging physicians to check patient preferences and provide information accordingly (i.e., both competence and warmth) (86, 87). Another intervention involved physicians giving more detailed explanations and making thoughtful pauses (competence) and enhanced active listening and positive non-verbal behavior (warmth), as well as developing skills relevant to competence and warmth (e.g., checking patient understanding and sharing the decision-making process) (88, 89). Yet another involved training a variety of skills that require both competence and warmth, such as assessing what the patient knows about their condition and providing information relevant to the patient's understanding and interests (90, 91).

The methods used in these studies highlight the utility of the competence and warmth framework. In these studies, researchers often work to carefully design studies that experimentally test dozens of different components in the patient–provider interaction. Yet all of these components can be understood, categorized, and synthesized within the framework of competence and warmth. This applies across a wide variety of intervention types, including those focused on empathy, communication skills, shared decisionmaking, and patient-centered care.

#### Which is More Important in Patient–Provider Interactions: Competence or Warmth?

The question of whether competence or warmth is more important in social interactions has been discussed somewhat in the social perception literature. Importantly, past research suggests that warmth and competence are not necessarily a tradeoff (21, 92). In fact, these dimensions often correlate somewhat positively (i.e., someone who is perceived as warmer also tends to be perceived as more competent) (17, 21).

There is some research suggesting that warmth takes primacy, or is prioritized, in judgments of others (14). When asked to list the traits that are most important in others, people tend to list warmth-related traits rather than competence-related traits, and prefer to learn about warmth-related traits in order to form impressions of others (93). Warmth judgments may also be made more quickly than competence judgments (94). Researchers suggest this pattern may occur because warmth more reliably indicates potential costs and benefits associated with interacting with another person (93, 95). Warmth's primacy makes sense from an evolutionary perspective, as its detection separates foe from friend, potential harm from potential help (15, 94) and must be made most rapidly in order to effectively prepare to fight or flee. The primacy of warmth does not, however, indicate that it is fundamentally more important than competence; both remain essential qualities of social interactions and we propose that the same is true for patients' interactions with providers as well.

There are differences in the role of competence and warmth in patient–provider interactions, as compared to social interactions


TABLE 6 | Experimentally varying warmth and competence in enhanced patient–provider interactions, as reported in Rakel et al. (82) and Barrett et al. (83).

TABLE 7 | Experimentally varying warmth and competence in enhanced patient–provider interactions in Moore et al. (84).


more generally, that are worth considering. To illustrate this, consider the definitions in the social perception literature of competence as traits that are "self-profitable" (i.e., that benefit the person who possesses them), and warmth as traits that are "other-profitable" (i.e., that benefit the people around the person who possesses them) (27, 96–98). Such definitions could further justify the primacy of warmth, as they portray judgments of another person's warmth (i.e., "Does this person possess traits that are likely to benefit me?") as the most relevant for self-interest. But in medical care, this distinction cannot be made. A provider's competence is clearly also "other-profitable" for patients, as its presence or absence directly affects a patient's health outcomes. A provider needs to have their patient's interests at heart, but without the ability to enact those positive intentions, even the best intentions are rendered meaningless. Similarly, a provider who has the knowledge to treat a patient but lacks the care or concern to thoughtfully administer this treatment will also not be effective. Accordingly, assessments of positive intentions (warmth) and the ability to enact those positive intentions (competence) are both critical in judgments of providers. Thus, a provider who seems *both* credible and likeable may be the most likely to influence patients' health.

#### Summary

Perceptions of the degree to which a provider "gets it" (i.e., competence) and "gets me" (i.e., warmth) emerge as two key dimensions in a number of important medical sources including: a) theoretical models of effective medical care, b) measures of patient satisfaction, c) measures of effective patient–provider interactions, and d) empirical research on patient–provider interactions. This suggests that the medical literature has implicitly deemed these two dimensions as pervasive and essential even if researchers did not explicitly use the terms competence and warmth. Likewise, the psychological literature has identified these same dimensions as cornerstones of impression formation more generally.

Thus, the psychological and medical literatures can be connected and simplified by utilizing the framework of competence and warmth. Competence and warmth distill a host of complex provider characteristics that are deemed essential to effective healthcare into two core dimensions. Accordingly, the competence and warmth framework can help practitioners and researchers alike identify which provider qualities are influential in patient–provider interactions and foster greater understanding of how to embody these core qualities to patients.

### DO COMPETENCE AND WARMTH MODERATE PLACEBO RESPONSE?

We now turn our attention to examining whether the dimensions of competence and warmth moderate placebo response. To do so, we review four empirical studies which experimentally altered elements of patient–provider interactions to test this question (10–13).

One study deliberately manipulated competence and warmth and three of these studies (10, 12, 13) did so implicitly, although the researchers may not have explicitly set out to do so. **Table 8** illustrates how the interpersonal variables altered in these studies map onto the competence and warmth dimensions. Next, we review each of these studies and their methods in detail.

#### Czerniak et al. (12): Competence and Warmth Moderate Placebo Pain Relief

Czerniak and colleagues (12) found that warm and competent patient–provider interactions increased healthy volunteers' responses to a placebo cream described as an analgesic (*N =* 122). This ostensible analgesic was applied before patients underwent a cold pressor task (99) in which participants immerse their hand in an ice water bath to induce pain. First, all participants underwent the cold pressor task without the administration of placebo cream to assess baseline pain threshold (defined as the number of seconds before participants indicated that they felt pain from the cold) and pain tolerance (defined as the number of seconds before participants withdrew their hand from the cold). Then, a trained actor posing as a doctor administered a placebo cream (i.e., moisturizer lotion) described as a pain relief cream before participants repeated the cold pressor task. The researchers randomly assigned patients to receive this placebo cream either in the context of a standard interaction designed to mimic a routine doctor's visit, or in the context of an enhanced


TABLE 8 | Competence and warmth as dimensions of patient–provider interaction manipulations that enhanced placebo response.

*N = number of participants in the study.*

interaction involving characteristics of ritual healing. Both the standard and enhanced interactions lasted approximately 5 minutes or less. Placebo response was measured by pain threshold and pain tolerance relative to baseline.

The researchers drew their inspiration for the "enhanced" interaction from a shaman's healing ritual, incorporating performance behaviors. The authors used a variety of performance-relevant behaviors in the enhanced interactions, including verbal behaviors (i.e., dialogue) that was "personal, attentive to the volunteer, and used imagery in the questions and explanations" (12, p. 4), and deliberate non-verbal behaviors, such as dramatic gestures and movement in the room. The dimensions altered, however, can be organized under the simplifying and unifying framework of provider competence and warmth. Some verbal behaviors (e.g., emphasizing that the provider has many years of experience studying pain, helping patients to use metaphors to describe their pain) and non-verbal behaviors (e.g., examining participants' hands more closely, not being distracted by a cell phone during the interaction) likely increased perceived competence. Several other verbal behaviors (e.g., greeting the participant by name) and nonverbal behaviors (e.g., increasing eye contact, using physical touch) likely increased perceived warmth. Some manipulations may have targeted both competence and warmth. In the enhanced interaction, the provider asked the patient to describe how they normally treat pain, thereby taking the patient's own preferences into account (signaling warmth) and gathering additional information to shape treatment decisions (signaling competence).

Participants who experienced the "enhanced" interaction showed a higher pain tolerance during the cold pressor task compared to participants who experienced the standard interaction. However, the effect of the interaction on pain tolerance was limited to participants who were categorized as "placebo responders" (defined as participants who showed at least a 30% increase in pain tolerance after placebo administration), suggesting that participants who were not susceptible to placebos were also not influenced by the differences in provider interactions.

### Kaptchuk et al. (10): Competence and Warmth Moderate Placebo Treatment for IBS

Kaptchuk et al. (10) found that warm and competent patient– provider interactions increased patients' response to sham acupuncture administered over the course of 3 weeks to treat irritable bowel syndrome (IBS) (*N =* 262). Sham acupuncture uses a device that creates the appearance of having pierced the skin without actually doing so, in order to mimic the needles used during acupuncture. Patients were randomly assigned to either receive this sham acupuncture in a short interaction in which providers restricted their engagement with patients, or in an enhanced interaction in which providers engaged in additional conversation with patients and incorporated several verbal and non-verbal behaviors to improve the quality of the interaction. Placebo response was measured through self-reported improvement in IBS symptoms, self-reported adequate relief of IBS symptoms, selfreported symptom severity, and the self-reported degree to which the condition interfered with a patient's quality of life.

The enhanced interaction in this study (10) was designed to be "warm, empathetic, and confident" (p. 2), clearly covering the two dimensions of provider competence and warmth. As documented in **Table 5**, several verbal behaviors (e.g., stating that the provider has had much experience with the treatment) and non-verbal behaviors (e.g., pausing in thoughtful silence for 20 s during the procedure) may have evoked competence, and several verbal behaviors (e.g., making empathetic statements, using active listening and words of encouragement) may have evoked warmth, and some behaviors may have evoked both competence and warmth (e.g., asking additional questions about the patient's understanding of the treatment).

Patients who experienced the "enhanced" interaction reported greater relief and improvement in symptoms over the course of the 6-week study. Thus, the positive effects of placebo acupuncture were augmented by a more supportive interaction with a provider.

#### Fuentes et al. (13): Competence and Warmth Moderate Placebo Treatment for Chronic Low Back Pain

Fuentes et al. (13) used a similar protocol to Kaptchuk et al. (10) to enhance the interaction between therapists and patients with chronic low back pain who were randomly assigned to either undergo active interferential current therapy (IFC) or sham IFC (*N =* 117).

In one condition, patients experienced a limited interaction in which the provider left after briefly introducing themselves and explaining the treatment. Providers also mentioned that they had been instructed not to converse with participants and minimized discussion accordingly. In the "enhanced interaction" condition, patients experienced an enhanced interaction involving several verbal behaviors that may have enhanced perceived competence (e.g., the provider asked patients additional questions about their symptoms), several that may have enhanced perceived warmth (e.g., active listening, making empathetic statements such as "I can understand how difficult this must be for you"), and several that may have targeted both (e.g., asking patients about their lifestyle and assessing their understanding of their condition). Enhanced interactions also employed several non-verbal behaviors that conveyed warmth, including a warmer tone of voice, increased eye contact, and incorporating physical touch into treatment.

The authors found that the enhanced interaction improved outcomes for both active and placebo treatment. As with Kaptchuk et al. (10), the enhanced interaction also involved providers spending more time with patients (5 min in the limited interaction and about 30 min in the enhanced interaction).

#### Howe et al. (11): Competence and Warmth Moderate Placebo Treatment for Allergic Reactions

The only study to date which has altered provider warmth and competence *independently* from each other in order to tease apart the dimensions was done by Howe and colleagues (11). In this study, healthy volunteers (*N =* 164) underwent a skin prick test using histamine, which was administered by a trained research assistant who acted as the provider. (Histamine causes a mild allergic reaction in which the skin becomes red, itchy, and a small bump called a "wheal" surfaces.) The provider then applied a placebo cream (moisturizer lotion) to the allergic reaction. This study also separated the qualities of the interaction from the expectations set about the placebo treatment. In the positive expectations condition, they stated that the cream was an antihistamine cream that would reduce the reaction and decrease itching. In the negative expectations condition, they stated that the cream was a histamine agonist that would increase the reaction and increase itching. Placebo/nocebo response was measured by the change in participants' wheal size (in mm) after the placebo cream was applied.

The same provider administered the cream to all participants, but was trained to interact with participants in one of four ways to evoke: 1. High warmth and high competence, 2. High warmth and low competence, 3. Low warmth and high competence, or 4. Low warmth and low competence. Competence was evoked through verbal manipulations (e.g., speaking confidently, minimizing filler words), non-verbal manipulations (e.g., executing all procedures flawlessly), and environmental manipulations (e.g., professional attire, room neat and clean). Warmth was also evoked through verbal manipulations (e.g., the provider introducing themselves and calling the participant by name), non-verbal manipulations (e.g., increased eye contact, sitting closer to participant), and environmental manipulations (e.g., hanging posters with warm images in the exam room). All conditions were the same length of time, thereby controlling for time interacting with the provider. Patients' self-reported ratings of the provider at the end of the exam suggested that perceived competence and warmth were substantially impacted through these simple changes, suggesting that perceptions of providers' warmth and competence are readily malleable.3

The researchers found that competence and warmth moderated placebo and nocebo responses. When the provider appeared

<sup>3</sup>Effect sizes for the impact of the experimental alterations of competence and warmth on patient perceptions of providers indicated that the changes in provider behavior designed to evoke competence had a medium size effect on patient perceptions of provider competence, Cohen's *d* = 0.47, and the changes in provider behavior designed to evoke warmth had a large effect on patient perceptions of provider warmth, Cohen's *d* = 1.75.

both competent and warm, participants who heard positive expectations about the cream showed a greater decrease in wheal size than participants who heard negative expectations about the cream. However, when participants had interacted with a provider who was low in warmth and low in competence, their wheal size continued to increase at the same rate regardless of whether or not the provider had set positive or negative expectations about the cream. Mixed conditions (i.e., high warmth/low competence and low warmth/high competence) produced moderate effects on the allergic reaction and were indistinguishable from each other.

This study disentangled precise dimensions of patient–provider relationships and found that warmth and competence shape participants' physiological responses to the expectations that a provider sets about treatment. An additional important take-away from this study is that neither warmth nor competence seemed to matter more than the other; rather, it was only when the two qualities worked together that they effectively created an overall interaction that boosted placebo effects.

#### Summary

Overall, these studies support the notion that a provider's competence and warmth are key dimensions that moderate placebo response: interactions in which a provider demonstrated both competence and warmth resulted in a greater response to placebo and active treatments. Thus, whether a provider "gets it" and "gets me" can affect the potency of a medical treatment. Accordingly, both of these dimensions constitute an important part of effective healthcare.

### WHAT ARE THE MECHANISM THROUGH WHICH COMPETENCE AND WARMTH MODERATE PLACEBO RESPONSE?

The patient–provider relationship is frequently cited as a key mechanism of placebo effects in and of itself (10, 83, 85). As discussed in depth above, the patient–provider relationship assessed in placebo research clearly contains dimensions of both competence and warmth. However, the mechanisms through which a competent and warm patient–provider interaction might boost placebo response are unclear from past literature. We propose that provider competence and warmth increases overall placebo effects by boosting known placebo mechanisms, including a) expectations and b) classical conditioning (i.e., repeated associations between a medical stimulus, such as a pill, and the active drug inside the pill, which could lead to a conditioned response) (4, 100). By augmenting the impact of these known placebo mechanisms, provider warmth and competence then boost overall placebo response.

#### Competence and Warmth Amplify Patient Expectations About Treatment

A provider's competence and warmth make a provider more credible, believable, and/or persuasive (101), which may boost the impact of the expectations they set about treatment. A doctor who is competent (e.g., conducts a thorough exam, seems knowledgeable) will appear as a more reputable source of medical information. Thus, the patient may be more likely to internalize this competent doctor's message about a treatment's efficacy. Likewise, when a doctor is warm (e.g., is friendly, calls the patient by name), the patient may feel more relaxed, at ease, and like they are in good hands. The patient may then be more receptive to what the doctor has to say, view the doctor as trustworthy, and believe expectations set about the efficacy of treatment to a greater extent. A warm provider may also appear to better understand the patient, and thus enhance this patient's confidence that the provider has chosen a course of treatment that will work for them as an individual. Patients may thus listen to and trust explanations of warm and competent providers to a greater degree, and accordingly be more influenced by them physiologically (102–104).

Competent and warm providers may thus be better able to set specific, individualized expectations that are more meaningful, helpful, and relevant for patients. When expectations resonate with patients more, they increase healing to a greater degree (105). Similarly, competent and warm providers may also more effectively set expectations about patients' own role in their health management. For example, one study examining enhanced provider interactions included provider comments such as "You can really make a difference in your cold by taking care of yourself" (82, 83). Such a statement may have no potency if a provider seems to lack understanding of medicine and/or of a particular patient's needs and abilities, but may be particularly believable coming from a provider who is seen as competent and warm. As another example, warm and competent providers may also be more skilled at reassuring patients in the course of treatment by providing information clearly and confidently, and providing concern that seems authentic. This could positively impact patient expectations by, for example, resolving uncertainty (106, 107). Furthermore, a recent study shows that even without medication, physician reassurance can help patients feel better by reducing symptoms and speeding healing (108). Through such processes, competent and warm providers may more effectively leverage the healing that is evoked by setting patients' expectations about treatment.

### Competence and Warmth Activate Conditioned Patient Responses

Competent and warm providers may more effectively leverage strategies that boost conditioned responses (109), including diagnostic rituals such as the physical exam. Further, competent and warm providers may simply *feel* more like a healer to the patient, thus leading the patient to experience greater conditioned responses. We thus theorize that warm and competent providers may activate conditioned patient responses because they are more effective at engaging in healing rituals that produce conditioned responses, and because patients may experience a greater conditioned response to these providers themselves.

It has been widely acknowledged that healing rituals can lead to conditioned placebo responses (10, 12, 100, 110). Even normal, everyday procedures that rely on only basic medical competence, such as taking a patient's height, weight, and blood pressure, can become conditioned stimuli for healing in a clinical context (105). However, there is likely great variation in how effectively different providers utilize healing rituals. Warmer, more competent providers may more effectively engage in rituals that produce conditioned healing responses in patients. For example, the physical exam may not only lead to more and better information with which to heal patients, it is likely that the "laying of hands" in the physical exam is healing in and of itself (111– 113). Likewise, research on the meaning of touch for patients with cancer found that nurses' touch conveyed confidence to these patients, and this confidence in turn increased positive patient expectations and hope of recovery (114). But touch can also be aversive for some patients—if a provider is not warm and competent, then these rituals could backfire. Providers who are competent and warm—who are socially and emotionally skilled and able to quickly gauge what their patients prefer—may be better able to utilize medical rituals effectively, particularly rituals involving touch. Indeed, provider warmth and competence may be crucial in the success of these rituals, as these dimensions may be the difference between a ritualistic experience that boosts healing and one that is off-putting for the patient.

Research also supports the hypothesis that a competent and warm provider may activate or amplify conditioned patient responses. Some research suggests that providers who seemed more like an expert or fit certain stereotypes about a doctor were able to enhance response to a treatment regardless of whether they used a placebo or active acupuncture treatment (115). Providers who are competent and warm may thus seem more like a good doctor or a trustworthy expert, which could bolster a conditioned response to seeing such a provider. While participants in past research have been shown to display conditioned responses to doctors who better fit stereotypical images of doctors (i.e., White male doctors), as medicine grows ever-more diverse, aspects of the provider, such as warmth and competence, may rise up in place of physical attributes to produce conditioned responses in patients. We are not aware of any research that directly assesses the impact of provider competence and warmth on conditioning, and future research should investigate how qualities of the provider may amplify or otherwise influence the effects of conditioned healing.

#### Summary

We have proposed that competence and warmth play a key role in placebo effects by strengthening expectations and conditioning during medical treatment. Of course, being complex psychological phenomena, provider competence and warmth likely impact placebo response in many other ways, including by reducing stress and anxiety, increasing positive emotions, influencing physiology directly, and by beneficially impacting behavioral mechanisms such as adherence, motivation, and adoption of healthier behaviors (82, 83, 101, 116–123). Indeed, past research and theory have suggested that provider competence and warmth can set off a cascade of physiological changes in the body, including "endogenous neurotransmitters, hormones, and immune regulators that mimic the expected or conditioned pharmacological effects" (124). But given the known importance of expectations and conditioning for placebo effects and the attention paid to these mechanisms in the placebo literature (3), we have restricted our discussion to these mechanisms and encourage future research and theory on other mechanisms.

### HOW CAN PROVIDERS DELIBERATELY LEVERAGE COMPETENCE AND WARMTH IN CLINICAL CARE?

In order to leverage competence and warmth in healthcare, we need to first understand what these qualities look like from a patient perspective and how they might reasonably be enacted from a provider perspective. To this end, we asked both patients and providers to describe their healthcare experiences. Their responses capture patients' and providers' impressions of how competence and warmth can be demonstrated in clinical encounters.

#### Provider Competence and Warmth From a Patient Perspective

To find out what provider competence and warmth look like to patients and how providers might embody this in real-world settings, we asked participants to describe healthcare experiences in open-ended responses.

Participants first answered two questions in which they imagined what positive qualities and behaviors a good doctor would demonstrate:


Then, participants reflected on their own experiences. Participants first responded yes or no to whether they had ever seen a good doctor, and yes or no to whether they had ever seen a bad doctor. If respondents answered yes to one or both questions, they were asked, respectively:


These questions allowed us to assess qualities and actions drawn from both patients' own positive or negative interactions with providers and patients' ideal interactions with providers.

In total, 334 American participants between age 25 and 87 (51.2% women, *M*age = 43.10, *SD*age = 14.09) responded to the survey, which was administered by Survey Sampling International (SSI). Participants came from a variety of racial/ ethnic backgrounds [29.6% White/Caucasian, 24.9% Asian/Pacific Islander, 23.4% Black/African-American, 22.2% Hispanic/Latino (a)] and socioeconomic backgrounds (41.0% college education, 28.8% some college education, 21.0% high school or less). Detailed survey methods are described in previous publications (125).

Following similar procedures to previous research (125), the authors generated a coding scheme including five categories related to a provider's competence and four categories related to a provider's warmth (see **Table 9** for a description and examples of each category).

Two research assistants who were blind to hypotheses coded a randomly selected 20% of participant responses (*N =* 67 each) by

#### TABLE 9 | Competence and warmth demonstrations and examples from patients.

#### Category Competence/"gets it": *Related to a provider's effectiveness at diagnosing and treating disease/symptoms of disease and encouraging healthy habits* Subcategory Description Examples "Medically knowledgeable" (general knowledge) The doctor is medically knowledgeable, knows current research and practices, intelligent, welleducated. What good qualities would this doctor have? "Good education" "Up to date with newer medical studies" What good things would this doctor do? "Be smart" "Keep me well informed about newest developments" What was good about this doctor? "He had a good knowledge of his field." "Gives proper treatment" What was bad about this doctor? "Incompetent" "Could not explain the importance of a balanced nutrition" "Keeps at it" (thoroughness) The doctor has an attention to detail, is thorough, covers all alternatives, has a good work ethic. What good qualities would this doctor have? "Looks at any and all alternatives" "Would honestly do everything he can to help me" What good things would this doctor do? "Check me out thoroughly" "Would follow-up on small concerns" What was good about this doctor? "Attention to detail" "She was thorough." What was bad about this doctor? "Very rushed" "Not interested in your illness, just what prescriptions do you need" "Understands my health" (patientspecific medical knowledge) The doctor knows your health history, has experience with patients like you (e.g., demographically, or with particular conditions). What good qualities would this doctor have? "Experience treating similar people" "Know my medical record for all appointments" What good things would this doctor do? "Know your body, habits, and family history" "Personalized patient care" What was good about this doctor? "Knows about my health" "Knew our family history"

What was bad about this doctor? "Never had a patient who exhibited similar symptoms" "Ignoring available information about my history" "Has seen it" (experience) The doctor has a lot and/or a variety of medical experience, has been practicing medicine for many years, has seen a lot of patients and treated a lot of medical conditions generally, knows their skill set/ limitations. What good qualities would this doctor have? "Very experienced" "Not attempt any treatment beyond that which he is skilled" What good things would this doctor do? "Know the area of his practice" "Refers you to specialist as needed What was good about this doctor? "He knows what he's talking about." "If can't help, finds someone who can" "Walks the walk" (role modeling) The doctor maintains their own physical and mental health. What good qualities would this doctor have? "Practices a healthy lifestyle themselves" "A great role model" What good things would this doctor do? "Eat healthy"

> What was good about this doctor? "Practiced what he preached"

#### TABLE 9 | Continued


indicating whether participants mentioned this category (1) or did not mention this category (0) for each of the four questions. Coders first coded 20% of the responses and then discussed and reconciled any discrepancies before coding the 80% of responses (*N =* 100 each). Inter-rater agreement before coders began coding the full sample was acceptable (Cohen's kappas > 0.70 for all categories). Data and scripts for analysis are provided at https://osf.io/5jxqy/.

**Table 9** depicts the different ways patients have experienced various forms of competence and warmth in their interactions with providers. These data illustrate that there is a rich variety of ways in which providers can demonstrate competence and warmth to their patients. Of course, providers do not need to embody all of these qualities or perform all of these actions. **Table 9** is not meant to be a checklist for effective medical care, but rather a rolodex of possible tools providers could employ to bolster competence and warmth. Ultimately, what appears to matter for healthcare is that patients perceive a provider as "getting it" and "getting me," and there are many routes to these same ends.

#### Competence and Warmth in Providers' Own Words

In addition to patient perspectives, we turned to medical providers to understand what competence and warmth actually look like in clinical practice. During focus groups in four Primary Care clinics, care team members were asked to generate ways they signal competence and warmth to patients. We collected responses from approximately 100 care team members, including physicians, medical assistants, nurses, and clinic staff.

Responses were collected during a larger training session, which also explained competence and warmth in the "gets it" and "gets me" framework. Providers were then asked: "How do you signal to patients that you get both 'it' and 'them'?" Providers listed at least one example of how they signal competence to patients (getting "it") and at least one example of how they signal warmth to patients (getting "them"). Providers' responses were coded and grouped into thematically similar strategies. **Table 10** lists the overarching strategies that emerged from providers' responses, and displays exemplary quotes for each category in providers' own words.

Importantly, as with **Table 9**, **Table 10** is not meant to suggest that providers adopt all of these strategies. Rather, **Table 10** suggests a multitude of ways in which providers could bolster patient perceptions of competence and warmth, allowing providers to flexibly choose strategies that resonate with them and/ or their patients' needs. Providers' responses span a wide range of behaviors, suggesting that everyone on the care team can bring their own unique strengths to signaling competence and warmth in clinical encounters. Critically, since these responses were generated from all members of the care team, they encompass ways each person in a healthcare clinic could signal competence and warmth to patients, whether their role is as a physician interacting with patients intimately or a scheduler who only interacts with patients by phone. Providers can thus take away from these responses what is most useful and actionable for them given the particular demands and resources of their healthcare context.

While some of these behaviors are basic, intuitive practices (e.g., eye contact), others require the cooperation of multiple medical team members (e.g., consistent messaging to patients). Some require greater investments of time and effort, such as researching personalized treatments beforehand and asking patients about their concerns. However, there are also many strategies that require only intention, not additional time, such as calling patients by name, greeting them warmly, and projecting confidence. Further, even the more effort-intensive demonstrations of competence and warmth may save providers more time in the long-term by fully addressing patients' needs.

#### SUMMARY AND FUTURE DIRECTIONS

By framing patient–provider interactions in terms of provider competence and warmth, we have capitalized on decades of research in social perception to begin to unpack how and why patient–provider interactions can boost placebo response. We have also begun to identify ways providers can leverage competence and warmth to deliberately increase the strength of placebo response. The competence/warmth framework simplifies the complex patient–provider interaction, organizing dozens of behaviors and qualities into two key dimensions that can be bolstered through a variety of routes. It thus suggests to clinicians and researchers alike what to focus on to enhance patient– provider interaction quality and suggests many practical ways to

leverage the power of the patient–provider relationship to boost placebo effects. It is our hope that the framework of competence and warmth will provide researchers and practitioners alike with a theoretical grounding from which to understand what aspects of the patient–provider interaction are most critical for improving various outcomes of medical care.

Further, we have illustrated how this framework is present in both placebo and medical literature, as evident in the way studies alter patient–provider interactions and how patient–provider interactions are assessed. This framework thus unites literature on social perception, placebo research, and medical research. In addition, considering the influence of competence and warmth could help generate novel ideas about the mechanisms through which patient–provider interactions may boost placebo effects. We have proposed that competence and warmth make a provider seem more credible and foster patients' belief in them and their statements, and thus enhance the impact of treatment expectations. We have also proposed that a provider's competence and warmth strengthen conditioned responses to providers and to medical rituals. There are a variety of other possible mechanisms through which a provider's competence and warmth may influence placebo effects and patient health more broadly (e.g., reducing anxiety).

It is likely that the qualities of competence and warmth foster other benefits in patient–provider interactions beyond enhancing patients' placebo response. For example, a provider's competence and warmth may establish trust between patients and providers. Indeed, competence and warmth emerge as core dimensions in literature on the social perception of trust (126, 127). Prerequisites of trust include *ability,* or *"*skills, competencies, and characteristics that enable a party to have influence within some specific domain," and *benevolence*, or "the extent to which a trustee is believed to want to do good to the trustor, aside from an egocentric profit motive" (127), dimensions that also map onto the competence/warmth framework. The possible relationship between competence, warmth, and trust in the healthcare context should be explored. Focusing on showcasing competence and warmth to patients could offer providers a more tangible route through which to establish trust than abstract recommendations to "get patients to trust you." Demonstrations of competence and warmth may be especially important for building trust in crossrace, cross-gender, and cross-socioecomonic status interactions, where trust may be absent or more challenging to build.

The guiding framework of competence and warmth inspires many open questions and serves as a guide for future research. One question is the degree to which competence and warmth are separable in medicine. A recent study found that behaviors often used to cultivate perceptions of warmth (e.g., eye contact) bolstered perceptions of *both* warmth and competence (128). In a medical context, perhaps especially when patients are anxious about very personal concerns, "getting me" may be critical to whether a provider seems to "get it." Likewise, the degree to which signals of warmth and competence *via* verbal, *vs*. non-verbal, *vs*. environmental cues evoke perceptions of these qualities is an open question. In addition, the universality of different experimental manipulations of warmth and competence is uncertain. For example, Kraft-Todd and colleagues (128) found

#### TABLE 10 | Competence and warmth strategies and examples from the healthcare team.

#### Category

Competence/"gets it": *Related to a provider's\* effectiveness at diagnosing and treating disease/symptoms of disease; a provider's understanding of diagnosis, prognosis, and treatment*


#### TABLE 10 | Continued

Howe et al. Provider Warmth and Competence



*\*For this table, "providers" refers to the entire care team at several Primary Care clinics, and thus includes physicians, medical assistants, nurse practitioners, front desk staff, behavioral health specialists, and pharmacists.*

that a provider wearing a white coat did not enhance perceptions of their competence; indeed, evidence on whether professional attire affects perceptions of competence is largely mixed (129– 132). Another interesting question for future research is whether the impact of more general qualities of warmth (e.g., general friendliness, eye contact) and competence (e.g., general medical knowledge, articulateness) differs from the impact from patientspecific qualities of warmth (e.g., asking a patient questions about their personal life) and competence (e.g., demonstrating knowledge of a patient's family history) (see examples in **Table 1**).

While we have proposed that warmth and competence work in conjunction to promote healing, certain contexts, patients, and circumstances may render either warmth or competence more impactful. Cultural expectations and individual personalities or desires likely play a role in both whether patients value warmth or competence more as well as how patients prefer their providers to express warmth and competence (133–135). For example, some of the behaviors patients and providers associated with warmth reviewed in this paper (e.g., calling a patient by their first name) may backfire in other cultural contexts. Different medical problems may also lend themselves more to warmth or competence; warmth might be especially important when dealing with a chronic illness that needs to be managed over time, while competence may be seen as more critical during surgery and for setting broken bones (136).

Regarding questions about the role of patient–provider relationships in placebo effects, the greatest need seems to be for rigorous research that separates the impact of provider interaction style (i.e., providers who are competent and/or warm) from the impact of explicitly set positive expectations. Future studies could help unpack whether and how provider competence and warmth boost the impact of expectations, as well as how setting expectations might boost patient perceptions of provider warmth and competence. This article hypothesizes mechanisms for how provider warmth and competence can boost placebo response, but future empirical research is needed to assess the validity of these hypotheses in research and clinical practice.

We hope that understanding and leveraging the competence and warmth framework will allow us to better address some of the most pressing problems in healthcare. For example, a wealth of literature suggests that minority populations in the U.S. have worse health outcomes (137). Recent authors suggest that differences in placebo response may be at least partially responsible for some of these disparities (138). Deliberately and effectively leveraging warmth and competence could potentially help healthcare providers diminish these gaps. Particularly as research suggests that cultural or racial matches between providers and patients lead to improved healthcare outcomes, warmth and competence may be one way to bridge the divide between providers and patients of different cultural, racial, and socioeconomic backgrounds, as it remains unfeasible to ensure that each patient is seen by a provider who matches his or her cultural background (135). Future research could explore these exciting possibilities.

It is our hope that the theory outlined in this article will spur novel research in these areas. Understanding how, when, and why provider qualities such as warmth and competence boost placebo response will not only further our comprehension of placebo effects, but will also help the medical field deliberately harness important mechanisms of placebo response that can be taken advantage of ethically alongside active medication and treatment. By distilling the complex qualities and behaviors of effective healthcare providers into warmth and competence, we hope this framework can help researchers and practitioners alike to more clearly understand how to practically and purposefully leverage the patient–provider relationship to boost placebo effects and improve healing.

### REFERENCES


### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Stanford University Institutional Review Board. The protocol was approved by the Stanford University Institutional Review Board. The Stanford University Institutional Review Board waived the need for written informed consent from participants.

### AUTHOR CONTRIBUTIONS

LH and KL analyzed the data and drafted the manuscript and AC provided critical revisions. All authors listed have made a substantial, direct, and intellectual contribution to the work, and approved it for publication.

### FUNDING

AC is supported by NIH/NCCIH Grant #DP2AT009511. AC and LH are supported by a grant from the Robert Wood Johnson Foundation. KL holds a Stanford Interdisciplinary Graduate Fellowship—Anonymous Donor.

### ACKNOWLEDGMENTS

The authors would like to thank Michelle Chang, Isaac Handley-Miner, Matthew Bernstein, and Rina Horii for their contributions to the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this artcile can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00475/ full#supplementary-material

of episodic Migraine attacks. *Sci Transl Med* (2014) 6(218):218ra5. doi: 10.1126/scitranslmed.3006175


controlled trial. *Pain* (2016) 157(12):2766–72. doi: 10.1097/j.pain.0000000 000000700


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Howe, Leibowitz and Crum. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Knowledge of Contextual Factors as Triggers of Placebo and Nocebo Effects in Patients With Musculoskeletal Pain: Findings From a National Survey

*Giacomo Rossettini1, Alvisa Palese2, Tommaso Geri1, Mattia Mirandola1, Fabio Tortella1 and Marco Testa1\**

*1 Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health, University of Genova, Savona, Italy, 2 Department of Medical Sciences, University of Udine, Udine, Italy*

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Sven Benson, Essen University Hospital, Germany Andrea Lovato, University of Padova, Italy Nicole Corsi, Istituto Di Ricerche Farmacologiche Mario Negri, Italy*

> *\*Correspondence: Marco Testa marco.testa@unige.it*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 27 December 2018 Accepted: 18 June 2019 Published: 04 July 2019*

#### *Citation:*

*Rossettini G, Palese A, Geri T, Mirandola M, Tortella F and Testa M (2019) The Knowledge of Contextual Factors as Triggers of Placebo and Nocebo Effects in Patients With Musculoskeletal Pain: Findings From a National Survey. Front. Psychiatry 10:478. doi: 10.3389/fpsyt.2019.00478*

Backgrounds: Contextual factors (CFs) have been recently proposed as triggers of placebo and nocebo effects in musculoskeletal pain. CFs encompass the features of the clinician (e.g. uniform), patient (e.g. expectations), patient–clinician relationship (e.g. verbal communication), treatment (e.g. overt therapy), and healthcare setting (e.g. design). To date, the researchers' understanding of Italian patients' knowledge about the role of CFs in musculoskeletal pain is lacking.

Objectives: The aim of this study was to investigate attitudes and beliefs of Italian patients with musculoskeletal pain about the use of CFs in clinical practice.

Methods: A national sample of Italian patients with musculoskeletal pain was recruited from 12 outpatient private clinics in Italy. An invitation to participate in an online survey was sent to patients: a) exhibiting musculoskeletal pain; b) aged 18–75; c) with a valid e-mail account; and d) understanding Italian language. Survey Monkey software was used to deliver the survey. The questionnaire was self-reported and included 17 questions and 2 clinical vignettes on the patients' behavior, beliefs, and attitudes towards the adoption of CFs in clinical practice. Descriptive statistics and frequencies described the actual number of respondents to each question.

Results: One thousand one hundred twelve patients participated in the survey. Five hundred seventy-four participants were female (52%). The average age of patients was 41.7 ± 15.2 years. Patients defined CFs as an intervention with an unspecific effect (64.3%), but they believed in their clinical effectiveness. They identified several therapeutic effects of CFs for different health problems. Their use was considered ethically acceptable when it exerts beneficial psychological effects (60.4%), but it was banned if considered deceptive (51.1%). During clinical practice, patients wanted to be informed about the use of CFs (46.0%) that are accepted as an addition to other interventions to optimize clinical responses (39.3%). Moreover, patients explained the power of CFs through body–mind connections (37.1%).

Conclusion: Patients with musculoskeletal pain had positive attitudes towards the use and effectiveness of CFs when associated with evidence-based therapy. They mostly perceived the adoption of CFs in clinical practice as ethical.

Keywords: placebo effect, nocebo effect, pain, musculoskeletal, survey, conditioning, learning, expectation

#### INTRODUCTION

Placebo and nocebo effects represent an emerging area of interest in musculoskeletal treatment. In this field, for several years, researchers have considered placebo and nocebo as incidental elements to be supervised in randomized controlled trials aimed at isolating the specific effect of a treatment (1). However, in the last decades, the modern neurobiological perspective has conceptualized placebo and nocebo effects as results of the psychosocial context surrounding every healthcare intervention, capable of influencing patients' pain (2).

Placebo effects are the beneficial result of a patient's exposure to a positive context (3), while nocebo effects are adverse consequences of a patient's interaction with a negative context (4). Expectations and conditioning are the main psychological mechanisms underlying placebo and nocebo effects, although social learning and mindset theories have also been demonstrated as explanations of their existing and their functioning (5–7). From a neurobiological perspective, the release of specific neurotransmitters is associated with the exposure to specific contexts: endogenous opioids, dopamine, cannabinoids, oxytocin, and vasopressin have been observed in positive contexts, while opioid and dopamine deactivation and cholecystokinin and cyclooxygenase-prostaglandins activation were observed in negative contexts (8–11). Moreover, different contexts can modulate neural pathways involved in the descending control of pain, influencing the activity of anterior cingulate cortex, dorsolateral prefrontal cortex, periaqueductal grey, and spinal cord (12–16).

The context is composed of several therapeutic signs, symbols, metaphors, and healing rituals (17, 18), called Contextual Factors (CFs), that inform the patients on the value and the meaning of treatment delivered and can influence their healthcare experience by triggering placebo and nocebo effects (19). The therapeutic encounter is strongly characterized by CFs such as a) the clinicians' beliefs and behaviors; b) the patients' expectations and his/her previous experiences; c) the colour and the shape of the intervention; d) the verbal and non-verbal element of communication; and e) the ornaments and the colour of the healthcare setting (2). A robust body of evidence informs clinicians about the positive impact of CFs on therapeutic outcomes such as pain, disability, satisfaction, and perceived quality in different healthcare field as medicine, nursing, physiotherapy, musculoskeletal, and neurological rehabilitation (2, 20–23). As a consequence, a recent experts consensus suggested the adoption of CFs to stimulate placebo effects and to avoid nocebo effects, thus increasing the overall effectiveness of established evidence-based interventions (24).

From a clinical perspective, the patient's point of view about CFs has been proposed as a central line of investigation (25). Up to now, qualitative and quantitative researches have investigated the participants' point of view towards placebo using focus groups (26, 27), interviews (28–30), and surveys (31–39–44). Studies have been performed in different countries such as the US (28, 29, 32–33, 34, 38, 39), Asia (27, 42, 43), Australia (36), and Europe (26, 30, 31, 35, 37, 40, 41, 44), involving healthy subjects (26, 27, 32, 34, 41–43) and patients with acute/chronic health conditions (30, 36–39, 40, 44), depression (43), irritable bowel syndrome (28, 29), and rheumatic and musculoskeletal pain (31, 33, 35). Overall, findings revealed a) a heterogeneous understanding of placebo effects, ranging from limited (27, 32, 33, 35, 36, 40) to well-expressed knowledge (30, 31, 37, 39); b) a dualistic conceptualization of placebo effects, as a beneficial element to be legitimized or as ineffective (26, 28); and c) an open vision about placebos in clinical practice, revealing the deception and the lack of informed consent as major ethical issues of their use (27, 30, 32–38, 39, 44). However, the cultural differences and the various adopted definitions of "placebo treatment" threatened the development of a coherent body of evidence and require more research in the field (25, 39), particularly in Italy, where no studies have investigated the attitudes and beliefs of Italian patients towards CFs.

Moreover, among other different chronic conditions greatly affecting the quality of life of patients, musculoskeletal pain medicine represents an interesting and open field of investigation, given its high frequency and its pervasion by CFs (2). Aligned with this vision, the aims of our study were to explore: a) the clinical behaviors, b) the definition, c) the beliefs, d) the ethical concerning, e) the communication implications, f) the circumstances of application, and g) the mechanism of actions of CFs in a nationwide sample of Italian patients with musculoskeletal pain.

#### MATERIALS AND METHODS

#### Design

A quantitative web-based cross-sectional survey herein reported in accordance with the Checklist for Reporting Results of Internet E-Surveys (CHERRIES) guidelines (45) and STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) (46) was performed. The Liguria Clinical Experimental Ethics Committee (P.R.236REG2016, accepted on 19/07/2016) approved the present study.

#### Participants and Setting

A national sample of Italian patients with musculoskeletal pain was recruited from 12 outpatients' private clinics located in different regions of Italy (North, *n* = 4; Centre, *n* = 4; South, *n* = 4) between May and August 2018.

**Abbreviations:** CFs, Contextual Factors; EQI, EuroQol Index; CHERRIES, Checklist for Reporting Results of Internet E-Surveys; STROBE, STrengthening the Reporting of OBservational Studies in Epidemiology.

Managers of each clinic provided the list of patients recruited for this survey to the principal investigator. The patients were included/excluded in accordance with the physician's judgement based on the defined criteria. The inclusion criteria were as follows: a) age between 18 and 75 (38, 39); b) being currently affected by musculoskeletal pain due to either acute traumatic events (e.g., a fracture) or chronic complaints (e.g., overuse) (47); c) having a valid e-mail account; d) good understanding of the Italian language (33); and e) a EuroQol Index (EQI) < 1. The EQI has values ranging from 0 (worst) to 1 (best) and was calculated using the specific normative data of the Italian population (48). The EQI was calculated starting from the answers given in the EuroQol 5-dimensional scale (EQ-5D-3L), that is, a descriptive system composed of five closed three-level single answer questions, exploring mobility, self-care, usual activities, pain/discomfort, and anxiety/depression domains. Patients affected by cancer or by non-musculoskeletal cause of pain (e.g. neuropathic pain) (33) were excluded.

The number of eligible people who responded to the survey was 1,112. With this sample size, a relative standard error of 3% of the true estimate in the population with a 95% confidence level within 0.03 percentage points was expected, using a simple random sampling approach and with the population proportion set to 50% (49).

#### Questionnaire Development and Pre-Testing

A survey instrument which included questions and clinical vignettes was developed adapting a previous survey on CFs performed among Italian physical therapists and nursing by our research group (50, 51). Questions and clinical vignettes were linguistically adapted to facilitate patient's understanding and answers by the research group. In the whole questionnaire, the word "placebo" was avoided preferring the word "contextual factors" aimed at improving the number of responses by participants (26, 50–52).

The initial list was composed of 22 questions and 2 clinical vignettes that were critically appraised for face and content validity (53) using a panel of seven experts with a wide experience in placebo and survey design (a psychologist, a nurse, and five physical therapists). The experts checked the list independently providing feedback on content accuracy, relevance, wording clarity, and survey structure. Following the feedback received, some adjustments were made and the number of questions was reduced from 22 to 17 because there were overlapping and redundancy.

Once consensus on the final questionnaire was reached among the experts, a preliminary version of the survey, composed of 17 questions and 2 clinical vignettes, was piloted in a convenience sample of 45 patients with musculoskeletal pain and coming from different Italian regions (North, *n* = 15, Centre, *n* = 15; South, *n* = 15) (54).

After the pilot, a telephone debriefing session was performed (53). Experts interviewed the convenience sample of patients about the possible problems encountered during the survey (e.g. recognizing questions that needed additional explanation, wording that was hard to read or that participants found unclear). The outcome of the pilot phase offered the opportunity to reword three items (regarding ethics, communication, and mechanism of action) and to improve the readability of the entire survey.

#### Questionnaire Implementation

The self-administered questionnaire (**Supplementary file 1** – English version, **Supplementary file 2** – Italian version) adopted in this study was divided into three sections (A, B, C), which used both open-ended and closed multiple-choice questions (55).

Section A investigated the socio-demographic variables using six questions (age, sex, geographical region, social status, workplace, and education). Three closed multiple-choice single answer questions explored the features of musculoskeletal pain (anatomical location, time of onset, and intensity using Numeric Rating Scale 0–10) (56).

In Section B, two clinical vignettes were presented as two closed multiple-choice questions with, respectively, single and multiple answers:


Section C comprised eight closed questions. Three closed multiple-choice single answer questions investigated the definition of CFs ("How would you define the therapeutic role of CFs?"), the participants' CFs belief (Likert from 0 "not at all" to 4 "a lot of "), and the potential beneficial effects of CFs ("What are the potential effects of CFs in the following health problems?"). Moreover, five closed multiple-choice multiple answers explored the ethical implications perceived in adopting CFs (e.g. "The use of CFs for therapeutic purposes can be considered ethically acceptable when…."), communication implications about CFs ("How do you communicate to the patient the use of CFs at the end of treatment?'), the circumstances under which they are applied ("Under what circumstances would you use CFs?"), and the possible mechanisms of action ("What mechanism of action can explain the effect of CFs?").

### Data Collection Procedure

Survey Monkey (Survey-Monkey, Palo Alto, California, www.surveymonkey.com) online survey tool was adopted to administer the questionnaire. The survey was disseminated over a 12-week period between 18th May 2018 and 18th August 2018. Participants were contacted using the mailing list of the 12-outpatient private clinics (55). An email including the survey link (https://it.surveymonkey.com/r/ contestopazientiitalianimsk) and a brief note outlining a) the aim of the study, b) data handling (anonymity), c) the informed consent statement, and d) the invitation to complete the survey was distributed. More specifically, the statement in the email informed the recipient that, by clicking on the survey link, the respondents were providing their consent to participate in the study (55). Moreover, an operational definition of CFs was provided to introduce participants to the topic, thus avoiding misinterpretation (30, 35–39): "CFs represent a series of relational or environmental situations capable of influencing the perception of your healthcare condition. Examples of CFs are: the words and posture used by the clinician, the smells, the sounds, and the furnishing of the therapeutic setting" (2).

Three email reminders were sent 4 and 8 and 12 weeks after the initial contact to encourage those who did not take part in the survey to complete it. The time required to complete the survey was 10–15 min (12 min on average), as per the optimal time required to increase response rates in online surveys (57). Participation was voluntary, and no incentives were offered to participants (55). Due to forced response validation, participants were required to answer all questions to prevent missing data (58). Participants were able to review or change responses using a back button before getting to the end of the questionnaire. At the end of the survey, a summary of the answers was provided to the participants (55). Data were copied and deposited in an encrypted computer, and only the project leader could access information achieved in all stages of the study (55). Participants' identities remained concealed to researchers; all data were anonymized (names and mail addresses) to ensure confidentiality and data protection and to avoid psychological harm (55).

#### Data Analysis

Survey data were downloaded from SurveyMonkey into .xls format and reviewed for data quality.

For descriptive statistics, continuous variables were reported using mean and standard deviation (SD). The five response options for the domain beliefs about CFs were also analyzed with mean and SD in order to have an average distribution of each single belief. Dichotomous, nominal, and ordinal variables, coming from single answer questions, were described using absolute and relative frequencies. Intervals of the observed estimates were calculated with a 95% confidence level (95%CI). For the questions with multiple answers, the absolute and relative frequencies were calculated for each combination of responses given by each participant. For example, considering that the fields (*n*) asked in the domain "Non-ethic" were three with dichotomous responses (*r*), we did not calculate the absolute frequency of the three possible fields, but of their eight combinations, given by the formula *r*<sup>∧</sup>*<sup>n</sup>*, to better describe the groups of participants giving multiple answers present in the population.

The association between the individual characteristics (section A of the survey) and the single choice responses given in sections B and C of the survey was investigated with Cramer's V, which is a measure of strength and direction of association derived from chi-square statistics, which was not considered for the analysis of the differences because its significance depends on the size of the sample. For this purpose, age was transformed into ordinal variables considering a decade as variable levels for the analysis of correlations, as described below. Only correlation values above the threshold of acceptance set at 0.60 were reported.

Data analysis was handled using R software (59) and the psych (60) and ggplot2 (61) packages.

### RESULTS

#### Participants' Characteristics

The majority of patients (*n* = 574; 51.6%; 95%CI 48.6–54.6) were female; their average age was 41.7 ± 15.2 years. 43.9% of participants (*n* = 488; 95%CI 40.9–46.9) were living in the North of Italy at the time of the survey.

Fifty point three percent of participants were high school graduate (*n* = 559; 95%CI 47.3–53.2); a large part of them were employed (*n* = 755; 67.9%; 95%CI 65.0–70.6) in intellectual, scientific, and highly specialized professions (*n* = 164; 14.7%; 95%CI 12.7–17.0).

Participants reported musculoskeletal pain principally located in the cervical spine and head region (*n* = 258; 23.2%; 95%CI 20.8–25.8). They had been suffering from pain for >6 months (*n* = 563; 50.6%; 95%CI 47.6–53.6) with a mean level of severity of 4.9 out of 10 (95%CI 4.8–5.0). The EQI presented a mean of 0.85 out of 1 ± 0.12.

The respondents' demographics are described in **Table 1**.

### Clinical Vignette 1

The most frequently chosen solution to the first vignette was "to suggest the possibility of delivering massage if the clinical condition fails to improve" (*n* = 525; 47.2%; 95%CI 44.2–50.2). The least frequent answer instead was to "try to convince the patient of the uselessness of massage" (*n* = 79; 7.1%; 95%CI 5.7– 8.8). The overall overview of data is reported in **Figure 1**.

### Clinical Vignette 2

The most frequent answer to the second vignette was "pain is not organic but psychological" (*n* = 496; 44.6%; 95%CI 41.7–47.6), while the least frequent one was "supporting patient determined improvements after treatment with sham laser (power-off)" (*n* = 99; 8.9%; 95%CI 7.3–10.8). The single items and their combinations are presented in **Figure 2**.

#### Definition of CFs

The majority of patients defined CFs as "an intervention without a specific effect for the condition being treated, but with a

#### TABLE 1 | Participant characteristics (*n* = 1,112).


*n, number of participants; %, percentage; SD, standard deviation; 95%CI, 95% confidence interval; >, more; visual analog scale, visual; EQI, EuroQol Index. \*According to "Nomenclature and classification of work" provided by ISTAT http:// professioni.istat.it/sistemainformativoprofessioni/cp2011/*

possible unspecific effect" (*n* = 715; 64.3%; 95%CI 61.4–67.1). Instead, the minority of patients identified CFs as "a sham treatment used as control tests for safety and efficacy of active treatment" (*n* = 109; 9.8%; 95%CI 8.1–11.7). The remaining considered CFs as "a harmless or inert intervention" (*n* = 167; 15.0%; 95%CI 13.0–17.3) or "an intervention that has a special effect through known physiological mechanisms" (*n* = 121; 10.9%; 95%CI 9.1–12.9).

#### Beliefs

The mean score of beliefs was 2.6 out of 5 (95%CI 2.5–2.6), thus denoting a substantial level of belief towards CFs among patients. In detail, the most believed CFs were (in descending order): "overt therapy" (mean = 3.4; 95%CI 3.3–3.4), "empathetic therapeutic alliance with the patient" (mean = 3.3; 95%CI 3.2– 3.3), "verbal communication" (mean = 3.1; 95%CI 3.0–3.1), and "patient-centered approach" (mean = 3.1; 95%CI 3.0–3.1). The least believed CFs were (in descending order): "adequate design" (mean = 1.8; 95%CI 1.8–1.9), "uniform" (mean = 1.8; 95%CI 1.8–1.9), and "physical contact with the patient" (mean = 1.5; 95%CI 1.4–1.5). An overall description of beliefs towards CFs is presented in **Table 2**.

#### Therapeutic Effect

Patients mainly chose "physiological and psychological" therapeutic effects for health problems such as acute pain (*n* = 640; 57.5%; 95%CI 54.6–60.5), chronic pain (*n* = 629; 56.6%; 95%CI 53.6–59.5), and insomnia (*n* = 562; 50.5%; 95%CI 47.6–53.5). The "psychological" effect was predominantly reported for emotional (*n* = 689; 62.0%; 95%CI 59.0–64.8) and cognitive disorders (*n* = 616; 55.4%; 95%CI 52.4–58.3) and oncological problems (*n* = 513; 46.1%; 95%CI 43.2–49.1). Patients identified the therapeutic effects behind several health conditions such as gastrointestinal (*n* = 451; 40.6%; 95%CI 37.7–43.5) and cardiovascular problems (*n* = 405; 36.4%; 95%CI 33.6–39.3) as "physiological." Infectious (*n* = 629; 56.6%; 95%CI 53.6–59.5), immune/allergic (*n* = 566; 50.9%; 95%CI 47.9–53.9), drug, and medication addictions (*n* = 531; 47.8%; 95%CI 44.8–50.7) were selected as having "no benefit." An overall report of therapeutic effects is presented in **Table 3**.

#### Ethical Implications

The adoption of CFs was considered ethical when "it exerts beneficial psychological effects" (*n* = 672; 60.4%; 95%CI 57.5– 63.3). In this field, the least selected answer was "the patient wants or expects this treatment" (*n* = 51; 4.6%; 95%CI 3.5–6.0). The detailed responses are presented in **Figure 3**.

The adoption of CFs was instead considered non-ethical when "it is based on deception" (*n* = 568; 51.1%; 95%CI 48.1–54.0). Differently, the least frequent selected answer was when "the evidence available is insufficient" (*n* = 164; 14.7%; 95%CI 12.7–17.0). The overall responses are presented in **Figure 4**.

#### Communication

Participants desired to be informed about the use of CFs, thus selecting with a higher frequency the communication "it is a treatment without a specific effect for your problem, but capable of improving your condition" (*n* = 512; 46.0%; 95%CI 43.1–49.0). The least frequent chosen item was "it can help but you are not sure about its effect" (*n* = 26; 2.3%; 95%CI 1.6–3.5). The full combinations of responses are reported in **Figure 5**.

patient of the futility of the massage.

#### TABLE 2 | Beliefs regarding contextual factors (*n* = 1,112).


*%, percentage; n, number of participants; 95%CI, 95% confidence interval; 0, not at all; 1, few; 2, enough; 3, much; 4, a lot of; A, physical therapist domain; B, patient domain; C, physical therapist–patient relationship domain; D, therapy domain; E, healthcare setting domain.*

*aThe items were reported from: Testa M, Rossettini G. Enhance placebo, avoid nocebo: How contextual factors affect physiotherapy outcomes. Man Ther. 2016;24:65–74.*

#### TABLE 3 | Therapeutic effect(s) of contextual factors (*n* = 1,112).


*%, percentage; n, number of participants; 95%CI, 95% confidence interval.*

### Circumstances of CF Application and Mechanism of Action

As for the circumstances of CF application, the most frequent item was "as an adjunct to other interventions to optimize clinical responses" (*n* = 437; 39.3%; 95%CI 36.4–42.2). The least frequent answers were two items: "for non-specific problems" (*n* = 15; 1.3%; 95%CI 0.8–2.3) and "to control pain" (*n* = 13; 1.2%; 95%CI 0.6–2.0). Globally, the combinations of responses are presented in **Figure 6**.

In terms of mechanism of action, patients selected "mind–body connections" as most frequent option (*n* = 413; 37.1%; 95%CI 34.3–40.1). The least frequent answers were instead "natural history of disease" (*n* = 14; 1.3%; 95%CI 0.7–2.2) and "spiritual energies" (*n* = 10; 0.9%; 95%CI 0.5–1.7) as reported in **Figure 7**.

#### Correlation between Variables

The strength of association was considered weak with a Cramer's V lower than the established threshold (Cramer's V < 0.60) for all the correlations, such as between the characteristics reported in **Table 1** (gender, age, Italian region, social status, type of job, education, anatomical region of pain, duration of pain, intensity of pain, EQI) and the responses given in sections B and C of the survey.

### DISCUSSION

To the best of our knowledge, this is the first research investigating the awareness of Italian patients about the therapeutic effect of CFs on musculoskeletal pain. The main findings of our study suggest that patients: a) conceptualized CFs as an intervention with an unspecific effect; b) believed in the clinical effectiveness of CFs; c) identified several possible therapeutic effects of CFs for various health problems; d) considered the use of CFs to stimulate beneficial psychological effects as ethically correct; e) saw as non-ethical the deceptive adoption of CFs; f) desired transparent information about CFs; g) recognized the application CFs as an adjunct to other interventions to optimize clinical responses; and h) proposed mind–body connection as a principal mechanism of action of CFs.

Therefore, according to our and former findings, it is recommended to extend the consideration of CFs in clinical policies and research designs, as they are also a patients' perspective expression, and not only a significant contribution to the therapeutic outcome from clinicians' point of view (18, 21, 25, 26, 29, 34). Namely, if patients present an adequate knowledge of CFs, their implementation can be ethically acceptable by clinicians and researchers. On the contrary, if patients report a misconception about CFs, clinicians and researchers should adequately reconceptualise their point of view before adopting CFs.

Responding to clinical vignette 1, about 50% of our participants suggested the possibility of delivering the expected intervention (massage) if clinical condition did not improve. As reported in previous qualitative researches (62, 63), patients with low back pain considered the fulfilment of expectation as a milestone of the decision-making process capable of improving clinical outcome(s) and adherence to treatment; therefore, clinicians should adopt it

aimed at enhancing therapeutic responses (2). However former studies did not explore the ethical implications we proposed to patients in our survey. Our observations made clinicians aware that satisfying patients' expectations cannot exceed ethical boundaries of professional deontology not only for their personal moral values but also for specific willingness of the patients. In other words, they desire that clinician avoids the administration of the expected intervention when it is detrimental or simply useless.

As resulted in clinical vignette 2, the majority of Italian patients considered the recovery of shoulder pain with laser switched off as explained by symptoms of psychological origin. In accordance with previous international surveys on placebo (30, 35, 36), participants recognized the patients' psychological profile as an important predictor of placebo effects, able to explain the reduction in complaints (64). Therefore, clinicians should remember that patients are aware that their psychological condition affects their health status, so healthcare providers may have to weight this component in each healthcare interaction they have.

Our results made us consider CFs as an intervention lacking specificity capable of influencing patients' clinical condition through an unspecific effect. This confirms the patients' vision of placebo as an inert (32, 39), sham (37), fake (28) substance without any pharmacological active ingredient (30) rather than an active contextual process (25). This old conceptualization of placebos among patients can be the result of the patients' sociocultural context (education, friends and family) (29) and of the external information received (books, newspapers, social media, and the internet) (26, 52). Routinely, clinicians should assess their patients' knowledge on placebo effects and try to correct misconceptions and inconsistencies with the current scientific thinking (65), for example, by encouraging the acquisition of information from evidence-based websites (66).

In line with previous surveys on placebos (30, 31, 35–39), Italian patients believed that CFs can influence therapeutic outcome(s). Namely, the most believed CFs are related to the therapeutic encounter (e.g. empathetic therapeutic alliance, communication, and overt therapy); the least believed CFs concerned healthcare design, the clinician's uniform and the touch. Previous surveys focused on evaluating patients' given value only on a part of possible CFs in each study, never trying to draft an importance ranking (26, 28, 31, 35, 37, 39). In our study, we aimed to draw up a classification, but this result suggests that patients assign the therapeutic value of CFs on a case-by-case basis. From a translational perspective, this finding pushes clinicians to assess patients' beliefs about specific CFs in order to adopt and reinforce the CFs most believed to trigger placebo and to reduce nocebo effects.

Italian patients identified several therapeutic effects of CFs for various health problems ranging from physiological and psychological issues to no benefit. While in previous surveys the expected therapeutic effect was limited to diseases in which psychological influence plays an important role (pain) (30, 31, 35, 36, 39), our participants' responses seem to be more articulated

and support the idea that: a) CFs do not work in all diseases; b) CFs can act with different therapeutic effects (e.g. physiological and psychological); and c) the therapeutic effect of CFs depends on the specific nature and the severity of the disease. This heterogeneity could be related to the ethno cultural background that differ between patients from Northern (e.g. United Kingdom) and Southern Europe (e.g. Italy), and between European patients compared to other populations from different continents, as reported in former surveys on placebo (67). However, our findings are not conclusive, requiring further studies aimed at identifying patients' perspective on the therapeutic effects of CFs in different health problems.

In accordance with the position of a recent expert consensus on placebo and nocebo for clinical practice (24), the majority of Italian patients considered as ethical and acceptable the use of CFs as therapy enhancers when they stimulate beneficial psychological effects and improve patients' symptoms. The pursuit of patients' benefit, the lack of harm, the absence of other effective treatments, and the presence of pain or other conditions of suffering are other main reasons for the ethical implementation of placebo treatments reported in literature (26, 30, 32–33, 34, 36–40). On the contrary, among surveys, the use of placebo is considered as non-ethical when: a) it conflicts with available scientific evidence; b) it provides advantages to clinicians; c) it determines dysfunctional attachment behavior between clinicians and patients; d) it is harmful; or e) it worsens clinical outcomes (26, 32–33, 34, 36, 38, 40).

Our participants considered as non-ethical the deceptive use of CFs. In accordance with previous surveys on placebo (26, 33, 34, 38, 44), deception was considered negatively as it determines a violation of the patients' autonomy and right to be informed about the treatment delivered. Indeed, it can compromise the trust towards clinicians particularly when deceptive treatment resulted in negative outcomes (37, 39). Surprisingly, in other surveys, participants expressed a more tolerant opinion and considered deception acceptable when it helps patients to improve without damaging patient–clinician relationship (36, 41–43). The heterogeneity of these data highlights the complexity behind the ethical domain of CFs, thus the need for further research on the topic across countries.

As for communication, the majority of Italian patients desired transparent information about CFs. In line with previous surveys (26, 30, 37–38, 39, 44), our result confirms the need to notify patients without lying when they receive a non-specific treatment. Communication is a central aspect of the patient– clinician relationship and constitutes one of the most important CFs capable of triggering placebo or nocebo response with a relevant effect on clinical outcomes (2). Two strategies to inform patients have been reported in literature: 1) a direct message ("this is a placebo pill") (37–39) or 2) an indirect general message ("this pill has helped others in the past") that avoids the "placebo" word to limit misunderstanding related to the term (26, 30). Nevertheless, some results of previous surveys supported the non-transparent use of placebo treatments (35, 38–40): some respondents claimed that a clinician should not tell patients that the treatment was a placebo to avoid a potential lack of benefit. Currently, this vision appears dated and incompatible with the evidence available on several health conditions such as irritable bowel syndrome, depression, allergic rhinitis, back pain, and attention deficit hyperactivity disorder (68) that report positive clinical effects also to open-label placebo administration.

According to a previous survey among patients with musculoskeletal complaints (33), in our investigation, CFs are mainly seen by Italian patients as additional interventions that can optimize clinical responses. Overall, our finding suggests a patient's positive attitude towards CFs, thus stimulating their adoption among clinicians to boost the result of evidence-based interventions (2, 22).

Mind–body connection has been proposed as the main mechanism of action of CFs by participants, in accordance with previous surveys on placebo (26, 28, 30, 38). Within a Cartesian dualistic perspective, the power of mind is able to activate patients' inner resources and capacity of self-healing, thus directly influencing symptoms from body (28), relegating to a less relevant role other mechanisms such as expectation, conditioning, hope, psychological (e.g. attitudes, beliefs, and desire), and physiological factors (e.g. real change in the brain) (26, 28, 29, 31, 69, 70). The future analysis about the mechanisms behind the clinical effectiveness of CFs represents a research agenda capable to enrich the knowledge of patients' perspective involved in the creation of placebo/nocebo effects.

#### STRENGTHS AND LIMITATIONS

We have investigated for the first time the knowledge of CFs among Italian patients with musculoskeletal pain, thus expanding, also by involving a wider sample, findings of research in this field previously conducted in other countries (31, 33). Furthermore, the patients' health status as measured with the EQI was similar to that of the general population (48). Compared to focus group methodology, the use of a questionnaire-based survey has contributed to expand the focus of our analysis and revealed the complexity behind CF construct (71). Moreover, the adoption of clinical vignettes helped to gradually introduce a potentially unfamiliar topic such as CFs to patients (26).

Despite the novelty of this study, we recognize several limitations that could affect our findings. First, we have recruited only participants from outpatient clinics, thus limiting the generalization of findings in different contexts (e.g. inpatient services). Second, although not correlated to CF knowledge in our sample, participants had a generally high education and work position, introducing a possible source of bias (38). Third, social desirability and recall bias could have occurred due to self-reported and retrospective nature of data (36, 37). Finally, the distribution of response in question with multiple choice (either with single or multiple answers) revealed the presence of different strata. Therefore, the confidence level of the estimate varies when the proportion of responses is different from the estimated 50% that occurred in non-dichotomic questions. We suggest using our result in future research to estimate the required sample size more precisely using stratified random sampling.

### CONCLUSION

Italian outpatient with musculoskeletal pain reported positive attitudes and beliefs towards the implementation of CFs in clinical practice, and this may have an impact at different levels.

According to the patients' opinion, it is ethically welcome for clinicians to adopt CFs as an additional treatment integrated with the evidence-based intervention aimed at enhancing therapeutic outcomes.

To support a mindful clinical use of CFs, educational courses should be implemented in academic curricula to expand the knowledge among healthcare providers.

Moreover, following the patients' vision, policymakers and managers should create the conditions and the normative frame to ease the appropriate integration of CFs in clinical practice.

Future surveys are needed to explore how patients conceptualise mechanisms of actions and the role of CFs in different health conditions and across countries.

### ETHICS STATEMENT

The present study consists in a web-based survey of which protocol was approved by the Liguria Clinical Experimental Ethics Committee (P.R.236REG2016, accepted on 19/07/2016). Participants were contacted by an email including the survey link (https:// it.surveymonkey.com/r/contestopazientiitalianimsk) and a brief note outlining a) the aim of the study, b) data handling (anonymity), c) the informed consent statement, and d) the invitation to complete the survey. Moreover, the email specifically informed the recipient that by clicking on the survey link he would have provided his consent to participate in the study. Participants' identities remained concealed to researchers; all data were anonymized (name and email address) to ensure confidentiality and data protection.

## AUTHOR CONTRIBUTIONS

Conceptualization: GR, MT. Data curation: GR, TG. Formal analysis: TG. Investigation: GR, MT. Methodology: GR, AP, TG, MM, FT, MT. Project administration: GR, MT. Resources: GR, MT. Software: GR, TG. Supervision: AP, MM, FT. Validation: GR, AP, TG, MM, FT, MT. Visualization: GR, AP, TG, MM, FT, MT. Writing – original draft: GR, AP, TG, MM, FT, MT. Writing – review & editing: GR, AP, TG, MM, FT, MT.

### ACKNOWLEDGMENTS

The authors would like to thank all the Italian patients who took part in the survey.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00478/ full#supplementary-material.

#### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Rossettini, Palese, Geri, Mirandola, Tortella and Testa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo- and Nocebo-Effects in Cognitive Neuroenhancement: When Expectation Shapes Perception

*Alexander Winkler\* and Christiane Hermann*

*Department of Clinical Psychology and Psychotherapy, Justus-Liebig-University, Giessen, Germany*

Objective: The number of students using prescription drugs to improve cognitive performance has increased within the last years. There is first evidence that the expectation to receive a performance-enhancing drug alone can result in improved perceived and actual cognitive performance, suggesting a substantial placebo effect. In addition, expecting a placebo can result in lower perceived and actual cognitive performance, suggesting a nocebo effect. Yet, the underlying mechanisms of these effects remain to be elucidated. The aim of our study was to investigate whether the expectation of receiving a performance-increasing drug or a performance-impairing drug leads to changes in actual and perceived cognitive performance, compared to a control group without expectation manipulation.

#### *Edited by:*

*Katja Weimer, University of Ulm, Germany*

#### *Reviewed by:*

*Jörn von Wietersheim, Ulm University Medical Center, Germany Kristina Fuhr, University of Tübingen, Germany*

#### *\*Correspondence:*

*Alexander Winkler Alexander.Winkler@psychol.unigiessen.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 25 February 2019 Accepted: 24 June 2019 Published: 12 July 2019*

#### *Citation:*

*Winkler A and Hermann C (2019) Placebo- and Nocebo-Effects in Cognitive Neuroenhancement: When Expectation Shapes Perception. Front. Psychiatry 10:498. doi: 10.3389/fpsyt.2019.00498*

Methods: A total of *N* = 75 healthy adults were recruited for an experiment to "try cognitive performance-modulating drugs." A participant's actual cognitive performance (alertness, working memory, sustained attention, and divided attention) using the standardized test of attentional performance (TAP) as well as their performance expectation were assessed. Participants were randomly assigned in equal numbers to either receiving a placebo performance increasing nasal spray ("Modafinil") or a nocebo performance impairing nasal spray ("Vividrin®") or no nasal spray (natural history). After placebo/nocebo nasal spray administration, cognitive performance was reassessed. Subsequent to the second assessment, participants rated their perceived change in cognitive performance, as well as adverse symptoms.

Results: Unlike hypothesized, a positive or negative performance expectation did not result in changes in actual performance, corresponding to the induced expectation. Participants in the placebo-Modafinil group rated their perceived change in cognitive performance subsequent to the application of the nasal spray significantly better (*d* = 1.16) compared to the nocebo-Vividrin® group. Additionally, participants who expected to receive Modafinil felt less tired than participants in the Vividrin® group (*d* = 0.96).

Conclusion: Manipulation of performance expectation affects the perceived change in performance and tiredness, but not the actual cognitive performance in healthy adults. This may explain why college students use such drugs despite their little impact on actual cognitive functioning.

Keywords: placebo, nocebo, neuroenhancement, expectation, cognitive, performance, drugs

## INTRODUCTION

The number of students using prescription drugs to improve cognitive performance without medical indication has increased over the last years, in spite of the potential risks associated with this use (1). Prevalence rates of non-medical stimulant use of 8.3% (lifetime) and 5.9% (past-year) in a sample of 4,580 US college students (2), 4.3% (lifetime) in a representative sample of 1,128 adults in the German population (3), and a lifetime prevalence rate of 6.5% among Australian university students (4) have been reported

Intriguingly, findings about the actual cognitive enhancement effects of stimulants in non-clinical populations are heterogeneous, suggesting a limited benefit at best (5–7). For example, Ilieva et al. (8) demonstrated that, in healthy participants, a dose of mixed-amphetamine salts enhanced the perceived, but not the actual cognitive ability, suggesting that pharmacological neuroenhancement may exclusively boost the subjective perception of cognitive performance. Interestingly, even if actual performance is improved after drug intake, this might at least partially be accounted for by performance expectation (9). Using a balanced placebo design, Cropsey et al. (9) compared the pharmacological versus expectancy effects of mixed amphetamine salts on cognitive performance in college students. Administered amphetamine salts enhanced cognitive performance in only 2 of 31 subtests of a neuropsychological test battery. Expected administration of the stimulant medication yielded improved perceived and actual cognitive performance, regardless of the group allocation (placebo vs. mixed amphetamine salts) (9). Likewise, Dawkins et al. (10) were able to show that expected caffeine intake improved attention regardless of whether students had consumed caffeinated or decaffeinated coffee in a balanced placebo design.

Aside from studies assessing placebo effects using mixed amphetamine salts (8, 9), caffeine (10, 11), or nicotine (12), studies investigating placebo and nocebo effects on cognitive neuroenhancement have either relied on administering placebo pills or used various psychological interventions (e.g., verbal suggestions) in order to manipulate performance expectation.

Among the studies utilizing placebo pills to manipulate performance expectation, only few studies have directly addressed whether placebo administration is effective in inducing cognitive neuroenhancement measured subjectively and/or objectively. Looby and Earleywine (13) showed that the expectation to receive methylphenidate enhances subjective arousal, but neither perceived nor actual cognitive performance. In fact, such an expectation even tended to impair cognitive performance (13). Szemerszky et al. (14) reported a detrimental effect of a placebo pill on perceived performance in a 14-min vigilance task when the pill was given together with information about its (putative) negative cognitive effects (14). However, in this study, actual cognitive performance was not assessed. Furthermore, there was no increase in symptom reports in the nocebo group (14). Notably, only non-specific bodily symptoms (e.g., abdominal pain, headache, itching) were assessed. Moreover, participants were not specifically informed about potentials side effects of the pill, which was described as a mild sedative.

There are two studies suggesting a placebo effect on objective measures of cognitive performance (15, 16). For example, in healthy seniors, a 2-week intake of a placebo pill enhanced memory and attention performance in comparison to a no pill control condition (15). Interestingly, expectancy of improvement and actual improvement of cognitive performance were correlated, though small in magnitude. In two doubleblind randomized-controlled experiments among university students, Colagiuri and Boakes (16) were able to demonstrate that participants who believed they had been allocated to the cognitive-enhancing drug group, due to false (positive) feedback given about their cognitive performance, performed better than those who believed they had been given a placebo.

In one of the very few studies manipulating performance expectation without pill administration, Fuhr and Werle (17) found neither an effect of a mental training based on verbal suggestion nor of the information about the effectiveness of the training on actual cognitive performance. In one of the few studies including both a placebo and a nocebo instruction, the expectation that a tone of a specific frequency will improve or impair cognitive performance strongly affected perceived, but not actual cognitive performance (18). Szemerszky et al. (14) demonstrated a negative effect of a sham magnetic field on perceived performance in a 14-min vigilance task. Unfortunately, actual cognitive performance was not assessed. Moreover, no change in symptom reports was noted (14). There are some studies supporting placebo effects on objective measures of cognitive performance (19–23). For example, sham subliminal presentation of the answers in a knowledge test improved the test scores in college students (20). Fluid intelligence was higher subsequent to a working memory training (1 h) in participants expecting an intelligence boost as compared to participants with no expectation regarding the outcome of the training (Foroughi et al., (21). Turi et al. (22) found a cognitive placebo effect on objective performance measures, but no effect on expectation and perceived performance, using a sham non-invasive brain stimulation technique. Colagiuri et al. (19) demonstrated both a placebo and a nocebo effect in a large sample of university students completing an implicit learning task while being exposed to an odor supposedly enhancing or impairing cognitive performance or having no effect at all. Participants given positive information responded faster; participants given negative information responded slower in cued reaction time trials as compared to the control group (19). Turi et al. (23) demonstrated that a sham non-invasive brain stimulation was able to increase (placebo condition) or decrease (nocebo condition) expected and perceived cognitive performance. Placebo and nocebo effects were also manifest in response accuracy in a reward-based learning performance test (23).

In sum, despite the heterogeneity of findings in the current literature, there is first evidence for placebo and nocebo effects on cognitive performance. However, the influence of such placebo and nocebo instructions has been directly compared only in very few studies [e.g., Ref. (18)]. Additionally, the influence of such placebo/nocebo expectations on cognitive performance has not consistently been evaluated both subjectively and objectively. In the present study, we used a randomized controlled parallel group design to evaluate the effects of expecting a performanceincreasing drug (placebo) or a performance-impairing drug (nocebo) on change in performance expectation, actual and perceived cognitive performance, and adverse somatic symptoms ("side effects"), compared to a control group without expectation manipulation, in a sample of 75 college students. We hypothesized that participants in the placebo group would show a higher and participants in the nocebo group a lower performance expectation compared to the control group. We also hypothesized that, depending on the positive or negative performance expectation, perceived and actual performance in a standardized test battery of attention measures would be altered in comparison to the control condition. Additionally, we hypothesized that participants will specifically endorse those adverse symptoms that were described as the side effects of the drug in the drug information leaflet the participants received as part of the placebo/nocebo induction.

#### METHOD

#### Participants

Seventy-five participants, 49 females (65.3%) and 26 males (34.7%), between 18 and 37 years old (*M =* 22.7, *SD* = 3.8) participated. Participants were recruited between March and June 2018 *via* e-mail advertisement ["Brain doping—Healthy participants wanted for an experiment on nootropics (smart drugs)"] addressed to staff and students of a German university. As cover story, participants were told that the goal of the study was to assess short-term effects of cognitive-performancemodulating drugs using a new delivery route (nasal spray). Participants were told that they would be randomly assigned to either a group receiving a fast acting stimulant ("Modafinil") or a fast acting antiallergic agent ("Vividrin") or no medication at all. In addition, they were informed that their cognitive performance would be tested using a computer-based cognitive performance task before and after drug administration. Actually, participants in the Modafinil and the Vividrin group both received the same placebo nasal spray without active ingredient. Inclusion criteria were age between 18 and 65 years, and fluency in German. Exclusion criteria were allergies to any substances actually (chili and sesame) or purportedly (Modafinil, Vividrin®) used in the study, pregnancy or nursing, suffering from a known mental disorder or severe medical condition, and intake of psychopharmacological drugs or prescription drugs used for enhancing cognitive performance within the last month before participation. All inclusion and exclusion criteria were assessed

*via* self-report in a phone screening. Participants gave written informed consent and were paid 10€ for their participation. The experiment was conducted according to the Declaration of Helsinki and the local ethics committee approved the study protocol (#2018-0001).

The sample size was based on an *a priori* power analysis using G\*Power 3 software (24) for our main outcome, the actual objective performance. For the 3 × 2 ANOVA interaction effect between three groups and two test of attentional performance (TAP) assessments, a total sample of at least 72 participants would be needed to detect a small effect (*f =* .15) with 95% power, alpha at .05, and correlation between repeated measurements (estimated on the basis of retest reliability described in the TAP manual) of.80.

The participants were randomly assigned in equal numbers to the placebo-Modafinil, nocebo-Vividrin®, and a natural history group. We observed no significant differences between the three groups regarding age, sex, and previous experience with performance-enhancing drugs (see **Table 1**). After completion of the experiment, seven participants (9%; placebo-Modafinil: *n* = 2, nocebo-Vividrin®: *n* = 3, natural history: *n* = 2) reported that they had not believed the cover story. Since the number of nonbelievers was similar across groups, these participants were not excluded from statistical analyses.

## Questionnaires and Self-Ratings

#### Subjective Performance Expectation

To assess participant's subjective performance expectation, we used the item "I will perform well in the task" to be rated on a seven-point Likert scale ranging from 1 (not agree at all) to 7 (totally agree). We assessed performance expectation online prior to each TAP assessment (see **Figure 1**).

#### Perceived Change in Cognitive Performance

Participants were asked to rate the perceived change in cognitive performance between the first and the second cognitive assessment ("How do you rate your cognitive performance now in comparison to the first assessment?") on a visual analog scale (VAS) ranging from 1 (worse) to 100 (better). The rating was assessed online after the second TAP assessment.

#### Adverse Symptoms/"Side Effects"

Subjectively perceived adverse symptoms and side effects of the purportedly administered drugs were assessed using the Generic Assessment of Side Effects Scale (GASE) (25). The original GASE entails 36 symptoms and covers the most frequently reported

TABLE 1 | Sample characteristics at baseline.


side effects of medications in clinical trials. The severity of each symptom is rated on a four-point Likert scale ranging from "not present" (0) to "severe" (3). The GASE has good internal consistency (Cronbach's α = 0.89) and has been validated (25). For the purpose of our study, 12 adverse symptoms were taken from the original GASE such that they matched the potential side effects as described in the drug information leaflet given to the participants (Modafinil: headache, palpitations/irregular heartbeat, abdominal pain, fatigue/tiredness and irritability/ nervousness; Vividrin®: bitter taste, nausea, skin rash/itching, feeling of weakness and drowsiness/exhaustion; both drugs: dizziness, irritation of nose or throat). We selected those adverse symptoms that could be expected to occur relatively quickly following acute administration of the drug and to fluctuate over the course of the experiment. We followed the recommendation of Rheker et al. (26) and assessed adverse symptoms twice, before the first TAP assessment (as baseline) and after the second TAP assessment, since complaints about minor bodily symptoms are extremely common in the general population (base rates up to 80%) (27, 28) and might easily be misattributed to the nasal spray intake.

### Cognitive Performance

Cognitive performance was tested using the subtests Alertness, Working Memory, Sustained Attention, and Divided Attention of the computer-based TAP (29). The TAP is a well-established test battery for assessing various aspects of cognitive performance and is suitable for testing healthy subjects. For each TAP subtest, the test performance scores were determined according to the TAP manual (see **Table 2**). Alertness is tested by requiring participants to press a key as quickly as possible when they notice a cross on the monitor, which is displayed at randomly varying intervals (preceded or not preceded by a warning tone). Working Memory is tested by a modified N-1 back task, i.e., a sequence of numbers is presented on a computer screen, and participants are required to indicate whether or not the currently presented number matches the previously shown number or the one before. In the Sustained Attention test, a sequence of stimuli is presented on the monitor. Participants are required to press a key whenever the stimulus presented matches the preceding stimulus regarding one of two predetermined stimulus characteristics (color, shape, size, or filling). In the Divided Attention test, participants undergo a dual task, i.e., a visual ("press a key when a varying number of crosses on the monitor form a square") and an auditory task ("press a key when a tone occurs twice in a row within a high and low tone sequence"). For all subtests that were used, the maximum level of difficulty was selected, whenever different levels of difficulty were available. As displayed in **Figure 1**, the TAP was assessed before and after manipulation of participants' expectation.

### Experimental Setup

In the current randomized controlled parallel group design study, the primary outcome was actual cognitive performance analyzed *via* a 2 × 3 mixed model ANOVA with the repeated factor time (first TAP assessment vs. second TAP assessment) and the between group factor group (placebo vs. nocebo vs. natural history). Secondary outcomes were performance expectation, perceived performance, and adverse symptoms. All participants underwent the first cognitive test battery (TAP) as baseline measurement (see **Figure 1**). Then, participants were randomly assigned to one of three groups (placebo-Modafinil, nocebo-Vividrin®, natural history). Participants allocated to the placebo-Modafinil group were informed that they will receive a stimulating drug that enhances cognitive performance and increases general alertness. Participants allocated to the nocebo-Vividrin® group were informed that they will receive a drug that dampens the activity of the central nervous systems and reduces alertness. Both groups actually received an active placebo nasal spray consisting of a mixture of sesame oil and capsaicin (0.0007%). Participants in the natural history group did not receive the nasal spray and were not further instructed regarding (potential) drug administration. Investigators were partially blinded to group allocation, since the participant leaflet for Modafinil or Vividrin® was handed to the participants in a closed envelope. Hence, the experimenter was unaware of whether the participant received the Modafinil or the Vividrin® instruction


*TAP, Test of Attentional Performance.*

July 2019 | Volume 10 | Article 498 Winkler and Hermann

Number of errors 0.96 1.40 0.84 1.86 1.04 1.37 0.92 1.12 0.80 0.91 0.52 0.71 0.56 .574 1.37 .245 0.13 .878

together with the nasal spray. For participants allocated to the natural history group, the experimenter was unblinded, since the participants were instructed to inform the experimenter that they were not supposed to take any nasal spray after reading the leaflet. After the information about the purported drug was given, participants in the placebo-Modafinil and nocebo-Vividrin® group received the active placebo nasal spray. Participants were instructed to wait for 60 s after the drug application in order to ensure good absorption before undergoing the second performance test. Subjective performance expectation was measured prior to each TAP assessment. The perceived change in performance was rated after the second TAP test. Adverse symptoms were assessed before the first TAP test and after the second TAP test. Participants assigned to the natural history group underwent the same procedure; however, they received no nasal spray (see **Figure 1**).

#### Study Procedure

Individuals interested in the study underwent a telephone screening to examine inclusion and exclusion criteria and to arrange a lab appointment. The participants were seated in a lab with the experimenter running the experiment from an adjacent room. The participants were monitored using a camera; they could communicate with the experimenter using a microphone at any time. After giving informed consent, participants completed the questionnaires online. Then they underwent the experiment (for details, see the section Experimental Setup). After completing the experiment, participants were asked to indicate whether or not they had believed the cover story, and they were then debriefed following a standardized protocol and were paid. The experiment lasted about 90 min in total.

#### Statistical Analysis

Statistical analyses were performed using IBM SPSS Statistics 23.0 for Windows (Chicago, SPSS, Inc.). Group differences in age, sex and previous experience with performance-enhancing drugs at baseline were analyzed using univariate analysis of variance (ANOVA) and chi-square tests.

Group differences regarding change in performance expectation over time were tested using a mixed design ANOVA with *time* (before and after expectation manipulation) as within subject and *group* (placebo-Modafinil, nocebo-Vividrin®, natural history) as between group factors. Significant group × time interaction effect was followed up by Bonferroni-corrected *post hoc* tests, mean differences (*Mdiff*) are reported.

Group differences in perceived change of cognitive performance were tested using a univariate ANOVA, followed by Bonferroni-corrected *post hoc* tests; mean differences (*Mdiff*) are reported.

To test for group differences in change of actual cognitive performance, we carried out mixed design ANOVAs with *time* (first and second TAP assessment) as repeated measures and *group* (placebo-Modafinil, nocebo-Vividrin®, natural history) as between-group factor for each TAP subtest performance score as dependent variable.

Group differences in drug-specific and unspecific adverse symptoms ("side effects") as described in the drug information leaflet assessed following the second TAP assessment were evaluated in an exploratory analysis using ANCOVAs for each item with symptom intensity prior to the first TAP assessment used as covariate, respectively. For this exploratory analysis, the family-wise error rate was set at .10. Bonferroni correction led to a *p*-value of .02 for single comparisons with respect to drug specific symptoms, and .05 as criterion for significance for single comparisons with respect to unspecific symptoms. Significant ANCOVAs were followed up by Bonferroni-corrected pairwise *post hoc* comparisons. Mean differences (*M*diff) are reported.

Product-moment correlation analyses were conducted to test the relationship between change in performance expectation and perceived change in performance.

#### RESULTS

#### Performance Expectations and Actual Cognitive Performance Performance Expectation

The mixed-measure ANOVA revealed a significant interaction effect between group and time [*F*(2,72) = 8.74, *p* < .001], a main effect of group [*F*(2, 72) = 4.01, *p* = .022], but no significant main effect of time. Follow-up tests revealed that the groups differed significantly in their performance expectation after expectation manipulation, but not at baseline (see **Figure 2**). Participants in the placebo-Modafinil group (*M* = 5.4, *SD* = 0.23) endorsed a significantly higher performance expectation than participants in the nocebo-Vividrin® (*M* = 4.0, *SD* = 0.23, *M*diff = 1.4, *p* < .001, *d* = 1.04) and in the natural history group (*M* = 4.2, *SD* = 0.23, *M*diff = 1.2, *p* = .001, *d* = 1.45), after expectation manipulation. There was no significant difference in performance expectation between the nocebo-Vividrin® and the natural history group (*M*diff = 0.2, *p* = 1.000). Moreover, performance expectation increased significantly in the placebo-Modafinil group following the expectation manipulation (*M*diff = 0.72, *p* = .009, *d* = 0.64), and it decreased significantly in the nocebo-Vividrin® group (*M*diff = −0.56, *p* = .039, *d* = −0.38) and the natural history group (*M*diff = −0.72, *p* = .009, *d* = −0.86).

#### Perceived Change in Performance Between First and Second Assessment TAP

The univariate ANOVA yielded a significant group main effect [*F*(2,38) = 6.37, *p =* .004]. As illustrated in **Figure 3**, Bonferroni-corrected *post hoc* tests revealed that the placebo-Modafinil group reported significantly greater improvement in performance than the nocebo-Vividrin® group (*M*diff = 22.91, *p* = .003, *d* = 1.16). Neither the placebo-Modafinil group (*M*diff = 11.71, *p* = .190, *d* = 0.85) nor the nocebo-Vividrin® group (*M*diff = −11.20, *p* = .207, *d* = 0.71) differed from the natural history group with respect to perceived change in performance.

#### Relationship Between Performance Expectation and Perceived Change in Performance

In the placebo-Modafinil and the nocebo-Vividrin® group combined, there was a significant positive correlation between performance expectation after placebo intake and perceived change in performance from the first assessment TAP to the second assessment TAP (*r* = .47, *p* = .002) as measured after the second assessment TAP.

#### Actual Cognitive Performance

There was no statistically significant interaction effect between group and time (first assessment vs. second assessment) for any of the TAP performance indices, as displayed in **Table 2**. Thus, there was no evidence for a differential effect of group allocation on actual cognitive performance. Moreover, no significant group main effects emerged. However, there were significant main effects for time, with respect to some subtests. Alertness without warning signal standard deviation reaction time [*F*(1, 72) = 10.40, *p* = .002, partial η² = .126], Alertness with warning signal standard deviation reaction time [*F*(1, 72) = 5.97, *p* = .017, partial η² = .077], and Alertness with warning signal number of anticipations [*F*(1, 72) = 19.84, *p* < .001, partial η² = .216] show higher values at the second assessment, respectively. Working Memory number of errors [*F*(1, 72) = 9.65, *p* = .003, partial η² = .118] and Sustained Attention number of errors [*F*(1, 72) = 27.12, *p* < .001, partial η² = .274] show lower values at the second assessment, respectively.

#### Adverse Symptoms ("Side Effects")

There was a significant difference in fatigue [*F*(2,71) = 4.41, *p =* .016] and irritation of nose or throat [*F*(2,71) = 29.82, *p* < .001] between groups after the second TAP assessment as revealed by ANCOVAs with symptom intensity prior to the first TAP assessment as covariate (see **Table 3**).

*Fatigue.* In comparison to participants in the nocebo-Vividrin® group, participants in the placebo-Modafinil group reported significantly less fatigue after placebo treatment (*M*diff = 0.84, *p* = .005, *d* = 0.96). However, there was no significant difference between the natural history group and the placebo-Modafinil group (*M*diff = 0.52, *p* = .136, *d* = 0.56) or nocebo-Vividrin® group (*M*diff = 0.32, *p* = .641, *d* = 0.36), with respect to fatigue post second TAP assessment.

*Irritation of nose and throat*. In comparison to participants in the natural history group, participants in the placebo-Modafinil group (*M*diff = 0.92, *p* < .001, *d* = 2.02) and the nocebo-Vividrin® group (*M*diff = 0.96, *p* < .001, *d* = 2.23) reported significantly more irritation of their nose and throat after the placebo intervention.

TABLE 3 | Intensity of the 12 selected GASE adverse symptom items before 1st TAP assessment (baseline) and post 2nd TAP assessment.


*GASE, Generic Assessment of Side Effects Scale, Bonferroni correction of the family wise error rate led to a p-value of .02 as criterion for significance with respect to the drug specific symptoms and .05 as criterion for significance with respect to unspecific symptoms.*

*\*p < .02, \*\*\*p < .001.*

## DISCUSSION

Our key finding is that manipulation of performance expectation *via* a placebo cognitive performance enhancing nasal spray affects the perceived change in performance and tiredness, but not the actual cognitive performance in healthy adults. Reasons for nonmedical use of prescription stimulants among university students are to improve concentration, to perform better in university (2), to "catch up with high achieving students," to increase the amount of work done under time constraint, to improve energy, and to "pull an all-nighter" (30). Therefore, the demonstrated placebo effect affecting subjective outcomes like perceived performance and tiredness could partially explain why these drugs are used despite potential risks and unclear benefit.

As hypothesized, the placebo-Modafinil group showed a significantly higher performance expectation after the expectation manipulation than the nocebo-Vividrin® and the natural history group. Although nearly all prior studies assumed that *a priori* performance expectation was changed by the intervention (e.g., administration of a placebo pill or verbal suggestion), the majority of these studies did not assess performance expectation directly after the intended expectation manipulation. Rather, the change in performance expectation was extrapolated based on a *post hoc* performance rating (9, 10, 14, 16–21). Clearly, *a priori* performance expectations and *a posteriori* performance ratings tap different aspects. Indeed, we observed only a moderate positive correlation between performance expectation after placebo intake and perceived change in performance (*r* = .47). Moreover, the few studies that directly assessed *a priori* performance expectation (13, 15) failed to report changes in performance expectation due to their intervention. Hence, it is unclear whether the intervention actually resulted in change in expectation. In the present study, we carefully assessed performance expectation prior and subsequent to the placebo instruction and observed a medium-sized (*d* = 0.64) increase in performance expectation within the placebo-Modafinil group.

Contrary to our hypothesis that a positive performance expectation would improve actual performance, there were no group differences in actual cognitive performance. This finding is consistent with the study of Looby and Earleywine (13), but inconsistent with the finding of an improvement in sustained attention in participants believing that they had received a cognitive-enhancing drug (16). Interestingly, in the latter study, performance expectation was induced by providing false feedback that participants had improved their performance by 20% due to the pill they had taken before in a blinded manner. Hence, based on their apparent change in performance, participants formed their belief about whether or not they had taken active pill or the placebo. As is long known, (perceived) mastery of a task has a strong effect on self-efficacy (31). In a similar vein, beliefs based on (seeming) changes in performance are likely to be more credible and powerful than verbal suggestion for the participants. This is also consistent with findings that verbal suggestion as compared to conditioning is associated with a smaller placebo effect (32). Oken et al. (15) also found an improvement of actual cognitive performance (memory and attention) after placebo pill intake. This may be explainable by the fact that Oken et al. (15) investigated performance-enhancing placebo effects in a sample of healthy seniors, 65–85 years of age. In elderly individuals, a placebo effect might manifest itself more easily because any ceiling effect is unlikely due to lower baseline levels of cognitive functions such as attention or memory. In line with such an interpretation, Oken et al. (15) reported that even in their sample of elderly, older participants demonstrated a greater benefit from placebo intake. Indeed, Oken et al. (15) relied on a neuropsychological assessment battery typically used for dementia screening (CERAD), whereas the TAP used in the current study is also sensitive for measuring high levels of cognitive functioning. Moreover, given the role of medicationrelated beliefs (33), a potential confounding influence could be that the attitude towards neuroenhancement as treatment for a cognitive deficit in elderly is quite different than the expected effects of drugs used for "brain doping" by healthy young adults.

As predicted, participants in the placebo-Modafinil group rated their perceived change in cognitive performance subsequent to the application of the nasal spray significantly better (*d* = 1.16) compared to the nocebo-Vividrin® group. Hence, an enhanced performance expectation affects the perceived change in performance, irrespective of any changes in actual cognitive performance. Similar observations, i.e., that performance expectation affects the perceived change in performance, but not the actual cognitive performance, have been made previously [e.g., Ref. (18)]. As outlined by Schwarz and Büchel (18), it is possible that objective measures of cognitive performance are generally not susceptible to expectancy manipulation. Those studies demonstrating an expectancyinduced change in objective performance (15, 16) are at odds with such an assumption. Alternatively, it is possible that only specific cognitive functions are susceptible to expectancy manipulation (e.g., tasks entailing a motivational component and/or tasks requiring great effort). Previous studies vary considerably with regard to the specific type of cognitive task used to evaluate changes in performance. For example, implicit learning task (19) or tests of fluid intelligence (21) have been used. Taking into account previous reports on changes in cognitive functioning due to administration of cognitive enhancers in healthy participants, we decided to focus on attention as a core cognitive function rather than complex cognitive functions (e.g., problem solving). We choose the TAP due to the broad range of functioning it allows to test. However, the TAP was developed to allow a differential diagnosis of attention deficits, based on reference data in the general population. Clearly, in our sample, there is no evidence for a potential ceiling effect. The mean T-values range between 46 and 57 for the different performance indices, indicating average cognitive performance in our healthy sample. At this point, it is far from being clear which method for expectation manipulation, e.g. sham subliminal presentation of information (20) or smelling an odor (19) or verbal suggestion, is particularly effective in yielding actual changes in performance. Moreover, it is unclear which aspects of cognitive functioning are susceptible to a placebo manipulation. Finally yet importantly, design differences (e.g., balanced placebo design vs. between group designs) could account for the heterogeneous results.

Contrary to our hypotheses, there was no difference in performance expectation between the nocebo-Vividrin® and the natural history group, indicating that the intended manipulation of the performance expectation had failed for the nocebo-Vividrin® group. Our results show that both groups, the nocebo-Vividrin® group (*d* = −0.38) and the natural history group (*d =* −0.86), showed a significant decrease in performance expectation compared to baseline with the effect sizes suggesting a larger drop. Possibly, participants in the natural history group, who were interested in participating in a study on brain doping as advertised, were disappointed that they were assigned to the control (natural history) group and therefore did not have the chance to try a smart drug, thus resulting in a nocebo effect. Participants in the nocebo-Vividrin® group may have underestimated the effects of Vividrin® due to its being administered as part of study on brain doping. Alternatively, participants may have had prior experiences with Vividrin®, which is a common anti-allergic substance, and, based on their own experience, did therefore not expect a deteriorated cognitive performance. As described above, the majority of studies did not directly assess performance expectation; hence, there are no previous findings on whether performance expectation is susceptible to negative manipulation in the same way as it is to positive manipulation.

We also attempted to evoke adverse symptoms consistent with the side effect profiles of the placebo/nocebo medications as listed in the drug information leaflets given to the participants. There was no evidence for a drug-specific side effect profile in either experimental group. Yet, the description of adverse symptoms is known to influence participants' perception of bodily symptoms (34). Fatigue was described as a potential side effect of Modafinil. Interestingly, participants in the placebo-Modafinil group felt less tired after the second TAP test than the nocebo-Vividrin® group. This suggests that describing Modafinil as a stimulating drug, which facilitates general alertness, overshadowed the listed side effects, especially given that fatigue as a side effect might seem counterintuitive for a stimulating drug. Moreover, since increased alertness and prolonged endurance when working are known reasons for nonmedical use of prescription stimulants (30), disregarding tiredness as an unwanted side effect could partly explain why these drugs are used despite potential risks and unclear benefit, especially if the effect could be evoked even by a medication without active component (a placebo).

There were no group differences regarding the other complaints, except for irritation of nose and throat, which is attributable to the capsaicin in the nasal spray, and therefore was reported significantly more often in the experimental groups as compared to the natural history group. Possibly, participants assumed that a single dose of the study medication would not lead to the side effects as described, but, based on previous experiences when taking medications, implicitly assumed that such side effect would primarily occur when regularly taking the same medication.

#### Limitations

First of all, we cannot rule out a certain self-selection bias of the participants such that we may have tested primarily individuals willing to try a cognitive-performance-modulating drug using a new route of delivery (see the cover story of the study) in an experimental setting and/or individuals with prior experience with such drugs. Additionally, due to the cover story, participants might have expected to get the chance of trying a performanceenhancing drug and were disappointed when they were allocated in the nocebo or natural history group, potentially leading to a reduced motivation and commitment.

Furthermore, our findings are limited to healthy adults. As stated by Fuhr and Werle (17), psychological interventions for enhancing cognitive performance might even be more effective for patients with impaired cognitive functioning, e.g., when suffering from an affective disorder. Patients might be more susceptible to expectancy manipulation and might benefit both subjectively and objectively, for example, due to better concentration, greater motivation, and higher perceived self-efficacy.

The approach to evoke adverse symptoms *via* information provided in the drug information leaflets may have not been optimal as they were described next to the drug action effects. Participants may have focused on the potentially desired effects and may have disregarded the adverse effects, especially if assuming, based on personal experience that "side effects" occur primarily when a drug is taken repeatedly.

We also cannot rule out that the placebo and the nocebo instruction might have been not fully equivalent since we referred to the substance name in the placebo condition (Modafinil), but used the trade name in the nocebo condition (Vividrin®). Given that Vividrin® is relatively well known, this might have triggered more expectations, thus confounding our expectancy manipulation.

In placebo/nocebo studies, in general, the situational context strongly influences study outcomes. Participants may have not fully believed that an actual drug, especially with negative effects on cognitive performance, would be applied at a department of psychology, especially with no physician being ostensibly involved.

Finally, it should be noted that attentional performance is just one facet of cognitive performance. However, unlike most previous studies, we used several tests of attentional performance rather than relying on just one or two tests. If performance expectancy primarily alters those cognitive functions entailing for example a strong motivational component, future studies should seek to use more comprehensive cognitive test batteries to elucidate which cognitive functions may be susceptible to performance expectancy effects.

To our knowledge, the present study is the first study investigating expectancy effects in pharmacological neuroenhancement including both placebo and nocebo instructions, assessing performance expectation directly after the intended manipulation and perceived change in cognitive performance, as well as cognitive measures. Additionally, it is the first study investigating drug-specific side effects of placebo- and nocebo-medication in the context of pharmacological neuroenhancement.

#### Conclusions

Manipulation of performance expectation affects the perceived change in performance and tiredness in healthy adults. This may explain why college students use such drugs despite their small, if any impact on actual cognitive functioning. Therefore, future studies should systematically assess the role of performance expectation, perceived change in performance, and tiredness in predicting future use of prescription drugs to improve cognitive performance. Future studies should also address whether enhancing placebo effects could be helpful in improving perceived or actual deficits in cognitive performance. This could stimulate clinical studies on utilizing placebo effects in clinical practice, for example, in patients suffering from affective disorders. Future studies should entail different cognitive tasks such that it can be determined what makes a cognitive task susceptible to expectancy manipulation. This holds the opportunity to elucidate the underlying mechanisms of such placebo/nocebo responses. With respect to the effect of nocebo responses on cognitive performance, our results suggest that demonstrating differences between a nocebo group (expectation manipulation intended to decrease performance expectation) and a natural history group seems to be challenging, due to potential nocebo effects within the natural history group. Nevertheless, in direct comparison with a placebo group (expectation manipulation intended to increase performance expectation) our data give evidence that a nocebo-intervention affects the perceived change in performance, irrespective of any changes in the actual cognitive performance. Future studies should apply alternative approaches to a natural history control group. Additionally maybe it would be beneficial to separate studies addressing placebo and nocebo effects in cognitive performance to avoid expectation violation of participants interested to try a pharmacological neuroenhancer and receiving no medication at all or a substance supposed to provoke the opposite effect.

The present findings add to the growing body of evidence that highlights the influence of prescription-stimulant-related expectancies on subjective outcomes but not cognitive performance. This finding implies that more information about the role of subjective expectations and the discrepancy between subjectively

#### REFERENCES


perceived and actual changes in cognitive performance needs to be communicated to the public in an attempt to modify beliefs held by (potential) users, thus possibly correcting individual beliefs about the benefit of such drugs.

### DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the local ethics committee of the faculty of psychology at Justus-Liebig-University Giessen, Germany with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the local ethics committee of the faculty of psychology at Justus-Liebig-University Giessen, Germany.

### AUTHOR CONTRIBUTIONS

Both authors contributed to the conception and design of the study. AW and CH conducted the statistical analysis. AW wrote the first draft of the manuscript. Both authors approved the submitted version.

### ACKNOWLEDGMENTS

We thank Tim Bartenschlager for his assistance in data acquisition.


**Conflict of Interest** S**tatement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Winkler and Hermann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# No Reason to Feel Sick? Nocebo Responses in the Placebo Arms of Experimental Endotoxemia Studies

*Sven Benson\* and Sigrid Elsenbruch*

*Institute of Medical Psychology and Behavioral Immunobiology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany*

Adverse side effects are reported by a large proportion of patients undergoing medical treatment in clinical practice or clinical trials. Nocebo effects, induced by negative treatment expectancies, can contribute to negative patient-reported outcomes but have rarely been studied in the context of inflammatory or immune-related conditions. Based on perceived treatment allocation, we herein analyzed nocebo responders in the placebo arms of randomized controlled double-blind experimental endotoxemia studies. We hypothesized that nocebo responders would report more bodily sickness symptoms and greater mood impairment. Out of *N* = 106 participants who had all received placebo injection, *N* = 20 (18.9%) wrongly believed they had received endotoxin and were thus considered as nocebo responders. Nocebo responders reported significantly more bodily sickness symptoms, suggesting that the perception of bodily symptoms affected perceived treatment allocation. Against our expectations, we did not find differences between nocebo responders and controls in psychological or physiological parameters. However, exploratory correlational analysis within nocebo responders revealed that more pronounced bodily sickness symptoms in response to placebo were associated with greater state anxiety and negative mood, as well as with the psychological traits catastrophizing and neuroticism. Our findings support that negative affectivity and personality-related factors may contribute to the reporting of sickness symptoms. Nonspecific symptoms experienced by patients undergoing pharmacological treatments or in randomized controlled trials can be misinterpreted and/or misattributed as unwanted side effects affecting perceived treatment allocation and presumably treatment satisfaction or its perceived efficacy. More nocebo research in the context of acute and chronic inflammatory conditions is warranted.

Keywords: nocebo response, placebo condition, immune system, inflammation, experimental endotoxemia, sickness behavior, symptom perception, side effects

### INTRODUCTION

Adverse side effects are reported by a large proportion of patients taking medications, with negative implications for compliance, treatment continuation, and health-related quality of life (1). Owing to advances in the placebo field, it has become abundantly evident that patient-reported health outcomes including side effects are not solely explained by the specific pharmacological actions of a drug or

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Karl Bechter, University of Ulm, Germany Andrea W.M. Evers, Leiden University, Netherlands*

*\*Correspondence: Sven Benson sven.benson@uk-essen.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 22 December 2018 Accepted: 28 June 2019 Published: 17 July 2019*

#### *Citation:*

*Benson S and Elsenbruch S (2019) No Reason to Feel Sick? Nocebo Responses in the Placebo Arms of Experimental Endotoxemia Studies. Front. Psychiatry 10:511. doi: 10.3389/fpsyt.2019.00511*

medical treatment. Indeed, nocebo effects induced by negative treatment expectancies contribute to so-called nonspecific side effects, including the generation of unwanted side effects or the worsening of symptoms (2–4). This has been shown in the placebo arms of RCTs where the pattern of reported side effects mimics that of the verum arm (3). Nocebo effects also occur in routine care when negative treatment expectations are formed by the psychosocial treatment context, e.g., during informed consent (1, 2). Thus far, much of the existing knowledge on patient-reported nocebo effects comes from experimental pain research and the analysis of placebo arms of randomized controlled trials (RCTs). Nocebo effects have rarely been studied in the context of inflammatory or immunerelated conditions, despite their broad clinical relevance (5, 6).

Aiming to close this research gap and to spark interest in translational research on nocebo effects in the context of acute inflammation, we herein analyzed nocebo-induced sickness behavior in the placebo arms of experimental endotoxemia studies. The experimental application of endotoxin is an established translational model to induce a transient systemic immune activation in healthy individuals (7). Experimental endotoxemia results in a well-characterized response encompassing psychological and bodily symptoms referred to as sickness behavior, which includes negative mood, fatigue, hyperalgesia, and nonspecific bodily symptoms (7). Sickness behavior can also occur as side effect of immune therapies and may contribute to mood disorders during chronic infection or conditions characterized by chronic inflammation (8). While many of the individual symptoms that characterize sickness behavior have been found to be modifiable by nocebo mechanisms, the collective symptom spectrum that characterizes sickness behavior in the context of acute inflammation has never been studied from a nocebo perspective.

We therefore merged data from the placebo arms of several randomized controlled double-blind endotoxemia studies conducted in our laboratory, implementing highly standardized informed consent and experimental procedures. Volunteers repeatedly received verbal and written information about effects and side effects of experimental endotoxin application during informed consent. We assessed perceived treatment allocation 24 h after the injection of placebo, assuming that an incorrect allocation (i.e., perceived endotoxin treatment when in reality received placebo) represents a nocebo responder. We compared the group of nocebo responders with volunteers with a correct treatment allocation (i.e., controls group: correct perceived allocation to placebo treatment). We specifically hypothesized that nocebo responders would report more sickness behavior symptoms, i.e., more bodily sickness symptoms and greater mood impairment. We further conducted exploratory analyses to identify psychological and physiological parameters related to the "nocebo response."

#### MATERIAL AND METHODS

#### Participants and Study Protocol

This merged dataset comprises a total of *N* = 106 healthy volunteers (*n* = 15 women, 15.4%), who were randomized to receive a placebo injection in one of our previous (9–12) or ongoing randomized controlled double-blind endotoxemia studies. Volunteers underwent an across studies identical and highly standardized recruitment process with verbal and written information about effects and side effects of experimental endotoxin application. Rigorous screening comprising clinical and laboratory assessment was conducted at multiple time points to exclude any physical or psychological conditions. Prior participation in any experimental endotoxin study was exclusionary. Hence, participants were endotoxin-naïve herein to exclude prior experience with endotoxin-induced sickness symptoms and the study-specific psychosocial treatment context. All primary studies were conducted in medically equipped study rooms at the University Hospital Essen, Germany [for details, see Ref. (13)]. On the study days, an intravenous catheter was placed in a forearm vein for repeated blood withdrawals and for the injection of low-dose endotoxin or placebo. Before injection, volunteers were informed that they would receive either the "test substance endotoxin or an inert substance in a double-blind manner" by the study physician. Before (baseline) and up to 6 h after injection, repeated assessments (see below) of bodily and psychological sickness symptoms along with vital parameters (blood pressure, heart rate, body temperature) were conducted, and blood samples were collected for the analysis of inflammatory markers (not shown) and cortisol concentrations. Perceived treatment allocation was assessed 24 h after injection. All studies were conducted in accordance with the Declaration of Helsinki and were approved by the Institutional Ethics Review Board of the Medical Faculty of the University of Duisburg-Essen. All participants gave written informed consent and received financial compensation for study participation.

#### Measures

Before the study day, psychological traits, including trait anxiety (State-Trait Anxiety Inventory, STAI-T), depression (Beck Depression Inventory, BDI), personality (NEO Five-Factor Inventory, NEO-FFI), and coping strategies (Pain-Related Self-Statement Scale, PRSS), were assessed with validated questionnaires. On the study day, state anxiety (State-Trait Anxiety Inventory, STAI-S), mood (Multidimensional Mood Questionnaire, MDBF), and bodily sickness symptoms (General Assessment of Side Effects, GASE) were repeatedly measured with standardized questionnaires. Perceived treatment allocation was retrospectively assessed 24 h after injection when volunteers returned to the lab with a brief questionnaire (forced choice of answers: believed to have received endotoxin or believed to have received placebo). Plasma cortisol concentrations were measured with commercial enzyme linked immunosorbant essay (ELISA) according to manufacturer instructions. For details on all measures, see Ref. (13).

#### Statistical Analyses

Nonparametric tests were used given non-normal distribution of data. Group differences between nocebo responders and controls were analyzed with chi² and Mann–Whitney *U* tests. To test our hypotheses, nocebo responders were compared with a parallelized control group, matched for age, sex, and primary study to account for putative effects of these variables. In an additional analysis, nocebo responders were compared with the full control sample to increase transferability and transparency. To explore if specific parameters were associated with a more pronounced "nocebo response," correlations between bodily sickness symptoms and psychological and physiological variables were computed using Spearman's rho. If not otherwise indicated, data are shown as mean ± SD (instead of median and interquartile range) to increase clarity.

#### RESULTS

Out of *N* = 106 participants who had all received placebo injection, *N* = 20 (18.9%) wrongly believed that they had received endotoxin and were thus considered as nocebo responders. Nocebo responders did not significantly differ in sociodemographic or psychological trait variables, nor in baseline physiological (i.e., cortisol, heart rate, blood pressure, body temperature) or psychological state (i.e., state anxiety, mood) variables from parallelized and full control samples (see **Table 1**).

In response to placebo injection, nocebo responders reported significantly more bodily sickness symptoms compared both to the parallelized (*U =* −3.12, *p* = 0.002) and full control samples (*U* = 4.05, *p* < 0.001) (**Figure 1A**, **Table 1**). Notably, differences remained significant if one nocebo responder with an extremely high symptom score of 14 was excluded from analyses (not shown). Against our expectation, we did not find evidence for increased state anxiety or impaired mood in nocebo responders (**Table 1**). In addition, no group differences were observed in blood pressure (not shown), heart rate, or plasma cortisol concentrations analyzed herein as biological markers of arousal (**Table 1**).

To explore if specific variables were associated with a more pronounced nocebo response, correlational analyses were conducted within nocebo responders. Herein, we observed that more pronounced bodily sickness symptoms were significantly correlated with PRSS catastrophizing coping (rho = 0.66, *p* = 0.002; **Figure 1B**) and with NEO-FFI neuroticism (rho = 0.49, *p* = 0.041) scores. Moreover, bodily symptoms were associated with higher state anxiety (STAI-S) assessed at 3 h postinjection (rho = 0.46, *p* = 0.040; **Figure 1C**), and with negative mood (MDBF) scores 3 h (rho = −0.55, *p* = 0.013; **Figure 1D**) and 6 h postinjection


*STAI, State-Trait Anxiety Inventory; BDI, Beck Depression Inventory; NEO-FFI, NEO Five-Factor-Inventory, "Big Five Personality questionnaire"; PRSS, Pain-Related Self-statement Scale; GASE, adapted version of the Global Assessment of Side Effects Scale, MDBF, Multidimensional Mood Questionnaire, subscale negative mood. Nocebo responders were compared a)to a parallelized group of controls matched for age, sex, and primary study, as well as b)to the full sample of controls. All data are shown as mean ± SD unless otherwise indicated. Significant group differences are printed in bold.*

95% CI. Group differences remain significant after exclusion of one nocebo responder with an extremely high symptom score of 14 (not shown). Figures 1B–D show correlations between bodily sickness symptoms and FSS passive coping scores (B), State-Trait Anxiety Inventory (STAI) state anxiety scores (C), and Multidimensional Mood Questionnaire, subscale negative mood. (MDBF) mood scores (D). Please note that the reported correlations for mood and coping remain statistically significant after exclusion of the volunteer with a sickness symptom score of 14, while the correlation for state anxiety is no longer significant (rho = 0.37, *p* = 0.12).

(rho = −0.46, *p* = 0.041). No significant correlations were found within the parallelized control group (all rho < 0.15, *p* > 0.56).

### DISCUSSION

The experimental endotoxemia model offers a unique approach to analyze nocebo effects in the context of expected inflammationinduced sickness symptoms. Based on perceived treatment allocation, we herein analyzed nocebo responders within over 100 healthy volunteers in the placebo arms of randomized controlled endotoxin studies. Retrospective ratings of perceived treatment allocation revealed that ~20% of the placebo-treated volunteers believed they had received endotoxin and were thus classified as nocebo responders. This proportion is comparable to nocebo response rates in randomized controlled drug trials, but can be even higher (14, 15). Nocebo responders reported significantly more bodily sickness symptoms, suggesting that the perception of symptoms affected perceived treatment allocation. Indeed, it has been proposed that mild, benign ailments (e.g., fatigue, headaches, drowsiness) are commonly reported even by healthy individuals not taking any medication and that such unspecific symptoms can be misattributed as unwanted drug effects in pharmacological trials (1). Supporting this notion, perceived treatment allocation was related to pain symptoms after dental surgery in clinical trials (16). Furthermore, retrospectively assessed perceived treatment allocation in a brain imaging study on placebo analgesia was preceded by alterations in neural pain processing, supporting that perceived treatment allocation is not a mere reporting bias (17). Our findings lend indirect support for the use of active placebos that mimic the (side) effects of active treatments in experimental nocebo research. If the perception of symptoms reinforces negative treatment expectations, it will indeed strengthen the assumption that an active treatment was given and hence boost nocebo effects. At the same time, active placebos could help overcome the problem of allocation concealment and blinding of patients in clinical trials (3, 18, 19).

Our second aim was to explore characteristics of nocebo responders. Against our expectations, we did not find differences between nocebo responders and controls in psychological or physiological parameters beyond bodily sickness symptom scores. However, correlational analysis revealed associations between the nocebo response and psychological parameters, which were exclusively observable within nocebo responders. This exploratory analysis suggests that nocebo responders are not characterized by alterations in psychological characteristics per se, but rather by a different contribution of psychological states and traits to the perception of sickness symptoms. In detail, we observed that more pronounced bodily sickness symptoms in response to the placebo injection were associated with greater state anxiety and negative mood, as well as with catastrophizing and neuroticism. The impact of anxiety and the anxiety-related neurotransmitter cholecystokinin on nocebo effects in pain has already been established (20). Similar processes in the perception of unspecific sickness symptoms are conceivable. It is also possible that nocebo responders misinterpreted normal somatic effects of emotional arousal induced by the injection, blood draws, or other aspects of the treatment context as side effects of endotoxin (1). Catastrophizing and neuroticism have previously been related to the perception of somatic symptoms in health and disease [e.g., Refs. (21, 22)]. Our data now support that these personality characteristics may also contribute to nocebo responses in the context of nonspecific somatic complaints. It is tempting to speculate that negative affectivity and personality-related factors have contributed to the perception and a misattribution of symptoms herein, which ultimately affected perceived treatment allocation. This would also be in line with the existing literature on predictors of nocebo responses, especially supporting a role of anxiety (23). However, current knowledge is scarce and far from conclusive (19), and our exploratory correlational findings need to be interpreted with caution. Keeping this limitation in mind, our data do not support a role of an exaggerated stress response in the generation of the nocebo response as suggested by nonsignificant findings for cortisol and heart rate. Nevertheless, future studies in animals and human should also aim to analyze the effects of repeated challenges and take the complex interaction between the generation of nocebo symptoms, aberrant neuro-immune communication, and functional changes in microglia activation (e.g., states of para-inflammation) (24) into account.

From a clinical perspective, our findings illustrate how information about immune-related sickness symptoms provided during informed consent can induce nocebo responses. Indeed, the incidence of adverse side effects after drug intake was affected by the disclosure of side effects (25–28). Another recent example is the discussion if switching from biologic agents to biosimilars may lead to nocebo responses in patients with autoimmune conditions (29). This further supports that negative information provided by health care professionals, leaflets, the media, etc. can induce nocebo effects in the context of medical interventions (2, 30), likely including those taking place in the vast clinical context of inflammation and immunity.

The strengths of our work include the translational and clinically relevant endotoxemia model with its broad spectrum of sickness symptoms, implemented using highly standardized experimental and informed-consent procedures. While this entire psychosocial treatment context invariably induces negative expectations, we unfortunately did not specifically quantify individual treatment or symptom-related expectations. This is a limitation and important future direction, as it would allow a better understanding of cognitive factors associated with nocebo responses. Furthermore, despite the large overall sample, the number of nocebo responders was small and allowed only simple correlational analyses rather than more sophisticated statistical approaches. Thus, our correlational findings do not allow causal interpretations and should be interpreted with caution. Herein, nocebo responders were classified based on a dichotomous scale. Future research could improve upon this by assessing perceived probability of a specific treatment. This would allow more refined analyses on decision making in the context of nocebo responses. It remains open if the present findings are transferrable to nocebo responses in the endotoxin arms; however, recent reports support the relevance of treatment expectations (31) and psychological parameters (13) for the intensity of sickness symptoms during real pharmacological treatment. Future research is needed to expand knowledge that herein was gathered in a small, highly selected sample of healthy young volunteers studied in an experimental laboratory setting to larger samples in clinical contexts.

### ETHICS STATEMENT

All studies were conducted in accordance with the Declaration of Helsinki and were approved by the Institutional Ethics Review Board of the Medical Faculty of the University of Duisburg-Essen. All participants gave written informed consent and received financial compensation for study participation.

### AUTHOR CONTRIBUTIONS

SE and SB contributed conception and design of the study. SB organized the database and performed the statistical analysis. SB and SE wrote the manuscript and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

## FUNDING

The study was funded by the German Research Foundation (Deutsche Forschungsgemeinschaft; DFG) (to SB: BE 5173/2-1 and BE 5173/3-1; to SE: FOR 1328).

## ACKNOWLEDGMENTS

The authors would like to express their gratitude to all contributors and coauthors of the primary studies, especially Alexandra Brinkhoff,

### REFERENCES


Elisa Engelbrecht, Harald Engler, Julian Kleine-Borgmann, Simone Kotulla, Larissa Lueg, Janina Maluck, Daniel Pastoors, Laura Rebernik, Philipp Rödder, Till Roderigo, Manfred Schedlowski, Eva Stemmler, and Alexander Wegner.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Benson and Elsenbruch. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Use of Expectancy and Empathy When Communicating With Patients With Advanced Breast Cancer; an Observational Study of Clinician– Patient Consultations

*Liesbeth Mirjam van Vliet1,2,3\*, Anneke L. Francke2,4, Maartje C. Meijers1,2, Janine Westendorp2, Hinke Hoffstädt2, Andrea W.M. Evers1,3,5, Elsken van der Wall6, Paul de Jong7, Kaya J. Peerdeman1,3, Jacqueline Stouthard8 and Sandra van Dulmen2,9,10*

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Johannes A. C. Laferton, University of Marburg, Germany Andreas Dinkel, Technical University of Munich, Germany*

#### *\*Correspondence:*

*Liesbeth van Vliet l.m.van.vliet@fsw.leidenuniv.nl*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 19 March 2019 Accepted: 12 June 2019 Published: 17 July 2019*

#### *Citation:*

*van Vliet LM, Francke AL, Meijers MC, Westendorp J, Hoffstädt H, Evers AWM, van der Wall E, de Jong P, Peerdeman KJ, Stouthard J and van Dulmen S (2019) The Use of Expectancy and Empathy When Communicating With Patients With Advanced Breast Cancer; an Observational Study of Clinician– Patient Consultations. Front. Psychiatry 10:464. doi: 10.3389/fpsyt.2019.00464*

*1 Health, Medical and Neuropsychology Unit, Institute of Psychology, Leiden University, Leiden, Netherlands, 2 Department of Communication, NIVEL, Netherlands Institute of Health Services Research, Utrecht, Netherlands, 3 Leiden Institute for Brain and Cognition (LIBC), Leiden University, Leiden, Netherlands, 4 Amsterdam Public Health Institute, Vrije Universiteit, Amsterdam, Netherlands, 5 Department of Psychiatry, Leiden University Medical Center, Leiden, Netherlands, 6 Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands, 7 Department of Medical Oncology, St Antonius Hospital, Utrecht, Netherlands, 8 Department of Medical Oncology, Netherlands Cancer Institute, Amsterdam, Netherlands, 9 Department of Primary and Community Care, Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, Netherlands, 10 Faculty of Health and Social Sciences, University of South-Eastern Norway, Drammen, Norway*

Background: Information provision about prognosis, treatments, and side-effects is important in advanced cancer, yet also associated with impaired patient well-being. To counter potential detrimental effects, communication strategies based on placebo and nocebo effect mechanisms might be promising to apply in daily practice. This study aimed to provide more insight into *how often* and *how* oncologists use expectancy and empathy expressions in consultations with patients with advanced breast cancer.

Methods: Forty-five consultations between oncologists and patients were audiotaped. To determine how often expectancy and empathy expressions were used, a coding scheme was created. Most consultations (*n* = 33) were coded and discussed by two coders, and the remaining 13 were coded by one coder. To determine how expectancy and empathy expressions were used, principles of inductive content analysis were followed.

Results: Discussed evaluation (i.e., scan) results were good (*n* = 26,58%) or uncertain (*n* = 12,27%) and less often bad (*n* = 7,15%). Uncertain expectations about prognosis, treatment outcomes, and side effects occurred in 13, 38, and 27 consultations (29%, 85%, and 56%), followed by negative expectations in 8, 26, and 28 consultations (18%, 58%, and 62%) and positive expectations in 6, 34, and 17 consultations (13%, 76%, and 38%). When oncologists provided expectancy expressions, they tapped into three different dimensions: relational, personal, and explicit. Positive expectations emphasized the doctor–patient relationship, while negative expectations focused on the severity of the illness, and uncertainty was characterized by a balance between (potential) negative outcomes and hope. Observed generic or specific empathy expressions were regularly provided, most frequently understanding (*n* = 29,64% of consultations), respecting (*n* = 17,38%), supporting (*n* = 16,36%), and exploring (*n* = 16,36%). A lack of empathy occurred less often and contained, among others, not responding to patients' emotional concerns (*n* = 13,27% of consultations), interrupting (*n* = 7,16%), and an absence of understanding (*n* = 4,9%).

Conclusion: In consultations with mainly positive or uncertain medical outcomes, oncologists predominantly made use of uncertain expectations (*hope for the best, prepare for the worst*) and used several empathic behaviors. Replication studies, e.g., in these and other medical situations, are needed. Follow-up studies should test the effect of specific communication strategies on patient outcomes, to counter potential negative effects of information provision. Studies should focus on uncertain situations. Ultimately, specific placebo and nocebo effect-inspired communication strategies can be harnessed in clinical care to improve patient outcomes.

Keywords: communication, placebo effects, nocebo effects, empathy, expectancy, cancer, palliative care, observational study

#### INTRODUCTION

When faced with a serious disease such as advanced breast cancer, patients need information to understand what is going on and to plan for their future (1). Information about prognosis, treatment outcomes and plans, and benefits and risks of treatments are essential to provide optimal patient-centered care. Earlier data showed that patients having experienced adequate information about treatment benefits and risks experienced better personcentered care (2).

Despite its importance, information provision is by no means a "magic bullet" and also entails risks. There are several possible negative effects of information provision in advanced cancer. Explicit information about the incurability of a disease seems appreciated by most, but not all patients (3–5). Patients who are fully aware of their poor prognosis, are also the ones with the lowest reported quality of life and highest anxiety (6). It is known that providing information about side effects can increase their occurrence (7). A large study showed, for example, that breast cancer patients with relatively higher expectations of side effects are the ones experiencing the most side effects (8). While information provision is thus one of the cornerstones of communication (9), it can also lead to negative effects on patients' well-being.

To counter any of these potential negative effects, communication strategies derived from placebo and nocebo mechanisms might be promising to apply in daily practice. Integrating the research worlds of communication and placebo effects is still in its infancy (10). Placebo effects can be seen as "all real biopsychological effects on patient outcomes that are not attributable to a medical-technical explanation" (11, 12). The most well-known mechanism *via* which placebo effects occur is the expectancy mechanism. There is ample evidence (mainly from experimental studies) that the use of positive expectations can influence clinical patients' outcomes for the better (13, 14). For example, post-operative patients are known to experience less pain when pain medication is delivered in full view while verbally raising positive expectations about its effectiveness (15, 16). A second possible placebo effect mechanism affecting patient outcomes is the empathy mechanism, which is only mentioned by few scholars so far (10, 17, 18). From communication studies, we know that empathy is highly appreciated by patients (3, 19). From experimental studies in advanced breast cancer, we know that physician empathy is capable of reducing patients' emotional distress, while increasing information recall (4, 20, 21).

It is, however, unclear if and how expectancy and empathy strategies are currently employed by clinicians when discussing prognosis, treatment outcomes, and side effects with patients with advanced cancer. The aim of this study is to provide more insight into *how often* and *how* oncologists use expectancy and empathy expressions in consultations with patients with advanced breast cancer. This study serves as a starting point for a research area aimed at creating more insight into possible beneficial placebo and nocebo effect inspired communication strategies. Future studies should test the effect of specific communication strategies on patient outcomes, before the most beneficial strategies can be harnessed in clinical care.

#### METHODS

#### Design

We conducted a multi-center observational study of consultations between 12 oncologists and 45 patients with advanced breast cancer. Consultations were audiotaped, as audio observations provide more objective insights into communication behavior than self-reports. Data were collected between August and December 2018 at two Dutch city-based hospitals (one cancerspecific hospital and one general hospital).

#### Ethical Approval

The study was evaluated by the Medical Ethical committee of the Netherlands Cancer Institute (NKI-AVL), which exempted the study from formal ethical approval. Both participating hospitals approved the conduct of the study in their representative hospitals. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### Sample

Initial consultations for patients with advanced breast cancer (i.e., the first time that patients would be informed that their disease is incurable) or follow-up visits in which evaluation results (i.e., scan results) would be discussed were included. It is likely that in these consultations, a detailed discussion of prognosis, treatment outcomes, and side effects would occur. The consultations had to include patients who were female, were ≥18 years of age, had advanced cancer in the sense that cure was no option anymore (according to the medical team), were not in the terminal phase of their disease, were cognitively able to provide consent and to complete a questionnaire, and who had command of the Dutch language.

#### Recruitment

The medical team of the participating hospitals screened (mostly) weekly for eligible consultations and eligible patients. If there was too little time between identification of the consultation and the opportunity to recruit patients, eligible patients were not contacted. The remaining eligible patients were contacted by a member of the hospital team with a brief introduction of the study. The contact details of interested patients were transferred to the research team who explained the study in more detail *via* telephone contact with the eligible patient. More specifically, patients were informed that the study focused on communication between oncologists and patients, that one consultation would be audiotaped and that participants would have to complete both a pre-consultation question and a post-consultation questionnaire (only the postconsultation questionnaire assessing patient characteristics is included in this article, as this was a descriptive study). The research team did not mention the advanced stage of the disease. Preliminary oral consent was provided *via* telephone, after which patients were sent a written information letter *via* post or e-mail, and written consent was gathered by the research team immediately pre-consultation in the waiting area of the hospital. It was stressed that participation was voluntary and that patients could always withdraw their participation. Participating oncologists also provided consent for the consultations to be audiotaped.

### Sample Size

Being an audio-observation study of medical consultations (i.e., medical interviews) in which communication is explored in detail, data saturation was aimed for. Taken into account the variability in patients, oncologists, and consultations, we aimed for a somewhat larger sample of consultations than normally recommended (22) and aimed to include 35–40 consultations between patients and oncologists.

#### Outcomes

#### Background Characteristics: Participants and Consultations

Patients' sociodemographic characteristics (e.g., age, ethnicity, education) and disease characteristics (i.e., treatments currently receiving) were assessed post-consultation using a self-created questionnaire.

Characteristics of the consultation were assessed by the coding team. This included consultation time and whether the provided evaluation results (i.e., scan results) in the consultations were "good" (e.g., regression or stable disease), "uncertain" (e.g., clinical data from scan results and blood results are contradictory), or "bad" (e.g., disease progression). These criteria were determined in collaboration with the practicing oncologists who were part of the research and authorship team (EW, PJ, and JS). The core coding team (LV, MM, JW, and HH) determined together the category of each result.

#### Coding

To determine the occurrence of expectancy and empathy expressions, we created a coding scheme. This coding scheme was based on previous studies in the field of communication and placebo and nocebo effect research [expectancy references (23– 28) and empathy references (4, 19–21, 29–35)], observations of other recorded consultations, and clinical and research expertise. See **Table 1** for a more detailed overview and explanation of the coding scheme.

For the expectancy expressions, the coding scheme addressed the number and content of oncologist-expressed positive, negative, or uncertain expectations regarding i) prognosis, ii) treatment outcomes, iii) side effects, and iv) others. This latter category was created to ensure we would not miss any expectancy expressions that could not be captured in our predefined categories. We did, however, not encounter any "other expectancy expressions"; hence, this is not further discussed in the Results section.

For the empathy expressions, the coding scheme addressed the number and content of the following oncologist-expressed empathic behaviors (irrespective of patients' expressed emotional expression, called "cue" or "concern") (36): i) NURSE (Naming, Understanding, Respecting, Supporting, Exploring) (30, 31); ii) showing interest in the patient and her feelings, not just the disease (19); iii) not interrupting the patient (only "negative" was coded); and iv) other. We coded both the occurrence of an empathic behavior as well as a non-empathic behavior. We created a third response category in case patients provided an emotional expression, which was not picked up by oncologists, labeling this a "missed opportunity for empathy" (37).

### Analyzing Process

The actual analyzing process consisted of several steps. We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement (38) and the Standards for Reporting Qualitative Research (SRQR) guideline (39), for the quantitative and qualitative part of the study, respectively.

#### TABLE 1 | Coding scheme.

#### Codes and examples of expectancy-expressions

*Code for each behavior how often it occurred and give the content (sentences) from which this became apparent.* 

*It is possible that an oncologist provided several remarks which e.g. illustrate that he/she is positive about the treatment outcomes. If that is the case, code each unique occurrence and provide the content for each occurrence.* 

*If there are two occurrences in one sentence, both are coded.* 

*Positive expectancy-expressions include expressions in which an oncologist expresses positive expectations about prognosis/treatment outcomes/side effects, negative expectancy-expressions include expressions in which an oncologists expresses negative expectations about prognosis/treatment outcomes/side effects, and neutral expectancy-expressions include expressions in which an oncologist expressed neither positive nor negative but neutral expectations about prognosis/ treatment outcomes/side effects.*


#### Other

#### Codes and examples of empathy expressions

*Code for each behavior how often it occurred and give the content (sentences) from which this became apparent.*

*It is possible that an oncologist provided several remarks that, e.g., showed an interest in a person. If that is the case, code each unique occurrence and provide the content for each occurrence.*

*If there are two occurrences in one sentence, both are coded.*

*For coding of the behaviors, it is not necessary that a patient expressed an explicit cue/concern. If a cue or concern was expressed, which was not responded upon by the oncologist, this is coded as "missed opportunity".*


*(only code in case of 'no)*

*Step 1:* Patients' background characteristics and consultations characteristics were analyzed using descriptive statistics.

*Step 2:* The consultations were coded to determine how often expectancy and empathy expressions were used by clinicians. All consultations were transcribed verbatim and personal identifiers were removed. First, the audiotapes of the consultations were listened to and the transcripts were read several times. Next, the abovementioned coding scheme (see **Table 1**) was applied and all specific positive/negative/uncertain expectancy expressions and empathic/non-empathic behaviors including the missed opportunities for empathy were copy-pasted from Word to a dedicated Excel template in which the specific behaviors were grouped together. In addition, how often all behaviors occurred per consultation was noted. Two investigators (MM and JW) independently coded 33 out of the 45 (73%) transcripts. All transcripts and coded segments were discussed and any discrepancies were resolved through discussion until a consensus was reached. The remaining 27% (*n* = 12) was coded by one investigator (JW). A third investigator (LV) coded all segments of a random 10% of the consultations (*n* = 4). Agreement between the investigators for all coded segments was 96.45% (136 out of 141 segments). Descriptive statistics were used to describe how often all expectancy and empathy expressions occurred per consultation. To facilitate analyses, Stata 14.0 was used.

*Step 3:* The expectancy- and empathy-coded text segments were used to determine how oncologists use these behaviors in consultations. To do so, all the coded segments that were grouped together were explored following the principles of inductive content analysis (40). First, in the preparation phase, the text was read several times, and two researchers (LV and JW or HH) independently wrote a memo for each subset of coded behavior, with most remarkable outcomes and subdivision of behaviors. These were discussed among the core researchers (LV, JW, MH, and MM). Next, in the organizing phase, text fragments belonging together were highlighted and codes were given. Emerging codes were grouped together under headings and compared to the entire dataset. In the final, reporting, phase, the final categories representing subforms of specific behaviors were determined. One researcher systematically coded all text (LV, communication/psychology background), while interim results were discussed among the research team (with a psychology, nursing, sociology, medicine, and communication background) to prevent onesided interpretation of the data (41).

#### RESULTS

#### Participants

All approached oncologists participated (*n* = 12). A total of 84 patients gave permission to be contacted by the research team. Of these, 19 gave no oral consent (they were not interested or found it too burdensome for the consultation to be audiotaped and/or to complete the questionnaires), 4 did not fulfill the inclusion criteria (e.g., they were scheduled for a check-up visit), 2 could not be reached by telephone, 10 encountered logistical problems preventing participation (e.g., there were 2 patients at the same time, the oncologist was too busy, or the consultation was cancelled), and 4 gave preliminary oral consent but withdrew their consent later. Lastly, for 2 patients who provided written consent, the audio-recordings failed. Background characteristics of the remaining 45 consenting participants are displayed in **Table 2**.

#### Consultations

The consultation lasted, on average, 18.96 min (SD = 8.00; range = 4.43–34.83). All consultations were evaluative follow-up consultation in which evaluation results (i.e., scan results) were discussed. In 26 consultations (58%), good evaluation results were discussed; in 12 consultations (27%), uncertain evaluation results were discussed; and in 7 (15%), bad evaluation results were discussed. There were no disagreements within the coding theme when determining to which category a consultation belonged.

TABLE 2 | Background characteristics of participants.


*1Low = primary education or less.*

*Intermediate 1 = lower secondary.*

*Intermediate 2 = upper secondary.*

*High = tertiary.*

*\*Out of the 45 participating women, 41 completed all questionnaires, data of the remaining 4 could not be retrieved.*

*\*\*Women can receive several treatments, so this does not add up to 100%.*

#### Use of Expectancy Expressions

#### How Often Are Expectancy Expressions Used? *Positive Expectations*

Positive expectations about prognosis were provided in 6 (13%) consultations, followed by positive expectations about side effects, which occurred in 17 (38%) consultations, while in most consultations (*n* = 34, 76%), positive expectations about treatment outcomes were provided. On average, positive expectations about prognosis and side effects occurred less than once per consultation while positive expectations about treatment outcomes occurred more than twice per consultation (see **Table 3**).

#### *Negative Expectations*

Negative expectations about prognosis were provided in 8 (18%) consultations, followed by negative expectations about treatment outcomes, which occurred in 26 (58%) consultations, while in 28 (62%) consultations, negative expectations about side effects were provided. On average, negative expectations about prognosis occurred less than once while negative expectations about treatment outcomes and side effects occurred almost twice per consultation (see **Table 3**).

#### *Uncertain Expectations*

Uncertain expectations about prognosis were provided in 13 (29%) consultations, followed by uncertain expectations about side effects, which occurred in 27 (56%) consultations, while in 38 (84%) consultations, uncertain expectations about treatment outcomes were provided. On average, uncertain expectations about prognosis occurred less than once, while uncertain outcomes about treatment outcomes occurred more than four times per consultation (see **Table 3**).

#### How Are Expectancy Expressions Used

When oncologists employed expectancy expressions, they tapped into three different dimensions: i) relational, ii) personal, and iii) explicit. The relational dimension refers to the extent to which expectations enhance the oncologist–patient relationship. The personal dimension refers to the extent to which expectations incorporate a personal reflection from oncologists. The explicit dimension refers to the extent to which expectations are made explicit. The different dimensions occur to various degrees within positive, negative, and uncertain expectations.

#### *Positive Expectations*

Positive expectations were characterized by a high degree of explicit—reassurance and thereby an emphasis on the doctor– patient relationship, while oncologists regularly referred to their personal thoughts and feelings. In **Figure 1A**, these different dimensions and their overlap are visually displayed. Patients were often reassured that there are still options available, that complaints are harmless, or that side effects will not be (or are

TABLE 3 | The occurrence of expectancy expressions throughout the consultations. Positive expectations Negative expectations Uncertain expectations *n* (%) M (SD) range Examples content *n* (%) M (SD) range Examples content *n* (%) M (SD) range Examples content Prognosis 6 (13) 0.40 (1.25) 0–7 *"Yes, but wait. For the time being, you're still around"* 8 (18) 0.40 (1.03) 0–4 *"Um, well that makes that I don't think your prospect is very positive"* 13 (29) 0.8 (1.84) 0–8 *"For how long this is going to go well? I hope for a terribly long time. Can I predict it fully? No I don't know. Every time it's for me also a bit hoping that it's OK."* Treatment outcomes 34 (76) 2.58 (2.30) 0–10 *"No, these numbers are not disturbing at all, those tumor markers. I sometimes see numbers of 5,000 or 10,000"* 26 (58) 1.78 (2.39) 0–11 *"Um, well yes, that test result does scare me a bit, because … well, what you see on the scan is, well, that is not going well"* 38 (84) 4.29 (4.27) 0–23 *There's always a possibility that it'll work or a possibility that it won't (…).: "And then you're back at the point of this uncertainty."* Side effects 17 (38) 0.80 (1.24) 0–4 *"And we're finding a better balance with the side-effects"* 28 (62) 1.91 (2.37) 0–8) *"Because for tiredness I have no miracle cure."* 27 (56) 2.05 (2.84) 0–12 *"And some people don't experience this (side effect, red) at all and others a bit or very much (…) but there is no way to test that* 

*n = number of consultation in which specific expectancy expression occurred.*


*SD = standard deviation of specific expectancy expression per consultation.*

*Range = Range of specific expectancy expression per consultation.*

*beforehand."*

(B) Visual representation of the presence and overlap of the personal/ relational/explicit dimensions of positive expectancy expressions. (C) Visual representation of the presence and overlap of the personal/relational/explicit dimensions of positive expectancy expressions.

not) too serious/burdensome. Such reassurance was frequently focused on very specific situations. Oncologists also regularly stressed their own thoughts and visions, which seemed to strengthen expressed positive expectations. Lastly, the doctor– patient partnership was often emphasized by referring to "we".

*"I am not, I'm not worried about this at all. That scan is fine." "With that reduced dose that (irritated mucous membranes, ed). will also get better"*

*"And we're finding a better balance with the side effects"*

Example of a quote where the personal, relational, and explicit dimensions come together:

*"Precisely, but just um looking into the far distance, I say yes, just carry on with it. Do we still have hormonal therapy as an alternative? Yes, if necessary we'll use that. And if at a certain moment in time we are done with hormonal therapy, do we then still have something else? (…) Like chemo therapy? Yes. Even then there are some choices to be made and we'll first and foremost have to make a choice that is then acceptable to you. (…) Do I have something good? Yes, I do. Is it acceptable to you? That is what we will talk about."*

#### *Negative Expectations*

Negative expectations were characterized by a high degree of personal reflections, which seemed to strengthen a more or less explicit negative future vision. In **Figure 1B**, these different dimensions and their overlap are visually displayed. Oncologists expressed their own worries, about disease progression, a lack of treatment effects or side effects by which they seemed to emphasize the severity of the situation.

*"Do you want me to honestly tell you how um I think it'll go? (…) Yes, I'm worried about you. Whether this will turn out well, because these blood counts, those blood platelets are suddenly so low."*

*"Because for tiredness I have no miracle cure."*

Such negative expressions varied in their level of explicitness, with treatment-related expectations often being expressed more implicitly than side-effect-related expectations, and with prognostic-related expectations being expressed both explicitly and implicitly.

*"For well, to be totally cured you have to, for that the various spots are actually too numerous."*

*"When all is said and done, the options I have are not infinite. Then it'll grow and then it'll get into your system and still further."*

With negative expectations, there was much less emphasis on relationship building. In the rare occasions the relationship dimension was tapped into, oncologists seemed to either emphasize or de-emphasize the clinician–patient relationship:

*"Yes, they are really nasty jabs. I have to admit that."*

#### *Uncertain Expectations*

Uncertain expectations were characterized by an emphasis on what an oncologist hopes for, but cannot guarantee. While expressing such hopes, oncologists both focused on their own perceptions, making it personal, and on the positive relationship with patients. In **Figure 1C**, these different dimensions and their overlap are visually displayed.

*"For how long this is going to go well? I hope for a terribly long time. Can I predict it fully? No I don't know. Every time it's for me also a bit hoping that it's OK."*

Most importantly, uncertain expectations seemed to represent a balancing act. On the one hand, patients were being prepared for negative outcomes such as a future discontinuation of treatments or occurrence of problematic side effects. On the other hand, potential possibilities were mentioned, which were not presented as "magic bullets" but as a quest for a balance between treatment (intensity) and side effects.

*"So the first step is reducing the dose a bit and at a certain moment we'll be putting in weeks of rest, with you doing two weeks followed by a week of no treatment. Um and doing so you hope that at a given time you'll find a sort of stable situation that is doable for you, that you can get on with, doesn't bother you too much yeah you'll experience some bother, but something that you can get on with. If we should see that this causes problems, yeah well, then we'll have to find the right balance, for that's of course always what it is; the balance between side effect and effect."*

Uncertain expectations about current and future treatment options and side effects were predominantly implicit in nature, but also sometimes more explicit (especially regarding treatment outcomes). They focused on (the source of) side effects and complaints that are currently present or might develop in the future, but also on the continuation of current and future treatments.

*"And some people don't experience this (side effect, red) at all and others a bit or very much (…) but there is no way to test that beforehand."*

*"There's always a possibility that it'll work or a possibility that it won't.' Patient: 'Umm mm.' Oncologist: 'And then you're back at the point of this uncertainty."*

## Use of Empathy Expressions

### Number of Expressions

#### *Use of Empathy*

All studied empathy expressions were displayed throughout the consultations, ranging from showing understanding of emotions in 29 (64%) consultations to the use of naming emotions in 4 (9%) consultations. The other empathy expressions occurred in around a third of consultations, e.g., respecting (*n* = 17, 38%), supporting (*n* = 16, 36%), exploring of patients' emotions (*n* = 16, 36%), and showing interest in the patient (*n* = 13, 29%). On average, understanding remarks occurred more than twice per consultation, while all other statements occurred generally less than once per consultation (see **Table 4**).

#### *Lack of Empathy*

Non-empathic behaviors were infrequently displayed throughout the consultation; interrupting the patient occurred in 7 (16%) consultations, followed by 4 (9%) consultations in which a lack of understanding occurred, while showing non-supporting statements or a lack of interest in the patient occurred in 1 consultation (2%). On average, negative behaviors occurred less than once per consultation (ranging from an average of 0.2 interruptions per consultation, to an average of 0.09 lack of showing understanding towards patient emotions per consultation). However, in more than a quarter of consultations (*n* = 12, 27%), oncologists failed to pick up on an emotional expression from a patient, which occurred, on average, 0.89 times per consultation (see **Table 4**).

#### How Empathy Expressions Are Used

#### *Use of Empathy*

When oncologists used empathy expressions, they used several manners to do so, which are closely aligned to the coding categories: NURSE (Naming, Understanding, Respecting, Supporting, Exploring) and showing interest in the person.

The most important distinction in empathy expressions referred to the level of specificity. Across the different NURSE categories, oncologists could either be generic in their level of expressed empathy, or, alternatively, could be specific. Specific empathic behaviors were characterized by referring to specific situations and emotions, or by referring to the individual.

Understanding generic: *"Yes, I understand."*

Understanding specific: *"Yeah, so it's really stressful, isn't it."* Respecting generic: *"OK, that's very good" (responding to a patient saying she will walk the dog on the beach).*

Respecting specific: *"What an extraordinary person you are, aren't you."*

Exploring generic: *"For um, how um do you feel about it."* Exploring specific: *"And um … What do you find stressful about it? Is it such a result or is it the Nivolumab itself?"*

When providing support, both generic and more specific statements were made that either referred to the oncologist proactively offering support, or referred to the patient proactively needing to request support.

Proactive oncologist generic: *"Is there anything else I can do for you?"*

Proactive oncologist specific: *"You know what, I'll give you a call tomorrow morning to see if things are getting a bit better."*

Proactive patient generic: *"Oh, right. Or you can always give me a ring."'*

Proactive patient specific: *"Um … hey, so give me a ring next week if you haven't recovered from that flue yet."*

#### TABLE 4 | The occurrence of empathy expressions throughout the consultations.


*n = number of consultations in which specific empathy expression occurred.*

*(%) = percentage of consultations in which specific empathy expression occurred.*

*M = mean number of specific empathy expression per consultation.*

*(SD) = standard deviation of specific empathy expression per consultation.*

*Range = range of number of specific empathy expression per consultation.*

Lastly, there were several ways in which oncologists showed an interest in the patient as a person. These included enquiring about holidays, patients' loved ones, important days coming up, and non-cancer-related health problems.

*"OK, nice where are you going?" "And how many years have you been married for?"*

#### *Lack of Empathy*

Although a lack of empathy did not frequently occur, there were a few occasions in which oncologists showed little understanding of patients' emotions by talking or laughing over them.

Patient: *"And um … well, that vocal cord, so you're saying I'd better see the ENT doctor."* Oncologist: *"We could also wait for a bit."*

Patient: *"Right. Um … is the therapy we're using now enough to extend my life?"* Oncologist: *"Oh what a difficult question ha ha [loud laughter]."*

The one occasion in which there was little interest in the person occurred when an oncologist failed to enquire about an ill loved one.

Patient: *"I'll handle this again. Well, yes the oldest son has Pfeiffer disease, so …* Oncologist: *Yes, you mentioned that. Patient: So, yes that …* Oncologist: *Let's look at the blood pressure."*

If patients were interrupted, this was mainly because oncologists seemed to complete their sentences.

Patient: *"Right, so it's not as if you spinal column as one…."*  Oncologist: *"It's counted spot by spot."*

Lastly, oncologists sometimes did not respond to patients' emotional expressions.

Patient: *"Aaahhh liver biopsy really is hell. But OK you're right I'm not a wimp, but I really don't like that, but well."* Oncologist: *"No, well, right."*

#### DISCUSSION

In this observational study of consultations between oncologists and patients with advanced breast cancer, we aimed to get an insight into and create a better understanding on *how often* and *how* oncologists make use of expectancy and empathy expressions in clinical care. While there has been a recent interest in the placebo and nocebo effects of communication, and clinicians' empathic responses to patients' expressed cues and concerns have extensively been studied (see, e.g., Zimmermann et al., 2007) (42), to the best of our knowledge, this is the first study to objectively determine how clinicians use expectancy and empathy expressions in advanced clinical breast cancer care. We found that in our sample, consisting of consultations in which mainly positive or uncertain medical outcomes were discussed, oncologists predominantly expressed uncertain expectations. Provided expectations differed in the extent to which they had a relational, personal, and explicit dimension. When expressing positive expectations, the doctor– patient relationship was emphasized, negative expectations focused on the severity of the illness, and uncertain expectations were characterized by a balance between (potential) negative outcomes and hope. Moreover, oncologists displayed several generic and specific empathic behaviors, most frequently showing an understanding towards patients' emotions. A lack of empathy was not common, but mainly included oncologists not responding to patients' emotional expressions. In sum, although various placebo and nocebo effect-inspired communication strategies were observed, their generalizability and their effects on patient outcomes remain to be determined, especially in uncertain situations with inherent uncertain expectations.

Focusing on expectancy expressions, several of our results are noteworthy. First, most (*n* = 26, 58%) consultations contained a "good" medical outcome (i.e., scan results), but positive expectancy expressions did not occur more often than negative or uncertain expectations. It might be that oncologists in our sample were reluctant to express—overtly—positive expectations in the context of advanced cancer, as patients are known to already often hold unrealistic expectations about their disease and treatment aims (43–45). This contrasts results from a study among heart disease patients, in which clinicians were often overly positive (46). Indeed, oncologists place great importance on not offering false hopes (47). Although very understandable, by refraining from positive expectations, oncologists might miss out on the potential helpful effects of this communication strategy. Patients appreciate it when clinicians are optimistic (48) and stress what can be done when facing an incurable cancer diagnosis (3, 49). Moreover, outside of the area of (advanced) cancer, positive expectations have shown to influence patient outcomes such as pain (evaluations) [(14, 50) (van Vliet et al., submitted)] and symptom burden (48). While it is a prerequisite that such expectations are realistic in nature, our insights suggest that there might be an underused potential for stressing positive aspects when communicating with patients with advanced cancer.

A second important observation was that expectation expressions differed not only in content (positive, uncertain, and negative) but also in the dimensions of being relational, personal, and explicit. By reassuring patients of the positive nature of outcomes, or by stressing that they hope for positive outcomes, oncologists in our sample did not only provide information but also seem to build a relationship, two distinct core functions of medical consultations (9). The stressful nature of discussing bad news (50), such as a lack of further treatment options, might, for some oncologists, limit the ability for relationship-building when providing negative expectations. In these situations, the severity of the situation is emphasized by making use of the negative impact of self-referring (e.g., "*I am worried"*) in contrast to its optimistic impact when raising positive expectations (e.g., "*I am not worried at all"*). Interestingly, in a series of experimental studies aimed at helpful communication styles, all communication elements that led to positive effects made use of a personal account (e.g., "*I understand you're worried. We will look together at the options"*) (4, 20, 21, 33) stressing the potential power of this dimension, also in the context of bad news. Lastly, the explicitness in which expectations were expressed varied widely, with more explicit expectations emphasizing an anticipation and implicit expectations characterizing uncertainty.

Uncertain situations seemed to be of critical importance and difficulty when raising expectations. In uncertain expectations, oncologists in our study made use of a balancing act in which they prepare patients for potential or certain negative outcomes, while simultaneously trying to offer some forms of perspective. In the literature, such an approach is called *"Hope for the best, prepare for the worst"* (51), illustrating a dual pathway followed in serious and uncertain illnesses. Previous studies have shown that patients differ in their preferences for how to handle the uncertainty of their advanced illness, with some wanting more explicit information than others (52). Clinicians, meanwhile, are reluctant towards and have difficulty in discussing clinician uncertainty (53, 54). We indeed found that the level of explicitness in particular varied widely when providing uncertain expectations, illustrating a lack of clear guidance on how to do so best. With treatment and care options in advanced cancer becoming increasingly complex, and targeted and personalized medicine options rapidly growing, there is a pressing need to develop more insight into how oncologists should best deal with uncertainty and provide expectations with an uncertain nature.

Focusing on empathy expressions, a more straightforward picture seemed to emerge compared to expectancy expressions. Oncologists made use of various forms of empathy, most frequently of showing understanding for patients' emotions and complimenting patients on how they handle their disease. The importance of acknowledging the emotions of patients with advanced cancer has been stressed before (49). Noteworthy, empathic remarks varied widely in their level of specificity, e.g., "*That's good"* compared to *"You have handled situation X very well"*. As patients value to be seen and treated as an individual person (19), also when faced with an incurable cancer diagnoses (49), one could expect that more specific expressions of empathy are most appreciated and beneficial. Although intuitively logical, there is a lack of empirical evidence on the effect of more generic or specific empathic remarks.

Interestingly, while most patient complaints in medical care are about clinician communication, as well as in advanced illnesses [e.g., Refs. (55–57)], in our study, we found that a lack of empathic communication did not often occur. There were, however, occasions in which patients' cues and concerns were not picked up by clinicians. Previous studies have shown that this is not uncommon in clinical practice (42, 58). If clinicians, however, do respond to emotional expressions, this can lead to positive outcomes, such as a decrease in consultation time (42), and an increase in the amount of information patients recall (58). Thus, based on our results, there seems to be room for improving the extent to which clinicians respond to patients' emotional expressions, leading to potentially positive effects.

#### Limitations

Our study has limitations. Firstly, our participants might not be representative for the entire population of people with advanced breast cancer, as they were female, highly educated, almost completely with a Dutch or other Western European background, and mainly recruited in a specialized research-focused cancer hospital. Secondly, our analyses were based on transcripts and thus verbal communication, while non-verbal elements such as eye contact remained masked. Intonation was used in the first but not latter phases of the qualitative analyzing process, as we used the transcripts for the coding. Thirdly, as we focused on the communication within the 45 audiotaped consultations, we did not take into account the nested design of our study (expectancy and empathy expressions were clustered within consultations, which were clustered within oncologists, which were clustered within hospitals). The number of audiotaped consultations per oncologist ranged from 1 to 8, while 8 of the 12 participating oncologists were from the specialized hospital, implying that the communication from the oncologists with more audiotaped consultations and from the specialized hospital influenced our results more strongly. Fourthly, given our limited sample size, we did not explore differences in used manipulations between consultations with a good, bad, or uncertain medical outcome. Fifthly, we only included consultations in which test results were discussed as these were the only ones identified, which potentially limits the generalizability of our results to initial consultations. Sixthly, as the research area of the placebo effects of communication is still in development, we welcomed the comment of one of the reviewers who wondered whether a comment as "that scan is fine" is a positive expectation and hope future discussions will help to clarify the criteria under study. Seventhly, although we did not observe other categories of expectancy expressions apart from our predefined categories, we cannot rule out that this is due to an implicit bias of the coding team, who all had a background in communication research. Our conceptualization was further hampered by a lack of a universally agreed conceptualization of expectancies [see, e.g., Laferton et al. (59) for a detailed overview]. Eighthly, we did not assess what patients' information and communication preferences were. Lastly, although all approached oncologists participated, they might form a subgroup of clinicians particularly interested and competent in communication.

#### Future Research

This study serves as a starting point for a research area aimed at creating more insight into possible beneficial placebo and nocebo effect-inspired communication strategies. The most pressing question our study does not answer is which specific forms of expectancy and empathy expressions are most promising in countering any negative effects of information provision and improving advanced cancer patients' outcomes. Moreover, there is a need for a better understanding into why oncologists use specific placebo and nocebo effect-inspired communication strategies and which strategies are most appreciated by patients. These questions need to be answered in follow-up studies. Ultimately, evidence-based expectancy and empathy expressions should be recommended for clinical use in advanced cancer. This specifically applies to expectancy expressions in uncertain situations, which seem to be most complex, and the effect of more generic or specific empathic behaviors. Additionally, replication studies within our and other medical and cultural contexts are needed, e.g., in other diseases of a chronic and often ultimately fatal nature, in non-Western countries, and with other participants such as men or patients with low health literacy. Furthermore, future observational studies should focus in more detail on the expressed manipulations, e.g., focus on differences between dyads, oncologists, and (specialized) hospitals; on differences between consultations discussing varying medical outcomes; and on sequential analyses of expressed manipulations. Such studies could also include other potential forms of expectations, such as regarding procedures or expectations regarding patient behavior (e.g., self-efficacy). Lastly, larger replication studies could also focus on the relation between consultation time and the use of positive expectancy and empathy expressions. In our sample, given the limited sample size, we explored this association, which did not seem to be present [except for the expression of positive expectations about side effects, and for showing understanding towards emotions (*p* < 0.01)].

#### Conclusions

To conclude, our study illustrated that when discussing positive or uncertain medical outcomes in advanced breast cancer, oncologists predominantly made use of uncertain expectancy manipulations. When providing positive expectations, oncologists emphasized the doctor–patient relationship, while negative expectations focused on the severity of the illness, and the area of uncertainty was characterized by a "hope for the best, prepare for the worst" approach. Moreover, empathy manipulations were generic or specific in nature and were dominated by oncologists showing an understanding towards patients' emotions. A lack of empathy was uncommon, and mainly included oncologists not picking up on patients' emotions. Follow-up studies should expand observational studies in this field, and focus on which communication strategies are most useful and influence patients' outcomes for the better, to counter any potential negative effects of information provision. Such studies should focus especially on uncertain and complex medical situations, in which oncologists have to discuss uncertain expectations. Ultimately, specific placebo and nocebo effect-inspired communication strategies can be harnessed in clinical care to improve patient outcomes.

#### DATA AVAILABILITY

The datasets for this manuscript are not publicly available because of ethical constraints. Requests to access the datasets should be directed to Liesbeth van Vliet, l.vanvliet@nivel.nl/l.m.van.vliet@ fsw.leidenuniv.nl.

#### ETHICS STATEMENT

The study was evaluated by the Medical Ethical committee of the Netherlands Cancer Institute (NKI-AVL), which exempted the study from formal ethical approval. Both participating hospitals approved the conduct of the study in their representative hospitals. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### AUTHOR CONTRIBUTIONS

LV: conceptualization, methodology, data collection, data analyses, writing—original draft. AF: conceptualization,

#### REFERENCES


methodology, data analyses, writing—review and editing. MM: methodology, data collection, data analyses, writing review and editing. JW: methodology, data collection, data analyses, writing—review and editing. HH: methodology, data collection, data analyses, writing—review and editing. AE: data analyses, writing—review and editing. EW: data analyses, writing—review and editing. PJ: methodology, data collection, data analyses, writing—review and editing. KP: methodology, data analyses, writing—review and editing. JS: methodology, data collection, data analyses, writing—review and editing. SD: conceptualization, methodology, data analyses, writing—review and editing

#### FUNDING

This study was funded by a Young Investigator Grant of the Dutch Cancer Society (number 10392) awarded to Liesbeth van Vliet.

#### ACKNOWLEDGMENTS

We would like to thank all the women and the oncologists for participating. We thank Dr. Annemiek van Ommen-Nijhof, Youssra Gokalp-El Benhaji, and Nanine van den Ing for their help in recruitment. We thank Dr. Janneke Budding for her help with translating the quotes. We thank our patient experts for their help with setting up the studies and especially the questionnaire and patient information letters. We would like to thank Tessie October for sharing her NURSE codebook.


responses to patients' emotional cues and concerns? An international multicentre study based on videotaped medical consultations. *Patient Educ Couns* (2013) 90(3):347–53. doi: 10.1016/j.pec.2011.06.010


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 van Vliet, Francke, Meijers, Westendorp, Hoffstädt, Evers, van der Wall, de Jong, Peerdeman, Stouthard and van Dulmen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo Effects on Visual Food Cue Reactivity: An Eye-Tracking Investigation

*Jonas Potthoff\*, Nina Jurinec and Anne Schienle*

*Department of Clinical Psychology, University of Graz, Graz, Austria*

Background: Enhanced visual food cue reactivity has been associated with overeating and weight gain. Due to the increasing prevalence of high-fat food images that we are constantly exposed to in both the real and the virtual world, methods that are able to reduce the reactivity to these types of cues are urgently needed. This eye-tracking study investigated whether food cue reactivity, especially toward high-caloric food, can be reduced with a placebo intervention.

Method: Fifty-two women [mean body mass index (BMI) = 23.5] were presented with pictures depicting combinations of food (high-caloric, low-caloric) and non-food items, which were shown once with and once without a placebo in a repeated-measures design. The placebo was a pill introduced as a medication targeting peptide YY that is able to reduce appetite specifically for high-caloric food. Gaze data (dwell time, fixations) and self-reported appetite were assessed during the two eye-tracking sessions (with/without placebo).

#### *Edited by:*

*Martina De Zwaan, Hannover Medical School, Germany*

#### *Reviewed by:*

*Karin Meissner, Ludwig Maximilian University of Munich, Germany Kathrin Schag, University of Tübingen, Germany*

> *\*Correspondence: Jonas Potthoff jonas.potthoff@uni-graz.at*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 10 December 2018 Accepted: 03 July 2019 Published: 24 July 2019*

#### *Citation:*

*Potthoff J, Jurinec N and Schienle A (2019) Placebo Effects on Visual Food Cue Reactivity: An Eye-Tracking Investigation. Front. Psychiatry 10:525. doi: 10.3389/fpsyt.2019.00525*

Results: The placebo reduced general appetite as well as specific appetite for the depicted food items. Additionally, the placebo decreased the percentage of fixations and dwell time on the food images. The placebo was not able to specifically change visual food cue reactivity to high-caloric stimuli but reduced responses to both high-caloric and low-caloric food. Reported appetite reduction and weight concerns were positively associated with the placebo-related decrease in visual attention for food.

Conclusions: The placebo was able to reduce visual food cue reactivity. This finding demonstrates that placebos are able to alter early visual–attentional processes.

Keywords: visual food cue reactivity, placebo, eye-tracking, appetite, wanting, liking

## INTRODUCTION

Food is a primary reinforcer that attracts automatic attention. From an evolutionary perspective, this mechanism enhances the efficient detection of food sources in the environment, which, in turn, enables adequate food intake and thus survival [e.g., Ref. (1)].

Neurobiological studies with methods such as electroencephalography, functional magnetic resonance imaging, and eye-tracking have revealed evidence that the human attentional system is tuned to identify food targets very quickly and to differentiate them from non-food items [e.g., Refs. (2–4)]. In addition, high-caloric food captures more automatic attention than lowcaloric food (4, 5). This attention bias seems to be more pronounced in overweight participants. Castellanos et al. (4) recorded eye movements for picture pairs with food (high-caloric, low-caloric) and non-food items during both a fasted and a fed condition, in normal-weight and obese women. In the fasted condition, both groups demonstrated longer fixation duration for food compared to non-food images. This visual bias was especially pronounced for high-caloric food. In the fed condition, obese individuals maintained increased attention towards food images. Additionally, they directed their first fixation toward food images more often than normal-weight individuals did. Similar findings were reported by Werthmann et al. (6). Overweight women directed more initial attention (first fixations) toward images with high-fat food than normalweight women. In a more recent study by Doolan et al. (5), normal weight and overweight adults (men and women) viewed high-caloric, low-caloric food and control images, during both a fasted and fed condition. Participants directed greater visual attention towards high-caloric food images. This response was most pronounced in overweight men.

The response bias for high-caloric food described above has been advantageous in earlier times when humans were still hunter-gatherers. However, in the present time, it has become problematic in Western societies due to the food surplus, compounded with the constant exposure to visual food cues, both in the virtual world (e.g., cookery shows on TV, food blogs) and in the real world (e.g., in supermarkets, restaurants) (7). This type of stimulation elicits appetite and the urge to consume the displayed food items [e.g., Refs. (8, 9)]. Since these food cues are so prevalent, it is not surprising that individual food cue reactivity can predict overeating, subsequent weight gain, and risk of obesity [see meta-analysis by Ref. (10)].

Modifying visual food cue reactivity is therefore a promising method for altering overeating habits. According to Boswell and Kober (10), food cue reactivity involves conditioned responses to stimuli that signal the presence of food (e.g., visual, olfactory cues), including physiological reactivity and craving. To change food cue reactivity, there are different behavioral strategies available. For instance, situation selection (where a person chooses to go into or avoid certain situations) and situation modification (where a person actively changes a situation, such as preference of diet products) are such strategies. Furthermore, cognitive reappraisal can be carried out. This refers to interpreting a situation in a way that alters its emotional impact (11). For example, one might focus on the negative consequences of food consumption, such as weight gain, or tell oneself that although a food item looks appetizing, it is not healthy. Such cognitive reappraisal strategies can reduce food cue reactivity [e.g., Refs. (12–14)]. However, all of the aforementioned strategies involve explicit cognitive processes that are effortful. These effortful inhibitory processes are generally challenging, but even more so for those who exhibit a tendency to overeat [e.g., Refs. (2, 15, 16)].

Due to the challenges involved in reducing food cue reactivity with explicit cognitive strategies, alternative (implicit) strategies should be considered. One such strategy is placebo treatment. Placebos are substances or treatments that are physically or pharmacologically inert. These types of treatments are offered to a recipient with the verbal suggestion that somatic and/or affective processes will change in a specific way (17). The most studied placebo effect is "placebo analgesia" (a reduction in pain that can be attributed to a sham treatment). Emerging neuroscience evidence implicates that multiple brain systems and neurochemical mediators are involved in placebo analgesia. Studies using the electroencephalogram have shown that placebo treatments are able to reduce amplitudes of event-related potentials in response to painful stimuli [e.g., Ref. (18)]. These changes occur already ~100– 200 ms after the onset of noxious stimulation, indicating early attentional and perceptual effects of placebos. However, placebo analgesia is also associated with autonomic and endocrine changes that occur much later [in the time frame of minutes and hours; for a review, see Ref. (17)]. A placebo therefore has several effects depending on the effector and time window investigated.

Studies in the area of appetite regulation have also consistently demonstrated placebo effects. Placebo-controlled clinical trials of appetite suppressants [e.g., Ref. (19] and placebo studies with healthy participants [e.g., Refs. (20, 21)] or with patients suffering from eating disorders [e.g., Ref. (22)] have all identified appetitechanging effects of sham treatments. For example, Hoffmann et al. (21) found that a satiety-enhancing placebo reduced reported appetite. An appetite-enhancing placebo did not alter subjective levels of hunger, but increased plasma levels of the "hunger hormone" ghrelin in female participants.

To the best of our knowledge, placebo-induced changes in food cue reactivity and appetite have not been studied with eyetracking so far. Such studies are important in order to find out if appetite-reducing placebos are able to affect early attentional– perceptual processes. The design of the current study was based on an experiment by Schienle et al. (23) during which the subjects passively viewed picture pairs (disgust pictures, neutral pictures) once with and once without a "disgust placebo" (inert pill administered with the verbal suggestion that it would reduce disgust symptoms). The placebo lowered reported revulsion and enhanced the fixation duration for disgusting pictures. The authors suggested that this change while on the placebo reflected a greater willingness of the participants to view these (previously avoided) stimuli.

The present placebo investigation administered picture pairs that depicted food (high-caloric, low-caloric) and non-food items. The experiment had a repeated-measures design with two counter-balanced sessions: the female participants viewed the pictures once with and once without the placebo. The placebo was introduced as a medication that targets peptide YY (a peptide released from cells in response to eating and satiety), which is able to reduce appetite, especially for high-caloric food. It was expected that the placebo would reduce the visual preference for high-caloric food cues (as indexed by reduced percentages of fixations, dwell time, and reduced initial gaze direction), as well as the reported appetite for high-caloric food [e.g., Ref. (24)]. Furthermore, a regression approach was used in order to analyze whether reported concerns about weight and eating as well as body mass index (BMI) would be associated with placebo-related effects on eye movements and appetite. This was done in order to investigate if overweight women who would like to lose weight might profit from this type of placebo intervention.

### METHODS

#### Sample

Fifty-two women (mean age: 26.4 years, SD = 8.7) with a mean BMI of 23.5 (SD = 3.7) took part in this experiment. Of the participants, nine were overweight (BMI = 25–30) and three were obese (BMI > 30) (**Table 1**). Sixty-nine percent of the participants were university students; the remaining subjects were whitecollar workers. They had normal or corrected-to-normal vision and did not report any somatic or mental disorders and no intake of medication. Participants were recruited for a study of an appetite-reducing medication ("propionate") *via* email lists and postings at the university campus. Written informed consent was obtained from all participants. The study was approved by the ethics committee of the university and was conducted in accordance with the Declaration of Helsinki.

#### Stimuli

The stimulus material comprised 60 images from the categories "low-caloric food" (e.g., fruits), "high-caloric food" (e.g., cream cakes), and "non-food" (e.g., office supplies). All images were taken from a validated set by Blechert et al. (25) and had a size of 600 × 450 pixels. The images of the three categories (highcaloric, low-caloric, and non-food) did not differ in their RGB values [R: *F*(2,57) = .952, *p* = .392, η2 p = .032; G: *F*(2,57) = .789, *p* = .459, η2 p = .027; B: *F*(2,57) = 1.729, *p* < .187, η2 p = .057] and their object size (number of pixels that are not the background) [*F*(2,57) = .033, *p* = .968, η2 p = .001].

The stimuli were presented as image pairs side by side on a white background on the computer screen (see **Figure 1**). Three types of image pairs were created: high-caloric + low-caloric food (*n* = 10), high-caloric food + non-food (*n* = 10), and lowcaloric + non-food (*n* = 10). Each image pair was shown twice during the experiment (60 trials in total). The second time an image pair was presented, the arrangement (which image was on the left or right side of the screen) was mirrored. The trial order was randomized. The eye-tracking experiment with the picture presentation lasted approximately 8 min.

#### Procedure

All participants answered demographic questions and two subscales of the Eating Disorder Examination-Questionnaire (EDE-Q) by Hilbert et al. (26) *via* an online survey (weight concern, eating concern). The questions are concerned with the past 4 weeks and are answered on seven-point scales (0 = not at all; 6 = very much).



*Placebo effectiveness: from 1 = "No change in appetite at all" to 7 = "Highest effect imaginable." BMI, body mass index; EDE-Q, Eating Disorder Examination-Questionnaire.* Typical items of the weight concern scale are: "How dissatisfied are you with your weight?" "Did you have the strong desire to lose weight?"; eating concern: "Were you afraid to lose control over your eating?" Cronbach's alphas in the present sample were α = .82 (weight concern) and α = .83 (eating concern).

Then, 52 participants were invited to the eye-tracking experiment [the sample size had been determined based on a previous eyetracking study with a comparable design; see Ref. (23)]. The experiment had a repeated-measures design and consisted of two sessions (with and without placebo), which were conducted approximately 1 week apart. The sequence of the two sessions (Placebo first vs. No Placebo first) was counterbalanced (26:26) across participants. Both sessions were conducted during the same time of the day after a 3-h fast. At the beginning of each session, the participants rated their general appetite on a seven-point scale (1: "I have no appetite at all;" 7: "I have an extreme urge to eat something right now"). This rating was repeated after 20, 40, and 60 trials.

The participants were asked to look at the images as if they were watching TV. Similar free-exploration instructions have been used before to study attentional biases in visual food cue perception (27, 28). Each image pair was shown for 6 s. Prior to each trial, a circle in the center of the screen had to be fixated for 1 s. Subsequently, the free exploration trial started, the circle disappeared, and the image pair was shown (**Figure 1**).

At the end of the each of the two sessions, 15 of the presented images (5 low-caloric food items, 5 high-caloric food items, 5 nonfood items) were shown again in random order. The 15 of the 60 stimuli pictures were chosen in order to cover a wide variety of different food items (e.g., cake, chocolate, fruits) but not to prolong the study. We presented only 15 images to avoid fatigue, effort, and boredom associated with repeated rating. The participants were asked to rate these food items with regard to their specific appetite/ wanting ("How much would you like to taste this food right now?" 1: "not at all", 7: "very much") and liking ("How much do you like this food in general?" 1: "not at all", 7: "very much").

In the placebo condition, the participants received a placebo pill (a 1-cm-long silica-filled capsule) prior to the picture presentation with the following verbal suggestion: "This pill contains propionate. The appetite-reducing effect of propionate, especially for highcalorie food, has repeatedly been confirmed in previous studies. The decrease in appetite is triggered by the release of the hormones peptide YY and glucagon-like peptide 1 (GLP-1). The effect will be noticeable approximately 15 minutes after intake." During this waiting time, the participants read an abstract of a scientific article and a newspaper article about propionate describing the positive effects of this medication. Subsequently, a saliva sample was taken from each participant and the experimenter pretended to conduct a test on the peptide YY level. The test fluid changed in color from colorless to blue (for all participants). It was explained that this would indicate a high peptide YY level (**Figure 2**). After the saliva test, the participants rated the effectiveness of propionate on a seven-point scale (7 = "extremely effective"; 1 = "not effective").

#### Eye Movement Recording and Analysis

We recorded two-dimensional eye movements using an SMI RED250 mobile eye-tracker with a sampling rate of 250 Hz.

low-caloric + non-food, high-caloric + low-caloric). Fixation disks had to be looked at for at least 1,000 ms in order to start the next trial.

To minimize head movements, a chin rest was used. We calibrated both eyes and analyzed data from the eye that produced the better spatial resolution, which was better than 0.35° visual angle. Stimuli were presented on a white background on a 24-in. screen with a resolution of 1920 × 1080 pixels. The viewing distance was 60 cm, resulting in a size of 15.6° × 11.7° viewing angle for the shown images. The experiment was controlled *via* the SMI Experiment Center (Version 3.6.53). For event detection, standard thresholds of the SMI BeGaze Software (Version 3.6.52) for high-speed eye-tracking data (sampling rate > 200 Hz) were used: The standard velocity threshold for saccade detection was 40°/s. In line with this velocity-based threshold [see Ref. (29)], fixations were defined by an absence of saccades and blinks (defined as moments without registered gaze positions) that lasted at least 50 ms. Data were exported

using SMI BeGaze and customized Python scripts. Within BeGaze, we defined the food and non-food images as areas of interest (AOI). We conducted gaze data analysis exclusively for the two AOIs of each trial.

We computed the percentage of fixations and dwell time that was spent on the food image (either high-caloric or low-caloric). For image pairs containing high-caloric and low-caloric food, these percentages were computed for the high-caloric image (for example, a value of 70% indicates that from the total number of fixations/dwell time, 70% were directed to the high-caloric food and 30% to the low-caloric food). Furthermore, the location of the first fixation was determined and used to compute the percentage of trials in which the first fixation was on the food image. For descriptive data (number of fixations and dwell time on each AOI), see **Table 2**.


TABLE 2 | Descriptive statistics (means, standard deviations) for the gaze parameters during the Placebo and No Placebo condition.

*HCLC, high-caloric and low-caloric; HCNF, high-caloric and non-food; LCNF, low-caloric and non-food.*

#### Statistical Analyses

In order to investigate placebo effects on general appetite, an analysis of variance (ANOVA) for repeated measures was computed with the within-subject factors Treatment (Placebo, No Placebo) and Time of Measurement (at the beginning of the session, after 20 trials, after 40 trials, after 60 trials of image presentation).

To evaluate the effect of the placebo treatment on the wanting/liking of the food depicted in the images, ANOVAs for repeated measures were computed with the within-subject factors Treatment (Placebo, No Placebo) and Image Category (high-caloric, low-caloric food) (the non-food items elicited no appetite and were therefore excluded from the analysis).

ANOVAs for repeated measures were performed with the within-subject factors Image Pair Category (high-caloric + nonfood, low-caloric + non-food, high-caloric + low-caloric) and Treatment (Placebo, No Placebo) for percentage of fixations, first fixations, and dwell time.

If sphericity was violated (Mauchly's Test of Sphericity), Greenhouse–Geisser correction was applied. We report the effect size as η2 p (partial eta squared) and Bonferroni adjusted *p* values. *p* values smaller than .05 were considered to be statistically significant.

Prior to the statistical analyses, we investigated a possible effect of the sequence of sessions (session with placebo first vs. session without placebo first). The calculated ANOVAs for general appetite, wanting, liking, fixations, dwell time, and first fixations revealed no significant interaction effects (all *p* > .10). Therefore, the sequence factor was not included in the ANOVAs.

Furthermore, we calculated three multiple linear regression analyses (enter method) to estimate the relationship between placebo-related changes of fixations on food, dwell time, and appetite (dependent variables) and the predictors eating concern, weight concern (EDE-Q scores), and BMI. In order to reveal possible associations between placebo-induced changes in appetite and percentage of dwell time on food images as well as percentage of fixations on food images, two exploratory Pearson correlations were calculated.

#### RESULTS

#### Self-Report

*EDE-Q:* The participants obtained the following scores on the selected EDE-Q subscales: *M* = 1.3 (*SD* = 1.2) for eating concern and *M* = 2.4 (*SD* = 1.5) for weight concern. Both eating concern [*t*(51) = 3.1, *p* = .003] and weight concern [*t*(51) = 3.4, *p* = .001] were elevated compared to the healthy norm sample (26).

*Placebo effectiveness:* The rated effectiveness of the placebo was, on average, *M* = 3.3 (*SD* = 1.9). A higher rating of placebo effectiveness was associated with a greater appetite reduction during the presentation of the food images (appetite rated before minus after placebo administration; *r* = −.36, *p* < .01).

*General appetite ratings:* The performed ANOVA revealed significant main effects of Treatment [*F*(1,51) = 12.84, *p* = .001, η2 p = .20] and Time [*F*(2.34,119.25) = 9.49, *p* < .001, η2 p = .16] and the Interaction [*F*(1.75,89.39) = 36.53, *p* < .001, η2 p = .42]. (**Figure 3**). *Post hoct* tests indicated that in the No Placebo condition, reported appetite increased from the first assessment (beginning of session) to the third and fourth assessment (after 40 and 60 trials of picture presentation; both *p* < .002). In the Placebo condition, the reported appetite was lower after 20, 40, and 60 trials of picture presentation compared to the initial value prior to placebo administration (all *p* < .001). The comparison of the Placebo and No Placebo condition showed that appetite ratings did not differ at the beginning of the session (*p* = .15) but for all other assessments (after 20, 40 and 60 trials of picture presentation, all *p* < .003). All *post hoc* tests were significant after Bonferroni correction.

trials of picture presentation. Whiskers indicate Cousineau–Morey confidence intervals (29).

*Wanting and liking of presented food images:* The ANOVA for wanting revealed a main effect of Treatment [*F*(1,51) = 30.78, *p* < .001, η2 p = .38] with lower values in the Placebo condition (*M* = 3.0, *SD* = 1.3) relative to the No Placebo condition (*M* = 4.2, *SD* = 1.2). The effect of Image Category [*F*(1,51) = 34.83, *p* < .001, η2 p = .41] was also significant with higher ratings for low-caloric (*M* = 4.1, *SD* = 1.3) vs. high-caloric food (*M* = 3.1, SD = 1.0). The interaction Treatment × Image Category did not reach statistical significance [*F*(1,51) = .15, *p* = .70, η2 p = .003].

For food liking, the main effect of Image Category was statistically significant [*F*(1,51) = 44.16, *p* < .001, η2 p = .46] with higher ratings for low-caloric food (low-caloric: *M* = 5.3, *SD* = 1.1; high-caloric: *M* = 3.9, *SD* = 1.0). The main effect of Treatment [*F*(1,51) = 3.17, *p* = .08, η2 p = .06] and the interaction Treatment × Image Category did not reach significance [*F*(1,51) = .43, *p* = .52, η2 p = .008].

#### Eye Movements

*Fixations:* The ANOVA revealed a significant main effect of Treatment [*F*(1,51) = 9.18, *p* = .004, η2 p = .15] with a reduced percentage of fixations on food pictures during placebo treatment (Placebo: *M* = 51.4%, *SD* = 10.3%; No Placebo: *M* = 56.9%, *SD* = 10.3%). The main effect of Image Pair Category [*F*(1.48,75.60) = 1.05, *p* = .34, η2 p = .02] and the interaction Treatment × Image Pair Category did not reach statistical significance [*F*(1.72,87.84) = 1.08, *p* = .34, η2 p = .02].

*Dwell time:*The main effect of Treatment [*F*(1,51) = 7.94, *p* = .007, η2 p = .14] was significant and indicated a placebo-related reduction in percentage of dwell time on food pictures (see **Figure 4**). The main effect of Image Pair Category was also significant [*F*(1.39,71.06) = 4.01, *p* = .04, η2 p = .07], but the computed *post hoc t* tests were not significant after Bonferroni correction. The interaction

FIGURE 4 | Mean percentage of dwell time on food for both conditions (Placebo, No Placebo) and three image pair conditions: HCLC (high-caloric food paired with low-caloric food; percentage of dwell time on high-caloric food), HCNF (high-caloric food paired with non-food), and LCNF (low-caloric food paired with non-food). Whiskers indicate Cousineau–Morey confidence intervals (30).

effect Treatment × Image Pair Category did not reach statistical significance [*F*(1.54,78.37) = .91, *p* = .38, η2 p = .02].

*First fixations:* The main effect of Image Pair Category was significant [*F*(2,102) = 19.74, *p* < .001, η2 p = .28]. First fixations were directed more often on high-caloric food (*M* = 52.6%, *SD* = 6.2%) than on low-caloric food (*M* = 47.7%, *SD* = 6.6%) when presented simultaneously with non-food items [*t*(51) = 3.65, *p* = .001]. The main effect of Treatment [*F*(1,51) = .49, *p* = .49, η2 p = .009] as well as the interaction Treatment × Image Pair Category did not reach statistical significance [*F*(2,102) = .61, *p* = .55, η2 p = .01].

*Exploratory correlation analyses:*A decrease in fixations on food presented in image pairs with non-food (percentage of fixations on food with placebo minus percentage of fixations on food without placebo) was associated with reduced appetite (mean appetite during the eye-tracking paradigm within placebo session minus mean appetite during the eye-tracking paradigm during control session) (*r* = .424, *n* = 52, *p* = .002). Furthermore, we found a significant correlation (*r* = .444, *n* = 52, *p* = .001) between appetite reduction and dwell time on food.

*Regression analyses:* For placebo-related fixation changes (percentage of fixations on food without placebo minus percentage of fixations on food with placebo), a significant equation with an adjusted *R*² of .11 was found [*F*(3,48) = 3.17, *p* = .03]. Weight concern was a significant positive predictor (**Table 3**). More pronounced weight concerns were associated with greater placebo-related reduction of food fixation. For changes in dwell time (percentage of dwell time on food without placebo minus percentage of dwell time on food with placebo), a significant regression equation was found [*F*(3,48) = 3.65, *p* = .02] with an adjusted *R*² of .14. Weight concern was a significant positive predictor of change in dwell time percentage (see **Table 3**). For change in appetite (appetite before minus after placebo treatment), no significant model was found.

#### DISCUSSION

Given the increasing prevalence of high-fat food images that surround us in both the real and virtual world, and dysfunctional eating behavior associated with this, it is important to find ways

TABLE 3 | Association between changes in fixation percentage, dwell percentage, and appetite (dependent variables) and EDE-Q eating concern, EDE-Q weight concern, and BMI (predictors).


to reduce visual attention towards high-energy food. In the current eye-tracking experiment, participants were presented with images of food (high-caloric/low-caloric) and non-food items. These images were shown once in combination with a placebo (an inert pill introduced as a medication that is able to specifically reduce appetite for high-caloric food) and once without the placebo.

The repeated presentation of visual food cues increased the reported appetite of the participants. In the No Placebo condition, the general appetite (desire to eat something) gradually increased across the trials. The placebo stopped this increase. Even during the first assessment of appetite during the eye-tracking experiment (after having viewed the first 20 picture pairs), the women in this condition experienced appetite reduction due to the placebo treatment. This reduced appetite continued to be present during the course of the entire experiment. In line with the general reduction of appetite, participants reported that their specific appetite for the depicted food items ("food wanting") was also reduced by the placebo. Thus, the placebo was able to reduce the desire to eat. The changes in self-report were in line with the eye-tracking data. The placebo pill reduced the percentage of fixations and the dwell time on food pictures. While under the placebo, the participants looked more often at the non-food items relative to the food (highcaloric and low-caloric).

The current study demonstrated a placebo effect on attentional processes that became apparent after a few minutes. This finding is in line with previous neurobiological studies, which also detected placebo-related changes in attentional networks of the brain in the range of milliseconds and seconds [e.g., Refs. (18, 23, 31, 32)]. In the mentioned EEG experiments (18, 32), a placebo was able to alter event-related components that reflect motivated attention (the characteristic of emotionally relevant stimuli to capture automatic attention). The studies with functional magnetic resonance imaging (30) showed that the placebo was able to change activation in primary and secondary visual cortex areas during the processing of affective pictures. Altogether, these results indicate that initial placebo effects rely on the modulation of sensory–attentional processes.

Furthermore, this modulation of attention could be predicted based on reported weight concerns of the participants. As shown on the regression analyses, dissatisfaction with one's own weight and the desire to lose weight (EDE-Q scale weight concern) were positively associated with placebo responsiveness; this was true for both gaze indicators (fixations and dwell time on food pictures). Miller et al. (33) have investigated the placebo effect in the context of illness and interpersonal healing. They argue that placebos predominantly operate by producing symptomatic relief of illness (e.g., pain, anxiety). This concept implies that some degree of impairment (suffering) must be present for a placebo to be able to work and to be effective. In the current experiment, the placebo was particularly beneficial for those women who perceived their own weight as problematic and who hoped for an appetite reduction. The BMI was not able to predict the gaze indicators of FCR. Therefore, our findings suggest that not the weight status itself (being overweight) but the subjective perception of one's own weight is a crucial predictor for the effectiveness of the placebo treatment.

We need to mention the following limitations of the current study. We analyzed the effect of a placebo on responses toward food cues in a female sample of university students (69%), who on average reported elevated eating and weight concerns and therefore were motivated to participate in the "propionate" study. Future studies should include clinical interviews for reliable diagnoses of possible eating disorders. Due to the self-selection of the participants, our findings cannot be generalized to other populations. Further, the reported food wanting and liking was higher for low-caloric relative to highcaloric food. It is likely that these responses were biased by social desirability factors. This hypothesis is supported by the eye-tracking data, which indicated that the first fixation was more often on high-caloric food (than on low-caloric food). This finding is backed by several previous investigations that have also shown that more initial attention (first fixations) is typically directed toward images with high-fat food vs. low-fat food (4, 6). Thus, to summarize, in the current study, the visual preference did not match the verbally expressed preference. To avoid fatigue and boredom, we did not obtain ratings for all images. Thus, the reported preference for the subset of pictures might not be representative for the complete picture set. Moreover, by means of the placebo instruction, we tried to specifically alter the food cue reactivity for high-caloric items. In the context of weight control programs, it would certainly be optimal if the reactivity to high-caloric food could be reduced, while low-caloric food reactivity does not need to change or even could be increased. This goal was not achieved. However, general appetite and focused attention changed in the intended direction. Finally, we did not assess eating behavior in the current experiment. This should be implemented in a future investigation.

In conclusion, the current study provides evidence for a reduction of food cue reactivity *via* placebo. The placebo treatment influenced attentional processes (gaze behavior) as well as food wanting and general appetite. Accordingly, placebos could be a helpful additional component for the treatment of overeating.

#### ETHICS STATEMENT

The study was approved by the ethics committee of the University of Graz.

### AUTHOR CONTRIBUTIONS

JP and AS designed the study and wrote the manuscript. JP and NJ recruited participants for the study, collected the data, and conducted the statistical analyses.

### FUNDING

The authors acknowledge the financial support by the University of Graz.

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Potthoff, Jurinec and Schienle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Minimizing Drug Adverse Events by Informing About the Nocebo Effect—An Experimental Study

#### *Yiqi Pan1\*, Timm Kinitz2, Marin Stapic1 and Yvonne Nestoriuc1,3*

*1 Department of Psychosomatic Medicine and Psychotherapy, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, 2 Clinical Psychology and Psychotherapy, University of Hamburg, Hamburg, Germany, 3 Clinical Psychology, Helmut-Schmidt-University/University of the Federal Armed Forces Hamburg, Hamburg, Germany*

Relevance: Informing patients about potential adverse events as part of the informed consent may facilitate the development of nocebo-driven drug adverse events (nocebo side effects).

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Keith James Petrie, The University of Auckland, New Zealand Kate Faasse, University of New South Wales, Australia*

> *\*Correspondence: Yiqi Pan y.pan@uke.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 09 January 2019 Accepted: 26 June 2019 Published: 25 July 2019*

#### *Citation:*

*Pan Y, Kinitz T, Stapic M and Nestoriuc Y (2019) Minimizing Drug Adverse Events by Informing About the Nocebo Effect—An Experimental Study. Front. Psychiatry 10:504. doi: 10.3389/fpsyt.2019.00504*

Objective: To investigate whether informing about the nocebo effect using a short information sheet can reduce nocebo side effects.

Methods: A total of *N* = 44 participants with weekly headaches for at least 6 months were recruited using the cover story of a clinical trial for a headache medicine. In reality, all participants took a placebo pill and were randomized to the nocebo information group or the standard leaflet group. Participants were instructed to read the bogus medication leaflet entailing side effects information shortly before pill intake. The nocebo group additionally received an explanation about the nocebo effect as part of the leaflet. Questionnaires were completed at baseline, 2 min, and 4 days after the pill intake. We conducted general linear models with bootstrap sampling. Baseline symptoms were included as a covariate.

Results: Most participants (70.5%) reported nocebo side effects at 2 min. Participants who received the nocebo information (*n* = 24) reported less nocebo symptoms than the control group (*n* = 20) (estimated difference: 3.3, BCa 95% CI [1.14; 5.15], *p* = 0.01, Cohen's *d* = 0.59). Baseline symptoms, perceived sensitivity to medicine, and side effect expectations each moderated the group effect (estimated difference in slope: 0.47, BCa 95% CI [0.19; 0.73], *p* = 0.001*, d* = 0.75; 1.07 [0.27; 1.61], *p =* 0.006, *d* = 0.73; 1.57 [0.38; 2.76], *p* = 0.02, *d* = 0.58). No group differences were found at 4-day follow-up. After revealing the actual aim of the study, 86% of the participants evaluated the nocebo information to be helpful in general.

Conclusions: Results provide the first evidence that informing about the nocebo effect can reduce nocebo side effects.

Keywords: nocebo effect, informed consent, patient education, drug safety information, side effects, inert exposure, predictors, risk factors

#### INTRODUCTION

Nocebo effects can cause reduced efficacy of treatments (1, 2) and side effects which are not attributable to the pharmacological or other active ingredients of the treatment (3). Broadly defined, nocebo effects are negative effects caused by psychological and contextual factors of the treatment. As demonstrated in placebo studies (4–6) and in the placebo arms of clinical trials (7–11), side effects are commonly reported after placebo intake. Remarkably, studies which reanalyzed clinical drug trials found considerable overlap in the side effect profiles of drug and placebo arms (7–11). These results indicate that information about potential side effects can influence side effect reporting.

In clinical trials and clinical practice, patients are informed about a treatment's side effects. However, if information about side effects can increase side effect reporting, does the informed consent potentially undermine the principle of nonmaleficence? Expectations are considered key, given that written and verbal information may lead to increased side effect expectations, which in turn—like a self-fulfilling prophecy—result in more side effects (12–14). Up to now, evidence regarding the effect of side effect disclosure on side effect reporting has been mixed (15). In these studies, patients received the same treatment yet different side effect information. Some studies showed that, the more information patients received, the more side effects they reported (16–19), while others studies found no difference (20–22). Although it cannot be concluded whether informing about side effects is disadvantageous in general, strategies to prevent nocebo side effects may be useful for clinicians, especially when treating patients who are at risk of developing nocebo effects. According to estimates based on adverse events reported in placebo arms of double-blind trials, nocebo side effects account for 40% of drug adverse events across diseases (23). Since adverse events can decrease quality of life, reduce adherence, and, consequently, increase public health costs (24, 25), minimizing nocebo side effects warrants clinical attention.

Researchers have advocated that side effect information should be tailored to the patient to prevent nocebo side effects while maintaining patient autonomy (26). Proposed strategies include permitted noninformation (27), framing (27, 28), and informing about the nocebo effect (3). Permitted noninformation offers patients the possibility of remaining unaware of certain mild side effects. Unlike severe and potentially irreversible side effects, knowledge of less threatening ones is not essential for making an informed choice. The clinician distinguishes between crucial and noncrucial side effect information depending on the treatment indication. Patients then receive a list of side effect categories, and they can decide which category they wish *not* to learn about. Framing, in turn, targets the way in which information is presented. First outlined by Tversky and Kahneman (29), the same probability can be presented either as a gain or a loss, affecting decision making. In clinical practice, the probability of side effect occurrence can either be framed as likely ("40% get a sore arm") or unlikely ("60% do not get a sore arm") (30). Some studies have also applied framing in a broader

sense; Wilhelm et al. (31) framed dizziness as an onset sensation of the drug, whereas Heisig et al. (32) framed information about potential side effects of breast cancer treatments in the context of expected treatment benefits such as increased survival. The effect of framing on side effects has been investigated in various samples using different experimental methods and has rendered mixed results (30–35).

Barsky and colleagues (3) suggested informing patients about the nocebo effect. When starting a new treatment, most patients have preexisting symptoms due to the natural course of the disease or comorbidities. These baseline symptoms, especially ambiguous ones such as pain, fatigue, and mood swings, can be misattributed to the new treatment. However, if participants are aware that contextual and psychological factors can play a part in the emergence and exacerbation of symptoms, misattribution is less likely to occur (3). Moreover, offering an alternative explanation may result in less attention towards symptoms, thereby reducing its perceived severity (36) and accompanying distress (37). One study examined the efficacy of a nocebo education on symptom reporting. Crichton and Petrie (38) explained symptoms ostensibly caused by infrasound either by a nocebo effect or biological mechanisms and found differences in symptom reporting after an infrasound exposure. Evidence in the clinical context is missing up to now (39).

We aim to investigate the effect of nocebo information on nocebo side effects among persons with weekly headaches. Specifically, we expect participants who receive the nocebo information to report fewer side effects after placebo intake. To understand which participants benefit most from the nocebo information, we will exploratively examine gender (40), perceived sensitivity to medicine (41), anxiety (42), side effect expectations (43), and cognitive coping styles (41) as potential correlates of nocebo side effects and candidate moderators of the hypothesized effect. Except for cognitive coping styles, these factors have been previously linked to nocebo effects (43, 44). As for cognitive coping styles, we presume that a monitoring coping style, i.e., being concerned about potential health threats and being vigilant towards health-related information, is positively associated with nocebo side effects, whereas a blunting coping style, i.e., avoiding confrontation with potentially threatening health-related information, is not. Pronounced monitoring has been associated with increased perception of physical symptoms (45). Given that prior studies found that nocebo effects induced by verbal suggestion can persist for up to 8 days (46, 47), we conducted a 4-day follow-up assessment to examine the time frame of our nocebo induction and of the intervention effect.

#### MATERIALS AND METHODS

#### Procedures

In an experimental design, we randomized participants 1:1 to the nocebo information group or the standard leaflet group. We used the cover story of conducting a double-blind phase-IV trial of an already approved headache medication "Relacalmin." The ostensible aim was to investigate beneficial effects after a onetime intake. Participants were told that they had a 50/50 chance of receiving Relacalmin or a placebo. In fact, all participants received a placebo pill. Except for the 4-day follow-up assessment, which was completed remotely *via* an online link, the study took place at the University Medical Center Hamburg-Eppendorf. Ethical approval was obtained from the ethics committee of the local chamber of psychotherapists (reference number 13/2014-PTK-HH).

Informed consent was signed by all participants before enrolment. Expectations, as well as short- and long-term effects of the medication, were explicitly mentioned in the written informed consent ("A randomization is necessary to underpin whether beneficial effects are caused by an active pharmacological effect or induced by positive expectations;" "It is possible that you will feel better after taking this medicine shortly after intake as well as over the course of four days").

After signing informed consent, participants completed baseline questionnaires. Then, participants drew from a set of identical looking envelopes. Each envelope contained a medication leaflet and a single blue placebo pill in blister packaging. The nocebo information group and the standard leaflet group received different leaflets. Both leaflets included information about the active substance of the medication, how it works, and its effectiveness ("Studies had shown that head muscle pain is reduced by up to 70%. Participants moreover report an overall feeling of ease and relaxation."). In line with common medication leaflets, information about contraindications and a list of seven potential adverse events were presented (in the following order): concentration problems, dizziness, vision problems (blurred vision), fatigue, tinnitus, muscle pain, and nosebleed. The adverse events were listed according to their alleged frequency of occurrence from "often," "sometimes," to "rarely." Additional probability information was provided for these frequency specifications, e.g., very often, more than 1 in 10 participants; often, less than 1 in 10 participants, but more than 1 in 100, etc. The nocebo information group received additional information about the nocebo effect as part of the leaflet (**Box 1**). Participants were acquainted with the distinction between specific and nonspecific side effects, and the concepts of misattribution and selective attention. A case example was provided to illustrate the nocebo effect (p. 52f) (48). Written by two investigators (YN and TK), its comprehensibility was evaluated by a self-help cancer patient group and adapted hereafter (39).

Participants were requested to read the leaflet, take the pill, and stay seated for 2 min. Further questionnaires were completed 2 min after pill intake (post). This time frame was chosen to avoid deviations in behavior after intake and to keep nocebo effects, which may be amplified due to symptom monitoring, at a minimum. After completing the questionnaire, participants received an online link for the 4-day follow-up assessment. To match up the questionnaires at post and at 4-day follow-up, participants generated a personal code at enrolment. Interaction between the investigator and the participant was prescripted, neutral, and short (~5 min in total).

At the 4-day follow-up assessment, participants indicated headache severity, side effects, and what they believed to be the

#### BOX 1 | Information sheet about nocebo effects.

#### Advance information about side effects

The occurrence of side effects has two fundamental causes. One cause is the pharmacological (substance dependent) mode of action. Specific pharmacological substances in the drug are metabolized and activate certain biochemical reactions in the body. The second cause is the nonpharmacological (nonsubstance dependent) mode of action. Here, the patient's expectations and the context of the medication intake activate certain biochemical reactions in the body.

The second cause is labeled the nocebo effect (expectation effect). For example, prior negative experiences or reading about possible side effects in a medication leaflet can increase a patient's expectations of developing side effects. Consequently, these negative expectations may lead to an actual increase in side effects. The nocebo effect is by no means an illusion; it is a real and measurable response. Clinical studies show that more than half of the experienced side effects can be attributed to expectations. On the one hand, expectations can lead to actual biochemical changes and, by that, facilitate diseases. On the other hand, expectations can induce heightened awareness of bodily sensations and symptoms. Everyday complaints, which occasionally occur even when no medication is taken, can then be perceived as side effects. Simply expecting illness can lead to actual symptoms. Vice versa, positive expectations can prevent the development of side effects and bring about actual health improvements.

The following example illustrates how expectations emerge and how they affect bodily sensations: "For my next checkup, I was to receive a contrast agent. I was anxious, knowing that my body reacts strongly to that kind of thing. The nurse hooked me up to the IV, through which the contrast agent would enter my body. She told me that the contrast agent would make me feel hot and that there might be a burning sensation. She then left me alone. The minute she left the room, I felt the heat washing over me, it streamed through my body and it burned. I knew this checkup was going to be awful. I felt extremely frightened. After a few minutes, the doctor entered the room and she told me: Ok, let's inject the contrast agent, shall we?"

study aim. Afterwards, all participants were debriefed about the actual study aim. Thereby, the nocebo information was presented to all participants. Lastly, the perceived usefulness of the nocebo information was assessed. A reimbursement of 10€ was paid for participation.

#### Participants

Eligibility criteria included age ≥18 years and weekly headaches in the past 6 months. To reinforce our cover story, we also added the following exclusion criteria: High sensitivity to pain and fever medication, acute gastrointestinal ulcer, increased risk for bleeding, and severe cardiomyopathy.

#### Recruitment

Participants were recruited from the general public in and around Hamburg, Germany, using advertisements in newspapers, online portals, and leaflets distributed in pharmacies and local stores. Screening was conducted *via* phone and, when eligible, an appointment was scheduled.

#### Randomization and Blinding

We performed randomization using blocks of eight. After completing the baseline questionnaire, participants were asked to choose one of four opaque, sealed envelopes containing a leaflet (either with or without the nocebo information) and the pill. Depending on the group, the leaflet was labeled either with the letter A or B. The leaflets were otherwise identical (in size and design). Two minutes after taking the pill, participants were asked to state the letter on the leaflet as part of the post assessment. To secure the blinding of the investigator, assessments were conducted using an online form. The investigator sat at a table facing the participant and not the screen. Moreover, the investigator was unaware of the meaning of the letter. All envelopes were prepared before enrolment. The number of prepared envelopes was larger than the required sample size so that every participant was able to choose from a set of envelopes.

#### Power Analysis

No previous study has investigated the effect of the nocebo information on side effect reporting. Hence, we have no information on whether the nocebo information is beneficial at all. To keep participants induced with nocebo effects to a minimum, we pragmatically chose the smallest possible sample size. For a one-tailed independent *t*-test, given a large effect size of Cohen's *d* = 0.8, a power of 0.8, and an alpha error of 5%, we obtained the total sample size of *N* = 42. This sample would allow us to discern whether the nocebo information is useful.

#### Measurements

Assessments were conducted at baseline, post, i.e., 2 min after pill intake, and at 4-day follow-up. The questionnaires were identical for both groups. All assessments were conducted using an online form.

#### Cover Story Credibility

The cover story was classified as credible if subjects either reported side effects after 2 min, reported less headache after intake compared to baseline, or expected their symptoms to alleviate after pill intake. At the 4-day follow-up, participants were additionally asked about the goal of the study.

#### Manipulation Check

At post, all participants evaluated the comprehensibility (0 "not comprehensible at all" to 10 "absolutely comprehensible") of the information in the leaflet. Further questions focusing on the nocebo information were not asked since they might have created suspicion about the cover story.

#### Outcome

Self-reported nocebo side effects were our primary outcome. We use the term nocebo side effects to highlight that, after placebo intake, all reported side effects were nocebo-driven. However, participants—who believed they were taking part in a doubleblind trial—were asked about "side effects of the pill." These were assessed using the validated General Assessment of Side Effects questionnaire (GASE) (49), which we shortened to 20 symptoms, of which 7 were named in the medication leaflet, and 13 were common nonspecific symptoms. Symptoms which were not listed in the leaflet include headache, hair loss, dry mouth, circulation problems, abdominal pain, nausea, diarrhea, skin rash or itching, fever/increased temperature, tendency to develop bruises, insomnia/sleeping problems, back pain, and irritability/ nervousness. We did not exclude headache from the symptom list since it has been previously reported as an adverse event in headache trials (50). Participants were instructed to indicate only the symptoms they attributed to the pill. Each symptom was rated on a scale from 0 "not present," 1 " mild," 2 "moderate," to 3 "severe." Sum scores were composed for total nocebo side effects, nocebo side effects which were listed in the leaflet (leaflet nocebo side effects), and nocebo side effects which were not listed in the leaflet (nonlisted nocebo side effects). Additionally, we also calculated the total number of nocebo side effects. This questionnaire was administered at 2 min after intake (post) and at 4-day follow-up.

#### Potential Predictors of Nocebo Side Effects, Expectation Change

All potential predictors were assessed at baseline.

**Baseline symptoms.** We used the same shortened GASE questionnaire to assess the number and severity of symptoms in the past 4 days. A sum score with a range of 0–60 was calculated.

**Perceived sensitivity to medicine.** Five items assessed the "belief that one is especially sensitive to the actions and side effects of medicine" (p. 1) (41) on a scale from 1 "strongly agree" to 5 "strongly disagree." The items were reversed and a sum score was computed, ranging from 5 to 25. The validity and reliability have been shown among different patient groups as well as among healthy participants (51).

**Trait Anxiety.** The State-Trait Anxiety Inventory is a commonly used instrument with good psychometric properties (52). We used the trait scale only. Twenty items are rated on a scale from 1 "almost never" to 4 "almost always." A sum score is obtained and ranges from 20 to 80.

**Cognitive coping mechanisms.** The Threatening Medical Situation Inventory assesses the degree to which individuals cope with threatening information by confronting and seeking out further information (monitoring, e.g., "I plan to ask the specialist as many questions as possible") or by avoiding information (blunting, e.g., "I think things will turn out to be alright") (53). We presented participants with two of the four possible medical scenarios (headaches and appendicitis) which included six items, respectively. Mean scores range from 1 to 5. The validity and reliability have been established previously (53).

**Sociodemographics.** Age, years of education, and gender were assessed with the latter investigated as a potential predictor of nocebo side effects.

**Expectations.** Participants indicated to which extent they expected the occurrence of side effects on a scale from 0 (absolutely disagree) to 10 (absolutely agree). Two filler items for the cover story inquired about subjects' expectations of headache reduction and their overall treatment expectations. Expectations were assessed at baseline and post. This would allow us to explore whether expectations changed overall and whether the change varied by group.

#### Placebo Effect, Evaluation of the Nocebo Information

**Headache.** At baseline, post, and 4-day follow-up, participants specified their current intensity of headache, state of relaxation, and overall well-being on a scale from 0 (none) to 10 (highest imaginable), with the latter two items being filler items. Placebo effects were operationalized as the difference in headache between baseline and post. Inquiries about symptom amelioration of symptoms at 4-day follow-up were filler items to balance out inquiries about side effects; no computation of 4-day placebo effects was performed since disentanglement from the natural course of the disease was not possible.

**Evaluation of the nocebo information.** After debriefing about the true study aim and presenting the nocebo information to all participants at 4-day follow-up, participants were asked whether they consider informing about the nocebo effect to be useful in general (yes/no).

#### Statistical Analyses

To assess whether nocebo side effects at post differed between the groups, we conducted general linear models (GLM) using the maximum likelihood estimation method. We adjusted for baseline symptoms since they are a confounder of our outcome (54). Except for the estimation method of parameters, GLM aligns with multiple linear regression models. To account for violations of heteroscedasticity, standard errors and 95% confidence intervals (CI) were obtained through nonparametric bootstrap resampling (55) with 2,000 replications and bias-corrected and accelerated (BCa) intervals. Further assumptions including the normal distribution of residuals and no multicollinearity of predictors were checked and met. If univariate associations were given between nocebo side effects and personality characteristics, baseline symptoms, expectations, or gender, moderation analyses were computed (56, 57). To obtain effect sizes, we divided the mean group difference by the standard error of the group difference multiplied by the square rooted number of participants in the standard leaflet group (58). Baseline symptoms were centered and included as a covariate in all models. For moderation analyses, the centered moderator variable and the product of moderator by group were included additionally. To determine the predictive value of the moderation effect, likelihood ratio tests in comparison with the intercept-only model were conducted.

Further analyses were performed to outline the placebo effect, the change in side effect expectations from baseline to post, and whether nocebo side effects sustained up to 4 days. Group differences in nocebo side effects at 4-day follow-up were examined using GLM after adjusting for baseline symptoms. Since associations between nocebo responders and placebo responders have been found previously (59), and since participants may view side effects as onset symptoms of the drug (60), which again, may facilitate placebo effects, correlations between headache change from baseline to post and nocebo side effects at post were investigated. Analyses were performed using IBM SPSS Version 25; GLMs were computed using the GENLIN command. All tests were conducted two-sided with an alpha error of 0.05.

### RESULTS

Baseline characteristics of the sample are portrayed in **Table 1**. The sample consisted mainly of women (70.5%), and most participants had at least a high school degree (88.6%). Participants reported an average of 9 (SD = 4.2) baseline symptoms. Most participants (*n* = 38; 86.4%) had a headache at baseline of an averaged mild to moderate severity (*M* = 3.3, SD = 2.5). The groups did not differ considering baseline characteristics. The cover story was credible, since all participants either expected headache reduction, experienced a headache reduction at 2 min, or reported nocebo side effects after 2 min. Both groups evaluated the leaflet information to be very comprehensible (nocebo information group: *M =* 9.1, SD = 1.6; control group: *M =* 9.4, SD = 1.5). When inquired about the study goal, almost all participants (95.5%) specified an answer in alignment with the cover story (e.g., "whether the medication works," "side effects of the drug," or "time course of


*SD, standard deviation; T, Student's t-test for independent samples; FET, Fisher's exact test.*

*aIndicated for n = 38 persons (nocebo information: n = 20; standard leaflet: n = 18) who suffered from headache at the time of baseline assessment, i.e., reported a score of 1 or higher. Headache severity was rated from 0 (no pain) to 10 (worst imaginable pain).*

*bExpectation about side effect occurrence was rated on a scale from 0 to 10.*

drug efficacy" etc). Only two individuals indicated "placebo effect." Although it is not evident what they meant, it is possible that they questioned the cover story. Sensitivity analyses were conducted after exclusion of these two participants.

#### Nocebo Side Effects

At 2 min after intake, 31 (70.5%) participants reported at least one symptom. The most reported symptoms were headache (56.8%), dry mouth (29.5%), exhaustion (29.5%), vision problems (22.7%), back pain (22.7%), and irritability (22.7%). Out of 20 possible side effects, 41.7 and 15% of participants in the nocebo information and standard leaflet group, respectively, reported no symptoms.

According to generalized linear models with bootstrap sampling, participants in the nocebo information group reported less nocebo side effects (sum score) after 2 min compared to participants in the standard leaflet group (**Table 2**). Baseline symptoms predicted nocebo side effects (*B* = 0.47, BCa 95% CI [0.27; 0.63], *p* < 0.001). The group difference remained when headache was excluded from the list of nocebo side effects (estimated difference: 3.2, BCa 95% CI [0.98; 5.07], *p* = 0.02, Cohen's *d* = 0.56) and after exclusion of two participants who may have questioned the cover story (3.4, BCa 95% CI [0.81; 5.67], *p* = 0.01, Cohen's *d* = 0.60). When nocebo side effects presented (7 symptoms) and not presented in the leaflet (13 symptoms) were analyzed separately, group differences were found only for nonlisted nocebo side effects, yet not for leaflet nocebo side effects. Individuals in the nocebo information group reported an estimated 2.8 (BCa 95% CI [1.0; 4.4], *p =* 0.009, Cohen's *d* = 0.66) fewer nocebo symptoms.

#### Predictors of Nocebo Side Effects and Moderators of the Intervention

Nocebo side effects correlated significantly with baseline symptoms (*r* = 0.64, *p* < 0.001), a monitoring cognitive coping style (*r =* 0.32, *p* = 0.04), and trait anxiety (*r* = 0.47, *p* = 0.001), and in trend with perceived sensitivity to medicine (*r* = 0.29, *p* = 0.06), and side effect expectations (*r* = 0.28, *p* = 0.07). No associations were found with a blunting cognitive coping style (*r* = −0.15, *p* = 0.33) or gender (*r* = 0.18, *p* = 0.24). Among the predictors, we found that baseline symptoms correlated with perceived sensitivity of medicine (*r* = 0.30, *p* = 0.049), trait anxiety (*r* = 0.55, *p* < 0.001), and side effect expectations (*r* = 0.34, *p* = 0.02). All the other variables were not associated.

Baseline symptoms, a monitoring cognitive coping style, trait anxiety, perceived sensitivity to medicine, and side effect expectations were further examined as moderators of the group effect (**Figure 1**). Baseline symptoms x group added predictive value over and above the intercept-only model ( *χ*<sup>2</sup> = 10.34, *df =* 1, *p* = 0.001). The slopes between the groups differed significantly (estimated mean difference = 0.47, BCa 95% CI [0.19; 0.73], *p* = 0.001, Cohen's *d* = 0.75), indicating that, with increased baseline symptoms, nocebo side effects also increased. This effect, however, was buffered by the nocebo information. The same pattern was found for perceived sensitivity to medicine (1.07, BCa 95% CI [0.27; 1.61], *p* = 0.006, Cohen's *d* = 0.73) and side effect expectations (1.57, BCa 95% CI [0.38; 2.76], *p* = 0.02, Cohen's *d* = 0.58). Trait anxiety and a monitoring cognitive coping style did not moderate the effect of the intervention.

#### Placebo Effects, Expectation Change, Sustained Nocebo Side Effects

Six (13.7%) participants reported reduced headache compared to baseline, indicating that the placebo effect after 2 min, if at all existent, was marginal. Hence, we did not examine the link between headache change and nocebo side effects.

Overall, side effect expectation change from baseline to post was marginal (*M* = 0.23, SD *=* 1.05). Expectation change did not differ by group [*M*Nocebo information = 0.33; SD = 1.12; *M*Standard leaflet *=*  0.10; SD = 0.97; *t*(42) = 7.3, *p* = 0.47].

*N* = 42 participants completed the 4-day follow-up assessment. A total of *n* = 41 (97.6%) participants reported at least one


*N, 44; SE, standard error; BCa, bias-corrected and accelerated; CI, confidence interval; Nocebo, nocebo information group; Standard, standard leaflet group.*

*aEstimates of general linear models with bootstrap sampling (2,000 samples), adjusted for baseline symptoms held constant at its mean.*

*bMean estimated group difference/(standard error of the estimated group difference \* √ sample size of the standard leaflet group).*

*cA list of 20 symptoms were presented, of which 7 were portrayed as bogus side effects in the leaflet, and 13 were common side effects of medications (nonlisted). The severity of each symptom was rated as 1 "mild," 2 "moderate," or 3 "severe."*

*\*p < 0.05, \*\*p < 0.01*

nocebo side effect. Participants in the nocebo information group (*n* = 22) and the standard leaflet group (*n* = 20) reported nocebo side effect sum scores (intensity × numbers) of *M* = 8.2 (SD = 8.8) and *M* = 9.0 (SD *=* 7.2). An averaged number of *M* = 5.7 (SD = 5.1) and *M* = 6.4 (SD = 4.6) nocebo side effects were indicated, respectively. No group differences were found for the side effect sum score at 4-day follow up (estimated difference: −0.42, BCa 95% CI [−3.22; 2.11], *p* = 0.78).

#### Evaluation of the Nocebo Information

After participants were debriefed about the true study goal, most of them (*n* = 36, 85.7%) considered the nocebo information to be useful in general. Five participants wrote additional comments with regard to its usefulness. One person wrote: "For me, it [the nocebo information] had no effect because I read the potential side effects only briefly. But now I remember that I had an earache which made me remember the side effect tinnitus. I had a pretty

boundaries. For interaction effects, log-likelihood tests comparing each model with the intercept-only model are shown in the upper left area. \**p* < 0.05, \*\**p* < 0.01.

strong headache and thought, if I really had taken medication, this one did not work at all, yet the side effects did affect me." Another person wrote, "I would have believed the same thing [referring to the case example in the nocebo information], because I am a little anxious." Three individuals referred to the nocebo information as "interesting."

#### DISCUSSION

The present findings suggest that participants with weekly headaches report less nocebo side effects when they were previously informed about the nocebo effect. In this experimental, ostensibly double-blind medication study, we have found that after placebo intake, individuals who received a one-page nocebo information sheet embedded in the medication leaflet reported an averaged 2.8 (95% CI [1.0; 4.4]) fewer symptoms compared to patients who solely received the medication leaflet. Nocebo side effects were significantly associated with heightened baseline symptoms, trait anxiety, and a monitoring cognitive coping style, and in trend with perceived sensitivity to medicine, and side effect expectations. No associations were found with a blunting cognitive coping style or gender. Explorative moderation analyses indicate that the beneficial effects of the nocebo information are more pronounced among participants with high rates of baseline symptoms, participants who perceived themselves to be highly sensitive to medication, and participants who were more confident that they would develop side effects.

Novel treatments may trigger an individual's attention towards potential meaningful symptoms—an essential procedure in order to initiate corresponding health behavior, e.g., side effect treatment and coping, or as in double-blind trials, for detailed recording of adverse events to evaluate treatment safety. Barsky (3, 61) proposed that nocebo side effects emerge when everyday complaints are misattributed as side effects. These symptoms, again, can be amplified through the individual's selective attention towards bodily signals. The nocebo information provides a framework which allows for a more benign interpretation of symptoms and, by that, breaks the vicious circle of amplification. Although due to the inert treatment in our study, we cannot evaluate whether symptom amplification can be prevented, yet we have shown that the additional information may help reduce symptom misattribution.

As implied in Barsky's theory, and in alignment with a number of empirical studies (43, 62), some patients appear to be more prone to developing nocebo side effects than others. Etiological models on symptom exacerbation through psychological factors postulate that patients with health worries and generally higher anxiety tend to engage in selective interoceptive awareness (37). This is reflected in our findings; participants with increased trait anxiety developed more nocebo side effects. This link has also been found in other studies (33, 59, 63). A monitoring cognitive coping style, which on the other hand has never been investigated in the context of nocebo effects, predicted nocebo side effects as well. "Monitorers" seek to gather as much information as possible about health risks. We propose that both procedures monitoring health information and monitoring bodily signals—originate from the same motivational goal of gaining reassurance. It is therefore likely that certain patients score high on both characteristics. In accordance with this reasoning, we found that a blunting cognitive coping style, i.e., avoiding information in face of medical threats, was not associated with nocebo side effects. Lastly, we found a high correlation between nocebo side effects and baseline symptoms. Patients with more baseline symptoms have a larger "pool" of symptoms of which they might identify as a side effect. In summary, patients who have many baseline symptoms, are more anxious, or tend to seek out information when facing potential health threats are more vulnerable to developing nocebo side effects.

In contrast to previous studies (33, 40, 64), we did not find an association between female gender and nocebo side effects. However, our sample size was small, and the proportion of female participants was high (70.5%), which does not allow for conclusions in this regard.

Notably, the nocebo information did not buffer the effect of trait anxiety and monitoring on nocebo side effects. It did, however, buffer the effects of baseline symptoms, perceived sensitivity to medicine, and side effect expectations on nocebo side effects. A link between perceived sensitivity to medicine and side effects, and a link between side effect expectations and side effects have been found in previous research (12, 13, 41, 65). In this study, these associations constitute only a trend. The predictive coding paradigm suggests that prior information generate predictions which, in turn, cocreate perception (66, 67). Thereby, sensory input is more likely to be perceived in line with predictions. Henningsen and colleagues suggested that enabling more precise predictions would facilitate a more differentiated perception of bodily sensations (66). Both side effect expectations and perceived sensitivity to medicine, which is characterized by agreeing to statements like "My body overreacts to medicines" or "Even small amounts of medicine can upset my body," are predictive of side effect development. We believe that, by distinguishing between specific and nonspecific side effects in the nocebo information, participants limited their predictions about side effects to the symptoms mentioned in the leaflet. This suggestion is corroborated by the finding that the groups differed only with regard to the side effects which were not listed in the leaflet, but not those which *were* listed. Interestingly, the specification of prediction was not reflected in a change of side effect expectations. Since the term side effects usually refers to pharmacological side effects, we presume that patients recognize nocebo effects to be, by definition, no side effects. In other words, knowing that symptoms can be misperceived as side effects and therefore intensify is, from the patient's perceptive, unrelated to pharmacological side effects and corresponding expectations.

The overall rate of nocebo response (70.5%) was higher compared to previous clinical trials. Adverse event rates following placebo intake amount to 18.4–18.7% for the acute treatment of migraine and cluster headaches and 24.0–42.8% for the preventive treatment of migraine and tension-type headaches (8). Mitsikostas et al. (9) have argued that high nocebo response rates reflect a more burdened patient population since comorbidities such as somatization and anxiety are more common among chronic headache patients. Indeed, a US survey with migraine patients found depression (63.8%), anxiety (60.4%), chronic pain (39.5%), and irritable bowel syndrome (29.3%) to be the most common comorbid conditions (68). However, whether or not this rationale is applicable to our patient sample cannot be confirmed due to the lack of diagnostic information. The discrepancies to other studies may also arise from different methods of adverse event assessment. Several reviews have pointed out inadequate reporting of adverse events in clinical trials (69, 70). It is common that assessments consist of openended questions from the investigators and spontaneous reports of participants, which leads to lower side effect reports compared to a systematic assessment of side effects as used in this trial.

At the 4-day follow-up, 97.6% of participants reported nocebo side effects. These reports did not differ by group. In line with these findings, a recent study showed that framing of side effect information reduced nocebo side effects short term but not after 24 h (33). However, we did not induce nocebo effects after 4 days due to ethical reasons but suggested a potential positive effect of the medication for 4 days. Consequently, some participants might have perceived side effects after 4 days to be unlikely. Given that the nocebo side effect sum scores at the 4-day follow-up were strikingly high compared to post-intake (difference by 4.1 points), it is uncertain whether some participants might have simply specified all of their symptoms, irrespective of whether they were attributed to the pill. Conclusions about the persistence of an indirect nocebo induction, i.e., through a leaflet and without verbal suggestions of symptom worsening, and the mid- or long-term beneficial effects of the nocebo information cannot decisively be drawn from our data. Further studies are warranted to this end.

### Limitations

This study has a number of limitations due to its pilot character. The sample size is small; although we conducted interaction tests which are recommended to assess differential subgroup effects (56), the moderation analyses, in particular, are based on a modest number of participants. These results should be viewed as hypothesis-generating and necessitate further evaluation in future studies. In addition, the sample size calculation was based on a Student's *t*-test for independent samples, yet main analyses were conducted after adjustment for baseline symptoms. Given that after inclusion of a covariate, a bigger sample size might have been necessary, our sample size estimation was liberal. The time points of 2 min and 4 days were chosen based on ethics and prior research on nocebo effects and do not align with the onset and duration of actual headache medications. In other words, studies which ostensibly administer medications do not give suggestions into a "vacuum" but rather trigger expectations related to the patients' prior experiences. Common headache drugs reach maximum plasma concentration 30–120 min after intake (71), whereas assessment after 2 h is a gold standard in headache trials (72, 73). Therefore, the direction of bias is unknown. On the one hand, nocebo side effects may be underestimated due to the short time period of 2 min. On the other hand, the short time frame may have promoted cognitive availability of the nocebo information and resulted in an overestimated influence of the intervention. In addition, patients in headache trials are instructed to take the medication when experiencing acute symptoms. In our study, six participants did not have a headache at the time of pill intake. In light of this, placebo effects at post were marginal. However, this does not necessarily signify unreliable reports of nocebo side effects. Prior evidence has shown that nocebo effects are elicited more easily than placebo effects (59, 74). Nonetheless, matching assessment points to the duration of effect of available medication and facilitating placebo effects could render more precise estimates of nocebo side effects and of the intervention effect, also with regard to its sustained effects.

It should be noted that our findings—although potentially highly relevant—cannot be transferred into clinical practice. In contrast to clinical practice, all participants took a placebo instead of an active medication. Moreover, they believed that they were taking part in a drug study, i.e., had a 50/50 chance of receiving either the medication or the placebo. This context differs from clinical practice, in which patients have 100% certainty of receiving treatment. Again, the direction of bias is unknown. Nocebo side effects could have been underestimated if participants believed to be in the placebo arm. They could also have been overestimated since uncertainty about safety and group affiliation can result in increased monitoring of symptoms. Lastly, given our liberal inclusion criteria (weekly headaches for at least 6 weeks), we cannot determine our sample considering headache diagnoses and comorbidities. It is probable that our study included both individuals with episodic and chronic headache types. Differential subgroup effects by diagnoses cannot be investigated.

### Implications

This study provides the first evidence that informing about the nocebo effect may be a viable strategy for reducing nocebo side effects. The strengths of the nocebo information consist of its convenience and feasibility; a standardized, short information sheet can be handed out by practitioners or pharmacists as an add-on to a new medication. However, due to its limitations, this trial should be perceived as a proof-of-concept. To determine the value of the nocebo information, further trials in clinical practice, i.e., with clearly specified patient groups undergoing active treatments, are needed.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Ethics Commission of the Chamber of Psychotherapists in Hamburg with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the the Ethics Commission of the Chamber of Psychotherapists in Hamburg. English translation of the ethics statement: Application number: 13/2014-PTK-HH Research project: "Can a patient education reduce side effects? An Experimental Study on the Nocebo Effect" Dear Prof. Dr. Nestoriuc, The Ethics Commission of the Chamber of Psychotherapists in Hamburg has issued the following statement after examining the documents submitted by you in order to examine the compatibility of the given study with ethical principles: After reviewing the documents submitted by you as the responsible head of the study on the aforementioned research project dated 10 September 2014, the Ethics Commission of the Chamber of Psychotherapists in Hamburg came to the conclusion that there were no ethical objections to study conduction. Based on this statement, we can inform you that there are no objections to the conduct of the study. Yours sincerely, Prof. Dr. Hertha Richter-Appelt Chairwoman of the Ethics Committee.

Original in German: Antragsnummer: 13/2014-PTK-Hamburg Forschungsvorhaben: Kann eine gute Aufklärung Nebenwirkungen reduzieren? Eine experimentelle Studie zum Nocebo-Effekt". Sehr geehrte Frau Prof. Dr. Nestoriuc, die Ethikkommission der Psychotherapeutenkammer Hamburg hat nach Prüfung der von Ihnen vorgelegten Unterlagen auf Prüfung der Vereinbarkeit der im Rubrum genannten Studie mit ethischen Grundsätzen die folgende Stellungnahme abgegeben: Nach Sichtung der von Ihnen als verantwortlicher Studienleiterin eingereichten Unterlagen zu dem vorgenannten Forschungsvorhaben vom 10.September 2014 ist die Ethikkommission der Psychotherapeutenkammer Hamburg zu dem Ergebnis gekommen, dass der Durchführung der Studie keine ethischen Einwände entgegenstehen. Aufgrund dieser Stellungnahme können wir Ihnen mitteilen, dass der

#### REFERENCES


Durchführung der Studie keine Einwände entgegenstehen. Mit freundlichen Grüßen, Prof. Dr. Hertha Richter-Appelt, Vorsitzende der Ethikkommission.

### AUTHOR CONTRIBUTIONS

YN and TK initiated the study design. TK and YP conducted the study. YP and MS analyzed and interpreted the data. YP drafted the manuscript. All authors made refinements and approved the final manuscript.

### FUNDING

All costs including reimbursement of participants and open access publication fees were/will be covered by YN's university budget.

#### ACKNOWLEDGMENTS

The authors thank Twyla Michnevich for proofreading the manuscript.


from an analogue online study. *J Psychosom Res* (2015) 79(6):519–29. doi: 10.1016/j.jpsychores.2015.10.003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Pan, Kinitz, Stapic and Nestoriuc. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Studying a Possible Placebo Effect of an Imaginary Low-Calorie Diet

*Valentin Stefanov Panayotov\**

*National Sports Academy, Sofia, Bulgaria*

In recent years the prevalence of obesity in developed countries has increased to the point that some authorities have coined the term "obesity epidemics." Combining energy intake control measures (via diet) with protocols for increasing energy expenditure (predominantly *via* low to medium intensity aerobic exercise) proved to be the most effective approach in addressing this problem. In this experiment, we studied for a possible placebo effect of a weight loss program on changes in body mass and fat tissue in overweight or obese people. Fourteen healthy adults of both sexes aged between 19 and 45 with body mass index (BMI) > 27 participated in the study. They were randomly assigned to two groups one experimental and one control. The subjects in the experimental group followed an isocaloric diet but were told they were put on a calorie-deficient regimen. The subjects in the control group were aware they followed an energy-balanced diet. All participants were engaged in regular sessions of resistance exercise three times a week with total energy cost of approximately 750–900 kcal/week. We studied within-group differences of body mass, percentage of fat tissue, and BMI. All three variables reduced in value in the experimental group: body mass—9.25 ± 5.26 kg, percentage of fat tissue—3.4 ± 0.97%, and BMI—2.88 ± 1.50. No statistically significant within-group differences were measured in the control group. Despite some methodological biases of the study construct, in our opinion, a placebo effect could partially explain the changes in the experimental group.

#### *Edited by:*

*Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Nathalie Michels, Ghent University, Belgium Victor Chavarria, Parc Sanitari Sant Joan de Déu, Spain*

> *\*Correspondence: Valentin Stefanov Panayotov*

#### *Specialty section:*

*v\_panajotov@abv.bg*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 13 December 2018 Accepted: 15 July 2019 Published: 30 July 2019*

#### *Citation:*

*Panayotov VS (2019) Studying a Possible Placebo Effect of an Imaginary Low-Calorie Diet. Front. Psychiatry 10:550. doi: 10.3389/fpsyt.2019.00550*

Keywords: placebo effect, obesity, anaerobic exercise, diet, body mass index, fat tissue

## INTRODUCTION

In recent years the prevalence of obesity in developed countries has increased to the point that some authorities talk about "obesity epidemics." According to data in 2014 more than 1.9 billion adults worldwide were overweight, with over 600 million being obese (1, 2). Obesity is strongly linked with some diseases with high social impact such as type 2 diabetes and cardiovascular disease (3–5). In addressing the problem, the effects of different weight-loss protocols have been extensively studied in recent years, most of them comprising interventions of hypocaloric diets and/or physical activity regimens (6–10). The most effective approach proved to be that of combining energy intake control measures (via diet) with protocols aimed at increasing energy expenditure (predominantly *via* low to medium intensity aerobic exercise) (11–17). Except the strictly mathematical part of the process of weight reduction (energy intake vs. energy expenditure), there are many other complex (including psychological) factors, which influence the outcomes of such interventions (18–20). The aim of this experiment was to distinguish between the metabolic and psychological/behavioral components of a weight loss intervention. Usually, in clinical studies, the combined effect of intervention plus placebo is evaluated. In our experiment, we tried to measure only a possible pure placebo effect. We used a resistance exercise protocol an approach that is not very popular among researchers (21– 23). It is easier to apply for overweight and obese sedentary people. While aerobic cyclic movements most often require the involvement of the whole body, which is hard and in some cases impossible to achieve in such subjects, resistance exercise allows for dosing and targeting efforts to particular parts of the body and are less energy efficient.

Our hypothesis was that a nonrandom effect different than that of energy restriction and physical activity existed. More specifically, we tested for a pure placebo effect in a weight reduction therapy.

## MATERIALS AND METHODS

This study was carried out in accordance with the recommendations of Scientific Projects and International Activities Guidelines of the Scientific Projects Committee of Bulgarian National Sports Academy and its protocol was approved by the Committee. All subjects gave written informed consent in accordance with the Declaration of Helsinki. Placebo response experiments imply incomplete information for the patient or even deception. For that reason, in most cases, they are under severe ethical surveillance in clinical practice (24). According to ethical analysis and international ethical guidance, our experiment is permitted to use placebo protocols when scientifically indicated (25).

#### Subjects

Fourteen healthy adults of both sexes aged between 19 and 45 with body mass index (BMI) > 27 were recruited through an advertisement in a local gymnasium website. Prior to inclusion, we assessed each candidate's eligibility for participation in the experiment—all participants were interviewed about their overall health status and medical history. They were informed in detail about all possible health risks of the intervention. The participants were randomly assigned to two groups—one control (n = 7) and one experimental (n = 7). The sex representation in both groups was balanced.

### Energy Expenditure Estimation

The theoretical daily energy expenditure (which included the energy price of the physical activity) was estimated using the protocols of Mifflin et al. (26–28) (for estimating Basal Metabolic Rate) and Levine and Kotz (29, 30). Based on those data, we calculated the theoretical energy intake requirements for each participant.

#### Anthropometric Measurements

We measured body mass (to an accuracy of 100 g) and the percentage of fat tissue twice—once in the beginning and once at the end of the study. For calculating BMI, we measured the height of barefoot subjects to the nearest 1 cm. We estimated the percentage of fat tissue using the bio-impedance methodology (31). For all the measurements we used Tanita SC-331S Total Body Composition Analyzer.

### Intervention Protocol

The subjects in the experimental group followed an isocaloric diet, but were informed it was a hypocaloric one with a deficit of 5,500 kcal weekly. Theoretically this should cause a weight loss of about 6 kg in 8 weeks. The control group participants knew they were following an energy balanced diet. Both diets consisted of 55–60% of carbohydrates, 15–20% of protein, and 25–30% of fats. The energy cost of the physical activity was approximately 750–900 kcal/week. Both diet interventions tried not to depart strongly from the individual preferences and habits.

The parameters of the physical activity protocol were as follows:


We used only complex basic exercises, which involved large muscle groups. Resistance exercises are energy inefficient, with low values of energy conversion efficiency, which increases greatly their energy cost compared to a strictly steady-state aerobic activity (32–34). Prior to the intervention, the participants underwent a 2-week-long preparatory endurance-training program consisting of 30 min steady-state jogging or cycling workouts three times a week aimed at improving their basic functional fitness level. We controlled for adherence to the intervention protocol by holding regular meetings of every participant with a dietitian once in 2 weeks. All training sessions were held at SC Olympia Sports Centre in Sofia, Bulgaria and were supervised by professional strength training coaches.

### Statistical Analysis

We evaluated baseline between-group differences *via* one-way analysis of variance (ANOVA) at a level of significance of *p < 0.05*. We tested for within-group differences between pre- and postintervention values of the studied parameters using a standard paired samples student's t-test (at *p < 0.05*). As we studied anthropometric parameters, which are approximately normally distributed, we considered the data had met the assumptions of both tests (35).

### RESULTS

No between-group differences were found at baseline (**Table 1**). No between-groups age differences were found either. There were no drop-outs—all participants completed successfully the experiment. All three variables reduced in value in the experimental group [data presented as mean value ± standard deviation (SD)]: body mass from 112.98 ± 19.93 to 103.73 ± 17.89 kg, difference of 9.25 kg; fat mass percentage from

#### TABLE 1| One-way ANOVA of baseline values.


39.38 ± 4.1% to 35.98 ± 4.46%, difference of 3.40%; and BMI from 34.62 ± 3.27 to 31.73 ± 2.89 kg/ m2 , difference of 2.88 kg/ m2 (*p < 0.05).* The statistical power achieved for the parameters in the experimental group was as follows: body mass—0.08, fat tissue percentage—0.01, and BMI—0.2. No significant withingroup differences were found of the variables in the control group (**Table 2**).

Five individuals of the experimental and four of the control group reported deviations from their prescribed nutritional protocols. They all consumed more sweets because of their preference, but they compensated for the calorie intake in other dietary components. While it was impossible to estimate precisely the energy costs of those deviations, the participants were experienced in dieting and calculating energy values of different foods and in most occasions successfully maintained their calorie intakes almost unchanged.

The raw data supporting the conclusions of this manuscript will be made available by the author, without undue reservation, to any qualified researcher.

### DISCUSSION

Our exhaustive search on the topic in the database of the US National Library of Medicine (https://www.ncbi.nlm.nih.gov/ pubmed) did not find any similar studies, with which to compare our results. The only publications, which included investigations of placebo effects, were those concerning the effects of different drug substances. Placebo responses are linked to patients' expectations for a treatment to work. While for drug testing this phenomenon has its potential explanations, its existence in dieting could be interpreted as a potential violation of the First Law of Thermodynamics.

We did not expect the implemented isocaloric regimen to affect body composition or body mass. Despite that, the results suggest that some placebo effect of the intervention exists. In our opinion, that proves that the metabolic considerations behind constructing a weight loss program comprise only a part of all the tools for treating obesity available. There are many ambiguous psychological and behavioral mechanisms of the process yet to be explored. Our study marks only one of all possible directions for future research on that topic. Interestingly, the participants in the control group reduced their weight and fat tissue too, though insignificantly. However, the significance of within-group differences of BMI was very close to the borderline value of 0.05 (**Table 2**). We could speculate that we witnessed the body composition changing potential of strength training, a well-documented phenomenon that had been studied extensively by many researchers (36, 37). Such speculations, though, need further research in order to be proven decisively (e.g., increasing the number of participants and/or the duration of experiments). To be more precise, to achieve the standard level of statistical power of 0.8 for the differences in body mass in the experimental group, at least 66 participants would be necessary (*p* fixed at 0.05, twotailed test). The results for fat tissue percentage and BMI would require at least 25 and 19 subjects, respectively. The numbers are even higher for the control group.

There are some potential biases in the construction of the study. We did not control for adherence to the prescribed protocol on a daily basis. Instead, we interviewed the participants about their daily routines during our regular meetings once in 2 weeks. Although few of them reported departures from the instructions, the study protocol lacked any mechanisms for controlling the adherence rate to the diet plan. Accordingly, some deviations from the study protocol could have been left unnoticed. For example,

TABLE 2 | Within-group differences between baseline and final values.


it was possible that some overenthusiastic participants had been periodically undereating and/or inadvertently had increased their routine daily physical activity. In addition, we did not control for diet- or performance-enhancing drugs administration. In our opinion, the abovementioned reasons partially explain the observed placebo effect, but there could have been many other processes unfolding, including psychological ones. Most obese people have a long history of trials and failures with different types of weight loss protocols and that could lead to a build-up of much frustration along the years. For that reason, the opportunity of being allowed to participate in an experiment, which is supervised and controlled by professional dietitians and strength-training coaches, could have been a great stimulus for some of the participants to reduce their calorie intake and/or energy expenditure further than prescribed and lose weight as a result. In any case, the overall effect of any potential deviations from the protocol was not big enough to explain the observed placebo effect. Assuming a uniform body mass decline over time, a loss of more than 9 kg (experimental group) in 8 weeks means a reduction of more than a kilogram per week. This translates into a daily calorie deficit of more than 1,000 kcal. A deficit of such dimensions is too big to pass unnoticed. It is equivalent to 250 g of protein or more than 100 g of fat. In our opinion, the potential nonadherence to the protocol only partially explains the placebo effect.

Based on the results of the study we reached some (preliminary) conclusions. First, despite some possible biases of the construct of the study, we found some evidence for the existence of a placebo effect of an imaginary hypocaloric diet. Probably, some kind of psychological/motivational/behavioral therapy could become a very important part of the whole weight loss process. In our opinion, further studies on the placebo effect hypothesis in dieting are necessary in order more definitive conclusions to be derived. And second, regular physical activity of anaerobic– lactic type (performed in neutral energy balance condition) do not induce weight loss or changes in body composition in the short term. These findings are corroborated by many studies (38–40). In any case, the assessment of the potential effectiveness of a regular anaerobic physical activity on body mass and body composition changes in overweight and obese people requires further research. Additionally, we consider that our study only sets the basis for further investigations, which to reach to more decisive results and either replicate or repudiate ours.

### REFERENCES


### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Scientific Projects Committee of Bulgarian National Sports Academy with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Scientific Projects Committee of Bulgarian National Sports Academy.

### AUTHOR CONTRIBUTIONS

VP organized the recruitment process and together with the dietitian held the initial interviews. The author managed the research process and was responsible for the statistical processing of the raw data. VP took part at the regular meetings of the participants and the dietitian and supervised the training sessions.

### FUNDING

This work was supported by the Scientific Projects Fund of the National Sports Academy of Bulgaria (Grant No. 223/09. 04. 2013). The terms of this arrangement have been reviewed and approved by the National Sports Academy of Bulgaria in accordance with its policy on objectivity in research.

### ACKNOWLEDGMENTS

We thank our colleagues, the dietitian Boriana Palatova and the strength training coach Plamen Ananassov, of SC Olympia, Sofia, Bulgaria who provided assistance and expertise that greatly contributed to the research. We also thank Prof. Krassimir Petkov of Bulgarian National Sports Academy. This study was presented at the 7th International Scientific Congress SSA on October 9–12, 2014. The manuscript is submitted under permission of the publisher of the proceedings of the conference, *Journal of Sport and Science*.

and risk of stroke in women. *JAMA* (1997) 277(19):1539–45. doi: 10.1001/ jama.1997.03540430051032


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Panayotov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo and Nocebo Effects in Patients With Takotsubo Cardiomyopathy and Heart-Healthy Controls

*Edited by: Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Karl Bechter, University of Ulm, Germany Victor Chavarria, Parc Sanitari Sant Joan de Déu, Spain*

#### *\*Correspondence:*

*Elisabeth Olliges Elisabeth.Olliges@med.unimuenchen.de Joram Ronel j.ronel@tum.de*

*†These authors have contributed equally to this work and share first authorship.*

*‡These authors have contributed equally to this work and share senior authorship.*

#### *Specialty Section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 14 December 2018 Accepted: 15 July 2019 Published: 02 August 2019*

#### *Citation:*

*Olliges E, Schneider S, Schmidt G, Sinnecker D, Müller A, Burgdorf C, Braun S, Holdenrieder S, Ebell H, Ladwig K-H, Meissner K and Ronel J (2019) Placebo and Nocebo Effects in Patients With Takotsubo Cardiomyopathy and Heart-Healthy Controls. Front. Psychiatry 10:549. doi: 10.3389/fpsyt.2019.00549*

*Elisabeth Olliges1\*†, Simon Schneider2†, Georg Schmidt2, Daniel Sinnecker2,3, Alexander Müller2, Christof Burgdorf4,5, Siegmund Braun6, Stefan Holdenrieder6, Hansjörg Ebell7, Karl-Heinz Ladwig8,9, Karin Meissner1,10‡, and Joram Ronel8,11‡*

*1 Institute of Medical Psychology, Medical Faculty, LMU Munich, Munich, Germany, 2 Medizinische Klinik und Poliklinik I, Klinikum rechts der Isar, Technische Universitaet Munich, Munich, Germany, 3 German Centre for Cardiovascular Research (DZHK), partner site Munich Heart Alliance, Munich, Germany, 4 Klinik für Herz- und Kreislauferkrankungen, Deutsches Herzzentrum Munich, Technische Universitaet Munich, Munich, Germany, 5 Department of Cardiology, Heart and Vascular Centre Bad Bevensen, Bad Bevensen, Germany, 6 Institute of Laboratory Medicine, Deutsches Herzzentrum Munich, Technische Universitaet Munich, Munich, Germany, 7 Private Practitioner, Munich, Germany, 8 Department of Psychosomatic Medicine and Psychotherapy, Klinikum rechts der Isar, Technische Universitaet Munich, Munich, Germany, 9 Department of Epidemiology II, Helmholtz Zentrum, Munich, Germany, 10 Division of Health Promotion, Coburg University of Applied Sciences, Coburg, Germany, 11 Department of Psychosomatic Medicine, Klinik Barmelweid AG, Barmelweid, Switzerland*

The etiology of takotsubo cardiomyopathy (TTC)—a rare, reversible, and acquired form of cardiac diseases—is not yet fully explained. An exaggerated activation of the sympatheticnervous-system (SNS) following stressful psychosocial life events is discussed to be of key importance. In this experimental study, we tested whether TTC patients, compared to heart-healthy controls, respond more strongly to supporting placebo interventions and stressful nocebo interventions targeting cardiac function. In a single experimental session, 20 female TTC patients and 20 age matched (mean age 61.5 years, ± 12.89) catheterconfirmed heart-healthy women were examined. Saline solution was administered three times i.v. to all participants, with the verbal suggestion they receive an inert substance with no effects on the heart (neutral condition), a drug that would support cardiac functions (positive condition), and a drug that would burden the heart (negative condition). Systolic and diastolic blood pressure (DBP/SBP), heart rate (HR), endocrine markers cortisol (µg/dl), copeptin (pmol/l), and subjective stress ratings (SUD) were assessed to examine alterations of the SNS and the hypothalamic–pituitary–adrenal axis (HPA). Before and after each intervention SUD was rated. One pre and three post serum cortisol and copeptin samples were assessed, and a long-term electrocardiogram as well as non-invasive, continuous blood pressure was recorded. The study design elucidated a significant increase of SUD levels as a response to the nocebo intervention, while perceived stress remained unaffected during the preceding neutral and positive interventions. Increasing SUD levels were accompanied by higher SBP and an anticipatory increase of HR shortly prior to the nocebo intervention. SBP increased also as a response to positive verbal suggestions (Bonferroni-corrected p-values > .05). Alterations of cortisol and copeptin due to the interventions and significant placebo effects failed to appear. Interestingly no differences between TCC patients and controls could be found.These findings do not support the assumption of an exaggerated activation of the SNS as a discriminatory factor for TTC. Since especially the nocebo intervention revealed negative subjective and objective effects, our results underscore the urgent need to consider carefully the impact of verbal suggestions in the interaction with cardiac patients in daily clinical routine. This study is registered at the Deutsches Register Klinischer Studien (DRKS00009296).

Keywords: placebo effects, nocebo effects, takotsubo cardiomyopathy, cardiological response, sympathetic nervous system

#### INTRODUCTION

Placebo effects are conceptualized as neurobiological phenomena, resulting from the positive psychosocial context, a treatment is embedded in. Correspondingly, a negative psychosocial context may induce negative clinical outcomes, referred to as "nocebo effects." The current state of research suggests that placebo and nocebo effects are mediated by explicit expectations and shaped by different means; social observational learning (1), classical conditioning (2), and verbal suggestions (3). The doctor's verbal suggestions inducing positive or negative outcome expectations are an important feature for placebo and nocebo effects (4–7). Placebo effects on functions linked to the central nervous system (CNS) such as pain or Parkinson's disease have been extensively investigated and their mechanisms are well understood (6, 8). For example, placebo analgesia is often associated with the release of endogenous opioids, whereas placebo-induced motor improvement in patients with Parkinson's disease could be connected to the release of dopamine in the dorsal striatum (8, 9). Within several studies, it has been demonstrated that placebo interventions can also affect peripheral organ functions (e.g., pulmonary and cardiovascular functions) controlled by the autonomic nervous system (ANS) (10–13), but results in this neglected area of placebo research are often ambiguous. For example, significant effects of verbal suggestions specifically targeting the diameter of coronary arteries could be observed during a coronary angiography. Here participants received intracoronary saline injections, together with the verbal suggestion the "drug" would widen the heart vessels and improve cardiac perfusion. Interestingly, the verbal suggestion led to coronary vasoconstriction accompanied by chest pain reduction. Acute psychological burden, HR and BP did not change significantly. Authors concluded that the coronary vasoconstriction was not caused by increased stress levels but by a reduction of sympathetic outflow and/or increase of parasympathetic outflow to the cardiac vessels (12).

Takotsubo cardiomyopathy (TTC) (also referred to as "stressinduced cardiomyopathy" or "broken heart syndrome") is considered a very rare, reversible, and acquired form of primary myocardial disorders (14–16). TTC is characterized by an acute, functional disturbance in the contraction of the myocardium, primarily affecting mid and apical areas of the left ventricle, accompanied by symptoms and signs rather similar to those of the acute phase of a myocardial infarction (MI) (e.g., chest pain,

dyspnea or alterations in the electrocardiogram or cardiac markers such as troponin), while the coronary arteries are mostly unaffected in TTC patients (17). Medeiros and colleagues found a similar impairment of systolic and diastolic function in TTCs and post MI patients, despite of their completely different pathophysiology (18). An increased sympathetic tone as well as a concomitant enhanced myocyte and microvascular catecholamine sensitivity is considered to increase the individual's vulnerability and may therefore serve as a risk factor for the development of TTC (19).

Approximately 0.07–2.3% of patients, suspected with an acute coronary syndrome (ACS), are diagnosed with TTC after cardiac catheter examination, with almost 90% being postmenopausal women (14, 20–24). The etiology of TTC is not yet fully explained. A dysfunctional presentation and processing of external physiological or psychosocial stressors are assumed to initiate an inadequate activation of the sympathetic nervous system, and therefore a pathophysiological cascade of the TTCpatient's myocardium (23, 25, 26). Triggers are not necessarily negative. A very small percentage of TTC patients (approximately 4%) experience a positive life event (e.g. a birthday party or the child's wedding), prior to the onset of the disease. It is supposed that, positive as well as negative events are proceeding through analogous signal pathways in the central nervous system (26, 27).

Further, data on the recurrence of TTC varies, but relapses are not infrequent with approximately 1.5% to 2.4% per patient-year and a rate of 5% to 11.4% within the first 4 years (25, 28–30). Simultaneously, several studies found a significantly higher mortality rate in TTCs in comparison with a control group of the same age and sex (25, 31, 32). Apart from cardiovascular events, this appears to be due to an increased prevalence of non-cardiac comorbidities, which suggests a persistent pathology, presumably referring to an alteration of the sympathetic system, inherent in TTC patients (28, 33–36).

Based on these considerations, we investigated whether the cardiac regulation of TTC patients reacts more sensitively to positive and negative external stimuli than that of hearthealthy individuals. In a case–control study, we examined the cardiovascular response to placebo and nocebo interventions targeting the cardiac functions in 20 TTC patients on average two years after disease onset and 20 matched heart-healthy individuals. We hypothesized that in TTC patients cardiovascular and perceived stress parameters would be stronger regulated as a response to placebo and nocebo interventions compared to healthy individuals.

#### MATERIAL AND METHODS

#### Sample

This case–control study (controlled for age) included 20 women, diagnosed with TTC, and 20 volunteers (CG) free of significant coronary artery disease (vessel stenosis ≤30%, confirmed via heart catheterization in the past) (see **Table 1**). TTC patients were diagnosed regarding Mayo Clinic's diagnostic criteria for Takotsubo Cardiomyopathy. These are: 1) transient hypokinesis, akinesis, or dyskinesis of the left ventricular mid segments with or without apical involvement; regional wall motion abnormalities extending beyond a single epicardial vascular distribution, with a stressful trigger often, but not always present, 2) absence of obstructive coronary disease or angiographic evidence of acute plaque rupture, 3) new electrocardiographic abnormalities (either ST-segment elevation and/or T-wave inversion) or modest elevation in cardiac troponin, 4) absence of a pheochromocytoma or myocarditis (37). Participants with significantly decreased ejection fraction (<55%) or low German proficiency, were excluded from the study. The mean time interval between the episode of TTC and the participation in the study was 24.61 months (±22.8). A total of 40 eligible women diagnosed at "Deutsches Herzzentrum" and "Medizinische Klinik und Poliklinik I, Klinikum rechts der Isar," Technical University, Munich, were enrolled in the study and contacted *via* mail and followed-up by a phone call. The study protocol was approved by the institutional review board. All participants received 50 € compensation, borne by the Deutsches Herzzentrum, Munich.


*Values are mean* ± *SD or n (%). †Mann–Whitney–U test, ††Chi-square-Test.*

#### Endpoints

The following parameter were chosen as primary endpoints in order to indicate alterations of the SNS and the HPA, the main peripheral pathways of the human stress system: Non-invasive continuous systolic (SBP) and diastolic blood pressure (DBP) as well as heart rate (HR) measured with Finapress Nova device (Finapres Medical Systems B.V.), as established indicators for the adaptive response to altered environmental, bio-psycho-social stimuli. Both cardiac functions are self-modifiable to attune the delivery of oxygenated blood by augmenting the beating frequency, respectively the pressure, with which the blood is pumped through the arteries (38). In addition, perceived stress was assessed by the "subjective units of distress scale" (SUD), an 11-point numeric rating scale from 0 (no stress) to 10 (maximal stress). Furthermore, blood samples were taken to measure cortisol (µg/dl) and copeptin (pmol/l). Cortisol has been shown to be proportionate to the degree of stress on a peripheral level. To gain a more direct insight in the stress level on the cerebral level, copeptin was chosen as a second humoral stress marker. Copeptin, a pre-hormone of vasopressin, is considered a relevant marker for acute, endogenous stress, especially associated with cardiological diseases (e.g. myocardial infarctions) (39–42).

#### Procedure

The experiment was performed in the Department of Cardiology at Klinikum rechts der Isar, between 10:00 am and 1:00 pm in a cardiological outpatient lab. Participants were examined at different time points with no contact to each other; therefore an exchange of experiences during the experiment was not possible and no "placebo-by-proxy" effects could emerge (43). After obtaining informed consent, participants received a transthoracic echocardiography to assess standard parameters [e.g., septum thickness (mm) and ejection fraction (%)]. Thereafter, the study coordinator connected the participants to the Finapress Nova device (Finapres Medical Systems B.V.) and activated the continuous measurement of cardiovascular parameters [blood pressure (mmHg), heart rate (bpm)] while the attending physician established vascular access and took the first blood sample [cortisol (µg/dl) and copeptin (pmol/l)] (see **Figure 1**).

At the beginning of the experiment (M0), the participants were asked to rate their perceived stress (SUD). After a baseline measurement of approximately 5 min, during which the cardiological parameters were continuously assessed, the first sham-intervention took place (I1). Here, the physician administered 2 ml of 0.9% physiological saline solution (NaCl) intravenously together with a standardized verbal neutral suggestion that the intravenously administered solution would not cause any bodily changes "similar to taking a sip of water." Thereafter, the first post-intervention measurement of physiological parameters was performed (approximately 5 min). At the end, patients were asked again to rate their level of distress on an 11-point numeric rating scale (from 0 = no stress to 10 = maximal stress) and blood samples were taken for a second time (M2). Subsequently, the same procedure was performed for the placebo and the nocebo interventions: after a pre-intervention measurement of physiological parameters of approximately 5 min

patients were asked to rate perceived stress levels (SUD) (M3). Next, 2 ml NaCl was administered intravenously accompanied by a standardized verbal positive suggestion that the intervention would "strengthen the heart," "blood pressure and heart rate would decrease," and "breathing would become easier" as the body would be "better supplied with oxygen" (I3). Then another post-intervention measuring period (approximately 5 min) was obtained with continuous measurement of physiological parameters. At the end of this period, distress levels were assessed and blood samples were taken (M4). Again after a pre-intervention period of approximately 5 min, stress ratings (SUD) were assessed again (M5). Finally, the last 2 ml NaCl was administered analogously to the previous conditions, with the verbal suggestion that this intervention would "burden" the heart, it would need to work "stronger and faster," and "hot flashes" could occur (I5). Conclusively, the last post-intervention period (approximately 5 min) was performed with continuous measurement of physiological parameters and assessment of distress levels, and the last blood sample was taken (M6). At the end of the examination the study rationale was disclosed to the participants and they were informed about the placebo character of the study with the administered substance being only "water." Additionally, the individual echocardiography results were reviewed together with the patient.

### Statistical Analysis

Analyses were performed by means of IBM SPSS Statistics 25 with a *p*-value ≤ 0.05 considered as significant. Mean values of HR, SBP, and DBP were calculated for the period from 200 to 20 s prior to the interventions (pre values) and 20 to 200 s after the interventions (post values). Data that did not fit normal distribution were logarithmized. Pre-post changes of HR, SBP, and DBP induced by the neutral, positive, and negative interventions were compared between groups by means of a mixed-design ANOVA with the within-subject factors "time" (pre and post intervention) and "condition" (neutral, positive, and negative), and the between-subject factor "group" (TTC, controls). Subsequently Bonferroni-corrected *post hoc* tests were performed. Due to the absence of a normal distribution, SUD levels were evaluated by using Bonferroni-corrected Wilcoxon signed-rank tests and Kruskal–Wallis tests, respectively; changes of cortisol as well as copeptin levels were calculated using Wilcoxon signed-rank tests, Mann–Whitney–U, and Friedman tests.

## RESULTS

### Baseline Characteristics

TTC patients and controls were comparable with regard to age, employment situation, living situation, and quality of life. The time point of evaluation did not differ between groups and the mean time span between the TTC diagnosis and the examination was 24.61 months (±22.8) (**Table 1**).

## Subjective Units of Distress (SUD)

SUD changes from before to after the neutral, positive, and negative intervention were evaluated by using the Wilcoxon signed-rank tests. No significant changes were observed in response to the neutral and positive verbal suggestions (Bonferroni-corrected *p* = .1 and *p* = .06, respectively). However, SUD ratings increased in response to the negative verbal suggestion (Bonferronicorrected *p* < .001), indicating a nocebo effect on perceived stress. SUD did not differ between patients with a history of TTC and heart-healthy controls at any time point during the experiment (Mann–Whitney–U test, all Bonferroni-corrected *p* > .05) (**Figure 2** and **Table 2**).

### Systolic Blood Pressure (SBP)

The mixed-design ANOVA with the within-subject factors "time" (pre, post intervention) and "condition" (neutral, positive, negative) and the between-subject factor "group" (TTC, controls) was used to examine SBP levels. A significant interaction between

TABLE 2 | Subjective Units of Distress (SUD), systolic blood pressure (mmHg).


"time" and "condition" was found (*F*(2,76) = 14.09; *p* < .001). *Post hoc* tests showed higher SBP levels in response to the negative and the positive verbal suggestions as compared to the neutral verbal suggestion (Bonferroni-corrected *p*-values, *p* = .045 and *p* = .002, respectively). There was also a significant main effect for "condition" (*F*(2,76) = 3.2, *p* = .047). Bonferroni-corrected *post hoc* tests, however, revealed no significant difference between conditions. No other main or interaction effects were significant (**Figure 3** and **Table 2**).

### Diastolic Blood Pressure (DBP)

The mixed-design ANOVA for DBP levels with the within-subject factors "time" (pre andf post intervention) and "condition" (neutral, positive, and negative) and the between-subject factor "group" (TTC and controls) revealed no significant main or interaction effects (**Figure 4** and **Table 2**).

#### Heart Rate (HR)

The mixed-design ANOVA with the within-subject factors "time" (pre and post intervention) and "condition" (neutral, positive, and negative) and the between-subject factor "group" (TTC and controls) for HR levels revealed a significant interaction effect between "time" and "condition" (*F*(2,76) = 5.5; *p* = .01). Simple effects analyses showed that this interaction was due to higher HR levels before the negative verbal suggestion compared to before the positive verbal suggestion, indicating an anticipatory increase of HR (Bonferroni-corrected *p* = .02). Furthermore, a significant main effect of "condition" was found (*F*(2,78) = 5.11, *p* = .01), with higher HR levels in the nocebo condition compared to the neutral condition (Bonferroni-corrected *p* = .037). Finally, the main effect of "time" was significant (*F*(1,39) = 46.8, *p* < .001), which was due to increasing HR levels from before to after the intervention (estimated means ± SE, before: 56.5 ± 1.2 and after: 57.4 ± 1.2). No other main or interaction effects were significant (**Figures 5**, **6** and **Table 2**).

### Humoral Stress Markers

Cortisol levels at baseline and after the neutral, the positive and the negative verbal suggestions were compared by Friedman tests. Results revealed a significant difference between conditions (*x*2 = 64.3, *p* < .001), which was due to a significant decrease of cortisol levels from condition to condition (Wilcoxon tests, all Bonferroni-corrected *p* < .001). In no condition significant group differences between TTC patients and controls were observed (Mann–Whitney–U test, all Bonferroni-corrected *p*-values = 1) (**Table 2**). A Friedman test for copeptin levels at baseline and after the neutral, the positive and the negative verbal suggestions revealed no significant differences between conditions (*p* = .84). In no condition significant differences between TTC patients and controls were observed (Mann–Whitney–U test, all Bonferroni-corrected *p* = 1) (**Figure 7** and **Table 2**).

### DISCUSSION

In this study, we investigated cardiac, psychological, and endocrine stress responses to placebo and nocebo interventions targeting the heart in patients with a history of TTC and matched heart-healthy controls. Although the pathophysiology underlying TTC is not yet entirely clear, a dysfunctional, overmodulated stress response with enhanced sympathetic stimulation might be of key importance (19). We expected that physiological and behavioral responses to placebo and nocebo interventions would be more pronounced in patients with a history of TTC compared to controls than in heart-healthy controls. In our study a significant nocebo effect on subjective units of distress was detected for the whole group of 40 participants. Furthermore, HR increased significantly before the nocebo intervention, possibly indicating anticipatory anxiety towards the upcoming negative intervention. In addition, SBP levels increased significantly in response to both, the placebo and nocebo interventions, suggesting a possible nocebo effect on SBP. Significant alterations of DBP, cortisol and copeptin due to the interventions failed to appear. Contrary to our expectations, none of these responses differed between TTC patients and heart-healthy controls.

Evidence regarding placebo effects on end organ functions regulated by the ANS (e.g., cardiovascular or gastric functions) is less clear compared to the accumulating evidence for placebo effects on functions associated with the central nervous system [e.g., pain and itch, e.g. Refs. (44–46)]. The

ANS is characterized by high functional specificity provided through elaborated afferent and efferent fibers. Hence, it is not surprising that placebo and nocebo interventions targeting end-organ functions controlled by the ANS can display a high target-specificity (10, 47). The present study adds to this field of placebo research in addressing cardiac parameters that are under control of the autonomic nervous system (HR, SPB, and DBP), as well as subjective stress ratings (SUD) and humoral correlates (copeptin and cortisol). To our knowledge this is one of the first experimental studies, and the first placebo study, in patients with a history of TTC.

Our observations of significant effects from placebo and nocebo interventions on SPB and HR but not on DBP are in accordance with previous studies, which investigated placebo and nocebo effects on cardiovascular parameters by means of verbal suggestions (13, 48). Former investigations that aimed to induce BP changes in healthy individuals by means of a placebo-spray in combination with verbal suggestions for instance, assumed that the absence of significant BP alterations could potentially be explained by lacking associations between memories of physiological or mental states with specific autonomic changes in the brain, which might be a necessary condition for verbal suggestions to induce the intended effects (49). This explanation was linked to the central organizational principle of the brain named, the "reuse of neural circuity," supposing that neural circuits established for a specific purpose, diversify or exploit to new uses, without losing their genuine function (50). This explanatory approach might also give insightful hints for the results of our study. A link between memories of BD and HR decreases and specific autonomic changes in the brain that could be crucial for the targeted physiological changes might not have been available.

Also the disclosure of the fixed order of the interventions, with the negative intervention being at the end, might have prevented the positive verbal suggestions to evoke HR and BP decrease. The increase of HR prior to the beginning of the nocebo intervention might be linked to the disclosure of the chronological order of interventions as well and could indicate anticipatory anxiety towards the nocebo intervention. Lyby and colleagues could show that fear can eliminate placebo effects induced by verbal suggestions (51). In this regard several imaging studies especially from the area of pain indicate that there is altered activity in the cortical nociceptor network already during the anticipation of pain (52, 53). Moreover, the perception of pain is not exclusively depending on the specific noxious stimulus. Attention, expectation and reappraisal seem to play an important role in the cognitive modulation of pain (54). Among other brain regions [e.g., dorsolateral prefrontal cortex (DLPFC) or the periaqueductal gray (PAG)], especially the rostral anterior cingulate cortex (rACC) seems to play an important role in the nociceptive network and reveals complex response patterns provoked by placebo interventions, but also during anticipation phases (55–59). An activation likelihood estimation meta-analysis also underlines the impact of negative expectations resulting from past experiences and present information on pain perception, which in turn might lead to higher pain intensity (60). Therefore, the anticipation of the negative intervention might explain the absence of relaxing effects due to the positive verbal suggestion and the increase of HR prior to the negative verbal suggestion. Nocebo effects (especially in the area of pain) have proven to be associated with complex biochemical and neuroendocrine mechanisms that seem to be connected to anticipatory anxiety (44). This suggests the activation of the HPA or SNS, which build the main peripheral pathways of the human stress system. The HPA axis regulates the release of cortisol that has been shown to be proportionate to the degree of stress on a peripheral level. In our study cortisol levels did not change as a response to the interventions, as it could be seen in previous studies on nocebo hyperalgesia but "naturally" decreased during the examination (61, 62). A similar phenomenon could be seen in a study done by Meissner et al. who examined the predictive value of cortisol on motion sickness (63) or Benedetti et al. who showed that placebo and nocebo effects in cortisol secretion could not be induced by verbal suggestions, but were affected by pharmacological conditioning (3). A meta-analysis, again in the area of pain, showed that the combination of verbal suggestions and conditioning induces larger placebo and nocebo effects than verbal suggestions alone (64, 65). Colloca and colleagues concluded that conditioning is less important in nocebo hyperalgesia compared to placebo analgesia (1). Unintended expectations and stimulus pairings could have been developed through the TTC patient's experiences during their disease history that might have led to a "blending" of expectation- and conditioning-induced effects in our examination (66).

The question of whether TTC is a transient, reversible disease, or is based on an enduring pathology affecting the sympathetic nervous system, is not yet fully clarified. It is widely believed that the suspected, exaggerated sympathetic activation within the acute phase of TCC is triggered by a precedent, mostly unexpected stressful life event [e.g., Ref. (21)]. The assumption that the normalization of the shape of the left ventricle and the systolic LVEF is accompanied by a regulation of the underlying sympathetic activation, would in turn explain the lacking difference between TTCs and heart-healthy controls. Additionally, recent studies indicate that the exposure to repeated stressors (in contrast to a single life event) is associated with the onset of TTC (67, 68), the authors argued that long-term stressful conditions might have led to an increased vulnerability towards strong emotional or physical stressors triggering the development of TTC. Within our study, positive as well as negative interventions were announced far in advance, took place in the "save environment" of the hospital and might therefore not have served as suitable stimuli for an exaggerated activation of the sympathetic nervous system. Another recent study focused on altered β-adrenergic signaling in TTC cardiomyocytes derived from pluripotent stem cells to explore whether genetic susceptibility underlies the pathophysiology of TTC. These findings point at a complex, multifactorial etiology of TTC with genetic predispositions combined with environmental factors such as age, postmenopausal hormonal status and stressful life events (69). At the cellular level, Borchert and colleagues could demonstrate that TTC phenotype was associated with enhanced β-adrenergic signaling and higher sensitivity to catecholamineinduced toxicity (70). These considerations might be further promising regarding distinguishing features between TTC and heart healthy individuals.

Although the sample size of 20 TTC patients is comparably high considering the prevalence of 0.07–2.3% of patients suspected with an ACS, a larger number of participants in our study would have been desirable. As a further issue the participants' medication intake (e.g., β-blocker) needs to be considered. Although the intake of antihypertensive medication was relatively similar in both groups, this could have led to a dampening effect of sympathetic activation and might therefore have reduced differences between groups. Furthermore, in the light of the explanations above, a combining of classical conditioning and verbal suggestions might have improved especially the placebo response but also the nocebo response. It could have shed new light on the impact of conditioning and verbal suggestions (resp. explicit expectations) on placebo and nocebo effects within the autonomic nervous system. A further limitation might be the variety of time spans between the cardiac event and the investigation that is attributed to the low prevalence of TTC. If we would have included patients within their acute phase only, the recruitment period would have been enormously long, which would have meant that constancy in further parameters, for instance examiner or examination rooms, could not have been guaranteed. If TTC is seen as a reversible disease or a maintaining pathology in stress processing, a predefinition of one or more specific time points (e.g., within the acute phase together with a two-year follow-up) needs to be considered in a further study. Due to standardization resp. generalization reasons (especially considering the relatively small sample size) the chronological order of the three interventions was standardized. Future studies should consider a cross-over design with a randomized order. The observation that the positive verbal suggestion did not reduce perceived stress is most probably due to a floor effect, since stress at baseline was very low (see **Figure 2**). Finally, the consideration that anticipatory anxiety might have prevented the induction of a placebo effect suggests to additionally collect fear ratings during the course of the intervention.

Summarizing, this study was the first to investigate effects of positive and negative verbal suggestions in combination with the intravenous application of saline solution on cardiac parameters in patients with a history of TTC compared to controls. Only an increase of SBP could be observed as a response to both positive and negative suggestions. Secondly the increase of SBP as a response to the nocebo intervention was congruently accompanied by higher levels of SUD. The increase of HR prior to the beginning of the nocebo intervention is possibly associated to anticipatory anxiety of the nocebo intervention. Our hypothesis that the cardiac response towards placebo and nocebo interventions in patients with a history of TTC would be different from those of heart-healthy controls could not be confirmed with our data, a TTC, on average diagnosed two years ago, does not appear to have an influence on the responsivity to placebo resp. nocebo interventions. This becomes even more important considering the fact that the etiology of TTC is not yet fully explained. The assumption that an altered sympathetic disposition might build the precondition for the pathophysiological cascade of TTCpatient's myocardium within the acute phase, could not be verified with our placebo resp. nocebo interventions, at least at the time of our examination, on average, two years after the acute phase.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Code of Conduct of the Technical University Munich, Germany, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the institutional board of the Technical University Munich, Germany.

### AUTHOR CONTRIBUTIONS

EO, KM, SS, GS, DS, AM, CB, HE, and JR designed the experiment. EO, SS, and DS performed the experiment. EO, KM, AM, DS, SB, K-HL and SH analyzed the data. EO drafted the first

#### REFERENCES


version of the manuscript. All authors interpreted the data and critically reviewed the manuscript.

#### FUNDING

The study was supported by the Deutsches Herzzentrum, Technische Universitaet, Munich, Germany. This work was supported by the German Research Foundation (DFG) and the Technical University of Munich within the funding program Open Access Publishing.

and prevention. *Circulation* (2006) 113(14):1807–16. doi: 10.1161/ CIRCULATIONAHA.106.174287


clinical characteristics, diagnostic criteria, and pathophysiology. *Eur Heart J* (2018) 39(22):2032–46. doi: 10.1093/eurheartj/ehy076


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Olliges, Schneider, Schmidt, Sinnecker, Müller, Burgdorf, Braun, Holdenrieder, Ebell, Ladwig, Meissner and Ronel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Other Side of the Coin: Nocebo Effects and Psychotherapy

*Cosima Locher1,2†, Helen Koechlin1,3†, Jens Gaab1 and Heike Gerger1\**

*1 Division of Clinical Psychology and Psychotherapy, Faculty of Psychology, University of Basel, Basel, Switzerland, 2 School of Psychology, University of Plymounth, Plymouth, United Kingdom, 3 Department of Anesthesiology, Critical Care, and Pain Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA, United States*

Psychotherapy and placebo have a long history, and both have been shown to have significant and clinically meaningful effects. In the last 100 years and up to today, psychotherapy has been subject to an enduring and often heated debate about its mechanisms and its possible relationship to placebos and their effects. However, there is little awareness of the placebo effects' counterpart—nocebo effects (from Latin "I will harm")—in the context of psychotherapy. Embedded in the controversy of whether psychotherapy and placebo share some unwanted proximity in terms of effects and mechanisms, the question arises which role nocebo effects may play in relation to psychotherapy. By using two examples, this article analyzes and discusses two different kinds of possible associations between psychotherapy and nocebo effects. We close with possibilities of how to prevent the occurrence of nocebo effects in psychotherapy, including some specific recommendations for clinical practice.

#### *Edited by:*

*Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Johannes A. C. Laferton, University of Marburg, Germany Victor Chavarria, Parc Sanitari Sant Joan de Déu, Spain*

> *\*Correspondence: Heike Gerger heike.gerger@gmail.com*

*†These authors share first authorship.*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 14 March 2019 Accepted: 16 July 2019 Published: 08 August 2019*

#### *Citation:*

*Locher C, Koechlin H, Gaab J and Gerger H (2019) The Other Side of the Coin: Nocebo Effects and Psychotherapy. Front. Psychiatry 10:555. doi: 10.3389/fpsyt.2019.00555*

Keywords: nocebo effects, adverse (side) effects, psychotherapy, trauma debriefing, chronic primary pain, (negative) treatment expectations

### PSYCHOTHERAPY, PLACEBO, AND NOCEBO

Throughout its history, psychotherapy has been associated with placebos and their effects, and much of psychotherapy's progress and controversy are owed to this complex and disputed relationship (1, 2). The debate encompasses the first origins of psychotherapy itself (3), the early and seminal publications of Rosenzweig's so-called Dodo bird verdict of implicit common factors underlying the effects of diverse psychotherapy approaches (4), Eysenck's provocative claims of psychotherapy not showing greater effectiveness than spontaneous remission (5) or placebo treatment (6), Fish's concept of "Placebo therapy" (7), and the epistemological conundrum of placebo insights (8). More recently, assumingly, verum psychotherapy was shown to be only slightly more effective than (pill) placebo (9) or nondirective supportive control treatments (10, 11), and observed differences between psychotherapies or psychotherapy and control treatments are strongly influenced by their structural equivalence (12–14) and the researchers' allegiance (15). Also, placebos with a psychotherapeutic meaning have been shown to be effective and to have effects comparable to those observed in subjects undergoing established psychotherapy treatments (16). These methodological and epistemological issues prompted Cuijpers and Cristea (17) to publish a guideline on "[h]ow to prove that your therapy is effective, even when it is not (…)." Thus, the acknowledgment and understanding of the relationship between psychotherapy and placebo is just as much problematic as it is relevant for research (18, 19) and an ethically sound clinical practice (20). But how about nocebo effects?

fpsyt.2018.00740.indd 1 Manila Typesetting Company 12/18/2018 09:14PM

Kennedy was the first to mention the nocebo effect some 50 years ago (21), emphasizing that the term *nocebo* uniquely refers to a subject-related response, a reaction inherent in the patient rather than in the active drug. The nocebo effects are typically understood as the malicious counterpart of the effects of the placebo. They are usually seen as the adverse consequences to inert treatments, which are associated with a negative meaning, whereas the placebo effects are understood as a beneficial consequence of an otherwise inert treatment provided with a therapeutic meaning. Adverse consequences of treatments can manifest as so-called side effects to an active or inert treatment, nonadherence, or even discontinuation of treatment or the lack or attenuation of beneficial effects of otherwise effective treatments (22). Noteworthy, not all adverse, negative, or missing treatment responses are to be attributed to nocebo effects (22). They can also occur because of the natural course of a given disease or disorder, the unsuitability or inaptness of a particular treatment for a given clinical condition, or the lack of responsiveness of a given clinical condition to the administered treatment. Furthermore, and of course, adverse outcome could also be the consequence of treatment errors, malpractice, and unethical or harmful behavior of the practitioner or therapist (23). But if the mechanisms assumed to explain the occurrence of adverse events after treatment administration were the same that are assumed to underlie nocebo effects, this would suggest that the adverse events were related to nocebo effects.

#### MECHANISMS OF NOCEBO EFFECTS

Several mechanisms have been described as possibly underlying nocebo effects. One of these are patients' negative expectations. Negative expectations can be induced verbally, that is, when patients are informed about the possible occurrence of side effects, or through the behavior of the treatment provider (24). For example, a rather nonempathic, distanced therapist may induce a negative treatment outcome expectation in the patient (25). In addition, a high somatic focus (26) and the presence of certain personality traits, such as anxiety and pessimism (27), have been related to the occurrence of nocebo effects. Furthermore, classical conditioning effects may play a role, as previous (negative) experiences with the assumed medical agent may contribute to the occurrence of nocebo effects (27). The significance of classical conditioning as an essential aspect of nocebo effects has been demonstrated in pain research (28). In addition, nocebo effects have important neurobiological and emotional correlates, which are associated with changes in brain activation (29), and may play a significant role in psychotherapy.

An additional aspect that is highly relevant for clinical practice and closely linked to the generation of (negative) expectations is the so-called narrative. In each treatment setting, different narratives play a crucial role. First, patients have their own background, experiences, and belief systems that influence their narrative of both why the symptoms are present and how they should be treated (i.e., so-called client narratives or subjective illness narratives) (30). Second, treatment providers also have their expectations and a theoretical background that shape Q7

their illness narratives. Finally, depending on the treatment and next to the theory behind it, there might be manualized methods and strategies to be used in treatment (i.e., also called the healing narrative) (31). All of these narratives influence the verbal and the nonverbal communication between patients and treatment providers (32). Importantly, the narratives of patient and provider do not necessarily match. Placebo research has shown that to harness the underlying processes, an open and honest conversation about the mechanisms that underlie the respective treatment effects is key (33). However, unintentional negative suggestions, such as trivialization (e.g., "You don't need to worry"), or focusing attention (e.g., "Are you in pain today?") may trigger a nocebo response (32). Assumingly, patients are especially sensitive to negative suggestions, particularly in vulnerable contexts.

#### NOCEBO EFFECTS IN PSYCHOTHERAPY

The question arises which role nocebo effects may play in the context of psychotherapy. Interestingly and relevant to our arguments, the possible negative effects of psychotherapy were common lore in the 1960s as Barlow (34) points out, "Being awakened to the possibility that one could inflict dire harm on patients during each visit to the consulting room (or even on the way to it) was an everpresent source of anxiety during those early years for many of us" (p. 13). This "dire harm" could consist of the "Pavlovian construct of transmarginal inhibition or a state of complete shutdown of the organism," being inflicted through "intense experiences" (p. 13). Accordingly, although psychotherapy of course can have negative consequences, such as negative side effects but also nonimprovement of symptoms or even symptom worsening (34, 23), these are regrettably underreported and underinvestigated in psychotherapy research (35). Recently, however, symptom deterioration in waiting-list control groups has been described as possibly being caused by the same mechanisms that cause nocebo effects (36): The authors argue that negative expectations regarding the hypothesized inactive control treatment and the assumption that patients give up their coping strategies while waiting for a promised effective treatment have been described to explain the observed symptom deterioration. Following a similar line of argumentation, we discuss two examples to illustrate two possible associations between psychotherapy and nocebo effects, and we analyze whether symptom deterioration or nonimprovement observed in psychotherapy may be related to nocebo mechanisms. We close our article with possible recommendations on how to prevent the occurrence of nocebo effects in psychotherapy.

### THE ROLE OF NOCEBO EFFECTS IN THE TREATMENT OF CHRONIC PRIMARY PAIN

Patients with chronic pain often suffer from symptoms that have no clear etiology (37). The population of chronic pain patients is very heterogeneous; however, they usually share the experience of a long and unsuccessful treatment history. Patients and providers strive to find a clear symptomatic

fpsyt.2018.00740.indd 2 Manila Typesetting Company 12/18/2018 09:14PM

cause for the pain, but although most interventions can help patients to deal with their pain, measurable pain reduction after an intervention is usually small in the long turn (38). Chronic pain is multicausal, but treatment approaches often fail to take this into account as domain-specific approaches dominate the field (39). Furthermore, patients usually see multiple physicians and specialists during their treatment odyssey, and because the etiology for most chronic pain conditions is unknown and most likely multicausal, a plausible and satisfying narrative is hard to find. This patient group also often present with a high somatic focus, a tendency to notice and report physical symptoms, which not only leads to reports of increased pain severity and disability as well as negative emotions but also likely influences provider's negative perception of these patients (40). Also, the nosology and terminology of the condition itself are a challenge. Chronic pain conditions without a clear etiology have been labeled as functional pain, medically unexplained pain, somatoform disorder, or psychosomatic symptoms (41). However, hearing that "It's all in your head" (as implied by the term "psychosomatic," for example) might lead to reduced compliance and hence symptom worsening. As past research has shown that compliance to medical advice is closely linked to patients' understanding of their illness, a new diagnostic term has far more implications than just semantics (42). The upcoming ICD-11 introduces a new diagnostic category called chronic primary pain (CPP), which emphasizes pain itself as the disease (41). This new term holds the potential to change the common understanding of chronic pain conditions and help explain why an interdisciplinary treatment approach is crucial. In experimental pain research, the occurrence of nocebo effects has been demonstrated using placebos accompanied by negative verbal suggestions (43). All of these points may contribute to induce nocebo effects, as negative expectations caused by demoralizing treatment experiences are likely to occur. In addition, the negative appraisal of pain symptoms (e.g., the assumption that pain is a threat or linked with tissue damage) and catastrophizing or rumination around pain may further contribute to the occurrence of nocebo effects. Thus, the chronic pain population is a specifically vulnerable to the occurrence of nocebo effects even without an active or inert treatment being administered.

But what do the outlined high potential for the occurrence of nocebo effects in people with chronic pain and the possibility to induce nocebo effects by simple verbal suggestion imply for the actual treatment of chronic pain patients? In this vulnerable population, a careful focus on expectations, a focus on positive effects of the treatment, and a trustful patient-provider relationship are crucial, keeping in mind a fine-grained and sensitive understanding of the several layers these conditions present with. To avoid nocebo effects in treatment, clinicians should be especially aware of past adverse experiences that their patients might have made in previous treatments (44). Additionally, studies have identified other risk factors for the nocebo effect, such as verbal suggestions of arousal and symptoms, social observation, and baseline symptom expectations (45). Considering that, in many cases, both the patient and the provider have a negatively connoted narrative about chronic pain, an open and transparent communication about their respective understanding of the development, maintenance, and handling of chronic pain appears central, ensuring an individualized treatment plan, which is crucial for the development of a shared understanding and for the creation of a more hopeful narrative of the condition itself (46). One good example is the use of metaphors to explain that pain by itself is a necessary and adaptive bodily function; however, if the system remains in a constant state of alarm, it becomes maladaptive (42, 47). As a second example, in the context of medically unexplained symptoms, it has been shown that psychotherapeutic treatments were most effective when delivered by psychotherapists (48). This finding might be because of psychotherapists focusing on patients' individual expectations, motivations, and perceptions, which may in turn correct patients' inaccurate understandings of their symptoms. The idea that an inaccurate understanding of chronic pain may increase chronic pain begs the question how can we best correct that inaccurate piece of knowledge? Psychology, hand in hand with other disciplines, such as biology and neurology, can contribute to a more elaborate shared narrative between patient and treatment provider and in turn may lead to the reduction of negative expectations.

In contrast, we will give an example of a psychotherapeutic treatment that has been shown to have limited benefits, and we will discuss whether the observed effects can be related to the occurrence of nocebo effects.

#### THE CASE OF DEBRIEFING FOR TRAUMA SURVIVORS

In 1983, Mitchell introduced Critical Incident Stress Debriefing (CISD) (49) as a crisis intervention for use with small homogeneous groups of paramedics, fire fighters, and law enforcement officers who were distressed by an exposure to some particularly gruesome event" (p. 2) (50). Initially, CISD was not thought to be a stand-alone treatment, but it soon gained popularity, was applied in different trauma populations (51), and was adopted for use in individual settings (52). Despite numerous adoptions (53–55), the main elements remained the same, that is, the trauma experience will be discussed with a focus on distinguishing between facts, cognitions, and emotions. Through the intervention, trauma survivors shall learn to judge negative reactions after trauma experience as "normal" reactions (52).

However, despite the initial enthusiasm toward trauma debriefing, several systematic reviews and meta-analyses found no evidence for the superiority of trauma debriefing over control treatments in preventing the occurrence of posttraumatic stress disorder (PTSD) symptoms in the aftermath of trauma experience (52, 56–60). On a closer look, the reviews included a number of studies that reported even an increase in PTSD symptoms after trauma debriefing compared with control treatments (61–63). Mitchell (50) argued that the negative effects of trauma debriefing in several studies can be explained by the debriefing not being implemented as

fpsyt.2018.00740.indd 3 Manila Typesetting Company 12/18/2018 09:14PM

manualized, that is, not within a homogeneous group setting, not with the designated trauma populations, and not with emergency staff but with trauma victims. In contrast, other researchers argued that the negative effects indicate the real danger of debriefing interventions to contribute to symptom deterioration. In this sense, it has been proposed that the negative effects may be caused by a strong pathologizing of the trauma (27), limited time for the trauma processing (64), and the creation of an expectation toward the occurrence of PTSD symptoms (59, 60).

Of the three outlined possible explanations for the failure of trauma debriefing in preventing PTSD symptoms, two can be closely related to nocebo effects. First, the information regarding potentially occurring negative reactions after trauma experience may increase the expectation of the occurrence of negative reactions, which may in turn induce the development of such negative reactions. Second, the focus on observed symptoms after trauma experience might lead to a reevaluation of the observed symptoms in the sense that the severity of the symptoms might be exaggerated, resulting in more negative evaluations of their own symptoms. In particular, persons with a stronger tendency for somatic symptoms might even be prompted toward negative reactions of their body, including emotional states, and in turn perceive and report an increase in negative reactions. The mechanisms would thus be the same as in the case of the administration of placebo pills, which lead to the experience of side effects after debriefing patients about potentially occurring side effects. Rose and colleagues have argued in this line in explaining the disappointing results of trauma debriefing in preventing PTSD symptoms in their meta-analysis (59).

Thus, the previous analysis has demonstrated that at least some of the mechanisms that have been postulated to explain the occurrence of negative outcomes after trauma debriefing are the same as those that are used to explain the occurrence of nocebo effects, suggesting a(n) (unwanted) proximity between nocebo effects and psychotherapy.

### CONCLUSIONS

Based on the discussion of whether psychotherapy and placebos share some unwanted proximity, we set out to examine possible associations between nocebo effects and psychotherapy in the present article.

First, we examined the potential for nocebo effects in patients with chronic primary pain. In this context, we identified relevant nocebo mechanisms that may occur during treatment of chronic pain, including mainly the creation of negative expectations. Thus, we conclude that patients with chronic pain may reflect a population with a particularly high risk for the occurrence of nocebo effects. However, the same arguments may hold true for other patient populations with symptoms that lack a clear etiology (e.g., medically unexplained symptoms or mental disorders, such as depression). We highlight the need for a flexible treatment approach, to address patients with preexisting treatment experiences, their negative expectations and motivations, and their subjective illness and healing narratives. Negative treatment expectations have been demonstrated to be related to negative treatment effects in other domains of health care [e.g., Ref. (65)]. The highly individualized approaches of most psychotherapeutic treatments offer the possibility to address the outlined issues. Thus, psychotherapy may be seen as a means to reduce nocebo effects in the treatment of chronic pain.

Second, we examined whether the observed occurrence of unwanted outcomes after the administration of trauma debriefing may be related to nocebo mechanisms. We conclude that at least some of the mechanisms that are assumed to be the cause of nonimprovement or even deterioration of symptoms after debriefing of trauma survivors are the same that underlie nocebo effects—most importantly, the creation of expectations regarding the occurrence of PTSD symptoms. Accordingly, just as it has been discussed in the context of other health care settings (22, 66, 67), debriefing of patients regarding possibly occurring symptoms may contribute to nocebo effects in the context of psychotherapy as well.

In terms of recommendations for clinical practice, the most relevant question is, "How can the occurrence of nocebo effects best be avoided within an ethical framework?" In the context of psychotherapeutic treatments, this essentially involves the following principles: first, to speak openly and honestly about the possible occurrence of nocebo effects in the course of psychotherapy; second, to address possible adverse responses to psychotherapeutic treatment; and third, with respect to the importance of the narrative, the choice of words should be carefully considered in treatment settings, taking into account the patient's own background and understanding (i.e., the patient's subjective illness narrative). In recent years, the impact of media presentations of health on individual patient's treatment expectations gained increasing relevance (66). Therefore, discussing and possibly correcting negative expectations, which patients gained by media consumption, in relation to the occurrence of nocebo effects, need to be considered during treatment as well.

With regard to implications for research, the main question may be "How can future studies advance our knowledge of the link between nocebo effects and psychotherapy?" One of the most important issues for psychotherapy outcome research might be that negative outcomes are measured and reported. To date, however, only a minority of psychological trials reported negative outcomes, but most psychotherapists stated that negative effects do occur within psychotherapy on a regular basis (35). Of course, unwanted effects are not necessarily linked to nocebo effects, but the reporting of negative outcomes in psychotherapy research is a prerequisite for a closer examination of the risk of the occurrence of nocebo effects.

To conclude, the issue of nocebo effects, which occur as a consequence of informing patients about the prognosis of their symptoms, including the disclosure of possibly occurring adverse reactions after treatment, is subject of an ongoing debate [e.g., Refs. (67–70)]. By outlining the possible relations between psychotherapy and nocebo effects, the present article contributes to translating this debate to the field of psychotherapy research.

fpsyt.2018.00740.indd 4 Manila Typesetting Company 12/18/2018 09:14PM

#### AUTHOR CONTRIBUTIONS

CL, HK, JG, and HG wrote and reviewed the manuscript.

#### FUNDING

CL, PhD, received funding for this project from the Swiss National Science Foundation (SNSF): P400PS\_180730 (Title:

#### REFERENCES


Overcoming Classificatory and Methodological Hurdles to Improve Treatment of Chronic Primary Pain: A Network Meta-Analytic Approach).

#### ACKNOWLEDGMENTS

We would like to thank Celine Bergamin, who contributed to the review on debriefing for trauma survivors.


fpsyt.2018.00740.indd 5 Manila Typesetting Company 12/18/2018 09:14PM


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Locher, Koechlin, Gaab and Gerger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

fpsyt.2018.00740.indd 6 Manila Typesetting Company 12/18/2018 09:14PM

# The Myth of the Placebo Response

*Wayne B. Jonas\**

*1 Samueli Integrative Health Programs, Alexandria, VA, United States, 2 Georgetown University School of Medicine, Washington, DC, United States, 3 Department of Family Medicine,Uniformed Services University, Bethesda, MD, United States*

The placebo response is a myth. It does not exist in reality, and continuing to name it is hindering the optimal application of science to healing in medicine. On the surface, it is obvious that, when defined as a biological response to an inert pill (like a sugar pill), the idea of a "response" to a placebo is impossible. Inert treatments by definition do not produce responses. So why do we continue to ponder why people get better from taking inert substances and base our acceptance of legitimate treatments on demonstrating that they go beyond that response? The problem arises because we have flawed assumptions of the value that reductionistic science and the demonstration of specific effects has for healing. To support those flawed assumptions, we support the idea of "the placebo response." This causes confusion among patients, clinicians, regulators, and even scientists. Legitimate medical treatments have become defined as those that do more than produce a placebo response. An entire pharmaceutical industry and its regulators attempt to control and profit by proving that small molecules produce a clinical effect greater than the placebo response. Billions of dollars are made when that is proven, often even when the size of the response in the active over the placebo group is miniscule. The fact is people heal and that inherent healing capacity is both powerful and influenced by mental, social, and contextual factors that are embedded in every medical encounter since the idea of treatment began. In this chapter, I argue that our understanding of healing and ability to enhance it will be accelerated if we stop using the term "placebo response" and call it what it is—the meaning response, and its special application in medicine called the healing response.

#### *Edited by:*

*Paul Enck, University of Tübingen, Germany*

#### *Reviewed by:*

*Irving Kirsch, Harvard Medical School, United States Younbyoung Chae, Kyung Hee University, South Korea*

*\*Correspondence: Wayne B. Jonas wayne@drwaynejonas.com*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 04 April 2019 Accepted: 22 July 2019 Published: 16 August 2019*

#### *Citation:*

*Jonas WB (2019) The Myth of the Placebo Response. Front. Psychiatry 10:577. doi: 10.3389/fpsyt.2019.00577*

TRADITIONAL HEALING SYSTEMS

Keywords: placebo, myth, response, healing, traditional

For millennia, the primary philosophy behind most healing traditions involved seeking balance and harmony with your spiritual self, social community, and nature (1). Patients and practitioners in these traditional systems adjusted how the patient lived in the society, with nature and with themselves, the latter referring to the spiritual and mental aspects of life. Hippocrates said that the physician's highest task was supporting the patient while nature did the healing—*Vis medicatrix naturae,* literally "the healing power of nature (2)." The Yellow Emperor of China talked about the physician working to keep the patient healthy through balance with nature and lifestyle (3). The ancient Ayurvedic system of medicine involves returning the patient to the unity of wholeness of a human being—called universal consciousness—as the path to induce healing processes (4). In most of these ancient healing traditions, the mind, heart, body, and nature are considered all one, and health came from getting them to work in harmonious interaction. Traditional healing systems from around the world kept their eye on the whole picture of the human person, which was defined as an individual in the context of their social and natural environment.

#### WESTERN BIOMEDICINE

Then, approximately 150 years ago, some things were discovered in the Western hemisphere about the small and particular that radically changed this thinking. The microscope and chemistry were invented, and we began to identify and isolate bugs (infectious agents) and drugs (chemicals) as causes and cures for certain diseases. Manipulating these smaller elements had a dramatic ability to stop death from those causes. These discoveries worked particularly well for infectious disease and trauma, which were the primary causes of immediate death for the millennia before that. So dramatic where these effects that a new Western version of medicine grew up, which rapidly spread and globally supplanted the older healing traditions. After all, who would not want to have their life saved when they were on the verge of death? And so, like cars and cell phones after them, Western medicine became the dominant system throughout the world backed by policy, payment, and delivery. The age of heroic medicine had arrived. Nature was now to be dominated and controlled. The idea of harmony and balance went out the window. The more holistic models from ancient times were swept away or were relegated to the so-called complementary or alternative medicine (CAM) practices. These ancient traditions were called "non-scientific" and delegitimized. Since Western medicine was particularly focused on the physical, no longer was the mind, spirit, or social dimensions of the person relevant for healing. No longer was the healing force of nature important. These concepts, previously foundations across the globe, largely disappeared from the medical encounter (5).

### THE RISE OF CHRONIC DISEASE

Except that disease did not disappear. It only shifted. Our ability to stop death resulted in an aging population and the emergence of chronic diseases as the dominant causes of morbidity and mortality. Diseases such as heart disease, diabetes, depression, dementia, obesity, and cancer now dominate humanity (6). These diseases do not respond well to the Western science of the small and particular. However, by now, we believed so much in this science and have seen its dramatic effects in acute disease and death that we continue to use this model and apply it to chronic illness. Our research is now organized around identifying the specific physical causes of chronic conditions, which has become the main criteria for what is legitimate or illegitimate practice. The science of the small and particular is imbedded in regulatory processes for approving and paying for treatments. The medical industry follows these regulations and seeks approval of proprietary small molecules for common chronic diseases. Billions of dollars flow following these approvals. Drugs—as defined by regulatory and patent bodies—dominate medical thinking and practice.

### REDUCTIONISM

However, most of these approaches do not work very well. The evidence is now abundantly clear that, at least for the management of complex chronic diseases, reductionism does not work well for and is inferior to whole systems approaches in practice. This can be illustrated in a number of ways. First, the narrow, reductionistic view is the underlying reason the pharmaceutical industry invests up to two billion dollars and takes 12–15 years to get a new drug on the market1,2 . The vast majority of drugs fail when ultimately tested in large studies compared to placebo treatments. Many those that are proven and do get on the market don't work very well. Two thirds of the positive research published in the mainstream literature cannot be replicated (7–10). For those that can be replicated, the effect size—that is, the effect of the drug group over the placebo group—is small. In a recent study, researchers at the United States National Health, Lung and Blood Institute analyzed the benefit of the medications that it funded research on for heart disease over the last 30 years. The result was that these drugs added ~8% over the spontaneous or placebo healing rates for those diseases (11).

Even simple proven and effective therapies such as statins for the prevention of heart disease illustrate this dilemma further. For every 100 people who take a statin for the primary prevention of heart disease, only two will avoid a heart attack by doing this, 98 will derive no benefit (but we or they have to pay for the drug), and 5–20 will suffer significant side effects. To get these small benefits, many must tolerate these side effects and costs. Who determines whether this benefit is better than the harms? That is not a scientific question, it is a value question that each patient and their physician must make for themselves (12). Unfortunately, physicians are armed almost completely with the tools that industry provides them. Rarely is a decision about statin use offered in the full context of the benefits and costs of alternative approaches such as behavior, the ritual of compliance, social and emotional factors such as loneliness, or the impact of patient and cultural beliefs and expectancies.

The recent promise of "personalized, precision medicine" the ultimate extension of the reductionistic approach—in an attempt to control even more specific molecular targets—is also, so far, largely a disappointment, although hope and hype spring eternal in this field. Precision (targeted) oncology is the most developed of these approaches. There have been some dramatic effects in certain people from hitting these targets with small molecules. Precision oncology has produced dramatic benefits (and major harms) in small populations. However, the promise of these breakthroughs for large populations is, overall, modest and overhyped. Professor Dimitrios Roukos, from the Personalized Cancer Medicine Biobank, Ioannina University School of Medicine in Greece summarized this as follows*: "… the results of clinical trials testing biomarkers and biologics developed on the basis of conventional single-gene cancer research have demonstrated modest, isolated clinical success. These findings are not surprising given the molecular network complexity* 

<sup>1</sup> https://www.drugs.com/fda-approval-process.html

<sup>2</sup> https://www.washingtonpost.com/news/wonk/wp/2014/11/18/does-it-really-cost-

<sup>2-6-</sup>billion-to-develop-a-new-drug/?noredirect=on&utm\_term=.87ac8cbe951c

Jonas Myth of the Placebo Response

*and heterogeneity of cancer. In the post-genomic era, nextgeneration DNA-sequencing technology-based results confirm available evidence that cancer initiation, growth and metastasis are driven by molecular networks rather than just one mutated gene or a single deregulated signaling pathway* (13)*."* What is needed is not only simply a personalized, precision oncology, where the drug is targeted to a unique molecule on a cell, but also a reversed personalized oncology, where the patient is adjusted to enhance the drug response. This requires a more holistic view than the current paradigm of the small and particular provides.

### INVENTION OF THE PLACEBO RESPONSE

Since reductionism has largely failed for chronic illness, yet Western medicine is already heavily invested in it, both in mindset and money, health care had to invent a way of solidifying its legitimacy further. Thus, it invented the "placebo response." Into the term placebo response was dumped all the rest of healing that was not produced by the isolated, physical, and specific treatment. Being seen as not placebo—meaning being specific and physical—became the requirement to be considered valid and real (14). Effects that could not demonstrate they were due to a specific and physical entities were said to be "just placebo" and therefore not real and not valuable for healing. Relegating effects to placebo provided a way to cover up the fundamental flaw in the reductionistic model—that it does not work for healing complex, multi-factorial, chronic disease.

### CLEARING THE PLACEBO MYTH

While the solution to this dilemma is multifaceted, one important step would be to stop pursuing the mythical concept called the placebo response. Several years ago, Professor Dan Moerman and I recommended that we replace the term placebo response with the term the "meaning response (15)." The reason for this was to make it more evident that our physiology was responding to the context and rituals that imbued meaning to a treatment rather than to a substance, inert or otherwise. And while the meaning response framework has gained some traction, it too was unsatisfactory for motivating the transformation needed in the medical encounter. While I still believe the term "meaning response" should replace the concept of "placebo response," we should also replace the words "placebo response" with the words "healing response" when referring to the use of meaning in treatment. This would acknowledge that it is the whole person that is in need of medicine taking into account the underlying mechanisms that produce those responses rather than attributing them to placebo. By abandoning the concept of a placebo response, we could bring into focus how our mind and expectations alter our biology and how the cultural rituals and environmental context of medicine induce maximum healing through meaning, rather than defaulting into debates over whether a treatment effect is "real" or "just placebo" based only whether it works through a specific theory or a small molecule.

Making this conceptual and linguistic shift would change the entire nature of placebo research for health care. Suddenly, research

on the meaning or healing response and its mechanisms would become more valuable for use in practice. Rather than simply using placebo-controlled research to eliminate what is "not real"—a consequence of the placebo myth that has left us with a paucity of proven therapies for chronic disease—research on how the meaning response works opens us up to an abundance of discoveries that can be immediately applied in practice. What is now dismissed as the placebo response could be used as the basis for inducing optimal healing that is personalized to the patient and their culture and context. We would rapidly go from therapeutic nillism to an abundance of ways to alleviate suffering and treat chronic disease.

### RELEASING PRACTICE FROM THE PLACEBO MYTH

By clearing away the placebo myth, I, as a physician, can use the understanding of the mechanisms of the meaning response to construct multiple paths for healing my patients. I can widen my therapeutic lens. For example, I can now use the power of mindset and belief to heal. I can create social rituals for healing that are specific for a patient and their culture. I can adjust the environment of the patient to optimize healing. I can value and use the doctor–patient relationship again—which has largely lost its place in Western medicine, and I can also use this knowledge to avoid harm, the so-called nocebo response. Destroying the placebo myth returns meaning to medicine, brings hope to the patient, and allows me to address the root causes of recovery. In addition, it could potentially reduce burnout by returning the heart of medicine—relationships—back into healthcare. Research on the meaning response and how it can be applied to healing would take us from looking at the effects found when using inert substances as simply curiosities to a new fundamental way for understanding how to optimize therapeutic practice.

Once the myth of the placebo response is removed, I, as a physician, can draw on research on the mechanisms of the meaning response to produce an evidence-based healing response for my patients. For example, I would now have evidence for using the following approaches in my day-to-day practice with any treatment, no matter what its efficacy is. I would try to use more frequent dosing rather than less frequent dosing—up to a limit (16). I would seek to deliver therapies in the most powerful therapeutic settings such as hospitals and clinics rather than at home (17). I would try and match the appearance, such as size and color, to the desired effect expected by the patient and their culture (16, 18). I would attend to the style and route of administration of a treatment (17). I would take the time to deliver therapies in a warm and caring way (19) and with confidence in their power to heal (20). I would explore what therapies my patient believes in and try to align and accommodate my treatment to that belief, provided it was safe (21–23). I would make sure I understand the mechanisms of a treatment so that I can believe in the treatment I am delivering (24, 27). I would seek to align all beliefs—that of the patient, the doctor, the family, and the culture (25). I would add a safe and easy to use conditioned stimulus alongside the specific therapy (26, 27). I would use a well-known brand or a new and exciting treatment claimed to have success (28–30). I would let

the patient know what to expect (31, 32). I would seek to use an electronic device to deliver and track the treatment when possible (28). I would always incorporate reassurance, relaxation, suggestion, and reassurance into the treatment (33–35). I would spend the time to listen and understand the patient (19, 36) and, when possible, touch them with empathy and reassurance (15, 37). More recently, the evidence shows that I can simply explain to the patient about the likely benefit of any treatment for its potential in healing and recovery (38, 39), and most remarkably, I can do this with any treatment, whether its specific effect has been proven or not.

### RELEASING RESEARCH FROM THE PLACEBO MYTH

Getting rid of the placebo myth also brings a breath of fresh air to biomedical research in general. First, we can alter our research designs to reduce the meaning response in the early phases of clinical testing and thus widen the gap between the effects of meaning and the medicine (40). This would allow us to demonstrate the specific effect of a treatment more easily, with fewer subjects and less expensively. In addition, it would help us build a basis for advancing both the evidence and ethical foundations for using meaning in medicine (41).

Freed from the placebo myth, we are no longer bound to an outdated hierarchy of evidence for determining what is valid and valuable. We can now structure our research agenda around what is useful for the patient (42). I call this patient-centered science. Safety comes first. If a treatment is safe with unknown efficacy, we still have the ability to use it in the care of the person for their benefit by optimizing the meaning effect. Recovery becomes more prominent. Rather than finding a molecule that I must give life-long to hold down a specific physiological mechanism deemed to be pathological, I can look for treatments that are stimulatory—inducing a more durable and low risk healing response. For example, rather than adding three drugs onto an antihypertensive regimen (the current stepped care standard), I can approach the patient with diet or exercise or meditation or acupuncture to treat their blood pressure and heal it at its root causes3 . With this abundance of healing response tools now established as safe and effective, my ability to personalize a treatment regimen becomes more flexible and doable for a patient. If a drug produces side effects or cost too much for an individual patient, I can approach them through lifestyle and diet or through mind–body practices or conditioning or through a variety of a traditional and complementary approaches previously shown to be safe (43).

Finally, freed from the myth of the placebo response, our medicine and our science align with the reality of the complex ecological system that is a whole person (44). We now can fit the ecological complexity with complexity science. This has been known for decades by the term the biopsychosocial model (45). In complexity science, the parts do not explain the whole, and they are not additive. Instead, once the complexity of the parts gets to a certain point, there emerge new properties with new dynamics. Complexity science the science of the large and the whole—provides an evidence base for treating a patient through multiple methods at the level of mind, body, social, or spirit (46). The translational gap between science and practice is now shortened. No two billion dollars and 15 years required for validity. Finding meaning opens multiple approaches to healing supported by an array of research methods.

### MAKING THE HEALING RESPONSE ROUTINE

Once freed from the placebo response myth, how can we use this newfound evidence from complexity science to heal? **Figure 1** llustrates a four-dimensional model of a person that I use in my practice to routinely enhance healing, based on knowledge from the meaning response that is derived from research using placebo treatments.

<sup>3</sup> https://www.mayoclinic.org/diseases-conditions/high-blood-pressure/in-depth/ high-blood-pressure/art-20046974

I do this through a set of questions and assessments that I call the HOPE Note (47). HOPE stands for healing-oriented practices and environments and draws heavily from looking at the placebo arms of controlled clinical trials and laboratory studies that illuminate the mechanisms of the meaning response. In the HOPE Note, I begin by asking the patient what matters to them in their life—why are they living and why do they want to have health. This makes finding meaning the central goal of the encounter and the interchange person centered from the beginning. We then go on to explore the multiple ways in which a healing response can be induced through mind–body practices, or through the social and emotional environment, or through lifestyle, or by altering the physical context in which treatment occurs. Knowledge from research using placebos and unpacking the meaning response infuses those discussions with a solid evidence base and helps the patient optimize and personalize their healing4.

4 http://drwaynejonas.com/resources/hope-note/

#### REFERENCES


Eliminating the myth of placebo will not be easy. Currently, medical care derived from the science of the small and particular provides us with only about 15–20% of the health benefits for populations, yet it gets 80–90% of the money (48). Our inherent healing response as accessed through behavior and the social environment accounts for the other 80%. However, this approach to illness has no business model to drive it forward or make it accessible to everyone. Even more difficult than changing the economic model of healing will be changing our minds about how healing works. A good first step would be to see the placebo response for what it is—a conceptual myth that sustains a broken medical system and covers up what we are really seeking—our inherit healing capacity now freed by understanding how deeply meaning infuses us all.

#### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.


activity of placebo and naproxen in cancer pain. *Clin. Trials Metaanal.* (1994) 29(1):41–7.


expert consensus. *Psychother. Psychosom.* (2018) 87(4):204–10. doi: 10.1159/00490354


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Jonas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo Economics: A Systematic Review About the Economic Potential of Utilizing the Placebo Effect

*Jens Hamberger1,2, Karin Meissner2,3, Thilo Hinterberger1, Thomas Loew1 and Katja Weimer4\**

*1 Department of Psychosomatic Medicine, University Clinic Regensburg, Regensburg, Germany, 2 Division of Health Promotion, University of Applied Sciences Coburg, Coburg, Germany, 3 Institute of Medical Psychology, Faculty of Medicine, LMU Munich, Munich, Germany, 4 Department of Psychosomatic Medicine and Psychotherapy, Ulm University Medical Center, Ulm, Germany*

#### *Edited by:*

*Martina De Zwaan, Hannover Medical School, Germany*

#### *Reviewed by:*

*Franziska Labrenz, Essen University Hospital, Germany Przemysław Ba˛bel, Jagiellonian University, Poland*

*\*Correspondence: Katja Weimer katja.weimer@uni-ulm.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 14 March 2019 Accepted: 13 August 2019 Published: 12 September 2019*

#### *Citation:*

*Hamberger J, Meissner K, Hinterberger T, Loew T and Weimer K (2019) Placebo Economics: A Systematic Review About the Economic Potential of Utilizing the Placebo Effect. Front. Psychiatry 10:653. doi: 10.3389/fpsyt.2019.00653*

Background: Recent research shows that placebo mechanisms can be utilized in ethical and legal ways such as in open-label conditions, when patients know that they receive placebos, and through psychological interventions aiming to optimize patients' expectations. Showing that placebo interventions are also cost-efficient could improve their acceptability.

Objective: To review studies that performed health economic evaluations (HEEs) of intentional placebo interventions and to review studies that intentionally applied placebo interventions and reported outcomes eligible for HEEs.

Methods: Two systematic reviews of the literature were performed. For the first review, we searched MEDLINE using "placebo" and Medical Subject Headings (MeSH) terms associated with HEEs such as "costs," "cost–benefit analyses," and "economics." Studies were eligible if they employed patients, applied placebo interventions, included an appropriate control group, and reported results of cost analyses. For the second review, we searched the Journal of Interdisciplinary Placebo Studies (JIPS) database and MEDLINE using search terms for outcomes eligible for cost–utility analyses, such as "quality of life" or "quality-adjusted life years" ("QALYs"). Risk of bias of all studies found was assessed according to the *Cochrane Handbook*, and a narrative synthesis of the results is provided.

Results: The first search resulted in 1,853 articles, which were screened for eligibility. Two studies were found only in which costs or cost-effectiveness analysis were reported, but with medium to high risks of biases. The second search yielded 164 articles particularly from the JIPS database of which 11 studies met our search criteria: in six studies, patients received placebo pills in open-label conditions; three studies investigated effects of patient–physician relationships; and two studies used psychological interventions to optimize treatment expectations, in patients with various diseases and disorders. These studies report outcomes potentially eligible for HEEs when costs of interventions were known. Risks of biases were low to medium, but patients were not blinded to the conditions in most studies.

Conclusions: The state of knowledge about HEEs of placebo interventions is scarce. To gain more visibility and acceptability for placebo interventions, future studies should measure outcomes usable for HEEs and costs of interventions, and HEEs should be performed for existing studies if data are available.

Keywords: placebo effect, placebo response, cost-effectiveness, cost–benefit analysis, health economic evaluation

#### INTRODUCTION

During the last 20 years, placebo research investigated intensively the mechanisms by which placebo effects occur, but their utilization as a treatment option is still in its infancy (1, 2). One of the main reasons for this fact is—or was—that concerns about ethical and legal issues have been raised as the placebo use is often considered to involve deception of patients (3). Recent research, however, shows that placebo mechanisms can be used in ethical and legal ways such as in open-label conditions when patients know that they receive placebo pills (4, 5). Furthermore, a meta-analysis found similar effect sizes for placebos and active treatments (6). Showing that placebo interventions are not only effective but also efficient could further improve their visibility and acceptability, at least in certain circumstances, but little is known about health economic evaluations (HEEs) of placebo interventions (7). HEEs use various methods to analyze the efficiency of interventions either as total or relative costs or in relation to their effects.

Several studies could show that placebo interventions can improve symptoms of diseases by eliciting the underlying mechanisms such as influencing treatment expectations or learning of treatment effects through conditioning (1). In openlabel placebo studies, patients are openly given placebos and are told that they can improve symptoms through self-healing mechanisms (4, 5). This has been shown, for example, for the treatment of irritable bowel syndrome (IBS) (8), low back pain (9), depression (10), and allergic rhinitis (11). In these studies, significant improvements of symptoms could be achieved while patients took no active drugs than in standard therapies, having the potential of reduced treatment costs. Studies using a so-called partial reinforcement schedule (1) showed that patients could be conditioned to drug effects and 50% drugs could be substituted for placebo pills while the effects of the full drug dose are maintained. This conditioning procedure has been shown to be effective for the substitution of stimulant drugs in attention deficit/hyperactivity disorder (ADHD) in children (12, 13) as well as for the substitution of corticosteroid therapy in psoriasis in adults (14). Furthermore, empathic practitioner– patient interactions have been shown to reduce the duration of the common cold by one whole day (15), which is a considerable economic factor. Although these studies comprised small sample sizes with fewer than 100 patients and short durations of maximal 3 weeks, they could show that placebo interventions can be applied successfully to patients. Additionally, a meta-analysis comparing differences between active treatment and placebo with differences between placebo and no treatment groups of three-armed trials found similar effect sizes for placebos and active treatments, particularly for continuous outcomes in 115 studies across different diseases (6). Despite such promising results, placebo interventions are far away from being considered as a treatment option, and HEEs of placebo interventions could support further research and acceptability (7).

HEEs are not part of approval procedures for new drugs but are more and more consulted for health-care decision making because of limited resources of health-care systems (16). To improve visibility and acceptability of placebo interventions, applying equal standards for testing their efficiency as for conventional drug therapy could be supportive. There are several methods for HEEs aiming to calculate health-care costs of an intervention in total or in relation to its effectiveness (16). The most frequently reported method is the cost–utility analysis (CUA), which measures the effects of an intervention with regard to its utility. To perform CUA, studies should assess the quality of life as outcome measure for the calculation of quality-adjusted life years (QALYs), that is, gained life years without symptoms. The cost-effectiveness analysis (CEA) utilizes clinical outcomes, morbidity, and mortality rather than quality of life measures and compares costs and effectiveness of an intervention with alternative interventions or placebo. For both CUA and CEA, an incremental cost-effectiveness or cost– utility ratio (ICER or ICUR, respectively) can be calculated as the ratio of additional costs divided by additional effectiveness of one intervention over another (ICER = (effect of intervention 1 − effect of intervention 0)/(cost of intervention 1 − cost of intervention 0)). Therefore, they provide information about extra costs per extra unit of the assessed effect or QALY. If intervention 1 is more effective than intervention 0, then a positive ICER indicates that intervention 1 is more expensive and a negative ICER indicates that intervention 1 is less expensive than intervention 0. For decisions in health care, thresholds have been proposed (but also criticized); for example, an ICER of up to £30.000 for a new drug or treatment is considered as cost-effective according to the National Institute for Health and Clinical Excellence (NICE) of Great Britain (17). Other methods are the cost-minimization analysis and the cost– cost analysis, which both compare costs of interventions when those are equally effective. An overview of different methods and their usage in different countries is presented by Riedel et al. (16). However, their overview shows that there is no established international standard for analyses or which outcomes should be reported in studies to perform HEEs.

To determine the current state about HEEs of placebo interventions, the primary aim of this article is to systematically review the evidence of HEEs of placebo interventions. As this review yielded only two studies, we additionally performed a second search to systematically review the literature to assess studies using placebo mechanisms that investigated outcomes that could at least be relevant for HEEs. Due to the lack of standard methods for economic evaluations and as we aim to provide a comprehensive review of the literature, a broadly based literature research was performed.

#### MATERIALS AND METHODS

This systematic review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (18, 19) (**Supplement 1**), except a previous registration of the research protocol.

#### Review Process

All literature researches were performed with regard to previously defined search criteria by two independent reviewers (JH and KW). In case of different search results, they were compared and discussed to come to an agreement. Lists of found articles were transferred from MEDLINE/PubMed to the reference management software EndNote™ (Version X7; Thomson Reuters), and duplicate articles, articles published in any other language than English or German, and letters, editorials, and comments were excluded. We restricted our search to articles published in and after 1995, because the term "placebo effect" [except in randomized controlled trial (RCTs)] as well as the systematic investigation of its underlying mechanisms was seldom reported before the mid-1990s (20), and current methods of HEEs are even younger. Of all remaining articles, titles were screened for eligibility. If the title did not suffice for a decision, abstracts were screened. Literature researches were performed between October and November 2018 and updated before submission on March 8, 2019.

#### Search and Eligibility Criteria

To answer the first question, whether and with which results HEEs of placebo interventions have been performed and reported, MEDLINE/PubMed was screened for "placebo" in addition to search terms suggested by Droste and Dintsios (21). They provided a list of Medical Subject Headings (MeSH) terms related to HEEs of which 53 relevant MeSH terms were selected for our systematic review (**Supplement 2**). Due to the large number of search terms, each search was performed separately, and double entries were excluded in a second step. The following search term was finally used with "xxx" as a placeholder for MeSH terms listed in **Supplement 2**: ("placebos"[MeSH Terms] OR "placebos"[All Fields] OR "placebo"[All Fields]) AND "xxx"[MeSH Terms].

Titles, and abstracts if necessary, were screened for any evidence about HEEs of placebo effects or placebo responses as the topic of the article, whether in RCTs or placebo studies. As we aimed to reach and provide a broad overview about HEEs of placebo interventions, we predefined only a few eligibility criteria according to the PICOS (population, intervention, comparator, outcomes, study design) approach (18, 19). Studies were considered if they employed patients with any disease or disorder, but studies with healthy volunteers were excluded. Interventions were considered if they aimed to improve any disease or disorder by means of an intentional placebo intervention that was explicitly stated as such by the article's authors or was recognized as such by the reviewers (JH and KW). An appropriate comparator group for placebo effects, such as a no-treatment or waiting list group, must have been included. Results of a HEE must have been reported in the article, for example, total or incremental costs of interventions, ICER or ICUR, or QALY. All kinds of study designs were considered such as randomized and nonrandomized clinical trials.

For the second question, whether there are studies investigating placebo mechanisms reporting outcomes suitable for HEEs, the Journal of Interdisciplinary Placebo Studies database (JIPS; https://jips.online) (20) was screened first. This database was founded by Enck and colleagues and contains 4,174 articles (on February 28, 2019) dealing with the placebo effect and related topics only. Articles included are hand-selected by Paul Enck and Katja Weimer from PubMed on a weekly basis; for a detailed description of the selection process, see Enck et al. (20). Eligibility criteria according to the PICOS approach (18, 19) were as follows: studies involving patients with any disease or disorder (Population), with a planned and intentional placebo intervention (Intervention) compared with an appropriate control group for unspecific effects such as regression to mean (Comparator), assessing outcome parameters allowing for HEEs (Outcomes), and in which patients were randomized to the interventions (Study design). According to Riedel et al. (16), outcome parameters of studies eligible for HEEs are not well defined. However, quality of life is considered the most important outcome parameter as well as morbidity and mortality. We therefore searched for "quality of life," "QoL," "disability," and common measures of this entity such as "SF-36" ("SF36"), "SF-12" ("SF12"), and "EQ-5D" ("EQ5D") and for "morbidity," "mortality." Additionally, we searched for "quality-adjusted life years" ("QALY") and "disability-adjusted life years" ("DALY"). The JIPS database was used for the second question, as the first systematic review reported above yielded a great amount of search results with the search term "placebo" but with low specificity for intentional placebo interventions, and a second literature search with this term was considered inefficient. However, to confirm this search, MEDLINE/PubMed was screened for each "placebo effect," "placebo response," and "placebo treatment" in combination with all of the above-mentioned search terms for outcome parameters (see **Supplement 3** for the full search term) and were searched for the above-described PICOS criteria.

#### Data Extraction

The following data of eligible articles were extracted (**Tables 1** and **3**): condition (disease or disorder), applied intervention, control group used, number of patients involved, age and sex of patients, outcome measures, and results (results in **Table 1** only).

#### Quality Assessment

Risk of bias of identified studies was assessed in accordance with the *Cochrane Handbook for Systematic Reviews of Interventions* (22) with regard to the following quality features of studies: random sequence generation (selection bias), allocation of concealment (selection bias), blinding of participants and personnel of the study (performance bias), blinding of outcome assessment (detection bias), incomplete outcome data (attrition bias), selective reporting of outcomes (reporting bias), and other bias. These features were evaluated as low risk of bias (+) when criteria were met and sufficiently described, high risk of bias (−) when criteria were not met, or unclear risk of bias (?) when information provided does not suffice for evaluation. Results of risk of bias assessments are reported in **Tables 2** and **4**.

#### RESULTS

#### Studies Reporting Hees for Placebo Interventions

After eligibility criteria were first screened, titles and abstracts of 1,593 studies were screened for the question whether they report a HEE of a placebo intervention in patients with any disorder or disease (**Figure 1**).

Two articles were identified that met the criteria (**Table 1**), and risk of bias was assessed (**Table 2**).

Gupta et al. (23) describe their intervention of using a flavored anesthetic mask as a placebo intervention by themselves and compared it with a non-flavored mask for children who undergo surgery. They report higher total costs for flavored compared with non-flavored masks (56.45 Indian rupee versus 54 Indian rupee) but did not relate it to effects of the masks. Pattamatta et al. (24) investigated if chewing a gum compared with a placebo dermal patch 3 h before and after colorectal surgery decreases complications such as postoperative ileus (PI) and anastomotic leakage (AL). Chewing a gum was considered a placebo intervention, as authors of this re-analysis of data did not provide any information about active mechanisms, and authors of the original article reported that the underlying mechanisms are still elusive (25). Costs for ward stay were lower in the gum chewing group, compared with the control group, but overall costs of treatment were not different. Calculation of ICERs for PI and AL (INR −2,414 and INR −8,450, respectively) showed superiority for the gum chewing group. Health-related quality of life was assessed but not used to calculate QALYs, as the author considered it inappropriate because of varying time points for the postoperative assessment.

#### Risk of Bias in Studies Reporting Hees for Placebo Interventions

Both studies (23, 24) report randomization of patients, but it is unclear if a selection and other biases could have occurred due to insufficient description. Gupta et al. (23) report that patients were blinded to the condition, but it must be assumed that they realized their group assignment when they smelled the flavor of the mask. In the study by Pattamatta et al. (24), patients were not blinded to the condition as they differed in their form of application (chewing gum versus dermal patch) (**Table 2**).

#### Studies Using Placebo Interventions and Outcomes Eligible for Hees

Literature research using the JIPS database yielded 11 studies investigating intentional placebo interventions or mechanisms in comparison with control groups (**Figure 2**), and which assessed outcomes eligible for HEEs such as quality of life, morbidity, and mortality (**Table 3**). In six studies, patients received placebo pills in open-label conditions; that is, they knew that they received placebo pills only, in combination with an explanation on how they work and improve symptoms to increase treatment expectations (8, 9, 11, 26–28). In three studies, enhanced and particularly


patch

TABLE 1 | Studies reporting health economic evaluations for placebo interventions.

*HEEs, health economic evaluations; ICER, incremental cost-effectiveness ratio.*

ileus and anastomotic leakage after colorectal surgery

(24)

costs for ward stay, ICER

Results

Anxiety and compliance did not differ; higher overall costs for flavored masks compared with non-flavored

Positive ICER in favor of gum chewing (lesser costs and positive effects)

TABLE 2 | Risk of bias of included studies listed in Table 1.


empathic patient–physician relationships were applied to enhance expectations of patients (15, 29, 30). Two studies used psychological interventions developed to optimize expectations concerning treatment outcomes (31, 32). Placebo interventions and mechanisms were applied to adult patients suffering from various diseases and disorders: with gastrointestinal disorders (8, 29, 30), respiratory or allergic diseases (11, 27, 31), cancerrelated fatigue (26, 28), common cold (15), chronic low back pain (9), and heart surgery (32). Outcome measures eligible for HEEs were patient-reported general or disease-specific quality of life questionnaires such as different versions of the Short Form Health survey (SF-8, SF-12, and SF-36), specific questionnaires for IBS, gastroesophageal reflux disease (GERD), asthma, fatigue, and a disability questionnaire.

The MEDLINE research revealed *N* = 853 articles of which 472 were randomized controlled trials, 269 were no original studies (e.g., reviews, meta-analyses, and letters), 93 were other kind of studies (e.g., *post hoc* analyses of placebo arms of RCTs without control condition for other unspecific effects, or patients were not randomized to groups), and 14 studies did not involve patients. We identified five articles meeting our criteria, which have also been found in the JIPS database (8, 9, 27–29).

#### Risk of Bias in Studies Using Placebo Interventions and Outcomes Eligible for Hees

Risk of bias according to the *Cochrane Handbook* is reported in **Table 4**. Most of the studies report adequate randomization, allocation concealment, and blinding of outcome assessment. Data were incomplete or insufficiently described in five studies, whereas selective reporting of results is assumed to occur only seldom. In 10 out of 11 studies, particularly patients and also practitioners were not blinded to the assigned condition.

#### DISCUSSION

To provide a comprehensive overview of the current state of analyzed and potential HEEs of placebo effects, we performed two systematic reviews of the literature. The first searched for reported HEEs of placebo effects in studies involving patients with any disease or disorder who were treated with an intentional placebo intervention. We found two articles only matching these criteria, of which one was selected with some uncertainty as authors suspected an underlying active mechanism (cephalic vagal activation), and the control group was a placebo intervention, too (24). The latter could control for unspecific effects in both groups, but placebo effects of equal size could occur resulting in equal overall effects in both groups. However, they found that gum chewing was more effective than placebo dermal patch to reduce postoperative complications, and gum chewing had a better cost–benefit balance calculated as ICER. The other study (23) reported higher total costs for the placebo intervention compared with the control group, due to the fact that the control group was treatment as usual (unflavored anesthetic mask) compared with an intervention with additional preparations. Therefore, they chose a cost– cost analysis, calculating and comparing the costs of both alternatives only, but did not relate costs to effectiveness of treatment. Calculating ICER could have been more beneficial for the placebo intervention, as effects on anxiety behavior and compliance were better than in the control group. It should be mentioned that costs of placebo arms of RCTs were occasionally calculated and reported in articles found but were not considered in this review, as they serve only as a control group for a mixture of placebo effects and unspecific effects that are not meant to be used as intentional treatment. In summary, we found only two studies reporting HEEs for placebo interventions with medium to high risks of biases and limited analyses of costs and cost–benefit balances, which do not significantly contribute to knowledge about HEEs of placebo interventions.

#### TABLE 3 | Studies employing placebo interventions and reporting of outcome measures suitable for HEEs.


*HEEs, health economic evaluations; ICER, incremental cost-effectiveness ratio.*

Hamberger et al.

Other bias


TABLE 4 | Risk of bias of identified studies listed in Table 3.

Due to the minor result of this systematic review, despite a broad search strategy, we decided to perform a second literature research to answer the question if, at least, there are studies with patients that have investigated intentional placebo interventions and assessed outcomes that could be eligible for HEEs. This second search yielded 11 studies, which reported measures of quality of life (8, 9, 11, 15, 26–32), allowing to calculate ICURs or quality-adjusted life years (QALY) for placebo interventions when costs of treatments were known. These studies report a variety of placebo interventions such as open-label placebo pills, placebo acupuncture, educational programs to enhance expectations about the treatment, and expanded empathic visits, in different kinds of patients and disorders. HEEs could be calculated when costs of the applied placebo and control interventions are known and could then be compared with costs and effectiveness of standard treatments. For example, when all costs of an open-label application of placebo pills including pills, other materials, and working hours of physicians for the treatment of chronic low back pain (9) were known, they could be compared with total costs of standard treatments such as with analgesics. To calculate ICER, the effects of both treatments, such as an increase in quality of life or a decrease of symptoms, are compared in relation to their costs. Furthermore, the occurrence of side effects and the related costs of their treatment could be taken into account in further HEEs. However, the authors of the placebo studies did not report costs of interventions, as this was not the aim of their studies and articles. Risks for biases vary between low to medium among most studies, but all of them report that patients, and in some cases physicians, were not blinded to the condition. According to the *Cochrane Handbook* and risk of bias tool (22), this is deemed a performance bias, but the tool is designed to evaluate RCTs in which the placebo group is used to control for placebo responses, including the placebo effect *per se* as well as (other) unspecific effects such as regression to the mean and natural course of symptoms. In contrast, placebo interventions aim to intentionally utilize the placebo effect by increasing patients' expectations. Blinding patients for their expectations being manipulated is very difficult to achieve and might be unethical, although not blinding patients could lead to better external validity than could blinded RCTs, as patients are not blinded to their treatment in daily routine.

#### LIMITATIONS

Some limitations of our systematic literature reviews should be mentioned. In the first review, titles and abstracts were screened carefully for any hints that an intentional placebo intervention was applied. However, we cannot exclude that ineffective interventions were applied that could have been considered as placebo interventions. We relied on the assumption that authors who are aware of applying a placebo intervention use the words "placebo" or "placebos" in the title, abstract, or keywords of their articles. To double-check for additional articles that does not comprise "placebo" but used placebo mechanisms, we explored to search for "expectation OR expectancy" and "conditioning" in combination of the words listed in **Supplement 2**. These searches yielded too many inappropriate results; and we, therefore, did not implement them in our literature research. For the second review, we first screened the JIPS database consisting of pre-selected articles about placebo effects and double-checked the results by searching for "placebo effect," "placebo response," and "placebo treatment" in combination with pre-defined search terms for HEEs in MEDLINE/PubMed for any additional results. We thus restricted the search to articles explicitly referring to these effects and did not perform a broadly based search for "placebo" only. This MEDLINE research yielded five studies only (8, 9, 27–29), which were also found in the JIPS database. These five studies investigated placebo treatments using placebo pills or acupuncture, whereas the additional six studies harnessing psychological interventions were not detected with the search terms "placebo effect," "placebo response," or "placebo treatment." Finally, CEAs could also be performed with other patient-reported outcomes (PROs) than those related to quality of life, for example, changes in any symptoms, or with biological parameters such as changes in inflammatory markers or heart rate variability. We restricted our search for measures of quality of life because they are most commonly used and recommended for HEEs and allow for comparisons between different kinds of treatments.

#### CONCLUSIONS

The state of knowledge about HEEs of placebo interventions is scarce. To gain more visibility and acceptability for placebo interventions, we recommend that (1) future studies applying placebo interventions to patients should measure outcomes usable for HEEs, such as quality of life, morbidity or mortality (where appropriate), and costs of interventions, and (2) HEEs should be performed for existing studies that applied placebo interventions.

### REFERENCES


### AUTHOR CONTRIBUTIONS

JH and KW contributed to the initial research questions for this systematic review and the search strategy and performed the literature research, screening of articles, data extraction, and quality assessment, and wrote the first draft of the manuscript. KM, TH, and TL contributed to the interpretation of results of literature research. All authors contributed to manuscript revision and read and approved the submitted version.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00653/ full#supplementary-material


outcome in heart surgery patients: results of the randomized controlled PSY-HEART trial. *BMC Med* (2017) 15:4. doi: 10.1186/s12916-016-0767-3

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Hamberger, Meissner, Hinterberger, Loew and Weimer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Are Individual Learning Experiences More Important Than Heritable Tendencies? A Pilot Twin Study on Placebo Analgesia

*Katja Weimer1,2\*, Elisabeth Hahn3, Nils Mönnikes4, Ann-Kathrin Herr2, Andreas Stengel2,4 and Paul Enck2*

*1 Department of Psychosomatic Medicine and Psychotherapy, Ulm University Medical Center, Ulm, Germany, 2 Department of Psychosomatic Medicine and Psychotherapy, Medical University Hospital Tübingen, Tübingen, Germany, 3 Department of Psychology, Saarland University, Saarbruecken, Germany, 4 Charité Center for Internal Medicine and Dermatology, Department for Psychosomatic Medicine, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany*

Objective: Predicting who will be a placebo responder is a prerequisite to maximize placebo effects in pain treatment and to minimize them in clinical trials. First evidence exists that genetics could affect placebo effects. However, a classical twin study to estimate the relative contribution of genetic influences compared to common and individual environmental influences in explaining interindividual differences in placebo responsiveness has yet not been performed.

#### *Edited by:*

*Guillaume Gourcerol, Université de Rouen, France*

#### *Reviewed by:*

*Per M. Aslaksen, Arctic University of Norway, Norway Ralf Kuja-Halkola, Karolinska Institutet, Sweden*

> *\*Correspondence: Katja Weimer katja.weimer@uni-ulm.de*

#### *Specialty section:*

*This article was submitted to Child and Adolescent Psychiatry, a section of the journal Frontiers in Psychiatry*

*Received: 28 February 2019 Accepted: 22 August 2019 Published: 18 September 2019*

#### *Citation:*

*Weimer K, Hahn E, Mönnikes N, Herr A-K, Stengel A and Enck P (2019) Are Individual Learning Experiences More Important Than Heritable Tendencies? A Pilot Twin Study on Placebo Analgesia. Front. Psychiatry 10:679. doi: 10.3389/fpsyt.2019.00679*

Methods: In a first explorative twin study, 25 monozygotic (MZ) and 14 dizygotic (DZ) healthy twin pairs (27.5 ± 7.7 years; 73% female) were conditioned to the efficacy of a placebo analgesic ointment with an established heat pain paradigm on their nondominant arm. Placebo analgesia was then tested on their dominant arm. Furthermore, warmth detection thresholds (WDTs) and heat pain thresholds (HPTs) were assessed, and participants filled in questionnaires for the assessment of psychological traits such as depression, anxiety, optimism, pain catastrophizing, and sensitivity to reward and punishment. Their expectations were determined with a visual analog scale.

Results: There was a small but significant placebo analgesic effect in both MZ and DZ twins. Estimates of heritability were moderate for WDT only but negligible for HPT, the conditioning response, and placebo analgesia. Common environment did not explain any variance, and the individual environment explained the largest parts. Therefore, the placebo analgesia response can be seen as influenced by individual learning experiences during the conditioning procedure, whereas other variables assessed were not associated.

Conclusions: Compared to the individual learning experience, genetic influences seem to play a minor role in explaining variation in placebo analgesia in this experimental paradigm. However, our results are restricted to placebo effects through conditioning on pain in healthy volunteers and should be replicated in larger samples and in patients. Furthermore, potential gene–environment interactions should be further investigated.

Keywords: conditioning, expectation, heritability, learning, placebo analgesia, placebo effect, twins

### INTRODUCTION

Placebo effects are part of every medical intervention and should be used to maximize treatment effects in daily routine, but need to be minimized in randomized controlled clinical trials (RCTs) to estimate the "pure" drug effect (1, 2) that should exceed the placebo effect. Among the most challenging questions is the prediction of who will be a placebo responder or nonresponder (3, 4). Placebo effects and responses are influenced by situational factors (5, 6), interact with personal factors (3) and prior experiences (7, 8), and are affected by the environment through explicit social (observational) learning of interventional effects (9, 10) and by an implicit social learning phenomenon called "placebo by proxy" (11, 12). Neither of these approaches has been able to allow the precise identification of placebo responders (1, 2).

Besides environmental or situational factors, studies following a molecular genetic approach have provided first evidence that genetic effects could influence placebo effects (13, 14). However, only a few studies investigated the association of genetic polymorphisms and placebo analgesia in healthy participants. Pecina and colleagues report that AA homozygotes compared to G carriers of the Mu-opioid receptor polymorphism (OPRM1 A118G) (15), as well as Pro/Pro homozygotes compared to Thr carriers of the fatty acid amide hydrolase (FAAH Pro129Thr) (16), showed higher placebo effects through verbal suggestions on pain induced through hypertonic saline. When placebo effects were induced through conditioning on thermal pain, Yu and colleagues found an association between Met allele carriers of the catechol-O-methyltransferase polymorphism (COMT Val158Met) and placebo analgesia (17). The latter is related to general dopamine release and has also been linked to placebo effects in irritable bowel syndrome (18) and major depression (19). Because of effects on different symptoms and in patients as well as in healthy participants, it seems to be an unspecific effect on placebo effects in response to the anticipation of rewarding situations. Subsequent studies aimed to replicate these findings with larger samples but did not find an association of the COMT genotype with placebo analgesia by verbal suggestion on thermal pain (20). Further studies combining polymorphisms of the before-mentioned genes show more promising but still inconclusive results. Aslaksen and colleagues found a significant placebo analgesic effect through verbal suggestion on thermal pain only in carriers of OPRM1 AA combined with COMT Met/Met and Val/Met alleles (21), whereas Colloca and colleagues found significant placebo analgesia in carriers of other combinations, namely, the combination of OPRM1 AA with FAAH Pro/Pro and the combination of COMT Met/Met with FAAH Pro/Pro, but not for OPRM1 AA with COMT Met/Met (Colloca et al., 2019). Furthermore, they found placebo effects in COMT Met/Val carriers independent of other combinations, and an interaction with the type of placebo induction through verbal suggestion or learning (22). Overall, results seem to be partly inconclusive, but influencing factors such as the type of pain stimuli and placebo procedure have only seldom been considered.

However, the so-far identified candidate genes and polymorphisms show rather small effects and neither allow reliably predicting placebo responders across clinical conditions and experimental paradigms nor can distinguish between genetic and environmental contributions to the placebo effect (23, 24). Here, quantitative behavioral genetic methods such as the classical twin design (CTD) are traditionally used to disentangle and estimate the relative contribution of genetic and environmental influences in explaining interindividual differences in human behavior. By comparing the observable similarities of monozygotic (MZ) and dizygotic (DZ) twins—who share 100% (MZ) and respectably 50% (DZ) of their segregating genes—the relative importance of genetic influences can be inferred in the sense that they are assumed to be important when MZs are twice as similar as DZs. The centerpiece of the CTD represents the heritability estimate (H), which describes the proportion of the total variance explained by the genetic variance. The remaining part of the variation can then be attributed to environmental influences from different kinds of sources (e.g., family, individual experiences, situational conditions) typically subdivided into common (leading to similarity between family members) and individual (leading to differences between family members) environmental influences. Although twin studies have been conducted successfully for more than 50 years, they are lacking so far in placebo research to assess the variance that could be explained by genetic, common, and individual environmental components (25–27).

Only few studies in healthy twins have investigated pain sensitivity and analgesic drug responses. Nielsen et al. (28) investigated pain sensitivity and found less evidence for both genetic and common environmental factors in an experimental study with 53 MZ and 39 DZ twin pairs: Genetic factors could only explain 7% and 3% of the variance in cold pressor and heat pain, respectively, and environmental factors explained only 5% and 8% of variance, respectively. In contrast, Angst et al. (29) employed 81 MZ and 31 DZ healthy twin pairs in an experimental study and found a significant heritability for cold pressor pain tolerance (explaining 49%) and a significant interaction of genetic and environmental effects for heat and cold pressor pain thresholds (explaining 24% and 32%, respectively). After infusion of alfentanil, a µ-opioid agonist, they found significant heritability for the analgesic effect in cold pressor pain thresholds (60%) and a familial effect on cold pressor pain tolerance (30%). Unfortunately, the results of the placebo arm were not reported.

Placebo analgesia, i.e., the pain reduction after the application of an inert treatment, is the best investigated paradigm to study the mechanisms underlying the placebo effect (conditioning, expectation, social learning). This has been tested with different pain stimuli (e.g., heat pain) and in healthy volunteers as well as in pain patients. An established heat pain paradigm was employed to induce a conditioned placebo analgesic effect (7, 30–32). The classical twin study design is an established methodology to differentiate between genetic and environmental factors (25, 27).

Our study combines these two approaches—conventional placebo analgesia stimulation with a heat pain paradigm and a classical twin study design—to explore the relative influence of genes and the environment on the placebo response in experimental pain in healthy twins for the first time. Based on the mixed results reported by previous studies using different experimental designs, we would like to reexamine the question whether differences in placebo effects actually show a heritable component—as should be expected based on the first law of behavior genetics postulating that everything is heritable (33) in contrast to an equally conceivable assumption of primary environmental learning influences as a source of individual differences in placebo responses given the strong learning component of analgesia responses. Furthermore, our results aim to stimulate further studies with twins to address open questions in the field of heritability and genetic influences on placebo effects.

#### METHODS

#### Participants

A community sample of 40 MZ and DZ healthy twin pairs were recruited through the database of HealthTwiSt GmbH, Germany (34), and by email at the University of Tübingen, Germany. Inclusion criteria were: between 18 and 60 years old, raised together, fluent in German, and participation of both twins in the study. They were excluded when at least one twin had acute or chronic diseases of the skin, pain disorders, disorders of the cardiovascular system, psychiatric disorders, other acute or chronic conditions or medication intake that affects pain sensitivity or reaction times. They were asked to refrain from drinking alcohol or taking medication for at least 24 h before the experiment. Inclusion and exclusion criteria were checked through online questionnaires and by the investigator before the experiment. One twin pair was excluded due to technical problems during testing.

All participants were included after written informed consent only and received monetary rewards for their participation in this study. This study was approved by the Ethical Review Board of the University of Tübingen (project no. 814/2015BO1) and was conducted in accordance with the Declaration of Helsinki.

### Zygosity Assessment

Zygosity was assessed based on questions about previous genetic zygosity tests, intrapair resemblance, and confusion by strangers. This has been shown to reliably distinguish between MZs and DZs (27, 35, 36). Ten MZs and one DZ reported that genetic tests were performed. A zygosity score between 0 (high dissimilarity) and 20 (high resemblance) was calculated and compared to twins' own knowledge or opinion about their zygosity. This score significantly distinguished between MZs and DZs (11.6 ± 1.7 vs. 3.4 ± 3.7, respectively, *t*(76) = 13.44, *p* < .001) and confirmed the twins' own information.

#### Study Design

All participants took part in the study on a single occasion between 11.00 a.m. and 6.30 p.m. They were informed about the study aims as being effects of genetics and implicit learning on pain sensitivity and perception. After written informed consent, inclusion and exclusion criteria were double-checked through a short anamnesis questionnaire by the experimenter. Experiments were performed on both volar forearms, beginning with the nondominant arm (arm 1) followed by the dominant arm (arm 2) of the participant. This order was chosen so that participants could use a computer mouse and press buttons with their dominant hand as usual. On both arms, the warmth detection threshold (WDT), heat pain threshold (HPT), and testing of two ointments, a control and a placebo ointment, were performed. Therefore, three squares of 3 × 3 cm for the positioning of a thermode were painted on the forearm: a black one in the middle of the forearm, and a green and a red one above and below, respectively (**Figure 1**). Distal and proximal positions of the green and red squares were randomized between twin pairs but kept constant within one pair. Participants were conditioned for the effectiveness of an inert ointment application on arm 1, and placebo analgesia was tested on arm 2. Between tests on both arms, participants filled in questionnaires for around 30 min.

All heat stimuli were applied with a thermode (TSA-II, Medoc Ltd., Ramat Yishai, Israel), which can apply temperatures between 0°C and 50°C on a square of 3 × 3 cm. Baseline temperature was set to 32°C for all tests.

### Outcome Measures and Conditioning Procedure

For the assessment of thresholds, the thermode was placed on the middle, black square. The assessment of thresholds was performed according to the quantitative sensory testing (QST) protocol (37). The temperature of the thermode increased by 0.5°C/s until the participant pressed a mouse button when she or he felt an increase of the temperature for the first time. Then the temperature decreased to the baseline automatically with a return rate of 1°C/s. The mean of three assessed temperatures was calculated as WDT. For the assessment of HPT, the temperature of the thermode increased by 1°C/s until the participant pressed a mouse button when the stimulus was perceived as painful for the first time. The temperature decreased to the baseline with a return rate of 10°C/s. The mean of three assessed temperatures was calculated as HPT.

Participants were familiarized with the rating of heat stimuli on a visual analog scale ranging from 0 (not painful at all) to 10 (extremely painful) by presenting three heat stimuli equal to the HPT and 1°C above and below HPT, respectively. Afterwards, eight stimuli of 10 s (~1.5 s ramp-up and ~1.5 s ramp-down) ranging between −1°C and +2°C in pseudo-randomized order were applied and rated by the participants. Temperatures according to a rating of 2 and 5 on the VAS were calculated by means of linear regression analyses and were used as conditioning temperature (VAS2) and as test temperature (VAS5). For conditioning on arm 1, an inert ointment (Base Cream DAC, Bombastus-Werke AG, Freital, Germany) was applied to the red square for 5 min and removed, and then eight heat stimuli of 10 s (with ~1.5 s ramp-up and ~1.5 s ramp-down) according to VAS5 were applied to this square and rated by the participant. Afterwards, an inert application of a topical analgesic cream (EMLA cream, AstraZeneca GmbH, Wedel, Germany) was applied to the green square for 5 min and removed, and then eight heat stimuli of 10 s (with ~1.5 s ramp-up and ~1.5 s ramp-down) according to VAS2 were applied to this square and rated by the participant. Conditioning was supported by the information that the first ointment is inert and the second ointment is EMLA, a potent analgesic ointment. Furthermore, during application and ratings, a green or red circle, respectively, was shown on a monitor. Means of the eight ratings as well as the difference between these means were calculated and reported as conditioning response. Our application of EMLA was ineffective, as studies have shown that EMLA comes into effect after application on the skin after at least 30 to 60 min (38–41). Using EMLA had the advantage that deception of participants was reduced to a minimum, as they were told honestly that it is an effective analgesic ointment. Placebo testing was performed on arm 2 through application of the inert and the EMLA ointments in the same way as on arm 1, but with the difference that on both squares, eight heat stimuli according to VAS5 were applied. Information and colored circles were provided like in the conditioning procedure. Means of the eight ratings as well as the differences between these means were calculated and reported as placebo analgesia.

#### Questionnaires

Studies have shown that placebo analgesia could be influenced by individual psychological characteristics (3, 42) such as optimism (43), the extent of depressive or anxious symptoms also in healthy individuals (44), pain catastrophizing, as well as expectations concerning the effectiveness of treatment (42). Furthermore, it has repeatedly been hypothesized that reward sensitivity could affect placebo analgesia (42, 45). To analyze such factors as predictors of placebo analgesia, the following questionnaires were assessed: scales for depression and anxiety of the Patient Health Questionnaire (PHQ) (46), Life Orientation Test—Revised version (LOT-R) (47), Pain Catastrophizing Scale (PCS) (48), and Sensitivity to Punishment and Sensitivity to Reward questionnaire (SPSR) (49).

Expectancy was assessed by the question, "How effectively do you think the treatment will reduce the heat pain?" and rated by participants on a VAS from 0 (no effect) to 10 (strong effect). In order that participants not become suspicious about the study design, expectancy was assessed during each application time but analyzed only for the relevant placebo testing (EMLA on arm 2).

#### Statistical Analyses

Phenotypic statistical analyses were performed with IBM SPSS Statistics for Windows, Version 25.0 (IBM Corp., Armonk, NY). Significance level was set at *p* < .05 for all analyses.

Sample size was calculated for the correlation of the main outcome, placebo analgesia, between a twin and his or her co-twin, for which a sample size of n = 67 was sufficient (with r = .3, alpha = .05, power = .80), as calculated with G\*Power Version 3.1.9.2 (50). Normal distribution of variables was assessed with Shapiro–Wilk and Kolmogorov–Smirnov tests and visual inspection of data with normal quantile–quantile plots. Differences between groups were analyzed with Student's t-tests. Conditioning response as well as the placebo analgesic effect were tested with paired t-tests for the rating of the control ointment and the rating of the inert EMLA ointment.

In our sample, pain-related outcome variables, reported in **Table 1**, did not differ between female and male participants. Furthermore, handedness did not affect any of the pain-related outcomes reported in **Table 1** (72 participants were right- and 6 were left-handed). Twin data were arranged according to the order of birth, and outcome variables reported in **Table 1** did not differ, neither between firstborn and second-born twins nor between MZ and DZ twins.

All behavioral genetic models were fitted using the OpenMx package (51). Prior to estimating genetic and environmental influences as well as correlations within twin pairs [assessed by intraclass correlations (ICCs)], all variables were residualized for age, age squared, sex, and interaction effects between age and sex by multiple regression procedures, as the perfect correlation for age and sex in twin pairs can inflate twin similarities (52).

Behavioral genetic research is based on the simple rationale that genetic influences are relevant for a specific trait when biological relatives are more alike than unrelated individuals. On the other side, family members sharing relevant environmental factors should be more alike than family members and TABLE 1 | Pain-related outcome measures in MZ and DZ twin pairs (reported as mean ± standard deviation) and intraclass correlations (reported as ICC coefficients and 95% CI).


*\*\*p < .01. MZ, monozygotic; DZ, dizygotic; ICC, intraclass correlation coefficient; VAS, rating on a visual analog scale from 0 to 10.*

unrelated individuals who do not share this environment. By comparing MZ and DZ twins, who share family environmental influences but differ in their genetic relatedness, these different sources of variation in a given trait, e.g., placebo response, can be distinguished and estimated. To estimate the relative contribution of genetic and environmental influences for individual differences in all relevant factors, we performed univariate genetic modeling decomposing the phenotypic variation into variation due to genetic influences (labeled as A for additive genetic variance) and environmental influences, which are subdivided into common environmental influences (labeled as C) and individual environmental influences (typically labeled as E including measurement error) (so-called ACE model). Based on MZ and DZ resemblances, different expectations about genetic and environmental influences can be formulated: If the within-MZ correlation is greater than the DZ correlation, genetic influences can be assumed. A high correlation within both MZs and DZs indicates common environmental influences (shared between family members), while low correlations within both MZs and DZs, as well as any differences between MZ twins growing up in one family, can be attributed to individual environmental effects and measurement error. Overall, it is important to note that genetic and common environmental influences increase intrapair twin similarity, whereas the individual environment decreases it.

A detailed description of the model fitting approach and estimation of heritability can be found elsewhere (53). Due to the limited sample size and hence power considerations, we focused on the results for the full model given that the exclusion of any genetic or environmental effect may result in biased estimates of the remaining factors in the model, even if the removed factor was not significant (54).

Assumptions of this model are that 1) theoretically, MZs share 100% of their segregating genes, while DZs share 50%; 2) both MZs and DZs raised together share 100% of their common environment; and 3) all other effects such as individual environmental influences, individual learning experiences, and measurement errors contribute to differences within twin pairs. Furthermore, the applied genetic model relies on a number of prerequisites (for details, see 55), such as that twins are generalizable to the rest of the population and that genetic and environmental influences are independent from one another.

Further predictors of placebo analgesia, such as the conditioning response (regarded as the individual learning experience), the co-twins' placebo analgesia (regarded as an estimate of aggregated familial effects), pain sensitivity of test arm (HPT on arm 2), expectancy, and psychological variables, were analyzed with Pearson's correlations, and *p* values are reported. Due to the exploratory nature of this study and as all predictors were reasonably chosen based on previous results, unadjusted *p* values are reported, but also, results when *p* values are adjusted for multiple testing according to Benjamini and Hochberg [false discovery rate (FDR)] (56). We planned to include significant predictors in a linear regression analysis to account for multiple predictors at the same time, but as the conditioning response was the only significant predictor, regression analysis was obsolete.

### RESULTS

#### Study Population and Outcome Measures

Of 39 twin pairs, 25 were MZ (19 female, 6 male) and 14 were DZ (7 female, 2 male, and 5 opposite sex). MZs were 28.3 ± 8.2 years old, and DZs were 25.9 ± 6.7 years old (*t*(37) = 0.93, *p* = .36).

Calculated test temperatures according to a VAS of 5 were 45.8 ± 2.1°C for MZ and 46.1 ± 2.3°C for DZ and did not differ between MZ and DZ (*t*(76) = −0.59, *p* = .56). Calculated test temperatures according to a VAS of 2 were 43.4 ± 2.3°C for MZ and 43.6 ± 2.5°C for DZ and did not differ between MZ and DZ (*t*(76) = −0.40, *p* = .69).

#### Warmth and Pain Sensitivity

WDT significantly correlated within MZ twins on both arms, but not between DZ twins. There were nearly no correlations of HPT between MZ twins on both arms; however, there was a low correlation between DZs on arm 1, but no correlation on arm 2 (**Table 1**).

#### Conditioning Response and Placebo Analgesia

Among all participants, there was a significant conditioning response, with a mean pain reduction on the VAS from 4.9 ± 1.4 to 2.9 ± 1.3 (*t*(77) = 12.38, *p* < .001, 20% of VAS) on arm 1, and a significant placebo analgesic effect, with a mean pain reduction from 5.1 ± 1.6 to 4.6 ± 1.6 (*t*(77) = 5.25, *p* < .001, 5% of VAS) on arm 2. Of all participants, 68% reported a pain reduction, whereas 32% reported no difference or an increase in pain on arm 2. Furthermore, both effects were significant within MZ (*t*(49) = 11.64, *p* < .001 and *t*(49) = 4.04, *p* < .001, respectively) and within DZ twins (*t*(27) = 6.31, *p* < .001 and *t*(27) = 3.39, *p* = .002, respectively) when analyzed separately (**Table 1**).

#### Genetic, Common, and Individual Environmental Contributions to Pain-Related Outcomes and Placebo Analgesia

Twin resemblances (reported as ICCs) and their respective confidence intervals are shown in **Table 1**. Except for WDT, the pattern of ICCs between MZ and DZ twin pairs did not suggest genetic influences to be an important source of variation. In accordance, the results of behavioral genetic model fitting (shown in **Table 2**) showed that estimates of heritability were extremely low or negligible. For WDT, the performed ACE model included heritability estimates of 34% (arm 1) and respectively 38% (arm 2), with the remaining variance explained by individual environmental influences (66% arm 1 and 62% arm 2). For all other traits, individual environmental influences were the major source of variation explaining between 85% and 100% of the variation.

#### Prediction of Placebo Analgesia

To further explore influences on the estimated high individual environmental effect on placebo analgesia, predictors were analyzed (**Table 3**). Placebo analgesia significantly correlated TABLE 2 | Standardized estimates of heritability (h2), common (c2) and individual environmental (e2) effects on pain-related outcomes, conditioning response, and placebo analgesia.


*\*p < .05. VAS, rating on a visual analog scale from 0 to 10.*

positively with the conditioning response only (*r* = .265, *p* = .019) (**Figure 2**) but not with any of the other predictors. The conditioning response itself was significantly associated with pain sensitivity (*r* = −.239, *p* = .035), the test temperature used (*r* = −.493, *p* < .001), pain catastrophizing (*r* = .229, *p* = .043), and expectancy (*r* = −.249, *p* = .028).

Placebo analgesia also significantly correlated negatively with the rating of the control ointment and positively with the inert EMLA ointment, as placebo analgesia was calculated as the difference between them. The ratings of the ointments on the test arm (arm 2) were significantly negatively correlated with the conditioning response: the better the conditioning response (more negative), the higher the ratings on the test arm (**Table 3**).

When *p* values were adjusted for multiple testing, there was no significant correlation between placebo analgesia and the predictors, but the conditioning response was still significantly associated with the ratings of the control and EMLA ointments (*p* < .001 and *p* = .028, respectively) and with the test temperature used (*p* < .001).

#### DISCUSSION

To the best of our knowledge, this is the first study with MZ and DZ twins in placebo research and estimating the variances explained by heritability, common environmental, and individual learning components of placebo analgesia. For this purpose,



*HPT, heat pain threshold; PHQ, Patient Health Questionnaire; LOT, Life Orientation Test; PCS, Pain Catastrophizing Scale; SPSR, Sensitivity to Punishment and Sensitivity to Reward questionnaire.*

we used an established conditioning paradigm with heat pain stimulation and inert ointment applications to induce placebo analgesia. Furthermore, we examined the role of these components (genetics, common and individual environment) in heat painrelated measures such as WDT, HPT, temperature ratings, and the conditioning response, and their association with placebo analgesia. We explored the effects of psychological traits on placebo analgesia. Finally, this pilot study shows open questions in the field of heritability and genetic influences, which should be further investigated.

WDT as well as HDT were assessed according to the quantitative sensory testing protocol and lie within the reported reference values as reported by Rolke and colleagues (37). With the conditioning paradigm used, participants reported a significant pain reduction of 5% on the VAS when the placebo ointment compared to the control ointment was applied (on arm 2), and 68% of participants reported reduction of pain. Reported placebo analgesia is highly variable between published studies; for example, Eippert et al. found a pain reduction of 23% (30), and Wager et al. detected 22% (31), whereas Wrobel et al. found placebo effects of around 4% in adults and 7% in children (at least according to the figure presented, as no data were mentioned) (32). The latter had the most similar study design to our study. Accordingly, we found comparable placebo analgesic effects. The placebo responder rate of 68% is comparable to the rate of 72% reported by Wager et al. (31). Differences in placebo analgesia could be due to differences in study designs, e.g., how many conditioning trials were performed, if conditioning and placebo testing were performed on the same day, and test temperatures.

Our study results show poor to fair (57) correlations within MZ twin pairs for WDT only, whereas correlations in HPT, ratings of ointments, conditioning responses and placebo analgesia were even lower and not significant in MZs as well as DZs. The pattern of low intrapair correlations in both MZ and DZ twins points to the fact that there is a low influence of heritability as well as common environmental components, which both are supposed to increase similarity between twins, and that the individual, nonshared environment may play a major role. The latter contributes to the dissimilarity of twins. Estimates of heritability (h2) and common (c2) and individual

(e2) environmental effects confirm the pattern found: moderate heritability was found for WDT on both arms only, whereas heritability of ratings of heat pain stimuli after ointment application varied between ointments and arms but was very low. Regarding the conditioning response and our main outcome, placebo analgesia, individual environmental influences explained 100% of the variation. To further investigate individual factors influencing placebo analgesia, questionnaires assessing traits that were previously found to affect placebo or nocebo effects (3, 42) were collected. In this study, placebo analgesia is correlated with ratings of ointments (not surprisingly, as it is calculated from those) and with the conditioning procedure as the only significant predictor. The conditioning procedure in turn is correlated with HPT as a measure of pain sensitivity, test temperature, and pain catastrophizing.

Results of our study show that genetics may play a role in WDT, but the individual environment plays a more important role in placebo analgesia than genetics or the common environment of twins. The genetic influence in WDT could be explained by a stronger involvement of physiology than cognitive and emotional appraisal, as the early detection of warmth implies low danger for tissue damage. It is well known that the perception of clinical and experimental pain is not only determined by physiology—*via* neuronally mediated nociception—which may be under genetic control, but is also influenced by cognitive and affective appraisals. The latter are subjective evaluations of pain signaling, which are influenced by learning from previously experienced situations (58, 59). In contrast to WDT, appraisal and learning mechanisms become more important with stimuli above the pain threshold, as for the induction of placebo analgesia. Such individual learning experiences have already been shown to play an important role in placebo analgesia in other experimental studies (1, 7, 30, 32) as well as in clinical analgesic trials (60).

The individual learning experience as induced by conditioning was in turn affected by other factors such as HPT, test temperature, and pain catastrophizing. HPT was also shown to be mainly influenced by individual environment experiences, and the test temperature was equal between twins. In another experimental study, heritability of pain catastrophizing has been estimated as 37% and individual environment as 63%, and has been shown to be directly related to experimental pain with a cold pressor task (61). Hence, the effectiveness of the conditioning procedure itself is affected by factors that are more attributable to individual environmental experiences than to genetic influences.

Twin studies are mainly performed to investigate and estimate the variance explained by heritability in diseases or symptoms, and the shared or common environment experienced by the twins within their family is considered to contribute to further similarity within twins, but the nonshared or individual environment component is considered a "residual term" (62), as it should contribute to dissimilarity. Turkheimer and Waldron (62) further elucidated the individual environment component and distinguished between objective and effective environment: even if the experienced objective environment can be the same, the effects on twins could

be different. In our study all participants underwent the same conditioning procedure (objectively common), but the conditioning procedure was variably effective, and they responded in different ways to the placebo testing (effectively individual). This indicates interactional effects of genes by environment and complex interactions between common and individual environmental effects, e.g., how prior experiences shape subsequent experiences, which should be further investigated.

Finally, some limitations of our study should be mentioned and discussed. First, we did not assess zygosity through genetic testing, but relied on twins' own information about genetic testing and questions about twin resemblance and dissimilarity. This procedure showed high consistency with genetic testing (27, 35, 36), but of course, it is not perfect. Second, we included male and female same-sex as well as opposite-sex twin pairs in our analyses, as female and male participants did not differ in pain-related outcomes. In contrast to our data, Roelke et al. reported significant sex differences in HPT but not in WDT (37), and sex differences in placebo analgesia through verbal suggestion were reported occasionally (5, 63). Therefore, sex differences should be further examined in subsequent studies with larger samples. Third, we report unadjusted *p* values for multiple testing for two reasons: 1) all predictors have been chosen reasonably based on previous results showing their association with placebo effects, and 2) we aim to stimulate further studies and assume that it is more helpful to report unadjusted p values. As *p* value adjustments are influenced by the number of tests performed as well as their significance levels, adjusted *p* values could be misleading for subsequent study design decisions about the inclusion of predictors. Fourth, the participants were blinded to the reduced temperature during the conditioning procedure, whereas our experimenters were not. Finally, in this experimental study, placebo analgesia was induced through conditioning with a well-established experimental paradigm in healthy volunteers to estimate the variance explained by heritability for the first time. Similar to experimental studies in general, results cannot be transferred to other situations without further research. Results should therefore be replicated in larger samples and with regard to other known placebo mechanisms such as verbal suggestions only and social learning, as well as with other experimental pain and other paradigms. Additionally, subsequent studies should estimate the variance in placebo effects explained by heritability in clinical samples, such as pain patients but also patients with other disorders.

In summary, we could show that heritability compared to the individual learning experience may play a minor role in placebo analgesia. However, interactions of genes and environment can still be a source of dissimilarity between twins; the search for candidate genes or polymorphisms is still important in the way to utilize placebo effects; and future studies should combine twin studies and genetic analyses. Furthermore, our results are restricted to placebo effects through conditioning on pain in healthy volunteers and should be replicated with regard to other mechanisms and symptoms as well as in patients.

### DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

#### ETHICS STATEMENT

All participants were included after written informed consent only and received monetary rewards for their participation in this study. This study was approved by the Ethical Review Board of the University of Tübingen (project No. 814/2015BO1) and was conducted in accordance with the Declaration of Helsinki.

### AUTHOR CONTRIBUTIONS

KW and PE contributed the conception and design of the study. NM and A-KH performed the study and organized the

#### REFERENCES


database. EH, NM, A-KH, AS, and KW performed the statistical analysis and contributed to the interpretation of data. KW wrote the first draft of the manuscript. EH wrote sections of the manuscript. All authors contributed to manuscript revision and read and approved the submitted version.

### FUNDING

This work was supported by the German Research Foundation for KW (Deutsche Forschungsgemeinschaft, DFG, WE5658/2-1), and we acknowledge support by Deutsche Forschungsgemeinschaft and the Open Access Publishing Fund of the University of Tübingen.

#### ACKNOWLEDGMENTS

We thank Peter Martus, PhD, and Manu Sharma, PhD, Institute for Clinical Epidemiology and Applied Biostatistics, University of Tuebingen, for statistical advice, and we acknowledge support from the TwinHealth initiative, Medical University Hospital, Tuebingen.


the German validation study]. *Diagnostica* (2004) 50:171–81. doi: 10.1026/0012-1924.50.4.171


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Weimer, Hahn, Mönnikes, Herr, Stengel and Enck. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Neuroimaging Studies of Antidepressant Placebo Effects: Challenges and Opportunities

*Vanessa Brown and Marta Peciña\**

*Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, United States*

Over the last two decades, neuroscientists have used antidepressant placebo probes to examine the biological mechanisms implicated in expectancies of mood improvement. However, findings from these studies have yet to elucidate a model-based theory that would explain the mechanisms through which antidepressant expectancies evolve to induce persistent mood changes. Compared to other fields, the development of experimental models of antidepressant placebo effects faces significant challenges, such as the delayed mechanism of action of conventional antidepressants and the complex internal dynamics of mood. Still, recent neuroimaging studies of antidepressant placebo effects have shown remarkable similarities to those observed in other disciplines (e.g., placebo analgesia), such as placebo-induced increased μ-opioid signaling and bloodoxygen-level dependent (BOLD) responses in areas involved in cognitive control, the representation of expected values and reward and emotional processing. This review will summarize these findings and the challenges and opportunities that arise from applying methodologies used in the field of placebo analgesia into the field of antidepressant placebo effects.

#### *Edited by:*

*Luana Colloca, University of Maryland, Baltimore, United States*

#### *Reviewed by:*

*Daniel Keeser, Ludwig Maximilian University of Munich, Germany Jon-Kar Zubieta, University of Michigan, United States*

*\*Correspondence: Marta Peciña pecinam@upmc.edu*

#### *Specialty section:*

*This article was submitted to Neuroimaging and Stimulation, a section of the journal Frontiers in Psychiatry*

*Received: 11 February 2019 Accepted: 19 August 2019 Published: 24 September 2019*

#### *Citation:*

*Brown V and Peciña M (2019) Neuroimaging Studies of Antidepressant Placebo Effects: Challenges and Opportunities. Front. Psychiatry 10:669. doi: 10.3389/fpsyt.2019.00669*

Keywords: antidepressants, placebo, neuroimaging, computational psychiatry, depression

## INTRODUCTION

Antidepressant placebo effects — averaging 31–45%, compared to ~50% response rates to conventional antidepressant medication — pose significant challenges for drug development (1, 2), a process progressively more time-consuming (currently 13 years on average) and expensive (\$800 million to \$3 billion per new agent) compared to medications for non-central nervous system (CNS) indications (3). Despite innovative clinical trial designs (4) and statistical methods (5, 6) aimed at controlling for this source of noise, the neurobiological mechanisms underlying antidepressant placebo effects are unknown. However, growing evidence suggests that placebos are not just control conditions in clinical trials and that expectations and learning mechanisms associated with their administration activate neurobiological substrates to produce physiological and clinical changes (7).

Functional neuroimaging studies, stemming primarily from the area of placebo analgesia, have rapidly advanced our knowledge of the mechanisms underlying placebo effects in pain using sophisticated experimental approaches (8). However, similar progress has not yet taken place in the field of psychiatry. In depression, the delayed mechanism of action of antidepressants (9) makes it hard to induce expectancies of fast-acting antidepressant effects. Furthermore, changes in mood states have long temporal dynamics (10), compared to brief and reliable pain manipulations. For these reasons, most experimental studies of antidepressant placebo effects have taken place in the context of antidepressant clinical trials, far from laboratory settings. Some of these difficulties may explain the scarcity of scientific evidence that followed the first neuroimaging studies on antidepressant placebo effects in 2002 (11, 12), compared to hundreds of studies (13–17) that followed the first neuroimaging study on placebo analgesia published the same year (18). This review will cover some of the methodological approaches used within the pain field and describe some of the challenges encountered by the field of antidepressant placebo effects and the potential opportunities that arise from the fields of neuroimaging and computational neuroscience currently used by other disciplines.

### THEORIES OF THE PLACEBO EFFECTS

Classical theories of the placebo effect, informed predominantly by placebo analgesia experiments, posit that placebo responses are explained by expectancy and conditioning mechanisms (19). While the former understands placebo effects as a product of expectations (e.g. "verbal instructions"), the latter understands then as conditioned responses (CR) through the pairing of a neutral stimulus (e.g., the placebo pill) with an unconditioned stimulus (US, e.g., the active drug). More recently, computational theories of placebo analgesia have suggested that placebo effects can be explained by a predictive coding framework, where the brain has a hierarchical, internally generated model of the world that is compared against incoming sensory stimuli (20). According to predictive coding theories, experiencing a sensation like pain results from bottom-up sensory signals as well as top-down expectancies about pain. The mismatch between these bottom-up and top-down signals is used to refine future expectancies in order to better predict future sensory input. This computational framework suggests that expectancies about pain serve as *priors* on experiences, whereas sensory input forms the *likelihood*. Very strong expectancies are represented by *priors* with low variance, which in Bayesian updating means that incoming information (such as sensory signals) has little effect; the opposite is true of weak or uncertain expectancies. Therefore, strong expectancy *priors* about the effect of a placebo will reduce the amount of learning that occurs from experience. In an experimental test of this model, Grahl et al. (21) fit a Bayesian updating model to two groups of participants who received a placebo treatment with expectations of analgesia. In both groups, the thermal pain delivered during putative 'treatment' trials was lower than the pain delivered during control trials; however, for one group, lower pain level was always constant, whereas for the other group it was variable. After participants had learned to associate the 'treatment' with lower pain, their pain levels were measured during a test phase where equal levels of pain were either paired or not paired with the 'treatment' cues. According to theory, participants receiving variable levels of pain while learning about the effects of the placebo analgesia should have a wider prior during the test phase and be more influenced by sensory pain signals — the likelihood. Accordingly, placebo effects correlated positively with the *precision* of *prior* expectations, and this *precision* was mapped onto the periaqueductal gray (PAG) and the rostral ventromedial medulla. This study showed that pain perception results from the integration of expectancies, in the form of *priors*, and sensory information, in the form of *likelihoods*, and the relative variances of these distributions affects placebo learning at behavioral and neural levels.

Alternative computational accounts have been considered. For example, current evidence suggests that placebo effects can be explained by models of reinforcement learning (RL). These models, and in particular variants of the Rescorla-Wagner model (22), propose that individuals update their expectancies as new sensory evidence is accumulated (e.g. pain), by incorporating a *prediction error* (PE), which signals the mismatch between what it is expected (*expected value*) and what it is perceived (*the reward*). This PE is then scaled by the *learning rate*, a parameter controlling the speed of updating of new sensory evidence and added to the expected value of the next experience. In standard RL, expectations not confirmed by experience are extinguished. However, emerging evidence from placebo analgesia experiments suggests that placebo analgesia arises from mechanisms implicated in *self-reinforcing expectancies*, such as *confirmation biases*, where expectancies are selectively reinforced by predictive cues (e.g., the placebo) only when new experience confirms prior expectations, or discount new evidence otherwise (16). Alternatively, others have suggested that persistent expectancies result from *impaired extinction learning* caused by prefrontal downregulation of RPEs (23).

These different theoretical frameworks have been embedded in many experimental designs of placebo analgesia since its early stages, leading to substantial progress in identifying the cognitive, neural and molecular bases of placebo analgesia. While it remains largely unknown whether similar conceptual frameworks can be applied to the formation of placebo responses across disorders, these experimental approaches have the potential to illuminate new insights into our understanding of antidepressant placebo effects.

### NEUROIMAGING APPROACHES TO ANTIDEPRESSANT PLACEBO EFFECTS: LEARNING FROM THE FIELD OF PLACEBO ANALGESIA

#### Neuroimaging Models of Placebo Analgesia Effects

The very first neuroimaging study of placebo analgesia measured regional cerebral blood flow (rCBF) with positron emission tomography (PET) to compare the effects of the short-acting µ-opioid receptor agonist remifentanil or a placebo under expectations of analgesia. This study revealed increased brain activity in the rostral anterior cingulate cortex (ACC) for both remifentanil and the placebo conditions. Placebo, but not remifentanil, further increased the connectivity between the rostral ACC and the PAG (18). Since then, many neuroimaging studies have followed this original investigation.

Most commonly, neuroimaging experimental designs of placebo analgesia involved verbal instructions of pain relief ("This is a potent analgesic") along with an inert treatment (e.g., a topical cream), compared to a control condition—the same inert treatment without expectations of pain relief. During an associative learning phase, the placebo is paired with a lowintensity painful stimulus and the control condition is paired with a high-intensity painful stimulus. Finally, during the test phase — usually conducted during a functional MRI scanning session — both the control and the placebo conditions are paired with a painful stimulus of the same intensity. Under these circumstances, experimenters can test whether pain reports and brain responses are modulated by the patient's beliefs about the treatment (8). Alternatively, pharmacological conditioning designs have involved the pairing of the relevant stimuli (e.g. pain stimuli, emotionally balanced pictures) and an acute active treatment (e.g. analgesic), during the associative learning phase.(24). While many alternative designs have been used to investigate placebo effects in the context of clinical trials (e.g. *parallel group designs* or *open versus hidden drug design),* this trial-by-trial manipulation of expectancies and sensory inputs (e.g. pain, mood) has been an essential feature of experimental neuroimaging models of placebo analgesia, which has allowed a rapid understanding of the behavioral, neural, molecular, and computational bases of placebo analgesia.

These studies have demonstrated placebo-induced activation in several cortical areas, such as the ACC and the dorsolateral prefrontal cortex (dlPFC) (18, 25), as well as the descending pain modulating system, involving the hypothalamus, the PAG, and the rostroventromedial medulla, reaching down to the spinal cord (13). More specifically, meta-analytic results have described both placebo-induced reductions in brain responses during painful stimulation in dorsal ACC, insula, thalamus, amygdala, striatum, and lateral prefrontal cortex, as well as placebo-induced increases in activation prior to and during noxious stimulation in the dlPFC and ventromedial PFC, rostral ACC, the midbrain surrounding the PAG, left anterior insula, and the striatum (8). Furthermore, studies using opioid antagonist blockade (26–29) and *in vivo* receptor binding of μ-opioid receptors (30, 31) have extensively confirmed the role of µ-opioid neurotransmission in placebo analgesia (32), and more recently antidepressant placebo effects (33), consistently with the role of the opioid system in pain (34) and mood processing (35). Nowadays, Neurosynth (36) and other related large-scale neuroimaging databases also offer the opportunity to perform comprehensive reverse inference analyses to define the neural correlates of placebo effects. Consistent with the results reported above, when the term "placebo" is entered as a term into a Neurosynth uniformity test, results from 332 studies reveal increased activity present in the dlPFC, dorsal, rostral, and subgenual ACC, the thalamus and the VS.

#### Neuroimaging Models of Antidepressant Placebo Effects

The experimental manipulation of expectations of mood improvement as well as its conditioning posits significant challenges. For example, the delayed action of conventional antidepressants limits the possibility of manipulating expectancies acutely. Furthermore, mood, unlike pain—which reliably emerges in response to specific stimuli—is a latent state with complex internal dynamics. For these reasons, most neuroimaging studies have used placebo-induced neuroimaging changes in the context of randomized clinical trials (RCTs) (11, 12) (pre- and postplacebo mood changes). Although these studies have informed about the biological substrates that underlie antidepressant placebo effects, they have yet to describe a mechanism through which antidepressant expectancies evolve to induce persistent mood changes, like those observed in RCTs. Critical to this aim is the development of novel trial-by-trial manipulations of antidepressant placebo effects. We have recently developed the first paradigm involving a trial-by-trial manipulation of antidepressant placebo effects (37). Here, we will argue that this kind of experimental manipulation is a necessary first step to develop an understanding of placebo effects that is embedded in a conceptual understanding of this phenomenon (**Figure 1**).

#### Parallel Group Designs of Antidepressant Placebo Effects

In the very first study that examined the neural correlates of antidepressant placebo effects, Leuchter et al. (12), used quantitative electroencephalography (QEEG) to compare changes in brain function during a 9-week RCT of fluoxetine or venlafaxine. QEEG data was collected at baseline, after a 1-week placebo lead-in phase, and at 2, 4, and 8 weeks after the start of double- blind treatment. This study showed that by week 2, placebo responders, compared to drug responders, showed increases in prefrontal cordance that significantly diverged from baseline by week 8. Contrary, at week 2, only drug responders showed a significant decrease in prefrontal cordance, which resolved at weeks 4 and 8. This was the first study to demonstrate that despite achieving similar symptomatic improvement, placebo and antidepressant treatments engaged prefrontal function through opposite mechanisms of action, specially at early stages during the course of treatment.

Soon after, Mayberg and colleagues examined the neural correlates of antidepressant placebo effects using fluorodeoxyglucose (FDG) and PET before and after 1 and 6 weeks of fluoxetine or placebo (11). This study revealed that, after 6 weeks of treatment, placebo responders had regional metabolic increases in the prefrontal cortex, ACC, premotor and parietal cortex, posterior insula, and posterior cingulate and metabolic decreases in the subgenual ACC, parahippocampus, and thalamus, whereas drug responders had additional metabolic increases in the brainstem, striatum, anterior insula, and hippocampus (11).

These two studies represented a major step forward in the investigation of antidepressant placebo effects. Interestingly, and despite using very different neuroimaging modalities with different temporal and spatial resolution, both studies found overall increases in prefrontal activity in response to the drug or the placebo treatments.

#### Placebo Lead-In Designs of Antidepressant Placebo Effects

In the context of RCTs, parallel group designs often assess symptom stability using a placebo lead-in phase. During this

phase, subjects who meet initial screening criteria, but exhibit a 20–25% reduction in symptoms, are usually excluded from participation in the post-randomization phase of the trial (38).

Biomarker studies have used placebo lead-in designs to examine the relationship between neural changes during the placebo lead-in period and the endpoint clinical outcome. An example of this kind of experimental design is the one published by Hunter et al., where they examined the neural responses during a placebo lead-in phase (39). In this case, they found that decreased prefrontal cordance during the placebo lead-in period predicted lower depression severity by the end of the trial in patients assigned to medication.

More recently, we conducted a study that involved a two-week single-blinded, crossover, randomized placebo lead-in of 2 identical oral placebos (described as having either 'active' or 'inactive' fastacting antidepressant-like effects) followed by a 10-week open-label antidepressant treatment (33). In this study, 35 medication-free patients were studied with PET and the µ-opioid receptor-selective radiotracer [11C] carfentanil after the 'active' and an 'inactive' oral placebo treatment. In addition, during the PET scanning session, but only after the active placebo condition, participants were administered 1 mL of isotonic saline intravenously, with instructions of fast-acting antidepressant effects. This study had several interesting findings. First, higher baseline opioid receptor binding in the nucleus accumbens (NAc) was associated with a better treatment response during the 10-week open label antidepressant treatment. Second, clinical responses to the 'active' placebo treatment, compared to the 'inactive', were associated with increased placebo-induced μ-opioid neurotransmission in the subgenual ACC, NAc, midline thalamus and amygdala. Finally, we found that placebo-induced opioid neurotransmission was associated with better antidepressant treatment response, predicting 43% of the variance in symptom improvement at the end of the antidepressant trial (33).

In addition, twenty-six patients from the sample described above completed a PET scan with the D2/3 receptor-selective radiotracer [11C] raclopride after each 1-week inactive and active oral placebo treatment. Here, we found that, compared to a matching sample of healthy controls, patients with depression showed greater D2/3 receptor availability in the bilateral ventral pallidum/NAc, and the right ventral caudate and putamen. D2/3 receptor availability in the ventral striatum correlated positively with high anxiety (caudal portion) and negatively with anhedonia (rostral portion). Furthermore, we observed increased placebo-induced DA neurotransmission in the ventral striatum. However, these changes were not correlated with the patient's levels of expectations of improvement or their mood improvement after the I.V. or the oral placebo nor the treatment with 10 weeks of antidepressants (40) (**Figure 2**). These results suggested that antidepressant placebo effects resulted in increased opioid and DA neurotransmission in regions involved in emotional and reward processing, mostly subcortically. However, as suggested by prominent reward theories (41), while both neurotransmitter systems are released in response to the administration of placebos, the mesolimbic dopamine system may be involved in the placebo 'wanting' — or the incentive salience that motivates approach while the μ-opioid system may be involved in the placebo 'liking' — the physiological response to a hedonic stimuli.

The same patients also completed a resting state functional connectivity (RSFC) after each of two different 'active' and 'inactive' placebos (42). In this case, we found that increased RSFC in the rostral

ACC within the salience network predicted both better response to the active, compared to the inactive placebo, and to the 10-week antidepressant treatment. Furthermore, using machine learning we showed that increased RSFC in the rostral ACC significantly predicted individual responses to placebo administration. These results suggested that increased RSFC in the rostral ACC, the most reliable marker of treatment response in depression across multiple treatments (43), as well as placebo analgesia (18, 24), seems to play a significant role in the formation of antidepressant placebo effects.

#### Trial-By-Trial Designs of Antidepressant Placebo Effects

We recently developed a new Sham Neurofeedback fMRI Task (37). This task features a within-subject trial-by-trial manipulation of two putative components of the placebo antidepressant effect: the expectancy of mood improvement and its reinforcement. During the expectancy manipulation, patients were presented with a drug infusion or no-infusion cue, which instructs patients about the imminent infusion of the "fast-acting antidepressant" (intravenous saline) or its absence, respectively. During the reinforcement manipulation, patients were presented with the display of sham neurofeedback signal of positive or negative valance during 20 s, with instructions that it reflected changes in brain activity in response to the drug infusions. Patients were asked to rate their expectations of mood improvement and their actual mood improvement after each expectancy and reinforcement manipulation, respectively, using a 7-point Likert scale (**Figure 1**).

Results from this study in 20 patients with MDD demonstrated the feasibility of manipulating fast-acting antidepressant effects. As expected, patients reported higher expectancy ratings during the placebo infusion condition (expecting a drug infusion as opposed to no infusion), and higher mood ratings during the drug infusion cue, compared to the no-infusion cue, and following the display of positive sham neurofeedback, compared to negative. Furthermore, the positive effect of neurofeedback on reported mood was enhanced when expectancies were high, as reflected in a positive two-way interaction.

The presentation of neurofeedback of greater magnitude recruited greater blood-oxygen-level dependent (BOLD) responses in the bilateral ventrolateral and dorsolateral PFC. Furthermore, greater increases in β-endorphin plasma levels during the task were associated with higher expectancy ratings during the placebo condition, compared to the no-infusion condition, and higher mood ratings during positive neurofeedback, compared to negative.

In our opinion, this trial-by-trial manipulation is an essential first step to decoding the neural representation of antidepressant placebo effects, by dissecting the different components of the placebo response and aiding the development of computational models which might provide new opportunities to disambiguate this complex phenomenon. For example, expectancy ratings during the Sham Neurofeedback fMRI Task could be fit to models

expected (Vt) and what it is perceived (Rt).

of RL where learned expected values for each trial type are updated every time the "antidepressant" infusion cue is presented and an outcome (positive or negative neurofeedback) is observed. This updating is based on the following equation: *Q*t + 1(*s*) = *Q*t + 1(*s*) + αδ*<sup>t</sup>* , where *Qt* (*s*) is the learned expected value of improvement at trial *t*, α is a learning rate, and δ is the difference between the actual and expected outcome (RPE): δ*<sup>t</sup>* = *rt* – *Qt* (*s*), where, *rt* is the actual reward outcome (positive vs. negative neurofeedback). These values are used to make choices (such as ratings of expectation of improvement) according to a sigmoid choice rule with two free parameters: β (*stochasticity*) and K (*choice bias*). The estimation of such parameters and derived values (e.g., expected values, RPE, etc.) — which cannot be accessed with descriptive approaches alone — can then be mapped onto the neural response during the Sham Neurofeedback fMRI Task. This trial-by-trial information is likely to provide new opportunities to disambiguate placebo responses. Furthermore, this transdiagnostic RL framework may apply to other clinical conditions where placebo effects are also prevalent, notably anxiety disorders, Parkinson's Disease, and various forms of persistent pain, but also schizophrenia, substance use disorders and surgeries (44, 45).

### COMPUTATIONAL APPROACHES TO ANTIDEPRESSANT PLACEBO EFFECTS

Whereas computational theories have not yet been applied to models of antidepressant placebo effects, recent evidence supports a relationship between RL and mood, which opens the possibility that antidepressant placebo effects might indeed result from RL mechanisms (46, 47). Expectations and PEs have shown to affect self-reported mood on a trial-to-trial basis (48), and mood can bias how people perceive and learn from rewards (46, 47). This bi-directional relationship between learning and mood is likely to play a significant role in the formation of antidepressant placebo effects (**Figure 3**).

RL models of antidepressant placebo effects are therefore likely to be influenced by features frequently affected in patients with depression. For example, patients with depression may show a reduction in the *primary sensitivity* to rewards (reduced consummatory anhedonia) and/or alterations in their ability to *learn* from positive or reward feedback. Furthermore, patients with depression might show exaggerated processing of negative or aversive feedback (49). These alterations in the processing of positive and negative feedback in patients with depression could also have implications for nocebo effects in this disorder as well. Therefore, RL models of antidepressant placebo effects might need to incorporate additional features such as *reduced sensitivity to positive feedback and differential sensitivity to positive versus negative feedback*. Models that account for these biases can adjust outcome processing or learning based on the valence of outcomes (by modulating learning rates or sensitivities to outcomes for positive and/or negative feedback) or prediction errors (by estimating separate learning rates for positive versus negative prediction errors).

Finally, improved mood may increase processing and learning from positive outcomes, biasing learning towards more positive learning with initial improvements in mood (46). Therefore, models of antidepressant placebo effects may also benefit from including *bidirectional influences between mood and learning*. This kind of biases create a feedback loop where initial improvements in mood, through biasing learning in positive direction, lead to more positive future mood states, providing a potential mechanism for the perpetuation of placebo responses.

#### CONCLUSION

This review has identified several challenges and opportunities that have emerged from early research investigating the neurobiology of antidepressant placebo effects and new computational approaches. As discussed, much can be learned from experimental approaches extensively used by other disciples.

In the future, the formalization of computational models of antidepressant placebo effects and other psychiatric conditions may provide a fruitful approach to map learning-based models of antidepressant placebo effects onto the underlying neural mechanism. The delineation of such a computational framework and associated neural circuits and neurotransmitters systems will open new *translational opportunities* to promote treatment response by stimulating placebo-related networks as new targets for mood improvement. From the perspective of drug and therapy development, *inhibiting placebo responses*  could help separate drug-specific and "non-specific" treatment effects. Higher signal and less noise in RCTs would, therefore, result in substantial savings by reducing the samples sizes necessary to achieve significant differences between active and inactive treatments. As discussed, a first step towards this aim is the use of model-based experimental approaches that disentangle the different elements involved in this complex phenomenon, including those shared by other disorders and those that are mood specific.

#### REFERENCES


Furthermore, the development of software tools and platforms that might provide access to high quality clinical multi-disciplinary data may allow the development of computational brain models useful in clinical practice (for example, Virtual Brain: https://www. thevirtualbrain.org/tvb/zwei). Such neurocomputational models could potentially be used to help identify key subject-specific mechanisms of placebo responses that might impact treatment response broadly. This approach is more likely to account for individual differences in placebo responses, a phenomenon that is subject to both intra-individual and inter-individual variability. Consistently, recent evidence suggests that functional organization within individual subjects is idiosyncratic and relatively robust to changes in brain state and provides meaningful information beyond group averages (50–52). This progress in key to the development of biomarkers of treatment and personalized medicine.

#### AUTHOR CONTRIBUTIONS

VB and MP contributed to writing, reviewing and making figures and tables for this review article.

### FUNDING

This work was supported by a K23 MH108674 (MP) and a T32 MH019986 (VB).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Brown and Peciña. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebo Manipulations Reverse Pain Potentiation by Unpleasant Affective Stimuli

#### *Philipp Reicherts1\*, Paul Pauli1,2, Camilla Mösler1 and Matthias J. Wieser1,3*

*1 Department of Psychology, University of Würzburg, Würzburg, Germany, 2 Center of Mental Health, University of Würzburg, Würzburg, Germany, 3 Department of Psychology, Education, and Child Studies, University of Rotterdam, Rotterdam, Netherlands*

According to the motivational priming hypothesis, unpleasant stimuli activate the motivational defense system, which in turn promotes congruent affective states such as negative emotions and pain. The question arises to what degree this bottom– up impact of emotions on pain is susceptible to a manipulation of top–down-driven expectations. To this end, we investigated whether verbal instructions implying pain potentiation vs. reduction (placebo or nocebo expectations)—later on confirmed by corresponding experiences (placebo or nocebo conditioning)—might alter behavioral and neurophysiological correlates of pain modulation by unpleasant pictures. We compared two groups, which underwent three experimental phases: first, participants were either instructed that watching unpleasant affective pictures would increase pain (nocebo group) or that watching unpleasant pictures would decrease pain (placebo group) relative to neutral pictures. During the following placebo/nocebo-conditioning phase, pictures were presented together with electrical pain stimuli of different intensities, reinforcing the instructions. In the subsequent test phase, all pictures were presented again combined with identical pain stimuli. Electroencephalogram was recorded in order to analyze neurophysiological responses of pain (somatosensory evoked potential) and picture processing [visually evoked late positive potential (LPP)], in addition to pain ratings. In the test phase, ratings of pain stimuli administered while watching unpleasant relative to neutral pictures were significantly higher in the nocebo group, thus confirming the motivational priming effect for pain perception. In the placebo group, this effect was reversed such that unpleasant compared with neutral pictures led to significantly lower pain ratings. Similarly, somatosensory evoked potentials were decreased during unpleasant compared with neutral pictures, in the placebo group only. LPPs of the placebo group failed to discriminate between unpleasant and neutral pictures, while the LPPs of the nocebo group showed a clear differentiation. We conclude that the placebo manipulation already affected the processing of the emotional stimuli and, in consequence, the processing of the pain stimuli. In summary, the study revealed that the modulation of pain by emotions, albeit a reliable and well-established finding, is further tuned by reinforced expectations known to induce placebo/nocebo effects—which should be addressed in future research and considered in clinical applications.

Keywords: placebo and nocebo effects, emotion processing, psychological pain modulation, late positive potential, somatosensory evoked potential

*Edited by:*

*Katja Weimer, University of Ulm, Germany*

#### *Reviewed by:*

*Per M. Aslaksen, UiT The Arctic University of Norway, Norway Florian Bublatzky, Central Institute for Mental Health, Germany*

*\*Correspondence: Philipp Reicherts philipp.reicherts@uni-wuerzburg.de*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 26 February 2019 Accepted: 16 August 2019 Published: 24 September 2019*

#### *Citation:*

*Reicherts P, Pauli P, Mösler C and Wieser MJ (2019) Placebo Manipulations Reverse Pain Potentiation by Unpleasant Affective Stimuli. Front. Psychiatry 10:663. doi: 10.3389/fpsyt.2019.00663*

## INTRODUCTION

The processing of pain is prone to a variety of psychological variables, such as the affective state of an individual [for an overview, see Ref. (1)]. In this vein, it was demonstrated that emotions, induced for instance by a threat manipulation (2) or by emotionally relevant stimuli, modulate pain processing (3–6). In an earlier study, Kenntner-Mabiala and colleagues presented affective pictures for about 6 s to participants while they applied brief painful electric stimuli and registered pain ratings plus the somatosensory evoked potential (SEP) (7, 8). Results suggest that emotions modulate early pain processing as unpleasant pictures resulted in increased pain ratings and increased amplitudes of the early N1 component of the SEP relative to positive pictures. Other studies indicate that expectations regarding the characteristics of an upcoming pain stimulus also determine the processing of nociceptive stimulation and the resulting pain perception (9, 10). The same is true for placebo and nocebo effects on pain; however, here, expectations focus on the *effect of an intervention*, which is expected to decrease (placebo) or increase pain (nocebo) (10–12). Expectations causing placebo and nocebo effects can be induced by verbal instructions suggesting a pain-modulating effect and/or by the actual experience of pain relief or pain exacerbation (placebo/nocebo conditioning) associated with a certain treatment or—experimental—intervention (13–17).

In a recent study, we investigated the respective contribution of expectations and prior experiences on the formation of placebo effects. To this end, we introduced a new, completely psychological placebo manipulation, which ensured that participants had not encountered the placebo agent before and thus had no *a priori*  expectation. We employed a common approach in placebo and nocebo research that is a placebo/nocebo instruction followed by a reinforcing conditioning phase, during which placebos were combined with lower and nocebos with stronger pain stimuli. Three experimental conditions were compared: Participants were either only informed of an analgesic/pro-algesic effect they were about to encounter, or participants actually experienced different levels of pain in a conditioning procedure, or participants received both, an instruction informing about a pain-modulating effect, which received support during a subsequent conditioning phase. We found that the latter condition, i.e., expectation plus concordant conditioning, was capable in modifying subjective and physiological indices of pain, even though the placebo/nocebo manipulation was lacking pharmacological plausibility, since we instructed participants that "watching certain black and white stripe patterns were found to have a pain augmenting/easing effect," respectively (18). These findings corroborate the critical role of higher-order cognitions for the modulation of pain.

Placebo and nocebo effects, however, are by no means restricted to pain. Significant modifications have been found for various somatic symptoms (12) and also for the perception of emotions. For example, Petrovic and colleagues found that subjective and neuronal responses to unpleasant affective pictures were reduced if participants believed they had received an anxiolytic medication (19). Based on the involved brain areas, the authors assume similar underlying mechanisms in placebo effects altering emotion and pain alike. More recently, Schienle

and colleagues (20) demonstrated reduced feelings of disgust paralleled by reduced insular activation when participants thought they took a herbal drug against disgust symptoms. In a related manner, findings from research on reversal learning show placebo- and nocebo-like effects on emotion processing. For instance, threat responses following the presentation of previously established conditioned threat cues (CS+), which were paired with aversive electrical stimuli, are reduced, if participants receive a verbal instruction that the cue is no longer indicative of danger (21). Similarly, although the presentation of emotional facial expressions reliably evokes positive or negative affective responses in an observer, verbal instruction about potential danger being indicated by a certain face category leads to defensive responding irrespective of face valence (e.g., happy or fearful faces announcing an aversive outcome) (12). These results nicely demonstrate that emotional responses can be shaped *top– down* by cognitive representations of superordinate functions.

Interestingly, placebo and nocebo effects often come along with emotional responses, such as anticipatory anxiety (nocebo) or positive feelings of relief and reward (placebo), which—to some degree—might mediate the modulation of (pain) symptoms (22– 24). For instance, Aslaksen and colleagues showed that a nocebo instruction suggesting hyperalgesic effects caused by an applied cream led to a pain increase, which was meditated by subjective and physiological indices of stress (25). However, when applying a mere conditioning procedure without explicit placebo or nocebo instructions, the role of negative affect might be less relevant (26). Just recently, Geers and colleagues found that the experimental induction of positive mood by watching a pleasant movie clip was capable to block a pain increase by a verbal nocebo suggestion (27). Despite all these findings, so far, little research explored the interaction of emotions on the one hand and placebo/nocebo manipulations on the other when modulating pain.

In the present study, we aimed at investigating whether the genuine pain-modulating effect of unpleasant affective pictures is sensitive to a placebo or nocebo manipulation. To this end, we compared two groups of participants who received either a placebo or nocebo manipulation related to unpleasant pictures. The nocebo group was instructed that watching unpleasant pictures leads to an increased perception of pain in line with findings from the literature (nocebo instruction), and during a later conditioning procedure, they actually experienced relatively more intense pain stimuli when watching the "nocebo" pictures. The placebo group was told the exact opposite, namely, that unpleasant pictures cause a decreased perception of pain. Thereafter, participants experienced relatively less intense pain stimuli when watching the "placebo" pictures. In addition to pain reports, we measured the electroencephalogram (EEG) that allowed us to analyze neurophysiological correlates of pain perception (N1 and P2 component of the SEP as mentioned earlier) and processing of the emotional pictures by means of visually evoked potentials (28). One component of the visually evoked potential following the presentation of emotional relevant stimuli is the late positive potential (LPP)—a positive signal deflection most prominent at centro-parietal electrode sites—which was found to be a sensitive measure for emotional intensity (arousal) of presented pictures (29–31).

We hypothesized that unpleasant picture stimuli generally increase pain processing; however, this effect is modulated by reinforced expectations induced by a placebo/nocebo manipulation. Specifically, we expect that a placebo manipulation (verbal instruction + placebo conditioning) reduces or even reverses the pain-augmenting effect of unpleasant pictures. This might lead to lower pain ratings and SEPs for unpleasant compared with neutral pictures. Further, the placebo manipulation might become evident also in altered neurophysiological correlates of unpleasant affective pictures processing, namely, by a lack of LPP modulation or even higher amplitudes for neutral compared with unpleasant pictures.

#### MATERIALS AND METHODS

#### Sample

Forty-two participants were recruited from the University of Würzburg and received course credit or €20 as compensation. Two participants needed to be excluded due technical problems during data acquisition, leaving 40 participants in the final analysis, 20 participants in the nocebo group (10 females) and 20 participants in the placebo group (10 females). All subjects had normal or corrected-to-normal vision, reported no current or prior history of chronic pain, neurological or psychiatric disorders (self-report), and did not take any analgesic medication prior to the experiment. Participants first read detailed instructions about the experiment and signed the informed consent before taking part in the experiment. Participants filled out questionnaires on current positive and negative affect (Positive Affect/Negative Affect Schedule) (32), on state and trait anxiety (State/Trait Anxiety Inventory) (33), on pain catastrophizing (Pain Catastrophizing Scale) (34), on sensitivity for pain (35), on dispositional optimism and pessimism (Life

#### TABLE 1 | Sample Characteristics.

Orientation Test—Revised) (36), and on anxiety of pain related symptoms (Pain Anxiety Symptom Scale) (37). Questionnaire scores of both groups were similar except for state anxiety, which was higher in the placebo group (see **Table 1**). All procedures were approved by the institutional review board of the medical faculty of the University of Würzburg.

#### Visual Stimuli

Participants watched 40 emotional pictures (twice), which were drawn from the International Affective Pictures System (38), comprising 20 neutral (International Affective Pictures System catalog numbers: 2095, 3170, 3180, 3230, 3261, 3500, 3530, 6212, 6256, 9040, 9050, 9163, 9250, 9300, 9321, 9413, 9419, 9901, 9921, and 9925) and 20 unpleasant pictures (2038, 2191, 2383, 2393, 2396, 2514, 2595, 2749, 2850, 2870, 2880, 5390, 5731, 5870, 7002, 7100, 7130, 7493, 7550, and 7590). Pictures were presented for 6 s interleaved by a central fixation cross present for 2–3 s (randomized). Picture order was randomized with the restriction of no more than two consecutive pictures of the same valence. Visual stimuli were projected centrally on a screen of 2 × 3.22 m (Powerwall), at 2.0-m distance from the participant's chair.

#### Electrical Pain Stimulation

Electrical pain stimuli were delivered on the left calf of the participants *via* a surface bar electrode with two stainless steel disk electrodes (8-mm diameter, 30-mm spacing), using a constant-current stimulator (Digitimer DS7A, Digitimer Ltd., Welwyn Garden City, UK). The intensity of the electrical stimulus was adjusted to the participants' individual pain threshold. During thresholding, participants were asked to rate electrical stimuli of two ascending and two descending series starting from 0 mA applying steps of ±0.5 mA, respectively, on a 11-point scale ranging from 0 (no pain at all) to 10 (unbearable pain). Stimulus


*Both groups consisted of 10 women and men; PANAS, Positive Affect/Negative Affect Schedule; STAI, State/Trait Anxiety Inventory; LOT, Life Orientation Test; PSQ, Pain Sensitivity Questionnaire; PCS, Pain Catastrophizing Scale; PASS-D, Pain Anxiety Symptom Scale; \*p < .05*; \*\**significant Mann–Whitney U Test.*

intensities rated with a 4 (just noticeable pain) were averaged, and 1 mA was added to the final stimulus intensity to reassure a moderate pain level. The final stimulation intensity was again rated on a 10-point scale (see **Table 1**). During the experiment, two different stimulation intensities were used, which varied with regard to the number of consecutive single pulses (train length). Low intense stimuli consisted of three square pulses (pulse length 2 ms) and an inter-pulse interval of 4 ms, high intense stimuli instead consisted of 10 square pulses. During the test phase and the threshold procedure, high intense stimuli were delivered, and during the conditioning phase, both low and high intense stimuli were used; see the procedure section for further details.

#### Electroencephalogram Recording and Evoked Potentials

Electrophysiological data were recorded from 32 active electrodes (ActiCap; Brain Products, Munich, Germany) with a sampling rate of 1,000 Hz, placed according to the international 10–20 system (C3, C4, CP1, CP2, CP5, CP6, Cz, F3, F4, F7, F8, FC1, FC2, FC5, FC6, Fp1, Fp2, Fz, O1, O2, Oz, P3, P4, P7, P8, Pz, T7, T8, TP10). FCz was used as online reference, and data were offline re-referenced to an average reference. Vertical (above and below the left eye) and horizontal (at the outer canthi of both eyes) electrooculogram was recorded. Electrode impedance was kept below 5 k Ohm, and the online band-pass filter was set to 0.01 to 250 Hz. Data were collected using a Brain-Amp-MR amplifier (Brain Products) and the software Brain Vision Recorder Version 1.05 together with ActiCap Control Software (Brain Products). Off-line EEG analysis was performed using Brain Vision Analyzer Version 2.1 (Brain Products). EEG was filtered (0.1–30 Hz) and corrected for horizontal and vertical ocular artifacts (39). Trials exceeding a transition threshold of 50 µV (sample to sample) or an amplitude criterion of ±100 µV were excluded from further analysis. For the analysis of the picture evoked LPP, epochs registered 100 ms before to 2,000 ms after picture onset were extracted and baseline corrected with reference to the mean baseline interval (100 ms before picture onset). The LPP was scored at the parietal electrode Pz and quantified as mean activity from 700- to 1,000-ms post picture onset, according to visual inspection of the scalp topographies and the literature (28, 31). For the analysis of the SEP following electrical stimulation, epochs registered 100 ms before to 1,000 ms after electrical stimulation (first pulse) were extracted, baseline-corrected, and averaged analog to the procedure of the LPP. Two components of the SEP were analyzed, that is, the N1 and P2, which were scored as mean activity at the Cz electrode in a time window from 75 to 125 ms and 200 to 330 ms, respectively (3, 8). For statistical analysis, all event-related potential components were averaged per participant across all artifact-free picture and pain epochs of the conditioning and test phase, respectively.

#### Pain Ratings

After each electrical stimulation*,* pain intensity and unpleasantness ratings were obtained using a digital visual analog scale. Ratings were converted off-line to values between 0 and 100. The scale for pain intensity ratings was labeled "not painful at all" at the left end and "extremely painful" at the right end of the scale, and for pain unpleasantness, the scale ranged from "not unpleasant at all" to "extremely unpleasant."

#### Procedure

After arrival, participants were assigned to one of the two experimental groups (nocebo vs. placebo)—taking into account the participants' gender—following an *a priori* randomization performed by the experimenter. According to the respective experimental condition, participants were instructed that during the experiment, they would watch a series of unpleasant and neutral pictures, which—in line with recent findings in the literature—very likely would change their perception of concurrently administered painful electrical stimuli. The nocebo group was told that unpleasant pictures would *increase* the perception of pain, while neutral pictures had no influence on pain at all. The placebo group instead was told that unpleasant pictures would result in a *decreased* perception of pain compared with neutral pictures, which would leave the perception of pain unchanged. Participants were seated 2.0 m in front of the screen and started the experiment. Unbeknownst to them, the experiment consisted of two parts, the conditioning phase, which was followed without interruption by the test phase. During conditioning, participants of the nocebo group watched neutral pictures and received the low-intensity pain stimuli and unpleasant pictures paired with high-intensity pain stimuli. This association was reversed for participants of the placebo group; here, participants were administered the low-intensity pain stimuli during unpleasant and the high-intensity stimuli during neutral picture presentation. Following the logic of previous placebo manipulation, this procedure should reassure the participants that the instruction they were given in the beginning of the experiment actually hold true and pain perception was modulated accordingly. During the test phase, participants of both groups always received the same, high-intensity pain stimulation, combined with neutral and unpleasant pictures (see **Figure 1**). After each trial, participants rated the electrical stimulus for pain intensity and unpleasantness. In total, participants completed 80 trials, which is 20 repetitions of unpleasant and neutral pictures per phase. In the end, participants filled out a post experimental survey asking how they evaluate the effect of unpleasant and neutral pictures on pain using a 9-point scale ranging from +4 (very pain increasing) to 0 (no effect on pain) to -4 (very pain reducing). Stimulus presentation was controlled by the software Presentation (Neurobehavioral Systems, Albany, CA, USA).

#### Statistical Analysis

Pain ratings (pain unpleasantness, pain intensity) and amplitudes of the SEP components (N1 and P2) were analyzed separately for the conditioning and the test phase. During the conditioning phase, a 2 × 2 repeated measures analysis of variance (ANOVA) including the within-subjects factors *Stimulation Level* (high vs. low intensity stimulation, irrespective of picture category) and the between-subjects factor *Group* (nocebo vs. placebo) was applied. During the test phase, pain responses following identical stimulation intensities were analyzed using the withinsubjects factor *Emotion* (unpleasant vs. neutral pictures) and

the between-subjects factor *Group*. LPPs were analyzed using a 2 × 2 × 2 repeated measures ANOVA including the withinsubjects factor *Emotion* (unpleasant vs. neutral pictures), *Phase* (conditioning vs. test phase) to capture potential changes across the time course of the experiment, and the betweensubjects factor *Group*. Significant interaction was explored using follow-up ANOVAs. The significance level was set to .05 (twotailed); for follow-up ANOVAs, a corrected alpha of *p* < .025 was considered. As a measure of effect size, we report partial *η*². Normal distribution of the analyzed data can be assumed for 93% of the variables (Shapiro–Wilk's tests), due to the robustness of the repeated measures ANOVA against violations of data normality (40); its usage seems appropriate in the present case.

#### RESULTS

#### Pain Ratings—Conditioning Phase

Analysis of pain intensity ratings revealed a significant main effect of *Stimulation Level F*(1, 38)=152.54, *p* < .001, *ηp*² = .80, as a result of higher pain ratings following more intense electrical stimulation. The interaction of *Stimulation Level* × *Group* was only marginally significant, *F*(1, 38) = 3.49, *p* = .07, *ηp*² = .08, presumably indicating a more pronounced differentiation between the two stimulation intensity for the placebo group. The factor *Group* was significant, *F*(1, 38) = 9.36, *p* = .004, *ηp*² = .20, due to higher pain ratings in the placebo compared with the nocebo group (*M =* 46.54 vs. *M =* 30.64), see **Figure 2**.

Analysis of pain unpleasantness ratings returned a similar picture, participants clearly differentiated between the two different pain stimuli as indicated by the significant main effect of *Stimulation Level, F*(1, 38)=113.96, *p* < .001, *ηp*² = .75, however the interaction of *Group* x *Stimulation* was not significant, *F*(1, 38) = 0.01, *p* = .99, *ηp*² < .01. Again, participants in the placebo group reported higher pain in general, *F*(1, 38) = 10.72, *p* = .002, *ηp*² = .22, (*M =* 47.32 vs. *M =* 31.12), see **Figure 2**.

#### Pain Ratings—Test Phase

Analysis of pain intensity ratings revealed a marginal significant main effect of *Emotion F*(1, 38)=3.66, *p* = .06, *ηp*² = .09, which was further qualified by a significant interaction of *Emotion* × *Group*, *F*(1, 38)=11.72, *p* < .001, *ηp*² = .24. Participants in the placebo group rated pain during neutral pictures significantly higher than during unpleasant pictures, *F*(1, 19) = 11.97, *p* = .003, *ηp*² = .39, while the same comparison failed significance in the nocebo group, *F*(1, 19)= 1.41, *p* = .25, *ηp*² = .07. The factor *Group* was also significant, *F*(1, 38) = 6.94, *p* = .01, *ηp*² = .15, resulting from generally higher pain ratings in the placebo group (*M =* 50.17 vs. *M =* 35.07), see **Figure 3**.

Analysis of pain unpleasantness ratings showed no main effect of *Emotion F*(1, 38)=0.46, *p* = .50, *ηp*² = .01; however, the interaction of *Emotion* × *Group* was significant, *F*(1, 38)=31.67, *p* < .001, *ηp*² = .45. Separate ANOVAs for each group revealed a significant main effect of *Emotion* for both the nocebo *F*(1, 19) = 22.10, *p* < .001, *ηp*² = .54 and the placebo groups, *F*(1, 19)=11.13, *p* = .003, *ηp*² = .37. However, while participants in the nocebo group rated pain stimuli higher

FIGURE 3 | Mean pain intensity (left) and unpleasantness (right) ratings (+SEM) in the test phase separately for picture category (neutral vs. unpleasant) and experimental group. All within group comparisons—except for pain intensity ratings of the nocebo group—and the between factor were significant (*p* < .05).

during unpleasant compared with neutral pictures (*M =* 40.92, vs. *M =* 31.17), participants in the placebo group showed the exact opposite pattern, namely, higher pain unpleasantness ratings while seeing neutral (*M =* 58.15) compared with unpleasant pictures (*M =*  50.50). Again, the placebo group showed generally higher pain unpleasantness ratings compared to the nocebo group, *F*(1, 38) = 11.89, *p* = .001, *ηp*² = .24, (*M =* 54.33 vs. *M* = 36.45), see **Figure 3**.

#### Somatosensory Evoked Potentials— Conditioning Phase

As expected, during the conditioning phase, the physically more intense pain stimuli resulted in elevated SEP amplitudes. This was true for the early N1 component, *F*(1, 38) = 73.14, *p* < .001, *ηp*² = .66, and the subsequent P2, *F*(1, 38) = 19.11, *p* < .001, *ηp*² = .34. For both components, neither the interaction [**N1**, *F*(1, 38) = 0.70, *p* = .41, *ηp*² = .02; **P2**, *F*(1, 38) = 0.91, *p* = .35, *ηp*² = .02] nor the factor *Group* reached significance [**N1,** *F*(1, 38) = 1.14, *p* = .29, *ηp*² = .03; **P2**, *F*(1, 38)=0.43, *p* = .52, *ηp*² = .01], see **Figure 4**.

#### Somatosensory Evoked Potentials—Test Phase

Analysis of **N1** amplitudes during the test phase—when pain stimuli had always the same intensity—revealed neither a significant main effect of *Emotion F*(1, 38)=0.01, *p* = .93, *ηp*² < .01 nor a significant interaction *F*(1, 38) = 0.47, *p* = .50, η*p*² = .01. The between factor was marginally significant, *F*(1, 38)=3.27, *p* = .08, *ηp*² = .08, likely due to more pronounced amplitudes in the placebo (*M =* -6.33) compared with nocebo group (*M =* -3.71). The **P2** component similarly revealed no significant effect of *Emotion F*(1, 38) = 2.75, *p* = .11, *ηp*² < .07; however, the interaction of *Group* × *Emotion* was significant, *F*(1, 38)=7.44, *p* = .01, *ηp*² < .16. Separate ANOVAs for each group showed a significant main effect of *Emotion* only for the placebo group, *F*(1, 19) = 6.64, *p* = .02, *ηp*² = .26, due to higher mean amplitudes following neutral (*M =* 14.02) compared with unpleasant pictures (*M =* 11.89). The same analysis returned a nonsignificant main effect of *Emotion F*(1, 19) =1.04, *p* = .32, *ηp*² = .05 for the nocebo group¸ see **Figure 5**.

FIGURE 4 | The SEPs at the Cz electrode elicited by electrical pain stimuli during the conditioning phase. In the nocebo group (left), unpleasant pictures (red line) were paired with high-intensity pain stimuli, and in the placebo group, unpleasant pictures (green line) were paired with low-intensity pain stimuli. The gray lines represent neutral pictures, combined with either high- or low-intensity stimuli. The N1 (mean activity 75–125 ms) and the P2 (200–330 ms) components were significantly increased for high- compared with low-intensity pain stimuli in both experimental groups. All within group comparisons *p* < .05.

#### Visually Evoked Potentials During Conditioning and Test Phases

The 2 × 2 × 2 ANOVA for the analysis of the visually evoked LPPs revealed a significant main effect *Phase F*(1, 38) = 15.50, *p* = .001, *ηp*² = .29, which was the result of higher LPP amplitudes during the test compared with the conditioning phase (*M =*  1.68 vs. *M =* 3.00). Furthermore, the significant main effect of *Emotion F*(1, 38) - 6.79, *p* = .01, *ηp*² = .15, was further qualified by a close to significant two-way interaction of *Emotion* × *Group*, *F*(1, 38)- 3.95, *p* = .054, *ηp*² = .09. Follow-up ANOVAs separately for each group revealed a significant main effect of *Emotion F*(1, 19) - 13.39, *p* = .002, *ηp*² = .41 for the nocebo group, which is the result of elevated LPP amplitudes for unpleasant compared with neutral pictures. Interestingly, for the placebo group, the factor *Emotion* was far from being significant, *F*(1, 19) - 0.16, *p* = .70, *ηp*² < .01, see **Figure 6**. No other main effect or interaction reached significance, all *p*s > .22.

### DISCUSSION

In the current study, we addressed the question whether a placebo/nocebo manipulation does alter the pain-enhancing effect of emotions elicited by unpleasant picture stimuli, and if so, neurophysiological correlates of emotions processing were changed accordingly. Results demonstrate lower pain ratings for unpleasant pictures introduced as placebo compared with neutral control pictures. Further, in the placebo group only, negative pictures led to reduced P2 amplitudes of the SEP. In the nocebo group, in line with classical findings, unpleasant pictures led to more pronounced LPP amplitudes than neutral pictures. In the placebo group instead, pleasant (placebo) and neutral (control) pictures led to similar neurophysiological responses, suggesting that the placebo manipulation already affected the processing of the emotional stimuli and, consequently, processing of the pain stimuli.

### Pain Modulation by Pictures Indicating Placebo Hypoalgesia or Nocebo Hyperalgesia

Pain ratings and neurophysiological pain responses during the conditioning phase of the experiment demonstrated a clear differentiation between the two stimulus intensities, which suggests a successful manipulation of the participants' actual experience in line with the idea of reinforced expectations often used in placebo/ nocebo designs (16, 18, 41). The high-intensity stimuli were rated as more painful and unpleasant and evoked larger amplitudes of the N1 and P2 components of the SEP in both groups. Regarding the pain intensity ratings, the difference between neutral and unpleasant pictures tended to be even stronger in the placebo group, which is a first hint for a critical role of top–down-driven expectations rather than invariant effects of emotions on pain: following the concept of motivational priming (5, 42), one might have expected the pain-increasing effect of unpleasant pictures to be *enhanced* in the nocebo group. Instead, the placebo manipulation led to an even more pronounced differentiation, suggesting a more prominent role of the reinforced expectations than of the unpleasant pictures. The generally elevated pain ratings in the placebo group might to some degree be also the consequence of the experimental manipulation, but see the limitation section for further discussion.

The results from the test phase demonstrate even more clearly the interplay of our placebo/nocebo manipulation and the modulation of pain by emotions. Participants of the nocebo group demonstrated the well-known pain-augmenting effect of unpleasant pictures (43). The placebo group, however, reveals a completely reversed pattern. Here, the unpleasant pictures, introduced as having a pain-easing effect, led to significantly *reduced* pain intensity and unpleasantness ratings—of physically identical pain stimuli—compared with the neutral pictures. This indicates that the placebo expectation in combination with the conditioning procedure was able to reverse the pain-increasing effect of negative emotions, reflecting an efficient top–down control of pain processing.

There is a long-standing debate regarding the role of expectancy and learning, i.e., conditioning underlying the formation of placebo effects [see for instance (44, 16)], and some even question whether placebo effects per se are anything but conditioning effects and suggest to drop the concept in general (45). With regard to our present results, it is hard to tell whether the instruction in the beginning of the experiment or the placebo acquisition phase contributed to greater extent to the final placebo effect. A previous study by our group showed that in case of so to say psychologically mediated placebo/nocebo agents, both aspects are crucial. It might be interesting to test whether the present findings would replicate if only a conditioning procedure or placebo instruction was applied. Although research on expectancy effects on psychological pain modulation is rather sparse, the effect of placebo and nocebo expectations was repeatedly shown for pharmacological pain interventions. For instance, it was demonstrated that the same dosage of pain medication is more effective if it is administered in a so-called open fashion—that is, a patient is well aware of receiving a medication, which generates a robust expectation for analgesia—in contrast to a hidden application without explicit knowledge of the patient (46). On the contrary, a nocebo expectation is capable to abolish the effectiveness of a highly potent analgesic medication (47). Accordingly, the present findings suggest that the same might be true if the pain-modifying mechanism at question is based on psychological processes, here emotion-based pain modulation.

### Placebo Expectations and Emotion Processing

Analysis of the LPPs, elicited by the picture stimuli, showed that participants in the nocebo group clearly differentiated between neutral and unpleasant pictures in line with previous studies on neurophysiological correlates of affective picture processing, which demonstrated a preferential processing of threatening stimuli (30, 31, 48, 49). However, the placebo group failed to exhibit discriminative LPPs for neutral versus unpleasant pictures, which might be due to an integration of emotional picture content and their alleged effect on pain. We suppose that the placebo manipulation changed the functional representation of the unpleasant pictures, since according to the instruction, those were now indicative for a positive outcome, which likely rendered them as less threatening. In accordance with this interpretation, Bradley and colleagues found that physiological responses following the presentation of emotional pictures change if picture valence positive vs. negative—operates as a cue for threat vs. safety, respectively (50). Threat cues, provoked stronger physiological defense reactions, irrespective of the emotional picture content. In a similar paradigm where positive and unpleasant pictures alternatingly served as threat or safety cues, analysis of the LPPs demonstrated elevated amplitudes for pictures indicating potential danger (51). Altogether, these findings suggest that affective picture processing and emotional responding is susceptible to a top–down-driven modulation of motivational/ functional significance. These results are further in line with a finding from research on emotion regulation, demonstrating that changing the meaning of an emotional relevant scene for instance by applying an alternative interpretation (reappraisal) leads to altered subjective and neurophysiological responses following picture processing (30, 52, 53).

#### Neurophysiological Pain Responses While Watching Pictures Indicating Placebo or Nocebo

SEPs during the test phase demonstrated no modulation of the early N1 component, neither in the nocebo group nor in the placebo group. However, previous studies found a significant modulation of the N1 solely for the comparison of unpleasant with positive pictures (3, 8). Accordingly, the contrast between neutral and unpleasant pictures was likely not strong enough, which might explain the lacking N1 modulation, especially in the nocebo group. Similarly, studies on placebo effects measuring neurophysiological responses to short laser beams found no modulation of early components of the LEP (54, 55). The P2 component instead was modulated by the picture category, but only in the placebo group, such that unpleasant compared with neutral (control) pictures led to a significantly reduced amplitude. This is in line with earlier findings demonstrating that emotional compared with neutral pictures reduce the P2 following electrical stimuli (3, 8). Furthermore, studies investigating placebo effects on LEPs found that a placebo manipulation reduced the P2 component or N2/P2 complex, respectively (54, 55). Given that participants in the placebo group showed little differentiation between the picture categories as indicated by similar LPP amplitudes, this might demonstrate an interference of emotion processing by the placebo manipulation. We conclude that the reduction of the P2 component likely is driven more strongly by a placebo effect than by the arousing content of the pictures. In the nocebo group, however, the instructed painaugmenting effect of unpleasant pictures did not provoke any conflict between picture content (negative) and functional significance (negative). Here, in line with previous studies on nocebo-like cueing effects reporting elevated LEPs (56), the experimental manipulation probably led to an increase of the P2 component during unpleasant picture presentation, which compensated the expected P2 decrease by high-arousing pictures found previously. Yet, the nocebo effect apparently was not strong enough to produce a significant potentiation of the P2 by unpleasant *nocebo* pictures, exceeding the responses following the neutral control pictures.

#### Limitations

Although the ratio of female and male subjects was equal within and across groups, due to the small total sample, a moderation of the reported findings by the participants' gender cannot be excluded. Future studies should incorporate larger sample sizes to explore gender effects in more detail and to control for the sometimeshigh variability in placebo and nocebo designs. Furthermore, even though the experimental groups varied only very little with regard to the individual pain threshold and later on administered pain stimuli, participants of the placebo group reported higher pain intensity and unpleasantness ratings, in general, despite similar SEPs amplitudes. Results of the post-experimental ratings where participants of the placebo group indicated a relative painincreasing effect of neutral pictures compared with the nocebo group—might be suggestive for an overall overestimation of pain in the placebo group, leading to elevated pain ratings, see **Table 1**. However, evidence for this interpretation is inconclusive and might be corroborated in future studies, obtaining measures of the participant's expectation already in the beginning of the experiment. In a similar vein, we decided against trial-by-trial affective ratings of the emotional picture stimuli. This might have been informative with regard to the findings from the visual evoked potentials but, at the same time, led to excessive length of the whole experiment. Future studies should complement physiological affective responses by subjective measures of emotion and expand the stimulus set by a positive valence category. With regard to state affect, participants in the nocebo group presented somewhat higher anxiety scores, which may result from the nocebo instruction. The difference in state anxiety might have influenced the present findings; however, mean scores of both groups indicate very moderate levels of state anxiety. Lastly, the bar electrode used in the present design might have led to muscle contraction artifacts contaminating SEP findings. Given the very similar stimulation intensities between groups, artifacts might not explain group differences. The problem of potential artifacts could be addressed in futures studies for instance by using ring electrodes (57).

### Conclusion and Outlook

The present study demonstrated an interaction of emotions and reinforced expectations on pain processing. We showed that a placebo manipulation (verbal instruction + placebo conditioning) is able to modulate and even reverse the genuine pain-increasing effect of unpleasant pictures. We assume that the placebo manipulation altered the processing of emotional pictures themselves, such that unpleasant pictures, expected to exert a positive effect on pain, were perceived as less arousing. This interpretation is in line with previous research demonstrating the modulatory influence of threat manipulations on physiological correlates of emotion processing (51, 58). These findings underline the important role of higher order expectations on pain processing and the effectiveness of psychological placebo effects as shown previously (18).

These processes deserve further explorations in future studies, investigating the interaction of placebo/nocebo expectations with other well-established emotional and cognitive factors impacting pain. For instance, it might be worthwhile to investigate whether a nocebo expectation of, e.g., pain exacerbation caused by highly demanding cognitive tasks, actually hampers the pain decrease following manipulations of attention allocation (59, 60). The same might be true for the modulation of pain by emotion regulation strategies such as reappraisal or suppression (61). A placebo vs. nocebo manipulation suggesting high vs. low effectiveness of pain regulation might block or even potentiate its pain-modifying capacities.

### DATA AVAILABILITY

Data is available upon request.

### ETHICS STATEMENT

All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the institutional review board of the medical faculty of the University of Würzburg.

### AUTHOR CONTRIBUTIONS

PR and MW designed the study. CM collected the data. PR analyzed the data. PR, MW, CM, and PP wrote the manuscript.

### FUNDING

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) to project number 44541416-TRR 58 (B01), and to the Research Group ''Emotion and Behavior,'' FOR 605, Wi2714/3-2. This publication was funded by DFG and the University of Würzburg in the funding programme "Open Access Publishing."

### ACKNOWLEDGMENTS

Special thanks to LHY for valuable inspirations during the preparation of the manuscript.

### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Reicherts, Pauli, Mösler and Wieser. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Placebos as a Source of Agency: Evidence and Implications

#### *Phoebe Friesen\**

*Biomedical Ethics Unit, Social Studies of Medicine, McGill University, Canada*

Bioethical discussions surrounding the use of placebos in clinical practice have long revolved around the moral permissibility of deceiving a patient if it is likely to benefit them. While these discussions have been insightful and productive, they reinforce the notion that placebo effects can only be induced through deception. This paper challenges this notion, looking beyond the paradigmatic clinical encounter involving deceptive placebos and towards many other routes that bring about placebo effects. After briefly describing the bioethical terrain surrounding the deceptive use of placebos in clinical practice, section 1 offers an examination of the various mechanisms known to contribute to placebo effects: classical conditioning, expectations, affective pathways, open-label placebo treatments, and additional factors that do not fall easily into a single category. The following section explores how each of these routes can be harnessed to bring about clinical benefits without the use of deception. This provides grounding for reconceiving of the placebo effect as a clinical tool that is not always in conflict with patient autonomy and can even be seen as a source of agency. In the final section, implications of the shift away from seeing placebos as necessarily deceptive are discussed. These include the necessity of looking beyond the clinical encounter and mainstream medicine as the primary sites of placebo responses, how important acknowledging the limits of placebo effects will be when we do so, as well as the difficulties of disentangling agency, responsibility, and blame within medicine.

#### Keywords: placebo effect, deception, agency, expectancy, conditioning, open-label treatments, psychosomatic conditions

"The placebo, as traditionally used, could be called the lie that heals. But a satisfactory understanding of the nature of the placebo effect shows that the healing comes not from the lie itself, but rather from the relationship between healer and patient, and the latter's own capacity for self-healing *via* symbolic and psychological approaches as well as *via*  biological intervention" (1)

### INTRODUCTION: "THE LIE THAT HEALS"

Discussions of the placebo effect in clinical practice have long contended with themes of deception, paternalism, and violations of autonomy. In 1907, Richard Cabot (2) argued that "every placebo is a lie, and in the long run the lie is found out". Arnold et al. (3) described the state of play more recently: "Conscious, deliberate, or incidental/unwitting utilization of the placebo effect is characterized as deceptive, unethical, unscientific, and unprofessional." Similarly, Kolber (4) reports on how placebo treatments are referred to by some as medicine's "dirty little secrets." In line with these associations,

#### *Edited by:*

*Seetal Dodd, Barwon Health, Australia*

#### *Reviewed by:*

*Stewart Justman, University of Montana, United States Maria Serena Panasiti, Sapienza University of Rome, Italy*

#### *\*Correspondence:*

*Phoebe Friesen phoebe.friesen@mcgill.ca*

#### *Specialty section:*

*This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry*

*Received: 28 February 2019 Accepted: 09 September 2019 Published: 25 October 2019*

#### *Citation:*

*Friesen P (2019) Placebos as a Source of Agency: Evidence and Implications. Front. Psychiatry 10:721. doi: 10.3389/fpsyt.2019.00721*

most bioethical discussions of the placebo effect revolve around the moral permissibility of using deception within the clinical encounter if it is likely to benefit the patient.1 A great deal has been written on this topic, examining the conflict that arises between two central values within medicine, autonomy and beneficence, and weighing the harms and benefits that fall out of prioritizing one over the other (5–9).

Many have defended deception within the clinical encounter. Kihlbom (10) and Shaw (11) have argued that a limited version of consent, which can maximize the benefits of placebos, is sufficient while Barnhill (12) has defended a view in which informed consent and deceptive placebo use need not be seen as incompatible [drawing on (13)]. Miller and colleagues, and later, Alfano (14), have argued that deception is permissible as long as patients consent to it first (sometimes called "authorized deception") (15, 16), while Kolber (4) has defended deception on the basis of evidence that patients would prefer to benefit than to be told the truth. On the other side of the debate, many have focused on the harms that might result from the deceptive use of placebos within clinical practice. Blease (17) has suggested that asking patients to authorize deceptive placebo treatments might, paradoxically, lead to worse outcomes by way of nocebo effects2 , while Asai and Kadooka (18) argue that "the clinical use of placebo and its acceptance would encourage undesirable labeling and contempt for the patient." Others point out that deceptive placebo use threatens trust and therefore care (19, 20). As Golomb (21) has noted, "The willful breach of trust by doctors to patients on a policy basis may corrode not just *that*  physician's relationship with *that* patient, but may tarnish the reputation of all physicians as trustworthy purveyors of medical advice—abrogating all physicians' effectiveness, always."

More recently, bioethical discussions of placebos and deception have also focused on the nocebo effect, asking whether information regarding potential negative side effects of a treatment should be withheld from a patient during informed consent if providing that information makes it more likely that the patient will experience negative side effects3 (24–27). While closely related to the conflict that arises between beneficence and autonomy when deceptive placebos are prescribed, this discussion changes tack ever so slightly, examining the tension between nonmaleficence (the avoidance of harm) and patient autonomy.4 Proposed solutions include authorized concealment (8, 15), tailoring the informed consent process to the individual (26), and taking into account the specificity and likelihood of each patient's potential nocebogenic symptoms (25, 27).

Bennet Foddy offers a defense of deceptive placebo use which relies on the description of several cases in which deceptive placebo use is portrayed as the least bad option. These cases include an individual who experiences an improvement in depressive symptoms even though they have been prescribed an ineffective dose of an antidepressant, a patient who has irritable bowel syndrome (IBS) which lacks effective treatments and is responsive to placebo treatments, and a clinician working in a warzone where there are no available treatments. In these cases, Foddy argues, deceptive placebo use is recommended. Since it is the best treatment available, he suggests, it involves "a type of deception that patients ought to be thankful for, just as we are thankful when we receive a mendacious compliment from a friend" (6).

While Foddy may be right that the least bad option is often the best one, it is not clear that any of the cases he presents require deception in order to produce placebo responses. As a result, the least bad option might not be deceptive placebo use, but honest and open placebo use. As I hope to demonstrate below, Foddy and many others who have engaged in bioethical discussions surrounding the deceptive use of placebos have limited themselves to a narrow subset of cases involving the placebo effect. These cases all take place within the clinical encounter and involve a doctor lying to her patient in order to bring about positive expectations surrounding treatment outcomes. If we follow the evidence, however, and examine the myriad ways in which placebo responses are produced, it is no longer obvious that deceptive placebo use ought to take center stage. Rather, placebos emerge as a promising tool for promoting patient autonomy, not merely violating it. In line with this, I make the case below that we should reconsider the age-old association between placebos and deception and examine instead the many ways in which placebos can enhance agency. Agency, in this case, can be thought of as the capacity to act which, in cases of non-deceptive placebo use, results from an increase in available routes by which suffering can be relieved. This capacity can be contrasted with the loss of agency that accompanies dishonest placebo prescriptions, in which patients are unaware of their choices regarding their medical care.

In the next section, I will briefly describe what we know about the mechanisms underlying placebo responses. Building on this evidence base, in the following section, I will argue that there are many ways in which placebo responses can be produced without the use of deception and that non-deceptive routes of placebo intervention ought to be seen as tools that can support the agency of patients. Finally, I will discuss several implications, and the ethical questions surrounding them, that fall out of conceiving of placebo effects as a source of agency.

#### WHAT WE KNOW: THE PRODUCTION OF PLACEBO RESPONSES

While defining the placebo effect is inevitably a contentious task, there is some agreement within the field of placebo studies about

<sup>1</sup> There is also a significant body of bioethical literature concerned with the use of placebo controls in research, but this literature revolves around the placebo as a control rather than as a phenomenon to be harnessed.

<sup>2</sup> Nocebo effects are akin to placebo effects but involve negative clinical outcomes rather than positive ones.

<sup>3</sup> For example, when men being prescribed finasteride for benign prostatic hyperplasia (prostate gland enlargement) were split into two groups, one of which was warned of potential sexual side effects and one of which was not, 44% of those who were warned reported experiencing sexual dysfunction, compared to only 15% of those who were not warned (22). Such side effects have also been reported as a result of finasteride outside of the research setting, but it is unclear whether they were induced through nocebo mechanisms or not (23).

<sup>4</sup> The relationship between beneficence and nonmaleficence is often an ambiguous one. As Veatch has pointed out in relation to the Belmont Report, it is unclear whether "beneficence" is meant to capture both beneficence and nonmaleficence as two sides of the same coin, or whether they should be seen as distinct values (28).

how placebo responses are produced.5 Different theorists tend to place different boundaries around what counts as a placebo effect and divide what falls within those boundaries into different categories. These boundaries and categories are shaped both by empirical evidence and decisions made by theorists. These decisions, in different cases, are informed by ordinary language use, pragmatic arguments, aesthetic appeal, or desires to conserve or break with the past. Here, resting on both empirical evidence and pragmatism, I have divided up the evidence related to placebo effects in a way that will help demonstrate the role they might play in enhancing agency. Below, I briefly discuss what we know of placebo responses brought about by 1) classical conditioning, 2) expectations, 3) affective pathways, 4) openlabel placebo treatments, and 5) additional factors, before going

on to link these categories to agency in the following section. While this may become clear in the discussion of evidence that follows, it is worth noting at the outset that some symptoms and conditions are much more responsive to placebo treatments than others. These include pain, both acute and chronic, mood and anxiety disorders, psychogenic movement disorders, autoimmune disorders, and functional somatic syndromes, many of which also lack effective treatments (30–37). Viruses and tumors do not appear to be impacted by placebo treatments, although related symptoms such as hot flashes, fatigue, and nausea, often are (38, 39).

### Classical Conditioning

The role of classical conditioning in bringing about placebo effects has long been recognized (40). Classical conditioning involves the repeated pairing of two stimuli until the result ordinarily produced by one begins to be produced by the other [e.g., Pavlov's famous experiment in which the sound of a bell produces salivation in a dog after being paired with food enough times (41)]. Conditioned placebo responses have been documented within the endocrine and immune systems and do not appear to be impacted by conscious beliefs or expectations (42). For example, the repeated pairing of cyclophosphamide with anise-flavored syrup led to a reduction of white blood cells (the usual result of cyclophosphamide) merely in response to anise-flavored syrup (42). Prior experience also appears to have a significant impact on analgesic (pain reduction) placebo responses, which can last several days, although, at least in acute cases of pain, these conditioned responses appear to be canceled out by negative expectations of an increase in pain (43, 44).

### Expectations

Expectations, which can be shaped by verbal manipulations, patient beliefs, or contextual factors, appear to impact placebo responses across a variety of symptoms and experiences, including, but not limited to, acute and chronic pain, nausea, inflammation, asthmatic reactions, and motor control (43, 45–47). The role of expectations is evidenced by research that demonstrates that analgesic treatments are significantly more effective when patients are told they are receiving them (as opposed to being administered intravenously and activated from another room) (43, 48). Relatedly, in clinical trials involving treatments for major depressive disorder, the higher the chances of participants receiving the active intervention (trials with more active arms), the more placebo responses occur (49). Similarly, when patients believe they are likely to benefit from a treatment, they are more likely to. In a trial in which participants with low back pain received either massage or acupuncture, their expectations related to treatment had more predictive value related to their outcomes than the treatment they received; those with high expectations benefitted much more (50).

#### Relational Components

A significant body of research has also documented the importance of the therapeutic alliance in bringing about placebo responses. In one experiment, patients with a common cold who rated their practitioner as high in empathy were found to have colds that were shorter in duration and less severe than those who perceived less empathy; these patients were also found to have increased immune responses (51). Perceptions of warmth and competence in a practitioner have also been found to progress healing, as evidenced by reduced allergic responses in patients who rated their practitioners as having these qualities (52). Two experiments, one involving patients with IBS and another involving patients with chronic low back pain, both found that additional time and support within the clinical encounter led to significant positive changes in patient outcomes (53, 54). There is also evidence for a correlation between high patient ratings of trust in their practitioner and improved clinical outcomes (55). A growing body of evidence is beginning to unpack why and how we have evolved to be so responsive to empathy, compassion, and those designated healers in our communities, as well as the neural and physiological mechanisms underlying these responses (56).

### Open-Label Placebo Treatments

Growing research on open-label placebo treatments suggests that even when patients are told that they are taking placebo pills which contain no active ingredients, such treatment can lead to significant improvements. This has been demonstrated in patients with IBS, migraines, allergic rhinitis, chronic low back pain, and children with attention-deficit/hyperactivity disorder6 (57–61). In one of these trials run by Kaptchuk et al., participants with IBS were recruited and randomized to receive either no treatment or an open-label placebo. Those in the open-label condition took two placebo pills each day and were instructed to think about the potential power of placebo effects. At the end of 3 weeks, these patients scored significantly higher than the no treatment control group on measures of both quality of life and symptom reduction (58). It is not clear what the mechanisms behind open-label placebo responses are. While recent reviews of the phenomenon have suggested that classical conditioning, expectations, and social support may all contribute (62, 63), others have suggested that these explanations are insufficient and that theories of

<sup>5</sup>Although for an interesting argument that there is more consensus within the field than is often acknowledged, see (29).

<sup>6</sup>Note that conclusions drawn from the trial by Sandler and Bodfish with children with a diagnosis of attention-deficit/hyperactivity disorder should be limited. Teachers, who were blinded, did not find that open label placebo plus a 50% dose of medication was as effective as a 100% dose, while parents and clinicians, who were not blinded, did.

embodied cognition and Bayesian predictive processing might better account for the success of open-label treatments (64).

#### Additional Factors

There are also several sources of placebo responses where the mechanisms at work are still unclear. Possibly linked to expectation-based placebo responses is evidence that suggests that placebo effects increase when one is given a choice of what analgesic to take (65), when a treatment is thought to be expensive (66), and when a treatment is invasive (67–69). Conditioning might explain greater placebo responses being derived from more frequent interventions (70, 71) or greater adherence to a treatment (72), while relational components may contribute to better outcomes in patients with nonspecific chest pain who received more diagnostic tests, despite these tests having no impact on treatment (73). Social learning (e.g., watching another person experience pain relief from a particular treatment) also contributes to analgesic placebo responses, which could be a result of either expectations or conditioning (74).

### AN ALTERNATIVE VIEW: PLACEBOS AS A SOURCE OF AGENCY

In a discussion of the role of the placebo effect in clinical practice, Alfano (14) acknowledges that "deception is not required to alter a patient's expectations, to classically condition them, or to modulate their somatic attention," and yet, he recommends the use of authorized deception and concealment within the clinical encounter. He argues that obtaining consent to deceive patients will contribute to increases in placebo responses through expectations and encourage greater adherence, promoting conditioned placebo responses. Such recommendations emphasize the importance of deception in bringing about placebo effects, promoting a picture that fails to recognize how placebo effects can be brought about without dishonesty. What if, rather than focusing on how deception can bring about placebo responses, we looked to the ways in which placebo effects can be used in conjunction with patient autonomy? In this section, following from the placebo pathways presented in the previous section, I will demonstrate how each of these can be manipulated in order to enhance agency rather than deny it.

### Classical Conditioning

Placebo responses brought about by way of classical conditioning have little need for deception, as a result of their tendency to remain disconnected from cognitive processes. In particular, conditioning can be used to enhance an existing therapeutic response and, in some cases, to reduce one's medication dosage in order to avoid side effects while maintaining the same level of efficacy (34, 39, 75). Dose reduction *via* placebo conditioning has been demonstrated to be effective with antihistamines for allergic reactions, methadone for those with opioid use disorder, melatonin for children with difficulties sleeping, antipsychotics for individuals diagnosed with schizophrenia, and corticosteroids for the treatment of psoriasis (76–80, 81).7 This suggests that classically conditioned placebo responses can be used to support tapering or weaning off a medication entirely, opening up new avenues for patients for which treatments are effective but cannot be sustained. Some groups that might benefit from such conditioned placebo responses include those who are unable to afford a medication, those who wish to taper their dose of a treatment because of negative side effects, or individuals with complex pharmaceutical regimens who hope to avoid adverse interactions between drugs (83). As mentioned above, awareness of the conditioning process does not appear to impact conditioned immune and endocrine responses, so there is no need for deception. This is slightly more complicated in the case of pain, where negative expectations appear to overrule positive classical conditioning that has come before. This suggests that, at least with regards to acute pain, conditioned placebo responses may need to be generated along with expectation-based placebo responses.

### Expectations

Expectation-based placebo responses are a more complicated case, in that deceptive placebo use is primarily based on intentions to manipulate patient expectations. However, such an approach to placebo use assumes that one's expectations related to one's clinical outcomes are entirely created within the clinical context. While the doctor's words may have a significant impact on what one anticipates, many other sources outside of the doctor's office contribute to shaping patient expectations as well. These sources include, but are not limited to, past experiences, information that one has read online, stories one has been exposed to about similar cases, related narratives in the media and popular culture, and what friends and family members have led one to expect. For example, joining a support group of individuals who have learned to live well despite the presence of chronic pain may alter one's expectations of one's own pain, leading to a reduction in suffering. As a result, individuals who are struggling with the kinds of symptoms and conditions that tend to be placebo responsive can actively shift their own expectations through exposing themselves to particular information and narratives, which are more likely to produce placebo, rather than nocebo, responses in themselves. Similarly, they can choose treatments that they believe are likely to work and that align with their values, thereby increasing the chances that they will (84).

### Relational Components

Deception and violations of autonomy are certainly not required to produce placebo responses in patients by way of relational components like warmth, empathy, and trust. These fall naturally out of positive clinical encounters. Efforts can be made to spend more time with patients and listen to them more carefully, as these are likely to increase placebo responses, particularly in those conditions that tend towards robust placebo effects. Furthermore, while there is a great deal of placebo literature focusing on the impact of aspects of the clinical encounter, it may be that these benefits can be gained through other social encounters as well. There is ample evidence that social support makes a difference to many clinical outcomes, particularly those related to mental and cardiovascular health (85–87). It is unclear whether warmth, trust, and empathy, in the clinical encounter, lead to improvements

<sup>7</sup>There is even evidence in rats that placebo conditioning of heart allografts can prolong transplant survival (82).

in wellbeing through the same mechanisms that warmth, trust, and empathy, outside of the clinical encounter do, but it is worth cashing in on both avenues. If research suggests that both a positive clinical encounter and a few hugs a day (see 88) are likely to be protective against illness, this suggests that there are multiple routes by which individuals can seek to boost their own placebo responses through supportive relationships.

#### Open-Label Placebo Treatments

Open-label placebo treatments are probably the most obvious way in which placebo responses can be harnessed without the use of deception because they involve a complete disclosure that the treatment is a placebo. While the evidence base is still quite limited, the research that does exist suggests that these treatments hold promise. The diversity of conditions that have been found to improve through open-label placebo treatments indicates that there may be many more worth exploring; as mentioned above, these include IBS, migraines, allergic rhinitis, and chronic low back pain. As with conditioned placebo responses, those who may be most likely to benefit may be individuals who cannot afford ordinary treatments, those who require polypharmaceutical regimens, or those who experience significant side effects from a particular treatment.

#### Additional Factors

Finally, the grab bag of routes that appear to lead to placebo effects, but that we do not currently understand well, is likely to offer additional tools by which individuals can benefit from placebo responses without the use of deception. If more invasive treatments appear to lead to better outcomes than noninvasive ones, then perhaps pairing a particularly pungent drink with one's medication or treatment can be of value. If frequency of treatment and adherence to a treatment also impact clinical outcomes, patients can divide pills into smaller doses to increase frequency and use reminders to increase their adherence in order to tap into these potential increases in efficacy. Similarly, if social learning contributes to placebo responses, exposing oneself to success stories of individuals who have recovered from a similar experience may be worthwhile.

#### IMPLICATIONS: ADVANCING BIOETHICAL DISCUSSIONS OF PLACEBOS

As evidenced above, the link between deception and placebo treatments is not a necessary one. Placebo effects are produced through many avenues which can be harnessed through nondeceptive means. Acknowledging these routes of placebo intervention is likely to advance bioethical discussions of placebo effects beyond questions concerning the moral appropriateness of dishonesty for the sake of clinical benefit. While considering the conflict that arises between beneficence and autonomy during deceptive placebo use is an important ethical issue, it is not the only issue pertinent to discussions of placebo treatments within medical ethics. In this section, I discuss four implications that fall out of shifting away from focusing on placebo treatments as associated with deception and towards seeing placebos as a source of agency. These implications raise new ethical questions that appear on the placebo landscape once we look beyond deceptive use, some of which I flag within the discussion below.

#### Looking Outside the Clinical Encounter

The first implication is that recognizing the role of agency in placebo effects takes us beyond the clinical context and requires us to see the potential for promoting placebo effects in several other realms. Rather than thinking only of the question of whether doctors should lie to patients for their medical benefit, examining the mechanisms underlying placebos and how they can promote agency reveals the significant role that placebo effects play in many domains of our lives. Many have pushed towards expanding the boundaries of the sources of placebo effects before. Miller and Kaptchuk (89) have suggested that "instead of focusing exclusively on the therapeutic power of medical technology and thereby ignoring or dismissing context, we should see the context of the clinical encounter as a potential enhancer, and in some cases the primary vehicle, of therapeutic benefit." Even beyond the context of the clinical encounter, however, there are routes by which expectations are shaped, associations are created, and relationships may contribute to placebo responses. Narrowing in on the mechanisms by which placebo responses are created leads one to recognize the significant roles that nonmedical contexts (e.g., online spaces, workplaces, schools) and nonmedical people (e.g., friends, family, characters) are playing in shaping both placebo and nocebo effects. For example, if social support and empathy bring about placebo responses for many conditions, it is crucial that we look to the networks and relationships individuals are embedded in as a source of placebo effects, as well as what happens in the doctor's office. Of course, such networks and relationships, or a lack thereof, can also be the source of nocebo effects.

Shifting our attention outside of the clinical encounter and towards other spaces in which placebo responses are likely to be generated allows us to see many more settings and influences that are relevant to discussions of the placebo effect. Rather than merely focusing on the doctor's office, we can begin to examine the role of individual and collective rituals and stories, social settings and communities one partakes in, and the many relationships one is embedded in, in producing placebo responses. Evidence related to "placebo by proxy" supports this extension, demonstrating, perhaps unsurprisingly, that sometimes placebo effects in children may be mediated more by their parents than by their doctor (90). Looking beyond the white walls and white coat leads to difficult ethical questions related to what falls within the bounds of medicine and what the responsibilities of healthcare professionals might be in relation to placebo responses that take place outside of their territory. If it is the case that many factors that may influence placebo responses are outside of the health-care system, is there a responsibility to communicate with patients about these influences within the process of informed consent? If so, what should they be told? Should the sources of placebo effects merely be prescribed, and a warning given, or should recommendations regarding how to enhance placebo effects and avoid nocebo effects be offered? Furthermore, does recognizing the wider scope of placebo influences have implications for how patient support networks should be run or for potential additional variables that ought to be controlled for within clinical trials?

#### Looking Outside Mainstream Medicine

Broadening our examination of the territory of placebo phenomena also allows us to look beyond mainstream or Western medical contexts, which provide the setting for a great deal of placebo research. Given the mechanisms underlying placebo responses, it seems likely that practitioners of complementary, alternative, and traditional medicines are likely to be contributing to placebo effects regularly (91, 92). This is because many of the features that tend to enhance placebo responses, particularly in relation to expectations and relational components, tend to show up in these forms of medicine, and because the conditions that people most frequently seek these treatments for are ones that tend to be highly responsive to placebo treatments (93). If evidence suggests that choosing a treatment that aligns with one's values can enhance placebo responses, what does this mean for treatments that do not fall within the evidence base but that many people would like to receive? How can this be taken into account within systems of evaluating the efficacy of treatments?

Challenging evidence related to these questions comes from a recent examination of the impact of different components of homeopathy on clinical outcomes in patients with rheumatoid arthritis. The findings suggest that whether one takes part in the homeopathic consultation, which is often quite extensive, involves particular attention to the therapeutic alliance, and is likely to generate hope and positive expectations, is more predictive of positive clinical outcomes than whether one receives a homeopathic treatment (94). This raises interesting ethical questions regarding the role that complementary, alternative, and traditional medicines ought to play or not play within health care. One might argue, based on this research, that homeopathy is merely a form of deceptive placebo use, and yet, it is possible that an open-label placebo treatment involving homeopathy would be effective for some people and some conditions. Does this suggest that we should make such treatments more widely available, given the difficulty of finding such elaborate care in mainstream medicine? Or does this mean that practitioners should be required to fully disclose which components of the treatment are likely to be contributing to positive outcomes and which are not? How should medical practices that primarily offer therapeutic effects *via* placebo responses be regulated?

#### Acknowledging the Limits of Placebo Treatments

Related to this is the importance of being clear about in which cases there might be room for improvement through the manipulation of placebo response and in which cases there is not. As mentioned above, there are some types of symptoms and conditions that tend to be highly responsive to placebo treatments (e.g., pain, mood, anxiety, psychosomatic symptoms or conditions) while others do not appear to be impacted at all (e.g., viruses, tumors). Unfortunately, there is a risk that acknowledging placebo use as a source of agency could lead to creating, or further cementing, inaccurate beliefs about where placebo treatments can be effective. This could occur if excitement generated about having the ability to impact one's own wellbeing in one domain bleeds into another domain, leading claims about the success of alternative treatments for IBS and the success of alternative treatments for cancer to be seen as equivalent, when based on what we know about the placebo effect, these two claims ought to be treated very differently.

It is well documented that an interest in alternative medicine aligns with a higher likelihood of refusing conventional therapies for cancer, which is linked to higher mortality rates, and with a greater tendency towards vaccine hesitancy (95–97). To acknowledge that some alternative, complementary, and traditional therapies may be quite effective in treating some symptoms and conditions by way of the placebo effect could indirectly encourage beliefs that all medical problems are treated equally well by such therapies. While there is a significant amount of research yet to be done that will better allow us to demarcate the boundaries of placebo potential, it is important to be as honest as possible at this point about what placebo responses can and cannot do for people. The risks and rewards that are likely to accompany experiences of seeking alternative care for chronic pain look very different from the risks and rewards that are likely to accompany experiences of seeking alternative care for lung cancer. Recognizing these limits raises questions related to how placebo research ought to be responsibly reported in scientific publications and the media, how clinicians working in integrative medicine should communicate with patients and the public about the evidence and mechanisms underlying the treatments they offer and about what research priorities in the field of placebo studies ought to be.

### Enhancing Agency Without Enhancing Blame8

Finally, given that an emphasis on individual agency within health conditions often brings with it an attentiveness to individual responsibility and blame, we ought to be careful in exploring the links between placebo effects and agency. Particularly when considering the capacity for individuals to produce nocebo effects, which produce negative rather than positive outcomes, it is crucial that we do not burden individuals with the weight of responsibility and blame for their own suffering (99). This is especially relevant with regards to conditions characterized as psychosomatic, many of which tend to show robust responses to placebo treatment. These conditions, however, are already among the most stigmatized within medicine, in large part because there is a tendency to characterize conditions in which psychological and somatic symptoms interact as less real or as being "all in the head" (100–103). As Greco (104) has suggested, what distinguishes biomedical and psychosomatic conceptions of illness is "a shift from aetiological or causal explanations to explanations that might be termed 'dispositional'." This shift leads to an understanding of psychosomatic conditions as

<sup>8</sup>A nod to Hannah Pickard's useful notion of responsibility without blame (98).

associated with an individual's moral failings, in part because the "perception of a need for medical care is not corroborated by a medical diagnosis based on physio-chemical evidence" (104).9

This suggests that we ought to be very careful in embracing the potential of conditioned, expectation-based, open-label, and relational placebo effects in these conditions, in that we do not want to create more stigma, and more harm, by reinforcing notions of blame and moral failing in these patients. Recognizing this tension raises questions about how to best to utilize these tools without directing attention to blame and responsibility. Is it likely that thinking of these routes of intervention as placebo effects will reinforce stigma within these patient populations? How might we better characterize placebo phenomenon so that they can be harnessed while causing the least harm possible? Would we be better off focusing on the individual routes by which outcomes are improved (e.g., conditioning, expectations) rather than thinking of placebo effects as a whole, as suggested by Alfano (14), or throwing out the term entirely, as suggested by Nunn (106)?

A broader version of this concern relates to how noting links between agency and wellbeing can promote healthism, which views health as a private resource that individuals are responsible for securing for themselves (107, 108). If we place the responsibility on individuals to ensure that they harness these agential placebo effects, we may end up alienating them rather than motivating them. Furthermore, not everyone has equal access to the resources that might allow them to benefit from these nondeceptive placebos, including a warm and empathetic clinician, the time and money for reiki, or unlimited hugs.10

#### REFERENCES


#### CONCLUSION

The placebo effect has long been associated with deception, lies, and clinical paternalism. While these associations are grounded in common ways in which the phenomenon has been, and continues to be, manipulated in clinical practice, these associations are not inherently linked to the phenomenon. As we learn more about the routes by which placebo effects can be generated, it is becoming clear that deception is not a necessary component of placebo prescription, but an accidental one. Placebo responses can operate by way of conditioning, expectations, relational factors, open-label placebo treatments, and other routes, which do not require a patient to be deceived. Recognizing the diverse ways in which patients can benefit from placebo effects without deception not only allows us to see a much greater potential in the phenomenon but also significantly widens the scope of ethical issues that we must contend with. In this manuscript, I hope to have gestured towards some of the bioethical issues that are likely to arise as the placebo effect continues to shed its "legacy of trickery" and becomes recognized as a powerful phenomenon that does not always need to lie to get its way (110).

#### AUTHOR CONTRIBUTIONS

This manuscript was conceptualized, researched, and written by PF.

#### FUNDING

This research was supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre, grant BRC-1215-20008 to the Oxford University Hospitals NHS Foundation Trust and the University of Oxford. The views expressed are those of the author and not necessarily those of the NHS, the NIHR or the Department of Health.


<sup>9</sup>This relates to the significant disagreement between patient organizations and medical authorities over the status of myalgic encephalopathy (also known as chronic fatigue syndrome) as a psychological (favored by clinicians) or physical (favored by patients) condition (105) [although see (103) for criticisms of the methodology used by Hossenbaccus and White].

<sup>10</sup> This also suggests that some will be least well off when it comes to benefitting from placebo effects, both those arising within the clinical encounter and arising outside of it [see also (109) on this topic].


hormonal placebo/nocebo responses. *J Neurosci* (2003) 23(10):4315–23. doi: 10.1523/JNEUROSCI.23-10-04315.2003


of social isolation and loneliness. *Public Health* (2017) 152:57–71. doi: 10.1016/j.puhe.2017.07.035


**Conflict of Interest:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Friesen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Healing Encounters and Attitudes Lists (HEAL): Psychometric Properties of a German Version (HEAL-D) in Comparison With the Original HEAL

Heike Gerger 1\*† , Sarah Buergler <sup>1</sup> , Dilan Sezer <sup>1</sup> , Marc Grethler <sup>1</sup> , Jens Gaab<sup>1</sup> and Cosima Locher 1,2†

<sup>1</sup> Division of Clinical Psychology and Psychotherapy, Faculty of Psychology, University of Basel, Basel, Switzerland, <sup>2</sup> School of Psychology, University of Plymouth, Plymouth, United Kingdom

#### Edited by:

Katja Weimer, University of Ulm, Germany

#### Reviewed by:

Carol M. Greco, University of Pittsburgh, United States Ben Colagiuri, University of New South Wales, Australia

#### \*Correspondence:

Heike Gerger heike.gerger@gmail.com

† These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 28 February 2019 Accepted: 13 November 2019 Published: 10 January 2020

#### Citation:

Gerger H, Buergler S, Sezer D, Grethler M, Gaab J and Locher C (2020) The Healing Encounters and Attitudes Lists (HEAL): Psychometric Properties of a German Version (HEAL-D) in Comparison With the Original HEAL. Front. Psychiatry 10:897. doi: 10.3389/fpsyt.2019.00897 Introduction: Over the last years, the interest in understanding health improvements that occur due to non-specific treatment effects, rather than in response to the specific active treatment ingredients, increased. Nevertheless, investigations on patients' idiosyncratic perspectives on the non-specific aspects of the healing encounter or of the treatment itself that contribute to placebo effects are still rare. The Healing Encounters and Attitudes Lists (HEAL) offer a unique and parsimonious set of instruments to measure patients' views on a variety of non-specific aspects of the caring encounter. The HEAL items can be administered as computerized adaptive tests or short forms that assess the patientprovider connection, the healthcare environment, treatment expectancy, positive outlook, spirituality, as well as attitudes towards complementary and alternative medicine. So far, no German version of the HEAL exists.

Methods: The original 168 HEAL items were translated into German (HEAL-D) applying a translation-back-translation procedure. We examined the psychometric properties of HEAL-D in a sample of 165 participants who reported at least one healthcare visit during the last year.

Results: The German short forms of HEAL (HEAL-D-SF) showed good internal consistency and test-retest reliability. The factor structure observed in the English original items showed low to moderate model fit in our sample.

Discussion: The development of a German version of HEAL in addition to the original English items offers new possibilities for investigating patients' idiosyncratic perspectives on the non-specific aspects of treatments across language borders. We will close with presenting possible clinical application as well as promising and relevant future research directions using HEAL-D-SF, including for instance large-scale, crossnational investigations.

Keywords: healthcare, non-specific treatment effects, patient-reported measures, German translation, patient attitudes and perceptions

### INTRODUCTION

Different aspects of healthcare interventions and of the healing encounter itself may influence health outcomes and well-being of patients. Typically, these aspects have been classified into two groups: First, certain treatment components are deduced from specific treatment theories, and have been referred to as characteristic, active, or (disorder-)specific treatment components (1). They are assumed to actively and directly affect health and symptom improvement (e.g. pharmacological ingredients in medications, particular exercises in physiotherapy, or the confrontation with a feared stimulus in exposure-based psychotherapy). Second, healthcare interventions typically take place in a context of care (2) in which additional aspects, such as the therapeutic bond or relationship between a healthcare professional and a patient (3), a plausible rationale for the treatment (4), the treatment providers' warmth (5, 6) as well as aspects of the treatment setting and environment, impact treatment success (7). These aspects have previously been labelled as non-specific, common, general, incidental, or contextual and their effects are typically described as placebo effects. While there are conceptual differences between the individual labels, all these aspects are assumed to be interacting with the characteristic, active or specific treatment components in contributing to health improvements. In the following we will use the terms specific effects when referring to the first kind of treatment effects and non-specific effects when referring to the latter kind of treatment effects.

In healthcare outcome research, which aims at identifying efficacious active treatments and treatment components, placebos (and other inert treatments) are used to keep all of the nonspecific treatment components constant, while manipulating the presence of the specific treatment component. Accordingly, controlling for the non-specific treatment effects in placebocontrolled randomised trials became the gold-standard in healthcare research (8). However, when evaluating more complex treatment packages the realization of a high-quality placebo-controlled study design, intended to control for the non-specific treatment effects, turned out to be a challenge (9–12). In addition the validity of distinguishing between specific and non-specific treatment components has been questioned empirically (13–16), as well as theoretically (17–19).

When turning from the highly controlled setting of health outcome research towards the practice of healthcare, where the actual improvement of a presenting patients' health is the major goal, several questions regarding the role of the non-specific treatment aspects and their potential effects arise: How relevant are the placebo effects, and thus the effects of the non-specific treatment components? How much do they contribute to patients' health improvement? Do certain patients benefit more from non-specific treatment components than others? And can non-specific treatment aspects support and boost the effectiveness of a standard treatment (18, 20, 21)?

Recently, an increased interest in understanding and investigating the effects of non-specific treatment aspects can be observed. This research has shown that in addition to the abovementioned non-specific aspects of the healthcare encounter itself, patients' perceptions and attitudes are associated with healthrelated outcomes across diverse healthcare settings. These perceptions and attitudes include patients' treatment outcome expectations (22–26), patients' trust in their treatment provider (27), or patients' spirituality (28, 29). Accordingly, a detailed knowledge about a particular patient's perception of and attitudes towards certain non-specific treatment aspects might enable treatment providers to specifically tailor the context in which interventions take place as well as the intervention itself to a certain patient's needs.

The "Healing Encounters and Attitudes Lists" (HEAL) have been developed as a precise and concise set of patient-report measures for assessing attitudes towards and perceptions of several treatment components that are associated with nonspecific treatment effects (30). HEAL item banks were constructed following the rigorous instrument development methodology of PROMIS® (31, 32), which combines literature reviews, surveys, clinician interviews, focus groups, cognitive interviews to assess item clarity, exploratory and confirmatory factor analyses, and item response theory methods. The convergent and discriminant validity of the initial items was demonstrated in two samples with over 1600 participants (30). The final item banks include a total of 168 Items reflecting six scales: patient-provider connection (57 items, e.g., I trust my healthcare provider), healthcare environment (25 items, e.g., My care was well organized.), positive outlook (27 items, e.g., I am hopeful about my future.), treatment expectancy (27 items, e.g., I expect good outcomes of this treatment.), spirituality (26 items, e.g., Spiritual beliefs give me hope.), and attitude toward complementary and alternative medicine (CAM; 6 items, e.g., I prefer natural remedies.). Participants are asked to rate items in relation to their current treatment on a five-point Likert scale (never, rarely, sometimes, often, and almost always). The items are generally applicable in clinical practice, and are not restricted to any type of treatment modality. The HEAL scales are independent of one another: researchers or clinicians can choose which HEAL scales to use. HEAL scales can also be administered as computerized adaptive tests. In computerized adaptive testing the test will be adapted individually to the testtakers responses. If the HEAL items were administered as computerized adaptive tests not all items belonging to one scale would be administered, but based upon the respondent' previous answers the following items would dynamically be selected for administration.

Short forms of the HEAL (HEAL-SF) have been proposed, with seven items for patient-provider-connection, and six items for healthcare environment, positive outlook, treatment expectancy, spirituality, and attitude toward CAM, respectively (30). Clinical experts selected items for the short forms that had excellent psychometric properties and that were considered to represent the clinical range of each scale of items. The HEAL-SF demonstrated excellent internal consistency which ranged between 0.92 and 0.97.

For clinical practice, particularly the HEAL-SF scales may be applied as a parsimonious assessment tool for complementing the treatment process. Certainly the use of HEAL items are not to replace the necessary exchange between a healthcare provider and the patient regarding the patient's idiosyncratic perceptions of and attitudes towards the treatment. Rather, HEAL item responses provide a formalized assessment about a certain patient's attitudes towards a number of non-specific treatment aspects, which may result in shared reflections about the treatment implementation, and may inform about necessary adaptions of the treatment in order to meet the patient's needs.

So far, no comparable item banks in German were available, that assessed patients' perceptions of and attitudes towards nonspecific treatment components that contribute to placebo effects. Therefore, we translated the English version of the HEAL item banks into a German version of HEAL (i.e., HEAL-D). The aim of this study was to investigate the psychometric properties of HEAL-D, with a specific focus on the short versions (HEAL-D-SF) as these have the most potential to being used in clinical practice as well as in research.

#### MATERIALS AND METHODS

#### Translation

We translated the HEAL item banks by means of a translationback-translation procedure in line with the guidelines proposed by Beaton and colleagues (33). First, the original 168 HEAL items were translated into German independently by two translators (MG and a student research assistant) without adding words or introducing new expressions, and a team of the two independent translators and two supervisors (HG and CL) consented on one German version of the HEAL items. Second, this version was translated back into English language by two independent translators (DS and a research assistant), and again a team including the two independent translators and two supervisors (HG and CL) compared the English back translations with the original HEAL items. If both back-translated versions indicated meaningful deviations from the original HEAL items, adjustments in the German wording were applied until a consensus was reached within the team of translators and supervisors.

#### Sample

We tested the German version of the HEAL-D items in a sample of 165 subjects who were recruited via an internet survey service of the University of Basel (baps.sona-systems.com). Subjects who received healthcare treatments within the past year, aged over 18, were fluent in reading and speaking German, and not under the acute influence of psychoactive drugs were invited to participate in the online survey.

The Local Ethics Committee Ethikkommission Nordwestund Zentralschweiz, Switzerland, approved the design and informed consent of the study. The database project and the server were coordinated and located at the Division of Clinical Psychology and Psychotherapy of the Faculty of Psychology at the University of Basel, Switzerland.

#### Measures

#### Demographic Variables

Demographic variables such as age, gender, mother tongue, and education were initially assessed.

#### Health-Related Questions

Our sample consisted of subjects who have been receiving at least one healthcare treatment within the past year. We assessed health-related characteristics of the sample, such as information regarding the main diagnosis, the according treatment, the practitioner providing the treatment, as well as the place where the treatment was delivered. We asked our participants to refer to the same treatment context in the first and second assessment.

#### Healing Encounters and Attitudes Lists—German Version (HEAL-D)

The HEAL item banks consist of 168 items reflecting six scales: patient-provider connection (PPC; 57 items), healthcare environment (HE; 25 items), positive outlook (PO; 27 items), treatment expectancy (TE; 27 items), spirituality (SP; 26 items), and attitude toward CAM (CAM; 6 items). We used the translated parallel German version (HEAL-D) of the 168 HEAL items. Additionally, we used the German version of the HEAL-SF (30), with seven items for PPC, and six items for HE, PO, TE, SP, and CAM, respectively. The original HEAL-SF scales demonstrated excellent internal consistencies, which ranged between 0.92 and 0.97.

#### Balanced Inventory of Desirable Responding (BIDR)

The short form of the BIDR (34, 35) contains 20 items, 10 of which capture self-deception (BIDR-SD) and 10 of which tap impression management (BIDR-IM). Internal consistencies of the German version of the two subscales ranged between 0.61 and 0.69 across three studies (34).

#### Procedure

Recruitment of participants took place online between July and December 2018. The online survey was advertised on markt. unibas.ch, studienteilnahme.ch, a faculty-internal student platform and in various pharmacies in Basel and was open to the public. Students received course credit for their participation.

After giving informed consent, participants were asked to generate a personalized token and were invited to participate in a secure online survey that included demographic and healthrelated questions as well as standardized questionnaires (including the HEAL-D items, for details see section Measures). The items of the standardized questionnaires were presented in a random manner, in order to prevent carry-over effects when answering a relatively large number of items which all belong to one scale (as is the case in the long version of HEAL-D). Participants had to indicate their preference on a 5-point response scale with 0 = not at all,1= a little bit,2= somewhat,3= quite a bit, and 4 = very much. The online survey was created and conducted in LimeSurvey (36). For the purpose of assessing the retest reliability of the HEAL-D items, participants were invited to complete the survey twice, whereby the median time interval between the first and the second assessment was 31 days (range 20–56). Since participants' answers were anonymized, the individual tokens allowed us to match the first and second assessments. Participants had to provide their email addresses in the first assessment, so that we were able to contact them 4 weeks later for the second assessment. Afterwards, email addresses were deleted so that the anonymity of the data was guaranteed.

#### Statistical Analyses

The major goal of our study was the development of HEAL-D-SF, a parallel version of HEAL-SF in German language. Initially, we excluded those cases from our sample that did not complete at least one entire scale, as well as cases that did not report a current healthcare provider. If participants reported diagnoses and healthcare providers in the second assessment that differed from the first assessment, the second assessment was not considered for retest reliability assessments. Then we checked for floor- and ceiling effects as well as for the presence of central tendency bias, and excluded respective cases.

In the remaining sample of 165 participants who completed the first assessment all individual item responses were analyzed with respect to their psychometric properties according to the principles of classical test theory. We analyzed the item difficulties and skewness across all 168 items. In addition, we checked for items that showed high correlations with social desirability, in order to identify inadequate items (i.e. items with restricted validity that reflect a high tendency towards socially desirable responses). Then, we selected the respective German items that constitute the original HEAL-SF. Based on this short forms of HEAL-D, we calculated the internal consistency (Cronbach's a) and the discrimination (corrected item-total-correlation) per scale, and the skewness for each of the 6 HEAL scales, as well as the correlation of the scales with social desirability. We assessed the comparability between the German short and long versions by correlating the scale means of both versions. Finally, we tested the retest-reliability by correlating the item means, as well as the scale means between the first and second assessment using the data from 115 participants who completed both assessments.

Next, confirmatory factor analyses (CFA) were carried out with the HEAL-D-SF in the sample of 165 participants who completed the first assessment, using R, "lavaan" package (37). Maximum likelihood estimation was used, with full information maximum likelihood for the missing data. Standardized latent factors were standardized, allowing free estimation of all factor loadings. Following recommendations of Kline (38), Hu and Bentler (39), and McDonald (40), four fit indices were used to examine the data-model fit of the CFA: (a) the chi-square test statistic, (b) the root-mean-square error of approximation (RMSEA), (c) the standardized root-mean-square residual (SRMR), and (d) the comparative fit index (CFI). As the chisquare test statistic is known to be influenced by sample size, model fit was assessed by determining whether the observed chisquare value divided by df (c<sup>2</sup> /df) was smaller than three (41). Regarding RMSEA, a cutoff value of 0.06 or lower was required for a relatively good fit (39), whereas values between 0.08–0.061 indicate a reasonable model fit (42). For the SRMR, Hu and Bentler (39) recommended a value close to 0.08 or lower. Finally, the CFI has a cutoff value close to 0.95 (39). Regarding differences between the models of invariance, changes in CFI of 0.01 or less reveal that the invariance hypothesis should not be rejected (43). Given that the interpretation of model fit in CFA is not without some degree of controversy, all these indices of fit were used, and evaluation was based on convergence among findings (39, 44).

Modification indices informed how the model fit would have changed if we would have added new parameters to the model. However, since the CFA model was not exploratory, we decided to only specify a particular modification of the model if this was theoretically justifiable (45).

All analyses were conducted using the open-source software environment R (version 3.3.1; 46). We assumed statistical significance if the 2-sided p was smaller than 0.05.

#### RESULTS

#### Socio-Demographic and Clinical Sample Characteristics

Two hundred forty four participants provided informed consent and started the online survey. Of those, 59 had to be excluded because they submitted an empty survey or did not complete at least one of the HEAL-D scales. In 32 cases we had to omit the second assessment, because they provided insufficient data for the retest reliability calculations, and in 10 cases we did not use the second assessment, because the healthcare provider differed between the first and second assessment. No single case had to be excluded because of occurring floor or ceiling effects or central tendency bias. The final sample, that completed the first assessment, and that was used for most analyses, consisted of 165 participants (86.7% female). The median age was 22 years (ranging from 19 to 48 years). Ninety eight percent of participants had at least a high school degree. The included participants reported a variety of reasons for seeking treatment. The most prevalent health complaints in our sample were affective, emotional, or behavioral problems (including depression, posttraumatic stress disorder, anxiety disorders, bipolar disorder, attention deficit and hyperactivity disorder, and anorexia mentioned by 33 participants) followed by pain (mentioned by 31 participants). Ten participants referred to check-ups (e.g. yearly check-up at the dentist). Two authors independently classified the mentioned health issues as chronic, acute, or unclear. In the chronic category chronic headaches, migraines, anxiety disorders, depression, allergies, and asthma were mentioned most often. Less frequently mentioned were chronic infections, irritable bowel syndrome, neurodermatitis, and chronic orthopedic dysfunctions including scoliosis and instability of joints. We rated health issues as chronic in 85 cases (52%). In 32 cases (19%) we rated the mentioned problems as acute. In this category most participants referred to accidents, surgeries, or check-ups. But also dental issues were rated as acute. In the unclear category (48 cases; 29%) we included pain-related issues (e.g. headaches and back pain that were not described as chronic), sleep problems, premenstrual and menstrual complaints, deficiency symptoms, and problems with the digestive system that were neither explicitly described as a particular syndrome nor as chronic. Table 1 summarizes the main characteristics of the study sample.

## Item and (Sub-)Scale Analyses

Item Characteristics of HEAL-D-SF

The items for the short-forms were selected in parallel to the original HEAL-SF. Table 2 displays the item characteristics of the HEAL-D-SF.

#### Characteristics of the HEAL-D-SF Scales and the BIDR Subscales

Table 3 shows the relevant psychometric properties of the applied scales. The HEAL-D-SF scales showed acceptable to excellent internal consistencies between 0.74 and 0.93. The retest reliability ranged between 0.71 and 0.96. Five of the scales were significantly skewed (all p < 0.02).

The BIDR-SD showed an unacceptably low internal consistency (0.31), and the BIDR-IM showed a questionable internal consistency (0.61). As we found three items with negative discrimination among the BIDR items, we deleted those items and repeated the analyses using the BIDR

TABLE 1 | Selected characteristics of the included sample.


–, not assessed.

subscales. In the adapted version the BIDR subscales' internal consistency improved slightly with Cronbach's a 0.54 for BIDR-SD and Cronbach's a 0.65 for BIDR-IM. The retest reliability of the adapted BIDR scales was r = 0.67 SD and r = 0.82 for IM, and both subscales were significantly skewed (p = 0.01, and p = 0.009, respectively). Due to the poor reliability of the BIDR-SD subscale (even after adaption), we did not use this scale for further correlation analyses, and we used the adapted version of BIDR-IM for the following correlation analyses.

#### Correlation Analyses

Four of the HEAL-D-SF scales showed significant correlations with BIDR-IM. The correlations between the short and long versions of HEAL-D were moderate to high ranging from r = 0.66 (positive outlook) to r = 0.98 (spirituality), indicating that the two versions are highly consistent. Table 3 shows the respective correlation coefficients.

#### Testing the Factor Structure of the HEAL-D-SF Scales

For our CFA the standardized factor loadings of most items were significant and most were larger than 0.4, except for the loading of five items (see Table 2 for details). Nevertheless, the initial model fit of the German version of the HEAL-SF was not sufficiently satisfying [c<sup>2</sup> : 2237.04; df: 614; p < 0.000; RMSEA: 0.13 with 90% CI (0.12, 0.13); RMR: 0.18, and CFI: 0.68] (Table 4). Modification indices found that specifying the presence of covariance for the error terms of one pair of items on the HCE factor, two pairs of items on the PO factor, and one pair of items on the CAM factor would significantly improve model fit (see Table 4 for details). Given that each pair of items contained related content and the same factor, it was judged appropriate to adjust the model such that the error terms of these items were allowed to covary<sup>1</sup> . All indicators of model fit (Table 4) suggested that the adjusted model had a slightly better, but still non-acceptable fit with the data.

#### DISCUSSION

#### Main Findings

We set out to evaluate a parallel version of the HEAL-SF in German language. The HEAL items assess patients' attitudes towards and perceptions of the so-called non-specific treatment components that have been shown to contribute to the effectiveness of inert treatments (e.g. sham interventions or placebos) but also to be responsible for a considerable amount of the effectiveness of pharmacological and psychotherapeutic

<sup>1</sup> Modification indices also suggested that specifying a covariance between the error terms of the items "My healthcare provider pays attention to my individual needs" and "The staff was helpful", as well as of the items "This treatment is right for me" and "It is important to be open to CAM" would improve model fit. However, as each item pair was from separate scales and the item content was judged as nonsimilar, we felt it was not theoretically justifiable to specify these particular modifications of the model.

#### TABLE 2 | Item characteristics of HEAL-D-SF based on the 165 participants who completed the first assessment.


(Continued)

#### TABLE 2 | Continued


a calculated per scale; <sup>b</sup> standardized factor loadings from confirmatory factor analysis: HEAL-D-SF; SD, standard deviation; SE, standard error.

TABLE 3 | Psychometric Properties and Correlations of HEAL-D-SF with Impression Management, and HEAL-D (N = 165).


a significant deviations from normal distribution are printed in bold face; <sup>b</sup> The adapted version of the BIDR-IM scale was used in which items with negative item-to-total correlations were deleted; <sup>c</sup> significant correlations are printed in bold face; SD, standard deviation.


\*Model included specified covariance between error terms for the item "The staff was friendly" and the item "The staff was helpful" (both factor HCE); the item "I am satisfied with my life" and the item "I feel I can cope with my problems" (both factor PO); the item "I feel positive about my life" and the item "I am satisfied with my life" (both factor PO); as well as the item "It is important to be open to CAM" and the item "I prefer natural remedies" (both factor CAM). CFI, comparative fit index; df, degrees of freedom, RMSEA, root-mean-square error of approximation; SRMR, standardized root-mean- square residual.

treatments. The HEAL items have been developed applying rigorous methodology.

In the present study, the German HEAL items were used for the first time in an online survey in Switzerland. The six scales of HEAL-D-SF have demonstrated acceptable to excellent internal consistency and retest reliability, which indicate that the HEAL-D-SF scales are reliably applicable instruments. Most of the scales were skewed in our sample with most participants indicating high endorsement, except for the scale CAM. Given the wellorganized and high-quality healthcare system in Switzerland, the skewness towards positive responses in the scales PCC, HCE, TE, and PO is no surprise. The scale SP was skewed towards negative responses, which may be explained by a poor relevance of spirituality in the selective sample of our study.

Using CFA, the six-factor structure of HEAL and HEAL-SF reported by Greco and colleagues (30) was partly confirmed using HEAL-D-SF: while factor loadings indicate a good fit of the items with the latent factors (i.e. scales) the overall model fit of the CFA was moderate to low. However, the model fit indices have been shown to largely depend on the sample size, which was comparably small in our study. By adjusting the original model following the highest modification index, which allows for covariation of error terms of several items, the model fit for the assessed fit indices slightly improved. Four items showed very low factor loadings as well as a low discrimination (HCE: "The waiting area was comfortable."; PO: "I feel I can cope with my problems." "I am satisfied with my life."; CAM: "It is important to be open to CAM."). If confirmed in future studies, these findings might indicate that the respective items represent different latent constructs compared with the other items of the respective scales.

Due to the poor psychometric quality of the BIDR scales, no conclusions are possible based on the significant correlations between the HEAL-D items and social desirability. In future studies the HEAL-D items need to be validated with additional reliable instruments.

#### Relation to Relevant Previous Conceptual and Theoretical Work

HEAL and HEAL-SF have been constructed as a set of individual scales, which represent different aspects of treatments and of the according treatment context. The development of HEAL included a comprehensive overview of existing scales, and of expert and patient opinions. Although the authors of the original HEAL item banks did not explicitly relate the HEAL items to theoretical frameworks of non-specific factors, when relating the items to a prominent model of context factors proposed by Frank and Frank (47), the HEAL scales can be considered as operationalizations of the proposed factors: First, the scale HCE can be seen as including operational definitions of the professional healthcare environment. Second, the scale PPC can be seen as an operationalization of the healing relationship. Third, the scales TE, PO, SP, and CAM can be seen as contributing to ensuring that the advised and prescribed treatment (i.e. the ritual) and the rationale for this treatment are in line with patients' expectations and attitudes and that they are thus acceptable for the patient as described by Budge and Wampold (48). Nevertheless, given the extreme variety of potentially relevant non-specific treatment aspects, the defined scales can only cover a part of all potentially relevant aspects, and additional operationalizations of the theoretical contextual factors are possible. In future, depending on the actual context, in which the HEAL items are to be administered, more scales tapping additional non-specific treatment components might be considered, and added to the HEAL item lists: For instance, items focusing on the provider's empathy might be added to the HEAL item lists in future studies, as empathy has been demonstrated to be associated with treatment effects across different kinds of treatments, and is not explicitly addressed in the current HEAL item lists.

The possibility of assessing patients' idiosyncratic perceptions of and attitudes towards treatment aspects besides the actively prescribed treatment components, can be seen as a further step to overcoming the invalid distinction between non-specific and specific treatment components and towards defining nonspecific aspects of treatments as specific, as described for instance by Kaptchuk (49). The idea of "making the nonspecifics specific" is not new: As early as 1973 Jefferson M. Fish proposed that that therapeutic processes have significant parallels to those taking place in faith-healing and placebo mechanisms in general (50). Along similar lines Frank characterized healing as a social influence process (47), and emphasized the relevance of the non-specific treatment components by presenting a contextual treatment model. More recently, Weinberger argued against using the term non-specific in the context of psychotherapeutic treatments: "I would prefer to say that some important factors may have not been operationalized well enough to be studied empirically; they have not yet been specified. Thus, they are non-specified, not non-specific. Contrary to the views of those questioning their scientific bona fides …, so-called non-specific effects are not ontologically non-specific. They are capable of being empirically specified." (17). The outlined views on the relevance of "making the non-specifics specific" are also reflected by a recent feature in The British Medical Journal entitled "Social prescribing: coffee mornings, singing groups, and dance lessons on the NHS" (51), which outlines the idea to formalize physicians' referrals of patients to community activities, and highlights the relevance of the entire healing context for clinical practice.

#### Implications for Clinical Practice

In clinical practice, placebo effects, and thus non-specific treatment aspects, moderate and mediate treatment outcomes significantly. However, if healthcare providers are not particularly sensitive towards the relevance of the non-specific treatment aspects, issues associated with these treatment aspects are likely to remain undetected. If a given patient had for instance a low expectancy regarding the efficacy of a necessary standard treatment, the patient's negative perceptions might have negative consequences with respect to the administration of or the adherence to the prescribed treatment, which might lead to a treatment failure. The low expectancy, however, might not appear to be relevant to the patient (and neither to the provider), and thus, might remain uncovered. In such a case, the administration of the HEAL items could help detecting the issue at hand. Then, the treatment provider could first take action in improving the patients' outcome expectancy, before initiating the actual standard procedure.

As many of the non-specific treatment aspects, that impact treatment outcomes, are largely neglected in the context of standard treatment administration, the implementation of HEAL items in clinical practice might be seen as facilitating the detection of problematic aspects of a treatment, that are routed in the non-specific aspects of treatments. A deeper knowledge of patients' idiosyncratic perceptions of and attitudes towards these would thus allow tailoring interventions in line with individual patients' needs by facilitating an ethical and research-based conversation regarding what works in an intervention. This may in turn contribute to positive treatment expectations by providing a plausible treatment rationale.

It is important to note that we see the HEAL-D-SF as a flexible tool: Depending on the context of implementation, different scales may be of greater importance than others. For instance, the scale spirituality (SP) might help some patients to understand their symptoms within the context of their culture and religious beliefs. Concordantly, a recent meta-analysis revealed that treatments which are tailored to patients' religious or spiritual beliefs are significantly more effective than no treatment or non-religious/ spiritual psychotherapies in terms of psychological functioning (29). Along similar lines, a feature recently published in The British Medical Journal stated that there is a "high demand among the public for someone to talk to about spiritual matters in times of crisis" (52). The HEAL-SF spirituality scale can help to detect such needs in individual patients, and in turn the treatment provider and the patient can collaboratively discuss and decide, how the treatment can be adapted or complemented, in order to satisfy the patient's need. Nevertheless, and to come back to the argument that a treatment should be credible and plausible, spirituality may not be relevantfor every patient.Wewould therefore advise to judge from patient to patient, (or from context to context, respectively), whether the assessment of spirituality seems appropriate. The same holds true for the other scales of the HEAL-D-SF.

### Implications for Research

The development of HEAL-D-SF as a parallel version of the original HEAL-SF is of importance for healthcare research: The HEAL items offer the possibility to investigate the impact of non-specific treatment components across diverse interventions as well as across treatment contexts. The theory behind non-specific treatment components claims that these components have comparable effects across various interventions, treatment settings, and contexts. It would be interesting to test this assumption—e.g., to evaluate whether there is one factor which is the most reliable predictor for treatment success across cultures, populations, and treatment approaches using the parallel English and German versions of HEAL-SF. Importantly, these findings would be based on the patient's own idiosyncratic views and assumptions, rather than relying on theoretical models or assumptions. Since HEAL-D-SF was developed in parallel to the existing HEAL-SF in English language, cross-cultural studies become possible in the future. Thus, a specific focus of future research projects can be the detection of similarities as well as dissimilarities in patients' perception of the impact of non-specific treatment components on treatment outcomes—depending on patients' cultural background.

### Limitations

Some limitations of the present study should be considered. First, and most important, the presented data are based on a comparably small sample, which may have negatively affected the overall model fit of the CFA, since fit indicators highly depend on the sample size. Hence, further studies are necessary that include larger samples in order to finally assess the factor structure of the German HEAL items. Second, our sample was heterogeneous with respect to the reported health conditions. While about half of the participants reported rather chronic conditions half of the participants reported rather acute conditions. It is possible, that the impact of the nonspecific aspects of treatments on health outcomes varies depending on the chronicity of health conditions, and is, for instance, mediated by the intensity, frequency, and duration of the treatment. Along similar lines, non-specific aspects may have a greater impact in an ongoing treatment for a clinical condition when compared to a medical check-up. On the other hand, however, the HEAL item banks are considered to be condition-insensitive. Thus, the diversity of our sample with respect to the reported health complaints could be considered a strength of our study. Nevertheless, future studies should consider and test these possible moderators or mediators. Third, study participants were rather homogeneous with respect to educational level and age. It is possible that our findings will not generalize to populations with other socio-demographic characteristics. Therefore, future studies should include a broader range of study participants. Fourth, validation studies are necessary to test the convergent and discriminant validity as well as the prognostic value of the HEAL-D-SF items, including for instance comparisons with existing scales that assess non-specific factors more extendedly, and using longer item lists but also testing the prognostic value of HEAL items in predicting for instance health improvements or well-being in prospective studies. Fifth, the presented study relies on outpatient data assessed via online survey. For future studies it would be interesting to apply HEAL-D-SF in a clinical context.

#### Conclusion

To conclude, we presented a German translation and a first evaluation of the HEAL items, that assess patients' attitudes towards so-called non-specific treatment components. The German version (HEAL-D-SF) proved to be a reliable set of measures in an initial study. With six scales and six to seven items per scale the HEAL-D-SF are a parsimonious set of measures to assess the relevance of diverse non-specific treatment aspects. Especially when implemented in clinical practice, the shortness of HEAL-SF and HEAL-D-SF constitute a particular strength. But, before a possible application of HEAL-D items in clinical practice, additional validation studies are needed.

### DATA AVAILABILITY STATEMENT

The datasets used for the analyses in this study are available from the corresponding author, upon reasonable request.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Ethikkommission Nordwest- und Zentralschweiz (Project-ID 2017-00870), with written informed consent from all subjects. All

### REFERENCES


subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethikkommission Nordwest- und Zentralschweiz.

### AUTHOR CONTRIBUTIONS

Study concept and design: HG, MG, CL. Acquisition, analysis or interpretation of data: SB, JG, HG, CL, DS. Drafting of the manuscript: HG, CL. Critical revision of the manuscript for important intellectual content: JG, HG, MG, CL. Statistical analyses: SB, HG, CL, DS. Study supervision: HG, CL.

### FUNDING

CL received funding for this project from the Swiss National Science Foundation (SNSF): P400PS\_180730.

### ACKNOWLEDGMENTS

We would like to thank Dr. Helen Koechlin and Sebastian Hasler for their help with the translation-back-translation procedure.


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Gerger, Buergler, Sezer, Grethler, Gaab and Locher. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modeling Learning Patterns to Predict Placebo Analgesic Effects in Healthy and Chronic Orofacial Pain Participants

Yang Wang1,2† , Christina Tricou3† , Nandini Raghuraman<sup>1</sup> , Titilola Akintola<sup>1</sup> , Nathaniel R. Haycock <sup>1</sup> , Maxie Blasini <sup>1</sup> , Jane Phillips <sup>3</sup> , Shijun Zhu<sup>4</sup> and Luana Colloca1,2,5\*

<sup>1</sup> Department of Pain and Translational Symptom Science, School of Nursing, University of Maryland, Baltimore, MD, United States, <sup>2</sup> Center to Advance Chronic Pain Research, University of Maryland, Baltimore, MD, United States, <sup>3</sup> Department of Neural and Pain Sciences, School of Dentistry, University of Maryland, Baltimore, MD, United States, <sup>4</sup> Department of Organizational Systems and Adult Health, School of Nursing, University of Maryland, Baltimore, MD, United States, <sup>5</sup> Departments of Anesthesiology and Psychiatry, School of Medicine, University of Maryland, Baltimore, MD, United States

#### Edited by:

Stephan Zipfel, University of Tübingen, Germany

#### Reviewed by:

Karin Meissner, Hochschule Coburg, Germany Miriam Goebel-Stengel, HELIOS Klinik Rottweil, Germany

\*Correspondence:

Luana Colloca colloca@umaryland.edu

† These authors share first authorship

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 01 May 2019 Accepted: 13 January 2020 Published: 12 February 2020

#### Citation:

Wang Y, Tricou C, Raghuraman N, Akintola T, Haycock NR, Blasini M, Phillips J, Zhu S and Colloca L (2020) Modeling Learning Patterns to Predict Placebo Analgesic Effects in Healthy and Chronic Orofacial Pain Participants. Front. Psychiatry 11:39. doi: 10.3389/fpsyt.2020.00039 Successfully predicting the susceptibility of individuals to placebo analgesics will aid in developing more effective pain medication and therapies, as well as aiding potential future clinical use of placebos. In pursuit of this goal, we analyzed healthy and chronic pain patients' patterns of responsiveness during conditioning rounds and their links to conditioned placebo analgesia and the mediating effect of expectation on those responses. We recruited 579 participants (380 healthy, 199 with temporomandibular disorder [TMD]) to participate in a laboratory placebo experiment. Individual pain sensitivity dictated the temperatures used for high- and low-pain stimuli, paired with red or green screens, respectively, and participants were told there would be an analgesic intervention paired with the green screens. Over two conditioning sessions and one testing session, participants rated the painfulness of each stimulus on a visual analogue scale from 0 to 100. During the testing phase, the same temperature was used for both red and green screens to assess responses to the placebo effect, which was defined as the difference between the average of the high-pain-cue stimuli and low-pain-cue stimuli. Delta scores, defined as each low-pain rating subtracted from its corresponding high-pain rating, served as a means of modeling patterns of conditioning strength and placebo responsiveness. Latent class analysis (LCA) was then conducted to classify the participants based on the trajectories of the delta values during the conditioning rounds. Classes characterized by persistently greater or increasing delta scores during conditioning displayed greater placebo analgesia during testing than those with persistently lower or decreasing delta scores. Furthermore, the identified groups' expectation of pain relief acted as a mediator for individual placebo analgesic effects. This study is the first to use LCA to discern the relationship between patterns of learning and the resultant placebo analgesia in chronic pain patients. In clinical settings, this knowledge can be used to enhance clinical pain outcomes, as chronic pain patients with greater prior experiences of pain reduction may benefit more from placebo analgesia.

Keywords: conditioning, expectation, latent class analysis, pain, temporomandibular disorder

#### INTRODUCTION

Placebo effects represent a phenomenon that encompasses psychological, biological, and interpersonal aspects of human physiology and behavior (1). A variety of frameworks, theories, and concepts have been postulated in an attempt to understand how placebo effects are elicited, formed, and maintained over time (2). Placebo effects appear to be complex in nature, highly flexible across contexts, and dynamic over time with their ability to influence symptoms and health outcomes. The high complexity of placebo effects, and the influence of subconscious processes, makes it unlikely that a single mechanism leads to the formation of placebo effects. However, expectations and placebo effects that influence and modify a patient's perception of symptoms may respond to computational rules and predictive models. Büchel et al. postulated the idea that the complex experience of pain is based on the actions of predictive coding (3). The brain is not merely a decoder of signs and signals from the periphery (e.g. nociceptive stimuli), but rather an elegant machine that makes inferences based on prior experiences and anticipatory cues, or expectations (3). Wiech (4) suggested that the experience of pain is an inferential process in which prior information and self-healing experiences are integrated to create anticipations of future events by forming a sort of "template" about future painful (and nonpainful) events, thus providing critical elements about how to interpret the ongoing inputs (4). Thus, humans are likely to interpret their experiences based, at least in part, on their own expectations rather than on the experiences themselves (4). As such, expectations are likely to bias perception of symptoms (e.g. pain experience) and signals (e.g. nociceptive stimuli) through brain activation in areas that process and interpret somatosensory input. According to Wiech, when expectations are too "far-fetched," then a modification of expectations occurs, making pain perception modulation an active and dynamic process that is enabled and primarily modified by learning processes and prior experiences (4).

In this context of pain signaling, a Bayesian computational model based on predictive coding could account for variability in placebo responsiveness (3). Anchisi and Zanon (5) built a Bayesian decision model (fBD) which indicated that placebo effects result from the integration of nociceptive stimuli with past experience (e.g. via conditioning), incoming sensorial information (e.g. nociceptive stimuli), and context (e.g. anticipatory cues) (5). In this study, we expanded upon these theories, using the latent class analysis (LCA) approach (6) to determine how learning patterns during conditioning can affect the formation of placebo analgesia. Additionally, we determined how self-reported expectations of pain relief are associated with and mediate placebo analgesia.

#### MATERIALS AND METHODS

Five hundred seventy-nine participants volunteered for this study, of which 380 were healthy participants and 199 were patients suffering from temporomandibular disorder (TMD), Table 1. All participants gave written consent to participate in this study and the internal review board of the University of Maryland, Baltimore approved the study (Prot. HP-00068315). Since deceptive information was used during the procedure, healthy participants were debriefed at the end of their experimental round using a study exit form that detailed the nature and the involvement of deception. They were offered the chance to withdraw their data from the study but none did.

#### Eligibility Criteria

All participants were within the ages of 18–65 years and were pre-screened over the phone to determine their eligibility as either a healthy or TMD participant. Participants over 65 years of age were excluded because pain thresholds increase and TMD dysfunctions steadily decrease in prevalence and severity with older age (7, 8).

#### Healthy Participants

Three hundred eighty volunteers were deemed eligible and enrolled as healthy participants based on an in-person screening by trained research personnel. Inclusion was based on their age and ability to speak and understand English. Participants were excluded based on the following criteria: presence of pain disorders; presence of degenerative neuromuscular, cardiovascular, neurological, kidney, or liver disease; pulmonary abnormalities; cancer within the past three years; any uncorrected impaired hearing; color-blindness; and pregnancy or breast-feeding. Participants with a family history of schizophrenia, bipolar disorders, and other psychoses were also excluded, as were those with any severe psychiatric condition leading to hospitalization in the last three years. Lifetime dependence on, or abuse within the prior year of, alcohol or recreational drugs was also an exclusion criteria. Volunteers identified as healthy participants underwent an in-person by trained research personnel who verified the screening results to ensure eligibility. In addition to the criteria listed above, healthy participants were also excluded if they suffered from any chronic pain condition or had a personal history of psychosis.

#### TMD Participants

One hundred ninety-nine volunteers were enrolled as TMD participants. Inclusion criteria for TMD participants were met by those who reported a minimum of 3 months of pain in the jaw, temple, or ear area on either side prior to examination. Those identified as potential TMD patients received an in-person clinical examination by a dental hygienist with expertise in orofacial pain at the Brotman Facial Pain Clinic at the University of Maryland, School of Dentistry. TMD research classifications were confirmed according to the Axis I Diagnostic Criteria for Temporomandibular Disorders (DC/ TMD) (9, 10). Axis II instruments were completed and grading of instruments was performed in accordance with the DC/TMD Scoring Manual for Self-Report Instruments (11). Participants were excluded based on the following criteria: presence of cervical pain (following stenosis or radiculopathy); presence of degenerative neuromuscular, cardiovascular, neurological, kidney, or liver disease; pulmonary abnormalities; diffuse cancer within the past three years; any uncorrected impaired hearing; color-blindness; and pregnancy or breast-feeding. Participants with any severe psychiatric condition leading to hospitalization in the last three years, lifetime dependence on alcohol or recreational drugs, or abuse of either within the prior year were also excluded.

#### Experimental Procedures

The experiment took place within the Clinical Suites at the University of Maryland Baltimore, School of Nursing and consisted of a single session. The study procedures were described in detail during the consent process, and participants provided written informed consent. Vital signs, including blood pressure, heart rate, height, weight, and body mass index (BMI), were recorded for monitoring purposes only.

### Heat Pain Stimulation

Painful thermal heat stimuli were applied to the dominant forearm and delivered using an ATS 30×30 thermode (PATHWAY System, Medoc, Ramat Yishai, Israel). The participants performed a pain sensitivity assessment using the limits paradigm (12) and reported their pain intensity (ranging from 0 = no pain to 100 = maximum tolerable pain) verbally to the experimenter. The pain sensitivity assessment allowed tailoring of the maximum, moderate, and minimum levels of painful stimulations to each participant, which were then used for the placebo manipulation during the conditioning and testing phases. The participants were reminded about the experimental tasks upon completion of the pain sensitivity assessment. They were informed that they would be receiving both electrical (actual a sham electrode) and heat-pain stimulation while viewing two colored screens, namely red and green. The sham electrode was attached above the thermode on the forearm, and they were informed that the electrode would stimulate their nerves at an imperceptible "subthreshold level" to reduce their pain. They were informed that the electrode would only be active when they viewed a green screen, and not red. The participants were trained to use a script-based rating device (Celeritas Fiber Optics Response System, Psychology Software Tools Inc, Sharpsburg, PA, USA) to rate their pain intensity after every trial using the visual analog scale (VAS) ranging from 0 = no pain to 100 = maximum tolerable pain.

### Placebo Manipulation

A well-established conditioning paradigm (13) with two conditioning phases and one testing phase was employed as a placebo manipulation. Each of the two conditioning phases and the testing phase contained 12 heat pain stimulations, of which six stimulations were associated with red screens and six with green screens. The participants were randomized to one of the four pseudorandom sequences of screen color to control for potential sequence effects. The experimenter used the three levels of temperature (accounting for maximum tolerable pain, minimum pain, and moderate pain) from the participant's pain sensitivity assessment. The temperature for the moderate pain level was usually one degree lower than the temperature of the maximum pain level. During the conditioning phases, the temperature for maximum pain was delivered with the red screens, and the temperature for minimum pain was delivered with green screens. During the testing phase, the temperature for moderate pain was delivered with both the red and green screens. After each stimulation, the participants rated their pain intensity using the VAS (Figure 1). The difference between the means of the red and green screen ratings during the testing phase was calculated to determine the magnitude of placebo response.

### Expectations Assessments

A 0–100 mm VAS scale was used to assess the participants' selfreported expectation with the question: "How much do you think this procedure will reduce your pain?" Before beginning of the conditioning phase, participants rated their baseline pain relief expectations about the anticipated effectiveness of the intervention. Immediately after the conditioning phase, participants rated their "reinforced expectations" by asking "How much do you think this procedure will reduce your pain?" Finally, after the testing phase, participants rated again their perceived effectiveness of the intervention.

### Statistical Analysis

To examine whether the conditioning procedure induced placebo analgesia, we adopted a repeated measures ANOVA to analyze the differences between the VAS ratings for the red and green trials. The mean delta score of red minus green trials in the testing phase was calculated to compare effect size of placebo analgesia in healthy participants and participants with TMD. The Cohen's d was further calculated. To determine the group differences in expectations ratings at each time point, a repeated measures ANOVA was conducted with the time set as the repeated measure (self-reported expectation at baseline vs. after the conditioning phase vs. after the testing phase) and group (TMD vs. healthy participants) as the between-subjects factor. To identify the variables that needed to be controlled in the latent class analysis (LCA) model, a linear regression model was conducted to explore the influences of demographic variables (sex, age, race, marital status, income, and education, see also Table 1), warmth thresholds, and pain sensitivity (i.e., temperature used for the testing phase) on placebo analgesia in the overall sample (n = 579). The variables that showed significant influences on placebo analgesia served as control variables in the LCA model.

The aim of the LCA model was to determine the potential classes that shared similar characteristics to the pain rating patterns during the conditioning phase. To determine the potential classes within TMD and healthy participants,

FIGURE 1 | Timeline of the experiment paradigm. Participants went through two sessions of conditioning phase and one session of testing phase with each session containing 12 trials. During the conditioning phase, a red screen was paired with a high heat painful temperature while a green screen was paired with a low heat painful temperature. During the testing phase, both red and green screens were paired with moderate heat painful temperature. For both conditioning and testing phases, the colored screens and the heat painful stimulations were presented for 10 s. After delivery of the heat painful stimulations, a VAS scale with 0 = not painful at all to 100 = maximal tolerable pain was provided (8 s) to assess participants' pain experience ratings. The inter-stimuli-interval (ISI) was set randomly between 10 s to 13 s was presented. For both conditioning and testing phase, red and green trials were randomly displayed using one of the four pre-programmed sequences that are randomly designed.

TABLE 1 | Demographic information for TMD (n = 199) and healthy controls (HC, n=380).


the delta scores of red-minus-green pain ratings during the conditioning phase were modeled using Mplus (14). The variables that showed significant influences on placebo analgesia in linear regression served as control variables for the LCA model. As part of the LCA model, a Lo-Mendel-Ruben Likelihood Ratio Test (LMR-LRT), which is an indication of goodness of fit (15, 16), was used to determine which group separation was ideal. Entropy and Bayesian information criteria (BIC) were used to confirm the separation of classes. Entropy was an index of group separation (17), with larger values indicating greater differences among identified classes. The BIC was set as the goodness of fit criteria, with a smaller value indicating a bettermodelfit (16). Specifically, the optimal number of classes was decided by considering the following requirements: 1) The number of classes (n) was selected when the LMR-LRT was significant (p < 0.05) for n-class model and was not significant for the next level of classes (i.e., n+1-class model) (15, 16); 2) The entropy value was over 0.8 (17); 3) The classes model had the smallest BIC value, and 4) Each identified classes contained more than 15 participants. We employed a non-parametric test Mann-Whitney U test to assess the statistical relevance of these classes on placebo analgesia.

Finally, we performed mediation analyses within TMD and healthy participants, separately, to test the hypothesis that the identified classes associated with placebo analgesia would be fully mediated by the reported reinforced expectation of pain reductions assessed after the conditioning phase. Mediation analyses were conducted using SPSS marco PROCESS developed by Hayes et al. (18, 19) and expectation scores were set as the mediator (M), placebo analgesia as the dependent variable (Y), and the identified classes as the independent variable (X). For testing indirect effects, a bias-corrected bootstrapping method based on resampling of 5,000 times was used. A 95% bootstrapped confidence interval (BCI) is significant if the interval does not contain zero.

The repeated measurements ANOVA, Mann-Whitney U test, regression and mediation analyses were carried out using the SPSS software package (SSPS Inc, Chicago, Illinois, USA, vers.22) and the Mplus software (vers. 8.2, https://www.statmodel.com/ index.shtml) was used for the LCA approaches.

According to previous placebo studies (13, 20), we expected to observe medium to large placebo effects induced by the conditioning paradigm. Based on the within-subjects design (with red and green trials set as the within-subjects factor), we performed a power analysis to determine the minimal number of participants. A total N of 129 would be sufficient to have 0.8 statistical power to observe a medium effect size Cohen's f = 0.25 at the alpha level of 0.05. We also determined the optimal sample size for the LCA algorithm. Assuming that pain ratings during conditioning phase would result in a 2-class model, a minimum N of 109 was needed to achieve 0.8 statistical power to detect a medium effect size (Cohen's w = 0.44) [(21), Table 8]. Thus, the current study with 199 TMD participants and 380 healthy participants allowed us to determine placebo effects, as well as the underlying conditioning strength pattern, with a full power (>0.8).

### RESULTS

### Pain Ratings During the Conditioning Phase

An omnibus ANOVA for repeated measurements was conducted with red and green trials set as the within-subjects dependent variable and group (TMD vs. healthy participants) as the between-subjects variable. The significant main effect of the condition (F1,577 = 6633.11, p < 0.001, Cohen's f = 3.39) indicated that, overall, participants rated red screen pain (mean = 69.64, sem = 0.68) as significantly different than green screen pain (mean = 9.67, sem = 0.41, p < 0.001). The significant interaction (F1,577= 20.44, p < 0.001) between the condition (red vs. green trials) and the group indicated thatTMD participants showed a smaller red-minus-green difference (mean = 56.64, sem = 1.19) during conditioning phase than healthy participants (mean = 63.30, sem = 0.86, p < 0.001; Cohen's d = 0.38).Moreover, the significant main effect of the group (F1,577 = 9.87, p = 0.002) suggested that, during the conditioning phase, TMD participants reported significantly lower overall pain intensities (mean = 38.65, sem = 0.68) than healthy participants (mean = 40.98, sem = 0.49). A separate analysis for controls and cases was also included and we observed a significant difference in pain ratings between red and green stimulations in both TMD (F1,198 = 2182.82, p < 0.001; Red: mean = 66.66, sem = 1.08; Green: mean = 10.02, sem = 0.68; Cohen'sf = 3.32) and healthy participants (F1,379 = 5468.46, p < 0.001; Red: mean = 72.62, sem = 0.80; Green: mean = 9.32, sem = 0.47; Cohen's f = 3.79).

#### Expectation Changes Across Time

There was a significant main effect of time on self-reported expectations (baseline vs. after the conditioning phase vs. after the testing phase, F2,1148 = 393.16, p < 0.001). Post-hoc analyses applying Bonferroni correction indicated that baseline expectations were significantly lower (mean = 43.92, sem = 1.05) than both reinforced post- conditioning expectations (mean = 74.75, sem = 1.06, p < 0.001) and overall expectations after the testing phase (mean = 68.73, sem = 1.15, p < 0.001). As anticipated, reinforced expectations after the conditioning phase were higher than overall expectations after the testing phase (p < 0.001). There were no differences between groups (TMD vs. healthy participants, F1,574 = 0.37, p = 0.542), indicating that TMD and healthy participants had comparable expectations at baseline, after the conditioning phase, and after the testing phase.

#### Placebo Analgesia Induced by Conditioning Procedure

An omnibus ANOVA for repeated measurements was conducted with red and green trials during the testing phase set as withinsubjects dependent variable and group as between-subjects variable. The significant main effect of the condition (F1,577= 631.03, p < 0.001, Cohen's f = 1.05) indicated that, overall, participants displayed placebo analgesia induced by the conditioning procedure with significantly lower pain intensity ratings for green trials (mean = 30.75, sem = 0.87) in comparison with red trials (mean = 49.73, sem = 0.94, Cohen's d = 1.05). There was no significant interaction between the condition and group (F1,577 = 3.29, p = 0.070), suggesting that the placebo analgesia was similar in the TMD (mean = 17.60, sem = 1.22) and healthy participants (mean = 20.35, sem = 0.89). We observed significant placebo analgesia through a separate analyses for TMDs and healthy participants, as revealed by the main effect of the condition (red vs. green) on pain intensity ratings during the testing phase in both TMDs (F1,198 = 197.75, p < 0.001; Cohen's f = 1.00) and healthy participants (F1,379 = 540.78, p < 0.001; Cohen's f = 1.19). That is, pain ratings for test trials (green) were significantly lower than control trials (red) during the testing phase in both TMDs (Green: mean = 31.17, sem = 1.37; Red: mean = 48.78, sem = 1.45; p < 0.001) and healthy participants (Green: mean = 30.33, sem = 1.04; Red: mean = 50.68, sem = 1.13; p < 0.001).

#### Identifying Critical Covariates

The results of linear regression indicated that older age was associated with lesser placebo analgesia (b = −0.20, p < 0.001). Additionally, higher warmth-detection threshold was associated with lesser placebo analgesia (b = −0.10, p = 0.020). Given that age and warmth-detection thresholds had a significant impact on placebo analgesia, those two variables were treated as covariates in the LCA models.

### Latent Class Analysis

We modeled the trajectory of the effects of learning using the delta scores of red-minus-green pain intensity ratings during the conditioning phase. For TMDs, the LMR-LRT was significant for the 2-Class model according to the delta pain ratings during the second round of conditioning (p = 0.035) with a high entropy value (0.858, Table 2). This suggests that placebo conditioning trajectories differed substantially during the second round of the conditioning phase between the two subgroups. The goodness of fit criteria were adequate with BIC = 9839.827 for this model. Class 1, including 164 participants (82.4%), was characterized by persistent large delta scores of pain ratings. On the contrary,


TABLE 2 | Goodness-of-fit criteria for LCA models within TMD and healthy controls (HC).

Class 2 (35 participants, 17.6%) was characterized by a decreasing delta scores over trials (Figure 2A).

Additionally, the TMD participants were classified into two classes based on delta scores using the overall conditioning phase (12 trials in total). The LMR-LRT test was significant (p = 0.0499) with adequate entropy value (0.812), suggesting that the two subgroups showed distinct trajectory patterns of delta scores. The goodness of fit criteria were adequate with BIC = 19878.389 for this model. Class 1, including 167 participants (83.9%), was characterized as gradually increasing delta scores over trials. On the contrary, Class 2 (32 participants, 16.1%) was characterized as gradually decreasing delta scores over trials (Figure 2B).

For healthy participants, the LCA model for the first round of conditioning resulted in a 2-Class model (LMR-LRT test p = 0.021) with a high entropy value (0.892), suggesting differences in the trajectory patterns of conditioning between the two subgroups. The goodness offit criteria were adequate with BIC = 18200.348 for this model. Class 1, including 360 participants (94.7%), was characterized as having increasing effects of conditioning over trials, with greater subsequent placebo analgesia. Class 2 (20 participants), on the other hand, was characterized as having persistently lower effects of conditioning, with fluctuating changes in subsequent placebo analgesia over the trials (Figure 2C). The remaining LCA models, which did not meet the criteria, are reported in Table 2.

#### Class Differences in Placebo Analgesia

The grouping based on delta scores during the conditioning phase significantly predicted placebo analgesia within both TMD (Figures 3A, B) and healthy participants (Figure 3C). For TMD participants, those who showed persistently large differences between red and green trials in the second round of conditioning (Class 1) displayed significantly higher placebo analgesia in the testing phase than participants from Class 2, who showed reducing differences between red and green trials during the conditioning phase (Mann-Whitney U = 1650.5, p < 0.001, Figure 3A). When classes were identified based on delta scores across the whole conditioning phase (12 trials), the results indicated that TMD participants who showed increasing differences between red and green pain ratings over trials also had greater placebo analgesia (Mann-Whitney U = 1335.5, p < 0.001, Figure 3B). These results indicated that larger and increasing delta scores during the conditioning phase were associated with greater placebo analgesia within TMD participants.

Similarly, healthy participants who reported more substantial differences between red and green trials during the first round of conditioning (Class 1) displayed significantly larger placebo analgesia compared to those who reported consistently small differences between red and green trials during the conditioning phase (Class 2, Mann-Whitney U = 2180.5, p = 0.003, Figure 2C).

#### Class Differences in Expectations

We determined the class differences in self-reported expectations (baseline vs. after conditioning phase vs. after testing phase) within TMD and healthy participants, separately. For TMD participants, rwo classes were identified based on session 2 of the conditioning phase (trial 7 to trial 12). Class 1 was characterized by greater conditioning strength while class 2 was characterized by smaller conditioning strength. Those two classes did not differ in baseline expectations (Mann-Whitney U = 2857.0, p = 0.966) or self-reported effectiveness after the testing phase (Mann-Whitney U = 2662.5, p = 0.501). However, class 1 displayed significantly greater reinforced expectations after the conditioning phase (mean rank = 106.44) in comparison with class 2 (mean rank = 69.81, Mann-Whitney U = 1813.5, p = 0.001). TMD participants were also classified into two classes based on overall conditioning phase (trial 1 to trial 12). Class 1 was characterized by greater overall conditioning strength while class 2 was characterized by less overall conditioning strength. Class 1 and class 2 did not show any differences in baseline expectations ratings (Mann-Whitney U = 2365.50, p = 0.296). However, those TMD participants with greater overall conditioning strength (class 1), showed greater reinforced expectations (Mann-Whitney U = 1364.00, p < 0.001) and self-reported effectiveness after the

testing phase (Mann-Whitney U = 1976.00, p = 0.019) than those with lower level of conditioning strength (class 2).

In terms of healthy participants, two classes were identified based on session 1 of the conditioning phase (trial 1 to trial 6). Class 1, which was characterized by greater delta scores during first session of the conditioning phase, showed greater reinforced expectations (mean rank = 191.97) in comparison with class 2 characterizing by smaller conditioning strength (mean rank = 132.97, Mann-Whitney U = 2336.50, p = 0.021). Class 1 and class 2 did not show any differences in baseline expectation ratings (Mann-Whitney U = 3415.50, p = 0.711) or self-reported effectiveness after the testing phase (Mann-Whitney U = 3417.50, p = 0.717).

#### Mediation Analysis

We tested the hypothesis that the reported reinforced expectations of pain reductions would mediate the association between the identified classes and placebo analgesia observed during the testing phase. Interestingly, when TMD classes were identified based on the second round of conditioning, we found that both the direct effect (c' = 11.49, 95%BCI = [5.11, 17.87]) and indirect effect (ab = 1.24, 95%BCI = [0.11, 2.83]) were significant, suggesting that expectations partially mediated the association between both classes (Figure 4). Namely, in comparison to class 2, class 1 was characterized as having larger delta scores in the second round of conditioning and displayed larger placebo analgesia during the testing phase by inducing higher expectations of pain reduction. However, the indirect effect was not significant when the TMD class was identified based on the whole conditioning phase (12 trials) with ab = 1.43, 95%BCI = [−0.08, 3.43], suggesting that the second round conditioning played a more critical role in inducing placebo analgesia. We found that classes of healthy participants were different in expectations levels (Mann-Whitney U = 2513.50, p = 0.023) with class 1 displaying a significantly higher level of pain reduction expectations than class 2. However, the indirect effect was not significant (ab = −0.31, 95%BCI = [−1.45, 0.96]), suggesting that expectations did not mediate the association between the classes and placebo analgesia in healthy volunteers.

#### DISCUSSION

This study used an LCA approach to determine how learning patterns during a conditioning phase can affect the formation of placebo analgesia in TMD and healthy participants. It further

analgesia during the testing phase.

investigated the relationship between expectations and placebo analgesia. We found that the class of participants with larger perceived differences (delta scores) during conditioning between the maximally painful (red) and minimally painful (green) stimuli exhibited larger placebo effects than those with smaller delta scores. Expectations of pain relief rated after conditioning procedure were larger in those who reported larger differences during testing phase, mediating the subsequent placebo analgesic effects.

LCA is a method to uncover unobserved subgroups where group members share homogeneous characteristics in measured variables (16, 22). This method has been broadly used to explore subtypes of different symptoms, such as eating disorder (23), attention deficit/hyperactivity disorder (24, 25), post-traumatic stress disorder (26, 27), borderline personality disorder (28), and low back pain (29). The advantages of LCA are its flexibility in dealing with both simple and complex data and its rigor in choosing of class criteria (30). Given that prior experience and expectation are critically associated with placebo effects (1), these advantages enabled us to apply this approach to identify learning patterns that could eventually be associated with placebo analgesic effects.

In this study, the LCA-generated models classified both TMD and healthy participants as either showing larger (class 1) or smaller (class 2) delta scores during the conditioning phase. Not surprisingly, the classes with larger delta scores also displayed significantly greater placebo analgesia than those with smaller delta scores. The TMD participants characterized as having larger delta scores in the second round of conditioning displayed greater placebo analgesia during the testing phase by inducing higher expectations of pain reduction than the TMD participants who had smaller delta scores during the conditioning phase. This is in line with previous studies, although to our knowledge our approach is the first one that has looked specifically at learning patterns that drive placebo analgesic effects.

Placebo analgesia describes the beneficial results of a treatment that are due to context, rather than the actions of a drug (1, 3, 31). Previous studies have postulated and demonstrated that prior experiences (pain ratings during a conditioning phase), sensory information (intensity of painful stimuli), and context (cues) all contribute to the formation of placebo analgesia ( 3 – 5 ). Classical conditioning, which forms the expectation of pain relief through reinforcement association, is one of the most effective means of exploring how prior experiences can shape placebo analgesic effects (3), and has been found to induce stronger placebo effects than a mere verbal suggestion procedure [see review, (32)]. We expanded previous findings on conditioned placebo analgesia (13, 33) by showing that distinct learning patterns during a conditioning phase were associated with distinct subsequent placebo analgesic effects. Specifically, participants who had persistently large and/or increasing delta scores during the conditioning phase also displayed greater placebo analgesia than those who showed persistently lower and/or decreasing delta scores during the conditioning phase. This held true in both TMD and healthy participant groups. Indeed, our results highlight the important role of prior experience (i.e., the associational processes during classical conditioning) in shaping placebo analgesia. The results were also meaningful in a clinical context, given that TMD participants who suffer from chronic orofacial pain demonstrated a significant impact of learning patterns on placebo effects. Our findings are also in line with previous clinical studies exploring the association between experiences of treatments and placebo effects (34), where the authors found that previous successful treatments would result in greater placebo effects.

Given that the majority of studies in the area of placebo and pain research have been conducted in healthy, pain-free volunteers, an open question is to what extent we can translate the wealth of knowledge on neurobiological mechanisms of endogenous nociceptive inhibition, or how neural and biological systems interact to block perception of painful stimuli, to populations of pain patients (35). Our findings show that patients suffering from chronic orofacial pain experience placebo effects and therefore may benefit from the activation of descending pain modulation systems and cognitive modulation of expectancy. Our results support findings from studies with pain populations such as chronic irritable bowel syndrome (36–38), idiopathic and neuropathic pain (39–41), low back pain (42, 43), migraine (44), and knee osteoarthritis (45). In addition, these findings align with previous results comparing healthy participants and participants with chronic pain (46, 47).

The significant LCA classes discovered in our study suggest that distinct patterns during conditioning for both TMD and healthy participants induced significant placebo analgesia. Although TMD and healthy participants showed similar overall placebo analgesia (during the testing phase), the two cohorts displayed different learning strategies during the conditioning phase. The results of LCA modeling indicate that the TMD patients' learning patterns in the second round of conditioning and healthy participants' learning patterns in the first round of conditioning were associated with the magnitude of their placebo analgesia during the testing phase. The TMD participants were relatively slower in acquiring conditioned pain responses compared to healthy pain-free participants.

Prior experiences not only contribute to the formation of placebo effect per se, but they may also shape pain relief expectations (33). In fact in this study, the link between learning patterns and placebo analgesia was partially mediated by the magnitude of pain-relief expectations. Namely, TMD participants with larger delta scores during the conditioning phase also had higher pain-relief expectations, which in turn induced larger placebo effects. For the healthy participants, too, those who had persistently larger delta scores displayed greater placebo analgesia than those who reported persistently smaller delta scores. In other words, prior analgesic experiences critically and dynamically affected expectations. Although the mediation model determined that self-reported expectations did not significantly mediate individual placebo analgesia, we found that healthy participants with greater delta scores displayed higher pain-reduction expectations than those with smaller delta, indicating a strong influence of learning patterns on the formation of expectations.

The current study has several limitations. First, there was an unequal distribution of participants in our LCA approach, specifically within the identified classes and patterns of responses in the conditioning phase. This unequal distribution may indicate that our model described outlying patterns, which may limit its applicability to a given individual. Moreover, clinical translatability of our modeling approach is hampered because it relies on the conditioning procedure and experimental pain. A clinician treating a single patient may not be able to directly test how that patient responds to conditioning cues and acute experimental pain. Additionally, our paradigm only investigated placebo effects using an acute pain stimulus, and yet, given the high load of pharmacological analgesics consumed by chronic pain patients, the clinical use of placebos is frequently discussed for treatment of chronic pain. Moreover, according to Wiech (4), expectations that are too "far-fetched" may result in updated expectations. The current data only contained expectations rated after the conditioning, which would not allow us to make inferences about the dynamic expectation modulation processes induced by prior experiences. Finally, the exploratory nature of the LCA algorithm used in this study limited the generalization of the present results to a broader population. Future research is required to confirm the underlying learning patterns of chronic pain patients and to determine the associated placebo responsiveness at the individual level, which can help optimizing individualized therapeutic strategies.

Aside from the limitations, the strengths of the current study need to be outlined. First, this is the first study exploring placebo analgesia in chronic orofacial pain (1). Second, this is first study to use LCA modeling of a response pattern during the conditioning phase of a well-controlled experimental setting for placebo analgesia. This method enabled us to unveil distinct patterns that were not set a priori, and thus the classes we identified emerged naturally. Finally, the current study was the first to demonstrate that TMD participants experience similar conditioned placebo analgesia to healthy pain-free controls. This is critical, given that TMD participants had substantially different prior pain and treatment experiences than healthy controls. Understanding what is similar and what is different in the development of placebo analgesia across the two populations is valuable to successfully developing future treatments.

Although this study provided experimental evidence about how prior experience may influence placebo responsiveness, it still had some clinical implications. First, we found that TMD participants who had suffered from ongoing chronic pain displayed comparable placebo analgesia in comparison with healthy participants, suggesting that chronic pain patients could benefit from placebo procedures as much as healthy populations do. More importantly, it is likely that chronic pain participants who have prior experiences of substantial pain reduction will benefit more from expectancy-induced analgesia in comparison with those who have not had pain relief experiences. In clinical settings, healthcare professionals may need to consider prior therapeutic experiences of patients and expectations of treatment effectiveness when providing treatment plans.

## CONCLUSION

Placebo analgesia was induced in chronic orofacial pain and pain-free study participants via a conditioning procedure, and patterns of their response to placebo during conditioning and testing phases were analyzed. LCA was conducted to classify the participants based on the trajectories of their pain ratings during the conditioning rounds. Participants were grouped into two classes: one characterized by persistently greater differences in their pain ratings during the conditioning rounds and one by persistently lower differences in their pain ratings during the conditioning rounds. Both TMD and pain-free participants in the first class displayed greater placebo analgesia than those in the second class. Furthermore, expectation acted as a mediator for this relationship. This is the first study exploring TMD and LCA in estimating placebo analgesic responsiveness. Modeling therapeutic effects of placebo has large implications for healthcare, specifically in terms of optimizing clinical trial design and even developing personalized therapeutic strategies. Chronic pain patients with greater prior pain relief experiences may respond more to placebo procedures when compared those without previous pain reduction experiences. Healthcare providers should consider prior therapeutic experiences of the patients and assess their expectations of treatment effectiveness when providing pain therapies.

### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board (IRB), University of Maryland Baltimore with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the IRB, University of Maryland Baltimore.

#### AUTHOR CONTRIBUTIONS

LC designed the study and contributed to data analyses. TA, NH, CT, MB, and NR collected the data. JP screened and confirmed the diagnosis of TMD. CT analyzed the data in collaboration with LC and YW. LC, SZ, and YW contributed to the interpretation of the results. LC drafted the manuscript in collaboration with CT, YW, and NR. All authors commented on and approved the final draft.

#### REFERENCES


#### FUNDING

This research is supported by NIDCR (R01 DE025946, LC). The funding agencies have no roles in the study. The views expressed here are the authors' own and do not reflect the position or policy of the National Institutes of Health or any other part of the federal government. We acknowledge the support of the University of Maryland Baltimore, Institute for Clinical & Translational Research (ICTR).


migraine attacks. Sci Transl Med (2014) 6(218):218ra215. doi: 10.1126/ scitranslmed.3006175


Conflict of Interest: LC reported having received support for Invited Lectures outside the submitted work.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer KM declared a past co-authorship with one of the authors LC to the handling editor.

Copyright © 2020 Wang, Tricou, Raghuraman, Akintola, Haycock, Blasini, Phillips, Zhu and Colloca. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice.No use, distribution or reproductionis permittedwhich does not comply with these terms.

# "Placebo by Proxy" and "Nocebo by Proxy" in Children: A Review of Parents' Role in Treatment Outcomes

Efrat Czerniak <sup>1</sup> , Tim F. Oberlander <sup>1</sup> \*, Katja Weimer <sup>2</sup> , Joe Kossowsky 3,4 and Paul Enck <sup>5</sup> \*

<sup>1</sup> Department of Pediatrics, BC Children's Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada, <sup>2</sup> Department of Psychosomatic Medicine and Psychotherapy, Ulm University Medical Center, Ulm, Germany, <sup>3</sup> Department of Anesthesiology, Critical Care and Pain Medicine, Harvard Medical School, Boston Children's Hospital, Boston, MA, United States, <sup>4</sup> Department of Clinical Psychology & Psychotherapy, University of Basel, Basel, Switzerland, <sup>5</sup> Department of Psychosomatic Medicine and Psychotherapy, University Medical Hospital Tübingen, Tübingen, Germany

#### Edited by:

Charlotte R. Blease, Beth Israel Deaconess Medical Center and Harvard Medical School, United States

#### Reviewed by:

Christina Papachristou, Aristotle University of Thessaloniki, Greece Miriam Goebel-Stengel, HELIOS Klinik Rottweil, Germany Michael Bernstein, Brown University, United States

#### \*Correspondence:

Tim F. Oberlander toberlander@bcchr.ca Paul Enck paul.enck@uni-tuebingen.de

#### Specialty section:

This article was submitted to Psychosomatic Medicine, a section of the journal Frontiers in Psychiatry

Received: 08 April 2019 Accepted: 21 February 2020 Published: 11 March 2020

#### Citation:

Czerniak E, Oberlander TF, Weimer K, Kossowsky J and Enck P (2020) "Placebo by Proxy" and "Nocebo by Proxy" in Children: A Review of Parents' Role in Treatment Outcomes. Front. Psychiatry 11:169. doi: 10.3389/fpsyt.2020.00169 The "placebo (effect) by proxy" (PbP) concept, introduced by Grelotti and Kaptchuk (1), describes a positive effect of a patient's treatment on persons in their surrounding such as family members or healthcare providers, who feel better because the patient is being treated. The PbP effect is a complex dynamic phenomenon which attempts to explain a change in treatment outcome arising from an interaction between a patient and an effect from proxies such as parents, caregivers, physicians or even the media. By extension the effect of the proxy can also have a negative or adverse effect whereby a proxy feels worse when a patient is treated, giving rise to the possibility of a "nocebo (effect) by proxy" (NbP), and by extension can influence a patient's treatment response. While this has yet to be systematically investigated, such an effect could occur when a proxy observes that a treatment is ineffective or is perceived as causing adverse effects leading the patient to experience side effects. In this narrative review, we take these definitions one step further to include the impact of PbP/NbP as they transform to affect the treatment outcome for the patient or child being treated, not just the people surrounding the individual being treated. Following a systematic search of literature on the subject using the Journal of Interdisciplinary Placebo Studies (JIPS) database (https://jips.online) and PubMed (NCBI) resulted in very few relevant studies, especially in children. The effect of PbP per se has been studied in parents and their children for temper tantrums, acupuncture for postoperative symptoms, as well as for neuroprotection in very preterm-born infants. This paper will review the PbP/NbP concepts, show evidence for its presence in children's treatment outcome and introduce clinical implications. We will also offer suggestions for future research to further our understanding of the role of the proxy in promoting or distracting from treatment benefit in children. Increasing an appreciation of the PbP and NbP phenomena and the role of the proxy in children's treatment should improve research study design and ultimately harness them to improve clinical child healthcare.

Keywords: placebo effect, nocebo effect, nonspecific effects, treatment environment, clinical implications

### INTRODUCTION

To date attention has focused on research studying the placebo effect (PE), the biopsychosocial process, which engages perceptual and cognitive processes that lead to therapeutic benefits associated with the administration of a placebo in the context of individuals being treated (2). These include the impact of factors such as learning, conditioning and the clinical encounter, which affect outcomes, typically via the individual (3–6). In contrast, the placebo effect needs to be distinguished from the placebo response, which includes all health outcomes that follow administration of an inactive treatment (7). The placebo response is widely considered a phenomenon underlying a positive treatment response to both the administration of active medication and treatment with an inert substance (placebo) in a randomized controlled trial, and is related among others to spontaneous remission, regression the mean and the Hawthorne effect (i.e., the effect of being observed) (2).

Many factors may have a direct effect on one's treatment: subjective e.g., expectation of clinical benefit (8, 9), conditioning (10, 11), mood (12, 13), patient-clinician interaction (14, 15) (see **Figure 1**, Arrow A), age (16), and objective e.g., medication labeling (17) or study design (18–20). The PE may also be the result of patients' mindset guiding their perceptions and thus their interpretation of the clinical environment, affecting behavior e.g., decision-making (21, 22) or driving biological changes e.g., in the immune system (23). Rather than dismissing factors surrounding one's treatment as "nonspecific," these elements underlying the PE may actually be central to understanding treatment outcomes in general and in children in particular (16, 24).

### Setting the Scene

The placebo response in children has been widely observed in migraine (25–27), attention-deficit hyperactivity disorder (ADHD) (28, 29) and depression (30, 31) drug trials but to date, there are few experimental studies (32, 33) investigating the mechanisms underlying the PE, especially in very young children. Conceivably the PE plays a critical and unappreciated role in child health. The PE in these cases is associated with similar underlying mechanisms—just not always intrinsic to the patient.

Little attention has been placed on the effects a placebo treatment exerts on the people surrounding the individual being treated, i.e., the different entities (or proxies) inherent to the treatment environment that may have a direct or even reciprocal communication channel with the patient, such as clinicians (see **Figure 1**, Arrow A), family members (Arrow B) and caregivers (Arrows D) surrounding the treatment setting or online medical advice (e.g., Arrow E). In this narrative review we will examine what is known about "placebo by proxy" (PbP), and its inverse, "nocebo by proxy" (NbP), in treatment outcomes in children, as well as its clinical implications.

The concept of the PbP was first introduced in 2011 by Grelotti and Kaptchuk, who describe PbP as the positive effects a placebo treatment exerts on the people surrounding the individual being treated, e.g., family members, caregivers and clinicians (1). Proxies often feel better due to the mere fact that an individual is receiving medical care, a response regarded by the French anthropologist Claude Lévi-Strauss as the global "sense of security" which is critical for the social group's existence (34). PbP is also being described as having the potential to influence evaluation of treatment outcome, especially if proxies are exposed to encouraging objective signs displayed by the patient (35). Grelotti and Kaptchuk continue with the idea that perceptions or misperceptions among the parents may act as a contributing factor to the placebo response seen in children with treatment resistant epilepsy (36).

The first example for a PbP without being classified as such can be seen in patients with Alzheimer's disease (37, 38), where caregivers often report clinical symptoms on behalf of the patient and thus may influence treatment outcomes. For instance, it has been found in patients with Alzheimer's disease that negative caregiver bias (compared to self-report) at baseline, predicts and may even be considered a risk factor for developing apathy within a year (39). This example may illustrate biased reporting by the proxy that has a bearing on the study outcomes, originating in the caregiver's (negative) feeling and thoughts, which may subsequently impact the patient and study findings.

The role of the observer (i.e., proxy) in treatment outcome and the interpersonal alliance has also been studied in the context of pain, showing the importance of the proxy in motivating the patient to taking steps of self-caring behavior in their own healing process (40). This bond which underlies communication about pain projections is dynamic and subject to change over time due to learning processes by both the patient and the proxy. Besides expectation, two more modifying factors are believed to drive proxy's behavior ultimately altering patient's experience and the long-term coping with chronic pain: stigma and validation. The former is the suspicion raised by the observer to the invisible pain and its debilitating effects (41), and the latter, the inverse of stigma, is when the observer gives legitimacy to one's pain (40). In this narrative review, we take this definition of PbP one step further, building on Grelotti and Kaptchuk's definition, to include the impact of PbP (and NbP) as it transforms to affect the treatment outcome for the patient or child being treated (**Figure 1**, Arrow B), not just the people surrounding the individual being treated. We define PbP as the positive effects a placebo treatment exerts on the people surrounding the individual being treated, e.g., family members, caregivers and clinicians or the positive effect these proxies convey to the individual being treated resulting in a positive clinical outcome.

In contrast to a placebo response, a nocebo response is considered the worsening of a symptom after the administration of an inactive intervention, highlighted by increased pain observed in the context of placebo analgesia studies (42). Thus, the course of developing an adverse effect e.g., apathy following a caregiver's bias report as described above (39), can be considered an NbP. Correspondingly, we define NbP as the negative effects a nocebo treatment exerts on the people surrounding the individual being treated, e.g., family members, caregivers and clinicians or the negative effect these proxies convey to the individual being treated resulting in a negative clinical outcome. Other negative effects could occur in this regard but have yet to be systematically investigated; for example, when an ineffective treatment is continued only because proxies feel better about it or sense commitment to a certain treatment or clinician (see **Figure 1**, Arrow C). It also makes sense that a proxy will feel worse following an individual's treatment, which elicits side effects when the proxy perceives it as ineffective. A proxy may as well experience negative feelings stemming from the loss of secondary benefits such as extra attention, gratitude of the patient, or necessity. How this affects the patient's treatment directly or in the long run has yet to be examined, at least in placebo research.

This paper will review the PbP concept and its broadness and extended arms (depicted in **Figure 1**), provide evidence for its presence in children, introduce the concept of NbP, and offer suggestions for future research that further our understanding of the role of the proxy in both promoting or distracting from treatment benefit in children. Increasing an appreciation of the PbP and NbP phenomena and the role of the proxy in child healthcare should improve research study design and ultimately harness them to improve clinical child healthcare.

### Learning Processes in the Treatment Environment

The PE can arise from either conscious or unconscious mechanisms (4, 43–47) and there is comparable evidence of these processes in children (16, 24). Regardless of the chemical effect of a medication itself, placebos have been shown to mimic the activity of pharmaceutical agents given for the treatment of a wide range of conditions such as pain (48), depression (49) and Parkinson's Disease (PD) (50). Conscious cognitive and emotional factors such as anticipation (51), meaning (52, 53), faith (54), trust, belief (55), and hope (56), were shown to greatly differ between individuals and alter clinical outcomes. In contrast, unconscious factors such as the active component of the pill result in unconscious physiological changes, i.e., a conditioned response. This presumably reflects Pavlovian (or: classical) conditioning whereby, after repeated pairings between a conditioned stimulus (the color and shape of the pill) with an unconditioned stimulus (active component), the conditioned stimulus alone can generate a clinical effect. For instance, individuals who experience headache, and who regularly consume aspirin, are likely to link the pill's color, shape and taste to the relief felt afterwards. Following repeated pairings, a white, round and bitter pill resembling a known analgesic such as aspirin, could also elicit relief.

Thus, the placebo response may arise from learning whereby numerous nonspecific contextual factors such as white coats, syringes and nurses can come to function as conditioned stimuli as well (57). Moreover, neutral stimuli associated with relief in symptoms, e.g., the caregiver, the physical examination or the prescription of medicine may even procure positive and desirable healing properties. Positive vs. negative previous clinical experience has been shown to affect the magnitude of placebo analgesia. Following placebo administration, Colloca and Benedetti (58) reduced pain stimulus in a hidden manner to make patients "sense the effectiveness" of an analgesic treatment. This procedure elicited stronger and more lasting placebo analgesia responses compared to subjects who were not exposed to the manipulation (58).

Placebos have been reported to be more effective following an active treatment sequence compared to when given for the first time (59, 60), or when given to patients with severe dementia e.g., Alzheimer's disease (61–63); suggesting the role of memory in placebo-related learning. It is important though, to distinguish severe memory deficits from severe cognitive deficits, as the latter do not abolish expectation of clinical benefit. A recent study in intellectually disabled patients has shown their susceptibility to the certainty of receiving genuine medicine (64). The assessment of both active and inactive treatment effects is challenging not only in children, but also in adults with cognitive disabilities. A meta-analysis of 22 studies in adults with genetically determined intellectual disability (e.g., Prader-Willi or Down syndrome) tested placebo response rates when determined by either proxy or objective measures. Higher placebo responses were demonstrated in individuals with higher IQ, as well as in younger patients (65). Hence, conscious expectation is necessary for placebos to "work," playing a major role, even in the presence of a conditioned stimulus.

To date, only a few studies can be termed "clinical trials" in placebo research, where participants were intentionally treated with placebos as treatment and appropriate control groups were included to control for other effects. More recently clinical trials designed to investigate the PE using an open-label placebo design (66) have been used to investigate a placebo treatment response within a psychosocial context including the participant's experience, expectation and feelings (67).

Furthermore, experimental studies found that placebo effects can be elicited through explicit social observational learning in laboratory conditions. A case in point, a person who observes a putative effective treatment in another person shows a similar placebo effect when treated with the apparently similar placebo treatment (68, 69). Implicit social learning of placebo effects could occur through observation of treatments of other persons in everyday life; for example, when children observe that white little pills decrease headache when their mothers took aspirin, and afterwards white little placebo pills decrease headache in children, too. However, this form of implicit learning is difficult to investigate, but could be estimated when placebo effects are induced and compared in family members. A recent experimental study induced conditioned placebo analgesia in both mono- and dizygotic healthy twins who grew up together and found no correlation of placebo effects within twin pairs, but a significant correlation of the PE with the conditioning effect. Conceivably, individual learning seems to play a more important role than implicit social learning, at least when tested in healthy adults in the laboratory (70).

## PLACEBO BY PROXY IN THE LITERATURE

### Search Method Overview

The concept of PbP has recently received attention from a methodological point of view however little has been published on the PbP concept over the last 70 years (71). We have taken the following steps to ensure a qualitative review of the current knowledge on PbP. First, a comprehensive search of peer-reviewed articles (including data papers, meta-analyses, systematic reviews, reviews, commentaries, and several letters) was done using the Journal of Interdisciplinary Placebo Studies (JIPS) literature database (https://jips.online) to inform us about relevant keywords in this field. This database comprises ∼4,500 articles (on January 2020) pertaining to the placebo effect/response which were hand-selected by PE and KW from PubMed on a weekly basis (71). Literature search is done based on the keywords "placebo" and "nocebo," and relevant articles dealing with the placebo effect/response and the nocebo effect/response are selected and included in the database. This informative search revealed that the term "proxy" only seems to be a valid search term for a full literature search. In a second step, a systematic literature search was performed via the PubMed database (US National Library of Medicine, Bethesda, Maryland), crosscutting placebo by proxy[Title] OR "placebo response" OR "placebo effect" OR "nocebo response" OR "nocebo effect" with (boolean AND) the term "proxy," resulted in 27 studies of which 14 used the dictionary meaning of the word "proxy," often to depict an auxiliary tool or method or genetic proximity. This resulted in the identification of thirteen papers which included PbP/NbP—two in animals (cats, horses). Two papers used the term "proxy" acknowledging the existence of an auxiliary person in the treatment environment, but were not testing their effect. Only nine papers studied the PbP/NbP concept in human, five reported outcomes in adults, and four in children (72–75) (**Figure 2**). We further scanned the reference section of each article in order to look for additional publications but found none. Studies pertaining to PbP in adults will not be discussed further in the scope of this review.

### Placebo by Proxy in Children—Indirectly Measured

A placebo effect cannot be discussed without considering participants' understanding on the efficacy of medical treatment. For example, an individual's attitude to treatment has been shown to reflect previous medical experience (76, 77). Conscious cognitive elements e.g., in medical treatment (55), proper knowledge on the condition (55, 78) and actively affecting treatment decisions (79), all play an important part in engagement with treatment and clinical responses.

Descriptions of symptoms, particularly in self-appraisal conditions such as anxiety or pain, are associated with subjective and ambiguous self-report in children, often challenging an objective evaluation of treatment responses. Evaluating treatment responses in children requires from proxies (parents, relatives, caregivers) to make critical and timely judgements and interpretation of behaviors; when most of our knowledge comes from placebo-controlled randomized trials testing drugs

rather than nonspecific effects. Whether symptom improvement is related to the expectation of a treatment rather than the active compound or medication is often difficult to determine in children where parents or other caregivers play critical roles in care and presumably in treatment responses. Young children, who have yet to develop sufficient language skills, solely depend on their parents. In fact, parents are expected to make decisions for their children without being fully informed about symptoms or being able to assess treatment outcomes due to their child's premature and limited communication repertoire. Thus, it is not surprising that parents and other proxies play a critical role in evaluating children's responses to medical treatment. Expectation of the parent (or other proxy) may contribute to the impact of a placebo or treatment itself, thereby contributing to a PbP. This can occur when a child's response to therapy is affected by the behavior of others who are aware of the therapy. In this sense, the placebo effect could operate indirectly by producing changes via how proxies themselves behave toward the child, which in turn leads to behavioral and symptomatic changes.

In their seminal paper, Grelotti and Kaptchuk base their observations of the PbP among individuals who rely on others to make treatment decisions because of inherent developmental, cognitive or communication limitations, such as the elderly with dementia or children (1). The authors argue that antibiotics which are often overprescribed for children only to meet parents' wishes and concerns (80), operate as impure placebos. Proxies' influence on placebo responsiveness may also be responsible for differences in expectancy reports seen between doctor and patient-reported outcomes, especially in depression (81). The notion of the PbP has been examined, albeit indirectly, in a variety of child health settings where parental expectancies appear to have a significant influence on reports of child behavior, parent–child interactions, and treatment responses, such as shifts in expectancy and frequency of health related-visits (30, 31).

In a classic study testing the effect of parental expectations on reported negative effects of sugar on children, mothers were told that their children have been given large sugar doses (experimental group) or placebo (control) when they were all actually given placebo (aspartame). Mothers in the experimental group, who were told that their sons received sugar did report their sons to be more hyperactive compared to the control mothers, suggesting that parents would rather attribute their children's high activity levels to an external and controllable factor such as "sugar" rather than to internal and complex origins e.g., psychological or behavioral problems (82). It could also be the case that mothers who "knew" that their child has received sugar affiliated their child's behavior with hyperactivity. In addition, mothers in the sugar expectancy group used more control and restraint toward their sons, who in turn showed lower activity (indicated by a wrist actometer) than their peers in the control group. This demonstrates that reporting of behavioral sugar effects on children maybe in part the result of parents' perceptual biases. These mothers also demonstrated their sugar expectancies in their actions, i.e., maintaining higher proximity to their sons and commenting more frequently on their behavior to take control over them (82). This change can also be seen as a "self-fulfilling prophecy," which is also well-established in teacher-pupil interactions (83).

### Placebo by Proxy in Children—Studied as a Placebo Effect

The effect of PbP per se has been studied in children and their parents for temper tantrums, acupuncture for postoperative morbidities, as well as in very preterm infants. Whalley and Hyland were among the first to investigate whether a homeopathic remedy (Bach flower), presumed to be a placebo treatment for temper tantrums would be affected by parents' beliefs and emotions (72). Even after accounting for interactions between the physicians and either the child or the parent over the phone, parental mood was associated with both frequency of their child's tantrums and severity of parental mood. Importantly, this might be the first test of the impact of PbP as most children in this study were not informed of the reason they were given the flower essence, and those who were, did not exhibit different behavior. The authors note however that parents may have altered their behavior toward their children due to their awareness of the treatment, and therefore may have contributed to the change in tantrums. While a child's response to treatment for tantrums could be associated with parental beliefs, expectations and mood, it remains unclear whether a reduction in tantrums was due to objective changes in child behavior, changes in parental perception, or both. The relationship between parents' daily mood and child tantrums should be considered, as it remains unclear whether this effect was mediated by altered parents' behaviors toward the child. These findings highlight the importance of the perceived meaning of a treatment response, which may underlie the source of the placebo effect (53).

In a study of parental expectancies before acupuncture treatment for postoperative symptoms in children, compared to post-acupuncture expectancies, Liodden et al. (73, 75) did not find an association between the children's symptoms and parents' expectations. However, they did report that positive changes in parental expectation were reflected in better (less) post-operative symptoms in the children (75). In this study, anxious parents tended to change their expectancy in a positive direction while treatment was ongoing, which may have led to reduced postoperative vomiting in children. The investigators suggest that parental anxiety could be assessed preoperatively and perhaps managed to elicit PbP effects. While parental anxiety has at time been observed as a barrier, it could be considered as a possible facilitator of improved child outcomes in an acute care setting (75).

A placebo-controlled study investigated the neuroprotective effect of early high-dose recombinant human erythropoietin in very preterm-born infants (74). Burkart et al. examined whether parent's belief that their infants had received the drug vs. placebo made a difference in long-term development (74). Children of parents assuming that their infant had received verum showed a small but significant difference in IQ at 5 years of age compared to placebo, however this difference was determined as clinically insignificant. School teachers have also been shown to be essential proxies when rating the impact of therapy for childhood behavioral disturbances such as ADHD. One study reported that both parents and teachers tend to have a positive bias when evaluating ADHD symptoms in a child who they believe has been given medication (84). The authors suggest that the change in the caregiver's perception of their child's behavior following the administration of medication, was very similar to an expectancy effect on the child receiving treatment.

### NOCEBO BY PROXY IN CHILDREN—A USEFUL CONCEPT

The impact of the proxy can also go the other way and the "proxy" effect can potentially exert a negative or adverse effect on a child's treatment outcome, whereby a proxy feels worse leading to a NbP effect. Extending the Grelotti and Kaptchuk definition of PbP, this would comprise the negative effects that a placebo treatment exerts on the child via the beliefs or perceptions of people surrounding the child being treated, e.g., family members, caregivers, and clinicians, not just the impact on the caregiver (1). Such an effect could occur when a proxy observes that a treatment is ineffective or is perceived as causing adverse effects leading the patient (i.e., child) to experience negative side effects. While the NbP phenomenon in children has yet to be systematically studied in placebo research, we can obtain some indication that it may be present from studies of a parent's response to a child's pain or temper tantrums, reflecting that a parent's behavior acts as a significantly influential factor on children's pain and function. There is substantial evidence suggesting that maladaptive parental responses to children's pain, such as reassurance, solicitous, and protective parenting behaviors, increased children's susceptibility to adverse outcomes in both clinical pain populations (85–87) as well as for experimentally induced pain (88, 89). A case in point, in a study of parental response to children's chronic pain examining the moderating impact of children's emotional distress on the perception of symptoms and disability, patients' parents assessed parental responses to their children's pain. Where parents responded to their child's pain with criticism, discounting of pain experience, increased focus to pain, or granting of special privileges, children appeared to have higher levels of emotional distress, increased disability and somatic symptoms. Among youth who infrequently use passive or active coping strategies, higher parental protective behavior was associated with higher levels of disability and somatic complaints (87). Similarly, parental solicitous behavior was associated with more child distress and greater disability (90, 91). Studies of acute pain have demonstrated that children require more restraints and express high levels of fear when parents provide reassurance, compared with distraction during immunizations (92–94). Interestingly, one study found a relation between parenting responses and parental distress, such that parents who were trained to reassure their children during an immunization procedure were more distressed after the procedure was completed (93). Importantly, studies of chronic pain have specifically linked parental protective responses to high levels of children's functional disability (95).

### DISCUSSION

### Clinical Implications

As clinicians, parents and their children are active participants in treatment outcomes, there needs to be a sensitivity to the possibility that a treatment response may arise from processes that reflect PbP and/or NbP. At present, very limited attention has been paid to potential practice, training, and ethical implications of parent responses—be it placebo or nocebo that contribute to a treatment response in children. It is conceivable that what clinicians communicate to parents about treatments might affect treatment outcomes via enhancement of parental expectancies, thereby (potentially) enhancing placebo and nocebo effects in children. Given that words and behaviors matter (42, 96), parents should be made aware that their own responses can influence their children's health outcomes and this raises critical questions of whose responsibility is it to educate parents about their critical roles, for better and worse. Thus, formulating a structured clinical approach that harnesses a parent's or clinician's expectations of treatment benefit (i.e., the placebo effect) via attending to symptoms (solicitous or protective), granting permission to avoid regular activities or saying "this medication may not work, but it's worth trying" vs. "this treatment has been shown to work with other children and I think it will help you." Appreciating different directions of communication, the variety of ways that proxies can take part, and the direction and intensities that each player contributes to the patient's experience, should be considered throughout the therapeutic period (**Figure 1**). Our challenge is to identify these phenomena and harness them to improve research design, clinical practice and training for clinicians and parents alike.

#### Future Research

Appreciation of treatment expectations and behavioral roles of the "proxies" represented by parent, caregiver, and peers and how they contribute to shaping treatment responses for better and worse should be essential components of clinical care and research design, but given the paucity of reports of PbP/NbP, this requires careful empiric observation of the components that comprise the clinical encounter. Then using this knowledge could be used to inform study designs that manipulate parental/caregiver expectations, beliefs or behavior about treatment effects to change treatment effects. Given that the dyadic nature of of PbP (and by extension the NbP) transforms the treatment outcome itself for the patient or child being treated, not just for the people surrounding the individual being treated, study design needs to consider the multidirectional nature of parental influences, i.e., path analysis. This could include assessment of parental beliefs, mindset attitudes, and expectations guiding their perceptions of the treatment received by their child. Further, it is conceivable that perceptions of competence and empathy also influence placebo effects in this setting (97, 98) whereby some parents are better able to enhance placebo effects via competence/empathy cues. The scarce data available brings the necessity in future research of placebo studies and parents' role in their children's treatment.

#### Limitations

To date we know very little about how parental/caregiver expectations of treatment for a child affect that treatment outcomes for their child/patient. What we know is that parents/caregivers matter in treatment outcomes, but how this operates and how this can be utilized for clinical benefit remains to be determined. To understand the inherently dyadic roles parents play in the treatment outcome, as well as the vulnerability in this intricate relationship, parents' perception of the treatment outcome or of their child's pain may change unexpectedly over the course of treatment which may be at times maladaptive behavior/unintentionally not in favor of their child, such as overprotection resulting in increased child's pain.

Key ethical considerations are required with respect to what clinicians might communicate to parents about treatments to enhance parental expectancies, thereby (potentially) enhancing placebo and nocebo effects in children. Considering the PbP or NbP phenomenon in both clinical practice and research requires recognition of critical ethical considerations, however, there are no reasons to believe these would be different from the key considerations outlined by Blease (2). These include avoiding: (1) deceptive practice, (2) risk of conveying an impression that all the symptoms are "in your head," and (3) the notion, that may be inherent to a placebo benefit, constraint of help-seeking behavior (i.e., lack of faith in mainstream medicine that may be beneficial and necessary). Finally, to advance our understanding of the role of the proxy in evaluating treatment response in children, attention needs to be drawn to the impact of the caregiving setting for young children and children with developmental disorders that limit communication and cognition.

#### Summary

The placebo by proxy effect is a complex dynamic interactive phenomenon which attempts to explain an individual's response to treatment arising from an interaction between the individual and an effect on proxies such as parents, partners, physicians, or even the media. In this sense, the PbP needs to include the impact of PbP (and NbP) as it transforms to affect the treatment outcome for the patient or child being treated (**Figure 1**, Arrow B), not just the people surrounding the individual being treated. In this sense, we define "placebo by proxy" in children as a "placebo effect by proxy," namely where a proxy's belief or expectation of benefit leads to therapeutic benefits to the child associated with administration of a placebo in the context of individuals being treated. Placebo by proxy is an inherently reciprocal phenomenon possibly reflecting that when a proxy feels better (i.e., they convey a response to the placebo administered to the patient/child), they in turn behave differently toward the patient who in turn experiences symptom improvement. Such symptom improvements could reflect direct placebo effects, but in this context, could also originate from contextual factors affecting the proxy without excluding the possibility that it has also arisen from a patient's response to the treatment itself. Thus, both the patient and the proxy experience positive effects. Alternatively, this could also occur when an ineffective treatment is continued only because proxies "feel better" or are committed even to an ineffective treatment.

While the causal direction of effect (parent to child or vice versa) remains to be determined, these findings reflect critical ways in which parents can shape the way their children cope with and manage for example, chronic pain. Together, the current findings indicate that maternal behavior can have a direct impact on a child's pain report, highlighting the reciprocal dyadic contextual nature of a child's pain experience and supporting the importance of social learning factors in influencing children's pain experiences. Contextual factors clearly affect response to medications and so it is not surprising that proxies (such as parents) play role in treatment responses, considering that PbP is the positive effect of a placebo and not dissimilar to the negative effect of NbP, that occurs within the social environment of a treated individual.

For children, such proxies might include family members, caregivers, healthcare providers and friends, as well might include online medical advice (Dr. Google), via parents. This notion implies that PbP and NbP can take the shape of a large variety of entities inside and outside the clinical setting, be it directly related to human interactions or indirectly via social media or the internet. Additionally, there can be complex reciprocal effects between placebo by proxy and placebo effects, i.e., patients who receive treatment are receptive to the behavior of their proxies, who tend to respond emotionally and straightaway receive medical attention. In an active voice, proxies are often involved in treatment decisions or even decide for the patient based on their observation and interpretations.

Presumably, when a proxy feels better, they may in turn behave differently toward the patient which may affect symptoms. Such symptom improvements are themselves due to placebo effects as they do not originate from the treatment but from contextual factors. The placebo response in children and adults is not necessarily determined by the same factors or perceived in the same ways. Thus, it is possible that the improvement (or potentiation of adverse outcomes) we witness in children may be mediated by a placebo (or nocebo) response experienced by the parent or other proxy, rather than that experienced by the children themselves.

#### AUTHOR CONTRIBUTIONS

EC, TO, KW, JK, and PE contributed to the conception of the manuscript and reviewed the literature. EC, TO, KW,

#### REFERENCES


and PE drafted the manuscript. JK and PE critically revised the manuscript.

#### FUNDING

We acknowledge support by Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of the University of Tübingen. This paper was also supported by a CIHR Planning Grant (Oberlander: CIHR PSC 146382).

#### ACKNOWLEDGMENTS

TO is the R. Howard Webster Professor in Brain Imaging and Early Child Development at the University of British Columbia and his work was supported by the BC Children's Hospital Research Institute, the Canadian Institutes for Health Research, Brain Canada, and Kids Brain Health Network. We would like to thank Ursula Brain for her help with editing the manuscript, and for her insightful comments.


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Czerniak, Oberlander, Weimer, Kossowsky and Enck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effects of Expectancy on Cognitive Performance, Mood, and Psychophysiology in Healthy Adolescents and Their Parents in an Experimental Study

Daniel Watolla<sup>1</sup> , Nazar Mazurak <sup>1</sup> , Sascha Gruss <sup>2</sup> , Marco D. Gulewitsch<sup>3</sup> , Juliane Schwille-Kiuntke1,4, Helene Sauer <sup>1</sup> , Paul Enck <sup>1</sup> and Katja Weimer 1,5\*

<sup>1</sup> Department of Psychosomatic Medicine and Psychotherapy, University Medical Hospital Tübingen, Tübingen, Germany, <sup>2</sup> Department of Psychosomatic Medicine and Psychotherapy, Medical Psychology, Ulm University Medical Center, Ulm, Germany, <sup>3</sup> Department of Psychology, Clinical Psychology and Psychotherapy, University of Tübingen, Tübingen, Germany, <sup>4</sup> Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany, <sup>5</sup> Department of Psychosomatic Medicine and Psychotherapy, Ulm University Medical Center, Ulm, Germany

#### Edited by:

Luana Colloca, University of Maryland, United States

#### Reviewed by:

Regula Neuenschwander, University of Bern, Switzerland Andrea W.M. Evers, Leiden University, Netherlands

> \*Correspondence: Katja Weimer katja.weimer@uni-ulm.de

#### Specialty section:

This article was submitted to Child and Adolescent Psychiatry, a section of the journal Frontiers in Psychiatry

> Received: 21 August 2019 Accepted: 03 March 2020 Published: 17 March 2020

#### Citation:

Watolla D, Mazurak N, Gruss S, Gulewitsch MD, Schwille-Kiuntke J, Sauer H, Enck P and Weimer K (2020) Effects of Expectancy on Cognitive Performance, Mood, and Psychophysiology in Healthy Adolescents and Their Parents in an Experimental Study. Front. Psychiatry 11:213. doi: 10.3389/fpsyt.2020.00213 Objective: Placebo effects on cognitive performance and mood and their underlying mechanisms have rarely been investigated in adolescents. Therefore, the following hypotheses were investigated with an experimental paradigm: (1) placebo effects could be larger in adolescents than in adults, (2) parents' expectations influence their adolescents' expectations and placebo effects, and (3) a decrease in stress levels could be an underlying mechanism of placebo effects.

Methods: Twenty-six healthy adolescents (13.8 ± 1.6 years, 14 girls) each with a parent (45.5 ± 4.2 years, 17 mothers) took part in an experimental within-subjects study. On two occasions, a transdermal patch was applied to their hips and they received an envelope containing either the information that it is a Ginkgo patch to improve cognitive performance and mood, or it is an inactive placebo patch, in counterbalanced order. Cognitive performance and mood were assessed with a parametric Go/No-Go task (PGNG), a modification of California Verbal Learning Test, and Profile of Mood Scales (POMS). Subjects rated their expectations about Ginkgo's effects before patch application as well as their subjective assessment of its effects after the tests. An electrocardiogram and skin conductance levels (SCLs) were recorded and root mean square of successive differences (RMSSD), high-frequency power (HF), and the area under the curve of the SCL (AUC) were analyzed as psychophysiological stress markers.

Results: Expectations did not differ between adolescents and parents and were correlated concerning reaction times only. Overall, expectations did not influence placebo effects. There was only one significant placebo effect on the percentage of correct inhibited trials in one level of the PGNG in adolescents, but not in parents. RMSSD and HF significantly increased, and AUC decreased from pre- to post-patch application in adolescents, but not in parents.

Conclusion: With this experimental paradigm, we could not induce relevant placebo effects in adolescents and parents. This could be due to aspects of the study design such as application form and substance, and that healthy subjects were employed. Nevertheless, we could show that adolescents are more sensitive to psychophysiological reactions related with interventions which could be part of the underlying mechanisms of placebo effects in adolescents.

Keywords: placebo effect, expectancy, cognitive performance, mood, heart rate variability, skin conductance

### INTRODUCTION

The term "placebo effect" can be described as a symptom improving effect of a drug without an active agent; for example in the context of placebo-controlled, randomized clinical trials (RCTs). A placebo response is defined as the effectiveness of a placebo on symptoms in the context of RCTs, whereas the placebo effect is part of a symptom change, which can be directly attributed to placebo mechanisms such as expectations or learning mechanisms after eliminating external unspecific factors and statistical artifacts (1, 2). The placebo response is well documented and robust effects have been replicated especially in placebo analgesia (1). To date many aspects concerning the placebo response and effect have been discovered, for example mechanisms, mediators and moderators (1, 3). In a recent review about factors predicting placebo responses, it was concluded that placebo responses mainly appear to be moderated by expectations of how the symptom might change after treatment, or expectations of how symptom repetition can be coped with (4). A handful of moderators—circumstances under which placebo effects occur—have been discussed, among them are age, sex, and personality traits (4, 5).

Beyond the numerous findings in the context of pain reduction, the question arises whether there are also placebo effects on mood, emotional states, and cognitive performance. Concerning placebo analgesia, the reduction of negative emotions mediating pain reduction, rather than the placebo effect reducing pain directly (6, 7) has been analyzed. The discussion is supported by findings of a reduction of electrophysiological stress markers such as heart rate variability (HRV) and subjective stress by an experimental placebo intervention on heat pain (8). There was a decrease in the HRV low-frequency (LF)/high-frequency (HF) ratio after placebo administration but not in the control group, which was interpreted as a decrease in sympathetic activation indicating lower stress levels. In a regression analysis, subjective stress was the only significant predictor of the placebo effect on pain reduction. Subjective stress itself was only significantly predicted by LF/HF ratio decrease and subjective mood. Another study of this group showed that placebo administration could decrease anticipatory stress which was correlated with placebo analgesia (9). These findings support the hypothesis that placebo effects could alter stress levels and negative emotions, which are, in turn, able to mediate the effects of pain reduction. It needs to be further studied whether this also applies to situations outside the context of pain reduction.

In experimental studies, placebo effects on mood and emotions were previously investigated in the context of pain, but only rarely with regard to depression, a negatively altered pathological state of mood and emotionality. Factors influencing the placebo effect on depression have been investigated through meta- and re-analyses of RCTs [see for example (3, 10, 11)]. There is evidence that neurobiological mechanisms produce placebo effects on mood and behavior, such as an opioid and dopamine modulation of the hypothalamic-pituitary-adrenal axis (12). Considering these results for mood improvement after placebo intake in the clinical context of depression, the question arises, whether and under which circumstances these kinds of effects also appear in a healthy population, and whether they can be experimentally induced. For example, some recent experimental studies measured mood with the Profile of Mood State Questionnaire (POMS) (13) or other affective state scores and examined the effects of placebo interventions in healthy populations (14–18).

Contrary to placebo effects in the context of pain, relatively little is known about placebo effects on cognitive performance. These effects are often examined in the context of substance (ab) use, with users hoping to benefit from the positive effects, for example on aspects of cognitive performance like memory or concentration. Beyond the physiological effects of a substance, placebo effects seem to play an important role in affecting behavior and cognitive performance. With regard to cognitive effects, methylphenidate is an increasingly used substance for "cognitive enhancement," not only in clinical use (for example for attention-deficit/hyperactivity disorder, or ADHD) but also in non-medical use by healthy people (14). A recent study involving Swiss school students with an average age of 17.1 years found a lifetime prevalence of almost 55% in substance abuse for cognitive enhancement, and a 13.3% lifetime prevalence for the use of prescription or recreational drugs (19). However, the role of stimulants as "cognitive enhancers" has been questioned even as medication (20), as the positive change in symptoms after stimulant treatment of children and adolescents with ADHD seems to be partly related to placebo effects (21). A simple experiment could show that students who responded to a flyer advertising a training for cognitive enhancement performed significantly better in a cognitive task than those who responded to a flyer advertising the same study with the benefit to receive credit points (22). Moreover, contradictory experimental findings in the context of everyday substances such as nicotine and caffeine do exist, with only some of them showing placebo effects on cognitive parameters (5, 15– 17, 23).

The placebo effect in children and adolescents has recently been reviewed with the conclusion that only little data exists, and that a relatively low number of studies handled the placebo effect per se in children and adolescents (24). In general, placebo responses in clinical trials tend to be higher in children and adolescents (24). In two of the few experimental studies on the placebo effect in children with similar designs, it was possible to induce placebo effects in healthy children in a heat placebo analgesia design (25, 26). The latter describes their expectancy induced placebo analgesia response as substantially higher than those typically found in adults, yet a control group was not used. Contrary effects have not been more distinct compared to an adult control group (25). This finding raises the question whether the placebo effect in children or adolescents might depend on their disease and developmental state (25, 27). Concerning the mechanisms of placebo effects in children and adolescents, higher learning capacities, associative learning, and learning capacities in general might play a more important role. Furthermore, other forms of learning like social learning or imitation might be more important in children and adolescents with an increased influence from peer groups and media (24). Social learning of placebo effects through observation of a beneficial and successful analgesic treatment was shown in health women, and this treatment was as effective as a conditioning procedure (28). If social learning of placebo effects works in children and adolescents has yet not been investigated. However, children's or adolescents' own expectations might play a subordinate role in producing the placebo effect (24, 29). This assumption goes in line with the "placebo by proxy" effect (30), a placebo effect on patients' environment eventually contributing in turn to symptom improvement in the patient. The research on children's and adolescents' placebo effects has consequently begun to arouse interest and should be further investigated with regard to the underlying mechanisms and the dependency on age, developmental state, diseases, expectations, and moderating traits' influences.

As outlined in Introduction, many aspects of the placebo effect, especially outside the pain context, are yet unknown and would be worth investigating, preferably in an experimental study. Thus, the presented study has three goals: (1) the primary objective is to compare healthy adolescents with their parents regarding the experimentally induced placebo effect on mood and cognitive performance—measured via psychological questionnaires, reaction, and memory tests. It is hypothesized that placebo effects can be induced by an ineffective alleged Ginkgo transdermal patch, and that this effect is greater in adolescents than in adults. (2) Parents' expectations about Ginkgo effects influence their children's expectations and placebo effects and they, therefore, are correlated. (3) Finally, we will exploratively investigate whether this placebo application can decrease stress levels measured as psychophysiological responses such as HRV and skin conductance levels (SCL). We will also analyze if they differ between adolescents and parents.

We therefore performed a study with two experimental sessions following a within-subjects design to induce placebo effects on cognitive performance and mood in parent–child dyads. Effects were induced with help of an inactive transdermal patch accompanied by the information that this patch is either a Ginkgo patch which improves mood and cognitive performance, or it is a non-effective placebo patch. The context of Ginkgo was chosen, because it is assumed that expectations about its effectiveness exist in the general population, as Ginkgo is advertised and sold as having proven positive effects on memory (31). To the authors' knowledge, a comparable experimental design with adolescents as subjects has never been done before, especially not in comparison to their parents.

#### MATERIALS AND METHODS

#### Sample

The subjects were recruited by advertisements at the medical university campus and public places in the city of Tübingen and through mail distribution lists. The advertisement for the study used the pretext of testing the impact of expectancy on the effects of a new Ginkgo preparation and an idea of the procedure was given. Before being invited, a telephone interview was conducted in which the participants' suitability was checked by ruling out acute or chronic somatic and psychiatric diseases and any moodor reaction-altering drug use. Applicants who were pregnant or breastfeeding were also ruled out. Only one child parent pair was rejected for not fulfilling the criteria and two further suitable pairs refused further participation after the interview for personal reasons.

All adolescents and parents were included after written informed consent only. This study was approved by the Ethical Review Board of the University of Tübingen (project No. 295/ 2013BO1) and was conducted in accordance with the Declaration of Helsinki.

#### Procedure

The experiment followed a within-subjects design and child– parent pairs were invited to two sessions which took place at the same time of the day with an interval of at least 3 days. All experiments were conducted by the same male investigator (DW) who wore neutral clothing in a research lab. At the beginning of the first session the subjects were handed a written document informing them about the study's procedure, length, risks, voluntariness, data protection, monetary compensation, and the fact that not all details of the study are revealed to the participants. We therefore followed the concept of "authorized deception" (32). The subjects had to sign a consent form and parents additionally had to sign for their children. There were no refusals.

The general procedure explained in the following sections was identical for both sessions. At the beginning of each session the subjects' physical condition was examined by measuring blood pressure and heart rate. Furthermore, the participants' general health was checked as well as if they abstained from alcohol and any drugs during the previous 24 h. Afterwards three electrodes were placed on the chest to record their electrocardiogram (ECG), and two electrodes were attached to their fingers for the assessment of the SCL (see below). A 5-min baseline measure was recorded, followed by the assessment of the POMS (13) baseline measure (pre) and a questionnaire about their expectancies about the possible effects of the Ginkgo preparation on reaction time, concentration, memory, and mood. Expectancies were assessed by the question "How effectively do you think that Ginkgo will affect your reaction time (concentration, memory, or mood, respectively)?" and rated by subjects on a visual analog scale (VAS) from "worsening" through "no change" to "improvement." The VAS was quantified from −50 to +50 mm for further analyses. After these preparations, the subjects received an envelope in which it stated whether they would get a Ginkgo patch, improving their mood and cognitive performance or a placebo patch, which would not improve their mood and cognitive performance. In fact, they always got a placebo patch which did not contain any active agent. Actually only the information (stimulus expectancy) was changed between the two sessions in a counterbalanced manner so that the Ginkgo information was given in the first or second session. Adolescents and parents were always in the same condition and thus received the same information. The experimenter was kept blind to the order of the conditions: the envelopes with the information were prepared in advance by another person of the lab, and subjects were told, not to tell the content of the envelope at any time in order to keep the experimenter blind. The exact wording in the envelope was according to the condition: "Today you are going to get a Ginkgo (placebo) patch. So, you are in the experimental (control) condition. Don't tell the experimenter about the today's condition during the experiment." After the subjects received their information the experimenter fixed the approx. 5 × 7.5 cm transdermal patches on the participants' hips. From then on, parents and adolescents were separated in two rooms. They had to wait for approximately 15–20 min after patch application, then POMS was filled out a second time (post) to evaluate mood changes. The cognitive tests began 25–30 min after the patch application.

The first cognitive test conducted was the California Verbal Learning Test (CVLT) (33). Subjects were informed that this is a word memory test. The instruction was read literally (translated from German): "Now I'm going to read a list of words to you. Please learn the words by heart and reproduce them afterwards. I'll read the list to you just once and the order in which you reproduce the words does not matter." As soon as the subject was ready, the 10 words were read at a rate of approximately 1 Hz. The subject was then asked to reproduce the words and every correct answer was noted (first recall). The subject did not receive any feedback regarding their accuracy, not even when asked. There was no time limit for reproducing the words. Afterwards, the parametric Go/No-Go task (PGNG) (34) was administered. The instruction was included in the program and every level of the task was explained step by step with examples following a test trial. The test took approximately 15–20 min and the time period was marked on the electrophysiological device. The PGNG ended approximately 45–50 min after patch application, and the second recall phase of the CVLT began. The subject was literally asked (translated from German): "Do you remember the learned words from the list? Please reproduce them. Again, the order does not matter." Every correct answer was noted (second recall). Immediately after finishing, the third phase of the CVLT began in which the subject had to recognize the 10 words from the list out of a sum of 30 words. The subject was asked (translated from German): "Which of the following words were included in the former list of recalled words? Answer with yes or no." The experimenter read the list and waited for the subjects' answer after each word. The answer was written down by the experimenter and again the subject did not get any feedback regarding his/her answers. Without knowing if the word was actually in the list, the subject was advised to go with his gut feeling. All correct words were counted as "hits." This phase was the last to be registered on the electrophysiological device. Finally, the subjects completed a questionnaire concerning the effectiveness of the patch received on the same VAS as at the beginning for expectations (subjective outcomes). The electrodes and the device were removed. The whole procedure took approximately 1 h. After the second session the family received their payment for participation (20 Euros for the parent and cinema vouchers worth 20 Euros for each participating child) and was informed about the whole experiment; especially about the patches not containing any active agent in both sessions. It was explicitly pointed out that all the administered data could be deleted if desired, but nobody wanted their data to be deleted.

#### Measurement of Cognitive Performance, Mood, and Subjective Outcomes

To measure placebo effects on cognitive performance, a PGNG test (34) was used. The PGNG measures reaction time, inhibition, and executive functions. It contains three levels of ascending difficulty, in which single letters are shown rapidly in the middle of a screen. Mean reaction time over correct targets (RTT) and percentage of correct target trials (PCTT) in all three levels, and the percentage of correct inhibitory trials (PCIT) in levels 2 and 3 were analyzed as dependent variables for concentration and reaction times. To test placebo effects on memory, an adaptation of the CVLT (33) was used. The sum of max. 10 words learned by heart and immediately recalled (first recall) as well as the sum of recalled words with delay (second recall) and the correct recognized words (hits) from the list at the end of CVLT were analyzed as dependent variables for memory. To operationalize the hypothesized change of mood, the shortened version of the POMS (13) was used. It contains 19 items to rate current positive and negative emotions, such as joy, anger, depression, fatigue, and tension on a 7-point Likert scale. For further analyses, sums of the POMS positive scale ranging from 6 to 42 points, and the POMS negative scale ranging from 13 to 91 were calculated. Differences between the POMS scales before and after patch application were used as dependent variables (positive values indicate higher values of the scale after the patch application).

To assess subjectively recognized effects of the patches, subjects were asked to fill out a questionnaire concerning the extent of the influence of the patch they received on VASs for reaction time, concentration, memory, and mood, at the end of each experimental session.

#### Electrophysiological Data

Electrophysiological data was collected in the form of interbeat intervals (IBIs) and SCLs using a 3991x-GPP BioLog recorder, firmware Version 1.2 (2012). A three channel ECG was set up on the participants' thoraxes on the level of second intercostal space left and right and below the left mammilla (see Procedure). Data was read out and saved by the 3991x-GPP DPS software, Version 1.2 (2012) immediately after each session. For the analysis of the HRV data, 6 subjects had to be excluded due to technical problems during recording or movement artifacts, resulting in 42 datasets (20 parents, 22 adolescents). The data handling of the HRV data was carried out with Kubios HRV, Version 2.2 using autoregression with a model order of 16 without factorization as spectrum estimation. Trend removal was applied by smoothing priors with lambda = 500. Artifact correction was used stepwise when needed. In 57.1% no artifact correction was used, in 5.9% very low, in 4.6% low, in 30.7% medium, and in 1.7% strong artifact correction was used. The parameters of interest concerning HRV were the root mean square of successive difference (RMSSD) and the logarithmically transformed HF power (0.15–0.4 Hz) in the autoregression spectrum (HF). These two parameters are known to represent vagal influence on HRV (35). In this study these parameters are supposed to reflect a decreased state of stress or arousal. Two 5-min time frames of measurement were chosen: 1) baseline after installation of the device at the beginning of the session, and 2) immediately after patch application while filling out personality questionnaires.

In contrast to HRV, which is a surrogate for parasympathetic activity and reactivity, the SCL represents sympathetic activity and reactivity. SCL is considered to be a good indicator of the "inner tension" of subjects. Two electrodes connected to the BioLog device were positioned on the index and the ring fingers of the non-dominant hand to detect conductivity changes. The SCL signal was detected with a rate of 10 Hz and between 0.1 and 39.9 mMho. Due to the adequate data quality, no other preprocessing steps were necessary, and the mean of the signal (SCL-M) as well as the area under the curve (SCL-AUC) were calculated (36, 37).

#### Statistical Analyses

All statistical analyses were performed with IBM SPSS Version 22 (IBM Corp., Armonk, NY). Significance level was set to a = 0.05. Sample size was calculated for the main analyses, the 2 × 2 repeated-measures ANOVA (condition × age group) for which a total sample size of n = 34 was sufficient to detect a medium effect size of f = 0.25 (with r = 0.3, a = 0.05, power = 0.80), as calculated with G\*Power Version 3.1.9.2 (38). Normal distribution of variables was assessed with Shapiro–Wilk tests and visual inspection of normal quantile–quantile plots. As some expectations were not normally distributed, Mann–Whitney U tests and Spearman correlations were used to analyze differences and associations between adolescents' and parents' expectations at first appointment when they were not influenced by any condition assignment, and between parents' expectations and adolescents' placebo effects. Placebo effects were calculated as the difference between the Ginkgo and the placebo condition for each outcome. In order to rule out possible sequence effects of the information given (Ginkgo vs. placebo) at the first and second appointment, all presented repeated-measures ANOVAs were rerun with sequence order as an additional factor. There were no main or interaction effects for any of the analyzed dependent variables (results not reported). To investigate whether placebo effects differ between adolescents and parents, 2 × 2 repeatedmeasures ANOVAs with condition (told placebo vs. told Ginkgo) as within-subjects factor and age group (adolescents vs. parents) as between-subjects factor were performed. As post hoc tests, differences between conditions (told placebo vs. told Ginkgo) were tested with paired t-tests for adolescents and parents separately. In order to control for multiple testing p-values were adjusted according to Hochberg (39).

With regard to psychophysiology, separate 2 × 2 × 2 repeatedmeasures ANOVAs were performed with condition (told Ginkgo vs. told placebo) and time point (baseline vs. post-patch) as within-subjects factors and age group (adults vs. adolescents) as between-subjects factor for each of the dependent variables RMSSD, HF, SCL-M, and SCL-AUC.

### RESULTS

### Sample Description

Twenty-six healthy adolescents between 12 and 17 years (13.8 ± 1.6 years; 12 boys, 14 girls) each with a parent (45.5 ± 4.2 years; 5 fathers, 17 mothers of which 4 mothers participated with 2 children) participated in the experiment, leading to a total of 48 subjects (because of four threesomes). Except for one girl, all the adolescents were in a German "Gymnasium," which is the highest secondary school level. The parents all had at least an education or had graduated. Except of one mother who had already tried homoeopathic Ginkgo sweets, none of the participants reported any experience with Ginkgo products.

### Expectations

At first appointment, expectations of Ginkgo effects did not differ between adolescents and parents in general and were significantly correlated between adolescents and their own parent concerning effects on reaction times only (Table 1). Furthermore, there was only one significant correlation between the expectation of the effects on mood and the placebo effect on negative mood in parents (r = −0.523, p = 0.013, adjusted p = 0.156), whereas there was no correlation between expectations and placebo effects in adolescents. Regarding the influence of parents' expectations on adolescents' placebo effects, there was one correlation between parents' expectation of Ginkgo effects on reaction and adolescents' placebo effect on reaction time in level 3 (r = 0.395, p = 0.046, adjusted p = 0.966).


TABLE 1 | Expectations of adolescents and parents concerning the effects of Ginkgo on outcome measures: differences between adolescents and parents in general (Mann–Whitney U tests), and correlation between adolescents and own parents (Spearman correlations) (reported as median [1st–3rd quartile]).

#### Placebo Effects on Cognitive Performance: Reaction Times, Correct Trials, and Memory

The 2 × 2 repeated-measures ANOVAs with condition (told Ginkgo vs. told placebo) as within-subjects factor and age group (adults vs. adolescents) as between-subjects factor showed a significant main effect of condition for PCTT level 3 as dependent variable only (F (1,46) = 8.91, p = 0.005), but without an interaction of condition × age group (F(1,46) = 3.48, p = 0.069). According to post hoc tests, adolescents showed a significantly lower PCTT in level 3 (worse cognitive performance) in the Ginkgo compared to the placebo condition, whereas no other comparison was significant neither in adolescents nor in parents (Table 2). The only significant placebo effect was found for PCIT level 2: There was a significant interaction of condition × age group (F(1,46) = 9.56, p = 0.003) with a higher PCIT in the Ginkgo compared to the placebo condition in adolescents but with nearly no change in parents (Table 2). Additionally, there were significant effects of the between-subjects factor age group (adults vs. adolescents) in mean reaction times: RTT level 1 (F(1,46) = 15.48, p < 0.001), RTT level 2 (F(1,45) = 35.47, p < 0.001), and RTT level 3 (F(1,46) = 18.89, p < 0.001) indicating faster reaction times in all three levels for adolescents.

There was no significant placebo effect on memory in any of the three dependent variables of the CVLT, and no difference between age groups or significant interaction (Table 2). With regard to the condition as a main effect, the statistics for first recall were F(1,46) = 0.04, p = 0.842, for second recall F(1,46) = 0.06, p = 0.806, and for hits F(1,46) = 0.02, p = 0.888.

#### Placebo Effects on Mood and Subjective Outcomes

Changes in the POMS scales from pre- to post-patch application for both conditions are reported in Table 3. Note that a negative difference indicates a decrease and a positive difference indicates an increase from pre- to post-patch application in the respective mood scale. In both dependent variables there was no significant effect in 2 × 2 ANOVAs, neither for the within-subjects factor nor for the between-subjects factor or the interaction. With regard to condition as main effect the statistics for positive emotions were F(1,46) = 1.02, p = 0.317, for negative emotions they were F(1,46) = 1.62, p = 0.209. However, adolescents reported significantly better mood in response to the Ginkgo compared to the placebo patch at least according to the unadjusted p value.

ANOVAs with the subjective assessments of the effects of the patches on reaction time, concentration, memory, and mood as dependent variables revealed a significant main effect of the condition for mood only (F(1,44) = 7.53, p = 0.009), with perceived better mood after Ginkgo compared to the placebo condition independent of age group. Post hoc paired t-tests suggest that this effect may consist on adolescents' assessments only although analyses do not withstand p-value adjustment.

#### Psychophysiological Data

For RMSSD as a dependent variable, the 2 × 2 × 2 ANOVA showed a significant main effect of time (pre- to post-patch application, F(1,42) = 5.67, p = 0.022) with an increase in RMSSD, and a significant interaction of time × age group (F

TABLE 2 | Cognitive performance (PGNG, CVLT) in the told placebo and told Ginkgo conditions in adolescents and parents (mean ± SD).


RTT, reaction time to target; PCTT, percentage correct target trials; PCIT, percentage correct inhibited trials; L, level; CVLT, California Verbal Learning Test; Paired t-tests, adjusted p values according to Hochberg (39).


TABLE 3 | Subjective assessments of the effects of the patches in the told placebo and told Ginkgo conditions in adolescents and parents (mean ± SD).

POMS, Profile of Mood Scale; Paired t-tests, adjusted p values according to Hochberg (39).

(1,42) = 14.05, p = 0.001) with an increase in both conditions in adolescents, but with nearly no change in parents (Figure 1). The main effect for condition and interactions of condition × age, condition × time, and condition × time × age were not significant (all p values > 0.05).

For HF, there was a significant main effect of time (F(1,42) = 8.58, p = 0.005), an interaction of time × age group (F(1,42) = 7.04, p = 0.011), and an interaction effect of condition × time × age group (F(1,42) = 4.09, p = 0.049). Figure 2 shows that HF

FIGURE 1 | Root mean square of successive differences (RMSSD) (ms) in adolescents and parents pre- and post-patch application in both conditions (M ± SE).

increases from pre- to post-patch application in the Ginkgo condition in both adolescents and parents, but not in parents in the placebo condition. The main effect of the condition, and the interaction effects of condition × age group, and condition × time were not significant (all p values > 0.05).

For SCL-M there was a significant main effect of time (F (1,42) = 17.21, p < 0.001), and a significant time × age group interaction (F(1,42) = 4.65, p = 0.037). Furthermore, there were significant interaction effects of time × age group (F(1,42) = 8.61, p = 0.005) and condition × time × age group (F(1,42) = 4.44, p = 0.041) for SCL-AUC, with a decrease from pre- to post-patch application in both conditions in adolescents, but with nearly no change in the placebo and an increase in the Ginkgo condition in parents (Figure 3).

#### DISCUSSION

The aim of the present study was to experimentally induce placebo effects on cognitive performance and mood in healthy parent–child dyads. In a within-subjects design, placebo effects shall be induced through the application of a non-effective patch on the hips of participants accompanied either by the information that it is a Ginkgo patch which improves cognitive performance or by the information that the patch is a placebo only. In both conditions cognitive performance was measured by a PGNG test (34) and CVLT (33) while mood was assessed with

adolescents and parents pre- and post-patch application in conditions (M ± SE).

POMS (13). Additionally, HRV and SCL were assessed as physiological stress markers.

Expectations about the effects of a Ginkgo patch on concentration, reaction times, memory, and mood ranged between neutral and high on a VAS from −50 to +50. They did not differ between adolescents and parents, and only correlated between adolescents and parents concerning reaction times. Additionally, parents' expectations and adolescents' placebo effects were associated with regard to reaction times in one of three levels, but this correlation did not withstand p value adjustment for multiple testing. It could be speculated whether adolescents' expectations mediate the effect of parents' expectations on adolescents' placebo effects. Furthermore, there was only one significant correlation between expectations and placebo effects in parents which also did not withstand p value adjustment. Therefore, explicit expectations prior to the intervention did not affect the results.

Concerning the eight parameters of the PGNG, the only significant main effect of the within-subjects factor patch condition (information) could be found in the percentage of the correct target trials (PCTT) in level 3, paradoxically with a lower percentage in the Ginkgo condition compared to the placebo condition demonstrating a worse cognitive performance. The main effect of the between-subjects factor of age seems to be more constant, with significantly faster reaction times (RTT) in all three levels for adolescents. The significant interaction between the patch condition and the age group for PCIT in level 2 is noteworthy since there is a higher difference between means of PCIT in the Ginkgo than in the placebo condition in adolescents compared to adults. Moreover, this effect supports the hypothesis that adolescents have better cognitive inhibition performance with Ginkgo compared to the placebo condition, and therefore is the only placebo effect found in this study. Interpreting the data further, it seems likely that adolescents in general tend to react faster and more accurately, but their ability to inhibit reactions is inferior to that of adults. The age effect on reaction time is not surprising, as several studies report a decrease of reaction time with the process of ageing at least until young adulthood (40). Better inhibitory skills in adults in comparison to adolescents are a common finding which can be also interpreted in line with differences in functional-neural maturation (40, 41). Furthermore, reaction times, correct target, and inhibited trials might be interconnected to a certain degree. For example, subjects who take more time to respond to targets might respond more accurately to targets and vice versa. Our data, however, showed that there could be significant changes in one entity without significant changes in the other. Results of CVLT as a dependent variable showed no significant effects at all, neither for patch condition nor age.

Following the trend of the placebo effects on cognitive performance, no significant main effects of the factors "patch condition" or "age" could be observed for mood, as measured by the POMS pre–post-patch application differences in the ANOVAs. However, adolescents reported significantly better mood in response to the Ginkgo compared to the placebo patch. they also subjectively reported that the Ginkgo patch influenced their mood, at least according to the unadjusted p values. Furthermore, parents thought that the Ginkgo influenced their concentration when compared to the placebo patch.

The two examined parameters of HRV represent vagal influence on the heart function. Thus, a rise of both RMSSD and HF from baseline to post-patch application can be interpreted as a decrease of stress. For both parameters, there was a main effect of time but no significant main effect of the factor patch condition. Additionally, an interaction shows a stronger increase in adolescents for both. For HF, a three-way interaction could be found, indicating that adolescents show an increase in both conditions, whereas parents show an increase with the told Ginkgo condition, but a decrease in the told placebo condition. In contrast, SCL parameters indicate sympathetic activation and mirrored the effects on RMSSD and HF. Sympathetic activation decreased in adolescents in both conditions, with a stronger decrease in the Ginkgo condition, but increased in parents in the Ginkgo condition whereas there was no change in the placebo condition. Thus, adolescents responded in the hypothesized way and showed an increase in parasympathetic and a decrease in sympathetic activation in response to a putative active intervention.

Analyzed together, we could find a significant placebo effect in only 1 (PCIT level 2) out of 11 parameters for cognitive performance and in 1 (subjective mood) out of 6 parameters for mood and subjective assessment in adolescents. Additionally, there is one paradox effect for patch condition on PCTT level 3, which is hard to interpret. However, psychophysiological data show that, there is a significant reaction to the intervention itself, which is indicated by a rise of RMSSD and HF and a decrease in SCL particularly in adolescents who seem to be more sensitive to psychophysiological changes. The shown physiological reaction after the patch could be a base for placebo effects on cognitive performance and mood, which may not have shown up due to possible theoretical reasons as well as limitations of the study. These will be discussed in the following sections.

As mentioned in Introduction, the placebo effect in the context of analgesia is a well replicated phenomenon (1). Even in adolescents it was possible to experimentally induce placebo effects in the context of analgesia (25, 26). In the context of cognitive performance, the experimental induction of placebo effects may not be as easy to perform as analgesia or possibly just under special circumstances (25, 26). The lack of a placebo effect supports other findings that also could not induce placebo effects on cognition in a paradigm with methylphenidate which also used subjects, who have had no experience with this substance. However, a significant improvement of subjective mood and arousal through a placebo effect was reported (14). Other studies could not find placebo effects on cognition in coffee users which was induced by variety of information about decaffeinated coffee, although they found that the wrong information about real coffee worsens cognitive performance. Furthermore, there were no clear findings on mood improvement (15). Further studies did not find any placebo or nocebo effects on cognitive performance caused by altering the information when drinking real coffee (17). Together with the results from our study and those from the comparable exemplary studies reported, it can be argued that in order to approach the essence of a possible placebo effect on cognition and mood, some crucial aspects must be considered. First of all, substance users or non-users should be examined as this seems to have an effect. Substance users actually have an idea of how the substance's effect should feel, whereas non-users do not have these experiences and have to link their expectations to theory, unbeknownst to the desired effect.

The well-replicated and easily inducible placebo effects in pain reduction might be due to a clear notion of what the desired effect should be—namely a pain reduction which has been experienced by every individual—often in the context of a painkiller. In line with this assumption, an experimental study showed a positive relationship between experienced pain relief during a preceding conditioning session and the later actual placebo effect in children, but not in adults (25). Adults seem to have a more robust history of pain reducing experiences than children. In contrast to analgesia—as a decrease of a specific symptom—the improvement of cognitive performance and mood could be a more unspecific and rare experience which is difficult to enumerate by healthy adults and adolescents. This could be a reason why placebo effects were harder to induce in these entities. The same argument concerning the amount and specificity of experiences apply, when thinking about the comparison of placebo effects in healthy subjects versus patients. The significance of several factors concerning the placebo effect in children and adolescents has previously been emphasized, such as the duration of disease, symptom severity and comorbidities (27). In adults, adolescents, and children suffering from diseases, it might be easier to induce placebo effects, because the expected effect is always towards a welldefined state of health or normality. In healthy people, however, the effect obviously must be some kind of "extra improvement." Thus, it is easy to explain that concerning placebo effects on cognitive performance, large effects in clinical studies, for example in ADHD patients (21), can be found. The same has been shown for placebo effects on mood: There are many well replicated clinical findings about mood improving effects in treatment of depression (3, 10)—a pathological state of emotionality with a clear notion of a comparable healthy state. On the other hand, however, in experimental studies with healthy subjects as ours, and similarly to other studies, placebo effects on cognitive performance and mood cannot be induced or only under certain circumstances. Concerning placebo effects on cognitive performance, recent studies have focused on the role of expectancies about the effectiveness of the intervention (post hoc subjective outcome). In some cases, rather the expectancies affect objective cognitive performance than the sole information of receiving an intervention (42–44). High prior expectations can increase post hoc expectancies about the intervention, yet they do not necessarily affect objective cognitive outcomes (45).

#### Limitations and Future Research

Some limitations of our study should be mentioned and discussed. First of all, it is not clear if the subjects really understood or internalized the effect of the different patches, despite forming mostly positive expectations of the Ginkgo effect. Although having been told about its positive effects it is possible that the effects have to be formulated in a more explicit and concrete way, e.g. improving reaction time, improving capability to memorize words, feeling happier, rather than talking about abstract entities as improving concentration, improving memory and mood. Maybe the subjects could not relate the tasks to the promised improvements. This assumption is supported by the lack of correlations between prior expectations and objective parameters and no differences between conditions in post hoc assessed subjective outcomes. Also, we did not explicitly ask subjects to rate their expectations about the effects of the placebo patch.

Furthermore, the usage of a transdermal patch for substance application is not common in our tested population. The time for an effect to take place was announced as 20–30 min in the experiment which might be not enough to mentally process the presence of the patch and consequently experience placebo effects. Although the subjects had positive expectancies about the substance itself, they might have been doubtful about an effect in such a short time period. To control for the effects of the application of a patch as an intervention, further studies should include a control group without any intervention or compare a patch application to other kinds of interventions such as pills or ointments. Additionally, due to our small sample size we did not explore the effects of different developmental phases or gender in children and adolescents or gender interactions with their accompanying parent. Finally, the PGNG and the CVLT might not be sensitive enough to detect differences between our conditions as they both might have been too easy which resulted in too little differences and a ceiling effect.

Due to the relative novelty of the paradigm, some aspects have to be optimized for future studies on placebo effects on cognitive performance and mood. These optimizations should include a correct and convincing induction of expectancies, an effective, salient application of the placebo substance, and an adequate allotted time period for the placebo effects to develop. The placebo could be more successful using common application forms, like pills, rather than transdermal patches. Moreover, from a theoretical point of view, a background of experience with the (placebo) substance or at least a concrete notion of how an effect should feel could be necessary for effective of placebo effects. Consequently, in experimental trials, a placebo sold as a familiar substance could be more effective, especially in subjects with a lot of experience with the substance in everyday life. Similarly, placebo effects on cognitive performance and mood might be easier to induce in subjects with such deficits because, in contrast to healthy subjects, an improvement towards a more concrete state is prospective. Thus, experimental trials on placebo effects on cognitive performance and mood in children or adolescents could also be conducted with subjects suffering from depression or attentional disorders. Aside from children, maybe elder people, who start to develop cognitive deficits in the form of mild cognitive impairment, could be a good target group in order to experimentally induce placebo effects on cognitive performance. Our limitations show that there are several other points that should be further investigated in future studies such as different developmental phases in cognitive development, gender differences, effects on varying aspects of cognitive performance, and a reasonable decision for the cognitive tests.

### CONCLUSION

In summary, we could not induce significant placebo effects on cognitive performance and mood in adolescents and their parents. This could particularly be due to some aspects of the study design such as the unusual form of application (transdermal patch) and substance used (Ginkgo) coupled with the fact that it could not work in health subjects without cognitive impairment or mood disturbances. However, we could show that adolescents are more sensitive to psychophysiological reactions to interventions—if they work or not—than adults, and this could be part of the underlying mechanism of placebo effects.

### AUTHOR'S NOTE

This study was part of DW's dissertation (Watolla D. Placebo effects on cognitive performance and mood in children and parents—an experimental approach [unpublished dissertation]. [Tübingen, Germany]: Eberhard Karls University; 2017. 75 p.).

#### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

#### ETHICS STATEMENT

The studies involving human participants were reviewed and approved by the Ethical Review Board of the University of

#### REFERENCES


Tübingen. Written informed consent to participate in this study was provided by the participants and the participants' legal guardian/next of kin.

#### AUTHOR CONTRIBUTIONS

PE, KW, and DW contributed conception and design of the study. DW performed the study. DW and KW organized the database, performed the statistical analysis, and wrote the first draft of the manuscript. JS-K and HS contributed design features concerning the inclusion of adolescents. NM and SG contributed design features and analyses of psychophysiological data. NM, SG, and MG wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.

### FUNDING

This work was supported by grants of the Fortüne-Program of the University of Tübingen (fortüne 2179-0-0 and 2266-0-0) for KW. We acknowledge support by Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of the University of Tübingen. JS-K was supported by a grant from the Faculty of Medicine, Tübingen (TÜFF no. 2399‐0‐0).

#### ACKNOWLEDGMENTS

We thank all adolescents and their parents who participated in this study.


test–retest reliability of the Parametric Go/No-Go Test. J Clin Exp Neuropsychol (2007) 29:842–53. doi: 10.1080/13803390601147611


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor is currently co-organizing a Research Topic with one of the authors PE and KW, and confirms the absence of any other collaboration.

Copyright © 2020 Watolla, Mazurak, Gruss, Gulewitsch, Schwille-Kiuntke, Sauer, Enck and Weimer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership