State of the Art of Interpersonal Physiology in Psychotherapy: A Systematic Review

Introduction: The fast expanding field of Interpersonal Physiology (IP) focuses on the study of co-ordination or synchronization dynamics between the physiological activities of two, or more, individuals. IP has been associated with various relational features (e.g., empathy, attachment security, rapport, closeness…) that overlap with desirable characteristics of clinical relationships, suggesting that the relevant studies might provide objective, economical, and theory-free techniques to investigate the clinical process. The goal of the present work is to systematically retrieve and review the literature on IP in the field of psychotherapy and psychological intervention, in order to consolidate the knowledge of this research domain, highlight its critical issues, and delineate possible developments. Method: Following the guidelines by Okoli and Schabram (2010), a systematic literature search was performed in Scopus, Web of Science, PsycINFO, and PubMed databases by means of multiple keyword combinations; the results were integrated with references to the retrieved articles' bibliography as well as to other published reviews on IP. Results: All the retrieved documents reported clinical interactions that are characterized, at least partially, by IP phenomena. They appear to use fragmented and sometimes ambiguous terminology and show a lack of both specific theory-informed hypotheses and sound analytical procedures. Conclusion: Although the psychological nature of IP and its role in the clinical relationship are still mostly unknown, the potential value of a physiology-based measure of implicit exchanges in psychotherapy drives an acceleration in this research field. On the basis of the highlighted critical issues, possible future directions for clinical IP researchers are discussed.


INTRODUCTION
There is growing interest in the study of co-regulation of nonverbal behavior and physiological activations in interpersonal contexts. In the last few years various reviews on the topic have been published (Riess, 2011;Delaherche et al., 2012;Butler and Randall, 2013;Koole and Tschacher, 2016;Palumbo et al., 2016;Finset and Ørnes, 2017), vouching for the existence of a general phenomenon of dynamic regulation between interacting people, which is expressed through a wide range of modalities (body movement, facial expressions, eye gaze direction, face blushing, pupil dilation, skin conductance activity, heart rate variability, breathing rate, paraverbal behaviors, language style, and more), and most often described in loosely defined terms of synchrony. Among the many modalities through which these coordination phenomena are expressed, the physiological ones, under the umbrella term of interpersonal physiology (IP), are of special interest in the clinical field, as: (a) they are not directly observable by the clinician or a trained observer and therefore promise an additional layer of information on the clinical process; (b) controlling them voluntarily requires significant effort, and generally is much more difficult in comparison to behavioral forms of synchronization; (c) in most cases, they are outside of the individuals' awareness.
These characteristics highlight the clear interest that such phenomena represent in the field of psychotherapy research, the focus of which is slowly shifting from the efficacy of approachspecific interventions to the study of relational variables, such as empathy, alliance, mutual affective regulation, and, more generally, common factors and micro-processes (e.g., Messer and Wampold, 2002;Orlinsky et al., 2004). Indeed, while research on psychotherapy has widely demonstrated the average efficacy and effectiveness of psychological treatments, the specific factors that drive and inhibit individual change are still not well understood (Lambert, 2013). Whilst research on IP is still in an early phase, it could potentially lead to the development of a tool able to detect the moment-to-moment implicit adjustments that occur between patient and therapist. Such an achievement would represent a major paradigm advancement for research on the clinical process, offering a new set of objective and (in most cases) automatic measurements.
The goal of the present work is to systematically retrieve and review the literature on IP in the field of psychotherapy and psychological intervention, in order to consolidate the knowledge of this research domain, highlight its critical issues, and delineate possible developments.

METHOD
Following the guidelines by Okoli and Schabram (2010), a systematic literature review was performed with the purpose of identifying the literature on IP phenomena in a clinical context, analyzing its characteristics, and highlighting both its strengths and its weaknesses. Given the great variety of terms employed to describe IP, three sets of keywords were chosen to identify the pertinent papers, based on the general reviews: a first set assessing the subject of synchronization (synchron * , concordance, attunement, linkage, interpersonal physiol * , interpersonal autonom * , mimic * , mirror * , entrainment), a second set specifying the physiological nature (physiol * , psychophysiol * , neurophysiol * , autonom * , sympath * , parasympath * , heart, skin conductance, galvanic skin, gsr, hrv, eeg, ecg, rsa, electroenceph * , electromyo * , electrocardio * , pupil * , blush * ), and a third set specifying the clinical context (psychotherap * , rapport, alliance, clinical relation * , therapeutic relation * , alliance, client, counsel * ). A wildcard symbol ( * ) was employed to generalize those keywords typically characterized by varying suffixes (e.g., one paper might exclusively employ one of the forms "physiology, " "physiologic, " or "physiological, " the wildcard form "physiol * " would match them all) but not for acronyms and words that are most commonly employed in a single form (e.g., "alliance"). The search was performed by fixing a logical conjunction (AND) relationship between the three sets, this means that each result was required to have at least one member of each set. Search areas included the "title, " "abstract, " and "keywords" fields through the following databases: Scopus, Web of Science (core collection), PsycINFO (EBSCO), PubMed. The search results were individually inspected, and only original research articles and case reports, written in English and published in international peer-reviewed journals, were considered. Furthermore, to be considered a match, the studies had to explicitly focus on simultaneous physiological activation of persons involved in a therapeutic or otherwise clinical interaction.
A second in-depth research step was performed by inspecting under the same criteria the works referenced in the articles that were found both in the database search, and in the existing reviews on interpersonal co-regulation (Riess, 2011;Delaherche et al., 2012;Butler and Randall, 2013;Koole and Tschacher, 2016;Palumbo et al., 2016;Finset and Ørnes, 2017).
All the matching articles were retained irrespective of their methodological quality or year of publication. From each retrieved article, the following information was extracted: (a) the terms employed to define IP; (b) the theoretical framework employed to explain IP (c) the physiological measurements employed; (d) the clinical sample and/or the general study design; (e) the specific psychological or clinical constructs hypothesized for their connection to IP; (f) the methodology through which IP was assessed; (g) the general findings.

RESULTS
As of September 2017, the keyword-based search returned the following number of documents: Scopus = 500, Web of Science = 98, PsycINFO = 163, PubMed = 98. Among these, 10 articles were matching the inclusion criteria. Following the indepth research step, 9 additional studies were identified; all the 19 included studies are shown in Table 1.

Trends
The first relevant observation on the literature corpus is its time trend. Figure 1 shows, how after an initial exploration of the phenomenon in the second half of the twentieth century, Categories: "Tension," "Tension release," "Neutral," "Disagreement," "Antagonism") Graphical comparison of physiological activations (between-sessions within-subject correlations) Patient's and therapist's HRV was similar in "Tension," "Tension release," and "Neutral" segments, but opposite in "Antagonism" segments.

Terminology
Among the many differences, the variety of terms employed to assess the general idea of IP is a big obstacle to the development in the field. It is symptomatic that the broad keyword-based search returned only half of the retrieved articles, and the only recurring terms are concordance (n = 7) and synchrony/synchronization (n = 5), combined with a vast number of different descriptors referring to the physiological domain. Indeed, although the overall number of articles retrieved is relatively small, neither of them (nor the previous reviews on the topic, presented in the introduction), managed to report them all.
In order to overcome this difficulty in the future and to aid the establishment of a research field identity, I suggest that further literature follow the advice of the excellent review by Palumbo et al. (2016) to employ interpersonal physiology as the most inclusive general domain term.
Even more serious is the fact that the terminological ambiguity extends to the methodologies. While the term concordance is mainly associated to the procedure described in Marci and Orr (2006), Marci et al. (2007), the same term has been previously employed to describe simple correlation (Di Mascio et al., 1955), and the same procedure has been also called physiological synchronization (Palmieri et al., in press), skin conductance resonance (Stratford et al., 2009), embodied synchrony , and therapeutic index (Stratford et al., 2012). It is imperative that new contributions propose and rely on operationally defined procedures, that explicitly point to the type of IP assessed.
As an example, studies employing the procedure developed by Marci and colleagues, should use Interpersonal Physiology in the title, in order to enable quick identification and simpler search, and to explicitly assess which IP measures where employed: Marci's ratio or Marci's index, in the abstract and methodological sections, instead of vaguely speaking of synchronization, concordance, attunement, etc. in the same way authors would specify whether they used a Student's t-test or a Mann-Whitney U-test procedure instead of referring to a generic comparison of means.

Theoretical Interpretation
Except for McCarron and Appel (1971), none of the studies employed IP as a predictor of a precise theoretical dimension. Five studies did not specify any theoretical interpretation for their data, 2 referred generically to embodiment theory, 2 to system models, 2 to alliance, and most (n = 6) to empathy, reflecting the general trend of IP literature outside the clinical setting . This lack of specificity is a big obstacle in understanding the clinical meaning of IP. For instance, while its role in the clinical relationships is undisputed, empathy is neither a simple nor a single process; instead it is known to consist in the interaction of multiple components, ranging from basic affective processes up to complex cognitive and social dimensions, each of them being characterized by specific neurobiological activations (e.g., Coutinho et al., 2014). Knowing that IP correlates with patients perceived empathy (Marci et al., 2007) has been a great incentive to foster this line of research, but, on the other hand, is a very general, and quite uninformative kind of observation. In order to reach the exciting scientific and clinical achievements promised by IP, we need to start asking more specific questions, such as: which specific empathic components are responsible for IP? What fundamental intersubjective processes IP does represent? Furthermore, specifically with regard to the clinical setting, we need to start informing our hypotheses with theoretical constructs of our clinical models (for a broader dissertation on this topic: Salvatore, 2011), for instance: which constructs from Dynamic Psychology, or Cognitive Behavioral Therapy are associated with IP? Is IP a mechanism involved in interpersonal emotional regulation? In active listening? In transference and counter-transference processes? In projective identification? And we need to proceed beyond the correlational logic by asking, whether these theoretical constructs map exactly to IP, or only partially, and where eventual discrepancies come from. Such form of reasoning could lead to an evidence-based refinement of the theoretical constructs, but also potentially inform and enrich the ways we observe and measure IP. Indeed, only after the identification of the precise therapeutic dynamics (if any) that IP represents within an established model of psychotherapy, we might be able to employ it to build empirical bridges across models.
Worth of note in this direction is the effort by Ham and Tronick (2009): Converging data from psychotherapy and mother-infant research, the authors propose a theoretical framework, where IP phenomena in therapeutic exchanges can be interpreted in the broader context of attachment theory and dyad models. Although still mostly speculative, the arguments presented in the article are a good and fertile example of how IP research can benefit from, and in return empower, clinical theory.

Correlation Assumptions
Aside from the studies relying on graphical comparison, and with one exception (Orsucci et al., 2016), the quantitative assessment of IP in this literature relies on correlational methods. Yet there are potential dangers in their blind adoption. Indeed, none of the retrieved studies directly assess the potential violation against methodical assumptions and the associated risks of ending up with spurious correlations. So, for instance, the linearity, and homoscedasticity assumptions of the Pearson correlation in the time series domain translate into the assumption of stationarity. This means not only that the individual signals are assumed to contain no relevant trends or drifts, but also that the linear association among them is constant over time, which, in data collected from human behavior, is almost never the case. Furthermore, if two signals are strongly autocorrelated (i.e., strongly dependent from their previous values), a correlation between them might produce inflated results, even where signals are independent. The most frequently employed physiological signal in the reviewed papers, SC, is indeed highly nonstationary and autocorrelated, and requires either signal processing techniques, such as detrending or deconvolution (e.g., Benedek and Kaernbach, 2010), or specific analytic approaches. The most widely employed strategy in the retrieved literature was the windowing procedure (Boker et al., 2002), a nonparametric approach which substitutes the correlation's assumption of global stationarity with that of local stationarity by calculating correlation over shorter, usually overlapping windows; other authors  suggested dynamical correlation as a more flexible alternative to the Pearson index. Yet no rigorous comparison and validation of these techniques has been published to date. Furthermore, the correlational approach reflects a general data-driven approach to IP, which has the advantage of providing a very direct and un-mediated assessment of the phenomenon, but, at the same time, the disadvantage of being only a measure of very simple linear association. Other methodologies, e.g., system dynamics (Ferrer and Helm, 2013), might be better suited to model more sophisticated and theoretically informed relationships between the dyad's signals.

Lag Analysis
Analogously, in the reviewed articles, the analysis of lag and the direction of influence were performed through very simple procedures such as applying a constant lag to the whole time-series and comparing the resulting windowed crosscorrelation of various lag settings. This approach has two main limitations: first it does not model the lag fluctuations during the interaction (especially in longer interactions characterized by multiple turn changes, such as a psychotherapy session, the assumption of a constant lag might lead to spurious results); and second, while the lagged-correlation can assess a temporal association between two phenomena (such as SC peaks), the method does not allow to imply causation (i.e., one person's activation causing another's at a later time) and thus it can't be established if a high synchronization at a specific lag is caused by leading-pacing dynamics or just by synchronization at different phases. This latter limitation can be overcome by employing specific causality tests such as Granger causality (Gourévitch et al., 2006;Liu and Molenaar, 2016). Granger causality implies not only a time-lagged association between two signals, but that knowing the previous states of the first signal (the leading signal) allows a better prediction of the second signal (the pacing signal) than just knowing the previous states of the second signals. Like most parametric analytic tools for time series, Granger causality has strong assumptions, and might return spurious results on nonstationary and cointegrated series, requiring a cautious implementation in IP data. Just as for the cross-correlation approach, Granger causality in nonstationary data can be analyzed by using a windowing technique (Hesse et al., 2003), or through specific solutions (Toda and Yamamoto, 1995). In conclusion, the technique has already been used to assess IP directionality in one publication on choir singers (Müller and Lindenberger, 2011), yet the concrete advantages over the simpler approaches and the overall validity of the procedure in this field are still unclear.

Choice of Parameters
Another issue concerns the many degrees of freedom that multivariate time-series analysis imposes. In the windowed cross correlation approach, the choice of the windows size and increments, the lag interval size and range and the analytic approach for lag, are fixed parameters, which can be combined in a vast range of possible configurations, possibly altering the results in a radical way. For instance, the results obtained by using 30 s windows and 10 s lags might give very different results than using 10 s windows with 30 s lag. While most papers employing this methodology reported the same settings, originally proposed by Marci and Orr (2006), offering at least the comparability of their results, the authors of that original paper did not provide a rationale backing their choice of parameters.
The consequences of this blind adoption can lead to seriously skewed conclusions, and generally hinders the possibility of moving from exploratory to confirmatory designs, as clearly explained in the methodological commentary "The garden of forking path" by Gelman and Loken (2013). A new generation of research should establish its procedures on an empirical basis, for instance by selecting parameters that maximize the effect size of IP of real dyads in comparison to random data, and by questioning whether absolutely best parameters for human interaction exist, or context-specific variations are to be employed.

Interpretation of Negative Results
Interpreting the analytic results is a critical issue too. In the correlational approaches, the resulting values for a given period of time can be either positive, zero, or negative; it is not clear, whether the high negative correlations (i.e., toward −1) should be considered as a part of IP just like positive values (an approach followed by the Motion Energy Analysis literature, e.g., Ramseyer and Tschacher, 2011), or as the lack of IP [as in the index described by Marci and Orr (2006), and used in several following studies], or possibly as a different form of co-regulation underlying different behaviors (a yet unexplored direction). These can be important differentiators of measured IP processes, which have mostly been neglected in current publications, and generally with a terminological variety similar to that of the main construct. For instance, among the selected papers only Di Mascio et al. (1955) explicitly acknowledged the phenomenon of an inverse correlation between patients' and therapists' physiology, and labeled it discordance, while other authors, outside the clinical field, employed different terms such as anti-phase physiological linkage (Reed et al., 2013) or complementarity (Dale et al., 2013). The lack of observable IP (i.e., very small or close to zero correlation) has not been explicitly addressed in the reviewed literature, while in other IP publications it has been referred as asynchrony (Dale et al., 2013); instead the term desynchronization, which is very common in neuroscience (e.g., "alpha desynchronization, " or "circadian system desynchronization"), is not used in IP field.

How Much Synchronization?
Finally, the implicit assumption present in most studies that "more IP" (in whatever way one may choose to measure it) is always better, might be an oversimplification. In the broader literature on interpersonal synchrony, some authors found that high synchrony was not necessarily connected with good interpersonal outcomes (Levenson and Ruef, 1992), and a study on mother-infant dyads found both very high and very low levels of prosodic synchronization to be predictive of insecure attachment (Jaffe et al., 2001).
In conclusion these results suggest that IP in psychotherapy should be studied with a higher degree of sophistication, testing theory-driven hypotheses on what amount of IP is advisable, in what context, if the lack of IP or phase opposition could have clinical meanings, and how to correctly assess and interpret lagged synchrony and causal direction.

Design
The nature of IP and its role in psychotherapy are still mostly unknown, as its relationship with the main psychological dimensions, or its dependency from the dyad's members individual factors, such as gender, diagnosis, or attachment style (just to name a few). Yet 11 out of 19 studies employed a nomothetic design, averaging across dyads or sessions, and comparing group measures to questionnaire data and other constructs in a confirmative fashion. The results of such an approach are low power studies that are unable to make strong points. Research on IP in psychotherapy, at this stage, could probably benefit more from an idiographic type of design, in which physiological dynamics are used either: -In comparison to theoretically-informed analysis of the clinical content. Specifically models that describe micro-processes and use empirically sound and manualized content analysis procedures are promising candidates. Examples of those are ruptures and repairs (Safran et al., 2011), innovative moments (Goncalves et al., 2011), or the patient attachment coding system (Talia et al., 2017). -In data-driven, bottom-up procedures of content classification.
Therapy sessions contain infinite amounts of information (verbal content, prosody, behavior, individual physiology, etc.) that can be paralleled to IP data to identify significant clusters and dynamics.
Among the available statistical tools to deal with these huge datasets, the Markovian transition matrix procedure (Orsucci et al., 2016) and the Conditional Inference Tree classification approach (Hothorn et al., 2006), can be suggested.

CONCLUSIONS
IP in psychotherapy is a phenomenon reported by numerous independent scholars, and its existence (at least with regard to SC and cardiac activity) can be considered an established fact.
Its dynamics and clinical meaning, however, are still almost completely unknown, although there are strong suggestions that the phenomenon might be related to some primitive form of affective empathy such as emotional contagion (Coutinho et al., 2014). As highlighted by the growing number of publications and in most authors' argumentations, there is the perception that this line of investigation holds a great value for the field of psychotherapy research. The automation, objectivity, and ecology of the autonomic measures, and their ability to detect implicit intersubjective dynamics, which mostly occur behind the conscious control of patients and therapists, outline the potentials hidden in IP as a research and clinical tool. Yet, to be able to fully benefit from these qualities, a significant amount of basic, idiographic, and/or bottom-up research must be performed. The literature search identified only 19 articles since the 1950s which assess IP in a clinical context, and most of them lack theoretically founded hypotheses, explicitly defined constructs, and operationally defined and empirically validated procedures.
To overcome these difficulties, the research community should strive to converge on common terminologies, to focus on the key questions on the nature and clinical interpretation of IP, and to embrace more sophisticated data analysis approaches. It is going to be a long road, but it will be worth it.

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and approved it for publication.