Heart Rate Variability and Cardiac Vagal Tone in Psychophysiological Research – Recommendations for Experiment Planning, Data Analysis, and Data Reporting

Psychophysiological research integrating heart rate variability (HRV) has increased during the last two decades, particularly given the fact that HRV is able to index cardiac vagal tone. Cardiac vagal tone, which represents the contribution of the parasympathetic nervous system to cardiac regulation, is acknowledged to be linked with many phenomena relevant for psychophysiological research, including self-regulation at the cognitive, emotional, social, and health levels. The ease of HRV collection and measurement coupled with the fact it is relatively affordable, non-invasive and pain free makes it widely accessible to many researchers. This ease of access should not obscure the difficulty of interpretation of HRV findings that can be easily misconstrued, however, this can be controlled to some extent through correct methodological processes. Standards of measurement were developed two decades ago by a Task Force within HRV research, and recent reviews updated several aspects of the Task Force paper. However, many methodological aspects related to HRV in psychophysiological research have to be considered if one aims to be able to draw sound conclusions, which makes it difficult to interpret findings and to compare results across laboratories. Those methodological issues have mainly been discussed in separate outlets, making difficult to get a grasp on them, and thus this paper aims to address this issue. It will help to provide psychophysiological researchers with recommendations and practical advice concerning experimental designs, data analysis, and data reporting. This will ensure that researchers starting a project with HRV and cardiac vagal tone are well informed regarding methodological considerations in order for their findings to contribute to knowledge advancement in their field.


INTRODUCTION
Thanks to more accessible technology (hardware and software) and since the establishment of standards by the Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology two decades ago (Malik, 1996), followed by concurring guidelines from the Society for Psychophysiological Research (Berntson et al., 1997), heart rate variability (HRV), representing the change in the time interval between successive heartbeats (see Figure 1), became a strong focus of psychophysiological research. This is due to the fact that HRV provides an index of the parasympathetic nervous system (Malik, 1996;Chapleau and Sabharwal, 2011). This is of particular interest, given the association of the parasympathetic nervous system with many aspects relevant for psychophysiology, such as self-regulation mechanisms linked to cognitive, affective, social, and health phenomena (Porges, 2007;Thayer et al., 2009;McCraty and Shaffer, 2015). The vagus nerve is the main nerve of the parasympathetic nervous system (Brodal, 2010), therefore we refer to parasympathetic activity as vagal tone from now on. More specifically, in this paper we refer to cardiac vagal tone as assessed by HRV measurement (also referred to as cardiac vagal control, given it reflects the contribution of the vagus nerve to cardiac functioning). Even if we refer in this paper to broad recommendations on HRV, where possible we will specify when our recommendations apply specifically to vagal tone, given its relevance for psychophysiology.
Progressions in technology and computer science have made data collection and analysis of HRV very accessible to psychophysiology researchers interested in the phenomena. Moreover, HRV represents a non-invasive, pain free, economic and simple measurement which again attracts many researchers. However, this ease of access to HRV collection often obscures the complicated nature of understanding and correctly interpreting the huge range of information that is provided by the numerous HRV parameters. Therefore the very nature of HRV itself may have led to some confusion for its use in psychophysiological research so far. HRV measurement is very sensitive to several methodological aspects, which hinders the comparison between studies. Therefore, there is a need for researchers to be aware of methodological issues related to HRV measurement, so results can be compared across laboratories around the world. The aim of this paper is to provide an overview of the methodological aspects to consider for the use of HRV in FIGURE 1 | Heart rate variability (HRV). This figure displays the way HRV is calculated based on the R-R intervals of the QRS complex extracted from the electrocardiogram (ECG) signal. psychophysiological research, with specific recommendations for experiment planning, data analysis, and data reporting. Before going further, we clearly state that our recommendations are not to become recognized standards or to enforce experimental procedures to assess HRV and cardiac vagal tone, because the answers will ultimately depend on the research questions. Instead, we aim to take the researcher by the hand and provide a comprehensive overview of the important methodological issues one has to consider for psychophysiological research on vagal tone. Therefore we provide the rationale for practical recommendations for the reader to make educated choices on those aspects. Given this overarching aim, we will not extensively cover every single topic in depth, instead we provide the reader with Supplementary Materials in order to visualize suitable resources and signpost to other relevant research.
When a budding researcher first searches for research conducted in HRV they are faced with a rather extensive list. On the 11th of January 2017, a search on the Web of Science with the keywords "HRV" returned 21,621 results, "parasympathetic" returned 10,983 results, "vagal" returned 16,339 results, while a combination of the three keywords returned 40,731 unique results. Therefore a psychophysiology researcher who wants to start investigating HRV and vagal tone is thus faced with a huge puzzle and may start to consult previous summative work (Malik, 1996;Berntson et al., 1997;Chambers and Allen, 2007;Quintana and Heathers, 2014;Shaffer et al., 2014;Billman et al., 2015;Sassi et al., 2015;Quintana, 2016;Quintana et al., 2016).
As mentioned earlier, measurement and interpretation issues were discussed two decades ago by a Task Force on HRV (Malik, 1996) and by further researchers (Berntson et al., 1997), offering the first solid foundation for HRV research. The recommendations of the Task Force are very useful and provide information on which HRV parameters to take into account and what their significance is at the physiological level. However, researchers in psychophysiology could not see a direct interpretation regarding the phenomena they were used to working with. Fortunately, some pioneers contributed then to the establishment of links between HRV and psychophysiological phenomena (Grossman and Taylor, 2007;Porges, 2007;Thayer et al., 2009;McCraty and Childre, 2010;Lehrer, 2013), giving birth to five psychophysiological theories that we detail in the next section. At the methodological level, the Task Force left researchers in psychophysiology to make their own decisions concerning experiment planning and the practical realization of experiments. In addition the data analysis and data reporting was also omitted of the Task Force report, which can both have a key influence on the results being reported and their interpretation.
A decade after the Task Force, an important milestone for research in psychophysiology was published. A special issue on cardiac vagal control addressed theoretical and methodological issues directly relating to psychophysiology (Chambers and Allen, 2007). Finally, an updated vision on HRV, including the theoretical and methodological progressions of the last two decades, was delivered by several research teams. A critical review of the new HRV analyses methods that emerged since the Task Force regarding HRV was provided by Sassi et al. (2015). Further, methodological issues were debated in a special issue lead by Tak et al. (2009) and Billman et al. (2015), interpretation and theoretical issues were presented in Shaffer et al. (2014), guidelines for reporting HRV experiments were presented in Quintana et al. (2016), statistical considerations for HRV case-control studies were presented in Quintana (2016), and methodological considerations for biobehavioral research were presented in Quintana and Heathers (2014). These reviews were helpful as they updated the knowledge acquired from two decades regarding HRV, however, they still did not fully answer the needs of psychophysiological researchers. The review of Sassi et al. (2015) concluded that the new methods to analyze HRV did not allow for better understanding of the physiological systems underlying HRV such as vagal tone, and therefore did not represent a meaningful added value for psychophysiologists. Billman et al. (2015) focused on methodological concerns mainly linked to research in physiology, which sometimes do not entirely portray the needs of psychophysiology researchers. Tak et al. (2009) focused on the methodological quality of HRV studies within a specific topic (i.e., functional somatic disorders). Shaffer et al. (2014) did focus on psychophysiological aspects reviewing the psychophysiological theories, but did not address issues related to measurement and data analysis. Quintana et al. (2016) provided a useful checklist for reporting articles on HRV for psychiatry, however, this very focus on data reporting in psychiatry left many questions for psychophysiological researchers. One of the key differences between the needs of psychiatry researchers in comparison to psychophysiological researchers is the focus on cases vs. control for the formers, and the focus on tasks vs. baseline for the latter ones. This focus on cases vs. control participants was also found in Quintana (2016), regarding the statistical considerations about reporting and planning HRV studies. Finally Quintana and Heathers (2014) provided very helpful methodological recommendations for psychophysiological research regarding within-subject design, controlling for respiration and baseline measurement. However, they did not account for many other additional issues that need to be addressed by a psychophysiological researcher in order to plan, measure, analyze, interpret and report HRV data correctly. Several of these aspects were later attended to by Quintana et al. (2016), however, there is still a need to provide a comprehensive overview. This overview should cover the majority of methodological challenges faced by a psychophysiological researcher willing to investigate HRV and vagal tone, a need which this paper aims to fill.
In summary, the initial ease of access of HRV often hides the complicated nature of HRV in order to understand and correctly interpret which information is actually relevant. HRV measurement is very sensitive to several methodological aspects, such as the body position baseline is taken from (Young and Leicht, 2011), which hinders the comparison between studies. Moreover, the inconsistent reporting of HRV parameters and analyses in HRV research papers may contribute to the confusion in the use of HRV and the conclusions drawn regarding vagal tone in psychophysiological research so far, as pointed out by Quintana et al. (2016). Thus the aim of this paper is to provide recommendations for the assessment of HRV in psychophysiological research. It will endeavor to cover all aspects of HRV research in psychophysiology from experiment planning, measurement, data analysis and data reporting. In other words it will act as a set of recommendations in order to enable psychophysiological researchers to conduct a full research project with HRV with a focus on vagal tone.

HRV IN PSYCHOPHYSIOLOGICAL RESEARCH: A FOCUS ON VAGAL TONE
As reviewed by Shaffer et al. (2014), there are five theories implying HRV in psychophysiological research: the neurovisceral integration model , the polyvagal theory (Porges, 2007), the biological behavioral model (Grossman and Taylor, 2007), the resonance frequency model (Lehrer, 2013), and the psychophysiological coherence model (McCraty and Childre, 2010). The neurovisceral integration model assumes a connection between the prefrontal cortex and the heart through the central autonomic network and the vagus nerve . The main assumption of this model is that the higher the vagal tone, the better executive cognitive performance, as well as better emotional and health regulation . Porges (2007), who developed the polyvagal theory, assumed that a higher vagal tone is associated to better social functioning. The biological behavioral model (Grossman and Taylor, 2007) focuses on the fact that vagal tone plays a primary role in regulation of energy exchange by synchronizing respiratory and cardiovascular processes during metabolic and behavioral changes. A higher resting vagal tone is seen as adaptive, given "it reflects a functional energy reserve capacity from which the organism can draw during more active states" (Grossman and Taylor, 2007, p. 279). Lehrer (2013) put forward the resonance frequency breathing model which mentions that an efficient way to increase vagal tone is through slow paced breathing at the resonance frequency. Finally, the psychophysiological coherence model (McCraty and Childre, 2010) shares similarities to Lehrer, in that a higher vagal tone can be achieved through slow paced breathing. They also postulate that slow paced breathing coupled with positive emotions will produce a broad range of positive outputs linked to personal, social, and global health (McCraty and Childre, 2010). One common ground of those five theories regarding HRV is their focus on vagal tone, which also constitutes one of the main focuses of HRV research. Given the focus on vagal tone of all existing psychophysiological theories, its measurement and interpretation will receive particular attention within this paper.

PLANNING A RESEARCH PROJECT WITH HRV
In this first section on experiment planning we will address the issues concerning the HRV variables to assess, the choice of within-subject vs. between-subject design, sample size, experiment structure, variables to control and the choice regarding considering HRV as a dependent or independent variable.

HRV Variables to Assess
The first question that the researcher wants to answer is the following: out of the more than 70 variables that can be calculated from HRV analysis (Bravi et al., 2011;Smith et al., 2013a,b) -what are the variables of interest for psychophysiological research?
The answer will depend of the phenomenon of interest and the research question itself. As an overview, HRV analysis can be performed in the time-domain, in the frequency-domain and finally with non-linear indices. We will present the main variables of interest for psychophysiological researchers with the physiological systems that they reflect (Table 1), with a specific interest on variables depicting vagal tone given the theoretical focus existing on those variables.
In the time-domain, the standard deviation of all R-R intervals (SDNN) reflects all the cyclic components responsible for variability in the period of recording (Malik, 1996). The root mean square of successive differences (RMSSD) reflects vagal tone (Thayer and Lane, 2000;Kleiger et al., 2005) and is highly correlated with high-frequency (HF) HRV (Kleiger et al., 2005). Finally, RMSSD is relatively free of respiratory influences, contrary to high frequency parameters ). The percentage of successive normal sinus RR intervals more than 50 ms (pNN50) is correlated with RMSSD and HF power and thus supposed to reflect also vagal tone (Shaffer et al., 2014). However, the RMSSD typically provides a better assessment of vagal tone and it is normally preferred to pNN50 (Otzenberger et al., 1998). Beyond those traditional variables, additional analyses based on the time-domain properties of the ECG signal can be used to infer vagal tone, such as the peak-valley analysis (Grossman et al., 1990), also known as peak-to-trough analysis (Lewis et al., 2012), which acts as a time-domain filter dynamically centered at the exact ongoing respiratory frequency (Grossman et al., 1990). Finally, another way of quantifying vagal tone is by using the Porges-Bohrer method (Lewis et al., 2012), which displayed interesting statistical properties in comparison to other metrics.
In the frequency-domain, the analysis requires filtering the signal into different bands (for a comprehensive visual display of the filtered frequencies, see Shaffer et al., 2014, p. 8). The ultralow frequencies (ULF) band is located below 0.0033 Hz. It reflects circadian oscillations, core body temperature, metabolism and the renin-angiotensin system (Berntson et al., 1997). It can be only assessed with 24 h recordings (Kleiger et al., 2005). The very-low frequency (VLF) range is located between 0.0033 and 0.04 Hz. This band represents long-term regulation mechanisms, thermoregulation and hormonal mechanisms (Malik, 1996;Berntson et al., 1997). The low-frequency (LF) band ranges between 0.04 and 0.15 Hz. The LF band reflects a mix between sympathetic and vagal influences that shows an influence of both sympathetic and parasympathetic branches (Malik, 1996;Berntson et al., 1997). HF, specifically between 0.15 and 0.40 Hz (Malik, 1996), reflects vagal tone. This is also frequently called the respiratory band because it corresponds to the heart rate variations related to the respiratory cycle (Eckberg and Eckberg, 1982). HF is influenced by breathing when breathing rates are between nine cycles per minute (0.15 Hz) and up to 24 cycles per minute (0.40 Hz) (Malik, 1996). When breathing remains between these cycles per minute then HRV stays between the boundaries of those frequencies, thus reflecting vagal tone. Bands might need to be adjusted regarding the population of interest: for example children and infants breathe faster, and a recommendation for them would be to move the boundaries of the band to 0.24-1.04 Hz at rest . In a similar vein, athletes usually present slower respiratory rates that may interfere with of the measured HF band (Saboul et al., 2014). Therefore population characteristics should always be considered when bands are chosen, either by looking at previous research, or by calculating respiratory rates of the sample under investigation . In this case it is always important to couple the HRV frequency analysis with other timedomain parameters supposed to index vagal tone to see to which extent they correlate, for example with RMSSD that is supposed to be less affected by respiratory influences . Heart rate accelerates during inspiration and slows down during expiration, a phenomenon that is called respiratory sinus arrhythmia. Hence, in the literature the term respiratory sinus arrhythmia is often written instead of HF, as it is supposed to reflect vagal tone (Eckberg, 1983). However, we would recommend for clarity matters to refer to HF when referring to vagal tone, and use RSA to depict the heart rate variations accompanying inspiration and expiration, respectively, accelerating and slowing down (Eckberg and Eckberg, 1982). Finally, the LF/HF ratio was long considered as representing the sympatho-vagal balance which is the balance between the sympathetic and parasympathetic systems. However, this view has been highly criticized (Eckberg, 1997;Billman, 2013). Among the most critical aspects is the loose relationship between LF power and sympathetic nerve activation, and the nonlinear and non-reciprocal relationship between sympathetic and parasympathetic nerve activity (Billman, 2013). Hence, there is now a consensus to say that the precise physiological underpinning is unclear, thus lowering its predictive value. Although around 65% of HRV papers are still basing their conclusions on it (Heathers, 2014), we strongly recommend researchers to adopt HRV indices that reflect clearly identified physiological systems with a theoretical underpinning such as the indices of vagal tone (i.e., RMSSD, peak-valley, and HF-HRV).
In addition, we could mention some non-linear indices that can be obtained from the interbeat interval (IBI) interval. As the autonomic nervous system is characterized by complex and erratic fluctuations, some researchers suggest that non-linear analyses might be more adequate and precise for HRV analysis than the prevalent linear measures (Piskorski and Guzik, 2005). One of those linear indices is the Poincaré plot. The plot itself displays the correlation of R-R intervals (which are usually measured in milliseconds) by assigning each following interval to the, respectively, former interval as a function value (autocorrelation). The result is a plot which illustrates quantitative and qualitative patterns of one's individual HRV in the shape of an ellipse. Additionally, two other parameters are added to the ellipse, namely the two different standard deviations resulting from the orthogonal distances between the scatter and the elliptical diameters. Firstly crosswise (SD1) and secondly lengthwise (SD2) to the ellipse. SD1 is supposed to be more sensitive to quick and high frequent changes whereas SD2 is viewed as an indicator of long-term changes (Piskorski and Guzik, 2005). The result is a plot that illustrates quantitative and qualitative patterns of one's individual HRV in the shape of an ellipse. As research results suggest, Poincaré plots can be seen as indicators of vagal activity and reduced cardiac vagal control which are associated not only with physiological but also with psychological strain and stress (Collins and Karasek, 2010;Melillo et al., 2011). However, some caution is still required regarding non-linear indices and their utility has still to be demonstrated to predict psychophysiological phenomena (Sassi et al., 2015). Hence, this would allude to not using them as single indicators, but rather as complementary HRV indicators.
Finally, a critical review of the new HRV analyses methods that emerged since the Task Force regarding HRV (Malik, 1996) was realized by Sassi et al. (2015), and the conclusion was that the new methods did not bring any additional information regarding physiological underpinnings of HRV, and therefore no additional information for vagal tone.

Within-Subject Design vs. Between-Subject Design
Deciding the experimental design is crucial for HRV experiments. Given high inter-individual variations and the complex interactions influencing HRV, within-subject designs are highly recommended (Quintana and Heathers, 2014). Within-subject designs offer optimal experimental control, contribute to the elimination of individual differences in respiratory rates (though there is still a need to control for them, which will be covered later), require less participants given they offer an increased statistical power, and reduce the impact of external factors such as medication, alcohol, smoking, etc. (Quintana and Heathers, 2014). In case testing occurs on different days, the time when the experiment is realized should be maintained constant and this should also apply to between-subject designs, having participants take part to the experiment at the same time of the day (Massin et al., 2000;van Eekelen et al., 2004). One limitation of withinsubject designs is the habituation to the experimental conditions and the learning effect that can be observed in some experimental tasks. Therefore in this case we would recommend whenever possible the use of non-identical correlated tasks investigating similar constructs. For example measuring response inhibition which can be investigated through the use of concurrent tasks such as the Stroop color naming task and a stop signal task (Miyake et al., 2000); or if the task needs to remain the same and the conditions change, it is then possible to counterbalancing the conditions in order to reduce confounding effects (e.g., low pressure vs. high pressure, such as in Laborde et al., 2014).

Sample Size
An effect size distribution analysis of close to 300 HRV effect sizes (Quintana, 2016) revealed that HRV studies are generally underpowered, and that Cohen's guidelines (Cohen, 1988) to interpret effect sizes should be adapted. More particularly, instead of interpreting 0.20, 0.50, and 0.80 as, respectively, small, medium, and large effect sizes, we should rather interpret 0.25, 0.50, and 0.90 as representing, respectively, small, medium, and large sizes. In terms of sample sizes, the effect size distribution analysis suggests that in order to achieve 80% power, samples of 233, 61, and 21 participants are required, respectively, to detect small, medium and large effect sizes (Quintana, 2016). If another effect size or statistical power level is desired, power analysis can be conducted for example with the help of the G * Power 3 (Faul et al., 2007(Faul et al., , 2009 or the "pwr" package (Champely, 2016) available for the R statistical package, which facilitates the calculation of required sample size. Experiment Structure and the Three Rs: Resting, Reactivity, Recovery Following the previous section, the next question is to consider the structure of the experiment. But first of all, we need to understand the concepts of tonic or phasic HRV which are recognized to be important in terms of adaptation (Porges, 2007;Thayer et al., 2012). Tonic HRV has also been referred to as resting HRV or baseline HRV and is when HRV is taken at one time point. Phasic HRV shows how the system reacts and has been named reactivity, stimulus-response, change delta HRV and vagal withdrawal in order to represent change in HRV from two different time points. When assessing HRV in these two concepts, tonic and phasic, it is important to consider what the level or changes in HRV represent. For example when measuring tonic HRV it is clear from the literature that a higher resting vagal tone is beneficial in most of the cases . There are some exceptions to this general rule (e.g., Stein et al., 2005;Peschel et al., 2016), for example when a higher resting vagal tone level is observed in the case of eating disorders such as bulimia nervosa (Peschel et al., 2016), which may be due to decreased resting metabolic rate originating from limited calorie intake (Martin et al., 2007). Assessing the phasic level may require a little more interpretation of the vagal activity in order to determine whether it is adaptive or not. For example a high level of vagal withdrawal (decrease in HRV) may be seen as adaptive or not depending on the situation. This may been seen as adaptive when the individual is facing a physical stressor or a mental stressor that does not involve executive function, as this demonstrates the individual's ability to provide the organism with the necessary energy to face the stressor (Porges, 2007), as showed experimentally (Neumann et al., 2004;Rottenberg et al., 2005;Lewis et al., 2007;Messerotti Benvenuti et al., 2015). However, when the stressor faced by the individual requires executive functioning, then a higher level of vagal withdrawal is seen as maladaptive , as showed experimentally (e.g., Marcovitch et al., 2010;Laborde et al., 2014Laborde et al., , 2015bPark et al., 2014). Related to this, an interesting recent study by Park et al. (2014) showed that tonic HRV might influence phasic HRV. In a selective attention task involving fearful and neutral faces to act as distractors, when using fearful distractors lower tonic vagal tone was associated with phasic vagal tone withdrawal, under both low and high perceptive load. While in contrast higher tonic vagal tone was associated with phasic vagal tone enhancement under low perceptual load and an absence of phasic HRV suppression under high perceptual load. As a consequence for researchers, this means that both tonic and phasic HRV values need to be taken into account because their interaction can shed light on findings that would otherwise remain unclear.
Based on the respective role of tonic and phasic vagal tone, we would advise researchers to have the following structure in their experimental designs: three time points referred to as: baseline, event, and post-event (e.g., Berna et al., 2014). We suggest this experimental structure to subsequently introduce the three Rs of HRV: resting, reactivity, recovery (see Figure 2). By using the three Rs structure it allows for investigation of tonic HRV for each of the three measurement points (i.e., baseline, event, postevent). In addition it allows for a measure of phasic HRV as we can measure the change between baseline and event (that we coin here as "reactivity"), the change between task and post-event (that we coin here as "recovery") and the change between baseline and post-event according to the research questions. The change in HRV, for reactivity and recovery, can be either reported in absolute values, or in percentage (see Duschek et al., 2009).

Variables to Assess
Now the structure of HRV experiments has been finalized, it is important to consider any confounding variables influencing HRV that can be controlled. In the Supplementary Materials (Data Sheet 1) we provide the reader with an example of a demographic form that can be adapted and used within experiments. The content of which recommends collecting the following information and potentially controlling for them, according to the research question. Therefore we suggest the researcher considers the following stable and transient participant's variables.
Stable variables: • Age and gender (Umetani et al., 1998) • Smoking (Hayano et al., 1990;Sjoberg and Saint, 2011) • Habitual levels of alcohol consumption (Quintana et al., 2013a,b) • Weight, height and waist-to-hip ratio (Yi et al., 2013) • Cardioactive medication, such as antidepressant (Kemp et al., 2010), antipsychotic (Cohen et al., 2001) or antihypertensive (Schroeder et al., 2003). Among psychotropic medication, a systematic review revealed that only tricyclic antidepressant and clozapine were found to statistically significantly influence HRV (Alvares et al., FIGURE 2 | Typical experiment structure for HRV experiments, depicting the three Rs: resting, reactivity, and recovery. HRV, Heart rate variability. Frontiers in Psychology | www.frontiersin.org 2016), but we would still recommend to document any cardio-related medication taken by the participants. • Oral contraceptive intake for female participants, it might not influence HRV during rest conditions (Rebelo et al., 2011;Nisenbaum et al., 2014), but it may influence the response to stressful conditions (Kirschbaum et al., 1999).
Transient variables: • Follow a normal sleep routine the day before the experiment, record the typical bed time and typical waking time (Stein and Pu, 2012) • No intense physical training the day before the experiment (Stanley et al., 2013a) • No meal the last 2 h before the experiment (Lu et al., 1999) • No coffee -or caffeinated drinks such as energizing drinks - (Zimmermann-Viehoff et al., 2015) or tea (Inoue et al., 2003) in the 2 h before the experiment • Ask if they need to use the bathroom before the experiment begins (Heathers, 2014) • No alcohol for 24 h prior to the experiment (Quintana et al., 2013a,b) Keeping track of all the potential confounding variables may allow the researcher to exclude participants prior to data collection or to understand outliers within the data post collection. As a note, individuals may give inaccurate information when reporting many of these demographic factors. Therefore researchers should bear in mind that the ideal procedure would be to get objective measures of these potentially confounding factors whenever possible. For instance, rather than asking participants if they have any blood pressure conditions, taking a direct measure of blood pressure would be more accurate. Also, instead of self-reporting physical/mental illnesses, it would be preferential for participants to receive a physical examination by a clinician and screening for psychiatric disorders.

HRV as a Dependent or Independent Variable
Heart rate variability has been used as both a dependent or independent variable by researchers. As a dependent variable, it is used to see how it relates in correlations and regression analyses (e.g., Laborde et al., 2011;Jennings et al., 2015) or how it differentiates groups split according to other criteria like individual differences (e.g., Schwerdtfeger and Derakshan, 2010;Laborde et al., 2015a) or to experimental conditions (e.g., Egizio et al., 2008;Laborde and Raab, 2013).
As an independent variable, much research has considered resting HRV as an individual difference per se, and groups were then created with median split, for example low and high RMSSD (for a review, see Thayer et al., 2009). The rational for considering HRV as an individual difference can be seen as more than a statistical trick, because it has also some theoretical and empirical premises. For example high resting HRV, which often reflects high resting vagal tone, is consistently associated to positive outcomes . Furthermore resting levels of HRV and particularly of vagal tone are reasonably stable in time (Bertsch et al., 2012) and finally cardiac vagal control is partially heritable (Neijts et al., 2014).

MEASUREMENT STANDARDS
In this category we address the measurement standards associated with best practice within HRV research. These include: issues associated with the HRV recording devices and the signal to record, electrocardiogram (ECG) sampling, duration of recording, baseline recording, measurement in ambulatory settings, measurement with movement, and respiration.

HRV Recording Device and Signal
Several recording techniques exist to measure HRV either through ECG recordings, the IBI or photoplethysmography. Using an ECG allows researchers to directly obtain HRV data from the electrical stimulus of the heart that is seen as the QRS complex (the graphical depiction of ventricle depolarization, i.e., a heartbeat). This can be collected through more traditional ECG equipment or through modern technologies such as the eMotion Faros device (Mega Electronics, Kuopio, Finland) using only two electrodes. ECG recordings are more accurate in terms of artifact correction because they allow the researcher to physically see the QRS complex and hence the heart beats, leading to very precise correction. As we will detail later artifact correction is very important step in the pre-processing of the signal (Berntson and Stowell, 1998;Shaffer and Combatalade, 2013). In case electrodes are used, it is necessary to follow standard recommendations regarding electrodes positioning according to the device used (Kligfield et al., 2007), and also consider skin preparation in order to improve signal quality such as cleaning or hair removal.
When measuring the IBI the researcher solely collects the time between heart beats. One method of collecting IBI data is through chest belts. The advantage of these belts coupled to heart rate monitors is that they are widely spread, and some specific heart rate monitors can record the IBI interval properly (Weippert et al., 2010). The disadvantage of chest belts is that they will create more artifacts than electrodes due to friction against the skin. In addition, they can only measure the IBI and not the ECG signal, making it less accurate than ECG.
Photoplethysmography involves shining a small light onto an area where capillaries are easy to access either through the finger or ear lobe using a sensor. Then the light reflected back to the sensor depicts blood volume in the vessel and thus forms the grounds of a heartbeat. Photoplethysmography measures pulse-to-pulse interval data, which is a mixture of the IBI and pulse transit time. Photoplethysmography is considered to represent an accurate approximation of the IBI (Gil et al., 2010). A comparative review of photoplethysmography against ECG stated that photoplethysmography can be used during rest but not during stress (Schafer and Vagedes, 2013). This is because stress induces changes in pulse transit time (the time the blood pressure wave takes to travel from the heart to the periphery) which results from changes in the elasticity of the arteries which cannot be detected through IBI (Shaffer et al., 2014). Furthermore, the curved peak of the blood volume pulse signal is harder to detect accurately than the sharp upward spike of the R wave of the QRS complex which can best be determined through ECG. Finally, the emergence of the quantified self movement (Swan, 2013) -lay people willing to track and monitor their own psychophysiological data -is giving birth to many consumer devices aiming to measure HRV. However, for the majority, the consumer devices have not been validated against ECG measures, and present several drawbacks . These usually include: reporting a proprietary metric rather than a standard metric, not providing access to raw data, and not offering technical details of correction methods . Therefore, for research purposes we would recommend using electrodes instead of a chest belt or finger sensors. Researchers should aim to obtain the ECG recording to allow for precise editing of the signal for artifact correction. Importantly, in case the device used does not allow for the use of time markers, a very precise time protocol of every experimental event should be kept in order to allow for later analysis (see an example in Supplementary Material, Data Sheet 2).
At the applied level for practitioners, with the growing impulse to quantify psychology and with the growth of biofeedback, it is not surprising that many practitioners look to HRV smartphone apps to meet this need. The development of smartphone apps connected to either a chest belt or photoplethysmography to measure HRV, such as Ithlete, have been shown to provide reliable HRV measurement (Flatt and Esco, 2013). However, this was based on the RR interval and for research purposes we recommend the use of devices able to record the ECG signal, as mentioned earlier. We strongly recommend to avoid using smartphone apps based on the camera of the smartphone to detect the finger pulse because their sampling is simply too low to provide a reliable assessment of HRV. Considering sampling rates is crucial to ensure accurate ECG recording, as we will discuss in the next session.

ECG Sampling
Together with the choice of HRV recording technique it is also important to make an informed choice of ECG sampling. Temporal accuracy is crucial to calculate successfully the variance of a time series , to identify as precisely as possible the fiducial points (i.e., landmarks) of the ECG complex. The accuracy of HRV measurements is primarily determined by the sampling rate of the data acquisition system, which should be set at a minimum of 200 Hz according to the Task Force (Malik, 1996). More conservative guidelines advise between 500 and 1000 Hz (Riniolo and Porges, 1997;Berntson et al., 2007), which may be particularly relevant in case of lower amplitude RSA (Riniolo and Porges, 1997). However, recent development on this topic, relying on Monte Carlo-based analysis of false positive rates, showed that when R-peak interpolation (i.e., mathematical estimation of the digitized signal performed to enhance the R-wave fiducial point) was applied prior to HRV calculation, sampling rate could even be lower than 100 Hz, and without R-peak interpolation the analysis could be considered as reliable down to 125 Hz for all measures, and far lower in case of specific measures (Ellis et al., 2015). Based on these recent developments, we advise a ECG sampling rate of at least 125 Hz for researchers in psychophysiological research in order to ensure a reliable assessment of HRV parameters, but researchers who would like to anticipate issues that may arise with low amplitude RSA could choose as a conservative lower boundary of 500 Hz.

Duration of Recording
As duration of recording might have an influence on HRV parameters, above all for the time-domain, the Task Force created a set of gold standards in terms of time measurement durations. They recommend for short-term recording the duration of 5 min (Malik, 1996) in order to ensure comparability of results across studies and laboratories. The basis for the recommendation is that the recording should last for at least 10 times the wavelength of the lower frequency bound of the investigated component. In specific cases, for example to meet the needs of an experimental design, recordings could be made shorter. However, the argumentation to do so needs to be strong and 1 min should be seen as the absolute minimum to obtain a reliable assessment of HF (Malik, 1996). More recently this 5 min gold standard has been challenged with shorter recording durations. For example a 1 min recording of the natural log of RMSSD has been proven to offer good reliability in comparison to the classical 5 min RMSSD (Esco and Flatt, 2014). Furthermore recordings between 10 and 50 s, according to which HRV parameters are considered, have been proven to be reliable under certain conditions (Salahuddin et al., 2007). Very recent work by Munoz et al. (2015) investigated the validity of ultra-short and short recordings on HRV measurements in a very large adult sample (N = 3,387). They found that it is unnecessary to use recordings longer than 120 s to obtain accurate measures of RMSSD. In addition, even a single 10 s (standard ECG) was found to yield a valid RMSSD measurement, although an average over multiple 10 s ECGs is preferable. Those 10 s periods don't need to be contiguous (i.e., not in succession), so it is possible to obtain a good estimate from several 10 s measurements spread over a trial or the experiment. In any case researchers would need to carefully justify their choice of periods duration and location within the experiment when presenting their data analysis strategy. In summation from the suggestions presented for duration of recording, we would recommend in line with the Task Force (Malik, 1996) when possible to keep a 5 min recording to enable comparison between clinical studies. Depending on the research question a minimum duration of 1 min when vagal tone is targeted to allow frequency analysis and also according to the research question consider shorter recordings if RMSSD is used as an index of vagal tone. This would reduce the duration of HRV experiments and allow for ultrashort measurements in specific cases, for example enabling genetic epidemiological studies to be performed on a large scale.
For certain indicators, longer recordings of 24 h could be interesting. The Task Force mentions that the 24 h indices seem to be "stable and free of placebo effect, (which may make them) ideal variables with which to assess intervention therapies" (Malik, 1996, p. 363). However, a main difference has to be taken into consideration for the analysis and that is whether analyses should be based on a single segment of 24 h, or on a 5 min epochs over a 24 h period. Single analysis of 24 h suffer from several problems (Malik, 1996). Firstly they violate stationarity; if mechanisms responsible for heart period modulation of a certain frequency remain unchanged during the whole recording period, then -and only then -the corresponding frequency component of HRV may be used as a measure of these modulations, otherwise the interpretation can't be ascertained. Secondly, they do not reflect the activity of the autonomic nervous system. For example, Roach et al. (1998Roach et al. ( , 2004 and Raj et al. (2004) have shown that low frequency HRV measures (e.g., ultra-low frequency and SDANN) reflect physical activity and functional capacity of patients and not strictly the autonomic nervous system activity. Therefore interpretations of differences between individuals on these measures as indicative of autonomic nervous system differences are problematic . In the case of 5 min epochs over 24 h period, this can be very advantageous as we can examine circadian variation and night values which in some cases can be more predictive than daytime values (e.g., Jarczok et al., 2012). Those 5 min epochs can be used differently: either using a moving window, for example a 5 min moving window to calculate HRV parameters at 1 min intervals (e.g., Fenton-O'Creevy and Lins, 2012), or using strict intervals, for example 5.35 min blocks (Jarczok et al., 2012). We would advise to use strict intervals, given the moving average suffers from interpretive difficulties. Afterward, average on 24 h can be calculated, as well as day and night averages according to the research question (Jarczok et al., 2012). In previous 24 h monitoring bulky Holter devices were previously used which may cause some discomfort for participants (e.g., Soares-Miranda et al., 2014). However, the new generation of ECG devices weighting under 15 g (for example, the Faros devices from Mega Electronics, Kuopio, Finland) remain almost unnoticed by participants.

Baseline Recording
The recording of an accurate baseline is crucial (Quintana and Heathers, 2014), for several reasons. In order to standardize baseline measurement it should be made as consistent as possible to ensure comparability of results across samples, experiments, and laboratories. The way in which it is realized should be through precisely described methods (e.g., body posture and instructions given). In the majority of studies, the baseline recording is generally taken while sitting with knees at a 90 • angle, both feet flat on the floor, hands on thighs and eyes closed, similar to what is recommended for blood pressure procedures (Pickering et al., 2008;Ghuman et al., 2009). With regard to hand position we recommend palms facing upward, given palms facing downward could introduce interoceptive effects if participants feel their radial (wrist) pulse. Other postures for baseline [e.g., supine (lying down), standing] might be used if they make sense regarding the experimental conditions, for example in sleep research (Neijts et al., 2014). Whatever the posture chosen, it is important that before the baseline measurement, the participant has been in this posture at least for 5 min (Ghuman et al., 2009). This can be referred to more generally as acclimatization to the recording environment, which is accomplished by using an analysis period starting later than the start of the recording, which ensures that the potential anxiety and increased attention to respiration and heart rate that may occur when people are told that recording is starting already potentially fade out . In addition, this would support not announcing the start of recording to protect the validity of the measurement period of interest. During baseline recording participants have to stay seated without speaking or making any movements, they are asked to relax and to breathe spontaneously. The best is to control for the time of day of the assessment (van Eekelen et al., 2004) and to make procedures consistent throughout the participants.
Some solutions were created to standardize baseline with the double aim to avoid mind-wandering, disruptive thoughts and to propose a baseline that is closer to experimental conditions. Firstly, some researchers attempted to standardize baseline recording using a video with neutral stimuli, like an aquatic video, which may be more comfortable for some participants than sitting with their eyes closed (Piferi et al., 2000). However, researchers have to be conscious that this can also influence individual's cardiac reactivity to the experimental tasks. Secondly, a passive and restful baseline is often compared to experimental tasks that involve performance of a psychomotor, cognitive, or stressful task for example. This might then conflict the difference between passive rest in regards to paying attention to the experimental task, with the difference between passive rest in regards to the specific experimental task demands (Quintana and Heathers, 2014). An alternative to this forced relaxation could be then to perform the Vanilla baseline (Jennings et al., 1992), where participants have to perform a task requiring sustained attention but minimal cognitive load. However, researchers have to make this decision carefully in regards to their experimental task as sustained attention is still linked to HRV .
To summarize, some researchers argue that an ideal HRV baseline recording does not exist, as there is not a correct list of parameters that would apply under all circumstances. Instead the following definition explains the parameters for a baseline recording, "the non-task situation that best controls for the presence of task comparison" (Quintana and Heathers, 2014, p. 6) and we agree that this is suitable where experiments are concerned. In the case that we want to consider resting HRV as predictor we still need to standardize the procedure, which is generally the one indicated earlier (i.e., sitting, knees at a 90 • angle, hands on thighs, palms facing upward, eyes closed). According to the experimental condition, we recommend the body position to be as close as possible as to the one used during baseline. For example sitting baseline compared to a sitting cognitive task or a standing baseline compared to a standing psychomotor task. Finally, when baseline recording is used to assess HRV as a trait, the aggregation across at least two measurements is recommended in order to discard situation specific variance (Bertsch et al., 2012). Regarding this last point, some authors recommend to measuring HRV in supine resting first thing in the morning so it is the less influenced by external factors (Buchheit et al., 2005).

HRV in Ambulatory Settings
We have previously mentioned that standardization for HRV experiments is crucial and therefore running reliable lab experiments is important in HRV research. However, researchers in psychophysiological research may consider the ecological environmental impacts on HRV and thus may ask themselves whether ambulatory, long-term measurement of HRV is possible. The answer is yes and it can provide very interesting information, for example when assessing 24 h recordings. When assessing HRV in situ, the predictive value of HRV is increased when controlling for respiration and physical activity . Ambulatory settings will introduce the issue of dealing with movement, which we address in the next section.

HRV Recording and Movement
In line with the previous section, we now consider whether it is possible to record HRV with movement, either inside or outside the lab. As HRV reflects the activity of the sympathetic and parasympathetic nervous systems, when the individual starts to move it affects immediately HRV, as both systems are involved in meeting physical demands (Brodal, 2010). Therefore movement will influence HRV parameters and in addition it may cloud the regulation linked to cognitive, emotional, social and health processes. The guidelines of the Task Force (Malik, 1996) are extremely clear on this matter: for an unambiguous interpretation of the physiological mechanisms underlying HRV, the measurement needs to be realized without physical activity. However, if the research question requires it, it may be possible to perform ambulatory measurements of HRV while controlling for respiration and physical activity , however, researchers have to be aware that in this case a clear interpretation of vagal tone won't be possible.
Another issue with movement is that we risk more artifacts within the data set. Currently, there is no generally available strategy to compute algorithms that are able to separate the influence of movement on HRV from the influence of other regulatory processes (but see Verkuil et al., 2016, for a new approach to this issue). The most common strategy in case of movement is to collect accelerometer data together with the HRV measurement and then to delete the sections where movement was excessive (e.g., Hansen et al., 2003;Johnsen et al., 2003). Specific algorithms like continuous wavelet transformation minimizes motion artifact (Villarejo et al., 2013), however, this only addresses one aspect of the issue because concerns related to interpretation of the data remain (Malik, 1996). As an alternative, HRV can be assessed directly before the task involving movement, for example when used as a precompetitive marker before sport competition where a decrease in vagal tone is generally observed, like before a swimming competition (Cervantes Blásquez et al., 2009) and a bike (i.e., BMX) competition (Mateo et al., 2012). HRV assessed before physical performance could then potentially serve to some extent as a predictor for the following motor performance. Building on this, some studies evidenced the predictive role of HRV measured during the task in performance when some movement was involved, for example in a police shooting simulator (Saus et al., 2006) or with a navigation simulator (Saus et al., 2012). This is encouraging for future studies aimed to reproduce ecologically valid situations with HRV.

Respiration
Controlling for respiration is a long debate within HRV research. The proposed reason for controlling respiration is that HRV could be affected in certain circumstances by respiratory depth, the amount of air taken into the lungs (Hirsch and Bishop, 1981), and respiratory frequency, the amount of breaths per minute (Brown et al., 1993;Houtveen et al., 2002). It could also be affected by the central respiratory drive, estimated through partial pressure of CO 2 (Houtveen et al., 2002). Hence, in order to accurately assess vagal function, it has been proposed to "correct" HRV for respiration (Grossman, 1992). Thus it has been proposed that these respiratory factors require experimental controls either online during the experiment or offline after the experiment with post hoc statistical analyses. However, the routine control of respiration is problematic for several reasons that we and others have described in detail (Denver et al., 2007;Larsen et al., 2010;Thayer et al., 2011;Lewis et al., 2012). Briefly, researchers (Larsen et al., 2010;Thayer et al., 2011;Dick et al., 2014) suggest a common basis for HRV and respiration, with a bi-directional communication between the respiratory and cardiovascular systems . Therefore this would deem a routine correction of HRV for respiration problematic. However, we will detail some of the issues associated with respiration and HRV below.
Regarding respiratory depth, respiratory sinus arrhythmia [which reflects HF when the breathing frequency is comprised between 9 and 24 cycles per minute (Malik, 1996)] shows greater amplitude during higher tidal volumes and lower respiratory frequencies (Hirsch and Bishop, 1981;Shaffer et al., 2014). Respiratory depth is linked to tidal volume and controlling for tidal volume can be done with pneumotachography (Quintana and Heathers, 2014). This process allows for tidal volume to be measured, however, this process requires a face mask which may not be practical to use and might create interferences in experimental psychophysiological research. A post hoc approach could be to use a dedicated algorithm to control for tidal volume (e.g., Schulz et al., 2009). However, the effect of respiratory depth or tidal volume on HRV has been shown to account for less than 5% of the variance in the several measures of HRV but more than 10% of the variance when using the peak-to-trough method (Lewis et al., 2012).
Regarding respiratory frequency, one component that may be heavily influenced by it is vagal tone because HF is deemed to reflect vagal tone only when breathing frequency is higher than nine cycles per minute (Malik, 1996;Berntson et al., 1997). More specifically, HF corresponds to vagal tone when between 0.15 and 0.40 Hz, which means between 9 and 24 cycles per minute regarding respiratory rate. Therefore, any respiratory rate below or above this interval HF may not accurately depict vagal tone anymore. In comparison to HF, RMSSD has been shown to be less affected by respiratory rate ). Thus, we still need to have knowledge of the respiratory rate in order to determine whether the changes we observe in HRV values are primarily due to changes in respiratory frequency (Kuehl et al., 2015).
It is possible to control online and offline for respiratory rate. Doing this online would require using a strain gage during the experiment (Quintana and Heathers, 2014). When this is not possible, and considering the ambiguity of some strain gage estimates (Thayer et al., 2002), there is still an option to control this offline with a post hoc estimation. An estimate of respiratory rate can for example be derived from the central frequency of the HF component detected in an autoregressive analysis of HR. The central frequency of HF-HRV is highly correlated with strain gage measures of respiration (Thayer et al., 2002). However, the limitation of this method is that there should be an observable HF component. If there is no observable peak, it is questionable whether there is really any true HF power, or if it is just noise. This is one reason why the autoregressive analysis of HR is preferred to Fast Fourier Transform in this case (Thayer et al., 2002). Another estimation method, originally developed by Moody et al. (1985), is available with Kubios to estimate from ECG data the respiratory frequency from changes in R-wave amplitude, which is called the ECG derived respiration (Tarvainen et al., 2014). As a conclusion regarding the measure of respiratory rate, a strain gage would provide more accurate results given it would account for any non-cyclical respiration patterns (e.g., sudden sighs, coughs). However, if no strain gage is available, the offline methods based on the calculation of the HF peak obtained with autoregressive analysis or on the ECG derived respiration could be used.
Another approach would be to force participants to breathe at a specific rate during the experiment (Grossman and Taylor, 2007). A compromise could be to control for respiration rate during the experiment through measuring a participant's natural breathing rate, and then using the derived frequency for respiratory pacing (Elstad, 2012). While these approaches could still work for baseline measurement, this procedure may influence HRV during emotional or cognitive task. If the participant has to consciously follow the pacing cue, in addition to paying attention to the experimental task, this might then influence task output (Quintana and Heathers, 2014). However, the effect of paced breathing, even at the pace of spontaneous breathing, is problematic as it has been shown to either increase, decrease, or not change estimates of HF HRV in a manner that is not predictable (Larsen et al., 2010); while in some cases paced breathing provides similar results to spontaneous breathing (e.g., Bertsch et al., 2012).
A totally different approach is to let participants breathe spontaneously because forcing participants to breathe at a specific pace would suppress an important influence on HRV (Denver et al., 2007). As reviewed by Thayer et al. (2011), there is an aggregate of evidence from behavior genetics, neuroimaging, cardiorespiratory coupling, and psychophysiological studies suggesting that the removal of variance associated with respiration from HRV would remove variance associated with the common neural origin of HRV and respiration.
Whereas there is continued controversy regarding respiratory control, several things are clear: (1) different indices of HRV are differentially affected by breathing with the peak-to-trough method being most affected (e.g., Penttila et al., 2001;Lewis et al., 2012); (2) when analyzed appropriately, even measures derived from the peak-to-trough method can be reliable indicators of HRV without additional respiratory control (Lewis et al., 2012); (3) it has been repeatedly shown, that the effects of respiration on parasympathetic indices of HRV when recorded under resting state conditions are minimal at bestand resting state HRV is recorded best under conditions of spontaneous breathing (e.g., Larsen et al., 2010;Bertsch et al., 2012). Controlling for respiration when examining HRV indices will remove variability associated with neural control over the heartbeat, and therefore some of the variance that the researcher is actually interested in would be removed (see Larsen et al., 2010 for a thorough review); (4) it is useful to have some indication of respiration to aid the interpretation of HRV and to ascertain that participants were breathing "normally." In sum, based on the most recent evidence on this topic (Thayer et al., 2011), we recommend researchers do not engage in routine correction of HRV for respiration in case of spontaneous breathing. However, we still recommend monitoring respiration, in order to foster the understanding of the neurobiological mechanisms and contextual factors responsible for the complex interactions between the respiratory and cardiovascular system. Additionally, researchers should check whether respiratory frequency remains between 9 and 24 cycles per minute (corresponding to the HF band, 0.15-0.40 Hz). If conclusions have to be made regarding vagal tone it is important to have no differences in respiratory frequency between experimental tasks or between case and control groups.

HRV DATA ANALYSIS
In this section we will provide recommendations regarding HRV software, artifact correction, normality of HRV data, HRV frequency-domain analysis, and which HRV variables to analyze.

HRV Software
The analysis of HRV data has been made very accessible through a free popular software, Kubios (Tarvainen et al., 2014), which is currently the most used by researchers. New softwares like gHRV (Rodriguez-Linares et al., 2014), a package for the R statistical environment, and ARTiiFACT (Kaufmann et al., 2011) are in development and offer other analysis tools such as different analysis options, visualize and export the HRV data, as well as the possibility to edit the source code for gHRV. Having access to the original source code is of value when comparing it to proprietary software, as the source code can be used to interpret what is being measured.

Artifact Correction
Any HRV data sets requires a signal pre-processing before proceeding to the analysis, with the objective to identify the fiducial points (typically the R peak) from a normal ECG QRS complex. Hence, all abnormal beats not generated by sinus node depolarisations should be eliminated from the record. HRV data from ambulatory recordings generally contain more artifacts that can be either of physiological or technical origins. Technical artifacts may result from poorly attached electrodes or to excessive motion from the individual. Physiological artifacts may include ectopic beats, atrial fibrillations sighs and coughs.
When recording the IBI this only allows you the possibility to carry out an automatic artifact correction, given the fiducial points of an ECG are not recorded. For example Kubios will allow you to automatically filter your data, the purpose of this is to detect RR intervals that differ "abnormally" from the normal mean RR interval which may represent an artifact (Tarvainen and Niskanen, 2012). The different threshold levels for artifact correction in Kubios are the following: very low = 0.45 s, low = 0.35 s, medium = 0.25 s, strong = 0.15 s, very strong = 0.05 s. This procedure is very commonly used and considered sufficient in most cases for example when data was recorded in good conditions and little movement or electrode/belt movements. However, we would strongly advise to record ECG signal because in doing this the researcher is able to edit the data and modify the artifact correction manually afterward by performing a visual inspection of the ECG signal. We therefore highly recommend not to rely only on an automatic artifact correction like the one offered by Kubios, because artifacts detected by the automatic procedure of Kubios may correspond to real heartbeats, as displayed in the Supplementary Materials (see Data Sheet 3), and recommend instead to visually inspecting the ECG signal. The consequence of deleting a real heart beat that is assumed to be an artifact may have critical consequences. These consequences include a substantial influence on the HRV values and losing precious information regarding the variability of the heart rate signal. In the Supplementary Material example we see that using the automatic correction option of Kubios (very low filter) could lead to an 11% error rate in the evaluation of vagal tone. This is eloquently explained by Berntson and Stowell (1998) who stated that only one heartbeat makes a difference in the analysis. Additional software can assist you in the detection of the artifacts seen on ECG signals, like ArtIifact (Kaufmann et al., 2011). Finally, we can refer the reader to a useful guide on how to prepare the HRV recording prior to the analysis (Shaffer and Combatalade, 2013). As a last remark, in case the experiment involved different conditions it would be advantageous that the person handling the HRV data, particularly when performing the artifact analysis, is unaware of the experimental conditions. This reduces experimenter bias and reduces the possibility of investigating for particular or suggestive results.

Normality of HRV Parameters
In many studies we observe a non-normal distribution of HRV parameters. In this case it is necessary to proceed to data transformation prior to their analysis, a common procedure is to log transform the data to adjust for the unequal variance and many studies report for example the natural logarithm of RMSSD (Stanley et al., 2013b) or the natural logarithm of the power values in ms 2 (e.g., Prinsloo et al., 2011).

Frequency Analysis of HRV Parameters
Regarding the frequency domain the researcher is faced with variables presented with different units. In line with the Task Force (Malik, 1996), we would always recommend to present the absolute power and the normalized units even though there can be some problems using normalized units (Heathers, 2014). These normalized units represent the relative value of each power component in proportion to the total power minus the VLF component.
Another choice faced by the researcher in frequency analysis is to decide between two main frequency analysis methods: whether to use Fast Fourier Transform or autoregressive modeling. Both analyses techniques usually correlate highly (between r = 0.86 and r = 0.91) in the HF band (Hayano et al., 1991). Fast Fourier Transform has been one of the most utilized techniques so far but AR is gaining interest. In regards to the visual display of data, the AR demonstrated better resolution of sharp peaks than FFT, and makes a smoother, more interpretable curve Cowan et al., 1992). Moreover, several authors observed that FFT overestimated the HF component, compared with autoregressive analysis (Badilini et al., 1988;Fagard et al., 1998;Pichon et al., 2006). Therefore we would recommend focusing on autoregressive analysis for HF band calculation. The model-order chosen to perform AR should always be indicated (Malik, 1996) and be no shorter than 16 for short-term recordings (Boardman et al., 2002).

HRV Variables to Analyze
If researchers choose to conduct a psychophysiological research project based on one of the five theories we presented in section "HRV in Psychophysiological Research: A Focus on Vagal Tone, " they will be interested in assessing vagal tone. If the researcher is aiming to identify vagal tone, our recommendation is to analyze one of the main variables reflecting it (i.e., either RMSSD, peakvalley, or HF). Moreover, to avoid bias within the research, we would recommend in addition to the main analysis performed with one variable reflecting vagal tone, to performing as well the same analyses with the other variables depicting vagal tone, to check whether results echo the findings across variables supposed to reflect vagal tone.

HRV Data Reporting
Recent guidelines regarding the reporting of HRV studies were introduced by Quintana et al. (2016) and insist on the need to consistently report key experimental elements in order to advance of the field. The guidelines include four main elements: participant selection, IBI collection, IBI analysis and cleaning, and HRV calculation. They came up with an easyto-follow 13-items checklist that will prove to be very useful for psychophysiological researchers, even though the guidelines focused on psychiatry. Given the extensive description of those guidelines concerning the reporting of HRV studies, we won't elaborate in details here on this topic, and instead refer the reader to this very informative paper. However, we would just like to stress a crucial point regarding the specific reporting of HRV data in scientific papers: currently, readers may find themselves frustrated when reading the result sections of HRV psychophysiological experiments because often the variables used to display vagal tone were not the same. This makes it extremely HRV variables to analyze If research question is based on vagal tone: perform the analyses with one main variable indexing vagal tone (e.g., RMSSD, peak-valley, and HF); perform same analyses with the other variables depicting vagal tone to check whether results are consistent Data reporting HRV variables to report In the paper present one main variable illustrating vagal tone for comprehension purposes (e.g., RMSSD, peak-valley, HF AR, and HF FFT); then submit as Supplementary Material all raw data as well as the analysis ran with the other HRV parameters to contribute to the development of HRV metrics and guidelines as well as the analyses ran with the other HRV parameters hard to compare results across studies as well as complicating the integration of findings into a comprehensive review or a meta-analysis, which subsequently hinders the development of the field. We understand that scientific journals have space restrictions and that every single HRV variable can't be displayed in every table. However, as more and more journals allow now attaching files as supplementary online material, whenever possible we recommend researchers to update their HRV raw data files as well as the statistical analyses realized with the other HRV variables, ensuring previously that ethics committee and participants explicitly agreed to public data sharing, even in an anonymous form. In addition, if novel HRV methods are used, researchers should follow recent recommendations and always present the new HRV measures together to more traditional measures of HRV (Sassi et al., 2015). This will allow for many researchers to contribute to the development of HRV metrics and guidelines and subsequently develop the field.

CONCLUSION
The aim of this paper was to provide the field of psychophysiology with practical recommendations concerning research conducted with HRV, specifically highlighting its ability to index cardiac vagal tone, which is relevant for many psychophysiological phenomena, such as self-regulation mechanisms linked to cognitive, affective, social, and health (Porges, 2007;Thayer et al., 2009;McCraty and Shaffer, 2015). These recommendations aimed to cover experiment planning, measurement standards, recommendations regarding data analysis and data reporting. Our endeavor here was not to strive to establish standards surrounding HRV, but rather to offer the reader with a comprehensive overview of the different issues to consider. Again, we could not be exhaustive and address in-depth all issues we talked about in this paper, but we always refer the reader to useful contributions that will help them make educated choices. This can be considered as a timely contribution as this paper falls two decades after the Task Force on HRV (Malik, 1996) and one decade after a major milestone driving HRV research psychophysiology, the special issue on cardiac vagal tone edited by Chambers and Allen (2007), and completes recent overview works on HRV research (Quintana and Heathers, 2014;Shaffer et al., 2014;Sassi et al., 2015;Quintana et al., 2016). A summary of the recommendations is presented in Table 2.
Again these recommendations should not be considered as imperative guidelines to follow, given of the breadth of potential research questions to be addressed. Instead they are rather thought to accompany the researcher in psychophysiology and to ease the use of HRV to investigate phenomena related to selfregulation at the cognitive, emotional, social and health level. Our recommendations are also aimed to guide researchers through complicated interpretation that follows HRV data collection, which is often clouded by methodological concerns. As a result we endeavored to cover the relevant areas concerning HRV research within psychophysiology, with a focus on vagal tone given its theoretical relevance, contributing to make HRV one of the pillars of psychophysiological research in the 21st century.

AUTHOR CONTRIBUTIONS
SL prepared the first draft, EM and JT provided insightful comments that critically improved the manuscript quality.

FUNDING
This project was financed by the German Sport University Cologne -Grant Number: HIFF 920147.