Validity of resting heart rate derived from contact-based smartphone photoplethysmography compared with electrocardiography: a scoping review and checklist for optimal acquisition and reporting

Background With the rise of smartphone ownership and increasing evidence to support the suitability of smartphone usage in healthcare, the light source and smartphone camera could be utilized to perform photoplethysmography (PPG) for the assessment of vital signs, such as heart rate (HR). However, until rigorous validity assessment has been conducted, PPG will have limited use in clinical settings. Objective We aimed to conduct a scoping review assessing the validity of resting heart rate (RHR) acquisition from PPG utilizing contact-based smartphone devices. Our four specific objectives of this scoping review were to (1) conduct a systematic search of the published literature concerning contact-based smartphone device-derived PPG, (2) map study characteristics and methodologies, (3) identify if methodological and technological advancements have been made, and (4) provide recommendations for the advancement of the investigative area. Methods ScienceDirect, PubMed and SPORTDiscus were searched for relevant studies between January 1st, 2007, and November 6th, 2022. Filters were applied to ensure only literature written in English were included. Reference lists of included studies were manually searched for additional eligible studies. Results In total 10 articles were included. Articles varied in terms of methodology including study characteristics, index measurement characteristics, criterion measurement characteristics, and experimental procedure. Additionally, there were variations in reporting details including primary outcome measure and measure of validity. However, all studies reached the same conclusion, with agreement ranging between good to very strong and correlations ranging from r = .98 to 1. Conclusions Smartphone applications measuring RHR derived from contact-based smartphone PPG appear to agree with gold standard electrocardiography (ECG) in healthy subjects. However, agreement was established under highly controlled conditions. Future research could investigate their validity and consider effective approaches that transfer these methods from laboratory conditions into the “real-world”, in both healthy and clinical populations.


Introduction Rationale
Photoplethysmography (PPG) can provide important clinical outcome measures and has been used for the diagnosis, monitoring, and screening of various diseases and disorders (1)."Photoplethysmography" consists of "photo," meaning light; "plethysmo," meaning volume; and "graphy" meaning recording (2).PPG was first suggested as a technique for measuring blood volume changes by Hertzman in 1937 (3,4).PPG is a measurement of light either absorbed (transmissive photoplethysmography) or reflected (reflective photoplethysmography) by human tissues (1), and is based on optical properties such as absorption, scattering and transmission (5).Transmissive PPG measures light that passes through the various human tissues and is mainly used at the distal parts of the body where those tissues are thin, for example at the fingers, toes, and earlobes.Reflective PPG measures scattered light that irradiates skin tissue and produces a reduced light intensity (6).While transmissive PPG exhibits more stable PPG performance (7) since the reflective type of signal is degraded, the latter has the advantage of a greater number of measurement sites such as the forehead, wrist, carotid artery, and esophagus, where transmissive PPG would be difficult (8,9).
As such, PPG data is explained by Beer-Lambert's Law which defines resultant light intensity by the extinction coefficient, concentration, and optical path length of a medium when light passes through it (10).The Beer-Lambert law can be described by: Whereby: the transmitted light intensity (I) through a medium will decrease exponentially in irradiated light intensity (I 0 ) in relation to the absorption coefficient (ϵ), where (λ) is the specific absorptivity, characteristic of the traversed tissue and dependent on the light wavelength λ, ρ is the density of the tissue, and d is the light pathlength (6).
Since most of these factors are constant for a given tissue the signal quality is mainly impacted through the later part of the equation, through manipulation of λ, ρ and d, which can be modified through measurement site selection, wavelength selection and contact pressure, resulting in a reduced ϵ, which could explain why fingers and earlobes are preferred.
Various PPG devices have been utilized in clinical practice (1).However, since the release of the first iPhone in 2007, smartphones have been widely adopted globally (11) and are now considered a tool with high utility, avoiding some major pitfalls of traditional data collection techniques.The traditional approach where an individual's health is monitored periodically, often by appointment, may not be an accurate representation of the possible variations in physiological measurements that occur longitudinally (12,13).Moreover, smartphone technology and embedded cameras allow PPG acquisition without the need for additional, potentially costly, external devices (14) and could be suitable for targeting populations in traditionally underserved groups (15) particularly those whose demographic, geographic, or economic characteristics negatively affect health care access and delivery (16,17).Therefore, telemedicine technologies are becoming more widely adopted in practice, especially since the recent COVID-19 pandemic, which highlighted the need for vital signs evaluated using telemonitoring (14,18,19).As a result, the proliferation of smartphone-based telemedicine appears to be here to stay and could address the United Nations Sustainable Development Goals (UN SDGs) (20), in particular UN SDG 3 (21).
Smartphone PPG has been previously utilized to estimate resting heart rate (RHR) through the measurement of distal pulse rate (PR) at rest, during exercise, and whilst completing mental tasks (1).However, at the time of writing, there is no consensus on what metric should be used to establish the validity of smartphone-based PPG or under what conditions.Another issue is that to convert the PPG signal, a mathematical algorithm is required which not only affects smartphone performance but also validity and reliability.This is problematic given the proliferation of telemedicine, and it is therefore essential mobile health (mHealth) technologies are considered reliable and valid compared to gold standard measurements before universal adoption (22).In this context, De Ridder et al. ( 23) conducted a meta-analysis of articles published between 1st January 2009 and 7th December 2016 investigating the use of smartphones to measure PR by performing PPG in comparison with a range of methods, including ECG, pulse oximetry and radial pulse.Although these methods suffer various pitfalls, comparisons with multiple validation methods could strengthen smartphone device and application validity.Results revealed good agreement between smartphone-derived (HR-PPG) and validated method-derived RHR.These authors therefore concluded that RHR obtained from a smartphone PPG signal could be used as an alternative to traditional methods, such as ECG, in an adult population, in the right context.However, De Ridder et al. (23) highlighted several limitations to the included studies.Firstly, there was high statistical heterogeneity between studies, ostensibly due to participant characteristics, measurement conditions, and the smartphone devices utilized (23).Secondly, the latest IOS device reviewed was the iPhone 5 (released 2012) and the latest android was the Samsung Galaxy S4 (released 2013).Emerging evidence suggests advancements in technology, such as the availability of various camera positions (i.e., front-facing vs. rear-facing) and the advent of multiple lenses, could result in improvements in PPG acquisition (14).
These technological enhancements are promising for the telemedicine sphere as HR-PPG could be considered a populationlevel biomarker, utilized for screening, surveillance, and to monitor responses to policy interventions in epidemiology and public health.Population-level biomarkers are easy to measure in the real-world, low-cost and scalable (24).RHR has considerable population-level applicability and can predict adverse outcomes and the development of disease.As smartphone ownership is increasing [80% of over 65-year-olds own a smartphone in the UK (25, 26)], and smartphone HR-PPG removes the barrier to scalability of "wearable" ownership, valid contact-based HR-PPG from a smartphone device has significant scope for public health surveillance.However, before that goal is reached, it is imperative to consider the existing literature in terms of HR-PPG validity.
Two approaches of measuring PR via PPG are known: contact and non-contact.With contact PPG, PR is measured by placing a finger on the phone rear camera, while in non-contact, imaging photoplethysmography (iPPG) is extracted from the face, without the need for direct skin contact.iPGG has some advantages over contact-based PPG, such as detecting PR in crowds and at longdistance (27,28).However, in general, contact PPG exhibits better accuracy than non-contact PPG (29).Considering that contactbased PPG is generally more accurate than non-contact PPG, and our group's interest in this methodology, we were interested in the validity of RHR acquisition from PPG utilizing contact-based smartphone devices.

Objectives
As a result of the importance of using validated PPG for telemedicine, and the rapidly improving technology, we aimed to conduct a scoping review assessing the validity of RHR acquisition from PPG (referred to as HR-PPG) utilizing contact-based smartphone devices against gold standard ECG (referred to as HR-ECG).Our four specific objectives of this scoping review were to 1) conduct a systematic search of the published literature concerning contact-based smartphone device-derived PPG, 2) map study characteristics and methodologies, 3) identify if methodological and technological advancements have been made, and 4) provide recommendations for the advancement of the investigative area.

Protocol and registration
The review was not preregistered, as scoping reviews are not.This review was conducted and reported in accordance with the Preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (PRISMA-ScR) guidelines (30).

Eligibility criteria
Studies were included if the measurement of HR-PPG was conducted via the front or rear facing camera of a smartphone by contact-based PPG.Only studies compared with the gold standard measurement [electrocardiography (ECG)], were included.Studies were excluded if the index measurement was conducted with a device connected to a smartphone, such as a mobile sensor, medical device or wearable device; the paper did not include validity assessment of HR-PPG and HR-ECG as an outcome measurement; the study used a clinical population (we assumed healthy population unless stated otherwise); the paper was not an original article (i.e., utilized a database from a secondary source); the paper was a review; there was no abstract or full text available.

Literature search
We conducted a systematic literature search of ScienceDirect, PubMed and SPORTDiscus from January 1st, 2007, to November 6th, 2022, with the following search key: ((((("validity") AND ("mobile")) AND ("photoplethysmography")) OR ("PPG")) AND ("heart rate")) NOT ("wearable") AND [2007:2022(pdat)], which were developed through examination of previously published original and review articles.Filters were applied to ensure only literature written in English were included.Reference lists of included studies were manually searched for additional eligible studies.

Study selection
Studies were identified by the first author and evaluated by JDM and LDH independently and compared in an unblinded and standardized manner.Once database searches were complete, all studies were downloaded to a single reference list [utilizing Zotero software (version 6.0.26)] and duplicates were removed.First, titles and abstracts were screened for eligibility (JDM).Full text articles were then read and coded in relation to exclusion criteria, utilizing "tags" in Zotero [version 6.0.26], which was reviewed by the second author (LDH).This process involved a thorough assessment of all eligibility criteria with authors JDM and LDH confirming inclusion and exclusion.Additionally, disagreements were addressed by a third reviewer (NFS).

Data extraction
Data extracted from each study included author(s), sample size, participant sex, country of study, age, skin pigmentation, if participants were considered healthy, smartphone model, name of application utilized, whether the application was commercially available, index measurement sampling rate, camera position and resolution, flash (torch) settings, channel used for computations, ECG device utilized, electrode placement, ECG processing information, instructions given to participants, dietary control, participant posture, region of interest, breathing pattern, environmental conditions, stabilization period, duration of measurement, number of attempts or trials, primary outcome measures and measures of validity.

Outcome measures
Our primary interests were measurements of validity and mean differences between heart rate via gold standard ECG measurement (HR-ECG), and pulse rate measured by contact-based smartphone PPG (HR-PPG).Additionally, issues that arose regarding the reporting and conducting of HR-PPG validity assessment were compiled into a checklist (Table 1).

Study selection
Following initial database searches, 1,401 articles were identified, and 1,365 titles and abstracts were screened once duplicates (n = 36) were removed.These were screened for inclusion, resulting in 251 full text articles being screened.Of these 247 were excluded and four remained.A further six articles were manually identified by consulting reference lists of the included four articles, resulting in a further six articles, and therefore a total of 10 articles were included in analysis (Figure 1).

Study characteristics
Of the ten studies included in the review, all (100%) reported the country of study, which were upper-middle to high income countries.

Principle findings
This scoping review provided an overview of existing literature regarding the acquisition and validity of HR-PPG, in healthy subjects at rest utilizing smartphone devices, with the aim of facilitating improvements in future research and clinical practice.In relation to our objective of assessing the validity of HR-PPG acquisition from PPG measurement utilizing contact-based smartphone devices, this review highlighted several methodological and reporting discrepancies between studies which can lead to different results that do not reflect outcome of comparison (22).As there is currently no consensus on what metric should be used to establish the validity of smartphone-based PPG or under what conditions, the reviewed research appears to have utilized an exploratory approach.However, with the rapid development in technology and an improved understanding of this research area, we have highlighted key considerations for reporting contact-based PPG RHR acquisition with smartphones (Table 1).

Target population considerations
With regards to the general study information reported (Table 2) results revealed only one study (10%) (37) met the suggested guidelines for validating heart rate devices (albeit wearables) suggested by Mühlen et al. (40).Overall reporting was poor with small and unjustified sample sizes, and few studies adequately reported sex, skin color, or age of participants.An expert consensus suggested that studies validating HR-PPG should determine sample size based on an expected mean absolute difference, expected SD of differences and a pre-defined clinical maximum difference needed to obtain a power of 80% or 90% to assess agreement with sufficient precision (41).If no a priori level of "in agreement" is specified a sample size of 45 is recommended (42).However, all but one study had a sample size of n < 45, and therefore results could be under powered (43).We suggest sample sizes should be carefully calculated during study design utilising current guidelines (43) and these calculations should be presented in the methods section.
Comorbidities were poorly reported, particularly those that might affect pulse rate or amplitude, such as arterial stiffening or conditions affecting cardiac electrophysiology (33, 44).At a minimum, studies should report either that participants were free from such health conditions, or clearly state their health conditions, if their aim is to validate HR-PPG in a particular population.
Additionally, there was inadequate reporting of participant skin color within this review.Felix von Luschan chromatic scale (VLCS) (range 1-36) was utilized, which is a validated method of skin color evaluation (45).Skin color is an important consideration during PPG acquisition as skin tone may affect the accuracy of measurements (40).However, a recent systematic review of wrist-word devices, which utilize reflective PPG, stated evidence is inconclusive possibly due to small sample sizes and the requirement for a more objective way of identifying participants' skin tone (46).Nevertheless, authors suggested HR-PPG detection may be less accurate in darker skin tones (46).Since the papers in this review failed to adequately report skin tone this cannot be corroborated with regards to camera-based methods and although Yan at al. (36) measured skin tone (Table 2), participants fell within the mid-range of the skin tone spectrum (range 19-25), with one representing light skin and 36 representing dark skin (45).Consequently, it is not clear if participants' skin tone influenced the results of the studies in the present review.Therefore, human factors such as skin color should be recorded (47) and appropriate light wavelength should be selected (48).Moreover, it is evident that more research is required investigating the effect of darker skin tones on signal quality.

Index measurement considerations
Interestingly, the majority of studies reported the use of a single smartphone device (11,(32)(33)(34)(35)(36)(37)39).This of course maximizes internal validity of each study, but does somewhat hamper ecological validity and generalizability, given the vast options in terms of smartphones at the time of writing.Additionally, since a major advantage of mHealth technologies are their reach, it is advisable to assess the index measurements validity crossplatform, at a minimum of one phone from each.Moreover, the most recent article was Nemcova et al. published in 2021 (28), suggesting future measurements could improve through the utilization of newer technology (14).
As highlighted in the results, heterogeneity existed between smartphone model and application utilized and although authors reported the name of the smartphone application, zero studies reported the specific programming code utilized for beat detection.This could be due to financial, security and/or privacy reasons, as some applications were commercially available.This makes direct comparisons between apps and devices difficult as there is no guarantee two apps used the same code.Additionally, around half of the studies stated the application utilized was developed specifically for the intended research, therefore the algorithm could have been described or the code made available.Consequently, validation of specific algorithms within this review was not possible, this could be feasible in future if algorithms and build versions were explicit (49).Moreover, it is difficult to extrapolate these data to the real world without testing the efficacy of those applications outside stringent conditions of a laboratory.Identification of certain smartphones or applications which produce better PPG signals could lead to improvements in HR measures (23).However, this is difficult as there is currently no consensus on what metric should be used to establish the validity of smartphone-based PPG or under what conditions, therefore protocols vary dramatically.Identifying optimal device (s) and application(s) is difficult.Therefore, we present a checklist (Table 1) to facilitate superior acquisition of HR-PPG via smartphone devices.
Although there has been a considerable increase in the number of mobile apps, many have been designed without regulation regarding development, risk mitigation, and quality control.Therefore, we advise future developers to adhere to the guidelines proposed by Llorens-Vernet and Miro (50), which consist of 36 important criteria and outline standards for mobile health-related applications.These criteria are grouped into eight categories including usability, privacy, security, appropriateness and suitability, transparency and content, safety, technical support and updates, and technology.
Most studies reported which camera recorded smartphone PPG measurements (32-39) of which the rear-facing camera was utilized for all with torch (flash) turned on.However, recent research investigating rear-vs.front-facing PPG smartphone measurement revealed the front-facing camera to be more advantageous when considering greater control over the emitted light and finger detection.It is possible that previous research has not utilized this method as smartphone devices with frontfacing camera capabilities are a newer technology that is still under development (14).However, regardless of the camera selected it is advisable to state this as camera selection clearly influences PPG signal quality.
Over half of the studies reported camera resolution (32-35, 37, 39).However, it was not clear if the reported resolution was referring to the smartphone cameras hardware settings or if the resolution was selected through the applications capture settings.Raposo et al., (14) suggest resolution should be set to its minimum value to reduce computational load.Moreover, implementation of interpolation techniques can be used to increase fiducial point detection through improvements in temporal resolution (51).This could influence device selection as future research could utilize devices with theoretically suboptimal resolution.For example, a device that, without adjustment of capture resolution would have high computational load, yet have other PPG performance advantages, we could then manually determine the resolution to the desired level within capture settings (i.e., reducing the capture resolution within the app), potentially improving PPG signal quality, and reducing computational load.For this reason, it is important to report what the resolution is and how it was acquired since newer devices often provide multiple rear-facing lenses, of which some have "slow-motion" technology, providing potentially enhanced sampling rate capabilities.
Smartphone sampling rate was reported in most studies (11, [31][32][33][34][35][37][38][39].Sampling rate can be as high as 1,000 Hz for medical equipment (52) however, for most smartphone cameras, it is typically less than 30 Hz (53), which can result in inaccurate waveform analysis (54).As outlined in our results, sampling rate was generally 20-30 Hz.For context, the latest smartphone model in the reviewed studies was the iPhone 6s (released 2015), which has a sampling rate of 30, 60 or 240 Hz, depending on the resolution settings during recording.Implications of inappropriate sampling frequency selection could result in inaccurate waveform analysis (54) and HR-PPG determination.Beres and Hejjel (51) investigated the minimum sampling frequency requirements for HR-PPG parameters in healthy individuals and concluded a minimum of 5 Hz is sufficient without interpolation, for pulse rate determination.However, although lower sampling frequencies minimize the computational load and, as a result, the power consumption consequently extending battery life (51), they can also deteriorate the accuracy of fiducial point detection in HR-PPG and/or HR-ECG, decreasing signal accuracy.Moreover, applications intending on measuring other parameters, for example those related to HRV, would require higher sampling rates with possible interpolation (51).As sampling rate is largely determined by smartphone make/model, we advise future research to utilise devices with higher sampling rate capabilities and/or implement interpolation techniques.When designing an application, it is important to consider the parameter being measured (higher sampling rates required for HRV in comparison to HR analysis) and the target demographic, as applications that are compatible with newer and older smartphone models could provide for broader scope, especially for those in low-and middle-income countries (LMIC) that may not have access to adequate healthcare.
As various wavelengths interact differently with blood and tissues (55), important consideration must be had with regards to wavelength selection (56) (i.e., red, green or blue colour channels).Emerging research suggests green wavelengths demonstrated stronger cardiac pulse signals in comparison with red or blue bands during remote PPG imaging (37).However, this was demonstrated in wrist-worn devices and more research is required in smartphone-derived PPG.Finally, improvements in pulse signal could be attained through optimization of the pixel averaging region (32), whereby the video area closest to the light source is analysed increasing the overall gain of the signal and therefore improving signal quality (14).

Experimental procedure considerations
Firstly, when describing the experimental procedure, studies described the technical computer science methods well.However, their relationship to physiology (i.e., what variable they are measuring and the relationship between the signal capture and the underpinning physiology) was not described in as much detail.Nearly all studies provided participant instructions (11,(31)(32)(33)(34)(35)(36)(37)(38), however, some study designs were hard to follow and not enough detail was provided to allow accurate replication.Studies that provided sufficient detail utilized schematic diagrams and detailed subsections within the methods as to index and criterion measurements, experimental procedure, and participant instructions.
Over half the studies reported participant postures with the seated posture being the most frequently utilized measurement position.Postural changes can result in deviations in cardiovascular measurements, such as HR (57,58).Therefore, participant measurement posture should be reported when describing the experimental procedure.In addition to measurement posture, measurement site is also an important consideration.Hartmann et al. ( 59) investigated the effect of measurement site on HR-PPG waveform characteristics utilizing a reflective PPG sensor with a peak wavelength of 880 nm, comparable with reflective wavelengths utilized in smartphone devices that utilize an infrared light wavelength (880-940 nm) (6).Authors determined that under normal and deep breathing conditions the finger produced the most analyzable waveforms (95% and 86% analyzable, respectively) in terms of mean amplitude, pulse peak time (Tp), dicrotic notch time (Tn), and the reflection index (RI) (all p < 0.001), which could be due to higher sensitivity to volumetric fluctuations in the cutaneous vascular walls of the finger compared with other measurement sites (59).
The application of pressure at the measurement site is something to be considered, as this is the fundamental of blood pressure measurement (i.e., an increase in pressure eventually results in occlusion).Variations in contact pressure can result in changes in several waveform characteristics (60).Increased contact pressure decreases the optical path length through the tissue, increasing AC amplitude.AC amplitude reaches its maximum when transmural pressure, defined as the difference between intraarterial pressure on the vessel wall and contact pressure, reach zero (61,62).Additional pressure beyond this begins to occlude the vessel reducing amplitude until no signal is visible.Conversely, contact pressure applied too softly increases the optical path length through the tissue, decreasing AC amplitude.Considering this, applying enough pressure to create conditions where transmural pressure is zero could be beneficial for RHR determination, as this could make peaks more easily identifiable.While this paragraph briefly outlines the underlying physiology and AC amplitude changes form varying contact pressures, from a technical standpoint, Apple stopped incorporating the strain gauge array under the screen (3D Touch) from ∼2017 onwards.Therefore, no force measures can be obtained directly from the device.For this reason, our in-house pilot testing has suggested that providing the app user with the real-time PPG signal (i.e., visual feedback) can enhance the quality of the PPG signal.This approach has been previously conducted by Nemcova et al. (38) who reported they provided app feedback (visual peaks presented on the smartphone display) to enhance signal quality during measurement conditions.These authors stated that quality was evaluated visually by the users; quasi-periodic peaks/spikes must be seen in the signals.A flat signal or a signal with many peaks/ spikes with the absence of quasi-periodicity represents a lowquality signal.The user should iteratively change the position of the smartphone according to the feedback of the application.Therefore, applying contact pressure which allows a signal which displays key pulse wave fiducial points, that has many quasiperiodic peaks would be considered ideal.
Previous research stated environmental conditions such as ambient light or motion can influence HR measurement (49,63).In addition, careful consideration of the environmental temperature has the benefit of reducing possible HR-ECG and HR-PPG noise due to shivering (64).Of course, these environmental conditions ultimately influence participant temperature, and temperature of the measurement site (i.e., skin temperature).However, no study included in this review reported skin temperature.From a technical standpoint, the device temperature sensors are only designed for management of the CPU and battery, so measurement of environmental or skin temperature is beyond the scope of those sensors.Thus, skin temperature reporting would require an additional device such as a skin thermometer.From our in-house pilot testing, we have observed that having cold hands can reduce the quality of the PPG signal (by "quality" we mean a signal which displays key pulse wave fiducial points, that has many quasi-periodic peaks would be considered ideal).This in-house pilot testing in our lab is supported by previous work suggesting that both increased and decreased skin temperature can alter the increased PPG amplitude and total signal, PPG waveform amplitude, and PPT (60, [65][66][67][68].Research suggests ambient light may also affect light sensitive diodes; however, the size of the effect is currently unknown (40).Allen (47) suggests correct positioning of the device and the use of light modulation filtering can reduce ambient light interference.
We identified HR-ECG and HR-PPG were generally recorded simultaneously for short durations (<3 min), which is acceptable.Nemcova et al. (39) suggest ultrashort-(< 5 min) and short-term (∼ 5 min) measurements have several advantages over longer term measurements, including minimal risk of data loss during measurement, subject comfort (including flash/torch burn risk) and reduced computational demands that influence battery capacity and memory.Definitions of short-and ultra-short vary depending on the intended research, 10 s duration is commonly cited as the most appropriate duration within the literature, for HR-PPG acquisition.However, although all studies in the current review reported measurement duration (Table 6), no study compared the effect of increased or reduced measurement duration on signal quality.Therefore, the impact of measurement duration in the present review is unclear.
The time taken for a pulse wave to travel along a fixed arterial length is considered the pulse transit time (PTT).When that pulse arrives, known as pulse arrival time (PAT), it is represented by a peak in the HR-PPG signal, however due to the PTT, there is misalignment, or "time lag", between the R wave of the HR-ECG signal and the HR-PPG peak (69).A recent review of opensource beat detection algorithms describes a method of time alignment where HR-ECG and HR-PPG derived beats within the range of <150 ms were determined to be correctly identified.The time lag between beats was manipulated by offsetting the beats in increments of 20 ms.The time lag that resulted in the most correctly identified beats (the most HR-ECG and HR-PPG beats within the range of <150 ms) was considered the "true lag" (70).Time alignment allows for direct beat comparison and ensures that not only are the same time frames are being analysed but also the same beats, improving validity assessment.However, only Bolkhovsky et al. (31) explicitly stated that HR-ECG and HR-PPG were aligned during post-recording data analysis.
Finally, the number of attempts allowed per participant was inadequately reported (Table 6).Holmes et al. (71) suggest number of attempts should be limited to three as additional measurements would counteract the advantages of ultrashortterm measures outlined above.We argue that there is a compromise to be made between end-user burden/acceptability and reliability/precision.Whilst it is likely that more trials per participant will increase the chances of acquiring a good signal and therefore improve validity, the more trials a user completes the greater the data entry burden (72), which could reduce usability and adherence.

Primary outcome and statistical measures of validity considerations
Results of this scoping review highlight the agreement between HR-PPG and HR-ECG (Table 8).In this scoping review CIs, LoA, or bias [from which LoA can be derived (LoA = bias ± 1.96 SD)] were not reported in all studies.Yet Mühlen et al. (40) state 95% confidence intervals (CIs) and LoAs should be provided for between-device comparisons.Interestingly, given the large number of studies reporting correlation coefficients, zero papers defined guidelines utilized to determine strength of coefficients (73, 74).We conducted post hoc interpretation and six articles (11,31,34,(36)(37)(38) exceeded the minimum requirements for "high" or "strong" correlation using previously reported guidelines (73, 74).However, it was unclear whether these studies examined mean HR agreement, rather than time alignment and beat to beat agreement.
Nam et al. (35) stated PPG measured from the green wavelength (HR-Green) demonstrated "good agreement" (Table 8) in comparison with HR-ECG, however, neither the coefficient itself, nor the criteria for this qualitative assessment was provided.In summary, statistical interpretation could be improved in future research, utilizing the Bland-Altman method (75) for testing agreement between HR-ECG and HR-PPG, rather than relationship between the two (as agreement and relationship are different concepts).We also propose greater transparency in statistical reporting, including precise coefficients, a priori thresholds for interpretation (i.e., "poor", "good", "very good") etc.

TABLE 1
Items to consider when reporting validity protocols for the acquisition of RHR via contact-based PPG, using smartphone devices.

TABLE 2
General study information of investigations concerning smartphone rear-facing PPG measurement and ECG for the determination of heart rate (pulse rate) and descriptive statistics of participants.

TABLE 4
Index device hardware technical specifications for included studies.

TABLE 5
Methodology of included studies.
*Represents significant difference (p < 0.05) between supine and tilt position with paired samples t-test:

TABLE 8
Results for heart rate: correlations, measures of validity and summary of results.