Reproducibility of Left Ventricular Dyssynchrony Indices by Three-Dimensional Speckle-Tracking Echocardiography: The Impact of Sub-optimal Image Quality

Background: 3D speckle-tracking echocardiography (3D-STE) is a novel method to quantify left ventricular (LV) mechanical dyssynchrony. 3D-STE is influenced by image quality, but studies on the magnitude of its effect on 3D-STE derived LV systolic dyssynchrony indices (SDIs) and their test-retest reproducibility are limited. Methods: 3D-STE was performed in two groups, each comprising 18 healthy volunteers with good echocardiographic windows. In study 1, optimal and inferior-quality images, by intentionally poor echocardiographic technique, were acquired. In study 2, sub-optimal quality images were acquired by impairing ultrasound propagation using neoprene rubber sheets (thickness 2, 3, and 4 mm) mimicking mildly, moderately, and severely impaired images, respectively. Measures (normalized to cardiac cycle duration) were volume- and strain-based SDIs defined as the standard deviation of time to minimum segmental values, and volume- and strain-derived dispersion indices. For both studies test-retest reproducibility was assessed. Results: Test-retest reproducibility was better for most indices when restricting the analysis to good quality images; nevertheless, only volume-, circumferential strain-, and principal tangential strain-derived LV dyssynchrony indices achieved fair to good reliability. There was no evidence of systematic bias due to sub-optimal quality image. Volume-, circumferential strain-, and principal tangential strain-derived SDIs correlated closely. Radial strain- and longitudinal strain-SDI correlated moderately or weakly with volume-SDI, respectively. Conclusions: Sub-optimal image quality compromised the reliability of 3D-STE derived dyssynchrony indices but did not introduce systematic bias in healthy individuals. Even with optimal quality images, only 3D-STE indices based on volume, circumferential strain and principal tangential strain showed acceptable test-retest reliability.


INTRODUCTION
Synchronous contraction is important for overall left ventricular (LV) systolic performance (1). Widening of the QRS duration on ECG is widely used as a marker of intra-ventricular dyssynchrony. However, LV mechanical dyssynchrony (LVMD) (i.e., dyssynchronous contraction and relaxation of the myocardium) may occur in the absence of ECG evidence (2). LVMD assessed using different imaging modalities has been shown to be an independent predictor of poor prognosis in cardiac disease (3)(4)(5)(6)(7)(8)(9). Hence, it may have advantages in guiding cardiac resynchronization therapy (CRT), although current guidelines are still based on electrical dyssynchrony criteria (10).
LVMD can be measured by several echocardiographic imaging modalities such tissue-Doppler imaging (TDI), 2D speckle-tracking echocardiography (STE), 3D echocardiography (3DE), and 3D-STE (11). TDI-based dyssynchrony indices do not improve patient selection for CRT (12), but 2D-STE based dyssynchrony indices have demonstrated added value in selecting potential candidates for CRT even when QRS duration is borderline (13,14). Despite promising results, 2D-STE is limited by the requirement for nonsimultaneous measurements introducing beat-to-beat variability (15). Furthermore, measurements are restricted to a single plane, and complex LV dyssynchrony patterns may be overlooked by a 2D method (13,15).
3D-STE has also emerged as a promising method for quantifying LVMD, with 3DE systolic dyssynchrony index (SDI-volume) having been proposed as a useful measure to assess LVMD and guide CRT (16). In addition to SDI-volume, parameters calculated from different myocardial strain vectors [i.e., longitudinal strain (LS), radial strain (RS), circumferential strain (CS), and recently area strain (AS)] have been suggested as potentially useful measures of myocardial mechanics in LVMD (15).
For a test to be useful it must be reproducible, i.e., show acceptable variability between measures (good reliability) and demonstrate no systematic differences between measurement occasions (no bias) (17). Good reproducibility of 3DE dyssynchrony indices has been reported in a meta-analysis of several studies (16), but one study reported that poor image quality impaired reliability of 3D dyssynchrony by ∼12-21% (18). However, these studies (15,16,(19)(20)(21)(22) including the study of image quality (18) were based on re-reading the same scans. Re-reading scans may substantially underestimate scan-rescan reliability and has limited ability to detect bias; test-retest reproducibility is usually more relevant to chronic studies (17). The effect of image quality on test-retest reproducibility of 3D-STE derived dyssynchrony indices has not been reported and evidence on systematic bias related to image quality is limited. It is difficult to assess this bias in observational studies examining correlations between reliability and scan quality since these associations are confounded by factors such as age, adiposity, or other (potentially unmeasured) risk factors that may jointly influence scan quality and dyssynchrony measures (23). Experimental modification of image quality avoids this confounding but evidence on the controlled effect of changing image quality on 3D-STE derived dyssynchrony indices is lacking.
We aimed therefore to (1) quantitate the impact of intentionally distorted image quality on reliability and bias of LV dyssynchrony indices by 3D-STE; (2) assess the association between SDI-volume and strain-based LV SDIs. By design, the study was conducted in healthy individuals with good echocardiographic windows since this allowed us to achieve a realistic degree of intentional image impairment to compare with good quality reference images. Some of these data have been presented previously in abstract form (24,25).

Study Population
Two prospective sub-studies (study 1 and study 2; conducted at different times) were performed. In study 1, 23 healthy individuals with no previous cardiac medical history were recruited to undergo 3DE. Only individuals with excellent/optimal echocardiographic windows were included in these studies, so 5 individuals were excluded in study 1 due to sub-optimal echocardiographic windows, leaving a final sample size of 18. In study 2 study, an additional 21 healthy individuals were recruited to further quantify the impact of sub-optimal image quality. Eighteen participants were finally included after excluding 3 participants with sub-optimal echocardiographic windows. The institutional review board approved the study protocol and informed consent was obtained from all participants at the time of examination.

Image Acquisition and Analysis
3DE examination was perfumed using an EPIQ7 ultrasound machine (Philips Medical Systems, Andover, MA) equipped with a X5-1 Xmatrix-array transducer by an experienced, British Society of Echocardiography accredited sonographer as previously described (26). Following a standard protocol, LVfocused 4 wedged-shaped sub-volumes were acquired over 4 consecutive cardiac cycles during a single breath-hold from the apical window using harmonic imaging and multiple-beat 3DE mode (27). During the acquisition, special care was taken to include the entire LV cavity within the pyramidal sector volume.
In study 1, two gated wide-angled 3DE LV full-volume datasets were obtained per participant. The acquisition of the first dataset was performed according to EAE/ASE recommendations (27). Machine settings including gain, sector width, and depth were adjusted by the operator to maximize the quality of images ensuring clear visualization of LV endocardial borders and avoiding echo drop-out. A good 3DE image was defined as clear visualization of the endocardium in all 16-segments in both end-diastolic and end-systolic frames. The quality of the second dataset was impaired by using an intentionally sub-optimal echocardiographic technique. This was achieved by a combination of scanning the participants while laying supine resulting in more distance between the transducer and the heart, and absence of gel to create an air-tissue interface initiating multiple reflections and acoustic shadowing artifacts. This resulted in echo drop out, shadow artifacts, or poor visualization of the endocardium A sub-optimal 3DE image was defined as the presence of at least one of the following ( Figure 1B): (1) Poor visualization of the endocardium of up to 7 segments throughout the cardiac cycle in a 16-segment LV model. (2) The presence of echo dropout.
An identical acquisition protocol (optimal and sub-optimal images) under the same conditions was repeated on the same day (typically within 1 h of the original scan) to assess the test-retest reproducibility of optimal and sub-optimal images.
In study 2, the quality of the 3DE images was impaired by attenuating ultrasound propagation in a graded and reproducible manner to better quantitate the impact of sub-optimal image FIGURE 1 | Three-dimensional echocardiographic (3DE) strain derived systolic dyssynchrony index (SDI) and impaired quality 3DE images. (A) SDI derived from principle tangential strain. (B) An example of a good and sub-optimal 3DE image quality obtained from the same participant in study 1. (C) An example of a 3DE with an optimal quality reference (no neoprene), mild (2 mm neoprene), moderate (3 mm neoprene), and severe (4 mm neoprene) impairment of 3DE image quality obtained from the same participant in study 2.
quality. In achieving this, we used three sheets of ultrasoundattenuating material called neoprene rubber with different thicknesses (2, 3, and 4 mm) to mimic mild, moderate, and severe impairment in image quality, respectively ( Figure 1C). Neoprene is a polymer of chloroprene, 2 chloro-1, and 3-butadiene. We chose it as many of its acoustic properties are similar to soft biological tissues, it is durable, and it has a comparatively high attenuation coefficient (28,29). Each sheet of neoprene rubber was placed between the skin and the transducer with ultrasound gel on both sides after which images were captured. In total, 4 gated 3DE LV full-volume datasets including an optimal quality reference image (no neoprene) were acquired per participant ( Figure 1C). We avoided images with stitching artifacts and ensured good quality ECG signals. We maintained a constant frame rate between scans with a minimum acquisition rate of 18 frames/second (30) and optimal machine settings.
Image analysis was perfumed using 4D LV-Analysis© software (TomTec Imaging Systems GmbH, Germany, 2015) by a single blinded reader. In total, 4 datasets per participant for each sub-study were analyzed. To quantify read-to-read variability a second read of scan was performed ∼2-3 months after the initial read. At the start of each analysis, three standard apical views and one short-axis view were automatically selected and displayed by the software. The longitudinal axis of the LV in all apical views were further aligned manually if needed using the mitral valve annulus and the apex as anatomical landmarks. The LV endocardial borders in all three apical views were then automatically defined by the software at end-diastole and tracked throughout the cardiac cycle in 3D space from which the 3D LV endocardial shell was constructed. Further, manual adjustments were kept as minimal as possible. The software then subdivided the LV into 16 segments using a standard model (31) and provided curves as well as maps of global and segmental volumetric and deformation indices. LV dyssynchrony indices were the following ( Figure 1A): -Volume-based SDI, defined as the SD of time to minimum segmental volumes over 16-LV segments. -Strain-based SDI, calculated as the SD of time to peak segmental strain over 16 LV-segments from LS, CS, RS, and principle tangential strain (PTS) which is a composite 3D measure of CS and LS. -Volume-and strain-derived dispersion indices (Di), calculated as the difference between minimum and maximum time to peak of segmental volume/strain values over 16-LV segments.
All indices were normalized to cardiac cycle length and reported as %.

Statistical Analysis
Continuous variables are presented as mean ± SD. Categorical variables are presented as counts and percentages. Analysis was performed using mixed linear modeling method to assess bias and reliability (fixed effects: scan occasion and image quality, random effect: participant identity). Test-retest reliability was quantified by intraclass correlation coefficients (ICC). Reliability was classified as follows: ICC < 0.4 = poor, 0.4 ≥ ICC < 0.75 = fair to good, and ICC ≥ 0.75 = excellent (32). Measurement error was assessed as the standard error of estimates derived from the mixed linear modeling as advocated by Popović and Thomas (33). Bias was assessed using Bland  Data are means (95% confidence intervals), P Bon are Bonferroni adjusted P values. Abbreviations as in Table 1.
Altman analysis and presented as the mean difference with limits of agreement between scans of different image quality. Intra-observer reproducibility based on re-reading the same (good quality) scans and was performed by the first reader (LA) blinded to the original measurements after 2-3 months interval. Over the same interval, inter-observer reproducibility was also performed by a second reader (CP) blinded to first reader's measurements. Linear correlations were analyzed and summarized using Pearson's correlation coefficient, r. The sample size was chosen to ensure a lower limit of the one-sided confidence interval ≤0.15 assuming a ICC = 0.8 (34). With this sample size we could also detect a bias ≥ 1SD (alpha = 0.05) with 96% power. All analyses were performed in Stata version 15.1 (StataCorp LLC, USA).

Test-Retest and the Impact of Image Quality on Reliability and Bias
Under optimal conditions (i.e., good quality images), only volume, CS and PTS 3D-STE derived dyssynchrony indices achieved fair to good test-retest reliability, whereas LS and RS derived dyssynchrony indices showed poor test-retest reliability ( Table 2). Reduced 3DE image quality impaired the reliability of 3D-STE derived LV dyssynchrony indices with test-retest reliability being poor for all indices when images were suboptimal ( Table 2).
There was no evidence of systematic bias due to sub-optimal image quality in any of 3D-STE derived LV dyssynchrony indices in study 1 ( Table 3). In study 2 using neoprene sheets, there was evidence of a small degree of systematic underestimation in volume derived dyssynchrony indices with increasingly poor image quality ( Table 4). Bland and Altman analysis by image quality is shown in Table S1 and Figure S1. Suboptimal image analyses showed higher mean difference (±SD) and wider limit of agreement for all 3D-STE drive LV dyssynchrony indices compared to good image analyses.
Intra-observer reproducibility based on re-reading the same (good quality) scans showed good to excellent reproducibility of all dyssynchrony indices except for LS derived dyssynchrony indices ( Table 5). Reproducibility was fair to good for LS-SDI, but poor for LS-Di. Inter-observer reproducibility was excellent for volume-based dyssynchrony indices, and CS and RS derived SDIs, but only fair to good for PTS and LS derived SDIs (Table 6). Overall, strain derived SDI indices showed better inter-observer reproducibility than strain-derived dispersion indices ( Table 6). Only CS-Di and RS-Di showed fair to good inter-observer reproducibility, being poor for LS-Di and PTS-Di ( Table 6).

DISCUSSION
3D-STE has theoretical advantages over other ultrasound-based methods for quantification of LVMD as it allows comprehensive evaluation of active contraction of the myocardium along different myocardial vectors simultaneously, and may therefore provide a better representation of LVMD (15). However, to be useful 3D-STE evaluation of LVMD needs to achieve Data are means (95% confidence intervals). Abbreviations as in Table 1.
acceptable reproducibility in the context of a clinically-relevant range of image qualities. We assessed the effect of image quality on test-retest reproducibility of 3D-STE derived LV dyssynchrony indices and assessed the correlation between strain-derived indices and SDI-volume. We found that suboptimal image quality did not introduce a systematic bias on 3D-STE derived strain-based LV dyssynchrony indices, although there was some evidence of underestimation of volume-based dyssynchrony indices with poorer image quality. However, under optimal conditions only volume, CS and PTS 3D-STE derived dyssynchrony indices achieved fair to good test-retest reliability. Under suboptimal conditions the reliability of all 3D-STE LV dyssynchrony indices was poor. SDI-volume by 3DE has previously been reported to be a feasible and reliable measure to assess LVMD (16,35). In a large meta-analysis, 3DE demonstrated 94% feasibility for the assessment of LVMD, and SDI-volume showed good intraand inter-observer reproducibility based on re-reading scans [interobserver ICC = 0.92 (95% CI 0.88, 0.95) and intraobserver ICC = 0.95 (95% CI 0.93, 0.97)] (16). However, the authors highlighted the lack of estimates of test-retest reliability which is the relevant measure for follow-up assessments of LVMD. We show that when image quality is optimal volume-based LV dyssynchrony indices have acceptable test-retest reliability (albeit noticeably lower than re-reading reliability), but that the testretest reliability of volume-based LV dyssynchrony indices is substantially lower for sub-optimal images. Comprehensive evaluation of LVMD from different myocardial vectors including LS, RS, CS, and more recently AS (AS; principal tangential strain or 3D-strain) may provide a better representation of the active contraction of the myocardium than volume changes (15). Despite this, there is currently a limited number of studies which have assessed the reliability of strain-based LV dyssynchrony indices by 3D-STE (15,(19)(20)(21)(22). Comprehensive assessment of LVMD of all myocardial directions by 3D-STE has been investigated by Thebault et al. although the authors only provided a re-read reliability assessment of AS derived dyssynchrony indices (21); the test-retest reliability of these indices has not been previously assessed. We show that LV dyssynchrony indices based on CS and PTS have fair to good test-retest reliability only when images are optimal, reliability is generally poor for sub-optimal images. Our data add to previous observations by Russo et al. who showed that poor image quality impaired the reliability of LV dyssynchrony indices assessed by 3DE even when re-reading the same images (18). We suggest that use of reliability estimates based on re-reading scans are likely to be over-optimistic and propose that test-retest estimates are a sounder basis for estimating sample sizes required to examine changes in dyssynchrony indices in follow-up studies.
Different components of myocardial mechanics reflect the contributions of different layers of the myocardium (36); these may be differentially affected by the extent and etiology of disease (37). It is possible therefore that LV dyssynchrony indices of different myocardial vectors may show differential associations with 3D-STE volume-based dyssynchrony indices. We found that SDI-volume correlated well with SDI derived from either PTS or CS and correlated moderately with RD-SDI, while SDI-volume  correlated weakly with LS-SDI. This could be consistent with these measures reflecting different myocardial activation patterns but equally the poor correlation between LS-SDI and SDI-volume could simply reflect the poor reliability of LS-SDI. Further work is required to establish whether differences in 3D-STE LV dyssynchrony indices have any prognostic importance.

Limitations
We acknowledge a number of limitations in this study. We only examined short-term test-retest reliability of 3D-STE derived LV dyssynchrony indices, and how they are influenced by suboptimal image quality in healthy individuals. The use of healthy individuals makes generalization to specific cardiac pathologies difficult but at least some of our findings are likely to be relevant to studies of diseased populations since sub-optimal image quality is more likely in unhealthy individuals. Image quality is only one factor influencing the clinical utility of 3D-STE in assessing LVMD. While our approach produced impairments of image quality that were qualitatively similar to that seen in older or more obese patients, echocardiographic images may impaired by other factors such as emphysema or surgical scars which may be more difficult to simulate. 3D-STE is also constrained by limited temporal resolution and the effect of frame rate on reproducibility was not studied as this has been described by others (30), but we ensured that an adequate and consistent frame rate was maintained in all studies to avoid any bias from this source. The study was performed by a single observer which is a strength in that it avoids the influence of inter-observer variability but limits its applicability to typical clinical practice.
Our study was performed on healthy individuals with good image quality (before image degradation) to maximize our ability to detect effects. Similar studies on individuals with cardiac disease, particularly dyssynchrony, would be valuable in future. We used software from a single vendor−3D-STE LV dyssynchrony indices have been reported to be uninfluenced by hardware or software (16), but other 3D-STE measures are vendor-dependent (35); therefore, our results should not be assumed to generalize to software from other vendors.

CONCLUSION
Under optimal conditions in healthy individuals, the reliability of volume-derived and some 3D-STE strain-based LV dyssynchrony indices by test-retest was fair to good, but reliability was substantially compromised by poor image quality. Previous studies using re-reading of images as a measure of reliability have probably over-estimated the reliability of 3D-STE derived LV dyssynchrony indices.

DATA AVAILABILITY STATEMENT
All datasets relevant to this study are included in the manuscript/Supplementary Files.

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of World Medical Association (WMA). All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the UCL REC.

AUTHOR CONTRIBUTIONS
Each author has contributed extensively to the manuscript. LA and AH conceived and designed the study. LA performed the data collection and the statistical analysis and drafted the manuscript. CP and AH assisted in data interpretation, critically reviewed, and amended the manuscript.

ACKNOWLEDGMENTS
We are grateful to all the volunteers who participated in this study.