Diagnostic and Prognostic Accuracy of Aortic Valve Calcium Scoring in Patients With Moderate-to-Severe Aortic Stenosis

Background: Assessing the true severity of aortic stenosis (AS) remains a challenge, particularly when echocardiography yields discordant results. Recent European and American guidelines recommend measuring aortic valve calcium (AVC) by multidetector row computed tomography (MDCT) to improve this assessment. Aim: To define, using a standardized MDCT scanning protocol, the optimal AVC load criteria for truly severe AS in patients with concordant echocardiographic findings, to establish the ability of these criteria to predict clinical outcomes, and to investigate their ability to delineate truly severe AS in patients with discordant echocardiographic AS grading. Methods and Results: Two hundred and sixty-six patients with moderate-to-severe AS and normal LVEF prospectively underwent MDCT and Doppler-echocardiography to assess AS severity. In patients with concordant AS grading, ROC analysis identified optimal cut-off values for diagnosing severe AS using different AVC load criteria. In these patients, 4-year event-free survival was better with low AVC load (60–63%) by these criteria than with high AVC load (23–26%, log rank p < 0.001). Patients with discordant AS grading had higher AVC load than those with moderate AS but lower AVC load than those with severe high-gradient AS. Between 36 and 55% of patients with severe LG-AS met AVC load criteria for severe AS. Although AVC load predicted outcome in these patients as well, its prognostic impact was less than in patients with concordant AS grading. Conclusions: Assessment of AVC load accurately identifies truly severe AS and provides powerful prognostic information. Our data further indicate that patients with discordant AS grading consist in a heterogenous group, as evidenced by their large range of AVC load. MDCT allows to differentiate between truly severe and pseudo-severe AS in this population as well, although the prognostic implications thereof are less pronounced than in patients with concordant AS grading.


INTRODUCTION
Several recent retrospective studies have indicated that, in elderly patients and particularly in elderly women with severe aortic stenosis (AS), physicians are frequently confronted with lower than expected mean transvalvular gradients, even in the presence of a preserved left ventricular ejection fraction (LVEF) (1)(2)(3)(4)(5). To differentiate this new form of severe AS from the classical "low flow (LF)-low gradient" (LG) form seen in patients with LV dysfunction (1), the term "paradoxical LG-AS" was recently proposed (2,3).
There is considerable debate as to the clinical significance of severe paradoxical LG-AS. Because it is frequently associated with concentric LV remodeling (5), low transvalvular flow rates (2), increased interstitial fibrosis (6), reduced LV long-axis function (5,7), and guarded prognosis (2,4,(8)(9)(10), several authors have hypothesized it could represent a more advanced form of severe AS. On the other hand, the results of recent natural history studies have indicated that severe paradoxical LG-AS usually evolves into severe high gradient (HG)-AS overtime (11)(12)(13) and that its clinical outcome resembles that of moderate AS (12,14), thus challenging the former hypothesis. It was further suggested that severe paradoxical LG-AS could be an intermediary stage between moderate AS and severe HG-AS (11).
To get further insight into the pathophysiology of this challenging condition, an alternate method for assessing AS severity is highly desirable. We and others have previously shown that aortic valve calcification (AVC) load is a fundamental marker of the severity of the aortic valve (AV) lesions seen in "degenerative" AS and that it can be accurately quantified by use of X-Ray computed tomography modalities, such as Electron Beam Computed Tomography (EBCT) and Multidetector Computed Tomography (MDCT) (15,16). Based on these observations, the most recent European Society of Cardiology (ESC) guidelines have recommended to use this approach to delineate the severity of AS in patients with discordant grading by echocardiography (17), and proposed specific AVC load thresholds to be used for diagnosing truly severe AS in this setting. Unfortunately, the proposed thresholds were derived from a single multicenter study, which used a wide variety of scanning protocols (18)(19)(20), some of which have been shown to significantly affect the resulting AVC load values. Accordingly, the aims of the present study were to define, using a standardized MDCT scanning protocol, the optimal AVC load criteria for truly severe AS in patients with concordant echocardiographic findings, to establish the potential of these criteria to predict clinical outcomes and to investigate their ability to delineate truly severe AS in patients with discordant echocardiographic AS grading.

Patients' Population
Between February 1st, 2013 and August 31th, 2015, 584 consecutive patients with LVEF > 50% and at least moderate native AS, defined as an effective orifice area (EOA) < 1.5 cm² and an indexed EOA (EOAi) < 0.9 cm²/m² by transthoracic echocardiography were prospectively identified in the valvular Clinic of the Cliniques Universitaires St-Luc and approached for inclusion in the IRB approved study (2014/29Nov/560). Patients were included into the study after giving written informed consent. Patients with rheumatic AS, LV dysfunction, more than mild aortic regurgitation or mitral valve disease, poor quality of echocardiographic images or a life expectancy < 1 year in the absence of severe AS were not considered for inclusion. The final study population consisted of 266 patients (Figure 1). After consenting to participate into the study, all patients underwent a comprehensive echocardiographic assessment of their AS and an MDCT examination within 10 ± 19 days of their echocardiographic examination.

Doppler Echocardiography Measurements
Echocardiographic data were obtained with commercially available ultrasound systems (IE33 or EPIC, Philps Medical System Andover, MA) conducted by experienced sonographers. All patients underwent a comprehensive examination, including M-mode, 2-dimensional and Doppler examinations according to ASE/EACVI recommendations.
For assessment of AS, multiple transducer positions were systematically used to record peak aortic jet velocities. The left ventricular outflow tract (LVOT) diameter was obtained from the parasternal long-axis view in mid-systole, parallel to the valve plane and immediately adjacent to the aortic leaflet insertion into the annulus. The LVOT velocity was recorded from the apical window by placing the pulsed-wave-Doppler sample volume in the LVOT, proximal to the aortic valve. Proper positioning of the sample volume was ensured by verifying the presence of smooth spectral velocity curves associated with an aortic valve closing click. Care was taken to optimize the ultrasound beam-blood flow alignment and to avoid sampling in the transvalvular jet or the proximal flow convergence region by excluding velocity curves with spectral broadening at peak ejection. The maximal velocity across the aortic valve was measured with continuouswave Doppler from multiple positions (apical, right parasternal, suprasternal, and subxyphoidal). The highest velocity signal was used to calculate peak and mean gradients. The EOA was calculated by use of the continuity equation, assuming that the LVOT area had a circular shape. LV volumes and LVEF were calculated by use of the biplane Simpson method and left atrial volume using biplane area-length method. In case of atrial fibrillation, 5-10 consecutive beats were systematically averaged.
Severe AS was defined as an indexed EOAi < 0.6 cm²/m² and was further stratified into subgroups with high and paradoxically low transvalvular gradients, respectively, in the presence of a mean transvalvular gradient (MPG) ≥ and < 40 mmHg. On the basis of EOAi and MPG, patients were categorized in 3 groups: 2 groups with concordant AS grading (moderate AS with an EOAi > 0.6 cm²/m² and a MPG < 40 mm Hg and severe high gradient AS with EOAi ≤ 0.6 cm²/m² and a MPG ≥ 40 mm Hg) and 1 group with discordant AS grading (severe paradoxical LG-AS with an EOAi < 0.6 cm²/m² and MPG < 40 mmHg). Patients with severe paradoxical LG-AS were further stratified into subgroups with low flow (LF) and normal flow (NF), respectively, in the presence of an indexed stroke volume < 35 or ≥ 35 mL/m².

Multidetector Computed Tomography Measurements
All MDCT examinations were performed by use of a helical 256slice CT scanner (Brillance ICT, Philips Healthcare, Cleveland, Ohio, USA). Acquisition parameters were set as follows: tube potential of 120 kV, tube current of 250 mA, gantry rotation time of 330 ms, detector configuration of 32 × 0.625 mm, and pitch of 0.14-0.18. Contiguous non-overlapping slices of 2.5 mm were acquired in a craniocaudal direction during inspiratory breathhold and using prospective ECGtriggering at 75% of R-R interval and a CB filter. No contrast enhancement was needed and no beta-blocker was administered for the purpose of the examination. The average of the total estimated effective radiation dose per CT scan was 0.89 ± 0.08 mGy and the average dose-length product was 64 ± 6 mGy.cm.
All measurements of AVC were performed on dedicated workstations using a validated commercially available software (heartbeat calcium scoring; Philips Medical Systems). Calcifications were identified by using a threshold of CT attenuation of 130 Hounsfield Units (HU), based on Agatston scoring method (21). Measurements were made in the axial view by a single investigator who identified the calcifications corresponding to the aortic valve leaflets. For this purpose, the aortic valve was visualized in multiple planes, including crosssectional valve plane, to accurately exclude contiguous calcium in the mitral valve annulus, the aortic wall and the coronary arteries. The Agatston score was reported as Agatston units (AU). AVC index was computed as the Agatston score divided by BSA and AVC density as Agatston score indexed to the LVOT cross sectional area (measured from echocardiographic data). The accuracy of our measurements was demonstrated in a previous study by anatomical validation, using in vivo (r = 0.86, p <0.001) and ex vivo (r = 0.93, p < 0.001) AVC measurements (16).

Outcome
Follow-up events were obtained for all patients between September and December 2018 by recalling physicians, cardiologists or patients themselves. Causes of death were established by autopsy records if the patient died in hospital, and otherwise by the referring physician. The primary outcome was the time to first event of death or aortic valve replacement, including both open surgical and transcatheter procedures. Decisions to proceed to aortic valve replacement were made according to international clinical guidelines and independently of MDCT calcium scoring, the results of which were not made available to the multidisciplinary discussion team. Patients in whom a decision to refer to aortic valve replacement had been made prior to the CT calcium scoring were excluded from the outcome analyses (n = 78).

Statistical Methods
All analyses were performed using the SPSS v19.0 (SPSS Inc., IBM, Chicago, IL) software. Normality was assessed by use of the Kolmogorov Smirnov-test. Continuous variables were expressed as mean ± 1 SD and were compared among groups using ANOVA when normally distributed or else using the Kruskall-Wallis-test. Individual differences among groups were compared post-hoc using Tukey-Kramers-test for normally distributed data with equal variances, the Games-Howell-test for normally distributed data with unequal variances and the Mann-Whitney U-tests (with Bonferroni correction for multiple comparisons) for non-normally distributed data. Categorical  variables were expressed as counts and percentages and were compared among groups using χ² or the Fisher exact-test.
In patients with concordant echocardiographic data, receiver operator curves were used to assess AVC load thresholds and to identify the optimum thresholds for severe AS. Their predictive value was evaluated by computing the area under the ROC curves. Kaplan-Meier curves and Cox proportional hazards regression analyses were used to determine the ability of these AVC load thresholds to predict adverse clinical events.
Where appropriate, collinearity of variables was assessed before inclusion in the multivariable model. All tests were two-sided and a p-value of < 0.05 was considered indicative of a statistically significant difference.

Baseline Clinical, Hemodynamic, and Echocardiographic Characteristics
The final study population consisted of 266 patients [151 men (57%), mean age: 77 ± 10 years] of which, 61 (22.9%) presented with moderate AS, 58 (22%) with severe paradoxical NF-LG-AS, 47 (18%) with severe paradoxical LF-LG-AS and 100 (38%) with severe HG-AS (Figure 1). The clinical and demographic characteristics of these 4 groups are shown in Table 1 and their echocardiographic parameters in Table 2. Overall, the clinical, demographic and echocardiographic characteristics were similar between groups, except for the glomerular filtration rate, which was significantly higher in patients with moderate AS than in those with severe paradoxical NF-LG-AS and atrial fibrillation, which was more prevalent in patients with severe paradoxical LF-LG-AS than in those with severe HG-AS. In addition, patients with moderate AS had higher prevalence of hypertension. Patients with severe paradoxical NF-LG-AS also exhibited larger EOAis than the other severe AS groups whereas severe paradoxical LF-LG-AS displayed lower LV volume indexes than the other groups.

AVC Load in the Different AS Groups
As shown in Table 3 and Figure 2, patients with moderate AS displayed significantly lower Agatston score, AVC density and AVC index than patients with severe HG-AS. Agastson score, AVC density and AVC index of patients with severe paradoxical LG-AS was intermediate between MAS and severe AS. Among patients with severe paradoxical LG-AS, no differences in Agatston score, AVC density or AVC index were found between those with NF and those with LF ( Table 3). These observations were made in both men and women, while Agastson score AVC density and AVC index were systematically higher in men than in women.

MDCT Diagnostic Thresholds in Patients With Concordant AS Grading
As shown in Table 4, using ROC analysis, the best cutoff values to identify severe AS were an Agatston score ≥ 1,577 AU for women and ≥ 2,238 AU for men, an AVC density ≥ 495 AU/cm² for women and ≥ 581 AU/cm² for men and an AVC index ≥ 891 AU/m² for women and ≥ 1,130 AU/m² for men. AVC density was associated with the highest area under the curve (AUC) both for women and men (AUC: 0.98 and 0.96, respectively), followed by the Agatston score (AUC: 0.94 each) and the AVC index (AUC: 0.96 and 0.93, respectivly).

Prediction of Adverse Events by AVC Load Indices
Event-free survival was assessed in the 188 patients in whom the decision to proceed to surgery had not yet been made at the time of the MDCT investigation. Over a mean follow-up of 31 months (range 1-48 months), 50 died and 94 underwent aortic valve replacement. The overall event-free survival of this cohort was 72 ± 3%, 63 ± 4%, and 46 ± 4% at, respectively, 1, 2, and 4 years. As shown in Figure 3, event-free survival was better in patients with moderate AS or severe paradoxical LG-AS than in those with severe HG-AS.
The impact of the different AVC load indices on event-free survival was tested in the entire population (Figure 4) as well as in the subgroups of patients with concordant and discordant echocardiographic AS grading (Figure 5). To delineate the factors independently associated with the combined end-point of death and aortic valve replacement, different Cox's proportional hazards regression models were generated. Using AVC load indices as continuous variables ( Table 5, model 1), Cox's analysis identified the AVC index as the sole independent predictor of outcome. Using the best AVC load indices cut-off values, as determined by the ROC curve analyses ( Table 5, model 2), Cox's analysis identified the AVC Agatston score, age and the effective orifice area as independent predictors of outcome. Table 5 compares the sensitivity, specificity and overall accuracy of different AVC load thresholds published in the literature to those found in our study. Overall, the sensitivity of these tresholds was similar (from 91 to 93% in women and from 87 to 89% in men). Differences in specificity were nonetheless observed, the thresholds recommended by the ESC guidelines and those proposed by Clavel et al. being less specific than those proposed by Pawade et al. or those found in the present study.

Comparison With Previously Published AVC Load Thresholds
Using univariate Cox's proportional hazard analyses, we also compared the ability of the different AVC load thresholds to predict outcome. As shown in Tables 6 and 7, all AVC load thresholds were highly predictive of the combined end-point of death and need for aortic valve replamcement. Yet, the model based on the thresholds found in the present study was the most powerful, as shown by its higher χ² and its lower AIC and SBC.  Figure 6 shows the proportion of patients with truly severe AS based on the AVC load thresholds found in our study. As shown, the different AVC load indices correctly identified > 90% of patients with severe HG-AS and > 85% of patients with moderate AS. Depending on the parameter used, 36-55% of patients with severe paradoxical LG-AS also met AVC load criteria for severe AS.

DISCUSSION
The aims of the present study were to define, using a standardized MDCT scanning protocol, the optimal AVC load criteria for truly severe AS in patients with concordant echocardiographic AS grading, to establish the potential of these criteria to predict clinical outcomes and to investigate their ability to delineate truly severe AS in patients with discordant echocardiographic AS grading. Our results can be summarized as follows: -AVC load increases from moderate AS, over severe paradoxical LG-AS to severe HG-AS. -In patients with concordant echocardiographic AS grading, all different AVC load indices permit to distinguish between moderate AS and severe HG-AS with a similar overall accuracy of 87-91%.  -The observed AVC load thresholds also allow predicting which patient will die or require aortic valve replacement. -Depending on the criteria used, between 36 and 55% of patients with severe paradoxical LG-AS meet AVC load criteria for severe AS, the lowest proportion being found using our own thresholds, and the highest proportion being obtained using the 2017 ESC guidelines thresholds or those proposed by Clavel et al. and Pawade et al. -The prognostic impact of AVC load thresholds is less in patients with discordant AS grading than in those with concordant AS grading.
Patients with severe paradoxical LG-AS despite normal LVEF pose diagnostic and management challenges that are distinctly different from the majority of patients with severe HG-AS. Both the European and American guidelines recognize the complexity of reaching a final decision in these patients and consider reasonable to propose AVR in symptomatic patients, provided that clinical, hemodynamic, and anatomic data support valve obstruction as the most likely cause of symptoms (17,23). According to the most recent ESC guidelines, this can be best achieved by measuring AVC load using MDCT (17).

AVC Load to Assess the Severity of AS
Earlier studies have shown a definite association between AVC load by EBCT and hemodynamic indices of AS severity (15). More recently, AVC load has also been evaluated with non-ECG gated MDCT (24-26), using a slightly modified Agatston scoring system in order to provide comparable values to the original EBCT quantification. With the introduction of ECG-gating, the quality of cardiac MDCT imaging has improved even further, so that today, MDCT has become the preferred method for assessing AVC load. Its accuracy has been validated in several anatomical studies (16,26). In the present study, we used this approach to calculate AVC load thresholds that best discriminate between moderate AS and severe HG-AS. Depending on the parameter used (Agatston score, AVC density and AVC index), severe AS was identified with a sensitivity of 87-91%, a specificity of 87-90% and an overall accuracy of 87-91%.  (22). Although these last authors found somewhat lower threshold values than in our study, a recent study of Clavel at al. demonstrated that the thresholds of AVC load that best identify adverse outcomes are higher than those proposed in the guidelines and quite similar to those found in the present study (around 1,500 AU in women and 2,250 AU in men) (27).

Which AVC Load Criteria Should We Use to Assess the Severity of AS?
Since absolute AVC load differs between bicuspid and tricuspid valves, but AVC density does not (16), this latter should probably be preferred to avoid misinterpretations of AVC load in patients in whom the underlying valve morphology is uncertain. The use of AVC density could also avoid underestimation or   overestimation of AS severity in patients with small or large annuli, as highlighted by several authors (16,18,27). However, current guidelines do not provide any recommendation in this regard. The present study shows that AVC density has the highest accuracy in identifying truly severe AS in patients with concordant AS grading. The thresholds found in our  study are also similar to those that were recently found by Clavel et al. as being associated with poor outcomes (430 AU/cm² in women and 560 AU/cm² in men) (27).
Further studies are needed to confirm the potential interest in using this parameter instead of the more commonly used Agatston score. Our survival analyses nonetheless suggest that it does allow better prediction of clinical outcomes than the Agatston score.

AVC Load in Patients With Severe Paradoxical LG-AS
An important finding of this study is that AVC load is significantly lower in patients with severe paradoxical LG-AS than in those with severe HG-AS, irrespective of the flow pattern. It is also higher than in patients with moderate AS. Similar results were reported by Clavel et al. (18) and more recently by Kamperidis et al. (28).
In an earlier analysis of the same cohort, we had already observed that a higher AVC load was needed to define severe AS on the basis of a MPG ≥ 40 mmHg or a Vmax > 4 m/s than on the basis of an EOAi < 0.6 cm²/m². We then hypothesized that use of the continuity equation to assess AS severity was responsible for these observations, as the EOAi derived from Doppler echocardiography is usually smaller than the anatomic valve area measured by planimetry, autopsy, or cardiac catheterization. Although the differences between the anatomic and effective valve areas are commonly explained by the continuing convergence of streamlines beyond the anatomical orifice, we have recently shown that in reality, it was largely due to the underestimation of subvalvular flow when inputting a circular LVOT area into the continuity equation (29). Since guidelines for grading AS severity were initially derived from invasive measurements reflecting anatomic valve area, inconsistent grading of AS severity on the basis of mean pressure gradients (or Vmax) and EOAi were to be expected. The present data indicate that use of AVC load might be helpful to better define AS severity, particularly when Doppler echocardiographic data are the most discordant, i.e., in patients with severe paradoxical LG-AS. Indeed, when using the above described AVC load thresholds to define severe AS, 36-55% of patients with severe paradoxical LG-AS meet AVC load criteria for severe AS. These findings are in line with those of Clavel et al. who also found that a substantial proportion (45-53%) of patients with severe paradoxical LG-AS meet AVC load criteria for severe AS (18). This confirms that patients with severe paradoxical LG-AS consist in an heterogenous population, and that use of MDCT to measure AVC load permits to diffrentiate those with truly severe AS from those with moderate or pseudo-severe AS. It should nonetheless be emphasized that the prognostic implications thereof seems to be less in this population than in patients with severe HG-AS. As shown in Figure 4, the event-free survival of patients with severe paradoxical LG-AS and high AVC load is indeed significantly better than that of similar patients with severe HG-AS.

Study Limitations
This study has limitations that should be acknowledged. First, we had to exclude a significant number of patients from the outcome analyses because they were already scheduled to undergo surgery at the time of their MDCT evaluation. Nevertheless, we were still able to assess clinical outcomes in 188 patients including a large number of patients with concordant or discordant echocardiographic AS grading. Second, we did not perform subgroups analyses in patients with bicuspid vs. tricuspid valves. This is because of the relative inability of echocardiography to accurately identify bicuspid valves when they are heavily calcified (30). This can potentially be problematic when assessing AVC load by use the Agatston score, since bicuspid valves are usually larger than tricuspid valves and therefore tend to display larger Agatston scores than tricuspid valves. As previously shown, this limitation can be easily overcome by use of AVC density instead of absolute AVC score (16). Finally, we did not investigate the potential impact of AVC load on post-operative or -interventional outcomes. Some studies have recently indicated that the presence and amount of calcium in the left ventricular outflow tract was an important determinant of outcomes after transcatheter aortic valve replacement (TAVR) (31). Unfortunately, the small number of patients undergoing TAVR in our study precluded any meaningful statistical analysis. Further studies will be needed to address this issue.

CONCLUSIONS
Assessment of AVC load accurately identifies truly severe AS and provides powerful prognostic information. Our data further indicate that patients with discordant AS grading consist in a heterogenous group, as evidenced by their large range of AVC load. MDCT allows to differentiate between truly severe and pseudo-severe AS in this population as well, although the prognostic implications thereof are not as pronounced as in patients with concordant AS grading.

SHORT SUMMARY
Using a standardized MDCT scanning protocol, we identified optimal AVC load criteria for diagnosing truly severe AS. In patients with concordant echocardiographic results, 4-year event-free survival was considerably better with low AVC load by these criteria than with high AVC load. In patients with discordant AS grading, between 36 and 55% of them met AVC load criteria for severe AS. Yet, the prognostic implications thereof was less pronounced than in patients with concordant AS grading.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article can be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Comité d'éthique hospitalo-facultaire. The patients/participants provided their written informed consent to participate in this study.