Accuracy of the Radiographic Assessment of Lung Edema Score for the Diagnosis of ARDS

Background: Bilateral opacities on chest radiographs are part of the Berlin Definition for Acute Respiratory Distress Syndrome (ARDS) but have poor interobserver reliability. The “Radiographic Assessment of Lung Edema” (RALE) score was recently proposed for evaluation of the extent and density of alveolar opacities on chest radiographs of ARDS patients. The current study determined the accuracy of the RALE score for the diagnosis and the prognosis of ARDS. Methods: Post-hoc analysis of a cohort of invasively ventilated intensive care unit (ICU) patients expected to need invasive ventilation for >24 h. The Berlin Definition was used as the gold standard. The RALE score was calculated for the first available chest radiograph after start of ventilation in the ICU. The primary endpoint was the diagnostic accuracy for ARDS of the RALE score. Secondary endpoints included the prognostic value of the RALE score for ICU and hospital mortality, and the association with ARDS severity, and the PaO2/FiO2. Receiver operating characteristic (ROC) curves were constructed, and the optimal cutoff was used to determine sensitivity, specificity and the negative and positive predictive value of the RALE score for ARDS. Results: The study included 131 patients, of whom 30 had ARDS (11 mild, 15 moderate, and 4 severe ARDS). The first available chest radiograph was obtained median 0 [0 to 1] days after start of invasive ventilation in ICU. Compared to patients without ARDS, a higher RALE score was found in patients with ARDS (24 [interquartile range (IQR) 16–30] vs. 6 [IQR 3–11]; P < 0.001), with RALE scores of 20 [IQR 14–24], 26 [IQR 16–32], and 32 [IQR 19–36] for mild, moderate and severe ARDS, respectively, (P = 0.166). The area under the ROC for ARDS was excellent (0.91 [0.86–0.96]). The best cutoff for ARDS diagnosis was 10 with 100% sensitivity, 71% specificity, 51% positive predictive value and 100% negative predictive value. The RALE score was not associated with ICU or hospital mortality, and weakly correlated with the PaO2/FiO2. Conclusion: In this cohort of invasively ventilated ICU patients, the RALE score had excellent diagnostic accuracy for ARDS.


INTRODUCTION
The chest radiograph is a frequently used imaging tool in intensive care unit (ICU) patients (Trotman-Dickenson, 2003;Graat et al., 2006), although its clinical value has been disputed (Graat et al., 2005). Findings on chest radiographs are an important part of the Berlin Definition for acute respiratory distress syndrome (ARDS Definition Task Force et al., 2012), despite the low interobserver reliability that does not improve with training (Rubenfeld et al., 1999;Goddard et al., 2018). Also, the description of chest radiographs findings remains mostly subjective. Recently, therefore, the "Radiographic Assessment of Lung Edema" (RALE) score was proposed (Warren et al., 2018), a numeric scoring system in which the chest is divided into four quadrants that are each scored on a numerical scale for extent of consolidation and density of opacification. The RALE score is calculated by summing the product of the scores for consolidation and density of opacification of the four quadrants, and can range from 0 to 48.
While the first description of the RALE score focused on validating the score against gravimetric quantification and testing the association between the score and outcome in patients with ARDS (Warren et al., 2018), it could be that this score also has discriminating properties to diagnose ARDS in invasively ventilated ICU patients who may or may not have ARDS. In addition, with every new scale or score, it is necessary to externally validate its capacity, feasibility and reliability (Patrick and Chiang, 2000;Keszei et al., 2010;Kottner et al., 2011).
The objective of the current study was two-fold. The first objective was to determine whether the RALE score has diagnostic properties for ARDS, and prognostic properties in ICU patients. The second objective was to assess the feasibility and interobserver reliability of the RALE score. These objectives were studied using the chest radiographs of patients in a welldefined cohort of invasively ventilated ICU patients (Vercesi et al., 2018). The hypotheses tested were that the RALE score has a good diagnostic accuracy for ARDS, and that the RALE score has prognostic value in invasively ventilated ICU patients, independent of the diagnosis of ARDS.

Study Design and Settings
This study was a post-hoc analysis of a single-center observational study performed in the ICU of the Amsterdam University Medical Centers, location Academic Medical Center (AMC) between November 2016 and June 2017 (Vercesi et al., 2018;Pisani et al., 2019). The Institutional Review Board of the AMC approved the original study and waived the need for informed consent from individual patients because data used in this study had been collected as part of standard care for patients with acute respiratory failure (approval W17_353 # 17.411).

Inclusion and Exclusion Criteria
Patients were eligible for participation in the original study if they: (a) were expected to receive invasive ventilation for at least 24 h at the moment of screening, (b) received ventilation with a minimum of 5 cm H 2 O positive endexpiratory pressure (PEEP); and (c) had a chest radiograph or lung CT scan within the first 24 and 48 h of start of invasive ventilation, respectively. As the original study focused on the diagnostic value of lung ultrasound plus pulse oximetry for moderate or severe ARDS, the original study had two exclusion criteria, namely: (a) no lung ultrasound study made within 48 h of start of invasive ventilation; and (b) conditions potentially compromising reliability of pulse oximetry, including carbon monoxide poisoning. The number of excluded patients because of these reasons, though, was very low. An additional exclusion criterion for the current analysis was the absence of a chest radiograph during the first 2 days of invasive ventilation in the ICU.

Data Collection
Collection of data involved demographic characteristics including age, gender, height, weight, and body mass index; disease severity scores including the acute physiology and chronic health evaluation IV score and the simplified acute physiology score II; and ventilation characteristics including FiO 2 , minute volume, PEEP, maximum airway pressure (P max ), respiratory rate, tidal volume, and blood gas analysis results.

ARDS Diagnosis
Acute respiratory distress syndrome was diagnosed according to the Berlin Definition for ARDS (ARDS Definition Task Force et al., 2012). For this, a panel of independent experienced clinicians assessed presence or absence of ARDS, strictly using the 4 components of the Berlin Definition for ARDS, i.e., new or worsening respiratory symptoms within 1 week of a known medical clinical insult; a PaO 2 /FiO 2 < 300 mm Hg at a minimum of 5 cm H 2 O PEEP; bilateral opacities on the chest radiograph or computed tomography (CT) exam, not explained by effusions, collapse or nodules; and respiratory failure not fully explained by cardiac failure or fluid overload. Of note, the clinicians applying the criteria in the Berlin Definition for ARDS could not calculate the RALE score, as this score was developed and reported in the literature after their assessments.

RALE Score
Two independent researchers (CZ and VL) scored the first available chest radiograph after start of mechanical ventilation in ICU patients. These researchers were unaware of clinical information or presence or absence of ARDS, as well as the results of assessments of the above-mentioned physicians who applied the criteria in the Berlin Definition. In short, as shown in Figure 1, the lung fields on the chest radiograph were divided into four quadrants by a vertical line over the spine and a horizontal line at the level of the first branch of the left main bronchus, exactly as described in the seminal publication on the RALE score (Warren et al., 2018). Each quadrant was assigned a number, and the extent of alveolar opacities (the consolidation score, from 0 to 4) and density of alveolar opacities (the density score, from 1 to 3) was determined. If the consolidation score was 0, the density score was 0. The final RALE score was the sum of the product of the consolidation and density score for each quadrant. Thus, the final RALE score ranged from minimum 0 to maximum 48.

Endpoints
The primary endpoint was the diagnostic accuracy for ARDS of the RALE score. Secondary endpoints included the prognostic value of the RALE score for ICU and hospital mortality, correlation between the RALE score and ARDS severity, and the inter-observer reliability for the RALE scoring, the correlation with the PaO 2 /FiO 2 at the moment the chest radiograph was obtained.

Statistical Analysis
Demographic data, and clinical and outcome variables were presented as frequencies with percentages for categorical variables and as medians with interquartile ranges for continuous variables.
To determine the reliability of the RALE score, the interobserver variability (Keszei et al., 2010) between the primary scorer and a second independent investigator was tested on the entire cohort of the patients. For this, a two-way mixed consistency average measures intraclass correlation coefficient (ICC) was calculated. A Bland-Altman plot and a scatter plot were used to visualize the agreement between independent viewers. For the primary analysis only the scores attributed by the primary scorer were used.
To determine the diagnostic accuracy of the RALE score for ARDS, the Area Under the Receiver Operating Characteristic curve (AUROC) with 95% confidence intervals (CI) was calculated. Diagnostic accuracy was considered "excellent" if AUROC was between 0.9 and 1, "very good" between 0.8 and 0.9, "good" between 0.7 and 0.8, "sufficient" between 0.6 and 0.7, and "bad" between 0.5 and 0.6 (Šimundić, 2009). The best cutoffs, the maximum difference between true positive and false positive, were obtained with the Youden index (Youden, 1950) (sensitivity + specificity -1). Sensitivity, specificity, positive and negative predictive values were calculated using this cutoff.
Next, RALE scores were compared between patients without ARDS, and patients with mild, moderate or severe ARDS, and local polynomial regression (LOWESS curve fitting) was used to assess the correlation between RALE score with PaO 2 /FiO 2 , PEEP, FiO 2 , and P Max .
Finally, to determine the prognostic accuracy for ICU or hospital mortality, ROCs were constructed and analyzed in the same way as for determining the diagnostic accuracy for ARDS.
Statistical significance was considered when P < 0.05. All analyses were performed using IBM SPSS Statistics 24.0 and graphs built using Prism 8 (GraphPad software, version 8.4.2).

Patients
Patient flow is shown in Figure 2. Of the 152 patients in the original cohort, 131 patients fulfilled the additional criteria for participation in the current analysis. Of them, 101 were diagnosed as not having ARDS, and 30 fulfilled the Berlin Definition for ARDS (11, 15, and 4 patients with mild, moderate and severe ARDS, respectively). Demographic and ventilatory characteristics are presented in Table 1.
The ICC for applying the RALE score was excellent (0.95 [95%-CI 0.92-0.96]). The Bland-Altman plot showed a strong agreement and the scatter plot suggests high degree of agreement between the two independent researchers (Supplementary Figure 1).  Values are medians [interquartile range] or numbers (percentage). Abbreviations: ARDS, acute respiratory distress syndrome; SAPS, simplified acute physiology score; SOFA, sequential organ failure assessment; APACHE, acute physiology and chronic health; VFD 28, ventilator-free days at day 28; FiO2, fraction of inspired oxygen; PEEP, positive end-expiratory pressure; Pmax, maximum airway pressure; RR, respiratory rate; VT, tidal volume; PBW, predicted body weight; SpO2, pulse oximetry saturation; PaO2, arterial oxygen pressure; and PaCO2, arterial carbon dioxide pressure.

The Prognostic Value of the RALE Score
The prognostic capacity of the RALE score for ICU -and hospital mortality was poor (Figure 4).

Correlation With PaO 2 /FiO 2
The correlation between RALE score and PaO 2 /FiO 2 was weak (R 2 linear = 0.21; Supplementary Figure 2). No meaningful association was detectable between the RALE score and PEEP levels recorded at the moment of the CXR.

DISCUSSION
The findings of this post-hoc analysis of cohort of well-defined invasively ventilated critically ill patients expected not to be extubated within 24 h can best be summarized as follows: (a) the RALE score is higher in patients with ARDS compared to patients not fulfilling the Berlin Definition for ARDS, (b) the diagnostic performance for ARDS of the RALE score is excellent, with a cutoff of 10 showing excellent sensitivity and moderate specificity; (c) though has poor prognostic value in a mixed cohort of patients with may or may not have ARDS; (d) the RALE score increases from mild to severe ARDS, though this finding was not statistically significant; and (e) the RALE score correlates weakly with the PaO 2 /FiO 2 . This study has several strengths. It used the data of a prospective study in which consecutive patients expected to be intubated for at least 24 h were included. The original study as well as the current re-analysis had only few exclusion criteria, increasing its generalizability. Only eight patients were excluded because of a missing chest radiograph. The chest radiographs used for calculating the RALE score were as close as possible to start of invasive ventilation in the ICU, and always with a PEEP ≥ 5 cm H 2 O. ARDS was diagnosed using the present "gold standard, " i.e., the Berlin Definition for ARDS, applied by independent physicians with extensive experience in using it. Finally, as a measure against bias, clinicians involved in applying the criteria in the Berlin Definition for ARDS were unaware of the RALE score, and vice versa, the investigators calculating the RALE score remained blinded for the presence of absence of ARDS.
One salient finding was the high agreement between the two researchers with regard to the RALE score in individual cases. This new numeric score seemed easy to learn and calculate, and gave a uniform interpretation of chest radiographs, in line with the seminal report on use of the RALE score (Warren et al., 2018). It is noticeable that the Berlin Definition investigators demonstrated low interobserver reliability which did not improve with training (Goddard et al., 2018). Thus, one could argue to use this new score as a finding to make diagnosing ARDS easier.
The findings of the current study are at least in part in line with the finding in the seminal study on this new score, i.e., that higher RALE scores are found in patients with more injured lungs, according to the PaO 2 /FiO 2 . One difference between the two studies was that in the current study the RALE score was calculated in much "broader" population of invasively ventilated ICU patients, i.e., not only patients with ARDS, but also patients at risk of this complication. The RALE score demonstrated an excellent diagnostic accuracy for ARDS, and may be taken into consideration in future refinements of the radiological criteria of the Berlin Definition of ARDS. The increase in RALE score from mild, to moderate and severe ARDS was not statistically significant, in agreement with a recent study focusing on the evolution of the RALE score in 108 patients with ARDS (Kotok et al., 2020). However, it must be mentioned that the number of patients with ARDS, in particular severe ARDS, was low.
Although we could not find an association between baseline RALE and mortality, a recent study proposes that the change in RALE score in the first days is associated with survival in ARDS (Jabaudon et al., 2020). Also in patients with pneumonia from coronavirus disease, both the visually scored and RALE score and the ones computed from artificial intelligence algorithms were associated with poor outcomes (Ebrahimian et al., 2021).
While the RALE score had a weak association with ARDS categories based on degree of hypoxemia, scores could independently increase the diagnostic performance and the outcome prediction. This should be tested in future cohorts of invasively ventilated ICU patients. This study has other limitations. The study included a relatively small number of patients, resulting in a low number of patients with ARDS, and especially few patients with severe ARDS. In addition, this was a single center study with all available patients being used without a formal power calculation performed beforehand. It will be important to confirm the results of the current study performing the RALE score in a multicenter setting.
In conclusion, the RALE score provides a reliable interpretation of signs of lung edema on chest radiographs in invasively ventilated ICU patients. The RALE score has an excellent diagnostic accuracy for ARDS in such patients but has only a weak correlation with PaO 2 /FiO 2 and no associations with patient outcomes. Additional validation of the cutoff and performance of the RALE score is needed in larger cohorts.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
MJS, LW, CC, MW, SG, CZ, and LP contributed to conception and design of the study. CZ, LP, and VL organized the database. CZ and LP performed the statistical analysis. CZ wrote the first draft of the manuscript. LP, VL, AA, and MRS wrote sections of the manuscript. MJS and LP supervised the project and revised the present manuscript. All authors contributed to the manuscript revision, read, and approved the submitted version.