Diagnostic Errors Are Common in Acute Pediatric Respiratory Disease: A Prospective, Single-Blinded Multicenter Diagnostic Accuracy Study in Australian Emergency Departments

Background: Diagnostic errors are a global health priority and a common cause of preventable harm. There is limited data available for the prevalence of misdiagnosis in pediatric acute-care settings. Respiratory illnesses, which are particularly challenging to diagnose, are the most frequent reason for presentation to pediatric emergency departments. Objective: To evaluate the diagnostic accuracy of emergency department clinicians in diagnosing acute childhood respiratory diseases, as compared with expert panel consensus (reference standard). Methods: Prospective, multicenter, single-blinded, diagnostic accuracy study in two well-resourced pediatric emergency departments in a large Australian city. Between September 2016 and August 2018, a convenience sample of children aged 29 days to 12 years who presented with respiratory symptoms was enrolled. The emergency department discharge diagnoses were reported by clinicians based upon standard clinical diagnostic definitions. These diagnoses were compared against consensus diagnoses given by an expert panel of pediatric specialists using standardized disease definitions after they reviewed all medical records. Results: For 620 participants, the sensitivity and specificity (%, [95% CI]) of the emergency department compared with the expert panel diagnoses were generally poor: isolated upper respiratory tract disease (64.9 [54.6, 74.4], 91.0 [88.2, 93.3]), croup (76.8 [66.2, 85.4], 97.9 [96.2, 98.9]), lower respiratory tract disease (86.6 [83.1, 89.6], 92.9 [87.6, 96.4]), bronchiolitis (66.9 [58.6, 74.5], 94.3 [80.8, 99.3]), asthma/reactive airway disease (91.0 [85.8, 94.8], 93.0 [90.1, 95.3]), clinical pneumonia (63·9 [50.6, 75·8], 95·0 [92·8, 96·7]), focal (consolidative) pneumonia (54·8 [38·7, 70·2], 86.2 [79.3, 91.5]). Only 59% of chest x-rays with consolidation were correctly identified. Between 6.9 and 14.5% of children were inappropriately prescribed based on their eventual diagnosis. Conclusion: In well-resourced emergency departments, we have identified a previously unrecognized high diagnostic error rate for acute childhood respiratory disorders, particularly in pneumonia and bronchiolitis. These errors lead to the potential of avoidable harm and the administration of inappropriate treatment.

Background: Diagnostic errors are a global health priority and a common cause of preventable harm. There is limited data available for the prevalence of misdiagnosis in pediatric acute-care settings. Respiratory illnesses, which are particularly challenging to diagnose, are the most frequent reason for presentation to pediatric emergency departments.
Objective: To evaluate the diagnostic accuracy of emergency department clinicians in diagnosing acute childhood respiratory diseases, as compared with expert panel consensus (reference standard).
Methods: Prospective, multicenter, single-blinded, diagnostic accuracy study in two well-resourced pediatric emergency departments in a large Australian city. Between September 2016 and August 2018, a convenience sample of children aged 29 days to 12 years who presented with respiratory symptoms was enrolled. The emergency department discharge diagnoses were reported by clinicians based upon standard clinical diagnostic definitions. These diagnoses were compared against consensus diagnoses given by an expert panel of pediatric specialists using standardized disease definitions after they reviewed all medical records.
Results: For 620 participants, the sensitivity and specificity (%, [95% CI]) of the emergency department compared with the expert panel diagnoses were generally poor: isolated upper respiratory tract disease (64.9 [54.6, 74.4]

INTRODUCTION
Diagnostic errors are the most common form of medical error and are defined as "the failure to make an accurate and timely explanation of the patient's health problem or to communicate that explanation to the patient" (1). Although diagnostic errors are classified by the World Health Organization (WHO) as a global health priority and are regarded as a "moral, professional and public health imperative" by the U.S. National Academy of Medicine (1), there are a lack of published data and reliable measures in this area, particularly in child health.
Diagnostic error research has been hindered, in part, because clinicians are generally poor at recognizing their own mistakes, particularly if they have not been formally trained to identify errors (2)(3)(4). With these findings in mind, reported error rates are likely to be underestimates. In addition, physician confidence levels have been shown to be insensitive to their diagnostic accuracy or the case difficulty (5). In pediatrics, where a diagnosis is dependent on having an accessible and communicative caregiver, there could be even greater potential for error, which might contribute to the almost 7 million global childhood deaths each year (6). When surveyed, 15-77% of pediatricians reported making at least one diagnostic error a month, and 45% reported making at least one harmful error each year (7)(8)(9).
The consequences of diagnostic errors include preventable harm for individuals and cost to the public health sector. In the United States, an estimated one million people per year experience harm from misdiagnosis (10) with the potential level of harm being moderate to severe in up to 86% of cases (11,12). In total, the cost of medical errors is estimated to be USD 19.5 billion per annum, with an economic impact approaching USD 1 trillion annually (13). Diagnostic errors are also an important contributor in medical malpractice (14).
Acute respiratory conditions are the most common reason for presentation to pediatric emergency departments (E.D.s) (15). However, there is little evidence in the literature to confirm the diagnostic accuracy of clinicians in these units. The differential diagnosis of childhood respiratory illness is challenging as it relies on a complex mix of clinical and interpretative skills in patient history, examination, and investigation. In the chaotic environment of a busy E.D., there is a greater potential for making errors. Many respiratory disorders have similar clinical features, such as breathlessness and wheeze, which can contribute to misdiagnoses (16). Auscultation underpins the diagnosis of many lower respiratory tract conditions, but this relies on clinician experience and interpretation and has been shown to have high inter-rater variation (17,18). In primary care settings, respiratory diseases, including pneumonia and asthma, are commonly misdiagnosed; however, the error rates in E.D.s have not been reported (11,19,20). Due to the potential for serious harm arising from missing childhood respiratory diseases, there is a need to define how frequently errors occur in E.D.s. Once the breadth of the problem is defined, remedial approaches can be developed.
The present study aimed to determine the diagnostic error rates for acute childhood respiratory diseases in two wellresourced E.D.s in Western Australia.

Ethics and Study Conduct
Approval to conduct the study was granted by the Human Research Ethics Committees of Ramsay Health Care WA|SA (Reference 1501) and the Child and Adolescent Health Service of Western Australia (Reference 2015030EP). We obtained written consent from the parents of all participants. No adverse events were reported. The study did not interfere with any aspect of clinical care or diagnosis.

Study Design and Participants
This was a multicenter, prospective, single-blind diagnostic accuracy study. We recruited a convenience sample of children aged 29 days to 12 years who attended a study site with signs or symptoms of a respiratory illness (Box 1).
Participating sites were two hospitals in Western Australia: a tertiary pediatric facility with 75,000 ED presentations per year and a metropolitan general hospital with 29,000 pediatric (109,000 total) E.D. presentations per year. Both hospitals are optimally resourced units with complete treatment, laboratory, and radiology services. The units deliver team-based clinical care led by Emergency Medicine Physicians and Pediatric and Emergency Fellowship registrars. At least two supervising consultants were always present in the departments.

Collected Data
We collected demographic data, clinical measures and symptoms, examination findings, cough-sound recordings, and diagnostic and treatment response reports. We recorded BOX 1 | Study inclusion and exclusion criteria.
Inclusion criteria (at least one of the following) A research nurse recorded cough sounds onto a standard iPhone 6. Each participant provided five cough sound events (spontaneous or voluntary) whilst they were in the E.D.

E.D. Diagnosis
The E.D. diagnosis was recorded by the treating team at patient discharge from the unit, either to home or to an inpatient ward. The department's clinicians reported x-rays performed in the E.D. The treating team could record more than one diagnosis for each participant. E.D. clinicians were blinded to the consensus panel diagnoses.

Expert Panel Consensus Diagnosis
To reach a consensus diagnosis, we assembled an expert panel comprising four acute-care pediatricians (Fellows of the Royal Australasian College of Physicians: median 15 years of specialist practice). All hospital charts and databases were available to the panel, including test results and treatment responses. The panel was able to access information related to the final discharge diagnosis from the E.D. and inpatient teams. Specialist radiologists reported all radiology, and the results were confirmed by the panel. For participants admitted to inpatient wards, data from their clinical course after the E.D. visit was available. The panel members were allowed to listen to the recorded coughs when considering a diagnosis of croup.
Two members of the panel reviewed each participant independently. A third member acted as a tiebreaker in the event of non-agreement. Each participant was scored as positive, negative, or unsure for each of the study conditions. A consensus diagnosis could not be assigned when there was insufficient information in the medical notes or if relevant tests were not performed.

Study Disease Definitions
Study diagnostic definitions ( Table 1) were developed from international guidelines and are consistent with the diagnostic pathways used at the two study sites (21)(22)(23)(24)(25)(26). In addition to the specific diseases of "no respiratory disease found, " "croup, " "asthma/reactive airway disease" (asthma/RAD), "bronchiolitis, " and "pneumonia, " we defined two broader groups to differentiate children with isolated upper respiratory disease (iURTD) from those with any form of lower respiratory tract disease (LRTD; disease below the trachea, including all chest infections and bronchodilator responsive conditions). Only children under 24 months were considered for a diagnosis of bronchiolitis (21,25). The asthma/RAD group included children with acute asthma and viral-induced wheezy episodes that were bronchodilator responsive. A positive diagnosis of asthma/RAD required the participant to have a positive bronchodilator response documented by the clinical team during the treatment visit.
Two sets of pneumonia were defined. The term "clinical pneumonia" was applied when radiology was not required in making the diagnosis, as is recommended in cases of mild severity by the Pediatric Infectious Diseases Society and the Infectious Diseases Society of America (27). Focal (consolidative) pneumonia was examined as a separate group due to the likelihood of being bacterial in etiology. To be positive for focal (consolidative) pneumonia, a specialist radiologist report indicating new consolidation or pleural effusion was necessary along with only focal, or no, auscultatory findings. To standardize radiology interpretation for the expert panel, we used the WHO criteria for interpreting chest x-rays developed to report postvaccination pneumonia prevalence (28,29).
As was the case with the E.D. diagnosis, participants could have more than one study diagnosis as some were dependent subsets. For example, patients with pneumonia also received the broader diagnosis of LRTD.

Statistical Analysis
Our primary measures of diagnostic accuracy were sensitivity and specificity, with expert panel consensus used as the reference standard. False-negative and false-positive rates were then computed as 1-sensitivity and 1-specificity, respectively. We calculated 95% confidence intervals around these parameters using the method of Clopper-Pearson. When reporting demographic details, median and interquartile ranges were provided, given the skewed distribution.
All data were analyzed by an independent statistician using Stata 16.1 (StataCorp, College Station, Texas).

Demographics and Expert Panel Diagnoses
We enrolled 620 children across the two study sites from September 2016 to August 2018−125 at the tertiary pediatric hospital and 495 at the metropolitan general hospital. There were no differences in age (47 vs. 47 months p = 0.45) or sex (male: 60.2% vs. 53.6%, p = 0.18) between enrolment sites. The proportion of participants positive for each study diagnosis were similar at both sites, except for bronchiolitis, where more participants were recruited at the tertiary site (120/145 vs. 25/35, p = 0.016, subjects <2 years old, data not shown). Participant flow through the study along with the reasons for any excluded cases per study disease is shown in Figure 1.

Comparison Between the Emergency Department and Expert Panel Diagnoses
The sensitivity and specificity between E.D. and panel diagnoses are shown in Table 2, along with the number of missed cases (false negative) and false-alarm cases (false positive). There were no differences between the two study sites. Asthma/RAD was the most reliably diagnosed condition (91% correctly identified as positive) followed by LRTD (86.6%), croup (76.8%), bronchiolitis (66.9%) and iURTD (64.9%). Pneumonia was the least reliably diagnosed condition, with clinical pneumonia identified in 63.9% of participants, while focal (consolidative) pneumonia was correctly identified in only 54.8% of participants. The specificity for all study diseases was higher than the sensitivity.
We explored the missed and false alarm cases by study disease, including the diagnoses erroneously attributed to each participant ( Table 2). There were common errors with differentiating asthma/RAD from bronchiolitis, croup from iURTD, and iURTD from LRTD. In 10 out of 42 cases of focal (consolidative) pneumonia, the E.D. did not detect any type of LRTD, and in six instances, no respiratory disease was detected. E.D. clinicians diagnosed the presence of a LRTD (10/19) or a wheezy condition (4/19) when they missed focal (consolidative) pneumonia.

Diagnostic errors cost health systems billions of dollars each
year and result in a notable amount of preventable harm. Our findings show that acute respiratory disorders are frequently misdiagnosed in optimally resourced E.D.s despite them being the most common reason for presentation. While asthma (91%) and LRTDs (86%) were well identified, over 45% of focal (consolidative) pneumonia, 35% of iURTD, 23% of croup and 33% of bronchiolitis cases were missed. Each diagnostic error was compounded by participants receiving an incorrect diagnosis, including 8% of LRTDs being assigned as iURTD and 14% of focal (consolidative) pneumonias classified as "no respiratory disease." In the assessment of respiratory disorders, the first decision point is the distinction of isolated upper airway from lower respiratory tract diseases such as pneumonia and asthma. Doctors are more likely to prioritize assessment and treatment for LRTDs as these conditions tend to be more severe. In our study, E.D. clinicians correctly identified a LRTD in 87% of cases, however then had trouble differentiating the more specific diagnoses of bronchiolitis and pneumonia. In addition, 15% of children with iURTDs were misdiagnosed as having a LRTD. Although this overly cautious approach could be considered safe, it does result in unnecessary investigations and treatments which carry adverse consequences. Many study participants who did not have pneumonia received unnecessary antibiotics (7.2-14.5%) and x-rays (3.6-17.4%), resulting in poor antibiotic stewardship and ionizing radiation exposure. Economically, the annual cost of such defensive medicine and over-investigation to guard against malpractice lawsuits is estimated to be at least USD 25-60 billion (30).
Due to overlapping features, it can be challenging for clinicians to differentiate between conditions that present with acute wheeze such as bronchiolitis and asthma. This is even more complicated when assessing young children who cannot always communicate their symptoms. In order to differentiate between wheezy conditions, physicians rely on bronchodilator tests and professional experience. In resource-poor areas where bronchodilator testing is not usually done, up to 50% of children under the age of 5 years who are diagnosed with pneumonia using WHO/Integrated management of childhood illness guidelines could subsequently be reclassified as having asthma (31)(32)(33). In our study, the accurate identification of asthma was excellent (91% of cases) except in younger children (<24 months) in which 9% of bronchiolitis cases were misdiagnosed as asthma. These diagnostic errors may have resulted from individual clinicians' interpretation of bronchodilator tests.
The accurate diagnosis of pneumonia is dependent on the availability of clinical expertise and diagnostic support resources (34)(35)(36)(37). Without radiology, clinicians must rely on auscultatory and clinical findings. However, the diagnosis  Sensitivity and Specificity data are presented as percentage and 95% confidence intervals. False-positive and false-negative data presented as absolute ratio and percentages of condition negative and positive, respectively. Numbers do not equal total missed or total false alarm cases as more than one study diagnosis could be allocated for each participant.
Frontiers in Pediatrics | www.frontiersin.org of community-acquired pneumonia using presenting clinical features alone is often inadequate, with primary care providers missing up to 71% of cases in adults (38)(39)(40). In our definition of clinical pneumonia, we included patients with generalized auscultatory changes along with patients with only focal changes to capture viral, atypical (mycoplasma) and bacterial infections. This approach can increase the possibility of misdiagnosis, as these features are also associated with conditions such as asthma and bronchiolitis. Viral and atypical chest infections typically have generalized signs, while bacterial infections are typically more localized, but there is considerable overlap between groups. Using this broad definition of pneumonia, we found that E.D. clinicians missed 36% of cases. Eight of the missed cases (n = 22) were instead diagnosed with "no respiratory disease" while six were diagnosed with an isolated URTD. Conversely, 96% of patients erroneously diagnosed with clinical pneumonia had another LRTD, including 26% who had asthma and did not receive appropriate treatment.
The most frequently missed condition in the study was focal (consolidative) pneumonia (sensitivity 55%, specificity 86%), which was also the most potentially serious condition because of probable bacterial etiology. As early detection and commencement of antibiotic therapy is essential for best outcomes these errors represent a serious cause of preventable harm. Globally, pneumonia is the leading cause of childhood death under 5 years of age, with the highest mortality rates seen in low-resourced areas (41). Although our study sites had access to specialist trained pediatric doctors and full diagnostic imaging support, they missed nearly one in two cases of focal (consolidative) pneumonia. In comparison, the symptom-based algorithm developed by the WHO/IMCD to detect pneumonia in low-resourced settings without radiology or expert healthcare has been found to have a similar sensitivity of 67% but a lower specificity of 60% to our study when compared to diagnosis by lung ultrasound (42). A 2021 study in Tanzania found that the algorithm had a much lower diagnostic sensitivity of 25%, and in a Canadian study using a population more comparable to our study, the positive and negative predictive values were also 25% (43,44). Studies in adult patients with suspected communityacquired pneumonia have reported higher false-positive rates than our results. A study across three hospitals that included 800 patients admitted from the E.D. with a community-acquired pneumonia diagnosis revealed 219 (27%) had a non-pneumonia diagnosis upon discharge (45). In another, a cohort of 195 adult patients admitted to a hospital ward with an E.D. diagnosis of pneumonia included 18% with normal chest x-rays (46). In addition, 29% of patients had a discordant discharge diagnosis, with 54% of these patients being diagnosed with an upper respiratory infection and 30% classified as having no infection at all.
We found a substantial error rate in the interpretation of chest x-rays. Only 59% of x-rays with consolidation were correctly diagnosed in E.D., while 14% of normal x-rays were erroneously thought to have consolidation. This error was a significant contributing factor to the poor diagnostic performance for focal (consolidative) pneumonia in our study. Difficulties associated with identifying pneumonia and consolidation on x-ray have been described in the adult literature, including poor interrater agreement (34)(35)(36)(37) between E.D. doctors and radiologists. Lung ultrasound examination has emerged as an effective tool. When used by trained operators a meta-analysis found a pooled sensitivity of 95% and a specificity of 95%. Diagnostic error rates for pneumonia could be reduced by having radiologists work with E.D. teams in real-time, but this approach would be resource-intensive and not possible in small or regional hospitals.
Diagnostic errors are often caused by human factors such as cognition, tiredness, distraction and miscommunication (47). Emergency departments are by nature distracting environments where clinicians are required to see multiple patients concurrently, often while tired. Strategies to lessen these influences have led to the development of objective digital technologies using artificial intelligence (A.I.) and machine learning. AI-based systems which mimic the clinical diagnostic process have shown some promising results. A system used to interrogate electronic medical records has been reported to accurately identify many pediatric conditions and differentiate upper from lower respiratory disease with over 87% accuracy (48). This approach, however, relies on the quality and generalizability of the training data sets and can only be used after all information has been entered into the patient's medical record. Studies of algorithms that use point-of-care automated cough-centered analysis have reported good diagnostic accuracy for respiratory diseases, including pneumonia, correctly identifying 87% of children and 86% of adults without the need for clinical examination or investigations (49)(50)(51)(52). In settings with limited resources, such algorithms might provide a diagnostic method equal to the accuracy of wellresourced E.D.s. The development of automated algorithms to detect pneumonia based on ultrasound patterns has also shown promise, potentially reducing the need for trained operators and chest x-rays (53).
To reduce the risk of harm associated with diagnostic errors, further studies should focus on objective methods to improve diagnostic accuracy, such as AI-based systems. Repeating our study in environments with limited clinical and diagnostic support resources would help to establish a baseline from which to assess the usefulness of new diagnostic modalities.

LIMITATIONS
We were confronted with several challenges during the design of our study. As some of the study diseases do not have clear objective diagnostic markers or tests, we articulated strict definitions for the adjudication panel to use based on internationally published criteria (21)(22)(23)(24)(25). We have found no other studies that have used reference diagnostic criteria as stringent as this. There were limitations relating to wheezy conditions, such as asthma, in which an accurate diagnosis relies on bronchodilator test responsiveness. However, unless formal lung function tests are conducted, interpretation of the response to bronchodilators is subjective (54). As it is not feasible to perform lung function tests in young children in E.D., nor is it recommended for children younger than six, we acknowledge that the interpretation of bronchodilator tests in our study was subjective. However, this is the current standard of care in clinical practice. To improve accuracy, we insisted a bronchodilator test be administered in the ED and the response recorded before assessing for either asthma or bronchiolitis. For children aged 1-2 years, the differentiation between bronchiolitis and asthma depends on bronchodilator test response, with bronchiolitis patients being unresponsive. We excluded cases (n = 10) from our analysis if bronchodilator tests were not done to avoid diagnostic classification errors.
To minimize the over-diagnosis of focal (consolidative) pneumonia we defined a study group to reflect focal, lobar pathology by requiring a specialist radiologist to report chest x-rays and ensured the absence of any generalized wheezy conditions. For this group, we used the radiological definitions developed by the WHO to detect bacterial lung disease in post-vaccination surveillance programs (28). We adopted this approach to avoid diagnosing focal (consolidative) pneumonia in cases of generalized LRTDs, such as asthma, bronchiolitis, and pneumonitis, when radiology was inappropriately performed. Assessment guidelines recommend against diagnostic imaging in bronchiolitis and asthma (21,26,55). Despite this, chest x-rays were inappropriately ordered in our study for these conditions, resulting in unnecessary radiation and antibiotic use when pneumonia was mistakenly diagnosed (56)(57)(58)(59).
We used a panel of three experts to provide a diagnostic consensus. Although this is a non-reference standard and one that is not attainable in real-life clinical practice, it provided us with the best achievable diagnostic standard. Studies that have used a single reviewer have been found to lack validity, and even those using two reviewers have shown low agreement (60)(61)(62)(63). Other studies have reported little benefit in using more than three assessors, and many studies have only engaged a single or second independent reviewer (11,64).
An analysis of diagnostic error studies that rely on chart review shows that the clinical data needed to make diagnoses are often missing in cases where an error has been made (65,66). Our panel, who did not have an opportunity to examine the patients or order tests, relied on medical charts and cough recordings. As the E.D. and panel recorded their diagnoses using the same examination findings, this could have led to underestimating the true frequency of diagnostic errors.

CONCLUSION
To the best of our knowledge, this is the first study to report on diagnostic error rates in well-resourced E.D.s for undifferentiated acute childhood respiratory diseases. Although these conditions represent the most common reasons for children to be taken to an ED and account for some of the more serious pediatric disorders, they are frequently misdiagnosed. At the time of discharge from the ED, over 45% of pneumonia, 35% of iURTD, 23% of croup and 33% of bronchiolitis cases had not been correctly diagnosed. The high diagnostic error rate for pneumonia is particularly concerning. Further, our results point to the risks posed to individuals caused by these errors, including the prescription of inappropriate tests and treatments whilst appropriate therapy is delayed or not given. More specifically, study participants received unnecessary antibiotics for bronchiolitis (14.5%), asthma (7.3%) and iURTD (6.9%) and were subjected to ionizing radiation. Incorrect antibiotic use carries implications for both individuals and the broader community in terms of antibiotic resistance and resource allocation.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ramsay Health Care WA|SA (Reference 1501) and the Child and Adolescent Health Service of Western Australia (Reference 2015030EP). Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.