Abdominal Ultrasound and Its Diagnostic Accuracy in Diagnosing Acute Appendicitis: A Meta-Analysis

Background: Acute appendicitis (AA) is a common cause of abdominal pain encountering unnecessary surgeries in emergency departments. The present meta-analysis aims to assess the accuracy of abdominal ultrasound in suspected acute appendicitis cases in terms of sensitivity, specificity, and post-test odds for positive and negative results. Materials and Methods: An extensive and systematic search was conducted in Medline (via PubMed), Cinahl (via Ebsco), Scopus, and Web of Sciences from 2010 till the end of March 2021. Two authors analyzed studies for inclusion, collected results, and conducted analyses separately. Examination of the histopathological tissue collected during appendectomy served as a gold standard for determining the final diagnosis of appendicitis. The accuracy was determined by evaluating sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), diagnostic odds ratio. Results: Out of 3,193 references, a total of 18 studies were selected. Overall sensitivity of 77.2% (95% CI – 75.4–78.9%) and specificity of 60% (95% CI – 58–62%) were observed. The diagnostic odds ratio of 6.88(95% CI 1.99–23.82) was obtained. Conclusion: Abdominal ultrasound shows significant accuracy of diagnosis in patients with suspected acute appendicitis.


INTRODUCTION
Acute appendicitis (AA) is considered one of the most common causes of surgical emergencies worldwide (1). The reported mortality rate is from <1% in younger patients up to 5% in the elderly (2,3). Abdominal pain is one of the most common cause of acute appendicitis, yet 34% of cases (4,5) are still misdiagnosed, which results in unnecessary surgery. This high rate of negative appendectomy can be decreased by careful and accurate diagnosis of appendicitis, thus preventing acute appendicitis from progressing to perforation and peritonitis (6).
Abdominal ultrasound (US), computed tomography (CT), and magnetic resonance imaging (MRI) have also been used in the identification or exclusion of AA. The US's sensitivity and specificity in identifying AA have been reported to range from 71 to 92% and 83%, respectively, for normal contrast-enhanced CT 98 and 91%, and MRI 97 and 93% (7)(8)(9).
Computed Tomography (CT) is the most preferred diagnostic imaging modality to rule out AA in the adult population. Although its accuracy is high, with sensitivities ranging from 90 to 96% and specificities ranging from 94 to 98%; however, there are certain limitations, including radiation exposure, risk of contrast administration, increased resource utilization, high cost (7,8), and development of future malignancies (9). However, to eliminate such constraints; the incidence of negative appendicectomy rate, and perforation, clinicians often go for imaging modalities such as abdominal ultrasound (US) as an alternative diagnostic approach because it is easy, inexpensive method, easily portable, and has high precision (10) in cases of suspected appendicitis both in children and adults.
CT or US did not improve the diagnostic precision of AA (3). Despite its confirmed low diagnostic accuracy, the US has been listed as a potential method for diagnosing AA because it does not require radiation. However, despite being a non-ionizing process, the question remains whether the US can contribute to the management of patients with AA suspicion without causing further management delays. Patients with stomach pain who do not have AA are exposed to invasive surgery if the condition is misdiagnosed. It can happen in up to 34% of cases (4,5).

Rationale
When patients with AA are misdiagnosed as not having the condition, a mandatory appendectomy may be postponed, and severe complications may occur, with a mortality rate of about 1.5% (2). Legal charges against both non-surgical and surgical subspecialties have been identified in delayed or incorrect diagnosis leading to adverse effects. As a result, it is essential to correctly identify AA in patients who exhibit symptoms and signs suggestive of the condition.

Objective
The present study is an approach to correlate the diagnostic accuracy of abdominal ultrasound to histopathology, which is considered a gold standard in acute appendicitis (AA) cases in terms of sensitivity and specificity for positive and negative US results.

MATERIALS AND METHODS
We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) normative recommendations in this study with the registration number NU#/IRB/2020/1022.

Search Strategy
The present meta-analysis is an extensive search conducted in Medline (via PubMed), Cinahl (via Ebsco), Scopus, Web of Sciences, from 2010 till the end of March 2021. The search was performed based on the keywords related to diagnostic accuracy, abdominal ultrasound, acute appendicitis, diagnosis, decreased CT use, and ultrasonography. All articles selected were based on PRISMA guidelines. The selection was irrespective of language or publication status or whether the study conducted was done prospectively or retrospectively. Table 1 summarizes the demographic details of the studies included from the search query of the Medline database with the considered variables. The primary focus of the present study was to assess the efficacy of ultrasound for cases of acute appendicitis in all age groups. To rule out the effectiveness of ultrasound in cases of acute appendicitis; sensitivity, specificity, positive predictive value (PPV), negative predictive value (npv), and diagnostic odds ratio's were assessed with the help of true positive (TP), false positive (FP), true negative (TN), false negative (FN) values.
It did not matter whether the data was compiled prospectively or retrospectively or in what language it was written. Two authors (JF and XZ) separately scanned the sources for related studies. For publications that at least one of the writers thought was significant, full texts of the sources were collected. To further exclude obsolete references, complete texts were obtained. Abstracts were only used if they included enough information for the study. Two researchers (LC and SL) independently collected data from included research.

Inclusion and Exclusion Criteria
The studies evaluated the diagnostic accuracy of ultrasound for acute appendicitis in all age groups from 2010 to 2021. The histopathology report for the same was defined as the reference standard included in the present study. Only fulltext data were included in the present study. Exclusion criteria included insufficient data, reference standard other than histopathology report, any relevant studies but published before 2010, and studies including pregnant females suffering from acute appendicitis to reduce the chances of risk of bias and heterogeneity.

Evaluation of the Analytical Standard
The quality assessment of diagnostic accuracy tests assessment tool (QUADAS-2) (28) was used to determine the methodological quality of the included studies. The methodological validity of the included studies was evaluated by two reviewers (JF and XZ) separately. SL was in charge of resolving any disagreements between authors.

STATISTICAL ANALYSIS
A 2 × 2 table was made, based on which pooled sensitivity, specificity, and diagnostic odds ratio were calculated employing the DerSimonian Lair technique. The diagnostic odds ratio was also evaluated with a higher DOR value indicating better diagnostic accuracy of the test. The Cochran Q statistic and I (2) index evaluated the heterogeneity of the studies included. Meta disc software was used for the creation of forest plots. We also presented the data obtained from the various studies in the form of summary points of sensitivity and specificity in receiver operating characteristics (SROC) space with corresponding 95% confidence regions created using Review Manager 5 (29).

Analysis of Sensitivity
Excluding participants with ambiguous findings can lead to an overestimation of diagnostic test accuracy. As a result, we conducted a sensitivity study in which we incorporated uninterpretable results in the analysis and evaluated diagnostic precision. All uninterpretable results were considered incorrect, comparing the outcomes to those of the principal analysis, which removed uninterpretable results.

Investigation of Sources of Heterogeneity
We used metaregression to investigate heterogeneity in the included experiments, introducing various sources of heterogeneity as covariates and fitting a bivariate model. We used the probability ratio test to see whether a covariate has a significant impact on the description sensitivity and precision. For any of the subgroups, a p-value of 0.05 was found statistically significant. Full-text publication vs. abstracts, high vs. low risk of bias in included studies, prospective vs. retrospective studies, studies that included only adults vs. those that included mixed adult and pediatric populations, proportion of female participants, proportion of obese patients, type of ultrasound probe, and ultrasonographer experience were among the sources of heterogeneity that we investigated.

Literature Search Results
Through electronic scans, we found a total of 3,193 studies. By reading titles and abstracts, we excluded 973 on reading titles and abstracts and 2,035 invalid references. Out of 185 studies, around 131 studies were excluded based on duplicity. Full-text publications were required for final screening was 54 out of which 36 were excluded based on the inclusion criteria. The research and meta-analysis contained 18 studies that met the inclusion criteria, i.e., based on the accuracy of abdominal ultrasound for acute cases of appendicitis, as shown in Figure 1. Inappropriate comparison criteria and inadequate evidence to create 2 × 2 tables for review were the key reasons for the omission. Table 1 shows the demographic details of the studies included in the present meta-analysis describing study author, year of publishing, study type, study duration, total sample size, age, gender, details of the sonographer, type of US probe used, the sample size in which ultrasound was conducted, histopathology report which was considered as the gold standard for the comparison and any other method of correlation used for diagnosis. A total of 4,209 patients were included in the present meta-analysis. All studies were released as full-text papers, six of which were prospective, eight of which were retrospective, and four were cross-sectional. The participants' age ranged from 14 to 60 years old, and the majority of research did provide information about the operator's background (13 out of 18) or the kind of US probe used (12 out of 18).

Risk of Bias Assessment
Individual reports' estimated sensitivity ranged from 75.4 to 78.9%, and specificity from 58 to 62%. Thus, according to the QUADAS-2 tool, all included experiments had a low chance of bias ( Table 2).

Meta-Analysis Results
The overall sensitivity of the abdominal ultrasound scan in acute appendicitis was 77.2% (95% CI -75.4-78.9%)   when correlated to histopathology, as shown in Figure 2.
The overall specificity was 60% (CI -58-62%), as shown in  and clinical probability of acute appendicitis were the covariates that showed statistically significant effects on summary outcomes in the subgroup study ( Table 3).

DISCUSSION
Definitive diagnosis in acute appendicitis has always been challenging because of its non-specific symptoms, signs, and laboratory findings, which can mimic several other pathologies (30). It is considered to be one of the most common abdominal emergency surgeries. However, to avoid the negative appendectomy rate of emergency surgeries, Computed tomography (CT) scan is considered as the gold standard in preoperative diagnosing acute appendicitis patients, and it is seen in the past that preoperative imaging with CT has significantly lowered the negative appendectomy rates (NARs) to 1.7% (31,32), but it exposes to ionizing radiation, is expensive and timeconsuming and has its diagnostic insufficiencies (33). The present Meta-analysis was an effort to rule out the efficacy of abdominal ultrasound in diagnosing suspected cases of acute appendicitis in all age groups. It can be misdiagnosed, especially in young women, children, and elderly patients. This Metaanalysis was a systematic update from 2010 to 2021, and a total of 18 articles were selected to rule out the sensitivity, specificity, PPV, NPV, and Diagnostic odd ratios. When correlated with histopathology, the present analysis showed an overall sensitivity of 77% with 95% CI varying from 75 to 79% based on the studies included. The studies included a wide range of sensitivity varying from 50 to 100% (95% CI -41-100%). The present analysis showed an overall specificity of 60%, with 95% CI varying from 58 to 62%. The studies included showed a wide range of sensitivity varying from 0 to 97% (95% CI -0%−98%); when compared to other previous studies.
Similarly, Doria et al. (33) compared CT and ultrasound in pediatric and adult populations. Again, surgery or follow-up was the gold standard. In the adult population, the combined sensitivity and specificity were 83 and 93%, respectively. Giljaca et al. (34) showed a sensitivity of 69% and specificity of 81%, which was different from the present study. The present study showed a high sensitivity rate compared to Giljaca et al. (34), stating the ability to identify acute appendicitis patients more accurately. Another similar Meta-analysis was performed by Orr et al. (35), showing sensitivity and specificity of 84.7 and 92.1%; however, the specificity of Orr et al. was very high when compared with the present meta-analysis showing a high ability to identify the patients without acute appendicitis, which differ from the present analysis. Orr et al. (35)    Likewise, only Korean papers were reviewed by Yu et al. (37). Although most of the included participants were checked up on, surgery and histopathology do not seem to be the reference norm. The US had a sensitivity and specificity of 86.7 and 90.0%, respectively.
van Randen et al. (38) specifically compared CT and US; however, surgery was not the reference norm in all patients, and others were followed up without surgery. The US had a sensitivity and accuracy of 78 and 83%, respectively. Carroll et al. (39) compared the sensitivity and specificity of the US performed by surgeons to histopathology or US performed by a radiologist, with sensitivity and specificity of 92 and 96%, respectively.
Only the histopathology record of the surgical specimen served as the reference standard in our research. As a result, our study's sensitivity and accuracy are much more minor than previously reported. This disparity may result from a rigidly enforced standard under which all patients were subjected to surgery. In addition, this fact may lead to an underestimation of sensitivity in our sample because the patient group was more chronically ill.
The limitation of the present study is that the variability in the type of sonographer as skilled and experienced radiologists can reduce the chance of false-negative results. The diagnostic accuracy of the US could be compared with other methods of imaging to see the variability. Analysis of studies showing accuracy based on techniques using Color Doppler to ultrasound examination and various scoring systems based on patient's history, physical examination, and laboratory tests can further improve the diagnostic accuracy rate of ultrasound in acute appendicitis.

CONCLUSION
Although imaging with CT has significantly lowered the negative appendectomy rates but still due to its high cost, high ionization radiation exposure risks, and its complexity for interpretation makes ultrasound technique an efficient diagnostic aid mainly in suspected cases of children, young females, and elderly patients. In addition, it is a simple, non-invasive, non-ionizing radiation technique, and its easy availability makes it an effective diagnostic alternative to reduce the rate of unnecessary surgeries in acute appendicitis.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
JF: concept and designed the study. XZ: analyzed data and drafting of the manuscript. LC: collected the data and helped in data analysis. SL: proofreading and final editing and guarantor of the manuscript. All authors read and approved the final version of the manuscript.