Use of Artificial Intelligence to Improve the Quality Control of Gastrointestinal Endoscopy

With the rapid development of science and technology, artificial intelligence (AI) systems are becoming ubiquitous, and their utility in gastroenteroscopy is beginning to be recognized. Digestive endoscopy is a conventional and reliable method of examining and diagnosing digestive tract diseases. However, with the increase in the number and types of endoscopy, problems such as a lack of skilled endoscopists and difference in the professional skill of doctors with different degrees of experience have become increasingly apparent. Most studies thus far have focused on using computers to detect and diagnose lesions, but improving the quality of endoscopic examination process itself is the basis for improving the detection rate and correctly diagnosing diseases. In the present study, we mainly reviewed the role of AI in monitoring systems, mainly through the endoscopic examination time, reducing the blind spot rate, improving the success rate for detecting high-risk lesions, evaluating intestinal preparation, increasing the detection rate of polyps, automatically collecting maps and writing reports. AI can even perform quality control evaluations for endoscopists, improve the detection rate of endoscopic lesions and reduce the burden on endoscopists.


INTRODUCTION
Artificial intelligence (AI) is a new and powerful technology. In contrast to machines, the human brain may make mistakes in long-term work due to fatigue and stress, among other distractions; AI technology can therefore compensate for the limited capabilities of humans. Over the past few decades, AI has received increasing attention in the field of biomedicine. A multidisciplinary meeting was held on September 28, 2019, where academic, industry and regulatory experts from different fields discussed technological advances in AI in gastroenterology research and agreed that AI will transform the field of gastroenterology, especially in endoscopy and image interpretation (1). In fact, there are many cases of missed lesion detection due to low-quality endoscopy, which can be greatly reduced with the help of AI.
Thus far, AI has mainly been applied to the field of endoscopy in two aspects: computer-aided detection (CADe) and computer-aided diagnosis (CADx) (2). Although many of the advantageous features of AI seem promising for routine endoscopy, endoscopy still depends heavily on the technical skills of the endoscopist. Improving the quality of endoscopy is thus needed to improve the detection rate and ensure the correct diagnosis of diseases.
In this review, we summarize the literature on AI in gastrointestinal endoscopy, focusing on the role of AI in monitoring (Figure 1)-mainly in monitoring the endoscopy time, reducing endoscopy blindness, improving the success rate of high-risk lesion detection, evaluating bowel preparation, increasing polyp detection rate and automatically taking pictures and writing reports, with the goal of improving the quality of daily endoscopy and making AI a powerful assistant to endoscopists in the detection and diagnosis of disease.

Terms Related to AI
In recent years, the proliferation of AI-based applications has rapidly changed the way we work and live. AI refers to the ability of a machine or computer to learn and solve problems by imitating the human mind with human-like cognition and task execution (3).
Machine learning (ML) and deep learning (DL) can be considered subsets of AI. Machine learning is a fundamental concept in AI, which can be described as the study of computer algorithms that are automatically improved through training and practice over time (4). This approach requires human input of meaningful image features into a trainable prediction algorithm, such as a classifier (5). Deep learning (DL) is a transformative machine-learning technique that enables transfer learning, where parameters in each layer are changed based on representations in previous layers, and can be effectively applied even when the new task has a limited training data set (6).
Artificial neural networks (ANNs) are supervised models that are very similar to the organization of the human central nervous system. Convolutional neural networks (CNNs) are an even more advanced digital DL technique widely used in image and pattern recognition. CNNs are similar to the human brain in their approach to thinking and use large image data sets for learning. Usually, the data set is divided randomly, and a subset is reserved for cross-validation (7).

Identifying Anatomy
For upper gastrointestinal endoscopy, the European Society of Gastrointestinal Endoscopy (EGSE) has proposed the collection of images of eight specific upper gastrointestinal (UGI) landmarks (8), and several similar classification methods have been developed. AI has proven useful for identifying and labeling anatomical sites of the upper digestive tract. Takiyama et al. designed a CNN to identify the anatomical location of esophagus gastroduodenoscopy (EGD) images. They collected 27,335 EGD images for training and divided them into four main anatomical parts (larynx, esophagus, stomach and duodenum) with three sub-classifications of the stomach (upper, middle and lower). The accuracy rate was found to be 97%, but the clinical application was limited (9). The Wisense AI system designed by Wu et al. classified 26 EGD sites and monitored blind spots in real time through reinforcement learning, achieving an accuracy rate of 90.02% and making significant progress in real time (10,11). Seong Ji Choi et al. developed an AI-driven quality control system for EGD using CNNs with 2,599 retrospectively collected and labeled images obtained from 250 EGD surgeries. The EGD images were classified into 8 locations using the developed model, with an accuracy of 97.58% and sensitivity of 97.42% (12).
In the lower digestive tract, an AI system can automatically identify the cecum and monitor the speed of endoscopic withdrawal. Samarasena et al. developed a CNN that can automatically detect equipment during endoscopy, such as snares, forceps, argon plasma coagulation catheter, endoscopic auxiliary equipment, anatomical cap, clamps, dilating balloons, rings and injection needles. The accuracy, sensitivity and specificity of these devices detected by the CNN were 0.97, 0.95 and 0.97, respectively (13). Based on the function of the recognition device, the AI system can further help accurately measure the size of the polyp and aid the endoscopist in quickly determining whether to leave it in place or remove and discard it. Karnes et al. developed a CNN to automatically identify the cecum (13), and the ENDOANGEL is further able to monitor the exit speed, colonoscopy intubation and exit timing and alert the endoscopic surgeon to blind spots caused by endoscopic sliding (14). Identifying the anatomical parts of the digestive tract and accurately classifying them can help inexperienced endoscopists correctly locate the examination site as well as reduce the blind spot rate.

Reducing the Blind Spot Rate of Endoscopy
Gastric and esophageal cancers are common cancers of the digestive tract but can easily be missed during endoscopy, especially in countries where the incidence of the disease is low and training is limited. The 5-year survival rate of gastric cancer is highly correlated with the stage of gastric cancer at the time of the diagnosis, so it is very important to improve the detection rate of early gastrointestinal (GI) cancer. Some blind spots in the gastric mucosa, such as the sinus and the small curvature of the fundus, may be hidden from the endoscopist, depending to a large extent on the competence of the endoscopist.
To reduce the blind spot rate of EGD surgery, Wu et al. built a real-time quality improvement system known as WISENSE. Through training on 34,513 stomach images, blind spots were detected in real EGD videos with an accuracy of 90.40%. In a single-center randomized controlled trial, the blind spot rates of the WISENSE group and the control group were 5.86 and 22.46%, respectively, indicating a significant reduction in the blind spot rate with the WISENSE. In addition, the WISENSE can automatically create photo files, thus improving the quality of daily endoscopy (10).
In a prospective, single-blind, randomized controlled trial, 437 patients were randomly assigned to unsedated ultrathin transoral endoscopy (U-TOE), unsedated conventional Esophagogastroduodenoscopy (c-EGD) or sedated c-EGD, and each group was divided into two subgroups according to the presence or absence of assistance from an AI system. Among all groups, the blind spot rate in the AI-assisted group was 3.42%, which was much lower than that in the control group (22.46%), and the addition of AI had the greatest effect on the sedated c-EGD group (11).

Guided Biopsy
Squamous cell carcinoma of the pharynx and esophagus is a common disease, and one randomized controlled study indicated that the specificity of esophageal carcinoma was no more than 42.1%, while the sensitivity was only 53% for inexperienced physicians (15,16). Seattle protocols and evolving imaging technologies can assist in the diagnosis, but some issues remain, such as the need for expert handling, a low sensitivity and sampling errors (17,18).
The American Society of Gastrointestinal Endoscopy recognizes the use of advanced imaging technology to switch from a random biopsy to a targeted biopsy under certain circumstances. Imaging techniques with targeted biopsies for detecting high-grade dysplasia (HGD) or early esophageal adenocarcinoma (EAC) achieve ≥90% sensitivity, negative predictive values of ≥98% and sufficiently high specificity (80%) to reduce the number of biopsies (19). However, this requires a long learning period, and only experienced endoscopists can reach this level.
An AI system can help endoscopists switch from a random biopsy to a targeted biopsy and improve the detection rate of endoscopic lesions without the need for complicated training procedures. To improve the detection of early esophageal tumors, de Groof et al. validated a DL-based CADe system using five independent datasets. The CAD system classified images as neoplasms or non-dysplastic BE with 89% accuracy, 90% sensitivity and 88% specificity. In addition, in 2 other validation datasets, the system accurately located the best location for biopsy in 97 and 92% of cases (20). The CNN constructed by Shichijo et al. was used for Helicobacter pylori detection by classifying the anatomical parts of the stomach (21,22). The sensitivity, specificity and accuracy were increased compared with endoscopists, improving the choice of the biopsy location (21,23).
Traditionally, a biopsy has been used to assess the nature of lesions. However, CADx systems can help predict histology, even in the absence of biopsy. Endocytoscopy is a contact microscopy procedure that allows for the realtime assessment of cell, tissue and blood vessel atypia in vivo. EndoBRAIN, a combination of endocytoscopy and narrow-band imaging (NBI), is a platform for performing automated optical biopsies that was validated and evaluated on 100 images of colorectal lesions resected endoscopically and subjected to pathology; the EndoBRAIN system shows an accuracy of 90% (24). Using laser-induced autofluorescence spectroscopy, which combines optical fibers into standard biopsy forceps and triggers upon contact, the WAVSTAT4 system provides a real-time, in vivo automatic optical biopsy of colon polyps. When validated prospectively in 137 polyps, the accuracy of the WAVSTAT4 system was found to be 85% (25). The use of the CADx systems can help reduce uneven level in the levels of observers, thereby improving standardization and enabling wider adoption by less-experienced endoscopists (26).

Determining the Depth and Boundary of Gastric Cancer Invasion
Gastric cancer is a common cancer of the digestive tract, and early cancer recognition tests are particularly important. However, an early endoscopic diagnosis is difficult, as most early gastric cancers show only a slight depression or bulge with a faint red color. Predicting the depth of infiltration of the gastric wall is a difficult task, and making an optical diagnosis using image enhancement techniques, flexible spectral imaging color enhancement (FICE) or blue-laser imaging (BLI) has proven useful, provided that the endoscopist has a great deal of expertise. AI helps solve the issue of endoscopists having too little experience (27).
To investigate the depth of esophageal squamous cell carcinoma (ESCC) invasion, two Japanese research groups developed and trained the CADX system separately. The sensitivity and accuracy of the system studied by Nakagawa et al. to distinguish pathological mucosal and submucosal microinvasive carcinoma from submucosal deep invasive carcinoma were 90.1 and 91.0%, respectively, and the specificity was 95.8%. The system was compared to the findings of 16 experienced endoscopic specialists, and its performance was shown to be comparable (28). Tokai et al.'s CADX system detected 95.5% of ESCCs (279/291) in the test images within 10 s and correctly estimated the depth of infiltration with a sensitivity of 84.1% and an accuracy of 80.9%, which was better than the accuracy of 12 of the 13 endoscopic experts (29). Kubota et al. developed a CADx model for diagnosing the depth of early gastric cancer invasion on gastroscopic images. About 800 images were used for computer learning, and the overall accuracy rate was 64.7%. The diagnostic accuracy rates of the T1, T2, T3, and T4 stages were 77.2, 49.1, 51.0, and 55.3%, respectively (30). Zhu et al. designed a CNN algorithm using 790 endoscopic images for training and another 203 for verification to assess the depth of invasion of gastric cancer. The accuracy of the system was 89.2%, the sensitivity was 74.5%, and the specificity was 95.6% (31).
Using magnified NBI images, Kanesaka et al. developed a CADe tool that can be used for detection, in addition to depicting the border between cancerous and non-cancerous gastric lesions, with 96.3% accuracy, 96.7% sensitivity and 95% specificity (32). Miyaki et al. developed a support vector machine (SVM)-based analysis system for the quantitative identification of gastric cancer together with BLI endoscopy. The training set was made using 587 images of gastric cancer and 503 images of surrounding normal tissue, and the validation set comes from 100 EGC images of 95 patients. These images were all examined by BLI magnification using the laser endoscopy system. The results showed that the average SVM output value of cancerous lesions was 0.846 ± 0.220, that of red lesions was 0.381 ± 0.349, and that of the surrounding tissue was 0.219 ± 0.277. The SVM output value of cancerous lesions was significantly greater than that of the red lesions or surrounding tissue. The mean output of undifferentiated cancer was greater than that of differentiated cancer (33). Ito et al. developed an endoscopic CNN to distinguish the depth of invasion of malignant colon polyps. The sensitivity, specificity and accuracy of the system for the diagnosis of deep invasion (cT1b) were 67.5, 89.0, and 81.2%, respectively. The use of a computer-assisted endoscopic diagnostic support system allows for a quantitative diagnosis to be made without relying on the skills and experience of the endoscopist (37).

Identifying and Characterizing Colorectal Lesions
The use of AI systems as clinical adjunct support devices allows for more extensive use of "leave in place" and "remove and discard" strategies for managing small colorectal polyps. Chen et al. developed a CADx system with a DNN-CAD for the identification of neoplastic or proliferative colorectal polyps smaller than 5 mm in size. The training set consisted of 1,476 images of neoplastic polyps and 681 images of proliferative polyps, and the test set consisted of 96 images of proliferative polyps and 188 images of small neoplastic polyps. The system achieved 96.3% sensitivity, 78.1% specificity and 90.1% accuracy in differentiating tumors from proliferative polyps. The DNN-CAD system was able to classify polyps more quickly than either specialists or non-specialists (38).

Automated Assessment of Bowel Cleansing
The adenoma detection rate (ADR) is widely accepted measure of the quality of colonoscopy, defined as the percentage of patients who have at least one adenoma detected during colonoscopy performed by an endoscopist. The ADR is negatively correlated with the risk of interstage colorectal cancer, and there is a strong positive correlation between the quality of bowel preparation and the colon ADR. A variety of tools have been developed to assess intestinal readiness, such as the Boston Bowel Preparedness Scale (BBPS) and the Ottawa Bowel Preparedness Scale, but subjective biases and differences also exist among endoscopic physicians. The bowel preparation scale is another indicator that can be automatically evaluated by AI, with good results achieved. A proof-of-concept study using AI models to evaluate quality measures such as the mucosal surface area and bowel readiness score examined the sufficiency of colonic dilation and clarity of endoscopic views (39). Another study used a deep CNN to develop a novel system called the ENDOANGEL to evaluate bowel preparation. The ENDOANGEL ultimately achieved 93.33% accuracy in 120 images and 89.04% in 20 realtime inspection videos, which is higher than the accuracy rate of the endoscopists consulted for the study. The accuracy rate, in 100 images with bubbles, also reached 80.00% (40).
The software program developed by Philip et al. to provide feedback on the quality of colonoscopy works in three ways: measuring the sharpness of the image from the video in real time, assessing the speed of exit and determining the degree of bowel preparation. Fourteen screening colonoscopy videos were analyzed, and the results were compared with those of three gastroenterology experts. For all of colonoscopy video samples, the median quality ratings for the automated system and reviewers were 3.45 and 3.00, respectively. In addition, the better the endoscopist withdrawal speed score, the higher the automated overall quality score (41).
In a recent study, Gong et al. (42) established a realtime intelligent digestive endoscopy quality control system capable of retrospectively analyzing endoscopy data and helping endoscopists understand inspection-related indicators, such as the inspection time and blindness rate, ADR and bowel preparation success rate. The complaint report can be generated automatically, and these data can further analyze the changing trend of the detection rate of colonoscopy adenoma and precancerous lesions, so as to help endoscopists to analyze their own shortcomings and make improvements.

Identifying and Characterizing UGI Tract Lesions
Advanced esophageal and gastric cancer often have a poor prognosis, so early upper gastrointestinal (UGI) endoscopic detection is especially important. In European community, the missed diagnosis rate for UGI cancers has been reported to range from 5 to 11%, while the rate for Barrett's early stage tumors has been reported to be as high as 40% (43). AI systems could help endoscopists detect upper digestive tract tumors and improve the detection rate. However, these systems are still experimental in design and there is still uncertainty about their clinical applicability.
In order to explore the diagnostic performance of AI in detecting and characterizing UGI tract lesions, Julia Arribas et al. searched relevant databases before July 2020 and analyzed and evaluated the comprehensive diagnostic accuracy, sensitivity and specificity of AI. According to the meta-analysis, the AI system showed high accuracy in detecting UGI tumor lesions, and its high performance covered all ranges of UGI tumor lesions [including esophageal squamous cell neoplasia (ESCN), Barrett's esophagus-related neoplasia (BERN), and gastric adenocarcinoma (GCA)]. The sensitivity of AI to detect UGI tumors was 90%, the specificity was 89%, and the total AUC was 0.95 (CI 0.93-0.97) (43).
Leonardo Frazzoni et al. evaluated the accuracy of endoscopic physicians in identifying UGI tumors using the AI validation research framework, with an AUC of 0.90 for ESCN (95%CI 0.88-0.92) and 0.86 for Bern (95%CI 0.84-0.88). The results showed that the accuracy of endoscopists in identifying UGI tumors was not particularly good, and suggested that AI validation studies could be used as a framework for evaluating endoscopists' capabilities in the future (44).
In order to explore the clinical applicability of AI in improving the detection rate of early esophageal cancer, we designed a prospective randomized, single-blind, parallel controlled experiment to evaluate the effectiveness of AI system ENDOANGEL in improving the detection of high-risk lesions in the esophagus (Figure 2). ENDOANGEL is an AI model based on a deep learning algorithm that recognizes and prompts high and low-risk esophageal lesions under NM-NBI. It outlines the range of suspicious lesions in the form of a prompt box and gives a risk rating. We hope ENDOANGEL can increase the detection rate of high-risk esophageal lesions by electronic esophageal gastroscopy. At present, this clinical study is in progress. In the early stage, we used a large number of gastroscopy videos of high-risk esophageal lesions to train the model. In the preexperimental stage, it was found that the model had a problem of misjudgment in the cardia, that is, the dentate line was mistaken for the lesion is framed. In order to reduce the misjudgment rate, we have further trained the model, and this problem has The accuracy of ENDOANGEL was higher than that of professional endoscopists.
Zhou et al. (40) The sharpness of the video image, speed of exit and level of intestinal preparation were measured The automatic system has high accuracy in scoring Filip et al. (41) been well-improved after learning. At the same time, as in other studies, this model occasionally mistakes bubbles and mucus for lesions. For now, AI is not perfect, but just like the problem encountered in this experiment, through deeper learning and continuous training, the error rate will gradually decrease to ensure a high correct detection rate.

CONCLUSION
In gastrointestinal endoscopy, computer-aided detection and diagnosis have made some progress. Table 1 summarizes the key research on the diverse functions of AI in the application of gastrointestinal endoscopy. At the present, CADe and CADx have helped endoscopists improve detection rates for many diseases, but there are still many limitations to its implementation and use. First, research on AI is still in the early stages, and static images are usually used to verify computer-aided design models. Most of these studies are retrospective and lack of prospective experiments. Second, computer-aided endoscopy systems are often plagued by false positives, such as air bubbles, mucus and feces and exposure. Third, most of these systems are developed and designed by a single institution for use in certain patient groups, so their expansion to other populations may be difficult. However, it is undeniable that the prospects for the auxiliary application of AI in GI endoscopy are bright. In remote or backward areas, endoscopic technology is difficult to be guaranteed, and the skills of endoscopists grow slowly. Computer-aided examination can help solve the problems of high rate of missed diagnosis and false diagnosis. It's worth noting that AI systems cannot completely replace endoscopes, even with further improvements in the future.
Most current AI systems are tested for specific diseases in specific areas. In the future, we expect that AI can improve the detection rate of a variety of digestive tract diseases in gastrointestinal examination, and serve clinical work better as a quality control system.

AUTHOR CONTRIBUTIONS
All authors contributed to the writing and editing of the manuscript and contributed to the article and approved the submitted version.