Clinical Applicable AI System Based on Deep Learning Algorithm for Differentiation of Pulmonary Infectious Disease

Objective: To assess the performance of a novel deep learning (DL)-based artificial intelligence (AI) system in classifying computed tomography (CT) scans of pneumonia patients into different groups, as well as to present an effective clinically relevant machine learning (ML) system based on medical image identification and clinical feature interpretation to assist radiologists in triage and diagnosis. Methods: The 3,463 CT images of pneumonia used in this multi-center retrospective study were divided into four categories: bacterial pneumonia (n = 507), fungal pneumonia (n = 126), common viral pneumonia (n = 777), and COVID-19 (n = 2,053). We used DL methods based on images to distinguish pulmonary infections. A machine learning (ML) model for risk interpretation was developed using key imaging (learned from the DL methods) and clinical features. The algorithms were evaluated using the areas under the receiver operating characteristic curves (AUCs). Results: The median AUC of DL models for differentiating pulmonary infection was 99.5% (COVID-19), 98.6% (viral pneumonia), 98.4% (bacterial pneumonia), 99.1% (fungal pneumonia), respectively. By combining chest CT results and clinical symptoms, the ML model performed well, with an AUC of 99.7% for SARS-CoV-2, 99.4% for common virus, 98.9% for bacteria, and 99.6% for fungus. Regarding clinical features interpreting, the model revealed distinctive CT characteristics associated with specific pneumonia: in COVID-19, ground-glass opacity (GGO) [92.5%; odds ratio (OR), 1.76; 95% confidence interval (CI): 1.71–1.86]; larger lesions in the right upper lung (75.0%; OR, 1.12; 95% CI: 1.03–1.25) with viral pneumonia; older age (57.0 years ± 14.2, OR, 1.84; 95% CI: 1.73–1.99) with bacterial pneumonia; and consolidation (95.8%, OR, 1.29; 95% CI: 1.05–1.40) with fungal pneumonia. Conclusion: For classifying common types of pneumonia and assessing the influential factors for triage, our AI system has shown promising results. Our ultimate goal is to assist clinicians in making quick and accurate diagnoses, resulting in the potential for early therapeutic intervention.


INTRODUCTION
Pneumonia is a leading cause of death, with mortality among older individuals (70 years) increasing by 33.6 percent between 2007 and 2017 (1). Bacterial pneumonia, viral pneumonia, fungal pneumonia, and parasitic pneumonia are the four types of pneumonia (2), each of which requires different treatment and has a varied prognosis. Rapid pathogen detection and identification are critical for guiding prompt and successful pneumonia therapies, resulting in faster clinical benefits, fewer problems, and lower hospital costs. The existing pneumonia pathogen testing method has various flaws, including low sensitivity and accuracy, long wait times, and high labor expenses. Non-specific medications, such as broad-spectrum antibiotics, might worsen sickness, and raise hospital expenses (3). More effective diagnostic methods with improved accuracy are required to reduce over-treatment.
Computed tomography (CT) plays an important role in the diagnosis of pneumonia. In the lack of a specific image clinical presentation, identifying pneumonia pathogens early and precisely is a major issue (4). Because the imaging signs of different types of pneumonia are similar, making it difficult for radiologists to identify and distinguish them with the naked eye. Furthermore, radiologists' inter-rater variability may result in conflicting outcomes. Artificial intelligence (AI) technologies, particularly deep learning (DL), offer a promising solution for such medical image interpretation, rapid identification, and classification, which can not only avoid doctor heterogeneity but also rapidly and automatically achieve higher diagnostic accuracy. Recent work using AI for the automated diagnosis of pneumonia has also yielded promising results (5)(6)(7)(8). In pediatric chest X-rays, DL was used to identify and discriminate between bacterial and viral pneumonia (9,10). Other studies (5,11) used CT images to build DL models to identify COVID-19 and distinguish it from community-acquired pneumonia (CAP) and other lung diseases. However, because these studies were designed to focus solely on COVID-19 and normal CT, additional pneumonia manifestations such as bacterial pneumonia were not examined. The real-world situation, on the other hand, would not be similar to this setting. Furthermore, these studies only looked at the image manifestations of pneumonia and ignored the accompanying clinical factors. CT, in conjunction with clinical presentation, can produce a high detection result. Moreover, these approaches do not provide an interpretative study of the model's learning factors, and the prediction models that arise may not be useful in guiding early and quick identification of various pulmonary infections. Some studies (5,(9)(10)(11) utilized class activation maps (12), a sort of heat map that overlays CT scans to indicate the important areas for model predictions. Although intuitive, these heat maps do not offer radiologists useful information for describing features or interpreting for fundamental clinical indications.
CT characteristics, also referred to as key imaging features or clinical indicators, include the number, location, and extents of different pulmonary lesions, such as ground-glass opacity (GGO) and consolidation. In recent studies of COVID-19 pneumonia, some of these CT characteristics, like lesions, have been exploited to monitor the progress of diseases (13). In contrast, others, like lesion location, were found to be risk factors for poor outcome (14). Although such accurate and automated quantification of these CT characteristics has already been made possible by machine learning-based algorithms, few studies have made efforts to assist radiologists in understanding the predicted results produced by the systems.
In this retrospective study, we aimed to develop and validate a CT-based DL system to classify pneumonia patients into four pathogenic types: common virus, bacteria, fungus, and SARS-CoV-2. This method will facilitate faster diagnosis and subsequently, more suitable treatment for pneumonia patients. Furthermore, we retrieved a slew of quantitative CT features or clinical indications, such as lesion numbers and location. In order to help radiologists in interpreting CT scans of pneumonia patients, we evaluated the relative relevance of each imaging feature in determining the pathogenic sources of pneumonia in a standard machine learning (ML) model/classifier.

Patient Cohort and Data Collection
The ethics committees approved this multi-center retrospective study and written informed consent was waived because the data used for system development were de-identified by removing personal information. Patients with respiratory symptoms suggestive of pulmonary infection (fever, cough, and sputum production) were enrolled in this research, who underwent chest CT scanning and received laboratory confirmation of the underlying pathology of pneumonia: SARS-CoV-2, common virus, bacterium, or fungus. The four pathogens of pneumonia were identified using reverse transcriptase-polymerase chain reaction (RT-PCR) and culture and microscopic inspection of sputum, blood, or lung tissue samples. From January 2011 to February 2020, we gathered 7,487 anonymous lung CT images from 2,195 individuals using these first criteria. Then, individuals who had previously undergone thoracic surgery, had severe TB, or had no radiological indications of pneumonia were eliminated.
We also eliminated individuals with respiratory artifacts, less than three slices, or a thickness more than 3 mm on their CT images.
Finally, a total of 1,431 patients from three institutions were employed in this study to establish the classification system. Figure 1 has more information on the inclusion and exclusion criteria, as well as a flowchart. To evaluate the robustness of our AI system in various clinical settings, the CT data obtained in this study came from a range of vendors, including Toshiba Medical Systems, Japan; GE Healthcare, USA; United Imaging, China; and Siemens Healthineers, Germany. All CT scans were obtained with a resolution of 512 * 512 with slice spacing ranging from 0.625 to 3 mm in the axial direction. A tube voltage of 120 kVp was used for CT examinations. The automated tube current modulation approach was utilized to control the tube current (30-70 mAs). The examinations were carried out in helical mode with a helical pitch of 0.8125-0.984 mm.

Overview of the AI System
We proposed a four pathogenic classification AI system for pneumonia that uses CT images as input and explains the interactions between the factors learned by the model (image and clinical records) to help clinicians make accurate and efficient predictions (Figure 2). The suggested classifier consists of three tasks, the first two of which were trained using a deep learning model (DL system) with a convolution neural network based on the PyTorch frame [(15); Figure 2B], and the third by a machine learning method (ML system) ( Figure 2D). Based on radiologists' recommendations, the first two CNN classifiers were developed: a bi-classifier for differentiating viral from non-viral pneumonia and a quad-classifier for the four pathogenic types. The given data was split into three sets with an 8:1:1 ratio for training, validation, and testing, while the third task combining images and medical records information to explain the clinical indicators.

Construction of the Deep Learning System
The pipeline of our deep learning-based system included four key components: (a) an abnormal-slice identification model (normal or abnormal), (b) a segmentation model that segmented the lung lobe and the contour of the lesions, (c) a classification model that investigated multiple indicators of pneumonia and differentiated the types (bi-classifier or quad-classifier), and (d) a voting model that merged the CT slices-wise scores to generate a patient-level CT volume prediction. The abnormal  CT slices with pneumonia-related lesions were used to train a convolutional neural network (CNN)-based classifier for the pneumonia pathogens. Specifically, for CT volumes, we have developed modified ResNet-50 networks (16) for radiological abnormality identification. We also developed a novel lesion segment network architecture for contour extraction of lesions and lobes, based on the trained backbone parameters and further fusing the extracted feature (lesion size, counts) to imitate physician diagnostic practice. In order to extract 3D context information based on a given lesion slice, this module used continuous multi-slice CT images as input to learning the weights of different layers and adaptive modifies network learning parameters depending on spatial changes in lesions. Furthermore, the model was designed for multi-resolutions, and the information gathered at various resolutions is adaptable in order to provide a more complete information basis on lesions of varying sizes. The high cost of data collection and labeling influences the difficulty of modeling pneumonia framework. As a result, transfer learning was used to solve the problem of insufficient training data by first learning the specific weights of the neural network on the source data set such as ImageNet (17) and then re-learning the appropriate weights for some of the different instances of the target data set. By majority voting, the final score of the CNN classifier's prediction for all abnormal CT slices was merged to generate a patient-level CT volume prediction. In the validation cohort, we preprocessed the given CT scan in the same way that we did in the training cohort. After that, the preprocessed image is sent to the backbone for predictions and majority voting. The code for reproducing the study's findings is available at https://github.com/chiehchiu/ CAAS.

Construction of the Machine Learning System
The DL system was built just to evaluate medical images, neglecting the complementary nature of medical records and visuals, as well as the need to see and comprehend the issue from several viewpoints. A written medical record reflects on the patient's health, and the image of the patient depicts the condition using the pathogenesis idea. The combination of both improves the patient's overall condition and reduces misdiagnosis.
To provide a comprehensive diagnosis of the image's clinical and case information, our machine learning-based system analyzes all data samples obtained from image and quantitative CT characteristics such as GGO count and location, as well as other clinical indicators such as sex and age, and to explain the interactions between the factors learned by the model. We utilized Shapley Additive exPlanation (SHAP) (18) on the XGBoost classifier (19) to analyze the contribution of each feature in detecting pneumonia pathogens ( Figure 2D). The most important step in this model is the filtering of key features.
We need to filter several of the features obtained in the previous step (quantitative CT characteristics) to remove those that may cause model deviation and those with low correlation. The specific methods are as follows: (a) screening based on statistical features such as variance; (b) using the maximum correlation and minimum redundancy feature selection methods and the lasso feature selection method to regress the highly correlated features of the predicted target and obtain the key features with high stability, discrimination, and independence; and (c) based on the lasso feature selection method to get the best K features for preservation. To counteract the class imbalance in our dataset during model training, we also used down-sampling and oversampling as needed.

Expert Performance Assessment
Two groups of doctors with varying levels of experience (three junior radiologists [3-4 years of experience] and three senior radiologists [7-8 years of experience]) were asked to evaluate pneumonia cases solely on CT scans independently and blindly to establish a comparative baseline for our AI system. In group examinations of three physicians, annotated lesions were identified as positive samples whereas the lesions viewed by two or more radiologists were considered as true lesions.

Statistical Analysis
The following measures were used to assess the performance of our classifiers: area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity (20). The DeLong technique (21) was used to calculate the 95% confidence intervals (CIs) for the AUC. The median and interquartile range (IQR) with a 95% confidence interval (CI) are used to represent continuous variables. The ANOVA test was used to determine whether there was a difference between the two or four pathogenic categories of pneumonia patients (22). For categorical characteristics, the χ 2 or Fisher exact test (23) was employed to compare the pathogenic groups. All statistical tests were twotailed, with statistical significance set at p < 0.05.

Deep Learning-Based Pathogen Identification
The performance of our pneumonia pathogens classification system was assessed on test data and described in Table 2. To identify positive cases, we also established cutoff values of the output probability value based on the findings, resulting in a high-sensitivity cutoff of 98% sensitivity for patient-wise classification and a high-specificity cutoff of 98% specificity. From this result, operating thresholds were defined as a probability of 0. 15     In the observer performance test, the AI system performed much better than all reader groups in terms of four type classification. CI, 0.828-0.846); specificity, 0.980 (95% CI, 0.980-0.982)]. We plotted the AUC curves of our quad-classifier on each pathogenic category, as shown in Figure 3, which also showed a similar trend.

Machine Learning-Based Feature Analysis
In this study, we utilized machine learning (ML) algorithms to integrate chest CT results (quantified by a DL system) with clinical symptoms in order to promptly diagnose patients who tested positive for four forms of pneumonia (Table 3). Then, using these features, determining the contribution of outcome to the prediction of pneumonia types. On the test set, we assessed the ML models and compared their performance to that of a DL system and two groups of radiologists with varying levels of expertise. The AUROC were calculated for both human readers and the two models in Figure 3 and  Table 3), there are no significant difference in terms of sex (p = 0.80) and age (P = 0.6). The four pathogenic groups differed in most of the CT characteristics (p < 0.001; Figure 4). Patient's age, lesion features such as GGO count, presence of lung nodule, and lesion density type, were significant features associated with SARS-CoV-2 status. The GGO features were identified as the most significant contributor to the evaluation of identifying COVID-19 from the four pneumonia types [odds ratio (OR), 1.76; 95% CI: 1.71-1.86; P = 0.003]. Clinical parameters relating to the lesion location (right upper lung or, 1.12; 95% CI: 1.03-1.25, P = 0.01) contributed to the prediction of viral pneumonia patients.

DISCUSSION
In this study, we presented an effective clinically relevant AI system based on medical image identification and clinical feature interpretation system based on real-world datasets. The accuracy of our AI system for distinguishing the four common types of pneumonia were relatively high [COVID-19 (99.7%), common viral pneumonia (99.4%), bacterial pneumonia (98.9%), and   fungal pneumonia (99.6%)]. Furthermore, using a specialized CT analysis, we retrieved dozens of quantitative CT features from the study cohort as CT findings or clinical indications. The GGO characteristics were found as the most important contributors in identifying the four pneumonia types. Notably, the COVID-19 patients had more GGO lesions; patients with common viral pneumonia were less likely to have bilateral lung infection; and patients with fungal pneumonia had a modest number of consolidation lesions. In this study, we present an AI system that outperforms immediate-level radiologists on differentiating the pulmonary infection based on CT scans. This fast imaging-based triaging system has the potential to be a non-culture technique for identifying common pneumonia, which would promote timely targeted antibiotic treatment for pneumonia patients and thus help reduce antimicrobial resistance, treatment side effects, and costs. During the COVID-19 pandemic, this system can also help stratify pneumonia patients for proper care or quarantine and thus lessen the burden of diagnosing numerous potentially infected patients. With the availability of more fine-grained pneumonia data, this system can easily be extended to recognize new strains or sub-strains of pulmonary infections.
Our DL system performed well in differentiating the four major kinds of pneumonia, and our results are somewhat more accurate than the prior AI study-based CT for COVID-19 diagnosis (24). Although CT is an essential tool for early detection of pneumonia, it is not as accurate in identifying the virus in the absence of clinical symptoms. In ML system, our joint AI model incorporates CT and clinical data, demonstrating that clinical information played a role in the accurate diagnosis of pneumonia in individuals in the early stage. Compared with radiologists, our CT image-based AI system can identify the possible pathogenic infectious pneumonia more quickly, and the accuracy is much improved, to timely guide clinical medication to maximize the patients' benefits. The timeliness and accuracy of AI can not only enable patients to get correct treatment decisions at an early stage, reduce hospitalization duration, and save treatment costs, but also significantly reduce the incidence of complications caused by delayed diagnosis and treatment decisions because of waiting for pathogen detection (25). The quantitative CT characteristics extracted from the study cohort by a dedicated CT analysis can help physicians to interpret better the CT scan and the prediction made by our system, such as more GGO lesions in the COVID-19, less bilateral lung infection in common viral pneumonia, fungal pneumonia had a moderate amount of consolidation lesions. The SHAP explainer also supported this statistical observation on the XGBoost classifier built from these CT features, which revealed the top 20 most important CT characteristics for predicting the pathogens of pneumonia, including age, GGO ratio, lesion position, and consolidation. The listing of these CT features together with their relative importance to the pathogen classification provides a clinician instant valuable information, instead of a straight diagnosis suggestion, of a chest CT scan that can help them make an informed decision on the final diagnosis and treatment. It can also serve as a training tool for junior radiologists to interpret CT scans and make a better judgement.
The GGO features were identified as the most significant contributor in identifying the four pneumonia types. GGO has traditionally been non-specific and can be seen in all types of pneumonia (26,27), but a recent study has found subtle differences in GGO between different diseases (28). Our study found that there were statistically significant differences in the distribution of GGO among the four pneumonia types, showing that our AI system could distinguish subtle differences in GGO from the four pneumonia types.
Our AI system combines the clinical advantages of CT and the intelligent advantages of AI, and has a good application prospect in clinical practice. In contrast to etiological tests such as RT-PCR, CT has some advantages. Although RT-PCR is the gold standard, but it also has certain instability. RT-PCR and other etiological tests can only make exclusive and definite diagnosis in the diagnosis of pneumonia, that is, RT-PCR can only detect covid-19 infection; CT, on the other hand, can simultaneously identify a variety of pathogens in the diagnosis process. And the detection rate of RT-PCR and other etiological tests is susceptible to some factors, such as variation in detection rate from different manufacturers, low patient viral load, or vulnerable clinical sampling (29), and so on. RT-PCR is prone to false negative results and may require repeated testing (30),So, compared with RT-PCR, CT is more economical and faster, and the CT scan showed more stable results and higher sensitivity (31). Although our AI system was more accurate than human experts in this aspect, and have many advantages, it cannot completely replace the gold standard set by laboratory tests. Future research could be conducted to address the issues as mentioned earlier. For instance, our AI system can benefit from more data samples in the bacteria and fungus groups. Clinical or laboratory information (such as exposure history and blood biochemical examination) may be incorporated as an additional information source into our CT-based AI system to boost the classification accuracy.
There are some limitations in our research. Firstly, the incidence of fungal pneumonia is much lower than that of other pneumonia, so the data volume of fungal pneumonia is much smaller than that of other pneumonia, in subsequent studies, we will further expand the data of fungal pneumonia. Secondly, our data contains different examinations of the same patient at one admission, but there is no different scan reconstruction of the same examination. The reasons for this are as follows: (1) The amount of data is small; (2) For a certain patient, examinations that took place at different time were included in our study. For different examinations of the same admission, CT findings will present different characteristics according to different phases of the course of the disease. Therefore, training and testing with CT images of the same pathogen infection at different periods (progression/improvement) can improve the performance and robustness of the model. This approach might have some limitations regarding to metric calculation, but other studies (5) have also adopted a similar approach. Moreover, compared to their method, our data have broader inclusion criteria and are more in line with real clinical scenarios. When we built the model, the data of training set and test set were randomly selected, which would not have a great influence on the final result. And thirdly, due to geographical and other factors, we cannot obtain data from other countries, so we only conduct data analysis on Chinese patients. This does have some limitations. But we are willing to disclose the code release: (https://github.com/chiehchiu/CAAS), and very welcome more countries researchers use more diversified data for research.
In conclusion, we proposed a CT-based AI system that can assist clinicians in classifying patients into four pathogenic types efficiently and accurately by listing quantitative CT characteristics and their importance for making the prediction. This study takes the first step in developing a rapid, CT-based, non-culture diagnostic method to triage pneumonia patients for timely targeted treatment. The proposed classifier may be used in pre-screening patients to conduct triage and fast-track decision making before RT-PCR.

DATA AVAILABILITY STATEMENT
The data used in this work is subject to the following licenses/restrictions: data sets cannot be made public. Access to these datasets should be requested through Wei Chen at cwjxl_2006@163.com or Jian Wang at wangjian_811@yahoo.com.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committees of the First Affiliated Hospital of Army Medical University, PLA (Approval Number: KY2020036). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
JW, WC, Y-hZ, and X-fH all helped to conceptualize and design the study. X-mQ and W-bZ recruited patients. X-qW and H-rL sorted out the data. This manuscript was primarily written by Y-hZ, X-fH, and J-cM. Y-hZ, X-fH, J-cM, Y-zY, Z-fW, SZ, D-jS, WC, and JW analyzed the data. Y-hZ, X-fH, J-cM, SZ, WC, and JW provided feedback on prior drafts of the work. The final manuscript was read and approved by all writers.