Medical decision support system using weakly-labeled lung CT scans

Purpose Determination and development of an effective set of models leveraging Artificial Intelligence techniques to generate a system able to support clinical practitioners working with COVID-19 patients. It involves a pipeline including classification, lung and lesion segmentation, as well as lesion quantification of axial lung CT studies. Approach A deep neural network architecture based on DenseNet is introduced for the classification of weakly-labeled, variable-sized (and possibly sparse) axial lung CT scans. The models are trained and tested on aggregated, publicly available data sets with over 10 categories. To further assess the models, a data set was collected from multiple medical institutions in Colombia, which includes healthy, COVID-19 and patients with other diseases. It is composed of 1,322 CT studies from a diverse set of CT machines and institutions that make over 550,000 slices. Each CT study was labeled based on a clinical test, and no per-slice annotation took place. This enabled a classification into Normal vs. Abnormal patients, and for those that were considered abnormal, an extra classification step into Abnormal (other diseases) vs. COVID-19. Additionally, the pipeline features a methodology to segment and quantify lesions of COVID-19 patients on the complete CT study, enabling easier localization and progress tracking. Moreover, multiple ablation studies were performed to appropriately assess the elements composing the classification pipeline. Results The best performing lung CT study classification models achieved 0.83 accuracy, 0.79 sensitivity, 0.87 specificity, 0.82 F1 score and 0.85 precision for the Normal vs. Abnormal task. For the Abnormal vs COVID-19 task, the model obtained 0.86 accuracy, 0.81 sensitivity, 0.91 specificity, 0.84 F1 score and 0.88 precision. The ablation studies showed that using the complete CT study in the pipeline resulted in greater classification performance, restating that relevant COVID-19 patterns cannot be ignored towards the top and bottom of the lung volume. Discussion The lung CT classification architecture introduced has shown that it can handle weakly-labeled, variable-sized and possibly sparse axial lung studies, reducing the need for expert annotations at a per-slice level. Conclusions This work presents a working methodology that can guide the development of decision support systems for clinical reasoning in future interventionist or prospective studies.


Example Output from the Medical Decision Support System
The main components of the pipeline are: • Lung Segmentation and removal of non-lung slices.
• Classification of stacks of successive slices in the CT volume.
• Segmentation and quantification of lesion on patients classified as COVID-19.
In this case, we take a patient with COVID-19 and input the CT volume through the developed pipeline. The attached documents Annex1.pdf, Annex2.pdf & Annex3.pdf contain the results of each phase.
Annex1.pdf shows the study once it has been segmented, non-lung slices have been removed and the remaining volume has been divided into stacks of 30 slices, which are then used as input to the classification model: Healthy vs. Unhealthy. Each page in the document contains the successive slices that were used to get the probability or confidence level (shown in the page's header) that such lung region shows patterns that suggest that the patient is unhealthy. When the probability is less than or equal to 0.5, the lung region is considered healthy; otherwise, it is considered unhealthy.
Annex2.pdf shows the study once it has been determined that the patient is unhealthy, and we want to determine whether it shows patterns that suggest COVID-19 or a different disease. In this case, the axial CT volume has also been segmented and divided into stacks of successive slices of lung, which are then used as input to the classification model: Unhealthy vs. . Each page in the document contains the successive slices that were used as input and the confidence level (shown in the page's header) that the patient has patterns that suggest COVID-19. When the confidence level is less than or equal to 0.5, the lung region is considered unhealthy; otherwise, such region is classified as COVID-19.
Annex3.pdf contains the segmented and quantified lesion of the patient's lungs once it has been determined that she/he is unhealthy and has COVID-19. In the document, each page shows in the lefthand side the original axial CT slice and, in the right-hand side, the same image once both the lung and lesion have been segmented. Furthermore, the header of each page shows an approximate quantification of the proportion of lung that has been affected.
Annex4.pdf, Annex5.pdf, Annex6.pdf contain the segmented and quantified lesion for other randomly selected COVID-19 patients. In Annex4.pdf, it is possible to see an example of a patient where the lesions suggestive of COVID-19 are not particularly easy to spot; and in some portions of the volume, the lung does not show lesions, making it clear why it is necessary to consider the whole scan to return a diagnosis.

Public Dataset
The following tables contain relevant performance metrics of the introduced models (pre-finetunning setting) using different diagnostic metrics on the test set of an aggregated dataset of axial lung CT scans collected from multiple public sources.

Colombian Hospitals Dataset
The following tables contain relevant performance metrics of the introduced models using different diagnostic metrics on the test set of the retrospectively collected dataset of axial CT scans from patients in multiple Colombian health institutions.

Developed Website
The following is a set of screenshots of the website that is going to be deployed as a decision support system for medical professionals.
Model Home Screen: Disclaimers: