AUTHOR=Zhu Chen-Yu , Wang Yu-Kun , Chen Hai-Peng , Gao Kun-Lun , Shu Chang , Wang Jun-Cheng , Yan Li-Feng , Yang Yi-Guang , Xie Feng-Ying , Liu Jie 

TITLE=A Deep Learning Based Framework for Diagnosing Multiple Skin Diseases in a Clinical Environment

JOURNAL=Frontiers in Medicine

VOLUME=Volume 8 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2021.626369

DOI=10.3389/fmed.2021.626369

ISSN=2296-858X

ABSTRACT=Background
Numerous studies have attempted to apply artificial intelligence (AI) in the dermatological field, mainly on the classification and segmentation of various dermatoses. However, researches under real clinical settings are scarce.
Objectives
This study was aimed to construct a novel framework based on deep learning trained by a dataset that represented the real clinical environment in a tertiary class hospital in China, for better adaptation of the AI application in clinical practice among Asian patients.
Methods
Our dataset was composed of 13,603 dermatologist-labelled dermoscopic images, containing 14 categories of diseases, namely lichen planus (LP), rosacea (Rosa), viral warts (VW), acne vulgaris (AV), keloid and hypertrophic scar (KAHS), eczema and dermatitis (EAD), dermatofibroma (DF), seborrheic dermatitis (SD), seborrheic keratosis (SK), melanocytic nevus (MN), hemangioma (Hem), psoriasis (Pso), port wine stain (PWS) and basal cell carcinoma (BCC). In this study, we applied Google’s EfficientNet-b4 with pretrained weights on ImageNet as the backbone of our CNN architecture. The final fully-connected classification layer was replaced with 14 output neurons. We added 7 auxiliary classifiers to each of the intermediate layer groups. The modified model was retrained with our dataset and implemented using Pytorch. We constructed saliency maps to visualize our network’s attention area of input images for its prediction. To explore the visual characteristics of different clinical classes, we also examined the internal image features learned by the proposed framework using t-SNE (t-distributed Stochastic Neighbor Embedding). 
Results
Test results showed that the proposed framework achieved a high level of classification performance with an overall accuracy of 0.948, a sensitivity of 0.934 and a specificity of 0.950. We also compared the performance of our algorithm with three most widely used CNN models which showed our model outperformed existing models with the highest AUC of 0.985. We further compared this model with 280 board-certificated dermatologists, and results showed a comparable performance level in an 8-class diagnostic task. 
Conclusions
The proposed framework retrained by the dataset that represented the real clinical environment in our department could accurately classify most common dermatoses that we encountered during outpatient practice including infectious and inflammatory dermatoses, benign and malignant cutaneous tumours.