AUTHOR=Zhu Chen-Yu , Wang Yu-Kun , Chen Hai-Peng , Gao Kun-Lun , Shu Chang , Wang Jun-Cheng , Yan Li-Feng , Yang Yi-Guang , Xie Feng-Ying , Liu Jie TITLE=A Deep Learning Based Framework for Diagnosing Multiple Skin Diseases in a Clinical Environment JOURNAL=Frontiers in Medicine VOLUME=Volume 8 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2021.626369 DOI=10.3389/fmed.2021.626369 ISSN=2296-858X ABSTRACT=Background Numerous studies have attempted to apply artificial intelligence (AI) in the dermatological field, mainly on the classification and segmentation of various dermatoses. However, researches under real clinical settings are scarce. Objectives This study was aimed to construct a novel framework based on deep learning trained by a dataset that represented the real clinical environment in a tertiary class hospital in China, for better adaptation of the AI application in clinical practice among Asian patients. Methods Our dataset was composed of 13,603 dermatologist-labelled dermoscopic images, containing 14 categories of diseases, namely lichen planus (LP), rosacea (Rosa), viral warts (VW), acne vulgaris (AV), keloid and hypertrophic scar (KAHS), eczema and dermatitis (EAD), dermatofibroma (DF), seborrheic dermatitis (SD), seborrheic keratosis (SK), melanocytic nevus (MN), hemangioma (Hem), psoriasis (Pso), port wine stain (PWS) and basal cell carcinoma (BCC). In this study, we applied Google’s EfficientNet-b4 with pretrained weights on ImageNet as the backbone of our CNN architecture. The final fully-connected classification layer was replaced with 14 output neurons. We added 7 auxiliary classifiers to each of the intermediate layer groups. The modified model was retrained with our dataset and implemented using Pytorch. We constructed saliency maps to visualize our network’s attention area of input images for its prediction. To explore the visual characteristics of different clinical classes, we also examined the internal image features learned by the proposed framework using t-SNE (t-distributed Stochastic Neighbor Embedding). Results Test results showed that the proposed framework achieved a high level of classification performance with an overall accuracy of 0.948, a sensitivity of 0.934 and a specificity of 0.950. We also compared the performance of our algorithm with three most widely used CNN models which showed our model outperformed existing models with the highest AUC of 0.985. We further compared this model with 280 board-certificated dermatologists, and results showed a comparable performance level in an 8-class diagnostic task. Conclusions The proposed framework retrained by the dataset that represented the real clinical environment in our department could accurately classify most common dermatoses that we encountered during outpatient practice including infectious and inflammatory dermatoses, benign and malignant cutaneous tumours.