Impact Factor 3.845 | CiteScore 3.92
More on impact ›

Technology and Code ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Pharmacol. | doi: 10.3389/fphar.2019.00971

ATC-NLSP: prediction of the classes of anatomical therapeutic chemicals using a network-based label space partition method

  • 1Shanghai Jiao Tong University, China
  • 2School of Life Sciences and Biotechnology, State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, China

Anatomical Therapeutic Chemical (ATC) classification system proposed by World Health Organization is a widely accepted drug classification scheme in both academic and industrial realm. It is a multi-labeling system which categorizes drugs into multiple classes according to their therapeutic, pharmacological and chemical attributes. In this study, we adopted a data-driven network-based label space partition (NLSP) method for prediction of ATC classes of a given compound within the multi-label learning framework. The proposed method ATC-NLSP is trained on the similarity-based features such as chemical-chemical interaction, structural and fingerprint similarities of a compound to other compounds belonging to the different ATC categories. The NLSP method trains predictors for each label cluster (possibly intersecting) detected by community detection algorithms and takes the ensemble labels for a compound as final prediction. Experimental evaluation based on the jackknife test on the benchmark dataset demonstrated that our method has boosted the absolute true rate, which is the most stringent evaluation metrics in this study, from 0.6330 to 0.7497, in comparison to the state-of-the-art approaches. Moreover, the community structures of the label relation graph were detected through the label propagation method. The advantage of multi-label learning over the single-label models was shown by label-wise analysis. Our study indicated that the proposed method ATC-NLSP, which adopts ideas from network research community and captures the correlation of labels in a data driven manner, is the top-performing model in the ATC prediction task. We believed that the power of NLSP remains to be unleashed for the multi-label learning tasks in drug discovery. The source codes are freely available at https://github.com/dqwei-lab/ATC.

Keywords: Drug classification, multi-label classification, Label Correlation, label space partition, Label propagation

Received: 10 Jun 2019; Accepted: 29 Jul 2019.

Copyright: © 2019 Wang, Wang, Xu, Xiong and Wei. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Prof. Yi Xiong, State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, School of Life Sciences and Biotechnology, Shanghai, 200240, China, xiongyi@sjtu.edu.cn
Prof. Dongqing Wei, State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, School of Life Sciences and Biotechnology, Shanghai, 200240, China, dqwei@sjtu.edu.cn