<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Pharmacol.</journal-id>
<journal-title>Frontiers in Pharmacology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Pharmacol.</abbrev-journal-title>
<issn pub-type="epub">1663-9812</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">648805</article-id>
<article-id pub-id-type="doi">10.3389/fphar.2021.648805</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Pharmacology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Predicting Drug-Induced Liver Injury Using Machine Learning on a Diverse Set of Predictors</article-title>
<alt-title alt-title-type="left-running-head">Adeluwa et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Predicting DILI with Machine Learning</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Adeluwa</surname>
<given-names>Temidayo</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="fn" rid="fn1">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1290427/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>McGregor</surname>
<given-names>Brett A.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="fn" rid="fn1">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/476568/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Guo</surname>
<given-names>Kai</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/851007/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Hur</surname>
<given-names>Junguk</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/154107/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Department of Biomedical Sciences, University of North Dakota, <addr-line>Grand Forks</addr-line>, <addr-line>ND</addr-line>, <country>United&#x20;States</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Department of Neurology, University of Michigan, <addr-line>Ann Arbor</addr-line>, <addr-line>MI</addr-line>, <country>United&#x20;States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/43250/overview">Pawe&#x142; P &#x141;abaj</ext-link>, Jagiellonian University, Poland</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1220618/overview">Joaquim Aguirre-Plans</ext-link>, Pompeu Fabra University, Spain</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/304873/overview">Minjun Chen</ext-link>, National Center for Toxicological Research (FDA), United&#x20;States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Junguk Hur, <email>junguk.hur@med.und.edu</email>
</corresp>
<fn fn-type="equal" id="fn1">
<label>
<sup>&#x2020;</sup>
</label>
<p>These authors contributed equally.</p>
</fn>
<fn fn-type="other">
<p>This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>18</day>
<month>08</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>648805</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>01</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>07</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Adeluwa, McGregor, Guo and Hur.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Adeluwa, McGregor, Guo and Hur</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>A major challenge in drug development is safety and toxicity concerns due to drug side effects. One such side effect, drug-induced liver injury (DILI), is considered a primary factor in regulatory clearance. The Critical Assessment of Massive Data Analysis (CAMDA) 2020 CMap Drug Safety Challenge goal was to develop prediction models based on gene perturbation of six preselected cell-lines (CMap L1000), extended structural information (MOLD2), toxicity data (TOX21), and FDA reporting of adverse events (FAERS). Four types of DILI classes were targeted, including two clinically relevant scores and two control classifications, designed by the CAMDA organizers. The L1000 gene expression data had variable drug coverage across cell lines with only 247 out of 617 drugs in the study measured in all six cell types. We addressed this coverage issue by using Kru-Bor ranked merging to generate a singular drug expression signature across all six cell lines. These merged signatures were then narrowed down to the top and bottom 100, 250, 500, or 1,000 genes most perturbed by drug treatment. These signatures were subject to feature selection using Fisher&#x2019;s exact test to identify genes predictive of DILI status. Models based solely on expression signatures had varying results for clinical DILI subtypes with an accuracy ranging from 0.49 to 0.67 and Matthews Correlation Coefficient (MCC) values ranging from -0.03 to 0.1. Models built using FAERS, MOLD2, and TOX21 also had similar results in predicting clinical DILI scores with accuracy ranging from 0.56 to 0.67 with MCC scores ranging from 0.12 to 0.36. To incorporate these various data types with expression-based models, we utilized soft, hard, and weighted ensemble voting methods using the top three performing models for each DILI classification. These voting models achieved a balanced accuracy up to 0.54 and 0.60 for the clinically relevant DILI subtypes. Overall, from our experiment, traditional machine learning approaches may not be optimal as a classification method for the current&#x20;data.</p>
</abstract>
<kwd-group>
<kwd>DILI</kwd>
<kwd>Connectivity Map</kwd>
<kwd>Tox21</kwd>
<kwd>FAERS</kwd>
<kwd>machine learning</kwd>
<kwd>Mold2</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Adverse drug reactions (ADRs) are a common concern of novel drugs and therapeutics. One of the more common targets of ADRs is the liver due to its role in the metabolism of compounds and resulting liver damage is termed as Drug-Induced Liver Injury (DILI) (<xref ref-type="bibr" rid="B11">Daly, 2013</xref>; <xref ref-type="bibr" rid="B3">Atienzar et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B30">Marzano et&#x20;al., 2016</xref>). DILI is a unique challenge in drug development due to the inability of animal models to translate to human clinical trials in treatment populations. Assessing DILI risk has been approached in multiple ways during drug development; however, officials often rely on post-marketing surveillance to detect possible long-term side effects such as DILI (<xref ref-type="bibr" rid="B4">Berlin et&#x20;al., 2008</xref>). The <xref ref-type="bibr" rid="B12">U.S. Food and Drug Administration (2021)</xref> has established the DILIrank dataset, the largest reference drug list ranked for DILI risk in humans, to facilitate the development of predictive models by enhancing drug label DILI annotation with weighted causal evidence (<xref ref-type="bibr" rid="B7">Chen et&#x20;al., 2016b</xref>). This dataset contains four classifications, including most, less, ambiguous, and no-DILI concern, regarding 1,036&#x20;FDA-approved drugs. Additionally, predicting DILI is difficult due to the absence of specific and reliable biomarkers. Traditional biomarkers, including alanine aminotransferase, total bilirubin levels, aspartate aminotransferase, and gamma-glutamyl transferase (among others) are not specific enough to separate DILI from other forms of liver injury (<xref ref-type="bibr" rid="B34">Ozer et&#x20;al., 2008</xref>). Due to this reason, FDA in 2016 approved investigations into glutamate dehydrogenase and microRNA-122 as potential biomarkers (<xref ref-type="bibr" rid="B2">Andrade et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B29">L&#xf3;pez-Longarela et al., 2020</xref>). Messner and others characterized exosomal microRNA-122 in methotrexate and acetaminophen-induced toxicity in hepatic stem cells, HepaRG. They confirmed that microRNA-122 can be used as a sensitive biomarker for DILI (<xref ref-type="bibr" rid="B32">Messner et&#x20;al., 2020</xref>).</p>
<p>Predictive markers of DILI, determined by compound properties and known variables rather than preclinical studies, would facilitate drug development in a wide variety of ways (<xref ref-type="bibr" rid="B13">Garc&#xed;a-Cort&#xe9;s et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B40">Saini et&#x20;al., 2018</xref>). Multiple groups have attempted to predict DILI using drug compounds or proposed drug properties. Chemical structures (<xref ref-type="bibr" rid="B41">Shin et&#x20;al., 2020</xref>), gene expression response (<xref ref-type="bibr" rid="B28">Liu et&#x20;al., 2020</xref>), and patient genetic data have been previously used for DILI prediction using traditional machine learning algorithms. Xu et&#x20;al. proposed a deep learning model built on a &#x201c;combined data set&#x201d; gathered from a variety of sources and used a molecular structural encoding approach for the chemical structures of the drugs in their data (<xref ref-type="bibr" rid="B46">Xu et&#x20;al., 2015</xref>). Kohonen et&#x20;al. proposed a &#x201c;big data compacting and data fusion&#x201d; concept (<xref ref-type="bibr" rid="B19">Kohonen et&#x20;al., 2017</xref>). In their approach, the authors utilized data from the Connectivity Map (CMap; Broad Institute) database, the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation Systems (TG-GATEs; National Institutes of Biomedical Innovation, Japan), the US National Cancer Institute 60 tumor cell line screening (NCI-60), and the US FDA Liver Toxicity Knowledge Base (LTKB). Using these databases, they modeled a predictive toxicogenomics space that captured all possible well-known hepato-pathological changes (<xref ref-type="bibr" rid="B19">Kohonen et&#x20;al., 2017</xref>).</p>
<p>Building upon these previous efforts to accurately predict DILI, the Critical Assessment of Massive Data Analysis (CAMDA) in collaboration with the Intelligent Systems for Molecular Biology (ISMB) has proposed the CMap Drug Safety Challenge for their annual conferences in 2018, 2019, and 2020 (<xref ref-type="table" rid="T1">Table&#x20;1</xref>). The previous challenges in 2018 and 2019, while sharing a similar goal to predict potential liver toxicity, also had distinct parameters. The prediction DILI classification in 2018 was a binary positive or negative DILI status, while in 2019 the challenge was more focused on the potential DILI risk ranging from no concern to most concern with four classifications reflecting the DILIrank dataset (<xref ref-type="bibr" rid="B7">Chen et&#x20;al., 2016b</xref>). The data, used for predicting the DILI classification of drugs in the 2018 challenge, were limited to microarray data from MCF7 and PC3 cell lines. Chierici et&#x20;al., in 2018 employed deep learning techniques for the microarray data from 276 compounds but only achieved Matthews Correlation Coefficient (MCC) values of &#x3c;0.2 (<xref ref-type="bibr" rid="B8">Chierici et&#x20;al., 2020</xref>). Sumsion et&#x20;al. in the same challenge year utilized more traditional classification algorithms along with soft voting but reached a maximum MCC of 0.2 and maximum accuracy of 70%, while the voting model never performed the best when compared to individual models (<xref ref-type="bibr" rid="B43">Sumsion et&#x20;al., 2020</xref>). Both studies cite struggles with the small sample size and imbalanced datasets; however, resampling, in this case, led to overfitting rather than improved testing accuracy.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Previous CAMDA Drug Safety Challenge Summary. The CMap Drug Safety Challenge has been a repeated effort by CAMDA to develop predictive models for DILI. Previous studies are cited by their year of publication and leading author while also describing the year in which the challenge was administered by CAMDA and relevant data sources and DILI classifications for prediction.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Authors</th>
<th align="center">CAMDA drug safety challenge</th>
<th align="center">Data sources</th>
<th align="center">DILI conditions</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Current: Adeluwa et&#x20;al.</td>
<td align="char" char=".">2020</td>
<td align="left">CMap L1000, MOLD2, FAERS, TOX21</td>
<td align="left">DILI1, DILI3, DILI5, DILI6</td>
</tr>
<tr>
<td align="left">2021: Liu et&#x20;al.</td>
<td align="char" char=".">2019</td>
<td align="left">CMap L1000, SMILES strings, SIDER 4.1</td>
<td rowspan="3" align="left">Most-DILI concern, Less-DILI concern, ambiguous DILI concern, No-DILI concern</td>
</tr>
<tr>
<td align="left">2021: Aguirre-Plans et&#x20;al.</td>
<td align="char" char=".">2019</td>
<td align="left">CMap L1000, DisGeNET, GUILDify, SMILES, DGldb, HitPick, SEA</td>
</tr>
<tr>
<td align="left">2021: Lesinski et&#x20;al.</td>
<td align="char" char=".">2019</td>
<td align="left">CMap L1000, SMILES, annotated images</td>
</tr>
<tr>
<td align="left">2020: Chierici et&#x20;al.</td>
<td align="char" char=".">2018</td>
<td align="left">Affymetrix GeneChip (MCF7, PC3)</td>
<td rowspan="2" align="left">DILI-1, DILI-0</td>
</tr>
<tr>
<td align="left">2020: Sumsion et&#x20;al.</td>
<td align="char" char=".">2018</td>
<td align="left">Affymetrix GeneChip (MCF7, PC3)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The CMap Drug Safety Challenge expanded in 2019 by including not only expression data from L1000 CMap but also by allowing a wide variety of external data sources that were incorporated into each study. Lesinski et&#x20;al. achieved their best predictive results by incorporating molecular drug properties along with the most informative variables from five of 13 cell line expression models <italic>via</italic> a super learner method (<xref ref-type="bibr" rid="B23">Lesi&#x144;ski et&#x20;al., 2021</xref>). Including molecular property information improved their cell line models&#x2019; accuracy up to 73% utilizing a random forest algorithm, which originally ranged from 55 to 61%. Liu et&#x20;al. built support vector machine and random forest models using chemical descriptions from DILIrank annotation along with expression values from predicted protein targets (<xref ref-type="bibr" rid="B27">Liu et&#x20;al., 2021</xref>). This approach produced models with an accuracy of 75.9% that were also able to correctly identify targets associated with the mechanism of action and toxicity of nonsteroidal anti-inflammatory drugs, a class of drugs commonly associated with DILI. Aguirre et&#x20;al. utilized the widest array of predictive data, including L1000 CMap expression, drug-target associations, structural data, phenotype-associated gene signatures, protein-protein interactions, and drug targets data (<xref ref-type="bibr" rid="B1">Aguirre-Plans et&#x20;al., 2021</xref>). Their models&#x2019; accuracy remained comparable to other study results at 70%, but they also identified structural dissimilarities within the DILI risk labels used. All three published studies from the 2019 CMap drug safety challenge cited data limitations within their study, including complex dosage-related toxicity, a small sample size, and a small number of compounds with hepatoxicity annotation.</p>
<p>The current CAMDA 2020 challenge was structured in a way to address the previous limitations, while also redefining the relevant DILI classifications. The challenge aimed to predict or classify positive and negative classes within each of four DILI designations, namely DILI1, DILI3, DILI5, and DILI6. DILI1 and DILI3 were clinical classifications based on specific severity scores or established FDA warnings and precautions, while DILI5 and DILI6 served as a negative and positive control class, respectively (<xref ref-type="table" rid="T2">Table&#x20;2</xref>). Drug class labels were assigned by the CAMDA 2020 challenge organizers. DILI1 was described as a severity score &#x2265;, six which is associated with high risk based on the DILIrank dataset and LTKB (<xref ref-type="bibr" rid="B6">Chen et&#x20;al., 2016a</xref>). DILI3 was described as drugs withdrawn, given boxed warnings, or warnings and precautions from the FDA due to either known risk factors or adverse event reporting. DILI5 served as a randomly assigned negative control, while DILI6 was constructed as a positive control based on molecular weight with positive compounds weighing &#x3e;320&#xa0;g/mol. The drug list for the study was expanded to 617 drug compounds to improve on the sample size limitations of previous studies; however, these datasets remained highly imbalanced.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Drug-Induced Liver Injury Classifications. Four binary classes of DILI were provided by the CAMDA organizers. DILI1 positive compounds were based on the clinical severity score associated with liver necrosis. DILI3 positive compounds were based on drugs already associated with warnings and precautions or that have been withdrawn due to liver toxicity. DILI5 was a random assignment from the organizers as a negative control group while the DILI6 classification was based on molecular weight (&#x3e;320&#xa0;g/mol) to serve as a positive control.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Targets</th>
<th align="center">Positive group</th>
<th align="center">Negative group</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">DILI1</td>
<td align="left">DILI severity score &#x2265;6 (N &#x3d; 141)</td>
<td align="left">DILI severity score &#x3c;6 (N &#x3d; 476)</td>
</tr>
<tr>
<td align="left">DILI3</td>
<td align="left">Withdrawn, box warning, warning and precaution (N &#x3d; 227)</td>
<td align="left">Adverse events and no match (N &#x3d; 390)</td>
</tr>
<tr>
<td align="left">DILI5</td>
<td align="left">Assigned DILI endpoint 1 (N &#x3d; 308 positive)</td>
<td align="left">(N &#x3d; 309 negative)</td>
</tr>
<tr>
<td align="left">DILI6</td>
<td align="left">Assigned DILI endpoint 2 (N &#x3d; 318 positive)</td>
<td align="left">(N &#x3d; 299 negative)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Note1: DILI5/DILI6 are controls; DILI5 is randomly split; DILI6 is the positive control, dividing compounds based on their molecular weight &#x3e; 320&#xa0;g/mol.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The imbalance within the clinically relevant DILI data is expected considering that many approved drugs do not have a significant hepatoxicity risk; however, the control classes of DILI5 and DILI6 were structured in a balanced manner (<xref ref-type="table" rid="T3">Table&#x20;3</xref>). For this challenge, L1000 drug expression signatures from primary human hepatocytes (PHH), liver carcinoma (HepG2), immortalized kidney cells (HA1E), human skin melanoma (A-375), breast cancer (MCF7), and adenocarcinoma (PC-3) were used as inferred from landmark genes defined by Connectivity Map (<xref ref-type="bibr" rid="B42">Subramanian et&#x20;al., 2017</xref>). These expression responses were simplified to one specific dose at one specific treatment time in order to yield the largest available dataset for training and testing while also addressing previous dosage toxicity concerns. Other non-gene expression data provided included molecular descriptors encoding two-dimensional chemical structure information from MOLD2 (<xref ref-type="bibr" rid="B16">Hong et&#x20;al., 2008</xref>), post-marketing drug adverse event information from FAERS (FDA adverse event reporting system (FAERS), 2015), and high-throughput liver toxicity screening results from TOX21 (<xref ref-type="bibr" rid="B17">Huang et&#x20;al., 2016</xref>). While previous studies also utilized external data sources to improve model performance, the current study focuses on the various types of data processed and provided from the CMap drug safety challenge.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Training Data Imbalance. The data used for the clinical DILI classes of DILI1 and DILI3 were imbalanced which negatively influenced the models built to predict these classes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">DILI class</th>
<th align="center">Negative</th>
<th align="center">Positive</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">DILI1</td>
<td align="char" char=".">326</td>
<td align="char" char=".">96</td>
</tr>
<tr>
<td align="left">DILI3</td>
<td align="char" char=".">262</td>
<td align="char" char=".">160</td>
</tr>
<tr>
<td align="left">DILI5</td>
<td align="char" char=".">218</td>
<td align="char" char=".">204</td>
</tr>
<tr>
<td align="left">DILI6</td>
<td align="char" char=".">197</td>
<td align="char" char=".">225</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>We constructed models to predict each drug&#x2019;s DILI class (positive or negative) within the four DILI classifications (DILI1, DILI3, DILI5, and DILI6) by first evaluating the performance of each dataset in predicting DILI and also by employing ensemble voting with the top three performing models across data types. The gene expression data presented a unique challenge in that not all drugs were tested in each cell line or even in liver-relevant cell lines. To address this, we utilized a Kru-Bor merging method to merge the expression signatures across cell lines into one representative drug signature (<xref ref-type="bibr" rid="B18">Iorio et&#x20;al., 2010</xref>; <xref ref-type="bibr" rid="B25">Lin, 2010</xref>). These expression signatures were narrowed down to the top and bottom 100, 250, 500, and 1,000 ranked genes and subjected to feature selection <italic>via</italic> a Fisher&#x2019;s exact test based on their involvement in DILI positive/negative assigned drugs for each DILI class. FAERS, MOLD2, and TOX21 datasets were also used to construct DILI predictive models, and to address the imbalance of these data we tested resampling techniques. Various traditional classifier algorithms were used to build models on these datasets, and the models were evaluated on a blinded test set by the CAMDA committee. Based on the training area under the curve (AUC) values of these models, the top three algorithms for each datatype (cell expression, FAERS, MOLD2, and TOX21) for each DILI class were included in our ensemble voting model. We tested hard, soft, and weighted voting across these datasets to see if the varying dimensions of data can improve predictive performance.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Data Processing</title>
<p>The overall workflow of our study is shown in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>. Initially, the overlap of drugs, included in each of the gene expression cell data sets, was investigated using VennDetail (<xref ref-type="bibr" rid="B14">Guo and McGregor, 2020</xref>) to create a Venn pie chart showing the various drug testing subsets across the six cell lines (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). Each of the non-gene expression datasets (FAERS, MOLD2, and TOX21) were treated as individual datasets, while the gene expression data were merged across cell lines to build classifier models. In general, we used standard preprocessing techniques, including removing zero variance features and missing values. DILI1 and DILI3 suffered from class imbalance (<xref ref-type="table" rid="T3">Table&#x20;3</xref>). For all non-gene expression data, to mitigate this issue, we attempted three oversampling techniques, including synthetic minority oversampling technique (SMOTE) (<xref ref-type="bibr" rid="B5">Chawla et al., 2002</xref>), random oversampling examples (ROSE) (<xref ref-type="bibr" rid="B31">Menardi and Torelli, 2014</xref>), and a random upsampling of the minority classes. SMOTE balances data by randomly creating artificial samples between two nearest-neighbor samples, while ROSE uses a smoothed bootstrap technique to resampled the data (<xref ref-type="bibr" rid="B5">Chawla et al., 2002</xref>; <xref ref-type="bibr" rid="B31">Menardi and Torelli, 2014</xref>). For comparison, models were built using imbalanced data as well. Before training non-gene expression datasets, they were standardized to have a mean of zero and a standard deviation of one. Preprocessing details specific to each dataset as well as some characteristics of the data are discussed&#x20;below.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Study Workflow. Data were separated into expression-based datasets and non-expression-based (FAERS, MOLD2, TOX21) for testing. Non-expression data was evaluated with resampling methods ROSE and SMOTE as well as an unbalanced dataset. Expression-based datasets were merged across cell lines into one representative expression signature per drug. These signatures were tested as the top and bottom 100, 250, 500, and 1,000 ranked genes for each drug. Following signature formation, feature selection using a fisher&#x2019;s exact test was used to determine significant predictors of DILI classification. Machine learning was used on predictors for both expression-based and non-expression models, which were evaluated based on training AUC curve values as well as testing performance. The top three performing models for each DILI type were utilized in ensemble voting models in an effort to incorporate both expression and non-expression datasets.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g001.tif"/>
</fig>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Drug Testing Cell Distribution. The Venn-Pie diagram depicts the overlap of drugs tested between each of the six cell lines used in this study. Each bar within the Venn-Pie represents an individual dataset while the color of the bars indicates the overlapping group of compounds across datasets. While 247 of the 617 drugs included in the training and test data were tested in all six cell lines, some compounds were only tested in a singular cell line and others did not have any expression information provided.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g002.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>Food and Drug Administration Adverse Event Reporting System</title>
<p>The CAMDA organizers provided us with FAERS data for all 617 drug compounds. Of these, 422 were grouped as &#x201c;training data&#x201d;. This dataset contains 20 features corresponding to information on the percentage of reported adverse events for each drug compound by gender and age group demographic. After removing highly correlated features, we upsampled the data to cater to the class imbalance by randomly sampling with replacement from the minority class to balance the majority class. An additional preprocessing step was to create two new variables, namely &#x201c;male ratio&#x201d; and &#x201c;female ratio&#x201d;, taking into account all reported events irrespective of the gender, all reported DILI events irrespective of gender, and the percentage of reported DILI events by gender.</p>
</sec>
<sec id="s2-3">
<title>Toxicology in the 21st Century</title>
<p>In addition to the FAERS dataset, we were provided with concentration-response information of 600 drugs. Of these, 412 were designated as &#x201c;training data&#x201d;. Thirty-two features corresponded to concentration-response curve ranks. Out of all 412 drugs for training, 57 drugs were removed for missing values. In addition, we removed highly correlated features using an arbitrary cutoff of 0.82 and catered to the class imbalance by using SMOTE.</p>
</sec>
<sec id="s2-4">
<title>Molecular Descriptors from 2D Structures</title>
<p>Alongside the FAERS and TOX21 data provided, we had access to the 2D molecular descriptors or structural information of these 617 drug compounds. 422 of these drugs were designated for training. There were 777 features for each drug compound with each feature corresponding to MOLD2 descriptors. To cater to class imbalance, we upsampled minority classes, as well as ROSE, and SMOTE.</p>
</sec>
<sec id="s2-5">
<title>Connectivity Map L1000 Gene Expression Data</title>
<p>The L1000 assay data used in this study is a high-throughput gene expression assay that measures mRNA transcript abundance of 978 landmark genes based on an inference algorithm to infer the expression of 11,450 additional genes in the transcriptome (<xref ref-type="bibr" rid="B42">Subramanian et&#x20;al., 2017</xref>). Utilizing simulation, it has been observed that this reduced representation of the transcriptome can recapitulate around 80% of the relationships of measuring the entire transcriptome directly. In this study, 12,328 deidentified predictors genes were provided by the CAMDA organizers with Z scores to indicate transcript abundance. The treatment time and dosage of each drug were selected by the CAMDA committee to produce the largest available dataset for both test and training&#x20;data.</p>
</sec>
<sec id="s2-6">
<title>Kruskal-Borda Merging</title>
<p>Since not all drugs were tested in each cell line data made available, we utilized the Kruskal-Borda (Kru-Bor) merging algorithm in the GeneExpressionSignature R package (<xref ref-type="bibr" rid="B24">Li et&#x20;al., 2013</xref>). This approach allowed us to generate a unified drug-induced expression signature across cell types since many drugs were not tested in the PHH or HepG2 liver cell lines. The Kruskal algorithm (<xref ref-type="bibr" rid="B20">Kruskal, 1956</xref>) finds a minimum spanning forest of an undirected edge-weighted graph while the Borda merging method (<xref ref-type="bibr" rid="B39">Saari, 2000b</xref>; <xref ref-type="bibr" rid="B38">2000a</xref>) uses ranked options in order of preference to determine the outcome. Thus each closest neighbor in rank merges one by one until a unified signature is formed. Following merging, the top and bottom 100, 250, 500, and 1,000 ranked genes were selected as drug signatures for feature selection.</p>
</sec>
<sec id="s2-7">
<title>Feature Selection</title>
<p>A method of feature selection utilized across the merged signatures produced <italic>via</italic> our Kru-Bor merging was based on a gene&#x2019;s significance (<italic>p</italic>-value &#x3c; 0.01) in predicting the DILI score <italic>via</italic> a Fisher&#x2019;s exact test. If a gene is included in the top or bottom 100, 250, 500, or 1,000 ranked list, depending on the model data, for any drug it would be assigned a one (True), or if it fell outside of that range it would be assigned a zero (False). The classifier for each type of DILI was also one (DILI positive) or zero (DILI negative). We used these classifiers to identify if these highly influenced genes were predictive of a drug being DILI positive or DILI negative with a <italic>p</italic>-value cutoff of&#x20;0.01.</p>
</sec>
<sec id="s2-8">
<title>Machine Learning</title>
<p>The prediction of DILI was treated as a binary classification problem for each DILI type. That is, for each of DILI1, DILI3, DILI5, and DILI6, outcomes were split between &#x201c;positive&#x201d; and &#x201c;negative&#x201d;. We used a 5-folds cross validation repeated 100 times, and a random search strategy to search for the best parameters for each model. The data was made available such that training and test sets had been pre-identified. Importantly, we did not have access to the correct labels for the test data. Models were built using traditional machine learning algorithms within the caret (<xref ref-type="bibr" rid="B21">Kuhn, 2008</xref>) package in R version 4.0.0 (<xref ref-type="bibr" rid="B36">R Core Team (2020)</xref>, 2020).</p>
<p>The machine learning algorithms we used are suitable for classification tasks. They include a Logistic Regression (LR) (<xref ref-type="bibr" rid="B10">Cox, 1958</xref>), Linear Discriminant Analysis (LDA) (<xref ref-type="bibr" rid="B44">Li and Jain, 2009</xref>), Decision Trees (DT) (<xref ref-type="bibr" rid="B35">Quinlan, 1986</xref>), Support Vector Machines (SVM) (<xref ref-type="bibr" rid="B9">Cortes and Vapnik, 1995</xref>), Na&#xef;ve Bayes (NB) (<xref ref-type="bibr" rid="B15">Hand and Yu, 2001</xref>), a One-layer Neural Network (Nnet), and a Random Forest (RF) algorithm. LR and LDA are generally categorized as linear classification models, with an assumption that the data follows a normal distribution. Given a set of predictors, LR aims to build a linear model of these predictors by minimizing the sum of squared residuals. LDA uses the prior probability of belonging to a class to estimate posterior probabilities by using Bayes&#x2019; Theorem. DT and RF are often classified as trees and rules-based algorithms. Given a set of predictors, a decision tree works by using if-else conditions to build a definitive set of rules using splits. The challenge usually lies in determining optimal situations to apply a &#x201c;then&#x201d;-clause (or a split). In RF, similar conditional statements are used. However, instead of using the entire sample of data for tree-building, RF uses many independent subsamples from the training data to build small decision trees. Each small decision tree classifies an observation by voting. Neural networks and SVMs are generally grouped as non-linear algorithms. Neural networks (in our case, a multilayer perceptron i.e. a neural network with one hidden layer), are modeled after how neurons in the human brain work. The outcome or prediction is a linear combination of the hidden layer(s) transformed by a non-linear activation function. There are several activation functions used, depending on whether the problem is a regression or classification problem. In our case, we used a sigmoidal or logistic function, since we were dealing with a classification problem. SVM aims to find support vectors or data points that separate the different classes as much as possible. Intuitively, these data points are the most difficult to separate (the reasoning is that they lie very close to one another and to the hyperplane or decision boundary), and are thought of to be important in separating classes. There are different flavors of SVMs depending on the kernel used (kernels are similar to non-linear activation functions used in neural networks). In the current study, we used polynomial, linear, and a radial basis function kernels.</p>
</sec>
<sec id="s2-9">
<title>Model Evaluation</title>
<p>To evaluate the performance of our models, we focused on the area under the ROC curve (AUC) value as well as the specificity, sensitivity, accuracy, and MCC of the models on the test set. The AUC value is a widely-used metric in binary classification problems. An AUC value of one indicates a perfect classifier, i.e. a model that is perfectly able to separate both classes, while an AUC value of 0.5 indicates a model that predicts at random. Depending on the application domain, AUC values of 0.7 and above are usually acceptable. Specificity measures the ratio of negative classes that were correctly identified by the model out of all negative classes, while sensitivity measures the ratio of positive classes that were correctly identified by the model out of all positive classes. These metrics are affected by how the target labels are structured and passed to the algorithm, and they range from 0 to 1. Additionally, we evaluated the performance of our models on the test set by calculating the balanced accuracy of prediction. Balanced accuracy is the average of the sensitivity and specificity or the average of the fraction of correct labels that are predicted correctly (by the model) within each class. We used this metric because we observed that there was class imbalance within our datasets regardless of DILI&#x20;type.</p>
<p>MCC is particularly useful in datasets of different class distributions (or imbalanced data) because it considers all of the false and true positives and negatives. It is calculated from the confusion matrix of a model and its values range from &#x2b;1 to &#x2212;1, with &#x2b;1 indicating a perfect classification, 0 indicating random classifications, and &#x2212;1 indicating no relationships between the observed and predicted classes.</p>
</sec>
<sec id="s2-10">
<title>Ensemble Voting Machine Learning</title>
<p>In an attempt to improve the classification accuracy of our models, we used three ensemble voting approaches, namely soft voting, hard voting, and a weighted voting approach. These ensemble methods work best when there are varying algorithms of different strengths i.e. algorithms having varying underlying assumptions about the data, and when each one has reasonable predictive power (<xref ref-type="bibr" rid="B22">Kuncheva, 2002</xref>; <xref ref-type="bibr" rid="B45">Van Erp et&#x20;al., 2002</xref>). Using the gene expression data provided by CAMDA 2018 organizers, Sumsion and others (<xref ref-type="bibr" rid="B43">Sumsion et&#x20;al., 2020</xref>) used hard and soft voting ensemble methods in an attempt to improve prediction accuracy on DILI risk. As an extension of their work, we hypothesized that since we have access to larger and more diverse datasets, we could capture different aspects of predicting DILI types and use these ensemble methods to improve prediction.</p>
<p>Hard voting, also known as majority voting, takes into account the predicted class labels of each classifier (or voter) (<xref ref-type="bibr" rid="B37">Ruta and Gabrys, 2005</xref>). Voting is done by counting how many class labels (for each class) were predicted among all classes. The class label with the highest count is taken to be the predicted class label for that observation. On the other hand, soft voting considers the probabilities of each class label by each classifier (<xref ref-type="bibr" rid="B26">Lin et&#x20;al., 2003</xref>). In other words, it considers how certain each classifier is about the class labels. For each class label, the probabilities are averaged, and the label with the highest average probability is taken as the predicted class label for the observation.</p>
<p>The third approach to voting involves using a weight to skew predictions towards the most certain models. In our approach, we used the AUC of each classifier as a weighting parameter for the output probabilities. This was done to take into account that some classifiers might have better predictive power and should be given preference in determining the outcome of the voting. To weigh each probability, we multiplied the probabilities of each predicted class by the AUC and divided this by one subtracted from the weight, that is, the AUC of that model. Afterward, weighted probabilities were treated just as in soft voting: by taking the average of all resulting weighted probabilities belonging to each class. The class label with the higher average was taken as the predicted class for that observation. Therefore, the predicted class, <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>, of observation, given an output set of class membership probabilities across many models, <inline-formula id="inf2">
<mml:math id="m2">
<mml:mi>P</mml:mi>
</mml:math>
</inline-formula>, is given by:<disp-formula id="equ1">
<mml:math id="m3">
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>m</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>&#x2026;</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mi>w</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
</mml:msubsup>
<mml:mo>&#x2217;</mml:mo>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>w</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>Where <inline-formula id="inf3">
<mml:math id="m4">
<mml:mi>m</mml:mi>
</mml:math>
</inline-formula> is a model, <inline-formula id="inf4">
<mml:math id="m5">
<mml:mrow>
<mml:msubsup>
<mml:mi>w</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the weighting parameter for a model, <inline-formula id="inf5">
<mml:math id="m6">
<mml:mrow>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the probability of class membership.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec id="s3-1">
<title>Food and Drug Administration Adverse Event Reporting System Modeling</title>
<p>The performance of FAERS data in predicting each of the DILI types can be seen in the bar plots in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>. While we built many models, we compared and picked the best three models based on the AUC values to predict DILI class on the test set. We noticed that using the raw data (without resampling), models achieved classification accuracy between 0.51 and 0.55 and MCC between 0.04 and 0.14 on the training set and did not do noticeably better on the test set (accuracy: 0.49 to 0.59, MCC: &#x2212;0.03&#x2013;0.22). On the other hand, using resampled datasets improved the accuracy of the models on the training set to a range of 0.61&#x2013;0.94 (MCC: 0.47&#x2013;0.89). Using these models to predict the DILI class of the test set showed a slight improvement in the accuracy (0.52&#x2013;0.62). The MCC, however, was between 0.04 and&#x20;0.24.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>FAERS Model Performance. Performance evaluation of the DILI predictive models built using the FAERS reporting data was conducted on both the original unbalanced and the resampled/balanced datasets. The best performing algorithm determined by AUC between GLM, IDA, NB, NNET, RF, RPART, and SVMPoly were selected. For DILI1 and DILI3, the highest accuracy was 0.62 with MCC values of 0.21 and 0.24, respectively.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g003.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>TOX21 Modeling</title>
<p>The top three models built using TOX21 data (using the AUC as the criterion) were evaluated on the test set (<xref ref-type="fig" rid="F4">Figure&#x20;4</xref>). Using the data as is, without resampling, the accuracy of the training data was between 0.50 and 0.57 (MCC: &#x2212;0.02&#x2013;0.17). As expected, the models failed to generalize to the test set (accuracy: 0.50 to 0.59, MCC: &#x2212;0.04&#x2013;0.19). Again, we observed that resampling slightly improved the accuracies of these models on the training set (accuracy: 0.62&#x2013;0.76, MCC: 0.25&#x2013;0.54). Yet, there was no major improvement on the test set (accuracy: 0.50 to 0.58, MCC: &#x2212;0.01&#x2013;0.20).</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>TOX21 Model Performance. The performance of DILI predictive models built using the toxicology information provided from TOX21. The three best-performing algorithms, based on training AUC and based on whether resampling was used or not, are presented in the bar&#x20;plots.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g004.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>MOLD2 Modeling</title>
<p>Similarly to how the FAERS data was handled, we selected the top three performing models built using MOLD2 data in each category (resampled or non-resampled) to predict the DILI class of the test data (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref>). Models built using the non-resampled MOLD2 dataset gave accuracies of 0.50&#x2013;0.54, showing that the models were randomly predicting the classes (MCC: 0.00&#x2013;0.17). This performance was similar on the test set (accuracy: 0.50 to 0.66, MCC: &#x2212;.01&#x2013;0.36) with a slight improvement. Similarly to what we observed using FAERS data, resampling the dataset improved both the accuracy and the MCC of the training set (accuracy: 0.71&#x2013;0.78, MCC: 0.56&#x2013;0.76) but could not generalize better than non-resampled MOLD2 data to the test set (accuracy: 0.51 to 0.67, MCC: 0.14&#x2013;0.36).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>MOLD2 Model Performance. The chemical structural information from MOLD2 was imbalanced between DILI positive and negative samples. Predictive models were evaluated on both the unbalanced and resampled/balanced datasets. The three best-performing models for each DILI type, based on AUC and resampling methods, are depicted in the bar graphs.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g005.tif"/>
</fig>
</sec>
<sec id="s3-4">
<title>Connectivity Map L1000 Cell Expression Modeling</title>
<p>Cellular RNA expression levels in the form of microarray data have been previously investigated for their ability to predict DILI with limited predictive power (<xref ref-type="bibr" rid="B8">Chierici et&#x20;al., 2020</xref>). In the current study, the L1000 data from the Connectivity Map was used including both the measured landmark genes as well as the inferred transcriptome. We built models using each expression data to investigate which cell lines were most successful in predicting DILI. <xref ref-type="table" rid="T4">Table&#x20;4</xref> summarizes the model results based on our training data. However, due to the limitation of each cell only providing expression response data from a subset of drugs (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>) involved in the training and test data, accuracy based on test data was not meaningful. Additional processing steps for this data involved merging across the six cell lines to generate a representative signature, testing different cutoffs for the amount of highest- and lowest-ranked genes to utilize, as well as a feature selection for determining predictor&#x20;genes.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Training Performance on Independent Cell Line based Models. Each of the six cell lines with L1000 expression data were used to build predictive models of the four DILI classes. Training performance results for the best performing model for each cell type and DILI class are shown as well as the number of predictors following feature selection as described in the methods section.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">DILI class</th>
<th align="center">Cell type tested</th>
<th align="center">ML algorithm</th>
<th align="center">Predictors</th>
<th align="center">AUC-ROC</th>
<th align="center">Sensitivity</th>
<th align="center">Specificity</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="6" align="left">DILI 1</td>
<td align="left">PHH</td>
<td align="left">SVM</td>
<td align="char" char=".">60</td>
<td align="char" char=".">0.969</td>
<td align="char" char=".">0.912</td>
<td align="char" char=".">0.945</td>
</tr>
<tr>
<td align="left">Hep G2</td>
<td align="left">SVM</td>
<td align="char" char=".">72</td>
<td align="char" char=".">0.922</td>
<td align="char" char=".">0.924</td>
<td align="char" char=".">0.693</td>
</tr>
<tr>
<td align="left">HA1E</td>
<td align="left">GLM</td>
<td align="char" char=".">40</td>
<td align="char" char=".">0.781</td>
<td align="char" char=".">0.903</td>
<td align="char" char=".">0.389</td>
</tr>
<tr>
<td align="left">A-375</td>
<td align="left">GLM</td>
<td align="char" char=".">178</td>
<td align="char" char=".">0.627</td>
<td align="char" char=".">0.826</td>
<td align="char" char=".">0.17</td>
</tr>
<tr>
<td align="left">MCF7</td>
<td align="left">GLM</td>
<td align="char" char=".">65</td>
<td align="char" char=".">0.722</td>
<td align="char" char=".">0.898</td>
<td align="char" char=".">0.222</td>
</tr>
<tr>
<td align="left">PC3</td>
<td align="left">RF</td>
<td align="char" char=".">315</td>
<td align="char" char=".">0.589</td>
<td align="char" char=".">1.000</td>
<td align="char" char=".">0</td>
</tr>
<tr>
<td rowspan="6" align="left">DILI 3</td>
<td align="left">PHH</td>
<td align="left">NB</td>
<td align="char" char=".">50</td>
<td align="char" char=".">0.931</td>
<td align="char" char=".">0.547</td>
<td align="char" char=".">0.957</td>
</tr>
<tr>
<td align="left">Hep G2</td>
<td align="left">RF</td>
<td align="char" char=".">75</td>
<td align="char" char=".">0.913</td>
<td align="char" char=".">0.942</td>
<td align="char" char=".">0.625</td>
</tr>
<tr>
<td align="left">HA1E</td>
<td align="left">SVM</td>
<td align="char" char=".">176</td>
<td align="char" char=".">0.922</td>
<td align="char" char=".">0.953</td>
<td align="char" char=".">0.788</td>
</tr>
<tr>
<td align="left">A-375</td>
<td align="left">SVM</td>
<td align="char" char=".">3,610</td>
<td align="char" char=".">0.833</td>
<td align="char" char=".">0.869</td>
<td align="char" char=".">0.607</td>
</tr>
<tr>
<td align="left">MCF7</td>
<td align="left">SVM</td>
<td align="char" char=".">74</td>
<td align="char" char=".">0.861</td>
<td align="char" char=".">0.872</td>
<td align="char" char=".">0.742</td>
</tr>
<tr>
<td align="left">PC3</td>
<td align="left">SVM</td>
<td align="char" char=".">345</td>
<td align="char" char=".">0.844</td>
<td align="char" char=".">0.863</td>
<td align="char" char=".">0.606</td>
</tr>
<tr>
<td rowspan="6" align="left">DILI 5</td>
<td align="left">PHH</td>
<td align="left">GLM</td>
<td align="char" char=".">8</td>
<td align="char" char=".">0.723</td>
<td align="char" char=".">0.484</td>
<td align="char" char=".">0.761</td>
</tr>
<tr>
<td align="left">Hep G2</td>
<td align="left">RF</td>
<td align="char" char=".">17</td>
<td align="char" char=".">0.719</td>
<td align="char" char=".">0.984</td>
<td align="char" char=".">0.229</td>
</tr>
<tr>
<td align="left">HA1E</td>
<td align="left">GLM</td>
<td align="char" char=".">20</td>
<td align="char" char=".">0.711</td>
<td align="char" char=".">0.693</td>
<td align="char" char=".">0.513</td>
</tr>
<tr>
<td align="left">A-375</td>
<td align="left">GLM</td>
<td align="char" char=".">24</td>
<td align="char" char=".">0.724</td>
<td align="char" char=".">0.786</td>
<td align="char" char=".">0.561</td>
</tr>
<tr>
<td align="left">MCF7</td>
<td align="left">RF</td>
<td align="char" char=".">38</td>
<td align="char" char=".">0.679</td>
<td align="char" char=".">0.803</td>
<td align="char" char=".">0.355</td>
</tr>
<tr>
<td align="left">PC3</td>
<td align="left">GLM</td>
<td align="char" char=".">14</td>
<td align="char" char=".">0.661</td>
<td align="char" char=".">0.255</td>
<td align="char" char=".">0.961</td>
</tr>
<tr>
<td rowspan="6" align="left">DILI 6</td>
<td align="left">PHH</td>
<td align="left">GLM</td>
<td align="char" char=".">2</td>
<td align="char" char=".">0.574</td>
<td align="char" char=".">0.087</td>
<td align="char" char=".">0.990</td>
</tr>
<tr>
<td align="left">Hep G2</td>
<td align="left">RF</td>
<td align="char" char=".">31</td>
<td align="char" char=".">0.686</td>
<td align="char" char=".">0.000</td>
<td align="char" char=".">1.000</td>
</tr>
<tr>
<td align="left">HA1E</td>
<td align="left">RF</td>
<td align="char" char=".">27</td>
<td align="char" char=".">0.688</td>
<td align="char" char=".">0.247</td>
<td align="char" char=".">0.949</td>
</tr>
<tr>
<td align="left">A-375</td>
<td align="left">GLM</td>
<td align="char" char=".">16</td>
<td align="char" char=".">0.619</td>
<td align="char" char=".">0.181</td>
<td align="char" char=".">0.945</td>
</tr>
<tr>
<td align="left">MCF7</td>
<td align="left">RF</td>
<td align="char" char=".">24</td>
<td align="char" char=".">0.689</td>
<td align="char" char=".">0.186</td>
<td align="char" char=".">0.975</td>
</tr>
<tr>
<td align="left">PC3</td>
<td align="left">RF</td>
<td align="char" char=".">53</td>
<td align="char" char=".">0.724</td>
<td align="char" char=".">0.159</td>
<td align="char" char=".">0.986</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The models built using the merged expression signatures with the highest AUC from the training data were evaluated on the test set. The training and test results are summarized in the bar plots in <xref ref-type="fig" rid="F6">Figure&#x20;6</xref>. None of the cell expression signatures performed well when predicting DILI3, DILI5, or DILI6 with an accuracy ranging from 0.39 to 0.64 and MCC values ranging from &#x2212;0.03 to 0.1. These models did have some limited success predicting DILI1 with the merged SVM 1000 model performing the best, reaching an accuracy of 0.67 but an MCC of 0.10 (<xref ref-type="sec" rid="s9">Supplementary Table&#x20;1</xref>). The poor predictability of DILI3 status by these models was unexpected with the accuracy of the best model being 0.49 with 0.33 sensitivity and 0.66 specificity. The limited success in predicting DILI5 and DILI6 was expected based on the positive and negative control construction of these DILI classes, which are not reflected in the gene expression&#x20;data.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Cell Expression Model Performance. A single cell expression signature for each drug was generated using Kru-Bor merging across all cell lines in which the drug was tested as described in the methods. Following merging, feature selection using a fisher&#x2019;s exact test was performed on expression signatures of the top and bottom 100, 250, 500, and 1,000 ranked genes. Models built on these predictors were evaluated and the top-performing ones, based on AUC, are shown in the training set bar&#x20;graph.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g006.tif"/>
</fig>
</sec>
<sec id="s3-5">
<title>Ensemble Voting Models Performance</title>
<p>Since the top three individual models did not perform well on the test set (<xref ref-type="sec" rid="s9">Supplementary Table&#x20;1</xref>), we asked if aggregating the top three models in an ensemble approach could improve the accuracy. To test this, we applied three ensemble voting methods namely soft voting, hard voting, and weighted voting. Hard voting gave accuracies of 0.39 and 0.37 on DILI1 and DILI3, respectively, while soft voting gave an accuracy of 0.44 and 0.40 for DILI1 andDILI3, respectively (<xref ref-type="fig" rid="F7">Figure 7</xref>). Soft voting slightly improved the accuracy of these models most likely because it considers membership probabilities rather than predicted class labels. We observed that weighted voting slightly improved the accuracy: 0.54 for DILI1 and 0.60 for DILI3. Our weighted approach considers both the probabilities and the AUC of the models and emphasizes the contribution of models with higher AUCs. Sumsion and others used similar approaches (soft and hard voting) with gene expression data resulting in decreased accuracies (<xref ref-type="bibr" rid="B43">Sumsion et&#x20;al., 2020</xref>). Compared to their study, our approach improved the accuracies of the models. However, our method(s) does not report MCCs because we do not have access to the true positives, true negatives, false positives, and false negatives in the test&#x20;data.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Ensemble Voting Method Performance. To incorporate the various types of data provided ensemble methods including hard, soft, and weighted voting were tested using the top three performing models for each DILI type.</p>
</caption>
<graphic xlink:href="fphar-12-648805-g007.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>CAMDA 2020 was a collaborative challenge to establish predictive models of DILI using gene expression data as well as a combination of data from clinically reported events, drug structure, and toxicology. In our study, we evaluated the predictability of these datasets on four DILI types, namely, DILI1 (severity score &#x2265;6), DILI3 (withdrawn, box warning, warning, and precaution), DILI5 (negative control), and DILI6 (positive control). These datasets included gene expression/perturbation data on six cell lines (PHH, HEPG2, HA1E, A375, MCF7, and PC3), concentration-response or toxicology information, 2D molecular descriptors of the drug structure, and reported adverse events. To assess the predictive abilities of these datasets, we used various traditional machine learning algorithms. For non-gene expression datasets, we corrected the imbalance issue using well-known techniques like SMOTE, ROSE, and upsampling the minority&#x20;class.</p>
<p>While CAMDA previously approached predicting DILI, there have been significant improvements in the data provided and scopes of the challenge each year. In 2018 the challenge data only included microarray expression data from non-liver relevant cell lines on 276 compounds with a binary DILI classification. Published results from the 2018 challenge indicate limited success from both deep learning and soft voting approaches which achieved a maximum accuracy of 0.7 and MCC values &#x3c; 0.2 (<xref ref-type="bibr" rid="B8">Chierici et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B43">Sumsion et&#x20;al., 2020</xref>). When the CMap drug safety challenge was readministered in 2019, the data expanded to L1000 transcriptomic data on 13 cell lines and allowed participants to use external data sources such as protein-protein interactions, drug-protein targets, and chemical descriptors. The DILI classifications for this challenge also changed from binary to a most, less, ambiguous, and no-DILI concern which is in line with the FDA DILIrank dataset. Predictive model rates from multiple distinct approaches to this challenge in 2019 often yielded similar accuracy results around 0.70 (<xref ref-type="bibr" rid="B1">Aguirre-Plans et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B23">Lesi&#x144;ski et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B27">Liu et&#x20;al., 2021</xref>). While it is difficult to make a direct comparison across the years of these challenges considering how the fundamental elements of predictive modeling, such as the data sources and classifications, have changed, the goal of the challenge has remained the same in modeling the risk of a drug to lead to liver injury in patients. The data structure of the challenge has also improved in each iteration attempting to expand the predictive data power as well as the data sample size to allow for more robust modeling. However, as in previous years, the highest accuracy we were able to achieve in the current study was 0.67 for DILI1 and DILI3 with the highest MCC value of 0.36. This suggests that there is still room for improvement in both model construction and developing robust predictive data, which captures the scope of DILI.</p>
<p>In our study, we developed models with gene expression data using individual cell lines, as well as a merging of these datasets. Each cell line dataset did not include all the drugs thereby reducing the size of the training data and making it difficult to evaluate each of them on the test set. Therefore, we merged these datasets into one expression signature across cell types. Further, we selected the 100, 250, 500, and 1,000 most upregulated and downregulated genes as an arbitrarily signature cutoff of the most perturbed genes by drug treatment. However, our approach failed to capture predictive differences between the positive and negative classes in each DILI type. Although we achieved an accuracy of 0.67 for DILI1 (on the test set), a sensitivity of 0.38 showed that our models were not learning the positive classes well enough. Usually, this problem is due to not having sufficient training examples for a particular class. In contrast, we could obtain specificity as high as 0.95, showing that the model could learn the negative classes well since there were more DILI negative drugs in the training set. <xref ref-type="table" rid="T5">Table&#x20;5</xref> summarizes the best performances on the test set. We observed that many of these models failed to generalize to the test set i.e. they showed poor predictability on the test set (see <xref ref-type="sec" rid="s9">Supplementary Table&#x20;1</xref> for all models). Since the individual models did not perform well on the test set, we attempted ensemble (voting) methods to improve prediction accuracy. We used soft voting, hard voting, and weighted voting approaches. In weighted-voting methods, there are diverse ways through which importance can be attached to each model. Weight-based ensemble methods tend to outperform single models, and even soft voting, because in addition to the posterior probabilities churned out by the models, they take into consideration some importance or weighting factor (<xref ref-type="bibr" rid="B33">Mu et&#x20;al., 2009</xref>). Although these methods could not improve test accuracy beyond individual models, weighted voting performed better than soft and hard voting because weighs the predicted probabilities of the test examples by the performance of each&#x20;model.</p>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Testing Performance of Top Models. The testing result metrics from the best model built using each dataset as well as the ensemble voting&#x20;model.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Dataset</th>
<th align="center">DILI class</th>
<th align="center">Algorithm</th>
<th align="center">Test sensitivity</th>
<th align="center">Test specificity</th>
<th align="center">Test MCC</th>
<th align="center">Test balanced accuracy</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="4" align="left">Merged expression</td>
<td align="center">DILI1</td>
<td align="left">SVM</td>
<td align="char" char=".">0.38</td>
<td align="char" char=".">0.95</td>
<td align="char" char=".">0.1</td>
<td align="char" char=".">0.67</td>
</tr>
<tr>
<td align="center">DILI3</td>
<td align="left">SVM</td>
<td align="char" char=".">0.33</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">-0.03</td>
<td align="char" char=".">0.49</td>
</tr>
<tr>
<td align="center">DILI5</td>
<td align="left">SVM</td>
<td align="char" char=".">0.58</td>
<td align="char" char=".">0.7</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">0.64</td>
</tr>
<tr>
<td align="center">DILI6</td>
<td align="left">SVM</td>
<td align="char" char=".">0.48</td>
<td align="char" char=".">0.53</td>
<td align="char" char=".">0</td>
<td align="char" char=".">0.51</td>
</tr>
<tr>
<td rowspan="4" align="left">FAERS</td>
<td align="center">DILI1</td>
<td align="left">NNET</td>
<td align="char" char=".">0.51</td>
<td align="char" char=".">0.73</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">0.62</td>
</tr>
<tr>
<td align="center">DILI3</td>
<td align="left">RF</td>
<td align="char" char=".">0.54</td>
<td align="char" char=".">0.71</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">0.62</td>
</tr>
<tr>
<td align="center">DILI5</td>
<td align="left">RPART</td>
<td align="char" char=".">0.51</td>
<td align="char" char=".">0.57</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.54</td>
</tr>
<tr>
<td align="center">DILI6</td>
<td align="left">RF</td>
<td align="char" char=".">0.72</td>
<td align="char" char=".">0.47</td>
<td align="char" char=".">0.2</td>
<td align="char" char=".">0.6</td>
</tr>
<tr>
<td rowspan="4" align="left">MOLD2</td>
<td align="center">DILI1</td>
<td align="left">SVMPoly</td>
<td align="char" char=".">0.33</td>
<td align="char" char=".">0.88</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">0.61</td>
</tr>
<tr>
<td align="center">DILI3</td>
<td align="left">SVMPoly</td>
<td align="char" char=".">0.55</td>
<td align="char" char=".">0.8</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">0.67</td>
</tr>
<tr>
<td align="center">DILI5</td>
<td align="left">SVMPoly</td>
<td align="char" char=".">0.38</td>
<td align="char" char=".">0.64</td>
<td align="char" char=".">0.01</td>
<td align="char" char=".">0.51</td>
</tr>
<tr>
<td align="center">DILI6</td>
<td align="left">SVMPoly</td>
<td align="char" char=".">0.95</td>
<td align="char" char=".">0.99</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">0.97</td>
</tr>
<tr>
<td rowspan="4" align="left">TOX21</td>
<td align="center">DILI1</td>
<td align="left">NNET</td>
<td align="char" char=".">0.3</td>
<td align="char" char=".">0.82</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">0.56</td>
</tr>
<tr>
<td align="center">DILI3</td>
<td align="left">GLM</td>
<td align="char" char=".">0.43</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">0.59</td>
</tr>
<tr>
<td align="center">DILI5</td>
<td align="left">GLM</td>
<td align="char" char=".">0.3</td>
<td align="char" char=".">0.75</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">0.53</td>
</tr>
<tr>
<td align="center">DILI6</td>
<td align="left">QDA</td>
<td align="char" char=".">0.63</td>
<td align="char" char=".">0.62</td>
<td align="char" char=".">0.26</td>
<td align="char" char=".">0.63</td>
</tr>
<tr>
<td rowspan="4" align="left">Ensemble voting</td>
<td align="center">DILI1</td>
<td align="left">Weighted voting</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.92</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.54</td>
</tr>
<tr>
<td align="center">DILI3</td>
<td align="left">Weighted voting</td>
<td align="char" char=".">0.3</td>
<td align="char" char=".">0.89</td>
<td align="char" char=".">0.24</td>
<td align="char" char=".">0.6</td>
</tr>
<tr>
<td align="center">DILI5</td>
<td align="left">Weighted voting</td>
<td align="char" char=".">0.28</td>
<td align="char" char=".">0.71</td>
<td align="char" char=".">-0.01</td>
<td align="char" char=".">0.5</td>
</tr>
<tr>
<td align="center">DILI6</td>
<td align="left">Weighted voting</td>
<td align="char" char=".">0.96</td>
<td align="char" char=".">0.97</td>
<td align="char" char=".">0.93</td>
<td align="char" char=".">0.96</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>One challenge we had was that the training set was perhaps too small to be further split into a training and validation set. However, machine learning algorithms benefit most from having sufficient examples. For some datasets such as the gene expression datasets, we did not have access to information on all 617 drugs, which reduced the size of the training data. Besides, the training data were largely unbalanced (<xref ref-type="table" rid="T3">Table&#x20;3</xref>). For instance, for DILI1, there were 96 positive examples and 326 negative examples. This problem resulted in many of our models having low sensitivities since the positive examples were insufficient. In an attempt to address this problem, we employed resampling techniques (SMOTE, ROSE, and upsampling minority classes) to balance the datasets. However, it was clear that models built using balanced (resampled) data were overfitting the training set. A possible reason for this was that due to our resampling approach, some training examples were also used in the validation stage during cross-validation. In addition, due to having blinded datasets, we could not explore how the features were influencing the models.</p>
<p>In summary, our study suggests that currently available data, including mRNA quantification, molecular descriptors, clinically reported events, and toxicology profiles, may be inadequate to capture important information enough to separate DILI classes in real-world scenarios. Also, larger datasets may be needed to encourage the application of deep learning algorithms which typically do better with bigger data. We also suggest an additional focus or challenge to predict biomarkers specific for DILI using various&#x2013;omics data, for instance, single-cell data and metabolomics signatures.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>Data are available for download as provided by the CAMDA organizers at <ext-link ext-link-type="uri" xlink:href="http://camda2020.bioinf.jku.at/doku.php/contest_dataset">http://camda2020.bioinf.jku.at/doku.php/contest_dataset</ext-link>. The full processing code of the data for the results obtained in this article can be found at <ext-link ext-link-type="uri" xlink:href="https://github.com/hurlab/CAMDA-Challenge-2020-Drug-Induced-Liver-Injury">https://github.com/hurlab/CAMDA-Challenge-2020-Drug-Induced-Liver-Injury</ext-link>.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>Participated in research design: TA, BM, KG, JH. Performed data analysis: TA, BM, KG, JH. Wrote or contributed to the writing of the manuscript: TA, BM. Overall supervision of the project:&#x20;JH.</p>
</sec>
<sec sec-type="COI-statement" id="s7">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s8">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>We would like to thank the CAMDA committee for organizing this challenge and providing the opportunity to present our results at ISMB&#x20;2020.</p>
</ack>
<sec id="s9">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fphar.2021.648805/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fphar.2021.648805/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Table1.xlsx" id="SM1" mimetype="application/xlsx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aguirre-Plans</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pi&#xf1;ero</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Souza</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Callegaro</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kunnen</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Sanz</surname>
<given-names>F.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>An Ensemble Learning Approach for Modeling the Systems Biology of Drug-Induced Injury</article-title>. <source>Biol. Direct</source> <volume>16</volume>, <fpage>1</fpage>&#x2013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1186/s13062-020-00288-x</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andrade</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Chalasani</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Bj&#xf6;rnsson</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kullak-Ublick</surname>
<given-names>G. A.</given-names>
</name>
<name>
<surname>Watkins</surname>
<given-names>P. B.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Drug-induced Liver Injury</article-title>. <source>Nat. Rev. Dis. Primers</source> <volume>5</volume>. <pub-id pub-id-type="doi">10.1038/s41572-019-0105-0</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Atienzar</surname>
<given-names>F. A.</given-names>
</name>
<name>
<surname>Blomme</surname>
<given-names>E. A.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hewitt</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kenna</surname>
<given-names>J.&#x20;G.</given-names>
</name>
<name>
<surname>Labbe</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Key Challenges and Opportunities Associated with the Use of <italic>In Vitro</italic> Models to Detect Human Dili: Integrated Risk Assessment and Mitigation Plans</article-title>. <source>Biomed. Res. Int.</source> <volume>2016</volume>, <fpage>9737920</fpage>. <pub-id pub-id-type="doi">10.1155/2016/9737920</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Berlin</surname>
<given-names>J.&#x20;A.</given-names>
</name>
<name>
<surname>Glasser</surname>
<given-names>S. C.</given-names>
</name>
<name>
<surname>Ellenberg</surname>
<given-names>S. S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Adverse Event Detection in Drug Development: Recommendations and Obligations beyond Phase 3</article-title>. <source>Am. J.&#x20;Public Health</source> <volume>98</volume>, <fpage>1366</fpage>&#x2013;<lpage>1371</lpage>. <pub-id pub-id-type="doi">10.2105/AJPH.2007.124537</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chawla</surname>
<given-names>N. V.</given-names>
</name>
<name>
<surname>Bowyer</surname>
<given-names>K. W.</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>L. O.</given-names>
</name>
<name>
<surname>Kegelmeyer</surname>
<given-names>W. P.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>SMOTE: Synthetic Minority Over-sampling Technique</article-title>. <source>J.&#x20;Artif. Intell. Res.</source> <volume>16</volume>, <fpage>321</fpage>&#x2013;<lpage>357</lpage>. <pub-id pub-id-type="doi">10.1613/jair.953</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Borlak</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tong</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2016a</year>). <article-title>A Model to Predict Severity of Drug-Induced Liver Injury in Humans</article-title>. <source>Hepatology</source> <volume>64</volume>, <fpage>931</fpage>&#x2013;<lpage>940</lpage>. <pub-id pub-id-type="doi">10.1002/hep.28678</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Thakkar</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Tong</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2016b</year>). <article-title>DILIrank: The Largest Reference Drug List Ranked by the Risk for Developing Drug-Induced Liver Injury in Humans</article-title>. <source>Drug Discov. Today</source> <volume>21</volume>, <fpage>648</fpage>&#x2013;<lpage>653</lpage>. <pub-id pub-id-type="doi">10.1016/j.drudis.2016.02.015</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chierici</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Francescatto</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bussola</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Jurman</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Furlanello</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Predictability of Drug-Induced Liver Injury by Machine Learning</article-title>. <source>Biol. Direct</source> <volume>15</volume>, <fpage>3</fpage>. <pub-id pub-id-type="doi">10.1186/s13062-020-0259-4</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cortes</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Vapnik</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Support-vector Networks</article-title>. <source>Mach. Learn.</source> <volume>20</volume>, <fpage>273</fpage>&#x2013;<lpage>297</lpage>. <pub-id pub-id-type="doi">10.1007/BF00994018</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cox</surname>
<given-names>D. R.</given-names>
</name>
</person-group> (<year>1958</year>). <article-title>The Regression Analysis of Binary Sequences</article-title>. <source>J.&#x20;R. Stat. Soc. Ser. B (Methodological)</source> <volume>20</volume>, <fpage>215</fpage>&#x2013;<lpage>232</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1958.tb00292.x</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Daly</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Pharmacogenomics of Adverse Drug Reactions</article-title>. <source>Genome Med.</source> <volume>5</volume>, <fpage>5</fpage>. <pub-id pub-id-type="doi">10.1186/gm409</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garc&#xed;a-Cort&#xe9;s</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ortega-Alonso</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lucena</surname>
<given-names>M. I.</given-names>
</name>
<name>
<surname>Andrade</surname>
<given-names>R. J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Drug-induced Liver Injury: a Safety Review</article-title>. <source>Expert Opin. Drug Saf.</source> <volume>17</volume>, <fpage>795</fpage>&#x2013;<lpage>804</lpage>. <pub-id pub-id-type="doi">10.1080/14740338.2018.1505861</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Guo</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>McGregor</surname>
<given-names>B. A.</given-names>
</name>
</person-group> (<year>2020</year>). <source>VennDetail: A Package for Visualization and Extract Details</source>. <comment>R package version 1.8.0</comment>, <ext-link ext-link-type="uri" xlink:href="https://github.com/hurlab/VennDetail">https://github.com/hurlab/VennDetail</ext-link>.</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hand</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Idiot&#x27;s Bayes? Not So Stupid after All?</article-title>. <source>Int. Stat. Rev</source> <volume>69</volume>, <fpage>385</fpage>&#x2013;<lpage>398</lpage>. <pub-id pub-id-type="doi">10.2307/1403452</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Ge</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Qian</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Fang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Mold2, Molecular Descriptors from 2D Structures for Chemoinformatics and Toxicoinformatics</article-title>. <source>J.&#x20;Chem. Inf. Model.</source> <volume>48</volume>, <fpage>1337</fpage>&#x2013;<lpage>1344</lpage>. <pub-id pub-id-type="doi">10.1021/ci800038f</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sakamuru</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Shahane</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Attene-Ramos</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Modelling the Tox21 10&#x20;K Chemical Profiles for <italic>In Vivo</italic> Toxicity Prediction and Mechanism Characterization</article-title>. <source>Nat. Commun.</source> <volume>7</volume>, <fpage>10425</fpage>. <pub-id pub-id-type="doi">10.1038/ncomms10425</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iorio</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bosotti</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Scacheri</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Belcastro</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Mithbaokar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Ferriero</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>Discovery of Drug Mode of Action and Drug Repositioning from Transcriptional Responses</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>107</volume>, <fpage>14621</fpage>&#x2013;<lpage>14626</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1000138107</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kohonen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Parkkinen</surname>
<given-names>J.&#x20;A.</given-names>
</name>
<name>
<surname>Willighagen</surname>
<given-names>E. L.</given-names>
</name>
<name>
<surname>Ceder</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wennerberg</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Kaski</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>A Transcriptomics Data-Driven Gene Space Accurately Predicts Liver Cytopathology and Drug-Induced Liver Injury</article-title>. <source>Nat. Commun.</source> <volume>8</volume>, <fpage>15932</fpage>. <pub-id pub-id-type="doi">10.1038/ncomms15932</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kruskal</surname>
<given-names>J.&#x20;B.</given-names>
</name>
</person-group> (<year>1956</year>). <article-title>On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem</article-title>. <source>Proc. Am. Math. Soc.</source> <volume>7</volume> (<issue>1</issue>), <fpage>48</fpage>. <pub-id pub-id-type="doi">10.2307/2033241</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kuhn</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Building predictive models in R using the caret package</article-title>. <source>J.&#x20;Stat. Softw.</source> <volume>28</volume>, <fpage>1</fpage>&#x2013;<fpage>26</fpage>. <pub-id pub-id-type="doi">10.18637/jss.v028.i05</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kuncheva</surname>
<given-names>L. I.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>A Theoretical Study on Six Classifier Fusion Strategies</article-title>. <source>IEEE Trans. Pattern Anal. Machine Intell.</source> <volume>24</volume>, <fpage>281</fpage>&#x2013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.1109/34.982906</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lesi&#x144;ski</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Mnich</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Goli&#x144;ska</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Rudnicki</surname>
<given-names>W. R.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Integration of Human Cell Lines Gene Expression and Chemical Properties of Drugs for Drug Induced Liver Injury Prediction</article-title>. <source>Biol. Direct</source> <volume>16</volume>, <fpage>2</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1186/s13062-020-00286-z</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2013</year>). <article-title>Geneexpressionsignature: An R Package for Discovering Functional Connections Using Gene Expression Signatures</article-title>. <source>OMICS: A J.&#x20;Integr. Biol.</source> <volume>17</volume>, <fpage>116</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1089/omi.2012.0087</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Space Oriented Rank-Based Data Integration</article-title>. <source>Stat. Appl. Genet. Mol. Biol.</source> <volume>9</volume>. <pub-id pub-id-type="doi">10.2202/1544-6115.1534</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yacoub</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Burns</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Simske</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Performance Analysis of Pattern Classifier Combination by Plurality Voting</article-title>. <source>Pattern Recognition Lett.</source> <volume>24</volume>, <fpage>1959</fpage>&#x2013;<lpage>1969</lpage>. <pub-id pub-id-type="doi">10.1016/S0167-8655(03)00035-7</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Walter</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wright</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bartosik</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Dolciami</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Elbasir</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Prediction and Mechanistic Analysis of Drug-Induced Liver Injury (DILI) Based on Chemical Structure</article-title>. <source>Biol. Direct</source> <volume>16</volume>, <fpage>1</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1186/s13062-020-00285-0</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Weng</surname>
<given-names>Z.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Machine-Learning Prediction of Oral Drug-Induced Liver Injury (DILI) <italic>via</italic> Multiple Features and Endpoints</article-title>. <source>Biomed. Res. Int.</source> <volume>2020</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. <pub-id pub-id-type="doi">10.1155/2020/4795140</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>L&#xf3;pez-Longarela</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Morrison</surname>
<given-names>E. E.</given-names>
</name>
<name>
<surname>Tranter</surname>
<given-names>J.&#x20;D.</given-names>
</name>
<name>
<surname>Chahman-Vos</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>L&#xe9;onard</surname>
<given-names>J.-F.</given-names>
</name>
<name>
<surname>Gautier</surname>
<given-names>J.-C.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Direct Detection of miR-122 in Hepatotoxicity Using Dynamic Chemical Labeling Overcomes Stability and isomiR Challenges</article-title>. <source>Anal. Chem.</source> <volume>92</volume>, <fpage>3388</fpage>&#x2013;<lpage>3395</lpage>. <pub-id pub-id-type="doi">10.1021/acs.analchem.9b05449</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marzano</surname>
<given-names>A. V.</given-names>
</name>
<name>
<surname>Borghi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Cugno</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Adverse Drug Reactions and Organ Damage: The Skin</article-title>. <source>Eur. J.&#x20;Intern. Med.</source> <volume>28</volume>, <fpage>17</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejim.2015.11.017</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Menardi</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Torelli</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Training and Assessing Classification Rules with Imbalanced Data</article-title>. <source>Data Min Knowl Disc</source> <volume>28</volume>, <fpage>92</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1007/s10618-012-0295-5</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Messner</surname>
<given-names>C. J.</given-names>
</name>
<name>
<surname>Premand</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Gaiser</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Kluser</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>K&#xfc;bler</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Suter-Dick</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Exosomal microRNAs Release as a Sensitive Marker for Drug-Induced Liver InjuryIn Vitro</article-title>. <source>Appl. Vitro Toxicol.</source> <volume>6</volume>, <fpage>77</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1089/aivt.2020.0008</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Watta</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Hassoun</surname>
<given-names>M. H.</given-names>
</name>
</person-group> (<year>2009</year>). &#x201c;<article-title>Weighted Voting-Based Ensemble Classifiers with Application to Human Face Recognition and Voice Recognition</article-title>,&#x201d; in <conf-name>Proceedings of the International Joint Conference on Neural Networks</conf-name>, <fpage>2168</fpage>&#x2013;<lpage>2171</lpage>. <pub-id pub-id-type="doi">10.1109/IJCNN.2009.5178708</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ozer</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ratner</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Schomaker</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>The Current State of Serum Biomarkers of Hepatotoxicity</article-title>. <source>Toxicology</source> <volume>245</volume>, <fpage>194</fpage>&#x2013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1016/j.tox.2007.11.021</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quinlan</surname>
<given-names>J.&#x20;R.</given-names>
</name>
</person-group> (<year>1986</year>). <article-title>Induction of Decision Trees</article-title>. <source>Mach. Learn.</source> <volume>1</volume>, <fpage>81</fpage>&#x2013;<lpage>106</lpage>. <pub-id pub-id-type="doi">10.1007/BF00116251</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="book">
<collab>R Core Team</collab> (<year>2020</year>).<article-title>R: A Language and Environment for Statistical Computing</article-title>. <publisher-loc>Vienna, Austria</publisher-loc>: <publisher-name>Environ. Stat. Comput. R Found. Stat. Comput.</publisher-name> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ruta</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Gabrys</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Classifier Selection for Majority Voting</article-title>. <source>Inf. Fusion</source> <volume>6</volume>, <fpage>63</fpage>&#x2013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2004.04.008</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saari</surname>
<given-names>D. G.</given-names>
</name>
</person-group> (<year>2000a</year>). <article-title>Mathematical Structure of Voting Paradoxes</article-title>. <source>Econ. Theor.</source> <volume>15</volume>, <fpage>1</fpage>&#x2013;<lpage>53</lpage>. <pub-id pub-id-type="doi">10.1007/s001990050001</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saari</surname>
<given-names>D. G.</given-names>
</name>
</person-group> (<year>2000b</year>). <article-title>Mathematical Structure of Voting Paradoxes: II</article-title>. <source>Positional Voting. <italic>Econ. Theor.</italic>
</source> <pub-id pub-id-type="doi">10.1007/s001990050002</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saini</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Bakshi</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>In-silico Approach for Drug Induced Liver Injury Prediction: Recent Advances</article-title>. <source>Toxicol. Lett.</source> <volume>295</volume>, <fpage>288</fpage>&#x2013;<lpage>295</lpage>. <pub-id pub-id-type="doi">10.1016/j.toxlet.2018.06.1216</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shin</surname>
<given-names>H. K.</given-names>
</name>
<name>
<surname>Kang</surname>
<given-names>M.-G.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yoon</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Development of Prediction Models for Drug-Induced Cholestasis, Cirrhosis, Hepatitis, and Steatosis Based on Drug and Drug Metabolite Structures</article-title>. <source>Front. Pharmacol.</source> <volume>11</volume>, <fpage>1</fpage>. <pub-id pub-id-type="doi">10.3389/fphar.2020.00067</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Subramanian</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Narayan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Corsello</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Peck</surname>
<given-names>D. D.</given-names>
</name>
<name>
<surname>Natoli</surname>
<given-names>T. E.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles</article-title>. <source>Cell</source> <volume>171</volume>, <fpage>1437</fpage>&#x2013;<lpage>1452</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2017.10.049</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sumsion</surname>
<given-names>G. R.</given-names>
</name>
<name>
<surname>Bradshaw</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Beales</surname>
<given-names>J.&#x20;T.</given-names>
</name>
<name>
<surname>Ford</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Caryotakis</surname>
<given-names>G. R. G.</given-names>
</name>
<name>
<surname>Garrett</surname>
<given-names>D. J.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Diverse Approaches to Predicting Drug-Induced Liver Injury Using Gene-Expression Profiles</article-title>. <source>Biol. Direct</source> <volume>15</volume>. <pub-id pub-id-type="doi">10.1186/s13062-019-0257-6</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="book">
<person-group person-group-type="editor">
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A.</given-names>
</name>
</person-group> (Editors) (<year>2009</year>). &#x201c;<article-title>LDA (Linear Discriminant Analysis)</article-title>,&#x201d; in <source>In <italic>Encyclopedia Of Biometrics</italic>
</source> (<publisher-loc>Boston, MA</publisher-loc>: <publisher-name>Springer US</publisher-name>), <fpage>899</fpage>. <pub-id pub-id-type="doi">10.1007/978-0-387-73003-5_349</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="book">
<collab>U.S. Food and Drug Administration</collab> (<year>2021</year>). <article-title>FDA Adverse Event Reporting System</article-title>. <publisher-name>Silver Spring: MD</publisher-name>. </citation>
</ref>
<ref id="B45">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Van Erp</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Vuurpijl</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Schomaker</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2002</year>). &#x201c;<article-title>An Overview and Comparison of Voting Methods for Pattern Recognition</article-title>,&#x201d; in <conf-name>Proceedings - International Workshop on Frontiers in Handwriting Recognition</conf-name>, <publisher-loc>Niagra-on-the-Lake, ON</publisher-loc>: <comment>IWFHR</comment>, <fpage>195</fpage>&#x2013;<lpage>200</lpage>. <pub-id pub-id-type="doi">10.1109/IWFHR.2002.1030908</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pei</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Deep Learning for Drug-Induced Liver Injury</article-title>. <source>J.&#x20;Chem. Inf. Model.</source> <volume>55</volume>, <fpage>2085</fpage>&#x2013;<lpage>2093</lpage>. <pub-id pub-id-type="doi">10.1021/acs.jcim.5b00238</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>