<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="systematic-review">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurosci.</journal-id>
<journal-title>Frontiers in Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-453X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnins.2022.906290</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Systematic Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Explainable AI: A review of applications to neuroimaging data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Farahani</surname> <given-names>Farzad V.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/649757/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Fiok</surname> <given-names>Krzysztof</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1995014/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Lahijanian</surname> <given-names>Behshad</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2106367/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Karwowski</surname> <given-names>Waldemar</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/211136/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Douglas</surname> <given-names>Pamela K.</given-names></name>
<xref ref-type="aff" rid="aff5"><sup>5</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/54672/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Biostatistics, Johns Hopkins University</institution>, <addr-line>Baltimore, MD</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Industrial Engineering and Management Systems, University of Central Florida</institution>, <addr-line>Orlando, FL</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Industrial and Systems Engineering, University of Florida</institution>, <addr-line>Gainesville, FL</addr-line>, <country>United States</country></aff>
<aff id="aff4"><sup>4</sup><institution>H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology</institution>, <addr-line>Atlanta, GA</addr-line>, <country>United States</country></aff>
<aff id="aff5"><sup>5</sup><institution>School of Modeling, Simulation, and Training, University of Central Florida</institution>, <addr-line>Orlando, FL</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Takashi Hanakawa, Kyoto University, Japan</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Andreas Holzinger, Medical University Graz, Austria; Anna Saranti, University of Natural Resources and Life Sciences Vienna, Austria</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Farzad V. Farahani <email>ffaraha2&#x00040;jhu.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Neural Technology, a section of the journal Frontiers in Neuroscience</p></fn></author-notes>
<pub-date pub-type="epub">
<day>01</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>16</volume>
<elocation-id>906290</elocation-id>
<history>
<date date-type="received">
<day>28</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>11</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Farahani, Fiok, Lahijanian, Karwowski and Douglas.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Farahani, Fiok, Lahijanian, Karwowski and Douglas</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Deep neural networks (DNNs) have transformed the field of computer vision and currently constitute some of the best models for representations learned <italic>via</italic> hierarchical processing in the human brain. In medical imaging, these models have shown human-level performance and even higher in the early diagnosis of a wide range of diseases. However, the goal is often not only to accurately predict group membership or diagnose but also to provide explanations that support the model decision in a context that a human can readily interpret. The limited transparency has hindered the adoption of DNN algorithms across many domains. Numerous explainable artificial intelligence (XAI) techniques have been developed to peer inside the &#x0201C;black box&#x0201D; and make sense of DNN models, taking somewhat divergent approaches. Here, we suggest that these methods may be considered in light of the interpretation goal, including functional or mechanistic interpretations, developing archetypal class instances, or assessing the relevance of certain features or mappings on a trained model in a <italic>post-hoc</italic> capacity. We then focus on reviewing recent applications of <italic>post-hoc</italic> relevance techniques as applied to neuroimaging data. Moreover, this article suggests a method for comparing the reliability of XAI methods, especially in deep neural networks, along with their advantages and pitfalls.</p></abstract>
<kwd-group>
<kwd>explainable AI</kwd>
<kwd>interpretability</kwd>
<kwd>artificial intelligence (AI)</kwd>
<kwd>deep learning</kwd>
<kwd>neural networks</kwd>
<kwd>medical imaging</kwd>
<kwd>neuroimaging</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="181"/>
<page-count count="26"/>
<word-count count="19493"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Machine learning (ML) and deep learning (DL; also known as hierarchical learning or deep structured learning) models have revolutionized computational analysis (Bengio et al., <xref ref-type="bibr" rid="B13">2013</xref>; LeCun et al., <xref ref-type="bibr" rid="B91">2015</xref>; Schmidhuber, <xref ref-type="bibr" rid="B135">2015</xref>) across a variety of fields such as text parsing, facial reconstruction, recommender systems, and self-driving cars (Cheng et al., <xref ref-type="bibr" rid="B22">2016</xref>; Richardson et al., <xref ref-type="bibr" rid="B127">2017</xref>; Young et al., <xref ref-type="bibr" rid="B174">2018</xref>; Grigorescu et al., <xref ref-type="bibr" rid="B47">2020</xref>). These models are particularly successful when applied to images, achieving human-level performance on visual recognition tasks (Kriegeskorte, <xref ref-type="bibr" rid="B81">2015</xref>; LeCun et al., <xref ref-type="bibr" rid="B91">2015</xref>). Among the demanding domains confronting ML/DL researchers are healthcare and medicine (Litjens et al., <xref ref-type="bibr" rid="B96">2017</xref>; Miotto et al., <xref ref-type="bibr" rid="B106">2017</xref>; Shen et al., <xref ref-type="bibr" rid="B140">2017</xref>; Kermany et al., <xref ref-type="bibr" rid="B73">2018</xref>). In medical imaging and neuroimaging, in particular, deep learning has been used to make new discoveries in various domains. For example, conventional wisdom for radiologists was that little or no prognostic information is contained within a tumor, and therefore one should examine its borders. However, a recent deep learning approach coupled to texture analysis found predictive information within the tumor itself (Alex et al., <xref ref-type="bibr" rid="B2">2017</xref>). As another example, Esteva et al. (<xref ref-type="bibr" rid="B39">2017</xref>) demonstrated that a single convolutional neural network could classify skin cancer with high predictive performance, on par with the performance of a dermatologist. Numerous other studies have been conducted on various intelligent medical imaging fields, from diabetic retinopathy (Ting et al., <xref ref-type="bibr" rid="B156">2017</xref>) up to lung cancer (Farahani et al., <xref ref-type="bibr" rid="B40">2018</xref>) and Alzheimer&#x00027;s disease (Tang et al., <xref ref-type="bibr" rid="B154">2019</xref>), all of which demonstrate good predictive performance.</p>
<p>However, in practice, data artifacts might compromise the high performance of ML/DL models, making it difficult to find a suitable problem representation (Leek et al., <xref ref-type="bibr" rid="B93">2010</xref>). Ideally, though, these algorithms could be leveraged for both prediction and explanation, where the latter may drive human discovery of improved ways to solve problems (Silver et al., <xref ref-type="bibr" rid="B143">2016</xref>; H&#x000F6;lldobler et al., <xref ref-type="bibr" rid="B58">2017</xref>). Thus, strategies for comprehending and explaining what the model has learned are crucial to deliver a robust validation scheme (Do&#x00161;ilovi&#x00107; et al., <xref ref-type="bibr" rid="B31">2018</xref>; Lipton, <xref ref-type="bibr" rid="B95">2018</xref>; Montavon et al., <xref ref-type="bibr" rid="B112">2018</xref>), particularly in medicine (Caruana et al., <xref ref-type="bibr" rid="B20">2015</xref>) and neuroscience (Sturm et al., <xref ref-type="bibr" rid="B151">2016</xref>), which must be modeled based on correct features. For example, brain tumor resection requires an interpretation in a feature space that humans can readily understand, such as image or text, to leverage that information in an actionable capacity (Mirchi et al., <xref ref-type="bibr" rid="B107">2020</xref>; Pfeifer et al., <xref ref-type="bibr" rid="B122">2021</xref>).</p>
<p>Conventionally, to compare a new ML/DL technique to the existing gold standard in medicine (i.e., the human in most applications), the sensitivity, specificity, and predictive values are first calculated for each modality. Then, the confusion matrices could be constructed for both the new technique and the clinician (e.g., radiologist) and ultimately compared with each other. However, a significant weakness of this comparison is that it ignores the similarity between the support features of the ML/DL model (e.g., voxels, pixels, edges, etc.) and the features examined by the radiologist (e.g., hand-drawn or eye-tracking features). Accordingly, it is impossible to determine whether the model has learned from the embedded signals or from the artifacts or didactic noise that covary with the target (Goodfellow I. J. et al., <xref ref-type="bibr" rid="B46">2014</xref>; Montavon et al., <xref ref-type="bibr" rid="B111">2017</xref>; Douglas and Farahani, <xref ref-type="bibr" rid="B33">2020</xref>). In other words, the presence of adversarial noise, which could be simply introduced due to instrumentation, prevents achieving a robust explanation of model decisions.</p>
<p>It is widely accepted that the different architectures of DL methods, e.g., recurrent neural network (RNN), long short term memory (LSTM), deep belief network (DBN), convolutional neural network (CNN), and generative adversarial network (GAN), which are well-known for their high predictive performance, are effectively considered to be black boxes, with internal inference engines that users cannot interpret (Guidotti et al., <xref ref-type="bibr" rid="B49">2018b</xref>). Therefore, the limited transparency and explainability in such non-linear methods has prevented their adoption throughout the sciences; as a result, simpler models with higher interpretability (e.g., shallow decision trees, linear regression, or non-negative matrix factorization) remain more popular than complex models in many applications, including bioinformatics and neuroscience though these choices often reduce predictivity (Ma et al., <xref ref-type="bibr" rid="B100">2007</xref>; Devarajan, <xref ref-type="bibr" rid="B29">2008</xref>; Allen et al., <xref ref-type="bibr" rid="B3">2012</xref>; Haufe et al., <xref ref-type="bibr" rid="B54">2014</xref>; Bologna and Hayashi, <xref ref-type="bibr" rid="B18">2017</xref>). Traditionally, some researchers believe that there is a trade-off between prediction performance and explainability for commonly used ML/DL models (Gunning and Aha, <xref ref-type="bibr" rid="B50">2019</xref>). In this respect, decision trees presumably exhibit the highest explainability but are the least likely to deliver accurate results, whereas DL methods represent the best predictive performance and the worst model explainability. However, it is essential to underline that this notion has no proven linear relationship, and it can be bent for specific models/methods and sophisticated setups (Yeom et al., <xref ref-type="bibr" rid="B172">2021</xref>), increasing both prediction performance and explainability.</p>
<p>Recently, this notion has been strongly challenged by novel explainable AI studies, in which well-designed interpretation techniques have shed light on many deep non-linear machine learning models (Simonyan et al., <xref ref-type="bibr" rid="B144">2013</xref>; Zeiler and Fergus, <xref ref-type="bibr" rid="B175">2014</xref>; Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Nguyen et al., <xref ref-type="bibr" rid="B115">2016</xref>; Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref>; Selvaraju et al., <xref ref-type="bibr" rid="B138">2017</xref>; Montavon et al., <xref ref-type="bibr" rid="B112">2018</xref>; Hall and Gill, <xref ref-type="bibr" rid="B53">2019</xref>; Holzinger et al., <xref ref-type="bibr" rid="B65">2019</xref>). Remarkably, in healthcare and medicine, there is a growing demand for building AI approaches that must perform well and guarantee transparency and interpretability to medical experts (Douglas et al., <xref ref-type="bibr" rid="B34">2011</xref>; Holzinger et al., <xref ref-type="bibr" rid="B65">2019</xref>). Additionally, researchers suggest keeping humans in the loop&#x02014;considering expert knowledge in interpreting the ML/DL results&#x02014;leads to user trust and identifying points of model failure (Holzinger, <xref ref-type="bibr" rid="B60">2016</xref>; Magister et al., <xref ref-type="bibr" rid="B101">2021</xref>). In recognition of the importance of transparency in models defined for the medical imaging data, dedicated datasets and XAI exploration environments were recently proposed (Holzinger et al., <xref ref-type="bibr" rid="B67">2021</xref>). Due to the nascent nature of the neuroimaging filed and its extensive use in deep learning studies, techniques such as magnetic resonance imaging (MRI), functional MRI (fMRI), computerized tomography (CT), and ultrasound, have considerably piqued the interest of XAI researchers (Zhu et al., <xref ref-type="bibr" rid="B180">2019</xref>; van der Velden et al., <xref ref-type="bibr" rid="B161">2022</xref>).</p>
<p>The present work provides a systematic review of recent neuroimaging studies that have introduced, discussed, or applied the <italic>post-hoc</italic> explainable AI methods. The <italic>post-hoc</italic> methods take a fitted and trained model and extract information about the relationships between the model input and model decision (with no effect on model performance). In contrast, model-based approaches alter the model to allow for mechanistic (functional) or archetypal explanations. In this work, we focused on <italic>post-hoc</italic> methods of their importance for practitioners and researchers who deal with deep neural networks and neuroimaging techniques. Hence, standard data analysis methods can be utilized to evaluate the extracted information and provide tangible outcomes to the end users. The remaining sections are organized as follows. Section Background: Approaches for interpreting DNNs summarizes the existing techniques for interpreting deep neural networks, categorized into saliency (e.g., gradients, signal, and decomposition) and perturbation methods. Section Methodology discusses our search strategy for identifying relevant publications and their inclusion criterion and validity risk assessment. Section Results provides the results of a literature search, study characteristics, reliability analysis of XAI methods, and quality assessment of the included studies. Finally, section Discussion discusses the significant limitations of XAI-based techniques in medical domains and highlights several challenging issues and future perspectives in this emerging field of research.</p>
</sec>
<sec id="s2">
<title>Background: Approaches for interpreting DNNs</title>
<p>A variety of explainable AI (XAI) techniques have been developed in recent years that have taken various approaches. For example, some XAI methods are model agnostic, and some take a local as opposed to a global approach. Some have rendered heatmaps based on &#x0201C;digital staining&#x0201D; or combining weights from feature maps in the last hidden layer (Cruz-Roa et al., <xref ref-type="bibr" rid="B27">2013</xref>; Xu et al., <xref ref-type="bibr" rid="B169">2017</xref>; H&#x000E4;gele et al., <xref ref-type="bibr" rid="B52">2020</xref>). Here we suggest that these methods should be distinguished based on the goal of the explanation: functional, archetypal, or <italic>post-hoc</italic> (relevance) approximation (<xref ref-type="fig" rid="F1">Figure 1A</xref>).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>(A)</bold> Explainable AI methods taxonomy. <bold>(B)</bold> Functional approaches attempt to disclose the algorithm&#x00027;s mechanistic aspects. <bold>(C)</bold> Archetypal approaches, like generative methods, seek to uncover input patterns that yield the best model response. <bold>(D)</bold> <italic>Post-hoc</italic> perturbation relevance approaches generally change the inputs or the model&#x00027;s components and then attributing relevance proportionally to the amount of the change in model output. <bold>(E)</bold> <italic>Post-hoc</italic> decomposition relevance approaches are propagation-based techniques explaining an algorithm&#x00027;s decisions by redistributing the function value (i.e., the neural network&#x00027;s output) to the input variables, often in a layer-by-layer fashion.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0001.tif"/>
</fig>
<p><italic>Functional</italic> approaches (<xref ref-type="fig" rid="F1">Figure 1B</xref>) examine the learned representations in the graph to reveal mechanistic aspects of the algorithm (Khaligh-Razavi and Kriegeskorte, <xref ref-type="bibr" rid="B75">2014</xref>; Kriegeskorte and Douglas, <xref ref-type="bibr" rid="B82">2018</xref>). One of the goals of these approaches is to shed light onto how the feature maps or filters learned through the layers help the model achieve its goal, or support its global decision structure. <italic>Archetypal</italic> methods (<xref ref-type="fig" rid="F1">Figure 1C</xref>) attempt to find an input pattern <italic>x</italic> that is a prototypical exemplar of <italic>y</italic>. A simple example of this is activation maximization, whereby an initial input is randomized and the algorithm searches for input patterns that produce maximal response from the model (Erhan et al., <xref ref-type="bibr" rid="B37">2009</xref>). A variety of generative methods have been developed for archetypal purposes, such as generative adversarial network models (Goodfellow I. et al., <xref ref-type="bibr" rid="B45">2014</xref>). <italic>Post-hoc</italic> (or <italic>relevance</italic>) methods (<xref ref-type="fig" rid="F1">Figures 1D,E</xref>) attempt to determine which aspects of input <italic>x</italic> make it likely to take on a group membership or provide supporting evidence for a particular class. In general relevance methods fall into three classes: feature ranking, perturbation methods, and decomposition methods. Feature ranking as well as feature selection methods have existed for many years (e.g., Guyon and Elisseeff, <xref ref-type="bibr" rid="B51">2003</xref>), though their importance or lack thereof in high dimensional medical imaging data sets has been debated (Chu et al., <xref ref-type="bibr" rid="B24">2012</xref>; Kerr et al., <xref ref-type="bibr" rid="B74">2014</xref>). <italic>Perturbation</italic> relevance methods (<xref ref-type="fig" rid="F1">Figure 1D</xref>) provide a local estimate of the importance of an image region or feature. This class of techniques involve altering some aspect of the inputs or the model, and subsequently assigning relevance as proportional to the magnitude of the alteration in the model output. An initial example of this is the classic pixel flip (Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Samek et al., <xref ref-type="bibr" rid="B131">2017a</xref>), whereby small local regions of the image are altered, and the ensuing changes to the output are mapped back in the form of a relevance score to those altered pixels. Alternatively, perturbation methods may alter the weights, and see how this effects the output <italic>f(x)</italic>. Perturbations methods can be model agnostic or not, depending upon their implementation. Lastly, <italic>decomposition</italic> or <italic>redistribution</italic> methods (<xref ref-type="fig" rid="F1">Figure 1E</xref>) for relevance assignment attempt to determine the share of relevance through the model layers by examining the model structure, and are thus, typically model dependent. These attribute methods involve message passing, and propose relevance backward through the model, viewing prediction as the output of a computational graph.</p>
<sec>
<title>History of <italic>post-hoc</italic> explanation techniques</title>
<p>The earliest work in XAI can be traced to 1960&#x02013;1980, when some expert systems were equipped with rules that could interpret their results (McCarthy, <xref ref-type="bibr" rid="B103">1960</xref>; Shortliffe and Buchanan, <xref ref-type="bibr" rid="B141">1975</xref>; Scott et al., <xref ref-type="bibr" rid="B136">1977</xref>). Although the logical inference of such systems was easily readable by humans, many of these systems were never used in practice for poor predictive performance, lack of generalizability, and the high cost of their knowledge base maintenance (Holzinger et al., <xref ref-type="bibr" rid="B65">2019</xref>). The emergence of ML techniques, especially those based on deep neural networks, has overcome many traditional limitations, although their interpretability to users remains their primary challenge (Lake et al., <xref ref-type="bibr" rid="B85">2017</xref>).</p>
<p>Accordingly, in recent years, AI researchers have afforded considerable focus to peering inside the black-box of DNNs and enhancing the system&#x00027;s transparency (Baehrens et al., <xref ref-type="bibr" rid="B10">2009</xref>; Anderson et al., <xref ref-type="bibr" rid="B6">2012</xref>; Haufe et al., <xref ref-type="bibr" rid="B54">2014</xref>; Simonyan and Zisserman, <xref ref-type="bibr" rid="B145">2014</xref>; Springenberg et al., <xref ref-type="bibr" rid="B148">2014</xref>; Zeiler and Fergus, <xref ref-type="bibr" rid="B175">2014</xref>; Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Yosinski et al., <xref ref-type="bibr" rid="B173">2015</xref>; Nguyen et al., <xref ref-type="bibr" rid="B115">2016</xref>; Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref>; Montavon et al., <xref ref-type="bibr" rid="B111">2017</xref>; Smilkov et al., <xref ref-type="bibr" rid="B147">2017</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>; Zintgraf et al., <xref ref-type="bibr" rid="B181">2017</xref>). In the following, we review the latest methods used to interpret deep learning models, including perturbation and decomposition (or redistribution) approaches. While unified in purpose, i.e., revealing the relationship between inputs and outputs (or higher levels) of the underlying model, these methods are highly divergent in outcome and explanation mechanism. The <italic>post-hoc</italic> XAI methods we found in our review are listed in <xref ref-type="table" rid="T1">Table 1</xref> and discussed in the following subsections.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p><italic>Post-hoc</italic> methods for interpreting deep neural networks.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left" colspan="2"><bold>Taxonomy</bold></th>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="left"><bold>References</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Perturbation</td>
<td valign="top" align="left">Gradients (sensitivity)</td>
<td valign="top" align="left">N/A (gradient-based)</td>
<td valign="top" align="left">Baehrens et al., <xref ref-type="bibr" rid="B10">2009</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Saliency maps</td>
<td valign="top" align="left">Simonyan et al., <xref ref-type="bibr" rid="B144">2013</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Class activation mapping (CAM)</td>
<td valign="top" align="left">Zhou et al., <xref ref-type="bibr" rid="B179">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Gradient-weighted CAM (Grad-CAM)</td>
<td valign="top" align="left">Selvaraju et al., <xref ref-type="bibr" rid="B138">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Guided Grad-CAM</td>
<td valign="top" align="left">Selvaraju et al., <xref ref-type="bibr" rid="B139">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">3D CAM</td>
<td valign="top" align="left">Yang et al., <xref ref-type="bibr" rid="B170">2018</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">3D Grad-CAM</td>
<td valign="top" align="left">Yang et al., <xref ref-type="bibr" rid="B170">2018</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Respond-CAM</td>
<td valign="top" align="left">Zhao et al., <xref ref-type="bibr" rid="B178">2018</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Multiscale CAM</td>
<td valign="top" align="left">Hu et al., <xref ref-type="bibr" rid="B70">2020</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">SmoothGrad (SG)</td>
<td valign="top" align="left">Smilkov et al., <xref ref-type="bibr" rid="B147">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Correlation maps</td>
<td valign="top" align="left">Schirrmeister et al., <xref ref-type="bibr" rid="B134">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Testing with concept activation vectors (TCAV)</td>
<td valign="top" align="left">Kim et al., <xref ref-type="bibr" rid="B77">2018</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Automated concept-based explanation (ACE)</td>
<td valign="top" align="left">Ghorbani et al., <xref ref-type="bibr" rid="B43">2019a</xref>,<xref ref-type="bibr" rid="B44">b</xref></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Signal</td>
<td valign="top" align="left">Guided backpropagation (GBP)</td>
<td valign="top" align="left">Springenberg et al., <xref ref-type="bibr" rid="B148">2014</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">DeConvNet (occlusion maps)</td>
<td valign="top" align="left">Zeiler and Fergus, <xref ref-type="bibr" rid="B175">2014</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Inversion-based</td>
<td valign="top" align="left">Mahendran and Vedaldi, <xref ref-type="bibr" rid="B102">2015</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Inversion-based</td>
<td valign="top" align="left">Dosovitskiy and Brox, <xref ref-type="bibr" rid="B32">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">PatternNet</td>
<td valign="top" align="left">Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">PatternAttribution</td>
<td valign="top" align="left">Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Model agnostic</td>
<td valign="top" align="left">Local interpretable model-agnostic explanations (LIME)</td>
<td valign="top" align="left">Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Submodular pick LIME (SP-LIME)</td>
<td valign="top" align="left">Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">anchor-LIME (aLIME)</td>
<td valign="top" align="left">Tulio Ribeiro et al., <xref ref-type="bibr" rid="B159">2016</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Model agnostic globally interpretable explanations</td>
<td valign="top" align="left">Puri et al., <xref ref-type="bibr" rid="B123">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">SHapley additive exPlanations (SHAP)</td>
<td valign="top" align="left">Lundberg and Lee, <xref ref-type="bibr" rid="B98">2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Decomposition (redistribution)</td>
<td/>
<td valign="top" align="left">Layer-wise relevance propagation (LRP)</td>
<td valign="top" align="left">Bach et al., <xref ref-type="bibr" rid="B9">2015</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Deep Taylor decomposition</td>
<td valign="top" align="left">Montavon et al., <xref ref-type="bibr" rid="B111">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Deep learning important FeaTures (DeepLIFT)</td>
<td valign="top" align="left">Shrikumar et al., <xref ref-type="bibr" rid="B142">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Integrated gradients (IG)</td>
<td valign="top" align="left">Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Gradient &#x000D7; input</td>
<td valign="top" align="left">Shrikumar et al., <xref ref-type="bibr" rid="B142">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Prediction difference analysis (PDA)</td>
<td valign="top" align="left">Zintgraf et al., <xref ref-type="bibr" rid="B181">2017</xref></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Graph LRP</td>
<td valign="top" align="left">Chereda et al., <xref ref-type="bibr" rid="B23">2021</xref></td>
</tr>
</tbody>
</table>
</table-wrap>
<sec>
<title>Perturbation approach</title>
<p>The perturbation-based approach is broadly divided into model-specific (e.g., gradients and signal) or model-agnostic methods. Gradients/sensitivity-based methods examine how a slight shift to the input affects the classification score for the output of interest, such as the techniques introduced by Baehrens et al. (<xref ref-type="bibr" rid="B10">2009</xref>) and Simonyan et al. (<xref ref-type="bibr" rid="B144">2013</xref>), as well as Class Activation Mapping (CAM; Zhou et al., <xref ref-type="bibr" rid="B179">2016</xref>), Gradient-weighted CAM (Grad-CAM; Selvaraju et al., <xref ref-type="bibr" rid="B138">2017</xref>), SmoothGrad (SG; Smilkov et al., <xref ref-type="bibr" rid="B147">2017</xref>), and (Multiscale CAM; Hu et al., <xref ref-type="bibr" rid="B70">2020</xref>). These techniques are easily implemented in DNNs because the gradient is generally computed by backpropagation (Rumelhart et al., <xref ref-type="bibr" rid="B130">1986</xref>; Swartout et al., <xref ref-type="bibr" rid="B153">1991</xref>). Signal methods typically visualize input patterns by stimulating neuron activation in higher layers, resulting in so-called feature maps. DeConvNet (Zeiler and Fergus, <xref ref-type="bibr" rid="B175">2014</xref>), Guided BackProp (Springenberg et al., <xref ref-type="bibr" rid="B148">2014</xref>), PatternNet (Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref>), and inversion-based techniques (Mahendran and Vedaldi, <xref ref-type="bibr" rid="B102">2015</xref>; Dosovitskiy and Brox, <xref ref-type="bibr" rid="B32">2016</xref>) are some examples of this group. Mahendran and Vedaldi (<xref ref-type="bibr" rid="B102">2015</xref>) showed that by moving from the shallower layers to the deeper layers, the feature maps reveal more complex patterns of input [e.g., in human face explanation: from (1) line and edges to (2) eyes, nose, and ears, then to (3) complex facial structures].</p>
<p>On the other hand, model agnostic methods explore the prediction of interest to infer the relevance of the input features toward the output (Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref>; Alvarez-Melis and Jaakkola, <xref ref-type="bibr" rid="B4">2018</xref>). Two of the most popular techniques in this category are Local Interpretable Model-agnostic Explanations (LIME; Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref>) and SHapley Additive exPlanations (SHAP; Lundberg and Lee, <xref ref-type="bibr" rid="B98">2017</xref>).</p>
</sec>
<sec>
<title>Decomposition approach</title>
<p>Decomposition-based methods seek to identify important features (pixels) in a particular input by decomposing the network classification decision into contributions of the input elements. The earliest study in this class goes back to Bach et al. (<xref ref-type="bibr" rid="B9">2015</xref>), who introduced the Layer-Wise Relevance Propagation (LRP) technique, which interprets the DNN decisions using heatmaps (or relevance-maps). Using a set of propagation rules, LRP performs a separate backward pass for each possible target class, satisfying a layer-wise conservation principle (Landecker et al., <xref ref-type="bibr" rid="B87">2013</xref>; Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>). As a result, each intermediate layer up to the input layer is assigned relevance scores. The sum of the scores in each layer equals the prediction output for the class under consideration. The conservation principle is one of the significant differences between the decomposition and gradients methods.</p>
<p>In another technique, Montavon et al. (<xref ref-type="bibr" rid="B111">2017</xref>) demonstrated how the propagation rules derived from deep Taylor decomposition relate to those heuristically defined by Bach et al. (<xref ref-type="bibr" rid="B9">2015</xref>). Recently, several studies have used LRP to interpret and visualize their network decisions in various applications such as text analysis (Arras et al., <xref ref-type="bibr" rid="B7">2017a</xref>), speech recognition (Becker et al., <xref ref-type="bibr" rid="B12">2018</xref>), action recognition (Srinivasan et al., <xref ref-type="bibr" rid="B149">2017</xref>), and neuroimaging (Thomas et al., <xref ref-type="bibr" rid="B155">2019</xref>). Other recently-developed decomposition-based methods include DeepLIFT (Shrikumar et al., <xref ref-type="bibr" rid="B142">2017</xref>), Integrated Gradients (Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>), Gradient Input (Shrikumar et al., <xref ref-type="bibr" rid="B142">2017</xref>), and Prediction Difference Analysis (PDA; Zintgraf et al., <xref ref-type="bibr" rid="B181">2017</xref>). In recent years, various studies have attempted to test the reliability of explanation techniques compared to each other by introducing several properties such as fidelity (or sensitivity), consistency, stability and completeness (Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>; Alvarez-Melis and Jaakkola, <xref ref-type="bibr" rid="B4">2018</xref>). In the following chapters, we address this issue.</p>
</sec>
</sec>
</sec>
<sec sec-type="methods" id="s3">
<title>Methodology</title>
<p>This systematic review was conducted according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement guidelines (Moher et al., <xref ref-type="bibr" rid="B108">2009</xref>). To reduce the effect of research expectations on the review, we first identified research questions and search strategies. Moreover, this systematic review adhered to the Cochrane Collaboration methodology (Higgins et al., <xref ref-type="bibr" rid="B56">2011</xref>), to mitigate the risk of bias and error.</p>
<p>Based on the objectives outlined in the abstract, the following research questions were derived and form the cornerstone of our study:</p>
<list list-type="bullet">
<list-item><p>What are the main challenges in AI that have limited their implementation in medical imaging applications, particularly in neuroimaging, despite their high prediction performance?</p></list-item>
<list-item><p>How can we overcome the black-box property of complex and deep neural networks for the user in critical areas such as healthcare and medicine?</p></list-item>
<list-item><p>How have recent advances in explainable AI affected machine/deep learning in medical imaging and neuroimaging?</p></list-item>
<list-item><p>How can one assess the reliability and generalizability of interpretation techniques?</p></list-item>
</list>
<sec>
<title>Search strategy</title>
<p>The current and seminal studies in the realm of XAI with a focus on healthcare and medicine were considered critical sources for this systematic review. A bibliographic search for this work was carried out across the following scientific databases and search engines: PubMed, Scopus, Web of Science, Google Scholar, ScienceDirect, IEEE Xplore, SpringerLink, and arXiv, using the following keyword combinations in the title, keywords, or abstract: (&#x0201C;explainable AI&#x0201D; or &#x0201C;XAI&#x0201D; or &#x0201C;explainability&#x0201D; or &#x0201C;interpretability&#x0201D;) and (&#x0201C;artificial intelligence&#x0201D; or &#x0201C;machine learning&#x0201D; or &#x0201C;deep learning&#x0201D; or &#x0201C;deep neural networks&#x0201D;) and (&#x0201C;medical imaging&#x0201D; or &#x0201C;neuroimaging&#x0201D; or &#x0201C;MRI&#x0201D; or &#x0201C;fMRI&#x0201D; or &#x0201C;CT&#x0201D;). Moreover, the reference lists of the retrieved studies were also screened to find relevant published works.</p>
</sec>
<sec>
<title>Inclusion criteria</title>
<p>Published original articles with the following features were included in the current study: (a) be written in English; AND [(b) introduce, identify, or describe XAI-based techniques for visualizing and/or interpreting ML/DL decisions; OR (c) be related to the application of XAI in healthcare and medicine]. Other exclusion criteria were: (a) book chapters; (b) papers that upon review were not related to the research questions; (c) opinions, viewpoints, anecdotes, letters, and editorials. The eligibility criteria were independently assessed by two authors (FF and KF), who screened the titles and abstracts to establish the relevant articles based on the selection criteria. Any discrepancies were resolved through discussion or referral to a third reviewer (BL or WK).</p>
</sec>
<sec>
<title>Data extraction</title>
<p>We developed a data extraction sheet, pilot-tested this sheet on randomly selected studies, and refined the sheet appropriately. During a full-text review process, one review author (KF) extracted the following data from the selected studies, and a second review author (FF) crosschecked the collected data, which included: taxonomic topic, first author (year of publication), key contributions, XAI model used, and sample size (if applicable). Disagreements were resolved by discussion between the two review authors, and if necessary, a third reviewer was invoked (BL or WK).</p>
</sec>
<sec>
<title>Additional analyses</title>
<p>We performed a co-occurrence analysis to analyze text relationships between the shared components of the reviewed studies, including XAI methods, imaging modalities, diseases, and frequently used ML/DL terms. Creating a co-occurrence network entails finding keywords in the text, computing the frequency of co-occurrences, and analyzing the networks to identify word clusters and locate central terms (Segev, <xref ref-type="bibr" rid="B137">2020</xref>). Furthermore, to provide a critical view of the extracted XAI techniques, we carried out an additional subjective examination of articles that have proposed quality tests for evaluating the reliability of these methods.</p>
</sec>
<sec>
<title>Quality assessment</title>
<p>The risk of bias in individual studies was ascertained independently by two reviewers (FF and KF) following the Cochrane Collaboration&#x00027;s tool (Higgins et al., <xref ref-type="bibr" rid="B56">2011</xref>). The Cochrane Collaboration&#x00027;s tool assesses random sequence generation, allocation concealment, blinding of participants, blinding of outcome assessment, incomplete outcome data, and selective outcome reporting, and ultimately rates the overall quality of the studies as weak, fair, or good. To appraise the quality of evidence across studies, we examined for lack of completeness (publication bias) and missing data from the included studies (selective reporting within studies). The risk of missing studies is highly dependent on the chosen keywords and the limitations of the search engines. A set of highly-cited articles was used to create the keyword search list in an iterative process to alleviate this risk. Disagreements were resolved by discussion between the study authors.</p>
</sec>
</sec>
<sec sec-type="results" id="s4">
<title>Results</title>
<sec>
<title>Literature search</title>
<p>Following the PRISMA guidelines (Moher et al., <xref ref-type="bibr" rid="B108">2009</xref>), a summary of the process used to identify, screen, and select studies for inclusion in this review is illustrated in <xref ref-type="fig" rid="F2">Figure 2</xref>. First, 357 papers were identified through the initial search, followed by the removal of duplicate articles, which resulted in 263 unique articles. Only &#x0007E;5% were published before 2010, indicating the novelty of the terminology and the research area. Afterwards, the more relevant studies were identified from the remaining papers by incorporating inclusion and exclusion criteria. The inclusion criteria at this step required the research to (a) be written in English; AND [(b) introduce, identify, or describe <italic>post-hoc</italic> XAI techniques for visualizing and/or interpreting ML/DL decisions; OR (c) be related to the application of <italic>post-hoc</italic> XAI in neuroimaging]. Other exclusion criteria included: (a) book chapters; (b) papers which upon review were not related to the research questions; (c) opinions, viewpoints, anecdotes, letters, and editorials. As a result, the implementation of these criteria yielded 126 eligible studies (&#x0007E;48% of the original articles). Subsequently, the full text of these 126 papers was scrutinized in detail to reaffirm the criteria described in the previous step. Eventually, 78 publications remained for systematic review.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>The flow diagram of the methodology and selection processes used in this systematic review follows the PRISMA statement (Moher et al., <xref ref-type="bibr" rid="B108">2009</xref>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0002.tif"/>
</fig>
</sec>
<sec>
<title>Study characteristics</title>
<p>The included studies&#x02014;published from 2005 to 2021&#x02014;were binned into three major taxonomies (<xref ref-type="fig" rid="F3">Figure 3A</xref>). The first category was focused on the researchers&#x00027; efforts to introduce <italic>post-hoc</italic> XAI techniques and theoretical concepts for the visualization and interpretation of deep neural network predictions (blue slice) and accounted for 37% of the selected articles. In the second category, articles discussing the neuroimaging applications of XAI were collected and reviewed (green slice), accounting for 42% of the selected papers. Finally, the last group consisted of perspective and review studies in the field (yellow slice), either methodologically or medically, which accounted for 21% of the selected articles. <xref ref-type="fig" rid="F3">Figure 3B</xref>, in particular, illustrates the classification of XAI applications in neuroimaging in terms of the <italic>post-hoc</italic> method they used (along with their percentage). As mentioned in the literature, these methods can be divided into decomposition-based and perturbation-based approaches; the latter can be classified into gradients, signal and model agnostic ones.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Study characteristics. <bold>(A)</bold> Categorization of included studies, <bold>(B)</bold> XAI in medical imaging, and <bold>(C)</bold> a bubble plot that shows mentioned studies by type of XAI method, imaging modality, sample size, and publication trend in recent years.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0003.tif"/>
</fig>
<p>Moreover, <xref ref-type="fig" rid="F3">Figure 3C</xref> distinguishes applied XAI studies in neuroimaging from various aspects such as method, imaging modality, sample size, and their publication trend in recent years. In this plot, each circle represents a study whose color determines the type of imaging modality such as EEG, fMRI, MRI, CT, and other (PET, ultrasound, histopathological scans, blood film, electron cryotomography, A&#x003B2; plaques, tissue microarrays, etc.), and its size is logarithmically related to the number of people/scans/images used in that study. By focusing on each feature, compelling general information can be extracted from this figure. For example, it can be noted that gradient-based methods cover most imaging modalities well, or those model agnostic methods are suitable for studies with large sample sizes.</p>
<p>A summary of the reviewed articles is provided in <xref ref-type="table" rid="T2">Table 2</xref>, which contains the author&#x00027;s name, publication year, XAI model examined, and key contributions, respectively, ordered by taxonomy and article date. The table is constructed to provide the reader with a complete picture of the framework and nature of components contributing to XAI in medical applications.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Summary of included articles by taxonomy, authors (year), XAI model, and key contributions.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>References</bold></th>
<th valign="top" align="left"><bold>XAI model</bold></th>
<th valign="top" align="left"><bold>Key contributions</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="3"><bold>XAI: Technical and theoretical concepts</bold></td>
</tr>
<tr>
<td valign="top" align="left">Robnik-&#x00160;ikonja and Kononenko (<xref ref-type="bibr" rid="B128">2008</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">The paper proposes a method to explain predictions made by various ML models that output prediction probabilities separately for individual instances. The authors propose three types of explanations: the instance explanation, i.e., the explanation as to why a given instance is classified as it is, the model explanation, i.e., &#x0201C;averages of explanations over many training instances,&#x0201D; and the domain explanation, achievable &#x0201C;if the accuracy of the model is high.&#x0201D; The work also proposes a way to visualize the resulting explanations. However, the method is limited to a case in which only a single change in the variables is sufficient to change the prediction. The quality of the proposed explanations is also correlated with the quality of the model. An example of method performance is presented for a well-known Titanic dataset.</td>
</tr>
<tr>
<td valign="top" align="left">Erhan et al. (<xref ref-type="bibr" rid="B37">2009</xref>)</td>
<td valign="top" align="left">AM</td>
<td valign="top" align="left">The work elaborates on techniques to visualize higher-layer features learned by DL models. The authors recalls that the visualization of early layers (first, second) is commonly possible, especially in AI vision, by visualizing the filters learned by the DL models. They also propose an optimization technique called activation maximization (AM). Two other previously known methods, i.e., a sampling technique and a method that creates a linear combination of selected filters from previous layers, are also analyzed. The methods are tested on the MNIST dataset and a large collection of tiny 12 &#x000D7; 12 natural images. The results show that AM allows the visualization of features learned by deeper layers of the DL architecture and that these layers, in fact, learn more complex patterns than earlier layers of the DL model.</td>
</tr>
<tr>
<td valign="top" align="left">Gaonkar and Davatzikos (<xref ref-type="bibr" rid="B42">2013</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">The paper proposes an &#x0201C;analytical short-cut&#x0201D; for the computation of explanations of SVM predictions. The method is demonstrated by analyzing fMRI data. In opposition to computation-expensive permutation tests, this method allows the data required to visualize 3D brain maps of statistically significant regions that contribute to the SVM prediction to be computed within a few seconds. The method is evaluated on simulated data, Alzheimer&#x00027;s disease data, and fMRI lie-detection data.</td>
</tr>
<tr>
<td valign="top" align="left">Simonyan et al. (<xref ref-type="bibr" rid="B144">2013</xref>)</td>
<td valign="top" align="left">AM, Gradient-based</td>
<td valign="top" align="left">This work addresses the visualization of CNNs trained for image classification by two techniques: AM and a method for computing the sample-wise saliency maps. The AM method is derived from literature, and the method for obtaining saliency maps is based on the computation of a single back-propagation pass. The results of both methods are visualized on randomly selected images from the ILSVRC-2013 test set. Demonstrates also that the created saliency maps can be used for &#x0201C;weakly supervised&#x0201D; object segmentation.</td>
</tr>
<tr>
<td valign="top" align="left">Zeiler and Fergus (<xref ref-type="bibr" rid="B175">2014</xref>)</td>
<td valign="top" align="left">DeConvNets</td>
<td valign="top" align="left">The paper proposes a new method for visualizing the function of intermediate feature layers in DL CNNs. The study profits from previous work, which introduced &#x0201C;deconvolutional networks.&#x0201D; This work compares the output from the proposed method to that of a simpler sensitivity analysis. The article presents transparent figures that help aid in the reader&#x00027;s understanding of the whole method. For various input images presents example outputs at different layers of the analyzed CNN. Another interesting result is the presentation of how learned features vary across layers and CNN training epochs.</td>
</tr>
<tr>
<td valign="top" align="left">Bach et al. (<xref ref-type="bibr" rid="B9">2015</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">The paper introduces Layer-Wise Relevance Propagation (LRP), an algorithm that generates heatmap visualizations of the contribution of single pixels in predictions made by non-linear DL classifiers. Relevance scores are computed by the method and assigned to the analyzed pixels of the input image. The authors present numerous examples for various datasets (e.g., Pascal VOC2009, MNIST digits, ILSVRC) and classifiers. In addition, model and LRP performance when inferring based on images with flipped pixels are analyzed. The authors present how the choice of the hyperparameters alpha and beta influence the LRP output.</td>
</tr>
<tr>
<td valign="top" align="left">Ribeiro et al. (<xref ref-type="bibr" rid="B126">2016</xref>)</td>
<td valign="top" align="left">LIME</td>
<td valign="top" align="left">This work proposes a method &#x0201C;local interpretable model-agnostic explanations&#x0201D; (LIME) that enables the predictions &#x0201C;for any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction,&#x0201D; to be explained. LIME is demonstrated by explaining various models for text analysis and image classification. LIME is based on local simplifications of complex non-linear global solutions learned by the explained model. The authors also propose a variant of the method, called SP-LIME, that addresses the problem of &#x0201C;trusting the model&#x0201D; by selecting representative instances with explanations. The paper also names the desired characteristics for explainers, e.g., interpretability, local fidelity, and being model-agnostic.</td>
</tr>
<tr>
<td valign="top" align="left">Samek et al. (<xref ref-type="bibr" rid="B132">2016</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">This paper summarizes LRP, originally published in another paper. This technique explains predictions carried out by DNN models in the image analysis domain.</td>
</tr>
<tr>
<td valign="top" align="left">Zhou et al. (<xref ref-type="bibr" rid="B179">2016</xref>)</td>
<td valign="top" align="left">CAM</td>
<td valign="top" align="left">This paper focuses on the advantages of the global average pooling (GAP) layer proposed for CNNs. The study shows that even when only given image-level labels, the CNN is capable of acquiring a remarkable localization ability during the learning process. The findings are proven and visualized by a class activation mapping (CAM) technique that enables instance-level visualization of features that contribute to the decision of the classifier. The authors argue that their visualization method is superior to previous methods, e.g., because it retains most of the CNN performance by only removing the fully-connected layers. In this work, CAM is compared to a similar approach based on global max pooling (GMP) and proves to be superior in localization. The performance of the method is demonstrated on various image datasets and classification tasks.</td>
</tr>
<tr>
<td valign="top" align="left">Arras et al. (<xref ref-type="bibr" rid="B7">2017a</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">This paper applies LRP to explain the predictions carried out by CNN and bag-of-words-based SVM models in text classification tasks. In addition to utilizing the novel technique in a new domain, the authors utilize scores mapped based on words provided by the method to generate novel vector-based representations of texts. Based on these representations, a measure of &#x0201C;model explanatory power&#x0201D; is introduced, which shows for the two analyzed models, that CNN provided a slightly higher level of explainability.</td>
</tr>
<tr>
<td valign="top" align="left">Arras et al. (<xref ref-type="bibr" rid="B8">2017b</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">This work extends the usage of the LRP method to RNN networks, e.g., LSTMs and GRUs. The method outputs text-heatmaps to visualize explanations. For example, a trained biLSTM text analysis model is used to carry out a five-class sentiment classification task, and the model&#x00027;s predictions are analyzed by proposed technique, as previously used in the domain sensitivity analysis (SA). The outputs of both methods are compared and discussed. In addition, quantitative analysis of extended LRP and SA is carried out by validation in a two-word deleting experiment in which the two most informative words have been deleted from the inferred text, and the trained model is asked to repeat its prediction.</td>
</tr>
<tr>
<td valign="top" align="left">Zintgraf et al. (<xref ref-type="bibr" rid="B181">2017</xref>)</td>
<td valign="top" align="left">PDA</td>
<td valign="top" align="left">The work identifies two types of methods for explaining DL image analysis predictions: AM and saliency methods. The authors introduce a saliency method called &#x0201C;prediction difference analysis.&#x0201D; For a given picture, the method highlights pixels that have played an important role for or against the prediction that the DL has made. The method relies on assessment of the probability of obtaining a correct prediction given a partially occluded input image and turns out to be time consuming, requiring a minimum of 20 min per image of computational time. It is possible to apply the method to selected layers of DL architecture. The authors present example results for natural images as well as brain MRI scans.</td>
</tr>
<tr>
<td valign="top" align="left">Montavon et al. (<xref ref-type="bibr" rid="B111">2017</xref>)</td>
<td valign="top" align="left">DTD</td>
<td valign="top" align="left">This paper introduces a saliency method for explaining the predictions of DL architectures, especially image classification, and is called &#x0201C;Deep Taylor Decomposition (DTD).&#x0201D; The method is compared to sensitivity analysis and is shown to provide more insight when applied to MNIST digits and ILSVRC images as well as CaffeNet and GoogleNet pre-trained models.</td>
</tr>
<tr>
<td valign="top" align="left">Kindermans et al. (<xref ref-type="bibr" rid="B79">2017</xref>)</td>
<td valign="top" align="left">GI, IG, DTD,<break/> SG, LRP</td>
<td valign="top" align="left">This work elaborates on the imperfections of the explanations of DL predictions provided by saliency methods. The paper quotes other research stating that before adopting an explanation, the method providing the explanation should be tested for completeness, implementation invariance, and sensitivity. &#x0201C;Input invariance&#x0201D; is proposed as another requirement. The authors carry out a comparison of various methods on MNIST images and find that many of these tested saliency methods do not maintain reasonable results after a simple shift in the input image. They also provide examples of deliberate misleading manipulation.</td>
</tr>
<tr>
<td valign="top" align="left">Sundararajan et al. (<xref ref-type="bibr" rid="B152">2017</xref>)</td>
<td valign="top" align="left">IG</td>
<td valign="top" align="left">The authors define axioms for explaining methods that must be satisfied as sensitivity and implementation invariance. Furthermore, the authors show that previous methods break either one axiom or the other. The authors also propose their own method, called Integrated Gradients, which aims to satisfy both axioms. The paper provides instructions for use and examples in the medical domain. The Integrated Gradients method is also shown to be effective in the NLP domain.</td>
</tr>
<tr>
<td valign="top" align="left">Lundberg and Lee (<xref ref-type="bibr" rid="B98">2017</xref>)</td>
<td valign="top" align="left">SHAP</td>
<td valign="top" align="left">This paper presents a unified framework for interpreting predictions from ML and DL algorithms that is called SHapley Additive exPlanations (SHAP). The authors define a new, general group of explainability models called additive feature attribution models and show that each of the six methods, namely LIME, DeepLIFT, LRP, Shapley regression values, Shapley sampling values, and Quantitative Input Influence (QII), fits the generalization. The work further elaborates on the computation of SHAP values and presents a comparison of feature importance as computed by the various methods.</td>
</tr>
<tr>
<td valign="top" align="left">Samek et al. (<xref ref-type="bibr" rid="B131">2017a</xref>)</td>
<td valign="top" align="left">SA, LRP</td>
<td valign="top" align="left">This article argues that the explanation of DL predictions should be a requirement, based on the following rationale: the need for verification, improvement, learning from the system, and compliance with legislation. Later, this paper compares heatmaps computed by SA and LRP for a DNN trained for image classification and presents LRP relevances per frame in human action recognition in videos.</td>
</tr>
<tr>
<td valign="top" align="left">Selvaraju et al. (<xref ref-type="bibr" rid="B138">2017</xref>)</td>
<td valign="top" align="left">Grad-CAM</td>
<td valign="top" align="left">The paper proposes a technique for creating visual explanations for inferences made by CNNs, called gradient-weighted class activation mapping (Grad-CAM). The paper identifies that &#x0201C;interpretability matters,&#x0201D; debates on &#x0201C;what makes a good explanation?&#x0201D; and discusses a trade-off between faithfulness and interpretability. The authors evaluate their method &#x0201C;via human studies.&#x0201D; Grad-CAM is a generalization of CAM that can be used with fully-convolutional layers, which are various networks that can be used not only for image classification but also for image captioning or visual question answering. The authors use guided backpropagation with Grad-CAM for fine-grained feature importance.</td>
</tr>
<tr>
<td valign="top" align="left">Shrikumar et al. (<xref ref-type="bibr" rid="B142">2017</xref>)</td>
<td valign="top" align="left">DeepLIFT</td>
<td valign="top" align="left">The article introduces a method called deep learning important features (DeepLIFT) that assigns importance scores to elements of input images thus explaining the predictions made by DNN. The authors divide similar methods into two groups, namely: (1) perturbation-based forward propagation approaches, and (2) backpropagation-based approaches, and categorize DeepLIFT into the second group, which is less expensive computationally. The paper further discusses critical previous backpropagation-based approaches. The proposed solution is based on the concept of difference-from-reference.</td>
</tr>
<tr>
<td valign="top" align="left">Doshi-Velez and Kim (<xref ref-type="bibr" rid="B30">2017</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This work focuses on the need to establish a consensus on what model interpretability is and to define an objective measure of interpretability. The authors divide current interpretability evaluations into two groups: (1) if the system is useful, then it must be interpretable, and (2) some model classes claim to be interpretable, i.e., rule list or sparse linear models. Later, the paper names the various desirable features of ML: fairness, unbiasedness, privacy, reliability, robustness, causality, usability, and trust. The authors argue that &#x0201C;the need for interpretability stems from an incompleteness in the problem formalization.&#x0201D; The paper also provides a taxonomy of interpretability evaluation and discusses open problems in the science of interpretability.</td>
</tr>
<tr>
<td valign="top" align="left">Smilkov et al. (<xref ref-type="bibr" rid="B147">2017</xref>)</td>
<td valign="top" align="left">SG, Gradients, IG, GBP</td>
<td valign="top" align="left">This paper introduces an improved version of gradient-based visual instance-level explanations for CNN predictions, called Smoothgrad (SG). The improvement visually sharpens the output of previous gradient-based methods. The method is based on the assumption that the ReLU activation functions commonly used in CNNs are usually not continuously differentiable and, as a result, proposes &#x0201C;smoothing&#x0201D; local discontinuities by computing &#x0201C;a simple stochastic approximation.&#x0201D; The method is compared to gradients, IG and Guided BackProp and is demonstrated to provide improved visualizations.</td>
</tr>
<tr>
<td valign="top" align="left">Adebayo et al. (<xref ref-type="bibr" rid="B1">2018</xref>)</td>
<td valign="top" align="left">Gradients, GI,<break/> IG, GBP, Guided Grad-CAM,<break/> SG</td>
<td valign="top" align="left">This article reflects critically on various gradient-based methods for generating saliency maps for image-based predictions. The authors define two concrete tests for the scope and quality of explanation methods (a model parameter randomization test and a data randomization test) and conduct extensive experiments with their use. According to the results of randomizing the weights in some layers of the tested CNN, some methods are independent both of the model and of the data and thus can provide misleading information. For example, Grad-CAM and Gradients passed the proposed tests, whereas Guided Grad-CAM failed.</td>
</tr>
<tr>
<td valign="top" align="left">Do&#x00161;ilovi&#x00107; et al. (<xref ref-type="bibr" rid="B31">2018</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This paper reviews the recent progress in XAI. The authors define notions that are important for this field: trust, interpretability, comprehensibility, explainability, and transparency and observe that some of these notions are nearly synonyms, while interpretability lacks a unique definition. This work recognizes two approaches to interpretability: integrated in the model structure and <italic>post-hoc</italic>. For the first group, a trade-off between model performance and &#x0201C;readability&#x0201D; is observed. In the second group, the authors distinguish methods that try to establish interpretability, predominantly, on a model-level, whereas another focuses on explainability at the instance-level. The utility of abstracted explanations for artificial general intelligence (AGI) is also discussed.</td>
</tr>
<tr>
<td valign="top" align="left">Hoffman et al. (<xref ref-type="bibr" rid="B57">2018</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This paper tries to explain questions regarding how the quality of XAI explanations can be measured. The paper depicts a conceptual model of the process of explaining AI that takes into consideration various XAI measurement categories: XAI goodness and user satisfaction, the user&#x00027;s mental model (the user&#x00027;s understanding of the AI system), curiosity (understood as XAI&#x00027;s ability to stimulate the user&#x00027;s curiosity, leading to improvement, for example, in the user&#x00027;s mental model), trust (the authors believe that trust in XAI will always be exploratory, i.e., will be based on allowing the user to explore the AI decision system), and performance. The authors indicate that the evaluation of the performance of an XAI system &#x0201C;cannot be neatly divorced from the evaluation of the performance of the user, or from the performance of the human-machine system as a whole.&#x0201D;</td>
</tr>
<tr>
<td valign="top" align="left">Montavon et al. (<xref ref-type="bibr" rid="B112">2018</xref>)</td>
<td valign="top" align="left">SA, AM, DTD, DeConvNets, GBP, LRP</td>
<td valign="top" align="left">The paper focuses on both the <italic>post-hoc</italic> interpretability of a pre-trained model, as opposed to incorporating interpretability in the model structure, and functional understanding instead of &#x0201C;mechanistic or algorithmic understanding.&#x0201D; The paper also analyses the model by explaining individual predictions. The authors create prototype data instances that will maximize activations (AM) of the analyzed DNN. The authors state that heatmap visualizations are more complete than images obtained through sensitivity analysis. The paper also describes good practices regarding DNN design if interpretability is to be achieved.</td>
</tr>
<tr>
<td valign="top" align="left">Ghorbani et al. (<xref ref-type="bibr" rid="B43">2019a</xref>)</td>
<td valign="top" align="left">ACE</td>
<td valign="top" align="left">This work proposes to go beyond per-sample-based explanations of ML predictions, discussing the principles and desiderata for concept-based explanations, before proposing a new algorithm for the automated concept-based explanation (ACE) of visual concepts. The ACE for an analyzed class analyzes many sample images, for each carrying out multi-resolution segmentation, clustering of similar segments, and computation of importance scores for each cluster by means of testing with the concept activation vectors (TCAV) method.</td>
</tr>
<tr>
<td valign="top" align="left">Lapuschkin et al. (<xref ref-type="bibr" rid="B90">2019</xref>)</td>
<td valign="top" align="left">LRP, SpRAy</td>
<td valign="top" align="left">This study presents how some example image classifiers achieve correct predictions through wrong features attributed to the dataset and not the objects themselves. The visualization of explanations is carried out with LRP. The work also uncovers a locally correct solution for AI trained to play a pinball game, which is not globally correct. Finally, the work proposes spectral relevance analysis (SpRAy), a semi-automated method for the analysis of AI behavior in large datasets.</td>
</tr>
<tr>
<td valign="top" align="left">Bosse et al. (<xref ref-type="bibr" rid="B19">2022</xref>)</td>
<td valign="top" align="left">CRP</td>
<td valign="top" align="left">Relevance Propagation (CRP) - A XAI method leveraging and extending the earlier LRP method and the concept of Activation Maximization to provide insights regarding both image-particular features contributing to the model&#x00027;s prediction as global features that the model learned to value and interpret. Owing to the design, the method allows one to quickly understand the general concept the model considers essential and the realization of that concept in the particular image in question.</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold>XAI in medical imaging applications</bold></td>
</tr>
<tr>
<td valign="top" align="left">Mour&#x000E3;o-Miranda et al. (<xref ref-type="bibr" rid="B114">2005</xref>)</td>
<td valign="top" align="left">Spatial maps</td>
<td valign="top" align="left">This work compares the performance of SVM and Fisher Linear Discriminant (FLD), in the assessment of brain states based on fMRI data without the prior selection of spatial features. Two functional tasks are analyzed, and PCA is employed for dimensionality reduction. During training, classifiers identify which voxels constitute &#x0201C;discriminating volumes,&#x0201D; i.e., features, that provide the most information needed for the classification of brain state. Provides visualizations of differences between classifiers in terms of the chosen discriminating volumes.</td>
</tr>
<tr>
<td valign="top" align="left">Kriegeskorte et al. (<xref ref-type="bibr" rid="B84">2006</xref>)</td>
<td valign="top" align="left">Searchlight</td>
<td valign="top" align="left">The paper introduces a method for the visualization of fMRI data, called &#x0201C;searchlight,&#x0201D; which provides an answer to the question &#x0201C;where in the brain the regional spatial activity pattern differs across experimental conditions.&#x0201D; Searchlight allows a continuous map in which informative regions are marked by moving a spherical multivariate &#x0201C;searchlight&#x0201D; through the measured volume to be obtained. This work demonstrates the utility of the method in regard to artificial fMRI signals and then further applies the method to real fMRI signals.</td>
</tr>
<tr>
<td valign="top" align="left">Wang et al. (<xref ref-type="bibr" rid="B164">2007</xref>)</td>
<td valign="top" align="left">Spatial maps</td>
<td valign="top" align="left">The authors analyze fMRI scans from multiple subjects by means of SVM and random effects analysis. The authors extract differences in brain activity between tasks in the form of a spatial discriminance map (SDM, also called &#x0201C;discriminating volume&#x0201D;) by means of SVM. To assess between-subject differences, the authors utilize both random effects analysis and permutation testing. The authors also propose group-level analysis, which is applied to a sensory-motor task with fMRI data.</td>
</tr>
<tr>
<td valign="top" align="left">Blankertz et al. (<xref ref-type="bibr" rid="B16">2011</xref>)</td>
<td valign="top" align="left">Spatial maps</td>
<td valign="top" align="left">This paper elaborates on the classification of human activity based on event-related potentials (ERP) in EEG data. This paper proposes a framework for signal preprocessing and highlights shrinkage estimators as a tool for improving later linear discriminant analysis (LDA). The improvements are presented in an evaluation experiment carried out on continuous EEG signals with the results compared to other models from the LDA family.</td>
</tr>
<tr>
<td valign="top" align="left">Haufe et al. (<xref ref-type="bibr" rid="B54">2014</xref>)</td>
<td valign="top" align="left">Spatio-spectral decomposition</td>
<td valign="top" align="left">Given the functional brain analysis domain and the struggle to model brain signals, this paper proposes a procedure for transforming &#x0201C;backward models&#x0201D; into &#x0201C;forward models&#x0201D; to enable the neurophysiological interpretation of the parameters of linear &#x0201C;backward models.&#x0201D; The considerations are valid for both EEG and fMRI data. The authors demonstrated on simulated and real fMRI and EEG data that the simple analysis of extraction filters may lead to severe misinterpretation in practice, whereas the proposed method for analyzing activation patterns resolves the problem.</td>
</tr>
<tr>
<td valign="top" align="left">Sturm et al. (<xref ref-type="bibr" rid="B151">2016</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">This study proposes the application of DNNs with LRP for the first time for EEG data analysis. With the use of LRP, sample DNN decisions are transformed into relevance heatmaps. The authors found that DNN&#x00027;s performance is comparable to that of previously utilized approaches with CSP and LDA methods and that for low-performing subjects, transferring the learning of a DNN from another subject can improve the results.</td>
</tr>
<tr>
<td valign="top" align="left">Schirrmeister et al. (<xref ref-type="bibr" rid="B134">2017</xref>)</td>
<td valign="top" align="left">Correlation maps</td>
<td valign="top" align="left">This work studies the application of CNNs for decoding imagined or executed tasks from raw EEG data and compared the CNN&#x00027;s performance to a broadly used approach that utilizes a filter bank common spatial patterns (FBCSP) algorithm. The paper also benefits from visualization techniques that enable explanations regarding which features in the EEG data are informative for the CNN. A shallow and deep version of the CNN is designed for the study, in addition to a mixture of these two models, called &#x0201C;hybrid CNN,&#x0201D; and a ResNet-like CNN. Using an example dataset, the authors demonstrate that CNN&#x00027;s performance is similar to that of the FBCSP approach; however, applying recent improvements in the design and training of the CNNs allows this approach to achieve superior results.</td>
</tr>
<tr>
<td valign="top" align="left">Herent et al. (<xref ref-type="bibr" rid="B55">2018</xref>)</td>
<td valign="top" align="left">Correlations maps, occlusion-based heatmaps</td>
<td valign="top" align="left">This study analyses a very large sample (almost 1,600 patients) of MRI data to train brain age predictors. Various ML models are trained. Many preprocessing techniques are presented. In addition, raw data are used to fine-tune a CNN model. To explain the causes of predictions, the authors utilize tools like correlation maps, weights maps, and heatmaps with occlusion methods. A 2D CNN model provides the smallest error of 3.6 years of brain age.</td>
</tr>
<tr>
<td valign="top" align="left">Li et al. (<xref ref-type="bibr" rid="B94">2018</xref>)</td>
<td valign="top" align="left">Corrupting</td>
<td valign="top" align="left">The work focuses on the application of DL in the prediction of autism spectrum disorder (ASD) from fMRI data. The authors observe that recent studies have applied DL in this domain; however, such studies have also lacked model transparency. This paper proposes a simple CNN network and a time-window sliding technique to capture the time-spatial characteristics of the analyzed fMRI. The proposed approach is based on &#x0201C;corrupting&#x0201D; sections of the original images and assessing the change in the prediction of a model that has been previously trained on original non-corrupted data. The framework is first tested with synthetic data and then with real fMRI data.</td>
</tr>
<tr>
<td valign="top" align="left">Paschali et al. (<xref ref-type="bibr" rid="B120">2018</xref>)</td>
<td valign="top" align="left">Extreme cases</td>
<td valign="top" align="left">This work tests models created for carrying out predictions in the medical field by assessing inferences obtained from images with &#x0201C;extreme cases of noise, outliers and ambiguous input data.&#x0201D; The rationale for such specific testing is provided as &#x0201C;existing model evaluation routines look deeply into over-fitting but insufficiently into scenarios of model sensitivity to variations of the input.&#x0201D; The authors present several strategies for creating &#x0201C;adversarial examples&#x0201D; of images and state that even though the human eye can hardly catch the difference in a manipulated image from its original, nonetheless a pre-trained model can be easily fooled to change its prediction.</td>
</tr>
<tr>
<td valign="top" align="left">Yang et al. (<xref ref-type="bibr" rid="B170">2018</xref>)</td>
<td valign="top" align="left">SA-3DUCM, 3D-CAM, 3D-Grad-CAM</td>
<td valign="top" align="left">This paper proposes 3D extensions of methods for creating 2D visual explanations in CNN image analysis and applies these methods to the 3D CNN analysis of MRI scans from patients suffering from Alzheimer&#x00027;s disease. The authors propose three methods for explaining the black-box predictions of the utilized 3D CNNs: sensitivity analysis in 3D (SA-3DUCM), 3D class activation mapping (3D-CAM), and 3D gradient-weighted class activation mapping (3D-Grad-CAM). The authors observe that each explanation technique produces very different visualizations and conclude that improvement is still needed in the field of 3D DL explainability tools.</td>
</tr>
<tr>
<td valign="top" align="left">Zhao et al. (<xref ref-type="bibr" rid="B178">2018</xref>)</td>
<td valign="top" align="left">3D-Grad-CAM, Respond-CAM</td>
<td valign="top" align="left">This work proposes a novel method called Respond-CAM for creating explanations and their visualizations of predictions inferred by DL 3D image analysis models. The method is tested on 3D images of macromolecular complex structures obtained from Cellular Electron Cryo-Tomography (CECT). The authors argue that this dataset provides large variations in shapes and sizes, which is favorable for performance validation. Tests are conducted on two CNN models with the results compared with the Grad-CAM 3D visualization method. According to the results, the introduced Respond-CAM outperforms the 3D Grad-CAM visualization method.</td>
</tr>
<tr>
<td valign="top" align="left">Qin et al. (<xref ref-type="bibr" rid="B124">2018</xref>)</td>
<td valign="top" align="left">Attention maps</td>
<td valign="top" align="left">This article describes autofocus convolutional layer (ACL) for 3D CNNs to provide scale-invariance, especially in the biomedical domain of fMRI and CT data. The authors propose the ACLs can be added to existing 3D CNN models. The ACLs compute attention maps, which are utilized for creating the visualizations of features that are important in the model&#x00027;s decision process. As an example, a 3D CNN model is modified with ACLs to demonstrate the whole concept on the real fMRI and CT data.</td>
</tr>
<tr>
<td valign="top" align="left">Couture et al. (<xref ref-type="bibr" rid="B26">2018</xref>)</td>
<td valign="top" align="left">Cropping</td>
<td valign="top" align="left">This paper discusses Multiple Instance (MI) learning for breast tumor histology. The authors propose a method of interpretability that by cropping areas of selected size of the original image and learning the classifier based on them can output the importance of each single prediction to the final &#x0201C;bag&#x0201D; prediction. The final prediction is obtained through aggregating instance predictions by pooling with here proposed quantile function. The importance of image augmentation is also highlighted.</td>
</tr>
<tr>
<td valign="top" align="left">Thomas et al. (<xref ref-type="bibr" rid="B155">2019</xref>)</td>
<td valign="top" align="left">DeepLight, LRP</td>
<td valign="top" align="left">This work introduces DeepLight, a CNN feature extractor&#x0002B;LSTM-based DL framework for the analysis of fMRI scans. The scanned brain slices are aligned in a sequence processed by the LSTM. Visual explainability is maintained by the use of LRP. The method is tested on a large, 100-patient fMRI dataset. The method is compared to the baseline General Linear Model (GLM), Searchlight, and whole-brain lasso models. The authors underscore that the proposed LSTM method is an improvement over previous techniques due to its use of a data-driven identification of the time component in the spatial fMRI analysis instead of hand-crafted methods for the identification of time dependency.</td>
</tr>
<tr>
<td valign="top" align="left">Tang et al. (<xref ref-type="bibr" rid="B154">2019</xref>)</td>
<td valign="top" align="left">Guided Grad-CAM, feature occlusion</td>
<td valign="top" align="left">This paper proposes a proof-of-concept DL framework for the classification of images that are important for the diagnosis of Alzheimer&#x00027;s disease and the visualization of explanations of DL predictions. Initially, this study compares the framework of a human expert analysis of images related to predicting Alzheimer&#x00027;s disease with CNN models. The authors test many CNN architectures and hyperparameters and note that a relatively shallow CNN has achieved &#x0201C;strong classification performance.&#x0201D; The obtained results show impressive performance of the automated method. In addition, heatmap explanations generated by Guided Grad-CAM and feature occlusion are discussed.</td>
</tr>
<tr>
<td valign="top" align="left">Lee et al. (<xref ref-type="bibr" rid="B92">2019</xref>)</td>
<td valign="top" align="left">CAM</td>
<td valign="top" align="left">This paper proposes a DL framework with an ensemble of four pre-trained CNN models to address the task of predicting acute intracranial hemorrhage based on a dataset of 904 CT non-contrast head scans. The framework&#x00027;s original design allows it to mimic the human radiological workflow at the image preprocessing stage. The model showed a comparable performance to that of radiologists. This work provided activation maps generated by the CAM method, which were separately validated by radiologists. They also report an attempt to train a 3D model to carry out the same tasks; however, the authors report a very lower mean average precision (mAP) of the 3D model, attributing this fact to &#x0201C;the curse of dimensionality.&#x0201D;</td>
</tr>
<tr>
<td valign="top" align="left">Wang et al. (<xref ref-type="bibr" rid="B163">2019</xref>)</td>
<td valign="top" align="left">Activation patterns</td>
<td valign="top" align="left">This paper trains a CNN model to classify six hepatic tumor entities using 494 lesions on multi-phasic MRI. A <italic>post-hoc</italic> algorithm inferred the presence of imaging features in a test set of 60 lesions by analyzing activation patterns of the pre-trained CNN model and scoring of radiological features. The developed system accomplishes 76.5% positive predictive value in identifying the correct radiological features present in each test lesion for liver tumor diagnosis.</td>
</tr>
<tr>
<td valign="top" align="left">Palatnik de Sousa et al. (<xref ref-type="bibr" rid="B118">2019</xref>)</td>
<td valign="top" align="left">LIME</td>
<td valign="top" align="left">In this research LIME method is used to provide explanations to two CNN models deployed on the task of classification of lymph node metastases based on medical images from a publicly available data set.</td>
</tr>
<tr>
<td valign="top" align="left">B&#x000F6;hle et al. (<xref ref-type="bibr" rid="B17">2019</xref>)</td>
<td valign="top" align="left">LRP, Guided Backpropagation (GB)</td>
<td valign="top" align="left">The study uses LRP and GB XAI method to provide explanations of predictions carried out by a Deep Neural Network deployed on MRI data regarding Alzheimer&#x00027;s disease. The authors demonstrate that the LRP technique is superior to GB in the analyzed task.</td>
</tr>
<tr>
<td valign="top" align="left">Papanastasopoulos et al. (<xref ref-type="bibr" rid="B119">2020</xref>)</td>
<td valign="top" align="left">IG, SG</td>
<td valign="top" align="left">This paper applies XAI techniques, including IG and SG, to the regions-of-interest from the training set. They trained a CNN for the classification of estrogen receptor status (ER&#x0002B; and ER&#x02013;) to aid in the molecular classification of breast cancer based on MRI medical imaging. Their model lets the CNN select features from various complementary characteristics of the same patient images.</td>
</tr>
<tr>
<td valign="top" align="left">Essemlali et al. (<xref ref-type="bibr" rid="B38">2020</xref>)</td>
<td valign="top" align="left">Saliency maps</td>
<td valign="top" align="left">This work introduces a XAI experiment to better understand the connectomic structure of the Alzheimer&#x00027;s disease. They showed that deep learning over structural connectomes are a prevailing method to leverage connectomes in the complex structure derived from diffusion MRI tractography. The article understands the brain connectivity based on the different brain&#x00027;s alteration with dementia with saliency map extraction. The introduced procedure revealed that no single region is responsible for Alzheimer&#x00027;s disease, but the combined effect of several cortical regions.</td>
</tr>
<tr>
<td valign="top" align="left">Windisch et al. (<xref ref-type="bibr" rid="B167">2020</xref>)</td>
<td valign="top" align="left">Grad-CAM</td>
<td valign="top" align="left">This paper focuses on a neural network to differentiate between MRI slices containing either a vestibular schwannoma, a glioblastoma, or no tumor for a basic brain tumor detection. The Grad-CAM is implemented in their study to find the areas that the neural network based its predictions on. To assess the confidence of the model in its predictions, the Bayesian neural network approach is considered.</td>
</tr>
<tr>
<td valign="top" align="left">Meske and Bunde (<xref ref-type="bibr" rid="B104">2020</xref>)</td>
<td valign="top" align="left">LIME</td>
<td valign="top" align="left">The study discusses how XAI allows to improve the degree of AI transparency on the example of detecting malaria from medical images. A simple Multi-Layer Perceptron and CNN models are trained and LIME is used to demonstrate heatmaps on the original input images.</td>
</tr>
<tr>
<td valign="top" align="left">Nigri et al. (<xref ref-type="bibr" rid="B116">2020</xref>)</td>
<td valign="top" align="left">Swap Test</td>
<td valign="top" align="left">This research proposes a novel Swap Test technique to provide heatmaps that depict areas of the brain most indicative of the Alzheimer&#x00027;s disease based on predictions from CNN models carried out on MRI brain data. The new technique is compared to the occlusion test and by measures of continuity and selectivity is determined to be superior.</td>
</tr>
<tr>
<td valign="top" align="left">El-Sappagh et al. (<xref ref-type="bibr" rid="B36">2021</xref>)</td>
<td valign="top" align="left">SHAP</td>
<td valign="top" align="left">This article is developed a two-layer model with random forest (RF) as classifier algorithm that enhances the clinical understanding of Alzheimer&#x00027;s disease diagnosis and progression processes. The developed model provides physicians with accurate decisions along with a set of explanations for every decision. They implemented 22 explainers for each layer based on a decision tree classifier and a fuzzy rule-based system.</td>
</tr>
<tr>
<td valign="top" align="left">Lee et al., <xref ref-type="bibr" rid="B92">2019</xref></td>
<td valign="top" align="left">SLIC, LIME</td>
<td valign="top" align="left">This study introduces a DL classification system for breast cancer detection based on ultrasound images. The models were trained on a data set of images obtained from 153 patients. The proposed approach merges a known pixel segmentation method (named, SLIC) and LIME XAI technique and applied them to an ultrasound image already segmented by a trained DL segmentation model which allows LIME to highlight the meaningful fragment of the segmented image.</td>
</tr>
<tr>
<td valign="top" align="left">Binder et al. (<xref ref-type="bibr" rid="B14">2021</xref>)</td>
<td valign="top" align="left">LRP</td>
<td valign="top" align="left">This paper studied a XAI technique for the integrated profiling of morphological, molecular and clinical features from breast cancer histology. The LRP-heatmaps they compute diverge because attention computes a weight in the forward pass without considering the final prediction made further in the predictor. It facilitates the quantitative evaluation of histomorphological features, the prediction of multiple molecular markers for subsets of cases with high accuracy (&#x0003E;95%) and can relate morphological and molecular properties in terms of cancer biology.</td>
</tr>
<tr>
<td valign="top" align="left">Pennisi et al. (<xref ref-type="bibr" rid="B121">2021</xref>)</td>
<td valign="top" align="left">Grad-CAM, Var-Grad</td>
<td valign="top" align="left">This study proposes a novel deep learning approach to classification of COVID-19 based on CT scans. The experiments are verified with use of saliency maps generated by a mixture of two XAI methods.</td>
</tr>
<tr>
<td valign="top" align="left">Zhang et al. (<xref ref-type="bibr" rid="B177">2021</xref>)</td>
<td valign="top" align="left">3D Grad-CAM</td>
<td valign="top" align="left">A novel model is proposed in this study which is capable of predicting Alzheimer&#x00027;s disease based on 3D structure MRI data. The proposed approach allows end-to-end learning, automated diagnosis and provides 3D class activation mapping heat-maps. The method achieves superior results when compared to selected 3D Deep Neural Networks.</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold>Perspective and review studies</bold></td>
</tr>
<tr>
<td valign="top" align="left">Holzinger et al. (<xref ref-type="bibr" rid="B63">2014</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This review gathers thoughts regarding knowledge discovery and interactive datamining in bioinformatics and elaborates on the enormous amounts of data available and the means to benefit from them. One thought is that given the abundance of data, there is insufficient bandwidth to focus on all of it. Another thought points out that the recent ML and AI algorithms lack interpretability, and as a result, should be treated with caution. Usability and interaction with the introduced methods are also named as important challenges. Describes four future areas of research in the domain: interactive data integration, data fusion and preselection of data-sets; interactive sampling, cleansing, preprocessing, mapping; interactive advanced datamining methods, pattern discovery; and interactive visualization, human-computer interaction, analytics, decision support.</td>
</tr>
<tr>
<td valign="top" align="left">Holzinger (<xref ref-type="bibr" rid="B60">2016</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This work elaborates on ML in the medical field, especially in interactive machine learning (iML), a group of algorithms that can interact with agents, possibly humans, for optimization of their learning process. The author argues that achieving a fully automatic ML is difficult in medical applications due to the fact that biomedical data-sets are full of uncertainty, incompleteness, and other flaws. The author states that a space for &#x0201C;human in-the-loop&#x0201D; methods is created, in which expert knowledge can help ML algorithms. The paper identifies the basis of iML as reinforcement learning (RL), preference learning (PL), and active learning (AL).</td>
</tr>
<tr>
<td valign="top" align="left">Biran and Cotton (<xref ref-type="bibr" rid="B15">2017</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This study reviews research concerning the explainability and justification of ML. The paper identifies the connected terms, interpretability, explainability, and justification and provides information regarding the historical approaches to the treatment of those terms in many ML-related fields. This review identifies two areas of work regarding explainability: the interpretation and justification of predictions and interpretable models. Model-specific and model-agnostic solutions are also identified. The authors observed that, especially in the NLP field, research has been focused on selecting a small part of the input as evidence to justify the predictions. The paper also identifies a model approximation, which &#x0201C;focuses on deriving a simple, interpretable model that approximates a more complex, uninterpretable one,&#x0201D; both in NLP and image classification.</td>
</tr>
<tr>
<td valign="top" align="left">Holzinger et al. (<xref ref-type="bibr" rid="B62">2017</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">The authors dispute the need for research on XAI and name the European General Data Protection Regulation as one of the reasons. The authors also observe a trade-off between model performance and explainability. The authors further elaborate on explainability and other notions such as functional understanding, interpretation, and causality. Explainable models are divided into <italic>post-hoc</italic> and <italic>ante-hoc</italic>. The authors name Amplitude Modulation Frequency Modulation (AM-FM) decompositions as an example of creating explainable features in a medical image domain.</td>
</tr>
<tr>
<td valign="top" align="left">Lipton (<xref ref-type="bibr" rid="B95">2018</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">The work discusses various notions regarding the interpretability of ML. The author underscores the fact that interpretability does not reference a monolithic concept. Importantly, the authors argue that especially in the medical field, &#x0201C;the short-term goal of building trust with doctors by developing transparent models might clash with the longer-term goal of improving health care.&#x0201D; The article also presents a warning regarding blindly trusting <italic>post-hoc</italic> interpretations, because they can potentially be misleading. The paper motivates researchers to clearly define notions regarding general interpretability each time they publish results and claim to achieve it.</td>
</tr>
<tr>
<td valign="top" align="left">Holzinger (<xref ref-type="bibr" rid="B61">2018</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This study is an overview of development from ML to XAI with an emphasis on the importance of human-computer interaction. The author introduces terms such as Automatic ML (aML), interactive ML (iML), and Human-Computer Interaction and Knowledge Discovery/Data Mining (HCI-KDD) and elaborates on several important elements of HCI-KDD: (1) data preprocessing and integration, (2) learning algorithms, (3) data visualization, (4) issues of data protection safety and security, (5) graph-based data mining, (6) topology-based data mining, and (7) entropy-based data mining. The author also mentions the struggle to achieve XAI.</td>
</tr>
<tr>
<td valign="top" align="left">Hosny et al. (<xref ref-type="bibr" rid="B68">2018</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This opinion article aims to familiarize a general understanding of AI methods, especially in regard to image-based tasks in radiology and oncology. The authors elaborate on AI capabilities in detection, characterization (a medical term referring to &#x0201C;the segmentation, diagnosis and staging of a disease&#x0201D;), monitoring, and other opportunities, e.g., data preprocessing or integrated diagnostics. The requirements regarding managing medical data are also discussed (Health Insurance Portability and Accountability Act - HIPAA), and positive examples are named.</td>
</tr>
<tr>
<td valign="top" align="left">Tjoa and Guan (<xref ref-type="bibr" rid="B157">2020</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This paper is a review of a non-exhaustive list of works regarding XAI in general and, specifically, in the medical domain. The authors find that many works that propose interpretability methods assume that such method provide obvious results that do not require human testing, which is believed to not always be true. The paper introduces various types of interpretability in AI and lists a few risks of their application in the medical domain.</td>
</tr>
<tr>
<td valign="top" align="left">Holzinger et al. (<xref ref-type="bibr" rid="B65">2019</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This work elaborates on the importance in the medical field of the ability to explain the elements that cause AI to output a given prediction. Differences between explainablity and causability are discussed. The paper distinguishes <italic>post-hoc</italic> and <italic>ante-hoc</italic> systems that enable the explanation of versions parts of the AI decision process. LIME is given as an example of the <italic>post-hoc</italic> system, whereas <italic>ante-hoc</italic> systems are interpretable based on their design, e.g., decision trees or fuzzy inference systems. The authors mention tools that are useful when interpreting DL predictions: uncertainty, attribution, activation maximization. The paper also presents an example of <italic>post-hoc</italic> and <italic>ante-hoc</italic> explanations by a human expert in a histopathological use-case.</td>
</tr>
<tr>
<td valign="top" align="left">Lundervold and Lundervold (<xref ref-type="bibr" rid="B99">2019</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This article is a survey of ML and DL applications in the medical domain, with a particular emphasis on the MRI field. The authors name numerous areas of AI application in medicine and observe that DL allows for a significant increase of the speed of computations and improvements in computational quality. It is noted that DL can be used for both diagnosis and signal processing. The authors also list medical imaging datasets in Arxiv and Github as the newest-information hubs, as well as various medical imaging competitions.</td>
</tr>
<tr>
<td valign="top" align="left">Langlotz et al. (<xref ref-type="bibr" rid="B88">2019</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This work identifies future key research areas in the AI Medical domain. The paper recalls the achievement of super-human performance by AI models in the classification of objects in 2015 during the ImageNet Large-Scale Visual Recognition Challenge. In addition, research opportunities in AI for medical imaging related to data sharing and data availability are discussed. Among various other topics, the need for &#x0201C;machine learning methods that can explain the advice they provide to human users&#x0201D; is highlighted.</td>
</tr>
<tr>
<td valign="top" align="left">Xu et al. (<xref ref-type="bibr" rid="B168">2019</xref>)</td>
<td valign="top" align="left">Various</td>
<td valign="top" align="left">This shallow review introduces the history of XAI, starting from expert systems and ML and progressing to the latest progress in DL-related XAI. The article gives an example of a DARPA-funded program for the development of XAI and depicts elements of current state-of-the-art and desiderata for future development, with a focus on the image analysis domain. This work also recalls the errors in trained models, as discovered by XAI. The paper ends with a discussion on the challenges and future directions.</td>
</tr>
<tr>
<td valign="top" align="left">Kohoutov&#x000E1; et al. (<xref ref-type="bibr" rid="B80">2020</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This paper elaborates on interpreting ML models in neuroimaging. The authors classify ways to interpret these complex brain models, they should (i) be understandable to humans, (ii) deliver valuable information about what mental or behavioral constructs are represented in particular brain regions, and (iii) establish that they are based on the relevant neurobiological signal, not artifacts or confounds. The provided protocol will support more interpretable neuroimaging models, also the users should be familiar with basic programming in MATLAB or Python.</td>
</tr>
<tr>
<td valign="top" align="left">Singh et al. (<xref ref-type="bibr" rid="B146">2020</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This study contributes a common framework for comparison of 13 XAI attribution methods used for Ophthalmic Disease Classification based on medical imaging. It presents a thorough comparison of both quantitative and qualitative performance of the methods.</td>
</tr>
<tr>
<td valign="top" align="left">Lucieri et al. (<xref ref-type="bibr" rid="B97">2021</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This study reviews XAI methods and studies applied to the dermatology domain. It identifies four main groups of explanation approaches as Visual Relevance Localization, Dermoscopic Feature Prediction and Localization, Similarity Retrieval and Intervention.</td>
</tr>
<tr>
<td valign="top" align="left">Hryniewska et al. (<xref ref-type="bibr" rid="B69">2021</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This research carries out a systematic review of numerous studies predicting COVID-19 from medical images based on deep learning architectures which utilize XAI techniques and focuses on mistakes made at different stages of model development. Among other, the study highlights example errors in XAI explanations as observer by trained radiologist.</td>
</tr>
<tr>
<td valign="top" align="left">Joshi et al. (<xref ref-type="bibr" rid="B72">2021</xref>)</td>
<td valign="top" align="left">N/A</td>
<td valign="top" align="left">This paper describes and reviews the present literature to present a comprehensive survey and commentary on the different explainability methods and techniques in a multimodal deep neural net especially image and text modalities in vision and language settings. The paper covers numerous topics on multimodal AI and its applications for generic domains including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain.</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>N/A, not applicable.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Co-occurrence analysis</title>
<p>Counting of matched data within a collection unit is what co-occurrence analysis is all about. <xref ref-type="fig" rid="F4">Figure 4</xref> visualizes the co-occurrences of the key vocabularies of XAI/AI concepts, XAI methodologies, imaging modalities, and diseases from our reviewed papers. In this figure, word clusters are represented by different colors. Also, the bubble size denotes the number of publications, while the connection width reflects the frequency of co-occurrence. We only considered the abstracts in our co-occurrence analysis due to the large number of words within the full texts and the curse of overlapping labels.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Co-occurrence network of the commonly used words in reviewed studies.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0004.tif"/>
</fig>
</sec>
<sec>
<title>Reliability analysis: XAI tests/measures</title>
<p>XAI methods are likely be unreliable against some factors that do not affect the model outcome. The output of the XAI methods, for instance, could be significantly altered by a slight transformation in the input data, even though the model remains robust to these changes (Kindermans et al., <xref ref-type="bibr" rid="B78">2019</xref>). Accordingly, we conducted a subjective examination on articles that have introduced properties such as completeness, implementation invariance, input invariance, and sensitivity (Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>; Alvarez-Melis and Jaakkola, <xref ref-type="bibr" rid="B4">2018</xref>) to evaluate the reliability of these methods (<xref ref-type="table" rid="T3">Table 3</xref>).</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>XAI tests and measures.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>XAI methods</bold></th>
<th valign="top" align="left"><bold>Test/measure</bold></th>
<th valign="top" align="left"><bold>Outcome</bold></th>
<th valign="top" align="left"><bold>References</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">GI, IG, LRP</td>
<td valign="top" align="left">Ground-truth</td>
<td valign="top" align="left">LRP was shown to perform best, followed by IG.</td>
<td valign="top" align="left">Osman et al., <xref ref-type="bibr" rid="B117">2020</xref></td>
</tr>
<tr>
<td valign="top" align="left">LRP, LIME</td>
<td valign="top" align="left">Relevance structural similarity</td>
<td valign="top" align="left">Superiority of decomposition-based algorithms such as LRP over the others</td>
<td valign="top" align="left">Douglas and Farahani, <xref ref-type="bibr" rid="B33">2020</xref></td>
</tr>
<tr>
<td valign="top" align="left">LIME, SHAP</td>
<td valign="top" align="left">Robustness/Lipschitz Estimate</td>
<td valign="top" align="left">Both methods were robust for SVM, unstable for NN and RF; LIME was more unstable than SHAP</td>
<td valign="top" align="left">Alvarez-Melis and Jaakkola, <xref ref-type="bibr" rid="B4">2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">Saliency maps, GI, IG, LRP, Occlusion Sensitivity, LIME</td>
<td valign="top" align="left">Robustness/Lipschitz Estimate</td>
<td valign="top" align="left">IG performed best, LIME very bad, rest satisfactory</td>
<td valign="top" align="left">Alvarez-Melis and Jaakkola, <xref ref-type="bibr" rid="B4">2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">Gradients, GI, SG, DeConvNets, GBP, DTD, IG</td>
<td valign="top" align="left">Input invariance</td>
<td valign="top" align="left">DTD and IG pass &#x0201C;contingent on the choice of reference and the type of transformation considered&#x0201D;</td>
<td valign="top" align="left">Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">DeepLift, LRP, IG</td>
<td valign="top" align="left">Implementation invariance</td>
<td valign="top" align="left">DeepLift and LRP fail, IG pass</td>
<td valign="top" align="left">Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Gradients, DeConvNets, GBP, DeepLift, LRP, IG</td>
<td valign="top" align="left">Sensitivity</td>
<td valign="top" align="left">Gradients, DeConvNets, GBP fail, DeepLift, LRP, IG pass</td>
<td valign="top" align="left">Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">Gradients, GI, IG, GBP, Guided Grac-CAM, SG</td>
<td valign="top" align="left">Model parameter randomization test</td>
<td valign="top" align="left">Gradients and Grad-CAM pass, rest fail</td>
<td valign="top" align="left">Adebayo et al., <xref ref-type="bibr" rid="B1">2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">Gradients, GI, IG, GBP, Guided Grac-CAM, SG</td>
<td valign="top" align="left">Data randomization test</td>
<td valign="top" align="left">GI, IG pass, rest fail</td>
<td valign="top" align="left">Adebayo et al., <xref ref-type="bibr" rid="B1">2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">Gradients, IG, DeepLIFT</td>
<td valign="top" align="left">Fragility/adversarial input samples</td>
<td valign="top" align="left">Gradients and DeepLIFT are more fragile</td>
<td valign="top" align="left">Ghorbani et al., <xref ref-type="bibr" rid="B43">2019a</xref></td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>Quality assessment</title>
<p>The Cochrane Collaboration&#x00027;s tool (Higgins et al., <xref ref-type="bibr" rid="B56">2011</xref>) was used to assess the risk of bias in each trial (<xref ref-type="fig" rid="F5">Figure 5</xref>). The articles were categorized as: (a) low risk of bias, (b) high risk of bias, or (c) unclear risk of bias for each domain. We judged most domains to be unclear or not reported using the Cochrane Collaboration. Finally, the overall quality of the studies was classified into weak, fair, or good, if &#x0003C; 3, 3, or &#x02265;4 domains were rated as low risk, respectively. Among the 78 studies included in the systematic review, 22 were categorized as good quality, 50 were fair quality, and 6 were low quality.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Assessing the risk of bias using the Cochrane Collaboration&#x00027;s tool.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0005.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s5">
<title>Discussion</title>
<p>The current study provides an overview of applications of <italic>post-hoc</italic> XAI techniques in neuroimaging analysis. We focused on <italic>post-hoc</italic> approaches since interpreting weight vectors has historically been the standard practice when applying encoding and decoding models to functional imaging and neuroimaging data. However, it is generally challenging to interpret decoding and encoding models (Kriegeskorte and Douglas, <xref ref-type="bibr" rid="B83">2019</xref>). In light of this, <italic>post-hoc</italic> procedures provide a novel strategy that would make it possible to use these techniques for predictions and gain scientific and/or neuroscientific knowledge during the interpretation step.</p>
<p>For many years, ML and DL algorithms have established a strong presence in various medical imaging research with examples of performance at least equaling that of radiologists (Khan et al., <xref ref-type="bibr" rid="B76">2001</xref>; Hosny et al., <xref ref-type="bibr" rid="B68">2018</xref>; Li et al., <xref ref-type="bibr" rid="B94">2018</xref>; Lee et al., <xref ref-type="bibr" rid="B92">2019</xref>; Lundervold and Lundervold, <xref ref-type="bibr" rid="B99">2019</xref>; Tang et al., <xref ref-type="bibr" rid="B154">2019</xref>). In contrast to linear models, many practitioners regard DNNs as a &#x0201C;black box,&#x0201D; and this lack of transparency has hindered the adoption of deep learning methods in certain domains where explanations are crucial (Guidotti et al., <xref ref-type="bibr" rid="B49">2018b</xref>). Transparency builds trust, subtends the evaluation of fairness, and helps identify points of model failure (Kindermans et al., <xref ref-type="bibr" rid="B79">2017</xref>; Rajkomar et al., <xref ref-type="bibr" rid="B125">2018</xref>; Vayena et al., <xref ref-type="bibr" rid="B162">2018</xref>; Wilson et al., <xref ref-type="bibr" rid="B166">2019</xref>). In many cases, trustworthy models may be essential to verify that the model is not exploiting artifacts in the data, or operating on spurious attributes that covary with meaningful support features (Leek et al., <xref ref-type="bibr" rid="B93">2010</xref>; Lapuschkin et al., <xref ref-type="bibr" rid="B89">2016</xref>; Montavon et al., <xref ref-type="bibr" rid="B112">2018</xref>).</p>
<p>The need for interpreting the black-box decisions of DNNs (Holzinger, <xref ref-type="bibr" rid="B59">2014</xref>; Biran and Cotton, <xref ref-type="bibr" rid="B15">2017</xref>; Doshi-Velez and Kim, <xref ref-type="bibr" rid="B30">2017</xref>; Lake et al., <xref ref-type="bibr" rid="B85">2017</xref>; Lipton, <xref ref-type="bibr" rid="B95">2018</xref>) was answered by leveraging a variety of <italic>post-hoc</italic> explanation techniques in recent years. These models can assign relevance to inputs for the predictions carried out by trained deep learning models either for each instance separately (Robnik-&#x00160;ikonja and Kononenko, <xref ref-type="bibr" rid="B128">2008</xref>; Zeiler and Fergus, <xref ref-type="bibr" rid="B175">2014</xref>; Ribeiro et al., <xref ref-type="bibr" rid="B126">2016</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>) or on the class or model level (Datta et al., <xref ref-type="bibr" rid="B28">2016</xref>; Guidotti et al., <xref ref-type="bibr" rid="B48">2018a</xref>; Staniak and Biecek, <xref ref-type="bibr" rid="B150">2018</xref>; Ghorbani et al., <xref ref-type="bibr" rid="B44">2019b</xref>). Because of the successful applications of CNNs in image analysis, particularly in the medical domain, several XAI methods were proposed solely for explaining predictions of 2D (Springenberg et al., <xref ref-type="bibr" rid="B148">2014</xref>; Bach et al., <xref ref-type="bibr" rid="B9">2015</xref>; Smilkov et al., <xref ref-type="bibr" rid="B147">2017</xref>) and 3D images (Yang et al., <xref ref-type="bibr" rid="B170">2018</xref>; Zhao et al., <xref ref-type="bibr" rid="B178">2018</xref>; Thomas et al., <xref ref-type="bibr" rid="B155">2019</xref>).</p>
<p>This systematic review also reveals the need to involve medical personnel in developing ML, DL, and XAI for the medical domains. Without feedback from clinicians&#x00027; active participation, it will be unlikely to create ML models dedicated solely to the medical fields (Ustun and Rudin, <xref ref-type="bibr" rid="B160">2016</xref>; Lamy et al., <xref ref-type="bibr" rid="B86">2019</xref>). Familiarizing AI researchers with the original needs and point-of-view of specialists from the medical domain and its subdomains (Tonekaboni et al., <xref ref-type="bibr" rid="B158">2019</xref>) would be beneficial because it would allow focusing on the detailed shortcomings of the state-of-the-art XAI methods, followed by their significant improvement.</p>
<sec>
<title>Why is XAI needed in neuroimaging?</title>
<p>From the perspective of health stakeholders (e.g., patients, physicians, pharmaceutical firms and government), interpretability is an integral part of choosing the optimal model. As shown in <xref ref-type="fig" rid="F6">Figure 6</xref>, interpretability could also be used to ensure other significant desiderata of medical intelligent systems such as transparency, causality, privacy, fairness, trust, usability, and reliability (Doshi-Velez and Kim, <xref ref-type="bibr" rid="B30">2017</xref>). In this sense, <italic>transparency</italic> indicates how a model reached a given result; <italic>causality</italic> examines the relationships between model variables; <italic>privacy</italic> assesses the possibility of original training data leaking out of the system; <italic>fairness</italic> shows whether there is bias aversion in a learning model; <italic>trust</italic> indicates how assured a model is in the face of trouble; <italic>usability</italic> is an indicator of how efficient the interaction between the user and the system is; and <italic>reliability</italic> is about the stability of the outcomes under similar settings (Doshi-Velez and Kim, <xref ref-type="bibr" rid="B30">2017</xref>; Miller, <xref ref-type="bibr" rid="B105">2019</xref>; Barredo Arrieta et al., <xref ref-type="bibr" rid="B11">2020</xref>; Jim&#x000E9;nez-Luna et al., <xref ref-type="bibr" rid="B71">2020</xref>; Fiok et al., <xref ref-type="bibr" rid="B41">2022</xref>).</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Requirement for interpretability in medical intelligent systems.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-16-906290-g0006.tif"/>
</fig>
<p>Even though AI methods have successfully been utilized in medical research and neuroimaging studies, these methods still have not advanced into everyday real-life applications. Researchers name several reasons for this fact: (1) Lack of interpretability, transparency, trust, and clear causality of the black-box AI models continues to be a vital issue (Holzinger et al., <xref ref-type="bibr" rid="B62">2017</xref>; Do&#x00161;ilovi&#x00107; et al., <xref ref-type="bibr" rid="B31">2018</xref>; Hoffman et al., <xref ref-type="bibr" rid="B57">2018</xref>) despite the research already carried out on XAI. (2) Speeding up model convergence while maintaining predictive performance is important in scenarios where data is naturally homogeneous or spatially normalized (e.g., fMRI, MRI sequences, PET, CT). This is crucial for neuroimaging research since the data are relatively homogeneous, unlike natural images, because of their uniform structure and spatial normalization (Eitel et al., <xref ref-type="bibr" rid="B35">2021</xref>). (3) Despite improvements in medical data availability for AI training (Hosny et al., <xref ref-type="bibr" rid="B68">2018</xref>; Lundervold and Lundervold, <xref ref-type="bibr" rid="B99">2019</xref>), an insufficient amount/quality of data for training ML and DL solutions remains a significant limitation, with the result that many studies are carried out on small sample sizes of subjects (13 in Blankertz et al., <xref ref-type="bibr" rid="B16">2011</xref>; 10 in Sturm et al., <xref ref-type="bibr" rid="B151">2016</xref>; 10 in Tonekaboni et al., <xref ref-type="bibr" rid="B158">2019</xref>). (4) It is believed that trained AI models that achieve super-human performance on data from a distribution (e.g., a specific hospital) cannot adapt appropriately to unseen data drawn from other medical units since it comes from a different distribution (Yasaka and Abe, <xref ref-type="bibr" rid="B171">2018</xref>). (5) Compliance with legislation that calls for the &#x0201C;right for explanation&#x0201D; is also considered (Holzinger et al., <xref ref-type="bibr" rid="B62">2017</xref>; Samek et al., <xref ref-type="bibr" rid="B133">2017b</xref>) to be a limiting factor regarding the use of ML and DL without the ability to provide explanations for each use case.</p>
<p>Understanding the need for XAI in the DNNs community seems to spread rapidly or can be already considered widespread. However, the importance of XAI, particularly for the medical domain, is still underestimated. When human health and life is at stake, it is insufficient to decide based solely on a &#x0201C;black box&#x0201D; prediction even when obtained from a superhuman model. It is far not enough to classify; instead, interpretation is the key to achieving if XAI manages to deliver a complete and exhaustive description of voxels that constitute a part of a tumor. The potential for XAI in medicine is exceptional as it can answer why we should believe that the diagnosis is correct.</p>
</sec>
<sec>
<title>Evaluation of explanation methods</title>
<p>In recent years, various computational techniques have been proposed (<xref ref-type="table" rid="T3">Table 3</xref>) to objectively evaluate explainers based on accuracy, fidelity, consistency, stability and completeness (Robnik-&#x00160;ikonja and Kononenko, <xref ref-type="bibr" rid="B128">2008</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>; Molnar, <xref ref-type="bibr" rid="B110">2020</xref>; Mohseni et al., <xref ref-type="bibr" rid="B109">2021</xref>). <italic>Accuracy</italic> and <italic>fidelity</italic> (sensitivity or correctness in the literature) are closely related; the former refers to how well an explainer predicts unseen data, and the latter indicates how well an explainer detect relevant components of the input that the black box model operates upon (notably, in case of high model accuracy and high explainer fidelity, the explainer also has high accuracy). <italic>Consistency</italic> refers to the explainer&#x00027;s ability to capture the common components under different trained models on the same task with similar predictions. However, high consistency is not desirable when models&#x00027; architectures are not functionally equivalent, but their decisions are the same (due to the &#x0201C;Rashomon effect&#x0201D;). While consistency compares explanations between models, <italic>stability</italic> compares explanations under various transformations or adversaries to a fixed model&#x00027;s input. Stability examines how slight variations in the input affect the explanation (assuming the model predictions are the same for both the original and transformed inputs). Eventually, <italic>completeness</italic> reveals a complete picture of features essential for decisions, so how well humans understand the explanations. It looks like the elephant in the room, that somewhat abstruse to measure, but very important to get right in future research. Particularly in medicine, we need a holistic picture of the disease, such as a complete and exhaustive description of voxels that are part of a tumor. What if altered medial temporal lobe shape covaries with a brain tumor (because the tumor moves it somehow)? Should we then resect the temporal lobe? Thus, further research is needed on this property.</p>
<p>In <italic>post-hoc</italic> explanation, fidelity has been studied more than accuracy (in fact, high accuracy is solely important when an explanation is used for predictions). In this respect, Bach et al. (<xref ref-type="bibr" rid="B9">2015</xref>) and Samek et al. (<xref ref-type="bibr" rid="B131">2017a</xref>) suggested a framework to evaluate saliency explanation techniques by pixel-flipping in an image input repeatedly (based on their relevance importance), then quantifying the effect of this perturbation on the classifier prediction. Their framework inspired many other studies (Ancona et al., <xref ref-type="bibr" rid="B5">2017</xref>; Lundberg and Lee, <xref ref-type="bibr" rid="B98">2017</xref>; Sundararajan et al., <xref ref-type="bibr" rid="B152">2017</xref>; Chen et al., <xref ref-type="bibr" rid="B21">2018</xref>; Morcos et al., <xref ref-type="bibr" rid="B113">2018</xref>). The common denominator of fidelity metrics is that the greater the change in prediction performance, the more accurate the relevance. However, this approach may lead to unreliable predictions when the model receives out-of-distribution input images (Osman et al., <xref ref-type="bibr" rid="B117">2020</xref>). To solve this problem, Osman et al. (<xref ref-type="bibr" rid="B117">2020</xref>) developed a synthetic dataset with explanation ground truth masks and two relevance accuracy measures for evaluating explanations. Their approach provides an unbiased and transparent comparison of XAI techniques, and it uses data with a similar distribution to those during model training.</p>
<p>Another possible way for appraising explanations is to leverage the saliency maps for object detection, e.g., by setting a threshold on the relevance and then calculating the Jaccard index (also known as Intersection over Union) concerning bounding box annotations as a measure of relevance accuracy (Simonyan et al., <xref ref-type="bibr" rid="B144">2013</xref>; Zhang et al., <xref ref-type="bibr" rid="B176">2018</xref>). However, since the classifier&#x00027;s decision is based solely on the object and not the background (contradictory to the real world) in this approach, the evaluation could be misleading. In many other cases, comparing a new explainer with those state-of-the-art techniques is utilized to measure explanation quality (Lundberg and Lee, <xref ref-type="bibr" rid="B98">2017</xref>; Ross et al., <xref ref-type="bibr" rid="B129">2017</xref>; Shrikumar et al., <xref ref-type="bibr" rid="B142">2017</xref>; Chu et al., <xref ref-type="bibr" rid="B25">2018</xref>).</p>
<p>On the other hand, Kindermans et al. (<xref ref-type="bibr" rid="B78">2019</xref>) proposed an <italic>input invariance</italic> property. They revealed that explainers might have instabilities in their results after slight image transformations, and consequently, their saliency maps could be misleading and unreliable. They assessed the quality of interpretation methods such as Gradients, GI, SG, DeConvNets, GBP, Taylor decomposition, and IG. Only Taylor decomposition and IG passed this property, subject to the choice of reference and type of transformation. In another study, Sundararajan et al. (<xref ref-type="bibr" rid="B152">2017</xref>) introduced two measures for evaluating the reliability of XAI methods, one called <italic>sensitivity</italic> (or fidelity) and the other as <italic>implementation invariance</italic> (i.e., a requirement that models with different architectures that achieve the same results should also provide the same explanations). In their paper, the sensitivity test was failed by Gradients, DeConvNets, and GBP, while DeepLift, LRP, and IG passed the test; in contrast, the implementation invariance was failed by DeepLift and LRP, while IG passed. To extend a similar idea, Adebayo et al. (<xref ref-type="bibr" rid="B1">2018</xref>) proposed another evaluation approach (to test Gradients, GI, IG, GBP, Guided Grad-CAM, and SG methods) by applying <italic>randomizations tests</italic> on the model parameters and input data, to confirm that the explanation relies on both these factors. Here, Gradients and Grad-CAM methods succeeded in the former and GI and IG in the latter. While these assessments can serve as a first sanity check for explanations, they cannot directly evaluate the explanation&#x00027;s adequacy.</p>
<p>One more known approach for evaluating visual explanations is to expose the input data to adversaries, unintentional or malicious purposes, which are generally unrecognizable to the human eyes (Paschali et al., <xref ref-type="bibr" rid="B120">2018</xref>; Douglas and Farahani, <xref ref-type="bibr" rid="B33">2020</xref>). For example, Douglas and Farahani (<xref ref-type="bibr" rid="B33">2020</xref>) developed a structural similarity analysis and compared the reliability of explanation techniques by adding small amounts of Rician noise to the structural MRI data (in the real world, this kind of adversary can be caused by the physical and temporal variability across instrumentation). In this study, while not significantly changing CNN&#x00027;s prediction performance for both the original and attacked images, the obtained relevance heatmaps showed the superiority of decomposition-based algorithms such as LRP over the others. In another study, Alvarez-Melis and Jaakkola (<xref ref-type="bibr" rid="B4">2018</xref>) proposed a Lipschitz estimate to evaluate explainers&#x00027; stability by adding Gaussian noise to the input data. The authors showed SHAP was more stable than LIME when a random forest was considered. They also assessed explanations provided by LIME, IG, GI, Occlusion sensitivity, Saliency Maps, and LRP over CNNs. They reported acceptable results by all methods (IG was the most stable) but LIME. Finally, Ghorbani et al. (<xref ref-type="bibr" rid="B43">2019a</xref>) proposed to measure fragility, i.e., given an adversarial input image (perturbed original), the degree of behavioral change of the XAI method. In their work, Gradients and DeepLIFT were found to be more fragile than IG.</p>
<p>While there is an ongoing discussion regarding the virtues that XAI should exhibit, so far, no consensus has been reached, even regarding fundamental notions such as interpretability (Lipton, <xref ref-type="bibr" rid="B95">2018</xref>). Terms such as completeness, trust, causality, explainability, robustness, fairness, and many others are actively brought up and discussed by different authors (Biran and Cotton, <xref ref-type="bibr" rid="B15">2017</xref>; Doshi-Velez and Kim, <xref ref-type="bibr" rid="B30">2017</xref>; Lake et al., <xref ref-type="bibr" rid="B85">2017</xref>), as researchers now struggle to achieve common definitions of most important XAI nomenclature (Doshi-Velez and Kim, <xref ref-type="bibr" rid="B30">2017</xref>). Given the research community&#x00027;s activity in this field, it is very likely that additional requirements and test proposals will be formulated shortly. Moreover, new XAI methods will undoubtedly emerge. We also note that not a single XAI method passed all proposed tests, and not all tests were conducted with all available algorithms. The abovementioned reasons force us to infer that the currently available XAI methods (Holzinger et al., <xref ref-type="bibr" rid="B64">2022a</xref>,<xref ref-type="bibr" rid="B66">b</xref>), exhibit significant potential, although they remain immature. Therefore, we agree with Lipton (<xref ref-type="bibr" rid="B95">2018</xref>), which clearly warns about blindly trusting XAI interpretations because they can potentially be misleading.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s6">
<title>Conclusion</title>
<p>AI has already inevitably changed medical research perspectives, but without explaining the rationale for undertaking decisions, it could not provide a high level of trust required in medical applications. With current developments of XAI techniques, this is about to change. Research on fighting cardiovascular disease (Weng et al., <xref ref-type="bibr" rid="B165">2017</xref>), hypoxemia during surgery (Lundberg and Lee, <xref ref-type="bibr" rid="B98">2017</xref>), Alzheimer&#x00027;s disease (Tang et al., <xref ref-type="bibr" rid="B154">2019</xref>), breast cancer (Lamy et al., <xref ref-type="bibr" rid="B86">2019</xref>), acute intracranial hemorrhage (Lee et al., <xref ref-type="bibr" rid="B92">2019</xref>) and coronavirus disease (Wang et al., <xref ref-type="bibr" rid="B163">2019</xref>), can serve as examples of developing successful AI&#x0002B;XAI systems that managed to adequately explain their decisions and pave the way to many other medical applications, notably neuroimaging studies. However, the XAI in this research field is still immature and young. If we expect to overcome XAI&#x00027;s current imperfections, great effort is still needed to foster XAI research. Finally, medical AI and XAI&#x00027;s needs cannot be achieved without keeping medical practitioners in the loop.</p>
</sec>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.</p>
</sec>
<sec id="s8">
<title>Author contributions</title>
<p>FF and KF conducted the literature search and prepared the initial draft of the paper. FF, KF, BL, and PD were involved in study conception and contributed to intellectual content. WK and PD supervised all aspects of manuscript preparations, revisions, editing, and final intellectual content. FF, KF, and BL edited the final draft of the paper. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Adebayo</surname> <given-names>J.</given-names></name> <name><surname>Gilmer</surname> <given-names>J.</given-names></name> <name><surname>Muelly</surname> <given-names>M.</given-names></name> <name><surname>Goodfellow</surname> <given-names>I.</given-names></name> <name><surname>Hardt</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>B.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Sanity checks for saliency maps,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source>, Vol. 31.<pub-id pub-id-type="pmid">35910188</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alex</surname> <given-names>V.</given-names></name> <name><surname>KP</surname> <given-names>M. S.</given-names></name> <name><surname>Chennamsetty</surname> <given-names>S. S.</given-names></name> <name><surname>Krishnamurthi</surname> <given-names>G.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Generative adversarial networks for brain lesion detection,&#x0201D;</article-title> in <source>Proc.SPIE.</source><pub-id pub-id-type="pmid">35282087</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allen</surname> <given-names>J. D.</given-names></name> <name><surname>Xie</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>M.</given-names></name> <name><surname>Girard</surname> <given-names>L.</given-names></name> <name><surname>Xiao</surname> <given-names>G.</given-names></name></person-group> (<year>2012</year>). <article-title>Comparing statistical methods for constructing large scale gene networks</article-title>. <source>PLoS ONE</source> <volume>7</volume>, <fpage>e29348</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0029348</pub-id><pub-id pub-id-type="pmid">22272232</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alvarez-Melis</surname> <given-names>D.</given-names></name> <name><surname>Jaakkola</surname> <given-names>T. S.</given-names></name></person-group> (<year>2018</year>). <article-title>On the robustness of interpretability methods</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1806.08049.</citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ancona</surname> <given-names>M.</given-names></name> <name><surname>Ceolini</surname> <given-names>E.</given-names></name> <name><surname>&#x000D6;ztireli</surname> <given-names>C.</given-names></name> <name><surname>Gross</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Towards better understanding of gradient-based attribution methods for deep neural networks</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1711.06104.</citation>
</ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Anderson</surname> <given-names>A.</given-names></name> <name><surname>Han</surname> <given-names>D.</given-names></name> <name><surname>Douglas</surname> <given-names>P. K.</given-names></name> <name><surname>Bramen</surname> <given-names>J.</given-names></name> <name><surname>Cohen</surname> <given-names>M. S.</given-names></name></person-group> (<year>2012</year>). Real-time functional MRI classification of brain states using Markov-SVM hybrid models: peering inside the rt-fMRI Black Box BT - machine learning and interpretation in neuroimaging,&#x0201D; in eds G. Langs, I. Rish, M. Grosse-Wentrup, and B. Murphy, B (<publisher-loc>Berlin, Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>), <fpage>242</fpage>&#x02013;<lpage>255</lpage>.</citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arras</surname> <given-names>L.</given-names></name> <name><surname>Horn</surname> <given-names>F.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2017a</year>). <article-title>&#x0201C;What is relevant in a text document?&#x0201D;: An interpretable machine learning approach</article-title>. <source>PLoS ONE</source> <volume>12</volume>, <fpage>e0181142</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0181142</pub-id><pub-id pub-id-type="pmid">28800619</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arras</surname> <given-names>L.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2017b</year>). <article-title>Explaining recurrent neural network predictions in sentiment analysis</article-title>. <source>arXiv [Preprint].</source> arXiv: 1706.07206.</citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bach</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Klauschen</surname> <given-names>F.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2015</year>). <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>. <source>PLoS ONE</source> <volume>10</volume>, <fpage>1</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0130140</pub-id><pub-id pub-id-type="pmid">26161953</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baehrens</surname> <given-names>D.</given-names></name> <name><surname>Schroeter</surname> <given-names>T.</given-names></name> <name><surname>Harmeling</surname> <given-names>S.</given-names></name> <name><surname>Kawanabe</surname> <given-names>M.</given-names></name> <name><surname>Hansen</surname> <given-names>K.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2009</year>). <article-title>How to explain individual classification decisions</article-title>. <source>arXiv [Preprint].</source> arXiv: 0912.1128.</citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barredo Arrieta</surname> <given-names>A.</given-names></name> <name><surname>D&#x000ED;az-Rodr&#x000ED;guez</surname> <given-names>N.</given-names></name> <name><surname>Del Ser</surname> <given-names>J.</given-names></name> <name><surname>Bennetot</surname> <given-names>A.</given-names></name> <name><surname>Tabik</surname> <given-names>S.</given-names></name> <name><surname>Barbado</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI</article-title>. <source>Inf. Fusion</source> <volume>58</volume>, <fpage>82</fpage>&#x02013;<lpage>115</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2019.12.012</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Becker</surname> <given-names>S.</given-names></name> <name><surname>Ackermann</surname> <given-names>M.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2018</year>). <article-title>Interpreting and explaining deep neural networks for classification of audio signals</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1807.03418.</citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Courville</surname> <given-names>A.</given-names></name> <name><surname>Vincent</surname> <given-names>P.</given-names></name></person-group> (<year>2013</year>). <article-title>Representation learning: a review and new perspectives</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>35</volume>, <fpage>1798</fpage>&#x02013;<lpage>1828</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2013.50</pub-id><pub-id pub-id-type="pmid">23787338</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Bockmayr</surname> <given-names>M.</given-names></name> <name><surname>H&#x000E4;gele</surname> <given-names>M.</given-names></name> <name><surname>Wienert</surname> <given-names>S.</given-names></name> <name><surname>Heim</surname> <given-names>D.</given-names></name> <name><surname>Hellweg</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Morphological and molecular breast cancer profiling through explainable machine learning</article-title>. <source>Nat. Mach. Intell</source>. <volume>3</volume>, <fpage>355</fpage>&#x02013;<lpage>366</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-021-00303-4</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Biran</surname> <given-names>O.</given-names></name> <name><surname>Cotton</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Explanation and justification in machine learning: A survey,&#x0201D;</article-title> in <source>IJCAI-17 workshop on explainable AI (XAI), Vol. 8</source>. p. <fpage>8</fpage>&#x02013;<lpage>13</lpage>.</citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blankertz</surname> <given-names>B.</given-names></name> <name><surname>Lemm</surname> <given-names>S.</given-names></name> <name><surname>Treder</surname> <given-names>M.</given-names></name> <name><surname>Haufe</surname> <given-names>S.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2011</year>). <article-title>Single-trial analysis and classification of ERP components - A tutorial</article-title>. <source>Neuroimage</source> <volume>56</volume>, <fpage>814</fpage>&#x02013;<lpage>825</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.06.048</pub-id><pub-id pub-id-type="pmid">20600976</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>B&#x000F6;hle</surname> <given-names>M.</given-names></name> <name><surname>Eitel</surname> <given-names>F.</given-names></name> <name><surname>Weygandt</surname> <given-names>M.</given-names></name> <name><surname>Ritter</surname> <given-names>K.</given-names></name></person-group> (<year>2019</year>). <article-title>Layer-wise relevance propagation for explaining deep neural network decisions in MRI-Based alzheimer&#x00027;s disease classification</article-title>. <source>Front. Aging Neurosci</source>. <volume>11</volume>, <fpage>194</fpage>. <pub-id pub-id-type="doi">10.3389/fnagi.2019.00194</pub-id><pub-id pub-id-type="pmid">31417397</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bologna</surname> <given-names>G.</given-names></name> <name><surname>Hayashi</surname> <given-names>Y.</given-names></name></person-group> (<year>2017</year>). <article-title>Characterization of symbolic rules embedded in deep DIMLP networks: A challenge to transparency of deep learning</article-title>. <source>J. Artif. Intell. Soft Comput. Res</source>. <volume>7</volume>, <fpage>265</fpage>&#x02013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.1515/jaiscr-2017-0019</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bosse</surname> <given-names>S.</given-names></name> <name><surname>Wiegand</surname> <given-names>T.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>From &#x0201C;where&#x0201D; to &#x0201C;what&#x0201D;: Towards human-understandable explanations through concept relevance propagation</article-title>. <source>arXiv [Preprint]</source>. arXiv: 2206.03208.</citation>
</ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Caruana</surname> <given-names>R.</given-names></name> <name><surname>Lou</surname> <given-names>Y.</given-names></name> <name><surname>Gehrke</surname> <given-names>J.</given-names></name> <name><surname>Koch</surname> <given-names>P.</given-names></name> <name><surname>Sturm</surname> <given-names>M.</given-names></name> <name><surname>Elhadad</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission,&#x0201D;</article-title> in <source>Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD &#x00027;15</source>. (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1721</fpage>&#x02013;<lpage>1730</lpage>.</citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Song</surname> <given-names>L.</given-names></name> <name><surname>Wainwright</surname> <given-names>M. J.</given-names></name> <name><surname>Jordan</surname> <given-names>M. I.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Learning to explain: An information-theoretic perspective on model interpretation,&#x0201D;</article-title> in <source>International Conference on Machine Learning (PMLR)</source>, <fpage>883</fpage>&#x02013;<lpage>892</lpage>.</citation>
</ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>H.-T.</given-names></name> <name><surname>Koc</surname> <given-names>L.</given-names></name> <name><surname>Harmsen</surname> <given-names>J.</given-names></name> <name><surname>Shaked</surname> <given-names>T.</given-names></name> <name><surname>Chandra</surname> <given-names>T.</given-names></name> <name><surname>Aradhye</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>&#x0201C;Wide &#x00026; deep learning for recommender systems,&#x0201D;</article-title> in <source>Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, DLRS 2016</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>7</fpage>&#x02013;<lpage>10</lpage>.</citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chereda</surname> <given-names>H.</given-names></name> <name><surname>Bleckmann</surname> <given-names>A.</given-names></name> <name><surname>Menck</surname> <given-names>K.</given-names></name> <name><surname>Perera-Bel</surname> <given-names>J.</given-names></name> <name><surname>Stegmaier</surname> <given-names>P.</given-names></name> <name><surname>Auer</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer</article-title>. <source>Genome Med</source>. <volume>13</volume>, <fpage>42</fpage>. <pub-id pub-id-type="doi">10.1186/s13073-021-00845-7</pub-id><pub-id pub-id-type="pmid">33706810</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chu</surname> <given-names>C.</given-names></name> <name><surname>Hsu</surname> <given-names>A.-L.</given-names></name> <name><surname>Chou</surname> <given-names>K.-H.</given-names></name> <name><surname>Bandettini</surname> <given-names>P.</given-names></name> <name><surname>Lin</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images</article-title>. <source>Neuroimage</source> <volume>60</volume>, <fpage>59</fpage>&#x02013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2011.11.066</pub-id><pub-id pub-id-type="pmid">22166797</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chu</surname> <given-names>L.</given-names></name> <name><surname>Hu</surname> <given-names>X.</given-names></name> <name><surname>Hu</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Pei</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Exact and consistent interpretation for piecewise linear neural networks: a closed form solution,&#x0201D;</article-title> in <source>Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &#x00026; Data Mining, KDD &#x00027;18</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1244</fpage>&#x02013;<lpage>1253</lpage>.</citation>
</ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Couture</surname> <given-names>H. D.</given-names></name> <name><surname>Marron</surname> <given-names>J. S.</given-names></name> <name><surname>Perou</surname> <given-names>C. M.</given-names></name> <name><surname>Troester</surname> <given-names>M. A.</given-names></name> <name><surname>Niethammer</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Multiple instance learning for heterogeneous images: training a CNN for histopathology,&#x0201D;</article-title> in <source>Medical Image Computing and Computer Assisted Intervention &#x02013; MICCAI 2018</source>, eds A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>254</fpage>&#x02013;<lpage>262</lpage>.</citation>
</ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cruz-Roa</surname> <given-names>A. A.</given-names></name> <name><surname>Arevalo Ovalle</surname> <given-names>J. E.</given-names></name> <name><surname>Madabhushi</surname> <given-names>A.</given-names></name> <name><surname>Gonz&#x000E1;lez Osorio</surname> <given-names>F. A.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection BT,&#x0201D;</article-title> in <source>Medical Image Computing and Computer-Assisted Intervention &#x02013; MICCAI 2013</source>, eds K. Mori, I. Sakuma, Y. Sato, C. Barillot, and N. Navab (<publisher-loc>Berlin, Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>), <fpage>403</fpage>&#x02013;<lpage>410</lpage>.</citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Datta</surname> <given-names>A.</given-names></name> <name><surname>Sen</surname> <given-names>S.</given-names></name> <name><surname>Zick</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Algorithmic transparency via quantitative input influence: theory and experiments with learning systems,&#x00027;</article-title> in <source>Proc. - 2016 IEEE Symp. Secur. Privacy, SP 2016</source>, <fpage>598</fpage>&#x02013;<lpage>617</lpage>.</citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Devarajan</surname> <given-names>K.</given-names></name></person-group> (<year>2008</year>). <article-title>Nonnegative matrix factorization: an analytical and interpretive tool in computational biology</article-title>. <source>PLoS Comput. Biol</source>. <volume>4</volume>, <fpage>e1000029</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1000029</pub-id><pub-id pub-id-type="pmid">18654623</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doshi-Velez</surname> <given-names>F.</given-names></name> <name><surname>Kim</surname> <given-names>B.</given-names></name></person-group> (<year>2017</year>). <article-title>Towards a rigorous science of interpretable machine learning</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1702.08608.<pub-id pub-id-type="pmid">33301494</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Do&#x00161;ilovi&#x00107;</surname> <given-names>F. K.</given-names></name> <name><surname>Br&#x0010D;i&#x00107;</surname> <given-names>M.</given-names></name> <name><surname>Hlupi&#x00107;</surname> <given-names>N.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Explainable artificial intelligence: a survey,&#x0201D;</article-title> in <source>2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)</source>, <fpage>210</fpage>&#x02013;<lpage>215</lpage>.<pub-id pub-id-type="pmid">33079674</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dosovitskiy</surname> <given-names>A.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Inverting visual representations with convolutional networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>4829</fpage>&#x02013;<lpage>4837</lpage>.</citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Douglas</surname> <given-names>P. K.</given-names></name> <name><surname>Farahani</surname> <given-names>F. V.</given-names></name></person-group> (<year>2020</year>). <article-title>On the similarity of deep learning representations across didactic and adversarial examples</article-title>. <source>arXiv [Preprint].</source> arXiv: 2002.06816.</citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Douglas</surname> <given-names>P. K.</given-names></name> <name><surname>Harris</surname> <given-names>S.</given-names></name> <name><surname>Yuille</surname> <given-names>A.</given-names></name> <name><surname>Cohen</surname> <given-names>M. S.</given-names></name></person-group> (<year>2011</year>). <article-title>Performance comparison of machine learning algorithms and number of independent components used in fMRI decoding of belief vs. disbelief</article-title>. <source>Neuroimage</source> <volume>56</volume>, <fpage>544</fpage>&#x02013;<lpage>553</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.11.002</pub-id><pub-id pub-id-type="pmid">21073969</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eitel</surname> <given-names>F.</given-names></name> <name><surname>Albrecht</surname> <given-names>J. P.</given-names></name> <name><surname>Weygandt</surname> <given-names>M.</given-names></name> <name><surname>Paul</surname> <given-names>F.</given-names></name> <name><surname>Ritter</surname> <given-names>K.</given-names></name></person-group> (<year>2021</year>). <article-title>Patch individual filter layers in CNNs to harness the spatial homogeneity of neuroimaging data</article-title>. <source>Sci. Rep</source>. <volume>11</volume>, <fpage>24447</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-03785-9</pub-id><pub-id pub-id-type="pmid">34961762</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>El-Sappagh</surname> <given-names>S.</given-names></name> <name><surname>Alonso</surname> <given-names>J. M.</given-names></name> <name><surname>Islam</surname> <given-names>S. M. R.</given-names></name> <name><surname>Sultan</surname> <given-names>A. M.</given-names></name> <name><surname>Kwak</surname> <given-names>K. S.</given-names></name></person-group> (<year>2021</year>). <article-title>A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer&#x00027;s disease</article-title>. <source>Sci. Rep</source>. <volume>11</volume>, <fpage>2660</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-82098-3</pub-id><pub-id pub-id-type="pmid">33514817</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Erhan</surname> <given-names>D.</given-names></name> <name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Courville</surname> <given-names>A.</given-names></name> <name><surname>Vincent</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <source>Visualizing Higher-Layer Features of a Deep Network</source>. <publisher-loc>Montreal, QC</publisher-loc>: <publisher-name>University of Montreal</publisher-name>.</citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Essemlali</surname> <given-names>A.</given-names></name> <name><surname>St-Onge</surname> <given-names>E.</given-names></name> <name><surname>Descoteaux</surname> <given-names>M.</given-names></name> <name><surname>Jodoin</surname> <given-names>P.-M.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Understanding Alzheimer disease&#x00027;s structural connectivity through explainable AI,&#x0201D;</article-title> in <source>Proceedings of the Third Conference on Medical Imaging with Deep Learning, Proceedings of Machine Learning Research</source>, eds T. Arbel, I. Ben Ayed, M. de Bruijne, M. Descoteaux, H. Lombaert, and C. Pal (PMLR), <fpage>217</fpage>&#x02013;<lpage>229</lpage>.</citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Esteva</surname> <given-names>A.</given-names></name> <name><surname>Kuprel</surname> <given-names>B.</given-names></name> <name><surname>Novoa</surname> <given-names>R. A.</given-names></name> <name><surname>Ko</surname> <given-names>J.</given-names></name> <name><surname>Swetter</surname> <given-names>S. M.</given-names></name> <name><surname>Blau</surname> <given-names>H. M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Dermatologist-level classification of skin cancer with deep neural networks</article-title>. <source>Nature</source> <volume>542</volume>, <fpage>115</fpage>&#x02013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1038/nature21056</pub-id><pub-id pub-id-type="pmid">28658222</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farahani</surname> <given-names>F. V.</given-names></name> <name><surname>Ahmadi</surname> <given-names>A.</given-names></name> <name><surname>Zarandi</surname> <given-names>M. H. F.</given-names></name></person-group> (<year>2018</year>). <article-title>Hybrid intelligent approach for diagnosis of the lung nodule from CT images using spatial kernelized fuzzy c-means and ensemble learning</article-title>. <source>Math. Comput. Simul</source>. <volume>149</volume>, <fpage>48</fpage>&#x02013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1016/j.matcom.2018.02.001</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fiok</surname> <given-names>K.</given-names></name> <name><surname>Farahani</surname> <given-names>F. V.</given-names></name> <name><surname>Karwowski</surname> <given-names>W.</given-names></name> <name><surname>Ahram</surname> <given-names>T.</given-names></name></person-group> (<year>2022</year>). <article-title>Explainable artificial intelligence for education and training</article-title>. <source>J. Def. Model. Simul</source>. <volume>19</volume>, <fpage>133</fpage>&#x02013;<lpage>144</lpage>.</citation>
</ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gaonkar</surname> <given-names>B.</given-names></name> <name><surname>Davatzikos</surname> <given-names>C.</given-names></name></person-group> (<year>2013</year>). <article-title>Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification</article-title>. <source>Neuroimage</source> <volume>78</volume>, <fpage>270</fpage>&#x02013;<lpage>283</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.03.066</pub-id><pub-id pub-id-type="pmid">23583748</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghorbani</surname> <given-names>A.</given-names></name> <name><surname>Abid</surname> <given-names>A.</given-names></name> <name><surname>Zou</surname> <given-names>J.</given-names></name></person-group> (<year>2019a</year>). <article-title>&#x0201C;Interpretation of neural networks is fragile,&#x0201D;</article-title> in <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>, <fpage>3681</fpage>&#x02013;<lpage>3688</lpage>.</citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghorbani</surname> <given-names>A.</given-names></name> <name><surname>Zou</surname> <given-names>J.</given-names></name> <name><surname>Wexler</surname> <given-names>J.</given-names></name> <name><surname>Kim</surname> <given-names>B.</given-names></name></person-group> (<year>2019b</year>). <source>Towards Automatic Concept-based Explanations</source>.</citation>
</ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Goodfellow</surname> <given-names>I.</given-names></name> <name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Mirza</surname> <given-names>M.</given-names></name> <name><surname>Xu</surname> <given-names>B.</given-names></name> <name><surname>Warde-Farley</surname> <given-names>D.</given-names></name> <name><surname>Ozair</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>&#x0201C;Generative adversarial nets,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems 27</source>, eds Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (<publisher-loc>Curran Associates, Inc.</publisher-loc>), <fpage>2672</fpage>&#x02013;<lpage>2680</lpage>.</citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goodfellow</surname> <given-names>I. J.</given-names></name> <name><surname>Shlens</surname> <given-names>J.</given-names></name> <name><surname>Szegedy</surname> <given-names>C.</given-names></name></person-group> (<year>2014</year>). <article-title>Explaining and harnessing adversarial examples</article-title>. <source>arXiv [Preprint].</source> arXiv: 1412.6572.</citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grigorescu</surname> <given-names>S.</given-names></name> <name><surname>Trasnea</surname> <given-names>B.</given-names></name> <name><surname>Cocias</surname> <given-names>T.</given-names></name> <name><surname>Macesanu</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>A survey of deep learning techniques for autonomous driving</article-title>. <source>J. F. Robot</source>. <volume>37</volume>, <fpage>362</fpage>&#x02013;<lpage>386</lpage>. <pub-id pub-id-type="doi">10.1002/rob.21918</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guidotti</surname> <given-names>R.</given-names></name> <name><surname>Monreale</surname> <given-names>A.</given-names></name> <name><surname>Ruggieri</surname> <given-names>S.</given-names></name> <name><surname>Pedreschi</surname> <given-names>D.</given-names></name> <name><surname>Turini</surname> <given-names>F.</given-names></name> <name><surname>Giannotti</surname> <given-names>F.</given-names></name></person-group> (<year>2018a</year>). <article-title>Local rule-based explanations of black box decision systems</article-title>. <source>arXiv [Preprint].</source> arXiv: 1805.10820.</citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guidotti</surname> <given-names>R.</given-names></name> <name><surname>Monreale</surname> <given-names>A.</given-names></name> <name><surname>Ruggieri</surname> <given-names>S.</given-names></name> <name><surname>Turini</surname> <given-names>F.</given-names></name> <name><surname>Giannotti</surname> <given-names>F.</given-names></name> <name><surname>Pedreschi</surname> <given-names>D.</given-names></name></person-group> (<year>2018b</year>). <article-title>A survey of methods for explaining black box models</article-title>. <source>ACM Comput. Surv</source>. <volume>51</volume>, <fpage>1</fpage>&#x02013;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.1145/3236009</pub-id></citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gunning</surname> <given-names>D.</given-names></name> <name><surname>Aha</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>DARPA&#x00027;s explainable artificial intelligence (XAI) Program</article-title>. <source>AI Mag</source>. <volume>40</volume>, <fpage>44</fpage>&#x02013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1145/3301275.3308446</pub-id></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guyon</surname> <given-names>I.</given-names></name> <name><surname>Elisseeff</surname> <given-names>A.</given-names></name></person-group> (<year>2003</year>). <article-title>An introduction to variable and feature selection</article-title>. <source>J. Mach. Learn. Res</source>. <volume>3</volume>, <fpage>1157</fpage>&#x02013;<lpage>1182</lpage>.</citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>H&#x000E4;gele</surname> <given-names>M.</given-names></name> <name><surname>Seegerer</surname> <given-names>P.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Bockmayr</surname> <given-names>M.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>Klauschen</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Resolving challenges in deep learning-based analyses of histopathological images using explanation methods</article-title>. <source>Sci. Rep</source>. <volume>10</volume>, <fpage>6423</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-020-62724-2</pub-id><pub-id pub-id-type="pmid">32286358</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hall</surname> <given-names>P.</given-names></name> <name><surname>Gill</surname> <given-names>N.</given-names></name></person-group> (<year>2019</year>). <source>An Introduction to Machine Learning Interpretability</source>. O&#x00027;Reilly Media, Incorporated.</citation>
</ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haufe</surname> <given-names>S.</given-names></name> <name><surname>Meinecke</surname> <given-names>F.</given-names></name> <name><surname>G&#x000F6;rgen</surname> <given-names>K.</given-names></name> <name><surname>D&#x000E4;hne</surname> <given-names>S.</given-names></name> <name><surname>Haynes</surname> <given-names>J.-D.</given-names></name> <name><surname>Blankertz</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>On the interpretation of weight vectors of linear models in multivariate neuroimaging</article-title>. <source>Neuroimage</source> <volume>87</volume>, <fpage>96</fpage>&#x02013;<lpage>110</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.10.067</pub-id><pub-id pub-id-type="pmid">24239590</pub-id></citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Herent</surname> <given-names>P.</given-names></name> <name><surname>Jegou</surname> <given-names>S.</given-names></name> <name><surname>Wainrib</surname> <given-names>G.</given-names></name> <name><surname>Clozel</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <article-title>Brain age prediction of healthy subjects on anatomic MRI with deep learning: going beyond with an &#x0201C;explainable AI&#x0201D; mindset</article-title>. <source>bioRxiv.</source> 413302. <pub-id pub-id-type="doi">10.1101/413302</pub-id></citation>
</ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Higgins</surname> <given-names>J. P. T.</given-names></name> <name><surname>Altman</surname> <given-names>D. G.</given-names></name> <name><surname>G&#x000F8;tzsche</surname> <given-names>P. C.</given-names></name> <name><surname>J&#x000FC;ni</surname> <given-names>P.</given-names></name> <name><surname>Moher</surname> <given-names>D.</given-names></name> <name><surname>Oxman</surname> <given-names>A. D.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>The cochrane collaboration&#x00027;s tool for assessing risk of bias in randomised trials</article-title>. <source>BMJ</source> 343. <pub-id pub-id-type="doi">10.1136/bmj.d5928</pub-id><pub-id pub-id-type="pmid">22008217</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoffman</surname> <given-names>R. R.</given-names></name> <name><surname>Mueller</surname> <given-names>S. T.</given-names></name> <name><surname>Klein</surname> <given-names>G.</given-names></name> <name><surname>Litman</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>Metrics for explainable AI: Challenges and prospects</article-title>. <source>arXiv [Preprint].</source> arXiv:1812.04608.</citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>H&#x000F6;lldobler</surname> <given-names>S.</given-names></name> <name><surname>M&#x000F6;hle</surname> <given-names>S.</given-names></name> <name><surname>Tigunova</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Lessons learned from alphago,&#x0201D;</article-title> in <source>YSIP</source>. p. <fpage>92</fpage>&#x02013;<lpage>101</lpage>.</citation>
</ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Trends in interactive knowledge discovery for personalized medicine: cognitive science meets machine learning</article-title>. <source>IEEE Intell. Inform. Bull.</source> <volume>15</volume>, <fpage>6</fpage>&#x02013;<lpage>14</lpage>.</citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>Interactive machine learning for health informatics: when do we need the human-in-the-loop?</article-title> <source>Brain Informat.</source> <volume>3</volume>, <fpage>119</fpage>&#x02013;<lpage>131</lpage>. <pub-id pub-id-type="doi">10.1007/s40708-016-0042-6</pub-id><pub-id pub-id-type="pmid">27747607</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;From machine learning to explainable AI,&#x0201D;</article-title> in <source>DISA 2018 - IEEE World Symp. Digit. Intell. Syst. Mach. Proc.</source>, <fpage>55</fpage>&#x02013;<lpage>66</lpage>.</citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Biemann</surname> <given-names>C.</given-names></name> <name><surname>Pattichis</surname> <given-names>C. S.</given-names></name> <name><surname>Kell</surname> <given-names>D. B.</given-names></name></person-group> (<year>2017</year>). <article-title>What do we need to build explainable AI systems for the medical domain?</article-title>. <source>arXiv [Preprint].</source> arXiv: 1712.09923.<pub-id pub-id-type="pmid">35679107</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Dehmer</surname> <given-names>M.</given-names></name> <name><surname>Jurisica</surname> <given-names>I.</given-names></name></person-group> (<year>2014</year>). <article-title>Knowledge discovery and interactive data mining in bioinformatics - state-of-the-art, future challenges and research directions</article-title>. <source>BMC Bioinformat.</source> <volume>15</volume>, <fpage>I1</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-15-S6-I1</pub-id><pub-id pub-id-type="pmid">25078282</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name> <name><surname>Fong</surname> <given-names>R.</given-names></name> <name><surname>Moon</surname> <given-names>T.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2022a</year>). <article-title>&#x0201C;xxAI-beyond explainable artificial intelligence,&#x0201D;</article-title> in <source>International Workshop on Extending Explainable AI Beyond Deep Models and Classifiers</source> (<publisher-loc>Springer</publisher-loc>), p. <fpage>3</fpage>&#x02013;<lpage>10</lpage>.</citation>
</ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Langs</surname> <given-names>G.</given-names></name> <name><surname>Denk</surname> <given-names>H.</given-names></name> <name><surname>Zatloukal</surname> <given-names>K.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>H.</given-names></name></person-group> (<year>2019</year>). <article-title>Causability and explainability of artificial intelligence in medicine</article-title>. <source>Wiley Interdiscip. Rev. Data Min. Knowl. Discov</source>. <volume>9</volume>, <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1002/widm.1312</pub-id><pub-id pub-id-type="pmid">32089788</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Saranti</surname> <given-names>A.</given-names></name> <name><surname>Molnar</surname> <given-names>C.</given-names></name> <name><surname>Biecek</surname> <given-names>P.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2022b</year>). <article-title>&#x0201C;Explainable AI methods-a brief overview,&#x0201D;</article-title> in <source>International Workshop on Extending Explainable AI Beyond Deep Models and Classifiers</source>, eds A. Holzinger, R. Goebel, R. Fong, T. Moon, K.-R. M&#x000FC;ller, and W. Samek (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>13</fpage>&#x02013;<lpage>38</lpage>.</citation>
</ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holzinger</surname> <given-names>A.</given-names></name> <name><surname>Saranti</surname> <given-names>A.</given-names></name> <name><surname>Mueller</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>KANDINSKYPatterns&#x02013;An experimental exploration environment for pattern analysis and machine intelligence</article-title>. <source>arXiv [Preprint]</source>. arXiv: 2103.00519.</citation>
</ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hosny</surname> <given-names>A.</given-names></name> <name><surname>Parmar</surname> <given-names>C.</given-names></name> <name><surname>Quackenbush</surname> <given-names>J.</given-names></name> <name><surname>Schwartz</surname> <given-names>L. H.</given-names></name> <name><surname>Aerts</surname> <given-names>H. J. W. L.</given-names></name></person-group> (<year>2018</year>). <article-title>Artificial intelligence in radiology</article-title>. <source>Nat. Rev. Cancer</source> <volume>18</volume>, <fpage>500</fpage>&#x02013;<lpage>510</lpage>. <pub-id pub-id-type="doi">10.1038/s41568-018-0016-5</pub-id><pub-id pub-id-type="pmid">29777175</pub-id></citation></ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hryniewska</surname> <given-names>W.</given-names></name> <name><surname>Bombi&#x00144;ski</surname> <given-names>P.</given-names></name> <name><surname>Szatkowski</surname> <given-names>P.</given-names></name> <name><surname>Tomaszewska</surname> <given-names>P.</given-names></name> <name><surname>Przelaskowski</surname> <given-names>A.</given-names></name> <name><surname>Biecek</surname> <given-names>P.</given-names></name></person-group> (<year>2021</year>). <article-title>Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies</article-title>. <source>Pattern Recognit</source>. <volume>118</volume>, <fpage>108035</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.108035</pub-id><pub-id pub-id-type="pmid">34054148</pub-id></citation></ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>S.</given-names></name> <name><surname>Gao</surname> <given-names>Y.</given-names></name> <name><surname>Niu</surname> <given-names>Z.</given-names></name> <name><surname>Jiang</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Xiao</surname> <given-names>X.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Weakly supervised deep learning for COVID-19 infection detection and classification from CT images</article-title>. <source>IEEE Access</source> <volume>8</volume>, <fpage>118869</fpage>&#x02013;<lpage>118883</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2020.3005510</pub-id><pub-id pub-id-type="pmid">34905733</pub-id></citation></ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jim&#x000E9;nez-Luna</surname> <given-names>J.</given-names></name> <name><surname>Grisoni</surname> <given-names>F.</given-names></name> <name><surname>Schneider</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>Drug discovery with explainable artificial intelligence</article-title>. <source>Nat. Mach. Intell</source>. <volume>2</volume>, <fpage>573</fpage>&#x02013;<lpage>584</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-020-00236-4</pub-id></citation>
</ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joshi</surname> <given-names>G.</given-names></name> <name><surname>Walambe</surname> <given-names>R.</given-names></name> <name><surname>Kotecha</surname> <given-names>K.</given-names></name></person-group> (<year>2021</year>). <article-title>A review on explainability in multimodal deep neural nets</article-title>. <source>IEEE Access</source>. <volume>9</volume>, <fpage>59800</fpage>&#x02013;<lpage>59821</lpage>.</citation>
</ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kermany</surname> <given-names>D. S.</given-names></name> <name><surname>Goldbaum</surname> <given-names>M.</given-names></name> <name><surname>Cai</surname> <given-names>W.</given-names></name> <name><surname>Valentim</surname> <given-names>C. C. S.</given-names></name> <name><surname>Liang</surname> <given-names>H.</given-names></name> <name><surname>Baxter</surname> <given-names>S. L.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Identifying medical diagnoses and treatable diseases by image-based deep learning</article-title>. <source>Cell</source> <volume>172</volume>, <fpage>1122</fpage>&#x02013;<lpage>1131.e9</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2018.02.010</pub-id><pub-id pub-id-type="pmid">29474911</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kerr</surname> <given-names>W. T.</given-names></name> <name><surname>Douglas</surname> <given-names>P. K.</given-names></name> <name><surname>Anderson</surname> <given-names>A.</given-names></name> <name><surname>Cohen</surname> <given-names>M. S.</given-names></name></person-group> (<year>2014</year>). <article-title>The utility of data-driven feature selection: Re: Chu et al. 2012</article-title>. <source>Neuroimage</source> <volume>84</volume>, <fpage>1107</fpage>&#x02013;<lpage>1110</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.07.050</pub-id><pub-id pub-id-type="pmid">23891886</pub-id></citation></ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khaligh-Razavi</surname> <given-names>S.-M.</given-names></name> <name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name></person-group> (<year>2014</year>). <article-title>Deep supervised, but not unsupervised, models may explain IT cortical representation</article-title>. <source>PLoS Comput. Biol</source>. <volume>10</volume>, <fpage>e1003915</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003915</pub-id><pub-id pub-id-type="pmid">25375136</pub-id></citation></ref>
<ref id="B76">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khan</surname> <given-names>J.</given-names></name> <name><surname>Wei</surname> <given-names>J. S.</given-names></name> <name><surname>Ringn&#x000E9;r</surname> <given-names>M.</given-names></name> <name><surname>Saal</surname> <given-names>L. H.</given-names></name> <name><surname>Ladanyi</surname> <given-names>M.</given-names></name> <name><surname>Westermann</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2001</year>). <article-title>Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks</article-title>. <source>Nat. Med</source>. <volume>7</volume>, <fpage>673</fpage>&#x02013;<lpage>679</lpage>. <pub-id pub-id-type="doi">10.1038/89044</pub-id><pub-id pub-id-type="pmid">11385503</pub-id></citation></ref>
<ref id="B77">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>B.</given-names></name> <name><surname>Wattenberg</surname> <given-names>M.</given-names></name> <name><surname>Gilmer</surname> <given-names>J.</given-names></name> <name><surname>Cai</surname> <given-names>C.</given-names></name> <name><surname>Wexler</surname> <given-names>J.</given-names></name> <name><surname>Viegas</surname> <given-names>F.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav),&#x0201D;</article-title> in <source>International Conference on Machine Learning (PMLR)</source>, <fpage>2668</fpage>&#x02013;<lpage>2677</lpage>.</citation>
</ref>
<ref id="B78">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kindermans</surname> <given-names>P.-J.</given-names></name> <name><surname>Hooker</surname> <given-names>S.</given-names></name> <name><surname>Adebayo</surname> <given-names>J.</given-names></name> <name><surname>Alber</surname> <given-names>M.</given-names></name> <name><surname>Sch&#x000FC;tt</surname> <given-names>K. T.</given-names></name> <name><surname>D&#x000E4;hne</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>&#x0201C;The (un)reliability of saliency methods,&#x0201D;</article-title> in <source>Explainable AI: Interpreting, Explaining and Visualizing Deep Learning</source>, eds W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, and K.-R. M&#x000FC;ller (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>267</fpage>&#x02013;<lpage>280</lpage>.</citation>
</ref>
<ref id="B79">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kindermans</surname> <given-names>P. J.</given-names></name> <name><surname>Sch&#x000FC;tt</surname> <given-names>K. T.</given-names></name> <name><surname>Alber</surname> <given-names>M.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name> <name><surname>Erhan</surname> <given-names>D.</given-names></name> <name><surname>Kim</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Learning how to explain neural networks: Patternnet and patternattribution</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1705.05598.</citation>
</ref>
<ref id="B80">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kohoutov&#x000E1;</surname> <given-names>L.</given-names></name> <name><surname>Heo</surname> <given-names>J.</given-names></name> <name><surname>Cha</surname> <given-names>S.</given-names></name> <name><surname>Lee</surname> <given-names>S.</given-names></name> <name><surname>Moon</surname> <given-names>T.</given-names></name> <name><surname>Wager</surname> <given-names>T. D.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Toward a unified framework for interpreting machine-learning models in neuroimaging</article-title>. <source>Nat. Protoc</source>. <volume>15</volume>, <fpage>1399</fpage>&#x02013;<lpage>1435</lpage>. <pub-id pub-id-type="doi">10.1038/s41596-019-0289-5</pub-id><pub-id pub-id-type="pmid">32203486</pub-id></citation></ref>
<ref id="B81">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>Deep neural networks: a new framework for modeling biological vision and brain information processing</article-title>. <source>Annu. Rev. Vis. Sci</source>. <volume>1</volume>, <fpage>417</fpage>&#x02013;<lpage>446</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-vision-082114-035447</pub-id><pub-id pub-id-type="pmid">28532370</pub-id></citation></ref>
<ref id="B82">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name> <name><surname>Douglas</surname> <given-names>P. K.</given-names></name></person-group> (<year>2018</year>). <article-title>Cognitive computational neuroscience</article-title>. <source>Nat. Neurosci</source>. <volume>21</volume>, <fpage>1148</fpage>&#x02013;<lpage>1160</lpage>. <pub-id pub-id-type="doi">10.1038/s41593-018-0210-5</pub-id><pub-id pub-id-type="pmid">30127428</pub-id></citation></ref>
<ref id="B83">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name> <name><surname>Douglas</surname> <given-names>P. K.</given-names></name></person-group> (<year>2019</year>). <article-title>Interpreting encoding and decoding models</article-title>. <source>Curr. Opin. Neurobiol</source>. <volume>55</volume>, <fpage>167</fpage>&#x02013;<lpage>179</lpage>. <pub-id pub-id-type="doi">10.1016/j.conb.2019.04.002</pub-id><pub-id pub-id-type="pmid">31039527</pub-id></citation></ref>
<ref id="B84">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name> <name><surname>Bandettini</surname> <given-names>P.</given-names></name></person-group> (<year>2006</year>). <article-title>Information-based functional brain mapping</article-title>. <source>Proc. Natl. Acad. Sci. U. S. A</source>. <volume>103</volume>, <fpage>3863</fpage>&#x02013;<lpage>3868</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0600244103</pub-id><pub-id pub-id-type="pmid">16537458</pub-id></citation></ref>
<ref id="B85">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lake</surname> <given-names>B. M.</given-names></name> <name><surname>Ullman</surname> <given-names>T. D.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name> <name><surname>Gershman</surname> <given-names>S. J.</given-names></name></person-group> (<year>2017</year>). <article-title>Building machines that learn and think like people</article-title>. <source>Behav. Brain Sci</source>. <volume>40</volume>, <fpage>e253</fpage>. <pub-id pub-id-type="doi">10.1017/S0140525X16001837</pub-id><pub-id pub-id-type="pmid">27881212</pub-id></citation></ref>
<ref id="B86">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lamy</surname> <given-names>J.-B.</given-names></name> <name><surname>Sekar</surname> <given-names>B.</given-names></name> <name><surname>Guezennec</surname> <given-names>G.</given-names></name> <name><surname>Bouaud</surname> <given-names>J.</given-names></name> <name><surname>S&#x000E9;roussi</surname> <given-names>B.</given-names></name></person-group> (<year>2019</year>). <article-title>Explainable artificial intelligence for breast cancer: a visual case-based reasoning approach</article-title>. <source>Artif. Intell. Med</source>. <volume>94</volume>, <fpage>42</fpage>&#x02013;<lpage>53</lpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2019.01.001</pub-id><pub-id pub-id-type="pmid">30871682</pub-id></citation></ref>
<ref id="B87">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Landecker</surname> <given-names>W.</given-names></name> <name><surname>Thomure</surname> <given-names>M. D.</given-names></name> <name><surname>Bettencourt</surname> <given-names>L. M. A.</given-names></name> <name><surname>Mitchell</surname> <given-names>M.</given-names></name> <name><surname>Kenyon</surname> <given-names>G. T.</given-names></name> <name><surname>Brumby</surname> <given-names>S. P.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Interpreting individual classifications of hierarchical networks,&#x0201D;</article-title> in <source>Proc. 2013 IEEE Symp. Comput. Intell. Data Mining, CIDM 2013 - 2013 IEEE Symp. Ser. Comput. Intell. SSCI 2013</source>, <fpage>32</fpage>&#x02013;<lpage>38</lpage>.<pub-id pub-id-type="pmid">20195502</pub-id></citation></ref>
<ref id="B88">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Langlotz</surname> <given-names>C. P.</given-names></name> <name><surname>Allen</surname> <given-names>B.</given-names></name> <name><surname>Erickson</surname> <given-names>B. J.</given-names></name> <name><surname>Kalpathy-Cramer</surname> <given-names>J.</given-names></name> <name><surname>Bigelow</surname> <given-names>K.</given-names></name> <name><surname>Cook</surname> <given-names>T. S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>A roadmap for foundational research on artificial intelligence in medical imaging: from the 2018 NIH/RSNA/ACR/The Academy Workshop</article-title>. <source>Radiology</source> <volume>291</volume>, <fpage>781</fpage>&#x02013;<lpage>791</lpage>. <pub-id pub-id-type="doi">10.1148/radiol.2019190613</pub-id><pub-id pub-id-type="pmid">30990384</pub-id></citation></ref>
<ref id="B89">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Muller</surname> <given-names>K.-R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Analyzing classifiers: fisher vectors and deep neural networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>2912</fpage>&#x02013;<lpage>2920</lpage>.</citation>
</ref>
<ref id="B90">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>W&#x000E4;ldchen</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2019</year>). <article-title>Unmasking Clever Hans predictors and assessing what machines really learn</article-title>. <source>Nat. Commun</source>. <volume>10</volume>, <fpage>1</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1038/s41467-019-08987-4</pub-id><pub-id pub-id-type="pmid">30858366</pub-id></citation></ref>
<ref id="B91">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>LeCun</surname> <given-names>Y.</given-names></name> <name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Hinton</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>Deep learning</article-title>. <source>Nature</source> <volume>521</volume>, <fpage>436</fpage>&#x02013;<lpage>444</lpage>. <pub-id pub-id-type="doi">10.1038/nature14539</pub-id><pub-id pub-id-type="pmid">26017442</pub-id></citation></ref>
<ref id="B92">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>H.</given-names></name> <name><surname>Yune</surname> <given-names>S.</given-names></name> <name><surname>Mansouri</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>M.</given-names></name> <name><surname>Tajmir</surname> <given-names>S. H.</given-names></name> <name><surname>Guerrier</surname> <given-names>C. E.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets</article-title>. <source>Nat. Biomed. Eng</source>. <volume>3</volume>, <fpage>173</fpage>&#x02013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1038/s41551-018-0324-9</pub-id><pub-id pub-id-type="pmid">30948806</pub-id></citation></ref>
<ref id="B93">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leek</surname> <given-names>J. T.</given-names></name> <name><surname>Scharpf</surname> <given-names>R. B.</given-names></name> <name><surname>Bravo</surname> <given-names>H. C.</given-names></name> <name><surname>Simcha</surname> <given-names>D.</given-names></name> <name><surname>Langmead</surname> <given-names>B.</given-names></name> <name><surname>Johnson</surname> <given-names>W. E.</given-names></name> <etal/></person-group>. (<year>2010</year>). <article-title>Tackling the widespread and critical impact of batch effects in high-throughput data</article-title>. <source>Nat. Rev. Genet</source>. <volume>11</volume>, <fpage>733</fpage>&#x02013;<lpage>739</lpage>. <pub-id pub-id-type="doi">10.1038/nrg2825</pub-id><pub-id pub-id-type="pmid">20838408</pub-id></citation></ref>
<ref id="B94">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Dvornek</surname> <given-names>N. C.</given-names></name> <name><surname>Zhuang</surname> <given-names>J.</given-names></name> <name><surname>Ventola</surname> <given-names>P.</given-names></name> <name><surname>Duncan</surname> <given-names>J. S.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Brain biomarker interpretation in ASD using deep learning and fMRI,&#x0201D;</article-title> in <source>Medical Image Computing and Computer Assisted Intervention &#x02013; MICCAI 2018</source>, eds A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>206</fpage>&#x02013;<lpage>214</lpage>.<pub-id pub-id-type="pmid">32984865</pub-id></citation></ref>
<ref id="B95">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lipton</surname> <given-names>Z. C.</given-names></name></person-group> (<year>2018</year>). <article-title>The mythos of model interpretability</article-title>. <source>Commun. ACM</source> <volume>61</volume>, <fpage>36</fpage>&#x02013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1145/3233231</pub-id></citation>
</ref>
<ref id="B96">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Litjens</surname> <given-names>G.</given-names></name> <name><surname>Kooi</surname> <given-names>T.</given-names></name> <name><surname>Bejnordi</surname> <given-names>B. E.</given-names></name> <name><surname>Setio</surname> <given-names>A. A. A.</given-names></name> <name><surname>Ciompi</surname> <given-names>F.</given-names></name> <name><surname>Ghafoorian</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>A survey on deep learning in medical image analysis</article-title>. <source>Med. Image Anal</source>. <volume>42</volume>, <fpage>60</fpage>&#x02013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1016/j.media.2017.07.005</pub-id><pub-id pub-id-type="pmid">33901992</pub-id></citation></ref>
<ref id="B97">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lucieri</surname> <given-names>A.</given-names></name> <name><surname>Dengel</surname> <given-names>A.</given-names></name> <name><surname>Ahmed</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Deep learning based decision support for medicine&#x02013;a case study on skin cancer diagnosis</article-title>. <source>arXiv [Preprint]</source>. arXiv: 2103.05112.</citation>
</ref>
<ref id="B98">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lundberg</surname> <given-names>S. M.</given-names></name> <name><surname>Lee</surname> <given-names>S. I.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;A unified approach to interpreting model predictions,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source>, Vol. 30.</citation>
</ref>
<ref id="B99">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lundervold</surname> <given-names>A. S.</given-names></name> <name><surname>Lundervold</surname> <given-names>A.</given-names></name></person-group> (<year>2019</year>). <article-title>An overview of deep learning in medical imaging focusing on MRI</article-title>. <source>Z. Med. Phys</source>. <volume>29</volume>, <fpage>102</fpage>&#x02013;<lpage>127</lpage>. <pub-id pub-id-type="doi">10.1016/j.zemedi.2018.11.002</pub-id><pub-id pub-id-type="pmid">30553609</pub-id></citation></ref>
<ref id="B100">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ma</surname> <given-names>S.</given-names></name> <name><surname>Song</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>Supervised group Lasso with applications to microarray data analysis</article-title>. <source>BMC Bioinformat.</source> <volume>8</volume>, <fpage>60</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-8-60</pub-id><pub-id pub-id-type="pmid">17316436</pub-id></citation></ref>
<ref id="B101">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Magister</surname> <given-names>L. C.</given-names></name> <name><surname>Kazhdan</surname> <given-names>D.</given-names></name> <name><surname>Singh</surname> <given-names>V.</given-names></name> <name><surname>Li&#x000F2;</surname> <given-names>P.</given-names></name></person-group> (<year>2021</year>). <article-title>GCExplainer: Human-in-the-loop concept-based explanations for graph neural networks</article-title>. <source>arXiv [Preprint].</source> arXiv: 2107.11889.</citation>
</ref>
<ref id="B102">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahendran</surname> <given-names>A.</given-names></name> <name><surname>Vedaldi</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Understanding deep image representations by inverting them,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>5188</fpage>&#x02013;<lpage>5196</lpage>.</citation>
</ref>
<ref id="B103">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCarthy</surname> <given-names>J.</given-names></name></person-group> (<year>1960</year>). <source>Programs with Common Sense</source>. RLE and MIT Computation Center.</citation>
</ref>
<ref id="B104">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Meske</surname> <given-names>C.</given-names></name> <name><surname>Bunde</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Transparency and trust in human-AI-interaction: the role of model-agnostic explanations in computer vision-based decision support BT,&#x0201D;</article-title> in <source>Artificial Intelligence in HCI</source>, eds H. Degen, and L. Reinerman-Jones (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>54</fpage>&#x02013;<lpage>69</lpage>.</citation>
</ref>
<ref id="B105">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miller</surname> <given-names>T.</given-names></name></person-group> (<year>2019</year>). <article-title>Explanation in artificial intelligence: insights from the social sciences</article-title>. <source>Artif. Intell</source>. <volume>267</volume>, <fpage>1</fpage>&#x02013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1016/j.artint.2018.07.007</pub-id></citation>
</ref>
<ref id="B106">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miotto</surname> <given-names>R.</given-names></name> <name><surname>Wang</surname> <given-names>F.</given-names></name> <name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Dudley</surname> <given-names>J. T.</given-names></name></person-group> (<year>2017</year>). <article-title>Deep learning for healthcare: review, opportunities and challenges</article-title>. <source>Brief. Bioinformat</source>. <volume>19</volume>, <fpage>1236</fpage>&#x02013;<lpage>1246</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbx044</pub-id><pub-id pub-id-type="pmid">28481991</pub-id></citation></ref>
<ref id="B107">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mirchi</surname> <given-names>N.</given-names></name> <name><surname>Bissonnette</surname> <given-names>V.</given-names></name> <name><surname>Yilmaz</surname> <given-names>R.</given-names></name> <name><surname>Ledwos</surname> <given-names>N.</given-names></name> <name><surname>Winkler-Schwartz</surname> <given-names>A.</given-names></name> <name><surname>Del Maestro</surname> <given-names>R. F.</given-names></name></person-group> (<year>2020</year>). <article-title>The virtual operative assistant: an explainable artificial intelligence tool for simulation-based training in surgery and medicine</article-title>. <source>PLoS ONE</source> <volume>15</volume>, <fpage>e0229596</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0229596</pub-id><pub-id pub-id-type="pmid">32106247</pub-id></citation></ref>
<ref id="B108">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moher</surname> <given-names>D.</given-names></name> <name><surname>Liberati</surname> <given-names>A.</given-names></name> <name><surname>Tetzlaff</surname> <given-names>J.</given-names></name> <name><surname>Altman</surname> <given-names>D. G.</given-names></name> PRISMA Group</person-group>. (<year>2009</year>). <article-title>Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement</article-title>. <source>Ann. Intern. Med</source>. <volume>151</volume>, <fpage>264</fpage>&#x02013;<lpage>269</lpage>. <pub-id pub-id-type="doi">10.7326/0003-4819-151-4-200908180-00135</pub-id><pub-id pub-id-type="pmid">20171303</pub-id></citation></ref>
<ref id="B109">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mohseni</surname> <given-names>S.</given-names></name> <name><surname>Zarei</surname> <given-names>N.</given-names></name> <name><surname>Ragan</surname> <given-names>E. D.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;A multidisciplinary survey and framework for design and evaluation of explainable AI systems,&#x0201D;</article-title> in <source>ACM Transactions on Interactive Intelligent Systems</source>, Vol. <volume>11</volume>(<issue>ACM</issue>), <fpage>1</fpage>&#x02013;<lpage>45</lpage>.</citation>
</ref>
<ref id="B110">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Molnar</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <source>Interpretable Machine Learning</source>. Lulu. com.</citation>
</ref>
<ref id="B111">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2017</year>). <article-title>Explaining nonlinear classification decisions with deep Taylor decomposition</article-title>. <source>Pattern Recognit</source>. <volume>65</volume>, <fpage>211</fpage>&#x02013;<lpage>222</lpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2016.11.008</pub-id></citation>
</ref>
<ref id="B112">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2018</year>). <article-title>Methods for interpreting and understanding deep neural networks</article-title>. <source>Digit. Signal Process. A Rev. J</source>. <volume>73</volume>, <fpage>1</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1016/j.dsp.2017.10.011</pub-id></citation>
</ref>
<ref id="B113">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morcos</surname> <given-names>A. S.</given-names></name> <name><surname>Barrett</surname> <given-names>D. G.</given-names></name> <name><surname>Rabinowitz</surname> <given-names>N. C.</given-names></name> <name><surname>Botvinick</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>On the importance of single directions for generalization</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1803.06959.</citation>
</ref>
<ref id="B114">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mour&#x000E3;o-Miranda</surname> <given-names>J.</given-names></name> <name><surname>Bokde</surname> <given-names>A. L. W.</given-names></name> <name><surname>Born</surname> <given-names>C.</given-names></name> <name><surname>Hampel</surname> <given-names>H.</given-names></name> <name><surname>Stetter</surname> <given-names>M.</given-names></name></person-group> (<year>2005</year>). <article-title>Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data</article-title>. <source>Neuroimage</source> <volume>28</volume>, <fpage>980</fpage>&#x02013;<lpage>995</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2005.06.070</pub-id><pub-id pub-id-type="pmid">16275139</pub-id></citation></ref>
<ref id="B115">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nguyen</surname> <given-names>A.</given-names></name> <name><surname>Dosovitskiy</surname> <given-names>A.</given-names></name> <name><surname>Yosinski</surname> <given-names>J.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name> <name><surname>Clune</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Synthesizing the preferred inputs for neurons in neural networks via deep generator networks,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems 29</source>, eds D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (<publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>3387</fpage>&#x02013;<lpage>3395</lpage>.</citation>
</ref>
<ref id="B116">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nigri</surname> <given-names>E.</given-names></name> <name><surname>Ziviani</surname> <given-names>N.</given-names></name> <name><surname>Cappabianco</surname> <given-names>F.</given-names></name> <name><surname>Antunes</surname> <given-names>A.</given-names></name> <name><surname>Veloso</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Explainable deep CNNs for MRI-based diagnosis of Alzheimer&#x00027;s Disease,&#x0201D;</article-title> in <source>2020 International Joint Conference on Neural Networks (IJCNN)</source>, <fpage>1</fpage>&#x02013;<lpage>8</lpage>.</citation>
</ref>
<ref id="B117">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Osman</surname> <given-names>A.</given-names></name> <name><surname>Arras</surname> <given-names>L.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2020</year>). <source>Towards ground truth evaluation of visual explanations</source>. arXiv e-prints, arXiv-2003.</citation>
</ref>
<ref id="B118">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Palatnik de Sousa</surname> <given-names>I.</given-names></name> <name><surname>Maria Bernardes Rebuzzi Vellasco</surname> <given-names>M.</given-names></name> <name><surname>Costa da Silva</surname> <given-names>E.</given-names></name></person-group> (<year>2019</year>). <article-title>Local interpretable model-agnostic explanations for classification of lymph node metastases</article-title>. <source>Sensors.</source> <volume>19</volume>, <fpage>2969</fpage>. <pub-id pub-id-type="doi">10.3390/s19132969</pub-id><pub-id pub-id-type="pmid">31284419</pub-id></citation></ref>
<ref id="B119">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papanastasopoulos</surname> <given-names>Z.</given-names></name> <name><surname>Samala</surname> <given-names>R. K.</given-names></name> <name><surname>Chan</surname> <given-names>H.-P.</given-names></name> <name><surname>Hadjiiski</surname> <given-names>L.</given-names></name> <name><surname>Paramagul</surname> <given-names>C.</given-names></name> <name><surname>Helvie</surname> <given-names>M. A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>&#x0201C;Explainable AI for medical imaging: deep-learning CNN ensemble for classification of estrogen receptor status from breast MRI,&#x0201D;</article-title> in <source>Proc.SPIE</source>.</citation>
</ref>
<ref id="B120">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Paschali</surname> <given-names>M.</given-names></name> <name><surname>Conjeti</surname> <given-names>S.</given-names></name> <name><surname>Navarro</surname> <given-names>F.</given-names></name> <name><surname>Navab</surname> <given-names>N.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Generalizability vs. robustness: investigating medical imaging networks using adversarial examples,&#x0201D;</article-title> in <source>Medical Image Computing and Computer Assisted Intervention &#x02013; MICCAI 2018</source>, eds A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>493</fpage>&#x02013;<lpage>501</lpage>.</citation>
</ref>
<ref id="B121">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pennisi</surname> <given-names>M.</given-names></name> <name><surname>Kavasidis</surname> <given-names>I.</given-names></name> <name><surname>Spampinato</surname> <given-names>C.</given-names></name> <name><surname>Schinina</surname> <given-names>V.</given-names></name> <name><surname>Palazzo</surname> <given-names>S.</given-names></name> <name><surname>Salanitri</surname> <given-names>F. P.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>An explainable AI system for automated COVID-19 assessment and lesion categorization from CT-scans</article-title>. <source>Artif. Intell. Med</source>. <volume>118</volume>, <fpage>102114</fpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2021.102114</pub-id><pub-id pub-id-type="pmid">34412837</pub-id></citation></ref>
<ref id="B122">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pfeifer</surname> <given-names>B.</given-names></name> <name><surname>Baniecki</surname> <given-names>H.</given-names></name> <name><surname>Saranti</surname> <given-names>A.</given-names></name> <name><surname>Biecek</surname> <given-names>P.</given-names></name> <name><surname>Holzinger</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Graph-guided random forest for gene set selection</article-title>.</citation>
</ref>
<ref id="B123">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Puri</surname> <given-names>N.</given-names></name> <name><surname>Gupta</surname> <given-names>P.</given-names></name> <name><surname>Agarwal</surname> <given-names>P.</given-names></name> <name><surname>Verma</surname> <given-names>S.</given-names></name> <name><surname>Krishnamurthy</surname> <given-names>B.</given-names></name></person-group> (<year>2017</year>). <article-title>MAGIX: model agnostic globally interpretable explanations</article-title>. <source>arXiv [Preprint].</source> <volume>arXiv</volume>:1 706.07160.</citation>
</ref>
<ref id="B124">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Qin</surname> <given-names>Y.</given-names></name> <name><surname>Kamnitsas</surname> <given-names>K.</given-names></name> <name><surname>Ancha</surname> <given-names>S.</given-names></name> <name><surname>Nanavati</surname> <given-names>J.</given-names></name> <name><surname>Cottrell</surname> <given-names>G.</given-names></name> <name><surname>Criminisi</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>&#x0201C;Autofocus layer for semantic segmentation,&#x0201D;</article-title> in <source>Medical Image Computing and Computer Assisted Intervention &#x02013; MICCAI 2018</source>, eds A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>603</fpage>&#x02013;<lpage>611</lpage>.</citation>
</ref>
<ref id="B125">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rajkomar</surname> <given-names>A.</given-names></name> <name><surname>Oren</surname> <given-names>E.</given-names></name> <name><surname>Chen</surname> <given-names>K.</given-names></name> <name><surname>Dai</surname> <given-names>A. M.</given-names></name> <name><surname>Hajaj</surname> <given-names>N.</given-names></name> <name><surname>Hardt</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Scalable and accurate deep learning with electronic health records</article-title>. <source>npj Digit. Med</source>. <volume>1</volume>, <fpage>18</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-018-0029-1</pub-id><pub-id pub-id-type="pmid">31304302</pub-id></citation></ref>
<ref id="B126">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>M. T.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <name><surname>Guestrin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>Nothing else matters: Model-agnostic explanations by identifying prediction invariance</article-title>. arXiv [Preprint]. <volume>arXiv</volume>: <fpage>1611.05817</fpage>.</citation>
</ref>
<ref id="B127">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Richardson</surname> <given-names>E.</given-names></name> <name><surname>Sela</surname> <given-names>M.</given-names></name> <name><surname>Or-El</surname> <given-names>R.</given-names></name> <name><surname>Kimmel</surname> <given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Learning detailed face reconstruction from a single image,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>1259</fpage>&#x02013;<lpage>1268</lpage>.</citation>
</ref>
<ref id="B128">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robnik-&#x00160;ikonja</surname> <given-names>M.</given-names></name> <name><surname>Kononenko</surname> <given-names>I.</given-names></name></person-group> (<year>2008</year>). <article-title>Explaining classifications for individual instances</article-title>. <source>IEEE Trans. Knowl. Data Eng</source>. <volume>20</volume>, <fpage>589</fpage>&#x02013;<lpage>600</lpage>. <pub-id pub-id-type="doi">10.1109/TKDE.2007.190734</pub-id></citation>
</ref>
<ref id="B129">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ross</surname> <given-names>A. S.</given-names></name> <name><surname>Hughes</surname> <given-names>M. C.</given-names></name> <name><surname>Doshi-Velez</surname> <given-names>F.</given-names></name></person-group> (<year>2017</year>). <article-title>Right for the right reasons: Training differentiable models by constraining their explanations</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1703.03717.</citation>
</ref>
<ref id="B130">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rumelhart</surname> <given-names>D. E.</given-names></name> <name><surname>Hinton</surname> <given-names>G. E.</given-names></name> <name><surname>Williams</surname> <given-names>R. J.</given-names></name></person-group> (<year>1986</year>). <article-title>Learning representations by back-propagating errors</article-title>. <source>Nature</source> <volume>323</volume>, <fpage>533</fpage>&#x02013;<lpage>536</lpage>. <pub-id pub-id-type="doi">10.1038/323533a0</pub-id></citation>
</ref>
<ref id="B131">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.</given-names></name></person-group> (<year>2017a</year>). <article-title>Evaluating the visualization of what a deep neural network has learned</article-title>. <source>IEEE Trans. Neural Networks Learn. Syst</source>. <volume>28</volume>, <fpage>2660</fpage>&#x02013;<lpage>2673</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2016.2599820</pub-id><pub-id pub-id-type="pmid">27576267</pub-id></citation></ref>
<ref id="B132">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name></person-group> (<year>2016</year>). <article-title>Interpreting the predictions of complex ML models by layer-wise relevance propagation</article-title>. <source>arXiv [Preprint].</source> arXiv: 1611.08191.<pub-id pub-id-type="pmid">33706810</pub-id></citation></ref>
<ref id="B133">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>Wiegand</surname> <given-names>T.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name></person-group> (<year>2017b</year>). <article-title>Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1708.08296.</citation>
</ref>
<ref id="B134">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schirrmeister</surname> <given-names>R. T.</given-names></name> <name><surname>Springenberg</surname> <given-names>J. T.</given-names></name> <name><surname>Fiederer</surname> <given-names>L. D. J.</given-names></name> <name><surname>Glasstetter</surname> <given-names>M.</given-names></name> <name><surname>Eggensperger</surname> <given-names>K.</given-names></name> <name><surname>Tangermann</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Deep learning with convolutional neural networks for EEG decoding and visualization</article-title>. <source>Hum. Brain Mapp</source>. <volume>38</volume>, <fpage>5391</fpage>&#x02013;<lpage>5420</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.23730</pub-id><pub-id pub-id-type="pmid">28782865</pub-id></citation></ref>
<ref id="B135">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmidhuber</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>Deep learning in neural networks: an overview</article-title>. <source>Neural Netw.</source> <volume>61</volume>, <fpage>85</fpage>&#x02013;<lpage>117</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2014.09.003</pub-id><pub-id pub-id-type="pmid">25462637</pub-id></citation></ref>
<ref id="B136">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scott</surname> <given-names>A. C.</given-names></name> <name><surname>Clancey</surname> <given-names>W. J.</given-names></name> <name><surname>Davis</surname> <given-names>R.</given-names></name> <name><surname>Shortliffe</surname> <given-names>E. H.</given-names></name></person-group> (<year>1977</year>). <source>Explanation Capabilities of Production-Based Consultation Systems</source>. Stanford Univ CA Dept of Computer Science.</citation>
</ref>
<ref id="B137">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Segev</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>Textual network analysis: detecting prevailing themes and biases in international news and social media</article-title>. <source>Sociol. Compass</source> <volume>14</volume>, <fpage>e12779</fpage>. <pub-id pub-id-type="doi">10.1111/soc4.12779</pub-id></citation>
</ref>
<ref id="B138">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Selvaraju</surname> <given-names>R. R.</given-names></name> <name><surname>Cogswell</surname> <given-names>M.</given-names></name> <name><surname>Das</surname> <given-names>A.</given-names></name> <name><surname>Vedantam</surname> <given-names>R.</given-names></name> <name><surname>Parikh</surname> <given-names>D.</given-names></name> <name><surname>Batra</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Grad-cam: visual explanations from deep networks via gradient-based localization,&#x0201D;</article-title> in <source>Proceedings of the IEEE International Conference on Computer Vision</source>, <fpage>618</fpage>&#x02013;<lpage>626</lpage>.</citation>
</ref>
<ref id="B139">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Selvaraju</surname> <given-names>R. R.</given-names></name> <name><surname>Das</surname> <given-names>A.</given-names></name> <name><surname>Vedantam</surname> <given-names>R.</given-names></name> <name><surname>Cogswell</surname> <given-names>M.</given-names></name> <name><surname>Parikh</surname> <given-names>D.</given-names></name> <name><surname>Batra</surname> <given-names>D.</given-names></name></person-group> (<year>2016</year>). <article-title>Grad-CAM: Why did you say that?</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1611.07450.</citation>
</ref>
<ref id="B140">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>D.</given-names></name> <name><surname>Wu</surname> <given-names>G.</given-names></name> <name><surname>Suk</surname> <given-names>H.-I.</given-names></name></person-group> (<year>2017</year>). <article-title>Deep learning in medical image analysis</article-title>. <source>Annu. Rev. Biomed. Eng</source>. <volume>19</volume>, <fpage>221</fpage>&#x02013;<lpage>248</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-bioeng-071516-044442</pub-id><pub-id pub-id-type="pmid">28301734</pub-id></citation></ref>
<ref id="B141">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shortliffe</surname> <given-names>E. H.</given-names></name> <name><surname>Buchanan</surname> <given-names>B. G.</given-names></name></person-group> (<year>1975</year>). <article-title>A model of inexact reasoning in medicine</article-title>. <source>Math. Biosci</source>. <volume>23</volume>, <fpage>351</fpage>&#x02013;<lpage>379</lpage>. <pub-id pub-id-type="doi">10.1016/0025-5564(75)90047-4</pub-id></citation>
</ref>
<ref id="B142">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shrikumar</surname> <given-names>A.</given-names></name> <name><surname>Greenside</surname> <given-names>P.</given-names></name> <name><surname>Kundaje</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Learning important features through propagating activation differences,&#x0201D;</article-title> in <source>Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML&#x00027;17</source>. (JMLR.org), <fpage>3145</fpage>&#x02013;<lpage>3153</lpage>.</citation>
</ref>
<ref id="B143">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Silver</surname> <given-names>D.</given-names></name> <name><surname>Huang</surname> <given-names>A.</given-names></name> <name><surname>Maddison</surname> <given-names>C. J.</given-names></name> <name><surname>Guez</surname> <given-names>A.</given-names></name> <name><surname>Sifre</surname> <given-names>L.</given-names></name> <name><surname>van den Driessche</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Mastering the game of Go with deep neural networks and tree search</article-title>. <source>Nature</source> <volume>529</volume>, <fpage>484</fpage>&#x02013;<lpage>489</lpage>. <pub-id pub-id-type="doi">10.1038/nature16961</pub-id><pub-id pub-id-type="pmid">26819042</pub-id></citation></ref>
<ref id="B144">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Vedaldi</surname> <given-names>A.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Deep inside convolutional networks: visualising image classification models and saliency maps</article-title>. <source>arXiv [Preprint].</source> arXiv: 1312.6034.</citation>
</ref>
<ref id="B145">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Very deep convolutional networks for large-scale image recognition</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1409.1556.</citation>
</ref>
<ref id="B146">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Singh</surname> <given-names>A.</given-names></name> <name><surname>Sengupta</surname> <given-names>S.</given-names></name> J J. B <name><surname>Mohammed</surname> <given-names>A. R.</given-names></name> <name><surname>Faruq</surname> <given-names>I.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>&#x0201C;What is the optimal attribution method for explainable ophthalmic disease classification?,&#x0201D;</article-title> in <source>BT - Ophthalmic Medical Image Analysis</source>, eds H. Fu, M. K. Garvin, T. MacGillivray, Y. Xu, and Y. Zheng (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>21</fpage>&#x02013;<lpage>31</lpage>.</citation>
</ref>
<ref id="B147">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smilkov</surname> <given-names>D.</given-names></name> <name><surname>Thorat</surname> <given-names>N.</given-names></name> <name><surname>Kim</surname> <given-names>B.</given-names></name> <name><surname>Vi&#x000E9;gas</surname> <given-names>F.</given-names></name> <name><surname>Wattenberg</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>SmoothGrad: removing noise by adding noise</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1706.03825.</citation>
</ref>
<ref id="B148">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Springenberg</surname> <given-names>J. T.</given-names></name> <name><surname>Dosovitskiy</surname> <given-names>A.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name> <name><surname>Riedmiller</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Striving for simplicity: the all convolutional net</article-title>. <source>arXiv [Preprint].</source> arXiv: 1412.6806</citation>
</ref>
<ref id="B149">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Srinivasan</surname> <given-names>V.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Hellge</surname> <given-names>C.</given-names></name> <name><surname>Muller</surname> <given-names>K. R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2017</year>). <article-title>Interpretable human action recognition in compressed domain</article-title>. <source>ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc</source>., <fpage>1692</fpage>&#x02013;<lpage>1696</lpage>.</citation>
</ref>
<ref id="B150">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Staniak</surname> <given-names>M.</given-names></name> <name><surname>Biecek</surname> <given-names>P.</given-names></name></person-group> (<year>2018</year>). <article-title>Explanations of model predictions with live and breakDown packages</article-title>. <source>arXiv [Preprint].</source> arXiv: 1804.01955.</citation>
</ref>
<ref id="B151">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sturm</surname> <given-names>I.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name></person-group> (<year>2016</year>). <article-title>Interpretable deep neural networks for single-trial EEG classification</article-title>. <source>J. Neurosci. Methods</source> <volume>274</volume>, <fpage>141</fpage>&#x02013;<lpage>145</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2016.10.008</pub-id><pub-id pub-id-type="pmid">27746229</pub-id></citation></ref>
<ref id="B152">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sundararajan</surname> <given-names>M.</given-names></name> <name><surname>Taly</surname> <given-names>A.</given-names></name> <name><surname>Yan</surname> <given-names>Q.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Axiomatic attribution for deep networks,&#x0201D;</article-title> in <source>Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML&#x00027;17</source> (<publisher-loc>JMLR.org</publisher-loc>), <fpage>3319</fpage>&#x02013;<lpage>3328</lpage>.</citation>
</ref>
<ref id="B153">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Swartout</surname> <given-names>W.</given-names></name> <name><surname>Paris</surname> <given-names>C.</given-names></name> <name><surname>Moore</surname> <given-names>J.</given-names></name></person-group> (<year>1991</year>). <article-title>Explanations in knowledge systems: design for explainable expert systems</article-title>. <source>IEEE Expert</source> <volume>6</volume>, <fpage>58</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1109/64.87686</pub-id></citation>
</ref>
<ref id="B154">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>Z.</given-names></name> <name><surname>Chuang</surname> <given-names>K. V.</given-names></name> <name><surname>DeCarli</surname> <given-names>C.</given-names></name> <name><surname>Jin</surname> <given-names>L. W.</given-names></name> <name><surname>Beckett</surname> <given-names>L.</given-names></name> <name><surname>Keiser</surname> <given-names>M. J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Interpretable classification of Alzheimer&#x00027;s disease pathologies with a convolutional neural network pipeline</article-title>. <source>Nat. Commun</source>. <volume>10</volume>, <fpage>1</fpage>&#x02013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1038/s41467-019-10212-1</pub-id><pub-id pub-id-type="pmid">31092819</pub-id></citation></ref>
<ref id="B155">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thomas</surname> <given-names>A. W.</given-names></name> <name><surname>Heekeren</surname> <given-names>H. R.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K. R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2019</year>). <article-title>Analyzing Neuroimaging Data Through Recurrent Deep Learning Models</article-title>. <source>Front. Neurosci</source>. <volume>13</volume>, <fpage>1</fpage>&#x02013;<lpage>18</lpage>. <pub-id pub-id-type="doi">10.3389/fnins.2019.01321</pub-id><pub-id pub-id-type="pmid">31920491</pub-id></citation></ref>
<ref id="B156">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ting</surname> <given-names>D. S. W.</given-names></name> <name><surname>Cheung</surname> <given-names>C. Y.-L.</given-names></name> <name><surname>Lim</surname> <given-names>G.</given-names></name> <name><surname>Tan</surname> <given-names>G. S. W.</given-names></name> <name><surname>Quang</surname> <given-names>N. D.</given-names></name> <name><surname>Gan</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes</article-title>. <source>JAMA</source> <volume>318</volume>, <fpage>2211</fpage>&#x02013;<lpage>2223</lpage>. <pub-id pub-id-type="doi">10.1001/jama.2017.18152</pub-id><pub-id pub-id-type="pmid">29234807</pub-id></citation></ref>
<ref id="B157">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tjoa</surname> <given-names>E.</given-names></name> <name><surname>Guan</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <article-title>A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Trans</article-title>. <source>Nural Netw. Learn. Syst</source>. <volume>32</volume>, <fpage>4793</fpage>&#x02013;<lpage>4813</lpage>.<pub-id pub-id-type="pmid">33079674</pub-id></citation></ref>
<ref id="B158">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tonekaboni</surname> <given-names>S.</given-names></name> <name><surname>Joshi</surname> <given-names>S.</given-names></name> <name><surname>McCradden</surname> <given-names>M. D.</given-names></name> <name><surname>Goldenberg</surname> <given-names>A.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;What clinicians want: contextualizing explainable machine learning for clinical end use,&#x0201D;</article-title> in <source>Machine Learning for Healthcare Conference (PMLR)</source>, <fpage>359</fpage>&#x02013;<lpage>380</lpage>.</citation>
</ref>
<ref id="B159">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tulio Ribeiro</surname> <given-names>M.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <name><surname>Guestrin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>Nothing else matters: model-agnostic explanations by identifying prediction invariance</article-title>.</citation>
</ref>
<ref id="B160">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ustun</surname> <given-names>B.</given-names></name> <name><surname>Rudin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>Supersparse linear integer models for optimized medical scoring systems</article-title>. <source>Mach. Learn</source>. <volume>102</volume>, <fpage>349</fpage>&#x02013;<lpage>391</lpage>. <pub-id pub-id-type="doi">10.1007/s10994-015-5528-6</pub-id></citation>
</ref>
<ref id="B161">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van der Velden</surname> <given-names>B. H. M.</given-names></name> <name><surname>Kuijf</surname> <given-names>H. J.</given-names></name> <name><surname>Gilhuijs</surname> <given-names>K. G. A.</given-names></name> <name><surname>Viergever</surname> <given-names>M. A.</given-names></name></person-group> (<year>2022</year>). <article-title>Explainable artificial intelligence (XAI) in deep learning-based medical image analysis</article-title>. <source>Med. Image Anal</source>. <volume>79</volume>, <fpage>102470</fpage>. <pub-id pub-id-type="doi">10.1016/j.media.2022.102470</pub-id><pub-id pub-id-type="pmid">35576821</pub-id></citation></ref>
<ref id="B162">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vayena</surname> <given-names>E.</given-names></name> <name><surname>Blasimme</surname> <given-names>A.</given-names></name> <name><surname>Cohen</surname> <given-names>I. G.</given-names></name></person-group> (<year>2018</year>). <article-title>Machine learning in medicine: addressing ethical challenges</article-title>. <source>PLoS Med</source>. <volume>15</volume>, <fpage>e1002689</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pmed.1002689</pub-id><pub-id pub-id-type="pmid">30399149</pub-id></citation></ref>
<ref id="B163">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>C. J.</given-names></name> <name><surname>Hamm</surname> <given-names>C. A.</given-names></name> <name><surname>Savic</surname> <given-names>L. J.</given-names></name> <name><surname>Ferrante</surname> <given-names>M.</given-names></name> <name><surname>Schobert</surname> <given-names>I.</given-names></name> <name><surname>Schlachter</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Deep learning for liver tumor diagnosis part II: convolutional neural network interpretation using radiologic imaging features</article-title>. <source>Eur. Radiol</source>. <volume>29</volume>, <fpage>3348</fpage>&#x02013;<lpage>3357</lpage>. <pub-id pub-id-type="doi">10.1007/s00330-019-06214-8</pub-id><pub-id pub-id-type="pmid">31093705</pub-id></citation></ref>
<ref id="B164">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Childress</surname> <given-names>A. R.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Detre</surname> <given-names>J. A.</given-names></name></person-group> (<year>2007</year>). <article-title>Support vector machine learning-based fMRI data group analysis</article-title>. <source>Neuroimage</source> <volume>36</volume>, <fpage>1139</fpage>&#x02013;<lpage>1151</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.03.072</pub-id><pub-id pub-id-type="pmid">17524674</pub-id></citation></ref>
<ref id="B165">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weng</surname> <given-names>S. F.</given-names></name> <name><surname>Reps</surname> <given-names>J.</given-names></name> <name><surname>Kai</surname> <given-names>J.</given-names></name> <name><surname>Garibaldi</surname> <given-names>J. M.</given-names></name> <name><surname>Qureshi</surname> <given-names>N.</given-names></name></person-group> (<year>2017</year>). <article-title>Can machine-learning improve cardiovascular risk prediction using routine clinical data?</article-title> <source>PLoS ONE</source>. <volume>12</volume>, <fpage>e0174944</fpage>.<pub-id pub-id-type="pmid">28376093</pub-id></citation></ref>
<ref id="B166">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wilson</surname> <given-names>B.</given-names></name> <name><surname>Hoffman</surname> <given-names>J.</given-names></name> <name><surname>Morgenstern</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>Predictive inequity in object detection</article-title>. <source>arXiv [Preprint].</source> arXiv: 1902.11097.</citation>
</ref>
<ref id="B167">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Windisch</surname> <given-names>P.</given-names></name> <name><surname>Weber</surname> <given-names>P.</given-names></name> <name><surname>F&#x000FC;rweger</surname> <given-names>C.</given-names></name> <name><surname>Ehret</surname> <given-names>F.</given-names></name> <name><surname>Kufeld</surname> <given-names>M.</given-names></name> <name><surname>Zwahlen</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Implementation of model explainability for a basic brain tumor detection using convolutional neural networks on MRI slices</article-title>. <source>Neuroradiology</source> <volume>62</volume>, <fpage>1515</fpage>&#x02013;<lpage>1518</lpage>. <pub-id pub-id-type="doi">10.1007/s00234-020-02465-1</pub-id><pub-id pub-id-type="pmid">32500277</pub-id></citation></ref>
<ref id="B168">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>F.</given-names></name> <name><surname>Uszkoreit</surname> <given-names>H.</given-names></name> <name><surname>Du</surname> <given-names>Y.</given-names></name> <name><surname>Fan</surname> <given-names>W.</given-names></name> <name><surname>Zhao</surname> <given-names>D.</given-names></name> <name><surname>Zhu</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). &#x0201C;Explainable AI: a brief survey on history, research areas, approaches and challenges,&#x0201D; in <source>Natural Language Processing and Chinese Computing</source>, eds J. Tang. M.-Y., Kan, D. Zhao, S. Li&#x0201E; and H. Zan (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>563</fpage>&#x02013;<lpage>574</lpage>.</citation>
</ref>
<ref id="B169">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Jia</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>L.-B.</given-names></name> <name><surname>Ai</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>F.</given-names></name> <name><surname>Lai</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features</article-title>. <source>BMC Bioinforma.</source> <volume>18</volume>, <fpage>281</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-017-1685-x</pub-id><pub-id pub-id-type="pmid">28549410</pub-id></citation></ref>
<ref id="B170">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>C.</given-names></name> <name><surname>Rangarajan</surname> <given-names>A.</given-names></name> <name><surname>Ranka</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Visual explanations from deep 3D convolutional neural networks for Alzheimer&#x00027;s Disease Classification,&#x0201D;</article-title> in <source>AMIA. Annu. Symp. proceedings. AMIA Symp. 2018</source>, <fpage>1571</fpage>&#x02013;<lpage>1580</lpage>.<pub-id pub-id-type="pmid">30815203</pub-id></citation></ref>
<ref id="B171">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yasaka</surname> <given-names>K.</given-names></name> <name><surname>Abe</surname> <given-names>O.</given-names></name></person-group> (<year>2018</year>). <article-title>Deep learning and artificial intelligence in radiology: current applications and future directions</article-title>. <source>PLoS Med</source>. <volume>15</volume>, <fpage>2</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1371/journal.pmed.1002707</pub-id><pub-id pub-id-type="pmid">30500815</pub-id></citation></ref>
<ref id="B172">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yeom</surname> <given-names>S.-K.</given-names></name> <name><surname>Seegerer</surname> <given-names>P.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Wiedemann</surname> <given-names>S.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Pruning by explaining: A novel criterion for deep neural network pruning</article-title>. <source>Pattern Recognit</source>. <volume>115</volume>, <fpage>107899</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.107899</pub-id></citation>
</ref>
<ref id="B173">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yosinski</surname> <given-names>J.</given-names></name> <name><surname>Clune</surname> <given-names>J.</given-names></name> <name><surname>Nguyen</surname> <given-names>A.</given-names></name> <name><surname>Fuchs</surname> <given-names>T.</given-names></name> <name><surname>Lipson</surname> <given-names>H.</given-names></name></person-group> (<year>2015</year>). <article-title>Understanding neural networks through deep visualization</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1506.06579.</citation>
</ref>
<ref id="B174">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Young</surname> <given-names>T.</given-names></name> <name><surname>Hazarika</surname> <given-names>D.</given-names></name> <name><surname>Poria</surname> <given-names>S.</given-names></name> <name><surname>Cambria</surname> <given-names>E.</given-names></name></person-group> (<year>2018</year>). <article-title>Recent trends in deep learning based natural language processing</article-title>. <source>IEEE Comput. Intell. Mag</source>. <volume>13</volume>, <fpage>55</fpage>&#x02013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1109/MCI.2018.2840738</pub-id><pub-id pub-id-type="pmid">34324453</pub-id></citation></ref>
<ref id="B175">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zeiler</surname> <given-names>M. D.</given-names></name> <name><surname>Fergus</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). &#x0201C;Visualizing and understanding convolutional networks,&#x0201D; in<italic>European Conference on Computer Vision</italic>, eds D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>818</fpage>&#x02013;<lpage>833</lpage>.</citation>
</ref>
<ref id="B176">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Bargal</surname> <given-names>S. A.</given-names></name> <name><surname>Lin</surname> <given-names>Z.</given-names></name> <name><surname>Brandt</surname> <given-names>J.</given-names></name> <name><surname>Shen</surname> <given-names>X.</given-names></name> <name><surname>Sclaroff</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>Top-down neural attention by excitation backprop</article-title>. <source>Int. J. Comput. Vis</source>. <volume>126</volume>, <fpage>1084</fpage>&#x02013;<lpage>1102</lpage>. <pub-id pub-id-type="doi">10.1007/s11263-017-1059-x</pub-id><pub-id pub-id-type="pmid">32070856</pub-id></citation></ref>
<ref id="B177">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Han</surname> <given-names>L.</given-names></name> <name><surname>Zhu</surname> <given-names>W.</given-names></name> <name><surname>Sun</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>An explainable 3D residual self-attention deep neural network for joint atrophy localization and Alzheimer&#x00027;s disease diagnosis using structural MRI</article-title>. <source>IEEE J. Biomed. Heal. Informatics</source>, 1. <pub-id pub-id-type="doi">10.1109/JBHI.2021.3066832</pub-id><pub-id pub-id-type="pmid">33735087</pub-id></citation></ref>
<ref id="B178">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>G.</given-names></name> <name><surname>Zhou</surname> <given-names>B.</given-names></name> <name><surname>Wang</surname> <given-names>K.</given-names></name> <name><surname>Jiang</surname> <given-names>R.</given-names></name> <name><surname>Xu</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Respond-CAM: analyzing deep models for 3D imaging data by visualizations,&#x0201D;</article-title> in <source>Medical Image Computing and Computer Assisted Intervention &#x02013; MICCAI 2018</source>, eds A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>485</fpage>&#x02013;<lpage>492</lpage>.</citation>
</ref>
<ref id="B179">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>B.</given-names></name> <name><surname>Khosla</surname> <given-names>A.</given-names></name> <name><surname>Lapedriza</surname> <given-names>A.</given-names></name> <name><surname>Oliva</surname> <given-names>A.</given-names></name> <name><surname>Torralba</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Learning deep features for discriminative localization,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>2921</fpage>&#x02013;<lpage>2929</lpage>.</citation>
</ref>
<ref id="B180">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>G.</given-names></name> <name><surname>Jiang</surname> <given-names>B.</given-names></name> <name><surname>Tong</surname> <given-names>L.</given-names></name> <name><surname>Xie</surname> <given-names>Y.</given-names></name> <name><surname>Zaharchuk</surname> <given-names>G.</given-names></name> <name><surname>Wintermark</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Applications of deep learning to neuro-imaging techniques</article-title>. <source>Front. Neurol</source>. <volume>10</volume>, <fpage>869</fpage>. <pub-id pub-id-type="doi">10.3389/fneur.2019.00869</pub-id><pub-id pub-id-type="pmid">31474928</pub-id></citation></ref>
<ref id="B181">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zintgraf</surname> <given-names>L. M.</given-names></name> <name><surname>Cohen</surname> <given-names>T. S.</given-names></name> <name><surname>Adel</surname> <given-names>T.</given-names></name> <name><surname>Welling</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Visualizing deep neural network decisions: prediction difference analysis</article-title>. <source>arXiv [Preprint]</source>. arXiv: 1702.04595.</citation>
</ref>
</ref-list>
</back>
</article>