TECHNOLOGY AND CODE article
Sec. Predictive Toxicology
TOXPANEL: A Gene-Set Analysis Tool to Assess Liver and Kidney Injuries
- 1DoD Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Development Command, Fort Detrick, MD, United States
- 2The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., Bethesda, MD, United States
Gene-set analysis is commonly used to identify trends in gene expression when cells, tissues, organs, or organisms are subjected to conditions that differ from those within the normal physiological range. However, tools for gene-set analysis to assess liver and kidney injury responses are less common. Furthermore, most websites for gene-set analysis lack the option for users to customize their gene-set database. Here, we present the ToxPanel website, which allows users to perform gene-set analysis to assess liver and kidney injuries using activation scores based on gene-expression fold-change values. The results are graphically presented to assess constituent injury phenotypes (histopathology), with interactive result tables that identify the main contributing genes to a given signal. In addition, ToxPanel offers the flexibility to analyze any set of custom genes based on gene fold-change values. ToxPanel is publically available online at https://toxpanel.bhsai.org. ToxPanel allows users to access our previously developed liver and kidney injury gene sets, which we have shown in previous work to yield robust results that correlate with the degree of injury. Users can also test and validate their customized gene sets using the ToxPanel website.
ToxPanel is a web-based tool to assess liver and kidney injury from in vitro or in vivo genomic data. In the field of toxicogenomics, a common assumption is that toxicity is associated with a change in the expression of either a single gene or a set of genes (i.e., a module or a gene signature) (Hamadeh et al., 2002; Segal et al., 2004; Fielden et al., 2005; Minowa et al., 2012; Sahini et al., 2014; Ippolito et al., 2015; Parmentier et al., 2017; Sutherland et al., 2019; Wang et al., 2019). Using a toxicogenomic approach, we previously derived 11 liver- and 8 kidney-injury modules (Te et al., 2016) from the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System (TG-GATEs) database (Igarashi et al., 2015), where each injury module is uniquely associated with a specific organ-injury phenotype, see Table 1. The TG-GATEs database contains gene-expression data from Sprague Dawley rats exposed to different chemicals for 4–29 days with corresponding documented and graded histopathological injury phenotypes.
TABLE 1. List of liver and kidney injury modules grouped into general classes with the number of genes in each module.
With the use of TG-GATE, we identified common gene responses (injury modules) that correlated with the severity of injury, including fibrosis, using in silico approaches. In Table 1 we summarized the injury modules we identified in previous studies (Te et al., 2016). For a biological interpretation, we categorized the histological endpoint into their pathological responses, inflammation, degeneration, and proliferation. The gene module approach outperforms individual genes in predicting severity of histological damage (AbdulHameed et al., 2014; Tawa et al., 2014; Te et al., 2016; Schyman et al., 2020b).
Adverse outcome pathway (AOP) is a recent development in toxicology that emphasize a mechanism-based approach to toxicological evaluation as an aid in developing alternatives to animal testing (Ankley et al., 2010). It typically summarizes complex toxicological phenotype in a flow chart-like diagram consisting of molecular initiating events (MIE), key events (KE), and adverse outcomes (AO) (Vinken, 2013). This type of mechanistic outline allows for the development of new in vitro tests that captures the adverse outcome caused by in vivo chemical exposures (Kleinstreuer et al., 2018). We and others have shown that gene expression data can be used to gain insights into the key events of an AOP at a molecular-level (Oki et al., 2016; AbdulHameed et al., 2019). The modules listed in Table 1 represent gene sets that have been associated with adverse outcome. The focus of current paper is on the development of a web-based tool that will allow any user to access and evaluate the activation of these gene modules for their own data. The output from ToxPanel can also be construed as a molecular-level read out for activation of key event in adverse outcome pathway. Our injury modules complement Wiki-AOPs as they offer an interpretation of an adverse biological response that is non-chemical specific. However, they do not offer detail mechanistic insights, which KEGG pathways or wiki-pathways can provide (Kanehisa and Goto, 2000; Martens et al., 2020). We have shown that the combination of our modular approach to identify key injury phenotype together with pathway analysis, provided in ToxPanel, can be useful when understanding the underlying molecular mechanisms in e.g., liver or kidney injury (Schyman et al., 2020a; Schyman et al., 2020b).
We previously validated these injury modules in vivo by treating Sprague Dawley rats with thioacetamide (Schyman et al., 2018), an organosulfur compound extensively used in animal studies as a fibrosis-promoting liver toxicant. Our ToxPanel approach correctly identified cellular infiltration and fibrogenesis as primarily liver-injury phenotypes induced by thioacetamide (Figure 1). Figure 1 shows the increased injury module activations over time related to inflammation and proliferation in accord with the progression of the fibrosis injury phenotype.
FIGURE 1. Gene-expression changes in the rat liver 8 and 24 h after thioacetamide (TAA) [100 mg/g] exposure. The right panel shows a strong inflammatory response 24 h after TAA exposure with cellular infiltration and fibrogenesis as primarily liver-injury phenotypes.
Furthermore, we have found that our injury modules can predict in vivo injury endpoints from in vitro RNA sequence (RNA-seq) data with a strong correlation (R2 > 0.6) (Schyman et al., 2019). In this study we compared in vivo rat data with in vitro cellular data 24 h after treatment of thioacetamide. The-top ranked liver-injury modules identified by our in vitro studies agreed with those identified in vivo using thioacetamide, indicating that in vitro cell injury was also associated with changes in the expression levels of fibrogenic genes.
Analysis of gene sets typically involves the use of tools for the enrichment analysis of specific biological pathways in gene annotation databases, such as KEGG (Kanehisa and Goto, 2000) and GO terms (The Gene Ontology Consortium, 2018). Pathway enrichment analysis tools are readily accessible in many widely used web applications, such as GSEA (Subramanian et al., 2005) and DAVID (Huang et al., 2008). An alternative approach involves analyzing activation scores derived from the aggregated fold-change (FC) values of the genes in a gene set or pathway and comparing it to a background set of FC values. Although this gene-set activation approach provides robust results (Ackermann and Strimmer, 2009; Yu et al., 2017), it is not available in most web applications.
Here, we present a web application that uses two gene-set activation methods, which we denote as aggregated FC (AFC) and aggregated absolute FC (AAFC). These methods are not limited to FC values per se, as they can also accept beta-values from Kallisto-Sleuth output (Bray et al., 2016; Pimentel et al., 2017) or z-score values as inputs. Figure 2 outlines a schematic image of ToxPanel‘s input and output files. AAFC and AFC can be used for predefined or custom-designed gene sets. In the application, the current default gene sets for these methods are liver- and kidney-injury modules, which are gene sets associated with specific injury phenotypes, such as liver fibrosis and kidney necrosis (Ippolito et al., 2015; AbdulHameed et al., 2016; Te et al., 2016; Schyman et al., 2018; Schyman et al., 2019; Wang et al., 2019; Schyman et al., 2020a; Schyman et al., 2020b). We also offer access to the rat and human KEGG pathways, as determined using Entrez gene IDs (Maglott et al., 2011). The gene-set format is compatible with MSigDB (Liberzon et al., 2011) and can be uploaded to the ToxPanel website for analysis. In a recent study in rats, we showed that our injury modules could link genomic responses to observed organ injuries (Schyman et al., 2018; Schyman et al., 2020a), demonstrating the promise of the modular approach in predicting rat in vivo results from rat and human in vitro genomic responses (Schyman et al., 2019; Schyman et al., 2020b).
FIGURE 2. Schematic illustration of typical User Input and the optional Custom Gene Set file formats. The ToxPanel Output presents the calculated AFC and AAFC values for each gene set based on the log FC values in the User Input file.
Aggregated Fold-Change Activation
Detailed descriptions and performance characteristics of the aggregated fold change (AFC) activation method can be found in the original literature (Ackermann and Strimmer, 2009; Yu et al., 2017). In this method, we define the gene-set or KEGG pathway score as the sum of the log-transformed FC values of all genes in the set or pathway. We then use the pathway scores to perform null hypothesis tests and estimate the significance of each pathway by its p-value, defined as the probability that the pathway score for a random data set is greater than the score from the actual data set. The z-score is the number of standard deviations by which the actual gene-set value differs from the mean of randomly selected FC values (10,000 times). The sign of the gene-set score represents the direction of regulation: we consider the pathway up-regulated (overexpressed genes) if the net sum of the gene-expression levels after treatment is increased relative to control and down-regulated (suppressed genes) if it is decreased.
Aggregated Absolute Fold-Change Activation
We recently used the aggregated absolute fold-change (AAFC) activation method to calculate the activation score of a gene set (Schyman et al., 2018; Schyman et al., 2019). This method identifies gene sets that are significantly changed or disrupted without considering the direction of change. The method, which takes the absolute values of the log-transformed FC values, performs well in identifying significantly altered pathways (Ackermann and Strimmer, 2009). Its potential shortcoming is that it disregards information about the direction of change in a pathway (whether it is up- or down-regulated i.e., if the sum of the activation scores of genes in a pathway increases or decreases relative to control).
The AAFC method first reads a list of gene FC values uploaded by the user and takes the absolute value of the log-transformed FC value for each gene. For each gene set, it then sums all of the absolute values to calculate the total absolute FC value. Subsequently, we use the gene-set scores to perform null hypothesis tests and estimate the significance of each gene set by its p-value, defined as the probability that the score for randomly selected FC values (10,000 times) is greater than the score from the actual gene set. A small p-value implies that the gene-set value is significant. As in the AFC method, the z-score is the number of standard deviations by which the actual gene-set value differs from the mean of the randomly selected FC values (10,000 times). The AAFC method, however, considers only positive z-score values, as negative z-score values indicate FC values smaller than the average absolute FC value.
Implementation of the Web-Application
The ToxPanel web-application is delivered through encrypted Hypertext Transfer Protocol Secure (HTTPs) and can be accessed at toxpanel.bhsai.org. The implementation of ToxPanel consists of controller, database, and front view. The controller is written in Java and runs in JDK 1.8. The controller handles interaction with the user from file uploading to job submission. When submitting a job, the controller stores a record in the database and queue the job, which will run an R script for the analysis. After completing the job, the controller stores the result and notify the user through email. On the database side, PostgreSQL 10.5 is employed to provide sufficient data storage and retrieval capability. The front view is implemented with PrimeFace 7.0 library and BootsFaces 1.3.0 library with decoration of ChartJS 2.9.3 and customized Cascading Style Sheets (CSS). The two libraries provide convenient syntax and a wide range of user interface components. They serve as the backbone for the web user interface. The ChartJS 2.9.3 provides more advanced chart drawing and allows further tuning. The web service runs on Tomcat 8.5, which resides inside a docker container. This allows a speedy recovery if the web service ever encounters critical failure.
Upon visiting the site, the user is directed to the login page. The user can either login with a registered account or login as guest. The guest account is primarily for demonstration purpose, but all features are available. Once logged in, the user can upload gene expression data, specify job variables, and submit a job. The job will be queued and once completed the user can visit the result page through the history table.
Results and Discussion
The main purpose of the ToxPanel website is to offer a platform to provide access to our liver- and kidney-injury modules and to calculate gene-set activation scores for gene-set analysis using log-transformed FC values. The website also allows users to upload their own gene sets or pathways. Figure 3 shows the job submission page with supported input file formats for gene expression data and customized gene sets. For each gene set, the program calculates the z-scores and p-values for both the AFC and AAFC methods. If the user provides gene-level p-values in the input file, it also calculates the aggregated p-value for a gene set, based on Fisher’s probability test (Fisher, 1932).
FIGURE 3. The figure illustrates the two input file formats for Step 1 and Step 3. The Step 1 input file with the gene expression data is required, but the p-values in column C are optional. The Step 3 Custom Gene Sets file is optional if one uses Human or Rat Entrez ID but required if Other gene IDs are used. The Custom Gene Sets format follows the .gmt format illustrated in the figure.
Users can view all of the results on the ToxPanel website or download them for offline analysis. Figure 4 shows a typical output for changes in gene expression following exposure to thioacetamide. By clicking on the name of a gene set, the user can view the genes in that gene set and their corresponding FC values. This is useful for identifying the main genes contributing to a gene set. For each KEGG pathway, we offer a link to its webpage. The main results are shown under the headings of Aggregate Fold Change and Aggregate Absolute Fold Change. We display both the z-score and p-value for each gene set so that users can easily identify significantly activated gene sets. In the example shown in Figure 4, the gene sets are ranked by the z-score of the AAFC method. The top-ranked gene set is Cellular infiltration for liver injuries, with an AAFC z-score of 13.97.
FIGURE 4. Screenshot of typical results for gene sets activated by the liver toxicant, thioacetamide. Headings: Gene Set—name of gene set, injury module, or pathway; Filter—type of gene set (e.g., KEGG pathway) displayed when selected; Aggregate Fold Change z-Score—positive for an up-regulated gene set and negative for a down-regulated gene set; Aggregate Absolute Fold Change z-Score—gene-set activation, as calculated by summing the changes in the expression levels for all genes within the gene set; Aggregate p-Value—Fisher’s combined p-value for all genes in the gene set.
In this paper, we introduced ToxPanel as a new tool for assessing liver and kidney injury based on gene expression data. Furthermore, ToxPanel complements existing gene and pathway analysis tools by providing a platform for users to access the AFC and AAFC methods. We have shown that the genes sets provided in ToxPanel can be used for making predictions of liver and kidney injury occurrence in rats before the damage appears (Schyman et al., 2018; Schyman et al., 2020a); and, that rat and human in vitro gene expression data correlate with in vivo injury observed in rat (Schyman et al., 2019; Schyman et al., 2020b). Thus, ToxPanel can potentially be used in early drug discovery and chemical safety valuations to assess chemical-induced liver and kidney injury from in vitro gene expression data.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
PS and AW made substantial contributions to the conception and design of the work. ZX and VD implemented the methods and designed the website. PS contributed to drafting the manuscript. PS, ZX, VD, and AW contributed to revising and editing the manuscript for important intellectual content. All authors read and approved the final manuscript.
The authors were supported by the U.S. Army Medical Research and Development Command (Fort Detrick, MD), and the Defense Threat Reduction Agency grant CBCall14-CBS-05-2-0007.
Conflict of Interest
Authors PS, ZX, and VD were employed by the company The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc.
The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors gratefully acknowledge the supported by the U.S. Army Medical Research and Development Command (Fort Detrick, MD), and the Defense Threat Reduction Agency grant CBCall14-CBS-05-2-0007. The opinions and assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the U.S. Army, the U.S. Department of Defense, or The Henry M. Jackson Foundation for Advancement of Military Medicine, Inc. This paper has been approved for public release with unlimited distribution.
AbdulHameed, M. D. M., Ippolito, D. L., Stallings, J. D., and Wallqvist, A. (2016). Mining kidney toxicogenomic data by using gene co-expression modules. BMC Genom. 17 (1), 790. doi:10.1186/s12864-016-3143-y
AbdulHameed, M. D. M., Pannala, V. R., and Wallqvist, A. (2019). Mining public toxicogenomic data reveals insights and challenges in delineating liver steatosis adverse outcome pathways. Front. Genet. 10, 1007. doi:10.3389/fgene.2019.01007
AbdulHameed, M. D., Tawa, G. J., Kumar, K., Ippolito, D. L., Lewis, J. A., Stallings, J. D., et al. (2014). Systems level analysis and identification of pathways and networks associated with liver fibrosis. PLoS One 9 (11), e112193. doi:10.1371/journal.pone.0112193
Ankley, G. T., Bennett, R. S., Erickson, R. J., Hoff, D. J., Hornung, M. W., Johnson, R. D., et al. (2010). Adverse outcome pathways: a conceptual framework to support ecotoxicology research and risk assessment. Environ. Toxicol. Chem. 29 (3), 730–741. doi:10.1002/etc.34
Fielden, M. R., Eynon, B. P., Natsoulis, G., Jarnagin, K., Banas, D., and Kolaja, K. L. (2005). A gene expression signature that predicts the future onset of drug-induced renal tubular toxicity. Toxicol. Pathol. 33 (6), 675–683. doi:10.1080/01926230500321213
Hamadeh, H. K., Knight, B. L., Haugen, A. C., Sieber, S., Amin, R. P., Bushel, P. R., et al. (2002). Methapyrilene toxicity: anchorage of pathologic observations to gene expression alterations. Toxicol. Pathol. 30 (4), 470–482. doi:10.1080/01926230290105712
Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2008). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4 (1), 44–57. doi:10.1038/nprot.2008.211
Igarashi, Y., Nakatsu, N., Yamashita, T., Ono, A., Ohno, Y., Urushidani, T., et al. (2015). Open TG-GATEs: a large-scale toxicogenomics database. Nucleic Acids Res. 43 (D1), D921–D927. doi:10.1093/nar/gku955
Ippolito, D. L., AbdulHameed, M. D. M., Tawa, G. J., Baer, C. E., Permenter, M. G., McDyre, B. C., et al. (2015). Gene expression patterns associated with histopathology in toxic liver fibrosis. Toxicol. Sci. 149 (1), 67–88. doi:10.1093/toxsci/kfv214
Kleinstreuer, N. C., Hoffmann, S., Alépée, N., Allen, D., Ashikaga, T., Casey, W., et al. (2018). Non-animal methods to predict skin sensitization (II): an assessment of defined approaches (*). Crit. Rev. Toxicol. 48 (5), 359–374. doi:10.1080/10408444.2018.1429386
Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdóttir, H., Tamayo, P., and Mesirov, J. P. (2011). Molecular signatures database (MSigDB) 3.0. Bioinformatics 27 (12), 1739–1740. doi:10.1093/bioinformatics/btr260
Minowa, Y., Kondo, C., Uehara, T., Morikawa, Y., Okuno, Y., Nakatsu, N., et al. (2012). Toxicogenomic multigene biomarker for predicting the future onset of proximal tubular injury in rats. Toxicology 297 (1–3), 47–56. doi:10.1016/j.tox.2012.03.014
Oki, N. O., Nelms, M. D., Bell, S. M., Mortensen, H. M., and Edwards, S. W. (2016). Accelerating adverse outcome pathway development using publicly available data sources. Curr. Environ. Health Rep. 3, 53–63. doi:10.1007/s40572-016-0079-y
Parmentier, C., Couttet, P., Wolf, A., Zaccharias, T., Heyd, B., Bachellier, P., et al. (2017). Evaluation of transcriptomic signature as a valuable tool to study drug-induced cholestasis in primary human hepatocytes. Arch. Toxicol. 91, 2879–2893. doi:10.1007/s00204-017-1930-0
Sahini, N., Selvaraj, S., and Borlak, J. (2014). Whole genome transcript profiling of drug induced steatosis in rats reveals a gene signature predictive of outcome. PLOS One 9, e114085. doi:10.1371/journal.pone.0114085
Schyman, P., Printz, R. L., AbdulHameed, M. D. M., Estes, S. K., Shiota, C., Shiota, M., et al. (2020a). A toxicogenomic approach to assess kidney injury induced by mercuric chloride in rats. Toxicology 442, 152530. doi:10.1016/j.tox.2020.152530
Schyman, P., Printz, R. L., Estes, S. K., Boyd, K. L., Shiota, M., and Wallqvist, A. (2018). Identification of the toxicity pathways associated with thioacetamide-induced injuries in rat liver and kidney. Front. Pharmacol. 9, 1272. doi:10.3389/fphar.2018.01272
Schyman, P., Printz, R. L., Estes, S. K., O’Brien, T. P., Shiota, M., and Wallqvist, A. (2019). Assessing chemical-induced liver injury in vivo from in vitro gene expression data in the rat: the case of thioacetamide toxicity. Front. Genet. 10, 1233. doi:10.3389/fgene.2019.01233
Schyman, P., Printz, R. L., Estes, S. K., O’Brien, T. P., Shiota, M., and Wallqvist, A. (2020b). Concordance between thioacetamide-induced liver injury in rat and human in vitro gene expression data. Int. J. Mol. Sci. 21, 4017. doi:10.3390/ijms21114017
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U.S.A. 102, 15545–15550. doi:10.1073/pnas.0506580102
Sutherland, J. J., Stevens, J. L., Johnson, K., Elango, N., Webster, Y. W., Mills, B. J., et al. (2019). A novel open access web portal for integrating mechanistic and toxicogenomic study results. Toxicol. Sci. 170, 296–309. doi:10.1093/toxsci/kfz101
Tawa, G. J., AbdulHameed, M. D., Yu, X., Kumar, K., Ippolito, D. L., Lewis, J. A., et al. (2014). Characterization of chemically induced liver injuries using gene co-expression modules. PLoS One 9, e107230. doi:10.1371/journal.pone.0107230
Te, J. A., AbdulHameed, M. D. M., and Wallqvist, A. (2016). Systems toxicology of chemically induced liver and kidney injuries: histopathology‐associated gene co‐expression modules. J. Appl. Toxicol. 36, 1137–1149. doi:10.1002/jat.3278
Wang, H., Liu, R., Schyman, P., and Wallqvist, A. (2019). Deep neural network models for predicting chemically induced liver toxicity endpoints from transcriptomic responses. Front. Pharmacol. 10, 42. doi:10.3389/fphar.2019.00042
Keywords: predictive toxicology, systems toxicology, toxicogenomics, nephrotoxicity, hepatotoxicity, RNA-seq
Citation: Schyman P, Xu Z, Desai V and Wallqvist A (2021) TOXPANEL: A Gene-Set Analysis Tool to Assess Liver and Kidney Injuries. Front. Pharmacol. 12:601511. doi: 10.3389/fphar.2021.601511
Received: 01 September 2020; Accepted: 08 January 2021;
Published: 09 February 2021.
Edited by:Monika Batke, Fraunhofer Institute for Toxicology and Experimental Medicine (FHG), Germany
Reviewed by:Alberto Mantovani, National Institute of Health (ISS), Italy
Sekena Hassanien Abdel-Aziem, National Research Centre, Egypt
Copyright © 2021 Schyman, Xu, Desai and Wallqvist. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Patric Schyman, email@example.com