Comprehensive Evaluation of Organotypic and Microphysiological Liver Models for Prediction of Drug-Induced Liver Injury

Drug-induced liver injury (DILI) is a major concern for the pharmaceutical industry and constitutes one of the most important reasons for the termination of promising drug development projects. Reliable prediction of DILI liability in preclinical stages is difficult, as current experimental model systems do not accurately reflect the molecular phenotype and functionality of the human liver. As a result, multiple drugs that passed preclinical safety evaluations failed due to liver toxicity in clinical trials or postmarketing stages in recent years. To improve the selection of molecules that are taken forward into the clinics, the development of more predictive in vitro systems that enable high-throughput screening of hepatotoxic liabilities and allow for investigative studies into DILI mechanisms has gained growing interest. Specifically, it became increasingly clear that the choice of cell types and culture method both constitute important parameters that affect the predictive power of test systems. In this review, we present current 3D culture paradigms for hepatotoxicity tests and critically evaluate their utility and performance for DILI prediction. In addition, we highlight possibilities of these emerging platforms for mechanistic evaluations of selected drug candidates and present current research directions towards the further improvement of preclinical liver safety tests. We conclude that organotypic and microphysiological liver systems have provided an important step towards more reliable DILI prediction. Furthermore, we expect that the increasing availability of comprehensive benchmarking studies will facilitate model dissemination that might eventually result in their regulatory acceptance.


INTRODUCTION
Drug-induced liver injury (DILI) constitutes a rare life-threatening adverse drug reaction (ADR). While it only occurs in 4-19 per 100,000 individuals, it is nevertheless the most common cause for acute liver failure in the US and Europe (Sgro et al., 2002;Bjornsson et al., 2013). Of the affected patients, 9.4% die or require liver transplantation and 18.9% show persistent liver damage 6 months after DILI diagnosis (Fontana et al., 2014). Acetaminophen overdose is the most common cause of acute liver failure, whereas around 13% of DILI cases in the clinics are idiosyncratic (Ostapowicz et al., 2002). In addition to its clinical importance, DILI constitutes a major problem for the pharmaceutical industry and regulatory authorities. In the last 30 years, 14 drugs have been withdrawn from the US and European markets due to hepatotoxicity in post-marketing stages ( Table 1). Withdrawal of medications represents a major financial burden for the pharmaceutical industry. An impressive example is provided by the later withdrawn antidiabetic drug troglitazone, which achieved sales of 750 million USD per year in 1998; however, when regulators in the UK announced that risks of troglitazone therapy outweighed benefits, share prices of the manufacturer Warner-Lambert dropped by 18.5%, corresponding to a loss of approximately 10 billion USD in company value (Gale, 2006). In addition to post-marketing withdrawals, DILI was responsible for the discontinuation of numerous drug development programs across all clinical stages, including the phase 3 terminations of aplaviroc (Nichols et al., 2008) and fasiglifam (Marcinak et al., 2018).
In light of this preamble, it is clear that the assessment of potential safety liabilities is an essential element of preclinical drug development in order to reduce the risk of expensive late project failures. Importantly, projects with preclinical safety signals are often closed in the clinic due to safety issues, emphasizing the need to thoroughly evaluate and act upon preclinical safety profiles (Cook et al., 2014). The mandatory core battery of pharmacological safety tests is formalized in guidelines by the International Conference on Harmonization (ICH) of Technical Requirements for Registration of Pharmaceuticals for Human Use and is implemented in Europe (EC), Japan (PMDA), US (FDA), Canada (Health Canada), and Switzerland (Swissmedic). Notably, these directives only specify tests covering central nervous, cardiovascular, and respiratory toxicity as mandatory (https://www.ich.org/products/guidelines/safety/article/safetyguidelines.html), whereas evaluation of DILI liability is not explicitly required. In 2010, however, EMEA has released a white paper to assess the risk of DILI during preclinical stages (https:// www.ema.europa.eu/en/documents/scientific-guideline/ reflection-paper-non-clinical-evaluation-drug-induced-liver-injury-dili_en.pdf), and these developments can be interpreted as the intention of regulators to formalize specific guidance for DILI in the near future.
Before first-in-human trials, drug safety is evaluated in preclinical animal models. However, 38-51% of compounds with hepatotoxic liabilities are not detected in preclinical tests (Hughes, 2008), and the development of systems that more faithfully capture human liver toxicity is thus a major focus of both academic research and commercial development. Early attempts to test and benchmark methods for their ability to predict human toxicity were carried out in the 1990s in the frame of the Multicentre Evaluation of In Vitro Cytotoxicity (MEIC) project (Bondesson et al., 1989). In the course of this project, 29 laboratories tested the cytotoxicity of 50 reference chemicals in 61 cell models, including the hepatoma cell line HepG2. Importantly, the authors found that human cell lines outperformed animal cell culture systems for the prediction of human lethal peak concentrations (Ekwall et al., 1998), thus setting the stage for the adoption of human cell models for toxicity studies.
Fueled by improvements in hepatocyte isolation methods and cryopreservation protocols, as well as the increasing appreciation that hepatoma cell lines express only very low levels of drug metabolizing enzymes, the focus shifted to primary human hepatocytes (PHH) as the gold standard cell model for predictive toxicology. While freshly isolated PHH closely resemble hepatocytes in situ, their phenotype rapidly deteriorates in conventional 2D culture in a process called dedifferentiation. Earliest perturbations on transcript level are already apparent after 30 min and more than 4,000 transcripts are differentially expressed (FDR = 0.01) during the first 24 h of culture . Significantly affected pathways include tricarboxylic acid (TCA) cycle, oxidative phosphorylation, fatty acid metabolism, and urea synthesis . Furthermore, 2D cultured PHH lose expression of important drug  (Kappos et al., 2015) *Withdrawal pertains to the oral formulation. Bromfenac is still marketed as an ophthalmic solution. NSAID, non-steroidal anti-inflammatory drug. Note that drugs that were only withdrawn in individual European countries are not listed here.
metabolizing enzymes and drug transporters, which hampers their usefulness for studies of liver biology and pathobiology, as well as for drug metabolism and toxicity (Rowe et al., 2013).
To overcome these problems, PHH can be cultured in sandwich configuration in which 2D cultured hepatocytes are overlaid with an additional layer of Matrigel. In this culture paradigm PHH retain their morphology and polarity, resulting in physiologically relevant expression levels of transporters and the formation of functional bile canaliculi networks (Bi et al., 2006;Swift et al., 2010;De Bruyn et al., 2013;Oorts et al., 2016). While longevity of cells is prolonged in this culture configuration, proteomic analyses revealed clear indications of hepatocyte dedifferentiation and declining expression levels of multiple CYPs in sandwich cultures (Kimoto et al., 2012;Rowe et al., 2013;Bell et al., 2018). In acute toxicity studies, sandwichcultured PHH achieved a sensitivity of 50-60% with a specificity of 100% (n = 200 DILI positive and n = 144 DILI negative compounds) (Xu et al., 2008). Underlying reasons for the high number of false negatives might be the limited mass transfer of compounds through the extracellular matrix (ECM) overlay and the fact that cells in 2D sandwich culture cannot be cultivated long enough to mimic the delayed onset of many hepatotoxic compounds. Thus, while PHH sandwich cultures provide useful tools for pharmacological studies, particularly for the analysis of hepatobiliary transport and cholestatic toxicity, they are only of limited utility for the mechanism-agnostic screening of liver toxicity.
In an attempt to improve phenotypic maintenance of liver cells in culture, a vast number of 3D culture methods have been developed during recent years Lin and Khetani, 2016;Underhill and Khetani, 2018;Lauschke et al., 2019). In the following sections we outline important considerations for the development of such organotypic liver models, and reflect on how 3D culture systems can promote the maintenance of differentiation characteristics of the parent organ. Next, we provide an updated comprehensive overview of current hepatic organotypic and microphysiological in vitro models for the prediction of DILI, including spheroid models, micropatterned co-cultures (MPCC), bioreactors, and liveron-a-chip devices, and critically discuss their strengths and weaknesses. In the last part of the review, we highlight emerging applications beyond the evaluation of hepatotoxicity for which organotypic liver models promise to provide conceptual advancements over conventional 2D cultures.

FEATURES OF ORGANOTYPIC LIVER MODELS
The most important feature of organotypic liver models is the preservation of robust hepatic functions and physiological expression levels of drug metabolizing enzymes and drug transporters in order to facilitate faithful hepatotoxicity predictions (Figure 1). For this reason, PHH cultures are more predictive than hepatoma cell lines, such as HepG2, which exhibit expression levels of drug metabolizing enzymes that are often orders of magnitude lower than in freshly isolated PHH (Donato et al., 2009). Importantly, PHH can be cryopreserved and maintain their metabolic activity and sensitivity to drugs upon thawing, which allows biologically meaningful replicate experiments using material from the same donor (McGinnity et al., 2004;Hallifax et al., 2008;Sison-Young et al., 2017). Furthermore, hepatic models should capture the human liver in vivo microenvironment, including the phenotypes of and interactions with relevant non-parenchymal cells (NPCs). Additionally, viability and molecular signatures of the cultured cells should be stable. As such, it is crucial that the rapid dedifferentiation observed within hours in 2D monolayer PHH cultures is prevented . This is particularly important for assessments of drug hepatotoxicity, as toxicity for the majority of compounds manifests with a delayed onset and, additionally, drug metabolizing enzymes are among the first to be lost during dedifferentiation.
In addition to the utilized cell models and culture paradigms, the composition of the culture medium has strong influence on both the phenotypes and stability of the cell model of choice. Specific consideration should be given to nutrient concentrations (sugars, amino acids, and vitamins) and hormone levels, most importantly insulin. Furthermore, while the use of serum might be supporting overall cell health, its use can complicate result interpretation due to incomplete information regarding its active constituents. Similar arguments apply to the use of media with proprietary non-disclosed formulations.
Lastly, the influence of substratum materials and, where applicable, scaffolds on the bioavailability of tested compounds should also be noted. Particularly, it is crucial that these systems exhibit low absorption of small molecules. This issue of drug absorption limits the predictive power of many culture systems, particularly those relying on the elastomer polydimethylsiloxane (PDMS) (Toepke and Beebe, 2006;Li et al., 2009;Gomèz-Sjoberg et al., 2010;Wang et al., 2012). In addition, the material of choice should exhibit minimal leaching of uncrosslinked oligomers into the culture medium, which can be bioactive and modulate drug action (McDonald et al., 2008). Oxygen concentrations in the medium can influence the molecular phenotype of hepatic cells, and it is thus important that environmental oxygen concentrations, platform design, and gas permeability of the substratum material are harmonized to guarantee that cells are exposed to relevant oxygen levels. Another key parameter for liver chips is the stiffness and topography of the substratum material, which can affect molecular phenotypes of cultured hepatocytes via mechanosensing pathways (Natarajan et al., 2015;Yang et al., 2017). Moreover, optical transparency of the material should be considered to render the platform compatible with imaging applications.

Scaffold-Free Spheroids in Multi-Well Formats
Scaffold-free spheroids are 3D cell aggregates that form by self-aggregation of cells in suspension in the absence of added substrates that promote cell attachment (Figure 2A). Spheroids of controlled sizes can be formed over the course of multiple days, either in hanging drops or in multi-well plates with ultra-low attachment (ULA) surfaces. Additionally, spheroids can be formed on a large scale in stirred tank bioreactors, which we review in the section Scaffold-Free Spheroids in Perfused Stirred-Tank Bioreactors. The earliest reports of spheroid hepatocyte culture used newborn rat hepatocytes on poly-HEMA or proteoglycan-coated ULA surfaces and demonstrated that such spheroids were viable for multiple weeks and functionally superior compared to 2D monolayer culture (Landry et al., 1985;Koide et al., 1989;Tong et al., 1992). Building on these seminal findings, a plethora of studies established and characterized 3D spheroids using hepatoma cells, stem cell-derived hepatocyte-like cells (HLCs), or primary hepatocytes. In the following sections we will introduce different cell models for studies of liver function and drug metabolism and provide a comprehensive overview of their application for drug toxicity prediction.

Spheroids From Hepatic Cell Lines
Human liver cell lines, such as HepG2, HepaRG, and Huh-7, are commonly used for studies of drug metabolism and toxicity due to their low cost, unlimited availability, and vast available knowledge base. Interestingly, spheroid culture of hepatoma cells results in a down-regulation of ECM, cytoskeleton and adhesion molecules, cortical actin organization and increased activity of drug metabolizing enzymes, and enhanced liver-specific functions, such as apolipoprotein and albumin secretion (Chang and Hughes-Fulford, 2009;Takahashi et al., 2015). FIGURE 1 | The choice of cells, media, and culture substrata markedly influences the phenotype and functional relevance of organotypic liver models. Important points that should be considered for model design are listed. Each newly established model should closely mimic the molecular phenotype and function of human liver tissue. Furthermore, cellular phenotypes should be stable over the course of multiple weeks in culture and assays should be reproducible, i.e., experiments using cells from the same donor should have low technical variability. Further ideal organotypic liver models should be sufficiently accessible, allowing time course measurements and the sampling of various end points, and should be compatible with high-throughput screening (HTS).
HepG2 constitutes arguably the most extensively characterized hepatoma cell line. HepG2 spheroid culture resulted in polarized expression of MRP2 and MDR1 transporters, improved albumin secretion compared to HepG2 monolayer culture for several weeks, and increased susceptibility to acetaminophen-induced toxicity (TC 50 of 7.2 mM in 3D culture compared to 33.8 mM in 2D culture) (Gaskell et al., 2016). Moreover, HepG2 spheroids are technically compatible with high-throughput screening (HTS) (Sirenko et al., 2016) and 3D culture improved prediction of hepatotoxicity of amiodarone, diclofenac, metformin, phenformin, and valproic acid (Fey and Wrzesinski, 2012). Importantly, however, the sensitivity to hepatotoxins in 3D spheroid culture was still considerably lower than in 3D PHH culture or in vivo, in accordance with low expression and activity of drug metabolizing enzymes (Wilkening et al., 2003). Of note, as cancer-derived cells, HepG2 models were found to be sensitive to compounds with anti-proliferative effects that do not rely on hepatic metabolism (Sirenko et al., 2016).
Compared to HepG2, HepaRG cells exhibit higher metabolic activity (Hart et al., 2010;Nelson et al., 2017;Yokoyama et al., 2018). HepaRG spheroids can be cultured up to 7 weeks with polarized transporters and bile canalicular structures . Furthermore, activities of CYP1A2, CYP2B6, CYP2C9, CYP3A4, and UGT enzymes remained stable and substantially (2-to 20-fold) higher in 3D compared to 2D culture Wang et al., 2015;Ott et al., 2017). In accordance with these functional data spheroid culture improved sensitivity to acetaminophen and aflatoxin B1, which both require metabolic activation Gunness et al., 2013;Mueller et al., 2014). HepaRG spheroids furthermore distinguished trovafloxacin and troglitazone from their less toxic structural analogues (Ramaiahgari et al., 2014). In a small set 10 hepatotoxins and 2 non-toxins using a multiplexed HepaRG spheroid assay, the platform correctly flagged 7/10 compounds as hepatotoxic after exposure for 7 days, whereas 6/10 drugs were successfully identified in the corresponding monolayer culture (Ott et al., 2017). Effects of the culture paradigm on drug sensitivity appears to be compound-specific, as 2D HepaRG monolayer culture was the more sensitive model for the detection of troglitazone, tamoxifen, and chlorpromazine toxicity (Gunness et al., 2013;Mueller et al., 2014;Ott et al., 2017). Comparison of CYP metabolism determined by mass spectrometry between 2D and 3D spheroid cultures of the same donors (n = 3) demonstrates that metabolic activities are 10-900-fold higher in 3D PHH spheroids after one or more weeks in culture. FC: fold change; FIC: freshly isolated cells. (E) Toxicity assessment of 70 hepatotoxic drugs (DILI positive; red dots) and 53 DILI negative controls in PHH spheroids (green dots). Viability as determined by ATP quantifications relative to untreated controls is shown for three exposure levels (1×, 5×, and 20× therapeutic c max ). The dashed line indicates viability of the respective control spheroids (100%). Notably, of the 70 compounds with DILI liabilities in humans, 48 of the 70 DILI positive compounds were successfully flagged as toxic (69% sensitivity), whereas none of the DILI negative drugs indicated hepatotoxic liabilities (100% specificity). Error bars indicate SD **, ***, and **** indicate p < 0.01, p < 0.001, and p < 0.0001 in a two-tailed heteroscedastic t-test, respectively. In addition to being used as a model for hepatocellular toxicity, Leite and colleagues recently demonstrated that HepaRG-stellate cell co-cultures can be used to detect stellate cell activation and drug-induced liver fibrosis of allyl alcohol and methotrexate (Leite et al., 2016). Furthermore, the authors found that acetaminophen could activate stellate cells, which was confirmed in vivo in mice, thus providing a useful experimental tool for the prediction of drug-induced liver fibrosis.

Spheroids From Primary Human Liver Cells
PHH serve as the gold standard cell type for predicting human hepatotoxic drugs (Gómez-Lechón et al., 2014). However, their rapid dedifferentiation in conventional 2D monolayer culture limited their phenotypic advantages compared to cell lines to short-term studies. Strikingly, organotypic spheroid culture provided a major conceptual advancement. In spheroids, PHH remain viable with stable albumin secretion for at least 5 weeks (Messner et al., 2013;Bell et al., 2016). Moreover, they can be co-cultured with NPCs and are capable of glycogen storage ( Figure  2B). Importantly, we could show that proteomic signatures in 3D spheroids closely resemble the human liver in vivo, whereas cultures from the same donors rapidly deteriorated in 2D monolayer or sandwich culture ( Figure 2C). Similarly, PHH spheroids retained their transcriptomic and metabolomic profiles for multiple weeks Bell et al., 2017;Bell et al., 2018). In contrast, transcriptomic patterns of other emerging cell models differed substantially and 8,148 out of 17,462 genes analyzed were differentially expressed in PHH spheroids compared to HepaRG cells and stem cell-derived HLCs (Bell et al., 2017). In accordance with these phenotypic differences, PHH spheroids exhibited substantially increased activity of CYP1A2, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, and CYP3A4 compared to HepG2, HepaRG, and 2D cultured PHH ( Figure 2D and references Berger et al., 2016;Vorrink et al., 2017).
These phenotypic improvements over other cell models and culture paradigms rendered PHH spheroids a promising tool for hepatotoxicity studies. Ogihara et al. demonstrated that PHH spheroids could detect hepatotoxicity of compounds with diverse toxicity mechanisms, such as acetaminophen, chlorpromazine, diclofenac, flutamide, imipramine, ticlopidine, and troglitazone, at clinically relevant concentrations after 2 weeks exposure using aspartate aminotransferase (AST) leakage as readout (Ogihara et al., 2015;Ogihara et al., 2017). In addition, spheroid assays have been presented that are tailored to identify and study specific hepatotoxicity mechanisms, such as cholestasis  and steatosis (Kozyra et al., 2018).
Recently, two studies evaluated the predictive power of PHH spheroids offered by InSphero AG in media with proprietary composition to detect hepatic liabilities of selected compounds. Proctor et al. analyzed 110 drugs (69 DILI positive and 41 DILI negative) at concentrations up to 100× the therapeutic exposure levels for 2 weeks in undisclosed media formulations and reported sensitivity and specificity of 59% and 80%, respectively (Proctor et al., 2017). The same model was compared to 2D sandwich culture using 12 DILI positive test compounds (Richert et al., 2016). Surprisingly, the authors found that 2D sandwich culture correctly identified 11/12 DILI-positive compounds already after 3 days, whereas the spheroid model only detected 9/12 compounds after 14 days (amiodarone, bosentan, and ximelagatran as false negatives).
In contrast to the InSphero model, we used spheroids in chemically defined serum-free conditions (CD-spheroids) to evaluate the toxicity of 123 drugs (70 DILI positive and 53 DILI negative) (Vorrink et al., 2018). Of these drugs, 38 overlapped with the study by Proctor and colleagues. The model achieved 69% sensitivity and 100% specificity at exposure levels of 20× therapeutic c max ( Figure 2E). Moreover, CD-spheroids correctly flagged amiodarone and bosentan as hepatotoxins. Combined, these results emphasize that PHH spheroids provide accurate tools for the preclinical prediction of hepatic liabilities of compounds with diverse patterns of liver damage from various therapeutic areas.

Spheroids From Stem Cell-Derived Hepatocyte-Like Cells
HLCs can be obtained by differentiation from human induced pluripotent stem cells (hiPS-Hep) and have been discussed as a promising alternative in vitro model for studies of human liver function and drug toxicity. To date, various differentiation protocols have been presented. While the overall differentiation state of HLCs has been improved, current protocols still fall short from producing cells that closely resemble the molecular phenotype of mature PHH (Schwartz et al., 2014;Baxter et al., 2015;Goldring et al., 2017). Importantly, however, while their phenotype remains immature, HLCs offer the appealing advantage that they can be generated from material from patients with specific phenotypes of interest, for instance from those that experienced rare idiosyncratic drug reactions, thus providing the unique opportunity to investigate patient-specific risk factors.
As for other cell models, spheroid culture seems to improve hepatic phenotypes and support expression of marker genes, albeit the reached levels are still orders of magnitude lower than in PHH (Meier et al., 2017). Despite these beneficial effects, only few studies have used HLC spheroids for hepatotoxicity assessments. Takayama and colleagues cultured HLC spheroids in multi-well plates with nanopillars at the bottom of each hole to facilitate spheroid formation (Takayama et al., 2013). Notably, in this platform, HLC spheroids were overall more sensitive than HepG2 spheroids to a set of 24 hepatotoxins. However, sensitivity was considerably lower than PHH monolayer cultures. By contrast, Sirenko et al. found that HepG2 spheroids were much more sensitive than HLC spheroids for 10/23 evaluated compounds following a single 72-h exposure (Sirenko et al., 2016). In summary, while 3D culture seems to improve hepatic phenotypes, its effects on sensitivity to hepatotoxins remain to be determined.

Spheroids From Primary Animal Hepatocytes
Organotypic liver models using animal cells can have multiple applications. Most commonly, primary animal hepatocytes, particularly from rat, are used to study human drug response, providing a cheaper and more easily accessible cell source. As for PHH, these spheroids possess structural polarity and functional bile canaliculi with MDR1 and MRP2 being localized at the canalicular membranes . Furthermore, rat hepatocyte spheroids retain the expression of hepatic genes, including albumin drug metabolizing enzymes and clotting factors for multiple weeks (Abu-Absi et al., 2002;Brophy et al., 2009). Exposure of rat hepatocyte spheroids to acetaminophen decreased glutathione levels and intracellular ATP, however at concentrations vastly exceeding toxic levels in humans (Sanoh et al., 2014). This relative insensitivity might be due to the drastic downregulation of Cyp3a2 and Cyp2e1, orthologues of human CYP3A4 and CYP2E1 that are mediating the formation of the liver toxic reactive acetaminophen metabolite NAPQI. Rat spheroids have furthermore been used to test the hepatotoxicity of methotrexate. Interestingly, the authors found that spheroids were less sensitive to methotrexate hepatotoxicity than 2D monolayer cultures, likely due to increased MRP2-mediated export (Walker et al., 2000;Yin et al., 2009).
Various strategies have been proposed to use animal hepatocytes as proxies for specific human genotypes. For instance, as cats lack UGT1A6 and have reduced activity of UGT2B isoforms (Court, 2013), cat hepatocytes could be useful to model the reduced glucuronidation capacity in individuals with poor UGT activity, for instance for mimicking the increased susceptibility to acetaminophen hepatotoxicity in individuals with Gilbert's syndrome (de Morais et al., 1992). Similarly, as dogs lack the N-acetyltransferases NAT1 and NAT2 (Trepanier et al., 1997), dog hepatocytes could model the effects of decreased NAT activity, which is common in humans (up to 40% are slow NAT metabolizers). Importantly, however, while such experiments might inform about the contribution of specific metabolic pathways, drastic differences in overall isoform composition, expression, and catalytic activities of drug metabolizing enzymes limit the predictive power of organotypic animal liver models for the prediction of human hepatotoxicity (Martignoni et al., 2006).

Scaffold-Free Spheroids in Perfused Stirred-Tank Bioreactors
Perfused stirred-tank bioreactors feature stirring to promote cellular aggregation. Similar to scaffold-free spheroids in multiwell formats, spheroids in bioreactors displayed spontaneous assembly of functional bile canaliculi networks, physiologically relevant expression of hepatocyte-specific markers, and sustained CYP activity (Tostões et al., 2012). Furthermore, the model has been expanded to be co-cultured with an outer layer of stromal human bone marrow mesenchymal stem cells, and the authors report increased expression of CYP3A4 and CYP1A2 (Rebelo et al., 2017).
Bioreactors allow tight control of parameters such as dissolved oxygen and pH and enable spheroid formation at scales that are difficult to achieve in conventional multi-well plate formats. Key parameters to consider are the type and size of paddle and impeller as well as the rate of mixing in order to balance the establishment and maintenance of cell-cell contacts with sheer stress. Furthermore, it must also be ensured that autocrine and paracrine signals are not disrupted by high perfusion rates (Ebrahimkhani et al., 2014). Importantly, perfused stirred-tank bioreactors do not allow the testing of different conditions without transfer to separate compartments, as spheroids are formed in a single reaction chamber. As such, the main application of these systems is to feed multiplexed high-throughput platforms for hepatotoxicity testing or other applications. Problematic in this context, however, is the large variability in spheroid sizes that can negatively impact reproducibility and test standardization.

Scaffold-Based Culture Models
Organotypic liver cultures can be supported by a variety of scaffolds, i.e., natural or synthetic components that can facilitate the formation and maintenance of cell-cell contacts, cell polarity, and tissue organization. In recent years, particularly alginate, collagen and nanofibrous poly-L-lactic acid (PLLA) scaffolds have received growing attention in the context of hepatic spheroids, as they facilitate the deposition of essential components of the hepatic microenvironment, including collagens and fibronectin (Selden et al., 2000). Alginate microencapsulation consequently improves viability and metabolic capacity of HepG2 and HepaRG spheroids (Elkayam et al., 2006;Rebelo et al., 2015). Alginate and alginate-collagen composite hydrogel furthermore enhance hepatic functions of primary rat hepatocyte spheroids formed using high-throughput emulsion droplet microfluidics (Chan et al., 2016). Similarly, PLLA nanoscaffolds had beneficial effects on albumin secretion and slightly improved expression of Cyp1a2 and Cyp2b2 of primary rat hepatocytes after 7 days in culture compared to 2D monolayers (Bierwolf et al., 2011). Also for PHH, beneficial scaffold effects on hepatic function have been demonstrated, manifesting in improved clearance predictions of low clearance compounds, such as warfarin and coumarin (Phillips et al., 2018).
Only few studies have analyzed the effects of scaffolds in the context of hepatotoxicity assessments. Spheroid culture on polystyrene scaffolds increased the activity of Cyp1a2, Cyp2b1, and Cyp3a2 more than fourfold and improved the sensitivity of rat hepatocyte spheroids to acetaminophen (Schutte et al., 2011). TC 50 values (approximately 40 mM), however, were still substantially higher than in scaffold-free spheroid culture or in vivo (approximately 1 mM). In contrast, HepG2 spheroids in ECM-based hydrogel ceased to proliferate and acquired a more differentiated phenotype, indicated by gradually increased expression of albumin, hepatic transcription factors, phase I and II drug metabolism enzymes, and transporters up to day 28, whereas culture in sandwich configuration or on porous polystyrene scaffold did not result in phenotypic improvements (Ramaiahgari et al., 2014;Luckert et al., 2017). Consequently, HepG2 spheroids in Matrigel identified the toxicity of eight hepatotoxins, whereas four control compounds not implicated in DILI were correctly flagged as nontoxic (Ramaiahgari et al., 2014). Liu et al. established HepaRG spheroids using decellularized rat liver as scaffold. They showed that this liver biomatrix scaffold enhances expression of phase I and phase II enzymes, drug transporters, and nuclear receptors for up to 28 days compared to scaffold-free spheroid culture (Liu et al., 2018). However, expression levels of most genes were still considerably lower than in PHH. In accordance with these improved molecular phenotypes, toxicity testing of 14 hepatotoxins revealed that the addition of scaffolds conferred a higher DILI sensitivity. Furthermore, various other culture scaffolds, such as galactosylated cellulosic sponges or poly-ethylene-glycol-diacrylate, increased long-term hepatic functionality of spheroids generated from human liver cell lines or stem cell-derived hepatocytes (Khalil et al., 2001;Tasnim et al., 2016;Wang et al., 2016;Mun et al., 2019). Furthermore, hepatotoxicity tests using small panels of drugs indicated that these models can successfully identify the hepatotoxicity of selected drugs in vitro.
While the highlighted studies indicate beneficial effects of various scaffold materials on the phenotypes of hepatic spheroids, comprehensive benchmarking experiments in which the molecular signatures of scaffold-based hepatic cultures have been compared on transcriptomic, proteomic, and metabolomic levels to the intact liver have not been conducted. Furthermore, batch-to-batch variability of scaffold materials can impair reproducibility and comparability, thus complicating the application of these models for DILI predictions. Additionally, the effects of scaffolds on drug diffusion should be considered and exposure concentrations should be adjusted accordingly before applying scaffold-based models to pharmacological and toxicological analyses.

Micropatterned Co-Culture
As an alternative to spheroid culture, Bhatia and colleagues developed a micropatterned system in which hepatic cells are seeded on micropatterned ECM islands and are supported by surrounding murine fibroblasts (Figure 3A). PHH in such MPCC form bile canaliculi networks and retain expression of phase I and phase II drug metabolizing enzymes, nuclear receptors, as well as drug transporters for at least 6 weeks in culture (Khetani and Bhatia, 2008). MPCC are widely used and have been shown to be a valuable tool for different phases of drug development by closely mimicking the complexity of human liver .
For hepatotoxicity tests, the model achieved 65% sensitivity and 90% specificity when analyzing a panel of 45 drugs (35 DILI positive and 10 DILI negative) using glutathione and ATP levels, as well as albumin secretion and urea synthesis as endpoints (Khetani et al., 2013). In addition to PHH, MPCC has been used to culture HLCs derived from iPSCs (iMPCC) overlaid with a thin Matrigel layer . The majority of hepatic genes in iMPCCs were expressed at levels between 10% and 200% of suspension cultured hepatocytes. Moreover, global gene expression profiling revealed that iMPCC transcriptomic signatures remained relatively stable between day 9 and day 21 of culture (R 2 between 0.6 and 0.96). Functional parameters, such as the activities of CYP2A6 and CYP3A4, as well as glucuronidation and sulfation reactions were drastically decreased to 5-25% of PHH levels. The system successfully distinguished between four toxic and four nontoxic compounds (Figures 3B, C) and later extension to a panel of 47 drugs yielded 65% sensitivity and 100% specificity, which was similar to the result in the PHH MPCC system on the same panel of test drugs, thus corroborating the value of iMPCC for DILI predictions .

Hollow Fiber Bioreactors
In hollow fiber bioreactors, cells are cultured in tubular modules, each permeated by a network of intratubular capillaries, which allow perfusion of medium and gases ( Figure 4A). Hepatic cells are cultured in the surrounding extra-capillary space, which permits the mimicking of capillary blood-tissue exchange, thus providing a pseudovascularized tissue model in which cells are protected from direct media flow (Williams et al., 2013).
Hollow fiber bioreactors for artificial liver support developed by Gerlach and colleagues require around 10 10 cells in a cell compartment of 800-ml volume, which renders such approaches incompatible with parallel drug testing (Gerlach et al., 1994;Gerlach et al., 2003;Sauer et al., 2003;Pless et al., 2006). In an attempt to make this setup accessible for pharmacological studies, the bioreactor setup was miniaturized to cell compartment volumes of 0.5 to 2 ml accommodating 10 7 to 10 8 cells. Notably, miniaturization did not negatively impact hepatic functions and metabolic activities were preserved for multiple weeks (Zeilinger et al., 2011;Hoffmann et al., 2012). Transporter expression of SLCO1B1 (OATP1B1) in HepaRG bioreactor cultures was 95% lower than in PHH bioreactors, whereas CYP3A4 levels were found to be similar . While all major in vivo metabolites of diclofenac were detected in both PHH and HepaRG bioreactors, metabolic routes of the investigational peroxisome proliferator activated receptor alpha (PPARA) agonist AZD6610 differed in HepaRG bioreactors, whereas they were similar between PHH and in vivo . Furthermore, Lübberstedt et al. demonstrated that serumfree culture of PHH in the miniaturized bioreactor resulted in increased expression of various CYPs, phase II enzymes, and drug transporters compared to cultures maintained with fetal calf serum (Lübberstedt et al., 2015).
To further facilitate mass exchange of nutrients and metabolites between cells and culture medium, de Bartolo and colleagues arranged the fibers of their bioreactor in cell plates with bilateral sinusoidal structures (De Bartolo et al., 2009). The bioreactor features two types of alternating cross-assembled hollow fiber membranes with different molecular weight cutoffs and physicochemical properties. These differing membrane properties allow the two membrane systems to mimic the in vivo arterious and venous blood vessels. When cryopreserved PHH were cultured in the system, urea synthesis and albumin production were maintained at levels approximately 8-fold higher than in the liver support system and diazepam biotransformation was observable for 18 days.
Due to the large cell requirements, hollow fiber bioreactors have rarely been used for hepatotoxicity evaluations. To our knowledge, only two studies have been presented for the utility of hollow fiber bioreactors for hepatotoxicity evaluations, both using fresh rat hepatocytes. In their first study, Shen and colleagues demonstrated that bioreactor cultures of rat hepatocytes were more sensitive than the corresponding 2D cultures for the detection of acetaminophen hepatotoxicity, likely due to higher CYP2E1 activity (Shen et al., 2006). However, as rats are highly resistant to acetaminophen toxicity in vivo (McGill et al., 2012), the relevance of these results can be questioned. In a follow-up work, the same group tested 48-h exposures to eight drugs ( Figure 4B). The main findings were that rat hepatocytes were more sensitive when cultured in polysulfoneg-poly(ethylene glycol) hollow fiber bioreactors compared to those cultured either in bioreactors with polysulfone hollow fibers or without hollow fibers (Shen et al., 2010).
In summary, hollow fiber bioreactors provide an excellent culture system for the long-term maintenance of hepatic phenotypes and functions. However, they are expensive, difficult to establish, and even in the miniaturized setup bioreactors require cell numbers that are orders of magnitude higher than in other organotypic 3D culture paradigms (Planchamp et al., 2003). Furthermore, toxicological and pharmacological studies in hollow fiber bioreactors are complicated by the absorption of hydrophobic drugs by system components and limited accessibility to biochemical or imaging-based monitoring of functional parameters.

Perfused Liver Chips
Liver cells can be cultured on microfluidic "chip-like" devices, which incorporate media perfusion to improve nutrient supply and mimic physiological shear stress (Cui and Wang, 2019). These liver chips typically feature a layered architecture in which layers of hepatocytes and NPCs are cultured adjacently, separated by ECM deposits (Figure 5A). Importantly, molecular phenotype and function of hepatocytes differ along the lobular axis; levels of urea synthesis, gluconeogenesis, and beta-oxidation are highest in the oxygen-rich periportal zone (approx. 15% oxygen) and decrease towards the oxygen-poor region around the central vein (approx. 6% oxygen), whereas glycolysis, bile acid synthesis, and CYP metabolism show inverse profiles (Kietzmann, 2017). Importantly, perfusion can be used to recapitulate this complex zonal architecture of the liver, resulting in graded expression of zone-specific genes, such as CYP3A4, CYP2E1, A1AT, and Arginase 1, and recapitulation of the perivenous pattern of acetaminophen hepatotoxicity (Lee-Montiel et al., 2017;Tomlinson et al., 2019).
The LiverChip manufactured by CN Bio consists of multiple units of collagen-coated reactor compartments in which hepatocytes assemble into 3D microtissues, as well as medium reservoirs and integrated micropumps ( Figure 5B and Powers et al., 2002;Domansky et al., 2010). PHH in the LiverChip form bile canaliculi and exhibit detectable levels of CYP activity. However, expression levels of various CYPs (CYP2C8 and  . September 2019 | Volume 10 | Article 1093 Frontiers in Pharmacology | www.frontiersin.org CYP2E1), drug transporters (BSEP and OATP1B3), and nuclear receptors (CAR and PXR) decline considerably over the course of 10 days (Vivares et al., 2014). Nevertheless, hierarchical clustering of lactate dehydrogenase (LDH) leakage, albumin secretion and urea production distinguished the hepatotoxins fialuridine, acetaminophen, and clozapine from the nontoxins olanzapine and entecavir . In a co-culture configuration of PHH and Kupffer cells, the platform remained metabolically stable for 2 weeks and could model the modulation of CYP3A4 activity by the monoclonal antibody tocilizumab and its effects on simvastatin hydroxy acid pharmacokinetics (Long et al., 2016). Interestingly, the platform has furthermore proven useful for the analysis of interactions of the hepatic niche with infiltrating breast cancer cells (Wheeler et al., 2014;Clark et al., 2017).
The HµREL chip is a polycarbonate microdevice that consists of multiple microfluidically interconnected cell compartments, fluid reservoirs, and peristaltic pumps. This system has been successfully used for the relatively accurate prediction of drug clearance in man (Chao et al., 2009;Novik et al., 2010).
Interestingly, flow seemed to improve CYP1A2, CYP2B6, and CYP3A4 activity only when PHH were co-cultured with NPCs (Novik et al., 2010). It can be hypothesized that an explanation for these findings is the secretion of collagen by stellate cells, as collagen has been demonstrated to be essential for the enhanced functional response of hepatocytes cultured under flow (Hegde et al., 2014). Furthermore, the model was recently used to screen 19 compounds (12 DILI positive, 7 DILI negative) with known clinical hepatotoxic liabilities (Novik et al., 2017). Using 100× human c max as threshold, the authors successfully flagged 10 out of 12 compounds as hepatotoxic. However, sensitivity was substantially lower (i.e., TC 50 values were substantially higher) than in spheroids and MPCC (compare (Khetani et al., 2013;Vorrink et al., 2018).
Emulate Bio has recently presented a PDMS-based chip consisting of two parallel channels coated with a proprietary ECM mixture and separated by a porous membrane. The upper channel contains monolayers of hepatocytes while the lower channel features liver sinusoidal endothelial cells (LSEC) FIGURE 4 | Hollow fiber bioreactors for organotypic hepatocyte culture. (A) Schematic representation of a hollow fiber bioreactor setup. Perfusion capillaries pass through a bioreactor module supplying the cells in the extracapillary space with oxygen and nutrients. (B) Hepatotoxicity of eight drugs was evaluated in bioreactors with hollow fibers made from polysulfone-g-poly(ethylene glycol) (PSf-g-PEG) or polysulfone (PSf) and was compared to cultures in cylindrical gels without hollow fibers. PSf-g-PEG fibers drastically reduced protein absorption and supported cellular functions as well as drug sensitivity. Error bars indicate ± SD; *p < 0.05, **p < 0.01, and ***p < 0.001. Figure modified with permission from (Shen et al., 2010;Williams et al., 2013). (Peel et al., 2019). Notably, the chip layout is standardized and compatible with automated microscopy to enable highthroughput evaluation of imaging-based endpoints ( Figure 5C). Hepatotoxicity assessment of acetaminophen and fialuridine resulted in dose-and time-dependent increases in miR-122 and α-GST release as well as decreases in albumin secretion (Foster et al., 2019). Notably though, hepatotoxicity of fialuridine (TC 50 in chip: 84 µM; TC 50 in sandwich culture: 45-90 µM) and acetaminophen (TC 50 in chip: 2.4 mM; TC 50 in sandwich culture: 2-3 mM) was similar to previous reports in sandwich cultured PHH (Bell et al., 2018), and thus, the added value of perfusion and LSEC co-culture for DILI prediction is not immediately clear.
Draper recently presented a liver-on-a-chip system made of oxygen permeable thermoplastic that consists of 96 units, Cells in compartment one are continually perfused through the scaffold by a diaphragm micropump, which circulates medium between the two wells. (C) Automated imaging workflow for the Emulate liver chip. The CV7000 automated microscope first performs low-resolution bright field scans of up to eight chips. On the basis of these images, relevant fields of view can be determined, which are subsequently used to high-resolution Z-stack imaging that allows accurate separation of the hepatocyte and endothelial cell layers. (D-F) Platform for the perfusion of hiPS-Hep spheroids developed by Bhatia and colleagues. (E) PDMS chips were designed with C-shaped features of 500 μm that allow to entrap spheroids across a range of perfusion rates. (F) Albumin secretion could be detected after 1 week of perfusion with flow rates up to 540 µl/h. Error bars indicate SEM; *p ≤ 0.05. Figure modified with permission from (Domansky et al., 2010;Schepers et al., 2016;Lee-Montiel et al., 2017;Peel et al., 2019). each containing sandwich cultured hepatocytes maintained in recirculating microfluidic conditions (Tan et al., 2019). The authors demonstrate that flow substantially increases albumin secretion and CYP3A4 activity but did not yet present the platform's performance for pharmacological or toxicological applications. Mimetas provides microfluidic plates consisting of culture chambers with three juxtaposed inlets and outlets in which perfusion is generated by gravity on a rocking device. While this platform has been extensively used for renal and neural cultures, only a single proof-of-concept study has been presented to date, which indicated that hepatic cells (HepaRG) remained largely viable and polarized for 14 days .
An interesting device was developed by Schepers and colleagues in which spheroidal hiPS-Hep aggregates are entrapped in perfusable C-shaped features (Figures 5D-F). Cells in this model remain viable and secrete albumin for at least 38 days, and the platform tolerates large variations in perfusion rate (from 24 up to 500 µl/h) (Schepers et al., 2016). Jin and colleagues presented a microfluidic device featuring rocker-based gravity-driven perfusion and vascularized liver organoids (Jin et al., 2018). Organoids were formed from HLCs generated by direct transcription factor-mediated reprogramming of murine fibroblasts. Notably, co-culture with HUVECs and culture in a decellularized porcine liver ECM hydrogel improved expression profiles, as well as albumin secretion, urea production, and CYP3A4 activity of these induced hepatic cells compared to monocultures without supportive ECM scaffold. Furthermore, the model exhibited dose-dependent hepatotoxicity of ethanol and acetaminophen, albeit at concentrations >10-fold higher than clinically toxic levels. Furthermore, the authors present proof-ofconcept data showing the integration of the system with gastric and small intestinal organoids for pharmacokinetic studies.
In addition, a plethora of other liver chips have been presented. However, these models are only in their infancy and no comprehensive characterization data have been presented with regards to molecular hepatic phenotypes or predictive power for DILI predictions. For further details on hepatic chip platforms, we refer the interested reader to recent reviews on the topic (Beckwitt et al., 2018 ;Cui and Wang, 2019).

Bioprinting
3D bioprinting encompasses a variety of methodologically distinct modalities that can create complex biological tissue structures by depositing cells and extracellular factors with high spatial and temporal precision. Droplet-based bioprinting (DBB), also termed inkjet bioprinting, refers to the precise deposition of cells dispensed in small droplets (Gudapati et al., 2016). Advantages of this method are high resolution down to 2 µm for hydrogels and down to 50 μm for cells and the high throughput that allows printing of cm-sized tissue constructs . For liver cells, DBB has been used to produce heterocellular systems of hepatic (HepG2) and endothelial (HUVEC) cell lines in which albumin secretion and CYP3A4 activity were increased up to threefold compared to HepG2 monocultures (Matsusaki et al., 2013). Furthermore, this modality does not impair the pluripotency of embryonic or iPSCs and is compatible with post-printing differentiation into HLCs (Faulkner-Jones et al., 2015).
Extrusion-based bioprinting (EBB) results in the deposition of continuous filaments of desired shapes, differing from the individual droplets of inkjet-based bioprinters, and constitutes arguably the most widely used bioprinting modality (Ozbolat and Hospodiuk). This method is highly versatile and compatible with a wide range of bioink properties. Furthermore, it currently constitutes the only modality compatible with the printing of scaffold-free cell systems, such as spheroids (Skardal et al., 2015;Bhise et al., 2016;Kizawa et al., 2017). However, EBB is rather slow and can result in considerable shear stress during the printing process. Furthermore, resolution is more limited than in other bioprinting modalities (>100 µm), rendering EBB incompatible with printing capillary networks. In the context of liver cells, EBB has been successfully used for a variety of applications in pharmacology, toxicology, and regenerative medicine. Multilayered 3D structures of HepG2 cells using alginate scaffolds expressed detectable levels of hepatic markers, such as ALB, TAT, and ASGR1 (Jeon et al., 2017). Yet, increased expression levels of the fetal liver marker AFP indicate that mature hepatic phenotypes are not achieved in this model.
Notably, EBB also has important prospects for tissue engineering. In an effort to incorporate vascularization, Lee et al. (2016) printed primary rat hepatocytes, endothelial HUVECs, and human lung fibroblasts in collagen bioink and demonstrated that the heterotypic interactions with NPCs improved albumin secretion and urea synthesis by >20-fold over the course of 10 days compared to hepatocyte monocultures. Furthermore, HLCs differentiated from mouse iPSCs and bioprinted using EBB into 3D hepatic structures further matured in vivo when transplanted into recipient mice in a liver injury model, suggesting the utility of such approaches for regenerative medicine applications (Kang et al., 2018).
Light-assisted bioprinting (LAB) constitutes the least prevalent bioprinting modality, primarily due to its high cost, difficult handling, and lack of commercially available offthe-shelf solutions. However, due to its high resolution and printing speed coupled with low cell stress during the printing process, LAB constitutes a promising platform for tissue engineering. LAB allows bioprinting of structures mimicking the microscale hexagonal architecture of the human liver in which cells display improved molecular marker signatures compared to conventional 2D monolayer culture (Grix et al., 2018). Furthermore, co-culture of hiPS-Heps in this system together with HUVEC and adipose-derived stem cells resulted in improved hepatic maturation for at least 1 week following bioprinting (Ma et al., 2016).
Bioink is a key component of the bioprinting process that should offer mechanical support for the printed structure while mimicking the natural microenvironment of the encapsulated cells. Recent years have seen tremendous progress in bioink development (Gopinathan and Noh, 2018;Gungor-Ozkerim et al., 2018). For hepatic cells, however, only alginate, Matrigel, gelatin methacrylamide, and decellularized ECM-based bioinks have been employed to date. Alginate is used for its similarity to glycosaminoglycans of human ECM and is fully biocompatible (Axpe and Oyen, 2016). Furthermore, alginate has beneficial gelation properties and is relatively cheap. Alginate encapsulated HepG2 cells exhibited higher urea synthesis compared to 2D monolayer cultures despite reduced cell viability caused by dispensing pressure and nozzle diameter (Chang et al., 2008;Hwang et al., 2010). However, when alginate cellulose nanocrystal hybrid bioink was used for printing of human hepatoma cells in 3D structures resembling liver sinusoids, only minor effects of the encapsulation and printing process on cell viability were observed (Wu et al., 2018).
Promising data have been presented for DILI tests using bioprinted liver models. Specifically, bioprinted livers reproduced hepatotoxicity of acetaminophen at clinically relevant concentrations and successfully distinguished trovafloxacin from its nontoxic structural analogue levofloxacin (Knowlton and Tasoglu, 2016;Nguyen et al., 2016). Furthermore, bioprinted liver has successfully been used for mimicking TGF-β1 and methotrexate induced fibrinogenesis (Norona et al., 2019). Combined, these studies indicate that bioprinting constitutes an emerging technology that supports the maintenance of hepatic phenotypes and functionality. Furthermore, the possibility to precisely mimic the architecture of the intact organ in vitro opens a multitude of promising liver-related applications. However, while promising proof-of-concept data have been presented, the added value for the prediction of drug metabolism and toxicity in bioprinted liver models compared to other 3D culture models has not yet been demonstrated.

Critical Comparison of DILI Prediction Models
While a large number of studies using organotypic and microphysiological liver models have been presented that provide proof-of-concept for the detection of hepatotoxicity, we are only aware of 11 studies that have screened a larger (> 10) number of compounds ( Table 2). Of these, four models were based on hepatoma cells, four on PHH, and three on hiPS-Heps. Most hepatotoxicity screens are published using spheroids (8/11), whereas only two and one used MPCC and liver chips, respectively. Notably, as of yet, no DILI prediction study with more than 10 compounds was published using bioreactors or bioprinting modalities.
Importantly, whereas the utilized endpoints are homogenous across studies (10/11 studies used ATP quantifications as proxy for cell viability), the analyzed test compounds were highly dissimilar. This is particularly important with regards to the definition of human hepatotoxicity. Most studies focused on compounds with specific hepatotoxic liability, whereas few others used drugs that were generally cytotoxic, such as anthracyclines, taxanes, or colchicine, which complicates model comparisons (Sirenko et al., 2016). Analogously, the use of negative control compounds can differ from non-hepatotoxic structural analogues of hepatotoxic drugs, which can be considered the gold standard for test specificity, to commonly used media additives (e.g., streptomycin or dexamethasone) and dietary components (such as sucrose or sorbitol).
The compound that was most commonly used for the evaluation of hepatotoxicity was acetaminophen. Therapeutic exposure levels of acetaminophen are 139 µM, whereas overdose plasma concentrations higher than 0.7 mM require immediate intervention (Vale and Proudfoot, 1995). Notably, only few liver models could detect acetaminophen toxicity at clinically relevant concentrations (Table 3). PHH spheroid cultures were most sensitive and multiple studies have been presented in which acetaminophen toxicity was detected upon repeated exposure for 2 weeks with TC 50 values <1 mM. Heterogenous results were obtained for HepaRG spheroids. Whereas some studies reported very high sensitivity (TC 50 around 1 mM; Leite et al., 2016) already after a 24-h exposure, other reports indicated considerably lower sensitivity (7.6 mM; Wang et al., 2015). HepG2 spheroids were considerably less sensitive with TC 50 values of 7.2 mM after 4 days of repeated exposures. Lastly, two studies in hiPS-Hep spheroids did not detect acetaminophen toxicity within the analyzed concentration range (up to 10-20 mM).
Interestingly, hiPS-Hep cultured as MPCC were able to detect acetaminophen toxicity albeit at relatively high concentrations (TC 50 = 4.5 mM), indicating that this culture paradigm might be specifically useful for the cultivation of stem cell-derived cells. However, acute exposure of PHH-MPCC did not result in relevant toxicity (TC 50 = 35 mM). Liver-on-a-chip studies using acetaminophen as a benchmark have only recently been published and exhibit modest sensitivity (TC 50 = 2-5.59 mM). Importantly, model sensitivity to acetaminophen increased overall with increasing exposure time up to 2 weeks, whereas acetaminophen hepatotoxicity is clinically most relevant in the context of acute acetaminophen poisoning, which occurs within hours. Acetaminophen depletes the cellular reduced glutathione (GSH) stores and excessive necrosis only manifests once GSH levels drop below 40% (Mitchell et al., 1973). The lack of acetaminophen toxicity at clinical overdose concentrations in the acute setting could be at least in part due to high levels of bioactive supplements in common culture media with antioxidant properties, such as ascorbic acid (vitamin C), retinol (vitamin A), and α-tocopherol (vitamin E), which might delay hepatocellular GSH depletion.
In conclusion, PHH spheroids appear to be the most sensitive model system for the detection of acetaminophen toxicity, followed by liver-on-chip systems. MPCC seem to be particularly useful for the functional support of hiPS-Heps, whereas the sensitivity of PHH-MPCC to acute single dose exposures to acetaminophen is surprisingly low.

Guidance for the Selection of Animal Safety Models
The safety of drugs and drug candidates can vary markedly between species and toxicity of the hepatobiliary system constitutes one of the areas with most pronounced species differences (Olson et al., 2000). Hepatotoxicity of acetaminophen constitutes arguably the most studied example. Although the metabolic fate of acetaminophen is similar between species, hepatotoxicity manifests in mice and humans whereas rats and cynomolgus monkeys are protected (McGill et al., 2012;Yu et al., 2015). While safety tests in at least one rodent and one non-rodent species are strongly recommended by regulators before a compound can progress into clinical stages, the choice of appropriate animal model is less controlled and often based on historical in-house data or anecdotal evidence.
Importantly, species-specific differences can be mimicked in 3D primary hepatocyte cultures. A recent study demonstrated that primary hepatocytes from commonly used preclinical rodent (mouse and rat) and non-rodent (minipig and rhesus monkey) animal models cultured as spheroids can recapitulate species-specific toxicity at clinically relevant concentrations (Vorrink et al., 2018). The system recapitulated acetaminophen hepatotoxicity in mouse and human hepatocyte spheroids, whereas hepatocytes from rats and primates in identical conditions were protected. These data show that cross-species comparisons of 3D primary hepatocyte spheroids can provide guidance for the selection of the preclinical model species that most closely recapitulates human liver toxicity for the specific compound in question. In contrast, only minor species differences were observed in the HµREL liver chip and acetaminophen toxicity was flagged as hepatotoxic in both human and rat hepatocytes (Novik et al., 2017).

Metabolite Identification
Hepatic biotransformation of drugs can generate circulating metabolites, which may possess a safety profile different from the parent molecule. Importantly, the panorama of generated metabolites can differ both quantitatively and qualitatively across species. Thus, the identification and safety assessment of metabolites that are found at disproportionate levels in humans compared to animals constitute an important facet of preclinical toxicology. Current regulatory guidance defines disproportionate metabolites that mandate safety evaluation as metabolites that constitute >10% of circulating drug levels in humans and that are either not detected in animals or for which animal exposure is at least 50% lower [ICH M3 (R2), 2010; The US FDA, 2016]. However, certain metabolite classes, such as stable glutathione or N-acetylcysteine conjugates, might be exempt on a case-by-case basis (Luffer-Atlas and Atrakchi, 2017).
3D culture models in which human hepatic cells maintain physiological levels of drug metabolizing enzymes for days to weeks hold promise to improve metabolite profiling compared to conventional systems, such as liver microsomes, liver S9 fractions, hepatocyte suspension cultures, and liver slices (Anderson et al., 2009;Dalvie et al., 2009). Scaffoldfree spheroids have been successfully employed to identify phase I and phase II metabolites of diclofenac, midazolam, acetaminophen, and propranolol (Ohkura et al., 2014). Furthermore, the authors could show that their model detected the human specific metabolites lamotrigine-N-glucuronide and salbutamol-4-O-sulfate, which have not been detected in previous systems. Similarly, PHH MPCC identified 77% (43/56) of the human relevant metabolites of 27 drugs after 7 days of incubation, whereas hepatocyte suspension cultures (55%), S9 fractions (46%), and liver microsomes (39%) detected considerably less (Wang et al., 2010).
Metabolic profiles obtained using hollow fiber bioreactors have been demonstrated to accurately mimic the metabolic fate of compounds in vivo. An impressive example is the metabolism of AZD6610, which was primarily hydroxylated in PHH bioreactor cultures and in vivo, whereas glucuronidation was major pathway in HepaRG cells in the same bioreactor setup . In the same study, the authors identified all major human relevant diclofenac metabolites in both PHH and HepaRG bioreactors.
Phase I and phase II diclofenac metabolites were also identified in the CN Bio LiverChip (Sarkar et al., 2017). This platform was furthermore successfully used for metabolic profiling of hydrocortisone (Sarkar et al., 2015). Moreover, Hultman et al. evaluated metabolite formation of quinidine, S-warfarin, metoprolol, acetaminophen, lorazepam, and oxaprozin in hepatocyte suspension cultures and the static HµREL liver chip (Hultman et al., 2016). Notably, of the 20 metabolites identified using HµREL, only 9 were identified in suspension cultures of the same donor. Similar results were obtained in two other studies for the metabolic profiles of timolol, meloxicam, linezolid, and XK469 (Burton et al., 2018), as well as for five additional unspecified test compounds (Cassidy and Yi, 2018). Unfortunately, the sets of test compounds rarely overlap between the different culture paradigms and thus the currently available data do not allow a direct methodological cross-comparison.

Elucidation of Hepatotoxicity Mechanisms
Comprehensive molecular profiling using omics technologies constitutes a powerful tool box for deciphering the molecular events leading to overt hepatotoxicity of a given drug or drug candidate. However, such studies in 2D monolayer cultures are restricted to the investigation of acute toxicity events at high exposure levels (Grinberg et al., 2014;El-Hachem et al., 2016;Ogese et al., 2019). Long-term stable 3D hepatocyte culture paradigms offer the opportunity to comprehensively profile molecular changes upon long-term exposure at clinically relevant subtoxic doses. Ware and colleagues used PHH in MPCC as the experimental paradigm to evaluate transcriptomic alterations upon exposure to the hepatotoxins troglitazone, nefazodone, ibufenac, and tolcapone or their nontoxic structural analogues rosiglitazone, buspirone, ibuprofen, and entacapone . Interestingly, transcriptomic changes in cells treated with hepatotoxins were consistently higher than for the nontoxic analogues, and multiple transcripts were identified that were unique to the liver toxic compounds. Specifically, the authors identified bile acid biosynthesis, fatty acid metabolism, and PPAR signaling as pathways modulated by troglitazone, which correspond to suspected mechanisms of troglitazoneinduced hepatotoxicity in vivo (Smith, 2003).
In a parallel study, Bell et al. (2017) analyzed the transcriptomic alterations elucidated by three hepatotoxic compounds with different toxicity mechanisms (chlorpromazine, amiodarone, and aflatoxin B1). Importantly, the authors identified both unique and shared toxicity signatures for the compounds that accurately captured the molecular changes observed in patients. The cholestatic agent chlorpromazine significantly downregulated bile acid biosynthesis mirroring inhibition of CYP7A1 observed in patients with cholestasis, while amiodarone increased PPAR activity, aligning with perturbed fatty acid metabolism in vivo . In contrast, aflatoxin B1 induced genes associated with nucleotide excision repair and DNA replication, matching aflatoxin genotoxicity. In summary, these data indicate that organotypic liver models constitute promising tools for unravelling the molecular underpinnings of mechanistically diverse hepatotoxic liabilities.

CONCLUSIONS AND FUTURE PERSPECTIVES
Since the publication of the first seminal studies regarding preclinical hepatotoxicity tests using in vitro systems, a plethora of different models have been presented that differ in the used culture paradigm, cell model, medium composition, and substratum. Particularly, the development of organotypic and microphysiological liver models using primary liver cells has drastically improved the predictive power of hepatotoxicity tests and a variety of spin-off companies have been founded that aim to facilitate their commercial dissemination (Table 4). Furthermore, the possibility to use cellular material from donors with genotypes, phenotypes, or pathologies of interest combined with the possibility to stably maintain these cells for multiple weeks has opened up a multitude of applications beyond the testing of acute liver toxicity (Ingelman-Sundberg and Lauschke, 2018). As outlined in this review, this progress has been made possible by advancements in the field of cell biology, microengineering, and materials science and, as the fields continue to develop, we expect further breakthroughs. Specifically, we anticipate that further optimization of emerging technologies, such as 3D bioprinting, will refine the use of hepatic models for various aspects of liver physiology. Furthermore, the integration of organotypic liver systems with other tissue models in Bioprinting (Nguyen et al., 2016;Norona et al., 2019) Companies are shown in alphabetical order. Note that only references pertaining to primary human liver models are shown. September 2019 | Volume 10 | Article 1093 Frontiers in Pharmacology | www.frontiersin.org perfusion devices constitutes an important avenue towards in vitro systems toxicology (Ewart et al., 2018;Prantil-Baun et al., 2018;Ronaldson-Bouchard and Vunjak-Novakovic, 2018;Wang et al., 2018). Notably, while the field of organotypic liver models is developing very rapidly, the adoption of these models into drug development pipelines and regulatory advice lags behind, primarily because systematic validation and benchmarking of most of these novel systems are currently lacking. Comprehensive evaluation of molecular phenotypes and direct comparison to human livers constitutes important information for judging the phenotypic relevance of a model system. Most importantly, there is a need for data that allow drug developers and regulators to directly compare the predictive power of the respective model to the current state of the art and other emerging model systems. To this end, benchmarking of model sensitivity and specificity using a sufficiently large standardized set of prototypic hepatotoxic compounds with diverse toxicity mechanisms would provide important arguments for progressing from proof-of-concept studies towards regulatory acceptance (Roth and Singer, 2014;.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct, and intellectual contribution to the work, and approved it for publication.

FUNDING
VML is co-founder and owner of HepaPredict AB. The work in the authors' laboratory is supported by the Swedish Research Council (grant agreement numbers: 2016-01153 and 2016-01154), by the Strategic Research Programme in Diabetes at Karolinska Institutet, by the Lennart Philipson Foundation, and by the Harald and Greta Jeansson Foundation. No writing assistance was utilized in the production of this manuscript.