Emergent Transcriptomic Technologies and Their Role in the Discovery of Biomarkers of Liver Transplant Tolerance

Liver transplantation offers a unique window into transplant immunology due, in part, to the considerable proportion of recipients who develop immunological tolerance to their allograft. Biomarkers are able to identify and predict such a state of tolerance, and thereby able to establish suitable candidates for the minimization of hazardous immunosuppressive therapies, are not only of great potential clinical benefit but might also shed light on the immunological mechanisms underlying tolerance and rejection. Here, we review the emergent transcriptomic technologies serving as drivers of biomarker discovery, we appraise efforts to identify a molecular signature of liver allograft tolerance, and we consider the implications of this work on the mechanistic understanding of immunological tolerance.


introduction
The liver represents a unique window to the immune system. Unlike other transplanted organs, it exhibits immunoregulatory, tolerogenic properties, enabling an allograft to be more readily spontaneously accepted. The phenomenon of operational tolerance, i.e., stable allograft function in spite of complete discontinuation of immunosuppressive therapy, while rarely achieved in cases of renal transplantation, for instance, is relatively commonplace in liver transplant recipients. Indeed, the prevalence of operational tolerance following liver transplantation appears to be far greater than previously appreciated. Until recent clinical trial evidence to the contrary, an average estimate was that approximately 20% of liver allograft recipients were able to successfully be weaned off immunosuppression, and thereby to achieve a state of induced operational tolerance (1,2). Benítez and colleagues, however, showed that a remarkable 42% of 98 liver allograft recipients undergoing weaning of immunosuppression achieved operational tolerance. Furthermore, the propensity to tolerance was noted to develop over time, with those who had had their graft for 10.6 years or more achieving tolerance in 79.2% of cases (3). While these results should be taken with some caution, the inescapable implication is that a significant proportion of liver transplant recipients, particularly if in the second decade of graft survival, are unnecessarily subjected to immunosuppressive therapy and the significant risks associated with it. The incentive, therefore, to search for a biomarker by which to identify patients amenable to drug minimization, becomes clear. Furthermore, the pursuit of such biomarkers might aid in the fuller characterization of the immunological phenotype associated with tolerance, and so offer a mechanistic understanding of the processes by which tolerance is achieved and might be induced.
Major advances have been made in recent years in the fields of genetic and molecular biology. Large international collaborations such as the human genome and proteome projects enabled further Liver transplant tolerance transcriptomic biomarkers Frontiers in Immunology | www.frontiersin.org technological developments of high-throughput technologies. Broadly, with new technologies, arise new investigative paradigms. Reductionist scientific approaches have been overtaken, to an extent, by the generation of vast biological datasets enabling the study of complete sets of molecules. The umbrella neologism "omics" has appeared in order to describe these changes and to classify emerging fields -metabolomics, proteomics, transcriptomics, and so on. The systems biology approach has developed to offer a computational and mathematical framework enabling the integration and analysis of data from these seemingly disparate fields. The application of these developing fields to the arena of transplantation, with a view to the personalized treatment of patients, has recently been dubbed as "transplantomics" (4,5).
Regarding the identification of biomarkers of tolerance, of all the emerging work in transplantomics, the high-throughput measurement of the transcriptome has shown the greatest promise and formed a focus of research. Here, we review the application of transcriptomic technologies to the unique window proffered by liver transplantation tolerance. We set out an overview of the technologies and of the associated analytical tools. We review recent progress in developing novel biomarkers of tolerance, and look at their application in trials of immunosuppression withdrawal. We discuss the limitations and pitfalls associated with high-throughput transcriptomic research. Finally, we consider the implications of these tools on our mechanistic understanding of operational tolerance and how this might guide future therapeutic developments.

Principles of Transcriptome Analysis
The "transcriptome" describes the complete set of messenger RNA (mRNA) and non-coding RNA (ncRNA) transcripts, which include micro RNA (miRNA), small nuclear RNA (snRNA), and small nucleolar RNA (snoRNA) among others. Comprehensive understanding of the transcriptome must also take into consideration further complexities -splicing isoforms, gene-fusion transcripts, post-translational modifications, and epigenetic controls for example. Thus, "Transcriptomics, " can be understood as the large-scale study of transcriptional products as well as their regulation and modification (5,6).

Transcriptome Profiling
Transcriptome profiling can be subdivided into two general approaches for simplicity -the candidate gene strategy focuses on single gene transcripts, while high-throughput approaches allow for the simultaneous measurements of thousands of transcripts. The first candidate gene-based studies utilized the Northern Blot (7). This method fixed RNA on a solid support, following its separation by electrophoresis, and then the presence and abundance of the fixed RNA species of interest were deduced by hybridization with complementarily labeled radioactive nucleic acid probes. The low throughput and requirement of large quantities of input RNA made this technique cumbersome. Reverse transcriptase polymerase chain reaction (RT-PCR) is now the method most commonly used for candidate gene transcript measurement and has broad applications in the clinical setting (8). In this approach, mRNA is reverse transcribed to complementary DNA (cDNA) and amplified with primers specific for the gene of interest using PCR. Quantitative measures of mRNA abundance are made possible by monitoring the accumulation of PCR product (9). While the method requires only small quantities of input RNA, is robust, cost-effective, and rapid, the throughput remains in the order of hundreds of known transcripts at a time and so is not amenable to transcriptome-wide investigations (10).
Microarray technology, on the other hand, has enabled the rapid, simultaneous measurement of the whole transcriptome. mRNA is hybridized to an array of oligonucleotide or cDNA probes that are robotically spotted onto a solid support chip, thereby allowing the identity of each probe to be defined by its location. Hybridization intensity to a particular probe is related to the abundance of corresponding transcript (11,12). Microarray technology has been applied to the gamut of transplantation biology over the last decade, including studies of acute and chronic rejection, and more relevant herein, the understanding of immune tolerance and identification of biomarkers. Microarrays have become the best standardized, most affordable, and widely accessible of the high-throughput omics technologies (13).

Microarray experimental Design and Analysis
A typical microarray generates expression levels for thousands of genes, thereby producing vast quantities of data. The major challenge is to analyze and understand these data, to distinguish true from misleading signals on the one hand, and to uncover clinically relevant findings on the other. The steps typically involved in a microarray experiment are (i) experimental design, (ii) sample preparation and processing, and (iii) data analysis and interpretation. Careful experimental design is crucial. It depends heavily on the array technology used and, of course, on the research objectives (14,15). Objectives are often characterized as either "class comparison" or "class prediction" (16). In this setting, the former describes attempts to identify genes differentially expressed between operationally tolerant recipients and another comparison group. The latter involves the development of multi-gene formulae able to predict which patients might exhibit tolerance based on their expression profiles. The high-dimensional datasets generated cannot be adequately analyzed with conventional comparative statistics. The complexity of analysis and the potential pitfalls require a team approach and a good understanding of the relevant software required for the steps of quality control, normalization, clustering, classification, and pathway analyses (8,17).

More Data Herald More Challenges…
Rigorous quality control criteria help to ensure high quality data collection from arrays that are reproducible and comparable. The MicroArray Quality Control project (MAQC), an unprecedented, community-wide effort to appraise microarray reliability and quality control metrics, reported that, with careful experimental design and appropriate data transformation and analysis, data can be reproducible and comparable across laboratories, institutions, and researchers (18). A number of commercial software packages have been developed to aid the quality control process (19)(20)(21)(22). Specialist software is also available to aid with data normalization, a crucial step in the conversion of raw data into scaled relative expression levels (8,23,24). Statistical packages are also utilized for calculating differential expression, controlling for false positives, selecting significance cut-offs, the clustering of genes thought to be similar or co-regulated, and final pathway analyses, enabling the identification of gene sets associated with specific biological functions (25)(26)(27).
Much of the complexity involved in the statistical analyses stems not only from the high numbers of genes measured per sample ("curse of dimensionality") but also the disproportion between this and the limited numbers of samples available for testing ("curse of scarcity") -a difficulty often faced in biomarker research. This is overcome, in part, through adjusted p-values (q-value) such as false discovery rates (5). Tools for analysis are often free to download and now widely used. They include significance analysis of microarrays (SAM), GenePattern, and GenMAPP (5). Despite robust analytical tools, an undiscerning researcher can erroneously use data to "discover" sets of genes that are able to differentiate the samples on which the gene algorithm modeling was based even when the data are completely random. This problem should be circumvented by ensuring that a gene model is tested on a validation group that is independent from the training set used to create the model in the first place. This approach is more desirable than cross-validation techniques sometimes employed (28)(29)(30). Further, technical validation of microarray results on a different transcriptional platform, usually RT-PCR, is also recommended to minimize inter-or intra-platform variability in hybridization noise that may arise between batches or laboratories.
In order to verify the reproducibility of analyses and to corroborate clinical validity, public microarray databases serve as essential repositories. In the transplant setting, where studies often include only small numbers of recipients, these resources are especially important. The Functional Genomics Data (FGED) Society (formerly the MGED Society), a non-profit, volunteer run organization promoting the sharing of high-throughput research data, helped to define the Minimum Information About a Microarray Experiment (MIAME) guidelines for data content standards. The Society also set the standard data exchange format, known as the Microarray Genetic Expression Markup Language (MAGE-ML). Thorough reviews of the numerous databases in existence have been set out in the literature (31).
Microarray data output is necessarily dependent on the quality of the original biological samples. RNA is considerably more susceptible to rapid enzymatic degradation than DNA, thereby making efficient processing and appropriate storage using robust protocols essential. Microarrays offer snapshots of gene expression. The kinetics of transcripts and the variability of changing levels of expression in relation to their baseline remain little understood and so are not amenable to statistical interpretation (32,33). Matters are further complicated by tissue heterogeneity, as is the case in blood samples for instance. This heterogeneity makes anatomical detail in the microarray approach difficult, in that it is difficult to know which cells' gene expression profiles are being analyzed. Cell sorting and microdissection are ways to tackle this difficulty, as is the application of statistical deconvolution methods such as the cell-specific significance analysis of microarrays (csSAM) (34). While peripheral blood has been at the forefront of efforts to identify biomarkers, the possibility of interrogating RNA extracted from paraffin embedded biopsies is a useful addition to investigative efforts.
It becomes clear then, that to discern biological fact from mere noise, it is essential that due attention is paid to the analytical complexities involved in microarray interpretation. Although, as we will see, microarray profiling has yielded important data in the pursuit of biomarkers of tolerance, and the technology is becoming more commonplace in transplantation research, the promise of emerging next-generation sequencing (NGS) technologies is likely to eclipse many microarray applications. In essence, NGS involves the sequential identification of the bases of small fragments of DNA from signals, which are emitted when each fragment is resynthesized from a DNA template strand. By extending this process across millions of reactions in parallel, the technology enables rapid sequencing of large stretches of DNA base-pairs spanning entire genomes (35).
In part, the promise of NGS stems from sidestepping some of the aforementioned problems inherent in microarray technology. NGS is highly reliable, and has greater dynamic range as it directly quantifies discrete digital sequencing readouts as opposed to relying on hybridization steps. Loss of specificity due to crosshybridization is controlled; the detection of rare and low abundance transcripts is made more achievable; the unbiased detection of novel transcripts is made possible since the need for transcriptspecific probes utilized by microarray become redundant; and errors in probe design, which are relatively common in microarray chips, are avoided. In addition to these technical considerations, NGS technology is advancing at such a pace that the prospect of "sequencing everything" (genome, epigenome, transcriptome) in a timely and cost-effective manner is well within reach. In the 4 years between 2007 and 2011, a single sequencing run's output increased 1000×, far outstripping Moore's law, while the cost of sequencing the entire genome has fallen from over 150,000 USD in 2009, to less than 5000 USD in 2014 (36). Of course, NGS presents its own technological and bioinformatics challenges -which have been comprehensively reviewed elsewhere (37, 38).

identification of Tolerance Biomarkers
Much hope has been placed upon transcriptomic technologies as the drivers of a "new era of individualized therapy" (4). The application of these technologies in the discovery of novel diagnostic and predictive markers has spanned diverse transplantation research fields, including the development of predictors of allograft risk, the identification of biomarkers of acute and chronic allograft injury, the assessment of organ suitability and viability during the preservation period, and forming the focus here, the discovery of biomarkers of tolerance.
Much of this work is in its infancy; transcriptomic investigation of biomarkers of liver allograft tolerance began less than a decade ago (39). Already though, biomarker-based diagnostic tests have gained regulatory approval and have reached the market (40)(41)(42). A diagnostic kit based on an 11-transcript set identified with microarray technology is used to non-invasively identify rejection in heart transplant recipients (41). As we will see, biomarkers of liver transplant tolerance have also yielded extremely promising results showing good potential for clinical translation in the near future. Liver transplant tolerance transcriptomic biomarkers Frontiers in Immunology | www.frontiersin.org Martínez-Llordella and colleagues were the first to use microarray technology for the gene expression profiling of blood samples from operationally tolerant liver transplant patients (39). This retrospective, cross-sectional study compared 16 operationally tolerant recipients to 16 recipients failing to undergo immunosuppression withdrawal, and found 462 positively and 166 negatively regulated genes. Functional analysis revealed that tolerance expression profiles were enriched in gamma-delta T (γδ T) cells and natural killer (NK) cells (see Table 1). Genes involved in mRNA processing, protein biosynthesis, DNA repair, cell cycle control, Interleukin 2 receptor signaling, and transcription regulation were also noted to be differentially expressed. While this was a first step toward proof of principle, there were considerable methodological limitations. One difficulty common to all studies of tolerant patients is the selection of an appropriate control group. Stable transplant patients are sometimes used as a control, but the immunosuppressive medications they receive may skew any comparative interpretations. Another approach is to use healthy controls to circumvent the concerns with immunosuppressive therapy, but in this case the absence of transplantation becomes a significant limitation in itself. Without a perfect control population, one reasonable approach is to use multiple control groups. In an attempt to address other methodological limitations with their first study, Martínez-Llordella's group followed up with a more robust analysis of a larger cohort of patients and incorporated both training and validation sets, as well as the necessary crossvalidation checkpoint procedures (43). Of 1932 differentially expressed genes identified in this follow-up study, RT-PCR validation of 68 promising candidate genes was performed with good correlation shown between platforms. Utilizing a novel modeling approach based on the misclassified penalized posterior (MiPP) algorithm, three optimally parsimonious gene signatures were identified, containing 2, 6, and 7 genes, respectively, and altogether comprising 12 different genes (see Table 1). These signatures were shown to be capable of accurately predicting the clinical status not only of the group of recipients from whom they were derived but also of an independent validation cohort of 23 patients. When these gene signatures were evaluated against a cohort of stable recipients on maintenance immunosuppression, they predicted that 26% of these patients would be tolerant; a prediction that is roughly equivalent to the prevalence of tolerance indicated by the literature (1,44,45). Mechanistic interpretations of these findings were hampered by the retrospective study design and the lack of simultaneous molecular analyses of allograft tissue. These issues were addressed in a prospective, multi-center immunosuppression withdrawal trial in liver transplant recipients, as reported by Bohne and colleagues (48). Of 75 recipients completing the trial, 42 underwent rejection, while 33 were successfully weaned off immunosuppression, thereby achieving a tolerant state. Microarray and RT-PCR analyses of both peripheral blood and the grafts themselves were conducted. While previous conclusions regarding peripheral blood mononuclear cell (PBMC) enrichment in NK and γδ cells were corroborated, of special interest is the fact that, in side-byside comparisons, liver tissue-derived transcriptional signatures proved more robust, accurate, and reproducible than PBMC derived signatures. The intragraft expression profile was mainly enriched with genes involved in iron homeostasis, and showed no overlap with genes identified from PBMC. The role of iron redistribution is a well-established antimicrobial strategy and has been shown to play a significant part in pathogenic infection of the liver, where iron overload is associated with poorer outcomes (50)(51)(52). Whether this is a property mediated through effects on pathogen growth, or on the host immune response itself is unclear. What this study is first to highlight, though, is the possibility that the dampening of alloreactive immune responses required for the establishment of tolerance may be dependent on the iron-store status of the allograft.
Liver biopsy tissue was also analyzed in a more recent study by Zhao and colleagues looking at a cohort of pediatric patients (53). While previous work had already identified the enrichment of γδ T cell subpopulations and the genes associated with their expression in the peripheral blood of tolerant recipients, Zhao's group examined these cells at the transcriptional level within the graft itself. Two prominent subsets of γδ T cells have been defined based on their δ chain -Vδ1 and Vδ2 T cells. Vδ2 cells are normally the predominant subset in blood and are involved in the inflammatory response. Vδ1 cells normally reside and are predominant in mucosal surfaces, possess potent immunoregulatory and suppressive capacities, and have been shown to emerge into the peripheral blood to a degree, which gives them predominance over Vδ2 cells in tolerant liver transplant recipients (39,54). Zhao et al. showed that Vδ1 cells also accumulated within the grafts of operationally tolerant recipients in an antigen driven process, and that the complementarity-determining region 3 (CDR3) sequence of the δ chain of these Vδ1 cells specifically undergoes oligoclonal expansion, thereby suggesting that tolerance might be identified through sequencing analysis of these intragraft cells.
In the largest analysis of transcriptomic data pertaining to transplant tolerance, Li and colleagues extended previous work by developing a tolerance signature independent of recipient age and donor source, cause of end-stage liver disease, or concomitant viral infection (49). This was achieved through the amalgamation of living and deceased donor and pediatric, as well as adult data from across different clinical centers. The 13-gene tolerance signature identified ( Table 1) was highly associated with NK cells, corroborating earlier work, and proved to have striking predictive accuracy, exhibiting 100% sensitivity and 83% specificity. This degree of predictive capacity would appear to obviate the need for the biopsy derived gene signatures, thought to be of superior utility as biomarkers of tolerance in earlier studies (43).
The benefits of identifying robust non-invasive biomarkers over those derived from biopsy tissue are self-evident. Noncoding transcripts such as miRNAs have been shown to be more stable in peripheral blood than mRNA, have been shown to be implicated in the control of genes relevant to alloreactive immune responses, and with the advent of NGS techniques offer the promise of novel PBMC-derived tolerance signatures (55,56). Using miRNATaqman low-density arrays targeting 381 human miRNAs, Danger et al. reported on the modulation of expression of eight miRNAs in peripheral blood samples, nine tolerant kidney transplant recipients as compared to 10 patients with stable renal function under immunosuppression (57). They noted that B cells from the operationally tolerant group overexpressed miR-142-3p, and that this expression was not modulated by immunosuppression. The stability of miRNA in biofluids allowed Lorenzen et al. to investigate miRNA levels in the urine of a small retrospective cohort of kidney transplant recipients, and to identify miR-210 as a reliable marker of acute rejection and predictor of long-term graft function (58). In a multicentre cohort of renal allograft recipients, Suthanthiran and colleagues prospectively validated a three-gene urinary mRNA signature [interferon inducible protein 10 (IP-10) mRNA, 18S rRNA, and CD3ε mRNA] (59). Their results represent a major step toward achieving non-invasive diagnosis and prediction of acute allograft rejection, and highlight the utility of pursuing biomarkers across varied tissue and biofluid samples. The success of miRNA biomarkers in studies of renal allograft tolerance and rejection has helped to instigate some early work in rodent models of liver transplant tolerance, while human studies are still awaited (60,61).
As highlighted by miRNA biomarkers, it would be remiss to conclude this review of transcriptomic research into biomarkers of liver transplant tolerance without reference to the important cross-fertilization of ideas, of methodological approaches, and of data and sample sharing with research groups investigating kidney transplantation tolerance. Transcriptomic research into kidney transplantation faces some unique challenges, the scarcity of patients able to achieve operational tolerance being one important example. Fewer than 200 cases of kidney operational tolerance have been described over the last 40 years (62). Nevertheless, with the successful development of research consortia in this field, a number of transcriptional studies have been successfully undertaken (47,(63)(64)(65)(66)(67)(68)(69). Very broadly, these reports presented gene lists converging toward a B cell signature of tolerance, and in so doing corroborated other data showing that both the percentage and the absolute number of B cells are increased in operationally tolerant kidney allograft recipients (64,65,70,71). Despite efforts to coordinate these studies, reports on kidney transplant tolerance have been extremely heterogeneous in terms of the techniques used, the controls groups drawn upon, and the various clinical profiles of the patients studied. Unsurprisingly then, overlap between the gene-markers identified between research groups has been poor, raising questions about their reliability and about their eventual applicability in clinical contexts (72).
Similarly, while identifying shared features of tolerance between kidney and liver transplant recipients would be helpful in finding common mechanistic processes underpinning tolerance induction, in developing therapeutic strategies, and in identifying novel biomarkers, it is the case that comparisons of data from disparate studies can be problematic. Array platforms often vary significantly in the probes they have in common; lymphochip and affymetrix chips, for instance, have only a few probes in common (43). Furthermore, in their direct comparison, employing the same transcriptional technology, Lozano et al. revealed an absence of significant overlap in blood phenotypic and transcriptional patterns between operationally tolerant liver and kidney recipients (47). Nevertheless, in recent work, the power of transplantomic technologies coupled with novel statistical techniques have helped to overcome many of these difficulties. This was exemplified by the recent identification of a common rejection molecule (CRM) across multiple transplanted organs (liver, kidney, heart, and lung) by Khatri and colleagues, who were able to compare and integrate data from several transcriptional studies by meta-analysis (73). The CRM consists of an 11-gene signature able to diagnose acute rejection with high sensitivity and specificity and could accurately predict future injury to a graft across all four organs. In recent months, a similar methodological approach was applied to integrate five of the disparate kidney transcriptional datasets aforementioned, in order to define a robust gene signature of operational tolerance (72). The meta-analytical methodology was able to reconcile the lack of overlap between the five studies, and to identify a gene-signature involving proliferation of B and CD4 T cells, and inhibition of CD14 monocytes. This gene signature, narrowed down to 20 biomarkers, underwent full cross validation, and was shown to be highly predictive in new samples and new patients, independent of the array technology used. It is critical that similar meta-analyses are performed on the liver tolerance datasets discussed here. The proof of the clinical utility of all these predictive biomarker sets rests on their successful application in prospective studies of biomarker-targeted immunosuppression weaning within a randomized, controlled setting. This precisely, is the purpose of a large, European trial currently underway called "BIOmarker-Driven personalized IMunosuppression, " or BIO-DrIM (www.biodrim.eu).

Conclusion
The unique characteristics of the liver transplant setting, alongside the technological advances in transplantomic disciplines, which have enabled the discrimination of operational tolerance at a molecular level, present researchers with the opportunities to decipher the immunological mechanisms underlying drug-free allograft survival and to develop therapeutic targets aimed toward tolerance induction strategies.
The understanding that a large proportion of liver transplant recipients, particularly those living with their graft for a number of years, are over-immunosuppressed, must act to incentivize the translation of biomarker discovery into everyday clinical practice.
Emergent technologies, including next generation sequencing, must be capitalized upon to provide insights into normal, pathological, and pharmacological processes. As the diverse omics fields become more elaborate and produce ever more data, the collaboration between researchers, laboratories, hospitals and other institutions, and the integration of clinical and molecular data become essential to the pursuit of advancing the field of transplantation and developing personalized therapy.

Author Contributions
SM, MM-L, and AS-F all contributed to the conception, drafting, critical revision, and final approval of this work. All are accountable for its content.